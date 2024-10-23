Key Takeaways SynthID helps identify content generated by Google Gemini and potentially other AI models.

Content misattributed to human writers has been on the rise since powerful, modern LLMs rendered the Turing Test obsolete. On top of similar tools for identifying AI-generated images, music, and video, Google's DeepMind AI research subsidiary just released the beta version of SynthID, a method of watermarking and identifying text created using the Gemini model. Even better, it's open-sourced the tool, so other AI companies can utilize it to keep track of what their models create.

Keeping track of the robots

One imperceptible signature at a time

Source: Google DeepMind

DeepMind's previously implemented techniques append images, video, and audio with watermarks undetectable by human eyes and ears. Researchers developed something a little different that allows SynthID to sign LLM-generated text.

It works by altering the model's probabilistic output, or very slightly changing which words it's likely to predict work best in a given text passage. Based on the difference between the LLM's predicted word output and what the modified algorithm produces, the tool can reliably identify if the content was written by Google Gemini.

Of course, it can't alter the content too much, or the model's raw language abilities suffer. To make sure SynthID didn't go too far, researchers put it through a massive test. They pushed roughly 20 million Gemini-generated passages to users, some with and some without watermarks.

The results indicated that users found watermarked and unaffected text equally accurate and useful, or in essence, indistinguishable. It also didn't affect the LLM's speed in any noticeable way.

Pushing for industry-wide support

DeepMind isn't stopping at labeling only Google Gemini output. SynthID watermarking and detection have already been open-sourced and offered to developers of other AI models, to encourage its adaptation for use with today's many competing LLMs.

Like every AI detection tool (and many encryption-breaking techniques, for that matter), SynthAI could give unscrupulous AI developers another means of practicing how to obfuscate a text's LLM origins. As a subsidiary of the world leader in data collection (that is, Google), DeepMind's involvement means there are significant resources in play behind ensuring AI content is readily identifiable as such.