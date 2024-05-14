Google has announced that it’s expanding the AI watermarks applied through content produced via Google Gemini and Veo to video and text content, after originally launching the ‘SynthID’ watermarking system with photos and audio.

It’s part of a whole song and dance Google’s doing around making ethical AI (just skip over the bit where it butchers websites via Search for AI-generated results).

The company has been quick to jump on education and awareness of the tech, such as with yesterday’s announcement of LearnLM, but in being such a big player, Google also has a responsibility to be ethical.

Image: Google

“SynthID isn’t a silver bullet for identifying AI-generated content, but is an important building block for developing more reliable AI identification tools and can help millions of people make informed decisions about how they interact with AI-generated content.

“Later this summer, we’re planning to open-source SynthID for text watermarking, so developers can build with this technology and incorporate it into their models,” Google said in a blog post.

The way Google’s watermarking video is easy to explain – it incorporates an underlying key in each frame of the video that can be identified by a sufficient system as artificially generated. It’s the same process that the company uses for AI-made images, and it allows online platforms to flag when a piece of content has been spat out of a generator.

It’s not impervious to harmful use, there’s nothing to theoretically stop a bad actor with a sufficiently powerful tool to remove the watermark, but it’s a welcome move regardless.

The way text is watermarked is, on paper, similar. “SynthID is designed to embed imperceptible watermarks directly into the text generation process. It does this by introducing additional information in the token distribution at the point of generation by modulating the likelihood of tokens being generated — all without compromising the quality, accuracy, creativity or speed of the text generation.”

Translation: Google is saying embedded into the words of a generated result, are scores that could indicate if it has been AI-generated or not. Google, however, notes there are limitations to this.

“SynthID text watermarking is less effective on responses to factual prompts because there are fewer opportunities to adjust the token distribution without affecting the factual accuracy. This includes prompts like ‘What is the capital of France?’ or queries where little or no variation is expected like ‘recite a William Wordsworth poem’,” Google added.

“SynthID works effectively on its own, but it can also be combined with other AI detection approaches to give better coverage across content types and platforms. While this technique isn’t built to directly stop motivated adversaries like cyber attackers or hackers from causing harm, it can make it harder to use AI-generated content for malicious purposes.”

When it comes to deepfakes, one of the things companies should be doing, be it AI builders or online platforms, is introducing safeguards that prevent malicious use. These could be used in the case of potentially swaying an election, or instigating violence against somebody or a group of people, through the misuse of AI.

And that’s why watermarking content is so important.

Read all Google I/O announcements here

Image: Google