Home / Companies / Resemble AI / Blog / Post Details
Content Deep Dive

PerTh Multimodal: Enterprise Multimodal Watermarking for Audio, Video, Image, and Text

Blog post from Resemble AI

Post Details
Company
Date Published
Author
-
Word Count
2,402
Company Posts That Month
11
Language
English
Hacker News Points
-
Summary

PerTh Multimodal is a comprehensive watermarking solution designed to address the increasing need for robust, imperceptible watermarking across audio, video, image, and text mediums, in response to regulatory requirements like the EU AI Act. Originally launched as an audio-only watermarker, PerTh has evolved to a multimodal platform with enhanced capabilities, achieving near-perfect accuracy in media watermark recovery. It embeds tamper-resistant signatures using modality-specific encoding techniques, such as psychoacoustic methods for audio and pixel-level modifications for images and videos, while linguistic rewriting is used for text watermarking. The new version rectifies limitations of the original PerTh audio model through a novel training attack system and curriculum design, incorporating diverse data augmentations to ensure robustness against real-world transformations. PerTh's integration capabilities are bolstered by ONNX support for edge deployment, ensuring interoperability and compliance with legal mandates, while offering detailed threshold settings to balance detection rates and false positives. The platform also complements metadata solutions like C2PA for dual-layer provenance, ensuring content integrity even after re-encoding or format changes. Despite its advancements, PerTh's explicit watermarking model requires further evaluation in non-speech audio and real-world settings to fully understand its potential and limitations.

Trends Found in this Post
Trend Post Mentions Total Month Mentions Posts Companies MoM
Real-time 2 5,457 1,338 238 -5%