Smuggling Arbitrary Data Through an Emoji

Extreme TLDR: Paul Butler explores encoding data in emojis using Unicode variation selectors, allowing hidden messages within visible characters. He demonstrates this with Rust code, highlights potential misuse (e.g., evading content filters, watermarking), and notes that while LLMs can tokenize variation selectors, they typically don't decode them directly.

https://paulbutler.org/2025/smuggling-arbitrary-data-through-an-emoji/