Glitch Tokens Explained: How Weird Text Strings Break AI Models

Mar 21, 2025

Discover how “glitch tokens” can break large language models, just like Google Bombs once hacked search results. Learn why it matters for AI stability.

Remember when typing “miserable failure” into Google brought up a politician’s official page? That wasn’t an accident. It was a hack; people figured out how to trick Google’s ranking system and exploited it. We called it a Google Bomb.

The good old days of SEO mischief, LLMs have their own version of this: glitch tokens.

I recently read a paper that made me think of the equivalent in the world of AI and Large Language Models (LLMs) - Glitch Tokens. They’re basically cheat codes that make the model lose its mind. The model sees these tokens and goes haywire - spitting out unrelated, nonsensical stuff. It’s like a glitch in the Matrix. Some of these text sequences look harmless like "TheNitrome", others are weird combinations of special characters " [],,,,".

Why is it interesting? As LLMs are integrated into more applications, knowing where these “bugs” pop up is critical to ensure stable behavior. Many companies like OpenAI patch these glitch tokens quickly once they're known but the researchers also created a tool called GlitchHunter that allows you to identify such possible tokens. Who knows, maybe you’ll stumble on one and make ChatGPT tell you about spaghetti recipes when you asked about quantum physics.

As AI moves into everything from search engines to coding assistants, finding and patching these glitches will be critical. Until then, keep your eyes peeled. You never know what little “miserable failure” is hiding in the code.