Humans type for speed, tone, and habit. Tokenizers split text based on common patterns, and providers bill per token. That means ordinary habits like typos, shorthand, filler words, pasted IDs, and stray whitespace can change token counts without changing intent much.
I started noticing this on a tiny prompt: 5 words, 2 spelling mistakes, 13 tokens. I fixed the spelling and sent it again: 6 tokens, including the full stop.
Counts below use OpenAI’s tokenizer and Claude’s API based tokenizer. In general Claude spits out more tokens on the same text compared to OpenAI in my usage. Counts here are for isolated strings. In real prompts, counts can shift slightly based on surrounding spaces, punctuation, and casing.
Swapped letters, dropped letters, doubled letters, nearby-key misses: all normal typing habits, all billable.
Common spellings compress. Rarer spellings fragment. In code, this can compound quickly: the same bad identifier/var name/func name shows up in declarations, references, logs, errors, diffs, etc.
When I type for work (code, prompts, texts etc.) my left hand is slightly faster than my right which results in some swapped letters. I never bothered to correct myself when using Google searches, or text messages etc. Now apparently that difference has a pricing model.
A tiny suffix looks harmless to a human. Tokenizers may split it very differently.
Human chat carries a lot of low-signal padding:
These help tone. They rarely help the task.
Humans optimize for keystrokes. Tokenizers optimize for common text. Those are not the same thing.
Most of the time standard dictionary words will be 1 token and almost always more explicit, clearer, and closer to the
text models saw during training, than shorthands.
Some things are not conversational, but they show up in normal work and still inflate tokens:
The model may recover meaning from all of this. Billing does not.
Humans type by habit. Tokenizers bill by pattern.
Which is mildly annoying, because now even tempalte feels like a line item to rectify.