The village square "reeks of woodsmoke and goblin-stink." That line, from Claude Opus 4.7, is exactly the kind of sensory, specific prose that makes fiction feel alive. A game developer testing Claude models side by side found that newer versions produce noticeably more sanitized, flat output - and the difference isn't subtle. The description used for the newer model outputs: "LinkedIn-ish cringe MBA approved enterprise." It lands because it's accurate.
This pattern is familiar to anyone who writes fiction with AI assistance. Older checkpoints wrote with texture and specificity. Newer ones default to something closer to a product description.
What's Driving the Regression
This isn't random drift. It's almost certainly a consequence of RLHF - reinforcement learning from human feedback - where models are trained using ratings from human reviewers. Reviewers hired to flag harmful content also tend to flag anything that reads as edgy, violent, or vividly descriptive. The model learns to avoid those signals. The result is prose that's technically competent but emotionally flat.
The goblin-stink village square is exactly the kind of detail that gets smoothed away in this process. It isn't harmful. It has no offensive content. But it has the texture of something a safety reviewer might mark down as unnecessarily gritty, and models trained against that signal stop generating it.
Anthropic isn't alone here. The same drift shows up across GPT-4 versions, and developers building games, interactive fiction, or any narrative-heavy work have been tracking it for years. Each major model release trades some creative range for better scores on benchmarks that measure helpfulness and policy compliance - metrics that don't capture whether your goblin raiders actually feel dangerous.
The Practical Cost
For game developers using AI-assisted dialogue and narrative, this is a concrete production problem. A model that writes "the goblin raiders approached aggressively" instead of a scene with actual sensory weight produces content that reads like a Wikipedia summary. Players notice.
The workarounds are clunky: system prompts explicitly instructing the model to write with sensory detail and not sanitize mature content, or reverting to older model versions still accessible through the API. Some developers maintain access to older checkpoints specifically because newer releases have drifted toward safer defaults.
AI labs haven't publicly addressed this tradeoff directly. Anthropic publishes model cards and safety documentation but doesn't typically discuss how safety training affects creative range. The gap between what a model can technically produce and what it will produce by default has widened with each generation - and for anyone shipping creative applications, that gap now has real product consequences.
Testing your specific use case against multiple model versions before upgrading is worth the time. Benchmark scores don't measure whether your goblin raiders smell right.