Why LLMs Get Distracted and How to Write Shorter Prompts

Post Details

Company

PromptLayer

Date Published

July 14, 2025

Author

Jared Zoneraich

Word Count

893

Language

English

Hacker News Points

-

Source URL

blog.promptlayer.com/why-llms-get-distracted-and-how-to-write-shorter-prompts

Summary

A study by Chroma titled "Context Rot: How Increasing Input Tokens Impacts LLM Performance" reveals that major large language models (LLMs) suffer from "context rot," where accuracy diminishes as prompts lengthen, contrary to the belief that more context equals better results. This degradation affects applications like Retrieval-Augmented Generation (RAG) systems, chatbots with conversation history, and any use of extensive context in LLMs. The research identified factors like semantic distance, distractors, and structured narratives as contributors to this decay, suggesting practices like retrieving fewer high-similarity tokens, reranking to eliminate distractors, and avoiding long narrative arcs to mitigate the issue. The findings emphasize that context is a limited resource requiring careful management, with shorter, precise prompts yielding more reliable responses, and highlight the need for sophisticated prompt engineering—referred to as "context engineering"—to optimize LLM performance in real-world applications.