1. Why state is hard
- A plain LLM call is stateless: input → output, done. The moment you add a loop, tools, or multi-turn conversation, you have to answer
2. The taxonomy — five state types, by name
- 2.1 Working memory (a.k.a. scratchpad, short-term, in-task)
- 2.2 Conversational memory (a.k.a. session state, dialog history)
- 2.3 Long-term episodic memory
- 2.4 Long-term semantic memory (a.k.a. knowledge memory)
- 2.5 Procedural memory (less common, worth knowing the name)
"Agent state has five flavors: working memory for the current task trajectory, conversational memory for multi-turn dialog, long-term episodic memory for past interactions, long-term semantic memory for stable knowledge, and procedural memory for reusable skills. Each has different storage, lifespan, and retrieval patterns.”
3. The token budget problem (the central engineering challenge)
- The core tension:
- Strategy 1: Sliding window
- Strategy 2: Summarization
- Strategy 3: Selective retention (a.k.a. message filtering)
- Strategy 4: Externalized state with pointers
- Strategy 5: Hierarchical / multi-agent decomposition
4. State persistence — surviving process boundaries
The token budget problem is in-memory state management. Persistence is the orthogonal problem: what happens when the request handler dies mid-task?
- Production agents have to survive:
The pattern is checkpointing — after each meaningful step (each tool call, each turn), persist the agent's state to durable storage. On a new request, load the checkpoint and resume.