PEOPLE DATA | AI | REMOTE LEADERSHIP & LEARNING

Not unexpected discovery about context

a bowl of coffee beans and a cup of coffee

I’m very concerned with context size since I run out of tokens for the first time after intensively using a multiagent set up. So I wanted to confirm a suspicion… is it better to cluster/group requests or to make them one by one?

Spoiler: Better to group requests! But see for yourself, if you’re a bit of a doubting Thomas

I asked cursor

If I'm working on a given set of 3 files. What would be costier in terms of context tokens:
- Ask for 3 modifications in three times
- Ask for the same 3 modifications together in one request to the agent
- Same cost

And this was Cursor’s answer:

Great question about context token costs! Let me break this down:
The answer is: Asking for 3 modifications in three separate requests is MORE COSTLY.
Here's why:

Token Cost Breakdown for Three Separate Requests
Each request includes:
- System prompts (rules, agent instructions, skills, etc.) - sent every time
- File context (open/recently viewed files metadata) - repeated each time
- Git status - repeated each time
- User query - 3 different messages
- Agent responses - 3 separate responses
- Tool calls (reading files) - files may be re-read multiple times across requests
- Conversation history - grows with each exchange, carrying forward all previous messages
One Combined Request
Includes:
- System prompts - sent once
- File context - sent once
- Git status - sent once
- User query - one message (even if longer)
- Agent response - one response
- Tool calls - files read once and used for all modifications
- No conversation history overhead from multiple back-and-forth exchanges

Why the Difference Matters
1. System overhead is significant: The system prompts, rules, and context information I receive are quite substantial (as you can see from the structure above). This gets sent with every new request.
2. File re-reading: When you make separate requests, the agent often needs to re-read the same files to understand context, even if they were just read in the previous request.
3. Conversation history accumulation: With three separate requests, the second request includes the first exchange in history, and the third includes both previous exchanges, creating a compounding effect.

Best Practice
Batch related changes together in a single request when possible. This is:
- More cost-efficient
- Often more coherent (the agent sees the full picture)
- Faster overall execution
- The only time separate requests might make sense is if modifications are truly independent or if you need to verify one change before proceeding to the next.


Spread the word

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

JOIN us!

Fancy getting RemoteFrog updates? - ¿Quieres estar al día de lo que pasa en RemoteFrog?

Discover more from Remote Frog

Subscribe now to keep reading and get access to the full archive.

Continue reading