Skip to main content

Command Palette

Search for a command to run...

Everyone is still trying to prompt their agents into reliability. The tooling quietly moved the other way last week

Updated
2 min read

LangChain shipped interpreter skills. Agents now run real code modules, not just prompt instructions.

https://www.langchain.com/blog/interpreter-skills

Deepagents added a middleware that grades the agent's own work against a rubric.

The pattern under both is the same.

Stop asking the model to behave. Start giving it behavior it cannot get wrong.

A prompt is a request. Every response is probabilistic.

Code is a guarantee. The same input returns the same output every time.

I moved a piece of agent behavior out of the prompt and into a script, and the accuracy went up, because the output became deterministic.

That is the whole trade, and it is not free.

Writing your own logic is expensive. Asking the model is cheaper, and less accurate.

So the real question on every step is which one you can afford to get wrong: • if a step must be correct, encode it as code • if a step needs judgment, leave it to the prompt • if a step is right most of the time, wrap the prompt in a check

Skills that execute code is a bigger update than it looked. It moves the reliable part of an agent out of language and into logic.

Prompts make an agent sound right.

Code makes it be right.

#ai #agentic-ai #llm

More from this blog

B

Bhuvesh's Tech Space

27 posts