Bhuvesh's Tech Space

Work today feels less like building and more like operating a system

Bhuvesh Dhiman — Sat, 30 May 2026 07:00:14 GMT

Instead of writing everything from scratch, I find myself orchestrating multiple AI agents. Each one handles a specific part of the workflow. Code generation, debugging, documentation, testing.

I am not replacing engineering effort. I am redistributing it.

The role shifts from doing to directing.

You define intent clearly, break problems into structured pieces, and route them to the right agents. The quality of output depends less on typing speed and more on how well you design the system around these agents.

This changes what “being a good engineer” means.

It is no longer just about writing clean code.

It is about:
• Designing workflows
• Structuring problems
• Evaluating outputs critically
• Building feedback loops

The leverage is real, but only if you know how to control it.

Otherwise you just produce more, faster, with the same flaws.

The real shift is subtle.

You are no longer just building software.
You are building systems that build software.

I use AI in almost everything I do. But I don’t rely on it

Bhuvesh Dhiman — Sat, 30 May 2026 06:59:34 GMT

There’s a difference most engineers miss.

AI is a multiplier, not a replacement.

If your thinking is weak, AI amplifies noise.
If your thinking is strong, AI accelerates clarity.

I treat AI like an engineer with infinite speed.

It can draft, explore, and expand possibilities.
But direction, judgment, and trade-offs still come from me.

The core thinking stays human.
The execution gets accelerated.

This shift matters.

Engineers who outsource thinking to AI will plateau.
Engineers who use AI to extend thinking will compound.

The real skill now is not prompting.

It is structured thinking, taste, and decision-making.

AI just makes the gap more visible.

That’s the game.

I haven’t written a single line of code by myself since April 2025

Bhuvesh Dhiman — Sat, 30 May 2026 06:59:05 GMT

Everything I ship is AI-assisted.

Not autocomplete. Not snippets. Full workflows.

At first, it felt uncomfortable. Like I was skipping something important.

But the constraint in software engineering was never typing speed. It was clarity of thought.

AI changes the role.

I spend less time on boilerplate, wiring APIs, or fixing obvious bugs.

More time goes into:
• Defining clear specs
• Breaking problems into precise steps
• Designing scalable systems
• Reviewing and validating outputs

The core skill is no longer “can you code this?”

It is “Can you express this clearly enough for a system to build it correctly?”

I still debug. I still design from first principles.

But I do not default to writing code anymore.

I direct it.

If you are using AI like a faster IDE, you are underestimating the change.

This is not about speed.

It is about how software gets built.

Most engineers are trying to fix AI with better prompts. The Claude docs point to a different approach

Bhuvesh Dhiman — Sat, 30 May 2026 06:58:11 GMT

There are two layers: CLAUDE.md and Skills.

They look similar. Both use markdown. Both influence behavior. But they solve different problems.

CLAUDE.md is context.

It defines how Claude should behave inside a project:

Coding standards
Tool usage
Workflow guidance

This improves consistency, but it still relies on interpretation. Every response is probabilistic.

Skills are capabilities.

They package workflows into reusable units:

Structured inputs and outputs
Scripts and resources
Loaded only when relevant

Instead of explaining tasks repeatedly, you encode them once. Claude applies them when needed.

The difference is simple:

CLAUDE.md improves reasoning. Skills improve execution.

What this means in practice

With CLAUDE.md:

You are scaling prompt engineering
Behavior can drift

With Skills:

Workflows become reusable
Systems become more deterministic

The real shift

AI is moving from prompt design to system design.

The advantage is no longer better prompts. It is building clear interfaces and capabilities around the model.

That is what makes AI reliable in production.

Spec-driven development is quietly becoming one of the most powerful workflows in modern engineering

Bhuvesh Dhiman — Sat, 30 May 2026 06:57:00 GMT

Most teams still write code first and specifications later.

That worked when systems were smaller and requirements were simple. It breaks down quickly when software becomes distributed, AI-assisted, and product-driven.

A spec-driven workflow flips the order.

You start with a clear specification of the problem, constraints, behavior, and expected outcomes. Only after the thinking is complete does implementation begin.

This has several important effects.

First, it forces clarity. Ambiguous thinking becomes visible immediately because the spec cannot be written cleanly.

Second, it separates problem design from code execution. Engineers spend more time designing systems and less time debugging accidental complexity.

Third, it works extremely well with AI-assisted development. When the specification is precise, AI tools can generate large portions of the implementation with surprisingly high accuracy.

In practice, the workflow becomes simple:

Write the spec
Review the spec
Iterate on edge cases
Generate or implement the code
Validate against the spec

The spec becomes the single source of truth.

Not the code. Not the comments. Not tribal knowledge.

Good engineering has always been about reducing ambiguity. Spec-driven workflows simply make that discipline explicit.

As systems get larger and AI becomes part of the development process, this approach will likely move from a niche practice to a default engineering pattern.

Most simple AI agents fail for one reason

Bhuvesh Dhiman — Sat, 30 May 2026 06:56:25 GMT

They try to do everything in one step.

A single prompt.
A single response.
And we expect perfect results.

In reality, this approach is slow, brittle, and error-prone.

The model has to reason, decide, and produce the final output all at once.
When something goes wrong, the entire result breaks.

A better approach is to think in workflows.

Break the task into smaller steps.

Each step has a clear responsibility:
• understand the input
• plan the task
• call the right tool
• validate the result
• produce the final output

Now the system becomes easier to control and easier to debug.

Instead of hoping the model gets everything right, you design the process so that mistakes are contained and corrected along the way.

This is the difference between a demo and a reliable AI system.

Simple agents look impressive in prototypes.
Structured workflows are what actually scale in production.

The real shift in AI engineering is not just prompting models.

It is designing systems around them.

Three years ago, I had a simple thought. Today, it is a reality

Bhuvesh Dhiman — Sat, 30 May 2026 06:55:40 GMT

Back then, when building products, everything was clear in my head.
The requirements. The vision. The architecture.
I knew what to build and exactly where it should live in the system.

And I kept wondering:

If the solution is already designed in my mind, what if something could just write the code for me?

Now it can.

AI coding agents execute the implementation layer.
Boilerplate. Repetitive logic. Standard integrations.
The parts that consume time but do not require deep judgment.

My role has changed.

I focus on system design, tradeoffs, edge cases, and product alignment.
I invest my thinking where it actually creates leverage.

This is not about avoiding code.
It is about moving up the abstraction layer.

When AI handles execution, engineers can concentrate on architecture.

The advantage is no longer speed of typing.
It is clarity of thought.

For product-oriented engineers, this shift is structural.

Your value compounds when you design better systems and guide the agent with precision.

AI does not replace engineering.
It increases the return on strong engineering judgment.

That transition is reshaping how modern software gets built.

AI is generating more code than ever

Bhuvesh Dhiman — Sat, 30 May 2026 06:55:03 GMT

But we still version control only the output, not the reasoning

Entire.io (https://entire.io/) is addressing that gap

Their open source CLI, Entire, integrates with Git and captures full AI agent sessions alongside commits. These records, called Checkpoints, store prompts, transcripts, tool calls, and the resulting code changes

So what is the real use of this data?

• You can audit why a change was made
• You can review intent, not just diffs
• You can reproduce or refine past AI sessions
• You create long term organizational memory around decisions

Traditional Git answers what changed

Entire’s approach answers why it changed

As AI agents become real contributors to codebases, that distinction becomes critical

Security Considerations When Integrating LLMs

Bhuvesh Dhiman — Sat, 30 May 2026 06:54:27 GMT

LLMs are being integrated into products and internal systems at unprecedented speed. Security is often treated as secondary.

An LLM is not just another dependency.
It is an execution surface.

Integrating an LLM introduces a probabilistic system that reasons and generates actions based on context you do not fully control. Traditional security models assume deterministic behavior. LLMs break that assumption.

A common mistake is treating prompts as static input. Prompts are dynamic and shaped by user data, system state, and upstream outputs, making prompt injection a practical risk.

Another mistake is over-trusting model output. LLMs can produce confident but incorrect or unsafe responses. When this output flows directly into APIs, databases, or automation, failures become silent.

Data boundaries are also frequently overlooked. LLMs do not understand confidentiality. Sensitive context can be leaked or inferred outside intended controls.

Effective LLM security requires explicit guardrails and constraints

Assume the model is untrusted.
Constrain what it can see and do.
Validate every output.
Isolate execution paths.
Log aggressively.

LLMs amplify capability.
They also amplify mistakes.

Your architecture decides which one scales.

The software engineering lifecycle has shifted, and code is no longer the hardest part.

Bhuvesh Dhiman — Sat, 30 May 2026 06:53:46 GMT

AI has compressed the time it takes to translate ideas into working code. That advantage is now table stakes.

What still cannot be automated is product thinking.

In an AI-driven lifecycle, engineers are expected to reason about problems before they write anything. Understanding user intent, constraints, tradeoffs, and system impact matters more than syntax mastery.

This is why flexibility has become a core engineering skill.

Engineers who can adapt to a new lifecycle, where AI handles execution speed and humans handle direction, will compound their impact. They move faster not because they type faster, but because they think better.

Engineers who define themselves only by coding output will struggle. When code becomes abundant, judgment becomes scarce.

The role is evolving from writing code to shaping systems.

The takeaway is simple. In the AI era, growth comes from pairing engineering depth with product clarity.

Most real-world AI agent problems are coordination problems, not intelligence problems

Bhuvesh Dhiman — Sat, 30 May 2026 06:53:13 GMT

As agent-based systems grow, a single agent handling everything stops working.
You need multiple AI agents, each with a clear role, working toward one outcome.

That is where AI agents succeed or fail in production.

In practice:
One agent interprets intent.
Another plans the steps.
Others gather context, call tools, validate results, or enforce constraints.

An agent in isolation is rarely useful.
Value comes from how agents interact.

The hard part is not building AI agents.
It is orchestrating them.

Without orchestration, agents duplicate work, contradict each other, and fail unpredictably.

Orchestration defines:
Which agent acts.
In what sequence.
How handoffs happen.
How failures are handled.
When execution stops.

This is not a model quality problem.
It is a systems problem.

Strong orchestration makes even simple agents reliable.
Weak orchestration breaks even the best ones.

The real question is not how smart your AI agents are.
It is how well they are orchestrated.

Coding agents did not make product building faster. They moved where the time goes

Bhuvesh Dhiman — Sat, 30 May 2026 06:52:13 GMT

Earlier, a large part of engineering time was spent translating intent into code. That translation step is what has shrunk.

AI now converts a clear mental model into working code in minutes.

What did not shrink is the thinking.

Today, most time is spent deciding what should be built and explaining it precisely.

The agent is not slow. The bottleneck is clarity.

When you use coding agents seriously, you notice a clear shift. Implementation speed is no longer the constraint. Understanding is.

You spend more time on:

Defining the problem precisely
Breaking behavior into explicit rules
Making tradeoffs visible
Describing intent instead of syntax

Once this is done, the translation to code is almost instant.

AI did not eliminate engineering effort. It removed the mechanical step of turning thoughts into syntax.

Poor thinking now fails faster. Clear thinking now ships faster.

This shift rewards engineers who understand systems, products, and users, not just tools.

The real skill is no longer writing code. It is forming a precise intent that machines can execute.

AI did not change where value comes from. It just made that truth impossible to ignore.

Everyone is reaching for vector search. Few stop to ask if it is actually the right tool

Bhuvesh Dhiman — Sat, 30 May 2026 06:51:29 GMT

I was reading this Anthropic engineering article on building agents, and one section stood out for its clarity and honesty:
https://claude.com/blog/building-agents-with-the-claude-agent-sdk

The part on semantic search is especially worth attention.

Semantic search works by chunking context, embedding it as vectors, and retrieving results based on conceptual similarity. It is fast. It scales well. It looks clean on the architecture diagrams.

But it comes with real tradeoffs.

It is less accurate for complex reasoning.
It is harder to maintain as the context evolves.
It is less transparent, which makes debugging and trust difficult.

Anthropic makes a counterintuitive recommendation. Start with agentic search.

Agentic search allows the system to reason step by step, decide what to look for next, and adapt based on intermediate findings. It is slower, but it is explicit, debuggable, and closer to how real problem-solving works.

Semantic search should be added only when you truly need speed or broader variation. Not by default. Not because it is trendy.

This highlights a deeper principle in AI system design.

Correctness comes before performance.
Clarity comes before scale.
Product value comes before architectural elegance.

As AI native engineers, the goal is not to stack advanced tools. The goal is to build the smallest system that reliably works, then optimize with intent.

Start with reasoning. Optimize later.

Most RAG systems fail for a reason no one talks about

Bhuvesh Dhiman — Sat, 30 May 2026 06:50:30 GMT

Not because the model is weak.
Not because embeddings are bad.
But because the data feeding them is messy.

Retrieval Augmented Generation only works as well as the structure of the data behind it.

If your inputs are inconsistent, duplicated, or overloaded with irrelevant fields, your vector search degrades.
The model then reasons over noise and confidently produces the wrong answer.

This is where normalized data changes the game.

Normalization forces structure before intelligence.

It means converting raw, heterogeneous data into a consistent schema.
Same concepts, same fields, same semantics, regardless of where the data came from.

Why this matters in practice:

First, retrieval quality improves.
When similar entities share the same structure, embeddings cluster meaningfully.
Queries retrieve what you actually want, not approximate matches.

Second, hallucinations reduce.
The model sees cleaner, deduplicated context.
Less ambiguity in inputs leads to more deterministic reasoning.

Third, security and compliance get easier.
Normalization lets you explicitly exclude sensitive fields before embedding.
What never enters the vector store can never leak.

Fourth, RAG scales across systems.
If you are pulling data from multiple tools, products, or customers, normalization gives you one mental model.
One retrieval strategy.
One pipeline.

The key insight is simple.

RAG is not just an LLM problem.
It is a data architecture problem.

If you treat RAG as “add embeddings and prompt harder,” you will keep debugging symptoms.
If you treat it as “design a clean data contract for intelligence,” systems start to behave.

The takeaway for engineers building AI products:

Before optimizing prompts or models, normalize your data.
Structure first.
Intelligence later.

That is how RAG systems move from demos to dependable infrastructure.

Most engineers use “vector index” and “vector database” as if they mean the same thing

Bhuvesh Dhiman — Sat, 30 May 2026 06:49:05 GMT

They do not.

I see this confusion often when discussing RAG systems and semantic search, so here is a simple way to separate the two.

A vector index is a data structure.

Its job is narrow.
Given a vector, find the most similar vectors efficiently.

It stores embeddings and uses algorithms like HNSW or IVF to speed up similarity search.

That is all it does.

No metadata handling.
No persistence guarantees.
No APIs.
No lifecycle management.

A vector index is an algorithmic component, not a system.

A vector database is a system.

It stores vectors along with metadata.
It supports insert, update, delete, and query operations.
It handles persistence, scaling, and reliability.
It exposes stable APIs for applications.

Internally, a vector database uses one or more vector indexes to perform similarity search.

Here is the clean mental model.

A vector index answers.
How do I find similar vectors quickly?

A vector database answers.
How do I store and query vectors in a real application?

The takeaway.

Indexes solve search.
Databases solve systems.

I read a great piece on context engineering for AI agents, and it changed how I think about agent design

Bhuvesh Dhiman — Sat, 30 May 2026 06:47:47 GMT

The article made one thing very clear. Agents do not fail because models are weak. They fail because context is poorly managed.

Context engineering is about deciding what information lives inside the context window at every step of an agent’s journey.

Not everything. Just the right things.

Here are the key takeaways I've learned.

Think of context like working memory

An LLM is like a CPU. The context window is its RAM.

It is limited, expensive, and fragile. Your job is to curate it carefully.

Context engineering is the discipline of deciding what earns a spot there.

Agents suffer when context grows unchecked

Long-running agents accumulate tool outputs, intermediate thoughts, and feedback.

This leads to real failure modes:

Confusion from irrelevant details. Distraction from too many signals. Poisoning occurs when hallucinations are stored as facts. Clashes when different context pieces disagree.

More tokens often mean worse performance.

Four core strategies matter most

The article grouped real-world agent designs into four practical strategies.

Write context Persist useful information outside the context window, like scratchpads or memories, so agents can refer back without carrying everything in line.

Select context Pull only task-relevant memories, tools, or knowledge into the context window. Retrieval quality matters more than retrieval volume.

Compress context Summarize or trim aggressively. Keep what matters, drop the rest. Compression is not optional for long agent runs.

Isolate context Split responsibilities across sub-agents, sandboxes, or structured state. Isolation reduces interference and improves reasoning.

These patterns show up again and again across strong agent systems.

Tool output is context too

Tool responses are often the biggest token hog. If you do not post process or isolate them, they dominate the context window and drown out reasoning.

Good agents treat tool feedback as a structured state, not raw text.

Observability is part of context engineering

You cannot improve what you cannot see.

Tracking token usage, context growth, and agent behavior is essential to know where context helps and where it hurts.

The big takeaway for me was this.

Building agents is less about prompt cleverness and more about memory, selection, compression, and boundaries.

Context engineering is not a trick.

It is core infrastructure.

This post is inspired by an excellent deep dive from the team at LangChain on context engineering for agents.

If you are building or planning serious agent systems, I highly recommend reading the full article here: https://www.langchain.com/blog/context-engineering-for-agents

Worth your time if you care about production-grade AI

Unlearning becomes essential as you grow into senior roles

Bhuvesh Dhiman — Sat, 30 May 2026 06:46:15 GMT

Early in your career, doing many things at once feels like progress. You say yes to everything, context switch constantly, and measure impact by how busy you are.

At senior levels, that approach breaks.

The scope increases. Decisions get heavier. The cost of distraction becomes real.

You still handle multiple responsibilities, but you cannot work on all of them at the same time. The skill shifts from execution speed to intentional focus.

Prioritization is not about ignoring work. It is about sequencing it correctly.

One problem at a time. One decision with full attention. One outcome owned end-to-end.

This is not something frameworks fully teach. It develops through experience, mistakes, and reflection. You learn when to zoom in deeply and when to step back and align.

Focus becomes a leadership skill, not a productivity trick.

The real growth moment is when you stop trying to do everything and start choosing what deserves your best thinking.

What I learned after 3 years of working with LLMs

Bhuvesh Dhiman — Sat, 30 May 2026 06:44:43 GMT

Most people think prompting is about asking sharper questions.
My experience taught me something different.

LLMs perform at their best when you give them the right context. Not just hints or fragments. Real context that mirrors how a human would understand the situation.

When we use these models every day, we usually skip this part. We jump straight to the question and expect the model to fill in the gaps. That is where hallucinations and weak answers start appearing.

I used to do the same until I noticed a pattern.
Whenever I shared full context first and only then asked my questions, the responses became sharper, more accurate, and more aligned with what I needed.

So I changed my workflow.
My first step is always to prime the model with the background, constraints, and details of the task.
My second step is to ask the actual question.

This simple shift reduced hallucinations, improved quality, and saved a lot of time.

Takeaway
If you want reliable output from LLMs, treat context as the starting point, not an afterthought

#ai #llm #process