Multi Agent & Agentic AI

Anand Nerurkar
Nov 24, 2025
6 min read

1. Agentic AI (What it means)

Agentic AI refers to AI systems that can take autonomous actions, not just generate text.These systems perceive, reason, plan, act, and learn — like a digital worker that executes end-to-end tasks with minimal human intervention.

Key capabilities:

Autonomy: Takes decisions without being prompted every time
Planning: Breaks goals into sub-tasks
Tool use: Calls APIs, databases, models
Reasoning loops: Self-critique, refine output
Learning: Improves from feedback

Example for BFSI:An “Agentic Fraud Investigator” that reads transactions → flags anomalies → gathers supporting evidence → recommends action → updates case notes.

2. What Is a Multi-Agent System?

A multi-agent system is a team of multiple AI agents, each with specialized skills, working together to complete a complex business workflow.

It mimics an enterprise function where multiple teams collaborate.

Example Structure:

Classification AgentReads a document, classifies into KYC/Loan/Invoice.
Extraction AgentExtracts fields using DocAI.
Validation AgentChecks data quality, RBI rules, business rules.
Decision AgentDetermines next best action.
Orchestrator AgentCoordinates all agents, maintains workflow state.

Why use Multi-Agent architecture?

Scalability
Clear separation of responsibilities
Easier troubleshooting / governance
Can plug-and-play individual models
Enables enterprise-wide reuse

🧠 Agent vs Multi-Agent (Interviewer-friendly comparison)

Feature	Agentic AI	Multi-Agent System
Definition	A single autonomous AI that acts	Multiple agents collaborating
Scope	Solves one complex task end-to-end	Solves a large workflow as a team
Example	Loan eligibility agent	Full digital lending agents for KYC → Eligibility → Decision → Agreement
Strength	Completes tasks independently	Division of labor, specialization

🚀 How to Explain in an Enterprise Architecture Interview

Sample Answer:

“Agentic AI gives us digital workers that can autonomously take actions — not just generate text. Multi-agent architecture extends this by introducing a set of specialized agents (Document Classifier, Extraction Agent, Compliance Agent, Decision Agent) orchestrated through an agent framework like LangGraph, AutoGen, or Spring AI Agents. This design fits naturally into large BFSI workflows where responsibilities are distributed. It also enforces governance, observability, and performance isolation — important for regulated industries. In my architecture, agents interact through events (Kafka) and use shared memory (vector DB + metadata store) to maintain context. This ensures transparency, auditability, and consistency across decision steps.”

⭐ What is an “I don’t know” loop in AI agents?

An “I don’t know” loop happens when an AI agent repeatedly returns uncertainty responses instead of progressing, due to missing context, failed retrieval, or wrong state transitions.

✅ Definition (Very Clear)

An “I don’t know” loop is when the LLM or agent keeps responding with variations of:

“I’m not sure.”
“I don’t know the answer.”
“I don’t have enough information.”
“Please provide more details.”

…but it repeats this in a cycle, because the orchestrator triggers the agent again without fixing the root cause.

✅ Why it happens

RAG retrieval returned no relevant context→ Similarity scores too low→ The agent has nothing to reason with
Wrong agent/tool selected→ Agent can’t perform the task, so it returns “I don’t know”
Prompt missing required instructions→ Model stays uncertain
Temperature too high→ Random variations of “I’m not sure…”
Agent state machine stuck→ Orchestrator keeps re-calling the same agent
Guardrails instruct model to be conservative→ It keeps refusing rather than solving

🚫 How the loop looks in logs

Example log pattern:

Agent → Query: “Extract policy clause”
RAG → Returned 0 chunks (min similarity < 0.7)
LLM → “I don’t know, I do not have enough context.”
Orchestrator → Retry same agent
LLM → “I don’t know the answer.”
Orchestrator → Retry…
LLM → “I don’t have enough information.”

This is the I-don’t-know loop.

🔧 How to fix it

If similarity < threshold → return fail state instead of retry
Add fallback agent (e.g., “Missing Context Analyzer”)
Improve prompt → add required context or guidance
Stop agents after X retries
Add telemetry to detect repeated uncertain responses
Lower temperature to reduce randomness

⭐ One-line summary

An “I don’t know” loop happens when the agent repeatedly expresses uncertainty because retrieval, prompt, or state logic is broken — and the orchestrator keeps calling the same agent instead of resolving the root cause.

Example of an “I Don’t Know” Loop

User: Why did the API call return 404?Model: I’m not sure.User: The logs are above.Model: I don’t know.User: Path is /customers/123.Model: I’m not sure.User: Look at the payload.Model: I don’t have enough information.

Here, the model gets stuck recycling uncertainty rather than reasoning.

Why We Track This in Hallucination Testing

“I don’t know” loops indicate:

Model uncertainty → Not reasoning deeply
Context loss → RAG or memory chain issues
Safety filter stuck in over-trigger mode
Breaking of agent tool-calling chain
Failure to maintain conversation grounding

In observability dashboards, you track them under:

➤ Uncertainty patterns

➤ Repetitive refusal segments

➤ Low-confidence response cluster

How to Detect “I Don’t Know” Loops (Telemetry + Observability)

You monitor:

A. Token-level repetition pattern

Repeated substrings like:

“I don’t know”
“I’m not sure”
“I cannot determine”
“I don’t have enough info”

B. Confidence scores

Low logit confidence clusters across turns.

C. Context window failures

Too many “missing context” responses indicate:

vector retrieval failure
wrong RAG pipeline
chunk mismatch
context is being truncated

D. Safety system triggers

The agent might be stuck in a safety fallback block.

E. Agent execution traces

Look at:

tool call failures
empty tool results
exceptions in chain
timeout on retrieval layer

✅ 4. How to Fix “I Don’t Know” Loops

1. Fix Context Loss

Reduce chunk size
Improve chunk overlap
Upgrade retrieval strategy (MMR, hybrid search, semantic search)
Increase context window
Validate tool outputs

2. Fix Safety Misfiring

Adjust guardrail rules
Add structured exceptions
Create allow-list for safe enterprise terms

3. Fix Prompt Architecture

Add explicit fallback rules:

Bad:“I don’t know.”

Good (New Rule):“If context is missing, explicitly request information from user or tool-call again, do NOT loop.”

4. Add Confidence-Based Routing

If logit score < threshold:→ call RAG again→ call secondary model→ escalate to human fallback

5. Add Observability Alerts

Trigger alerts when:

3 repetitive refusals
2 missing context errors
2 empty embeddings

“I don’t know loops” are repetitive uncertainty patterns that occur when the model is stuck in a guardrail fallback, loses context, or receives empty tool-chain outputs.I monitor them using telemetry — token repetition, context window checks, tool-call logs, and confidence scoring.I eliminate them with improved prompt architecture, better retrieval strategies, guardrail tuning, and fallback logic.This ensures the agent is reliable, predictable, and safe in production.”

Failure Taxonomy for LLM Agents

(Hallucination vs Refusal vs Drift vs Looping — clear definitions + examples)

1. Hallucination

Definition:Model produces confident but incorrect information not grounded in retrieved data or prompt.

Symptoms:

Fabricated RBI rules, customer names, policy clauses
Overconfident tone
Missing citations

Root Causes:

Weak retrieval grounding
Poor prompt constraints
Low-quality embeddings

2. Refusal Failure

Definition:Model declines to answer even when allowed or expected.

Symptoms:

“I cannot help with that.”
“This seems unsafe.”
Over-triggering internal safety rails

Root Causes:

Safety alignment overly strict
Prompt phrasing ambiguous
Incorrect classification of task as unsafe

3. Context Drift

Definition:Model gradually deviates from the original user goal due to deteriorating context over multiple turns.

Symptoms:

Topic slowly shifts
Incorrect memory carried forward
Agents responding based on earlier context, not latest

Root Causes:

Insufficient context rehydration
Old messages overweighted
Wrong retrieval chunks

4. Looping (a.k.a. “I don’t know” Loop)

Definition:Agent repeats the same uncertainty statement or fallback logic in a cycle.

Symptoms:

“I’m not sure, let me check…” → retrieves nothing → “I’m still not sure…”
Repetitive tool calls
Infinite chains in LangGraph/LangChain

Root Causes:

Similarity < threshold → retrieval fails
No fallback policy
Faulty planner/orchestrator

4. Architecture: How to Prevent Loops in Multi-Agent Systems

(Event-driven + Orchestrator-controlled)

🔹 High-Level Components

User Proxy Agent→ Accepts query, normalizes intent.
Planner / Orchestrator Agent→ Decides which agent to call next.→ Applies loop-detection & halt rules.
Domain-Specific Agents (Fraud, Credit, Policy, Data)→ Perform tasks with tools (SQL, APIs, RAG).
Retrieval Layer (Vector DB + similarity threshold)→ Sets: min similarity, max results, confidence bands.
Loop Detection & Guardrails Engine→ Hard stop after N attempts.→ Tracks:
- previous tool calls
- previous final answers
- repeated uncertainty patterns
Feedback Channel → Planner→ If retrieval fails twice → escalate to “Fallback agent”.

🔹 Loop Prevention Logic

1. Similarity Threshold Bands

if similarity < 0.70 → no retrieval, trigger fallback agent
if 0.70–0.80 → partial retrieval + uncertainty weighting
if > 0.80 → normal retrieval

2. Step-level Context Rehydration

Every agent call receives:

latest user question
top K retrieved chunks
last successful agent summary
error traces from previous step

Prevents drift.

3. Loop Detection

Track last 3 messages.
If pattern matches:
- “not sure”,
- “don’t have enough info”,
- repeated tool calls with empty result…

→ Automatically terminate.

4. Orchestrator Fallback

Fallback agent options:

Ask Clarifying Question
Return Best-Effort Answer
Escalate to Human

5. Spring AI Code Snippet to Detect and Break Loops

(Practical, production-ready sample for interviews)

A. Configure Similarity Threshold

@Bean
public SearchRequest searchRequest() {
    return SearchRequest.builder()
            .topK(5)
            .similarityThreshold(0.75)  // <--- minimum similarity score
            .build();
}

B. Apply Loop Detection – Custom Interceptor

@Component
public class LoopDetectionInterceptor implements ResponseInterceptor {

    private final Deque<String> lastResponses = new ArrayDeque<>();

    @Override
    public String intercept(String response) {

        lastResponses.add(response);

        if (lastResponses.size() > 3) {
            lastResponses.removeFirst();
        }

        boolean isLoop = lastResponses.stream()
                .allMatch(r -> r.contains("I don't know") 
                        || r.contains("not sure") 
                        || r.contains("cannot find"));

        if (isLoop) {
            return "I am unable to retrieve the right information. "
                    + "Let me switch to fallback logic.";
        }

        return response;
    }
}

C. Context Rehydration in Every Step

public Prompt buildPrompt(String userMessage, List<Document> retrievedDocs, String lastSummary) {

    String context = retrievedDocs.stream()
            .map(Document::getContent)
            .collect(Collectors.joining("\n\n"));

    String template = """
        You are an enterprise agent.
        
        Last summary:
        {lastSummary}
        
        Retrieved context:
        {context}
        
        User query:
        {query}
        
        If you lack confidence, return: "NEED_FALLBACK".
        """;

    return Prompt.fromTemplate(template)
            .add("query", userMessage)
            .add("context", context)
            .add("lastSummary", lastSummary);
}

D. Fallback Handling

if (response.contains("NEED_FALLBACK")) {
    return fallbackAgent.handle(query);
}

1. Agentic AI (What it means)

Key capabilities:

2. What Is a Multi-Agent System?

Example Structure:

Why use Multi-Agent architecture?

🧠 Agent vs Multi-Agent (Interviewer-friendly comparison)

🚀 How to Explain in an Enterprise Architecture Interview

⭐ What is an “I don’t know” loop in AI agents?

✅ Definition (Very Clear)

✅ Why it happens

🚫 How the loop looks in logs

🔧 How to fix it

⭐ One-line summary

Example of an “I Don’t Know” Loop

Why We Track This in Hallucination Testing

➤ Uncertainty patterns

➤ Repetitive refusal segments

➤ Low-confidence response cluster

How to Detect “I Don’t Know” Loops (Telemetry + Observability)

A. Token-level repetition pattern

B. Confidence scores

C. Context window failures

D. Safety system triggers

E. Agent execution traces

✅ 4. How to Fix “I Don’t Know” Loops

1. Fix Context Loss

2. Fix Safety Misfiring

3. Fix Prompt Architecture

4. Add Confidence-Based Routing

5. Add Observability Alerts

Failure Taxonomy for LLM Agents

1. Hallucination

2. Refusal Failure

3. Context Drift

4. Looping (a.k.a. “I don’t know” Loop)

4. Architecture: How to Prevent Loops in Multi-Agent Systems

🔹 High-Level Components

🔹 Loop Prevention Logic

1. Similarity Threshold Bands

2. Step-level Context Rehydration

3. Loop Detection

4. Orchestrator Fallback

5. Spring AI Code Snippet to Detect and Break Loops

A. Configure Similarity Threshold

B. Apply Loop Detection – Custom Interceptor

C. Context Rehydration in Every Step

D. Fallback Handling

Comments