top of page

Grounded Context & Citation

  • Writer: Anand Nerurkar
    Anand Nerurkar
  • 16 hours ago
  • 3 min read

is citation & grounded context is same??

=====

Good question — they’re related but not the same, and interviewers often use them loosely, which causes confusion.

Short answer

Citation ≠ Grounded Context

Citation supports grounding, but grounding is broader.

1. What is Grounded Context?

Grounded context means the AI response is anchored to trusted, known data, not hallucinated.

The model is constrained to:

  • Enterprise documents

  • Policies

  • Contracts

  • Databases

  • Knowledge bases

  • Retrieved search results (RAG)

👉 Goal: Accuracy, relevance, and reduced hallucination

Example

“Based on your 2024 supplier contract (Clause 7.2), termination requires 60 days’ notice.”

Here, the answer is grounded because it came from a real contract.

2. What is a Citation?

A citation is proof of grounding — it shows where the information came from.

It can be:

  • Document name

  • Section / clause number

  • Page number

  • URL / document ID

👉 Goal: Transparency and trust

Example

“Termination requires 60 days’ notice (Supplier_Contract_v3.pdf, Clause 7.2).”

That clause reference is the citation.

3. Relationship Between the Two

Aspect

Grounded Context

Citation

What it is

Data the model is restricted to

Evidence reference

Purpose

Prevent hallucinations

Build trust & auditability

Required for accuracy

✅ Yes

❌ Not strictly

Required for compliance

Sometimes

Often

Visible to user

Not always

Usually

Key insight:

A response can be grounded without showing citations, but a citation implies the response was grounded.

4. In Enterprise / Procurement / BFSI Context

You’ll often say:

“We ground GenAI responses using enterprise-approved data sources and provide clause-level citations for auditability and compliance.”

That sentence shows:

  • You understand AI safety

  • You understand regulatory expectations

  • You know how to operationalize GenAI

5.

“Grounded context ensures the model answers only from trusted enterprise data, while citations provide traceability and auditability by showing exactly which document or clause the answer came from.”

Sample Codebase

=======

Below is a simple, ready reference codebase that demonstrates Grounded Context + Citations using:

  • Java + Spring Boot

  • Spring AI

  • Azure OpenAI

  • PostgreSQL + pgvector

  • RAG (Retrieval-Augmented Generation)

I’ll keep it clean, minimal, and explainable so you can walk through it confidently.

1️⃣ Architecture Flow (What You Explain Verbally)

User Question
   ↓
Query Embedding
   ↓
Vector Search (pgvector)
   ↓
Retrieve Relevant Documents (Grounded Context)
   ↓
Prompt with Context + Citation Metadata
   ↓
LLM Answer (with Citations)

Key idea:

LLM is NOT answering from memory — only from retrieved enterprise documents.

2️⃣ Database Schema (Grounded Data + Citation Metadata)

CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE procurement_docs (
    id UUID PRIMARY KEY,
    content TEXT,
    source_doc VARCHAR(255),
    section VARCHAR(100),
    embedding VECTOR(1536)
);

Example row:

content

source_doc

section

"Termination requires 60 days notice..."

Supplier_Contract_v3.pdf

Clause 7.2

➡️ source_doc + section = citation

3️⃣ Document Ingestion (Embedding with Metadata)

@Autowired
private EmbeddingClient embeddingClient;

@Autowired
private JdbcTemplate jdbcTemplate;

public void ingestDocument(String content, String source, String section) {

    List<Double> embedding = embeddingClient.embed(content);

    jdbcTemplate.update("""
        INSERT INTO procurement_docs (id, content, source_doc, section, embedding)
        VALUES (?, ?, ?, ?, ?)
        """,
        UUID.randomUUID(),
        content,
        source,
        section,
        embedding.toArray()
    );
}

🔹 Grounding starts here🔹 Every chunk carries traceability

4️⃣ Vector Search (Grounded Context Retrieval)

public List<Map<String, Object>> retrieveContext(String questionEmbedding) {

    String sql = """
        SELECT content, source_doc, section
        FROM procurement_docs
        ORDER BY embedding <-> ?::vector
        LIMIT 3
        """;

    return jdbcTemplate.queryForList(sql, questionEmbedding);
}

Only enterprise-approved documents are retrieved.

5️⃣ Prompt Construction (Grounded + Citation-Aware)

public String buildPrompt(
        String userQuestion,
        List<Map<String, Object>> docs) {

    StringBuilder context = new StringBuilder();

    for (Map<String, Object> doc : docs) {
        context.append("""
        Content:
        %s
        Source: %s, %s

        """.formatted(
            doc.get("content"),
            doc.get("source_doc"),
            doc.get("section")
        ));
    }

    return """
    Answer the question ONLY using the context below.
    If the answer is not present, say "Information not found."

    Context:
    %s

    Question:
    %s

    Include citations in your answer.
    """.formatted(context, userQuestion);
}

This is hard grounding:

  • No context → No answer

  • Citations mandatory

6️⃣ LLM Call (Azure OpenAI via Spring AI)

@Autowired
private ChatClient chatClient;

public String answerQuestion(String prompt) {
    return chatClient.call(prompt);
}

7️⃣ Final User Response (Grounded + Cited)

User Question:

“What is the supplier contract termination notice period?”

LLM Response:

“The supplier contract requires a termination notice period of 60 days.(Supplier_Contract_v3.pdf, Clause 7.2)

✅ Grounded✅ Cited✅ Auditable✅ Compliance-ready

8️⃣Summary

“We implement GenAI using a RAG-based grounded architecture. User queries are converted to embeddings and matched against enterprise-approved documents stored in pgvector. The retrieved content forms the only context passed to the LLM, preventing hallucination. Each document chunk carries metadata like document name and clause number, which we surface as citations in the response to ensure traceability, auditability, and regulatory compliance.”


 
 
 

Recent Posts

See All
VP-OKR-KPI-

VP Technology / Architecture – Strategic Framework 1️⃣ Strategic (12–36 months) Focus: Long-term business and technology transformation Objectives / Goals: Cloud-First & API-First:  Modernize platform

 
 
 
AI Risk Metrices

🏦 KEY BANKING RISK METRICS (EXPLAINED SIMPLY) 🔍 What is AUC  (in Credit / Risk Models)? AUC = Area Under the ROC Curve In simple terms: AUC measures how well a model can distinguish between good and

 
 
 

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
  • Facebook
  • Twitter
  • LinkedIn

©2024 by AeeroTech. Proudly created with Wix.com

bottom of page