AI- Knowledge Hub
- Anand Nerurkar
- Dec 2
- 11 min read
1️⃣ What is Your “Knowledge Hub” in Banking GenAI?
You defined it correctly:
Knowledge Hub (Permanent Vector Store) contains:
RBI circulars
Bank lending policies
Credit risk rules
AML/Sanctions SOPs
Legal templates and compliance rules
Product brochures and pricing rules
LLMOps Pipeline for Knowledge Hub (Permanent)
Document Upload
→ OCR
→ Chunking
→ Embedding
→ Indexing
→ Vector Store (pgvector / Pinecone / Azure AI Search)
→ RAG Layer
✅ This is long-lived
✅ Used across all customers
✅ Updated only when policies change
✅ Fully governed & audited
2️⃣ Loan Agreement is NOT Knowledge Hub Content
A loan agreement is:
Customer-specific
Confidential (PII + financial data)
Used only by:
That borrower
That underwriter
That relationship manager
So it should NOT be mixed with RBI policies or SOPs in the same permanent vector store.
Instead, it goes into a:
✅ Temporary / Ephemeral Vector Index
3️⃣ Correct End-to-End Flow for Loan Agreement + GenAI Explanation
Step A — Agreement is Generated (Traditional System)
Loan Agreement is generated via:
Rule engine + templates
Product configuration
Legal clauses
CBS + LOS APIs
This is NOT GenAI yet.
It is stored in:
Secure Object Store (Blob / S3 / MinIO)
loan-agreement-ramesh-12345.pdf
Step B — Event Triggers GenAI Processing
An event is fired:
Event: LoanAgreement.Generated
Payload:
{
applicationId: "LN-12345",
documentPath: "/agreements/loan-agreement-ramesh-12345.pdf",
customerId: "CUST-9912",
ttl: "48h"
}
This event starts the LLMOps pipeline for this document only.
Step C — Temporary LLMOps Pipeline for This Agreement
PDF from Blob
→ OCR (if scanned)
→ Chunking (clause-wise)
→ Embedding
→ TEMP Vector Index (Isolated Namespace)
→ TTL = 48 hours
This is NOT your permanent knowledge hub.This is called:
✅ Ephemeral / Session-Based Vector Index
Used only for:
This one customer
This one agreement
For limited time
Step D — How GenAI Explains the Agreement
Now when Ram asks:
“Summarize my loan agreement in simple terms”
GenAI does:
Query the TEMP vector index
Retrieve only Ram’s agreement clauses
Combine with permanent Knowledge Hub (for interpretation rules)
Generate explanation
Example prompt to LLM:
You are a banking legal assistant.
Use the retrieved clauses from the agreement.
Explain risks in simple language for a retail borrower.
Output:
EMI structure
Prepayment penalty
Variable interest risk
Default consequences
Cooling-off period
✅ This explanation is real-time✅ No data pollution✅ Fully compliant
Step E — Automatic Expiry (TTL Control)
After 24–48 hours:
TEMP Vector Index → Auto Deleted
Original PDF → Only in core LMS document store
Embeddings → Destroyed
This ensures:
✅ Data privacy
✅ RBI compliance
✅ No accidental reuse
✅ No model contamination
4️⃣
“Loan agreement also goes to Knowledge Hub with TTL”
🔴 Correction:It should NOT go to the permanent Knowledge Hub.
✅ Correct Architecture is:
Store Type | Purpose | Retention |
Permanent Knowledge Hub | RBI, policies, SOPs | Years |
Temporary Vector Index | One customer’s agreement | 24–48 hrs |
This is the industry-grade BFSI setup.
5️⃣Summary
“Loan agreements are not stored in the permanent knowledge hub. Once the agreement is generated, an event triggers a temporary LLMOps pipeline where the document is OCRed, chunked, embedded, and stored in an isolated, short-lived vector index with strict TTL. The GenAI assistant uses this ephemeral index combined with the bank’s permanent policy knowledge hub via RAG to explain clauses, risks, and EMI terms to the customer. After expiry, all embeddings are purged to meet RBI data-retention and privacy guidelines.”
This answer shows:
✅ LLMOps maturity
✅ BFSI compliance
✅ Architectural depth
✅ AI-First thinking
1️⃣ What is a Permanent Knowledge Hub (Long-Term Vector Store)?
This is your enterprise memory.
It contains:
RBI circulars
Bank lending policies
Credit risk frameworks
AML / Fraud SOPs
Product brochures
Underwriting manuals
Regulatory FAQs
Pipeline (One-time or Periodic)
Policy PDF → OCR → Chunking → Embedding → Indexing → Stored in PGVector / Pinecone
Characteristics
✅ Long-lived✅ Versioned✅ Audited✅ Used by RAG for months/years✅ Supports compliance✅ Not deleted automatically
Examples of Tech
PGVector (Postgres)
Pinecone
Weaviate
Azure AI Search (vector mode)
This is what you correctly called the Knowledge Hub ✅
2️⃣ What is an Ephemeral (Temporary) Vector Index?
This is short-lived, request-scoped memory created only for a specific transaction or document, NOT for enterprise reuse.
Typical use cases:
Generated loan agreement
One customer’s KYC packet
One uploaded bank statement
One email thread
One support ticket document
Characteristics
Property | Ephemeral Vector Index |
Lifetime | Minutes / Hours / 1–2 Days |
Scope | Per customer / per transaction |
Storage | In-memory / temp Redis / temp vector DB |
Compliance | Auto-expiry (TTL) |
Reuse | ❌ Not reused globally |
Purpose | Only for contextual understanding |
Cost | Low |
Risk | Minimal data residency |
Typical Tech
In-memory FAISS
Redis Vector
Temporary PGVector schema
Temp Pinecone namespace with TTL
3️⃣ Your Key Doubt – “If Loan Agreement is NOT in Knowledge Hub, how does GenAI read it?”
✅ Correct: The loan agreement should NOT go to the permanent Knowledge Hub✅ Correct: It must go to a temporary / ephemeral vector index
Why?
Because:
It contains PII + financial data
It is user-specific
It does not apply to other customers
Storing it permanently violates data minimization (RBI/DPDP Act)
4️⃣ Correct End-to-End Flow for Loan Agreement + GenAI Explanation
This is the exact production-grade flow you should explain in interview:
🔹 Step 1: Loan Agreement Generated (Template-Based)
This is NOT GenAI — you are right.
Loan Service → Fills Template → Generates PDF → Stores in Secure Object Store
Example:
Azure Blob (private container)
S3 Private Bucket
Event Fired:
LoanAgreement.Generated
{
loanId,
customerId,
documentUrl
}
🔹 Step 2: Ephemeral LLMOps Pipeline Triggered
Event triggers a temporary GenAI pipeline:
LoanAgreement PDF
↓
OCR (if scanned)
↓
Chunking
↓
Embedding
↓
Ephemeral Vector Index (TTL = 24–48 hours)
⚠️ This does NOT go into your permanent Knowledge Hub
It goes into a temporary vector space scoped to:
sessionId + loanId + customerId
🔹 Step 3: Borrower Assistant Uses RAG with TWO Data Sources
When customer asks:
“Explain my loan agreement risks”
The GenAI does:
A) From Permanent Knowledge Hub
RBI rules
Prepayment penalty norms
Foreclosure regulations
Fair practice code
B) From Ephemeral Vector Index
Customer’s specific loan agreement
EMIs
Clauses
Interest rate
Penalties
Then it performs:
RAG (Policy Context + Agreement Context)
→ LLM Reasoning
→ Customer-friendly explanation
🔹 Step 4: TTL Expiry and Auto-Delete
After 24 or 48 hours:
✅ The ephemeral vector index is deleted automatically✅ The original PDF remains in secure storage (for legal reasons)✅ No embeddings of customer data remain in vector DB✅ You remain RBI + DPDP compliant
5️⃣ So Your Understanding Is 100% Correct ✅
You said:
“Once loan agreement is generated and uploaded, based on event it will kick LLMOps → chunk → embed → index → then GenAI explains it”
✅ Correct✅ Just one refinement:
It is indexed into a TEMPORARY vector index, not the permanent knowledge hub.
6️⃣
“Our Knowledge Hub is a permanent vector store containing RBI policies, lending rules, and SOPs. However, customer-specific documents like loan agreements are never stored there. Instead, when a loan agreement is generated, an event triggers an ephemeral LLMOps pipeline which creates a temporary vector index with a strict TTL. GenAI uses this temporary index together with the permanent knowledge hub for RAG-based explanation of clauses, risks, and EMIs. After TTL expiry, the temporary embeddings are auto-deleted to ensure regulatory compliance and data minimization.”
1️⃣ What is your Knowledge Hub (Permanent Vector Store)?
You are 100% correct:
Knowledge Hub = All RBI docs, lending policies, credit rules, SOPs, risk frameworks, product brochures, compliance norms→ OCR → Chunk → Embed → Index → Store in Permanent Vector DB (pgvector / Pinecone / Weaviate)
This is:
✅ Long-term memory✅ Used by RAG for:
Policy explanations
Eligibility rules
Compliance justifications
Customer FAQs
Underwriter reasoning
This data:
Never expires
Is versioned
Is governed
Is auditable
2️⃣ When Loan Agreement is Generated — Why It Is NOT in Knowledge Hub
You are also correct here:
Loan agreement is customer-specific, transaction-specific & short-lived
So it must NOT be mixed with permanent Knowledge Hub, because:
It is PII + Financial Contract
It is not reusable for other customers
It is legally sensitive
It must follow data minimization & retention policy
So it goes into a:
✅ Temporary / Ephemeral Vector Index✅ With TTL (time-to-live), e.g., 24–72 hours
3️⃣ What is an Ephemeral Vector Index (Temporary Vector Store)?
Think of it as:
A short-term memory for GenAI, created only for one customer session or one document.
Examples:
In-memory FAISS
Redis Vector
Temp pgvector namespace
Encrypted short-lived namespace in Pinecone
Characteristics
Feature | Permanent Knowledge Hub | Ephemeral Vector Index |
Purpose | Policies, SOPs, RBI docs | One loan agreement |
Retention | Years | Minutes / Hours / Days |
PII Allowed | ❌ No | ✅ Yes |
Reusable | ✅ Yes | ❌ No |
Audited | ✅ Fully | ✅ Logged but auto-expired |
Used For | Governance & reasoning | Contract explanation & Q&A |
4️⃣
❓ “If loan agreement is generated, how does GenAI read it if it is NOT in Knowledge Hub?”
✅ Correct End-to-End Flow (Industry-Standard Design)
Step 1 — Loan Agreement Generated (Template Based)
This is NOT GenAI:
It is rule-based
It uses:
Loan amount
Tenure
Interest rate
Risk category
Regulatory clauses
✅ Output: loan_agreement.pdf
Step 2 — Event Fired
Event: LOAN_AGREEMENT_GENERATED
Payload:
{
loanId,
customerId,
agreementDocURI,
riskCategory
}
Step 3 — OCR + Text Extraction (Document AI)
PDF → OCR → Clean Text
Now we have raw contract text.
Step 4 — Chunk, Embed & Temporary Indexing (Ephemeral Store)
Even for temporary storage, we still:
✅ Chunk✅ Embed✅ Index
Why?Because:
LLMs cannot process full 20–40 page contracts in one go
Chunking enables:
Clause-level retrieval
Risk-specific querying
Question-answering
So this becomes:
TempVectorIndex(loanId, customerSessionId, TTL=48 hrs)
Step 5 — GenAI Contract Explainer Uses:
Temp Vector Index → for this exact contract
Permanent Knowledge Hub → to interpret clauses using RBI & policy context
This is called:
✅ Dual RAG
One ephemeral
One permanent
Step 6 — Customer Asks:
“What is the penalty if I miss EMI?”
GenAI does:
Retrieve from temporary vector → Finds the actual penalty clause
Retrieve from knowledge hub → Finds RBI guideline & bank policy on penalties
Combines both → Generates explainable legal answer
✅ This is how GenAI explains a document that was never in Knowledge Hub permanently
Step 7 — TTL Expiry (Auto Deletion)
After 48–72 hours:
✅ Temp embeddings auto-deleted✅ No customer contract remains in vector DB✅ Only the original PDF remains in secure document vault (not vector DB)
This is RBI + GDPR + DPDP Act compliant.
5️⃣ Your Key Architectural Question:
❓ “Why not just scan & store text? Why chunk/embed/index even for temporary doc?”
Because:
Only Store Text | Chunk + Embed + Index |
Keyword search only | Semantic search |
No reasoning | Full clause reasoning |
Cannot answer follow-up questions | Can answer multi-step questions |
No risk scoring | Clause-wise risk extraction |
No explainability | Full explainable AI |
Not GenAI-ready | Fully GenAI-ready |
So to enable GenAI, embeddings are always required — even for temporary memory.
6️⃣ Is Loan Agreement Generation GenAI?
✅ You correctly said:
“Loan agreement generation is template-based, not GenAI.”
That is absolutely correct.
Where GenAI is used instead:
Stage | GenAI Used? | Purpose |
Agreement creation | ❌ No | Rule engine + templates |
Clause explanation | ✅ Yes | Natural language explanation |
Risk summarization | ✅ Yes | What risks apply to this borrower |
EMI explanation | ✅ Yes | Personalized EMI understanding |
Legal jargon simplification | ✅ Yes | Customer-friendly language |
Multi-language explanation | ✅ Yes | Local language conversion |
So in interview you must say:
“We do NOT use GenAI for contract creation — we use it for post-generation understanding, explanation, risk interpretation, and compliance-aware customer communication.”
That is a very strong architectural statement.
7️⃣
“In our AI-first digital lending platform, permanent knowledge like RBI policies, credit rules, and SOPs are stored in a governed vector Knowledge Hub. When a customer-specific loan agreement is generated using a template engine, we do not push that document into the permanent knowledge base. Instead, an event triggers an ephemeral LLMOps pipeline where the contract is OCR-processed, chunked, embedded, and indexed into a short-lived vector index with a strict TTL. During the review window, the GenAI Borrower Assistant uses a dual-RAG approach — retrieving actual clauses from the temporary vector index and interpreting them using the permanent policy knowledge hub. This allows us to explain penalties, risks, repayment terms, and legal clauses with full regulatory context. After the TTL expires, all embeddings are deleted to comply with RBI, DPDP, and data minimization laws.”
✅ Is knowledge hub permanent vector store?
Yes — pgvector / Pinecone / Weaviate.
✅ Is loan agreement also pushed there?
No — it goes to ephemeral vector index only.
✅ What is ephemeral vector index?
Short-term, session-based vector store with TTL.
✅ Why chunk & embed temporary docs?
Because GenAI cannot reason on raw text — embeddings are mandatory for semantic understanding.
✅ How GenAI reads agreement without Knowledge Hub?
Through temporary vector index + dual RAG.
1️⃣ What is your Knowledge Hub (Permanent Vector Store)?
Knowledge Hub = Permanent Vector Store (pgvector / Pinecone / Weaviate) that contains:
RBI circulars
Bank lending policies
Credit risk rules
AML / KYC SOPs
Product terms & conditions
Historical regulatory documents
Pipeline for Knowledge Hub:
Document Upload
→ OCR / Text Extraction
→ Chunking
→ Embedding
→ Indexing
→ Stored permanently in Vector Store
→ Accessed via RAG
✅ This data is:
Long-living
Versioned
Audited
Used by both Borrower Assistant + Underwriter Copilot
2️⃣ What is a Temporary / Ephemeral Vector Index?
This is NOT your main knowledge hub.
It is a short-lived, session-based vector store used for user-specific / transaction-specific documents like:
Loan Agreement generated for Ram
Sanction letter
Offer letter
Signed consent document
These documents:
Are private to Ram
Are valid only for short time
Must not pollute enterprise knowledge hub
Must be auto-deleted after TTL
So we create:
✅ Ephemeral Vector Index = Temporary, in-memory or short-lived vector store
Examples:
Redis Vector
pgvector table with TTL
In-memory FAISS
Weaviate TTL collections
3️⃣ Why Do We Still Need Chunking + Embedding + Indexing Even for Temporary Store?
“We can simply scan loan agreement and extract text, why chunk + embed?”
Because GenAI does NOT understand raw PDF text efficiently
If you only store plain text:
LLM cannot do semantic search
Cannot do clause-level Q&A
Cannot retrieve specific risk clauses
Cannot compare against policies
Therefore even temporary documents go through:
Loan Agreement PDF
→ OCR / Text Extraction
→ Chunking (Clause-wise)
→ Embedding
→ Stored in Ephemeral Vector Index
→ Used by RAG
✅ Without embeddings, you only have string search
✅ With embeddings, you have semantic intelligence
4️⃣ How GenAI Explains a Loan Agreement
“If loan agreement is not in knowledge hub, how does GenAI read it?”
Here is the correct production flow:
✅ Step 1: Loan Agreement Generated (Non-GenAI)
Loan agreement is generated using:
Template Engine (Docx / HTML / PDF)
Rule-based parameter substitution
Example:
Loan Amount = ₹12,00,000
Tenure = 60 months
Interest = 11.5%
At this stage:❌ No GenAI involved yet✅ It is just a deterministic document
✅ Step 2: Agreement Upload Triggers Event
Event Fired → loan.agreement.generated
Payload:
{
loanId,
customerId,
documentPath,
timestamp
}
✅ Step 3: LLMOps Pipeline is Triggered
This document goes to Temporary LLMOps Pipeline:
OCR
→ Text Extraction
→ Chunking (Clause-wise)
→ Embedding
→ Stored in Ephemeral Vector Index (TTL = 48 hours)
This is NOT permanent knowledge hubThis is session-specific knowledge memory
✅ Step 4: Borrower Assistant Uses RAG on Temporary + Permanent Stores
When Ram asks:
“Can you summarize my loan agreement?”
The Borrower Assistant does:
User Query
→ RAG Query
→ Search:
1. Temporary Vector Index (Ram’s agreement)
2. Permanent Knowledge Hub (Policy definitions)
→ Retrieved Chunks
→ LLM Generates:
- Summary
- Clause explanation
- Risk highlights
- EMI obligations
✅ This is how GenAI reads a document it never saw before
✅ Step 5: TTL Expiry (Auto Data Cleanup)
After:
T+48 hours OR
Agreement Signed OR
Cooling-off period ends
Ephemeral Vector Index
→ Auto purge
→ RAM data removed
→ Regulatory compliance maintained
✅ This avoids:
Data leakage
Vector store pollution
Compliance violations
5️⃣ Does RAG Retrieve From Temporary Store Also?
✅ YES — RAG can retrieve from multiple vector sources
Typical RAG routing:
Source | Purpose |
Permanent Knowledge Hub | RBI rules, policies, risk definitions |
Ephemeral Vector Index | Ram’s loan document, sanction letter |
Feature Store (optional) | ML score explanations |
So RAG works like:
Query → Router →
[ Temporary Store + Permanent Store ] →
Merged Context → LLM → Response
6️⃣ Why Not Put Loan Agreement Directly in Knowledge Hub?
Because:
Risk | Why Not |
Privacy | Loan agreement is PII-heavy |
Data Explosion | Millions of agreements |
Compliance | Must be auto-expired |
Vector Pollution | Reduces search quality |
Legal | User-specific docs must not be searchable by others |
✅ Therefore: Ephemeral Index is mandatory in regulated BFSI systems
7️⃣ Final Clean Mental Model
“We maintain two vector layers. One is our permanent enterprise knowledge hub containing RBI regulations, product policies, credit risk rules, and SOPs. The second is a temporary, ephemeral vector index created only for customer-specific documents like loan agreements and sanction letters. When a loan agreement is generated through a template engine, an event triggers a short LLMOps pipeline where the document is OCR’d, chunked, embedded, and stored in a TTL-bound temporary vector index. The borrower assistant then uses RAG across both the permanent knowledge hub and this temporary vector store to generate summaries, explain clauses, highlight risks, and answer customer questions. After the cooling-off or signing window, this temporary vector data is automatically purged for compliance and privacy.”
This explanation alone shows true AI-first, BFSI-grade architecture maturity.
8️⃣
Question | Answer |
Is loan agreement stored in knowledge hub? | ❌ No — goes to ephemeral vector index |
Is chunk/embed required for temporary docs? | ✅ Yes — for semantic RAG |
Can GenAI read without storing? | ❌ No — must be embedded for retrieval |
Does RAG query temporary store? | ✅ Yes |
What is ephemeral vector index? | ✅ TTL-based, session-scoped vector store |
Why not plain text? | ❌ No semantic search, poor accuracy |
What expires? | ✅ Only embeddings, not original PDF |
.png)

Comments