AI 1st Automation -Digital LEnding journey
- Anand Nerurkar
- Nov 23
- 9 min read
Updated: 2 days ago
⭐ End-to-End Digital Lending Architecture – Borrower Journey (Ramesh)
(Text Version – No diagrams)
Scenario
Borrower: RameshProduct: Personal Loan – ₹5 Lakhs
PHASE 1 — Borrower Interaction (GenAI + Frontend)
1. Ramesh logs in & applies for a personal loan
UI triggers Borrower GenAI Assistant (LLM-based conversational layer).
Ramesh asks:
“Am I eligible? What documents should I upload?”
GenAI retrieves policy rules from the RAG Layer (Policies, SOPs, KYC rules, eligibility matrix).
2. Upload of documents
Ramesh uploads:
Aadhaar
PAN
Salary slips
Bank statements (PDF)
Selfie (optional)
3. Application ID generated → Event emitted
application.receivedPayload stored in:
Blob Storage (Raw Zone)
Metadata in PostgreSQL
Document hashes in Cosmos DB
PHASE 2 — Document AI + MLOps Pipelines
4. Document AI (MLOps Pipeline #1)
Triggered event:documents.uploaded
AI model performs:
OCR + Layout understanding
Entity extraction (Name, DOB, PAN, Address, Salary, Employer name)
Document classification (KYC / Income / Bank statement / Noise docs)
Fraud signals: signature mismatch, tampering
Output stored in:
Curated zone (Blob)
Structured fields → PostgreSQL
Features → Feature Store
Note: Document AI is a trained model (MLOps pipeline, deployed on AKS via Azure ML runtime)
PHASE 3 — KYC/CDD/EDD
5. KYC Service consumes event
kyc.triggered
This performs:
Aadhaar XML / DigiLocker verification
PAN → NSDL
Face match (selfie vs Aadhaar)
Address consistency check
CDD → occupation, employer risk, geo-risk
EDD → high-risk occupation, mismatch in identity, multiple PAN matches
Fraud check → duplicate applications
6. If KYC fails
Event:kyc.failed
GenAI Borrower Assistant uses:
ContextAPI → timeline
RAG Layer → KYC SOPsto explain:
“Your KYC failed because your Aadhaar address does not match your PAN. Please upload updated Aadhaar.”
No policy or PII embedded — only policy text is in vector DB.
If Ramesh uploads corrections → pipeline restarts.
PHASE 4 — Parallel Risk Engines (Event-Driven)
Once KYC passes:kyc.completed
This triggers 3 parallel microservices:
A. Credit Risk Microservice (MLOps Pipeline #2)
Event: creditRisk.triggered
Actions:
Calls CIBIL/Experian API
Internal Credit ML model (PD, LGD estimation)
Stability of past liabilities
Delinquency prediction
Output → Feature Store + Timeline DB
B. Income Stability Service (MLOps Pipeline #3)
Event: incomeStability.triggered
Consumes data already extracted by OCR—no re-parsing.
Calculates:
Income-to-debt ratio
FOIR
Salary volatility
Employer risk score
Cash flow signal (from bank statements)
C. Fraud & AML/ Sanctions Service (MLOps Pipeline #4)
Event: fraudAndAML.triggered
Performs:
AML model scoring (internal)
Sanctions & PEP checks (API-based)
Hunter/Experian Fraud API
Anomaly detection (ML)
Device/browser fingerprint
Geo-location check
Outputs → Feature Store + Timeline
PHASE 5 — AI-Augmented Decisioning
Event: risk.allCompleted
Rule Engine + Model Fusion
Inputs:
Credit Score + ML PD
Income Stability Score
Fraud Score
AML Score
Policy constraints (interest rate caps, risk tiers)
Rule Engine outputs:
Auto-Approve
Auto-Reject
Manual Review
GenAI Underwriter Copilot
(LLM-based, for internal bank use)
Fetches:
All risk outputs via ContextAPI
Policies / SOP from RAG
AML/credit rules
Document AI results
And generates:
Risk summary
Policy deviations
Reasons for decision
Questions to ask borrower
Recommendation for final approval
The underwriter edits the summary →Human-in-the-loop feedback captured →Goes to LLMOps pipeline for reinforcement tuning.
PHASE 6 — Borrower Experience by GenAI
At every stage Ramesh can ask:
“Why is my loan delayed?”
“What is FOIR?”
“What happens after KYC?”
“Why did fraud score increase?”
GenAI responds using:
ContextAPI (application status, reasons)
RAG Layer (policy text)
Domain prompting (explain in simple terms)
No PII stored in vector DB.
PHASE 7 — Loan Agreement + e-Sign + CBS Account Creation
If approved:
Loan Agreement Generation
Uses traditional template engine
Optional: GenAI summary of agreement terms
EMI
Prepayment rules
Penalties
Tenure
Total cost of credit
Borrower reviews
Asks GenAI:
“Explain this loan agreement in simple terms.”
GenAI uses RAG over SOP/Policy + contextual loan data.
e-Sign Service
Event: loanAgreement.ready
OTP-based / Aadhaar eSign
Signed PDF → Blob Storage
CBS Integration
Event: esign.completed
CBS API creates loan account
Schedules repayment
Disbursal triggered automatically
Borrower Notification
SMS + email + app notification.
PHASE 8 — Analytics Layer (Bank Internal)
Operational Dashboards
Funnel drop-offs
TAT per step (KYC, ML, AML)
Fraud heatmap
Agent productivity
Risk Analytics
PD/LGD trends
NPA prediction
Early warning indicators
AML suspicious patterns
GenAI Governance Analytics
Prompt logs
Toxicity & bias monitoring
Red team insights
PHASE 9 — LLMOps Pipeline (Policies/SOP Only)
When a new regulatory policy arrives:
Ingestion
OCR + purification
Chunking
Embedding
Indexing into vector DB
Versioning + approval
Deploy updated RAG index
Red team testing
Promotion to production
(No PII is ever embedded.)
Summary — AI Models Used (Total 6 ML + 1 LLM)
ML Models (MLOps)
Document AI Model
Credit Risk Model
Income Stability Model
Fraud/Anomaly Model
AML Risk Model
Sanction/PEP ML Model
GenAI (LLMOps)
Borrower Assistant (LLM)
Underwriter Copilot (LLM)
A. End-to-end text architecture — one borrower journey (Ramesh)
Context: Ramesh logs in and applies for a personal loan. This is the full flow (event-driven). I name events and indicate which teams/infra own each step.
User action — Application created
Ramesh logs into portal → fills form → uploads Aadhaar, PAN, payslip, bank statement → application.created published.
Stored: raw files → ADLS Gen2 (raw); metadata + masked pointers → Postgres; timeline entry → Cosmos DB (context store).
Document ingestion & Document-AI
Event: docs.uploaded → Document-AI service consumes.
Document-AI (LayoutLM/ViT + NER + tamper & face-match models) extracts structured fields (name, dob, pan, salary, transactions) and produces confidences.
Outputs: docs.parsed (pointer to curated JSON in ADLS + masked fields in Postgres).
Owner: Feature/Data + Document AI team (MLOps owns model lifecycle for these models).
KYC / Identity validation
Event: kyc.triggered → KYC microservice validates against APIs (PAN / Aadhaar / CKYC) and checks liveness/face match.
Emits: kyc.completed with status {OK | SUSPICIOUS | FAIL_DEFINITE} and coded reasons (no raw PII in event).
If FAIL_DEFINITE → pipeline stops → decision.made = AUTO_REJECT. GenAI Borrower Assistant generates masked explanation and instructs Ramesh on next steps.
AML / Sanctions / PEP checks
Event: aml.triggered → AML microservice checks vendor lists (World-Check/Refinitiv), EU/UN/OFAC, PEP lists, adverse media.
Emits: aml.completed {CLEAR | POTENTIAL_HIT | HIGH_HIT} with reasonCodes and sourceRefs.
If HIGH_HIT → decision.made = AUTO_REJECT. If POTENTIAL_HIT → route to EDD (cdd.triggered).
Parallel predictive checks (after KYC+AML pass)
Orchestrator publishes simultaneously:
creditRisk.triggered → Credit microservice: calls CIBIL + calls Credit Risk ML endpoint → emits credit.completed (bureauScore, pdScore, modelVersion, shapTop).
fraudCheck.triggered → Fraud microservice: vendor call + Fraud ML endpoint → emits fraud.completed.
incomeStability.triggered → Income microservice: consumes parsed JSON → computes DTI, EMI capacity; optionally calls Income Stability ML → emits income.completed.
All model outputs are written to Feature Store (online) and snapshots to ADLS feature zone.
Owner: MLOps + application microservices.
Decision Engine (Rules + ML inputs)
Event: upon receiving credit.completed, fraud.completed, income.completed → Decision Engine (DMN/Drools) executes rules combining thresholds + ML scores.
Produces decision.made = {AUTO_APPROVE | AUTO_REJECT | MANUAL_REVIEW}. Includes ruleVersion, modelVersions, and evidencePointers (doc ids, shapTop).
Persisted to audit store (append-only) with traceId.
GenAI Underwriting Copilot & Borrower Assistant
If MANUAL_REVIEW or upon borrower request: Context API aggregates masked timeline + scores + evidence pointers (from Cosmos/Postgres) and calls LLMOps orchestrator.
RAG retrieves relevant SOP/policy chunks (policy KB stored in vector DB — NO PII).
LLM produces evidence-backed brief: summary, top risks, policy citations, recommended action. Emit underwriter.brief.created.
GenAI also drives the Borrower Assistant: Ramesh can ask “Why my KYC failed?” or “Explain the agreement”, and the assistant responds using Context API + RAG (masked info and SOPs).
Human-in-loop (if required)
Underwriter reviews the brief and documents, updates decision. Event: decision.confirmed (includes underwriterId, changes).
Edits/labels are stored for labeling pipeline.
Post-approval automation
If approved: agreement.generated via DocGen (templating); esign.triggered → Digital eSign provider returns esign.completed.
loan.account.create call to CBS (Finacle/Temenos) → loan.account.created → disbursement → notification to Ramesh.
Audit, Training & Monitoring
Every model inference, LLM prompt/response, and decision is logged (prompt + retrieved policy ids + LLM output) to immutable audit store for compliance.
MLOps monitors model drift, triggers retrain; LLMOps monitors retrieval quality, hallucination rates, and triggers red-team cycles.
Important operational notes for the journey:
PII never flows into vector DB. LLM sees only masked or derived context from Context API.
All events include traceId and auditPointer for full traceability.
Teams: App teams (microservices/orchestration), MLOps (training/serving), LLMOps (RAG/prompt ops), DevOps/SRE, Risk/Compliance.
B. The LLMOps pipeline (policy/SOP → RAG → reasoning)
Explain this sequence in interview terms:
SOP/Policy ingestion
Source: PDFs, DOCX, regulatory circulars, credit policy docs, SOPs.
Preprocess: clean, normalize (remove headers/footers), canonicalize.
Chunking (policy-aware)
Chunk by clause/section boundaries (preserve legal context).
Each chunk carries metadata: docId, sectionId, effectiveDate.
Embedding
Apply embedding model (governed & versioned). For in-house LLMs or Azure OpenAI embeddings.
Store vectors in a vector DB (Pinecone/Milvus/pgvector) with metadata.
Indexing
Build retrieval index and store mapping chunk → clause id → source.
RAG Retrieval
When Context API requests reasoning, LLM orchestrator:
Accepts masked context JSON (scores, reason codes, timeline).
Retrieves relevant policy chunks via vector DB + hybrid lexical checks (to guarantee precision).
Supplies context + snippets to LLM with system prompt that enforces citation & no hallucination.
Prompt orchestration & guardrails
Prompt templates are versioned by LLMOps.
Enforce rule: always cite policy chunk id(s) (evidence pins).
Enforce PII masking / safe-response templates.
Log prompt + retrieved snippets + response.
Response templating & audit
LLM output structured to include: summary, top risks, policy citations, recommended action.
Persist everything in audit store.
Monitoring / Feedback
Track retrieval recall/precision, hallucination incidents, response latency.
Run red-team tests and safety checks regularly.
C. Human-in-loop & retrain lifecycle
Human edits / approvals
Underwriters change decisions or annotate reasoning in the UI. Those edits become labeled data.
Label pipeline
Labeled cases are ingested into training datasets (feature store + label tables). Data is versioned and stored in ADLS (training zone).
Retraining & release
MLOps builds retrain pipelines, evaluates fairness/explainability (SHAP), runs validation, and stores candidate models in model registry (MLflow).
Models pass governance board before production rollout (canary/blue-green).
When to retrain
Retrain is triggered by: drift detection metrics, periodic schedule, or significant label accumulation (e.g., >X manual reviews for a cohort).
Impact of human edit on single application
If underwriter edits and resubmits, that application’s final decision is persisted immediately (no blocking), and it is stored as label for batch retrain. Optionally, a “fast re-score” can be triggered to update downstream counters or portfolio metrics.
D. Policy update (SOP/policy change) handling
Ingest new/updated policy into SOP ingestion pipeline (chunk → embed → index). This updates the vector DB and the mapping of clause ids.
Does NOT automatically re-run full upstream pipeline for all past applications by default (that’s expensive).
Re-evaluation strategies:
In-flight applications: Re-evaluate only open applications (re-query RAG & re-run Decision Engine if policy change affects thresholds). Emit decision.recheck.
Historical reprocessing: Run batch job to flag previously approved cases where compliance now requires review (audit use-case).
Audit: store policyVersion on all future decisions; retain old policy clause ids for historical auditability.
E. Where LLMs are deployed (deployment pattern)
Options depending on model choice and governance:
Managed cloud LLM (Azure OpenAI)
Pros: managed infra, lower ops, compliance contracts available.
Use when vendor models acceptable.
Private LLM (self-hosted) deployed via Azure ML / AKS
Deploy model container to AKS or Azure ML managed endpoints (KServe or Azure ML Real-time endpoints).
Use when tighter control/privacy required (on-prem/data residency).
LLMOps is responsible for container images, autoscaling, GPU scheduling, rate limiting, and prompt caching.
Hybrid
Use managed LLM for non-sensitive user interactions (templates) and private self-hosted smaller LLMs for sensitive, high-control reasoning.
Operational notes: LLM endpoints must be fronted by the LLM Gateway, which enforces prompt templates, quotas, PII masking, and logs every request/response for audit.
F. Red Team testing (what it is & why)
Red Team testing = adversarial testing of LLM/GenAI systems to surface safety and security failures:
Prompt injection tests (attempt to force LLM to reveal hidden data or ignore guards).
Data leakage tests (ensure LLM never reconstructs PII from masked input).
Hallucination benchmarks (give edge-case queries and measure factuality).
Adversarial content / jailbreak attempts (malicious queries to bypass policies).
Bias / fairness tests (measure differential outputs across demographic cohorts).
Load & failure tests (LLM under heavy load, fallback correctness).
Outcome: fix prompts, update guardrails, improve retrieval (RAG), patch model or fallback to templates. Red-team runs are a mandatory LLMOps task before production and periodically after.
G. Is AML+Sanction ML or API?
Hybrid pattern (practical):
Deterministic API checks against vendor lists (World-Check, Refinitiv, OFAC) for exact matches.
ML is used to score fuzzy matches, reduce false positives, detect linkages (graph ML linking aliases / shell companies) and to surface adverse media signals.
So AML service typically combines vendor API + ML for ranking/triage.
H. How many MLOps pipelines are required?
At minimum for this architecture:
Document-AI model pipeline (document classification + field extraction + face match).
Credit Risk model pipeline.
Fraud Detection model pipeline.
Income Stability model pipeline.
AML (if ML components used) pipeline.
(Optional) Behavioral/Intent model pipeline.
Monitoring / retrain pipeline (shared infra for all the above) — drift detection, data pipelines, labeling.
LLMOps is separate (prompt ops, RAG ingestion, embedding lifecycle) and has its own CI/CD hygiene but is not counted as a classic MLOps "model training" pipeline — still needs versioning and QA.
I. Summary — “AI-first automation in lending”
“Our platform is event-driven and AI-first: documents land in ADLS Gen2 and a Document-AI pipeline converts unstructured proofs into trusted structured features. Those features feed multiple MLOps-deployed models — credit, fraud, income stability and AML triage — which run in parallel and push results to a rule-based Decision Engine. The Decision Engine makes a deterministic judgement (auto-approve/reject/manual review) but every recommendation is accompanied by an evidence package: model scores, SHAP explainability, and policy references. For explainability and customer UX we use LLMOps: policies and SOPs are chunked, embedded into a RAG index, and a governed LLM synthesizes evidence-backed briefs for underwriters and natural-language explanations for customers. Human decisions are stored as labels for retrain; MLOps handles model lifecycle, drift detection and governance, while LLMOps manages prompt/versioning, red-team testing and retrieval quality. PII never goes into the vector store — the LLM only sees masked context via a Context API. This design delivers faster decisions, fewer false positives, measurable NPA improvements, and transparent explanations for customers and regulators.”
J. Quick Q&A bullets
Q: Does LLM ever see raw PII? — No. LLM reads masked context from Context API; vector DB only holds policies/SOPs.
Q: What triggers retrain? — drift detection, label accumulation from human edits, or periodic cadence + evaluation.
Q: What to do when policy updates? — ingest policy into RAG; re-evaluate in-flight applications selectively; batch reprocess historical cases if required for compliance.
Q: Where are LLMs hosted? — managed (Azure OpenAI) or private (AKS/AzureML endpoint) depending on governance and latency needs.
Q: How many MLOps pipelines? — minimum 5–7 (document AI, credit, fraud, income, AML + shared monitoring).
.png)

Comments