AI 1st Automation -Digital LEnding journey

Anand Nerurkar
Nov 23, 2025
9 min read

Updated: Dec 12, 2025

⭐ End-to-End Digital Lending Architecture – Borrower Journey (Ramesh)

(Text Version – No diagrams)

Scenario

Borrower: RameshProduct: Personal Loan – ₹5 Lakhs

PHASE 1 — Borrower Interaction (GenAI + Frontend)

1. Ramesh logs in & applies for a personal loan

UI triggers Borrower GenAI Assistant (LLM-based conversational layer).
Ramesh asks:
“Am I eligible? What documents should I upload?”
GenAI retrieves policy rules from the RAG Layer (Policies, SOPs, KYC rules, eligibility matrix).

2. Upload of documents

Ramesh uploads:

Aadhaar
PAN
Salary slips
Bank statements (PDF)
Selfie (optional)

3. Application ID generated → Event emitted

application.receivedPayload stored in:

Blob Storage (Raw Zone)
Metadata in PostgreSQL
Document hashes in Cosmos DB

PHASE 2 — Document AI + MLOps Pipelines

4. Document AI (MLOps Pipeline #1)

Triggered event:documents.uploaded

AI model performs:

OCR + Layout understanding
Entity extraction (Name, DOB, PAN, Address, Salary, Employer name)
Document classification (KYC / Income / Bank statement / Noise docs)
Fraud signals: signature mismatch, tampering

Output stored in:

Curated zone (Blob)
Structured fields → PostgreSQL
Features → Feature Store

Note: Document AI is a trained model (MLOps pipeline, deployed on AKS via Azure ML runtime)

PHASE 3 — KYC/CDD/EDD

5. KYC Service consumes event

kyc.triggered

This performs:

Aadhaar XML / DigiLocker verification
PAN → NSDL
Face match (selfie vs Aadhaar)
Address consistency check
CDD → occupation, employer risk, geo-risk
EDD → high-risk occupation, mismatch in identity, multiple PAN matches
Fraud check → duplicate applications

6. If KYC fails

Event:kyc.failed

GenAI Borrower Assistant uses:

ContextAPI → timeline
RAG Layer → KYC SOPsto explain:

“Your KYC failed because your Aadhaar address does not match your PAN. Please upload updated Aadhaar.”

No policy or PII embedded — only policy text is in vector DB.

If Ramesh uploads corrections → pipeline restarts.

PHASE 4 — Parallel Risk Engines (Event-Driven)

Once KYC passes:kyc.completed

This triggers 3 parallel microservices:

A. Credit Risk Microservice (MLOps Pipeline #2)

Event: creditRisk.triggered

Actions:

Calls CIBIL/Experian API
Internal Credit ML model (PD, LGD estimation)
Stability of past liabilities
Delinquency prediction

Output → Feature Store + Timeline DB

B. Income Stability Service (MLOps Pipeline #3)

Event: incomeStability.triggered

Consumes data already extracted by OCR—no re-parsing.

Calculates:

Income-to-debt ratio
FOIR
Salary volatility
Employer risk score
Cash flow signal (from bank statements)

C. Fraud & AML/ Sanctions Service (MLOps Pipeline #4)

Event: fraudAndAML.triggered

Performs:

AML model scoring (internal)
Sanctions & PEP checks (API-based)
Hunter/Experian Fraud API
Anomaly detection (ML)
Device/browser fingerprint
Geo-location check

Outputs → Feature Store + Timeline

PHASE 5 — AI-Augmented Decisioning

Event: risk.allCompleted

Rule Engine + Model Fusion

Inputs:

Credit Score + ML PD
Income Stability Score
Fraud Score
AML Score
Policy constraints (interest rate caps, risk tiers)

Rule Engine outputs:

Auto-Approve
Auto-Reject
Manual Review

GenAI Underwriter Copilot

(LLM-based, for internal bank use)

Fetches:

All risk outputs via ContextAPI
Policies / SOP from RAG
AML/credit rules
Document AI results

And generates:

Risk summary
Policy deviations
Reasons for decision
Questions to ask borrower
Recommendation for final approval

The underwriter edits the summary →Human-in-the-loop feedback captured →Goes to LLMOps pipeline for reinforcement tuning.

PHASE 6 — Borrower Experience by GenAI

At every stage Ramesh can ask:

“Why is my loan delayed?”
“What is FOIR?”
“What happens after KYC?”
“Why did fraud score increase?”

GenAI responds using:

ContextAPI (application status, reasons)
RAG Layer (policy text)
Domain prompting (explain in simple terms)

No PII stored in vector DB.

PHASE 7 — Loan Agreement + e-Sign + CBS Account Creation

If approved:

Loan Agreement Generation

Uses traditional template engine
Optional: GenAI summary of agreement terms
- EMI
- Prepayment rules
- Penalties
- Tenure
- Total cost of credit

Borrower reviews

Asks GenAI:

“Explain this loan agreement in simple terms.”

GenAI uses RAG over SOP/Policy + contextual loan data.

e-Sign Service

Event: loanAgreement.ready

OTP-based / Aadhaar eSign
Signed PDF → Blob Storage

CBS Integration

Event: esign.completed

CBS API creates loan account
Schedules repayment
Disbursal triggered automatically

Borrower Notification

SMS + email + app notification.

PHASE 8 — Analytics Layer (Bank Internal)

Operational Dashboards

Funnel drop-offs
TAT per step (KYC, ML, AML)
Fraud heatmap
Agent productivity

Risk Analytics

PD/LGD trends
NPA prediction
Early warning indicators
AML suspicious patterns

GenAI Governance Analytics

Prompt logs
Toxicity & bias monitoring
Red team insights

PHASE 9 — LLMOps Pipeline (Policies/SOP Only)

When a new regulatory policy arrives:

Ingestion
OCR + purification
Chunking
Embedding
Indexing into vector DB
Versioning + approval
Deploy updated RAG index
Red team testing
Promotion to production

(No PII is ever embedded.)

Summary — AI Models Used (Total 6 ML + 1 LLM)

ML Models (MLOps)

Document AI Model
Credit Risk Model
Income Stability Model
Fraud/Anomaly Model
AML Risk Model
Sanction/PEP ML Model

GenAI (LLMOps)

Borrower Assistant (LLM)
Underwriter Copilot (LLM)

A. End-to-end text architecture — one borrower journey (Ramesh)

Context: Ramesh logs in and applies for a personal loan. This is the full flow (event-driven). I name events and indicate which teams/infra own each step.

User action — Application created
- Ramesh logs into portal → fills form → uploads Aadhaar, PAN, payslip, bank statement → application.created published.
- Stored: raw files → ADLS Gen2 (raw); metadata + masked pointers → Postgres; timeline entry → Cosmos DB (context store).
Document ingestion & Document-AI
- Event: docs.uploaded → Document-AI service consumes.
- Document-AI (LayoutLM/ViT + NER + tamper & face-match models) extracts structured fields (name, dob, pan, salary, transactions) and produces confidences.
- Outputs: docs.parsed (pointer to curated JSON in ADLS + masked fields in Postgres).
- Owner: Feature/Data + Document AI team (MLOps owns model lifecycle for these models).
KYC / Identity validation
- Event: kyc.triggered → KYC microservice validates against APIs (PAN / Aadhaar / CKYC) and checks liveness/face match.
- Emits: kyc.completed with status {OK | SUSPICIOUS | FAIL_DEFINITE} and coded reasons (no raw PII in event).
- If FAIL_DEFINITE → pipeline stops → decision.made = AUTO_REJECT. GenAI Borrower Assistant generates masked explanation and instructs Ramesh on next steps.
AML / Sanctions / PEP checks
- Event: aml.triggered → AML microservice checks vendor lists (World-Check/Refinitiv), EU/UN/OFAC, PEP lists, adverse media.
- Emits: aml.completed {CLEAR | POTENTIAL_HIT | HIGH_HIT} with reasonCodes and sourceRefs.
- If HIGH_HIT → decision.made = AUTO_REJECT. If POTENTIAL_HIT → route to EDD (cdd.triggered).
Parallel predictive checks (after KYC+AML pass)
- Orchestrator publishes simultaneously:
  - creditRisk.triggered → Credit microservice: calls CIBIL + calls Credit Risk ML endpoint → emits credit.completed (bureauScore, pdScore, modelVersion, shapTop).
  - fraudCheck.triggered → Fraud microservice: vendor call + Fraud ML endpoint → emits fraud.completed.
  - incomeStability.triggered → Income microservice: consumes parsed JSON → computes DTI, EMI capacity; optionally calls Income Stability ML → emits income.completed.
- All model outputs are written to Feature Store (online) and snapshots to ADLS feature zone.
- Owner: MLOps + application microservices.
Decision Engine (Rules + ML inputs)
- Event: upon receiving credit.completed, fraud.completed, income.completed → Decision Engine (DMN/Drools) executes rules combining thresholds + ML scores.
- Produces decision.made = {AUTO_APPROVE | AUTO_REJECT | MANUAL_REVIEW}. Includes ruleVersion, modelVersions, and evidencePointers (doc ids, shapTop).
- Persisted to audit store (append-only) with traceId.
GenAI Underwriting Copilot & Borrower Assistant
- If MANUAL_REVIEW or upon borrower request: Context API aggregates masked timeline + scores + evidence pointers (from Cosmos/Postgres) and calls LLMOps orchestrator.
- RAG retrieves relevant SOP/policy chunks (policy KB stored in vector DB — NO PII).
- LLM produces evidence-backed brief: summary, top risks, policy citations, recommended action. Emit underwriter.brief.created.
- GenAI also drives the Borrower Assistant: Ramesh can ask “Why my KYC failed?” or “Explain the agreement”, and the assistant responds using Context API + RAG (masked info and SOPs).
Human-in-loop (if required)
- Underwriter reviews the brief and documents, updates decision. Event: decision.confirmed (includes underwriterId, changes).
- Edits/labels are stored for labeling pipeline.
Post-approval automation
- If approved: agreement.generated via DocGen (templating); esign.triggered → Digital eSign provider returns esign.completed.
- loan.account.create call to CBS (Finacle/Temenos) → loan.account.created → disbursement → notification to Ramesh.
Audit, Training & Monitoring
- Every model inference, LLM prompt/response, and decision is logged (prompt + retrieved policy ids + LLM output) to immutable audit store for compliance.
- MLOps monitors model drift, triggers retrain; LLMOps monitors retrieval quality, hallucination rates, and triggers red-team cycles.

Important operational notes for the journey:

PII never flows into vector DB. LLM sees only masked or derived context from Context API.
All events include traceId and auditPointer for full traceability.
Teams: App teams (microservices/orchestration), MLOps (training/serving), LLMOps (RAG/prompt ops), DevOps/SRE, Risk/Compliance.

B. The LLMOps pipeline (policy/SOP → RAG → reasoning)

Explain this sequence in interview terms:

SOP/Policy ingestion
- Source: PDFs, DOCX, regulatory circulars, credit policy docs, SOPs.
- Preprocess: clean, normalize (remove headers/footers), canonicalize.
Chunking (policy-aware)
- Chunk by clause/section boundaries (preserve legal context).
- Each chunk carries metadata: docId, sectionId, effectiveDate.
Embedding
- Apply embedding model (governed & versioned). For in-house LLMs or Azure OpenAI embeddings.
- Store vectors in a vector DB (Pinecone/Milvus/pgvector) with metadata.
Indexing
- Build retrieval index and store mapping chunk → clause id → source.
RAG Retrieval
- When Context API requests reasoning, LLM orchestrator:
  - Accepts masked context JSON (scores, reason codes, timeline).
  - Retrieves relevant policy chunks via vector DB + hybrid lexical checks (to guarantee precision).
  - Supplies context + snippets to LLM with system prompt that enforces citation & no hallucination.
Prompt orchestration & guardrails
- Prompt templates are versioned by LLMOps.
- Enforce rule: always cite policy chunk id(s) (evidence pins).
- Enforce PII masking / safe-response templates.
- Log prompt + retrieved snippets + response.
Response templating & audit
- LLM output structured to include: summary, top risks, policy citations, recommended action.
- Persist everything in audit store.
Monitoring / Feedback
- Track retrieval recall/precision, hallucination incidents, response latency.
- Run red-team tests and safety checks regularly.

C. Human-in-loop & retrain lifecycle

Human edits / approvals
- Underwriters change decisions or annotate reasoning in the UI. Those edits become labeled data.
Label pipeline
- Labeled cases are ingested into training datasets (feature store + label tables). Data is versioned and stored in ADLS (training zone).
Retraining & release
- MLOps builds retrain pipelines, evaluates fairness/explainability (SHAP), runs validation, and stores candidate models in model registry (MLflow).
- Models pass governance board before production rollout (canary/blue-green).
When to retrain
- Retrain is triggered by: drift detection metrics, periodic schedule, or significant label accumulation (e.g., >X manual reviews for a cohort).
Impact of human edit on single application
- If underwriter edits and resubmits, that application’s final decision is persisted immediately (no blocking), and it is stored as label for batch retrain. Optionally, a “fast re-score” can be triggered to update downstream counters or portfolio metrics.

D. Policy update (SOP/policy change) handling

Ingest new/updated policy into SOP ingestion pipeline (chunk → embed → index). This updates the vector DB and the mapping of clause ids.
Does NOT automatically re-run full upstream pipeline for all past applications by default (that’s expensive).
Re-evaluation strategies:
- In-flight applications: Re-evaluate only open applications (re-query RAG & re-run Decision Engine if policy change affects thresholds). Emit decision.recheck.
- Historical reprocessing: Run batch job to flag previously approved cases where compliance now requires review (audit use-case).
Audit: store policyVersion on all future decisions; retain old policy clause ids for historical auditability.

E. Where LLMs are deployed (deployment pattern)

Options depending on model choice and governance:

Managed cloud LLM (Azure OpenAI)
- Pros: managed infra, lower ops, compliance contracts available.
- Use when vendor models acceptable.
Private LLM (self-hosted) deployed via Azure ML / AKS
- Deploy model container to AKS or Azure ML managed endpoints (KServe or Azure ML Real-time endpoints).
- Use when tighter control/privacy required (on-prem/data residency).
- LLMOps is responsible for container images, autoscaling, GPU scheduling, rate limiting, and prompt caching.
Hybrid
- Use managed LLM for non-sensitive user interactions (templates) and private self-hosted smaller LLMs for sensitive, high-control reasoning.

Operational notes: LLM endpoints must be fronted by the LLM Gateway, which enforces prompt templates, quotas, PII masking, and logs every request/response for audit.

F. Red Team testing (what it is & why)

Red Team testing = adversarial testing of LLM/GenAI systems to surface safety and security failures:

Prompt injection tests (attempt to force LLM to reveal hidden data or ignore guards).
Data leakage tests (ensure LLM never reconstructs PII from masked input).
Hallucination benchmarks (give edge-case queries and measure factuality).
Adversarial content / jailbreak attempts (malicious queries to bypass policies).
Bias / fairness tests (measure differential outputs across demographic cohorts).
Load & failure tests (LLM under heavy load, fallback correctness).

Outcome: fix prompts, update guardrails, improve retrieval (RAG), patch model or fallback to templates. Red-team runs are a mandatory LLMOps task before production and periodically after.

G. Is AML+Sanction ML or API?

Hybrid pattern (practical):
- Deterministic API checks against vendor lists (World-Check, Refinitiv, OFAC) for exact matches.
- ML is used to score fuzzy matches, reduce false positives, detect linkages (graph ML linking aliases / shell companies) and to surface adverse media signals.
So AML service typically combines vendor API + ML for ranking/triage.

H. How many MLOps pipelines are required?

At minimum for this architecture:

Document-AI model pipeline (document classification + field extraction + face match).
Credit Risk model pipeline.
Fraud Detection model pipeline.
Income Stability model pipeline.
AML (if ML components used) pipeline.
(Optional) Behavioral/Intent model pipeline.
Monitoring / retrain pipeline (shared infra for all the above) — drift detection, data pipelines, labeling.

LLMOps is separate (prompt ops, RAG ingestion, embedding lifecycle) and has its own CI/CD hygiene but is not counted as a classic MLOps "model training" pipeline — still needs versioning and QA.

I. Summary — “AI-first automation in lending”

“Our platform is event-driven and AI-first: documents land in ADLS Gen2 and a Document-AI pipeline converts unstructured proofs into trusted structured features. Those features feed multiple MLOps-deployed models — credit, fraud, income stability and AML triage — which run in parallel and push results to a rule-based Decision Engine. The Decision Engine makes a deterministic judgement (auto-approve/reject/manual review) but every recommendation is accompanied by an evidence package: model scores, SHAP explainability, and policy references. For explainability and customer UX we use LLMOps: policies and SOPs are chunked, embedded into a RAG index, and a governed LLM synthesizes evidence-backed briefs for underwriters and natural-language explanations for customers. Human decisions are stored as labels for retrain; MLOps handles model lifecycle, drift detection and governance, while LLMOps manages prompt/versioning, red-team testing and retrieval quality. PII never goes into the vector store — the LLM only sees masked context via a Context API. This design delivers faster decisions, fewer false positives, measurable NPA improvements, and transparent explanations for customers and regulators.”

J. Quick Q&A bullets

Q: Does LLM ever see raw PII? — No. LLM reads masked context from Context API; vector DB only holds policies/SOPs.
Q: What triggers retrain? — drift detection, label accumulation from human edits, or periodic cadence + evaluation.
Q: What to do when policy updates? — ingest policy into RAG; re-evaluate in-flight applications selectively; batch reprocess historical cases if required for compliance.
Q: Where are LLMs hosted? — managed (Azure OpenAI) or private (AKS/AzureML endpoint) depending on governance and latency needs.
Q: How many MLOps pipelines? — minimum 5–7 (document AI, credit, fraud, income, AML + shared monitoring).