Modernization Legacy Mutual Fund
- Anand Nerurkar
- May 18
- 11 min read
Updated: May 21
🏛️ Mutual Fund Platform Modernization: Enterprise-Scale Architecture
Business Vision
Modernize a legacy mutual fund transaction and investment management platform into a high-performance, resilient, cloud-native, intelligent system.
Supports 150K+ concurrent investors, 6000+ TPS, multi-tenant onboarding, SEBI/GST/FATCA compliance,real-time insights and real-time NAV processing.
Enabling distributors and admin operations via a secure and scalable platform.
Build with Spring Boot-based microservices architecture hosted on Azure Cloud using AKS, Istio, and Kafka.
Target Architecture Overview
Microservices: 40+ Spring Boot services
Cloud: Azure
Container Orchestration: AKS (Azure Kubernetes Service)
Service Mesh: Istio (traffic control, mTLS, policy enforcement)
Messaging: Kafka (event-driven processing)
Database: Azure SQL, Cosmos DB
Monitoring: Prometheus, Grafana, Azure Monitor
Logging: ELK Stack
Authentication: Azure AD B2C
CI/CD: Azure DevOps
Business Outcomes (Before vs. After Modernization)
Metric | Legacy System | Modernized Platform |
Concurrent Users Supported | ~10,000 | 150,000+ |
Transactions per Second | ~500 TPS | 6000+ TPS, burstable to 8000+ |
NAV Update Frequency | 3x/day | Every 5–15 mins, validated + cached |
Audit Readiness | Manual + fragmented | Immutable, append-only CosmosDB + SEBI auto-reports |
Deployment Downtime | High (2–4 hrs/month) | <10 mins, zero-downtime via Istio + Helm |
Support Resolution Time | Manual, 1–2 days | GenAI-assisted, reduced to hours |
Investor Experience | Static, transactional | Conversational, intelligent, and contextual |
Tech Strategy (Aligned to Business)
Business Objective | Tech Strategy | |
Scale to millions of investors and Scalability to 150K users | AKS with HPA, Kafka, Redis, CosmosDB | |
High Throughput | Kafka event-driven microservices (Spring Boot) | |
Regulatory compliance | Immutable logs in CosmosDB for audit, RBAC, Azure Key Vault, SEBI/FATCA export | |
Investor intelligence | Embedded GenAI advisor, RAG-powered chat, recommendation engine | |
Reduced time-to-market | Azure DevOps + Helm + Istio (canary/blue-green CI/CD) | |
Real-time portfolio and NAV | Kafka streaming + Redis caching + CosmosDB historical logs | |
Multi-tenant support | Istio VirtualServices + JWT claims + scoped RBAC |
🔹✅ Enterprise-Scale Architecture Principles
Event-Driven Microservices using Kafka
Real-Time Ingestion + Async Flow with fallback
Immutable Audit Logs in Cosmos DB
Azure AKS Active-Active (2 Regions) + Kafka MirrorMaker
DevSecOps Gates: CVE scans, key vault integration, RBAC policies
Observability: Prometheus (HPA, latency), Grafana, ELK
SEBI/FATCA Compliance: Scheduled, auditable reports
RBAC + Azure AD: Fine-grained access across personas
💡 Bonus Impact Metrics (Before vs. After)
Metric | Legacy | Modernized |
Order Processing Time | 10–12s | 3–4s |
NAV Refresh | 3x/day (batch) | Every 15 min |
Downtime | 5–7 hrs/year | <1 hr/year |
Deployment Risk | High | Zero-downtime |
SEBI Audit Cycle | Manual, 3–5 days | Automated, <2 hours |
Capability Map (Functional + GenAI)
🎯 Functional Capabilities
Investor Onboarding (via Admin/Distributor)
Risk Profiling
Fund Discovery & NAV Access
Transaction Management (Buy/Sell/Switch)
Payment Gateway Integration
Portfolio Management
NAV Feed Ingestion & Publication
Distributor Management
Commission Calculation
Support Ticketing (CRM)
Document Management (T&Cs, CAS)
Analytics & Insights
SEBI, FATCA, GST Compliance
Audit & Trail Logging
Notifications (SMS, Email, App)
🤖 GenAI-Enhanced Capabilities
GenAI Conversational Assistant (Investor, Distributor)
NAV & Fund Comparison Chatbot (RAG)
KYC Auto-fill via OCR/NLP
Document Summarization (T&Cs, Factsheets)
Portfolio Health Advisor
Transaction Anomaly Explanation
Auto-generated SEBI/GST report summary
CRM Ticket Reply Drafting
AML Alert Explainer (compliance flow)
Capability to Microservice Mapping
Capability | Microservices Involved |
Investor Registration | InvestorService, KYCService, CredentialService |
Risk Profiling | RiskEngineService (questionair,risk scoring,MF suitaibility) |
Fund Discovery & NAV | ProductCatalogService, (fund explorer,nav view,factsheets , offer docuemnts)NAVPublisherService |
Order Management | OrderService, PaymentService, TransactionEngine (Buy, Sell, Switch, Modify, Cancel Orders) |
Portfolio Mgmt | HoldingService, ReportService |
NAV Feed Processing | NAVIngestorService, NAVCalculator, NAVPublisherService (Real-time NAV Ingestion, Validation, Publication, Redis Caching) |
Distributor Ops | DistributorService, LeadService, CRMAdapterService - Lead Management, Commission Payouts, Hierarchy, CRM Integration |
Commission Mgmt | CommissionEngine, LedgerService - Slab-wise, event-based, referral-based computation |
Notifications | NotificationService, AlertService - sms.email.alert |
Compliance & Reporting | ComplianceService, AuditTrailService - SEBI/FATCA Reporting, Audit Trail, RBAC, Immutable Logs |
GenAI Advice | GenAIAdvisorService, RAG Orchestrator |
Doc Summarization | DocSummarizerService |
Fraud & AML Detection | FraudDetectionService, AMLScreeningAdapter |
Document Management | Onboarding Docs, Fund Factsheet, Statement Storage (Blob) |
Analytics & BI | Fund Performance, Investor Behavior, Operational KPIs |
Support & Ticketing | CRM Integration (Zendesk, Freshdesk), Ticket Tracking |
Admin Configuration | Fund Setup, Limits, NAV triggers, Access Management |
Fraud Monitoring | Anomaly Detection, Velocity Check, Manual Override |
Loyalty & Referral | Referral Tracking, Campaigns, Rewards |
🔗 EXTERNAL SYSTEM INTEGRATIONS – Data Pipelines & APIs
Area | System/API | Kafka Topics & Flow | purpose | integration mode |
KYC | UIDAI / NSDL | kyc.completed ← from KYC Service | Aadhaar eKYC, XML auth | REST + Secure Callback |
eSign | Digio, SignDesk | eSign callbacks → esign.completed | PAN Verification, e-Sign | API + Signed PDF exchange |
NAV Feed | CAMS / KFinTech (SFTP) | NAVIngestorService → nav.raw-feeds → nav.validated → nav.broadcast | NAV Feed, Folio Sync, Order Status | SFTP / API + file ingest |
Payments | UPI / BillDesk | Webhook → payment.success / payment.failed | Payment Collection (UPI, Mandate, NetBanking) | API + Webhook |
CRM | Distributor CRM APIs | Pull investor-lead mapping, lead status | Ticket sync, Distributor inquiries | API + oauth |
SEBI/AMFI | Audit exports | compliance.report.generated / API export | Reporting API / XML Upload | Batch file + webhook |
Blob Storage | Azure Blob | Statements and reports from ReportService | Reporting API / XML Upload | Batch file + webhook |
Monitoring | Prometheus / Grafana | Scrapes Istio sidecars, shows TPS, CPU, NAV delay | ||
Audit | Cosmos DB | AuditTrailService consumes all key events | ||
Income Tax Portal / PAN DB | PAN validation and linking with Aadhaar | REST / SOAP | ||
GSTN / E-Invoice Portal | Commission invoices, distributor GST compliance | REST API + GSP | ||
NSDL CAS (Consolidated Account Statement) | Consolidated investor holdings reporting | File Upload or API | ||
Banking APIs (ICICI/HDFC Axis) | Account Aggregator / API Gateway | |||
Azure Active Directory | Admin access control, RBAC, and Just-In-Time access | SAML / OAuth | ||
Azure AD B2C | Investor authentication and authorization | OAuth2 / OpenID | ||
CRM (e.g., Zendesk, Salesforce) | Ticketing, distributor support, lead management | REST API | ||
SMS / Email Gateways (MSG91, Twilio, SendGrid) | Notifications | REST / SMTP | ||
Analytics / BI Platform (Power BI, Azure Synapse) | Business reporting, fund insights | Data Export + ETL / Kafka Connect |
✅ Top 20 Enterprise Risks and Mitigations
# | Risk Description | Category | Mitigation Strategy |
1 | Incorrect NAV pricing | Business | NAV validation rules; Prometheus alerts for deviation beyond threshold |
2 | Duplicate order execution | Business | Enforce unique orderId; TransactionEngine idempotency logic |
3 | Commission miscalculation | Business | CommissionEngine logic + reconciliation audits; audit logs persisted in Cosmos DB |
4 | Delayed NAV feed | Operations | TTL check on NAV in Redis; fallback to Cosmos; alert via nav_age_seconds metric |
5 | Failed deployment causing downtime | Operations | Istio canary rollout, Helm rollback; Spring Boot health checks + synthetic testing |
6 | Manual SEBI reporting errors | Operations | Auto-generated SEBI-compliant reports with approval workflow and blob backups |
7 | Kafka consumer lag or topic overload | Technology | Partitioned topics, Prometheus lag alerts, autoscaler for consumer groups |
8 | Redis cache failure | Technology | Read-through fallback to Cosmos DB; Redis cluster with high-availability failover |
9 | Pod restarts due to memory/cpu spikes | Technology | Liveness/readiness probes; resource requests/limits; HPA tuning |
10 | PII data exposure | Security | AES-256 encryption at rest/in-transit; field masking; Key Vault tokenization |
11 | Public API abuse or denial-of-service | Security | Azure API Management throttling; rate limiting; JWT validation + RBAC |
12 | Hardcoded secrets or leaked credentials | Security | Use of Azure Key Vault + sealed secrets; pipeline security scanning |
13 | Audit trail tampering or loss | Governance | Write-once logs in Cosmos DB; append-only policy; RBAC-controlled access |
14 | SEBI/FATCA non-compliance | Governance | Automated scheduled exports; report APIs; policy-driven audit templates |
15 | Missing user activity logs | Governance | AuditTrailService with Kafka hooks + correlation ID logging |
16 | Admin/staff misuse of elevated privileges | People | Azure AD RBAC enforcement; scoped access levels; Just-In-Time role elevation |
17 | Fund configuration error | People | Admin UI with validations; dual-approval workflow for sensitive changes |
18 | Inconsistent CI/CD across teams | Process | Unified Azure DevOps pipeline templates; Helm-based release strategy |
19 | Missed disaster recovery drill | Process | Quarterly DR simulations; failover dashboards; observability alerts post-switch |
20 | Unauthorized access to audit data | Governance | Role-based export APIs; encryption of exports; audit logging of report generation |
🔹 🔁 Inbound Data Pipelines (Batch + Real-Time)
Source System | Data Pipeline | Purpose |
CAMS / KFinTech (RTA) | Daily NAV File → Kafka → Redis | NAV Updates every 15 mins |
Razorpay / PG | Webhook → Kafka payment.success | Realtime payment event ingestion |
UIDAI / NSDL | API Polling / Webhook → Kafka | eKYC and eSign event updates |
Distributor CRM | API to create lead.created | Lead sync from distributor CRM |
SEBI Data Pull | Scheduled batch download | Fund-level reports |
Kafka Ingestor for Events | Kafka → Data Lake (ELT jobs) | Analytics, compliance tracking |
Internal Scheduler | Cron → NAVPublisher | NAV push to Redis/Cosmos every 5–15 mins |
🔹 Real-Time Capabilities (Streaming + Cache)
Feed | Mechanism |
NAV Feed | Kafka + Redis + Cosmos DB |
Order Events | Kafka order.placed → transaction.completed |
Notifications | Kafka notification.sent + Async UI push |
Compliance Audit Logs | Kafka → Cosmos append-only |
📊 NAV Pipeline: Every 15 Minutes
NAV file drop (SFTP) → NAVIngestorService
File parsed → Kafka nav.raw-feeds
NAVCalculatorService computes final NAV → nav.validated
NAVPublisherService:
Cache in Redis (nav.current)
Persist in Cosmos DB
Kafka nav.broadcast to downstream
Alerts if delay > 900s → nav.alert.raised
🧠 NAVPublisherService — Redis Update Strategy
⏱️ Update Interval: Every 15 minutes (configurable via cron/scheduler)
🔁 Flow Summary:
NAVIngestorService picks up NAV feed file from SFTP / Blob
Each NAV record is published to Kafka topic: nav.raw-feeds
NAVCalculatorService consumes and processes NAV values:
Applies rounding, fee rules, currency conversion
Publishes to nav.validated
NAVPublisherService listens to nav.validated:
Persists the NAV to Cosmos DB (for historical reference)
✅ Updates Redis cache with the latest NAV every 15 minutes
Key: nav:<fundCode>
TTL: e.g., 20 minutes to prevent stale reads
Publishes to Kafka nav.broadcast for real-time use (e.g., alerting, UI push)
💡 Why Redis Cache is Updated Every 15 Minutes?
NAV values are typically refreshed by RTAs (like CAMS/KFinTech) every 15 minutes.
Redis provides low-latency access for:
Transaction Engine (unit calculation)
Portfolio Service (current valuation)
Investor Dashboard UI (live NAV)
Example Redis Entry:
json
Key: nav:HDFC123
Value: {
"fundCode": "HDFC123",
"nav": 55.1247,
"currency": "INR",
"timestamp": "2025-05-15T11:00:00Z"
}
🤝 Commission Calculation
On transaction.completed → CommissionEngine invoked
Payout calculated based on distributor mapping
LedgerService updated → Kafka commission.calculated
Distributor dashboard updated
🛡️ Fraud Detection
TransactionEngine publishes transaction.completed
FraudDetectionService listens → applies velocity rule
If suspicious → Kafka fraud.alert.raised → Admin alert
Manual review triggered via Admin portal
🔹 ✅ Enhanced Real-World Features
Category | Feature Example |
SLAs | NAV data freshness < 900 seconds, Order latency < 2s |
Auditability | Investor order trace from UI → Kafka → Transaction → DB |
SLA Breach Alert | NAV ingestion delay → Prometheus → PagerDuty + UI Banner |
NAV Fall-back | Redis → Cosmos DB fallback with alerting |
Data Sync | Folio reconciliation with RTA → SFTP file → Kafka ingestion |
Multi-tenant Ops | Separate fund house access + Istio gateway segmentation |
🔂 CROSS-SERVICE EVENT MAP (Kafka Topics)
Topic Name | Produced By | Consumed By |
investor.registered | AdminService | KYCService, AccountService |
kyc.completed | KYCService | CredentialService |
credentials.issued | CredentialService | NotificationService |
fund.created | AdminService | ProductCatalogService |
order.placed | OrderService | PaymentService, TransactionEngine |
payment.success | PaymentService | TransactionEngine |
transaction.completed | TransactionEngine | HoldingService, CommissionEngine |
portfolio.updated | HoldingService | PortfolioService |
nav.raw-feeds | NAVIngestorService | NAVCalculatorService |
nav.validated | NAVCalculatorService | NAVPublisherService |
nav.broadcast | NAVPublisherService | Redis Cache, NotificationService |
notification.sent | NotificationService | N/A |
commission.calculated | CommissionEngine | DistributorDashboardService |
lead.created | DistributorService | CRMService |
🔹 🧩 Example Microservice Inventory (~40+ services)
Service Name | Domain Area |
InvestorService | Investor profile, preferences |
KYCService | Aadhaar/PAN validation, UIDAI/NSDL |
OrderService | Order placement, status tracking |
TransactionEngine | NAV allocation, validation, settlement |
NAVIngestorService | Ingest file from RTA (CAMS/KFinTech) |
NAVCalculatorService | Apply rounding, formula |
NAVPublisherService | Cache to Redis, store in Cosmos |
PaymentService | Mandate, UPI, webhook handlers |
HoldingService | Portfolio state, holding snapshot |
ReportService | Monthly, quarterly statements |
CommissionEngine | Event-driven commission calculator |
NotificationService | SMS, Email, App push |
DistributorService | Lead mgmt, hierarchy, commissions |
CRMAdapterService | Integrates Freshdesk/Zendesk |
AdminService | Admin access, user management |
AuthService | AuthZ/AuthN, Azure AD & AD B2C |
AuditTrailService | Kafka event logger to Cosmos DB |
ComplianceReportService | SEBI/FATCA audit generator |
FraudDetectionService | Velocity rules, anomaly alerts |
DataLakeIngestor | Kafka to data lake ingestion |
✅ Real-World Mutual Fund Platforms Have Multiple Data Pipelines, Not Just One
🔎 Why Multiple Pipelines?
Enterprise mutual fund platforms operate in a highly integrated, regulated, and data-rich ecosystem. Different types of data — with different SLAs, formats, sources, and consumers — demand specialized and decoupled pipelines for performance, compliance, and observability.
🔹 Examples of Independent Real-World Pipelines
Pipeline | Purpose | Characteristics |
NAV Feed Ingestion | Ingest fund NAV from RTA | SFTP/API → Kafka → Redis/Cosmos |
Transaction Audit Trail | Immutable logs for SEBI, FATCA | Kafka → CosmosDB append-only |
Commission Calculation | Track commission for each transaction | Kafka → CommissionEngine → Ledger |
SEBI Reporting | Periodic audit submission | Cosmos → CSV generator → Secure Upload |
Fraud Monitoring | Real-time fraud pattern detection | Kafka → FraudDetectionService → Alert |
Notification Pipeline | SMS, Email, Push for events | Kafka → NotificationService |
Analytics & BI Pipeline | PowerBI or Azure Synapse integration | Kafka → DataLake → BI Export |
CRM/Ticketing Feed | Support tickets, lead mgmt | CRM API → Kafka → TicketService |
Payment Events Feed | Razorpay/PG webhook events | API → Kafka → PaymentService |
✅ Each pipeline has unique:
SLAs (e.g., NAV < 15 min, alerts < 1 min, reports daily)
Data formats (CSV, JSON, binary)
Sources (SFTP, REST APIs, Webhooks)
Destinations (Redis, CosmosDB, Data Lake, Email/SMS)
🧩 Cluster Sizing Calculation (BFSI Standard)
✅ What’s the Standard TPS per Pod?
In BFSI-grade production environments, the typical sustained TPS per Spring Boot pod (with Kafka, Istio, Redis, logging, etc.) is:
Complexity of Microservice | Realistic TPS per Pod (Sustained) |
Lightweight stateless service | 150–200 TPS |
Medium complexity (with Kafka, Redis) | 80–120 TPS |
Heavy logic or I/O-bound (e.g., TransactionEngine) | 40–80 TPS |
✅ Cluster Sizing Principles (BFSI-Standard Aligned)
Parameter | Industry Standard / Best Practice |
TPS per Spring Boot Pod | 50–100 TPS depending on complexity |
Pods per Node (AKS) | 6–8 pods per node (max 10 in controlled use cases) |
Service Replication | 2–3 replicas minimum for HA (zone fault tolerance) |
System Overhead Pods | +40–60% for Istio, logging, Kafka, observability |
Node Sizing Buffer | Always round up for peak load + HPA headroom |
CPU/Memory Requests | Aligned to JVM heap sizing, Istio + metrics agents |
✅ Industry Standard Range: 50–100 TPS for core services under realistic latency + durability SLAs.
🔎 Industry Standard for Pod Density in BFSI Workloads
Context | Typical Pod Density (Pods per Node) |
BFSI-grade workloads with Istio, Kafka, monitoring, encryption | 6–8 pods per node (recommended) |
Lightweight stateless apps | 10–15 pods per node (rare in BFSI) |
📌 Especially with Istio sidecars, JVM-based Spring Boot apps, and heavy observability, 7 pods per node is the most reliable target for BFSI.
✅ Calculation Based on Realistic TPS/Pod
Let’s recalculate based on a safer 75 TPS per pod, which is very reasonable for BFSI-grade transaction microservices under load with Istio, Kafka, and security instrumentation.
AKS Cluster Sizing – BFSI Industry Standard
✅ Final AKS Cluster Sizing (Based on 7 Pods/Node)
Metric | Value | Explanation |
Concurrent Users Target | 150,000 | Investor + distributor workload |
TPS Target | 6,000 | Transaction per second goal |
TPS per Pod | 75 | Safe, BFSI-compliant throughput per pod |
Estimated Pods for TPS | 80 | 6000 / 75 |
Microservices | 30 | Business-domain aligned |
Replicas per Service | 3 | For HA, load distribution |
Adjusted Total Service Pods | 90 | 30 services * 3 |
Infra/System Pods (50%) | 45 | Kafka, Redis, Istio, logging, tracing, agents |
Total Pods Required | 135 | Core + infra |
Pods per Node (BFSI conservative) | 7 | Aligns with Istio overhead and JVM resource usage |
Final Estimated Node Count | 20 | 135 / 7 rounded up |
🧮 Step-by-Step AKS Cluster Sizing Calculation (BFSI Industry Standard)
Target: 150K+ concurrent users, 6000+ TPS, Spring Boot + Kafka + Istio on Azure AKS (active-active)
🔹 1. Define Core Input Metrics
Metric | Value | Justification |
Concurrent Users | 150,000 | Real-world scale during market open |
TPS Required | 6,000 | Order + NAV + notifications |
TPS per Pod (BFSI standard) | 75 | With Istio, Kafka, metrics, JVM |
No. of Core Microservices | 30 | Order, KYC, NAV, Transaction, etc. |
Replicas per Service (HA) | 3 | Zone-level HA standard |
Pods per Node (BFSI std) | 7 | After accounting for sidecars & infra overhead |
🔹 2. Calculate Required Pods
🧩 a. Pods needed for TPS
bash
CopyEdit
6000 TPS ÷ 75 TPS/pod = 80 pods needed to meet demand
🧩 b. Estimate App Pods (30 services x 3 replicas)
java
CopyEdit
30 services × 3 replicas = 90 app pods (standard HA requirement)
🧩 c. Add System/Infra Overhead (50% extra)
java
CopyEdit
90 × 1.5 = 135 total pods (incl. Istio, Kafka, Redis, monitoring)
🔹 3. Calculate Node Requirement
bash
CopyEdit
135 pods ÷ 7 pods/node = ~19.3 → round up → 20 nodes per region
➡️ Final: 20 nodes/region × 2 regions (active-active) = 40 nodes total
🔹 4. Kafka, Redis, Istio Config Sizing
Component | Configuration |
Kafka Brokers | 5 brokers × 100 partitions each (active-active) |
Redis | Premium cache × 3 shards with geo-replication |
Istio | Enabled globally, sidecars auto-injected per pod |
Cosmos DB | Multi-region write, 10k RU/s per partition |
Ingress | Azure Front Door + Istio Gateway for multi-region routing |
✅ This sizing matches real-world BFSI benchmarks (e.g., from AMCs, NBFCs, retail banking) and ensures:
Performance headroom
Predictable latency under load
HA/DR readiness
Compliance scalability (NAV, order processing, KYC)
🔹 Observability & Governance
Area | Detail |
Audit Log Retention | Cosmos DB, 7-year TTL, write-once policy |
Transaction Traceability | Correlation ID with logs per event |
Prometheus Metrics | nav_age_seconds, tx_latency, HPA_scale_trigger |
Alerts & Dashboards | Grafana (real-time), ELK, Teams/PagerDuty for ops alerts |
SEBI/FATCA Reporting | Auto CSV/JSON reports, API download, access logs enabled |
Deployment & DR Strategy
Feature | Implementation |
CI/CD | Azure DevOps + Helm + Istio + rollback |
Canary Deployment | 10% → 25% → 100% traffic shift via Istio |
Blue/Green for Core Services | Parallel clusters with manual cutover |
Active-Active Cluster | AKS South + West India, Kafka MM2 + Cosmos geo-write |
Monthly DR Drill | Redis rehydration, failover reroute, Cosmos failproof |
🔹 BFSI Compliance Highlights
Aadhaar/PAN encrypted & masked
SEBI reports: auto-generated JSON/CSV via APIs
Cosmos DB with 7-year TTL for audit
Role-based views: Admin, Investor, Distributor
DR: Kafka MM2 + Cosmos geo-write + Redis HA
Why This is Enterprise-Grade
1. High Concurrency and TPS Ready
Designed for 10,000–150,000 concurrent users
Built to handle 3,000–5,000+ transactions per second (TPS)
Uses horizontal scaling via AKS + Istio with HPA (pods) and CA (nodes)
2. Event-Driven Architecture (Kafka Backbone)
Fully asynchronous, decoupled microservices
Each critical domain publishes/consumes Kafka events (e.g., transaction.completed, nav.broadcast, portfolio.updated)
Supports high-throughput processing with guaranteed ordering and fault tolerance
3. Real-Time Data Flows
NAV updates every 15 minutes to 1 minute, flowing through SFTP → Kafka → Redis → UI in near real time
Portfolio values and alerts update live based on NAV changes
Commission tracking and distributor dashboards update in real time
4. Role-Based Separation & Flow Control
Clearly segmented flows:
Investor: login, fund browse, order, portfolio
Distributor: lead creation, commission calculation
Admin: registration, KYC, fund setup, audit, DR
Enables RBAC, observability, and scaling per role boundary
5. External System Integration
KYC APIs (UIDAI, NSDL), Payment Gateways, eSign (Digio), NAV Feed (CAMS, KFinTech)
CRM APIs, SEBI/AMFI reporting
Secured with Azure AD B2C / RBAC, Key Vault, and Istio mTLS
6. Compliance & Observability
Cosmos DB as immutable audit store
Prometheus + Grafana for detailed metrics (latency, TPS, resource usage)
ELK stack for centralized logging
SEBI compliance via automated report generation and trail visibility
7. Disaster Recovery & Multi-Region Setup
Active-Active AKS Clusters (e.g., South India + West India)
Kafka MirrorMaker2 ensures topic replication across regions
DBs (Azure SQL, Cosmos DB) in geo-redundant setup
Failover readiness tested via DR simulation flows
8. DevOps-Driven Continuous Delivery
Azure DevOps Pipelines for CI/CD
Helm + Kubernetes for deployment
Automated rollbacks, blue/green or canary releases supported
✅ BFSI Standards Alignment – Breakdown
🔹 1. Scalability & Performance
BFSI Expectation | Your Architecture |
Handle 100K–200K concurrent users | ✔ Designed for 150K+ users, 5K+ TPS |
Multi-region HA | ✔ Active-active AKS in South & West India |
Zero-downtime deployment | ✔ Istio canary & blue/green with rollback |
Horizontal scaling | ✔ HPA & Cluster Autoscaler with proactive scaling |
🔹 2. Security & Compliance (SEBI / RBI / IRDAI)
BFSI Standard Requirement | Your Approach |
7+ years of audit log retention | ✔ Cosmos DB + immutability + TTL enforcement |
Role-based access (RBAC) | ✔ Azure AD, Istio policies, scoped APIs |
PII Encryption | ✔ PAN, Aadhaar encrypted + masked via Key Vault |
DR readiness & tested failover | ✔ Kafka MM2, Cosmos geo-replication, monthly drills |
Secure deployments | ✔ DevSecOps gates (SonarQube, CVE scan, Key Vault) |
Immutable logs | ✔ Write-once, append-only Cosmos setup |
UIDAI / NSDL / Digio integration | ✔ External KYC/eSign services integrated securely |
🔹 3. Observability & Operations
BFSI Observability Practices | Your Design |
Real-time infra + app monitoring | ✔ Prometheus + Grafana dashboards |
Log traceability | ✔ ELK with contextual enrichment |
Business SLA tracking (e.g., NAV) | ✔ nav_age_seconds + Prometheus alerts |
Region failover visibility | ✔ Redis + Cosmos + Kafka readiness and alerting |
🔹 4. BFSI Domain Patterns
Key Industry Pattern | Your Coverage |
Event-driven transaction systems | ✔ Kafka-based asynchronous orchestration |
Idempotent transaction engines | ✔ Unique orderId + deduplication logic |
Real-time portfolio updates | ✔ Redis + Kafka + NAV recalculations |
SEBI reporting | ✔ Scheduled compliant report generation |
Payment reconciliation workflows | ✔ Payment → Kafka → Transaction → Audit trail |
📌 Final Verdict: ✅ BFSI-Grade Architecture
✔ Complies with BFSI performance, resilience, and audit standards
✔ Built-in observability, compliance, and HA
✔ Well-positioned for SEBI inspections, cyber audits, and RBI IT governance reviews
✔ Aligns with architectures used by top AMCs, NBFCs, insurers, and banks
👥 Persona-Based Architecture Walkthrough (Mutual Fund Platform)
👤 1. Investor Persona
"Retail investor engaging with funds for investment, redemption, or viewing portfolio."
🔄 Key Journeys:
Login & View Funds
Place Order (Buy/Sell/Switch)
View Portfolio & Transaction History
Receive Notifications & Statements
🧩 Microservices Involved:
AuthService (Azure AD B2C)
ProductCatalogService
OrderService, PaymentService
TransactionEngine
HoldingService, ReportService
NotificationService, PreferenceService
🔁 Event Flow:
Login (OAuth via Azure AD B2C) → token with investorId
Views NAV → served by NAVPublisher via Redis
Places order → Kafka order.placed → PaymentService
PG callback → Kafka payment.success
TransactionEngine allocates units using NAV
Kafka transaction.completed → triggers:
Portfolio update
Commission payout (if referred)
Email/SMS from NotificationService
🔗 External Integrations:
Razorpay (payment)
MSG91/Twilio (notification)
UIDAI/NSDL (onboarding via distributor/admin)
NSDL CAS (CAS sync)
🤝 2. Distributor Persona
"Advisors or agencies facilitating investor onboarding and earning commission."
🔄 Key Journeys:
Add Investor Leads
Track Commission
Download Investor Reports
Submit Support Requests
🧩 Microservices Involved:
DistributorService
LeadService, InvestorService
CommissionEngine, LedgerService
CRMAdapterService
ReportService, AuthService
🔁 Event Flow:
Distributor logs in (Azure AD) → scoped RBAC
Adds lead → Kafka lead.created
Initiates registration → investor.registered
When investor transacts → transaction.completed triggers:
CommissionEngine → commission.calculated
LedgerService logs payout
Support ticket via CRM → CRMAdapterService sends to Zendesk
🔗 External Integrations:
GST Portal (commission invoice)
CRM (Zendesk/Salesforce)
SMS/Email
Aadhaar Vault (tokenized storage)
👨💼 3. Admin Persona
"Operations user managing fund setup, NAV, investor creation, and DR oversight."
🔄 Key Journeys:
Create Funds, Set NAV
Create Investors (Direct)
Monitor System Health
Manage Access & Reports
🧩 Microservices Involved:
AdminService, FundSetupService
NAVIngestorService, NAVPublisherService
KYCService, CredentialService
ComplianceReportService, AuditTrailService
DocumentService, AlertService
🔁 Event Flow:
Admin logs in (Azure AD) → creates fund → Kafka fund.created
NAV file drop → SFTP → NAVIngestorService
Kafka nav.raw-feeds → calculated → nav.validated
NAVPublisher pushes to Redis + Cosmos + Kafka nav.broadcast
Admin creates investor (back office) → kyc.initiated → kyc.completed
🔗 External Integrations:
RTA (CAMS/KFinTech) for NAV
UIDAI/NSDL
Aadhaar Vault
Azure Monitor/Log Analytics
SEBI Gateway
🛡️ 4. Compliance Officer Persona
"Responsible for regulatory reports, data audits, and access governance."
🔄 Key Journeys:
Access Audit Trails
Generate Regulatory Reports
Verify User Actions & Data Flows
🧩 Microservices Involved:
AuditTrailService (event-sink from Kafka)
ComplianceReportService
ReportService, AccessLogService
🔁 Event Flow:
Compliance logs in (Azure AD, scoped role)
Requests audit report → AuditTrailService pulls from Cosmos DB
SEBI/FATCA reports → auto-generated or on-demand → report.generated
Any breach (e.g., delayed NAV, failed KYC) → alert.raised → escalated
🔗 External Integrations:
SEBI Upload Portal
Cosmos DB (7-year retention)
Azure RBAC Logs
Power BI (for dashboards)
🔐 System-Wide Architecture Enforcement
Concern | Approach |
Security & RBAC | Azure AD/AD B2C, Istio AuthorizationPolicies, microservice-level auth |
Scalability | AKS + Kafka + Redis; HPA/CA + pod/node isolation |
Compliance | Cosmos DB audit, immutable logs, SEBI reporting automation |
Observability | Prometheus + Grafana + ELK + Azure Monitor |
Failover/DR | Active-active AKS, Kafka MM2, Cosmos geo-replication |
✅ Real-World Mutual Fund Platform – Full Sequence & Event-Driven Flow
👤 Investor Persona Flow – Place Order, View Portfolio
Step | Initiator | Action | Event Produced | Produced By | Consumed By | Outcome |
1 | Investor | Logs in via portal | — | — | AuthService (AD B2C) | JWT token with investorId, role, tenantId issued |
2 | Investor | Browses fund list | — | — | ProductCatalogService | NAV and fund metadata fetched from Redis/cache |
3 | Investor | Places order | order.placed | OrderService | PaymentService | PaymentService initiates PG request (e.g. Razorpay) |
4 | Payment Gateway | Sends callback (after investor pays) | payment.success | PGWebhookHandler | PaymentService | Validates payment, marks it success, sends next event |
5 | PaymentService | Validates order + payment | transaction.ready | PaymentService | TransactionEngine | TransactionEngine begins NAV validation and unit allocation |
6 | TransactionEngine | Allocates units | transaction.completed | TransactionEngine | HoldingService, AuditTrailService | Units credited, event recorded in audit DB |
7 | HoldingService | Updates portfolio | portfolio.updated | HoldingService | PortfolioService | Portfolio cache and DB updated |
8 | NotificationService | Sends alert | notification.sent | NotificationService | Twilio/MSG91/Email | Investor gets SMS/email |
9 | Investor | Asks: “Compare Fund A vs B” | — | GenAIChatOrchestrator | ProductCatalogService, NAVPublisher | Response using RAG (factsheet, NAV, perf) |
🤝 Distributor Persona Flow – Onboard Investor, Track Commission
Step | Initiator | Action | Event Produced | Produced By | Consumed By | Outcome |
1 | Distributor | Adds new lead | lead.created | LeadService | CRMAdapterService | Lead stored in CRM (Zendesk/Salesforce) |
2 | Distributor | Converts lead to investor | investor.registered | InvestorService | KYCService | Initiates KYC flow with Aadhaar/PAN |
3 | KYCService | Completes KYC | kyc.completed | KYCService | CredentialService | Login credentials issued via SMS/email |
4 | Investor | Places order | order.placed | OrderService | PaymentService | Starts order lifecycle |
5 | TransactionEngine | Order finalized | transaction.completed | TransactionEngine | CommissionEngine | Commission calculated, event published |
6 | CommissionEngine | Commission calculated | commission.calculated | CommissionEngine | LedgerService | Ledger updated, dashboard refreshed |
7 | Distributor | Asks: “Why is my payout low?” | — | GenAICommissionExplainer | LedgerService | GenAI explains based on ledger entries |
👨💼 Admin Persona Flow – Fund Setup, NAV, User Management
Step | Initiator | Action | Event Produced | Produced By | Consumed By | Outcome |
1 | Admin | Creates a new fund | fund.created | AdminService | FundSetupService | Fund metadata persisted, fund listed |
2 | RTA | Uploads NAV file (every 15 mins) | nav.raw-feeds | NAVIngestorService | NAVCalculatorService | Parses file, schema & threshold validation |
3 | NAVCalculator | Valid NAV published | nav.validated | NAVCalculatorService | NAVPublisherService | NAV pushed to Redis, Cosmos, Kafka |
4 | NAVPublisher | NAV made public | nav.broadcast | NAVPublisherService | UI, PortfolioService | Latest NAV visible in UI and used for order processing |
5 | Admin | Downloads audit trail | audit.export.triggered | AuditTrailService | CosmosDB Exporter | Audit logs (immutable) exported |
6 | Admin | Uploads document | document.signed | DocumentService | Blob Storage, UI | Signed docs stored in Blob with hash verification |
🛡️ Compliance Officer Persona Flow – AML, Reporting, Audit
Step | Initiator | Action | Event Produced | Produced By | Consumed By | Outcome |
1 | KYCService | Starts Aadhaar PAN validation | kyc.initiated | KYCService | AMLScreeningService | AML/PEP scan started |
2 | AML API | Raises a match | aml.alert.raised | AMLScreeningAdapter | AlertService | Compliance officer notified, ticket created |
3 | Compliance | Runs SEBI/FATCA report | report.scheduled | ComplianceService | CosmosDB, ReportService | CSV/JSON generated and uploaded to SEBI gateway |
4 | Compliance | Reviews alert summary | — | GenAIAMLExplainer | Cosmos + AML flags | Human-readable GenAI explanation of AML match |
5 | Officer | Verifies admin logs | audit.accessed | AuditTrailService | CosmosDB + UI | Immutable audit trail checked |
📡 NAV Feed Pipeline Flow (Realtime)
Step | Action Source | Event Produced | Produced By | Consumed By | Outcome |
1 | RTA SFTP Upload | File placed | — | NAVIngestorService | File picked from blob or SFTP |
2 | NAV file processed | nav.raw-feeds | NAVIngestorService | NAVCalculatorService | Validated, cleaned |
3 | NAV calculated | nav.validated | NAVCalculatorService | NAVPublisherService | Rounding, formula applied |
4 | NAV cached/broadcasted | nav.broadcast | NAVPublisherService | UI, PortfolioService, TransactionEngine | Real-time NAV available for display & transactions |
5 | Monitoring | Metrics emitted | Prometheus Exporter | Grafana, Alerting | Alert if NAV delayed (nav_age_seconds > 900) |
👤 INVESTOR FLOW — Place Order and View Portfolio
🧭 Use Case: Investor places an order and tracks it end-to-end
Investor logs in→ AuthService authenticates via Azure AD B2C→ JWT token is issued with investorId, tenantId, and roles→ No event — handled via stateless auth
Investor searches for funds→ UI calls ProductCatalogService→ Fund metadata and NAV fetched from Redis (cached by NAVPublisher)→ No event — real-time REST
Investor places an order→ OrderService validates inputs→ Publishes order.placed to Kafka
PaymentService consumes order.placed→ Initiates payment via Razorpay API→ Registers webhook endpoint→ Order marked as "Awaiting Payment"
Razorpay sends callback (payment success)→ PGWebhookHandler receives response→ Publishes payment.success event
PaymentService consumes payment.success→ Validates signature and transaction ID→ Publishes transaction.ready event
TransactionEngine consumes transaction.ready→ Reads latest NAV from Redis→ Allocates fund units→ Publishes transaction.completed
HoldingService consumes transaction.completed→ Updates portfolio state in DB and cache→ Publishes portfolio.updated
NotificationService consumes portfolio.updated→ Fetches user contact preferences→ Sends SMS/Email→ Publishes notification.sent (for audit trail)
Investor visits dashboard→ PortfolioService queries updated holdings→ Latest data shown in UI
🤝 DISTRIBUTOR FLOW — Onboard Investor and Track Commission
🧭 Use Case: Distributor creates a lead, completes KYC, tracks commissions
Distributor logs in (Azure AD)→ Role: distributor, JWT token issued
Adds lead→ LeadService creates entry→ Publishes lead.created
CRMAdapterService consumes lead.created→ Syncs with Zendesk/Salesforce via API→ Lead created in CRM
Distributor initiates registration→ InvestorService creates new investor→ Publishes investor.registered
KYCService consumes investor.registered→ Triggers Aadhaar/PAN via Digio/NSDL→ On success → Publishes kyc.completed
CredentialService consumes kyc.completed→ Issues login credentials via SMS/email→ Access activated
Investor places transaction later→ Flow from "Investor" kicks in→ On transaction.completed → CommissionEngine invoked
CommissionEngine publishes commission.calculated→ Includes payout type, hierarchy, tax→ Consumed by LedgerService
LedgerService updates records→ Dashboard refreshed
👨💼 ADMIN FLOW — Fund Setup, NAV Publication, DR & Logs
🧭 Use Case: Admin configures a new fund, uploads NAV, and audits logs
Admin logs in (Azure AD)→ JWT with admin role
Creates fund→ AdminService triggers fund creation→ Publishes fund.created
FundSetupService consumes fund.created→ Stores metadata, assigns default categories
NAV file dropped by CAMS/KFinTech (every 15 mins)→ Blob trigger → NAVIngestorService reads file→ Parses schema, checks timestamp→ Publishes nav.raw-feeds
NAVCalculatorService consumes nav.raw-feeds→ Computes final NAV using formula→ Publishes nav.validated
NAVPublisherService consumes nav.validated→ Pushes to Redis (for UI), Cosmos DB (history)→ Publishes nav.broadcast
All relevant consumers (UI, PortfolioService, TransactionEngine)→ Fetch updated NAV
Admin triggers audit export→ AuditTrailService reads Cosmos append-only logs→ Publishes audit.export.triggered→ CSV/JSON made available for download
🛡️ COMPLIANCE FLOW — AML, SEBI Reporting, Audit Trail
🧭 Use Case: AML alert raised during KYC, reports submitted
KYCService triggers PAN/Aadhaar→ Publishes kyc.initiated
AMLScreeningService consumes kyc.initiated→ Checks with AML/PEP APIs→ If flagged → Publishes aml.alert.raised
AlertService consumes aml.alert.raised→ Shows alert on Compliance dashboard→ Ticket auto-generated
ComplianceService runs daily job→ Reads audit logs + transactions→ Publishes report.scheduled
ReportExporter consumes report.scheduled→ Generates SEBI-compliant output→ Secure file uploaded to SEBI gateway
Compliance Officer requests AML explanation→ Query sent to GenAIAMLExplainerService→ Returns plain-English justification from JSON + logs
📡 NAV FEED PIPELINE FLOW – Every 15 mins
NAV file dropped by RTA (CAMS/KFinTech)→ Stored in blob or SFTP folder
NAVIngestorService picks up file→ Validates schema, checksum→ Publishes nav.raw-feeds
NAVCalculatorService consumes nav.raw-feeds→ Applies rounding, threshold rules→ Publishes nav.validated
NAVPublisherService consumes nav.validated→ Updates:
Redis cache (UI)
Cosmos DB (audit)
Kafka nav.broadcast
Consumers (UI, Portfolio, TransactionEngine)→ Subscribe to nav.broadcast→ React in near real-time
Prometheus tracks nav_age_seconds→ Alerts if >900 seconds old
GEN AI Capability
===
🔹 1. GenAI Capabilities by Use Case
GenAI Capability | Target Persona | Purpose |
Conversational Assistant (RAG) | Investor, Distributor | Answer fund questions, NAV, portfolio insights |
Goal-based Investment Advisory | Investor | Recommend funds based on risk, age, goal |
Document Summarization (Offer Docs) | Investor | Simplify long PDFs (factsheets, T&Cs) |
Portfolio Health Analysis | Investor | Natural language explanation of holdings/performance |
Anomaly Detection & Root Cause | Admin, Ops | Explain abnormal system or NAV behavior using logs |
Auto-fill KYC & Form Validation | Admin, Distributor | Extract Aadhaar/PAN data and auto-fill during onboarding |
Email/Support Reply Drafting | CRM Agent | Draft contextual replies using past tickets and metadata |
AML/PEP Screening Summarizer | Compliance Officer | Explain alerts and matches from AML APIs |
Fund Comparison Bot | Investor | Chatbot to compare funds with reasoning (RAG + LLM) |
Commission Breakdown Explainer | Distributor | “Why did I get this commission?” — explainable GenAI |
🔹 2. New GenAI Microservices to Introduce
Microservice | Function |
GenAIAdvisorService | LLM-based fund advice and recommendations |
KYCDataExtractorService | Extract Aadhaar/PAN data via OCR/NLP |
DocSummarizerService | Summarize PDF factsheets, offer documents |
PortfolioExplainerService | Use GenAI to explain investor’s gains/losses |
ConversationOrchestrator | Multimodal chat interface orchestrating fund/NAV/portfolio lookups |
AnomalyExplainerService | Use logs/metrics + LLM to explain failures or spikes (e.g., NAV delay) |
SupportReplyGenerator | Generate email replies based on ticket + previous resolutions |
🔹 3. Integration Architecture (Text Version)
Investor Chat Journey:
Investor: “Show me best-performing funds in tech for 3 years”
Chat UI → ConversationOrchestrator
Orchestrator:
Hits RAG layer (Vector DB of fund factsheets)
Calls ProductCatalogService for NAV history
LLM composes explanation: “Fund A outperformed due to X, Y...”
GenAI reply sent to UI (with visual + text)
NAV Spike Investigation:
Ops observes NAV jump
Sends query: “Why was NAV for Fund X high yesterday?”
AnomalyExplainerService:
Pulls NAV feed logs, recent events, Redis trend
Calls LLM: “NAV spike due to large inflow from corporate investor X...”
Reply published to Alert Dashboard
🔹 4. GenAI Architecture Components
Component | Role |
LLM Backend | Azure OpenAI / private GPT / Claude / Gemini |
RAG Layer | Vector DB (e.g., Pinecone, Azure Search with Embeddings) |
Prompt Orchestrator | Dynamically compose prompts per use case |
Chat UI | Angular/React with session memory, contextual tool invocation |
Auth Hook | JWT-based tenant+user scoping for chat sessions |
🔹 5. Observability & Governance for GenAI
Concern | Control |
Data Leakage | Use private LLM or tokenized prompts; redact sensitive input |
Prompt Injection | Validate and limit prompt inputs |
Compliance Logging | Store prompt-response trace in Cosmos or Blob (7-year retention) |
Output Explainability | RAG citations + fund sources |
RBAC Access | Restrict chat tools per user role |
✅ Summary: Business Value of GenAI Integration
Persona | Added Value |
Investor | Smarter, advisory experience with explainable insights |
Distributor | Faster commission clarity, lead scoring |
Admin | Self-explaining system behaviors, onboarding automation |
Compliance | AML/PEP rationalization, report drafting |
CRM/Support | Faster, accurate, personalized responses |



Resilience Strategy for External API Integration
To ensure high availability and fault-tolerance when interacting with external systems (UIDAI, PAN DB, SEBI, CRM, LLM APIs), the platform implements the following resilience strategies:
🔄 Circuit Breaker Pattern
Prevents cascading failures when external services are slow or unavailable.
Implemented using Resilience4j at the service layer.
Automatically blocks requests when failure threshold is breached.
🔁 Retry with Backoff
Uses Retry pattern with exponential backoff.
Configurable retry attempts and delay using Spring Boot + Resilience4j.
⏱ Timeout Handling
API calls are wrapped with a TimeLimiter to enforce 2–3 second limits.
Ensures downstream services are not held up indefinitely.
🚨 Fallback Methods
If external API fails after retries, a graceful fallback method returns a default response.
Example: “We’re unable to verify PAN at the moment. Please try later.”
📥 Asynchronous Integration
Long-running workflows use Kafka + webhook model.
API accepts request → processes async → external system calls back via webhook → flow resumes.
🧠 Caching Valid Responses
Frequently called but static APIs (like PAN validation) are cached in Redis with TTL.
Reduces dependency on external availability.
🧱 Bulkhead Pattern
Isolates external API interaction within limited thread pools.
Prevents one API failure from starving other service resources.
📈 Monitoring & Alerting
API health metrics exposed via Prometheus.
Dashboards and alerts configured in Grafana/Azure Monitor.
Alerts triggered on high failure rate, open circuit breakers, or timeouts.
🧾 Queue Fallback for Retry
Failed external interactions pushed to Kafka retry topic.
Delayed consumers handle retries with backoff logic.
This ensures the platform remains responsive and stable even when integrated systems are degraded or unavailable.
Comentários