Personal Banking-Capacity

Personal Banking System - Scalable Microservices Architecture on Azure (AKS + Kafka)

Use Case: Complex personal banking system
Microservices: 20 Spring Boot-based microservices
Cloud: Azure (Active-Active region setup)
Traffic:
- Average: 500 TPS
- Peak: 1000+ TPS
- Concurrent Users: 10,000 - 15,000
Core Modules:
- Customer Onboarding
- Account Management
- Fund Transfer (NEFT/RTGS/IMPS/UPI)
- Loan Management (Personal/Home/Car)
- Fraud Detection
- Statements & Notifications

Business Domain	Capability
Customer Management	Onboarding, KYC, Profile Update
Accounts	Creation, Closure, Linking
Transactions	NEFT, UPI, RTGS, IMPS
Loan Services	Eligibility, EMI Management
Fraud Detection	Real-time transaction analysis
Notifications	SMS, Email, In-app messages
Statements	Monthly/Annual Statements
Audit & Compliance	Logging, Traceability, Reporting

Capability	Microservice
Onboarding	customer-onboarding-service
KYC	kyc-service
Profile Management	profile-service
Account Management	account-service
Fund Transfer	fund-transfer-service
Payment Gateway	payment-gateway-service
UPI/NEFT/RTGS	payment-routing-service
Loan Management	loan-service
Loan Evaluation	loan-evaluation-service
Fraud Detection	fraud-analytics-service
Notifications	notification-service
Statement Gen	statement-service
Auth/Login	auth-service
Audit Logging	audit-service
User Session Mgmt	session-service
Reporting	report-service
Document Storage	doc-storage-service
Config Management	config-service
API Gateway	api-gateway
Orchestration	orchestration-service

Cluster Count: 2 AKS clusters (Active-Active)
Node Pool Size (per cluster):
- Node Type: D8s_v5 (8 vCPU, 32 GB RAM)
- Node Count: 20 (scalable to 30)
Total CPU/RAM (per cluster): 160 vCPU / 640 GB RAM

User Request reaches Azure Front Door
Routed to region via Traffic Manager
Passes through Azure API Gateway
Authentication via auth-service
Based on request type:
- Onboarding -> customer-onboarding-service
- UPI/NEFT -> fund-transfer-service
- Loan check -> loan-service + loan-evaluation-service
Transactions logged via audit-service
Data written to PostgreSQL + Kafka
Kafka triggers fraud-analytics-service
Notifications sent via notification-service
User gets async updates via websockets or push notifications

Service Type	Replicas (Avg)	Reason
Core Services	4-5	Consistent usage (auth/account)
High-Traffic (e.g. payments)	8-10	500+ TPS
Async Services	2-3	Notification, audit
Event Consumers	3-5	Kafka-based

Total Pods Estimate: ~100-120 per region (scalable)

HPA (Horizontal Pod Autoscaler): CPU/memory/custom metrics
Node Autoscaler: Enabled for burst traffic
Node Pool Design:
- General Pool (80%) for business logic
- Dedicated Pool (20%) for Kafka & infra services

Category	Risk	Priority	Mitigation
Business	User churn due to downtime	High	Active-Active, SLAs, SRE practices
Operations	Scaling delays during traffic burst	High	Autoscaler with buffer nodes
Technology	Kafka message loss	High	Durable storage + replication
People	Misuse of elevated roles	Medium	RBAC, Zero Trust policies
Process	Inefficient loan processing	Medium	BPM, async workflows
Security	Transaction spoofing/fraud	High	Real-time fraud analytics, OTP, MFA
Compliance	Non-adherence to RBI/SEBI	High	Audits, traceability, retention
Governance	No visibility on microservice health	Medium	Observability stack (ELK, Prometheus)

Recent Posts