Personal Banking-Capacity
- Anand Nerurkar
- May 16
- 2 min read
Personal Banking System - Scalable Microservices Architecture on Azure (AKS + Kafka)
š Overview
Use Case:Ā Complex personal banking system
Microservices:Ā 20 Spring Boot-based microservices
Cloud:Ā Azure (Active-Active region setup)
Traffic:
Average: 500 TPS
Peak: 1000+ TPS
Concurrent Users: 10,000 - 15,000
Core Modules:
Customer Onboarding
Account Management
Fund Transfer (NEFT/RTGS/IMPS/UPI)
Loan Management (Personal/Home/Car)
Fraud Detection
Statements & Notifications
šļø Capability Map
Business Domain | Capability |
Customer Management | Onboarding, KYC, Profile Update |
Accounts | Creation, Closure, Linking |
Transactions | NEFT, UPI, RTGS, IMPS |
Loan Services | Eligibility, EMI Management |
Fraud Detection | Real-time transaction analysis |
Notifications | SMS, Email, In-app messages |
Statements | Monthly/Annual Statements |
Audit & Compliance | Logging, Traceability, Reporting |
āļø Capability to Service Mapping
Capability | Microservice |
Onboarding | customer-onboarding-service |
KYC | kyc-service |
Profile Management | profile-service |
Account Management | account-service |
Fund Transfer | fund-transfer-service |
Payment Gateway | payment-gateway-service |
UPI/NEFT/RTGS | payment-routing-service |
Loan Management | loan-service |
Loan Evaluation | loan-evaluation-service |
Fraud Detection | fraud-analytics-service |
Notifications | notification-service |
Statement Gen | statement-service |
Auth/Login | auth-service |
Audit Logging | audit-service |
User Session Mgmt | session-service |
Reporting | report-service |
Document Storage | doc-storage-service |
Config Management | config-service |
API Gateway | api-gateway |
Orchestration | orchestration-service |
š Resource Planning (Active-Active AKS + Kafka)
š Performance Targets
Peak TPS:Ā 1000+
Latency Target:Ā <200ms for 95% of requests
Availability:Ā 99.99% (across regions)
š AKS Cluster Planning
Cluster Count:Ā 2 AKS clusters (Active-Active)
Node Pool Size (per cluster):
Node Type: D8s_v5 (8 vCPU, 32 GB RAM)
Node Count: 20 (scalable to 30)
Total CPU/RAM (per cluster):Ā 160 vCPU / 640 GB RAM
š” Kafka Planning (Azure Event Hubs / Confluent on AKS)
Broker Nodes:Ā 6 brokers x 2 regions
Partitions:Ā 200 (across topics)
Replication Factor:Ā 3 (high durability)
Throughput:Ā 2 GBps (handle 1000 TPS)
š Other Resources
Azure SQL/PostgreSQL (Active Geo-Replication)
Blob Storage (RA-GRS)
Redis Cache (Active-Active)
Application Gateway + Front Door (Global Load Balancer)
š¹ Architecture Flow (Text Description)
User RequestĀ reaches Azure Front Door
Routed to region via Traffic Manager
Passes through Azure API Gateway
Authentication via auth-service
Based on request type:
Onboarding -> customer-onboarding-service
UPI/NEFT -> fund-transfer-service
Loan check -> loan-service + loan-evaluation-service
Transactions logged via audit-service
Data written to PostgreSQL + Kafka
Kafka triggers fraud-analytics-service
Notifications sent via notification-service
User gets async updates via websockets or push notifications
š Pods & Replicas
š§¶ Pod Planning
Total Microservices:Ā 20
Average Resource per Pod:Ā 500m CPU / 1 GB RAM
Service Type | Replicas (Avg) | Reason |
Core Services | 4-5 | Consistent usage (auth/account) |
High-Traffic (e.g. payments) | 8-10 | 500+ TPS |
Async Services | 2-3 | Notification, audit |
Event Consumers | 3-5 | Kafka-based |
Total Pods Estimate:Ā ~100-120 per region (scalable)
ā” Node Pool & Scaling Strategy
HPA (Horizontal Pod Autoscaler):Ā CPU/memory/custom metrics
Node Autoscaler:Ā Enabled for burst traffic
Node Pool Design:
General Pool (80%) for business logic
Dedicated Pool (20%) for Kafka & infra services
ā ļø Enterprise Risks & Mitigation
Category | Risk | Priority | Mitigation |
Business | User churn due to downtime | High | Active-Active, SLAs, SRE practices |
Operations | Scaling delays during traffic burst | High | Autoscaler with buffer nodes |
Technology | Kafka message loss | High | Durable storage + replication |
People | Misuse of elevated roles | Medium | RBAC, Zero Trust policies |
Process | Inefficient loan processing | Medium | BPM, async workflows |
Security | Transaction spoofing/fraud | High | Real-time fraud analytics, OTP, MFA |
Compliance | Non-adherence to RBI/SEBI | High | Audits, traceability, retention |
Governance | No visibility on microservice health | Medium | Observability stack (ELK, Prometheus) |
ć³ć”ć³ć