top of page

Personal Banking-Capacity

  • Writer: Anand Nerurkar
    Anand Nerurkar
  • May 16
  • 2 min read

Personal Banking System - Scalable Microservices Architecture on Azure (AKS + Kafka)

🌐 Overview

  • Use Case:Ā Complex personal banking system

  • Microservices:Ā 20 Spring Boot-based microservices

  • Cloud:Ā Azure (Active-Active region setup)

  • Traffic:

    • Average: 500 TPS

    • Peak: 1000+ TPS

    • Concurrent Users: 10,000 - 15,000

  • Core Modules:

    • Customer Onboarding

    • Account Management

    • Fund Transfer (NEFT/RTGS/IMPS/UPI)

    • Loan Management (Personal/Home/Car)

    • Fraud Detection

    • Statements & Notifications

šŸ‹ļø Capability Map

Business Domain

Capability

Customer Management

Onboarding, KYC, Profile Update

Accounts

Creation, Closure, Linking

Transactions

NEFT, UPI, RTGS, IMPS

Loan Services

Eligibility, EMI Management

Fraud Detection

Real-time transaction analysis

Notifications

SMS, Email, In-app messages

Statements

Monthly/Annual Statements

Audit & Compliance

Logging, Traceability, Reporting

āš–ļø Capability to Service Mapping

Capability

Microservice

Onboarding

customer-onboarding-service

KYC

kyc-service

Profile Management

profile-service

Account Management

account-service

Fund Transfer

fund-transfer-service

Payment Gateway

payment-gateway-service

UPI/NEFT/RTGS

payment-routing-service

Loan Management

loan-service

Loan Evaluation

loan-evaluation-service

Fraud Detection

fraud-analytics-service

Notifications

notification-service

Statement Gen

statement-service

Auth/Login

auth-service

Audit Logging

audit-service

User Session Mgmt

session-service

Reporting

report-service

Document Storage

doc-storage-service

Config Management

config-service

API Gateway

api-gateway

Orchestration

orchestration-service

šŸ“Š Resource Planning (Active-Active AKS + Kafka)

🌟 Performance Targets

  • Peak TPS:Ā 1000+

  • Latency Target:Ā <200ms for 95% of requests

  • Availability:Ā 99.99% (across regions)

šŸš€ AKS Cluster Planning

  • Cluster Count:Ā 2 AKS clusters (Active-Active)

  • Node Pool Size (per cluster):

    • Node Type: D8s_v5 (8 vCPU, 32 GB RAM)

    • Node Count: 20 (scalable to 30)

  • Total CPU/RAM (per cluster):Ā 160 vCPU / 640 GB RAM

šŸŽ” Kafka Planning (Azure Event Hubs / Confluent on AKS)

  • Broker Nodes:Ā 6 brokers x 2 regions

  • Partitions:Ā 200 (across topics)

  • Replication Factor:Ā 3 (high durability)

  • Throughput:Ā 2 GBps (handle 1000 TPS)

šŸŒ€ Other Resources

  • Azure SQL/PostgreSQL (Active Geo-Replication)

  • Blob Storage (RA-GRS)

  • Redis Cache (Active-Active)

  • Application Gateway + Front Door (Global Load Balancer)

šŸ”¹ Architecture Flow (Text Description)

  1. User RequestĀ reaches Azure Front Door

  2. Routed to region via Traffic Manager

  3. Passes through Azure API Gateway

  4. Authentication via auth-service

  5. Based on request type:

    • Onboarding -> customer-onboarding-service

    • UPI/NEFT -> fund-transfer-service

    • Loan check -> loan-service + loan-evaluation-service

  6. Transactions logged via audit-service

  7. Data written to PostgreSQL + Kafka

  8. Kafka triggers fraud-analytics-service

  9. Notifications sent via notification-service

  10. User gets async updates via websockets or push notifications

šŸš€ Pods & Replicas

🧶 Pod Planning

  • Total Microservices:Ā 20

  • Average Resource per Pod:Ā 500m CPU / 1 GB RAM

Service Type

Replicas (Avg)

Reason

Core Services

4-5

Consistent usage (auth/account)

High-Traffic (e.g. payments)

8-10

500+ TPS

Async Services

2-3

Notification, audit

Event Consumers

3-5

Kafka-based

Total Pods Estimate:Ā ~100-120 per region (scalable)

⚔ Node Pool & Scaling Strategy

  • HPA (Horizontal Pod Autoscaler):Ā CPU/memory/custom metrics

  • Node Autoscaler:Ā Enabled for burst traffic

  • Node Pool Design:

    • General Pool (80%) for business logic

    • Dedicated Pool (20%) for Kafka & infra services

āš ļø Enterprise Risks & Mitigation

Category

Risk

Priority

Mitigation

Business

User churn due to downtime

High

Active-Active, SLAs, SRE practices

Operations

Scaling delays during traffic burst

High

Autoscaler with buffer nodes

Technology

Kafka message loss

High

Durable storage + replication

People

Misuse of elevated roles

Medium

RBAC, Zero Trust policies

Process

Inefficient loan processing

Medium

BPM, async workflows

Security

Transaction spoofing/fraud

High

Real-time fraud analytics, OTP, MFA

Compliance

Non-adherence to RBI/SEBI

High

Audits, traceability, retention

Governance

No visibility on microservice health

Medium

Observability stack (ELK, Prometheus)



Ā 
Ā 
Ā 

Recent Posts

See All
Ops Efficiency 30 % improvement

how did you achieve 30 % operational efficiency Achieving 30% operational efficiencyĀ in a BFSI-grade, microservices-based personal...

Ā 
Ā 
Ā 

ć‚³ćƒ”ćƒ³ćƒˆ

5ć¤ę˜Ÿć®ć†ć”0ćØč©•ä¾”ć•ć‚Œć¦ć„ć¾ć™ć€‚
ć¾ć č©•ä¾”ćŒć‚ć‚Šć¾ć›ć‚“

č©•ä¾”ć‚’čæ½åŠ 
  • Facebook
  • Twitter
  • LinkedIn

©2024 by AeeroTech. Proudly created with Wix.com

bottom of page