top of page

Digital Banking Platform- Capacity

  • Writer: Anand Nerurkar
    Anand Nerurkar
  • May 16
  • 3 min read

Cloud Architecture Review Document: Digital Banking Platform

1. Executive Summary This document outlines the cloud-native architecture, capacity planning, and scaling strategy for the Digital Banking Platform, designed to support high concurrency, transaction throughput, and regulatory compliance using Microsoft Azure and Kubernetes.

2. Business Objectives

  • Modernize legacy banking systems into scalable microservices

  • Ensure high availability and disaster recovery

  • Achieve 500-1200+ TPS during peak load

  • Support 10,000+ concurrent users

  • Comply with BFSI regulatory mandates (e.g., RBI, SEBI)

3. Target Architecture Overview

  • Cloud Platform: Microsoft Azure

  • Orchestration: Azure Kubernetes Service (AKS)

  • Regions: Active-Active setup in North India and West India

  • Architecture Style: Microservices (Spring Boot), Event-Driven

  • Message Broker: Apache Kafka

  • API Gateway: Azure API Management + Istio Ingress Gateway

  • Observability: Prometheus, Grafana, ELK, Azure Monitor

  • Security: Azure Key Vault, Azure AD B2C, WAF, NSG, Pod Security Policies

4. Workload Summary

  • Microservices: 30 Spring Boot services

  • Replicas per service: 3

  • Kafka Brokers: 3 per region

  • Zookeeper Nodes: 3 per region

  • Istio Control Plane: 4 components

  • Observability Stack: Prometheus, Grafana, ELK (6 pods total)

  • Sidecars: Envoy (1 per app pod)

5. Capacity Planning (Per Region)

Component

Pods

Microservices (30x3)

90

Istio Sidecars

90

Kafka + Zookeeper

6

Istio Control Plane

4

Istio Ingress Gateway

2

Observability Stack

6

System/DaemonSets

10

Total Pods

208

  • Pods per Node (avg): 7

  • Min Nodes: 30 = 208 /7 =aprox 30

  • Max Nodes: 45 = min node + 50 % overhead=30+15=45


6. Scaling Strategy

  • Horizontal Pod Autoscaler (HPA): Triggers on CPU >70%, memory >75%, or Kafka lag

  • Cluster Autoscaler (CA): Adds/removes nodes based on pod schedulability

  • KEDA: Optional for event-based auto-scaling (e.g., Kafka consumer lag)

  • Rate Limiting: Istio + Azure API Gateway with circuit breakers and retries


7. High Availability & DR

  • Active-Active Deployment: North India + West India (Azure Paired Regions)

  • Multi-AZ within each region for zone redundancy

  • Geo-redundant Kafka topics (optional via MirrorMaker2 or Confluent Replicator)


8. Observability & Monitoring

  • Prometheus + Grafana: Real-time metrics (latency, throughput, resource usage)

  • Azure Monitor + Application Insights: Logs, traces, health checks

  • ELK Stack: Centralized logging, error analysis, audit trails


9. Security & Compliance

  • Authentication: Azure AD B2C, OAuth2/JWT

  • Secrets Management: Azure Key Vault

  • Network Security: NSG, WAF, private endpoints

  • Regulatory Compliance: Data residency, RBAC, ISO 27001, PCI-DSS alignment


10. Summary This architecture provides a highly scalable, secure, and cloud-native foundation for digital banking, capable of handling 1000+ TPS with 10,000 concurrent users while ensuring compliance and operational excellence. It follows Azure and CNCF best practices and is designed for resilience, observability, and agility.



  • Validate architecture with load testing and chaos engineering:

    • Use tools like JMeter, Locust, or k6 for load testing APIs and Kafka throughput.

    • Apply chaos engineering using Azure Chaos Studio or Litmus to test resiliency of pods, nodes, and services under failure scenarios.

  • Finalize node pool isolation (Kafka, app, system):

    • Create dedicated node pools for application services, Kafka workloads, and system components.

    • Use taints and tolerations to enforce workload placement and avoid resource contention.

  • Implement CI/CD with security gates:

    • Set up Azure DevOps pipelines with static code analysis (e.g., SonarQube), image scanning (e.g., Trivy), and secrets validation.

    • Enforce environment promotion via manual or approval gates for QA, UAT, and Prod.

  • Set up DR drills and compliance reviews:

    • Schedule quarterly DR drills simulating region failover and backup recovery.

    • Align reviews with ISO 27001, RBI MSP guidelines, and document audit readiness checkpoints.


 
 
 

Recent Posts

See All
Ops Efficiency 30 % improvement

how did you achieve 30 % operational efficiency Achieving 30% operational efficiency in a BFSI-grade, microservices-based personal...

 
 
 

댓글

별점 5점 중 0점을 주었습니다.
등록된 평점 없음

평점 추가
  • Facebook
  • Twitter
  • LinkedIn

©2024 by AeeroTech. Proudly created with Wix.com

bottom of page