Active Active Set Up Scenario
- Anand Nerurkar
- May 16
- 3 min read
Scenario:
User 1 transacts in North India region
User 5 transacts in West India region
North India region fails unexpectedly
What Happens When North India Region Fails?
1. User 1’s Transaction During Failure
Since North India AKS cluster is down, traffic routing layer (Azure Front Door or Traffic Manager) detects failure via health probes.
Traffic is automatically redirected to West India cluster for User 1’s requests.
User 1’s client reconnects transparently to West India services.
2. User 5’s Transaction in West India
Continues uninterrupted, as West India cluster is healthy.
West India cluster remains fully operational.
How Does Cross-Region Data Replication Handle This?
1. Kafka Geo-Replication
Both regions have independent Kafka clusters.
Kafka topics are replicated bi-directionally using MirrorMaker 2 (MM2) or similar tool.
All events produced by User 1 in North India are mirrored to West India and vice versa.
2. Database Geo-Replication
Databases (e.g., Azure SQL) use active geo-replication.
Changes committed in North India DB are asynchronously replicated to West India DB.
When North India fails, West India DB is the latest consistent copy.
Detailed Replication Flow:
sql
CopyEdit
User 1 → North India AKS Cluster → Kafka Broker (North) → MirrorMaker2 replicates events → Kafka Broker (West)
User 5 → West India AKS Cluster → Kafka Broker (West) → MirrorMaker2 replicates events → Kafka Broker (North)
North India DB changes replicate asynchronously to West India DB
On failure of North India:
- Traffic Manager routes User 1 to West India cluster
- West India services process User 1’s transaction using local Kafka and DB (up-to-date from replication)
Key Points:
Replication is asynchronous but typically near real-time (seconds latency).
Kafka MM2 ensures event stream continuity across regions.
Databases maintain transaction consistency via geo-replication.
Traffic routing fails over quickly to healthy region without user impact.
Application design should handle idempotency and eventual consistency due to asynchronous replication.
Summary Table:
Component | Role in Failover Scenario |
Azure Front Door | Detects region failure and reroutes traffic |
AKS Cluster | Regional service availability |
Kafka MirrorMaker2 | Replicates event streams between regions |
Azure SQL Geo-Rep | Keeps transactional data in sync asynchronously |
Clients | Retry on connection failover |
Here’s a text version of the replication flow in an active-active AKS setup for a banking platform deployed in North India and West India regions:
🔄 Active-Active Replication Flow (Text Format)
🧍 Scenario:
User 1 is routed to North India
User 5 is routed to West India
Both perform banking transactions concurrently
North India fails during User 1's session
📍 Step-by-Step Replication Flow
✅ 1. User Request Handling
Azure Front Door / Traffic Manager uses health probes to route users:
User 1 → North India AKS Cluster
User 5 → West India AKS Cluster
✅ 2. Kafka Event Publishing
Each region has its own Kafka cluster.
User 1's actions (e.g., loan.requested) are published to a Kafka topic in North India Kafka.
User 5’s events (e.g., account.created) are published to West India Kafka.
✅ 3. Kafka Cross-Region Replication
Kafka MirrorMaker 2 (MM2) replicates topics:
From North India Kafka → West India Kafka
From West India Kafka → North India Kafka
This ensures both clusters have a full copy of the event stream.
✅ 4. Database Replication
Transactional data (e.g., loan details, account status) is stored in Azure SQL / Cosmos DB.
These databases are configured with active geo-replication:
Changes in North India DB are asynchronously replicated to West India DB.
Replication is near real-time (latency in seconds).
✅ 5. Application State and Configuration
Application config, secrets, and feature flags are stored in:
Azure App Configuration
Azure Key Vault
These are either manually replicated or synced using CI/CD GitOps pipelines to ensure both clusters are in sync.
🚨 6. Region Failure Scenario: North India Down
Front Door detects South India is unhealthy via probe failure.
User 1's requests are re-routed to West India.
West India AKS Cluster:
Uses latest replicated Kafka topic and synced database.
Processes the transaction with minimal user disruption.
Kafka continues ingesting User 1’s events directly in West India Kafka.
Database writes are committed to West India DB.
🧠 Key Technologies Enabling This Flow
Function | Tool/Service |
Global load balancing | Azure Front Door |
Kafka topic replication | Kafka MirrorMaker 2 |
DB replication | Azure SQL Geo-Replication / Cosmos DB Multi-Region |
Secrets/config replication | GitOps (e.g., ArgoCD) + Azure App Config/Key Vault |
Health-based failover | Traffic Manager / Front Door probes |
🛡️ Failover Goals Achieved
RTO: Near-zero — auto-failover handled by Front Door
RPO: Low — Kafka and DB replication minimizes data loss
User experience: Seamless — clients are rerouted without needing to retry manually
Comments