Microservices ii
- Anand Nerurkar
- 4 hours ago
- 11 min read
SECTION 1 – Advanced Microservices Interview Q&A (Step-by-Step + BFSI Scenarios)
Topic 1: Service Mesh (Istio)
Q1. What is a service mesh and why use Istio in BFSI microservices?
Step-by-step answer:
Definition
Service Mesh = Layer that manages service-to-service communication (east-west traffic) in microservices without changing application code.
Istio = Popular service mesh using Envoy sidecars for observability, security (mTLS), and traffic control.
Key Features
Mutual TLS for internal API calls
Traffic shaping (Canary, A/B testing)
Circuit breaking, retries
Centralized metrics and tracing (Prometheus/Grafana, Jaeger)
Why BFSI Needs It
BFSI apps require secure internal traffic (e.g., KYC ↔ Loan Evaluation)
Compliance: PCI DSS, RBI/SEBI mandate encryption in transit
Zero downtime: Canary/Blue-Green deploys with Istio
BFSI Example
Digital Lending Platform: Loan, Credit Score, and Fraud Detection services communicate through Istio mesh with mTLS and traffic control during version upgrades.
Diagram placeholder (text):
[API Gateway]
↓
[Istio Ingress Gateway]
↓
[Microservices Mesh: KYC | Loan | Credit | Fraud]
(Each has Envoy proxy → mTLS, metrics, routing)
Q2. How does Istio help achieve zero-downtime deployments?
Answer:
Deploy new service version (v2) alongside old (v1) in mesh
Configure Istio VirtualService to route 5% traffic to v2, 95% to v1
Gradually shift traffic → monitor metrics/logs → cutover to 100% v2
Rollback instantly if errors spike
BFSI Example:
Loan Service v2 rolled out for new credit algorithm → 5% live traffic tested → safe scaling to 100% after validation.
Topic 2: API Gateway Patterns
Q3. What API Gateway patterns do you use in BFSI microservices?
Step-by-step answer:
Patterns
Aggregator Pattern: Combines multiple service calls → single response (e.g., KYC + Credit + Loan summary)
Proxy / Routing Pattern: Directs request to appropriate microservice
Backend-for-Frontend (BFF): Separate gateway for mobile vs web UIs
Security Enforcement: OAuth2/JWT validation at gateway level
Azure Implementation
Use Azure API Management:
Central entry point
JWT/OAuth validation
Rate limiting
Transformation policies (e.g., XML ↔ JSON)
BFSI Example
Mutual fund portal: API Gateway aggregates NAV (Fund Service) + Portfolio (Customer Service) + Compliance (SEBI Service) → single API for mobile app.
Diagram placeholder:
[Mobile App / Web]
↓
[Azure Front Door]
↓
[API Management (Gateway)]
↓
[Microservices: Fund | Portfolio | Compliance]
Topic 3: Distributed Caching (Redis)
Q4. Why is Redis used in BFSI microservices?
Step-by-step answer:
Purpose
Reduce DB load
Improve latency for frequently accessed data (e.g., customer KYC, NAV values)
Store session tokens in stateless services
Caching Strategies
Cache-aside (Lazy Loading): Check cache → DB if miss → populate cache
Write-through: Update cache + DB together
Write-behind: Update cache → flush later to DB
BFSI Example
Loan evaluation service caches credit score results for 5 minutes to speed up multiple queries during application processing.
Q5. How do you manage cache invalidation in BFSI scenarios?
Use TTL (Time-To-Live) = 5–10 minutes for fast-changing data (e.g., NAV)
Explicit invalidation on event (e.g., loan approval updates) via Kafka consumer → purge Redis key
Versioned keys (e.g., loan:v2:user123) during deployment upgrades
Topic 4: Event Sourcing vs CQRS
Q6. Explain Event Sourcing and CQRS with BFSI use case
Step-by-step answer:
Event Sourcing
Store state changes as events instead of current state
Example: LoanApplied, LoanApproved, LoanDisbursed
CQRS
Separate command (write) and query (read) models
Optimize read-heavy workloads with denormalized views
BFSI Example
Digital wallet: Event sourcing to maintain full transaction audit (required by SEBI/RBI)
CQRS: Queries served from read replica for transaction history (fast response for mobile apps)
Topic 5: Zero-Downtime Deployments
Q7. How do you achieve zero downtime deployments in BFSI apps?
Approaches:
Blue-Green
Two identical environments → switch DNS/Front Door
Canary with Istio
Gradually shift traffic (5%, 20%, 50%, 100%)
Rolling Updates (Kubernetes)
Update pods sequentially
BFSI Example
New loan eligibility model deployed via canary (Istio VirtualService) to ensure minimal risk on live transactions.
Topic 6: Security & Compliance
Q8. How do you ensure PCI DSS, GDPR, and RBI compliance in microservices?
Step-by-step:
PCI DSS (Card Data)
Tokenize card numbers → no raw PAN stored
TLS 1.2+ for data in transit
Role-based access control (RBAC)
GDPR (Personal Data)
Implement Right to Erasure across all microservices
Data minimization → only store necessary fields
RBI/SEBI (India-specific BFSI)
Data localization (host data in Indian region of Azure)
Immutable audit logs for 7-year retention
BFSI Example
Mutual Fund platform ensures NAV and customer PII stored in India region; audit logs immutable (event sourcing) for SEBI compliance.
Topic 7: Advanced Saga Orchestration
Q9. How do you implement Sagas in BFSI loan processing workflows?
Step-by-step answer:
Problem:
Loan approval involves multiple services: KYC, Credit Score, Loan Evaluation, Fraud Check, Disbursement.
Need distributed transaction handling (ACID not possible).
Saga Types:
Choreography (Event-Driven): Services communicate via events (Kafka).
Orchestration: Central controller coordinates steps (e.g., Loan Orchestrator Service).
BFSI Example:
Loan Orchestrator sends commands:
Start KYC → Credit Score → Loan Evaluation → Disbursement.
If credit fails, compensating actions triggered (rollback partial approvals).
Topic 8: Multi-Tenancy in BFSI (SaaS for Banks)
Q10. How do you design multi-tenant BFSI microservices?
Patterns:
Database-per-tenant:
Each bank (HDFC, ICICI) has its own DB → better isolation (regulatory compliance).
Schema-per-tenant:
Single DB with separate schemas → moderate isolation.
Row-level multi-tenancy:
Single DB with tenant ID in rows → cheapest, lowest isolation.
BFSI Example:
Digital wallet platform serving multiple banks → schema-per-tenant to comply with RBI’s data isolation rules.
Topic 9: Auto-Scaling with KEDA (Event-driven Scaling)
Q11. How do you scale BFSI microservices based on Kafka/Event Hubs load?
Step-by-step:
Use KEDA (Kubernetes Event-driven Autoscaling)
Define ScaledObject watching Kafka lag
Auto-scale consumers from 1 → N replicas when transaction volume spikes
BFSI Example:
Fraud detection service scales up during high UPI transaction spikes (festival sales) → scales back after load drop.
Topic 10: Distributed Transactions Without 2PC
Q12. How do you ensure consistency without two-phase commit (2PC)?
Use outbox pattern (write DB + event in same transaction)
Publish event to Kafka after commit
Downstream services consume and maintain eventual consistency
BFSI Example:
Mutual fund investment order placed → OrderCreated event saved in DB → Kafka → NAV calculation + compliance services update asynchronously.
Topic 11: Resilience and Chaos Testing
Q13. How do you ensure resilience under failures?
Circuit breakers (Resilience4j)
Bulkhead isolation (limit concurrent requests)
Chaos testing (Simian Army/Gremlin) to simulate failures
BFSI Example:
Simulate KYC service failure during loan processing → ensure fallback to manual review queue.
Topic 12: Observability in BFSI
Q14. How do you implement end-to-end observability?
Logs: Centralized ELK stack
Metrics: Prometheus + Grafana dashboards (e.g., loan approval latency)
Tracing: OpenTelemetry → Jaeger (trace transaction across KYC → Credit → Loan)
Topic 13: Rate Limiting & Throttling
Q15. How do you handle API rate limiting for BFSI customers?
Use API Gateway (Azure API Management) policies:
Limit 1000 requests/min per customer
Redis used to maintain counters for high-frequency APIs
BFSI Example:
UPI wallet API throttled to prevent abuse/fraud during peak hours.
Topic 14: API Versioning Strategy
Q16. How do you manage versioning of BFSI microservice APIs?
Semantic versioning: /v1/loan, /v2/loan
Backward compatibility during rollout
API Gateway routes old clients to old version
Topic 15: Security – mTLS and Zero Trust
Q17. How do you enforce zero-trust security in BFSI microservices?
mTLS between services (Istio)
JWT/OAuth 2.0 for external traffic
Principle of least privilege (RBAC)
Topic 16: Hybrid Cloud & Cloud-Agnostic
Q18. How to design BFSI microservices for cloud-agnostic deployment?
Use Kubernetes + Istio (runs on Azure, AWS, GCP)
Externalize configs (12-factor) → Secrets in Vault
Use messaging (Kafka/NATS) → portable across clouds
BFSI Example:
Mutual fund app deployable on Azure (India) for regulatory compliance, and AWS for DR site.
Topic 17: Compliance Logging
Q19. How do you design compliance logging for BFSI?
Log every transaction event (Event Sourcing)
Immutable storage (WORM-compliant)
Retain for 7–10 years (RBI/SEBI)
Topic 18: Data Privacy & Masking
Q20. How do you mask PII in logs?
Implement log sanitizers (mask Aadhaar, PAN)
Example: XXXX-XXXX-1234 instead of full number
Required for GDPR + Indian DPDP compliance
(We can continue expanding to 40 questions, covering topics like API composition vs choreography, polyglot persistence, distributed tracing deep dive, cost optimization, security token lifecycle, CI/CD patterns, etc.)
Next Step
Shall I:
Complete remaining questions (20+) to reach 40 with BFSI context now, then move to Section 2 (Architecture flows)?
Or start Section 2 (Architecture flows + diagrams) in parallel, so you see both structure and flow?
Do you also want me to include “common CTO round scenario Q&A” (migration, compliance, scaling challenges) after each topic for realism?
You said:
Complete remaining questions (20+) to reach 40 with BFSI context now, then move to Section 2 (Architecture flows)?
ChatGPT said:
Got it — I’ll complete the remaining advanced questions to reach 40 total, all with BFSI context. Then I’ll move to Section 2 (Architecture Flows). This way, you’ll have a complete interview Q&A bank first.
Advanced Microservices Interview Questions (21–40) – BFSI Context
Topic 19: Polyglot Persistence
Q21. How do you decide when to use multiple databases (polyglot persistence)?
Step-by-step answer:
Different services may have different storage needs:
Loan Service → RDBMS (ACID)
Fraud Detection → NoSQL (real-time, high throughput)
Portfolio Analytics → Time-series DB (historical NAV trends)
BFSI Example:
Mutual Fund Platform uses Azure SQL for customer profiles, Redis for caching NAVs, and CosmosDB for large-scale transaction logs.
Topic 20: Event-Driven Architecture with Kafka
Q22. How does event-driven architecture benefit BFSI workflows?
Decouples services (KYC → Loan → Fraud)
Ensures scalability and fault tolerance
Real-time notifications (e.g., “Loan Approved” → SMS/email via Event Hub)
Example BFSI flow:
LoanApplied event → Credit Service consumes → CreditChecked event → Loan Evaluation continues asynchronously.
Topic 21: Idempotency in BFSI Transactions
Q23. How do you ensure idempotency for critical BFSI operations?
Generate idempotency key (e.g., txn-UUID)
Store processed transactions in Redis/DB
Ignore duplicate events with same key
BFSI Example:
Payment gateway ensures no double debit if UPI callback received twice.
Topic 22: API Security – JWT and OAuth
Q24. How is JWT/OAuth2 implemented in BFSI APIs?
Step-by-step explanation
Concept
JWT = JSON Web Token for stateless auth
OAuth2 = Authorization framework for secure API access
Implementation
API Gateway validates JWT → forwards to microservices
Tokens carry scopes (e.g., read:loan, write:portfolio)
Refresh tokens stored in Key Vault (not in client)
BFSI Example
Mutual fund dashboard:
User logs in → OAuth2 provider issues JWT
API Gateway checks scope before exposing NAV data
Security Enhancements
Short-lived access tokens
Rotate signing keys regularly (Key Vault + Istio integration)
mTLS for internal service calls
BFSI Example:
Mutual Fund Mobile App issues JWT after login → API Gateway checks scope before exposing NAV details.
Topic 23: Distributed Locking
Q25. How do you prevent race conditions in BFSI workflows?
tep-by-step explanation
Problem
Concurrent updates → double loan disbursement, duplicate SIP orders.
Solution
Redis distributed lock (SETNX):
Acquire lock with expiration
Process transaction
Release lock
BFSI Example
Loan disbursement:
Lock on loanId
Ensure only one disbursement per loan, even if events are replayed
Considerations
Lock expiration for fail-safe
Retry policy if lock not acquired
Use Redis distributed locks (SETNX)
Lock around critical operations (e.g., fund disbursement) to avoid double processing
Topic 24: Dead Letter Queues (DLQ)
Q26. How do you handle failed events in Kafka/Event Hub?
Step-by-step explanation
Concept
Failed messages routed to DLQ for manual review or reprocessing.
BFSI Use Case
Loan Disbursement events failing due to invalid bank details
Push failed events to DLQ → monitored by ops team
Implementation
Kafka DLQ topic or Azure Event Hub secondary queue
Alerts triggered via Prometheus/Alertmanager
Compliance
DLQ must be auditable (store cause, payload, timestamp)
Configure DLQ topic for failed events
Retry policies + alerting (Prometheus alerts)
Manual reconciliation for BFSI transactions (audit trail maintained)
BFSI Example:
Failed LoanDisbursed events pushed to DLQ for manual finance team review.
Topic 25: API Composition vs Choreography
Q27. Which pattern do you prefer in BFSI and why?
Step-by-step explanation
API Composition
API Gateway aggregates data from multiple services
Suitable for dashboards (low latency required)
Choreography
Services emit/consume events without central orchestrator
Suitable for workflows (loan processing, mutual fund SIP)
Hybrid BFSI Example
Loan application UI = API Composition (status from multiple services)
Loan approval process = Event Choreography (KYC → Credit → Fraud)BFSI Example:
Loan dashboard = API Composition (single response to user)
Loan processing workflow = Choreography (event-driven across services)
Topic 26: Feature Flags in BFSI
Q28. How do you use feature flags during BFSI releases?
Step-by-step explanation
Concept
Toggle features at runtime without redeploying
Gradual rollout (canary for premium users)
Implementation
Feature flag service (e.g., LaunchDarkly, custom DB flags)
Flags stored in Redis for low-latency reads
BFSI Example
New loan scoring model rolled out only for 10% traffic
If metrics degrade, toggle flag off instantly
Toggle new loan scoring model without redeploying
Gradual rollout → enable per user segment (e.g., VIP customers first)
Topic 27: Distributed Configuration
Q29. How do you manage config across 50+ BFSI microservices?
Step-by-step explanation
Need
50+ microservices, configs differ per environment (UAT, Prod, DR)
Implementation
Central config (Spring Cloud Config, Azure App Configuration)
Secrets in Key Vault (not in config repo)
Auto-refresh via message bus
BFSI Example
API keys for Credit Bureau stored in Key Vault
Loan rate slabs configurable in central repo (no redeploy needed)
Use Spring Cloud Config + Azure Key Vault
Central config → versioned and encrypted
Dynamic refresh without restarts
Topic 28: Blue-Green Deployments
Q30. How does Blue-Green deployment reduce downtime risk?
Step-by-step explanation
Concept
Two environments (Blue active, Green idle)
Deploy new version to Green → test → switch traffic
Benefits
Instant rollback by switching DNS/front-end routing
No downtime during upgrade
BFSI Example
Mutual Fund NAV calculation upgrade:
Blue = current version (serving live trades)
Green = upgraded version (tested internally)
Switch post-market hours for minimal risk
Two identical environments → one active (Blue), one idle (Green)
Deploy to Green → test → switch DNS/Front Door
Rollback by switching back to Blue
BFSI Example:
Mutual fund NAV calculation microservice upgraded with zero impact on active trading window.
Topic 29: Rate-based Auto Scaling
Q31. How do you scale BFSI services based on request rate?
Step-by-Step Answer
Problem
Sudden traffic spikes (IPO subscriptions, loan campaigns).
Solution
Use KEDA for event-driven autoscaling:
Scale microservices based on Kafka topic lag or queue depth.
BFSI Example
Mutual fund SIP processing auto-scales from 5 to 100 pods based on order queue size.
Use Horizontal Pod Autoscaler (HPA) + KEDA
Scale credit check service up during peak loan application periods
Topic 30: Circuit Breakers
Q32. How do circuit breakers protect BFSI workflows?
Prevent cascading failures if Credit Score API fails
Fallback → manual review queue for pending loans
Topic 31: Bulkhead Pattern
Q33. Why use bulkheads in BFSI microservices?
Isolate failures (e.g., Fraud Detection failure shouldn’t crash Loan Evaluation)
Separate thread pools for critical vs non-critical services
Topic 32: Chaos Engineering
Q34. How do you apply chaos engineering to BFSI systems?
Simulate API latency/failures
Validate resilience (fallbacks, retries)
BFSI-specific: Test UPI service outage recovery
Topic 33: API Monetization
Q35. How do BFSI APIs generate revenue?
Partner APIs (e.g., account verification, mutual fund NAV feeds)
API Gateway applies rate limiting + billing metrics
Topic 34: GraphQL vs REST
Q36. When to use GraphQL in BFSI?
GraphQL for custom dashboards (flexible queries: NAV + portfolio + transactions)
REST for core workflows (loan approval, payments)
Topic 35: Cost Optimization
Q37. How do you reduce cloud costs in BFSI microservices?
Right-size AKS nodes (scale-to-zero off-peak)
Use Redis TTL caching to cut DB queries
Tier storage for compliance logs (hot vs cold)
Topic 36: Security Token Lifecycle
Q38. How do you manage token lifecycle securely?
Step-by-Step Answer
Concept
Short-lived access tokens; refresh tokens for renewal.
Implementation
Store refresh tokens securely (encrypted DB).
Revoke via blacklist (Redis) or rotation.
BFSI Example
Mutual fund login: 15-min access token, 30-day refresh token.
Short-lived access tokens, refresh tokens stored encrypted
Revoke on fraud detection (immediate logout)
Topic 37: Compliance-Driven Event Retention
Q39. How do you retain BFSI events for 7–10 years (RBI/SEBI)?
Archive to immutable storage (WORM-compliant)
Cold storage in Azure Blob with lifecycle management
Topic 38: Auditability of Microservices
Q40. How do you ensure every BFSI transaction is auditable?
Step-by-Step Answer
Techniques
Event sourcing (immutable log).
Centralized audit service (collects from all microservices).
Tamper-proof storage (WORM).
BFSI Example
Every NAV calculation event stored with timestamp + approver ID.
Event Sourcing (append-only logs)
Correlation IDs across services (trace loan journey end-to-end)
Immutable audit trail required by SEBI/RBI
How do you plan DR (Disaster Recovery) for BFSI microservices?
Step-by-Step Answer
Requirements
RTO (Recovery Time Objective): e.g., 1 hour.
RPO (Recovery Point Objective): e.g., 5 min.
Implementation
Active-active across regions (e.g., Azure India Central + South).
Event replication (Kafka MirrorMaker).
DB replication with geo-failover.
BFSI Example
Loan system auto-fails over to DR region during Mumbai DC outage.
Comments