top of page

Modernization Around Core CBS

  • Writer: Anand Nerurkar
    Anand Nerurkar
  • Feb 18
  • 17 min read

Updated: Feb 26

🏦 Current State (Typical Bank Using Finacle Modules)

Finacle

Traditional setup:

Channel → Finacle LOS → Finacle LMS → Finacle CBS

Where:

  • LOS = Origination

  • LMS = Loan servicing

  • CBS = Ledger & accounting

Problem:

  • Tight vendor coupling

  • Limited agility

  • Slow product innovation

  • Heavy customization risk

  • Upgrade pain

🎯 Target: Digital Lending Modernization Around Core

You introduce a Digital Lending Platform (DLP) layer.

Mobile/Web
    ↓
Digital Lending Platform (Your Layer)
    ↓
LOS / LMS / CBS (as transactional engines)

This layer becomes:

✔ Journey orchestrator✔ Product configurator✔ Rules engine✔ Eligibility engine✔ Document workflow engine✔ API façade

LOS/LMS/CBS become back-end processors.

🧠 What Should Your Digital Layer Own?

1️⃣ Customer Journey Orchestration

Instead of:

  • LOS controlling workflow

You control:

  • Step progression

  • Dynamic form rendering

  • Pre-approved offers

  • Consent management

  • Multi-product bundling

LOS only gets:

  • Final application payload

2️⃣ Credit Decisioning (Externalized)

Instead of:

  • Hardcoded LOS rules

You introduce:

  • Independent decision engine

  • Real-time bureau calls

  • Alternative data scoring

  • Risk-based pricing

This gives agility.

3️⃣ Product Configuration Outside CBS

Rather than:

  • Finacle product tables driving everything

You maintain:

  • Product catalog service

  • Rate computation service

  • Fee computation service

CBS just books:

  • Loan account

  • Schedule

  • Accounting entries

4️⃣ Event-Driven Loan Lifecycle

Instead of tight LOS-LMS coupling:

Loan Approved → Event
Loan Disbursed → Event
EMI Paid → Event
Delinquent → Event

Your digital layer subscribes and reacts:

  • Push notifications

  • Cross-sell triggers

  • Collection workflow updates

🧩 Architecture Blueprint

Channels
    ↓
API Gateway
    ↓
Digital Lending Microservices
    - Journey Service
    - Eligibility Service
    - Pricing Engine
    - Document Service
    - Offer Service
    ↓
Integration Layer
    ↓
LOS / LMS / CBS

CBS remains:✔ Ledger✔ GL posting✔ Regulatory engine

LOS/LMS become:✔ Transaction processors✔ Data persistence engines

🚨 But Here’s the Critical Design Principle

Do NOT:

  • Rebuild accounting logic

  • Recalculate EMI independently of CBS

  • Duplicate amortization logic

  • Create shadow balances

Otherwise:You create reconciliation nightmare.

Digital layer must be:

Orchestration + intelligenceNOT financial book of record

🔥 Why This Approach Is Powerful

1️⃣ Vendor Independence

If tomorrow you:

  • Replace Finacle LMS

  • Introduce new lending core

Only integration layer changes.

Channels remain untouched.

2️⃣ Faster Product Launch

Want to launch:

  • BNPL

  • Instant top-up

  • Co-lending product

You configure it in digital layer.

LOS just receives structured booking instruction.

3️⃣ Scalable Digital Traffic

Heavy:

  • Pre-eligibility checks

  • Simulations

  • EMI calculators

Should NOT hit CBS.

Digital layer handles it.

⚖ Trade-Offs (Important for CTO Discussion)

Benefit

Risk

Agility

Increased architecture complexity

Decoupling

Need strong governance

Innovation speed

Requires strong integration discipline

Vendor flexibility

Must avoid duplication of core logic

🎯 Strategic Positioning


“Can we modernize digital lending around LOS/LMS/CBS instead of using them end-to-end?”

“Yes. We can introduce a digital lending abstraction layer that owns journey orchestration, decisioning, and product configuration, while LOS/LMS/CBS remain transactional engines. This enables agility and vendor decoupling without compromising ledger integrity.”

That sounds enterprise-ready.

🏦 High-Level Architecture Vision

Core principle:

Digital layer owns journey + intelligence Core banking owns accounting + regulatory ledger

Using:

  • Finacle/TCS BaNC (CBS – account booking)

  • Fenergo (KYC / onboarding)

  • Actimize (Fraud / AML)

🧱 1️⃣ Macro Architecture

[ Mobile / Web App ]
        ↓
[ API Gateway ]
        ↓
[ Digital Lending Platform - Microservices Layer ]
        ↓
[ Event Bus (Kafka / MQ) ]
        ↓
[ Risk / ML / RegTech / CBS Integrations ]

Core components inside your abstraction layer:

  • Identity Service

  • Consent Service

  • Document Service

  • Application Service

  • Orchestration Service

  • Risk Decision Engine

  • Notification Service

  • Integration Adapter Layer

🚀 2️⃣ End-to-End Journey Flow (Event-Driven)

Let’s go step by step.

🔐 Step 1 – Login

  • OAuth2 / OIDC

  • MFA if needed

  • Customer profile fetch (cached, not direct CBS hit)

Event:

USER_AUTHENTICATED

📝 Step 2 – Consent Management

Customer accepts:

  • Data usage

  • Bureau pull

  • Income verification

  • AML screening

Stored in:

  • Consent DB (immutable store)

Event:

CONSENT_CAPTURED

Important:Consent must be auditable & tamper-proof.

📂 Step 3 – Application Initiated

Customer enters:

  • Loan amount

  • Tenure

  • Employer details

  • Income details

System generates:

  • Application Reference ID

Event:

APPLICATION_INITIATED

This triggers the pipeline.

⚙️ 3️⃣ Event-Driven Processing Pipeline

Now the orchestration begins.

🧾 Stage 1 – OCR & Document Processing

Input:

  • Uploaded salary slip / bank statement

Process:

  • OCR extraction

  • Data normalization

  • Confidence scoring

Event:

OCR_COMPLETED

If confidence low:→ Manual review queue

🪪 Stage 2 – KYC (via Fenergo)

Call:Fenergo API

Checks:

  • Identity verification

  • Sanction screening

  • PEP check

Event:

KYC_COMPLETED

If failed:→ APPLICATION_REJECTED

📊 Stage 3 – Credit Risk Assessment

Parallel execution:

  1. Bureau pull

  2. Bank internal ML model:

    • Credit model

    • Income stability model

  3. Bank rule engine (DTI, FOIR etc.)

Event:

CREDIT_RISK_COMPLETED

Output:

  • Risk grade

  • Recommended limit

  • Pricing band

🛡 Stage 4 – Fraud & AML Screening

Call:Actimize

Also:

  • Internal fraud ML model

  • Device fingerprinting

  • Velocity checks

Event:

FRAUD_RISK_COMPLETED
AML_SCREENING_COMPLETED

If high risk:→ Escalate to manual investigation queue

🧠 4️⃣ Decision Engine (Central Brain)

Now orchestration service aggregates:

  • OCR output

  • KYC status

  • Credit score

  • Fraud score

  • AML status

  • Internal rule evaluation

Decision logic:

If all green → APPROVED
If moderate risk → REFER
If failed → REJECTED

Event:

LOAN_APPROVED

📜 5️⃣ Loan Agreement & E-Sign

System generates:

  • Personalized agreement PDF

  • Risk-based pricing

  • EMI schedule preview

Customer signs via:

  • E-sign provider

Event:

LOAN_AGREEMENT_SIGNED

Only after signed → Disbursement allowed.

🏦 6️⃣ CBS Booking (Finacle API Call)

Now your Integration Adapter Layer calls:

Finacle API:

  • Create customer if new

  • Create loan account

  • Create repayment schedule

  • Post initial disbursement

Critical:

  • Idempotency key = Application ID

  • Retry-safe

  • Exactly-once guarantee

Event:

LOAN_ACCOUNT_CREATED
DISBURSEMENT_SUCCESS

CBS now becomes system of record.

🔔 7️⃣ Notification & Downstream Updates

Events trigger:

  • SMS / Email

  • CRM update

  • Analytics pipeline

  • Collection strategy assignment

  • Regulatory reporting update

Event:

NOTIFICATION_SENT

🧩 8️⃣ Event Bus Orchestration Pattern

Important:

Use choreography pattern where possible.

Only use orchestration for:

  • Critical approval decision

  • CBS booking coordination

Everything else async.

🛡 9️⃣ Resilience & Control Mechanisms

Must include:

✔ Circuit breaker for CBS✔ Timeout control for Actimize / Fenergo✔ Retry with exponential backoff✔ Dead letter queue✔ Manual review workflow✔ Reconciliation microservice

Never allow:Duplicate disbursement.

📊 10️⃣ Data & Audit Layer

Maintain:

  • Immutable event store

  • Application state machine table

  • Full traceability (for regulator)

  • Risk decision explainability (model governance)

Critical for compliance.

🧠 11️⃣ Separation of Concerns

Layer

Responsibility

Digital Layer

Journey + Intelligence

RegTech (Fenergo)

Compliance

Risk (Actimize + ML)

Fraud & AML

CBS (Finacle)

Accounting

Event Bus

Decoupling

Notification

Customer communication

🎯 Final Architecture Summary

User
 ↓
Digital Lending Platform
 ↓
Event Pipeline
 ↓
Risk + RegTech + ML
 ↓
Decision Engine
 ↓
Finacle CBS (Ledger)
 ↓
Notifications + Analytics

CBS touched only once:→ At booking & disbursement.

Everything else happens outside.

💎 Why This Architecture Is Strong

✔ Composable✔ Vendor-independent✔ Cloud scalable✔ Upgrade safe✔ AI-friendly✔ Regulator auditable✔ Future core replacement ready


1️⃣ How would you modernize the Finacle ecosystem?

Objective: Add agility, digital capabilities, and integration without disrupting the core CBS (Finacle).

Step-by-Step Approach:

  1. Assess Core Footprint

    • Identify Finacle modules in use: CBS, Treasury, Loan, Deposits.

    • Map upstream/downstream systems: LOS, LMS, Payments, AML.

  2. Define Bounded Contexts

    • Keep Finacle as system of record.

    • Create a Digital Lending / Digital Banking Layer as an abstraction layer.

  3. Event-Driven Integration

    • Introduce Kafka / Event Bus for all domain events.

    • Use Outbox pattern in digital services.

    • Digital services subscribe to events for orchestration and analytics.

  4. Adapter Layer

    • AxisCBSAdapter → abstracts Finacle APIs (REST / SOAP)

    • Handles protocol/message transformation, retries, throttling.

  5. Digital Services

    • Microservices for loan journey: KYC, Credit, Fraud, AML, Consent, Notifications.

    • Store projections in Postgres / Redis / CosmosDB.

  6. Data Platform

    • Raw → Curated → Analytics → Feature Store

    • Enable ML for credit, fraud, collections.

  7. Hybrid Modernization Pattern

    • Strangler pattern: gradually shift business logic to digital layer.

    • Avoid touching core CBS.

2️⃣ How do you ensure event consistency in banking?

Objective: Avoid mismatched states across digital layer and core systems.

Patterns and Approach:

  1. Outbox + Event Bus

    • Every state change writes to Outbox table, then published to Kafka.

    • Guarantees exactly-once publishing.

  2. Idempotency

    • Every command includes idempotency key (loan ID, transaction ID).

    • Prevents duplicate processing in LOS/LMS/CBS.

  3. Saga / Choreography

    • Use orchestration for multi-step flows (loan application → approval → account creation → disbursement).

    • If one service fails, rollbacks or compensating actions are triggered.

  4. Acknowledgement & Retry

    • Adapter waits for system ACK before marking event processed.

    • Retries with exponential backoff, dead-letter queues for failed events.

  5. Reconciliation

    • Near real-time and end-of-day reconciliation service.

    • Detects and resolves mismatches across Digital ↔ LOS ↔ LMS ↔ CBS.

3️⃣ How do you design for 99.99% availability?

Objective: Ensure high reliability for critical banking operations.

Step-by-Step Design:

  1. Microservice Resilience

    • Stateless services with autoscaling.

    • Circuit breakers, bulkheads, rate limiting.

  2. Database Layer

    • Multi-AZ deployments, hot standby replicas.

    • Use distributed cache (Redis / CosmosDB) for low-latency reads.

  3. Message Bus

    • Kafka clusters with replication factor ≥3.

    • Multi-region setup if necessary.

  4. Core System Adapter

    • Retry, idempotency, failover endpoints.

  5. Monitoring & Alerts

    • SLA monitoring, anomaly detection.

    • Pager duty integration.

  6. Disaster Recovery

    • Multi-region DR for Kafka, databases, core adapters.

    • Automatic failover with minimal RPO/RTO.

4️⃣ How would you architect UPI at scale?

Requirements: Millions of TPS, low latency (<1 sec), high reliability.

Step-by-Step Architecture:

  1. API Gateway

    • Terminals / apps call stateless UPI gateway.

  2. Event-Driven Transaction Processing

    • Transaction events → Kafka → Orchestrator → Core CBS/Ledger.

  3. High-Throughput Ledger

    • Use Finacle / TCS Bancs with adapter layer.

    • Ensure atomic debit/credit.

  4. Concurrency & Idempotency

    • UPI transaction ID = idempotency key.

    • Prevent duplicate debits.

  5. Scaling

    • Horizontal scaling of gateways & microservices.

    • Kafka partitioning by VPA/Bank ID for throughput.

  6. Real-Time Settlement

    • NEFT/IMPS/UPI pipelines for inter-bank settlement.

    • Event-driven notifications for payer/payee.

  7. Fraud & Risk

    • Real-time ML scoring per transaction.

5️⃣ How do you manage cost vs performance tradeoff?

Principles:

  1. Use Cloud Elasticity

    • Autoscale for peaks (e.g., EMI dates, salary days).

    • Scale down in off-peak.

  2. Caching & Projection

    • Redis for real-time journey state.

    • Avoid hitting LOS/LMS/CBS for every UI request.

  3. Batch for Non-Critical Work

    • EOD batch reconciliation.

    • Analytics pipelines in Spark / Databricks.

  4. Prioritize SLAs

    • Critical flows (payments, loan approval) → synchronous, high-cost path.

    • Non-critical (analytics, dashboards) → asynchronous, low-cost.

6️⃣ How do you run ARB (Application Reconciliation Batch)?

Objective: Detect mismatches between digital layer and LOS/LMS/CBS.

Steps:

  1. Query all domain events in the day.

  2. Compare:

    • Loan application IDs, loan amount, status.

    • EMI schedule, disbursed amount, ledger entries.

  3. Generate reconciliation report:

    • Exceptions flagged

    • Operations ticket created

  4. Automate retries for minor issues.

  5. Persist ARB results in audit/logging system for compliance.

7️⃣ How do you balance build vs buy?

Framework:

  • Buy: LOS, LMS, CBS, Finacle — core transactional systems.

  • Build: Digital orchestration, ML scoring, event-driven pipelines, adapters.

  • Criteria:

    • Strategic differentiation → build (e.g., digital journey UX, ML models).

    • Commodity / stable → buy (e.g., core banking, payments clearing).

    • Regulatory compliance → buy unless internal expertise exists.

8️⃣ How do you handle regulatory audit scenario?

Steps:

  1. Immutable Audit Logs

    • Store raw events in Data Lake raw zone.

    • Append-only, timestamped, versioned.

  2. Lineage Tracking

    • All transformations logged: raw → curated → feature → decision.

  3. Reconciliation Evidence

    • EOD / ARB reports, exception handling, SLA adherence.

  4. Explainable ML

    • Keep feature + model version for each decision (RBI compliance).

  5. Access Control & Governance

    • SailPoint for roles.

    • Fine-grained audit of who accessed or modified data.

  6. Regulatory Reporting

    • Export dashboards and reports to regulator-ready formats.

Summary Table for Interview

Question

Key Answer Pillars

Finacle Modernization

Digital Layer + Adapters + Event-driven + Strangler pattern

Event Consistency

Outbox, Kafka, Saga, Idempotency, Reconciliation

99.99% Availability

Resilient microservices, multi-AZ DB, Kafka replication, DR

UPI at Scale

API Gateway, Event-driven, Idempotency, Real-time ML

Cost vs Performance

Autoscaling, caching, batch vs real-time, SLA prioritization

ARB

Event comparison, reconciliation reports, ops ticketing

Build vs Buy

Strategic differentiation → Build, Commodity → Buy

Regulatory Audit

Immutable logs, lineage, explainable ML, role-based access, reports

1️⃣ Hybrid Architecture (On-Prem Finacle + Cloud Microservices)

Objective: Modernize banking operations without disrupting Finacle, while leveraging cloud agility.

Step-by-Step Approach:

  1. Assess Core Footprint

    • On-prem Finacle = system of record, immutable business logic.

    • Identify modules to modernize: CBS, Loan, Treasury, Payments.

  2. Define Hybrid Boundary

    • Keep Finacle on-prem for critical core banking.

    • Move digital services, analytics, ML scoring, orchestration, notifications to cloud.

  3. Integration Layer

    • Introduce AxisCBSAdapter on bank side:

      • Handles REST / SOAP calls

      • Protocol transformation

      • Retry / throttling / idempotency

    • Adapter ensures cloud services don’t directly touch Finacle.

  4. Data Management

    • Cloud databases for digital layer:

      • Postgres (relational state)

      • Redis (caching, low-latency reads)

      • CosmosDB (geo-redundant, multi-region)

    • Event-driven updates ensure consistency with Finacle.

  5. Network & Security

    • Use secure VPN / ExpressRoute / PrivateLink for cloud ↔ on-prem traffic.

    • Apply zero-trust principles for microservices.

2️⃣ API Gateway Pattern

Objective: Centralized entry-point for all digital traffic.

Step-by-Step:

  1. Expose Microservices

    • All digital microservices (loan-svc, kyc-svc, fraud-svc, aml-svc) exposed via API Gateway.

  2. Functions of API Gateway

    • Authentication & authorization (Azure AD / OAuth2 / JWT)

    • Rate limiting & throttling

    • Routing to services

    • Aggregation of responses (for multi-service calls)

    • Protocol translation (HTTP → gRPC / REST)

  3. Benefits

    • Single entry-point for apps & UPI APIs.

    • Shield core Finacle from direct exposure.

    • Enables analytics on traffic (request counts, latencies).

3️⃣ Event-Driven Using Kafka

Objective: Loose coupling, scalability, eventual consistency.

Step-by-Step:

  1. Event Sourcing Pattern

    • Digital microservices emit domain events (loan-initiated, KYC-verified, loan-approved).

    • Outbox pattern ensures exactly-once delivery.

  2. Kafka Topics

    • One topic per domain event type:

      loan-initiated-event kyc-verified-event credit-score-verified-event loan-account-created-event

  3. Consumers

    • Orchestration services, audit service, data lake pipelines, and adapters subscribe to events.

    • Enables real-time processing, ML scoring, and reconciliation.

  4. Advantages

    • High throughput, fault-tolerant.

    • Microservices don’t call each other synchronously → avoids tight coupling.

4️⃣ Service Mesh

Objective: Simplify microservice communication, observability, and security in a cloud environment.

Step-by-Step:

  1. Deploy a Service Mesh (e.g., Istio / Linkerd) in Kubernetes / OpenShift:

    • Sidecar proxies manage service-to-service traffic.

    • Provides mTLS encryption.

    • Handles service discovery, retries, load balancing.

  2. Observability Features

    • Distributed tracing (Jaeger)

    • Metrics collection (Prometheus)

    • Traffic routing / blue-green / canary deployments

  3. Benefits

    • Decouples networking concerns from application logic.

    • Standardizes resilience policies across services (retry, timeout, circuit breaker).

5️⃣ Observability Stack

Objective: Monitor microservices and hybrid environment for performance and failures.

Step-by-Step:

  1. Metrics

    • Use Prometheus / Grafana to capture CPU, memory, request latency, error rates.

  2. Tracing

    • Distributed tracing (Jaeger / OpenTelemetry)

    • Trace events from UI → Digital Layer → Adapters → Finacle

  3. Logging

    • Centralized logging (ELK / EFK stack)

    • Include structured logs, correlation IDs, and request IDs.

  4. Alerting

    • SLA / SLO violations trigger alerts.

    • PagerDuty / OpsGenie integration.

  5. Business Observability

    • Track key banking KPIs (loan approvals, disbursements, failures) in real time.

6️⃣ SRE (Site Reliability Engineering) Model

Objective: Ensure 99.99% availability and operational excellence.

Step-by-Step:

  1. Define SLIs / SLOs / SLAs

    • Example: Loan approval API latency < 2s 99.9% of the time.

    • Availability of digital layer = 99.99%.

  2. Automated Incident Management

    • Self-healing microservices with retry + circuit breaker.

    • Reconciliation service detects mismatches and triggers ops tickets automatically.

  3. Capacity Planning

    • Use auto-scaling based on load.

    • Pre-provisioned Kafka partitions & replicas for peak banking hours.

  4. Change Management

    • Canary releases / blue-green deployments.

    • Reduce risk for production changes in hybrid setup.

  5. Postmortems & Learning

    • Every outage / mismatch triggers blameless postmortem.

    • Feed improvements back to system design.

🔹 Combined Hybrid Architecture Flow (Axis Bank Style)

User Apps / UPI API
        ↓
   API Gateway
        ↓
Digital Layer Microservices
        ↓
Event Bus (Kafka) ←→ Service Mesh (Istio)
        ↓
Adapters (LOS / LMS / CBS)
        ↓
On-Prem Systems (Finacle / LOS / LMS)
        ↓
Data Lake → Analytics → Feature Store (ML)
        ↓
Observability Stack + SRE dashboards

✅ This ensures:

  • Hybrid integration (on-prem + cloud)

  • Event consistency

  • Observability

  • High availability

  • Cloud-native microservice patterns


🏦 Typical ABC Bank-Style Hybrid Deployment Model

🔹 On-Prem (ABC Bank Data Center)

This is where regulated, core, and legacy systems stay.

Deployed On-Prem:

  • CBS (like Finacle)

  • LOS

  • LMS

  • Enterprise Service Bus (if legacy)

  • Core payment switch

  • Core DB clusters

  • Core adapters (often)

  • Security & HSM modules

Why On-Prem?

  • Regulatory requirements

  • Data residency

  • Tight control on financial ledger

  • Lower latency to core systems

  • Vendor support model

CBS is always treated as System of Record and rarely moved fully to cloud in traditional banks.

☁️ Cloud (Digital Modernization Layer)

This is where innovation happens.

Deployed in Cloud:

  • Digital lending microservices (loan-svc, kyc-svc, fraud-svc)

  • Orchestration service

  • Kafka (managed cluster)

  • API Gateway

  • ML scoring services

  • Feature store

  • Data lake

  • Redis / Postgres / CosmosDB

  • Observability stack

  • Reconciliation engine

Why Cloud?

  • Elastic scaling (UPI peaks, EMI days)

  • Faster deployments

  • Microservices-friendly

  • Lower infra management overhead

  • AI/ML workloads

🔄 How They Connect (Very Important)

You NEVER expose Finacle directly to cloud.

Instead:

Cloud Digital Layer
        ↓
Secure VPN / Private Link / ExpressRoute
        ↓
On-Prem Adapter Layer
        ↓
Finacle CBS

🧩 Where Should Adapter Be Deployed?

There are two common models:

✅ Model 1 (Most Common in Banks)

Adapter deployed On-Prem

Cloud Digital
      ↓
Secure Channel
      ↓
AxisCBSAdapter (On-Prem)
      ↓
Finacle CBS

Why?

  • Keeps Finacle insulated

  • Better latency (adapter close to CBS)

  • Easier protocol transformation

  • Centralized throttling & control

  • Security boundary protection

This is the safest enterprise model.

⚡ Model 2 (Less Common)

Adapter in Cloud.

But this increases:

  • Security exposure

  • Network dependency risk

  • Compliance complexity

Most large banks prefer adapter near core.

📍 Final Deployment Architecture

On-Prem Zone

  • Finacle CBS

  • LOS

  • LMS

  • CBS Adapter

  • LOS Adapter

  • LMS Adapter

Cloud Zone

  • API Gateway

  • Digital Microservices

  • Kafka

  • ML Services

  • Data Lake

  • Reconciliation Engine

  • Monitoring stack

🛡 Security Considerations

Between Cloud ↔ On-Prem:

  • Mutual TLS

  • IP whitelisting

  • Private connectivity (not public internet)

  • Strict firewall rules

  • Rate limiting at adapter layer

🎯

Where would you deploy Finacle and adapters in hybrid setup?

“In a hybrid model, Finacle CBS remains on-prem as the financial system of record. We deploy the CBS adapter on-prem as well, close to Finacle, to handle protocol transformation, resiliency, and security control. The digital microservices layer runs in cloud for scalability and agility, connected via secure private network links.”



How to ensure Reliability in Banking

====


🎯 Step 1: Define What Reliability Means in Banking

In banking, reliability means:

  • No data loss

  • No duplicate transactions

  • No financial inconsistencies

  • High availability (99.99%+)

  • Predictable performance

  • Fast recovery from failure

So reliability = Correctness + Availability + Resilience + Recoverability

🏗 Step 2: Reliability at Each Layer

We ensure reliability across 7 layers.

1️⃣ Infrastructure Reliability

In Cloud (Digital Layer)

  • Multi-AZ deployment

  • Auto-scaling groups

  • Managed Kubernetes (AKS / EKS)

  • Load balancers with health checks

  • Distributed Kafka cluster (replication factor ≥ 3)

On-Prem (CBS Side)

  • Active-passive or active-active CBS

  • Database replication

  • Redundant network paths

  • Dual firewalls

2️⃣ Service-Level Reliability (Microservices)

Each microservice must:

✅ Be Stateless

No session stored locally.

✅ Use Circuit Breaker

If CBS is slow:

  • Stop calling

  • Return fallback response

  • Prevent cascading failure

✅ Timeouts + Retries

  • Set strict timeout (e.g., 2s)

  • Retry with exponential backoff

  • Max retry threshold

✅ Bulkhead Pattern

Separate connection pools for:

  • CBS calls

  • LOS calls

  • LMS calls

Prevents one failure from affecting entire system.

3️⃣ Data Reliability

This is most critical in banking.

🔐 Idempotency

Every request includes:

X-Idempotency-Key = LoanID or TransactionID

Prevents duplicate loan creation.

🔄 Outbox Pattern

When service updates DB:

  1. Save business data

  2. Save event in outbox table

  3. Background process publishes event to Kafka

Guarantees:

  • No event loss

  • No partial update

📦 Kafka Reliability

  • Replication factor ≥ 3

  • Acknowledgment level = ALL

  • Dead letter queues for failed messages

  • Consumer offset tracking

4️⃣ Transaction Reliability (Saga Pattern)

Loan creation involves:

  1. Loan approval

  2. CBS account creation

  3. LMS loan schedule

  4. Disbursement

We use Orchestrated Saga Pattern:

If CBS fails:

  • Compensate → mark loan as FAILED

  • Do not proceed to LMS

This prevents inconsistent state.

5️⃣ Hybrid Network Reliability

Between Cloud ↔ On-Prem:

  • Private connectivity (ExpressRoute / MPLS)

  • Mutual TLS

  • Retry logic at adapter

  • Secondary failover endpoint

If primary CBS endpoint fails:

  • Switch to secondary

6️⃣ Monitoring & Observability

Reliability without visibility is impossible.

Metrics

  • API latency

  • Error rate

  • CBS response time

  • Kafka lag

Tracing

Track full journey:

User → Gateway → LoanSvc → Adapter → CBS

Alerting

  • SLA breach alerts

  • Kafka consumer lag alerts

  • DB replication lag alerts

7️⃣ Recovery & Reconciliation

Even with best design, failures happen.

So we implement:

Near Real-Time Reconciliation

Digital vs LOS vs LMS vs CBS comparison.

End-of-Day ARB

Financial reconciliation.

Replay Capability

Kafka allows replaying events from offset.

Manual Override Dashboard

Ops team can:

  • Retry

  • Reconcile

  • Re-trigger events

🛡 Reliability Example Scenario

Scenario:

Loan account created in CBS but response lost.

Without reliability:→ Duplicate loan risk.

With reliability:

  1. Adapter uses idempotency key.

  2. If retry happens:

    • CBS detects duplicate

    • Returns existing loan account

  3. Reconciliation confirms consistency.

No financial corruption.

📊

How will you ensure reliability in hybrid banking architecture?

You say:

“I design reliability across infrastructure, service, data, and operational layers. I ensure stateless microservices with circuit breakers, idempotent APIs, outbox-based event publishing, Kafka replication, saga-based transaction management, and continuous reconciliation between digital and core systems. Additionally, we implement observability and automated recovery mechanisms to maintain 99.99% availability.”

That answer shows maturity.

🧠 Bonus: Reliability Pyramid

Infrastructure Stability
        ↓
Service Resilience
        ↓
Data Consistency
        ↓
Transaction Integrity
        ↓
Monitoring & Recovery

1️⃣ How Do You Run ARB? (Architecture Review Board)

ARB is not a meeting.It’s a governance mechanism to control architecture quality, risk, cost, and alignment.

🎯 Step 1: Define ARB Charter

Clearly define:

  • Scope (All new systems? Only Tier-1 changes?)

  • Review triggers:

    • New platform

    • Major integration

    • Cloud adoption

    • Vendor onboarding

    • Regulatory-impacting change

Without scope clarity, ARB becomes chaos.

🎯 Step 2: Standardized Submission Template

Every proposal must include:

  1. Business objective

  2. Current architecture

  3. Proposed architecture diagram

  4. NFRs (availability, performance, RTO/RPO)

  5. Security model

  6. Data classification

  7. Integration points

  8. Cost estimate

  9. Build vs Buy analysis

  10. Risk assessment

This prevents emotional decisions.

🎯 Step 3: Structured Review Dimensions

ARB evaluates across 7 pillars:

Pillar

What We Check

Alignment

Does it align with enterprise target architecture?

Security

IAM, encryption, PII handling

Reliability

HA, DR, resiliency patterns

Integration

Event-driven? APIs? Tight coupling?

Data

Duplication? Data ownership?

Cost

Capex/Opex impact

Compliance

RBI/GDPR/SOX implications

🎯 Step 4: Decision Outcomes

ARB decisions should be:

  • Approved

  • Approved with conditions

  • Rework required

  • Rejected

And everything documented.

No informal approvals.

🎯 Step 5: Post-Approval Governance

ARB doesn’t end after approval.

You track:

  • Design compliance

  • Drift detection

  • Production alignment

Otherwise teams deviate later.

2️⃣ What Do You Evaluate?

This is critical.

🏗 Architecture Evaluation Areas

1️⃣ Technical Fit

  • Does it align with hybrid architecture?

  • Does it reuse shared services (Kafka, API Gateway)?

  • Is it introducing a new stack unnecessarily?

2️⃣ NFR Coverage

  • Availability target?

  • Scaling model?

  • Latency expectations?

  • DR plan?

3️⃣ Integration Strategy

  • REST vs Event?

  • Synchronous dependencies?

  • Risk of cascading failures?

4️⃣ Data Governance

  • Source of truth defined?

  • Data duplication?

  • Audit trail available?

5️⃣ Operational Readiness

  • Monitoring defined?

  • SLOs documented?

  • Support model defined?

6️⃣ Vendor Risk (If Buy)

  • Lock-in risk?

  • Exit strategy?

  • SLA commitment?

7️⃣ Long-Term Sustainability

  • Tech roadmap?

  • Skills availability?

  • Community support?

3️⃣ How Do You Prevent Tech Sprawl?

Tech sprawl = uncontrolled tools, frameworks, vendors.

It kills maintainability and increases cost.

🎯 Step 1: Define Approved Technology Stack

Example:

  • Backend: Java / .NET

  • Messaging: Kafka

  • Cache: Redis

  • DB: Postgres

  • Observability: Prometheus + Grafana

  • API Gateway: Standardized

No arbitrary new tools without ARB approval.

🎯 Step 2: Platform Engineering Model

Provide shared platforms:

  • Shared CI/CD

  • Shared Kafka cluster

  • Shared Kubernetes

  • Shared observability stack

When platform is easy, teams won’t build their own.

🎯 Step 3: Reuse-First Principle

Before approving new tech, ask:

  • Can existing platform solve this?

  • Is 80% solution acceptable?

🎯 Step 4: Periodic Rationalization

Every 6–12 months:

  • List all tools in use

  • Identify duplicates

  • Decommission low-usage systems

This prevents entropy.

4️⃣ How Do You Handle Deviations?

Deviations are inevitable.

What matters is how you control them.

🎯 Step 1: Categorize Deviation

Type

Example

Minor

Version mismatch

Medium

Using alternate DB

Major

Introducing new event bus

🎯 Step 2: Temporary vs Permanent

If deviation is needed:

  • Document reason

  • Define sunset timeline

  • Define rollback plan

Never allow undocumented permanent deviation.

🎯 Step 3: Risk Assessment

Evaluate:

  • Security impact?

  • Operational complexity?

  • Compliance violation?

  • Future integration risk?

🎯 Step 4: Compensating Controls

If deviation allowed:

  • Additional monitoring

  • Extra documentation

  • Restricted scope

  • Periodic review

🎯 Step 5: Governance Tracking

Maintain:

Architecture Deviation Register

Include:

  • System name

  • Deviation type

  • Risk level

  • Expiry date

  • Owner

Without tracking, deviations become architecture debt.

🎤

How do you run ARB and prevent sprawl?

“We operate ARB with structured submission templates and evaluate proposals across alignment, security, NFRs, integration, cost, and compliance. We enforce a standardized enterprise tech stack and promote reuse via platform engineering. Deviations are documented, risk-assessed, time-bound, and tracked in a deviation register to prevent architectural entropy.”



Modernization Around MainFrame

-------

🎯 Strong Leadership Answer Framework


1️⃣ Acknowledge the Reality

“Absolutely — mainframe teams are usually protective because they’ve built and maintained the system for 15–20 years. There is institutional knowledge and natural resistance.”

2️⃣ Address the “You Don’t Understand” Concern

“We never approached modernization with the mindset that we knew better. In fact, the first phase was reverse engineering and knowledge acquisition.”

you actually did:

  • Conducted application portfolio assessment

  • Identified COBOL modules, JCL jobs, CICS transactions

  • Mapped batch vs online flows

  • Created dependency maps

  • Captured data lineage

  • Built functional decomposition

This shows depth.

3️⃣ Did You Learn It Yourself?

This is where you differentiate as a leader.

Do NOT say “I learned COBOL myself.”

Say this:

“As a technology leader, I didn’t try to become a COBOL expert. Instead, I ensured we had hybrid squads — legacy SMEs + modern engineers — and I personally drove structured knowledge capture workshops.”

“However, I invested time to understand high-level mainframe architecture, batch scheduling, VSAM datasets, and integration touchpoints so that modernization decisions were informed.”

That balances hands-on awareness + leadership positioning.

4️⃣ How Did You Handle Non-Cooperation?


“Resistance reduced significantly when we reframed modernization from ‘replacement’ to ‘risk reduction and performance uplift’. We involved mainframe SMEs in design reviews and made them co-owners of the future state.”

Psychology > Technology.

Also mention:

  • Incentivized knowledge documentation

  • Formal KT sign-offs

  • Shadow runs

  • Parallel production runs

5️⃣ Who Validated If You Didn’t Know?

This is the most powerful part.

Say:

“Validation was not individual-driven — it was systemic.”

Then explain:

  • Automated regression testing

  • Data reconciliation scripts

  • Parallel run comparisons

  • Output parity validation

  • Audit logs

  • Business sign-off checkpoints

For BFSI especially:

  • Reconciliation at field level

  • Batch output comparison

  • T+1 reconciliation reports

That shows governance maturity.

6️⃣ Address Time Consumption

Say:

“Yes, modernization is time-consuming if done blindly. We reduced risk and timeline by using a strangler pattern approach rather than big-bang replacement.”

Mention:

  • API wrapping

  • Incremental module migration

  • Domain-driven segmentation

  • Event-driven extraction

.

💎

“Mainframe modernization always comes with knowledge concentration and resistance. We addressed this by first building deep system understanding through reverse engineering workshops and dependency mapping. Rather than positioning modernization as replacement, we positioned it as risk reduction and performance uplift. We formed hybrid squads combining mainframe SMEs and cloud-native engineers. Validation was systematic — automated regression suites, parallel run comparisons, and reconciliation at data and transaction levels ensured parity. Instead of big-bang migration, we adopted a strangler approach — gradually extracting domains via APIs and microservices. My role was to ensure technical correctness, stakeholder alignment, and controlled risk — not to become a COBOL developer, but to orchestrate a structured transition.”

 
 
 

Recent Posts

See All
RFP PRE/POST-PROPOSAL SUBMISSION FLOW

🏆 1. The 5 Pillars to Win a Large Strategic Deal 1. Understand the Client Better Than They Do 👉 Don’t just read RFP — decode it What is their real problem ? What is driving this deal? (compliance, c

 
 
 
DIGITAL LENDING RFP Solution

🎯 RFP Proposal SOLUTION PRESENTATION – DIGITAL LENDING (WITH COLOR-CODED ARCHITECTURE) 1️⃣ Opening “Thank you for the opportunity. I’ll walk you through our approach to building a next-generation dig

 
 
 

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
  • Facebook
  • Twitter
  • LinkedIn

©2024 by AeeroTech. Proudly created with Wix.com

bottom of page