Multi tenant SAAS scenario
- Anand Nerurkar
- May 13
- 17 min read
Updated: May 19
How do you architect a scalable multi-tenant SaaS platform?
Architecting a scalable multi-tenant SaaS platform involves making design decisions that balance tenant isolation, scalability, security, cost-efficiency, and operational manageability. Here's a detailed breakdown across key architectural dimensions:
🔷 1. Core Design Considerations
Area | Options / Notes |
Tenant Isolation | 🔹 Shared everything (cheapest) 🔹 Shared app, separate DB (balanced) 🔹 Separate app & DB per tenant (most isolation) |
Scalability | Stateless services, horizontal scaling using Kubernetes (K8s), autoscaling, service mesh |
Extensibility | Plugin-based or feature-flag driven architecture |
Observability | Tenant-aware logging (ELK), monitoring (Prometheus/Grafana), tracing (Jaeger/OpenTelemetry) |
🔷 2. High-Level Architecture Diagram (Textual View)
markdown
┌─────────────┐ ┌────────────────────────┐
│ Load Balancer│───▶│ API Gateway │
└─────────────┘ └───────┬────────────────┘
│
┌────────────────────────┼────────────────────────┐
▼ ▼ ▼
┌────────┐ ┌────────────┐ ┌────────────┐
│ Auth │ │ Tenant Mgr │ │ Config Mgr │
└────────┘ └────────────┘ └────────────┘
┌────────────────────────────────────────┐
│ Microservices (Domain-wise) │
│ e.g., User, Billing, Notification, etc.│
└────────────────────────────────────────┘
▼ ▼ ▼
┌────────────┐ ┌────────┐ ┌────────┐
│ DB Shards │ │ Redis │ │ Kafka │
└────────────┘ └────────┘ └────────┘
Observability: ELK | Prometheus | Grafana | Tracing
CI/CD: GitHub Actions / Azure DevOps / Jenkins
Platform: Kubernetes + Service Mesh (e.g., Istio)
🔷 3. Key Architecture Components
✅ 1. API Gateway
Performs routing, authentication, rate limiting
Tenant ID inferred from:
Subdomain (e.g., tenant1.app.com)
Custom headers / JWT claims
✅ 2. Authentication & Authorization
Identity Provider (IdP) with OAuth2/OIDC (e.g., Auth0, Azure AD)
Multi-tenant RBAC/ABAC at tenant and user levels
✅ 3. Tenant Management Service
Manages tenant lifecycle: provisioning, onboarding, billing plan, metadata
Stores tenant configs in DB or centralized config store (e.g., Consul)
✅ 4. Data Isolation Strategies
Strategy | Description | Use Case |
Shared DB, tenant ID column | Most efficient, low isolation | SMBs |
Separate schema per tenant | Balance of isolation & efficiency | Mid-size |
Separate DB per tenant | High security/compliance needs | Enterprise |
✅ 5. Configuration Management
Centralized dynamic configuration per tenant
Feature flags per tenant (e.g., LaunchDarkly)
✅ 6. CI/CD
GitOps with tenant-aware deployment pipelines
Canary or blue/green deployments
🔷 4. Scalability & Performance
Stateless microservices with Kubernetes (AKS/EKS/GKE)
Horizontal Pod Autoscaler (HPA) & Cluster Autoscaler
Redis / Memcached for caching frequently used tenant-specific data
Kafka/NATS for asynchronous processing
🔷 5. Security
Network Security: VPC isolation, NSGs, API Gateway policies
Data Security:
Tenant-aware encryption at rest and in transit
Row-level security (RLS) or separate key per tenant
App Security:
Secure SDLC practices, automated scans, WAF
🔷 6. Monitoring & Observability
Logs indexed with tenant_id using ELK or Loki
Dashboards per tenant using Grafana
Distributed tracing with tenant context
Alerting per tenant SLA thresholds
🔷 7. Billing & Metering (Optional)
Usage metering per tenant (API calls, DB storage, active users)
Integration with billing platforms (Stripe, Razorpay)
🔷 8. Disaster Recovery & High Availability
Regional HA with load balancers & geo-replication
Backups per tenant (retention based on plan)
DR drills automated via Terraform/Ansible scripts
🔷 9. Tenant Onboarding Flow
mermaid
sequenceDiagram
Customer ->> SaaS Portal: Sign Up
SaaS Portal ->> Tenant Mgr: Provision Tenant
Tenant Mgr ->> Config Store: Add Tenant Config
Tenant Mgr ->> DB: Create Schema or DB
Tenant Mgr ->> Email Service: Send Welcome Email
what is SMBs
:
SMBs stands for Small and Medium-sized Businesses.
These are companies that typically have:
Small to moderate revenue
Limited IT resources
Smaller customer base or user volume compared to large enterprises
Why is this important in SaaS architecture?
In multi-tenant SaaS platforms:
SMBs are more likely to accept shared infrastructure (e.g., shared database with tenant ID column) to reduce cost.
They usually have lower security/isolation demands than enterprises.
They often prefer plug-and-play features with minimal customization.
So, if you're targeting SMBs, the architecture should prioritize:
Cost-efficiency
Faster onboarding
Simple tenant isolation
Scalable shared resources
In contrast, enterprises often demand:
High data isolation
Custom integrations
Compliance (like GDPR, HIPAA, etc.)
Emphasize trade-offs between isolation, cost, and complexity
Certainly. In a multi-tenant SaaS architecture, choosing the right level of tenant isolation is a critical decision that involves trade-offs between isolation, cost, and complexity. Here’s a detailed explanation with a comparison table and examples:
🔄 Trade-offs: Isolation vs Cost vs Complexity
Isolation Model | Data Isolation | Operational Complexity | Cost Efficiency | Use Case Suitability |
Shared DB, Shared Schema | ❌ Low | ✅ Low | ✅ High | SMBs, Startups, MVPs |
Shared DB, Separate Schema | ⚠️ Medium | ⚠️ Medium | ⚠️ Medium | Growing SaaS with moderate demands |
Dedicated DB per Tenant | ✅ High | ❌ High | ❌ Low | Enterprises, Regulated industries |
Dedicated Infrastructure (App + DB) | ✅✅ Very High | ❌❌ Very High | ❌❌ Very Low | Premium tenants, strong SLAs, GovTech |
🔷 1. Shared Database (Single Schema)
Isolation: Minimal (tenants share tables, separated by a tenant_id column)Cost: Very low (common compute/storage)Complexity: Minimal (one schema to maintain)✅ Best For: SMBs, MVPs, low-security environments❌ Risk: One bug or bad query can leak data between tenants
🔷 2. Shared Database, Separate Schemas
Isolation: Better than shared schema, each tenant has own schemaCost: Medium (same DB instance, more schema management)Complexity: Moderate (versioning, migrations, data lifecycle)✅ Best For: Mid-size tenants, balance of cost and security❌ Risk: Still some shared resources (I/O, CPU), can't fully isolate noisy tenants
🔷 3. Dedicated Database per Tenant
Isolation: Strong (data-level and sometimes performance isolation)Cost: Higher (storage, backups, connections)Complexity: High (automation of provisioning, updates, backups)✅ Best For: Enterprise clients, compliance-heavy sectors (banking, healthcare)❌ Trade-off: Cost increases with tenant scale; complex DB lifecycle management
🔷 4. Dedicated Infrastructure per Tenant
Isolation: Complete (compute, data, and networking isolation)Cost: Very high (infra replicated for each tenant)Complexity: Very high (devops, CI/CD, monitoring per tenant)✅ Best For: Strategic/regulated clients with strict SLAs, tenant-specific customization❌ Trade-off: Not scalable for 100s or 1000s of tenants
🎯 How to Choose?
Criteria | Recommendation |
Number of tenants | >100 → shared DB; <10 → dedicated DB |
Tenant size variance | Wide variance → hybrid model |
Compliance requirements | High → dedicated DB or schema |
Operational maturity | Low → shared DB, simple CI/CD |
SLA/customization needed | High → dedicated stack or services |
Monetization model | Tiered plans → map infra accordingly |
💡 Hybrid Strategy (Best of Both)
Many modern SaaS platforms use a hybrid approach:
SMBs on shared schema or shared schema-per-tenant
Enterprises on dedicated DB or even dedicated instances
Feature flags/configs used to enable/disable premium features
📌 Summary
Isolation ↑ | Complexity ↑ | Cost ↑ |
Shared Schema < Shared DB w/ Schema < Dedicated DB < Dedicated Stack |
The goal is to optimize for your target market, allowing scalability for small tenants and compliance/customization for larger ones.
How do you manage tenant isolation in shared infrastructure
Managing tenant isolation in shared infrastructure is one of the most critical and challenging aspects of building a multi-tenant SaaS platform, especially when you're aiming to balance cost efficiency, security, and operational simplicity.
Here’s how to effectively manage tenant isolation while using shared infrastructure:
🔐 1. Data Isolation
a. Logical Isolation
Use a tenant_id column in every multi-tenant table.
Enforce isolation at the data access layer (DAL) or through ORM-level filters.
Example: Hibernate filters, Spring Data JPA multi-tenancy support
Apply Row-Level Security (RLS) in the database (e.g., PostgreSQL).
b. Schema-per-Tenant (Optional)
Still uses shared DB, but each tenant gets its own schema.
Allows different indexes, constraints per tenant.
Better for performance isolation than pure shared schema.
🔐 2. Authentication & Authorization
Use JWT tokens or OAuth2 with tenant_id claim embedded.
Validate tenant_id on each request against the token/user context.
Implement tenant-aware RBAC/ABAC:
Restrict data and actions based on roles within a tenant (e.g., Admin, Viewer)
🔐 3. Application-Level Isolation
a. Tenant Context Propagation
Extract tenant_id from subdomain, header, or token.
Use a thread-local or context-aware mechanism (e.g., RequestContextHolder in Spring) to pass tenant_id to services and repositories.
b. Service Configuration Isolation
Store per-tenant configuration (e.g., feature flags, branding, limits) in a config store like Consul, Redis, or a centralized config DB.
Dynamically load config on each request or cache it with TTL.
🔐 4. Resource Quotas & Limits
Implement rate limiting and quotas per tenant:
API Gateway or service mesh policies (e.g., Istio, Kong, Azure API Management)
Prevent noisy neighbors from affecting others
Limit DB usage:
Query limits, CPU quotas, memory thresholds
Use read replicas to offload reporting
🔐 5. Logging, Monitoring, and Auditing
Tag all logs, metrics, and traces with tenant_id.
Centralized log management: ELK, Loki, or Azure Monitor
Enable per-tenant dashboards and alerts (Grafana, Datadog)
🔐 6. Feature Isolation
Use feature flags per tenant (e.g., LaunchDarkly, Unleash)
Allows gradual rollout, A/B testing, or premium-only features
🔐 7. Deployment and Code Isolation
Even in shared infra, maintain tenant-aware CI/CD:
Feature toggles, config overrides, or tenant-specific builds if needed
Use canary releases or blue/green deployments to limit risk by tenant segment
🔐 8. Security Isolation
Encrypt data at rest and in transit.
Use tenant-specific encryption keys (e.g., AWS KMS, Azure Key Vault)
Validate request source to ensure requests can't be spoofed across tenants.
🔐 9. Tenant Onboarding Automation
Automate:
Schema creation
Config generation
Role provisioning
Use Infrastructure as Code (Terraform, Helm) to deploy per-tenant resources where needed
🔄 Example: Tenant-Aware Middleware in Spring Boot
java
CopyEdit
public class TenantInterceptor implements HandlerInterceptor { @Override public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) { String tenantId = request.getHeader("X-Tenant-ID"); TenantContext.setCurrentTenant(tenantId); return true; } }
🧠 Summary Table
Layer | Strategy |
Data | tenant_id, RLS, schema-per-tenant |
AuthN/AuthZ | JWT with tenant context |
App Logic | Tenant context + tenant-aware filters |
Resources | Quotas, throttling per tenant |
Config | Tenant-specific feature/config stores |
Logs/Monitoring | Tenant-tagged logs, metrics, alerts |
Security | Encryption, token validation, rate limits |
CI/CD | Config-driven deployments, feature toggles |
Do we mentain gitnub repo per servoce per tenant in multi tenant saas
In a multi-tenant SaaS architecture, you generally do NOT maintain a separate GitHub repository per service per tenant — that would be unscalable and operationally inefficient. Here's how it’s typically handled, along with when exceptions might apply:
✅ Best Practice: Single Codebase, Multi-Tenant Aware
✔️ One GitHub repo per microservice
All tenants share the same service logic
Tenant-specific behavior is handled through:
Configuration
Feature flags
Tenant-specific overrides (in DB, config store, or conditional logic)
✔️ Benefits:
Easier maintenance and updates
Single CI/CD pipeline
Faster onboarding of new tenants
Lower operational complexity
🔧 How to Handle Tenant-Specific Needs
Tenant Requirement | Strategy |
Custom config (limits, themes) | Config store (e.g., Consul, Spring Cloud Config) keyed by tenant_id |
Feature enablement | Feature flags (e.g., LaunchDarkly, Unleash) |
Custom branding or logic | Theming engine + tenant-aware conditional logic |
Isolated data | tenant_id in DB or separate schema/DB |
❗ When You Might Need Separate Repos
These are edge cases, typically for enterprise or premium tenants:
Scenario | Reason |
Tenant-specific logic/integration | e.g., regulatory customization, bank-specific workflows |
On-premise deployment | Code fork per tenant if SaaS is delivered on-prem |
Full white-labeled product | Custom branding, compliance, or business logic that significantly diverges |
Strict compliance/legal separation | When legal contracts demand complete codebase separation |
Even then, you would:
Use a private fork from the main repo (not a totally separate repo from scratch)
Automate sync from the upstream codebase
Clearly document deltas
🔄 Recommended GitHub Repo Strategy
plaintext
microservices/
├── user-service/
│ ├── src/
│ ├── config/
│ └── Dockerfile
├── payment-service/
│ ├── src/
│ ├── config/
│ └── Dockerfile
├── onboarding-service/
│ ├── src/
│ ├── config/
│ └── Dockerfile
Config per tenant stored in separate folders or in external config store (like Spring Cloud Config, S3, etc.)
configs/
├── tenantA/
│ └── application-tenantA.yml
├── tenantB/
│ └── application-tenantB.yml
🧠 Summary
Aspect | Recommended Approach |
GitHub Repo Structure | ✅ One repo per service |
Tenant Logic Handling | ✅ Config + flags + conditional logic |
Separate Repo Per Tenant | ❌ Avoid unless business case demands it |
Customization Strategy | ✅ Use modular design + config-driven behavior |
Here's a text version flow showing how GitHub repos, tenant logic, and configuration work together in a CI/CD pipeline for a multi-tenant SaaS platform:
✅ 1. Code Repository (GitHub)
Structure:
plaintext
├── user-service/
├── payment-service/
├── notification-service/
└── onboarding-service/
Each microservice repo contains:
Shared logic for all tenants
Tenant-aware code paths (conditioned by tenant ID)
External config integration logic (to fetch per-tenant config)
✅ 2. Configuration Repository / Store
GitOps or external config store (e.g., Spring Cloud Config, S3, Vault, Consul)
Structure:
plaintext
configs/
├── tenantA/
│ └── application-tenantA.yml
├── tenantB/
│ └── application-tenantB.yml
└── tenantC/
└── application-tenantC.yml
Each file includes:
Rate limits, quotas
Branding, localization
Feature flags
External integration endpoints
✅ 3. CI Pipeline (Triggered by GitHub Commits)
Tools: GitHub Actions / Jenkins / Azure DevOps / CircleCI
plaintext
[Push to GitHub Repo] ──▶ [Run CI Pipeline]
├── Run tests
├── Build Docker image
└── Push to container registry
Shared image for all tenants with logic to fetch per-tenant config at runtime
✅ 4. CD Pipeline (Triggered After CI or on Config Change)
Tools: ArgoCD / Spinnaker / Azure DevOps / Helm
plaintext
CopyEdit
[New Docker Image or Config Change] ──▶ [CD Pipeline]
├── Pull tenant-specific config
├── Template Helm/Kustomize manifests
├── Deploy to AKS/EKS/GKE clusters
└── Route to correct tenant subdomain Route to correct tenant subdomain
✅ 5. Runtime Behavior (Tenant Request)
Request comes in via:
https://tenantA.myapp.com/api/user
API Gateway parses tenant ID from:
Subdomain (tenantA)
Header (X-Tenant-ID)
Auth token (tenant_id claim)
Service logic does:
Load config for tenantA (from external config store)
Enable/disable features
Enforce rate limits, roles
Execute tenant-specific logic if needed
✅ 6. Monitoring and Observability
Logs, traces, and metrics tagged with tenant_id
Dashboards show:
Per-tenant API usage
Error rates
Resource consumption
🧠 Summary Flow
[GitHub Repo (shared logic)]
│
▼
[CI Pipeline → Docker Image]
│
▼
[CD Pipeline → Per-tenant Config Applied]
│
▼
[Runtime: Tenant Request Handling]
│
├── Identify tenant from request
├── Load tenant config
├── Apply business rules, flags
└── Execute logic securely in shared infra
✅ What Is Tenant-Specific Deployment?
Tenant-specific deployment means tailoring part of the deployment lifecycle (configs, resources, or even code behavior) based on individual tenants, even though most of the platform is shared.
🔧 When to Use Tenant-Specific Deployment?
Use Case | Action |
Feature toggling by tenant | Use config-based deployment with shared image |
Tenant-specific branding or limits | Use external config store or Helm templating |
Tenant-level integration or secrets | Inject via env variables or mounted secrets |
Enterprise tenant with SLA isolation | Deploy isolated instances (namespace/cluster) |
Premium tenant with custom code | Optional: fork or overlay service at build-time |
🏗️ Deployment Models
Model | Description | Example |
Shared Deployment | All tenants use the same services and infra | Most SaaS SMB tenants |
Config-Driven Deployment | One deployment, but each tenant has its own config | Separate values.yaml |
Namespace per Tenant | Each tenant runs in a separate Kubernetes namespace | Logical isolation |
Cluster per Tenant | Entire infra stack replicated per tenant (very rare) | High-compliance use cases |
🚀 Tenant-Specific Deployment Flow (Kubernetes + Helm)
Folder Structure:
plaintext
tenant-deployments/
├── tenantA/
│ └── values-tenantA.yaml
├── tenantB/
│ └── values-tenantB.yaml
└── helm-chart/
├── templates/
└── Chart.yaml
Sample values-tenantA.yaml
yaml
CopyEdit
tenantId: tenantA
replicaCount: 2
resources:
limits:
cpu: "500m"
memory: "512Mi"
featureFlags:
newDashboard: true
branding:
logoUrl: "https://cdn.tenantA.com/logo.png"
env:
EXTERNAL_API_URL: "https://api.tenantA.com"
Helm Deployment Command
bash
helm upgrade --install tenant-a-user-svc ./helm-chart \
-f ./tenant-deployments/tenantA/values-tenantA.yaml \
--namespace tenant-a
🔐 Injecting Tenant-Specific Secrets
Use Kubernetes secrets or external secret managers:
Option 1: Kubernetes Secret
yaml
apiVersion: v1
kind: Secret
metadata:
name: tenant-a-secret
namespace: tenant-a
type: Opaque
data:
db-password: <base64-encoded-password>
Option 2: External (e.g., Azure Key Vault, HashiCorp Vault)
Use CSI Driver or Vault Agent Injector to mount secrets dynamically.
⚙️ Automating with CI/CD (GitHub Actions / Azure DevOps / ArgoCD)
Example GitHub Actions Matrix Deploy:
yaml
jobs:
deploy:
strategy:
matrix:
tenant: [tenantA, tenantB, tenantC]
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Deploy Helm Chart
run: |
helm upgrade --install ${{ matrix.tenant }}-svc ./helm-chart \
-f ./tenant-deployments/${{ matrix.tenant }}/values-${{ matrix.tenant }}.yaml \
--namespace ${{ matrix.tenant }}
📊 Observability and Tenant Isolation
Namespace: Logical isolation of metrics, logs
Service Mesh: Rate limits, retries, mTLS per tenant
Prometheus/Grafana: Dashboards per namespace or tenant_id label
Alerting: Per-tenant thresholds (e.g., error rate, API latency)
🧠 Summary
Area | Approach |
Codebase | Shared across tenants |
Config | Per-tenant values.yaml or config store |
Deployment | Helm/Kustomize templating per tenant |
Secrets | Kubernetes Secrets or external secret manager |
Namespace Isolation | Optional for SLAs and governance |
CI/CD Automation | Matrix deploy or pipeline per tenant |
What approach would you use for feature flagging in a multi-tenant platform?
For feature flagging in a multi-tenant SaaS platform, you need a strategy that enables:
Granular control (per tenant, per user, per segment)
Safe rollout (canary releases, A/B testing)
Dynamic toggling (without redeployments)
Auditability & traceability (for compliance, debugging)
✅ Recommended Approach: External Feature Flag Management
Component | Description |
Feature Flag Service | Central service to evaluate flags dynamically (e.g., LaunchDarkly, Unleash) |
Tenant-aware Logic | Flags are evaluated based on tenant_id, user_id, etc. |
Rollout Strategy | Gradual % rollout, custom rules, cohorts, or subscription tiers |
SDK Integration | Flags fetched at runtime using lightweight SDKs |
🏗️ Architecture Overview
lua
+-------------------+
| SaaS Application |
+-------------------+
|
v
[ tenant_id | user_id ]
|
v
+--------------------------+
| Feature Flag Service API |
+--------------------------+
|
v
Evaluates flags via rules:
- tenant_id == "icici"
- plan == "premium"
- region == "EU"
🛠️ Options for Feature Flag Platforms
Tool | Highlights | License |
LaunchDarkly | SaaS, SDKs in all major languages, audit logs | Commercial |
Unleash | Open-source, self-hosted, supports strategies | Open Source |
Flagsmith | Hosted & OSS, multivariate flags | Both |
Togglz | Java-based, Spring Boot integration | Open Source |
Custom | Use DB or config service with custom logic | Build effort |
🎛️ How to Define and Use Flags
Example: Flag Definition
json
CopyEdit
{ "flagKey": "new-dashboard", "enabled": true, "rules": [ { "condition": "tenant_id == 'tenantA'", "enabled": true }, { "condition": "plan == 'enterprise'", "enabled": true } ] }
In Code (Pseudocode):
java
CopyEdit
if (featureFlagService.isEnabled("new-dashboard", tenantId, userId)) { showNewDashboard(); } else { showLegacyDashboard(); }
🚀 Feature Rollout Strategies
Strategy | Description | Example |
Per tenant | Enable feature for tenantA only | Early feedback from premium clients |
Percentage rollout | Gradually roll out to % of users | Canary deployment |
Plan-based | Enable for "enterprise" plan only | Monetization differentiation |
Region-based | Enable only for EU customers | Compliance testing |
Custom rules | Based on internal metrics, date, user type | Advanced experiments |
🔐 Multi-Tenant Considerations
Concern | Strategy |
Isolation | Flag rules scoped per tenant_id |
Performance | Use local caching + periodic polling for flag updates |
Security | Don’t expose all tenant flags in client SDKs |
Auditing | Keep a change log of flag toggles per tenant |
⚙️ Deployment & CI/CD Integration
Keep flag logic outside the code deployments
CI/CD can tag or label flags during a release
Use feature flags for blue/green, A/B testing, and rollback
🧠 Summary
Area | Best Practice |
Platform | Use LaunchDarkly, Unleash, or Flagsmith |
Scoping | Per tenant, plan, region, or user |
Integration | SDK or API in backend services |
Config management | Flags fetched dynamically, cached locally |
Audit and control | Track changes, enable rollback |
How do you ensure tenant-aware authentication and authorization?
Ensuring tenant-aware authentication and authorization in a multi-tenant SaaS platform is critical for security, isolation, and custom access control. It requires identifying which tenant a user belongs to, enforcing boundaries between tenants, and applying tenant-specific permissions.
🔐 1. Tenant-Aware Authentication
✅ Goals:
Identify the user and the tenant context
Prevent cross-tenant access
Support different identity providers per tenant (optional for B2B SaaS)
🔑 Key Approaches:
Approach | How it works | Notes |
Subdomain-based | tenantA.myapp.com → infer tenant ID from hostname | Simple and intuitive |
Header-based | X-Tenant-ID header passed with each request | Common for internal APIs |
Token-based | JWT contains tenant_id claim | Best practice with OAuth2/OIDC |
Path-based | /api/tenantA/users → extract tenant from path | Used in REST APIs |
🔐 Example: JWT Payload
json
CopyEdit
{ "sub": "user123", "tenant_id": "tenantA", "roles": ["admin", "approver"], "email": "admin@tenantA.com" }
✅ Auth Services & Tools:
Identity Provider | Features |
Auth0 | Multi-tenant SSO, tenant-specific branding |
Keycloak | Realms per tenant, custom flows |
Azure AD B2C | Tenant orgs, federation, MFA |
Okta | Tenant-aware login flows, user management |
🔒 2. Tenant-Aware Authorization
Once authenticated, apply fine-grained access control:
💡 Principles:
Use RBAC or ABAC scoped to the tenant
Always check if the resource belongs to the requesting tenant
Never trust the tenant context from the client — validate on server
✅ Authorization Techniques:
Layer | What to check | Example |
App/API | tenant_id in request matches JWT/DB record | loan.tenant_id == user.tenant_id |
DB Layer | Row-level security or filters | WHERE tenant_id = :tenant_id |
UI Layer | Disable/hide UI actions not allowed | Hide buttons/menus based on tenant role |
🔐 Example: DB-Level Isolation
sql
CopyEdit
SELECT * FROM transactions WHERE tenant_id = 'tenantA' AND user_id = 'user123';
Or use RLS (Row-Level Security) in PostgreSQL:
sql
CopyEdit
CREATE POLICY tenant_isolation_policy ON transactions FOR SELECT USING (tenant_id = current_setting('app.tenant_id')::uuid);
🏢 3. Tenant Isolation Models
Model | Description | Trade-off |
Shared DB | All tenants share DB; filter by tenant_id | Cheaper but needs strict checks |
Schema-per-tenant | Separate schema for each tenant | Moderate isolation, harder to scale |
DB-per-tenant | Complete DB isolation | Strongest isolation, costlier |
🛡️ 4. Best Practices for Secure Tenant-Aware Access
Area | Best Practice |
Token design | Always include tenant_id, role, user_id in JWT |
Tenant context | Derive tenant context on server side (not client-trusted) |
Access control | Enforce tenant filters at every layer (UI, API, DB) |
Testing | Add cross-tenant access test cases to catch leakages |
Auditing | Log tenant access events for traceability and compliance |
🧠 Summary
Layer | Tenant-Awareness Strategy |
Authentication | Use JWT with tenant ID, support SSO per tenant |
Authorization | Enforce RBAC/ABAC scoped to tenant |
API & Service | Validate tenant ID on every request/resource |
Database | Filter by tenant ID or use schema/DB isolation |
How do you manage schema upgrades across tenants?
Managing schema upgrades across tenants in a multi-tenant SaaS platform depends on your tenant isolation model (shared DB, schema-per-tenant, or DB-per-tenant) and your deployment strategy (monolith vs microservices, CI/CD, etc.).
Schema changes must be applied safely, consistently, and with minimal downtime.
✅ Key Considerations
Challenge | Mitigation Strategy |
Multi-tenant impact | Version-aware, backward-compatible changes |
Downtime risk | Zero-downtime migrations |
Tenant data isolation | Per-tenant control for schema-per or DB-per models |
Rollback complexity | Version control, idempotent migrations |
Tracking & auditability | Migration logs and audit trails |
🏗️ Isolation Model vs Upgrade Strategy
Model | Upgrade Approach |
Shared DB | Single schema upgrade with tenant-aware data |
Schema-per-tenant | Loop and apply schema changes per schema |
DB-per-tenant | Apply changes per DB (parallelized, staged rollout) |
🛠️ Tooling (Recommended)
Tool | Use Case |
Liquibase | Declarative DB change management, supports multiple schemas |
Flyway | Simple versioned migrations, per-schema/tenant support |
Alembic (Python) | For SQLAlchemy-based apps |
Liquibase Pro | Advanced features: rollbacks, audit trail |
🔁 CI/CD-Driven Upgrade Process
Maintain migration scripts in version control (/migrations)
Tag migration files with versions (e.g. V1.1__add_column.sql)
For shared DB: Run once
For schema-per-tenant or DB-per-tenant:
Loop through each tenant and apply in parallel or batches
Record success/failure per tenant
🧪 Example: Flyway Multi-Tenant Upgrade (Schema-per-Tenant)
bash
TENANTS=("tenant_a" "tenant_b" "tenant_c")
for TENANT in "${TENANTS[@]}"
do
flyway -schemas=$TENANT migrate
done
🛡️ Zero-Downtime Schema Upgrade Techniques
Technique | Description |
Add columns, don't drop | Make non-breaking changes first |
Dual-read/write | Write to both old and new structures during cutover |
Toggle via feature flags | Switch new features off until migration is safe |
Backfill asynchronously | Populate new columns in background |
Blue-green or canary | Apply changes gradually |
📊 Schema Version Tracking
Method | Description |
Metadata table | Track per-tenant schema version in DB |
Central registry | Store schema versions in a config database |
Migration logs | Keep upgrade history per tenant |
🧠 Best Practices Summary
Area | Best Practice |
Isolation model | Choose based on scalability and control needs |
Upgrade tools | Use Flyway, Liquibase, or similar |
Testing | Use pre-prod with tenant data snapshots |
Rollback | Plan safe fallback strategies with reversible scripts |
Monitoring | Log and alert on migration failures per tenant |
Governance | Audit who, when, and how changes were applied |
📌 Sample Flow (Schema-per-Tenant, CI/CD-Integrated)
Developer commits schema change to Git
CI pipeline:
Runs unit tests
Validates SQL syntax
Packages migration script
CD pipeline:
Fetch list of tenants (schemas)
Loops through each schema
Applies change using Flyway
Logs result per tenant
Handling tenant-specific customizations in a multi-tenant SaaS platform is critical for supporting business differentiation without fragmenting your core platform. The goal is to enable customization while keeping the codebase and operational overhead manageable.
✅ Common Types of Tenant-Specific Customizations
Customization Type | Examples |
UI | Branding, color schemes, layouts |
Business Logic | Approval rules, eligibility logic, pricing models |
Feature Access | Role-based modules, feature flags |
Workflow Config | Custom onboarding steps, form fields |
Data Models | Optional fields, reference data |
Integrations | External APIs (e.g., CRM, Payment Gateway) |
Locale/Language | Currency, language, timezone |
🧱 Architectural Patterns for Tenant Customizations
1. Configuration-Driven Customization
Store tenant-specific logic/config in a config DB or JSON/YAML file
Example: onboarding steps, limits, labels
json
CopyEdit
{ "tenantId": "tenantA", "config": { "approvalFlow": "manager -> compliance", "currency": "INR", "loanLimit": 500000 } }
Spring Boot Approach:
Use @ConfigurationProperties
Load tenant-specific configs at runtime using context
2. Feature Flags per Tenant
Enable/disable features for specific tenants
Tools: Unleash, LaunchDarkly, FF4J, or a custom DB-driven system
yaml
CopyEdit
tenantA: features: kycEnabled: true autoApproval: false
3. Theming & Branding
Use dynamic CSS/themes per tenant
Serve static assets (logos, fonts) from tenant-specific buckets (e.g., S3)
4. Pluggable Strategy Pattern (Code-Level Hooks)
Implement tenant-specific logic via @Component-based strategies
java
CopyEdit
public interface LoanApprovalStrategy { boolean approveLoan(Application app); } @Component("tenantA") public class TenantALoanApproval implements LoanApprovalStrategy { public boolean approveLoan(Application app) { // Custom rule return app.amount < 500000; } }
Use Spring’s ApplicationContext to resolve strategy by tenant ID
5. Multi-Tenant Data Model
Use optional/JSON fields for tenant extensions (in RDBMS or NoSQL)
Example: PostgreSQL jsonb, MongoDB schema-less
sql
CopyEdit
ALTER TABLE customer ADD COLUMN custom_attributes JSONB;
6. Dynamic Workflows (BPM/Workflow Engine)
Use Camunda, Flowable, or Temporal to define workflows per tenant
Example: Tenant A requires additional legal approval step
7. Custom Extensions via Microservices
Allow tenants to plug in custom microservices via contract-driven APIs
Define API schemas and version contracts
Route calls to tenant-specific service endpoints
🔐 Security Considerations
Area | Strategy |
Authorization | Ensure tenant-specific logic doesn’t expose data |
Data access | Use tenant ID filter in DB query (row-level security) |
API exposure | Use RBAC and scopes per tenant in tokens |
🧪 CI/CD Considerations
Area | Practice |
Config management | Use GitOps or centralized config DB |
Test coverage | Maintain tests per tenant configuration |
Rollback | Use feature flag rollback or version fallback |
Observability | Monitor behavior per tenant (e.g., Prometheus + labels) |
🧠 Summary of Approaches
Method | Use Case |
Config-based customization | UI, workflow, rules |
Feature flags | Module-level enable/disable |
Strategy pattern | Business logic overrides |
Workflow engine | Tenant-specific steps |
Theming | Branding, static assets |
API plugins | External integrations per tenant |
How would you onboard a new tenant dynamically?
Onboarding a new tenant dynamically in a multi-tenant SaaS platform means provisioning everything the tenant needs — data isolation, configurations, default users, etc. — without downtime or manual effort.
✅ Key Goals of Dynamic Tenant Onboarding
Goal | Description |
Self-service or API-driven | No manual DB/script steps |
Isolated environment | Data isolation: schema, DB, or row-level |
Configuration-ready | Tenant-specific settings, branding, features |
Scalable & repeatable | Can be automated in CI/CD or backend flows |
Secure | Auth rules, keys, and tokens scoped per tenant |
🏗️ Step-by-Step: Dynamic Tenant Onboarding Flow
1. Tenant Creation Trigger
Via Admin Panel or API call:
http
CopyEdit
POST /tenants { "tenantName": "Acme Corp", "plan": "enterprise", "region": "us-east-1", "adminEmail": "admin@acme.com" }
2. Provisioning Backend Logic (Spring Boot Example)
a. Generate Tenant Metadata
Assign unique ID, tenant code
Persist metadata in tenant_master table
b. Provision Database/Schema
Options:
Schema-per-tenant (Postgres, MySQL)
DB-per-tenant (high isolation)
Row-level (single schema, multi-tenant flag)
Use Flyway/Liquibase to migrate schema:
java
CopyEdit
flyway.configure() .schemas("tenant_acme") .locations("classpath:db/migration") .load() .migrate();
c. Create Default Users & Roles
Insert into users, roles, permissions tables
Hash passwords, assign scopes
d. Set Default Configurations
Save branding, workflows, limits in tenant_config
e. Register Tenant Context
In-memory map, config DB, or distributed cache (Redis)
Useful for tenant routing and dynamic datasource lookup
f. Emit Event
Publish onboarding event to Kafka/NATS for async provisioning (e.g., S3 bucket, email domain)
3. Tenant Context Routing in Runtime
Use a Tenant Resolver to route DB/logic based on request headers or token claims:
http
CopyEdit
Authorization: Bearer eyJ...tenant_id=acme
4. CI/CD Support (Optional)
Automate schema creation in deploy pipelines:
bash
CopyEdit
curl -X POST /tenants -d @new_tenant.json
Add Helm chart or Terraform logic if provisioning isolated infra (K8s namespace, blob storage, etc.).
🔐 Security & Access Control
Area | Best Practice |
Authentication | Use tenant_id in JWT or OAuth2 claims |
Authorization | Validate user roles and tenant match |
Resource Access | Tag resources (e.g., files, records) with tenant ID |
Throttling | Apply rate limits per tenant if needed |
🧪 Test Scenarios
Test Case | Description |
New tenant can login | Validate default admin access |
Schema provisioned successfully | DB objects created via Flyway |
Config overrides working | Check branding, limits, workflows |
Wrong tenant access blocked | Prevent cross-tenant data access |
🧠 Summary Table
Component | Strategy |
DB/Schema | Use Flyway to create schema dynamically |
Config | Load defaults from templates or JSON |
Feature Flags | Store flags per tenant in config DB |
Auth | Add tenant ID to claims or token scope |
CI/CD | Automate via pipelines or APIs |
Extensibility | Publish Kafka/NATS event on onboarding |
A Spring Boot code sample for dynamic onboarding? A PostgreSQL schema-per-tenant onboarding script? A Tenant onboarding API spec?
Great! Below are all three as requested — each tailored to help you dynamically onboard a new tenant in a Spring Boot + PostgreSQL schema-per-tenant multi-tenant SaaS platform.
✅ 1. Spring Boot Code Sample – Dynamic Tenant Onboarding
java
@RestController
@RequestMapping("/tenants")
@RequiredArgsConstructor
public class TenantController {
private final TenantService tenantService;
@PostMapping
public ResponseEntity<String> onboardTenant(@RequestBody TenantRequest request) {
tenantService.onboardTenant(request);
return ResponseEntity.ok("Tenant onboarded successfully");
}
}
java
@Data
public class TenantRequest {
private String tenantName;
private String schemaName;
private String adminEmail;
}
java
@Service
@RequiredArgsConstructor
public class TenantService {
private final DataSource defaultDataSource;
private final JdbcTemplate jdbcTemplate;
private final Flyway flyway;
public void onboardTenant(TenantRequest request) {
String schema = request.getSchemaName();
// 1. Create new schema
jdbcTemplate.execute("CREATE SCHEMA IF NOT EXISTS " + schema);
// 2. Run Flyway migrations on new schema
Flyway.configure()
.dataSource(defaultDataSource)
.schemas(schema)
.locations("classpath:db/migration")
.load()
.migrate();
// 3. Register tenant in central metadata table
jdbcTemplate.update(
"INSERT INTO tenant_master (tenant_name, schema_name, admin_email, onboarded_at) VALUES (?, ?, ?, now())",
request.getTenantName(), schema, request.getAdminEmail()
);
// 4. (Optional) Send Kafka/NATS event for async provisioning
}
}
✅ Ensure Flyway is configured to accept dynamic schemas.
✅ 2. PostgreSQL Schema-per-Tenant Onboarding Script
📄 V1__init_schema.sql (placed in resources/db/migration)
sql
-- User table for multi-tenancy
CREATE TABLE IF NOT EXISTS users (
id SERIAL PRIMARY KEY,
username VARCHAR(100) NOT NULL,
password VARCHAR(100) NOT NULL,
role VARCHAR(50),
created_at TIMESTAMP DEFAULT now()
);
-- Add more tenant-specific tables here
Use this to initialize each new tenant schema using Flyway.
✅ Create tenant_master in default schema:
sql
CREATE TABLE IF NOT EXISTS tenant_master (
id SERIAL PRIMARY KEY,
tenant_name VARCHAR(100) NOT NULL,
schema_name VARCHAR(50) NOT NULL,
admin_email VARCHAR(100),
onboarded_at TIMESTAMP DEFAULT now()
);
✅ 3. Tenant Onboarding API Spec (OpenAPI/YAML)
yaml
openapi: 3.0.1
info:
title: Tenant Onboarding API
version: 1.0.0
paths:
/tenants:
post:
summary: Onboard a new tenant
requestBody:
required: true
content:
application/json:
schema:
$ref: '#/components/schemas/TenantRequest'
responses:
'200':
description: Tenant onboarded successfully
components:
schemas:
TenantRequest:
type: object
properties:
tenantName:
type: string
schemaName:
type: string
adminEmail:
type: string
🧪 Optional Enhancements
Area | Tool / Pattern |
Auth & tenant claims | Spring Security + JWT |
Config per tenant | Load from DB/Redis on startup |
Multitenant routing | AbstractRoutingDataSource |
Async events | Kafka / NATS |
Testing onboarding | MockMvc + H2 or TestContainers |
How do you enforce audit logging per tenant
ChatGPT said:
Enforcing audit logging per tenant in a multi-tenant SaaS platform ensures traceability, compliance, and security — without mixing tenant data. Here’s how you can architect it:
✅ Goals of Per-Tenant Audit Logging
Goal | Why it matters |
Tenant isolation | Avoid data leakage across tenants |
Traceable operations | Who did what, when, and where |
Compliance-ready | Regulatory requirements (e.g., SOC 2, GDPR, RBI, etc.) |
Search & exportability | Easy filtering per tenant |
🔧 Architectural Options
1. Database Table per Tenant (Schema-per-tenant Model)
Each tenant’s schema has its own audit_logs table
Logged via triggers or application service
Example:
sql
CopyEdit
-- In tenant_acme schema CREATE TABLE audit_logs ( id SERIAL PRIMARY KEY, actor VARCHAR(100), action VARCHAR(100), resource VARCHAR(100), timestamp TIMESTAMP DEFAULT now(), metadata JSONB );
➡️ Use Flyway to create this table in every schema during onboarding.
2. Shared Audit Log Table with tenant_id (Row-level Model)
Centralized audit_logs table with tenant ID
Ideal for row-level or hybrid multitenancy
sql
CopyEdit
CREATE TABLE audit_logs ( id SERIAL PRIMARY KEY, tenant_id VARCHAR(50), actor VARCHAR(100), action VARCHAR(100), resource VARCHAR(100), timestamp TIMESTAMP DEFAULT now(), metadata JSONB );
➡️ Apply row-level filtering via a WHERE tenant_id = ? clause for reads.
3. Logging to Kafka / ELK / Audit Service
Emit structured events from your app:
Topic: audit-logs
Partitioned by tenant_id
Consumers store it in Elasticsearch, MongoDB, or an audit service
json
CopyEdit
{ "tenant_id": "acme", "actor": "user1@acme.com", "action": "DELETE_ACCOUNT", "resource": "/accounts/12345", "timestamp": "2025-05-13T12:34:00Z", "ip": "10.10.10.5", "browser": "Chrome" }
➡️ Use tools like Kafka + Elasticsearch + Kibana (ELK) for tenant-level dashboards and filtering.
🛠 Spring Boot Implementation (Row-level Example)
java
CopyEdit
@Entity @Table(name = "audit_logs") public class AuditLog { @Id @GeneratedValue private Long id; private String tenantId; private String actor; private String action; private String resource; private Instant timestamp = Instant.now(); @Lob private String metadata; }
java
CopyEdit
public void logAction(String tenantId, String actor, String action, String resource, Map<String, Object> metadata) { AuditLog log = new AuditLog(); log.setTenantId(tenantId); log.setActor(actor); log.setAction(action); log.setResource(resource); log.setMetadata(new ObjectMapper().writeValueAsString(metadata)); auditLogRepository.save(log); }
➡️ Extract tenantId from request header, token claim, or tenant context.
🔐 Security + Governance
Strategy | Detail |
Encrypt logs | Encrypt PII/PHI fields before storing |
Access control | RBAC on audit log APIs/views |
Retention policies | Delete/archive logs based on tenant plan or geography |
Immutable logs | Store audit logs in WORM storage or blockchain-based DB |
✅ Best Practices
Always include:
tenant_id, user_id, action, resource, timestamp, metadata
Use structured logs (JSON) for easy search
Support export or alerting on suspicious actions (e.g., multiple failed logins)
Make logs tenant-visible in the admin UI (with filters)
How do you support tenant lifecycle management?
ChatGPT said:
Supporting tenant lifecycle management in a multi-tenant SaaS platform involves orchestrating the creation, activation, modification, suspension, and deletion of tenants — with strong automation, security, and governance.
🌀 Tenant Lifecycle Stages
Stage | Description |
Onboarding | Provision schema/resources, register metadata, setup default config |
Activation | Enable access, send credentials, set quotas |
Configuration | Customize features, branding, integrations |
Operation | Normal usage, scaling, logging, monitoring |
Upgrade | Apply schema/app updates per tenant or all |
Suspension | Temporarily disable tenant (e.g., for payment issues) |
Deletion | Remove resources/data (with retention/compliance policies) |
✅ Architecture Components for Lifecycle Management
1. Tenant Metadata Registry (Database)
A central table to track all tenants and their statuses.
sql
CopyEdit
CREATE TABLE tenant_master ( tenant_id VARCHAR PRIMARY KEY, schema_name VARCHAR, status VARCHAR, -- [ACTIVE, SUSPENDED, DELETED] plan VARCHAR, created_at TIMESTAMP, updated_at TIMESTAMP );
2. Lifecycle Management API
Expose endpoints to manage tenant state transitions:
API | Purpose |
POST /tenants | Onboard new tenant |
PUT /tenants/{id}/activate | Activate tenant |
PUT /tenants/{id}/suspend | Suspend tenant |
PUT /tenants/{id}/upgrade | Update plan/features |
DELETE /tenants/{id} | Delete tenant (soft/hard) |
3. Provisioning Pipeline
Trigger internal processes:
Create DB schema (schema-per-tenant)
Run Flyway migrations
Setup default roles/configs
Register in service discovery
Notify admin (email/SMS)
4. Configuration Management
Use config DB, S3, or Git-backed configs
Fetch based on tenant_id
Override defaults for feature flags, branding, SLA
5. Monitoring and Quotas
Per-tenant metrics (CPU, memory, API rate limits)
Auto scale or alert based on plan
Use Prometheus/Grafana with tenant filters
6. Authentication & Access Control
Maintain tenant isolation:
Use tenant_id in JWT tokens
Enforce tenant scoping in backend services
7. Suspension Handling
Mark tenant as SUSPENDED in metadata
Block login & all request handling (e.g., filter/interceptor)
Show appropriate error: 403 – Tenant Suspended
8. Deletion Process
Soft delete first (status = DELETED)
Optionally purge data after retention window
Secure wipe from:
DB (schema drop or row deletion)
Blob storage (S3)
Indexes (Elasticsearch)
🚦 State Machine (Simplified)
less
CopyEdit
[REGISTERED] --> [ACTIVE] --> [SUSPENDED] --> [ACTIVE] | [DELETED]
🔐 Governance & Compliance
Practice | Purpose |
Tenant SLAs | Define RTO, RPO, retention per plan |
GDPR/CCPA compliance | Right to be forgotten → data deletion |
Audit trail | Log all lifecycle changes |
RBAC for tenant admins | Limit what tenant admins can do |
🧠 Bonus Tips
Automate onboarding via CI/CD (e.g., provisioning jobs)
Support white-labeling per tenant (logo, theme, URL)
Use async workflows (e.g., Kafka/NATS) for provisioning
Track versioning: app version, schema version per tenant
Enable feature rollout per tenant (via flags)
what is SAST and DAST TOOL
SAST (Static Application Security Testing) and DAST (Dynamic Application Security Testing) are two key types of application security testing tools used to identify vulnerabilities — but they work in different ways and at different stages of the development lifecycle.
🔐 SAST (Static Application Security Testing)
✅ What It Is:
SAST analyzes source code, bytecode, or binaries without executing the application. It scans for security issues like SQL injection, XSS, hardcoded secrets, and insecure APIs at rest, early in the SDLC.
🔍 How It Works:
Scans source code statically (pre-build)
Can be integrated in CI pipelines (shift-left)
Detects vulnerabilities before the code is deployed
🛠 Common SAST Tools:
Tool | Language Support |
SonarQube | Java, C#, JavaScript, etc. |
Checkmarx | Wide range |
Fortify SCA | Enterprise-grade |
Veracode SAST | Cloud-based |
Semgrep | Lightweight, fast |
🧪 DAST (Dynamic Application Security Testing)
✅ What It Is:
DAST scans running applications for vulnerabilities by simulating real attacks in a black-box manner. It doesn't look at the code — instead, it tests the app like a hacker would.
🔍 How It Works:
Executes HTTP requests and analyzes responses
Finds runtime issues like:
Authentication flaws
Server misconfigurations
Business logic flaws
Broken access control
🛠 Common DAST Tools:
Tool | Key Features |
OWASP ZAP | Free, open source, powerful |
Burp Suite | Manual + automated scanning |
AppScan | Enterprise support |
Veracode DAST | Cloud-based, scalable |
Acunetix | Comprehensive vulnerability scanning |
🔄 SAST vs DAST – Quick Comparison
Feature | SAST | DAST |
When | Early (code phase) | Late (runtime phase) |
What it scans | Source code, binaries | Running application |
Speed | Fast (for small codebases) | Slower (depends on test cases) |
Coverage | Code logic, secrets, patterns | Runtime flaws, auth bugs |
Language-aware | Yes | No (black-box) |
Dev integration | IDE, GitHub, CI/CD | QA/Test environments |
🚧 Best Practice
Use both SAST + DAST for full security coverage:
SAST in CI/CD pipelines → to prevent vulnerable code from being merged
DAST in staging → to validate runtime behavior & detect real-world vulnerabilities
Comments