POD VS Microservices

Anand Nerurkar
May 16
2 min read

how many services per pod

In Kubernetes, the best practice is to run one service per pod — especially in microservices architecture using Spring Boot. Here's why:

✅ Why One Service Per Pod?

Reason	Benefit
Isolation	Each service runs independently, easier to monitor & debug.
Scalability	You can scale each service separately based on load.
Resilience	One service crashing doesn’t affect others.
Resource control	Allocate specific CPU/memory per service.
Security & least privilege	Fine-grained network policies, secrets, and role bindings.

⚙️ What Runs in a Pod Besides the Service?

Each pod can include:

Main container: The Spring Boot microservice.
Sidecars (optional):
- Envoy / Istio proxy for service mesh.
- FluentBit for logging.
- Prometheus exporter for metrics.
- Kafka sidecar (if needed).

But still, only one Spring Boot service per pod.

👁️ Example: Pod Layout for a Service

Pod Name	Containers
account-service-pod	- account-service (main) - istio-proxy (sidecar) - fluentbit (sidecar)
loan-service-pod	- loan-service (main) - istio-proxy (sidecar)

💡 Summary

✅ One Spring Boot service per pod is the industry standard for cloud-native microservices architecture.

If you're running 20 microservices, each with 4 replicas, you'll have:

20 × 4 = 80 pods running Spring Boot containers
Plus supporting infrastructure pods (Kafka, Istio, Zookeeper, monitoring, etc.)

what do you mean by Services with burst traffic

ChatGPT said:

"Services with burst traffic" refers to microservices that experience sudden spikes in user requests or data processing load, often unpredictably or at specific peak times. These bursts are short-lived but high in intensity, and the system must handle them gracefully without crashing, slowing down, or timing out.

🔍 Examples in a Personal Banking System:

Service	Example of Burst Scenario
🔐 Auth Service	Many users logging in during office hours or salary day
💸 Fund Transfer	Festival days, salary credit day, or payment deadlines
🧾 Statement Download	End of month/year when users download bank statements
📥 Loan Application	Marketing campaigns or interest rate drop announcements
📤 Notification Service	Bulk notifications for updates, offers, or system alerts

🧠 Characteristics of Burst Traffic Services:

Short duration but high throughput
Often occurs at predictable times (but not always)
Can overwhelm resources if not scaled dynamically
Usually backed by autoscaling (e.g., Kubernetes HPA or KEDA)

🚀 How to Handle Burst Traffic:

Horizontal Pod Autoscaler (HPA):
- Increases/decreases replicas based on CPU, memory, or custom metrics.
- Example: Scale from 2 to 10 pods during high login volume.
Kafka Backpressure Buffering:
- For burst in event-driven services (e.g., fraud detection), Kafka can absorb high loads and allow consumers to process at their own pace.
Queue-based Async Processing:
- Decouple user action and processing (e.g., user clicks “Transfer,” but processing goes to queue).
Node Autoscaling:
- AKS cluster auto-scales node pools based on pod resource demand.
Rate Limiting and Circuit Breakers:
- Protect downstream services from overload during a burst.

✅ Summary

"Services with burst traffic" need to be designed to scale quickly and recover gracefully. In banking, this is essential for services tied to customer interaction and money movement, which can become hotspots during specific times. Scaling, buffering, and async designs are key tools to handle it efficiently.