top of page

This site was designed with the

website builder. Create your website today.Start Now

anannd.nerurkar@gmail.com

Preview (3).png

Search

P95/P99

Anand Nerurkar
May 6
1 min read

⏱️ Latency (P95/P99 Response Time) is a key performance metric used to measure how fast a system or API responds under real-world load — not just on average, but at the “worst-case” tail of user experiences.

📌 Definitions:

P95 (95th percentile latency):95% of all requests are faster than this value — the slowest 5% take longer.➤ A good balance between average performance and worst-case outliers.
P99 (99th percentile latency):99% of all requests are faster than this value — only 1% are slower.➤ Highlights rare but impactful latency spikes.

🧠 Why It Matters:

Metric	What It Tells You
Average latency	General performance under normal load
P95 latency	How most users experience your app
P99 latency	Worst-case response time under load — can expose outliers, contention, or scaling issues

📊 Example:

Let’s say your API returns:

Request #	Response Time
1–90	100–200ms
91–95	300–400ms
96–99	800–1000ms
100	2,000ms

P95 latency ≈ 400ms
P99 latency ≈ 1,000ms
Max latency = 2,000ms

✅ Target Ranges (Good Practice):

API Type	P95 Target	P99 Target
Internal APIs	< 300ms	< 500ms
External/Public APIs	< 500ms	< 1000ms
Critical real-time (e.g., payments)	< 100ms	< 250ms

Recent Posts

Java 8 coding preparation

Java 8 coding preparation

why springbatch job??

Spring Batch Job Spring Batch is designed exactly for batch workloads like Pro*C migrations. ✅ Advantages: Chunk-oriented processing...

Pro*c Job to Spring Batch Job

Example1: 📌 Background Pro*C job → Written in C with embedded SQL, often used for batch ETL-like jobs in Oracle. Spring Batch job →...

bottom of page