Building Blocks

Latency

The annoying delay

TL;DR

The delay between asking and getting an answer. When your video call freezes for 2 seconds — that's latency. Lower is always better.

The Plain English Version

You click a button. There's a pause. Then something happens. That pause? That's latency. It's the time delay between doing something and seeing the result. In an ideal world, latency would be zero. In reality, data has to travel through wires, get processed by servers, and travel back. That takes time.

Think about a phone call. When you're talking to someone across the room, there's essentially no delay. Call someone in another country, and sometimes there's that awkward half-second gap where you both start talking at the same time. That gap is latency — the time it takes for your voice to travel there and their voice to travel back.

On the internet, latency is measured in milliseconds. Under 50ms feels instant. 100-200ms is noticeable but fine. Over 500ms and things start feeling sluggish. Over a second and you're wondering if the page is broken. Every tech company obsesses over reducing latency because even small delays make users bounce.

Why Should You Care?

Because latency affects every digital experience you have. Slow websites, laggy video games, glitchy video calls, buffering videos — that's all latency. When AI takes a few seconds to respond, that's inference latency. Understanding it helps you diagnose why things feel slow and appreciate why tech companies build data centers all over the world (to get closer to you and reduce latency).

The Nerd Version (if you dare)

Latency is the time delay in a system, typically measured as round-trip time (RTT) in milliseconds. Network latency depends on physical distance, routing hops, congestion, and protocol overhead. Application latency includes processing time, database queries, and serialization. Optimization strategies include CDNs, edge computing, connection pooling, caching, async processing, and protocol optimization (HTTP/3, QUIC). P50/P95/P99 percentiles are used to measure latency distributions in production systems.

Related terms

Bandwidth CDN The Cloud

Like this? Get one every week.

Every Tuesday, one AI concept explained in plain English. Free forever.

Want all 75 terms in one PDF? Grab the SpeakNerd Cheat Sheet — $9