Time to First Byte measures how long the browser waits between requesting a page and receiving the first byte of HTML. It's not a Core Web Vital itself, but it caps every other metric. A page with 1.5s TTFB cannot have a 1.5s LCP. The math doesn't allow it.
The single biggest mistake in TTFB diagnosis is measuring it from a single fast location and assuming that's the user experience.
Three measurement sources, in order of how much you should trust them:
Real-user monitoring (RUM) via Cloudflare Web Analytics, Vercel Speed Insights, or your own beacon. RUM reports TTFB from actual visitors, weighted by your traffic distribution. This is ground truth.
Google Search Console Core Web Vitals. Reports the field-measured TTFB segment of LCP for your highest-traffic URLs. Lag is 28 days but the data is from real Chrome users.
Synthetic tests (PageSpeed Insights, WebPageTest). Useful for reproducing a specific problem, useless for representing the population. Test from at least three regions if synthetic is all you have.
A site that passes TTFB from Vercel's edge in California can fail it from a 4G connection in São Paulo by 3x. Fix the population, not the single best case.
TTFB has four components. Each has different fixes.
Component
What it measures
Typical fix
DNS lookup
Resolving your domain to an IP
DNS provider with anycast
TCP + TLS handshake
Connection setup, certificate exchange
HTTP/3, edge termination
Request waiting
Server processing the request
Static generation, caching, faster DB
Response transfer
First byte traveling back to the browser
CDN, edge rendering, compression
The Page Speed Grader reports the breakdown. Most sites have one dominant component — usually request waiting or response transfer. Fix the dominant one first.
If DNS is the bottleneck (over 200ms), switch DNS providers — Cloudflare, Route 53, NS1 all measure DNS in single-digit milliseconds. If TCP + TLS is the bottleneck (over 300ms), the site is probably not on HTTP/3 yet and not behind a CDN that terminates TLS at the edge.
The fastest server response is no server response. A page served from a CDN's edge as pre-rendered HTML has a TTFB measured in tens of milliseconds, not hundreds.
Static generation works for anything that doesn't need to be personalized per request:
Marketing pages
Blog posts
Documentation
Product catalog pages where pricing and availability are cached
Comparison pages, landing pages, pricing pages
Frameworks make this a single config:
Next.js: export const dynamic = "force-static", or generateStaticParams() for dynamic routes
Astro: static by default
Hugo, Jekyll, Eleventy: static-only by design
A WordPress site running on a $20 VPS with 1.2s TTFB will hit 80ms TTFB after a static-export plugin (WP2Static, Simply Static) puts the rendered HTML on Cloudflare Pages. The content is identical. The TTFB is 15× better.
When static isn't possible — authenticated dashboards, real-time data, A/B-tested pages — keep reading.
A CDN with edge presence near your users does three things that cut TTFB:
Terminates TLS at the edge (no round-trip back to the origin for the handshake)
Caches static assets and HTML where applicable
Often runs HTTP/3 (QUIC) automatically, which removes the TCP handshake entirely on warm connections
Cloudflare and Fastly are the dominant choices. Vercel and Netlify bundle this into their hosting. Cloudfront works but takes more configuration. Whatever you use, verify the edge is actually serving the request — a CDN that always proxies back to origin gives you the cost without the benefit.
To verify: curl -I https://yoursite.com/some-page and look at the response headers. CDNs add cf-cache-status, x-vercel-cache, x-cache, or similar. If you see MISS consistently on a page that should be cacheable, the CDN config is wrong.
When the request waiting component is the bottleneck, the cause is almost always one of three things:
A slow database query in the page render path
A synchronous external API call
A heavy framework startup (cold start on serverless)
For database queries:
Add an index. The 90% case for "this query takes 500ms" is a missing index on the foreign key being joined. Run EXPLAIN ANALYZE on the query, find the sequential scan, add the index.
Cache the result. If the data changes once an hour, cache it for an hour. Redis, Memcached, or the framework's built-in cache (Next.js cache(), Django's cache framework) all work.
Move the query out of the render path. If the page renders without that data being immediately present, defer it — stream the response, fetch the data client-side, or render with a skeleton.
For external APIs: same logic. Cache the response, set a sensible TTL, never make a synchronous third-party API call during page render unless the page literally cannot exist without it.
For cold starts on serverless: increase the memory allocation (more memory means a faster CPU on most providers), pre-warm critical functions, or move the route to an edge runtime where cold starts are measured in single-digit milliseconds.
If TTFB is fine on warm connections but bad on cold ones, two flags matter:
Enable Brotli compression. Brotli compresses HTML 15–25% better than gzip. Most CDNs negotiate it automatically; for self-hosted origins, the config is usually a single line in Nginx or Apache. The smaller the response body, the faster the first byte and the lower TTFB-equivalent for the full payload.
Verify HTTP/3 is on. HTTP/3 (QUIC) eliminates the TCP handshake on connection reuse, which dramatically improves TTFB on subsequent requests in the same session. Cloudflare, Fastly, and CloudFront all support it. Verify with curl --http3 -I https://yoursite.com (requires curl 7.66+) — a HTTP/3 200 response confirms it's serving.
The single most common TTFB failure is a database call to count something in the page header. "Number of users", "items in cart", "unread notifications". The count runs on every page load, blocks rendering, and adds 100–500ms to every request.
Two fixes:
Cache the count with a short TTL (60 seconds is usually fine)
// Client component, fires after renderconst userCount = useUserCount(); // SWR or React Queryreturn <Header userCount={userCount} />;
A single render-blocking database count can put 300ms on TTFB across every page in the application. Removing it is often the single largest TTFB win available.
Sub-800ms TTFB unlocks the rest of the performance metrics. LCP can pass, FCP can pass, INP starts mattering more than initial render. The LCP fix guide is the natural next step. Once that's clean, the INP guide handles the post-load metric.
TTFB is foundational. Every millisecond cut here is a millisecond unavailable to anything downstream.