Frontend Performance Optimization in 2025
Core Web Vitals after the INP migration, image strategy, cache layers, and the perf moves that still matter on modern React stacks.
Lighthouse on a fast laptop is theatre. The real measurement is p75 INP on a mid-range Android phone over a flaky 4G connection. Optimise for that and the rest follows.
The three numbers that matter
Core Web Vitals reduced the universe to three measurements. They are not perfect, but they correlate well enough with "the site feels good" that I treat them as the scoreboard.
- LCP (Largest Contentful Paint)< 2.5s
- INP (Interaction to Next Paint)< 200ms
- CLS (Cumulative Layout Shift)< 0.1
LCP is "how long until the main thing on screen renders." INP, which replaced FID in March 2024, is "the slowest interaction the user had on this page." CLS is "did things jump around while the user was reading."
You only need to get good at three categories of optimisation to hit all three: ship less, ship it sooner, and stop blocking the main thread.
LCP: ship less, ship it sooner
The largest element on screen is almost always one of two things: a hero image or a block of text inside the main content card.
For images:
Serve the correct size
A 280px slot should not download a 2000px source. `next/image` with explicit `width` and `sizes` handles this. Without that, every responsive site secretly serves desktop images to phones.
Serve a modern format
AVIF first, WebP fallback, original last. AVIF is 30 to 50 percent smaller than WebP for visually identical quality.
Preload the hero
The element that decides LCP gets `<link rel="preload" as="image" fetchpriority="high">`. Only that one element. Preloading everything cancels the benefit.
For text-driven LCP, the answer is usually server rendering. If the LCP element is "the headline of an article," the browser should not need to download a JS bundle to see it. Server-render the page shell, hydrate later.
INP: stop blocking the main thread
INP is the metric that exposes lazy React code. Every time a click handler runs synchronously for more than ~50ms, INP gets worse. The user feels the lag even if they cannot name it.
- 01
Use `useTransition` for expensive state updates React 18 gave us transitions. Wrap state updates that trigger heavy re-renders, and the click handler returns immediately while the work happens at low priority. INP plummets.
- 02
Move work off the main thread A 200ms sync filter on a list of 5,000 items will tank INP. The same filter in a Web Worker is invisible. `comlink` makes the worker boilerplate tolerable.
- 03
Profile before guessing Chrome DevTools' Performance tab will tell you exactly which function call ate 180ms. Half the time it is a third-party script you forgot was there.
- 04
Yield to the browser For long synchronous loops you cannot avoid, use `scheduler.yield()` (or a setTimeout(0) shim) every few iterations. It lets the browser process input between chunks.
CLS: reserve space for everything
CLS is the cheapest one to fix and the most embarrassing one to ship.
- Every image and video has explicit `width` and `height` attributes so the layout reserves space before the asset loads.
- Web fonts are loaded with `font-display: optional` or with a fallback that closely matches metrics (use `size-adjust` and `ascent-override` in your `@font-face`).
- Any element that mounts asynchronously (cookie banner, app shell update) is rendered inside a container with a reserved height, even if it ends up empty.
If you fix these three patterns, CLS is usually a solved problem.
Caching, in three layers
A request that does not happen is the fastest request.
Browser cache
Hash the filename. `/_next/static/chunks/main-a1b2c3.js` can be cached for a year because if the content changes, the hash changes. `Cache-Control: public, max-age=31536000, immutable`.
CDN cache
The HTML response can usually be cached at the edge for 60 seconds with stale-while-revalidate. That means 99% of requests hit the edge and only 1% hit your origin. Latency drops by an order of magnitude.
Service worker cache
For repeat visits, a service worker that pre-caches the app shell lets the second visit render before the network even responds. The trade-off is complexity around invalidation. Worth it for content sites; usually not worth it for dashboards.
What I no longer chase
A few things that used to matter and have stopped.
- Bundle size below some magic number. Bundle size matters, but the right metric is "time until interactive on a real phone," not "kilobytes of JS." A 200KB bundle that runs in 80ms is fine. A 100KB bundle that takes 400ms to parse and execute is not.
- Server-rendering everything. RSC is great where it fits, but rendering every interaction on the server adds round trips that hurt INP more than client rendering does.
- Premature code-splitting. Splitting routes is obvious. Splitting components below the route level usually trades one network request for one user-perceived delay. Measure first.
The fastest site is not the one with the best Lighthouse score. It is the one where the user never thinks about loading.
Build the AI layer you'd be proud to ship.
If your roadmap has voice, copilots, RAG, or agentic flows on it, the booking link below is the right move. 30 minutes, no pitch, straight answer on whether I can help.