Pagefind: full-text search for static sites, no backend

Pagefind is a search library for static sites. It indexes the built HTML once at build time and serves the index as static files. Queries run in the visitor's browser against a small WASM binary - no search backend, no API key, no third-party JS.

This site uses it for the /blog/ index.

# How it works

Build step. After your site generator writes HTML, you run pagefind --site _site. Pagefind crawls the output, looks for data-pagefind-body (or just falls back to <main>/<article>), tokenises the text, and writes:
- pagefind/pagefind.js - the loader.
- pagefind/pagefind-ui.js + pagefind-ui.css - the optional default UI (a search input + result list).
- pagefind/wasm.*.pagefind - the WASM binary that does the actual querying.
- pagefind/index/*.pagefind - the index, chunked by term prefix (typically a few hundred bytes to a few KB per chunk).
- pagefind/fragment/*.pagefind - per-page metadata (title, URL, excerpt source).
Page load. The browser loads pagefind-ui.js lazily. Nothing else fetches yet.
First keystroke. Pagefind loads the WASM (~70KB gzip), then fetches only the index chunks matching the prefix the user typed. For a query like podm, it pulls the pod chunk, not the whole index.
Result render. Matched fragment files are fetched to build excerpts. Each result is one or two extra ~500-byte requests.

The total network cost for a typical search is well under 200KB even on a multi-thousand-page site, because nothing is loaded until the user actually searches and only the relevant chunks come down.

# Why it beats a backend search service

For a static site or a personal KB, Pagefind cuts out a bunch of operational headaches:

No server. Nothing to provision, monitor, rate-limit, patch, or pay for. It's bytes on a CDN. CF Pages / GitHub Pages / S3 / Netlify all serve it the same way they serve the rest of your site.
No third-party. Algolia/Typesense/Elastic-as-a-service all add a domain to your CSP, a tracking surface, an SLA you depend on, and a key that can leak. Pagefind queries never leave the page - relevant if you write about anything you'd rather not hand to a vendor's logs.
Private by design. No analytics. No "what did this user search for" telemetry to leak. There's no place for it to go, since the search runs entirely in the page.
Builds in the same pipeline. It's one extra command after your site generator. No separate index-rebuild job, no webhook, no eventual-consistency window between "I published a page" and "it's searchable." When the site deploys, the index deploys.
Trivial to roll back. A bad index ships with a bad deploy and reverts with a git revert + redeploy. With a hosted index, rollback means re-indexing.
Cheap. Free. MIT-licensed. The hosting cost is whatever your static host charges (often $0).

# Where a backend still wins

Tens of thousands of pages. Pagefind chunks the index but the fragment count grows linearly. Past ~10k pages, the build step gets slow and the cold-start WASM + chunk fetches start to feel sluggish on mobile data. Backend search keeps a constant client cost.
Per-user authorisation. If results depend on who's logged in (private docs, multi-tenant SaaS), the index can't be public, and shipping a per-user index isn't realistic. You need a server enforcing access at query time.
Frequent index updates without a deploy. If content changes outside the build (user-generated, CMS-driven, real-time), you need an index that updates independently. Pagefind only updates when you rebuild.
Fancy features. Synonyms, typo-tolerance with a learned language model, query analytics, A/B-tested ranking, federated search across heterogeneous sources - all things Algolia / Typesense / Meilisearch do well and Pagefind doesn't try to.

# Integration notes (Eleventy specifically)

Three things to remember:

Pagefind has no watch mode. eleventy --serve keeps the build in memory and never invokes Pagefind. For dev, npm run dev shows you the empty #search div with no UI; it's not broken. To actually exercise search, build to disk and serve the directory: npm run build && npm run serve.
Indexing scope. By default Pagefind indexes everything inside <body>. Wrap your real content in <main data-pagefind-body> (or <article data-pagefind-body>) to exclude nav/footer chrome from search results. Worth doing - otherwise every result excerpt starts with "home blog Dark mode ☾".
UI customisation through CSS variables. The default PagefindUI widget exposes --pagefind-ui-primary, --pagefind-ui-background, --pagefind-ui-border, etc. You set them on :root and the widget picks them up. The override block in css/style.css:326 is all this site does to theme it.

Why your MCP server cannot see env vars from .bashrc

# How it works

# Why it beats a backend search service

# Where a backend still wins

# Integration notes (Eleventy specifically)

# Related

# References