Blog

DuckDB WASM: Querying Data in the Browser, No Backend

January 1, 1970

duckdbwasmsqlarchitecture

For most of the history of web-based analytics tools, "query the data" has meant "send a request to a server, wait, get results back." DuckDB WASM breaks that assumption in a way that's easy to undersell: it runs an actual, complete SQL engine inside the browser tab, with no server round-trip required to execute a query.

What's actually running

DuckDB WASM compiles DuckDB — a real, embedded analytical database, not a toy SQL parser — to WebAssembly. That means genuine SQL: joins, window functions, aggregates, all executed locally against data already loaded into the browser's memory.

import * as duckdb from "@duckdb/duckdb-wasm";

const db = await duckdb.createWorker(bundle.mainWorker);
const conn = await db.connect();

await conn.query(`
  select region, sum(revenue) as total
  from quarterly_revenue
  group by region
  order by total desc
`);

No API endpoint. No server-side query planner. The query runs on whatever's already in the tab.

Why this matters more than it sounds like

The obvious benefit is latency — no network round-trip for every filter change or drill-down. That's real, but it's not the interesting part.

The more significant shift is architectural: an analytical tool that queries client-side doesn't need a query-serving backend at all for read operations. The data can be loaded once — as an Arrow buffer, say — and every subsequent interaction is a local computation. That changes what "publishing a dashboard" even means. A published page doesn't need to proxy every interaction through a live server; it can ship the data and the query engine together, and let the browser do the rest.

The caching question this raises

Running client-side doesn't mean ignoring freshness — it means being deliberate about when data gets re-fetched versus when it's safe to reuse what's already loaded. A reasonable default: load an Arrow-encoded snapshot on open, query against it locally for the session, and re-fetch on a defined interval or explicit refresh rather than on every interaction.

// snapshot loaded once per session, then queried locally —
// re-fetch is a deliberate action, not implicit per-query
await conn.insertArrowFromIPCStream(cachedArrowBuffer, {
  name: "quarterly_revenue",
});

// every filter change below is a local query, not a network call
const filtered = await conn.query(
  `select * from quarterly_revenue where region = 'North'`
);

This is close to the snapshot_then_requery pattern — cache what you have, serve it instantly, and requery deliberately rather than on a hair-trigger. It's a reasonable default for most analytical documents: fast by default, current on a schedule you control.

Where this lands architecturally

The practical effect is that a published, embedded analytical page can behave like a lightweight app rather than a thin client to a server — genuinely interactive, without the hosting and maintenance burden of a backend serving every query. Infigured uses DuckDB WASM as its sole query engine for exactly this reason: it's the same engine whether you're actively editing on a canvas or someone else is viewing a published page, with an Arrow-encoded snapshot cached directly in the document itself.

Where the limits are

Client-side execution has a ceiling — very large datasets or genuinely heavy joins still benefit from doing the work server-side first and shipping a smaller, pre-aggregated result to the browser. The right read isn't "everything runs client-side now," it's that a meaningful slice of analytical workloads that used to require a backend, don't anymore.


Related: How to Embed a Live Chart in Notion, a Blog, or Docs

Continue the workflow

Explore related guides

Next step

Move from insight to a stakeholder-ready story.

Infigure helps teams replace the export-to-slides loop with one connected reporting workflow for analysis, narrative, and delivery.