Building unbg: how we run AI background removal 100% in the browser
Why we built unbg with ONNX Runtime Web and WebAssembly instead of a server, what we learned about running neural networks client-side, and why it changes the privacy story for image tools.

Every popular background-removal tool has the same problem. You upload your image to someone else's server. That is fine for a throwaway avatar, but it is a non-starter for product photography, personal photos, or anything under NDA. We built unbg to run the entire pipeline in the browser. No upload, no account, no watermark, no limits. This post walks through how.
The architecture in one diagram
The whole app is a single Next.js 16 page that loads an ONNX model into the browser and runs inference via WebAssembly.
User uploads image
↓
Canvas API → Uint8Array
↓
@imgly/background-removal (ONNX Runtime Web + WASM)
↓
Transparent PNG → download or edit
↓
(optional) brush editor → final downloadThere are no fetch calls. There are no API keys. The first run downloads about 80MB of model weights, after which everything is cached in IndexedDB and works offline.
Why ONNX Runtime Web
We considered TensorFlow.js and an in-house wrapper around MediaPipe, but ONNX Runtime Web won on three axes.
- Model portability. The team behind @imgly/background-removal ships a well-tuned ONNX segmentation model. Switching to a newer one is a file swap.
- WASM SIMD. On modern browsers we get near-native speed via WASM SIMD. A 1024x1024 image segments in about 1.5 seconds on an M-series MacBook, around 3 seconds on a mid-range Android.
- No native dependencies. No CUDA, no node-gyp, no server. The same codebase runs identically on every device.
The hard part: mobile
Browser ML looks great in a desktop demo. Mobile is where it falls apart. Three issues dominated our testing.
- HEIC photos. iPhone galleries are full of HEIC files, which browsers cannot decode natively. We added an automatic HEIC to JPEG conversion step before inference.
- Memory. A 4000x3000 image plus model weights will crash Safari on a 4GB iPhone. We downsample oversized inputs before inference and upscale the mask afterward.
- Touch input. The brush editor needed full touch support. That meant pointer events, coalesced events for smoothness, and palm rejection on the canvas.
UX details that mattered
Things that moved the product:
- Before and after slider. The single feature that made users trust the output. Without it, they kept asking whether it had actually worked.
- Brush editor with erase and restore. AI gets edges wrong on hair and fur. A simple masking editor covers the gap without needing a better model.
- Batch mode. E-commerce sellers have 50 photos at a time, not one. Processing a queue sequentially, not in parallel because of memory, made it usable.
- Warm model on first interaction. We preload the model as soon as the user drops a file, before they click Start. That shaves two seconds off perceived latency.
What we gave up
Running everything client-side means we cannot do a few things.
- Use a bigger, more accurate model. We are capped by what will run on a mid-range phone.
- Charge for API access. The model lives in the user's browser, so the business model has to be something else. For unbg, we just give it away.
- Collect usage data beyond anonymous, aggregate analytics. No server means no ability to see who used what.
We think the tradeoff is worth it. For a privacy-sensitive, commoditized feature like background removal, the claim that everything runs on your own device is a strong enough difference that it does not need a pricing page.
Bigger picture: browser ML is ready
WASM SIMD, ONNX Runtime Web, and IndexedDB model caching together make a whole category of small model, privacy-sensitive, latency-sensitive apps viable without a backend. Background removal, OCR, audio transcription, pose estimation. All of these can run in a tab today.
If you are shipping something where users are already skeptical about uploading their data, moving inference into the browser removes the objection entirely. Your infrastructure bill collapses to a CDN.
Try it or build one
unbg.tech is live and free. If you are thinking about running your own ML features in the browser and want to compare notes on model size, caching strategy, or mobile issues, get in touch.
