State-of-the-art image generation, in your browser.
Bonsai Image 4B is built to run entirely in the browser, on your own GPU over WebGPU — nothing leaves the device, there is no per-prompt server cost, and iteration is instant because there is no round-trip. FLUX.2 [klein] quality from a diffusion transformer that fits in ~1 GB. Open weights, Apache-2.0.
Built from FLUX.2 [klein] 4B and re-quantized to binary / ternary weights, Bonsai is compact enough to download once, cache on the device, and run locally — laptops and phones included.
Requires a WebGPU browser. Chrome / Edge 113+ are the primary verified path. Safari 18+ is a target; on-device verification is pending.
Interface visualization — not a generated image
First-run download 3.43–3.89 GB · cached locally
Technology preview. The in-browser studio currently runs a reference mock while we vendor and verify the WebGPU runtime on-device. The technology is real today — see PrismML's live WebGPU demo.
A studio for a model that fits on your device
General Intelligence is a studio for Bonsai Image 4B — an open-weight, 4B-parameter diffusion model built from FLUX.2 [klein] 4B and re-quantized to binary or ternary weights so it is compact enough to run locally in a WebGPU browser. You pick a variant and write a prompt; once the WebGPU runtime is in place, generation runs on your own hardware — nothing uploaded, nothing metered. When a result is worth selling, you publish it to the marketplace with its full provenance attached.
Three steps, all on your machine
No account, no upload, no queue. The model loads into your browser and stays there.
- 01
Pick a variant
Ternary for maximum quality, or binary for the smallest footprint. Each shows its exact download size up front.
- 02
Type a prompt
Describe what you want. Set steps and a seed, or leave them — the resolved seed is shown so any result is reproducible.
- 03
Generate on your GPU
Once the WebGPU runtime is in place, generation runs locally — no server round-trip, and nothing leaves your device until you publish.
First run downloads the model once (~3.4–3.9 GB) and caches it on your device; later runs load from cache.
In this preview the studio runs a reference mock so you can walk the full flow; real on-device generation lights up once the WebGPU runtime is verified on-device.
Two ways to trade footprint for fidelity
Bonsai re-quantizes the FLUX.2 [klein] diffusion transformer to binary or ternary weights. The result moves the quality–footprint frontier: 4B-class behaviour in a fraction of the memory.
Ternary Bonsai Image 4B
Quality{−1, 0, +1} · 1.71 effective bits / weight
The quality default. The extra zero state buys representational flexibility — better visual quality and prompt fidelity.
First-run download 3.89 GB
1-bit Bonsai Image 4B
Footprint{−1, +1} · 1.125 effective bits / weight
The footprint default. Brings the diffusion transformer below 1 GB — the right fit when memory and bandwidth are the constraint.
First-run download 3.43 GB
Benchmarks
GenEval (object composition & attribute binding); HPSv3 (human-preference & aesthetic quality); DPG-Bench (dense prompt following & semantic faithfulness).
| Model | Footprint | GenEval | HPSv3 | DPG-Bench | vs [klein] |
|---|---|---|---|---|---|
| Ternary Bonsai Image 4B | 1.21 GB | 0.723 | 12.22 | 0.851 | 95% |
| 1-bit Bonsai Image 4B | 0.93 GB | 0.671 | 11.15 | 0.822 | 88% |
| FLUX.2 [klein] 4B | 7.75 GB | 0.819 | 12.84 | 0.853 | 100% |
| SDXL | 5.14 GB | 0.300 | 10.05 | 0.740 | 67% |
| Stable Diffusion 1.5 | 1.72 GB | 0.396 | 4.20 | 0.601 | 51% |
Source: PrismML launch benchmarks (GenEval / HPSv3 / DPG-Bench). Higher is better.
Mean-active memory
After the text encoder is offloaded.
Generation speed
512² image, 4 denoising steps.
PrismML reports that, to its knowledge, Bonsai Image 4B is the first image model in its parameter class to run directly on an iPhone.
Image-making is iterative. The model should be too.
Cloud generation turns every prompt into a remote request — metered, billed, and gated behind latency. Once the model fits on the device, that friction disappears.
Private by default
Prompts and generated images stay on the device. Generation happens in the page, not on a server.
No marginal cost
Cloud generation makes every prompt a remote request with a per-prompt serving cost. On-device, there is no per-prompt server cost.
Instant iteration
Image-making is iterative — revise, compare, re-roll, discard. With no server round-trip per attempt, the creative loop sits inside the product.
Publish what you make. License it.
Keep everything local while you iterate. When a result is worth selling, publish it to the marketplace and offer it under a non-exclusive license — the full provenance of how it was generated travels with it.
Commercial use of generated images
The recorded model-stack finding: Bonsai, its FLUX.2 [klein] 4B base, and its Qwen3-4B text encoder are all licensed Apache-2.0, which imposes no non-commercial restriction on generated outputs. Generated images therefore carry no license restriction from the model stack.
- Listings are sold under a non-exclusive license — the creator keeps the right to relist or reuse.
- Each image ships with its full provenance (model, variant, base, seed, sampler) recorded at generation time.
- The model-stack finding above is informational, not legal advice.