🔭 Experimental

Vision Lab

Point your camera, ask a question. A vision-language model runs entirely on your device via WebGPU. Pick SmolVLM-500M for low-RAM machines or Moondream2 for richer answers. Frames never leave the browser.

One-time setup

Downloads the model from HuggingFace's CDN — happens once.
Cached in your browser's persistent storage (Origin Private File System).
Subsequent visits load in ~3 seconds with no network.
Frames are processed on-device. The model's text answer is the only thing that leaves your browser (only when you press "Find compatible skills").
Use a normal tab. Private/Incognito browsing wipes OPFS storage on reload, so the model would re-download every time.

Checking your browser…

Model

Bandwidth tip: the download is large. On cellular it'll eat your data plan; prefer Wi-Fi.

Vision Lab

One-time setup

Loading model…

Answer