Accurate captioning of institutional media — especially names and acronyms — is where automatic transcription fails. The institution needed reliable captions with a tight human-review loop, running locally.
13Production (Mac mini)SDCCD Captioning System
Whisper hears it, a local LLM polishes it, a pattern-flagger catches the names — shipped as a real Mac app.
A production captioning service: whisper.cpp transcription, a local LLM polish pass, and a four-layer vocabulary pipeline for names and acronyms — packaged as a signed macOS app and used in production on a Mac mini.
whisper.cpp (large-v3-turbo) handles transcription, Ollama/Qwen2.5 polishes it, and ffmpeg manages media. A four-layer vocabulary brain — a Whisper hint window, an LLM polish, a pattern flagger, and human review — catches the hard names. Captions render to the DCMP standard. The whole service is FastAPI packaged as a signed macOS app with closed-onboarding auth.
Architect and engineer — the transcription pipeline, the vocabulary brain, the caption renderer, and the macOS packaging.
Running in production on a Mac mini. (Private system — architecture and status only.)
Technical proof.
- whisper.cpp large-v3-turbo + Ollama/Qwen2.5 + ffmpeg transcription pipeline.
- 4-layer vocab brain: hint window → LLM polish → pattern flagger → human review.
- DCMP-standard caption rendering.
- FastAPI packaged as a signed macOS .app with closed-onboarding auth.