Voice Messages
lynox accepts voice input on two surfaces:
- Web UI (primary) — record in the browser, transcript is sent back for review or submitted directly as a message.
- Mail / Inbox — forward a voice attachment (m4a / mp3 / ogg) to your connected mailbox and lynox transcribes it before running the task.
Transcription happens server-side. Two providers are supported; you can choose which one runs.
Keyboard shortcut
Section titled “Keyboard shortcut”In the Web UI, double-tap ⌘ (macOS) or Ctrl (Windows/Linux) to start or stop recording — no chord, just two quick taps on the bare modifier within 350 ms. The shortcut is intentionally collision-free with every other browser/OS binding (a bare modifier is never used as a hotkey anywhere else), so it works the same in any focused field. Stop with the same gesture or click the microphone icon.
Provider matrix
Section titled “Provider matrix”| Provider | Speed (60 s clip) | German WER (business speech) | Cost | Hosting | When to use |
|---|---|---|---|---|---|
| Mistral Voxtral | ~2 s | ~10 % on mixed DE/EN | $0.003/min | Mistral La Plateforme (Paris) | Default for cloud and BYOK setups. Fast, EU-hosted, no training on your audio. |
| whisper.cpp | several seconds (CPU) | ~23 % on mixed DE/EN | free | Local (your server) | Air-gapped self-host. OSS fallback when no Mistral key is set. |
Numbers come from the Phase 0 spike on ten self-recorded German-business-speech clips (see PRD voice-transcription-v2.md). whisper.cpp uses the ggml-base model on CPU.
Configuration
Section titled “Configuration”Provider selection
Section titled “Provider selection”Set transcription_provider in your ~/.lynox/config.json:
{ "transcription_provider": "auto"}Values:
"auto"(default) — use Mistral ifMISTRAL_API_KEYis set, otherwise whisper.cpp."mistral"— force Mistral Voxtral. Transcription fails if the key is missing."whisper"— force local whisper.cpp. Transcription fails if the binary/model are missing.
The LYNOX_TRANSCRIBE_PROVIDER environment variable overrides the config value.
Mistral Voxtral
Section titled “Mistral Voxtral”Requires an API key from console.mistral.ai:
export MISTRAL_API_KEY=...lynox calls the /v1/audio/transcriptions endpoint with model voxtral-mini-2602. Only the documented parameters are sent (file, model, language). Your recording is transmitted to Mistral’s EU infrastructure; per Mistral’s terms, customer audio is not used to train models.
whisper.cpp
Section titled “whisper.cpp”Requires the whisper-cli binary and a ggml model on the host:
# macOSbrew install whisper-cpp ffmpeg# ggml modelcurl -L -o ~/.local/share/whisper/ggml-base.bin \ https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.binDefaults to the base model; falls back to tiny for clips under 10 seconds.
Glossary repair
Section titled “Glossary repair”Speech-to-text systems mis-hear proper nouns and product vocabulary in predictable ways. lynox applies a two-layer glossary to the raw transcript before returning it:
- Core glossary — lynox product vocabulary (
Setup Wizard,Go-Live,Knowledge Graph, …). Seeded from known mishearings; extended via PR as new surface-area names ship. - Session glossary — built at call time from your own context: CRM contact names, registered API/tool names, recent thread titles, Knowledge-Graph entity labels, custom workflow names. A short edit-distance match (≤2) rewrites nearby tokens when they are not in a common-language stop list — so
RollandbecomesRolandbutrundis left alone.
Both passes run in single-digit milliseconds. The glossary never leaves the lynox process; your vocabulary is never part of the audio-transcription API request.
Privacy
Section titled “Privacy”- Mistral-hosted audio: sent to Paris, not retained for training, not stored post-transcription per Mistral’s terms.
- whisper.cpp: audio never leaves the server; transient
/tmpfiles are deleted after each transcription. - The Web UI shows a short privacy hint under the voice button indicating which provider is in use.
- No audio is retained by lynox — only the final text goes into the thread history.
Troubleshooting
Section titled “Troubleshooting”- “Transcription not available” — check that
MISTRAL_API_KEYis set or that whisper.cpp is installed. - Short clips mis-transcribed — whisper.cpp uses the
tinymodel for clips ≤10 s; record a slightly longer utterance to get the more accuratebasemodel. - Product term still wrong — if a new lynox product name ships and is mis-heard, add it to the core glossary (
src/core/transcribe/glossary/core-terms.ts). Contact/tool names self-heal from the session glossary.