Hold a key anywhere on your machine, speak, release — your words land in the focused text field. A free, open-source, entirely-local alternative to WisprFlow. And because Alexander AI Voice clones voices too, any AI agent can speak back in a voice you own.
macOS, Windows, Linux
Hold the shortcut, speak, release — a capture lands in the Captures tab. Replay the original audio, re-transcribe with a different model, refine with a local LLM, copy to clipboard, or send it straight to any MCP-aware agent. Nothing leaves your machine.
Base, Small, Medium, Large, and Turbo. Pick per-capture — 99 languages at every tier, all local, all downloadable from inside the app.
A local Qwen model cleans ums, self-corrections, and punctuation — without rephrasing. Keep raw and refined side-by-side; the original audio is always kept.
Every dictation keeps both the audio and the transcript. Search, re-run, or turn any capture into a voice sample for cloning from the Captures tab.
One tool call — voicebox.speak— and any MCP-aware agent can talk to you in a voice you’ve cloned. Claude Code, Cursor, Cline, or anything that speaks MCP.
{
"mcpServers": {
"voicebox": {
"url": "http://127.0.0.1:17493/mcp"
}
}
}// In any MCP-aware agent:
await voicebox.speak({
text: "Deploy complete.",
profile: "Morgan",
})POST /speakfor anything that doesn’t speak MCP — ACP, A2A, shell scripts, or custom harnesses.Bind each MCP client to a voice profile. Claude Code in Morgan, Cursor in Scarlett — you know which agent is talking without looking.
Every agent-initiated speech surfaces the pill. No silent background TTS — you always see what’s coming out of your machine.
MCP ships day one. ACP, A2A, and anything else built on a tool-call primitive slots into the same endpoint.
Free, open-source, local. No account, no API keys, no per-character fees.