Mellon started as a dictation app: you speak, words appear. But what if your voice could do more than just type? What if you could command AI just by talking to it — and have it act on the text right in front of you?
That's Agent Mode. Say your trigger word followed by any instruction, and Mellon sends your command — along with your current text context — to an AI model. The result gets pasted directly into your app. No copy-pasting. No switching to ChatGPT. Just speak and it happens.
Two Ways to Activate Agent Mode
Agent Mode gives you two ways to trigger AI commands: voice trigger words during any dictation session, or a dedicated shortcut key for instant access.
Voice Trigger Words
While dictating in any mode, say your trigger word followed by a command. Mellon ships with a built-in "Mellon" trigger group (recognizing "Hey Mellon", "Mellon", "Hey melon", and other pronunciation variations), and you can add your own custom trigger word groups with as many variations as you like.
For example, you could add a "Jarvis" trigger group with variations like "Hey Jarvis", "Jarvis", and "Hey Jarvis please". When Mellon detects any of these phrases in your dictation, it treats the rest of your sentence as an AI command.
Trigger words work during any dictation session — whether you started recording with the dictation shortcut or the agent mode shortcut. You don't need a special mode to use them.
Dedicated Shortcut Key
You can also assign a dedicated modifier key for Agent Mode (e.g., Right Option, Right Command, or any other modifier). When you use this shortcut, the entire transcription is sent as an AI command — no trigger word needed. This is useful when you know upfront that you want to give an AI instruction.
Smart Tap vs Hold
Both the dictation shortcut and the agent mode shortcut support automatic tap-vs-hold detection. Mellon uses a 300ms threshold to determine your intent:
- Quick tap (under 300ms) — toggles recording on/off. A lock icon appears to indicate toggle mode is active. Tap again to stop.
- Hold down (300ms or longer) — records while held. Release to stop. Great for quick one-off commands.
You don't need to configure this — it just works. Use whichever feels natural in the moment.
How It Works
Agent Mode has two behaviours depending on whether you have text selected:
With Selected Text → Replace
If you've highlighted text in any app, Agent Mode treats that selection as the target. The AI sees your selected text and your spoken command, then replaces the selection with its output.
Examples:
- "Hey Mellon, make this more concise" — rewrites the selected paragraph to be shorter
- "Hey Mellon, translate this to French" — replaces the English text with French
- "Hey Mellon, fix the grammar" — corrects errors in the selected text
- "Hey Mellon, make this sound more professional" — adjusts tone while keeping meaning
Without Selected Text → Insert at Cursor
If nothing is selected, Agent Mode reads the text around your cursor position — what's before and after it — and inserts the AI's output right where your cursor is.
Examples:
- "Hey Mellon, write a follow-up sentence" — continues where you left off
- "Hey Mellon, add a bullet list of three key points" — generates a list at the cursor
- "Hey Mellon, write the conclusion for this email" — reads the email context and writes the ending
Works in Any App
Because Mellon operates at the system level, Agent Mode works wherever you can type: emails, documents, code editors, Slack, Notes, even browser text fields. You don't need a plugin or extension — Mellon reads the focused text field automatically.
Choose Your AI Provider
Agent Mode uses the same AI provider you've configured in Mellon's settings. You can choose from:
- Anthropic (Claude) — great for nuanced writing and following complex instructions
- OpenAI (GPT) — versatile, widely used
- Google Gemini — strong at reasoning and multilingual tasks
- Groq — fast inference for quick responses
- Custom endpoint — point to any OpenAI-compatible API (local LLMs, corporate proxies, etc.)
Switch providers or models any time in Settings → AI Enhancement → Post-processing. Agent Mode automatically uses whatever you've configured.
Setup (30 Seconds)
- Open Mellon → Settings → AI Enhancement → Post-processing
- Select your AI provider and enter your API key
- Go to the Agent Mode tab and toggle it on
- Optionally, assign a dedicated shortcut key for agent mode and add custom trigger word groups
- Start dictating — say your trigger word, or use the agent mode shortcut, and the rest becomes an AI command
Why Voice + AI Matters
Copy-pasting text into ChatGPT, waiting for a response, then pasting it back is slow. Agent Mode eliminates that entire loop. You stay in your app, in your flow. Select text, speak a command, and the result appears — all in a few seconds.
For anyone who writes a lot — emails, docs, code comments, messages — this turns your voice into an AI-powered editing tool. And since everything runs through your own API key, there's no Mellon subscription or middleman.
Want AI working across your business workflows? Book a free 30-min strategy call — we'll map out what to automate first.