Every clinician we onboard has already tried at least one of these: medical dictation (Dragon, Nuance, etc.) or an ambient AI scribe. The two are often lumped together as "voice tools." They shouldn't be.

Dictation: active, structured, fast

Dictation is press-to-talk. You say the note out loud, usually inside a structured template, and software transcribes what you said in the format you want.

You're still authoring the note. The model just replaces typing with speaking.
Typically done between sessions or at end of day, not during.
Works well for clinicians who have a practiced, spoken "note voice."
Accuracy depends heavily on your microphone, your accent, and how well the model handles clinical vocabulary.

When dictation works, it's very fast. A well-trained clinician can dictate a complete SOAP note in 60 to 90 seconds.

Ambient: passive, session-wide, trust-dependent

Ambient scribing listens to the full session, transcribes both speakers, and drafts a structured note at the end.

You're not authoring. You're reviewing and editing a draft.
The work happens during the session, not after.
You don't have to remember to say anything out loud. The note is built from what actually happened.
The note is only as good as the model's ability to separate clinically meaningful content from small talk, stalls, and detours.

When ambient works, you end the session, review the draft, and sign off before the next client walks in.

The side-by-side

| | Dictation | Ambient AI | | ----------------------------- | --------------------------- | ------------------------ | | When the note gets built | After the session | During the session | | Who authors | You | Model drafts; you edit | | Mic/setup sensitivity | High | Moderate | | Handles rambling conversation | No (you filter) | Yes (model filters) | | Works well for long sessions | Less so (recall fades) | Yes | | Client has to consent | No | Yes | | Learning curve for clinician | High (spoken note voice) | Low (review & edit) |

Where dictation still wins

Short, highly structured encounters. Medication management visits, for example, where the same six fields get dictated every time.
Clinicians who've been dictating for a decade and have the muscle memory.
Settings where client consent for ambient recording is hard to get.

Where ambient wins

Therapy sessions, where the clinical content is the conversation itself.
Clinicians who hate the feeling of "doing the note twice," once in their head during session, once out loud afterward.
Anyone who has lost a note to a bad dictation day.

How we think about it at Aimé

Aimé is ambient-first. We built it because we think the note is a record of what happened, not a retelling, and the cleanest way to get an accurate record is to capture it in real time. We support dictation mode for the cases where it genuinely fits better. Some users turn it on for brief check-ins and leave ambient on for full sessions.

The right question isn't "ambient or dictation." It's "where in my week does each one save the most time?" For most behavioral health clinicians, the answer is ambient most of the time, dictation for the rest.

Ambient AI vs. dictation: two very different workflows