The medical models are specifically tailored for recognition of words that are common in medical settings, such as diagnoses, medications, symptoms, treatments, and conditions. If you want to recognize this type of audio data, you can improve your transcription results by using these models.
There are two medical models, each tailored to specific use cases:
medical_conversation: for conversations between a medical provider—for example, a doctor or nurse—and a patient. Use this model when both a provider and a patient are speaking. Words uttered by each speaker are automatically detected and labeled in the returned transcript.
medical_dictation: for dictated notes spoken by a single medical provider—for example, a doctor dictating notes about a patient's blood test results.
Use medical models only with the following Speech-to-Text features. Features omitted from this list can't be used with either medical model. The automatic punctuation feature is enabled by default.
Automatic punctuation
Alternate transcriptions
Word timestamps
The medical conversation model supports the following features:
Speaker diarization
The medical dictation model supports the following features:
Spoken Punctuation
Formatting Commands
Spoken Headings
Create realistic voices for any medical text records in seconds by using
over +840 realistic voices across +135 languages & dialects.