Posted On: Apr 29, 2020

HAQM Transcribe Medical is a HIPAA eligible automatic speech recognition (ASR) service that makes it easy for developers to add medical speech-to-text capabilities to their healthcare and life science applications. Starting today, users can give HAQM Transcribe Medical more information about how to process speech from audio content by creating a custom vocabulary. A custom vocabulary is a list of specific words that you want HAQM Transcribe Medical to recognize. These can be domain-specific words and phrases, such as medicine names, healthcare brands, or even terms related to procedures that aren’t already recognized out of the box.  

Using custom vocabulary is easy and straightforward. Simply create a list of custom terms or phrases in a plain text file and upload it to an HAQM S3 bucket. Then, before starting a transcription job using HAQM Transcribe Medical, point the service to reference that custom vocabulary. Custom vocabulary not only allows you to add out-of-lexicon terms, but also lets you add custom pronunciations associated with each term, by using the International Phonetic Alphabet (IPA). Additionally, you can now designate exactly how a custom terminology should be displayed when it is transcribed (e.g. “adenosine triphosphate” as “ATP”) by using the built-in custom display forms capability.  

Custom vocabulary is available for both HAQM Transcribe Medical’s synchronous (streaming) API as well as the asynchronous (batch) API. The feature is available in all AWS regions where the service is. Try out the new custom vocabulary feature by visiting the HAQM Transcribe Medical service console or learn more by seeing this technical documentation.