AWS Machine Learning Blog

Create softer speech with the new HAQM Polly phonation tag

Speech Synthesis Markup Language (SSML) is a standardized markup language that enables developers to modify Text-to-Speech (TTS) audio. With SSML, you can control various vocal characteristics of TTS output, such as pronunciation, speech rate, and other elements, to produce a more natural-sounding voice experience.

Today, we are excited to announce a new phonation SSML tag that you can use with HAQM Polly. The new phonation tag enables you to produce a softer dialogue.

Using the new phonation tag

The new amazon:effect tag coupled with the phonation=“soft” tag allows HAQM Polly to generate softer speech. Notice in the sample below, that amazon:effect requires a closing tag. In this case, the first portion of the synthesized speech is spoken with a normal voice, whereas the portion using the phonation tag is spoken more softly.

<speak>
     This is Matthew speaking in my normal voice. <amazon:effect phonation="soft"> This is Matthew speaking in my softer voice. </amazon:effect>
</speak>
Listen now

Voiced by HAQM Polly

Copy the example above and paste it into the HAQM Polly console, and try it with any of the HAQM Polly voices.

HAQM Polly supports standard SSML tags such as prosody, which enables you to control the volume, rate, and pitch of the delivery of the text. HAQM Polly also has unique tags you can use for cool effects, such whispered voice, dynamic range compression, and vocal tract length, which further enhance your ability to modify HAQM Polly voices to best suit your needs.


About the Author

Binny Peh is a Sr. Product Marketing Manager for AWS machine learning solutions. In her spare time, she indulges in too much television and is an aspiring foodie. Binny’s glass is always half-full, and she believes in the power of positive thinking.