AWS Machine Learning Blog
HAQM Polly Gives WordPress a Voice!
Voiced by HAQM Polly
|
Today AWS in partnership with WP Engine is announcing the release of an HAQM Polly plugin for WordPress. The sample plugin enables WordPress creators to easily add Text-to-Speech capabilities to written content. As voice interaction becomes more common, it’s essential to provide your website’s content in audio formats. And, visitors who are drawn to your websites by voice capabilities can now consume your content through new channels, such as inline audio players and mobile podcast applications. Now your readers and listeners can listen to your posts, even while they are away from the screen – driving, riding a bike, or even jogging.
WordPress powers 29% of websites, and is well on its way to powering 50% of the web. With its ambitious goal of “democratizing publishing on the web,” WordPress thrives on simplifying the relationship between the creative and the technical. With this philosophy in mind, we at AWS knew we could provide a unique digital experience for consumers of web content by bringing an actual “voice” to WordPress websites.
The HAQM Polly plugin for WordPress lets consumers of the world’s most popular website management system “listen” to a natural, friendly narration of their websites’ content. With dozens of lifelike voices across a variety of languages, you can select the ideal voice and build other speech-enabled applications using the full-featured HAQM Polly service.
Begin voicing your WordPress content today, it’s easy to get started with this step-by-step tutorial.
How it Works
The experience begins when a WordPress site administrator, let’s assume that’s you, installs and configures the plugin using the native WordPress Install Plugin page. Then, you navigate to the HAQM Polly Settings page and connect the plugin to your AWS account. If the site is hosted on AWS, you can handle authentication by using IAM roles. Otherwise, provide the plugin with your AWS credentials, and you’ll be on your way to creating your first voice-enabled web application.
Let’s look at how you configure settings. Many options are configurable, but we provide a default configuration to get your site speaking content as quickly as possible. The following screen shows the plugin’s typical configuration:
You can set these options:
- AWS Access Key and AWS Secret Key – AWS credentials, which allow the plugin to use HAQM Polly and HAQM S3. If you are hosting your WordPress site on HAQM EC2, you can use IAM roles. In that case, leave these two fields blank.
- Sample Rate – The sample rate of the audio files that will be generated (higher sampling rates mean higher-quality audio).
- Voice Name – The HAQM Polly voice to use to create the audio file.
- Player Position – Where to position the audio player on the website. You can put it before or after the post, or, if you want to use only podcast functionality, not use it at all.
- New Post Default – Specifies whether HAQM Polly should automatically be enabled for all new posts. If so, HAQM Polly uses the configuration settings to create an audio file for each new post.
- Autoplay – Specifies whether the audio player should automatically start playing the audio when a user visits a site for a specific post.
- Store audio in HAQM S3 – If you want audio files to be stored on HAQM S3, not on the server itself, choose this option. HAQM Polly creates the bucket automatically.
- HAQM CloudFront (CDN) Domain Name – If you want to broadcast your audio files with HAQM CloudFront, provide the name of your domain (you should create it yourself).
- ITunes email–The editorial contact for the podcast channel.
- ITunes Category – The category for your blog posts. Choosing a category makes it easier for podcast users to find your posts in the podcast catalog.
- ITunes Explicit – Specifies whether to enable HAQM Pollycast (podcast) functionality.
- Bulk Update All Posts – If you want to convert all posts using the current plugin settings, choose this option.
After you install and configure the plugin, it can create audio files for any new content. You can configure audio file creation to occur automatically upon publication or on a per-submission basis. If you have historical content, you can process it in batches to provide your site visitors with an enhanced experience with existing content.
Now, when users publish content, it’s sent to the HAQM Polly API for synthesis. By default, audio files are stored locally on disk on the web server. If you need to scale, you can integrate with HAQM S3 cloud storage and the CloudFront content delivery network. Long posts are split into blocks for processing and are recombined into a single audio file when all files have been processed.
The following diagram shows how WordPress users can listen to audio content from your website:
- In the first method, the audio files are provided to users directly from the WordPress server.
- When you use S3, all audio files are stored and broadcast from HAQM S3.
- If you use CloudFront distribution, your files are stored in S3, but are broadcast using HAQM CloudFront (CDN).
To highlight your website’s new capabilities, you can configure the HTML player to appear either above or below text. Or you can completely disable it. HAQM Polly automatically uses the default audio settings, but you can choose other options to customize individual posts and pages with custom voices and sample rates.
We also enable podcast capabilities through HAQM Pollycast feeds. These feeds are RSS 2.0 compliant and provide the necessary XML data for aggregation by popular podcast mobile applications and podcast directories, such as iTunes. HAQM Pollycast endpoints are added automatically to all WordPress archive URLs, which gives you the option to syndicate site-wide or targeted podcasts based on categories, tags, author, and so on. With HAQM Pollycast podcast syndication, you can now expand your audience to include users who are away from their screens, such as commuters and ambitious multi-taskers. For example, you could submit your WordPress podcast as an official ITunes podcast source to make it available to a wider audience!
The following screenshots show the WordPress site view and podcast view from a mobile phone.
Although our first priority was to provide immediate value with the plugin’s initial release, we also created a framework for future features. In the spirit of the open web, the plugin source code is available on our GitHub repository. Collaboration is both welcome and encouraged.
We look forward to hearing from you and seeing your WordPress sites soon! If you have questions or ideas for new features, use the comment section below to let us know.
About the Authors
Steven Word is the Innovation Program Manager at WP Engine, where he evaluates, recommends, and architects WordPress technology solutions for customers and partners. A WordPress veteran of 10 years and Core Contributor, Steven is passionate about providing technical and strategic guidance to content creators, developers, and product teams.
Tomasz Stachlewski is a Solutions Architect at AWS, where he helps companies of all sizes (from startups to enterprises) in their cloud journey. He is a big believer in innovative technology, such as serverless architecture, which allows companies to accelerate their digital transformation.