AWS Machine Learning Blog
Improve search accuracy with Spell Checker in HAQM Kendra
HAQM Kendra is an intelligent search service powered by machine learning. You can receive spelling suggestions for misspelled terms in your queries by utilizing the HAQM Kendra Spell Checker. Spell Checker helps reduce the frequency of queries returning irrelevant results by providing spelling suggestions for unrecognized terms.
In this post, we explore how to use HAQM Kendra Spell Checker on the AWS Management Console, as well as how to enable Spell Checker in an HAQM Kendra-powered search application through the AWS Command Line Interface (AWS CLI) and AWS SDK.
Use HAQM Kendra Spell Checker on the console
You can automatically receive spelling suggestions for your misspelled HAQM Kendra queries when querying through the console.
On the HAQM Kendra console, choose your desired index, then choose Search indexed content in the navigation pane. Make sure that the selected index has ingested documents; in this post, we use the sample AWS documentation found in the Data sources section of the navigation pane.
On the HAQM Kendra search console, simply submit a query as you usually would. Misspelled terms in the query are substituted with suggested terms in the “Did you mean” section of the search console.
Choosing the suggested query submits a new query with the corrected spelling.
As you can see, the query results provided through the suggested query are significantly more relevant, thanks to Spell Checker!
Use HAQM Kendra Spell Checker in search applications
Search applications powered by HAQM Kendra can quickly and easily enable Spell Checker through the AWS CLI or AWS SDK, which we walk through in this section. Additionally, we go over an example of how to process the Spell Checker response.
AWS CLI
Let’s look at how AWS CLI users can opt in to HAQM Kendra Spell Checker to receive spelling suggestions for misspelled query terms. We use the AWS CLI to query HAQM Kendra as usual, with only one small change: we include the --spell-correction-configuration IncludeQuerySpellCheckSuggestions=true
argument:
In addition to the normal query results, the response from HAQM Kendra now contains a SpellCorrectedQueries
object, if there are any spelling suggestions for the query. For more information, see SpellCorrectedQuery.
AWS SDK
Next, let’s walk through how HAQM Kendra provides spell check functionality for AWS SDK users. For this example, we use Python 3. We submit a query with a few spelling errors, and print out the SpellCorrectedQueries
object in the response:
The response from HAQM Kendra now contains the expected spelling suggestions:
Process the HAQM Kendra Spell Check response
Now that we’ve gone over how to programmatically get spelling suggestions through either the AWS CLI or AWS SDK, we can examine how we turn the response into a human-readable suggested query. For this example, we use the sample output from the previous section:
Each SpellCorrectedQuery
has two keys: SuggestedQueryText
and Corrections
.
SuggestedQueryText
maps to a string containing the updated query with the suggested spelling corrections.Corrections
maps to a list ofCorrection
objects, which contains the beginning and ending offset of the correction, as well as the original term from the query and the spelling suggestion for that term.
For our example, we want to show the suggested query text with the newly suggested terms italicized, similar to what is done on the HAQM Kendra console. To achieve this, we can add HTML italics opening tags <i>
at the BeginOffset
of each Correction
and HTML italics closing tags </i>
at the EndOffset
of each Correction
in the Corrections
list. Note that BeginOffset
and EndOffset
are based on the length of the corrected terms, not the original terms.
Adding the italics tags to SuggestedQueryText
gives us the following suggested query text:
As you can see, HAQM Kendra Spell Checker makes it simple to add spell check functionality to your search application.
Conclusion
Spell Checker is a new, powerful feature offered by HAQM Kendra. Spell Checker is a simple, effective way to quickly reduce the number of unhelpful queries by providing spelling suggestions to end-users for misspelled terms.
Spell Checker is available in all AWS Regions where HAQM Kendra is available, and supports all languages currently supported by HAQM Kendra.
To learn more about HAQM Kendra, visit the HAQM Kendra product page.
About the Author
Matthew Peretick is a Software Development Engineer at HAQM Web Services based in New York City. Matthew is a member of the HAQM Kendra team focused on enhancing the HAQM Kendra query experience.