AWS Machine Learning Blog

Get started with the HAQM Kendra HAQM WorkDocs connector

HAQM Kendra is an intelligent search service powered by machine learning (ML). HAQM Kendra reimagines enterprise search for your websites and applications so your employees and customers can easily find the content they’re looking for, even when it’s scattered across multiple locations and content repositories within your organization.

With HAQM Kendra, you can search through troves of unstructured data and discover the right answers to your questions, when you need them. HAQM Kendra is a fully managed service, so there are no servers to provision, and no ML models to build, train, or deploy.

HAQM WorkDocs is a fully managed and secure content creation, storage, and collaboration service. With HAQM WorkDocs, you can easily create, edit, and share content. Moreover, because it’s stored centrally on AWS, you can access it from anywhere on any device.

In this post, we show how HAQM Kendra allows your users to search documents stored in HAQM WorkDocs.

Use case

For this post, we created a specific folder in HAQM WorkDocs containing a set of PDFs and Microsoft Word documents that we want to search content on. The HAQM WorkDocs connector also allows you to ingest comments for those documents.

The following screenshot shows the contents of a fictional WorkDocs folder called WorkdocsBlogpostDataset.

Create an HAQM WorkDocs connector

To create an HAQM WorkDocs connector, complete the following steps:

  1. On the HAQM Kendra console, choose Data sources.
  2. Choose Add data source.
  3. Under WorkDocs, choose Add connector.
  4. For Data source name, enter a name for your data source.
  5. Enter an optional description.
  6. Choose Next.
  7. In the Source section, choose the organization ID for your HAQM WorkDocs site.
  8. Create a new AWS Identity and Access Management (IAM) role for the data source.
  9. For Sync scope, select Crawl document comments and Use change logs.

For this post, we want HAQM Kendra to ingest the documents in the WorkdocsBlogpostDataset folder.

  1. In the Additional configuration section, enter WorkdocsBlogpostDataset as a path on the Include patterns tab.
  2. Choose Add.
  3. For Sync run schedule¸ choose Run on demand.
  4. Choose Next.
  5. In the WorkDocs field mapping section, use the default field mapping.
  6. Choose Next.
  7. Review the settings and choose Create.
  8. When the creation process is complete, choose Sync.

When the sync process complete, you can see how many documents were ingested.

Now your documents are ready be searched by HAQM Kendra.

  1. In the navigation pane, choose Search console.

You can now submit some test queries, as shown in the following screenshots.

Also, with the HAQM WorkDocs connector, you can ingest feedback (comments) on your documents. For example, the following screenshot shows that this document has feedback.

The following screenshot shows what the feedback search experience looks like.

Conclusion

In this post, you created a data source and ingested your HAQM WorkDocs documents into your HAQM Kendra index. As a next step, you can try some more queries and see what kind of results you obtain. You can also dive deep into HAQM Kendra with the HAQM Kendra Essentials workshop or try the multilingual chatbot experience.


About the Author

Juan Bustos is an AI Services Specialist Solutions Architect at HAQM Web Services, based in Dallas, TX. Outside of work, he loves spending time writing and playing music as well as trying random restaurants with his family.

 

 

 

Vijai Gandikota is a Senior Product Manager at HAQM Web Services for HAQM Kendra.