Posted On: Dec 18, 2019
HAQM Textract is a machine learning service that makes it easy and quick to retrieve text and structured data like tables and forms using our DetectText or AnalyzeDoc APIs, without requiring any custom configuration or templates. One advantage of a managed service like HAQM Textract is that customers benefit from continuous improvement over time. Today, we are pleased to announce that HAQM Textract is now PCI DSS certified. This means that you can now use HAQM Textract for all workloads that require Payment Card Industry Data Security Standard (PCI DSS) information security standard, such as cardholder data (CHD) or sensitive authentication data (SAD). Also starting today, AWS launched a set of quality enhancements that make HAQM Textract even more accurate for our tables and forms features.
First, our tables model now works better with complex table structures that contain split cells and merged cells, which make it difficult to align cell values to the correct column header or row header. Next, HAQM Textract has further improved in identifying rows and columns for cells with wrapped text (text present across multiple lines), even for tables without explicit boundaries. HAQM Textract now does a more accurate determination of cells with content across multiple lines as opposed to when it is a new row without an explicit boundary. Finally, HAQM Textract has also improved the forms model to give more accurate results for key-value pair identification. These benefits apply to many types of documents, but are especially pronounced for documents where tables and key-value pair are present within the same page. Now, HAQM Textract correctly identifies key-value pairs embedded within a table.
You can learn more about these updates here.