AWS Machine Learning Blog

Category: HAQM SageMaker Ground Truth

Real-time data labeling pipeline for ML workflows using HAQM SageMaker Ground Truth

High-quality machine learning (ML) models depend on accurately labeled, high-quality training, validation, and test data. As ML and deep learning models are increasingly integrated into production environments, it’s becoming more important than ever to have customizable, real-time data labeling pipelines that can continuously receive and process unlabeled data. For example, you may want to create […]

zomato digitizes menus using HAQM Textract and HAQM SageMaker

This post is co-written by Chiranjeev Ghai, ML Engineer at zomato. zomato is a global food-tech company based in India. Are you the kind of person who has very specific cravings? Maybe when the mood hits, you don’t want just any kind of Indian food—you want Chicken Chettinad with a side of paratha, and nothing […]

Processing auto insurance claims at scale using HAQM Rekognition Custom Labels and HAQM SageMaker Ground Truth

Computer vision uses machine learning (ML) to build applications that process images or videos. With HAQM Rekognition, you can use pre-trained computer vision models to identify objects, people, text, activities, or inappropriate content. Our customers have use cases that span every industry, including media, finance, manufacturing, sports, and technology. Some of these use cases require […]

Streamlining data labeling for YOLO object detection in HAQM SageMaker Ground Truth

Object detection is a common task in computer vision (CV), and the YOLOv3 model is state-of-the-art in terms of accuracy and speed. In transfer learning, you obtain a model trained on a large but generic dataset and retrain the model on your custom dataset. One of the most time-consuming parts in transfer learning is collecting […]

Setting up human review of your NLP-based entity recognition models with HAQM SageMaker Ground Truth, HAQM Comprehend, and HAQM A2I

Update Aug 12, 2020 – New features: HAQM Comprehend adds five new languages(Spanish, French, German, Italian and Portuguese) read here. HAQM Comprehend increased the limit of number of entities per custom entity model from 12 to 25 read here. Organizations across industries have a lot of unstructured data that you can evaluate to get entity-based […]

Building a custom Angular application for labeling jobs with HAQM SageMaker Ground Truth

As a data scientist attempting to solve a problem using supervised learning, you usually need a high-quality labeled dataset before starting your model building. HAQM SageMaker Ground Truth makes dataset building for a different range of tasks, like text classification and object detection, easier and more accessible to everyone. Ground Truth also helps you build […]

Developing NER models with HAQM SageMaker Ground Truth and HAQM Comprehend

Update October 2020: HAQM Comprehend now supports HAQM SageMaker GroundTruth to help label your datasets for Comprehend’s Custom Model training. For Custom EntityRecognizer, checkout Annotations documentation for more details. For Custom MultiClass and MultiLabel Classifier, checkout MultiClass and MultiLabel documentation for more details respectively. Named entity recognition (NER) involves sifting through text data to locate noun phrases […]

Labeling data for 3D object tracking and sensor fusion in HAQM SageMaker Ground Truth

HAQM SageMaker Ground Truth now supports labeling 3D point cloud data. For more information about the launched feature set, see this AWS News Blog post. In this blog post, we specifically cover how to perform the required data transformations of your 3D point cloud data to create a labeling job in SageMaker Ground Truth for […]

Bring your own model for HAQM SageMaker labeling workflows with active learning

With HAQM SageMaker Ground Truth, you can easily and inexpensively build accurately labeled machine learning (ML) datasets. To decrease labeling costs, SageMaker Ground Truth uses active learning to differentiate between data objects (like images or documents) that are difficult and easy to label. Difficult data objects are sent to human workers to be annotated and […]

Identifying worker labeling efficiency using HAQM SageMaker Ground Truth

A critical success factor in machine learning (ML) is the cleanliness and accuracy of training datesets. Training with mislabeled or inaccurate data can lead to a poorly performing model. But how can you easily determine if the  labeling team is  accurately labeling data? One way is to manually sift through the results one worker at […]