Posted On: Nov 7, 2019
The AWS Step Functions Data Science Software Development Kit (SDK) is an open-source library that allows you to easily create workflows that pre-process data and then train and publish machine learning models using HAQM SageMaker and AWS Step Functions. You can create machine learning workflows in Python that orchestrate AWS infrastructure at scale, without having to provision and integrate the AWS services separately.
AWS Step Functions is a serverless orchestration service that allows you to build resilient workflows using AWS services such as HAQM SageMaker, AWS Glue, and AWS Lambda. HAQM SageMaker enables you to build, train and deploy machine learning models quickly. Now with the new Data Science SDK, you can easily build workflows, also known as pipelines, on AWS infrastructure using the preferred tools of data scientists - Python and Jupyter Notebooks.
You can use the Data Science SDK to create and visualize end-to-end data science workflows that perform tasks such as data pre-processing on AWS Glue and model training, hyperparameter tuning, and endpoint creation on HAQM Sagemaker. You can reuse the workflows in production by exporting AWS CloudFormation templates.
The Data Science SDK is included in AWS Step Functions pricing at no additional cost and is available in all regions where both AWS Step Functions and HAQM SageMaker are offered. The SDK can be used in conjunction with other services such as AWS Glue and AWS Lambda in their supported regions. For a complete list of regions and service offerings, see AWS Regions.
To get started with the AWS Step Functions Data Science SDK, download the Hello World notebook from GitHub, or open it from a notebook instance on HAQM SageMaker.
To learn more:
- Learn about the AWS Step Functions Data Science SDK in the technical documentation and GitHub.
- View the Step Functions Jupyter Sample Notebooks on HAQM SageMaker.