AWS Big Data Blog
Category: HAQM Managed Workflows for Apache Airflow (HAQM MWAA)
Introducing HAQM MWAA larger environment sizes
HAQM Managed Workflows for Apache Airflow (HAQM MWAA) is a managed service for Apache Airflow that streamlines the setup and operation of the infrastructure to orchestrate data pipelines in the cloud. Customers use HAQM MWAA to manage the scalability, availability, and security of their Apache Airflow environments. As they design more intensive, complex, and ever-growing […]
Introducing HAQM MWAA support for Apache Airflow version 2.8.1
HAQM Managed Workflows for Apache Airflow (HAQM MWAA) is a managed orchestration service for Apache Airflow that makes it straightforward to set up and operate end-to-end data pipelines in the cloud. Organizations use HAQM MWAA to enhance their business workflows. For example, C2i Genomics uses HAQM MWAA in their data platform to orchestrate the validation […]
Disaster recovery strategies for HAQM MWAA – Part 1
In the dynamic world of cloud computing, ensuring the resilience and availability of critical applications is paramount. Disaster recovery (DR) is the process by which an organization anticipates and addresses technology-related disasters. For organizations implementing critical workload orchestration using HAQM Managed Workflows for Apache Airflow (HAQM MWAA), it is crucial to have a DR plan […]
Orchestrate HAQM EMR Serverless Spark jobs with HAQM MWAA, and data validation using HAQM Athena
As data engineering becomes increasingly complex, organizations are looking for new ways to streamline their data processing workflows. Many data engineers today use Apache Airflow to build, schedule, and monitor their data pipelines. However, as the volume of data grows, managing and scaling these pipelines can become a daunting task. HAQM Managed Workflows for Apache […]
Introducing shared VPC support on HAQM MWAA
In this post, we demonstrate automating deployment of HAQM Managed Workflows for Apache Airflow (HAQM MWAA) using customer-managed endpoints in a VPC, providing compatibility with shared, or otherwise restricted, VPCs. Data scientists and engineers have made Apache Airflow a leading open source tool to create data pipelines due to its active open source community, familiar […]
Introducing HAQM MWAA support for Apache Airflow version 2.7.2 and deferrable operators
Today, we are announcing the availability of Apache Airflow version 2.7.2 environments and support for deferrable operators on HAQM MWAA. In this post, we provide an overview of deferrable operators and triggers, including a walkthrough of an example showcasing how to use them. We also delve into some of the new features and capabilities of Apache Airflow, and how you can set up or upgrade your HAQM MWAA environment to version 2.7.2.
Use Snowflake with HAQM MWAA to orchestrate data pipelines
This blog post is co-written with James Sun from Snowflake. Customers rely on data from different sources such as mobile applications, clickstream events from websites, historical data, and more to deduce meaningful patterns to optimize their products, services, and processes. With a data pipeline, which is a set of tasks used to automate the movement […]
Set up fine-grained permissions for your data pipeline using MWAA and EKS
This blog post shows how to improve security in a data pipeline architecture based on HAQM Managed Workflows for Apache Airflow (HAQM MWAA) and HAQM Elastic Kubernetes Service (HAQM EKS) by setting up fine-grained permissions, using HashiCorp Terraform for infrastructure as code.
Introducing Apache Airflow version 2.6.3 support on HAQM MWAA
HAQM Managed Workflows for Apache Airflow (HAQM MWAA) is a managed orchestration service for Apache Airflow that makes it simple to set up and operate end-to-end data pipelines in the cloud. Trusted across various industries, HAQM MWAA helps organizations like Siemens, ENGIE, and Choice Hotels International enhance and scale their business workflows, while significantly improving security […]
Automate secure access to HAQM MWAA environments using existing OpenID Connect single-sign-on authentication and authorization
Customers use HAQM Managed Workflows for Apache Airflow (HAQM MWAA) to run Apache Airflow at scale in the cloud. They want to use their existing login solutions developed using OpenID Connect (OIDC) providers with HAQM MWAA; this allows them to provide a uniform authentication and single sign-on (SSO) experience using their adopted identity providers (IdP) […]