AWS Big Data Blog
Category: HAQM Managed Workflows for Apache Airflow (HAQM MWAA)
Introducing in-place version upgrades with HAQM MWAA
Today, AWS is announcing the availability of in-place version upgrades for HAQM Managed Workflow for Apache Airflow (HAQM MWAA). This enhancement allows you to seamlessly upgrade your existing Apache Airflow version 2.x environments to newer available versions while retaining the workflow run history and environment configurations. You can now take advantage of the latest capabilities […]
Simplify AWS Glue job orchestration and monitoring with HAQM MWAA
Organizations across all industries have complex data processing requirements for their analytical use cases across different analytics systems, such as data lakes on AWS, data warehouses (HAQM Redshift), search (HAQM OpenSearch Service), NoSQL (HAQM DynamoDB), machine learning (HAQM SageMaker), and more. Analytics professionals are tasked with deriving value from data stored in these distributed systems […]
What’s new with HAQM MWAA support for startup scripts
HAQM Managed Workflow for Apache Airflow (HAQM MWAA) is a managed service for Apache Airflow that lets you use the same familiar Apache Airflow environment to orchestrate your workflows and enjoy improved scalability, availability, and security without the operational burden of having to manage the underlying infrastructure. In April 2023, HAQM MWAA added support for […]
What’s new with HAQM MWAA support for Apache Airflow version 2.4.3
HAQM Managed Workflows for Apache Airflow (HAQM MWAA) is a managed orchestration service for Apache Airflow that makes it simple to set up and operate end-to-end data pipelines in the cloud at scale. HAQM MWAA supports multiple versions of Apache Airflow (v1.10.12, v2.0.2, and v2.2.2). Earlier in 2023, we added support for Apache Airflow v2.4.3 […]
Improve observability across HAQM MWAA tasks
HAQM Managed Workflows for Apache Airflow (HAQM MWAA) is a managed orchestration service for Apache Airflow that makes it simple to set up and operate end-to-end data pipelines in the cloud at scale. A data pipeline is a set of tasks and processes used to automate the movement and transformation of data between different systems. […]
Automate data lineage on HAQM MWAA with OpenLineage
In modern data architectures, datasets are combined across an organization using a variety of purpose-built services to unlock insights. As a result, data governance becomes a key component for data consumers and producers to know that their data-driven decisions are based on trusted and accurate datasets. One aspect of data governance is data lineage, which […]
How ZS created a multi-tenant self-service data orchestration platform using HAQM MWAA
This is post is co-authored by Manish Mehra, Anirudh Vohra, Sidrah Sayyad, and Abhishek I S (from ZS), and Parnab Basak (from AWS). The team at ZS collaborated closely with AWS to build a modern, cloud-native data orchestration platform. ZS is a management consulting and technology firm focused on transforming global healthcare and beyond. We […]
How GE Proficy Manufacturing Data Cloud replatformed to improve TCO, data SLA, and performance
This is post is co-authored by Jyothin Madari, Madhusudhan Muppagowni and Ayush Srivastava from GE. GE Proficy Manufacturing Data Cloud (MDC), part of the GE Digital’s Manufacturing Execution Systems (MES) suite of solutions, allows GED’s customers to increase the derived value easily and quickly from the MES by reliably bringing enterprise-wide manufacturing data into the […]
Persist and analyze metadata in a transient HAQM MWAA environment
Customers can harness sophisticated orchestration capabilities through the open-source tool Apache Airflow. Airflow can be installed on HAQM EC2 instances or can be dockerized and deployed as a container on AWS container services. Alternatively, customers can also opt to leverage HAQM Managed Workflows for Apache Airflow (MWAA). HAQM MWAA is a fully managed service that […]
How ENGIE scales their data ingestion pipelines using HAQM MWAA
ENGIE—one of the largest utility providers in France and a global player in the zero-carbon energy transition—produces, transports, and deals electricity, gas, and energy services. With 160,000 employees worldwide, ENGIE is a decentralized organization and operates 25 business units with a high level of delegation and empowerment. ENGIE’s decentralized global customer base had accumulated lots […]