AWS Big Data Blog

Category: Application Integration

Monitor data pipelines in a serverless data lake

AWS serverless services, including but not limited to AWS Lambda, AWS Glue, AWS Fargate, HAQM EventBridge, HAQM Athena, HAQM Simple Notification Service (HAQM SNS), HAQM Simple Queue Service (HAQM SQS), and HAQM Simple Storage Service (HAQM S3), have become the building blocks for any serverless data lake, providing key mechanisms to ingest and transform data […]

Empower your Jira data in a data lake with HAQM AppFlow and AWS Glue

In the world of software engineering and development, organizations use project management tools like Atlassian Jira Cloud. Managing projects with Jira leads to rich datasets, which can provide historical and predictive insights about project and development efforts. Although Jira Cloud provides reporting capability, loading this data into a data lake will facilitate enrichment with other […]

Automate secure access to HAQM MWAA environments using existing OpenID Connect single-sign-on authentication and authorization

Customers use HAQM Managed Workflows for Apache Airflow (HAQM MWAA) to run Apache Airflow at scale in the cloud. They want to use their existing login solutions developed using OpenID Connect (OIDC) providers with HAQM MWAA; this allows them to provide a uniform authentication and single sign-on (SSO) experience using their adopted identity providers (IdP) […]

Introducing in-place version upgrades with HAQM MWAA

Today, AWS is announcing the availability of in-place version upgrades for HAQM Managed Workflow for Apache Airflow (HAQM MWAA). This enhancement allows you to seamlessly upgrade your existing Apache Airflow version 2.x environments to newer available versions while retaining the workflow run history and environment configurations. You can now take advantage of the latest capabilities […]

Simplify AWS Glue job orchestration and monitoring with HAQM MWAA

Organizations across all industries have complex data processing requirements for their analytical use cases across different analytics systems, such as data lakes on AWS, data warehouses (HAQM Redshift), search (HAQM OpenSearch Service), NoSQL (HAQM DynamoDB), machine learning (HAQM SageMaker), and more. Analytics professionals are tasked with deriving value from data stored in these distributed systems […]

What’s new with HAQM MWAA support for startup scripts

HAQM Managed Workflow for Apache Airflow (HAQM MWAA) is a managed service for Apache Airflow that lets you use the same familiar Apache Airflow environment to orchestrate your workflows and enjoy improved scalability, availability, and security without the operational burden of having to manage the underlying infrastructure. In April 2023, HAQM MWAA added support for […]

What’s new with HAQM MWAA support for Apache Airflow version 2.4.3

HAQM Managed Workflows for Apache Airflow (HAQM MWAA) is a managed orchestration service for Apache Airflow that makes it simple to set up and operate end-to-end data pipelines in the cloud at scale. HAQM MWAA supports multiple versions of Apache Airflow (v1.10.12, v2.0.2, and v2.2.2). Earlier in 2023, we added support for Apache Airflow v2.4.3 […]

Cross-account integration between SaaS platforms using HAQM AppFlow

Implementing an effective data sharing strategy that satisfies compliance and regulatory requirements is complex. Customers often need to share data between disparate software as a service (SaaS) platforms within their organization or across organizations. On many occasions, they need to apply business logic to the data received from the source SaaS platform before pushing it […]

Build event-driven data pipelines using AWS Controllers for Kubernetes and HAQM EMR on EKS

An event-driven architecture is a software design pattern in which decoupled applications can asynchronously publish and subscribe to events via an event broker. By promoting loose coupling between components of a system, an event-driven architecture leads to greater agility and can enable components in the system to scale independently and fail without impacting other services. […]

Synchronize your Salesforce and Snowflake data to speed up your time to insight with HAQM AppFlow

This post was co-written with Amit Shah, Principal Consultant at Atos. Customers across industries seek meaningful insights from the data captured in their Customer Relationship Management (CRM) systems. To achieve this, they combine their CRM data with a wealth of information already available in their data warehouse, enterprise systems, or other software as a service […]