AWS Big Data Blog

Category: AWS Glue

Accelerate your analytics with HAQM S3 Tables and HAQM SageMaker Lakehouse

HAQM SageMaker Lakehouse is a unified, open, and secure data lakehouse that now seamlessly integrates with HAQM S3 Tables, the first cloud object store with built-in Apache Iceberg support. In this post, we guide you how to use various analytics services using the integration of SageMaker Lakehouse with S3 Tables.

Build unified pipelines spanning multiple AWS accounts and Regions with HAQM MWAA

In this blog post, we demonstrate how to use HAQM MWAA for centralized orchestration, while distributing data processing and machine learning tasks across different AWS accounts and Regions for optimal performance and compliance.

Manage concurrent write conflicts in Apache Iceberg on the AWS Glue Data Catalog

This post demonstrates how to implement reliable concurrent write handling mechanisms in Iceberg tables. We will explore Iceberg’s concurrency model, examine common conflict scenarios, and provide practical implementation patterns of both automatic retry mechanisms and situations requiring custom conflict resolution logic for building resilient data pipelines. We will also cover the pattern with automatic compaction through AWS Glue Data Catalog table optimization.

HAQM Web Services named a Leader in the 2024 Gartner Magic Quadrant for Data Integration Tools

HAQM Web Services (AWS) has been recognized as a Leader in the 2024 Gartner Magic Quadrant for Data Integration Tools. We were positioned in the Challengers Quadrant in 2023. This recognition, we feel, reflects our ongoing commitment to innovation and excellence in data integration, demonstrating our continued progress in providing comprehensive data management solutions.

How Open Universities Australia modernized their data platform and significantly reduced their ETL costs with AWS Cloud Development Kit and AWS Step Functions

At Open Universities Australia (OUA), we empower students to explore a vast array of degrees from renowned Australian universities, all delivered through online learning. In this post, we show you how we used AWS services to replace our existing third-party ETL tool, improving the team’s productivity and producing a significant reduction in our ETL operational costs.

How MuleSoft achieved cloud excellence through an event-driven HAQM Redshift lakehouse architecture

In our previous thought leadership blog post Why a Cloud Operating Model we defined a COE Framework and showed why MuleSoft implemented it and the benefits they received from it. In this post, we’ll dive into the technical implementation describing how MuleSoft used HAQM EventBridge, HAQM Redshift, HAQM Redshift Spectrum, HAQM S3, & AWS Glue to implement it.