AWS Big Data Blog

Category: HAQM Managed Streaming for Apache Kafka (HAQM MSK)

Near-real-time fraud detection using HAQM Redshift Streaming Ingestion with HAQM Kinesis Data Streams and HAQM Redshift ML

The importance of data warehouses and analytics performed on data warehouse platforms has been increasing steadily over the years, with many businesses coming to rely on these systems as mission-critical for both short-term operational decision-making and long-term strategic planning. Traditionally, data warehouses are refreshed in batch cycles, for example, monthly, weekly, or daily, so that […]

Analyze real-time streaming data in HAQM MSK with HAQM Athena

Recent advances in ease of use and scalability have made streaming data easier to generate and use for real-time decision-making. Coupled with market forces that have forced businesses to react more quickly to industry changes, more and more organizations today are turning to streaming data to fuel innovation and agility. HAQM Managed Streaming for Apache […]

Gain visibility into your HAQM MSK cluster by deploying the Conduktor Platform

This is a guest post by AWS Data Hero and co-founder of Conduktor, Stephane Maarek. Deploying Apache Kafka on AWS is now easier, thanks to HAQM Managed Streaming for Apache Kafka (HAQM MSK). In a few clicks, it provides you with a production-ready Kafka cluster on which you can run your applications and create data […]

How SOCAR built a streaming data pipeline to process IoT data for real-time analytics and control

August 30, 2023: HAQM Kinesis Data Analytics has been renamed to HAQM Managed Service for Apache Flink. Read the announcement in the AWS News Blog and learn more. SOCAR is the leading Korean mobility company with strong competitiveness in car-sharing. SOCAR has become a comprehensive mobility platform in collaboration with Nine2One, an e-bike sharing service, […]

Retain more for less with tiered storage for HAQM MSK

Organizations are adopting Apache Kafka and HAQM Managed Streaming for Apache Kafka (HAQM MSK) to capture and analyze data in real-time. HAQM MSK allows you to build and run production applications on Apache Kafka without needing Kafka infrastructure management expertise or having to deal with the complex overheads associated with running Apache Kafka on your […]

Use MSK Connect for managed MirrorMaker 2 deployment with IAM authentication

March 2025: This post was reviewed and updated for accuracy. MSK Replicator now makes it easier to set up cross-Region and same-Region replication without running MirrorMaker 2. Read AWS News Blog to learn more.  In this post, we show how to use MSK Connect for MirrorMaker 2 deployment with AWS Identity and Access Management (IAM) authentication. We create […]

Split your monolithic Apache Kafka clusters using HAQM MSK Serverless

Today, many companies are building real-time applications to improve their customer experience and get immediate insights from their data before it loses its value. As the result, companies have been facing increasing demand to provide data streaming services such as Apache Kafka for developers. To meet this demand, companies typically start with a small- or […]

Reduce network traffic costs of your HAQM MSK consumers with rack awareness

HAQM Managed Streaming for Apache Kafka (HAQM MSK) runs Apache Kafka clusters for you in the cloud. Although using cloud services means you don’t have to manage racks of servers any more, we take advantage of rack aware features in Apache Kafka to spread risk across AWS Availability Zones and increase availability of HAQM MSK […]

Best practices to optimize cost and performance for AWS Glue streaming ETL jobs

AWS Glue streaming extract, transform, and load (ETL) jobs allow you to process and enrich vast amounts of incoming data from systems such as HAQM Kinesis Data Streams, HAQM Managed Streaming for Apache Kafka (HAQM MSK), or any other Apache Kafka cluster. It uses the Spark Structured Streaming framework to perform data processing in near-real […]

How Epos Now modernized their data platform by building an end-to-end data lake with the AWS Data Lab

Epos Now provides point of sale and payment solutions to over 40,000 hospitality and retailers across 71 countries. Their mission is to help businesses of all sizes reach their full potential through the power of cloud technology, with solutions that are affordable, efficient, and accessible. Their solutions allow businesses to leverage actionable insights, manage their […]