AWS Big Data Blog

Category: Kinesis Data Streams

Optimize write throughput for HAQM Kinesis Data Streams

HAQM Kinesis Data Streams is used by many customers to capture, process, and store data streams at any scale. This level of unparalleled scale is enabled by dividing each data stream into multiple shards. Each shard in a stream has a 1 Mbps or 1,000 records per second write throughput limit. Whether your data streaming […]

Architectural Patterns for Real Time Analytics using HAQM Kinesis Data Streams, Part 2 – AI Applications

Architectural Patterns for real-time analytics using HAQM Kinesis Data Streams, Part 2: AI Applications

Welcome back to our exciting exploration of architectural patterns for real-time analytics with HAQM Kinesis Data Streams! In this fast-paced world, Kinesis Data Streams stands out as a versatile and robust solution to tackle a wide range of use cases with real-time data, from dashboarding to powering artificial intelligence (AI) applications. In this series, we […]

Build Spark Structured Streaming applications with the open source connector for HAQM Kinesis Data Streams

Apache Spark is a powerful big data engine used for large-scale data analytics. Its in-memory computing makes it great for iterative algorithms and interactive queries. You can use Apache Spark to process streaming data from a variety of streaming sources, including HAQM Kinesis Data Streams for use cases like clickstream analysis, fraud detection, and more. Kinesis Data Streams is a serverless streaming data service that makes it straightforward to capture, process, and store data streams at any scale.

With the new open source HAQM Kinesis Data Streams Connector for Spark Structured Streaming, you can use the newer Spark Data Sources API. It also supports enhanced fan-out for dedicated read throughput and faster stream processing. In this post, we deep dive into the internal details of the connector and show you how to use it to consume and produce records from and to Kinesis Data Streams using HAQM EMR.

Architecture Diagram for Krones Production Line Monitoring

Krones real-time production line monitoring with HAQM Managed Service for Apache Flink

Krones provides breweries, beverage bottlers, and food producers all over the world with individual machines and complete production lines. This post shows how Krones built a streaming solution to monitor their lines, based on HAQM Kinesis and HAQM Managed Service for Apache Flink. These fully managed services reduce the complexity of building streaming applications with Apache Flink. Managed Service for Apache Flink manages the underlying Apache Flink components that provide durable application state, metrics, logs, and more, and Kinesis enables you to cost-effectively process streaming data at any scale.

Exploring real-time streaming for generative AI Applications

Foundation models (FMs) are large machine learning (ML) models trained on a broad spectrum of unlabeled and generalized datasets. FMs, as the name suggests, provide the foundation to build more specialized downstream applications, and are unique in their adaptability. They can perform a wide range of different tasks, such as natural language processing, classifying images, […]

HAQM KDS-Lambda cross acct solution architecture

Invoke AWS Lambda functions from cross-account HAQM Kinesis Data Streams

A multi-account architecture on AWS is essential for enhancing security, compliance, and resource management by isolating workloads, enabling granular cost allocation, and facilitating collaboration across distinct environments. It also mitigates risks, improves scalability, and allows for advanced networking configurations. In a streaming architecture, you may have event producers, stream storage, and event consumers in a […]

Architectural patterns for real-time analytics using HAQM Kinesis Data Streams, part 1

We’re living in the age of real-time data and insights, driven by low-latency data streaming applications. Today, everyone expects a personalized experience in any application, and organizations are constantly innovating to increase their speed of business operation and decision making. The volume of time-sensitive data produced is increasing rapidly, with different formats of data being […]

Solution design diagram

Run Kinesis Agent on HAQM ECS

February 9, 2024: HAQM Kinesis Data Firehose has been renamed to HAQM Data Firehose. Read the AWS What’s New post to learn more. Kinesis Agent is a standalone Java software application that offers a straightforward way to collect and send data to HAQM Kinesis Data Streams and HAQM Kinesis Data Firehose. The agent continuously monitors […]

HAQM Kinesis Data Streams: celebrating a decade of real-time data innovation

Data is a key strategic asset for every organization, and every company is a data business at its core. However, in many organizations, data is typically spread across a number of different systems such as software as a service (SaaS) applications, operational databases, and data warehouses. Such data silos make it difficult to get unified […]