AWS Big Data Blog
Implement a Real-time, Sliding-Window Application Using HAQM Kinesis and Apache Storm
Rahul Bhartia is an AWS Solutions Architect
Streams of data are becoming ubiquitous today – clickstreams, log streams, event streams, and more. The need for real-time processing of high-volume data streams is pushing the limits of traditional data processing infrastructures. Building a clickstream monitoring system, for example, where data is in the form of a continuous clickstream rather than discrete data sets, requires the use of continuous processing rather than ad-hoc, one-time queries.
Developers can use Apache Storm and HAQM Kinesis to quickly and cost-effectively build an application that continuously processes very high volumes of streaming data. To help developers integrate Apache Storm with HAQM Kinesis, earlier this year we launched the HAQM Kinesis Storm Spout. Last week we released an update to the Spout to support Ack/Fail semantics. With this update, the Spout now re-emits failed messages up to the configured retry limit, making it easier to build reliable data processing applications. The updated HAQM Kinesis Storm Spout is available on Github.
Check out the white paper to learn how the entire stack works all the way from ingestion to visualization, and look at our github repository to view further instructions on how to build and deploy it yourself.
If you have questions or suggestions, please leave a comment below.
Do more with HAQM Kinesis!
Processing HAQM Kinesis Stream Data Using HAQM KCL for Node.js
Hosting HAQM Kinesis Applications on AWS Elastic Beanstalk
Snakes in the Stream! Feeding and Eating HAQM Kinesis Streams with Python