AWS Big Data Blog
Category: Serverless
Retain more for less with tiered storage for HAQM MSK
Organizations are adopting Apache Kafka and HAQM Managed Streaming for Apache Kafka (HAQM MSK) to capture and analyze data in real-time. HAQM MSK allows you to build and run production applications on Apache Kafka without needing Kafka infrastructure management expertise or having to deal with the complex overheads associated with running Apache Kafka on your […]
Simplify data analysis and collaboration with SQL Notebooks in HAQM Redshift Query Editor V2.0
HAQM Redshift Query Editor V2.0 is a web-based analyst workbench that you can use to author and run queries on your HAQM Redshift data warehouse. You can visualize query results with charts, and explore, share, and collaborate on data with your teams in SQL through a common interface. With SQL Notebooks, HAQM Redshift Query Editor […]
How The Mill Adventure enabled data-driven decision-making in iGaming using HAQM QuickSight
This post is co-written with Darren Demicoli from The Mill Adventure. The Mill Adventure is an iGaming industry enabler offering customizable turnkey solutions to B2B partners and custom branding enablement for its B2C partners. They provide a complete gaming platform, including licenses and operations, for rapid deployment and success in iGaming, and are committed to […]
Deploy DataHub using AWS managed services and ingest metadata from AWS Glue and HAQM Redshift – Part 2
In the first post of this series, we discussed the need of a metadata management solution for organizations. We used DataHub as an open-source metadata platform for metadata management and deployed it using AWS managed services with the AWS Cloud Development Kit (AWS CDK). In this post, we focus on how to populate technical metadata […]
Deploy DataHub using AWS managed services and ingest metadata from AWS Glue and HAQM Redshift – Part 1
Many organizations are establishing enterprise data warehouses, data lakes, or a modern data architecture on AWS to build data-driven products. As the organization grows, the number of publishers and subscribers to data and the volume of data keeps increasing. Additionally, different varieties of datasets are introduced (structured, semistructured, and unstructured). This can lead to metadata […]
How a blockchain startup built a prototype solution to solve the need of analytics for decentralized applications with AWS Data Lab
February 9, 2024: HAQM Kinesis Data Firehose has been renamed to HAQM Data Firehose. Read the AWS What’s New post to learn more. This post is co-written with Dr. Quan Hoang Nguyen, CTO at Fantom Foundation. Here at Fantom Foundation (Fantom), we have developed a high performance, highly scalable, and secure smart contract platform. It’s […]
Use MSK Connect for managed MirrorMaker 2 deployment with IAM authentication
March 2025: This post was reviewed and updated for accuracy. MSK Replicator now makes it easier to set up cross-Region and same-Region replication without running MirrorMaker 2. Read AWS News Blog to learn more. In this post, we show how to use MSK Connect for MirrorMaker 2 deployment with AWS Identity and Access Management (IAM) authentication. We create […]
Simplify semi-structured nested JSON data analysis with AWS Glue DataBrew and HAQM QuickSight
As the industry grows with more data volume, big data analytics is becoming a common requirement in data analytics and machine learning (ML) use cases. Data comes from many different sources in structured, semi-structured, and unstructured formats. For semi-structured data, one of the most common lightweight file formats is JSON. However, due to the complex […]
Automate HAQM Redshift Serverless data warehouse management using AWS CloudFormation and the AWS CLI
HAQM Redshift Serverless makes it simple to run and scale analytics without having to manage the instance type, instance size, lifecycle management, pausing, resuming, and so on. It automatically provisions and intelligently scales data warehouse compute capacity to deliver fast performance for even the most demanding and unpredictable workloads, and you pay only for what […]
Ingest VPC flow logs into Splunk using HAQM Kinesis Data Firehose
February 9, 2024: HAQM Kinesis Data Firehose has been renamed to HAQM Data Firehose. Read the AWS What’s New post to learn more. December 2023: This post was reviewed and updated to remove the dependency on the AWS Lambda function according to the latest version in Splunk AWS Add-on (7.3.0). In September 2017, during the […]