AWS Big Data Blog

Category: Compute

Building a Real World Evidence Platform on AWS

Deriving insights from large datasets is central to nearly every industry, and life sciences is no exception. To combat the rising cost of bringing drugs to market, pharmaceutical companies are looking for ways to optimize their drug development processes. They are turning to big data analytics to better quantify the effect that their drug compounds […]

Build a Serverless Architecture to Analyze HAQM CloudFront Access Logs Using AWS Lambda, HAQM Athena, and HAQM Kinesis Analytics

Nowadays, it’s common for a web server to be fronted by a global content delivery service, like HAQM CloudFront. This type of front end accelerates delivery of websites, APIs, media content, and other web assets to provide a better experience to users across the globe. The insights gained by analysis of HAQM CloudFront access logs […]

Build a Healthcare Data Warehouse Using HAQM EMR, HAQM Redshift, AWS Lambda, and OMOP

In the healthcare field, data comes in all shapes and sizes. Despite efforts to standardize terminology, some concepts (e.g., blood glucose) are still often depicted in different ways. This post demonstrates how to convert an openly available dataset called MIMIC-III, which consists of de-identified medical data for about 40,000 patients, into an open source data […]

Processing VPC Flow Logs with HAQM EMR

In this post, I show you how to gain valuable insight into your network by using HAQM EMR and HAQM VPC Flow Logs. The walkthrough implements a pattern often found in network equipment called ‘Top Talkers’, an ordered list of the heaviest network users, but the model can also be used for many other types of network analysis.

Simplify Management of HAQM Redshift Snapshots using AWS Lambda

NOTE: HAQM Redshift now supports creating an automatic snapshot schedule using the snapshot scheduler. For more information, please review this “What’s New” post. ———————————- Ian Meyers is a Solutions Architecture Senior Manager with AWS HAQM Redshift is a fast, fully managed, petabyte-scale data warehouse that makes it simple and cost-effective to analyze all your data […]

Real-time in-memory OLTP and Analytics with Apache Ignite on AWS

February 9, 2024: HAQM Kinesis Data Firehose has been renamed to HAQM Data Firehose. Read the AWS What’s New post to learn more. Babu Elumalai is a Solutions Architect with AWS Organizations are generating tremendous amounts of data, and they increasingly need tools and systems that help them use this data to make decisions. The […]

From SQL to Microservices: Integrating AWS Lambda with Relational Databases

Bob Strahan is a Senior Consultant with AWS Professional Services AWS Lambda has emerged as excellent compute platform for modern microservices architecture, driving dramatic advancements in flexibility, resilience, scale and cost effectiveness. Many customers can take advantage of this transformational technology from within their existing relational database applications. In this post, we explore how to […]

Analyze a Time Series in Real Time with AWS Lambda, HAQM Kinesis and HAQM DynamoDB Streams

This is a guest post by Richard Freeman, Ph.D., a solutions architect and data scientist at JustGiving. JustGiving in their own words: “We are one of the world’s largest social platforms for giving that’s helped 26.1 million registered users in 196 countries raise $3.8 billion for over 27,000 good causes.” Introduction As more devices, sensors […]