AWS Storage Blog
Seamless streaming to HAQM S3 Tables with StreamNative Ursa Engine
Organizations are modernizing data platforms to use generative AI by centralizing data from various sources and streaming real-time data into data lakes. A strong data foundation, such as scalable storage, reliable ingestion pipelines, and interoperable formats, is critical for businesses to discover, explore, and consume data. As organizations modernize their platforms, they often turn to […]
Connect Snowflake to S3 Tables using the SageMaker Lakehouse Iceberg REST endpoint
Organizations today seek data analytics solutions that provide maximum flexibility and accessibility. Customers need their data to be readily available using their preferred query engines, and break down barriers across different computing environments. At the same time, they want a single copy of data to be used across these solutions, to track lineage, be cost […]
Build a managed Apache Iceberg data lake using Starburst and HAQM S3 Tables
Managing large-scale data analytics across diverse data sources has long been a challenge for enterprises. Data teams often struggle with complex data lake configurations, performance bottlenecks, and the need to maintain consistent data governance while enabling broad access to analytics capabilities. Today, Starburst announces a powerful solution to these challenges by extending their Apache Iceberg […]
Build a data lake for streaming data with HAQM S3 Tables and HAQM Data Firehose
Businesses are increasingly adopting real-time data processing to stay ahead of user expectations and market changes. Industries such as retail, finance, manufacturing, and smart cities are using streaming data for everything from optimizing supply chains to detecting fraud and improving urban planning. The ability to use data as it is generated has become a critical […]
Event-driven framework to integrate AWS Backup service with CSPM tools
Many organizations use third-party Cloud Security Posture Management (CSPM) tools (for example Wiz.io) to continuously detect and remediate misconfiguration from build time to runtime across hybrid clouds such as AWS. CSPM tools often use AWS resource tags to enhance their security and compliance monitoring capabilities. Tags are key-value pairs that you can assign to AWS resources […]
Optimizing HAQM FSx for Lustre storage consumption using automatic data tiering with HAQM S3
Managing high-performance file storage can be a significant operational and cost challenge for many organizations, especially those running compute-intensive workloads such as high-performance computing (HPC) or data analytics. This is particularly true for organizations with existing data lakes on HAQM S3 who need POSIX-compliant, high-performance file system access. HAQM FSx for Lustre provides a scalable, […]
Unlock higher performance for file system workloads with scalable metadata performance on HAQM FSx for Lustre
Imagine a company like a movie studio, one that works with enormous volumes of video files, scripts, and animation assets. They store these files on a high-performance file system such as HAQM FSx for Lustre, a fully managed shared storage built on the world’s most popular high-performance file system. Each file has metadata, such as […]
Access data in HAQM S3 Tables using PyIceberg through the AWS Glue Iceberg REST endpoint
Modern data lakes integrate with multiple engines to meet a wide range of analytics needs, from SQL querying to stream processing. A key enabler of this approach is the adoption of Apache Iceberg as the open table format for building transactional data lakes. However, as the Iceberg ecosystem expands, the growing variety of engines and languages has […]
Protect Oracle Databases on HAQM EC2 using NetApp SnapCenter with HAQM FSx for NetApp ONTAP
Oracle databases typically see significant data growth which in turn increases backup, restore and database refresh times. The need to quickly backup, restore, and refresh large-scale databases is important for ensuring data consistency, business continuity, and accelerating testing and development processes. As more businesses migrate their Oracle databases to HAQM Elastic Compute Cloud (EC2) instances, […]
Integrating custom metadata with HAQM S3 Metadata
Organizations of all sizes face a common challenge: efficiently managing, organizing, and retrieving vast amounts of digital content. From images and videos to documents and application data, businesses are inundated with information that needs to be stored securely, accessed quickly, and analyzed effectively. The ability to extract, manage, and use metadata from this content is […]