AWS Architecture Blog
Category: Resilience
Pilot light with reserved capacity: How to optimize DR cost using On-Demand Capacity Reservations
In this post, we explore an intermediate strategy between the pilot light and the warm standby strategies: pilot light with reserved capacity. You can use this strategy to reserve compute capacity in a secondary Region while also limiting cost.
Enhance the resilience of critical workloads by architecting with multiple AWS Regions
In this post, we will share how you can use multi-Region as an architectural approach to achieve higher resilience on HAQM Web Services (AWS). This approach relies on first operating a workload across multiple Availability Zones within an AWS Region, before expanding to achieve even higher resilience by using multiple Regions.
Know before you go – AWS re:Invent 2024 cloud resilience
If you’re attending AWS re:Invent 2024 with the goal of improving your organization’s cloud resilience operations, we will be offering valuable insights, best practices, and fun activities to improve your cloud resilience expertise. This year, we’re offering more than 100 resilience breakout sessions, workshops, chalk talks, builders’ sessions, and code talks. Find the complete list in the re:Invent 2024 session catalog and filter by “Resilience” in the area of interest field. In this post, we highlight must-see sessions for those building resilient applications and architectures on AWS. Reserved seating is now open, so act quickly to claim your seat. Be sure to also check out the vertical-specific re:Invent guides.
London Stock Exchange Group uses chaos engineering on AWS to improve resilience
This post was co-written with Luke Sudgen, Lead DevOps Engineer Post Trade, and Padraig Murphy, Solutions Architect Post Trade, from London Stock Exchange Group. In this post, we’ll discuss some failure scenarios that were tested by London Stock Exchange Group (LSEG) Post Trade Technology teams during a chaos engineering event supported by AWS. Chaos engineering […]
Journey to Adopt Cloud-Native Architecture Series: #3 – Improved Resilience and Standardized Observability
September 8, 2021: HAQM Elasticsearch Service has been renamed to HAQM OpenSearch Service. See details. In the last blog, Maximizing System Throughput, we talked about design patterns you can adopt to address immediate scaling challenges to provide a better customer experience. In this blog, we talk about architecture patterns to improve system resiliency, why observability […]
IT Resilience Within AWS Cloud, Part II: Architecture and Patterns
In Part I of this two-part blog, we outlined best practices to consider when building resilient applications in hybrid on-premises/cloud environments. We also showed you how to adapt mindsets and organizational culture. In Part II, we’ll provide technical considerations related to architecture and patterns for resilience in AWS Cloud. Considerations on architecture and patterns The […]
IT Resilience Within AWS Cloud, Part I: Mindset and Culture
As customers migrate to the cloud, many struggle to adapt business continuity and operational plans from their on-premises environments. This affects the resilience of critical business applications and can stall cloud adoption. This two-part blog series will provide guidance on implementing IT resilience strategies. In Part I, we’ll review challenges commonly experienced by executive builders. […]