AWS Availability How-To: Achieving Fault Tolerance and Redundancy in EC2

Feb 27, 2017

With millions of customers and a host of entry-level services that help companies quickly on-board the cloud, it’s no surprise that organizations are shifting mission-critical apps and services onto the elastic compute cloud (EC2). But even Amazon isn’t immune to hardware failures, natural disasters or other system outages.  

Before you spin up, review the following AWS availability primer for achieving fault tolerance and redundancy in EC2.

Fault Tolerance & Redundancy: The Same, But Different

First up? Defining the difference between redundant and fault-tolerant solutions. While the terms are certainly related — and often used interchangeably — they’re not exactly the same. And although there’s no hard-and-fast rule regarding the definitions, the commonly accepted answer goes like this:

  • Components — such as disks, racks or servers — are redundant.
  • Systems — such as disk arrays or cloud computing networks — are fault tolerant.

Put simply, redundant means having more than one of something in case the first instance fails. Having two disks on the same system that are regularly backed up makes them redundant, since if one fails the other can pick up the slack. If the entire system fails, however, both disks are useless. This is the role of fault-tolerance, to keep the system as a whole operating even if portions of the system fail. So, how does this apply to EC2 and the Amazon cloud?

At SingleHop, we have zero tolerance for zero fault tolerance. (That’s not redundant.)

Explore Managed AWS

Saving Grace

For many companies, the cloud acts as both home for applications and a flexible DR service in the event of local systems failure. But what happens when the cloud itself goes down? Like all cloud providers, Amazon has experienced outages due to weather, power failures and other disasters; while the company promises 99.95 percent uptime for its compute instances, this still equates to approximately four hours of downtime per year. Use of Amazon as a DR solution is now both possible and recommended — but isn’t perfect. To address this issue, EC2 comes with several tools that can help companies increase both their total redundancy and overall fault tolerance.

Ramping Up AWS Redundancy

How do companies address the issue of redundancy in their EC2 instances? It starts with availability zones (AZs). These zones are divided by region — meaning if you’re on the West Coast of the United States you’ll have a choice of multiple zones along the coast that are independently powered and cooled, and have their own network and security architectures. AZs are insulated from the failures of other zones in the group, making them a simple form of redundancy. By replicating your EC2 instance across multiple AZs, you significantly reduce the chance of total outage or failure.

It’s worth noting that bandwidth across zone boundaries costs $0.01/GB, which is a fraction of the cost of Internet traffic at large but is important to consider when calculating cloud costs. It’s also important to remember that information transfer does have an upper limit bounded by the speed of light, meaning that if you’re using two geographically distant AZs to house your EC2 instances you may experience some latency in the event of a failure.

Finding Fault Tolerance

As noted by the AWS Reference Architecture for Fault Tolerance and High Availability, while higher-level services such as the Amazon Simple Storage Service (S3), Amazon SimpleDB, Simple Queue Service (SQS) and Elastic Load Balancing (ELB) are inherently fault-tolerant, EC2 instances come with a number of tools that must be properly used to achieve overall fault tolerance.

For example, employing ELB can help migrate workloads off failed EC2 instances and ensure you’re not wasting resources, while creating an Auto Scaling group in addition to an existing ELB load balancer will automatically terminate “unhealthy” instances and launch new ones. Also critical are the use of elastic IP addresses, which are public IP addresses that can be mapped to any EC2 instance in the same region, since they’re associated with your AWS account and not the instance itself. In the event of a sudden EC2 failure, elastic IP lets you shift network requests and traffic in under two minutes. It’s also a good idea to make use of Snapshots in combination with S3 — by taking regular point-in-time snapshots of your EC2 instance, saving them to S3 and replicating them across multiple AZs, it’s possible to reduce the impact of unexpected or emerging faults.

Mission-critical workloads now have a place in Amazon’s EC2 offering. Ensuring the high availability demanded by these workloads, however, means making best use of both redundant and fault-tolerant tools included with any elastic compute cloud instance.


Application-Driven Managed Hosting for AWS

Build better
Read Also:
Is VMware Cloud on AWS the Right Solution for Your Business? (Video) One of Our Customers Just Left for AWS. Here’s Why We Don’t Mind. SingleHop at AWS re:Invent 2017 – Takeaways and Predictions
Adam Cady, Director, Third Party Managed Services
Adam Cady
Vice President of Operations

Adam Cady is Vice President of Operations at SingleHop. Cady leads SingleHop's Service First Support team and is responsible for developing cutting-edge managed services solutions and software tools to help users from the ent...READ MORE

How to use comments in Python? When working with any programming language, you include comments in the code to notate your work. This details what certain parts of the code are for, and lets other developers – you included – know what you were up to when you wrote the code. This is a necessary practice, and good developers make heavy use of the comment system. Without it, things can get real confusing, real fast. If you want more details to contact us: #Livewire-Velachery, 9384409662, #PythonTraininginChennai,#PythonTrainingInstituteinChennai,#TraininginVelachery,

thanks for sharing aws training

Application-Driven Managed Hosting for AWS

build better
Recent Tweets

Ready to Transform Your IT Strategy?

From groundbreaking server management software and automation platforms to custom, flexible managed infrastructure solutions, we win customers because we put customers’ unique needs at the center of every solution.

"I feel the customer service is light years better at SingleHop than with my previous provider. I love that I can call the 24 hour support line when things are simply easier to explain on the telephone than in a support ticket. "

Jane, SingleHop Customer

"Wonderful service. We really appreciate your willingness to work with us to help our business succeed. "

Aviva, SingleHop Customer

"As always I can depend on SingleHop Tech Support team for an assist whenever we need them. They’ve exceeded our expectations each and every time for the last 7 years. "

Rodney, SingleHop Customer

"Excellent! Hardware and software are important in this environment but what is truly outstanding is the tech support that comes with it!"

Kenneth, SingleHop Customer

"[The] completed task has made a serious difference in the server’s performance. Thanks for digging deeper. The efforts/findings were so worth the time taken, in my eyes!"

Michael, SingleHop Customer

"The crew is indeed outstanding. Everyone is involved with your case; they respond promptly and accurately.
They are always correct and incredibly fast."

Juliana, SingleHop Customer