Welcome!

Java Authors: Roger Strukhoff, Barbara Porter, Pat Romanski, Jim Kaskade, Liz McMillan

Related Topics: Cloud Expo, Java, SOA & WOA, Virtualization, Big Data Journal, SDN Journal

Cloud Expo: Blog Feed Post

Cloud Conversations: AWS EBS, Glacier and S3 Overview | Part 1

Simple Storage Services (S3), Glacier and Elastic Block Storage (EBS) are part of the AWS cloud storage portfolio of services

Amazon Web Services (AWS) recently added EBS Optimized support for enhanced bandwidth EC2 instances (read more here). This industry trends and perspective cloud conversation is the first (looking at EBS) in a three part series companion to the AWS EBS optimized post found here. Part II is here (closer look at S3) and part III is here (tying it all together).

For those not familiar, Simple Storage Services (S3), Glacier and Elastic Block Storage (EBS) are part of the AWS cloud storage portfolio of services. There are several other storage and data related service for little data database (SQL and NoSql based) other offerings include compute, data management, application and networking for different needs shown in the following image.

AWS services console image
AWS Services Console via www.amazon.com

Simple Storage Service (S3) is commonly used in the context of cloud storage and object storage accessed via its S3 API. S3 can be used externally from outside AWS as well as within or via other AWS services. For example with Elastic Cloud Compute (EC2) including via the Amazon Storage Gateway (read more here and about EC2 here). Glacier is the AWS cold or deep storage service for inactive data and is a companion to S3 that you can read more about here.

S3 is well suited for both big and little data repositories of objects ranging from backup to archive to active video images and much more. In fact if you are using some of the different AaaS or SaaS services including backup or file and video sharing, those may be using S3 as its back-end storage repository. For example NetFlix leverages various AWS capabilities as part of its data and applications infrastructure (read more here).

AWS Basics
AWS consists of multiple regions that contain multiple availability zones where data and applications are supported from.

yyyy

Note that objects stored in a region never leave that region, for example data stored in the EU west never leave Ireland, or data in the US East never leaves Virginia.

AWS does support the ability for user controlled movement of data between regions for business continuance (BC), high availability (HA) and disaster recovery (DR). Read more here at the AWS Security and Compliance site and also in this AWS white paper.

What About EBS?
That brings us to Elastic Block Storage (EBS) that is used by EC2 (read more about EC2 and instances here) as storage for cloud and virtual machines or compute instances. In addition to using S3 as a persistent backing store or target for holding snapshots EBS can be thought of as primary storage. You can provision and allocate EBS volumes in the different data centers of the various AWS availability zones. As part of allocating your EBS volume you indicate the type (standard) or provisioned IOP's or the new EBS Optimized volumes. EBS Optimized volumes enables instances that support the feature to have better IO performance to storage.

The following image shows an EC2 instance with EBS volumes (standard and provisioned IOPS's) along with S3 volumes and snapshots. In the following example the instance and volumes are being served via the AWS US East region (Northern Virginia) using availability zone US East 1a. In addition, EBS optimized volumes are shown being used in the example to increase bandwidth or throughput performance between storage and the compute instance.

xxxxxxx

Using the above as a basis, you can build off of that to leverage multiple availability zones or regions for HA, BC and DR combined with application, network load balancing and other capabilities. Note that EBS volumes are protected for durability by being spread across different servers and storage in an availability zone. Additional protection is provided by using snapshots combined with S3. Additional BC and DR or HA protection can be accomplished by replicating data across availability zones.

SQL applications using cloud and object storage services

The above is an example of tying various components and services together. For example using different AWS availability zones, instances, EBS, S3 and other tools including those from third parties. Here is a link to a free chapter download from Cloud and Virtual Data Storage Networking (CRC Press) pertaining to data protection, BC and DR (available at Amazon here and Kindle here). In addition here is an AWS white paper pertaining to using their services for BC, HA and DR.

EBS volumes are created ranging in size from 1GByte to 1Tbyte in space capacity with multiple volumes being mapped or attached to an EC2 instances. EBS volumes appear as a virtual disk drive for block storage. From the EC2 instance and guest operating system you can mount, format and use the EBS volumes as any other block disk drive with your favorite tools and file systems. In addition to space capacity, EBS volumes are also provisioned with standard IO (e.g. disk based) performance or high performance Provisioned IOPS (e.g. SSD) for thousands of IOPS per instance. AWS states that a standard EBS volume should support about 100 IOP's on average, with about 2,000 IOPS for a provisioned IOP volume. Need more than 2,000 IOPS, then the AWS recommendation is to use multiple IOP provisioned volumes with data spread across those. Following is an example of AWS EBS volumes seen via the EC2 management interface.

Image of mapping AWS EBS to ECS instance
AWS EC2 and EBS configuration status

Note that there is a 10 to 1 ratio of space capacity to IOP's being provisioned. If you try to play a game of 1,000 IOPS provisioned on a 10GByte EBS volume to keep your costs down you are out of luck. Thus to get 1,000 IOPS's you would need to allocate at least a 100GByte EBS volume of which you will be billed for the actual space used on a monthly pro-rated basis. The following is an example of provisioning an AWS EBS volume using provisioned IOPS in the US East region in the 1a availability zone.

Image of AWS EBS provisioned IOPs
Provisioning IOPS with EBS volume

Standard and Provisioned IOPS EBS volumes
Standard EBS volumes are good for boot images or other application usage that are not IO performance intensive. For database or other active applications where more performance is needed, then EBS Provisioned IOPS volumes are your option. Note that the provisioned IOP rate is persistent for the specific volume during its life. Thus if you set it and forget it including not using it without turning it off, you will be billed for provisioning it.

Additional reading and related items:

Continue reading part II (closer look at S3) here and part III (tying it all together) here.

Ok, nuff said (for now)

Cheers gs

Greg Schulz - Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2013 StorageIO All Rights Reserved

Read the original blog entry...

More Stories By Greg Schulz

Greg Schulz is founder of the Server and StorageIO (StorageIO) Group, an IT industry analyst and consultancy firm. Greg has worked with various server operating systems along with storage and networking software tools, hardware and services. Greg has worked as a programmer, systems administrator, disaster recovery consultant, and storage and capacity planner for various IT organizations. He has worked for various vendors before joining an industry analyst firm and later forming StorageIO.

In addition to his analyst and consulting research duties, Schulz has published over a thousand articles, tips, reports and white papers and is a sought after popular speaker at events around the world. Greg is also author of the books Resilient Storage Network (Elsevier) and The Green and Virtual Data Center (CRC). His blog is at www.storageioblog.com and he can also be found on twitter @storageio.