Amazon S3

Contents

Amazon S3

S3 is an asset store; a repository of content with different rights for different users, and a formal SLA. It can be accessed remotely (with a fee per MB), or in the EC2 farms. In the EC2 farms, access to the US S3 store does not incur any bandwidth charges. Access to the EU store is billed at both ends.

The S3 store is where AMIs are kept.

API

There is some coverage of S3 in the RESTful Web Services book. There is a SOAP API, but the REST API is the main one, as every upload asset can be accessed via an HTTP URL. BitTorrent support is an extra feature of the REST API.

Concepts

S3 Bucket

An S3 Bucket is something that holds data. Every bucket gets a hostname.

  1. billing is by the bucket
  2. the bucket name becomes the hostname
  3. bucket names must be unique across all users
  4. names exclude characters not allowed in hostnames. Only alphanumeric + ".-_" are allowed, and - and _ are not encouraged; 3<=bucket.length<=5; IP-style numbers 123.456 are forbidden (why?)
  5. there is a limit on the number of buckets per account
  6. Access control is by the bucket. You can grant anonymous access, or access to named AWS users. Amazon account email addresses are used for naming.

As bucket names are global across all users (and visible to anyone with a web browser), Amazon warn users about not expecting high performance from this. DNS updates may take a while too. Developers are not to create and delete buckets wildly.

S3 Object

An S3 Object is an asset stored in the repository

  1. Every Object has a key which, with the bucket, forms a unique ID. (bucket,key) -> object
  2. Objects have names, which can include forward slashes
  3. objects are references as objects under a bucket. This maps 1:1 with URLS
  4. Objects can have metadata.
  5. Maximum size of a single object: 5GB.
  6. Max metadata size: 2KB
  7. System metadata: metadata used by S3 (content-type, etc)
  8. User metadata: any custom metadata

Cost

      Storage
      $0.15 per GB-Month of storage used

      Data Transfer
      $0.10 per GB - all data transfer in
      $0.18 per GB - first 10 TB / month data transfer out
      $0.16 per GB - next 40 TB / month data transfer out
      $0.13 per GB - data transfer out / month over 50 TB

      Requests
      $0.01 per 1,000 PUT or LIST requests
      $0.01 per 10,000 GET and all other requests*
      * No charge for delete requests

The pricing is designed to deliver added value to the big users, the companies that use S3 for storing photos and videos. The per request charge puts a slight penalty on people using S3 as the store for lots of small content, like icons and other artwork, because they build up a bill, even if the amount of data is low. Given that every HTTP operation incurs server-side costs (electricity, CPU load, deprecation of servers), this probably makes sense.

S3 Costs and EC2

For storing AMIs, you pay about $1-$2 to upload the image, then, assuming it is a small image, about $1/month in storage. You don't pay anything for loading the image in the EC2 farm. For using S3 as the persistence layer for your AMI, you only pay for the storage, though of course you may pay more if the files so uploaded are made directly available to customers. However, as all GET requests served directly to your customers bypass the EC2 servers, you save on server fees in such a situation.

As mentioned before, transfer between the EC2 servers and the EU S3 Store is billed at both ends. However, EU customers get to experience lower latency with the data in this store. In some applications, it may be better to store data in the S3 store, especially if you have client applications to do the uploading directly from the EU customer's machines.

GUI

The S3 Firefox Organiser provides a GUI to let you synchronise a local directory with the S3 store. You can use this to publish AMIs or to move other data to and fro. For $0.10+VAT you can even use it to back up your photo collection.

File system semantics

This is not your normal filesystem. It has caching built in, and is designed to scale out by taking away some of the things people expect from a local filesystem

  • No locking. "If two puts are simultaneously made to the same key, the put with the latest time stamp wins."
  • After an operation has returned success, the data is stored.
  • The effects of an operations may take an undefined period of time to propagate. Until the propagation has completed, read operations may return the old view of the data (including its absence or presence)
  • There is no statement of how operations are ordered. The changes from the second operation in a sequence may be visible before the changes of the first

What they do appear to guarantee is that once a change has propagated, it will be what people see.

Developer API (RESTful)

There are two APIs, a SOAP API and a REST API. We will ignore the SOAP API, as it requires a feature (WS-Security) that is not in Alpine.

List

List all objects matching a specific key. Returns an XML document you can apply XPath to. The size of the return list does not significantly affect the time to create the list (i.e. linear or better). Note that parsing long lists can hurt the memory consumption of a DOM-style XML parser.

Service Operations

GET

GET / 
host: s3.amazonaws.com

Get the list of buckets that belong to the authenticated user

Bucket Operations

Every bucket is a host, so you can operate on it by making requests. If a host is not found, nslookup still works, but Amazon returns an error, such as
following a link to http://something.new.smartfrog.s3.amazonaws.com/

HTTP/1.1 404 Not Found
x-amz-request-id: 2D4A2D0DFCB1190C
x-amz-id-2: kFK/F3fWy2UcGCtlndSL1wQvIF2AI0p1oGBvF3frHN5HSMc5aKKk/6WjBoxdOhek
Content-Type: application/xml
Date: Fri, 23 Nov 2007 18:41:14 GMT
Connection: close
Server: AmazonS3

<?xml version="1.0" encoding="UTF-8"?>
<Error>
<Code>NoSuchBucket</Code>
<Message>The specified bucket does not exist</Message>
<RequestId>11FFFDAC076F89E8</RequestId>
<BucketName>something.new.smartfrog</BucketName>
<HostId>
I3TmzpV4EAlReRF9Ap3AAA5lGv0hnEq5ycNTrhVMcqCBEdzOc832/SGfiRSsCFdU
</HostId>
</Error>

This shows some implementation aspects of how s3 works. DNS is set up so that hostname under *.s3.amazonaws.com always resolves to their servers; there is no need to add/remove DNS entries as buckets come and go. This means that you cannot rely on hostname lookup to probe for a bucket, you have to check for a 404 response.

PUT

creates a new bucket at the host identified.

PUT / 
Host: smartfrog.s3.amazonaws.com

This is REST at its best: an HTTP operation that creates a new resource (the bucket), dynamically updating DNS as it does so!

GET

Get a list of object matching the pattern passed down.

GET ?prefix=N&marker=Ned&max-keys=40 HTTP/1.1
Host: quotes.s3.amazonaws.com
Date: Wed, 01 Mar 2006 12:00:00 GMT
Authorization: AWS 15B4D3461F177624206A:xQE0diMbLRepdf3YB+FIEXAMPLE=

GET ? location

Returns the location of the bucket

GET /?location HTTP/1.1
Host: quotes.s3.amazonaws.com
Date: Tue, 09 Oct 2007 20:26:04 +0000
Authorization: AWS 1ATXQ3HHA59CYF1CVS02:JUtd9kkJFjbKbkP9f6T/tAxozYY=


<LocationConstraint xm
lns="http://s3.amazonaws.com/doc/2006-03-01/">EU</LocationConstraint>

This is a bit non-restful. They should really have given every server resources for metadata about the service itself, something like \services. Instead they've tacked on overlaying functionality based on the query string, probably because they added this later and didn't want to break anything by reserving content underneath the server.

DELETE

     DELETE / HTTP/1.1
     Host: quotes.s3.amazonaws.com
     Date: Wed, 01 Mar 2006 12:00:00 GMT
     Authorization: AWS 15B4D3461F177624206A:xQE0diMbLRepdf3YB+FIEXAMPLE=

Results in something like

     HTTP/1.1 204 No Content
     x-amz-id-2: JuKZqmXuiwFeDQxhD7M8KtsKobSzWA1QEjLbTMTagkKdBX2z7Il/jGhDeJ3j6s80
     x-amz-request-id: 32FE2CEB32F5EE25
     Date: Wed, 01 Mar 2006 12:00:00 GMT
     Connection: close

204 is a special HTTP response to mean 'successful, nothing interesting to provide'. The Connection:close header is important, as the server will no longer exist once this change propagates. the x-amz-request-id header is a unique request ID -they can be used for support calls with amazon if things are going wrong.

Object operations

PUT

Adds an object to a bucket.

"The response indicates that the object has been successfully stored. Amazon S3 never stores partial
objects: if you receive a successful response, then you can be confident that the entire object was stored.
If the object already exists in the bucket, the new object overwrites the existing object. Amazon S3
orders all of the requests that it receives. It is possible that if you send two requests nearly
simultaneously, we will receive them in a different order than they were sent. The last request received
is the one which is stored in Amazon S3. Note that this means if multiple parties are simultaneously
writing to the same object, they may all get a successful response even though only one of them wins in
the end. This is because Amazon S3 is a distributed system and it may take a few seconds for one part of
the system to realize that another part has received an object update. In this release of Amazon S3, there
is no ability to lock an object for writing – such functionality, if required, should be provided at the
application layer."

Request Headers can include content-type and cache metadata; response headers include the ETag key that is
the MD5 sum of the message. you can use this to verify that the data wasn't tampered with on the
HTTP request, and in If-Match and If-None-Match operations.

GET

Get the object. the Etag can be used for conditional GETs.

HEAD

Get the object's metadata. the Etag can be used for conditional HEAD operations; to poll for changes.

DELETE

Delete an object. It is idempotent; not an error to delete a nonexistent object.

Metadata

In the REST API, when you PUT a resource you can add name:value HTTP headers with the prefix x-amz-meta. On a GET, the prefix is stripped and all duplicate entries are merged into a comma-separate list. Because the prefix is stripped, the data can be used to create HTTP headers for third party programs to handle. What you can't do is search for resources by metadata.

Request Security

You need to create a signed checksum of every request; the rules for this are quite complex. The solution is simple: delegate the works to libraries that implement it.

Objects and buckets have ACL based security; you can grant rights to individual users, or groups of users.

BitTorrent Support

Every resource that is world readable can have a bittorrent description. Just add ?torrent at the end of the resource URL.

If there are no peers, the torrent serves up the S3 resource: it is the seed. However, if there are peers serving the content, these may be picked up instead, depending upon networking settings. The result is that popular content may be downloaded faster, with some bandwidth costs saved.

The torrent file is demand created on the first ?torrent request; the time to create is is O(file size), and can take several minutes for a big file. If you want to serve torrent content, it is best to do the ?torrent request yourself.

When you delete an object, or remove anonymous access, S3 stops serving the torrent. This does not stop others continuing to serve the deleted file, though the .torrent may be harder to get hold of.

Logging

You can turn logging for a bucket on; the logs are delivered to a different bucket -you get pay for the storage.

Libraries

There's a good Ruby Library. For Java, the jetS3t (pronounced Jet-Set) library does the work. For SmartFrog we went with Restlet, which has support for the Amazon Web Services custom authentication protocol.

Analysis

S3 is a datastore for large volume data; its costs may be comparable to trying to run the datacentre yourself. Because you can feed up URLs directly to customers, you can embed content from the S3 store straight into the web browser or other HTTP-enabled tool.

The SOAP API should be viewed as obsolete; the fun stuff is RESTy. Even there, however, you can see the API doing things that are not 'pure' REST. It's pretty close though, and because it uses PUT and DELETE, is one of the key REST architectures.

The SLA is pretty good, and with its security model, you can use it as the back end for the non-database part of any application, that combines database data with artifacts that are stored in a central repository. Traditionally, people use the filesystem for this, but having integrated with asset stores in the past, I can appreciate how hard it is to get all details right. If you used S3, then right from the outset you'd be working against a long haul repository. This does mean you'd encounter connectivity and reliability issues early, and ramp up bills if you are not careful, but it also means that by the time you go live, you know what the costs will be, and you know your front end will be able to cope with unreliable connections and S3's concurrency rules.

Where there are limitations are in the metadata and the (in)ability of the repository to look like a real filesystem. The metadata is good, but limited in size and usefulness. you'd have to walk every artifact and do HEAD requests if you were looking for a specific piece of metadata. You couldn't put something like an expiry date on an artifact and then search for all artifacts which had already expired. Nor can you apparently change metadata on an existing artifact. Because of these limitations, it is clearly just a way to set some custom HTTP headers on requests; all the real metadata would have to be stored in a database.

This is where it gets complex for EC2: S3 is the only way to store data. You don't have a shared filestore; you have the local image's disk (which has to be considered unreliable) and the S3 repository. It's only after you've got a 2XX response from the S3 store that you know that data is successfully written; that the transaction is complete.

Get SmartFrog at SourceForge.net. Fast, secure and Free Open Source software downloads