Pattern - Outsourced S3 Filestore
Here Amazon Web Services provide persistent storage in the form of Amazon S3, their REST-based filestore.
Features
- REST API to upload, list, retrieve and delete files.
- AWS authentication can be used to generate transitive requests; all headers are ordered and then a checksum is somehow encoded with the (secret) password; the headers and the checksum can be passed to a third party for use within a specific time period.
- Every uploaded file can be accessed as a BitTorrent feed.
- Non-standard, Non-RESTy copy operation through header abuse on a PUT.
- Pay per GB of data stored per month, per GB uploaded or downloaded from the Internet, and per GET/HEAD operation.
- US and EU datastores
- 'buckets' can be made public, in which case the data is free to download by anyone.
- Every bucket is in fact a virtual host under s3.amazonaws.com.
Advantages
- Low storage and operational costs; it is hard to compete with their prices for large internet-visible datastores.
- Geo-location ensures high availability.
- EU datastore can be used to remain compliant with EU data protection legislation.
- RESTy interface is easy to use through third party libraries and tools.
- A public bucket can be used to serve up content direct to third parties; no need for any other hosting. All static content can be served this way.
- For EC2, data transfer is free; S3 acts as the persistent filestore.
Disadvantages
- As it is not WebDAV compatible, WebDAV clients cannot use it.
- It's non-standard authentication restricts secure access to AWS-enabled client libraries.
- It's non-standard authentication mechanism is brittle against client side clock problems...if the client's clock is very out (or the client is configured to be in a different timezone from where it really is, a request may fail)
- Dependent on Amazon AWS for providing high availability services; there has been one outage in February 2008, related to authentication service overload rather than S3 itself.
SmartFrog support
The sf-ec2 components provide access to the S3 filestore by way of the sf-restlet libraries. S3 buckets can be created on deployment, (optionally) terminated on undeployment. Files may be uploaded to the bucket and deleted from it as part of deployments and workflow deployments. The sf-www health checks may also be used to verify that publicly visible artifacts are indeed publicly visible.