Amazon.com is really the leader right now in so-called “cloud computing”, where there’s some anonymous cluster of servers that you somehow use for various reliable services.
They started with “S3“, meaning “Simple Storage Service”. S3 is a back-end web service that allows you to place, retrieve, and share files via some sort of web back-end interface. As I’m not really a web developer, that interface is beyond me for the moment. However, various programs have cropped up that act as a front-end to S3, my favorite being JungleDisk. Once JungleDisk is fired up, you have what appears to be a network disk with unlimited storage. It is, in fact, effectively unlimited, though it of course comes at a cost. There’s a bandwidth cost whenever you move things to and from, and an ongoing storage cost. The ongoing storage cost is something like a charge based on your “average daily balance” of storage used, and is currently charged at a mere $0.15/GB/month. So, if you want to store 100 GB, that’s $15/mo. Not bad. Remember, too, that you’re never paying for storage that you don’t need, unlike some other services.
After they got S3 stabilized, Amazon.com came out with another beta “cloud” service, the “EC2“, for “Elastic Compute Cloud”. They let you run virtual machine images, basically, charged at an hourly rate. This was kind of interesting, and had some usefulness for a lot of people, but the machines were somewhat small and weak.
No more.
Within the last few days, they launched larger machine images with an x86_64 architecture, up to 4 virtual cores, and nearly 16GB of available RAM. Now we’re talking something interesting for people like me, and our lab. As a test, I created my own custom image, containing our simulation software, and fired it up. For $0.80/hour, I can get the equivalent of 2 of our 2-core Opteron cluster nodes. It’s a little slower than that, but only just. It also probably doesn’t scale as well as our current system, and I know it won’t scale as well as the system that will be arriving on Monday, but something tells me that this will change. I’m not the only one out there that wants to run cluster applications on EC2, as a quick google search will reveal.
Furthermore, this is a relatively cheap and ubiquitous platform that would allow someone to run high-performance applications without the overhead of purchasing a complete cluster. It would be good for, say, starting a business that required a large amount of computing power without having to purchase, store, feed, cool, and house all of that hardware up front. Once things got rolling it would be possible to use revenue to purchase and maintain such dedicated hardware.
The one major down-side of EC2, as I see it, is that it doesn’t save any of the data on the machine once the machine is shut down. One has to ship the data off to S3 (no charge) or another machine (bandwidth charges apply) before shutting it down. Nonetheless, as the tools for interacting with S3 improve, I expect that this limitation will disappear as well.
I should note that I’m not affiliated with either Amazon.com or JungleDisk in any way, except as a happy user.
JungleDisk cots money — I use the free s3sync.rb to backup my files. Here’s the reference I used to set it up.
http://blog.eberly.org/2006/10/09/how-automate-your-backup-to-amazon-s3-using-s3sync/
Also, Dreamhost offers 500 GB backed up for $6/month. Therefore, if you plan on hosting over 40 GB on your backup server, Dreamhost beats out Amazon S3. Dreamhost also offers shell access and web hosting.
I actually use JungleDisk actively as my main working repository of project and reference files.
Also, Dreamhost has lofty promises, but if you look into it a little bit you’ll find mountains of complaints that their hardware can’t match what they promise, and they have tons of outages. I have 250GB of storage on my a2hosting account, which I may start using for remote backups, and which can be accessed via SSL-encrypted WebDav and SSH. However, my 5GB or so of working files must stay on JungleDisk.
There are two reasons:
1) Can’t open files off of a normal WebDav disk, I have to copy them over first.
2) JungleDisk also keeps a local cache, which makes things a lot faster.