June 20, 2007 - by jason
In a previous article, On Grids, the Ambitions of Amazon and Joyent, I made a few premises:
Let’s be clear, EC2 is fine when you’re doing batch, parallel things on data that’s sitting in S3. In line with the economics favoring compute being by data (Jim Gray’s Distributed Computing Economics), and is definitely an improvement over other publicly available batch systems. I can see why it would be attractive to those working on science grids, one just has to overcome the large data set in proximity to compute issue.
However, the promise of unlimited scalability (at least in the “scale up” direction) for normal web applications has no basis and is not technically possible beyond normal limits with EC2. A “normal” web application is that one that has to always be up and persistent.
And I get a bit irritated when I come across sentences like Jinesh’s at RailsConf: “infinity auto-scalable on-demand computing resource” (here)
I know that is has no basis because it lacks at least one critical thing: real application switches.
Yes, I’m now calling load balancers application switches because I think one has to distinguish software on a general purpose server from dedicated, high-end switching hardware. For example, OpenBSD is a great operating system and has OpenBGPD, and while I could slap it onto a couple of one unit servers to function as my routers, I wouldn’t do that above a certain level.
There is a difference in the horizontal scalability for how many rails processes you can hit in the backend (previous joyeur). The limit is typically <1000 req/second and not that many mongrels, so it’s pretty easy to know that any Rails application on EC2 is not pushing a lot of traffic.
While you might think I’m biased because we do have the Accelerator product line, let me make it clear that we have that product line because we’ve also had to scale some of the oldest Rails applications around, and that constantly feeds back into the design.
I’ve also said what I think is valid about EC2, but let’s be clear about the list of deficiencies for multi-tiered applications:
(I’ll leave out ones like having to learn and program against proprietary API and commands like “ec2-run-instances ami-5da964c3 -k websvr-key”, that’s usually what’s called vendor lock-in).
In conclusion, EC2 is fine for batch on S3 data and for interacting with the Simple Queue Service (see the webmail.us example), but I wouldn’t put a multi-tiered web application on it.