Published in IT World
Restaurants, roller coasters and randomly failing web applications
You arrive at a downtown restaurant that does not take reservations. Straight ahead, a member of staff is standing behind a podium making notes on a booking sheet. To the left, happy diners who got there before you did, munch through their meals. To the right, prospective diners who also got there before you did, sit at the bar sipping cocktails, waiting for their table to be called. The member of staff behind the podium orchestrates the whole show. Her job is to keep the restaurant operating at maximum throughput without overloading it. To do this she juggles a variety of information in real time. Sizes of parties awaiting tables. Expected availability of free tables, time of day (lunch has a very different dynamic to dinner) and so on. You ask her the question she gets asked umpteen times a day. "How long is the wait?". "About 30 minutes", she replies.
You arrive at an uptown theme park to try out the roller coaster. The only way to access the roller coaster is via a ticket counter, railed on either side to create an orderly queue. You join the queue, envious of those ahead of you who got there before you did and gloating at those who are miles behind you. Overhead you can read an LED display. It says "Average wait from this point: 10 minutes."
You arrive at a popular website to book a flight/send an e-mail/book tickets for a concert. At first, everything goes well. You get part of the way through the transaction but then the trouble starts. The web browser seems to just hang. You cannot tell if it is doing something or in trouble. After a while, your patience wears out and you press the back button. Life is too short to double guess what is going on. You conduct your business on a different website.
The glaring difference between the operating model of the restaurant, roller coaster and website is that the first two have a very different scalability model. To see why, let us re-cast the restaurant to operate like the website...
You arrive at a downtown restaurant - Chez Web - that does not take reservations. Straight ahead are lots of tables that the front page blurb claims are available right now at keen prices. You appear to be the only person waiting for service. You request a table from the member of staff who appears to be allocated exclusively to you. At first she smiles a lot but soon her face turns blank. She stops interacting with you, failing to respond to questions or even expletives. After a while, the discomfort of not knowing what is going on gets to you and you leave the restaurant.
Some weeks later, you bump into the restaurant owner and explain your experience. "Oh, we have fixed that", she says. "We have added more members of staff to arrange tables for customers. We have more tables and a bigger kitchen too.". You try again some months later. Same bad experience. It would appear that all of the new capacity has been taken up with new business in the exploding dining market. Some months later, there is a drop in the volume of business in restaurants generally. You try for a third time only to find that the restaurant has gone out of business blaming high operating costs.
The key difference between the original restaurant and the re-cast one, is the concept of a queue. More specifically, a customer-visible queue. By putting a queue in place, the restaurant can optimize its throughput. It can ensure that the key revenue generating parts of its operation (the tables) do not get overloaded. Critically, it does this in an open and transparent fashion whereby prospective customers always know where they stand and do not get frustrated by delays. If the volume of business merits it, the restaurant can scale by adding extra tables/kitchen staff without changing how the front-office operates at all, thus keeping operating costs down.
Contrast this with the standard model for scaling web applications. More and more "threads" on more and more servers are required to maintain the illusion that each customer is the only one active at the moment. As the service scales, the customer experience stays the same. There is no way for a customer to know how busy the web application is, what the probable delay will be and so on. Of course, given enough money, you can keep adding more and more hardware - more running costs - to keep up the illusion. Whether or not this makes sense depends on the type of web application you are hosting.
If your application is pure publishing, for example, a search index, then scaling this way makes perfect sense. Users want instantaneous results and user-visible queues are out of the question. If, on the other hand, your application is transactional, then scaling this way makes less sense. The web has fantastic machinery to help pure publishing applications to scale well. Not so for transactional applications.
Another reason why you might want to look at the problem differently is the emerging paradigm of direct machine-to-machine operation of web sites - Web Services. Be in no doubt, consumers of web services exposed by your application can and will flood you with service requests. Imagine 10,000 hungry people arriving at your restaurant all at once!
"Form a queue now folks, take a ticket and watch the displays..."
Or, in web terminology : "this is a temporally decoupled web application folks, POST your required transaction, take an idempotent URI and poll for latest service information."