COMMENTS
No offense, but you'be been what, 3 years in the making with HubSpot? How does that place you in the position where you can recommend what should a startup do or not? Without your $$, 3 years without a product out of the door would mean DEAD for any startup.
Actually, for the record, HubSpot did not officially start until June 2006 (which is when I got done with grad school).
By January of 2007, we had a beta product available that customers already started paying for. The product launched in November 2007. The company had about 150 paying customers at the time of launch. This is also when we made a major upgrade to the infrastructure and the software as there was sufficient evidence there was a market and the product was solving a problem people would pay to have solved.
Sorry if this sounds defensive (wasn't meant to be). We really didn't spend much money on scalability until revenues started coming in.
I've worked at a few startups and as a marketing guy, the tradeoff always seems pretty clear that your technical team can either work on scalability or on innovation. And I think the fast turnaround of new features and iterating quickly on product development based on customer/market feedback is critical. If the case is really that I can have both, I would of course want both. But I would rather iterate quickly on features and have some uptime/scaleability issues early on. If you are upfront with customers that you are in beta and the uptime might not be 99.999%, then they usually understand and appreciate that you are listenitng to them on the feature side and are actually develoing stuff they want in days not months.
Mike and Dharmesh make an interesting point. Speaking as a consumer, if the application was innovative enough, I would work through the initial problems. This leads me to believe that startups should focus less on scalability and more on making a slick app.
Perfect example, I heard a lot about Mint.com. I was excited to sign up but kept getting an error when trying to connect to my Chase bank account. I tried it like 5 times and submitted 3 trouble tickets to no avail. I kept getting a generic auto-response answer. I checked their support forums and still couldn't figure it out. Finally, I just gave up.
My point isn't to bash Mint.com.... If an app is sexy enough, I will work through the initial bugs and keep trying to figure out a way to make it work. I'll contact the company (it's good for them b/c they're not finding bugs AND engaging with their users) until they fix it.
But what do I know? I'm the idiot that still can't connect his bank account to Mint.com
Raza Imam
BoycottSoftwareSweatshops.com
I think the points you are making Dharmesh are very sound. I know for sure that we have often debated whether we should get product out of the door and deal with whatever happens, and yes we have the money now to do so. However, I also think it is wise to be prepared for scaleability. I suppose the advice I would offer would be to guide your programmers towards the development of sites and structures that will be scaleable without too much hassle. The thing to avoid is having to undo everything once you do start to gain more customers. Whilst this usually means you have cash to invest I would never wish to waste it.
One point I didn't make in the article, but probably should have:
When making tradeoffs regarding scalability, you are at some level incurring technology debt. Debt is not always a bad thing -- it can often help you grow. The key is to make sure that the "interest rate" on the debt does not outweigh the benefit of the tradeoff.
So, if making a scalability tradeoff will likely cause you to rewrite the entire system, it's probably not worth it. But, if it's simply a matter of "Pay X now or 1.2X later", it might be better in some cases to just pay 1.2 X later".
I agree with Dharmesh's position here. A few job cycles ago I was heavily involved in a datacenter operation, where we sold everything from single U's to private cages. In that job I got to see how a lot of different people were handling their hosting infrastructure at various stages in the evolution of the their business.
It was almost inevitable that the guys who would roll up with a full rack of load-balanced redundant-everything before they had traction would suffer endless headaches. Things wouldn't work right, or they'd forget to patch one of the redundant servers (causing every other page load to not work).
It takes time to get all the equipment working properly and fluidly. As you're writing (and debugging!) your application and trying to hunt down odd gremlins, making sure that the load-balancer, IPS, transparent proxy, or other active device is not the source of the problem takes more time.
Your application is also likely to morph over time. Where you first thought you'd need 5 load-balanced web servers you later find that you really need 3 database servers, or 10 web servers, or more geographic locations. Thus, the chance that you can really design the right model well in advance of your product getting some traction is not very great.
On top of that you factor in that these components generally get cheaper and better over time, implementing all this gear too early on is only burning capital needlessly.
Just for the record, I do recommend that you at least have some concept of where the weak spots might be in your architecture as you are going through the early development phase. As you progress, it's good to keep a plan for how you would address these various weak points, should they actually become realized in production. This often doesn't have to be a terribly formal process at first, just something like "if the DB server gets thrashed, we'll have to add RAM, or index differently" or "If we get noticed on [digg|reddit|news.yc] we'll make a static content page".
Things like this rarely "pop" overnight. Your server load will increase at a pace that gives you time to address these issues before they become tragic.
Jeff -
You mention the 3x scaling factor a couple of times. What's the magic formula behind 3x? I would think the scalability number would be a factor of current user base and expected growth. Also, why the concern over a 7-30 day span?
Is 3x just a good number to use as a calculation for growth potential?
Just curious, it sounds like you've developed these guidelines over time, so I'm interested in how you got there...
Great article. Planning for scalability and investing in scalability to soon are different things.
In my experience, a company's technical and business people need to absolutely be in harmony on what is an acceptable level of scalability risk and timing of that risk vs. current business levels.
If the business people tell the technical people "We can never crash" then the technical folks err on the side of technical caution but with increased financial risk based on the costs involved.
Not to discount the important work that folks like Jeff perform but systems operations is not core to many startups businesses nor is it a competitive advantage. Its a cost center that needs to be managed in the most effective way as business dictates.
In my last company, we made major investments in technology and scalability based on our needs at our highest demand levels. We didn't crash but had more capacity then we needed when demand dropped. Looking back, I would have rather put that money to use by hiring another developer, salesperson, etc.
This time around, I don't want to own any equipment, lease any rack space or employ staff that need to maintain that equipment.
Rackspace and firms like them provide a good alternative to rent vs. own this capacity. Scalability, however, depends on what you agree to rent from them. About $600 a month for one server.
From a business and technical perspective, the ultimate solution seems to be services like Amazon Web Services, both Elastic Computing Cloud EC2 and S3 storage.
Amazon is one of the most trafficked sites on the web and systems operations is core to their business. For a startup, you pay for what you use and scalability is a few clicks away. In the short term its a cost savings -- pricing starts at ten cents an hour-- plus peace of mind and a growth path later.
What do the experts think about Amazon Web Services?
I'm with Jeff on this one. While I don't believe you need to go overboard with it, I do believe you should think about it and make some plans for it before it happens. An ounce of prevention ... Friendster remains the shining example of where this sort of philosophy kills you. Scalability, lack thereof, killed Friendster's chance at the really big time and they've been running to play catch up since. It doesn't take much effort to do some load testing and identify the parts that are going to break and come up with a plan to deal with them at a later time. Just diving in head first to rush something out the door is how we ended up with children's toys painted in lead.
Ahh, the irony. "The system is currently experience a technical problem. Most of the time, the system will automatically recover and will be up and running within 5-10 minutes. If you continue to see this message, please email us at productsupport@hubspot.com. We apologize for the issue." ...
Anyway, I definitely think you need to think about scalability issues even if you're not going to implement them yet. It may prevent you from inadvertently painting yourself into a corner by introducing impossible-to-scale features.In order to make JIT scalability changes though, you need to know where the bottlenecks are, before they completely choke the system. This means having detailed profiling/metrics logged so that you can keep an eye on trends and spikes.The good thing is that's relatively easy to do.
WHen you say 'spend too much' maybe you're not spending efficiently. With Sun Microsystem's Startup Essentials Program, you can get an amazing discount on X64 servers, scalable solutions & tech support without wasting any. It pays to do your research and I did mine..
www.sun.com/startup
You have a point, but I guess that it's natural that the first people to try a startup tool are those who are not afraid of fail. They are kind of risky. They like everything new and they are usually prepared for unfortunate events. Then after some time, when the tool proves to be stabe it is adapted by the masses. Those who are cautious just gotta wait. That's how it happend with our project management tool - Wrike http://www.wrike.com/. We tried it in beta, but started to seriously use it only a couple of months ago and we are very happy about it now.
Good article Dharmesh.
I've worked as an Operations Engineer for about four years and I would say that there are times when pre-mature scaling is not the most beneficial things you could do but I also find a lot of favour with Jeff's point of view - you need to think about scaling as you build your app.
I believe the big problem is that many companies (as a whole) don't understand why you can't just add a nice pluggable "handle ten times more load" when you need it. Operation Engineers should really have some input into the development process. I've worked in one company in which I was shocked at the lack of provision to scale their systems. I repeatedly warned them in business terms (i.e You'll lose money and customers) about the problems they would face down the line if these things were not addressed. They were not addressed until the customer experienced the problem which was too late. They had to spend tens of thousands of pounds in labour costs dealing with the fallout and modifying code. This came at a time when they were losing money and redundancies were on the cards. They were also facing fierce competition who would jump at the chance to steal the unhappy customer.
The funny thing is, that some of the issues are not even hard to fix if they had been considered earlier on. But when weakness are built into the system, it was expensive to change it afterwards.
I agree with Jon Gilkison totally - understand where the weak points of scalability lie in your app, and make provisions to deal with them. If you don't, when the problem hits - you might find yourself ill equipped to handle it.
Having worked on many large scale systems in my professional life and having startups experience at BlueLithium / MingleNow / Burrp, I would say that keeping in mind that one day you will have to serve {BIG NUMBER HERE} or people is VERY important. However,being able to scale to that number on the first day you launch is not as important.
Like Jeff said, It can be very painful (money wise, and time wise) to scale a system if you have not planned for it.
Here is a recent post from ZDNet at should add more light on the subject of this posting. I particularly like one comment from vinnie mirchandani which I think should be taken into account here.
"When I’m at home using Twitter, a great example of cool consumer software, I want to be delighted, thrilled, entertained, and engaged. When I transfer money through my bank, which is certainly a non-sexy enterprise system, I demand the system work every time without fail. There’s a big difference between enterprise and consumer systems, a lesson I suspect Robert Scoble is about to learn."
Don’t weep for underappreciated enterprise software
http://blogs.zdnet.com/BTL/?p=7285&tag=nl.e539
This post resonates with me. In the spirit of the 20% rule someone mentioned here, it's so easy to jeopardize the chances of long-term success (ironically) by looking too far into the future. I'm thinking of the resources invested in the hypothetical scalability as features which are external to the basic value of an application as it currently exists. In my albeit limited experience, this is just lack of focus. It's good to see it pointed out.