One of the most common questions I hear in my work is “will it scale?”.
90% of the time the answer is, yes. This is because of the #1 rule of software development:
If someone else has solved a problem, you don’t need to solve it yourself.
Think about Arnold Schwarzenegger. He scaled quite rapidly in his young days. He went from shrimpy kid to muscle man. There were a lot of people who wanted to scale like Arnold, and he made a lot of money selling them workout videos.
This is nothing like software. If Arnold’s body was software, he wouldn’t sell his workouts, he’d sell his muscles. Once Arnold engineered the perfect body (and accent), at scale, he’d copy his code and sell it directly. Everyone would be walking around looking and sounding like Arnold, without any time in the gym.
Unfortunately, Arnold Schwarzenegger is hardware, not software. But fortunately, Google is software, which means you can buy Google scale, even if you can’t buy Arnold’s muscles.
I was at Twitter when it scaled from 10m to 200m users. It sucked for us, it doesn’t have to suck for you. I’ll talk about what scaling means, the main problem that inhibits scaling and the things you can buy that let you scale with no work. Then I’ll talk about the scaling issues that you might still need to solve yourself.
What does it mean to scale?
Every application is just 0’s and 1’s, written by an engineer. If you want to run an application you need to get the application’s code from the engineer’s computer to yours.
Back in the day, an engineer would distribute their code onto CDs or Floppys.
Today, with the internet, things are easier. We engineers can put our code onto a “server”. A server is a computer that anyone can access, using the internet. When you visit a website you are copying code from a server to your computer.
Scaling means allowing lots of people to copy your application at once. Every second 100K people ask Twitter servers for parts of its application. Giving all of them code, quickly, is easier today than when we used CDs, but definitely not trivial.
Why is scaling hard: Vertical vs. Horizontal
If more people want your application than you can serve, you have a scaling problem. There are two types of scalings solutions: Vertical and Horizontal.
Vertical Scaling means building or buying bigger, better, and often more vertically tall, servers, which can send code to more people at once.
Horizontal Scaling refers to buying more of your existing, crappy, servers. Obviously you then place them all next to each other in a nice little, horizontal line.
Vertical scaling is relatively easy, but pretty limited. A single computer can only be so powerful.
Horizontal scaling offers the potential of unlimited scale. Serving more users simply means spending more money.
Solving a scale problem means finding a way to solve the problem horizontally. However this is hard if your application changes frequently.
Today’s applications don’t just change when engineers write new code. Every time you use Twitter you’ll see different Tweets. These tweets and other user created data are just 0’s and 1’s you need to copy, same as the application itself. Each time someone you follow tweets, Twitter needs to get the 0’s and 1’s for these tweets, to you.
When Twitter had just a few, powerful servers (vertical scaling), it was easy to make sure everyone’s tweets were available, up to date, on every server.
However, this was not true when we started scaling horizontally. When we had 50,000 servers there was no way we could copy every tweet onto every server. Instead we made software that tracked which server had which tweets— very hard when the data was changing 30,000 times per second.
When Twitter struggled with scale in 2010, money could not solve the problem. We had vertically scaled as much as we could and we didn’t have a way to scale horizontally. There was simply no way we could serve more users until, after 2 years of work, we built a way to scale horizontally.
Thankfully, as mentioned before, the solutions that Twitter, Facebook, Google, etc. used to scale horizontally can now be bought! Here are the easy solutions that let you reach Twitter scale with about an hour of work.
Scaling solutions:
Note: Amazon, Google and Microsoft offer similar solutions for nearly every scaling problem. I’ll call out my favorite products but you can Google search “<product name> alternative” to find the similar offerings from other providers.
Amazon EC2: Scaled Application Code
EC2, which stands for Elastic Cloud Compute is a service that lets you put your code on as many servers as you want, with a press of a button. You rent the servers from Amazon per minute. About 95% of the web uses this or similar services to serve the application code that their engineers write. However, it won’t serve data, AKA 0’s and 1’s that users create.
Google Firestore: Scaled data.
As someone who worked on the backend for Twitter, this is probably my favorite software tool. Firestore lets you store data that is changing 100K + times a second, for 500M+ users, and can be used by the most junior engineers. If this had existed in 2010, Twitter could have been created and scaled by a single junior engineer with $100K in funding. This technology is only a few years old, but is already used by companies like Lyft, New York Times and Duolingo.
Tensor Flow: Scaled Machine learning
Scaled Machine Learning is still relatively new. There is not much prior art to be bought and sold, but we’re getting there! Tensor Flow, which is one of the most popular Machine Learning frameworks, can be scaled horizontally— though it will cost you. The computers needed for training Machine Learning models are very expensive. Training the much hyped GPT-3 model cost about $4.6M. Training GPT-4 will, in theory cost $8.6BN and use more high powered servers than currently exist in the world… Still, for most machine learning applications Tensor Flow, combined with something like Amazon’s Sage Maker let’s you achieve scale.
Agora: Scaled real time Audio + Video
Real time Video Calls was one of hardest technological challenges of the last 2 decades. Video calls require a ton of data that is changing super fast. If I stream video I’m taking 30 pictures a second, then sending those pictures to every connected user. Making a single, live, video call between 100 users is harder than serving Twitter to millions of people at once. Still there has been a lot of progress in recent years! Agora is my favorite solution. Using Agora you can easily create video calls with up to 17 people sharing video and about 1M receiving video. If that’s not enough scale for you, have faith— anything Zoom does will be available within a year or two, after their engineers leave to launch a hip new startup.
Solving scale problems vs. Identifying scale problems
The tools above solve hard problems that every app will have as it grows.
However, every app will also have its own, specific product and engineering issues at scale. Once identified, these more specific issues usually have easy solutions. However, identifying the problems is hard— you won’t know until your app is actually used at scale.
Engineers will sometimes “prematurely optimize”. They will try to guess all the problems that will be encountered at scale and fix them ahead of time. However, I recommend this: Use the above tools to solve the hard problems you know you will have, then see what specific issues emerge as you scale. Use your time and energy identifying the scale problems you actually have instead of solving the ones you think you’ll have. That means actually growing your user base to reach scale, which might not be as fun as solving a technical problem, but is much more productive.