The architecture of StackOverflow
One of the most interesting talks these weeks, and a rare insight into one of the most active pages on the web: Marco Cecconi of StackOverflow speaks about the general server architecture, why they don’t unit-test (!), how they release (5 times a day) and shows some awesome server load screenshots. It’s fascinating that they run one of the most trafficked pages (that also uses long-polling “real-time” messaging !) on just 25 servers, most of them on 10% load all the time. “We could run it on just 5 servers if needed”. Awesome. Nice statements regarding caching and using existing code, too.
I really like the Get-Things-Done attitude and the simple, but productive view on workflow (use multiple monitors, don’t be the nerd sitting in front of a laptop). The code is not perfect (lots of static methods), they don’t even test, only have a hand full of developers (!) and nearly no downtime. Ah yes, and they run one of the most successful sites in the history of the internet.
“Languages are just tools”. “You’ll be successful anyways, or fail anyways [it does not depend on the language].” I really like that guy. And by the way, they mainly use dot.net for the site. Make sure you also check out the links, especially #5 shows the current tech stack used in the company.
And by the way, have you noticed that EXTREMELY huge presentation screen ? Awesome! They obviously did this in a cinema or university audimax.
Update #1:
The slides of this talk:
https://speakerdeck.com/sklivvz/the-architecture-of-stackoverflow-developer-conference-2013
Update #2:
Thread on news.ycombinator.com regarding this topic, and Marco Cecconi (and other StackOverflow IT guys) have joined:
https://news.ycombinator.com/item?id=7052835
Update #3:
Excellent article in the StackOverflow tech blog showing how StackExchange was build back in 2008 (lots of technical details):
http://blog.stackoverflow.com/2008/09/what-was-stack-overflow-built-with/
Update #4:
Official 2009 database dump (legally available directly on StackOverflow):
http://blog.stackoverflow.com/2009/06/stack-overflow-creative-commons-data-dump/
Update #5:
AWESOME! Full up-to-date list of software, technology, methods and servers used for StackExchange:
http://meta.stackoverflow.com/questions/10369/which-tools-and-technologies-are-used-to-build-the-stack-exchange-network
Update #6:
This excellent YouTube comment by Joseph Lust sums it up perfectly:
* Use static methods everywhere instead of OOP
* Write the least code possible
* Keep entire site compilation under 10s
* Cache every single object, it’s faster
* Design to scale Up before Out
* Use 368GB memory for your servers/db’s
* Don’t write tests, have your users find defects
* Don’t reinvent square wheels