Full list of distributed systems articles
Real-world distributed architectures What Does 100 Million Users On A Google Service Mean? Article Link How Razorpay handled significant transaction bursts during events like IPL Article Link YouTube Architecture – How Does It Serve High-Quality Videos With Low Latency Article Link YouTube Database – How…
Master system design for your interviews or your web startup
The word system design got pretty popular with software engineers ever since the big guns like Facebook, Google, Palantir added an essential system design round in their interview process. I mean that’s how I came across the term. Now the obvious question that would pop…
Product development: An insight into the process of developing new products
Hello, world, This write-up takes a deep dive into the process of developing a new product from the bare bones. I’ve been writing software for more than ten years and the reason for this post is to share the little knowledge I’ve acquired about building…
Google Databases: How Do Google Services Store Petabyte-Exabyte Scale Data?
Google.com is the most visited website on our planet. Followed by YouTube.com. Both services are owned by Google. Besides these two there are other multiple online services owned by Google each with over a billion users like Gmail, Google Ads, Google Play, Google Maps, Google…
YouTube Database – How Does It Store So Many Videos Without Running Out Of Storage Space?
YouTube is the second most popular website on the planet after Google. As of May 2019, more than 500 hours of video content is uploaded to the platform every single minute. With over 2 billion users, the video-sharing platform is generating billions of views with…
Distributed Systems and Scalability Feed
Facebook photo storage architecture
Facebook built Haystack, an object storage system designed for storing photos on a large scale. The platform stores over 260 billion images which amounts to over 20 petabytes of data. One billion new photos are uploaded each week which is approx—60 terabytes of data. At peak, the platform serves over one million images per second.
In the original NAS-based photo storage architecture, Facebook faced throughput and latency issues as the photos and the associated metadata lookups in NAS caused excessive disk operations almost upto ten just for retrieving a single image.

Tail latency in distributed systems
Tail latency is that tiny percentage of responses from a system that are the slowest in comparison to most of the responses. They are often called as the 98th or 99th percentile response times. This may seem insignificant at first but for large applications like LinkedIn, this has noticeable effects. This could mean that for a page having a million views per day 10,000 of those page views would experience the delay. Read how LinkedIn deals with longtail network latencies.
There can be multiple causes of tail latency: increasing load on the system, complex and distributed systems, application bottlenecks, slow network, slow disk access and more. Read more on it.
RobinHood: Tail latency-aware caching
RobinHood is a research caching system for application servers in large distributed systems having diverse backends. The cache system dynamically partitions the cache space between different backend services and continuously optimizes the partition sizes.
Microsoft research has a talk on getting rid of long-tail latencies.
Zero to Software Architect Learning Track - Starting from Zero to Designing Web-Scale Distributed Applications Like a Pro. Check it out.
Master system design for your interviews. Check out this blog post written by me.
Looking for developer, software architect jobs? Try Jooble. Jooble is a job search engine created for a single purpose: To help you find the job of your dreams!!
Zero to Software Architect Learning Track - Starting from Zero to Designing Web-Scale Distributed Applications Like a Pro. Check it out.
Recent Posts
- How Actor model/Actors run in clusters facilitating asynchronous communication in distributed systems
- Understanding the Actor model to build non-blocking, high-throughput distributed systems
- Technical Consultant: How can I become one? Explained with an example
- IT consultant: How do I become one? Explained with a real-world use case
- Software architecture course – From zero to mastering the fundamentals
Follow On Social Media