As part of the previous post, we saw how Facebook makes use of Haystack for object storage. But with increase in amount of blob(Binary large…
Paper Notes: Finding a needle in Haystack – Facebook’s photo storage
Photo storage is one of the key functionalities of any social media platform & Facebook is no exception to this. But the scale at which…
Implementing Viewstamped Replication protocol
Viewstamped replication(VR) is a replication technique that takes care of failures when one or more nodes end up crashing in a cluster. It works as…
Paper Notes: F1 – A Distributed SQL Database That Scales
F1 is a distributed relational database built at Google to support AdWords domain at Google. It is built on top of Spanner which we discussed…
Paper Notes: Spanner – Google’s Globally-Distributed Database
Spanner is a scalable & globally distributed database built at Google. It is the first database to replicate data globally while providing consistent distributed transaction…
Paper Notes: Kora – A Cloud-Native Event Streaming Platform For Kafka
With growing demand for data, robust solutions for handling large-scale data streaming has become essential for organizations. In the cloud-native world, getting both scalable and…
Paper Notes: Megastore- Providing Scalable, Highly Available Storage for Interactive Services
Megastore is a storage system built at Google that provides best of the both database worlds. It provides scalability of NoSQL along with strong consistency…
Paper Notes: Spark – Cluster Computing with Working Sets
In one of the previous posts, we looked into how MapReduce is used to perform large scale computations on large-scale data using commodity hardware. MapReduce…
Paper Notes: Distributed Transactions at Scale in Amazon DynamoDB
NoSQL databases come up with lot of good things such as high-availability, high-scalability and cloud-scale performance. But providing transaction support that doesn’t leaves the data…
Paper Notes: MapReduce – Simplified Data Processing on Large Clusters
MapReduce is another paradigm-shift similar to Google file system in the domain of distributed computing. It is a programming model for processing large sets of…