[Last Updated on 9/29/2025]
I have had tons of different bookmarks in my reading list to eventually get back to and read but they get mixed up and shifted around and I naturally let my ADHD-infused thought process clutter it up and randomize my time. I am going to try and avoid that by dumping my reading list into the blog directly, and forcing myself to get through everything here in the order that I have laid out from the beginning, rather than constantly randomizing over the endless different things I eventually want to learn.
Distributed systems is the primary driver of this reading list, but there are other tidbits and tangents along the way that I want to get back to that are also captured here.
- Raft Paper -> Currently re-reading this, this time I am implementing it from hand as well.
- Aleksey Charapko Fall25 Reading Group -> Reading a paper a week, trying to write about at least a few.
- Effective Rust -> Read bits and pieces here and there day-to-day, especially as I'm now going to start hacking on Rust at my day job.
Once the list above is exhausted will I revisit this list and add some more new topics from:
- Awesome DS
- Paper Trail DS Theory
- DynamoDB
- Murat's Foundational DS List
- Aleksey Charapko Reading Group Archive
- A DS Reading List
- Reynold Xin Database Papers
- Mark Brooker's Blog
- Christopher Meiklejohn DS Readings
- interpreter book and compiler book -> Some weekend reading when I get a chance to hack these together, preferably in Rust.
- Attention is All you Need -> Trying to make more time for AI-related theory.
- Crafting Interpreters
I also want to use this list to archive some of the various things I have read before and may eventually want to revisit:
- Designing Data-Intensive Applications -> The primer book that I have learned most DS theory from, and what helps set me up to go further on any specific paper or implementation.
- What every Programmer Should Know about Memory -> Ulrich Drepper's dictionary on the fundamentals of memory that every programmer should know. Super dense but full of tons of super useful things to revisit.
- CAP in Plain English -> The write up on CAP that solidified it a lot for me, along with the DDIA description, roughly "In the face of P, we can strictly choose C or A."
- Jepsen's DS Class -> Jepsen's dictionary on distributed systems, good to revisit various terminology and the basics.
- Chord -> Pretty cool paper and sets the stage for what Dynamo talks about.
- Dynamo -> Not DynamoDB! This and CHORD were some pretty interesting first papers I read and implemented.
- Google File System -> GFS's general design for distributed append-only storage chunking and relaxed consistency is a pattern I feel like I see all over the place elsewhere.
- Spanner -> The idea of planet-scale transactions and the ability to assign accurate timestamps is what makes this paper pretty awesome.
- FLP Paper -> More of a core idea to remember than a paper I constantly revisit. Async systems can't guarantee consensus if any process can fail. Paper Trail's Summary of FLP is also good.
- Snowflake -> How Snowflake (similarly to Aurora) created a new mode of operation for cloud databases, this time elastically scaling data warehousing.
- Aurora -> The database I actively work on, and an extremely innovative idea that changed how databases operate in the cloud. In short, why not let storage be able to replay the log, then we can separate storage and compute as distinctly elastic resources. You know it's genius because everyone has copied it since.
- High Performance Browser Networking
- Cassandra -> How I learned about LSM trees and first started understanding storage system architectures better.
- Bitcask -> Simpler than LSM trees, another log-based append storage engine.
- ZooKeeper -> ZAB i Another classic consensus system that is an interesting highlight of this read.
- Lakehouse -> More in the DB engine and architecture vein. an interesting read to understand where data systems are going in the current day, how can we get it all in terms of data warehousing, ACID, decoupled storage and compute.
- MapReduce -> Probably the first DS or parallel computing paper I ever read.
- Kademilia -> A pretty cool and simple P2P system design.
- Lamport Clocks -> The OG on why distributed systems are not that easy.