Great list. Almost anything Jim Gray wrote, or any of Michael Stonebraker's C-store stuff, is worth reading.
Database Systems: The Complete Book by Garcia-Molina, Ullman, and Widom seems to be one of the few full books with decent coverage of implementing a database, even though anyone wanting to implement a modern database will have to go a long way from what it covers.
There are a ton of great papers out there easily accessible, though, at least in part thanks to the VLDB Endowment. The people at MS Research working on databases publish a lot of great papers, too.
Absolutely agree. Also for those interested in query processing and optimization I highly recommend some of the fundamental stuff by Goetz Graefe. http://dblife.cs.wisc.edu/person/Goetz_Graefe
I wanted a list of papers that would be essential in building database systems. It was little bit sad to see many members of the community re-discovering and re-inventing the wheels from 30+ years of relational database and systems research. This list of papers should help build a solid foundation.
It's a nice list but I have the feeling you can read each of these articles and still not be able do write a halfway decent database because there's nothing on how to actually write data to the disk efficiently on any operating system.
I for one would be very interested in that, although perhaps the deprecation of spinning disks has reduced the complexity of the issue.
"halfway decent database" is a very vague notion. What would that even be? Something similar to SQLite, MySQL, Cassandra, VoltDB? Are you talking about write ahead logging of flushing dirty pages from the buffer (aka sequential or random writes)?
All of these database systems have extremely different host requirements and write characteristics. Even SQLite and MySQL can be pretty different depending on the storage layer (SQLite can use BerkeleyDB)
The future isn't MySQL and spinning disks, the future is a tens of different databases that do things and handle reliability in a variety of ways. I know that complicates the issue even more, but my point would be that writing a halfway decent database is possible (or impossible, depending on your perspective) within a lot of this literature.
( Also, spinning disks even have abstractions since mostly you are using a RAID controller with a decent size of cache to alleviate random writes. )
I completely agree with you with regards to the future of databases, it will definitely be running many databases.
What I meant with halfway decent database is one that can actually guarantee the data it committed is actually on the hard drive in a readable state, and that can sustain random writes at a rate close to the theoretical limit of the persistent storage device. I'm not judging any databases in particular. I'm asking how can I make a database that's as awesome as MySQL or Cassandra.
All these databases have already taught us a lot of what's in these articles, but one thing that isn't written about a lot is how they actually achieve high performance, not just in a theoretical way, but in a practical way. What system calls do they make, what schemes do they use to minimize seeks, how is the memory laid out. None of that high level "do we use Paxos of Raft", that's kids stuff.
You should check out Database Systems: The Complete Book.
Those papers have a lot of that information, but it's usually a bit buried.
> What system calls do they make
I think that's actually kind of out of scope for a paper, because already you are assuming an OS and a language. Many papers/articles assume neither.
> what schemes do they use to minimize seeks
Look for information on buffer management/buffer managers
> how is the memory laid out
Almost always in pages that correspond to disk blocks. Also it's dependent on lots of things (transactional model, index types, column vs row orientation, etc..) This is heavily tied into the buffer management stuff too.
Firmware on the SSD is actively in the data path, e.g. wear-levelling to ensure that many writes to a single file does not cause a particular block to wear out much earlier than other blocks, reducing overall capacity of the drive.
"This paper provides a summary of 35 years of data model proposals, grouped into 9 different eras. We discuss the proposals of each era, and show that there are only a few basic data modeling ideas, and most have been around a long time."
These lists (of lists ..) are reinventing early Yahoo/DMOZ/webrings. If lists were available in a parseable format with accurate metadata (title, author, date, publisher), one could monitor github RSS and generate a local master list for analysis.
Assuming you're right that this is a trend, I don't think it should be too hard to heuristically pick out likely examples. Assuming most of them are good quality, there should not be so many that it's impossible to pick through the results and do the final formatting manually. Waiting for a standard format to come along could take a while (seen the semantic web yet?).
The jekyll-scholar format already exists and is more "semantic" than Markdown. After people read these database papers, perhaps one will understand the value of indexing structured data and convert/fork the list repo to structure the source format. This is easy at the birth of a collaboratively edited project and impossible later.
At least MS/Google will thank humans for enriching their proprietary knowledge graphs / semantic webs. It may take a few more years for people to realize why they need "personal algos" as much as "personal computing devices". Humans can then feed delectable metadata to their private algos, in addition to feeding large public spiders.
I really loved reading "CAP Twelve Years Later: How the "Rules" Have Changed".
This sentence nailed what I thought was wrong with some early decisions in NoSQL systems: "because partitions are rare, there is little reason to forfeit C or A when the system is not partitioned."
If you duplicate your data in locations a few hundred milliseconds apart (e.g. America, Europe and Asia [1]) you either need to tolerate inconsistency or have every write take a few hundred milliseconds.
If your system requires response times faster than that, the data centres you don't have time to contact behave sort of like they are partitioned all the time.
Me, I tolerate writes that take a bit longer to keep things simple :)
"Me, I tolerate writes that take a bit longer to keep things simple :)"
It's not just to keep things simple, it's to keep things consistent :-) Some NoSQL systems (e.g., MongoDB) introduced a new P in the equation (Performance) and used this to justify a lot of questionable decisions that affected consistency (and durability!). They have come a long way since these early decisions though.
If a system considers moderate latency as a partition, it would need to perform partition recovery on every write. That's why I'm not sure considering latency as partition is in line with the CAP theorem.
Maybe it is more related to availability though (not sure).
Caveat here is that you're duplicating data in a multi-master scenario. If you have a single write master and multiple read-only instances then consistency is easy to maintain (and suffering when you write from a distance) - obviously you're giving up strict read consistency, but that's enormously easier to model and understand.
edit: I guess this is written from a very web-tech point of view, where the assumption is often made that reads will dominate writes. Obviously that's not always the case, and the tradeoffs get a lot more complicated!
By separating the behavior under failure (Consistency or Availability) and the behavior without failures (Consistency or Latency), PACELC clearly communicates that these are different tradeoffs.
Paxos paper is in "Basics" category just like it is meant to be a joke. Even the "Paxos made simple" paper is not easily understood by graduate students as many studies have shown. (see Raft paper for a study on this.)
Nope, this is an illicit file sharing site. O'Reilly publishes Date's book. uk-corp's whois record has a privacy stand in. These could be legit copies or packaged with malware.
If you'd like to learn about databases (at least RMDBs anyways), there's a paid course at Harvard's Extension School. It's offered online for graduate credit at $2k a pop.
I've read it. Actually, there are no video lectures on this topic at all (i.e. Database implementation in general, not necessary the paid course you mentioned).
Database Systems: The Complete Book by Garcia-Molina, Ullman, and Widom seems to be one of the few full books with decent coverage of implementing a database, even though anyone wanting to implement a modern database will have to go a long way from what it covers.
There are a ton of great papers out there easily accessible, though, at least in part thanks to the VLDB Endowment. The people at MS Research working on databases publish a lot of great papers, too.