Document VS Relational Databases
Complete Developer Podcast - En podcast af BJ Burns and Will Gant - Torsdage
Document databases are becoming increasingly common. Not only do they offer a lot of advantages over relational databases in certain contexts, but they also greatly reduce the friction of working with data when using object-oriented languages. Not only do these advantages often make it easier to work with your data, but they can make it easier to rapidly iterate on your design as you develop your software. However, much like agile principles, all this flexibility can get you into trouble if you aren’t disciplined. Simply being able to do more things doesn’t make all those things prudent, scalable, or sustainable. The structure of your data matters. This is still true when you are using NoSql data models. However, while many developers know how to build decent relational databases, many of us are at a loss the first time we encounter a database with a non-relational structure. Document databases are becoming increasingly common and important parts of modern software architecture. While there are a few spots where they are tricky to use well, they can often supplement or even replace older relational models. If nothing else, the simplicity of working with them (and the avoidance of the object-relational impedence mismatch) is often worth the trouble. Episode Breakdown Major concerns for relational databases. No duplication of data. Flat structures connected by joins. Focus is on entities and their relationships. Queries and indexes tend to be built after the data model is worked out. How document databases do it. Data structure is highly based around how it will be retrieved and structured based on how it will be read. While relationships between documents are supported to varying degrees,they aren’t treated with the importance that they are in a relational model. The data that is generally read together is generally stored together. Duplicate data is less of a concern than it would be in a relational model. One to one relationships In a relational database, you might model such relationships in a single table, or in two tables with a relationship between them. In general, in a document database, you would simply put it all in a single document, with nullable fields as required. This is a simple case, but gets rid of a join. Updating either record requires an update of the document. One to (not too) many relationships. In a relational database, this is going to be expressed as a parent table, with a child table that has a foreign key back to the parent. In the relational model, you’d retrieve the parent and children in one shot by either returning two recordsets, or by using a join to flatten the relationship. In a document database, if the children are not relevant without the parents, you’d simply have an array in the document being stored. Retrieval of this dataset would not require a join and would not require parsing, but updating a child record would require you to update the parent document. One to (a great) many relationships. In a relational database, this would again be expressed as two tables joined with a foreign key (and hopefully with some appropriate indices). In a document database, this typically would not all be stored in the manner previously outlined, because any retrieval of the parent brings back all the child records. Instead, you would typically store the child records as their own documents, with a reference back to the parent. That way you could grab sane recordsets without getting humongous amounts of data. This can be interesting as more child records pile up, but you could also use a hybrid approach. Many to Many relationships In a relational database model, this would be expressed by having two entity tables, with a join table between them.