Freiberufler, Java-Architekt und Java-Entwickler

Highly Scalable, Ultra-Fast and Lots of Choices

Document Store

Your application has a rich domain model and does not collect huge masses of data.

When data is stored in a relational database, a normalized entity-relationship schema is the typical way to go. A normalized data structure has many merits, e.g. it guarantees the consistency of your business data. The more complex the domain model of an applications becomes, the more tables are needed to store the data.

Given a rich and complex domain model, you need to efficiently search for and modify your domain entities. But mapping a complex domain model to a relational structure often leads to slow performance.

When a complex domain model is mapped to a normalized data structure in a relational database, you typically get a large number of tables and foreign key constraints in between. Querying such a data structure causes many join operations. But with concurrent read and write access, too many database joins quickly demolish the overall retrieval performance. Still, you want to keep your rich domain model instead of relying on a simple domain model as provided by a Key/Value Store.

The ability to query data by searching for occurrences of business values is important to your application but following relations among entities that are stored in separate tables of a relational database is an expensive operation.

An expressive, preferably object-oriented domain model improves the understandability of the business domain but mapping such a model on a fine-granular data structure may cause performance degradation. The results are often a large number of round-trips to the data store and therefore inefficient load operations.

* * *

Therefore:

Choose a Document Store, which manages hierarchically structured data records. Modifying and retrieving such records are a cheap operations.

Document Store

A Document Store keeps all information related to a single entity in one document. The complete data of a document can be stored and retrieved atomically. Whereas in a relational database, you need to create and fill multiple tables to model one-to-many relationships, you can easily store such data in single documents.

Typically, data records in Document Stores are saved in a JSON-like data structure. JSON, which is short for JavaScript Object Notation, can hold any kind of hierarchically structured data. Developers who have developed Rich Internet Applications are typically quite familiar with this data format.

Mapping an object-graph to a JSON notation and back is easy and fast, both done manually and by libraries. On the downside, a JSON-based structure cannot model many-to-many relationships. In addition, some details such as non-standardized data types (e.g. the format of dates) may cause problems in creating a rich data model based on JSON.

Document Stores work best if the entities of your domain model are mostly Aggregates, as propagated by Domain-Driven Design. An aggregate entity is a complex data structure to which there is only a single point of reference.

The ability to express queries for entities based on their internal data structure is an essential feature of all Document Stores. Still, not all such products provide a dynamic query language to execute search requests. Some products mandate that you develop queries outside your application code in a different programming language (mostly JavaScript) as Map-Reduce functions.

As another downside, few products provide the capability to span queries over several data records, i.e. joining them. Also, you typically cannot easily prefetch data that is stored in multiple data records in a single call.

Document Stores can be seen as Key/Value Stores with extra capabilities to structure and query the data. In reality, there is a smooth transition between both. Some products that started as basic Key/Value Stores slowly grow to become more like Document Stores.

Examples of Document Stores are MongoDB and CouchDB.

Back to the pattern overview