I. What is NoSQL database?
- NoSQL (not only SQL) is a non-relational database design that provides flexible schemas for storing and retrieving data.
- Gaining more attention and popularity because of cloud computing, big data
- Chosen for scale, performance
II. NoSQL vs SQL databases
Non-relational database | Relational database |
---|---|
Non-relational design | Relational design (use tables to store data) |
Schema-less means more flexible | Table designs based on pre-defined schemas |
Can store Structured, Semi-Structured, and Unstructured data | Can only store Structure data |
Do not use SQL to query data | Use SQL for querying data |
Specifically desined for low-cost hardware | Maintaining a RDDMS is expensive |
Most NoSQL are not ACID compliant | Support ACID-compliance, which ensure reliability of transactions and crash recovery |
New technology, thus is not well-documented and may not be bug-free | Mature, well-documented |
III. Four types of NoSQL databases
Based on the model being used for storing data, there are 4 common types of NoSQL databases:
- Key-value store
- Document based
- Column based
- Graph based
1. Key-value store
- Data in key-value based database is stored as pairs of keys and values.
- A key represents an attribute of the data and is uniquely identify.
- Both keys and values are anything from integers, string to a complex JSON. => Use for storing user session data, user references, real-time recommendation.
Disadvantages:
- Not efficient to query data based on values.
- Store relationships between data values
- Multiple unique keys.
Example of key-value store: JSON.
{
"name": "Alice",
"age": 18,
"country": {
"name": "Netherlands",
"region": "Europe"
}
}
Providers:
- Redis
- Memcached
- AWS DynamoDB
2. Document based database
- One single object with its information is recorded in a single document.
- Multiples documents create a collection.
- Document based databases enable flexible indexing, powerful adhocs query with analytics over collections. => Great for eCommerce platforms, medical storage, analytics platforms.
Disadvantages:
- Complex search queries.
- Multi-operation transactions.
Providers
3. Column-based Databases
- Data is grouped by columns, not by rows.
- A logical grouping of columns is refered as column family
- All cells corresponding to a column are stored as a continuous disk entry, thus faster and more efficient search and access. => Great for systems with heavily write requests and storing time series data, weather data, IoT.
Disadvantages
- Complex queries
- Change queries pattern frequently
Providers
4. Graph based Databases
- Use graphical models to represent and store data
- Useful to visualizing, analyzing and finding connections between data. => Excellent with connecting data, which means data contains lots of relationships. Therefore, its use cases are social networks, recommendations, fraud detections.
Disadvantages:
- Cannot process high volume transactions
Providers