I. What is NoSQL database?

  • NoSQL (not only SQL) is a non-relational database design that provides flexible schemas for storing and retrieving data.
  • Gaining more attention and popularity because of cloud computing, big data
  • Chosen for scale, performance

II. NoSQL vs SQL databases

Non-relational database Relational database
Non-relational design Relational design (use tables to store data)
Schema-less means more flexible Table designs based on pre-defined schemas
Can store Structured, Semi-Structured, and Unstructured data Can only store Structure data
Do not use SQL to query data Use SQL for querying data
Specifically desined for low-cost hardware Maintaining a RDDMS is expensive
Most NoSQL are not ACID compliant Support ACID-compliance, which ensure reliability of transactions and crash recovery
New technology, thus is not well-documented and may not be bug-free Mature, well-documented

III. Four types of NoSQL databases

Based on the model being used for storing data, there are 4 common types of NoSQL databases:

  • Key-value store
  • Document based
  • Column based
  • Graph based

    1. Key-value store

  • Data in key-value based database is stored as pairs of keys and values.
  • A key represents an attribute of the data and is uniquely identify.
  • Both keys and values are anything from integers, string to a complex JSON. => Use for storing user session data, user references, real-time recommendation.

Disadvantages:

  • Not efficient to query data based on values.
  • Store relationships between data values
  • Multiple unique keys.

Example of key-value store: JSON.

{
    "name": "Alice",
    "age": 18,
    "country": {
        "name": "Netherlands",
        "region": "Europe"
    }
}

Providers:

  • Redis
  • Memcached
  • AWS DynamoDB

    2. Document based database

  • One single object with its information is recorded in a single document.
  • Multiples documents create a collection.
  • Document based databases enable flexible indexing, powerful adhocs query with analytics over collections. => Great for eCommerce platforms, medical storage, analytics platforms.

Disadvantages:

  • Complex search queries.
  • Multi-operation transactions.

Providers

3. Column-based Databases

  • Data is grouped by columns, not by rows.
  • A logical grouping of columns is refered as column family
  • All cells corresponding to a column are stored as a continuous disk entry, thus faster and more efficient search and access. => Great for systems with heavily write requests and storing time series data, weather data, IoT.

Disadvantages

  • Complex queries
  • Change queries pattern frequently

Providers

4. Graph based Databases

  • Use graphical models to represent and store data
  • Useful to visualizing, analyzing and finding connections between data. => Excellent with connecting data, which means data contains lots of relationships. Therefore, its use cases are social networks, recommendations, fraud detections.

Disadvantages:

  • Cannot process high volume transactions

Providers