data model
no-sql
- schema flexibility: schema on read
- locality. Documents/data is usually stored as continuous string, it could have performance advantage when application needs to access entire documents / large portion of it. When overwriting/updating, it’s suggested to do write that does not change the size of the documents, otherwise the entire documents needs to be rewritten.
- data structure similar to application requirements
- Storage is less of a concern compared with compute resource because of Moore’s law.
- awkward for highly inter-connected data
sql
- schema on write. unavoidable downtime when doing schema change with ALTER/UPDATE query.
- better support for join
- ACID
- many-to-one relationship and many-to-many relationship (it depends on the real requirement from the application logic)
some common tricks:
- denormalization: reduce redundant data in database; when some data needs to be updated/altered, only 1 place needs to be updated, and other reference can get updated data during query time
Why SQL database does not scale?
- JOIN operations will take a LOT of time
- Very hard to horizontal scale
- Unbounded query
TODO:
fault tolerance
concurrency handling