ACID, CAP, BASE in Database Model

·

5 min read

We often come across terms like ACID, CAP, BASE. It will be easier to understand these terms by comparing their purpose and demystifying the each letter in acronyms.

Let's start with ACID

In ACID, a transaction is characterised by four properties. SQL Databases follow ACID :

  • Atomicity
  • Consistency
  • Isolation
  • Durability

Atomicity

  • A Transaction should be atomic ,i.e. If multiple data insertion operations are part of transaction ,then either all or none should succeed.
  • "Atomicity" is a commonly-used, yet multi-purpose term used in programming world. In multi-threading terminology, atomicity signifies that a critical-section can executed by a thread without the fear of interleaving by another thread, whereas in Database ACID terminology it implies a unit-of-work(All or none) done by a Transaction. So it's important to use and understand the term wisely based on context of topic
  • Another point to note, "A" in CAP or in BASE are different. Please read below to understand CAP & BASE concepts

Consistency

  • Consistency rule signifies state of Database should be always be in consistent state.
  • So, what exactly does that mean ? Let's simplify this with some examples
    • "Referential Integrity" constraints must be maintained between master, child tables.
    • If data is transferred from Account A to Account B, total-sum of amount after should be before.
    • Column value constraints
  • A point to Note, "C" in CAP is different. Please read below to understand CAP
  • What's is the relation of Consistency with Atomicity ? If only one constraint gets violated, the entire transaction along with all modifications should be rolled back.

Isolation

  • Isolation rule tells transactions should occur concurrently independently without interference. Changes done by one transactions should not be visible to another transaction, until it's committed
  • Database has different isolation level. The purpose of isolation-level is to how to define concurrency-control strategies. Each isolation level does trade-off between consistency and performance. The stricter level, the more consistent, the lesser becomes performance. It's an interesting and detailed topic, which can be covered in a separate article.

Durability

  • Durability of a committed transaction even after power failure or crash is ensured by recording all the completed transactions into a non-volatile memory

Now let's jump to CAP Theorem

CAP Theorem basically defines the requirements of database managemenet in a distributed system with replication. It's important to keep the phrase "Distributed system with replication" in mind while discussing about CAP

Consistency

  • Let's first compare this "C" against "C" in ACID. Well "C" in CAP specifically talks about replication node consistency. i.e. Consistency among replicated nodes. Nodes should have the same copies of a data visible for various transactions.

Availability

  • Availability means distributed system should be simply available. Every node on must be able to respond in a reasonable amount of time. Either an available node should return success or a failed node should send a message that operation can not completed
  • If any node in system is un-responsive, it would fails "Availability" rule

Partition-Tolerant

  • Well we know, network failure is inevitable. In other words, cluster of nodes may get partitioned(2 or more partitions) by a network failure . Partition tolerance means that the system can continue operating in such case .
  • One point to note, during network partition, nodes from same side of partition can talk and communicate with each-other .

Types of Database based on CAP

Theoretically, a distributed system can only achieve any 2 of above properties . Why? As network failure is not avoidable, it is not practically possible to have a database which can be both available or consistent.

CP database

  • A CP database delivers consistency and partition tolerance at the expense of availability.
  • When a network-partition occurs in a cluster, the system has 2 choices
    1. Stop write/read operations to/from all the nodes ("Strict consistency", for example Banking systems need highest consistency). All nodes become unavailable.
    2. One side of partition remains as unavailable/offline & the Other side nodes(Master nodes) will do read/write operations("Sloppy consistency") until the partition is resolved. Nodes from one side of partition become unavailable.
  • Example : MongoDB

AP database

  • An AP database delivers availability and partition tolerance at the expense of consistency.
  • When a network-partition occurs, all nodes remain available but nodes from one side of the partition might return stale data. When the partition is resolved, the AP databases typically re-sync data to outdated nodes to repair all inconsistencies in the system.
  • Example : Cassandra

CA database

  • A CA database delivers consistency and availability across all nodes. This can not be achieved if there is a partition between any two nodes in the system. And practically this type of database can not exist

What is BASE

With the rise of NoSQL databases. a new database model was designed to define transaction properties

Basically Available

  • "Basically" means available, but without consistency guarantee

Soft state

  • The state of the database could change without application interactions due to eventual consistency. Read next point to understand more.

Eventually consistent

  • Consistency is not guaranteed at a transaction level. The system will be eventually consistent sometime after application interaction. The data will be replicated to different nodes and will eventually reach a consistent state. So, there is time-period(T) after application interaction, when data consistency will take place, because of which it's called "Soft State"

What is relation between CAP, ACID & BASE

  • CAP theorem talks about consistency across nodes in distributed system, whereas ACID/BASE defines transaction properties in a database
  • BASE is more of loose-consistent-version of ACID