Introduction to Cassandra
Introduction to Cassandra
Cassandra is a NoSQL column-based database that is highly scalable and big data ready. It is a distributed database that is highly fault-tolerant with no single point of failure.
Cassandra was developed by Facebook in 2008. To solve the inbox search problem Facebook developed it. It became the Apache open-source project in 2014.
Features of Cassandra
- Data is stored as tables and columns.
- It is open-source software.
- It is highly fault-tolerant.
- It does not support master-slave architecture.
- It is a column-based database.
- It has a limited SQL interface.
- Every table has a primary key.
- It provides a very fast read and writes.
Example of Cassandra query – ‘select tranid, custno from transaction where amount = 100000 Order By tranid’. Observe the syntax of the query it is quite similar to that of SQL in the RDBMS system.
Cassandra should be used when there is a lot of information and it needs to be stored quickly in the database and when we are expecting a large increase in data size, a highly fault-tolerant cluster is needed, and when high performance is needed for the reading and writing operations.
Advantages of Cassandra
- Cassandra does not have a single point of failure and it is highly fault-tolerant that means if any nodes fail without completing its job then other nodes will take over it and complete the task.
- Systems can be added and removed without any downtime.
- Fast writes allow real-time big data processing.
- Every node in the cluster is identical this means there is no master-slave concept in Cassandra.
Limitation of Cassandra
Cassandra is not a general-purpose database because of the following reasons:
- It does not provide data aggregation commands such as group by, sum(), max(), or min() like in relational databases.
- Joins of tables are not possible in Cassandra. Therefore, data has to be denormalized before storing it in Cassandra.
- It does not support additional search clauses or conditions.
- Sorting by non-key fields is not allowed in Cassandra.