← Home

Summary of Bigtable

2021/11/28

A bigtable is a sparse, distributed, persistent multi-dimensional sorted map.

Features

Data Model

(Row, column, time) -> value.

Columns are divided into different families for access control and locality refinement (see below).

Time is used to mark different versions of value.

Building Blocks

Implementation

Three major components:

A write operation consists commit log -> memtable -> SSTable.

The immutable character of SSTable means it need compaction to optimize data. Three type of compactions:

Refinements

Performance

Random writes are better than random reads.

The aggregate performance is growing as cluster scaling, but single machine performance is degrading, especially read/write without RAM cache.

Lessons

Features related to Cassandra