# Bitalosdb
# Problem & Solution
Problem: Standard LSM-Tree exists read&write amplification problem. Data size is larger, and resource consumption caused by read-write amplification is greater. How to carry more write & reads with lower resource cost on large data scale?
Solution: Based on Bitalostree to solve read amplification, based on Bitash to realize KV separation to solve write amplification, based on hot and cold data separation to further save memory and hard disk consumption.
# Key Technology
Bithash (KV separation technology), significantly reduce write amplification. For bithash, time complexity of retrieval is O(1). GC can be completed independently, and value and index are decoupled.
Bitalostree (high-performance compression index technology), basically eliminate read amplification. If B+ Tree has a number of huge Pages, write amplification is a severe problem. With a creative index compression technology, Bitalostree eliminates B+ Tree write amplification, and improves the read performance.
Bitalostable (cold and hot data separation technology), stores cold data, which is calculated according to the data scale and access frequency. Storage engine writes cold data to Bitalostable when QPS becomes low. Improve data compression, reduce index memory consumption, and achieve more rational resource utilization. (open source stable edition has basic features, enterprise edition supports more comprehensive hot and cold separation).
# IO Architecture
# KV Separation
# Solution Analysis
- Solution A
- Solution B
- Solution C
- Analysis
- Conclusion
Solution A&B needs to query & update the index during vlog-GC, which consumes additional CPU and IO resource.
Solution C will trigger multiple random reads when reading vLog, and read performance can be further optimized.
Bitalosdb not only ensures high-performance reading of vLog, but also realizes GC inside vLog without querying & updating index.
# Implementation details
- File structure
- Data writing
- Data reading
- Index write
# High performance compressed index
- Note: The open source version supports 90% of the compression formats in the figure, and the enterprise version supports all index compression formats.