bf-tree
· 1 min read
Bf-tree
decouple cache pages from disk pages, it no longer has to mirror disk 1:1
lets understand a little deeper:
- problem with B-tree, data lives in fixed size pages (4kb), buffer pool caches whole pages in RAM.
- to update a record -> read full page -> modify a few bytes -> write back 4kb
- but, cache doesnt require to mirror 1:1 disk so, in this paper they’ve introduced - “mini pages”
mini pages - a variable length in-mem fragments, so you dont need to hold the full 4kb page in buffer.

so instead of:
disk page ↔ identical in-memory page
we have:
disk page + 0/1 mini page (hot subset + unflushed changes) and buffer pool decides how big mini page is.
they have also introduced, circular buffer in RAM
- fixed total size (N bytes)
- tail pointer - where new mini page allocations are appended
- head pointer - where evicts starts
- when full, it evicts mini pages near head pointer and updates back to disk to free space.
so Bf-tree is basically trying to dominate the “triangle” of read, write, range-scan, it hits a sweet spot where both B-tree and LSM are compromises.
