<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Databases on Hrushikesh Dokala</title><link>https://hrushikesh.dev/tags/databases/</link><description>Recent content in Databases on Hrushikesh Dokala</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Tue, 30 Dec 2025 00:00:00 +0000</lastBuildDate><atom:link href="https://hrushikesh.dev/tags/databases/index.xml" rel="self" type="application/rss+xml"/><item><title>building snaildb 🐌 : embedded, persistent key-value store written in rust</title><link>https://hrushikesh.dev/notes/snaildb/</link><pubDate>Tue, 30 Dec 2025 00:00:00 +0000</pubDate><guid>https://hrushikesh.dev/notes/snaildb/</guid><description>&lt;p>i decided to experiment with building a key-value store in rust as a fun project to both learn the language and dive into writing the low-level data structures and algorithms used in databases. inspiration was to see how far i could push the limits of a kv store compared to existing solutions like &lt;a href="https://rocksdb.org/">rocksdb&lt;/a> and &lt;a href="https://github.com/google/leveldb">leveldb&lt;/a>, especially after reading up on architectural concepts like bare-metal designs, object storage-based databases, and the &lt;a href="https://arxiv.org/pdf/2308.05462">bf-tree&lt;/a> paper. this project is my way of getting hands-on experience and satisfying my curiosity about database internals, might turn out to be production ready db in future. (start date: 02 december 2025, not really sure when this gets finished)&lt;/p></description></item><item><title>bf-tree</title><link>https://hrushikesh.dev/notes/bf-tree/</link><pubDate>Mon, 24 Nov 2025 00:00:00 +0000</pubDate><guid>https://hrushikesh.dev/notes/bf-tree/</guid><description>&lt;p>&lt;strong>Bf-tree&lt;/strong>&lt;/p>
&lt;blockquote>
&lt;p>decouple cache pages from disk pages, it no longer has to mirror disk 1:1&lt;/p>&lt;/blockquote>
&lt;p>lets understand a little deeper:&lt;/p>
&lt;ol>
&lt;li>problem with B-tree, data lives in fixed size pages (4kb), buffer pool caches whole pages in RAM.&lt;/li>
&lt;li>to update a record -&amp;gt; read full page -&amp;gt; modify a few bytes -&amp;gt; write back 4kb&lt;/li>
&lt;li>but, cache doesnt require to mirror 1:1 disk so, in this paper they&amp;rsquo;ve introduced - &amp;ldquo;mini pages&amp;rdquo;&lt;/li>
&lt;/ol>
&lt;p>&lt;strong>mini pages&lt;/strong> - a variable length in-mem fragments, so you dont need to hold the full 4kb page in buffer.&lt;/p></description></item><item><title>vitess architecture</title><link>https://hrushikesh.dev/notes/vitess/</link><pubDate>Fri, 21 Nov 2025 00:00:00 +0000</pubDate><guid>https://hrushikesh.dev/notes/vitess/</guid><description>&lt;p>first of all, im very inspired by &lt;a href="https://x.com/samlambert">@samlambert&lt;/a>, ceo of planetscale. i was curious enough to explore the &lt;a href="https://planetscale.com">planetscale.com&lt;/a> (fastest dbs available in cloud with their fast NVMe drives) and found something interesting, which is &lt;strong>vitess&lt;/strong>.&lt;/p>
&lt;p>there is a lot going on, in their website but the &lt;strong>vitess&lt;/strong>, allows &lt;em>&lt;strong>mysql dbs to scale horizontally through sharding&lt;/strong>&lt;/em>. which is very interesting. so thought of digging deeper into it. one of the questions i had was - &amp;ldquo;what is the exact problem vitess solves for mysql?&amp;rdquo;.&lt;/p></description></item><item><title>SPFresh: incremental in-place updates for billion scale</title><link>https://hrushikesh.dev/notes/spfresh/</link><pubDate>Sat, 16 Aug 2025 00:00:00 +0000</pubDate><guid>https://hrushikesh.dev/notes/spfresh/</guid><description>&lt;p>inspiration: you already know, im diving deep into &lt;a href="https://hrushikesh.dev/notes/vector-index">ann indexes&lt;/a>, and was looking into turbopuffer architecture - which points me to 





 





&lt;a href="https://arxiv.org/pdf/2410.14452" class="link-red" target="_blank" rel="noopener noreferrer">SPFresh&lt;/a>

 making me very curious to know how it works.&lt;/p>
&lt;p>SPFresh is a disk based cluster partitioned ANN index, which supports in-place updates and avoids global rebuilds, which are really expensive by continuously local rebalancing in billion scale vectors.&lt;/p>
&lt;p>&lt;strong>components&lt;/strong>:&lt;/p>
&lt;ul>
&lt;li>&lt;em>&lt;strong>LIRE&lt;/strong>&lt;/em> -&amp;gt; lightweight incremental re-balance protocol
&lt;ul>
&lt;li>A protocol which splits/merges the partitions (postings), wisely without rebuilding the global indexes&lt;/li>
&lt;li>It only re-assigns the partitions of the boundary vectors, during split/merge which violates NPA (nearest partition assignment) rule, as the rule says the vector needs to be assigned to partition where the centroid of it is nearest.&lt;/li>
&lt;li>&lt;strong>two conditions&lt;/strong> to check only the boundary vectors, so you dont scan everything:
&lt;ul>
&lt;li>vectors from split posting where, old centroid is nearest compared to new with the boundary vector (might mean, neighbour posting is now closer)&lt;/li>
&lt;li>vectors in the neighbour postings needs check if the new centroid is nearest to the boundary vectors.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;em>&lt;strong>algorithm&lt;/strong>&lt;/em>
&lt;ul>
&lt;li>
&lt;p>insert/delete: append to the nearest posting (partition), mark deletes&lt;/p></description></item><item><title>ann (approximate nearest neighbor) indexes</title><link>https://hrushikesh.dev/notes/vector-index/</link><pubDate>Fri, 15 Aug 2025 00:00:00 +0000</pubDate><guid>https://hrushikesh.dev/notes/vector-index/</guid><description>&lt;p>inspiration - im working on vector, hybrid search for data catalogs and got curious about the different index structures used at scale.&lt;/p>
&lt;p>firstly, what is ann index? approximate nearest neighbor index, a data structure that store your vectors in a way that lets you avoid comparing against all of them. so, i was deep diving into ann, and i&amp;rsquo;ve learnt a few interesting indexes, based on scale of the data points. there are 2 most popular indexes - &lt;strong>graph&lt;/strong> and &lt;strong>cluster&lt;/strong> based.&lt;/p></description></item><item><title>indexes</title><link>https://hrushikesh.dev/notes/indexes/</link><pubDate>Fri, 18 Jul 2025 00:00:00 +0000</pubDate><guid>https://hrushikesh.dev/notes/indexes/</guid><description>&lt;p>an index is a data structure that improves the speed of data retrieval operations by providing quick access to specific information without having to search through every piece of data. (Eg, a books index - look for the word and then navigate to that page)&lt;/p>
&lt;ul>
&lt;li>
&lt;p>forward [ O(n) ]&lt;/p>
&lt;ul>
&lt;li>we have multiple documents (pages), and we extract words from each document and store it in a data structure
&lt;ul>
&lt;li>doc 1 — “book”, “pen”, “student”&lt;/li>
&lt;li>doc 2 — “pen”&lt;/li>
&lt;li>doc 3 — “student”&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>when doing a search, it needs a full scan across the docs to retrieve which docs has the word.&lt;/li>
&lt;li>bring a pen -&amp;gt; do a full text scan on each doc -&amp;gt; Doc 1, 2&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>reverse/inverted [ O(1) ]&lt;/p></description></item><item><title>do you commit directly to the db? nah bro!</title><link>https://hrushikesh.dev/notes/wal/</link><pubDate>Sun, 15 Jun 2025 00:00:00 +0000</pubDate><guid>https://hrushikesh.dev/notes/wal/</guid><description>&lt;p>even i thought i know how databases work, but i was wrong until i read about the &lt;strong>wal&lt;/strong>.&lt;/p>
&lt;p>&lt;em>thought, this is how it works:&lt;/em>&lt;/p>
&lt;p>( user/api request ) -&amp;gt; ( operation (insert, update, delete) ) -&amp;gt; ( db (write to disk) ) -&amp;gt; ( respond 200 to user )&lt;/p>
&lt;p>&lt;em>learnt, it works like this:&lt;/em>&lt;/p>
&lt;p>( user/api request ) -&amp;gt; ( operation (insert, update, delete) ) -&amp;gt; ( wal (write to buffer in-memory) ) -&amp;gt; ( wal (flush to disk - fsync) ) + (starts async process) -&amp;gt; ( respond 200 to user )&lt;/p></description></item></channel></rss>