Pgmicro storage-format
SQLite file format, B-trees, pages, cells, overflow, freelist that is used in tursodb
install
source · Clone the upstream repo
git clone https://github.com/glommer/pgmicro
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/glommer/pgmicro "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills/storage-format" ~/.claude/skills/glommer-pgmicro-storage-format && rm -rf "$T"
manifest:
.claude/skills/storage-format/SKILL.mdsource content
Storage Format Guide
Database File Structure
┌─────────────────────────────┐ │ Page 1: Header + Schema │ ← First 100 bytes = DB header ├─────────────────────────────┤ │ Page 2..N: B-tree pages │ ← Tables and indexes │ Overflow pages │ │ Freelist pages │ └─────────────────────────────┘
Page size: power of 2, 512-65536 bytes. Default 4096.
Database Header (First 100 Bytes)
| Offset | Size | Field |
|---|---|---|
| 0 | 16 | Magic: |
| 16 | 2 | Page size (big-endian) |
| 18 | 1 | Write format version (1=rollback, 2=WAL) |
| 19 | 1 | Read format version |
| 24 | 4 | Change counter |
| 28 | 4 | Database size in pages |
| 32 | 4 | First freelist trunk page |
| 36 | 4 | Total freelist pages |
| 40 | 4 | Schema cookie |
| 56 | 4 | Text encoding (1=UTF8, 2=UTF16LE, 3=UTF16BE) |
All multi-byte integers: big-endian.
Page Types
| Flag | Type | Purpose |
|---|---|---|
| 0x02 | Interior index | Index B-tree internal node |
| 0x05 | Interior table | Table B-tree internal node |
| 0x0a | Leaf index | Index B-tree leaf |
| 0x0d | Leaf table | Table B-tree leaf |
| - | Overflow | Payload exceeding cell capacity |
| - | Freelist | Unused pages (trunk or leaf) |
B-tree Structure
Two B-tree types:
- Table B-tree: 64-bit rowid keys, stores row data
- Index B-tree: Arbitrary keys (index columns + rowid)
Interior page: [ptr0] key1 [ptr1] key2 [ptr2] ... │ │ │ ▼ ▼ ▼ child child child pages pages pages Leaf page: key1:data key2:data key3:data ...
Page 1 always root of
sqlite_schema table.
Cell Format
Table Leaf Cell
[payload_size: varint] [rowid: varint] [payload] [overflow_ptr: u32?]
Table Interior Cell
[left_child_page: u32] [rowid: varint]
Index Cells
Similar but key is arbitrary (columns + rowid), not just rowid.
Record Format (Payload)
[header_size: varint] [type1: varint] [type2: varint] ... [data1] [data2] ...
Serial types:
| Type | Meaning |
|---|---|
| 0 | NULL |
| 1-4 | 1/2/3/4 byte signed int |
| 5 | 6 byte signed int |
| 6 | 8 byte signed int |
| 7 | IEEE 754 float |
| 8 | Integer 0 |
| 9 | Integer 1 |
| ≥12 even | BLOB, length=(N-12)/2 |
| ≥13 odd | Text, length=(N-13)/2 |
Overflow Pages
When payload exceeds threshold, excess stored in overflow chain:
[next_page: u32] [data...]
Last page has next_page=0.
Freelist
Linked list of trunk pages, each containing leaf page numbers:
Trunk: [next_trunk: u32] [leaf_count: u32] [leaf_pages: u32...]
Turso Implementation
Key files:
- On-disk format,core/storage/sqlite3_ondisk.rs
enumPageType
- B-tree operations (large file)core/storage/btree.rs
- Page managementcore/storage/pager.rs
- Page cachingcore/storage/buffer_pool.rs
Debugging Storage
# Integrity check cargo run --bin tursodb test.db "PRAGMA integrity_check;" # Page count cargo run --bin tursodb test.db "PRAGMA page_count;" # Freelist info cargo run --bin tursodb test.db "PRAGMA freelist_count;"