| 1 | trees: Sketch of Storage / Networking Architecture
|
| 2 | ==================================================
|
| 3 |
|
| 4 | As usual, we try not to invent anything big or new, but instead focus on
|
| 5 | composing and rationalizing existing software and protocols:
|
| 6 |
|
| 7 | - Many good implementation of POSIX file systems (Linux ext4, ZFS, etc.)
|
| 8 | - git, a distributed version control system
|
| 9 | - in particular the packfile format
|
| 10 | - the ssh send/receive pattern
|
| 11 | - Static WWW file servers like Apache and nginx
|
| 12 | - tar files, gzip files
|
| 13 |
|
| 14 | ## Use Cases
|
| 15 |
|
| 16 | 1. Building CI containers faster with wedges
|
| 17 | - native deps: re2c, bloaty, uftrace, ...
|
| 18 | - Python deps, e.g. MyPy
|
| 19 | - R deps, e.g. dplyr
|
| 20 | - wedge source is a .treeptr tarball
|
| 21 | - wedge derived is a .treeptr file
|
| 22 | 2. CI serving `.wwz` files. We need fast random access.
|
| 23 | 3. Running benchmarks on multiple machines
|
| 24 | - `oils-for-unix` tarball from EVERY commit, sync'd to different CI tasks
|
| 25 | 4. Comparisons across distros, OSes, and hardware
|
| 26 | - building same packages on Debian, Ubuntu, Alpine
|
| 27 | - and FreeBSD
|
| 28 | - x86 / x86-64 / ARM
|
| 29 | 5. Web .log files can be .treeptr files
|
| 30 |
|
| 31 | ## Silo: Large Trees Managed Outside Git
|
| 32 |
|
| 33 | You can `git pull` and `git push` without paying for these large objects, e.g.
|
| 34 | container images.
|
| 35 |
|
| 36 | To start, trees use regular compression with `gzip`. Later, it will introspect
|
| 37 | trees and take **hints** for **differential** compression.
|
| 38 |
|
| 39 | Related:
|
| 40 |
|
| 41 | - git annex
|
| 42 | - git LFS
|
| 43 |
|
| 44 | ### Data
|
| 45 |
|
| 46 | https://oilshell.org/
|
| 47 | deps.silo/
|
| 48 | objects/ # everything is a blob at first
|
| 49 | 00/ # checksums calculated with git hash-object
|
| 50 | 123456.gz # may be a .tar file, but silo doesn't know
|
| 51 | pack/ # like git, it can have deltas, and be repacked
|
| 52 | foo.pack
|
| 53 | foo.idx
|
| 54 | derived/ # DERIVED trees, e.g. different deltas,
|
| 55 | # different compression, SquashFS, ...
|
| 56 |
|
| 57 | ### Commands
|
| 58 |
|
| 59 | silo verify # blobs should have valid checksums
|
| 60 |
|
| 61 | Existing tools:
|
| 62 |
|
| 63 | rsync # back up the entire thing
|
| 64 | rclone # ditto, but works with cloud storage
|
| 65 |
|
| 66 | ssh rm "$@" # a list of vrefs to delete can be calculated by 'medo reachable'
|
| 67 | scp # create a new silo from 'medo reachable' manifest
|
| 68 |
|
| 69 | du --si -s # Total size of the Silo
|
| 70 |
|
| 71 | ## Medo (meadow): Named and Versioned Subtrees in `git`
|
| 72 |
|
| 73 | To start, this will untar and uncompress blobs from a Silo. We can also:
|
| 74 |
|
| 75 | - Materialize a git `tree`, e.g. in a packfile
|
| 76 | - Mount a git `tree` directly with FUSE. I think the pack `.idx` does binary
|
| 77 | search, which makes this possible.
|
| 78 | - TODO: write prototype with pygit2 wrapping libgit2
|
| 79 | - [FUSE bindings seem in question](https://stackoverflow.com/questions/52925566/which-module-is-the-actual-interface-to-fuse-from-python-3)
|
| 80 |
|
| 81 | ### Data
|
| 82 |
|
| 83 | ~/git/oilshell/oil/
|
| 84 | deps/ # 3 medo structure is arbitrary; they're
|
| 85 | # generally mounted in different places, and
|
| 86 | # used by different tools
|
| 87 |
|
| 88 | source.medo/ # Relocatable data
|
| 89 | SILO.json # Can point to multiple Silos
|
| 90 | Python-3.10.4.treeptr # with checksum and provenance (original URL)
|
| 91 |
|
| 92 | derived.medo/ # derived values, some are wedges with absolute paths
|
| 93 | SILO.json # Can point to multiple Silos
|
| 94 | debian/
|
| 95 | bullseye/
|
| 96 | Python-3.10.4.treeptr
|
| 97 | ubuntu/
|
| 98 | 20.04/
|
| 99 | Python-3.10.4.treeptr # derived data has provenance:
|
| 100 | # base layer, mounts of input / code, env / shell command
|
| 101 | 22.04/
|
| 102 | Python-3.10.4.treeptr
|
| 103 |
|
| 104 | opaque.medo/ # Opaque values that can use more provenance.
|
| 105 | SILO.json
|
| 106 | images/ # 'docker save' format. Make sure it can be imported.
|
| 107 | debian/
|
| 108 | bullseye/
|
| 109 | slim.treeptr
|
| 110 |
|
| 111 | layers/
|
| 112 | debian/
|
| 113 | bullseye/
|
| 114 | mypy-deps.treeptr # packages needed to build it
|
| 115 |
|
| 116 | ### Commands
|
| 117 |
|
| 118 | # Get files to build. This does uncompress/untar.
|
| 119 | medo expand deps/source.medo/Python-3.10.4.treeptr _tmp/source/
|
| 120 |
|
| 121 | # Or sync files that are already built. If they already exist, verify
|
| 122 | # checksums.
|
| 123 | medo expand deps/derived.medo/debian/bullseye/ /wedge/oilshell.org/deps
|
| 124 |
|
| 125 | # Combine SILO.json and the JSON in the .treeptr
|
| 126 | medo url-for deps/source.medo/Python-3.10.4.treeptr
|
| 127 |
|
| 128 | # Verify checksums.
|
| 129 | medo verify deps.medo/ /wedge/oilshell.org/deps
|
| 130 |
|
| 131 | # Makes a tarball and .treeptr that you can scp/rsync
|
| 132 | medo add /wedge/oilshell.org/bash-4.4/ deps.medo/ubuntu/18.04/bash-4.4.treeptr
|
| 133 |
|
| 134 | medo reachable deps.medo/ # first step of garbage collection
|
| 135 |
|
| 136 | medo mount # much later: FUSE mount
|
| 137 |
|
| 138 | ## `/wedge`: A binary-centric "semi-distro" that works with OCI containers, and without
|
| 139 |
|
| 140 | A package exports one or more binaries, and is a `treeptr` value:
|
| 141 |
|
| 142 | - metadata is stored in a `.medo` directory
|
| 143 | - data is stored in a Silo
|
| 144 |
|
| 145 | The package typically lives in a subdirectory of `/wedge`. This is due to to
|
| 146 | `configure --prefix=/wedge/...`.
|
| 147 |
|
| 148 | What can you do with it?
|
| 149 |
|
| 150 | - A wedge can be mounted, e.g. `--mount type=bind,...`
|
| 151 | - It can be copied into an image: `COPY ...`
|
| 152 | - for quick deployment to cloud services, like Github Actions or fly.io
|
| 153 | - It has provenance, like other treeptr values. The provenance is either:
|
| 154 | - the original URL, for source data
|
| 155 | - the code, data, and environment used to build it
|
| 156 |
|
| 157 | Related:
|
| 158 |
|
| 159 | - GNU Stow (symlinks)
|
| 160 | - GoboLinux
|
| 161 | - Distri (exchange dirs with FUSE)
|
| 162 | - Nix/Bazel: a wedge is a "purely functional" value
|
| 163 | - Docker: wedges are meant to be created in containers, and mounted in
|
| 164 | containers
|
| 165 |
|
| 166 | ### Data
|
| 167 |
|
| 168 | /wedge/ # an absolute path, for --configure --prefix=/wedge/..
|
| 169 | oils-for-unix.org/ # scoped to domain
|
| 170 | pkg/ # arbitrary structure, for dev dependencies
|
| 171 | Python-3.10.4.treeptr # metadata
|
| 172 | Python-3.10.4/
|
| 173 | python # Executable, which needs a 'python3' symlink
|
| 174 |
|
| 175 | ## Design Notes
|
| 176 |
|
| 177 | ### Data and Metadata Formats
|
| 178 |
|
| 179 | Text:
|
| 180 |
|
| 181 | - JSON for .treeptr, MEDO.json, SILO.json
|
| 182 | - lockfile / "world" / manifest - what does this look like?
|
| 183 |
|
| 184 | Data:
|
| 185 |
|
| 186 | - `git`
|
| 187 | - blob
|
| 188 | - tree for FS metadata
|
| 189 | - no commit objects!
|
| 190 | - packfile for multiple objects
|
| 191 | - Archiving: `.tar`,
|
| 192 | - OCI layers use `.tar`
|
| 193 | - Compression: `.gz`, `bzip2`, etc.
|
| 194 | - Encryption (well LUKS does the whole system)
|
| 195 |
|
| 196 | ### knot: Incremental, Parallel, Coarse-Grained, Containerized Builds with Ninja
|
| 197 |
|
| 198 | It's a wrapper like `ninja_lib.py`. Importantly, everything you build should
|
| 199 | be versioned, immutable, and cached, so it doesn't use timestamps!
|
| 200 |
|
| 201 | Distributed builds, too? Multiple workers can pull and publish intermediate
|
| 202 | values to the same Silo.
|
| 203 |
|
| 204 | Key ideas:
|
| 205 |
|
| 206 | - the knot worker pulls tasks and is pointed at source.medo and derived.medo
|
| 207 | directories.
|
| 208 | - All of this metadata is in git. The git repo is sync'd on worker
|
| 209 | initialization, and continually updated.
|
| 210 | - TODO: if 2 workers grab the same task, it should be OK. One of their git
|
| 211 | commits will fail?
|
| 212 | - The worker does a lazy 'medo sync'
|
| 213 | - The worker keeps a local cache of the Silo, according to the parts of the
|
| 214 | Medo it needs
|
| 215 | - It can give HINTS for differential compression, saying "I have
|
| 216 | Python-3.10.4, send me delta for Python-3.10.5"
|
| 217 | - If all metadata is local, it can be even smarter
|
| 218 |
|
| 219 | (Name: it's geometry like "wedge", and hopefully cuts a "Gordian knot.")
|
| 220 |
|
| 221 |
|
| 222 | ## TODO
|
| 223 |
|
| 224 | ### Research
|
| 225 |
|
| 226 | - shrub vs. blob?
|
| 227 | - a shrub is a subtree, unlike a git `tree` object which is like an inode
|
| 228 | - is all of the metadata like paths and sizes stored client side? Then the
|
| 229 | client can give repacking hints for differential compression, rather than
|
| 230 | the server doing anything smart.
|
| 231 | - medo explode? You change the reference client-side
|
| 232 | - or silo explode? It can redirect from blob to shrub
|
| 233 | - TODO: look at git tree format, and whether an entire subtree/shrub of
|
| 234 | metadata can be stored client-side. We want ONLY trees, and blobs should be
|
| 235 | DANGLING.
|
| 236 | - Use pack format, or maybe a text format.
|
| 237 |
|
| 238 | ```
|
| 239 | ~/git/oilshell/oil$ git cat-file -p master^{tree}
|
| 240 | 040000 tree 37689433372bc7f1db7109fe1749bff351cba5b0 .builds
|
| 241 | 040000 tree 5d6b8fdbeb144b771e10841b7286df42bfce4c52 .circleci
|
| 242 | 100644 blob 6385fd579efef14978900830e5fd74bbac907011 .cirrus.yml
|
| 243 | 100644 blob 343af37bf39d45b147bda8a85e8712b0292ddfea .clang-format
|
| 244 | 040000 tree 03400f57a8475d0cc696557833088d718adb2493 .github
|
| 245 | ```
|
| 246 |
|
| 247 | ### More
|
| 248 |
|
| 249 | - Analog for low level `runc`, `crun`
|
| 250 | - Analog for high level `docker run`, `podman run`
|
| 251 | - The equivalent of inotify() on a silo / medo.
|
| 252 | - could be an REST API on `https://app.oilshell.org/soil.medo/events/` for tarballs
|
| 253 | - it tells you what Silo to fetch from
|
| 254 | - Source browser for https://www.oilshell.org/deps.silo
|
| 255 |
|
| 256 | ## Ideas / Slogans
|
| 257 |
|
| 258 | - "Distributed OS without RPCs". We use the paradigms of state
|
| 259 | synchronization, dependency graphs (partial orders), and probably low-level
|
| 260 | "events".
|
| 261 | - Silo is the **data plane**; Medo is the **control plane**
|
| 262 | - Hay config files will also be a control plane
|
| 263 | - Silo is a **mechanism**; Medo is for **policy**
|
| 264 | - `/wedge` is a **middleground** between Docker and Nix/Bazel
|
| 265 | - Nix / Bazel are purely functional, but require rewriting upstream build
|
| 266 | systems in their own language (to fully make use of them)
|
| 267 | - Concretely: I don't want to rewrite the R build system for the tidyverse.
|
| 268 | I want to use the Debian packaging that already works, and that core R
|
| 269 | developers maintain.
|
| 270 | - `/wedge` is purely functional in the sense that wedges are literally
|
| 271 | **values**. But like Docker, you can use shell commands that mutate layers
|
| 272 | to create them. You can run entire language package managers and build
|
| 273 | systems via shell.
|
| 274 | - Wedges compose with, and compose better than, Docker layers.
|