| 1 | ---
|
| 2 | default_highlighter: oils-sh
|
| 3 | ---
|
| 4 |
|
| 5 | Hay - Custom Languages for Unix Systems
|
| 6 | =======================================
|
| 7 |
|
| 8 | *Hay* lets you use the syntax of the YSH to declare **data** and
|
| 9 | interleaved **code**. It allows the shell to better serve its role as
|
| 10 | essential **glue**. For example, these systems all combine Unix processes in
|
| 11 | various ways:
|
| 12 |
|
| 13 | - local build systems (Ninja, CMake, Debian package builds, Docker/OCI builds)
|
| 14 | - remote build services (VM-based continuous integration like sourcehut, Github
|
| 15 | Actions)
|
| 16 | - local process supervisors (SysV init, systemd)
|
| 17 | - remote process supervisors / cluster managers (Slurm, Kubernetes)
|
| 18 |
|
| 19 | Slogans:
|
| 20 |
|
| 21 | - *Hay Ain't YAML*.
|
| 22 | - It evaluates to [JSON][] + Shell Scripts.
|
| 23 | - *We need a better **control plane** language for the cloud*.
|
| 24 | - *YSH adds the missing declarative part to shell*.
|
| 25 |
|
| 26 | This doc describes how to use Hay, with motivating examples.
|
| 27 |
|
| 28 | As of 2022, this is a new feature of YSH, and **it needs user feedback**.
|
| 29 | Nothing is set in stone, so you can influence the language and its features!
|
| 30 |
|
| 31 |
|
| 32 | [JSON]: $xref:JSON
|
| 33 |
|
| 34 | <!--
|
| 35 | - although also Tcl, Lua, Python, Ruby
|
| 36 | - DSLs, Config Files, and More
|
| 37 | - For Dialects of YSH
|
| 38 |
|
| 39 | Use case examples
|
| 40 | -->
|
| 41 |
|
| 42 | <!-- cmark.py expands this -->
|
| 43 | <div id="toc">
|
| 44 | </div>
|
| 45 |
|
| 46 | ## Example
|
| 47 |
|
| 48 | Hay could be used to configure a hypothetical Linux package manager:
|
| 49 |
|
| 50 | # cpython.hay -- A package definition
|
| 51 |
|
| 52 | hay define Package/TASK # define a tree of Hay node types
|
| 53 |
|
| 54 | Package cpython { # a node with attributes, and children
|
| 55 |
|
| 56 | version = '3.9'
|
| 57 | url = 'https://python.org'
|
| 58 |
|
| 59 | TASK build { # a child node, with YSH code
|
| 60 | ./configure
|
| 61 | make
|
| 62 | }
|
| 63 | }
|
| 64 |
|
| 65 | This program evaluates to a JSON tree, which you can consume from programs in
|
| 66 | any language, including YSH:
|
| 67 |
|
| 68 | { "type": "Package",
|
| 69 | "args": [ "cpython" ],
|
| 70 | "attrs": { "version": "3.9", "url": "https://python.org" },
|
| 71 | "children": [
|
| 72 | { "type": "TASK",
|
| 73 | "args": [ "build" ],
|
| 74 | "code_str": " ./configure\n make\n"
|
| 75 | }
|
| 76 | ]
|
| 77 | }
|
| 78 |
|
| 79 | That is, a package manager can use the attributes to create a build
|
| 80 | environment, then execute shell code within it. This is a *staged evaluation
|
| 81 | model*.
|
| 82 |
|
| 83 | ## Understanding Hay
|
| 84 |
|
| 85 | A goal of Hay is to restore the **simplicity** of Unix to distributed systems.
|
| 86 | It's all just **code and data**!
|
| 87 |
|
| 88 | This means that it's a bit abstract, so here are a few ways of understanding
|
| 89 | it.
|
| 90 |
|
| 91 | ### Analogies
|
| 92 |
|
| 93 | The relation between Hay and YSH is like the relationship between these pairs
|
| 94 | of languages:
|
| 95 |
|
| 96 | - [YAML][] / [Go templates][], which are used in Helm config for Kubernetes.
|
| 97 | - YAML data specifies a **service**, and templates specify **variants**.
|
| 98 | - Two common ways of building C and C++ code:
|
| 99 | - [Make]($xref:make) / [Autotools]($xref:autotools)
|
| 100 | - [Ninja]($xref:ninja) / [CMake][]
|
| 101 | - Make and Ninja specify a **build graph**, while autotools and CMake detect
|
| 102 | a **configured variant** with respect to your system.
|
| 103 |
|
| 104 | Each of these is *70's-style macro programming* — a stringly-typed
|
| 105 | language generating another stringly-typed language, with all the associated
|
| 106 | problems.
|
| 107 |
|
| 108 | In contrast, Hay and YSH are really the same language, with the same syntax,
|
| 109 | and the same Python- and JavaScript-like dynamic **types**. Hay is just YSH
|
| 110 | that **builds up data** instead of executing commands.
|
| 111 |
|
| 112 | (Counterpoint: Ninja is intended for code generation, and it makes sense for
|
| 113 | YSH to generate simple languages.)
|
| 114 |
|
| 115 |
|
| 116 | [Go templates]: https://pkg.go.dev/text/template
|
| 117 | [CMake]: https://cmake.org
|
| 118 |
|
| 119 | ### Prior Art
|
| 120 |
|
| 121 | See the [Survey of Config Languages]($wiki) on the wiki, which puts them in
|
| 122 | these categories:
|
| 123 |
|
| 124 | 1. Languages for String Data
|
| 125 | - INI, XML, [YAML][], ...
|
| 126 | 1. Languages for Typed Data
|
| 127 | - [JSON][], TOML, ...
|
| 128 | 1. Programmable String-ish Languages
|
| 129 | - Go templates, CMake, autotools/m4, ...
|
| 130 | 1. Programmable Typed Data
|
| 131 | - Nix expressions, Starlark, Cue, ...
|
| 132 | 1. Internal DSLs in General Purpose Languages
|
| 133 | - Hay, Guile Scheme for Guix, Ruby blocks, ...
|
| 134 |
|
| 135 | Excerpts:
|
| 136 |
|
| 137 | [YAML][] is a data format that is (surprisingly) the de-facto control plane
|
| 138 | language for the cloud. It's an approximate superset of [JSON][].
|
| 139 |
|
| 140 | [UCL][] (universal config language) and [HCL][] (HashiCorp config language) are
|
| 141 | influenced by the [Nginx][] config file syntax. If you can read any of these
|
| 142 | languages, you can read Hay.
|
| 143 |
|
| 144 | [Nix][] has a [functional language][nix-lang] to configure Linux distros. In
|
| 145 | contrast, Hay is multi-paradigm and imperative.
|
| 146 |
|
| 147 | [nix-lang]: https://wiki.nixos.org/wiki/Nix_Expression_Language
|
| 148 |
|
| 149 | The [Starlark][] language is a dialect of Python used by the [Bazel][] build
|
| 150 | system. It uses imperative code to specify build graph variants, and you can
|
| 151 | use this same pattern in Hay. That is, if statements, for loops, and functions
|
| 152 | are useful in Starlark and Hay.
|
| 153 |
|
| 154 | [Ruby][]'s use of [first-class
|
| 155 | blocks](http://radar.oreilly.com/2014/04/make-magic-with-ruby-dsls.html)
|
| 156 | inspired YSH. They're used in systems like Vagrant (VM dev environments) and
|
| 157 | Rake (a build system).
|
| 158 |
|
| 159 | In [Lisp][], code and data are expressed with the same syntax, and can be
|
| 160 | interleaved.
|
| 161 | [G-Expressions](https://guix.gnu.org/manual/en/html_node/G_002dExpressions.html)
|
| 162 | in Guix use a *staged evaluation model*, like Hay.
|
| 163 |
|
| 164 | [YAML]: $xref:YAML
|
| 165 | [UCL]: https://github.com/vstakhov/libucl
|
| 166 | [Nginx]: https://en.wikipedia.org/wiki/Nginx
|
| 167 | [HCL]: https://github.com/hashicorp/hcl
|
| 168 | [Nix]: $xref:nix
|
| 169 |
|
| 170 | [Starlark]: https://github.com/bazelbuild/starlark
|
| 171 | [Bazel]: https://bazel.build/
|
| 172 |
|
| 173 | [Ruby]: https://www.ruby-lang.org/en/
|
| 174 | [Lisp]: https://en.wikipedia.org/wiki/Lisp_(programming_language)
|
| 175 |
|
| 176 |
|
| 177 | ### Comparison
|
| 178 |
|
| 179 | The biggest difference between Hay and [UCL][] / [HCL][] is that it's
|
| 180 | **embedded in a shell**. In other words, Hay languages are *internal DSLs*,
|
| 181 | while those languages are *external*.
|
| 182 |
|
| 183 | This means:
|
| 184 |
|
| 185 | 1. You can **interleave** shell code with Hay data. We'll discuss the many
|
| 186 | uses of this below.
|
| 187 | - On the other hand, it's OK to configure simple systems with plain data
|
| 188 | like [JSON][]. Hay is for when that stops working!
|
| 189 | 1. Hay isn't a library you embed in another program. Instead, you use
|
| 190 | Unix-style **process-based** composition.
|
| 191 | - For example, [HCL][] is written in Go, which may be hard to embed in a C
|
| 192 | or Rust program.
|
| 193 | - Note that a process is a good **security** boundary. It can be
|
| 194 | additionally run in an OS container or VM.
|
| 195 |
|
| 196 | <!--
|
| 197 | - Code on the **outside** of Hay blocks may use the ["staged programming" / "graph metaprogramming" pattern][build-ci-comments] mentioned above.
|
| 198 | - Code on the **inside** is *unevaluated*. You can execute it in another
|
| 199 | context, like a remote machine, Linux container, or virtual machine.
|
| 200 | -->
|
| 201 |
|
| 202 | The sections below elaborate on these points.
|
| 203 |
|
| 204 | [shell-pipelines]: https://www.oilshell.org/blog/2017/01/15.html
|
| 205 |
|
| 206 | <!--
|
| 207 | - YSH has an imperative programming model. It's a little like Starlark.
|
| 208 | - Guile / GNU Make.
|
| 209 | - Tensorflow.
|
| 210 | -->
|
| 211 |
|
| 212 |
|
| 213 | ## Overview
|
| 214 |
|
| 215 | Hay nodes have a regular structure:
|
| 216 |
|
| 217 | - They start with a "command", which is called the **type**.
|
| 218 | - They accept **string** arguments and **block** arguments. There must be at
|
| 219 | least one argument.
|
| 220 |
|
| 221 | ### Two Kinds of Nodes, and Three Kinds of Evaluation
|
| 222 |
|
| 223 | There are two kinds of node with this structure.
|
| 224 |
|
| 225 | (1) `SHELL` nodes contain **unevaluated** code, and their type is ALL CAPS.
|
| 226 | The code is turned into a string that can be executed elsewhere.
|
| 227 |
|
| 228 | TASK build {
|
| 229 | ./configure
|
| 230 | make
|
| 231 | }
|
| 232 | # =>
|
| 233 | # ... {"code_str": " ./configure\n make\n"}
|
| 234 |
|
| 235 | (2) `Attr` nodes contain **data**, and their type starts with a capital letter.
|
| 236 | They eagerly evaluate a block in a new **stack frame** and turn it into an
|
| 237 | **attributes dict**.
|
| 238 |
|
| 239 | Package cpython {
|
| 240 | version = '3.9'
|
| 241 | }
|
| 242 | # =>
|
| 243 | # ... {"attrs": {"version": "3.9"}} ...
|
| 244 |
|
| 245 | These blocks have a special rule to allow *bare assignments* like `version =
|
| 246 | '3.9'`. That is, you don't need keywords like `const` or `var`.
|
| 247 |
|
| 248 | (3) In contrast to these two types of Hay nodes, YSH builtins that take a block
|
| 249 | usually evaluate it eagerly:
|
| 250 |
|
| 251 | cd /tmp { # run in a new directory
|
| 252 | echo $PWD
|
| 253 | }
|
| 254 |
|
| 255 | Builtins are spelled with `lower` case letters, so `SHELL` and `Attr` nodes
|
| 256 | won't be confused with them.
|
| 257 |
|
| 258 | ### Two Stages of Evaluation
|
| 259 |
|
| 260 | So Hay is designed to be used with a *staged evaluation model*:
|
| 261 |
|
| 262 | 1. The first stage follows the rules above:
|
| 263 | - Tree of Hay nodes → [JSON]($xref) + Unevaluated shell.
|
| 264 | - You can use variables, conditionals, loops, and more.
|
| 265 | 2. Your app or system controls the second stage. You can invoke YSH again to
|
| 266 | execute shell inside a VM, inside a Linux container, or on a remote machine.
|
| 267 |
|
| 268 | These two stages conceptually different, but use the **same** syntax and
|
| 269 | evaluator! Again, the evaluator runs in a mode where it **builds up data**
|
| 270 | rather than executing commands.
|
| 271 |
|
| 272 | ### Result Schema
|
| 273 |
|
| 274 | Here's a description of the result of Hay evaluation (the first stage).
|
| 275 |
|
| 276 | # The source may be "cpython.hay"
|
| 277 | FileResult = (source Str, children List[NodeResult])
|
| 278 |
|
| 279 | NodeResult =
|
| 280 | # package cpython { version = '3.9' }
|
| 281 | Attr (type Str,
|
| 282 | args List[Str],
|
| 283 | attrs Map[Str, Any],
|
| 284 | children List[NodeResult])
|
| 285 |
|
| 286 | # TASK build { ./configure; make }
|
| 287 | | Shell(type Str,
|
| 288 | args List[Str],
|
| 289 | location_str Str,
|
| 290 | location_start_line Int,
|
| 291 | code_str Str)
|
| 292 |
|
| 293 |
|
| 294 | Notes:
|
| 295 |
|
| 296 | - Except for user-defined attributes, the result is statically typed.
|
| 297 | - Shell nodes are always leaf nodes.
|
| 298 | - Attr nodes may or may not be leaf nodes.
|
| 299 |
|
| 300 | ## Three Ways to Invoke Hay
|
| 301 |
|
| 302 | ### Inline Hay Has No Restrictions
|
| 303 |
|
| 304 | You can put Hay blocks and normal shell code in the same file. Retrieve the
|
| 305 | result of Hay evaluation with the `_hay()` function.
|
| 306 |
|
| 307 | # myscript.ysh
|
| 308 |
|
| 309 | hay define Rule
|
| 310 |
|
| 311 | Rule mylib.o {
|
| 312 | inputs = ['mylib.c']
|
| 313 |
|
| 314 | # not recommended, but allowed
|
| 315 | echo 'hi'
|
| 316 | ls /tmp/$(whoami)
|
| 317 | }
|
| 318 |
|
| 319 | echo 'bye' # other shell code
|
| 320 |
|
| 321 | const result = _hay()
|
| 322 | json write (result)
|
| 323 |
|
| 324 | In this case, there are no restrictions on the commands you can run.
|
| 325 |
|
| 326 | ### In Separate Files
|
| 327 |
|
| 328 | You can put hay definitions in their own file:
|
| 329 |
|
| 330 | # my-config.hay
|
| 331 |
|
| 332 | Rule mylib.o {
|
| 333 | inputs = ['mylib.c']
|
| 334 | }
|
| 335 |
|
| 336 | echo 'hi' # allowed for debugging
|
| 337 | # ls /tmp/$(whoami) would fail due to restrictions on hay evaluation
|
| 338 |
|
| 339 | In this case, you can use `echo` and `write`, but the interpreted is
|
| 340 | **restricted** (see below).
|
| 341 |
|
| 342 | Parse it with `parseHay()`, and evaluate it with `evalHay()`:
|
| 343 |
|
| 344 | # my-evaluator.ysh
|
| 345 |
|
| 346 | hay define Rule # node types for the file
|
| 347 | const h = parseHay('build.hay')
|
| 348 | const result = evalHay(h)
|
| 349 |
|
| 350 | json write (result)
|
| 351 | # =>
|
| 352 | # {
|
| 353 | # "children": [
|
| 354 | # { "type": "Rule",
|
| 355 | # "args": ["mylib.o"],
|
| 356 | # "attrs": {"inputs": ["mylib.c"]}
|
| 357 | # }
|
| 358 | # ]
|
| 359 | # }
|
| 360 |
|
| 361 | ### In A Block
|
| 362 |
|
| 363 | Instead of creating separate files, you can also use the `hay eval` builtin:
|
| 364 |
|
| 365 | hay define Rule
|
| 366 |
|
| 367 | hay eval :result { # assign to the variable 'result'
|
| 368 | Rule mylib.o {
|
| 369 | inputs = ['mylib.c']
|
| 370 | }
|
| 371 | }
|
| 372 |
|
| 373 | json write (result) # same as above
|
| 374 |
|
| 375 | This is mainly for testing and demos.
|
| 376 |
|
| 377 | ## Security Model: Restricted != Sandboxed
|
| 378 |
|
| 379 | The "restrictions" are **not** a security boundary! (They could be, but we're
|
| 380 | not making promises now.)
|
| 381 |
|
| 382 | Even with `evalHay()` and `hay eval`, the config file is evaluated in the
|
| 383 | **same interpreter**. But the following restrictions apply:
|
| 384 |
|
| 385 | - External commands aren't allowed
|
| 386 | - Builtins other than `echo` and `write` aren't allowed
|
| 387 | - For example, the `.hay` file can't invoke `shopt` to change global shell
|
| 388 | options
|
| 389 | - A new stack frame is created, so the `.hay` file can't mutate your locals
|
| 390 | - However it can still mutate globals with `setglobal`!
|
| 391 |
|
| 392 | In summary, Hay evaluation is restricted to prevent basic mistakes, but your
|
| 393 | code isn't completely separate from the evaluated Hay file.
|
| 394 |
|
| 395 | If you want to evaluate untrusted code, use a **separate process**, and run it
|
| 396 | in a container or VM.
|
| 397 |
|
| 398 | ## Reference
|
| 399 |
|
| 400 | Here is a list of all the mechanisms mentioned.
|
| 401 |
|
| 402 | ### Shell Builtins
|
| 403 |
|
| 404 | - `hay`
|
| 405 | - `hay define` to define node types.
|
| 406 | - `hay pp` to pretty print the node types.
|
| 407 | - `hay reset` to delete both the node types **and** the current evaluation
|
| 408 | result.
|
| 409 | - `hay eval :result { ... }` to evaluate in restricted mode, and put the
|
| 410 | result in a variable.
|
| 411 | - Implementation detail: the `haynode` builtin is run when types like
|
| 412 | `Package` and `TASK` are invoked. That is, all node types are aliases for
|
| 413 | this same builtin.
|
| 414 |
|
| 415 | ### Functions
|
| 416 |
|
| 417 | - `parseHay()` parses a file, just as `bin/ysh` does.
|
| 418 | - `evalHay()` evaluates the parsed file in restricted mode, like `hay eval`.
|
| 419 | - `_hay()` retrieves the current result
|
| 420 | - It's useful interactive debugging.
|
| 421 | - The name starts with `_` because it's a "register" mutated by the
|
| 422 | interpreter.
|
| 423 |
|
| 424 | ### Options
|
| 425 |
|
| 426 | Hay is parsed and evaluated with option group `ysh:all`, which includes
|
| 427 | `parse_proc` and `parse_equals`.
|
| 428 |
|
| 429 | <!--
|
| 430 |
|
| 431 | - The `parse_brace` and `parse_equals` options are what let us inside attribute nodes
|
| 432 | - `_running_hay`
|
| 433 |
|
| 434 | -->
|
| 435 |
|
| 436 |
|
| 437 | ## Usage: Interleaving Hay and YSH
|
| 438 |
|
| 439 | Why would you want to interleave data and code? One reason is to naturally
|
| 440 | express variants of a configuration. Here are some examples.
|
| 441 |
|
| 442 | **Build variants**. There are many variants of the YSH binary:
|
| 443 |
|
| 444 | - `dbg` and `opt`. the compiler optimization level, and whether debug symbols
|
| 445 | are included.
|
| 446 | - `asan` and `ubsan`. Dynamic analysis with Clang sanitizers.
|
| 447 | - `-D GC_EVERY_ALLOC`. Make a build that helps debug the garbage collector.
|
| 448 |
|
| 449 | So the Ninja build graph to produce these binaries is **shaped** similarly, but
|
| 450 | it **varies** with compiler and linker flags.
|
| 451 |
|
| 452 | **Service variants**. A common problem in distributed systems is how to
|
| 453 | develop and debug services locally.
|
| 454 |
|
| 455 | Do your service dependencies live in the cloud, or are they run locally? What
|
| 456 | about state? Common variants:
|
| 457 |
|
| 458 | - `local`. Part or all of the service runs locally, so you may pass flags like
|
| 459 | `--auth-service localhost:8001` to binaries.
|
| 460 | - `staging`. A complete copy of the service, in a different cloud, with a
|
| 461 | different database.
|
| 462 | - `prod`. The live instance running with user data.
|
| 463 |
|
| 464 | Again, these collections of services are all **shaped** similarly, but the
|
| 465 | flags **vary** based on where binaries are physically running.
|
| 466 |
|
| 467 | ---
|
| 468 |
|
| 469 | This model can be referred to as ["graph metaprogramming" or "staged
|
| 470 | programming"][build-ci-comments]. In YSH, it's done with dynamically typed
|
| 471 | data like integers and dictionaries. In contrast, systems like CMake and
|
| 472 | autotools are more stringly typed.
|
| 473 |
|
| 474 | [build-ci-comments]: https://www.oilshell.org/blog/2021/04/build-ci-comments.html
|
| 475 |
|
| 476 | The following **examples** are meant to be "evocative"; they're not based on
|
| 477 | real code. Again, user feedback can improve them!
|
| 478 |
|
| 479 | ### Conditionals
|
| 480 |
|
| 481 | Conditionals can go on the inside of a block:
|
| 482 |
|
| 483 | Service auth.example.com { # node taking a block
|
| 484 | if (variant === 'local') { # condition
|
| 485 | port = 8001
|
| 486 | } else {
|
| 487 | port = 80
|
| 488 | }
|
| 489 | }
|
| 490 |
|
| 491 | Or on the outside:
|
| 492 |
|
| 493 | Service web { # node
|
| 494 | root = '/home/www'
|
| 495 | }
|
| 496 |
|
| 497 | if (variant === 'local') { # condition
|
| 498 | Service auth-local { # node
|
| 499 | port = 8001
|
| 500 | }
|
| 501 | }
|
| 502 |
|
| 503 |
|
| 504 | ### Iteration
|
| 505 |
|
| 506 | Iteration can also go on the inside of a block:
|
| 507 |
|
| 508 | Rule foo.o { # node
|
| 509 | inputs = [] # populate with all .cc files except one
|
| 510 |
|
| 511 | # variables ending with _ are "hidden" from block evaluation
|
| 512 | for name_ in *.cc {
|
| 513 | if name_ !== 'skipped.cc' {
|
| 514 | call inputs->append(name_)
|
| 515 | }
|
| 516 | }
|
| 517 | }
|
| 518 |
|
| 519 | Or on the outside:
|
| 520 |
|
| 521 | for name_ in *.cc { # loop
|
| 522 | Rule $(basename $name_ .cc).o { # node
|
| 523 | inputs = [name_]
|
| 524 | }
|
| 525 | }
|
| 526 |
|
| 527 |
|
| 528 | ### Remove Duplication with `proc`
|
| 529 |
|
| 530 | Procs can wrap blocks:
|
| 531 |
|
| 532 | proc myrule(name) {
|
| 533 |
|
| 534 | # needed for blocks to use variables higher on the stack
|
| 535 | shopt --set dynamic_scope {
|
| 536 |
|
| 537 | Rule dbg/$name.o { # node
|
| 538 | inputs = ["$name.c"]
|
| 539 | flags = ['-O0']
|
| 540 | }
|
| 541 |
|
| 542 | Rule opt/$name.o { # node
|
| 543 | inputs = ["$name.c"]
|
| 544 | flags = ['-O2']
|
| 545 | }
|
| 546 |
|
| 547 | }
|
| 548 | }
|
| 549 |
|
| 550 | myrule foo # call proc
|
| 551 | myrule bar # call proc
|
| 552 |
|
| 553 | Or they can be invoked from within blocks:
|
| 554 |
|
| 555 | proc set-port (port_num; out) {
|
| 556 | call out->setValue("localhost:$port_num")
|
| 557 | }
|
| 558 |
|
| 559 | Service foo { # node
|
| 560 | set-port 80 :p1 # call proc
|
| 561 | set-port 81 :p2 # call proc
|
| 562 | }
|
| 563 |
|
| 564 | ## More Usage Patterns
|
| 565 |
|
| 566 | ### Using YSH for the Second Stage
|
| 567 |
|
| 568 | The general pattern is:
|
| 569 |
|
| 570 | ./my-evaluator.ysh my-config.hay | json read :result
|
| 571 |
|
| 572 | The evaluator does the following:
|
| 573 |
|
| 574 | 1. Sets up the execution context with `hay define`
|
| 575 | 1. Parses `my-config.hay` with `parseHay()`
|
| 576 | 1. Evaluates it with `evalHay()`
|
| 577 | 1. Prints the result as JSON.
|
| 578 |
|
| 579 | Then a separate YSH processes reads this JSON and executes application code.
|
| 580 |
|
| 581 | TODO: Show code example.
|
| 582 |
|
| 583 | ### Using Python for the Second Stage
|
| 584 |
|
| 585 | In Python, you would:
|
| 586 |
|
| 587 | 1. Use the `subprocess` module to invoke `./my-evaluator.ysh my-config.hay`.
|
| 588 | 2. Use the `json` module to parse the result.
|
| 589 | 3. Then execute application code using the data.
|
| 590 |
|
| 591 | TODO: Show code example.
|
| 592 |
|
| 593 | ### Locating Errors in the Original `.hay` File
|
| 594 |
|
| 595 | The YSH interpreter has 2 flags starting with `--location` that give you
|
| 596 | control over error messages.
|
| 597 |
|
| 598 | ysh --location-str 'foo.hay' --location-start-line 42 -- stage2.ysh
|
| 599 |
|
| 600 | Set them to the values of fields `location_str` and `location_start_line` in
|
| 601 | the result of `SHELL` node evaluation.
|
| 602 |
|
| 603 | ### Debian `.d` Dirs
|
| 604 |
|
| 605 | Debian has a pattern of splitting configuration into a **directory** of
|
| 606 | concatenated files. It's easier for shell scripts to add to a directory than
|
| 607 | add to a file.
|
| 608 |
|
| 609 | This can be done with an evaluator that simply enumerates all files:
|
| 610 |
|
| 611 | var results = []
|
| 612 | for path in myconfig.d/*.hay {
|
| 613 | const code = parseHay(path)
|
| 614 | const result = eval(hay)
|
| 615 | call results->append(result)
|
| 616 | }
|
| 617 |
|
| 618 | # Now iterate through results
|
| 619 |
|
| 620 | ### Parallel Loading
|
| 621 |
|
| 622 | TODO: Example of using `xargs -P` to spawn processes with `parseHay()` and
|
| 623 | `evalHay()`. Then merge the JSON results.
|
| 624 |
|
| 625 | ## Style
|
| 626 |
|
| 627 | ### Attributes vs. Procs
|
| 628 |
|
| 629 | Assigning attributes and invoking procs can look similar:
|
| 630 |
|
| 631 | Package grep {
|
| 632 | version = '1.0' # An attribute?
|
| 633 |
|
| 634 | version 1.0 # or call proc 'version'?
|
| 635 | }
|
| 636 |
|
| 637 | The first style is better for typed data like integers and dictionaries. The
|
| 638 | latter style isn't useful here, but it could be if `version 1.0` created
|
| 639 | complex Hay nodes.
|
| 640 |
|
| 641 | ### Attributes vs. Flags
|
| 642 |
|
| 643 | Hay nodes shouldn't take flags or `--`. Flags are for key-value pairs, and
|
| 644 | blocks are better for expressing such data.
|
| 645 |
|
| 646 | No:
|
| 647 |
|
| 648 | Package --version 1.0 grep {
|
| 649 | license = 'GPL'
|
| 650 | }
|
| 651 |
|
| 652 | Yes:
|
| 653 |
|
| 654 | Package grep {
|
| 655 | version = '1.0'
|
| 656 | license = 'GPL'
|
| 657 | }
|
| 658 |
|
| 659 | ### Dicts vs. Blocks
|
| 660 |
|
| 661 | Superficially, dicts and blocks are similar:
|
| 662 |
|
| 663 | Package grep {
|
| 664 | mydict = {name: 'value'} # a dict
|
| 665 |
|
| 666 | mynode foo { # a node taking a block
|
| 667 | name = 'value'
|
| 668 | }
|
| 669 | }
|
| 670 |
|
| 671 | Use dicts in cases where you don't know the names or types up front, like
|
| 672 |
|
| 673 | files = {'README.md': true, '__init__.py': false}
|
| 674 |
|
| 675 | Use blocks when there's a **schema**. Blocks are also different because:
|
| 676 |
|
| 677 | - You can use `if` statements and `for` loops in them.
|
| 678 | - You can call `TASK build; TASK test` within a block, creating multiple
|
| 679 | objects of the same type.
|
| 680 | - Later: custom validation
|
| 681 |
|
| 682 | ### YSH vs. Shell
|
| 683 |
|
| 684 | Hay files are parsed as YSH, not OSH. That includes `SHELL` nodes:
|
| 685 |
|
| 686 | TASK build {
|
| 687 | cp @deps /tmp # YSH splicing syntax
|
| 688 | }
|
| 689 |
|
| 690 | If you want to use POSIX shell or bash, use two arguments, the second of which
|
| 691 | is a multi-line string:
|
| 692 |
|
| 693 | TASK build '''
|
| 694 | cp "${deps[@]}" /tmp
|
| 695 | '''
|
| 696 |
|
| 697 | The YSH style gives you *static parsing*, which catches some errors earlier.
|
| 698 |
|
| 699 | ## Future Work
|
| 700 |
|
| 701 | - `hay proc` for arbitrary schema validation, including JSON schema
|
| 702 | - Examples of running hay in a secure process / container, in various languages
|
| 703 | - Sandboxing:
|
| 704 | - More find-grained rules?
|
| 705 | - "restricted" could come with a security guarantee. I've avoided making
|
| 706 | such guarantees, but I think it's possible as YSH matures. The
|
| 707 | interpreter uses dependency inversion to isolate I/O.
|
| 708 | - More location info, including the source file.
|
| 709 |
|
| 710 | [Please send
|
| 711 | feedback](https://github.com/oilshell/oil/wiki/Where-To-Send-Feedback) about
|
| 712 | Hay. It will inform and prioritize this work!
|
| 713 |
|
| 714 | ## Links
|
| 715 |
|
| 716 | - Blog posts tagged #[hay]($blog-tag). Hay is a general mechanism, so it's
|
| 717 | useful to explain it with concrete examples.
|
| 718 | - [Data Definition and Code Generation in Tcl](https://trs.jpl.nasa.gov/bitstream/handle/2014/7660/03-1728.pdf) (2003, PDF)
|
| 719 | - Like Hay, it has the (Type, Name, Attributes) data model.
|
| 720 | - <https://github.com/oilshell/oil/wiki/Config-Dialect>. Design notes and related links on the wiki.
|