| 1 | ---
|
| 2 | in_progress: yes
|
| 3 | default_highlighter: oils-sh
|
| 4 | css_files: ../../web/base.css ../../web/manual.css ../../web/toc.css
|
| 5 | ---
|
| 6 |
|
| 7 | Word Language
|
| 8 | =============
|
| 9 |
|
| 10 | Recall that Oil is composed of three interleaved languages: **words**,
|
| 11 | [commands](command-language.html), and [expressions](expression-language.html).
|
| 12 |
|
| 13 | This doc describes words, but only the things that are **not** in:
|
| 14 |
|
| 15 | - [A Tour of the Oil Language](oil-language-tour.html)
|
| 16 | - The `#word-lang` section of [OSH Help
|
| 17 | Topics](osh-help-topics.html#word-lang)
|
| 18 | - The `#word-lang` section of [Oil Help
|
| 19 | Topics](oil-help-topics.html#word-lang)
|
| 20 |
|
| 21 | <div id="toc">
|
| 22 | </div>
|
| 23 |
|
| 24 | ## What's a Word?
|
| 25 |
|
| 26 | A word is an expression like `$x`, `"hello $name"`, or `{build,test}/*.py`. It
|
| 27 | evaluates to a string or an array of strings.
|
| 28 |
|
| 29 | Generally speaking, Oil behaves like a simpler version of POSIX shell / bash.
|
| 30 | Sophisticated users can read [Simple Word Evaluation](simple-word-eval.html)
|
| 31 | for a comparison.
|
| 32 |
|
| 33 | ## Contexts Where Words Are Used
|
| 34 |
|
| 35 | ### Words Are Part of Expressions and Commands
|
| 36 |
|
| 37 | Part of an expression:
|
| 38 |
|
| 39 | var x = ${y:-'default'}
|
| 40 |
|
| 41 | Part of a command:
|
| 42 |
|
| 43 | echo ${y:-'default'}
|
| 44 |
|
| 45 | ### Word Sequences: in for loops and array literals
|
| 46 |
|
| 47 | The three contexts where splitting and globbing apply are the ones where a
|
| 48 | **sequence** of words is evaluated (`EvalWordSequence`):
|
| 49 |
|
| 50 | 1. [Command]($help:simple-command): `echo $x foo`
|
| 51 | 2. [For loop]($help:for): `for i in $x foo; do ...`
|
| 52 | 3. [Array Literals]($help:array): `a=($x foo)` and `var a = :| $x foo |` ([oil-array]($help))
|
| 53 |
|
| 54 | ### Oil vs. Bash Array Literals
|
| 55 |
|
| 56 | Oil has a new array syntax, but it also supports the bash-compatible syntax:
|
| 57 |
|
| 58 | ```
|
| 59 | local myarray=(one two *.py) # bash
|
| 60 |
|
| 61 | var myarray = :| one two *.py | # Oil style
|
| 62 | ```
|
| 63 |
|
| 64 | ### Oil Discourages Context-Sensitive Evaluation
|
| 65 |
|
| 66 | Shell also has contexts where it evaluates words to a **single string**, rather
|
| 67 | than a sequence, like:
|
| 68 |
|
| 69 | ```sh
|
| 70 | # RHS of Assignment
|
| 71 | x="${not_array[@]}"
|
| 72 | x=*.py # not a glob
|
| 73 |
|
| 74 | # Redirect Arg
|
| 75 | echo foo > "${not_array[@]}"
|
| 76 | echo foo > *.py # not a glob
|
| 77 |
|
| 78 | # Case variables and patterns
|
| 79 | case "${not_array1[@]}" in
|
| 80 | "${not_array2[@]}")
|
| 81 | echo oops
|
| 82 | ;;
|
| 83 | esac
|
| 84 |
|
| 85 | case *.sh in # not a glob
|
| 86 | *.py) # a string pattern, not a file system glob
|
| 87 | echo oops
|
| 88 | ;;
|
| 89 | esac
|
| 90 | ```
|
| 91 |
|
| 92 | The behavior of these snippets diverges a lot in existing shells. That is,
|
| 93 | shells are buggy and poorly-specified.
|
| 94 |
|
| 95 | Oil disallows most of them. Arrays are considered separate from strings and
|
| 96 | don't randomly "decay".
|
| 97 |
|
| 98 | Related: the RHS of an Oil assignment is an expression, which can be of any
|
| 99 | type, including an array:
|
| 100 |
|
| 101 | ```
|
| 102 | var parts = split(x) # returns an array
|
| 103 | var python = glob('*.py') # ditto
|
| 104 |
|
| 105 | var s = join(parts) # returns a string
|
| 106 | ```
|
| 107 |
|
| 108 | ## Sigils
|
| 109 |
|
| 110 | This is a recap of [A Feel for Oil's Syntax](syntax-feelings.html).
|
| 111 |
|
| 112 | ### `$` Means "Returns One String"
|
| 113 |
|
| 114 | Examples:
|
| 115 |
|
| 116 | - All substitutions: var, command, arith
|
| 117 | - TODO: Do we have `$[a[x+1]]` as an expression substitution?
|
| 118 | - Or `$[ /pat+ /]`?
|
| 119 | - I don't think so.
|
| 120 |
|
| 121 | - Inline function calls, a YSH extension: `$[join(myarray)]`
|
| 122 |
|
| 123 | (C-style strings like `$'\n'` use `$`, but that's more of a bash anachronism.
|
| 124 | In Oil, `c'\n'` is preferred.
|
| 125 |
|
| 126 | ### `@` Means "Returns An Array of Strings"
|
| 127 |
|
| 128 | Enabled with `shopt -s parse_at`.
|
| 129 |
|
| 130 | Examples:
|
| 131 |
|
| 132 | - `@myarray`
|
| 133 | - `@[arrayfunc(x, y)]`
|
| 134 |
|
| 135 | These are both Oil extensions.
|
| 136 |
|
| 137 | The array literal syntax also uses a `@`:
|
| 138 |
|
| 139 | ```
|
| 140 | var myarray = :| 1 2 3 |
|
| 141 | ```
|
| 142 |
|
| 143 | ## OSH Features
|
| 144 |
|
| 145 | ### Word Splitting and Empty String Elision
|
| 146 |
|
| 147 | Uses POSIX behavior for unquoted substitutions like `$x`.
|
| 148 |
|
| 149 | - The string value is split into args with `$IFS`.
|
| 150 | - If the string value is empty, no args are produced.
|
| 151 |
|
| 152 | ### Implicit Joining
|
| 153 |
|
| 154 | Shell has odd "joining" semantics, which are supported in Oil but generally
|
| 155 | discouraged:
|
| 156 |
|
| 157 | set -- 'a b' 'c d'
|
| 158 | argv.py X"$@"X # => ['Xa', 'b', 'c', 'dX']
|
| 159 |
|
| 160 | In Oil, the RHS of an assignment is an expression, and joining only occurs
|
| 161 | within double quotes:
|
| 162 |
|
| 163 | # Oil
|
| 164 | var joined = $x$y # parse error
|
| 165 | var joined = "$x$y" # OK
|
| 166 |
|
| 167 | # Shell
|
| 168 | joined=$x$y # OK
|
| 169 | joined="$x$y" # OK
|
| 170 |
|
| 171 | <a name="extended-glob"></a>
|
| 172 | ### Extended Globs
|
| 173 |
|
| 174 | Extended globs in OSH are a "legacy syntax" modelled after the behavior of
|
| 175 | `bash` and `mksh`. This features adds alternation, repetition, and negation to
|
| 176 | globs, giving the power of regexes.
|
| 177 |
|
| 178 | You can use them to match strings:
|
| 179 |
|
| 180 | $ [[ foo.cc == *.(cc|h) ]] && echo 'matches' # => matches
|
| 181 |
|
| 182 | Or produce lists of filename arguments:
|
| 183 |
|
| 184 | $ touch foo.cc foo.h
|
| 185 | $ echo *.@(cc|h) # => foo.cc foo.h
|
| 186 |
|
| 187 | There are some limitations and differences:
|
| 188 |
|
| 189 | - Extended globs are supported only when Oil is built with GNU libc.
|
| 190 | - GNU libc has the `FNM_EXTMATCH` extension to `fnmatch()`. Unlike bash and
|
| 191 | mksh, Oil doesn't implement its own extended glob matcher.
|
| 192 | - They're more **static**, like in `mksh`. When an extended glob appears in a
|
| 193 | word, we evaluate the word, match filenames, and **skip** the rest of the
|
| 194 | word evaluation pipeline. This means:
|
| 195 | - Automatic word splitting is skipped in something like
|
| 196 | `$unquoted/@(*.cc|h)`.
|
| 197 | - You can't use arrays like `"$@"` and extended globs in the same word, e.g.
|
| 198 | `"$@"_*.@(cc|h)`. This is usually nonsensical anyway.
|
| 199 | - OSH only accepts them in **contexts** that make sense.
|
| 200 | - For example, `echo foo > @(cc|h)` is a runtime error in OSH, but other
|
| 201 | shells will write a file literally named `@(cc|h)`.
|
| 202 | - OSH doesn't accept `${undef:-@(cc)}`. But it does accept `${x%@(cc)}`,
|
| 203 | since string strip operators like `%` accept a glob.
|
| 204 | - Extended globbing is always on in OSH, regardless of `shopt -s extglob`.
|
| 205 | - Trivia: `bash` can't parse some extended globs unless `extglob` is on. But
|
| 206 | it parses others when it's off.
|
| 207 | - Extended globs can't be used in the `PATTERN` in `${x//PATTERN/replace}`.
|
| 208 | This is because we only translate normal (non-extended) globs to regexes (in
|
| 209 | order to get the position information necessary for string replacement).
|
| 210 | - They're not supported when `shopt --set simple_word_eval` (Oil word
|
| 211 | evaluation).
|
| 212 | - For similar reasons, they're also not supported in assignment builtins.
|
| 213 | (This is a good thing!)
|