OILS / doc / known-differences.md View on Github | oilshell.org

473 lines, 295 significant
1---
2default_highlighter: oils-sh
3---
4
5Known Differences Between OSH and Other Shells
6==============================================
7
8This document is for **sophisticated shell users**.
9
10You're unlikely to encounter these incompatibilities in everyday shell usage.
11If you do, there's almost always a **simple workaround**, like adding a space
12or a backslash.
13
14OSH is meant to run all POSIX shell programs, and most bash programs.
15
16<!-- cmark.py expands this -->
17<div id="toc">
18</div>
19
20<!--
21TODO:
22
23- `` as comments in sandstorm
24 # This relates to comments being EOL or not
25
26- Pipelines
27 - PIPESTATUS only set when a pipeline is actually run.
28 - zsh-like lastpipe semantics.
29
30-->
31
32## Numbers and Arithmetic
33
34### printf '%d' and other numeric formats require a valid integer
35
36In other shells, `printf %d invalid_integer` prints `0` and a warning. OSH
37gives you a runtime error.
38
39<!-- TODO: Probably should be strict_arith -->
40
41### Dynamically parsed command subs disallowed unless `shopt -s eval_unsafe_arith`
42
43In shell, array locations are often dynamically parsed, and the index can have
44command subs, which execute arbitrary code.
45
46For example, if you have `code='a[$(echo 42 | tee PWNED)]'`, shells will parse
47this data and execute it in many situations:
48
49 echo $(( code )) # dynamic parsing and evaluation in bash, mksh, zsh
50
51 unset $code
52
53 printf -v $code hi
54
55 echo ${!code}
56
57OSH disallows this by default. If you want this behavior, you can turn on
58`shopt -s eval_unsafe_arith`.
59
60Related: [A 30-year-old security problem](https://www.oilshell.org/blog/2019/01/18.html#a-story-about-a-30-year-old-security-problem)
61
62## Static Parsing Differences
63
64This section describes differences related to [static
65parsing](http://www.oilshell.org/blog/2016/10/22.html). OSH avoids the
66dynamic parsing of most shells.
67
68(Note: This section should encompass all the failures from the [wild
69tests](http://oilshell.org/cross-ref.html?tag=wild-test#wild-test) and [spec
70tests](http://oilshell.org/cross-ref.html?tag=spec-test#spec-test).
71
72### Strings vs. Bare words in array indices
73
74Strings should be quoted inside array indices:
75
76No:
77
78 "${SETUP_STATE[$err.cmd]}"
79
80Yes:
81
82 "${SETUP_STATE["$err.cmd"]}"
83
84When unquoted, the period causes an ambiguity with respect to regular arrays
85vs. associative arrays. See [Parsing Bash is
86Undecidable](http://www.oilshell.org/blog/2016/10/20.html).
87
88
89### Subshell in command sub
90
91You can have a subshell in a command sub, but it usually doesn't make sense.
92
93In OSH you need a space after `$(`. The characters `$((` always start an
94arith sub.
95
96No:
97
98 $((cd / && ls))
99
100Yes:
101
102 $( (cd / && ls) ) # Valid but usually doesn't make sense.
103 $({ cd / && ls; }) # Use {} for grouping, not (). Note trailing ;
104 $(cd / && ls) # Even better
105
106
107### Extended glob vs. Negation of boolean expression
108
109The OSH parser distinguishes these two constructs with a space:
110
111- `[[ !(a == a) ]]` is an extended glob.
112- `[[ ! (a == a) ]]` is the negation of an equality test.
113
114In bash, the parsing of such expressions depends on `shopt -s extglob`. In
115OSH, `shopt -s extglob` is accepted, but doesn't affect parsing.
116
117### Here doc terminators must be on their own line
118
119Lines like `EOF]` or `EOF)` don't end here docs. The delimiter must be on its
120own line.
121
122No:
123
124 a=$(cat <<EOF
125 abc
126 EOF)
127
128 a=$(cat <<EOF
129 abc
130 EOF # this is not a comment; it makes the EOF delimiter invalid
131 )
132
133Yes:
134
135 a=$(cat <<EOF
136 abc
137 EOF
138 ) # this is actually a comment
139
140
141### Spaces aren't allowed in LHS indices
142
143Bash allows:
144
145 a[1 + 2 * 3]=value
146
147OSH only allows:
148
149 a[1+2*3]=value
150
151because it parses with limited lookahead. The first line would result in the
152execution of a command named `a[1`.
153
154### break / continue / return are keywords, not builtins
155
156This means that they aren't "dynamic":
157
158 b=break
159 while true; do
160 $b # doesn't break in OSH
161 done
162
163Static control flow will allow static analysis of shell scripts.
164
165(Test cases are in [spec/loop][]).
166
167### OSH has more builtins, which shadow external commands
168
169For example, `append` is a builtin in OSH, but not in `bash`. Use `env append`
170or `/path/to/append` if you want to run an external command.
171
172(Note that a user-defined proc `append` takes priority over the builtin
173`append`.)
174
175### OSH has more keywords, which shadow builtins, functions, and commands
176
177In contrast with builtins, **keywords** affect shell parsing.
178
179For example, `func` is a keyword in OSH, but not in `bash`. To run a command
180named `func`, use `command func arg1`.
181
182Note that all shells have extensions that cause this issue. For example, `[[`
183is a keyword in `bash` but not in POSIX shell.
184
185## Later Parsing Differences
186
187These differences occur in subsequent stages of parsing, or in runtime parsing.
188
189### Brace expansion is all or nothing
190
191No:
192
193 {a,b}{ # what does the second { mean?
194 {a,b}{1...3} # 3 dots instead of 2
195
196Yes:
197
198 {a,b}\{
199 {a,b}\{1...3\}
200
201bash will do a **partial expansion** in the former cases, giving you `a{ b{`
202and `a{1...3} b{1...3}`.
203
204OSH considers them syntax errors and aborts all brace expansion, giving you
205the same thing back: `{a,b}{` and `{a,b}{1...3}`.
206
207### Brackets should be escaped within Character Classes
208
209Don't use ambiguous syntax for a character class consisting of a single bracket
210character.
211
212No:
213
214 echo [[]
215 echo []]
216
217Yes:
218
219 echo [\[]
220 echo [\]]
221
222
223The ambiguous syntax is allowed when we pass globs through to `libc`, but it's
224good practice to be explicit.
225
226### [[ -v var ]] doesn't allow expressions
227
228In bash, you can use `[[` with `-v` to test whether an array contains an entry:
229
230 declare -a array=('' foo)
231 if [[ -v array[1] ]]; then
232 echo 'exists'
233 fi # => exists
234
235Likewise for an associative array:
236
237 declare -A assoc=([key]=value)
238 if [[ -v assoc['key'] ]]
239 echo 'exists'
240 fi # => exists
241
242OSH currently treats these expressions as a string, which means the status will
243be 1 (`false`).
244
245Workaround:
246
247 if [[ "${assoc['key']:+exists}" ]]; then
248 echo 'exists'
249 fi # => exists
250
251In ysh, you can use:
252
253 var d = { key: 42 }
254 if ('key' in d) {
255 echo 'exists'
256 } # => exists
257
258## Data Structures
259
260### Arrays aren't split inside ${}
261
262Most shells split the entries of arrays like `"$@"` and `"${a[@]}"` here:
263
264 echo ${undef:-"$@"}
265
266In OSH, omit the quotes if you want splitting:
267
268 echo ${undef:-$@}
269
270I think OSH is more consistent, but it disagrees with other shells.
271
272### Values are tagged with types, not locations (`declare -i -a -A`)
273
274Even though there's a large common subset, OSH and bash have a different model
275for typed data.
276
277- In OSH, **values** are tagged with types, which is how Python and JavaScript
278 work.
279- In bash, **cells** (locations for values) are tagged with types. Everything
280 is a string, but in certain contexts, strings are treated as integers or as
281 structured data.
282
283In particular,
284
285- The `-i` flag is a no-op in OSH. See [Shell Idioms > Remove Dynamic
286 Parsing](shell-idioms.html#remove-dynamic-parsing) for alternatives to `-i`.
287- The `-a` and `-A` flags behave differently. They pertain to the value, not
288 the location.
289
290For example, these two statements are different in bash, but the same in OSH:
291
292 declare -A assoc # unset cell that will LATER be an assoc array
293 declare -A assoc=() # empty associative array
294
295In bash, you can tell the difference with `set -u`, but there's no difference
296in OSH.
297
298### Indexed and Associative arrays are distinct
299
300Here is how you can create arrays in OSH, in a bash-compatible way:
301
302 local indexed=(foo bar)
303 local -a indexed=(foo bar) # -a is redundant
304 echo ${indexed[1]} # bar
305
306 local assoc=(['one']=1 ['two']=2)
307 local -A assoc=(['one']=1 ['two']=2) # -A is redundant
308 echo ${assoc['one']} # 1
309
310In bash, the distinction between the two is blurry, with cases like this:
311
312 local -A x=(foo bar) # -A disagrees with literal
313 local -a y=(['one']=1 ['two']=2) # -a disagrees with literal
314
315These are disallowed in OSH.
316
317Notes:
318
319- The `=` keyword is useful for gaining an understanding of the data model.
320- See the [Quirks](quirks.html) doc for details on how OSH uses this cleaner
321 model while staying compatible with bash.
322
323## Assignment builtins
324
325The assignment builtins are `export`, `readonly`, `local`, and
326`declare`/`typeset`. They're parsed in 2 ways:
327
328- Statically: to avoid word splitting in `declare x=$y` when `$y` contains
329 spaces. bash and other shells behave this way.
330- Dynamically: to handle expressions like `declare $1` where `$1` is `a=b`
331
332### `builtin declare x=$y` is a runtime error
333
334This is because the special parsing of `x=$y` depends on the first word
335`declare`.
336
337### Args aren't split or globbed
338
339In bash, you can do unusual things with args to assignment builtins:
340
341 vars='a=b x=y'
342 touch foo=bar.py spam=eggs.py
343
344 declare $vars *.py # assigns at least 4 variables
345 echo $a # b
346 echo $x # y
347 echo $foo # bar.py
348 echo $spam # eggs.py
349
350In contrast, OSH doesn't split or glob args to assignment builtins. This is
351more like the behavior of zsh.
352
353## Pipelines
354
355### Last pipeline part may run in shell process (zsh, bash `shopt -s lastpipe`)
356
357In this pipeline, the builtin `read` is run in the shell process, not a child
358process:
359
360 $ echo hi | read x
361 $ echo x=$x
362 x=hi # empty in bash unless shopt -s lastpipe
363
364If the last part is an external command, there is no difference:
365
366 $ ls | wc -l
367 42
368
369This is how zsh behaves, and how bash (sometimes) behaves with `shopt -s
370lastpipe`.
371
372### Pipelines can't be suspended with Ctrl-Z
373
374Because the last part may be the current shell process, the entire pipeline
375can't be suspended.
376
377OSH and zsh share this consequence of the `lastpipe` semantics.
378
379In contrast, bash's `shopt -s lastpipe` is ignored in interactive shells.
380
381### `${PIPESTATUS[@]}` is only set after an actual pipeline
382
383This makes it easier to check compound status codes without worrying about them
384being "clobbered".
385
386Bash will set `${PIPESTATUS[@]}` on every command, regardless of whether its a
387pipeline.
388
389## More Differences at Runtime
390
391### Alias expansion
392
393Almost all "real" aliases should work in OSH. But these don't work:
394
395 alias left='{'
396 left echo hi; }
397
398(cases #33-#34 in [spec/alias][])
399
400or
401
402 alias a=
403 a (( var = 0 ))
404
405Details on the OSH parsing model:
406
4071. Your code is statically parsed into an abstract syntax tree, which contains
408 many types of nodes.
4092. `SimpleCommand` are the only ones that are further alias-expanded.
410
411For example, these result in `SimpleCommand` nodes:
412
413- `ls -l`
414- `read -n 1` (normally a builtin)
415- `myfunc foo`
416
417These don't:
418
419- `x=42`
420- `declare -r x=42`
421- `break`, `continue`, `return`, `exit` &mdash; as explained above, these are
422 keywords and not builtins.
423- `{ echo one; echo two; }`
424- `for`, `while`, `case`, functions, etc.
425
426### Extended globs are more static like `mksh`, and have other differences
427
428That is, in OSH and mksh, something like `echo *.@(cc|h)` is an extended glob.
429But `echo $x`, where `$x` contains the pattern, is not.
430
431For more details and differences, see the [Extended Glob
432section](word-language.html#extended-glob) of the Word Language doc.
433
434### Completion
435
436The OSH completion API is mostly compatible with the bash completion API,
437except that it moves the **responsibility for quoting** out of plugins and onto
438the shell itself. Plugins should return candidates as `argv` entries, not
439shell words.
440
441See the [completion doc](completion.html) for details.
442
443## Interactive Features
444
445### History Substitution Language
446
447The rules for history substitution like `!echo` are simpler. There are no
448special cases to avoid clashes with `${!indirect}` and so forth.
449
450TODO: Link to the history lexer.
451
452<!--
453TODO: we want to make history more statically parsed. Should test the ZSH
454parser.
455-->
456
457## Links
458
459- [OSH Spec Tests](../test/spec.wwz/survey/osh.html) run shell snippets with OSH and other
460 shells to compare their behavior.
461
462External:
463
464- This list may seem long, but compare the list of differences in [Bash POSIX
465 Mode](https://www.gnu.org/software/bash/manual/html_node/Bash-POSIX-Mode.html).
466 That page tells you what `set -o posix` does in bash.
467
468
469[spec/command-sub]: ../test/spec.wwz/command-sub.html
470[spec/loop]: ../test/spec.wwz/loop.html
471[spec/alias]: ../test/spec.wwz/alias.html
472
473