OILS / doc / idioms.md View on Github | oilshell.org

982 lines, 622 significant
1---
2default_highlighter: oils-sh
3---
4
5YSH vs. Shell Idioms
6====================
7
8This is an informal, lightly-organized list of recommended idioms for the
9[YSH]($xref) language. Each section has snippets labeled *No* and *Yes*.
10
11- Use the *Yes* style when you want to write in YSH, and don't care about
12 compatibility with other shells.
13- The *No* style is discouraged in new code, but YSH will run it. The [OSH
14 language]($xref:osh-language) is compatible with
15 [POSIX]($xref:posix-shell-spec) and [bash]($xref).
16
17[J8 Notation]: j8-notation.html
18
19<!-- cmark.py expands this -->
20<div id="toc">
21</div>
22
23## Use [Simple Word Evaluation](simple-word-eval.html) to Avoid "Quoting Hell"
24
25### Substitute Variables
26
27No:
28
29 local x='my song.mp3'
30 ls "$x" # quotes required to avoid mangling
31
32Yes:
33
34 var x = 'my song.mp3'
35 ls $x # no quotes needed
36
37### Splice Arrays
38
39No:
40
41 local myflags=( --all --long )
42 ls "${myflags[@]}" "$@"
43
44Yes:
45
46 var myflags = :| --all --long |
47 ls @myflags @ARGV
48
49### Explicitly Split, Glob, and Omit Empty Args
50
51YSH doesn't split arguments after variable expansion.
52
53No:
54
55 local packages='python-dev gawk'
56 apt install $packages
57
58Yes:
59
60 var packages = 'python-dev gawk'
61 apt install @[split(packages)]
62
63Even better:
64
65 var packages = :| python-dev gawk | # array literal
66 apt install @packages # splice array
67
68---
69
70YSH doesn't glob after variable expansion.
71
72No:
73
74 local pat='*.py'
75 echo $pat
76
77
78Yes:
79
80 var pat = '*.py'
81 echo @[glob(pat)] # explicit call
82
83---
84
85YSH doesn't omit unquoted words that evaluate to the empty string.
86
87No:
88
89 local e=''
90 cp $e other $dest # cp gets 2 args, not 3, in sh
91
92Yes:
93
94 var e = ''
95 cp @[maybe(e)] other $dest # explicit call
96
97### Iterate a Number of Times (Split Command Sub)
98
99No:
100
101 local n=3
102 for x in $(seq $n); do # No implicit splitting of unquoted words in YSH
103 echo $x
104 done
105
106OK:
107
108 var n = 3
109 for x in @(seq $n) { # Explicit splitting
110 echo $x
111 }
112
113Better;
114
115 var n = 3
116 for x in (1 .. n+1) { # Range, avoids external program
117 echo $x
118 }
119
120Note that `{1..3}` works in bash and YSH, but the numbers must be constant.
121
122## Avoid Ad Hoc Parsing and Splitting
123
124In other words, avoid *groveling through backslashes and spaces* in shell.
125
126Instead, emit and consume [J8 Notation]($xref:j8-notation):
127
128- J8 strings are [JSON]($xref) strings, with an upgrade for byte string
129 literals
130- [JSON8]($xref) is [JSON]($xref), with this same upgrade
131- [TSV8]($xref) is TSV with this upgrade (not yet implemented)
132
133Custom parsing and serializing should be limited to "the edges" of your YSH
134programs.
135
136### More Strategies For Structured Data
137
138- **Wrap** and Adapt External Tools. Parse their output, and emit [J8 Notation][].
139 - These can be one-off, "bespoke" wrappers in your program, or maintained
140 programs. Use the `proc` construct and `flagspec`!
141 - Example: [uxy](https://github.com/sustrik/uxy) wrappers.
142 - TODO: Examples written in YSH and in other languages.
143- **Patch** Existing Tools.
144 - Enhance GNU grep, etc. to emit [J8 Notation][]. Add a
145 `--j8` flag.
146- **Write Your Own** Structured Versions.
147 - For example, you can write a structured subset of `ls` in Python with
148 little effort.
149
150<!--
151 ls -q and -Q already exist, but --j8 or --tsv8 is probably fine
152-->
153
154## The `write` Builtin Is Simpler Than `printf` and `echo`
155
156### Write an Arbitrary Line
157
158No:
159
160 printf '%s\n' "$mystr"
161
162Yes:
163
164 write -- $mystr
165
166The `write` builtin accepts `--` so it doesn't confuse flags and args.
167
168### Write Without a Newline
169
170No:
171
172 echo -n "$mystr" # breaks if mystr is -e
173
174Yes:
175
176 write --end '' -- $mystr
177 write -n -- $mystr # -n is an alias for --end ''
178
179### Write an Array of Lines
180
181 var myarray = :| one two three |
182 write -- @myarray
183
184## New Long Flags on the `read` builtin
185
186### Read a Line
187
188No:
189
190 read line # Mangles your backslashes!
191
192Better:
193
194 read -r line # Still messes with leading and trailing whitespace
195
196 IFS= read -r line # OK, but doesn't work in YSH
197
198Yes:
199
200 read --raw-line # Gives you the line, without trailing \n
201
202(Note that `read --raw-line` is still an unbuffered read, which means it slowly
203reads a byte at a time. We plan to add buffered reads as well.)
204
205### Read a Whole File
206
207No:
208
209 read -d '' # harder to read, easy to forget -r
210
211Yes:
212
213 read --all # sets $_reply
214 read --all (&myvar) # sets $myvar
215
216### Read Lines of a File
217
218No:
219
220 # The IFS= idiom doesn't work in YSH, because of dynamic scope!
221 while IFS= read -r line; do
222 echo $line
223 done
224
225Yes:
226
227 while read --raw-line {
228 echo $_reply
229 }
230 # this reads a byte at a time, unbuffered, like shell
231
232Yes:
233
234 for line in (io.stdin) {
235 echo $line
236 }
237 # this reads buffered lines, which is much faster
238
239### Read a Number of Bytes
240
241No:
242
243 read -n 3 # slow because it respects -d delim
244 # also strips whitespace
245
246Better:
247
248 read -N 3 # good behavior, but easily confused with -n
249
250Yes:
251
252 read --num-bytes 3 # sets $_reply
253 read --num-bytes 3 (&myvar) # sets $myvar
254
255
256### Read Until `\0` (consume `find -print0`)
257
258No:
259
260 # Obscure syntax that bash accepts, but not other shells
261 read -r -d '' myvar
262
263Yes:
264
265 read -0 (&myvar)
266
267## YSH Enhancements to Builtins
268
269### Use `shopt` Instead of `set`
270
271Using a single builtin for all options makes scripts easier to read:
272
273Discouraged:
274
275 set -o errexit
276 shopt -s dotglob
277
278Idiomatic:
279
280 shopt --set errexit
281 shopt --set dotglob
282
283(As always, `set` can be used when you care about compatibility with other
284shells.)
285
286### Use `:` When Mentioning Variable Names
287
288YSH accepts this optional "pseudo-sigil" to make code more explicit.
289
290No:
291
292 read -0 record < file.bin
293 echo $record
294
295Yes:
296
297 read -0 (&myvar) < file.bin
298 echo $record
299
300
301### Consider Using `--long-flags`
302
303Easier to write:
304
305 test -d /tmp
306 test -d / && test -f /vmlinuz
307
308 shopt -u extglob
309
310Easier to read:
311
312 test --dir /tmp
313 test --dir / && test --file /vmlinuz
314
315 shopt --unset extglob
316
317## Use Blocks to Save and Restore Context
318
319### Do Something In Another Directory
320
321No:
322
323 ( cd /tmp; echo $PWD ) # subshell is unnecessary (and limited)
324
325No:
326
327 pushd /tmp
328 echo $PWD
329 popd
330
331Yes:
332
333 cd /tmp {
334 echo $PWD
335 }
336
337### Batch I/O
338
339No:
340
341 echo 1 > out.txt
342 echo 2 >> out.txt # appending is less efficient
343 # because open() and close()
344
345No:
346
347 { echo 1
348 echo 2
349 } > out.txt
350
351Yes:
352
353 fopen > out.txt {
354 echo 1
355 echo 2
356 }
357
358The `fopen` builtin is syntactic sugar -- it lets you see redirects before the
359code that uses them.
360
361### Temporarily Set Shell Options
362
363No:
364
365 set +o errexit
366 myfunc # without error checking
367 set -o errexit
368
369Yes:
370
371 shopt --unset errexit {
372 myfunc
373 }
374
375### Use the `forkwait` builtin for Subshells, not `()`
376
377No:
378
379 ( cd /tmp; rm *.sh )
380
381Yes:
382
383 forkwait {
384 cd /tmp
385 rm *.sh
386 }
387
388Better:
389
390 cd /tmp { # no process created
391 rm *.sh
392 }
393
394### Use the `fork` builtin for async, not `&`
395
396No:
397
398 myfunc &
399
400 { sleep 1; echo one; sleep 2; } &
401
402Yes:
403
404 fork { myfunc }
405
406 fork { sleep 1; echo one; sleep 2 }
407
408## Use Procs (Better Shell Functions)
409
410### Use Named Parameters Instead of `$1`, `$2`, ...
411
412No:
413
414 f() {
415 local src=$1
416 local dest=${2:-/tmp}
417
418 cp "$src" "$dest"
419 }
420
421Yes:
422
423 proc f(src, dest='/tmp') { # Python-like default values
424 cp $src $dest
425 }
426
427### Use Named Varargs Instead of `"$@"`
428
429No:
430
431 f() {
432 local first=$1
433 shift
434
435 echo $first
436 echo "$@"
437 }
438
439Yes:
440
441 proc f(first, @rest) { # @ means "the rest of the arguments"
442 write -- $first
443 write -- @rest # @ means "splice this array"
444 }
445
446You can also use the implicit `ARGV` variable:
447
448 proc p {
449 cp -- @ARGV /tmp
450 }
451
452### Use "Out Params" instead of `declare -n`
453
454Out params are one way to "return" values from a `proc`.
455
456No:
457
458 f() {
459 local in=$1
460 local -n out=$2
461
462 out=PREFIX-$in
463 }
464
465 myvar='init'
466 f zzz myvar # assigns myvar to 'PREFIX-zzz'
467
468
469Yes:
470
471 proc f(in, :out) { # : is an out param, i.e. a string "reference"
472 setref out = "PREFIX-$in"
473 }
474
475 var myvar = 'init'
476 f zzz :myvar # assigns myvar to 'PREFIX-zzz'.
477 # colon is required
478
479### Note: Procs Don't Mess With Their Callers
480
481That is, [dynamic scope]($xref:dynamic-scope) is turned off when procs are
482invoked.
483
484Here's an example of shell functions reading variables in their caller:
485
486 bar() {
487 echo $foo_var # looks up the stack
488 }
489
490 foo() {
491 foo_var=x
492 bar
493 }
494
495 foo
496
497In YSH, you have to pass params explicitly:
498
499 proc bar {
500 echo $foo_var # error, not defined
501 }
502
503Shell functions can also **mutate** variables in their caller! But procs can't
504do this, which makes code easier to reason about.
505
506## Use Modules
507
508YSH has a few lightweight features that make it easier to organize code into
509files. It doesn't have "namespaces".
510
511### Relative Imports
512
513Suppose we are running `bin/mytool`, and we want `BASE_DIR` to be the root of
514the repository so we can do a relative import of `lib/foo.sh`.
515
516No:
517
518 # All of these are common idioms, with caveats
519 BASE_DIR=$(dirname $0)/..
520
521 BASE_DIR=$(dirname ${BASH_SOURCE[0]})/..
522
523 BASE_DIR=$(cd $($dirname $0)/.. && pwd)
524
525 BASE_DIR=$(dirname (dirname $(readlink -f $0)))
526
527 source $BASE_DIR/lib/foo.sh
528
529Yes:
530
531 const BASE_DIR = "$this_dir/.."
532
533 source $BASE_DIR/lib/foo.sh
534
535 # Or simply:
536 source $_this_dir/../lib/foo.sh
537
538The value of `_this_dir` is the directory that contains the currently executing
539file.
540
541### Include Guards
542
543No:
544
545 # libfoo.sh
546 if test -z "$__LIBFOO_SH"; then
547 return
548 fi
549 __LIBFOO_SH=1
550
551Yes:
552
553 # libfoo.sh
554 module libfoo.sh || return 0
555
556### Taskfile Pattern
557
558No:
559
560 deploy() {
561 echo ...
562 }
563 "$@"
564
565Yes
566
567 proc deploy() {
568 echo ...
569 }
570 runproc @ARGV # gives better error messages
571
572## Error Handling
573
574[YSH Fixes Shell's Error Handling (`errexit`)](error-handling.html) once and
575for all! Here's a comprehensive list of error handling idioms.
576
577### Don't Use `&&` Outside of `if` / `while`
578
579It's implicit because `errexit` is on in YSH.
580
581No:
582
583 mkdir /tmp/dest && cp foo /tmp/dest
584
585Yes:
586
587 mkdir /tmp/dest
588 cp foo /tmp/dest
589
590It also avoids the *Trailing `&&` Pitfall* mentioned at the end of the [error
591handling](error-handling.html) doc.
592
593### Ignore an Error
594
595No:
596
597 ls /bad || true # OK because ls is external
598 myfunc || true # suffers from the "Disabled errexit Quirk"
599
600Yes:
601
602 try { ls /bad }
603 try { myfunc }
604
605### Retrieve A Command's Status When `errexit` is On
606
607No:
608
609 # set -e is enabled earlier
610
611 set +e
612 mycommand # this ignores errors when mycommand is a function
613 status=$? # save it before it changes
614 set -e
615
616 echo $status
617
618Yes:
619
620 try {
621 mycommand
622 }
623 echo $[_error.code]
624
625### Does a Builtin Or External Command Succeed?
626
627These idioms are OK in both shell and YSH:
628
629 if ! cp foo /tmp {
630 echo 'error copying' # any non-zero status
631 }
632
633 if ! test -d /bin {
634 echo 'not a directory'
635 }
636
637To be consistent with the idioms below, you can also write them like this:
638
639 try {
640 cp foo /tmp
641 }
642 if failed { # shortcut for (_error.code !== 0)
643 echo 'error copying'
644 }
645
646### Does a Function Succeed?
647
648When the command is a shell function, you shouldn't use `if myfunc` directly.
649This is because shell has the *Disabled `errexit` Quirk*, which is detected by
650YSH `strict_errexit`.
651
652**No**:
653
654 if myfunc; then # errors not checked in body of myfunc
655 echo 'success'
656 fi
657
658**Yes**. The *`$0` Dispatch Pattern* is a workaround that works in all shells.
659
660 if $0 myfunc; then # invoke a new shell
661 echo 'success'
662 fi
663
664 "$@" # Run the function $1 with args $2, $3, ...
665
666**Yes**. The YSH `try` builtin sets the special `_error` variable and returns
667`0`.
668
669 try {
670 myfunc # doesn't abort
671 }
672 if failed {
673 echo 'success'
674 }
675
676### Does a Pipeline Succeed?
677
678No:
679
680 if ps | grep python; then
681 echo 'found'
682 fi
683
684This is technically correct when `pipefail` is on, but it's impossible for
685YSH `strict_errexit` to distinguish it from `if myfunc | grep python` ahead
686of time (the ["meta" pitfall](error-handling.html#the-meta-pitfall)). If you
687know what you're doing, you can disable `strict_errexit`.
688
689Yes:
690
691 try {
692 ps | grep python
693 }
694 if failed {
695 echo 'found'
696 }
697
698 # You can also examine the status of each part of the pipeline
699 if (_pipeline_status[0] !== 0) {
700 echo 'ps failed'
701 }
702
703### Does a Command With Process Subs Succeed?
704
705Similar to the pipeline example above:
706
707No:
708
709 if ! comm <(sort left.txt) <(sort right.txt); then
710 echo 'error'
711 fi
712
713Yes:
714
715 try {
716 comm <(sort left.txt) <(sort right.txt)
717 }
718 if failed {
719 echo 'error'
720 }
721
722 # You can also examine the status of each process sub
723 if (_process_sub_status[0] !== 0) {
724 echo 'first process sub failed'
725 }
726
727(I used `comm` in this example because it doesn't have a true / false / error
728status like `diff`.)
729
730### Handle Errors in YSH Expressions
731
732 try {
733 var x = 42 / 0
734 echo "result is $[42 / 0]"
735 }
736 if failed {
737 echo 'divide by zero'
738 }
739
740### Test Boolean Statuses, like `grep`, `diff`, `test`
741
742The YSH `boolstatus` builtin distinguishes **error** from **false**.
743
744**No**, this is subtly wrong. `grep` has 3 different return values.
745
746 if grep 'class' *.py {
747 echo 'found' # status 0 means found
748 } else {
749 echo 'not found OR ERROR' # any non-zero status
750 }
751
752**Yes**. `boolstatus` aborts the program if `egrep` doesn't return 0 or 1.
753
754 if boolstatus grep 'class' *.py { # may abort
755 echo 'found' # status 0 means found
756 } else {
757 echo 'not found' # status 1 means not found
758 }
759
760More flexible style:
761
762 try {
763 grep 'class' *.py
764 }
765 case (_error.code) {
766 (0) { echo 'found' }
767 (1) { echo 'not found' }
768 (else) { echo 'fatal' }
769 }
770
771## Use YSH Expressions, Initializations, and Assignments (var, setvar)
772
773### Initialize and Assign Strings and Integers
774
775No:
776
777 local mystr=foo
778 mystr='new value'
779
780 local myint=42 # still a string in shell
781
782Yes:
783
784 var mystr = 'foo'
785 setvar mystr = 'new value'
786
787 var myint = 42 # a real integer
788
789### Expressions on Integers
790
791No:
792
793 x=$(( 1 + 2*3 ))
794 (( x = 1 + 2*3 ))
795
796Yes:
797
798 setvar x = 1 + 2*3
799
800### Mutate Integers
801
802No:
803
804 (( i++ )) # interacts poorly with errexit
805 i=$(( i+1 ))
806
807Yes:
808
809 setvar i += 1 # like Python, with a keyword
810
811### Initialize and Assign Arrays
812
813Arrays in YSH look like `:| my array |` and `['my', 'array']`.
814
815No:
816
817 local -a myarray=(one two three)
818 myarray[3]='THREE'
819
820Yes:
821
822 var myarray = :| one two three |
823 setvar myarray[3] = 'THREE'
824
825 var same = ['one', 'two', 'three']
826 var typed = [1, 2, true, false, null]
827
828
829### Initialize and Assign Dicts
830
831Dicts in YSH look like `{key: 'value'}`.
832
833No:
834
835 local -A myassoc=(['key']=value ['k2']=v2)
836 myassoc['key']=V
837
838
839Yes:
840
841 # keys don't need to be quoted
842 var myassoc = {key: 'value', k2: 'v2'}
843 setvar myassoc['key'] = 'V'
844
845### Get Values From Arrays and Dicts
846
847No:
848
849 local x=${a[i-1]}
850 x=${a[i]}
851
852 local y=${A['key']}
853
854Yes:
855
856 var x = a[i-1]
857 setvar x = a[i]
858
859 var y = A['key']
860
861### Conditions and Comparisons
862
863No:
864
865 if (( x > 0 )); then
866 echo 'positive'
867 fi
868
869Yes:
870
871 if (x > 0) {
872 echo 'positive'
873 }
874
875### Substituting Expressions in Words
876
877No:
878
879 echo flag=$((1 + a[i] * 3)) # C-like arithmetic
880
881Yes:
882
883 echo flag=$[1 + a[i] * 3] # Arbitrary YSH expressions
884
885 # Possible, but a local var might be more readable
886 echo flag=$['1' if x else '0']
887
888
889## Use [Egg Expressions](eggex.html) instead of Regexes
890
891### Test for a Match
892
893No:
894
895 local pat='[[:digit:]]+'
896 if [[ $x =~ $pat ]]; then
897 echo 'number'
898 fi
899
900Yes:
901
902 if (x ~ /digit+/) {
903 echo 'number'
904 }
905
906Or extract the pattern:
907
908 var pat = / digit+ /
909 if (x ~ pat) {
910 echo 'number'
911 }
912
913### Extract Submatches
914
915No:
916
917 if [[ $x =~ foo-([[:digit:]]+) ]] {
918 echo "${BASH_REMATCH[1]}" # first submatch
919 }
920
921Yes:
922
923 if (x ~ / 'foo-' <capture d+> /) { # <> is capture
924 echo $[_group(1)] # first submatch
925 }
926
927## Glob Matching
928
929No:
930
931 if [[ $x == *.py ]]; then
932 echo 'Python'
933 fi
934
935Yes:
936
937 if (x ~~ '*.py') {
938 echo 'Python'
939 }
940
941
942No:
943
944 case $x in
945 *.py)
946 echo Python
947 ;;
948 *.sh)
949 echo Shell
950 ;;
951 esac
952
953Yes (purely a style preference):
954
955 case $x { # curly braces
956 (*.py) # balanced parens
957 echo 'Python'
958 ;;
959 (*.sh)
960 echo 'Shell'
961 ;;
962 }
963
964## TODO
965
966### Distinguish Between Variables and Functions
967
968- `$RANDOM` vs. `random()`
969- `LANG=C` vs. `shopt --setattr LANG=C`
970
971## Related Documents
972
973- [Shell Language Idioms](shell-idioms.html). This advice applies to shells
974 other than YSH.
975- [What Breaks When You Upgrade to YSH](upgrade-breakage.html). Shell constructs that YSH
976 users should avoid.
977- [YSH Fixes Shell's Error Handling (`errexit`)](error-handling.html). YSH fixes the
978 flaky error handling in POSIX shell and bash.
979- TODO: Go through more of the [Pure Bash
980 Bible](https://github.com/dylanaraps/pure-bash-bible). YSH provides
981 alternatives for such quirky syntax.
982