OILS / doc / proc-func.md View on Github | oilshell.org

895 lines, 600 significant
1---
2default_highlighter: oils-sh
3---
4
5Guide to Procs and Funcs
6========================
7
8YSH has two major units of code: shell-like `proc`, and Python-like `func`.
9
10- Roughly speaking, procs are for commands and **I/O**, while funcs are for
11 pure **computation**.
12- Procs are often **big**, and may call **small** funcs. On the other hand,
13 it's possible, but rarer, for funcs to call procs.
14- You can write shell scripts **mostly** with procs, and perhaps a few funcs.
15
16This doc compares the two mechanisms, and gives rough guidelines.
17
18<!--
19See the blog for more conceptual background: [Oils is
20Exterior-First](https://www.oilshell.org/blog/2023/06/ysh-design.html).
21-->
22
23<div id="toc">
24</div>
25
26## Tip: Start Simple
27
28Before going into detail, here's a quick reminder that you don't have to use
29**either** procs or funcs. YSH is a language that scales both down and up.
30
31You can start with just a list of plain commands:
32
33 mkdir -p /tmp/dest
34 cp --verbose *.txt /tmp/dest
35
36Then copy those into procs as the script gets bigger:
37
38 proc build-app {
39 ninja --verbose
40 }
41
42 proc deploy {
43 mkdir -p /tmp/dest
44 cp --verbose *.txt /tmp/dest
45 }
46
47 build-app
48 deploy
49
50Then add funcs if you need pure computation:
51
52 func isTestFile(name) {
53 return (name => endsWith('._test.py'))
54 }
55
56 if (isTestFile('my_test.py')) {
57 echo 'yes'
58 }
59
60## At a Glance
61
62### Procs vs. Funcs
63
64This table summarizes the difference between procs and funcs. The rest of the
65doc will elaborate on these issues.
66
67<style>
68 thead {
69 background-color: #eee;
70 font-weight: bold;
71 }
72 table {
73 font-family: sans-serif;
74 border-collapse: collapse;
75 }
76
77 tr {
78 border-bottom: solid 1px;
79 border-color: #ddd;
80 }
81
82 td {
83 padding: 8px; /* override default of 5px */
84 }
85</style>
86
87<table>
88 <thead>
89 <tr>
90 <td></td>
91 <td>Proc</td>
92 <td>Func</td>
93 </tr>
94 </thead>
95
96 <tr>
97 <td>Design Influence</td>
98<td>
99
100Shell-like.
101
102</td>
103<td>
104
105Python- and JavaScript-like, but **pure**.
106
107</td>
108 </tr>
109
110 <tr>
111 <td>Shape</td>
112
113<td>
114
115Procs are shaped like Unix processes: with `argv`, an integer return code, and
116`stdin` / `stdout` streams.
117
118They're a generalization of Bourne shell "functions".
119
120</td>
121<td>
122
123Funcs are shaped like mathematical functions.
124
125</td>
126 </tr>
127
128 <tr>
129<td>
130
131Architectural Role ([Oils is Exterior First](https://www.oilshell.org/blog/2023/06/ysh-design.html))
132
133</td>
134<td>
135
136**Exterior**: processes and files.
137
138</td>
139
140<td>
141
142**Interior**: functions and garbage-collected data structures.
143
144</td>
145 </tr>
146
147 <tr>
148 <td>I/O</td>
149 <td>
150
151Procs may start external processes and pipelines. Can perform I/O anywhere.
152
153</td>
154 <td>
155
156Funcs need an explicit `value.IO` param to perform I/O.
157
158</td>
159 </tr>
160
161 <tr>
162 <td>Example Definition</td>
163<td>
164
165 proc print-max (; x, y) {
166 echo $[x if x > y else y]
167 }
168
169</td>
170<td>
171
172 func computeMax(x, y) {
173 return (x if x > y else y)
174 }
175
176</td>
177 </tr>
178
179 <tr>
180 <td>Example Call</td>
181<td>
182
183 print-max (3, 4)
184
185Procs can be put in pipelines:
186
187 print-max (3, 4) | tee out.txt
188
189</td>
190<td>
191
192 var m = computeMax(3, 4)
193
194Or throw away the return value, which is useful for functions that mutate:
195
196 call computeMax(3, 4)
197
198</td>
199 </tr>
200
201 <tr>
202 <td>Naming Convention</td>
203<td>
204
205`kebab-case`
206
207</td>
208<td>
209
210`camelCase`
211
212</td>
213 </tr>
214
215 <tr>
216<td>
217
218[Syntax Mode](command-vs-expression-mode.html) of call site
219
220</td>
221 <td>Command Mode</td>
222 <td>Expression Mode</td>
223 </tr>
224
225 <tr>
226 <td>Kinds of Parameters / Arguments</td>
227 <td>
228
2291. Word aka string
2301. Typed and Positional
2311. Typed and Named
2321. Block
233
234Examples shown below.
235
236</td>
237 <td>
238
2391. Positional
2401. Named
241
242(both typed)
243
244</td>
245 </tr>
246
247 <tr>
248 <td>Return Value</td>
249 <td>Integer status 0-255</td>
250 <td>
251
252Any type of value, e.g.
253
254 return ([42, {name: 'bob'}])
255
256</td>
257 </tr>
258
259 <tr>
260 <td>Interface Evolution</td>
261<td>
262
263**Slower**: Procs exposed to the outside world may need to evolve in a compatible or "versionless" way.
264
265</td>
266<td>
267
268**Faster**: Funcs may be refactored internally.
269
270</td>
271 </tr>
272
273 <tr>
274 <td>Parallelism?</td>
275<td>
276
277Procs can be parallel with:
278
279- shell constructs: pipelines, `&` aka `fork`
280- external tools and the [$0 Dispatch
281 Pattern](https://www.oilshell.org/blog/2021/08/xargs.html): xargs, make,
282 Ninja, etc.
283
284</td>
285<td>
286
287Funcs are inherently **serial**, unless wrapped in a proc.
288
289</td>
290 </tr>
291
292 <tr>
293 <td colspan=3 style="text-align: center; padding: 3em">More <code>proc</code> features ...</td>
294 </tr>
295
296 <tr>
297 <td>Kinds of Signature</td>
298 <td>
299
300Open `proc p {` or <br/>
301Closed `proc p () {`
302
303</td>
304 <td>-</td>
305 </tr>
306
307 <tr>
308 <td>Lazy Args</td>
309<td>
310
311 assert [42 === x]
312
313</td>
314 <td>-</td>
315 </tr>
316
317</table>
318
319### Func Calls and Defs
320
321Now that we've compared procs and funcs, let's look more closely at funcs.
322They're inherently **simpler**: they have 2 types of args and params, rather
323than 4.
324
325YSH argument binding is based on Julia, which has all the power of Python, but
326without the "evolved warts" (e.g. `/` and `*`).
327
328In general, with all the bells and whistles, func definitions look like:
329
330 # pos args and named args separated with ;
331 func f(p1, p2, ...rest_pos; n1=42, n2='foo', ...rest_named) {
332 return (len(rest_pos) + len(rest_named))
333 }
334
335Func calls look like:
336
337 # spread operator ... at call site
338 var pos_args = [3, 4]
339 var named_args = {foo: 'bar'}
340 var x = f(1, 2, ...pos_args; n1=43, ...named_args)
341
342Note that positional args/params and named args/params can be thought of as two
343"separate worlds".
344
345This table shows simpler, more common cases.
346
347
348<table>
349 <thead>
350 <tr>
351 <td>Args / Params</td>
352 <td>Call Site</td>
353 <td>Definition</td>
354 </tr>
355 </thead>
356
357 <tr>
358 <td>Positional Args</td>
359<td>
360
361 var x = myMax(3, 4)
362
363</td>
364<td>
365
366 func myMax(x, y) {
367 return (x if x > y else y)
368 }
369
370</td>
371 </tr>
372
373 <tr>
374 <td>Spread Pos Args</td>
375<td>
376
377 var args = [3, 4]
378 var x = myMax(...args)
379
380</td>
381<td>
382
383(as above)
384
385</td>
386 </tr>
387
388 <tr>
389 <td>Rest Pos Params</td>
390<td>
391
392 var x = myPrintf("%s is %d", 'bob', 30)
393
394</td>
395<td>
396
397 func myPrintf(fmt, ...args) {
398 # ...
399 }
400
401</td>
402 </tr>
403
404 <tr>
405 <td colspan=3 style="text-align: center; padding: 3em">...</td>
406 </tr>
407
408</td>
409 </tr>
410
411 <tr>
412 <td>Named Args</td>
413<td>
414
415 var x = mySum(3, 4, start=5)
416
417</td>
418<td>
419
420 func mySum(x, y; start=0) {
421 return (x + y + start)
422 }
423
424</td>
425 </tr>
426
427 <tr>
428 <td>Spread Named Args</td>
429<td>
430
431 var opts = {start: 5}
432 var x = mySum(3, 4, ...opts)
433
434</td>
435<td>
436
437(as above)
438
439</td>
440 </tr>
441
442 <tr>
443 <td>Rest Named Params</td>
444<td>
445
446 var x = f(start=5, end=7)
447
448</td>
449<td>
450
451 func f(; ...opts) {
452 if ('start' not in opts) {
453 setvar opts.start = 0
454 }
455 # ...
456 }
457
458</td>
459 </tr>
460
461</table>
462
463### Proc Calls and Defs
464
465Like funcs, procs have 2 kinds of typed args/params: positional and named.
466
467But they may also have **string aka word** args/params, and a **block**
468arg/param. (The block param is passed as a typed, positional arg, although
469this detail usually doesn't matter.)
470
471In general, a proc signature has 4 sections, like this:
472
473 proc p (
474 w1, w2, ...rest_word; # word params
475 p1, p2, ...rest_pos; # pos params
476 n1, n2, ...rest_named; # named params
477 block # block param
478 ) {
479 echo 'body'
480 }
481
482In general, a proc call looks like:
483
484 var pos_args = [3, 4]
485 var named_args = {foo: 'bar'}
486 p /bin /tmp (1, 2, ...pos_args; n1=43, ...named_args) {
487 echo 'block'
488 }
489
490<!--
491- Block is really last positional arg: `cd /tmp { echo $PWD }`
492-->
493
494Some simpler examples:
495
496<table>
497 <thead>
498 <tr>
499 <td>Args / Params</td>
500 <td>Call Site</td>
501 <td>Definition</td>
502 </tr>
503 </thead>
504
505 <tr>
506 <td>Word args</td>
507<td>
508
509 my-cd /tmp
510
511</td>
512<td>
513
514 proc my-cd (dest) {
515 cd $dest
516 }
517
518</td>
519 </tr>
520
521 <tr>
522 <td>Rest Word Params</td>
523<td>
524
525 my-cd -L /tmp
526
527</td>
528<td>
529
530 proc my-cd (...flags) {
531 cd @flags
532 }
533
534 <tr>
535 <td>Spread Word Args</td>
536<td>
537
538 var flags = :| -L /tmp |
539 my-cd @flags
540
541</td>
542<td>
543
544(as above)
545
546</td>
547 </tr>
548
549</td>
550 </tr>
551
552 <tr>
553 <td colspan=3 style="text-align: center; padding: 3em">...</td>
554 </tr>
555
556 <tr>
557 <td>Typed Pos Arg</td>
558<td>
559
560 print-max (3, 4)
561
562</td>
563<td>
564
565 proc print-max ( ; x, y) {
566 echo $[x if x > y else y]
567 }
568
569</td>
570 </tr>
571
572 <tr>
573 <td>Typed Named Arg</td>
574<td>
575
576 print-max (3, 4, start=5)
577
578</td>
579<td>
580
581 proc print-max ( ; x, y; start=0) {
582 # ...
583 }
584
585</td>
586 </tr>
587
588 <tr>
589 <td colspan=3 style="text-align: center; padding: 3em">...</td>
590 </tr>
591
592
593
594 <tr>
595 <td>Block Argument</td>
596<td>
597
598 my-cd /tmp {
599 echo $PWD
600 echo hi
601 }
602
603</td>
604<td>
605
606 proc my-cd (dest; ; ; block) {
607 cd $dest (block)
608 }
609
610</td>
611 </tr>
612
613 <tr>
614 <td>All Four Kinds</td>
615<td>
616
617 p 'word' (42, verbose=true) {
618 echo $PWD
619 echo hi
620 }
621
622</td>
623<td>
624
625 proc p (w; myint; verbose=false; block) {
626 = w
627 = myint
628 = verbose
629 = block
630 }
631
632</td>
633 </tr>
634
635</table>
636
637## Common Features
638
639Let's recap the common features of procs and funcs.
640
641### Spread Args, Rest Params
642
643- Spread arg list `...` at call site
644- Rest params `...` at definition
645
646### The `error` builtin raises exceptions
647
648The `error` builtin is idiomatic in both funcs and procs:
649
650 func f(x) {
651 if (x <= 0) {
652 error 'Should be positive' (status=99)
653 }
654 }
655
656Tip: reserve such errors for **exceptional** situations. For example, an input
657string being invalid may not be uncommon, while a disk full I/O error is more
658exceptional.
659
660(The `error` builtin is implemented with C++ exceptions, which are slow in the
661error case.)
662
663### Out Params: `&myvar` is of type `value.Place`
664
665Out params are more common in procs, because they don't have a typed return
666value.
667
668 proc p ( ; out) {
669 call out->setValue(42)
670 }
671 var x
672 p (&x)
673 echo "x set to $x" # => x set to 42
674
675But they can also be used in funcs:
676
677 func f (out) {
678 call out->setValue(42)
679 }
680 var x
681 call f(&x)
682 echo "x set to $x" # => x set to 42
683
684Observation: procs can do everything funcs can. But you may want the purity
685and familiar syntax of a `func`.
686
687---
688
689Design note: out params are a nicer way of doing what bash does with `declare
690-n` aka `nameref` variables. They don't rely on [dynamic
691scope]($xref:dynamic-scope).
692
693## Proc-Only Features
694
695Procs have some features that funcs don't have.
696
697### Lazy Arg Lists `where [x > 10]`
698
699A lazy arg list is implemented with `shopt --set parse_bracket`, and is syntax
700sugar for an unevaluated `value.Expr`.
701
702Longhand:
703
704 var my_expr = ^[42 === x] # value of type Expr
705 assert (myexpr)
706
707Shorthand:
708
709 assert [42 === x] # equivalent to the above
710
711### Open Proc Signatures bind `argv`
712
713TODO: Implement new `ARGV` semantics.
714
715When a proc signature omits `()`, it's called **"open"** because the caller can
716pass "extra" arguments:
717
718 proc my-open {
719 write 'args are' @ARGV
720 }
721 # All valid:
722 my-open
723 my-open 1
724 my-open 1 2
725
726Stricter closed procs:
727
728 proc my-closed (x) {
729 write 'arg is' $x
730 }
731 my-closed # runtime error: missing argument
732 my-closed 1 # valid
733 my-closed 1 2 # runtime error: too many arguments
734
735
736An "open" proc is nearly is nearly identical to a shell function:
737
738 shfunc() {
739 write 'args are' @ARGV
740 }
741
742## Usage Notes
743
744### 3 Ways to Return a Value
745
746Let's review the recommended ways to "return" a value:
747
7481. `return (x)` in a `func`.
749 - The parentheses are required because expressions like `(x + 1)` should
750 look different than words.
7511. Pass a `value.Place` instance to a proc or func.
752 - That is, out param `&out`.
7531. Print to stdout in a `proc`
754 - Capture it with command sub: `$(myproc)`
755 - Or with `read`: `myproc | read --all; echo $_reply`
756
757Obsolete ways of "returning":
758
7591. Using `declare -n` aka `nameref` variables in bash.
7601. Relying on [dynamic scope]($xref:dynamic-scope) in POSIX shell.
761
762### Procs Compose in Pipelines / "Bernstein Chaining"
763
764Some YSH users may tend toward funcs because they're more familiar. But shell
765composition with procs is very powerful!
766
767They have at least two kinds of composition that funcs don't have.
768
769See #[shell-the-good-parts]($blog-tag):
770
7711. [Shell Has a Forth-Like
772 Quality](https://www.oilshell.org/blog/2017/01/13.html) - Bernstein
773 chaining.
7741. [Pipelines Support Vectorized, Point-Free, and Imperative
775 Style](https://www.oilshell.org/blog/2017/01/15.html) - the shell can
776 transparently run procs as elements of pipelines.
777
778<!--
779
780In summary:
781
782* func signatures look like JavaScript, Julia, and Go.
783 * named and positional are separated with `;` in the signature.
784 * The prefix `...` "spread" operator takes the place of Python's `*args` and `**kwargs`.
785 * There are optional type annotations
786* procs are like shell functions
787 * but they also allow you to name parameters, and throw errors if the arity
788is wrong.
789 * and they take blocks.
790
791One issue is that procs take block arguments but not funcs. This is something
792of a syntactic issue. But I don't think it's that high priority.
793
794-->
795
796## Summary
797
798YSH is influenced by both shell and Python, so it has both procs and funcs.
799
800Many programmers will gravitate towards funcs because they're familiar, but
801procs are more powerful and shell-like.
802
803Make your YSH programs by learning to use procs!
804
805## Appendix
806
807### Implementation Details
808
809procs vs. funcs both have these concerns:
810
8111. Evaluation of default args at definition time.
8121. Evaluation of actual args at the call site.
8131. Arg-Param binding for builtin functions, e.g. with `typed_args.Reader`.
8141. Arg-Param binding for user-defined functions.
815
816So the implementation can be thought of as a **2 &times; 4 matrix**, with some
817code shared. This code is mostly in [ysh/func_proc.py]($oils-src).
818
819### Related
820
821- [Block Literals](block-literals.html)
822
823<!--
824
825TODO: any reference topics?
826
827-->
828
829<!--
830OK we're getting close here -- #**language-design>Unifying Proc and Func Params**
831
832I think we need to write a quick guide first, not a reference
833
834
835It might have some **tables**
836
837It might mention concerete use cases like the **flag parser** -- #**oil-dev>Progress on argparse**
838
839
840### Diff-based explanation
841
842- why not Python -- because of `/` and `*` special cases
843- Julia influence
844- lazy args for procs `where` filters and `awk`
845- out Ref parameters are for "returning" without printing to stdout
846
847#**language-design>N ways to "return" a value**
848
849
850- What does shell have?
851 - it has blocks, e.g. with redirects
852 - it has functions without params -- only named params
853
854
855- Ruby influence -- rich DSLs
856
857
858So I think you can say we're a mix of
859
860- shell
861- Python
862- Julia (mostly subsumes Python?)
863- Ruby
864
865
866### Implemented-based explanation
867
868- ASDL schemas -- #**oil-dev>Good Proc/Func refactoring**
869
870
871### Big Idea: procs are for I/O, funcs are for computation
872
873We may want to go full in on this idea with #**language-design>func evaluator without redirects and $?**
874
875
876### Very Basic Advice, Up Front
877
878
879Done with #**language-design>value.Place, & operator, read builtin**
880
881Place works with both func and proc
882
883
884### Bump
885
886I think this might go in the backlog - #**blog-ideas**
887
888
889#**language-design>Simplify proc param passing?**
890
891-->
892
893
894
895<!-- vim sw=2 -->