1 | ---
|
2 | default_highlighter: oils-sh
|
3 | ---
|
4 |
|
5 | Variable Declaration, Mutation, and Scope
|
6 | =========================================
|
7 |
|
8 | This doc addresses these questions:
|
9 |
|
10 | - How do variables behave in YSH?
|
11 | - What are some practical guidelines for using them?
|
12 |
|
13 | <div id="toc">
|
14 | </div>
|
15 |
|
16 | ## YSH Design Goals
|
17 |
|
18 | YSH is a graceful upgrade to shell, and the behavior of variables follows from
|
19 | that philosophy.
|
20 |
|
21 | - OSH implements shell-compatible behavior.
|
22 | - YSH enhances shell with **new features** like expressions over typed data,
|
23 | which will be familiar to Python and JavaScript programmers.
|
24 | - It's a **stricter** language.
|
25 | - Procs (shell functions) are self-contained and modular. They're
|
26 | understandable by reading their signature.
|
27 | - We removed [dynamic scope]($xref:dynamic-scope). This mechanism isn't
|
28 | familiar to most programmers, and may cause accidental mutation (bugs).
|
29 | - YSH has variable **declarations** like JavaScript, which can prevent
|
30 | trivial bugs.
|
31 | - Even though YSH is stricter, it should still be convenient to use
|
32 | interactively.
|
33 |
|
34 | ## Keywords Are More Consistent and Powerful Than Builtins
|
35 |
|
36 | YSH has 5 keywords affect shell variables. Unlike shell builtins, they're
|
37 | statically-parsed, and take dynamically-typed **expressions** on the right.
|
38 |
|
39 | ### Declare With `var` and `const`
|
40 |
|
41 | It looks like JavaScript:
|
42 |
|
43 | var name = 'Bob'
|
44 | const age = (20 + 1) * 2
|
45 |
|
46 | echo "$name is $age years old" # Bob is 42 years old
|
47 |
|
48 | Note that `const` is enforced by a dynamic check. It's meant to be used at the
|
49 | top level only, not within `proc` or `func`.
|
50 |
|
51 | const age = 'other' # Will fail because `readonly` bit is set
|
52 |
|
53 | ### Mutate With `setvar` and `setglobal`
|
54 |
|
55 | proc p {
|
56 | var name = 'Bob' # declare
|
57 | setvar name = 'Alice' # mutate
|
58 |
|
59 | setglobal g = 42 # create or mutate a global variable
|
60 | }
|
61 |
|
62 | ### "Return" By Mutating a `Place` (advanced)
|
63 |
|
64 | A `Place` is a more principled mechanism that "replaces" shell's dynamic scope.
|
65 | To use it:
|
66 |
|
67 | 1. Create a place with the `&` prefix operator
|
68 | 1. Pass the place around as you would any other value.
|
69 | 1. Assign to the place with its `setValue(x)` method.
|
70 |
|
71 | Example:
|
72 |
|
73 | proc p (s; out) { # place is a typed param
|
74 | # mutate the place
|
75 | call out->setValue("prefix-$s")
|
76 | }
|
77 |
|
78 | var x
|
79 | p ('foo', &x) # pass a place
|
80 | echo x=$x # => x=prefix-foo
|
81 |
|
82 | - *Style guideline*: In some situations, it's better to "return" a value on
|
83 | stdout, and use `$(myproc)` to retrieve it.
|
84 |
|
85 | ### Comparison to Shell
|
86 |
|
87 | Shell and [bash]($xref) have grown many mechanisms for "declaring" and mutating
|
88 | variables:
|
89 |
|
90 | - "bare" assignments like `x=foo`
|
91 | - **builtins** like `declare`, `local`, and `readonly`
|
92 | - The `-n` "nameref" flag
|
93 |
|
94 | Examples:
|
95 |
|
96 | readonly name=World # no spaces allowed around =
|
97 | declare foo="Hello $name"
|
98 | foo=$((42 + a[2]))
|
99 | declare -n ref=foo # $foo can be written through $ref
|
100 |
|
101 | These constructs are all discouraged in YSH code.
|
102 |
|
103 | ## Keywords Behave Differently at the Top Level (Like JavaScript)
|
104 |
|
105 | The "top-level" of the interpreter is used in two situations:
|
106 |
|
107 | 1. When using YSH **interactively**.
|
108 | 2. As the **global** scope of a batch program.
|
109 |
|
110 | Experienced YSH users may notice that `var` and `setvar` behave differently in
|
111 | the top-level scope vs. `proc` scope. This is caused by the tension between
|
112 | the interactive shell and the strictness of YSH.
|
113 |
|
114 | In particular, the `source` builtin is dynamic, so YSH can't know all the names
|
115 | defined at the top level.
|
116 |
|
117 | For reference, JavaScript's modern `let` keyword has similar behavior.
|
118 |
|
119 | ### Usage Guidelines
|
120 |
|
121 | Before going into detail on keyword behavior, here are some practical
|
122 | guidelines:
|
123 |
|
124 | - **Interactive** sessions: Use shell's `x=y`, or YSH `setvar`. You can think
|
125 | of `setvar` like Python's assignment operator: it creates or mutates a
|
126 | variable.
|
127 | - **Short scripts** (~20 lines) can also use this style.
|
128 | - **Long programs**: Refactor them into composable "functions", i.e. `proc`.
|
129 | - First wrap the **whole program** into `proc main { }`.
|
130 | - The top level should only have `const` declarations. (You can use `var`,
|
131 | but it has special rules, explained below.)
|
132 | - The body of `proc` and `func` should have variables declared with `var`.
|
133 | - Inside these code blocks, use `setvar` to mutate **local** variables, and
|
134 | `setglobal` to mutate **globals**.
|
135 |
|
136 | That's all you need to remember. The following sections explain the rationale
|
137 | for these guidelines.
|
138 |
|
139 | ### The Top-Level Scope Has Only Dynamic Checks
|
140 |
|
141 | The lack of static checks affects the recommended usage for both interactive
|
142 | sessions and batch scripts.
|
143 |
|
144 | #### Interactive Use: `setvar` only
|
145 |
|
146 | As mentioned, you only need the `setvar` keyword in an interactive shell:
|
147 |
|
148 | ysh$ setvar x = 42 # create variable 'x'
|
149 | ysh$ setvar x = 43 # mutate it
|
150 |
|
151 | Details on top-level behavior:
|
152 |
|
153 | - `var` behaves like `setvar`: It creates or mutates a variable. In other
|
154 | words, a `var` definition can be **redefined** at the top-level.
|
155 | - A `const` can also redefine a `var`.
|
156 | - A `var` can't redefine a `const` because there's a **dynamic** check that
|
157 | disallows mutation (like shell's `readonly`).
|
158 |
|
159 | #### Batch Use: `const` only
|
160 |
|
161 | It's simpler to use only constants at the top level.
|
162 |
|
163 | const USER = 'bob'
|
164 | const HOST = 'example.com'
|
165 |
|
166 | proc p {
|
167 | ssh $USER@$HOST ls -l
|
168 | }
|
169 |
|
170 | This is so you don't have to worry about a `var` being redefined by a statement
|
171 | like `source mylib.sh`. A `const` can't be redefined because it can't be
|
172 | mutated.
|
173 |
|
174 | It may be useful to put mutable globals in a constant dictionary, as it will
|
175 | prevent them from being redefined:
|
176 |
|
177 | const G = { mystate = 0 }
|
178 |
|
179 | proc p {
|
180 | setglobal G.mystate = 1
|
181 | }
|
182 |
|
183 | ### `proc` and `func` Scope Have Static Checks
|
184 |
|
185 | These YSH code units have additional **static checks** (parse errors):
|
186 |
|
187 | - Every variable must be declared once and only once with `var`. A duplicate
|
188 | declaration is a parse error.
|
189 | - `setvar` of an undeclared variable is a parse error.
|
190 |
|
191 | ## Procs Don't Use "Dynamic Scope"
|
192 |
|
193 | Procs are designed to be encapsulated and composable like processes. But the
|
194 | [dynamic scope]($xref:dynamic-scope) rule that Bourne shell functions use
|
195 | breaks encapsulation.
|
196 |
|
197 | Dynamic scope means that a function can **read and mutate** the locals of its
|
198 | caller, its caller's caller, and so forth. Example:
|
199 |
|
200 | g() {
|
201 | echo "f_var is $f_var" # g can see f's local variables
|
202 | }
|
203 |
|
204 | f() {
|
205 | local f_var=42 g
|
206 | }
|
207 |
|
208 | f
|
209 |
|
210 | YSH code should use `proc` instead. Inside a proc call, the `dynamic_scope`
|
211 | option is implicitly disabled (equivalent to `shopt --unset dynamic_scope`).
|
212 |
|
213 | ### Reading Variables
|
214 |
|
215 | This means that adding the `proc` keyword to the definition of `g` changes its
|
216 | behavior:
|
217 |
|
218 | proc g() {
|
219 | echo "f_var is $f_var" # Undefined!
|
220 | }
|
221 |
|
222 | This affects all kinds of variable references:
|
223 |
|
224 | proc p {
|
225 | echo $foo # look up foo in command mode
|
226 | var y = foo + 42 # look up foo in expression mode
|
227 | }
|
228 |
|
229 | As in Python and JavaScript, a local `foo` can *shadow* a global `foo`. Using
|
230 | `CAPS` for globals is a common style that avoids confusion. Remember that
|
231 | globals should usually be constants in YSH.
|
232 |
|
233 | ### Shell Language Constructs That Write Variables
|
234 |
|
235 | In shell, these language constructs assign to variables using dynamic
|
236 | scope. In YSH, they only mutate the **local** scope:
|
237 |
|
238 | - `x=val`
|
239 | - And variants `x+=val`, `a[i]=val`, `a[i]+=val`
|
240 | - `export x=val` and `readonly x=val`
|
241 | - `${x=default}`
|
242 | - `mycmd {x}>out` (stores a file descriptor in `$x`)
|
243 | - `(( x = 42 + y ))`
|
244 |
|
245 | ### Builtins That Write Variables
|
246 |
|
247 | These builtins are also "isolated" inside procs, using local scope:
|
248 |
|
249 | - [read]($osh-help) (`$REPLY`)
|
250 | - [readarray]($osh-help) aka `mapfile`
|
251 | - [getopts]($osh-help) (`$OPTIND`, `$OPTARG`, etc.)
|
252 | - [printf]($osh-help) -v
|
253 | - [unset]($osh-help)
|
254 |
|
255 | YSH Builtins:
|
256 |
|
257 | - [compadjust]($osh-help)
|
258 | - [try]($oil-help) and `_status`
|
259 |
|
260 | <!-- TODO: should YSH builtins always behave the same way? Isn't that a little
|
261 | faster? I think read --all is not consistent. -->
|
262 |
|
263 | ### Reminder: Proc Scope is Flat
|
264 |
|
265 | All local variables in shell functions and procs live in the same scope. This
|
266 | includes variables declared in conditional blocks (`if` and `case`) and loops
|
267 | (`for` and `while`).
|
268 |
|
269 | proc p {
|
270 | for i in 1 2 3 {
|
271 | echo $i
|
272 | }
|
273 | echo $i # i is still 3
|
274 | }
|
275 |
|
276 | This includes first-class YSH blocks:
|
277 |
|
278 | proc p {
|
279 | var x = 42
|
280 | cd /tmp {
|
281 | var x = 0 # ERROR: x is already declared
|
282 | }
|
283 | }
|
284 |
|
285 | ## More Details
|
286 |
|
287 | ### Examples of Place Mutation
|
288 |
|
289 | The expression to the left of `=` is called a **place**. These are basically
|
290 | Python or JavaScript expressions, except that you add the `setvar` or
|
291 | `setglobal` keyword.
|
292 |
|
293 | setvar x[1] = 2 # array element
|
294 | setvar d['key'] = 3 # dict element
|
295 | setvar d.key = 3 # syntactic sugar for the above
|
296 | setvar x, y = y, x # swap
|
297 |
|
298 | ### Bare Assignment
|
299 |
|
300 | [Hay](hay.html) allows `const` declarations without the keyword:
|
301 |
|
302 | hay define Package
|
303 |
|
304 | Package cpython {
|
305 | version = '3.12' # like const version = ...
|
306 | }
|
307 |
|
308 | ### Temp Bindings
|
309 |
|
310 | Temp bindings precede a simple command:
|
311 |
|
312 | PYTHONPATH=. mycmd
|
313 |
|
314 | They create a new namespace on the stack where each cell has the `export` flag
|
315 | set (`declare -x`).
|
316 |
|
317 | In YSH, the lack of dynamic scope means that they can't be read inside a
|
318 | `proc`. So they're only useful for setting environment variables, and can be
|
319 | replaced with:
|
320 |
|
321 | env PYTHONPATH=. mycmd
|
322 | env PYTHONPATH=. $0 myproc # using the ARGV dispatch pattern
|
323 |
|
324 | ## Appendix A: More on Shell vs. YSH
|
325 |
|
326 | This section may help experienced shell users understand YSH.
|
327 |
|
328 | Shell:
|
329 |
|
330 | g=G # global variable
|
331 | readonly c=C # global constant
|
332 |
|
333 | myfunc() {
|
334 | local x=X # local variable
|
335 | readonly y=Y # local constant
|
336 |
|
337 | x=mutated # mutate local
|
338 | g=mutated # mutate global
|
339 | newglobal=G # create new global
|
340 |
|
341 | caller_var=mutated # dynamic scope (YSH doesn't have this)
|
342 | }
|
343 |
|
344 | YSH:
|
345 |
|
346 | var g = 'G' # global variable (discouraged)
|
347 | const c = 'C' # global constant
|
348 |
|
349 | proc myproc {
|
350 | var x = 'L' # local variable
|
351 |
|
352 | setvar x = 'mutated' # mutate local
|
353 | setglobal g = 'mutated' # mutate global
|
354 | setglobal newglobal = 'G' # create new global
|
355 | }
|
356 |
|
357 | ## Appendix B: Problems With Top-Level Scope In Other Languages
|
358 |
|
359 | - Julia 1.5 (August 2020): [The return of "soft scope" in the
|
360 | REPL](https://julialang.org/blog/2020/08/julia-1.5-highlights/#the_return_of_soft_scope_in_the_repl).
|
361 | - In contrast to Julia, YSH behaves the same in batch mode vs. interactive
|
362 | mode, and doesn't print warnings. However, it behaves differently at the
|
363 | top level. For this reason, we recommend using only `setvar` in
|
364 | interactive shells, and only `const` in the global scope of programs.
|
365 | - Racket: [The Top Level is Hopeless](https://gist.github.com/samth/3083053)
|
366 | - From [A Principled Approach to REPL Interpreters](https://2020.splashcon.org/details/splash-2020-Onward-papers/5/A-principled-approach-to-REPL-interpreters)
|
367 | (Onward 2020). Thanks to Michael Greenberg (of Smoosh) for this reference.
|
368 | - The behavior of `var` at the top level was partly inspired by this
|
369 | paper. It's consistent with bash's `declare`, and similar to JavaScript's
|
370 | `let`.
|
371 |
|
372 | ## Related Documents
|
373 |
|
374 | - [Interpreter State](interpreter-state.html)
|
375 | - The shell has a stack of namespaces.
|
376 | - Each namespace contains {variable name -> cell} bindings.
|
377 | - Cells have a tagged value (string, array, etc.) and 3 flags (readonly,
|
378 | export, nameref).
|
379 | - [Guide to Procs and Funcs](proc-func.html)
|
380 |
|