1 | ---
2 | default_highlighter: oils-sh
3 | ---
4 |
5 | Variable Declaration, Mutation, and Scope
6 | =========================================
7 |
8 | This doc addresses these questions:
9 |
10 | - How do variables behave in YSH?
11 | - What are some practical guidelines for using them?
12 |
13 | <div id="toc">
14 | </div>
15 |
16 | ## YSH Design Goals
17 |
18 | YSH is a graceful upgrade to shell, and the behavior of variables follows from
19 | that philosophy.
20 |
21 | - OSH implements shell-compatible behavior.
22 | - YSH enhances shell with **new features** like expressions over typed data,
23 | which will be familiar to Python and JavaScript programmers.
24 | - It's a **stricter** language.
25 | - Procs (shell functions) are self-contained and modular. They're
26 | understandable by reading their signature.
27 | - We removed [dynamic scope]($xref:dynamic-scope). This mechanism isn't
28 | familiar to most programmers, and may cause accidental mutation (bugs).
29 | - YSH has variable **declarations** like JavaScript, which can prevent
30 | trivial bugs.
31 | - Even though YSH is stricter, it should still be convenient to use
32 | interactively.
33 |
34 | ## Keywords Are More Consistent and Powerful Than Builtins
35 |
36 | YSH has 5 keywords affect shell variables. Unlike shell builtins, they're
37 | statically-parsed, and take dynamically-typed **expressions** on the right.
38 |
39 | ### Declare With `var` and `const`
40 |
41 | It looks like JavaScript:
42 |
43 | var name = 'Bob'
44 | const age = (20 + 1) * 2
45 |
46 | echo "$name is $age years old" # Bob is 42 years old
47 |
48 | Note that `const` is enforced by a dynamic check. It's meant to be used at the
49 | top level only, not within `proc` or `func`.
50 |
51 | const age = 'other' # Will fail because `readonly` bit is set
52 |
53 | ### Mutate With `setvar` and `setglobal`
54 |
55 | proc p {
56 | var name = 'Bob' # declare
57 | setvar name = 'Alice' # mutate
58 |
59 | setglobal g = 42 # create or mutate a global variable
60 | }
61 |
62 | ### "Return" By Mutating a `Place` (advanced)
63 |
64 | A `Place` is a more principled mechanism that "replaces" shell's dynamic scope.
65 | To use it:
66 |
67 | 1. Create a place with the `&` prefix operator
68 | 1. Pass the place around as you would any other value.
69 | 1. Assign to the place with its `setValue(x)` method.
70 |
71 | Example:
72 |
73 | proc p (s; out) { # place is a typed param
74 | # mutate the place
75 | call out->setValue("prefix-$s")
76 | }
77 |
78 | var x
79 | p ('foo', &x) # pass a place
80 | echo x=$x # => x=prefix-foo
81 |
82 | - *Style guideline*: In some situations, it's better to "return" a value on
83 | stdout, and use `$(myproc)` to retrieve it.
84 |
85 | ### Comparison to Shell
86 |
87 | Shell and [bash]($xref) have grown many mechanisms for "declaring" and mutating
88 | variables:
89 |
90 | - "bare" assignments like `x=foo`
91 | - **builtins** like `declare`, `local`, and `readonly`
92 | - The `-n` "nameref" flag
93 |
94 | Examples:
95 |
96 | readonly name=World # no spaces allowed around =
97 | declare foo="Hello $name"
98 | foo=$((42 + a[2]))
99 | declare -n ref=foo # $foo can be written through $ref
100 |
101 | These constructs are all discouraged in YSH code.
102 |
103 | ## Keywords Behave Differently at the Top Level (Like JavaScript)
104 |
105 | The "top-level" of the interpreter is used in two situations:
106 |
107 | 1. When using YSH **interactively**.
108 | 2. As the **global** scope of a batch program.
109 |
110 | Experienced YSH users may notice that `var` and `setvar` behave differently in
111 | the top-level scope vs. `proc` scope. This is caused by the tension between
112 | the interactive shell and the strictness of YSH.
113 |
114 | In particular, the `source` builtin is dynamic, so YSH can't know all the names
115 | defined at the top level.
116 |
117 | For reference, JavaScript's modern `let` keyword has similar behavior.
118 |
119 | ### Usage Guidelines
120 |
121 | Before going into detail on keyword behavior, here are some practical
122 | guidelines:
123 |
124 | - **Interactive** sessions: Use shell's `x=y`, or YSH `setvar`. You can think
125 | of `setvar` like Python's assignment operator: it creates or mutates a
126 | variable.
127 | - **Short scripts** (~20 lines) can also use this style.
128 | - **Long programs**: Refactor them into composable "functions", i.e. `proc`.
129 | - First wrap the **whole program** into `proc main { }`.
130 | - The top level should only have `const` declarations. (You can use `var`,
131 | but it has special rules, explained below.)
132 | - The body of `proc` and `func` should have variables declared with `var`.
133 | - Inside these code blocks, use `setvar` to mutate **local** variables, and
134 | `setglobal` to mutate **globals**.
135 |
136 | That's all you need to remember. The following sections explain the rationale
137 | for these guidelines.
138 |
139 | ### The Top-Level Scope Has Only Dynamic Checks
140 |
141 | The lack of static checks affects the recommended usage for both interactive
142 | sessions and batch scripts.
143 |
144 | #### Interactive Use: `setvar` only
145 |
146 | As mentioned, you only need the `setvar` keyword in an interactive shell:
147 |
148 | ysh$ setvar x = 42 # create variable 'x'
149 | ysh$ setvar x = 43 # mutate it
150 |
151 | Details on top-level behavior:
152 |
153 | - `var` behaves like `setvar`: It creates or mutates a variable. In other
154 | words, a `var` definition can be **redefined** at the top-level.
155 | - A `const` can also redefine a `var`.
156 | - A `var` can't redefine a `const` because there's a **dynamic** check that
157 | disallows mutation (like shell's `readonly`).
158 |
159 | #### Batch Use: `const` only
160 |
161 | It's simpler to use only constants at the top level.
162 |
163 | const USER = 'bob'
164 | const HOST = 'example.com'
165 |
166 | proc p {
167 | ssh $USER@$HOST ls -l
168 | }
169 |
170 | This is so you don't have to worry about a `var` being redefined by a statement
171 | like `source mylib.sh`. A `const` can't be redefined because it can't be
172 | mutated.
173 |
174 | It may be useful to put mutable globals in a constant dictionary, as it will
175 | prevent them from being redefined:
176 |
177 | const G = { mystate = 0 }
178 |
179 | proc p {
180 | setglobal G.mystate = 1
181 | }
182 |
183 | ### `proc` and `func` Scope Have Static Checks
184 |
185 | These YSH code units have additional **static checks** (parse errors):
186 |
187 | - Every variable must be declared once and only once with `var`. A duplicate
188 | declaration is a parse error.
189 | - `setvar` of an undeclared variable is a parse error.
190 |
191 | ## Procs Don't Use "Dynamic Scope"
192 |
193 | Procs are designed to be encapsulated and composable like processes. But the
194 | [dynamic scope]($xref:dynamic-scope) rule that Bourne shell functions use
195 | breaks encapsulation.
196 |
197 | Dynamic scope means that a function can **read and mutate** the locals of its
198 | caller, its caller's caller, and so forth. Example:
199 |
200 | g() {
201 | echo "f_var is $f_var" # g can see f's local variables
202 | }
203 |
204 | f() {
205 | local f_var=42 g
206 | }
207 |
208 | f
209 |
210 | YSH code should use `proc` instead. Inside a proc call, the `dynamic_scope`
211 | option is implicitly disabled (equivalent to `shopt --unset dynamic_scope`).
212 |
213 | ### Reading Variables
214 |
215 | This means that adding the `proc` keyword to the definition of `g` changes its
216 | behavior:
217 |
218 | proc g() {
219 | echo "f_var is $f_var" # Undefined!
220 | }
221 |
222 | This affects all kinds of variable references:
223 |
224 | proc p {
225 | echo $foo # look up foo in command mode
226 | var y = foo + 42 # look up foo in expression mode
227 | }
228 |
229 | As in Python and JavaScript, a local `foo` can *shadow* a global `foo`. Using
230 | `CAPS` for globals is a common style that avoids confusion. Remember that
231 | globals should usually be constants in YSH.
232 |
233 | ### Shell Language Constructs That Write Variables
234 |
235 | In shell, these language constructs assign to variables using dynamic
236 | scope. In YSH, they only mutate the **local** scope:
237 |
238 | - `x=val`
239 | - And variants `x+=val`, `a[i]=val`, `a[i]+=val`
240 | - `export x=val` and `readonly x=val`
241 | - `${x=default}`
242 | - `mycmd {x}>out` (stores a file descriptor in `$x`)
243 | - `(( x = 42 + y ))`
244 |
245 | ### Builtins That Write Variables
246 |
247 | These builtins are also "isolated" inside procs, using local scope:
248 |
249 | - [read]($osh-help) (`$REPLY`)
250 | - [readarray]($osh-help) aka `mapfile`
251 | - [getopts]($osh-help) (`$OPTIND`, `$OPTARG`, etc.)
252 | - [printf]($osh-help) -v
253 | - [unset]($osh-help)
254 |
255 | YSH Builtins:
256 |
257 | - [compadjust]($osh-help)
258 | - [try]($oil-help) and `_status`
259 |
260 | <!-- TODO: should YSH builtins always behave the same way? Isn't that a little
261 | faster? I think read --all is not consistent. -->
262 |
263 | ### Reminder: Proc Scope is Flat
264 |
265 | All local variables in shell functions and procs live in the same scope. This
266 | includes variables declared in conditional blocks (`if` and `case`) and loops
267 | (`for` and `while`).
268 |
269 | proc p {
270 | for i in 1 2 3 {
271 | echo $i
272 | }
273 | echo $i # i is still 3
274 | }
275 |
276 | This includes first-class YSH blocks:
277 |
278 | proc p {
279 | var x = 42
280 | cd /tmp {
281 | var x = 0 # ERROR: x is already declared
282 | }
283 | }
284 |
285 | ## More Details
286 |
287 | ### Examples of Place Mutation
288 |
289 | The expression to the left of `=` is called a **place**. These are basically
290 | Python or JavaScript expressions, except that you add the `setvar` or
291 | `setglobal` keyword.
292 |
293 | setvar x[1] = 2 # array element
294 | setvar d['key'] = 3 # dict element
295 | setvar d.key = 3 # syntactic sugar for the above
296 | setvar x, y = y, x # swap
297 |
298 | ### Bare Assignment
299 |
300 | [Hay](hay.html) allows `const` declarations without the keyword:
301 |
302 | hay define Package
303 |
304 | Package cpython {
305 | version = '3.12' # like const version = ...
306 | }
307 |
308 | ### Temp Bindings
309 |
310 | Temp bindings precede a simple command:
311 |
312 | PYTHONPATH=. mycmd
313 |
314 | They create a new namespace on the stack where each cell has the `export` flag
315 | set (`declare -x`).
316 |
317 | In YSH, the lack of dynamic scope means that they can't be read inside a
318 | `proc`. So they're only useful for setting environment variables, and can be
319 | replaced with:
320 |
321 | env PYTHONPATH=. mycmd
322 | env PYTHONPATH=. $0 myproc # using the ARGV dispatch pattern
323 |
324 | ## Appendix A: More on Shell vs. YSH
325 |
326 | This section may help experienced shell users understand YSH.
327 |
328 | Shell:
329 |
330 | g=G # global variable
331 | readonly c=C # global constant
332 |
333 | myfunc() {
334 | local x=X # local variable
335 | readonly y=Y # local constant
336 |
337 | x=mutated # mutate local
338 | g=mutated # mutate global
339 | newglobal=G # create new global
340 |
341 | caller_var=mutated # dynamic scope (YSH doesn't have this)
342 | }
343 |
344 | YSH:
345 |
346 | var g = 'G' # global variable (discouraged)
347 | const c = 'C' # global constant
348 |
349 | proc myproc {
350 | var x = 'L' # local variable
351 |
352 | setvar x = 'mutated' # mutate local
353 | setglobal g = 'mutated' # mutate global
354 | setglobal newglobal = 'G' # create new global
355 | }
356 |
357 | ## Appendix B: Problems With Top-Level Scope In Other Languages
358 |
359 | - Julia 1.5 (August 2020): [The return of "soft scope" in the
360 | REPL](https://julialang.org/blog/2020/08/julia-1.5-highlights/#the_return_of_soft_scope_in_the_repl).
361 | - In contrast to Julia, YSH behaves the same in batch mode vs. interactive
362 | mode, and doesn't print warnings. However, it behaves differently at the
363 | top level. For this reason, we recommend using only `setvar` in
364 | interactive shells, and only `const` in the global scope of programs.
365 | - Racket: [The Top Level is Hopeless](https://gist.github.com/samth/3083053)
366 | - From [A Principled Approach to REPL Interpreters](https://2020.splashcon.org/details/splash-2020-Onward-papers/5/A-principled-approach-to-REPL-interpreters)
367 | (Onward 2020). Thanks to Michael Greenberg (of Smoosh) for this reference.
368 | - The behavior of `var` at the top level was partly inspired by this
369 | paper. It's consistent with bash's `declare`, and similar to JavaScript's
370 | `let`.
371 |
372 | ## Related Documents
373 |
374 | - [Interpreter State](interpreter-state.html)
375 | - The shell has a stack of namespaces.
376 | - Each namespace contains {variable name -> cell} bindings.
377 | - Cells have a tagged value (string, array, etc.) and 3 flags (readonly,
378 | export, nameref).
379 | - [Guide to Procs and Funcs](proc-func.html)
380 |