1 | ---
|
2 | in_progress: yes
|
3 | body_css_class: width40 help-body
|
4 | default_highlighter: oils-sh
|
5 | preserve_anchor_case: yes
|
6 | ---
|
7 |
|
8 | YSH Expression Language
|
9 | ===
|
10 |
|
11 | This chapter in the [Oils Reference](index.html) describes the YSH expression
|
12 | language, which includes [Egg Expressions]($xref:eggex).
|
13 |
|
14 | <div id="toc">
|
15 | </div>
|
16 |
|
17 | ## Literals
|
18 |
|
19 | ### bool-literal
|
20 |
|
21 | YSH uses JavaScript-like spellings for these three "atoms":
|
22 |
|
23 | true false null
|
24 |
|
25 | Note that the empty string is a good "special" value in some cases. The `null`
|
26 | value can't be interpolated into words.
|
27 |
|
28 | ### int-literal
|
29 |
|
30 | var myint = 42
|
31 | var myfloat = 3.14
|
32 | var float2 = 1e100
|
33 |
|
34 | ### rune-literal
|
35 |
|
36 | #'a' #'_' \n \\ \u{3bc}
|
37 |
|
38 | ### ysh-string
|
39 |
|
40 | Double quoted strings are identical to shell:
|
41 |
|
42 | var dq = "hello $world and $(hostname)"
|
43 |
|
44 | Single quoted strings may be raw:
|
45 |
|
46 | var s = r'line\n' # raw string means \n is literal, NOT a newline
|
47 |
|
48 | Or escaped *J8 strings*:
|
49 |
|
50 | var s = u'line\n \u{3bc}' # unicode string means \n is a newline
|
51 | var s = b'line\n \u{3bc} \yff' # same thing, but also allows bytes
|
52 |
|
53 | Both `u''` and `b''` strings evaluate to the single `Str` type. The difference
|
54 | is that `b''` strings allow the `\yff` byte escape.
|
55 |
|
56 | ---
|
57 |
|
58 | There's no way to express a single quote in raw strings. Use one of the other
|
59 | forms instead:
|
60 |
|
61 | var sq = "single quote: ' "
|
62 | var sq = u'single quote: \' '
|
63 |
|
64 | Sometimes you can omit the `r`, e.g. where there are no backslashes and thus no
|
65 | ambiguity:
|
66 |
|
67 | echo 'foo'
|
68 | echo r'foo' # same thing
|
69 |
|
70 | The `u''` and `b''` strings are called *J8 strings* because the syntax in YSH
|
71 | **code** matches JSON-like **data**.
|
72 |
|
73 | var strU = u'mu = \u{3bc}' # J8 string with escapes
|
74 | var strB = b'bytes \yff' # J8 string that can express byte strings
|
75 |
|
76 | More examples:
|
77 |
|
78 | var myRaw = r'[a-z]\n' # raw strings are useful for regexes (not
|
79 | # eggexes)
|
80 |
|
81 | ### triple-quoted
|
82 |
|
83 | Triple-quoted string literals have leading whitespace stripped on each line.
|
84 | They come in the same variants:
|
85 |
|
86 | var dq = """
|
87 | hello $world and $(hostname)
|
88 | no leading whitespace
|
89 | """
|
90 |
|
91 | var myRaw = r'''
|
92 | raw string
|
93 | no leading whitespace
|
94 | '''
|
95 |
|
96 | var strU = u'''
|
97 | string that happens to be unicode \u{3bc}
|
98 | no leading whitespace
|
99 | '''
|
100 |
|
101 | var strB = b'''
|
102 | string that happens to be bytes \u{3bc} \yff
|
103 | no leading whitespace
|
104 | '''
|
105 |
|
106 | Again, you can omit the `r` prefix if there's no backslash, because it's not
|
107 | ambiguous:
|
108 |
|
109 | var myRaw = '''
|
110 | raw string
|
111 | no leading whitespace
|
112 | '''
|
113 |
|
114 | ### str-template
|
115 |
|
116 | String templates use the same syntax as double-quoted strings:
|
117 |
|
118 | var mytemplate = ^"name = $name, age = $age"
|
119 |
|
120 | Related topics:
|
121 |
|
122 | - [Str => replace](chap-type-method.html#replace)
|
123 | - [ysh-string](chap-expr-lang.html#ysh-string)
|
124 |
|
125 | ### list-literal
|
126 |
|
127 | Lists have a Python-like syntax:
|
128 |
|
129 | var mylist = ['one', 'two', 3]
|
130 |
|
131 | And a shell-like syntax:
|
132 |
|
133 | var list2 = %| one two |
|
134 |
|
135 | The shell-like syntax accepts the same syntax that a command can:
|
136 |
|
137 | ls $mystr @ARGV *.py {foo,bar}@example.com
|
138 |
|
139 | # Rather than executing ls, evaluate and store words
|
140 | var cmd = :| ls $mystr @ARGV *.py {foo,bar}@example.com |
|
141 |
|
142 | ### dict-literal
|
143 |
|
144 | {name: 'value'}
|
145 |
|
146 | ### range
|
147 |
|
148 | A range is a sequence of numbers that can be iterated over:
|
149 |
|
150 | for i in (0 .. 3) {
|
151 | echo $i
|
152 | }
|
153 | => 0
|
154 | => 1
|
155 | => 2
|
156 |
|
157 | As with slices, the last number isn't included. Idiom to iterate from 1 to n:
|
158 |
|
159 | for i in (1 .. n+1) {
|
160 | echo $i
|
161 | }
|
162 |
|
163 | ### block-literal
|
164 |
|
165 | var myblock = ^(echo $PWD)
|
166 |
|
167 | ### expr-lit
|
168 |
|
169 | var myexpr = ^[1 + 2*3]
|
170 |
|
171 | ## Operators
|
172 |
|
173 | <h3 id="concat">concat <code>++</code></h3>
|
174 |
|
175 | The concatenation operator works on strings:
|
176 |
|
177 | var s = 'hello'
|
178 | var t = s ++ ' world'
|
179 | = t
|
180 | (Str) "hello world"
|
181 |
|
182 | and lists:
|
183 |
|
184 | var L = ['one', 'two']
|
185 | var M = L ++ ['three', '4']
|
186 | = M
|
187 | (List) ["one", "two", "three", "4"]
|
188 |
|
189 | String interpolation can be nicer than `++`:
|
190 |
|
191 | var t2 = "${s} world" # same as t
|
192 |
|
193 | Likewise, splicing lists can be nicer:
|
194 |
|
195 | var M2 = :| @L three 4 | # same as M
|
196 |
|
197 | ### ysh-compare
|
198 |
|
199 | a == b # Python-like equality, no type conversion
|
200 | 3 ~== 3.0 # True, type conversion
|
201 | 3 ~== '3' # True, type conversion
|
202 | 3 ~== '3.0' # True, type conversion
|
203 |
|
204 | ### ysh-logical
|
205 |
|
206 | not and or
|
207 |
|
208 | Note that these are distinct from `! && ||`.
|
209 |
|
210 | ### ysh-arith
|
211 |
|
212 | YSH supports most of the arithmetic operators from Python. Notably, `/` and `%`
|
213 | differ from Python as [they round toward zero, not negative
|
214 | infinity](https://www.oilshell.org/blog/2024/03/release-0.21.0.html#integers-dont-do-whatever-python-or-c-does).
|
215 |
|
216 | Use `+ - *` for `Int` or `Float` addition, subtraction and multiplication. If
|
217 | any of the operands are `Float`s, then the output will also be a `Float`.
|
218 |
|
219 | Use `/` and `//` for `Float` division and `Int` division, respectively. `/`
|
220 | will _always_ result in a `Float`, meanwhile `//` will _always_ result in an
|
221 | `Int`.
|
222 |
|
223 | = 1 / 2 # => (Float) 0.5
|
224 | = 1 // 2 # => (Int) 0
|
225 |
|
226 | Use `%` to compute the _remainder_ of integer division. The left operand must
|
227 | be an `Int` and the right a _positive_ `Int`.
|
228 |
|
229 | = 1 % 2 # -> (Int) 1
|
230 | = -4 % 2 # -> (Int) 0
|
231 |
|
232 | Use `**` for exponentiation. The left operand must be an `Int` and the right a
|
233 | _positive_ `Int`.
|
234 |
|
235 | All arithmetic operators may coerce either of their operands from strings to a
|
236 | number, provided those strings are formatted as numbers.
|
237 |
|
238 | = 10 + '1' # => (Int) 11
|
239 |
|
240 | Operators like `+ - * /` will coerce strings to _either_ an `Int` or `Float`.
|
241 | However, operators like `// ** %` and bit shifts will coerce strings _only_ to
|
242 | an `Int`.
|
243 |
|
244 | = '1.14' + '2' # => (Float) 3.14
|
245 | = '1.14' % '2' # Type Error: Left operand is a Str
|
246 |
|
247 | ### ysh-bitwise
|
248 |
|
249 | ~ & | ^
|
250 |
|
251 | ### ysh-ternary
|
252 |
|
253 | Like Python:
|
254 |
|
255 | display = 'yes' if len(s) else 'empty'
|
256 |
|
257 | ### ysh-index
|
258 |
|
259 | Like Python:
|
260 |
|
261 | myarray[3]
|
262 | mystr[3]
|
263 |
|
264 | TODO: Does string indexing give you an integer back?
|
265 |
|
266 | ### ysh-slice
|
267 |
|
268 | Like Python:
|
269 |
|
270 | myarray[1 : -1]
|
271 | mystr[1 : -1]
|
272 |
|
273 | ### func-call
|
274 |
|
275 | Like Python:
|
276 |
|
277 | f(x, y)
|
278 |
|
279 | ### thin-arrow
|
280 |
|
281 | The thin arrow is for mutating methods:
|
282 |
|
283 | var mylist = ['bar']
|
284 | call mylist->pop()
|
285 |
|
286 | <!--
|
287 | TODO
|
288 | var mydict = {name: 'foo'}
|
289 | call mydict->erase('name')
|
290 | -->
|
291 |
|
292 | ### fat-arrow
|
293 |
|
294 | The fat arrow is for transforming methods:
|
295 |
|
296 | if (s => startsWith('prefix')) {
|
297 | echo 'yes'
|
298 | }
|
299 |
|
300 | If the method lookup on `s` fails, it looks for free functions. This means it
|
301 | can be used for "chaining" transformations:
|
302 |
|
303 | var x = myFunc() => list() => join()
|
304 |
|
305 | ### match-ops
|
306 |
|
307 | YSH has four pattern matching operators: `~ !~ ~~ !~~`.
|
308 |
|
309 | Does string match an **eggex**?
|
310 |
|
311 | var filename = 'x42.py'
|
312 | if (filename ~ / d+ /) {
|
313 | echo 'number'
|
314 | }
|
315 |
|
316 | Does a string match a POSIX regular expression (ERE syntax)?
|
317 |
|
318 | if (filename ~ '[[:digit:]]+') {
|
319 | echo 'number'
|
320 | }
|
321 |
|
322 | Negate the result with the `!~` operator:
|
323 |
|
324 | if (filename !~ /space/ ) {
|
325 | echo 'no space'
|
326 | }
|
327 |
|
328 | if (filename !~ '[[:space:]]' ) {
|
329 | echo 'no space'
|
330 | }
|
331 |
|
332 | Does a string match a **glob**?
|
333 |
|
334 | if (filename ~~ '*.py') {
|
335 | echo 'Python'
|
336 | }
|
337 |
|
338 | if (filename !~~ '*.py') {
|
339 | echo 'not Python'
|
340 | }
|
341 |
|
342 | Take care not to confuse glob patterns and regular expressions.
|
343 |
|
344 | - Related doc: [YSH Regex API](../ysh-regex-api.html)
|
345 |
|
346 | ## Eggex
|
347 |
|
348 | ### re-literal
|
349 |
|
350 | An eggex literal looks like this:
|
351 |
|
352 | / expression ; flags ; translation preference /
|
353 |
|
354 | The flags and translation preference are both optional.
|
355 |
|
356 | Examples:
|
357 |
|
358 | var pat = / d+ / # => [[:digit:]]+
|
359 |
|
360 | You can specify flags passed to libc `regcomp()`:
|
361 |
|
362 | var pat = / d+ ; reg_icase reg_newline /
|
363 |
|
364 | You can specify a translation preference after a second semi-colon:
|
365 |
|
366 | var pat = / d+ ; ; ERE /
|
367 |
|
368 | Right now the translation preference does nothing. It could be used to
|
369 | translate eggex to PCRE or Python syntax.
|
370 |
|
371 | - Related doc: [Egg Expressions](../eggex.html)
|
372 |
|
373 | ### re-primitive
|
374 |
|
375 | There are two kinds of eggex primitives.
|
376 |
|
377 | "Zero-width assertions" match a position rather than a character:
|
378 |
|
379 | %start # translates to ^
|
380 | %end # translates to $
|
381 |
|
382 | Literal characters appear within **single** quotes:
|
383 |
|
384 | 'oh *really*' # translates to regex-escaped string
|
385 |
|
386 | Double-quoted strings are **not** eggex primitives. Instead, you can use
|
387 | splicing of strings:
|
388 |
|
389 | var dq = "hi $name"
|
390 | var eggex = / @dq /
|
391 |
|
392 | ### class-literal
|
393 |
|
394 | An eggex character class literal specifies a set. It can have individual
|
395 | characters and ranges:
|
396 |
|
397 | [ 'x' 'y' 'z' a-f A-F 0-9 ] # 3 chars, 3 ranges
|
398 |
|
399 | Omit quotes on ASCII characters:
|
400 |
|
401 | [ x y z ] # avoid typing 'x' 'y' 'z'
|
402 |
|
403 | Sets of characters can be written as trings
|
404 |
|
405 | [ 'xyz' ] # any of 3 chars, not a sequence of 3 chars
|
406 |
|
407 | Backslash escapes are respected:
|
408 |
|
409 | [ \\ \' \" \0 ]
|
410 | [ \xFF \u0100 ]
|
411 |
|
412 | Splicing:
|
413 |
|
414 | [ @str_var ]
|
415 |
|
416 | Negation always uses `!`
|
417 |
|
418 | ![ a-f A-F 'xyz' @str_var ]
|
419 |
|
420 | ### named-class
|
421 |
|
422 | Perl-like shortcuts for sets of characters:
|
423 |
|
424 | [ dot ] # => .
|
425 | [ digit ] # => [[:digit:]]
|
426 | [ space ] # => [[:space:]]
|
427 | [ word ] # => [[:alpha:]][[:digit:]]_
|
428 |
|
429 | Abbreviations:
|
430 |
|
431 | [ d s w ] # Same as [ digit space word ]
|
432 |
|
433 | Valid POSIX classes:
|
434 |
|
435 | alnum cntrl lower space
|
436 | alpha digit print upper
|
437 | blank graph punct xdigit
|
438 |
|
439 | Negated:
|
440 |
|
441 | !digit !space !word
|
442 | !d !s !w
|
443 | !alnum # etc.
|
444 |
|
445 | ### re-repeat
|
446 |
|
447 | Eggex repetition looks like POSIX syntax:
|
448 |
|
449 | / 'a'? / # zero or one
|
450 | / 'a'* / # zero or more
|
451 | / 'a'+ / # one or more
|
452 |
|
453 | Counted repetitions:
|
454 |
|
455 | / 'a'{3} / # exactly 3 repetitions
|
456 | / 'a'{2,4} / # between 2 to 4 repetitions
|
457 |
|
458 | ### re-compound
|
459 |
|
460 | Sequence expressions with a space:
|
461 |
|
462 | / word digit digit / # Matches 3 characters in sequence
|
463 | # Examples: a42, b51
|
464 |
|
465 | (Compare `/ [ word digit ] /`, which is a set matching 1 character.)
|
466 |
|
467 | Alternation with `|`:
|
468 |
|
469 | / word | digit / # Matches 'a' OR '9', for example
|
470 |
|
471 | Grouping with parentheses:
|
472 |
|
473 | / (word digit) | \\ / # Matches a9 or \
|
474 |
|
475 | ### re-capture
|
476 |
|
477 | To retrieve a substring of a string that matches an Eggex, use a "capture
|
478 | group" like `<capture ...>`.
|
479 |
|
480 | Here's an eggex with a **positional** capture:
|
481 |
|
482 | var pat = / 'hi ' <capture d+> / # access with _group(1)
|
483 | # or Match => _group(1)
|
484 |
|
485 | Captures can be **named**:
|
486 |
|
487 | <capture d+ as month> # access with _group('month')
|
488 | # or Match => group('month')
|
489 |
|
490 | Captures can also have a type **conversion func**:
|
491 |
|
492 | <capture d+ : int> # _group(1) returns Int
|
493 |
|
494 | <capture d+ as month: int> # _group('month') returns Int
|
495 |
|
496 | Related docs and help topics:
|
497 |
|
498 | - [YSH Regex API](../ysh-regex-api.html)
|
499 | - [`_group()`](chap-builtin-func.html#_group)
|
500 | - [`Match => group()`](chap-type-method.html#group)
|
501 |
|
502 | ### re-splice
|
503 |
|
504 | To build an eggex out of smaller expressions, you can **splice** eggexes
|
505 | together:
|
506 |
|
507 | var D = / [0-9][0-9] /
|
508 | var time = / @D ':' @D / # [0-9][0-9]:[0-9][0-9]
|
509 |
|
510 | If the variable begins with a capital letter, you can omit `@`:
|
511 |
|
512 | var ip = / D ':' D /
|
513 |
|
514 | You can also splice a string:
|
515 |
|
516 | var greeting = 'hi'
|
517 | var pat = / @greeting ' world' / # hi world
|
518 |
|
519 | Splicing is **not** string concatenation; it works on eggex subtrees.
|
520 |
|
521 | ### re-flags
|
522 |
|
523 | Valid ERE flags, which are passed to libc's `regcomp()`:
|
524 |
|
525 | - `reg_icase` aka `i` - ignore case
|
526 | - `reg_newline` - 4 matching changes related to newlines
|
527 |
|
528 | See `man regcomp`.
|
529 |
|
530 | ### re-multiline
|
531 |
|
532 | Multi-line eggexes aren't yet implemented. Splicing makes it less necessary:
|
533 |
|
534 | var Name = / <capture [a-z]+ as name> /
|
535 | var Num = / <capture d+ as num> /
|
536 | var Space = / <capture s+ as space> /
|
537 |
|
538 | # For variables named like CapWords, splicing @Name doesn't require @
|
539 | var lexer = / Name | Num | Space /
|