| 1 | Notes on VM Opcodes
|
| 2 | ===================
|
| 3 |
|
| 4 | 2018 Bytecode assessement
|
| 5 | -------------------------
|
| 6 |
|
| 7 | ### Same
|
| 8 |
|
| 9 | Stack Management Bytecodes that are the same:
|
| 10 |
|
| 11 | - `POP_TOP`
|
| 12 | - `DUP_TOP_TWO`
|
| 13 | - `ROT_{TWO,THREE,FOUR}`
|
| 14 |
|
| 15 | (We could switch to a register VM, but that would be an completely orthogonal
|
| 16 | change.)
|
| 17 |
|
| 18 | Control Flow Bytecodes that are the same:
|
| 19 |
|
| 20 | - `SETUP_LOOP`
|
| 21 | - `SETUP_WITH`
|
| 22 | - `WITH_CLEANUP`
|
| 23 | - `SETUP_{EXCEPT,FINALLY}`
|
| 24 | - `END_FINALLY`
|
| 25 | - `POP_BLOCK`
|
| 26 | - `JUMP_{FORWARD,ABSOLUTE,...}`
|
| 27 | - `POP_JUMP_*`
|
| 28 | - `{BREAK,CONTINUE}_LOOP`
|
| 29 | - `RETURN_VALUE`
|
| 30 | - `RAISE_VARARGS` -- although it's not as general
|
| 31 | - `GET_ITER`
|
| 32 |
|
| 33 | Data structure bytecodes that are likely the same:
|
| 34 |
|
| 35 | - `{LIST,SET,MAP}_ADD`
|
| 36 | - `STORE_MAP`
|
| 37 | - `BUILD_{TUPLE,LIST,SET,MAP}`
|
| 38 | - `SLICE_*`
|
| 39 |
|
| 40 | At least in the beginning they are the same. Later we might have specialized
|
| 41 | data structures, e.g. `Array<Str>`, which is extremely common in shell.
|
| 42 |
|
| 43 | ### Changed
|
| 44 |
|
| 45 | Load / Store bytescodes that will take indices instead of names:
|
| 46 |
|
| 47 | - `{LOAD,STORE}_NAME`
|
| 48 | - fast variants go away: `{LOAD,STORE}_FAST`
|
| 49 | - `{LOAD,STORE}_GLOBAL`
|
| 50 | - `{LOAD,STORE}_ATTR` - for object members
|
| 51 |
|
| 52 |
|
| 53 | Highly Changed based on language semantics
|
| 54 |
|
| 55 | - `CALL_FUNCTION_*` -- Instead of four variants, we may just have one more
|
| 56 | static kind.
|
| 57 | - It will support `f(msg, *args)` and `f(*args, **kwargs)`, but maybe not
|
| 58 | much else?
|
| 59 |
|
| 60 | Bytecodes that can be type-specialized:
|
| 61 |
|
| 62 | - `BINARY_*`
|
| 63 | - `UNARY_*`
|
| 64 | - `COMPARE_OP` -- or maybe just don't allow nonsensical comparisons
|
| 65 |
|
| 66 | Maybe type-specialized:
|
| 67 |
|
| 68 | - `FOR_ITER` -- iterating items in a list, iterating characters in a string
|
| 69 | could be compiled statically. In other words, the iterator protocol isn't
|
| 70 | quite necessary.
|
| 71 |
|
| 72 | ### Removed
|
| 73 |
|
| 74 | Dynamic bytecodes that will go away, because names are statically resolved:
|
| 75 |
|
| 76 | - `BUILD_CLASS`
|
| 77 | - `MAKE_FUNCTION`
|
| 78 | - `IMPORT_{NAME,STAR}`
|
| 79 | - maybe: `MAKE_CLOSURE`: this should be done statically? Closures and classes
|
| 80 | should be the same? It's like calling a constructor.
|
| 81 |
|
| 82 | Other Removed:
|
| 83 |
|
| 84 | - `DELETE_NAME`: Namespaces are static
|
| 85 | - Might be unnecessary for our purposes: `YIELD_FROM`
|
| 86 | - `EXEC_STMT`: I want a different interface to the compiler, for
|
| 87 | metaprogramming purposes.
|
| 88 |
|
| 89 | Deprecated:
|
| 90 |
|
| 91 | - `PRINT_*` -- this should just be a normal function call
|
| 92 |
|
| 93 | ### Additions
|
| 94 |
|
| 95 | - Bytecodes for ASDL structures?
|
| 96 | - Bytecodes for shell?
|
| 97 | - For parsing VM?
|
| 98 |
|
| 99 | 2017
|
| 100 | ----
|
| 101 |
|
| 102 | This is an elaboration on:
|
| 103 |
|
| 104 | https://docs.python.org/2/library/dis.html
|
| 105 |
|
| 106 | I copy the descriptions and add my notes, based on what I'm working on.
|
| 107 |
|
| 108 |
|
| 109 |
|
| 110 | `SETUP_LOOP(delta)`
|
| 111 |
|
| 112 | Pushes a block for a loop onto the block stack. The block spans from the
|
| 113 | current instruction with a size of delta bytes.
|
| 114 |
|
| 115 | NOTES: compiler2 generates an extra SETUP_LOOP, for generator expressions,
|
| 116 | along with POP_BLOCK.
|
| 117 |
|
| 118 |
|
| 119 | `POP_BLOCK()`
|
| 120 |
|
| 121 | Removes one block from the block stack. Per frame, there is a stack of blocks,
|
| 122 | denoting nested loops, try statements, and such.
|
| 123 |
|
| 124 |
|
| 125 | `LOAD_CLOSURE(i)`
|
| 126 |
|
| 127 | Pushes a reference to the cell contained in slot `i` of the cell and free
|
| 128 | variable storage. The name of the variable is `co_cellvars[i]` if i is less
|
| 129 | than the length of `co_cellvars`. Otherwise it is
|
| 130 | `co_freevars[i - len(co_cellvars)]`.
|
| 131 |
|
| 132 | NOTES: compiler2 generates an extra one of these
|
| 133 |
|
| 134 |
|
| 135 | `MAKE_CLOSURE(argc)`
|
| 136 |
|
| 137 | Creates a new function object, sets its `func_closure` slot, and pushes it on
|
| 138 | the stack. `TOS` is the code associated with the function, `TOS1` the tuple
|
| 139 | containing cells for the closure’s free variables. The function also has `argc`
|
| 140 | default parameters, which are found below the cells.
|
| 141 |
|
| 142 |
|
| 143 | `LOAD_DEREF(i)`
|
| 144 |
|
| 145 | Loads the cell contained in slot `i` of the cell and free variable storage.
|
| 146 | Pushes a reference to the object the cell contains on the stack.
|
| 147 |
|
| 148 |
|
| 149 | `GET_ITER()`
|
| 150 |
|
| 151 | Implements TOS = iter(TOS).
|
| 152 |
|
| 153 | NOTES: Hm how do I implement this? It turns it from a collection into an
|
| 154 | iterator. Gah.
|
| 155 |
|
| 156 | PyObject *iter = PyObject_GetIter(iterable);
|
| 157 |
|
| 158 | objects/abstract.c -
|
| 159 | objects/iterobject.c - PySeqIter_New
|
| 160 | PySeqIter_Type has a it_seq field. The PyObject being iterated over. It
|
| 161 | maintains an index too.
|
| 162 | How does items() work as an iterable then?
|
| 163 |
|
| 164 | Then iter_iternext() calls:
|
| 165 | PySequence_GetItem(seq, it->it_index)
|
| 166 |
|
| 167 |
|
| 168 |
|
| 169 | `LOAD_FAST(var_num)`
|
| 170 |
|
| 171 | Pushes a reference to the local `co_varnames[var_num]` onto the stack.
|
| 172 |
|
| 173 | NOTES:
|
| 174 | This still does a named lookup? Generator expressions do `LOAD_FAST 0 (.0)`
|
| 175 | since there is no formal parameter name.
|
| 176 |
|
| 177 | Oh I see, there is a `PyObject** fastlocals` in EvalFrame
|
| 178 |
|
| 179 | It's initialized to `f->f_localsplus` -- frame holds them. Oh I see, that's
|
| 180 | where the frame setup is different! Don't need inspect.callargs.
|
| 181 |
|
| 182 |
|
| 183 | FastCall populates fastlocals from `PyObject** args` and `nargs`.
|
| 184 |
|
| 185 |
|
| 186 |
|
| 187 |
|
| 188 |
|
| 189 |
|
| 190 |
|