|
| 1 | +# YARV Frame Layout |
| 2 | + |
| 3 | +This document is an introduction to the layout of the VM stack as calls happen. |
| 4 | +The code holds the ultimate truth for this subject, so beware that this |
| 5 | +document can become stale. |
| 6 | + |
| 7 | +We'll walk through the following program, with explanation at selected points |
| 8 | +in execution and abridged instruction sequence disassembly listings: |
| 9 | + |
| 10 | +```ruby |
| 11 | +def foo(x, y) |
| 12 | + z = x.casecmp(y) |
| 13 | +end |
| 14 | + |
| 15 | +foo(:one, :two) |
| 16 | +``` |
| 17 | + |
| 18 | +First, after arguments are evaluated and right before the `send` to `foo`: |
| 19 | + |
| 20 | +``` |
| 21 | + ┌────────────┐ |
| 22 | + putself │ :two │ |
| 23 | + putobject :one 0x2 ├────────────┤ |
| 24 | + putobject :two │ :one │ |
| 25 | +► send <:foo, argc:2> 0x1 ├────────────┤ |
| 26 | + leave │ self │ |
| 27 | + 0x0 └────────────┘ |
| 28 | +``` |
| 29 | + |
| 30 | +The `put*` instructions have pushed 3 items onto the stack. It's now time to |
| 31 | +add a new control frame for `foo`. The following is the shape of the stack |
| 32 | +after one instruction in `foo`: |
| 33 | + |
| 34 | +``` |
| 35 | + cfp->sp=0x8 at this point. |
| 36 | + 0x8 ┌────────────┐◄──Stack space for temporaries |
| 37 | + │ :one │ live above the environment. |
| 38 | + 0x7 ├────────────┤ |
| 39 | + getlocal x@0 │ < flags > │ foo's rb_control_frame_t |
| 40 | +► getlocal y@1 0x6 ├────────────┤◄──has cfp->ep=0x6 |
| 41 | + send <:casecmp, argc:1> │ <no block> │ |
| 42 | + dup 0x5 ├────────────┤ The flags, block, and CME triple |
| 43 | + setlocal z@2 │ <CME: foo> │ (VM_ENV_DATA_SIZE) form an |
| 44 | + leave 0x4 ├────────────┤ environment. They can be used to |
| 45 | + │ z (nil) │ figure out what local variables |
| 46 | + 0x3 ├────────────┤ are below them. |
| 47 | + │ :two │ |
| 48 | + 0x2 ├────────────┤ Notice how the arguments, now |
| 49 | + │ :one │ locals, never moved. This layout |
| 50 | + 0x1 ├────────────┤ allows for argument transfer |
| 51 | + │ self │ without copying. |
| 52 | + 0x0 └────────────┘ |
| 53 | +``` |
| 54 | + |
| 55 | +Given that locals have lower address than `cfp->ep`, it makes sense then that |
| 56 | +`getlocal` in `insns.def` has `val = *(vm_get_ep(GET_EP(), level) - idx);`. |
| 57 | +When accessing variables in the immediate scope, where `level=0`, it's |
| 58 | +essentially `val = cfp->ep[-idx];`. |
| 59 | + |
| 60 | +Note that this EP-relative index has a different basis the index that comes |
| 61 | +after "@" in disassembly listings. The `@` index is relative to the 0th local |
| 62 | +(`x` in this case). |
| 63 | + |
| 64 | +## Q&A |
| 65 | + |
| 66 | +Q: It seems that the receiver is always at an offset relative to EP, |
| 67 | + like locals. Couldn't we use EP to access it instead of using `cfp->self`? |
| 68 | + |
| 69 | +A: Not all calls put the `self` in the callee on the stack. Two |
| 70 | + examples are `Proc#call`, where the receiver is the Proc object, but `self` |
| 71 | + inside the callee is `Proc#receiver`, and `yield`, where the receiver isn't |
| 72 | + pushed onto the stack before the arguments. |
| 73 | + |
| 74 | +Q: Why have `cfp->ep` when it seems that everything is below `cfp->sp`? |
| 75 | + |
| 76 | +A: In the example, EP points to the stack, but it can also point to the GC heap. |
| 77 | + Blocks can capture and evacuate their environment to the heap. |
| 78 | + |
| 79 | + |
| 80 | + |
| 81 | + |
| 82 | + |
| 83 | + |
0 commit comments