Skip to content

Commit 86c4994

Browse files
committed
Add explainer for YARV frame layout
Just a first step. Have a read, and let's improve it together. Close: GH-6
1 parent 8405548 commit 86c4994

File tree

2 files changed

+160
-0
lines changed

2 files changed

+160
-0
lines changed

doc/yarv_frame_layout.md

Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
# YARV Frame Layout
2+
3+
This document is an introduction to the layout of the VM stack as calls happen.
4+
The code holds the ultimate truth for this subject, so beware that this
5+
document can become stale.
6+
7+
We'll walk through the following program, with explanation at selected points
8+
in execution and abridged instruction sequence disassembly listings:
9+
10+
```ruby
11+
def foo(x, y)
12+
z = x.casecmp(y)
13+
end
14+
15+
foo(:one, :two)
16+
```
17+
18+
First, after arguments are evaluated and right before the `send` to `foo`:
19+
20+
```
21+
┌────────────┐
22+
putself │ :two │
23+
putobject :one 0x2 ├────────────┤
24+
putobject :two │ :one │
25+
► send <:foo, argc:2> 0x1 ├────────────┤
26+
leave │ self │
27+
0x0 └────────────┘
28+
```
29+
30+
The `put*` instructions have pushed 3 items onto the stack. It's now time to
31+
add a new control frame for `foo`. The following is the shape of the stack
32+
after one instruction in `foo`:
33+
34+
```
35+
cfp->sp=0x8 at this point.
36+
0x8 ┌────────────┐◄──Stack space for temporaries
37+
│ :one │ live above the environment.
38+
0x7 ├────────────┤
39+
getlocal x@0 │ < flags > │ foo's rb_control_frame_t
40+
► getlocal y@1 0x6 ├────────────┤◄──has cfp->ep=0x6
41+
send <:casecmp, argc:1> │ <no block> │
42+
dup 0x5 ├────────────┤ The flags, block, and CME triple
43+
setlocal z@2 │ <CME: foo> │ (VM_ENV_DATA_SIZE) form an
44+
leave 0x4 ├────────────┤ environment. They can be used to
45+
│ z (nil) │ figure out what local variables
46+
0x3 ├────────────┤ are below them.
47+
│ :two │
48+
0x2 ├────────────┤ Notice how the arguments, now
49+
│ :one │ locals, never moved. This layout
50+
0x1 ├────────────┤ allows for argument transfer
51+
│ self │ without copying.
52+
0x0 └────────────┘
53+
```
54+
55+
Given that locals have lower address than `cfp->ep`, it makes sense then that
56+
`getlocal` in `insns.def` has `val = *(vm_get_ep(GET_EP(), level) - idx);`.
57+
When accessing variables in the immediate scope, where `level=0`, it's
58+
essentially `val = cfp->ep[-idx];`.
59+
60+
Note that this EP-relative index has a different basis the index that comes
61+
after "@" in disassembly listings. The `@` index is relative to the 0th local
62+
(`x` in this case).
63+
64+
## Q&A
65+
66+
Q: It seems that the receiver is always at an offset relative to EP,
67+
like locals. Couldn't we use EP to access it instead of using `cfp->self`?
68+
69+
A: Not all calls put the `self` in the callee on the stack. Two
70+
examples are `Proc#call`, where the receiver is the Proc object, but `self`
71+
inside the callee is `Proc#receiver`, and `yield`, where the receiver isn't
72+
pushed onto the stack before the arguments.
73+
74+
Q: Why have `cfp->ep` when it seems that everything is below `cfp->sp`?
75+
76+
A: In the example, EP points to the stack, but it can also point to the GC heap.
77+
Blocks can capture and evacuate their environment to the heap.
78+
79+
80+
81+
82+
83+

doc/yarv_frames.md

Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
# YARV Frame Layout
2+
3+
This document is an introduction to what happens on the VM stack as the VM
4+
services calls. The code holds the ultimate truth for this subject, so beware
5+
that this document can become stale.
6+
7+
We'll walk through the following program, with explanation at selected points
8+
in execution and abridged disassembly listings:
9+
10+
```ruby
11+
def foo(x, y)
12+
z = x.casecmp(y)
13+
end
14+
15+
foo(:one, :two)
16+
```
17+
18+
First, after arguments are evaluated and right before the `send` to `foo`:
19+
20+
```
21+
┌────────────┐
22+
putself │ :two │
23+
putobject :one 0x2 ├────────────┤
24+
putobject :two │ :one │
25+
► send <:foo, argc:2> 0x1 ├────────────┤
26+
leave │ self │
27+
0x0 └────────────┘
28+
```
29+
30+
The `put*` instructions have pushed 3 items onto the stack. It's now time to
31+
add a new control frame for `foo`. The following is the shape of the stack
32+
after one instruction in `foo`:
33+
34+
```
35+
cfp->sp=0x8 at this point.
36+
0x8 ┌────────────┐◄──Stack space for temporaries
37+
│ :one │ live above the environment.
38+
0x7 ├────────────┤
39+
getlocal x@0 │ < flags > │ foo's rb_control_frame_t
40+
► getlocal y@1 0x6 ├────────────┤◄──has cfp->ep=0x6
41+
send <:casecmp, argc:1> │ <no block> │
42+
dup 0x5 ├────────────┤ The flags, block, and CME triple
43+
setlocal z@2 │ <CME: foo> │ (VM_ENV_DATA_SIZE) form an
44+
leave 0x4 ├────────────┤ environment. They can be used to
45+
│ z (nil) │ figure out what local variables
46+
0x3 ├────────────┤ are below them.
47+
│ :two │
48+
0x2 ├────────────┤ Notice how the arguments, now
49+
│ :one │ locals, never moved. This layout
50+
0x1 ├────────────┤ allows for argument transfer
51+
│ self │ without copying.
52+
0x0 └────────────┘
53+
```
54+
55+
Given that locals have lower address than `cfp->ep`, it makes sense then that
56+
`getlocal` in `insns.def` has `val = *(vm_get_ep(GET_EP(), level) - idx);`.
57+
When accessing variables in the immediate scope, where `level=0`, it's
58+
essentially `val = cfp->ep[-idx];`.
59+
60+
Note that this EP-relative index has a different basis the index that comes
61+
after "@" in disassembly listings. The "@" index is relative to the 0th local
62+
(`x` in this case).
63+
64+
## Q&A
65+
66+
Q: It seems that the receiver is always at an offset relative to EP,
67+
like locals. Couldn't we use EP to access it instead of using `cfp->self`?
68+
69+
A: Not all calls put the `self` in the callee on the stack. Two
70+
examples are `Proc#call`, where the receiver is the Proc object, but `self`
71+
inside the callee is `Proc#receiver`, and `yield`, where the receiver isn't
72+
pushed onto the stack before the arguments.
73+
74+
Q: Why have `cfp->ep` when it seems that everything is below `cfp->sp`?
75+
76+
A: In the example, `cfp->ep` points to the stack, but it can also point to the
77+
GC heap. Blocks can capture and evacuate their environment to the heap.

0 commit comments

Comments
 (0)