Skip to content
This repository was archived by the owner on Dec 22, 2021. It is now read-only.

Commit 710f870

Browse files
authored
Add .bitmask instruction family (#201)
* Add .bitmask instruction family i8x16.bitmask and i32x4.bitmask directly map to SSE movemask instructions; i16x8.bitmask can be synthesized using packs+movemask. These instructions are important to be able to do lane-wise processing after a vector comparison - for example, these can be used together with ctz to find the index of the first lane with the matching values after a comparison instruction. * Update opcode tables with bitmask
1 parent 0cad3d2 commit 710f870

File tree

4 files changed

+25
-1
lines changed

4 files changed

+25
-1
lines changed

proposals/simd/BinarySIMD.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -114,6 +114,7 @@ The `v8x16.shuffle` instruction has 16 bytes after `simdop`.
114114
| `i8x16.neg` | `0x61`| - |
115115
| `i8x16.any_true` | `0x62`| - |
116116
| `i8x16.all_true` | `0x63`| - |
117+
| `i8x16.bitmask` | `0x64`| - |
117118
| `i8x16.narrow_i16x8_s` | `0x65`| - |
118119
| `i8x16.narrow_i16x8_u` | `0x66`| - |
119120
| `i8x16.shl` | `0x6b`| - |
@@ -134,6 +135,7 @@ The `v8x16.shuffle` instruction has 16 bytes after `simdop`.
134135
| `i16x8.neg` | `0x81`| - |
135136
| `i16x8.any_true` | `0x82`| - |
136137
| `i16x8.all_true` | `0x83`| - |
138+
| `i16x8.bitmask` | `0x84`| - |
137139
| `i16x8.narrow_i32x4_s` | `0x85`| - |
138140
| `i16x8.narrow_i32x4_u` | `0x86`| - |
139141
| `i16x8.widen_low_i8x16_s` | `0x87`| - |
@@ -159,6 +161,7 @@ The `v8x16.shuffle` instruction has 16 bytes after `simdop`.
159161
| `i32x4.neg` | `0xa1`| - |
160162
| `i32x4.any_true` | `0xa2`| - |
161163
| `i32x4.all_true` | `0xa3`| - |
164+
| `i32x4.bitmask` | `0xa4`| - |
162165
| `i32x4.widen_low_i16x8_s` | `0xa7`| - |
163166
| `i32x4.widen_high_i16x8_s` | `0xa8`| - |
164167
| `i32x4.widen_low_i16x8_u` | `0xa9`| - |

proposals/simd/ImplementationStatus.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,7 @@
8787
| `i8x16.neg` | `-msimd128` | :heavy_check_mark: | | | :heavy_check_mark: |
8888
| `i8x16.any_true` | `-msimd128` | :heavy_check_mark: | | | :heavy_check_mark: |
8989
| `i8x16.all_true` | `-msimd128` | :heavy_check_mark: | | | :heavy_check_mark: |
90+
| `i8x16.bitmask` | `-munimplemented-simd128` | :heavy_check_mark: | | | |
9091
| `i8x16.narrow_i16x8_s` | `-msimd128` | :heavy_check_mark: | | | :heavy_check_mark: |
9192
| `i8x16.narrow_i16x8_u` | `-msimd128` | :heavy_check_mark: | | | :heavy_check_mark: |
9293
| `i8x16.shl` | `-msimd128` | :heavy_check_mark: | | | :heavy_check_mark: |
@@ -107,6 +108,7 @@
107108
| `i16x8.neg` | `-msimd128` | :heavy_check_mark: | | | :heavy_check_mark: |
108109
| `i16x8.any_true` | `-msimd128` | :heavy_check_mark: | | | :heavy_check_mark: |
109110
| `i16x8.all_true` | `-msimd128` | :heavy_check_mark: | | | :heavy_check_mark: |
111+
| `i16x8.bitmask` | `-munimplemented-simd128` | :heavy_check_mark: | | | |
110112
| `i16x8.narrow_i32x4_s` | `-msimd128` | :heavy_check_mark: | | | :heavy_check_mark: |
111113
| `i16x8.narrow_i32x4_u` | `-msimd128` | :heavy_check_mark: | | | :heavy_check_mark: |
112114
| `i16x8.widen_low_i8x16_s` | `-msimd128` | :heavy_check_mark: | | | :heavy_check_mark: |
@@ -132,6 +134,7 @@
132134
| `i32x4.neg` | `-msimd128` | :heavy_check_mark: | | | :heavy_check_mark: |
133135
| `i32x4.any_true` | `-msimd128` | :heavy_check_mark: | | | :heavy_check_mark: |
134136
| `i32x4.all_true` | `-msimd128` | :heavy_check_mark: | | | :heavy_check_mark: |
137+
| `i32x4.bitmask` | `-munimplemented-simd128` | :heavy_check_mark: | | | |
135138
| `i32x4.widen_low_i16x8_s` | `-msimd128` | :heavy_check_mark: | | | :heavy_check_mark: |
136139
| `i32x4.widen_high_i16x8_s` | `-msimd128` | :heavy_check_mark: | | | :heavy_check_mark: |
137140
| `i32x4.widen_low_i16x8_u` | `-msimd128` | :heavy_check_mark: | | | :heavy_check_mark: |

proposals/simd/NewOpcodes.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@
8282
| i8x16.neg | 0x61 | i16x8.neg | 0x81 | i32x4.neg | 0xa1 | i64x2.neg | 0xc1 |
8383
| i8x16.any_true | 0x62 | i16x8.any_true | 0x82 | i32x4.any_true | 0xa2 | ---- | 0xc2 |
8484
| i8x16.all_true | 0x63 | i16x8.all_true | 0x83 | i32x4.all_true | 0xa3 | ---- | 0xc3 |
85-
| ---- bitmask ---- | 0x64 | ---- bitmask ---- | 0x84 | ---- bitmask ---- | 0xa4 | ---- | 0xc4 |
85+
| i8x16.bitmask | 0x64 | i16x8.bitmask | 0x84 | i32x4.bitmask | 0xa4 | ---- | 0xc4 |
8686
| i8x16.narrow_i16x8_s | 0x65 | i16x8.narrow_i32x4_s | 0x85 | ---- narrow ---- | 0xa5 | ---- | 0xc5 |
8787
| i8x16.narrow_i16x8_u | 0x66 | i16x8.narrow_i32x4_u | 0x86 | ---- narrow ---- | 0xa6 | ---- | 0xc6 |
8888
| ---- widen ---- | 0x67 | i16x8.widen_low_i8x16_s | 0x87 | i32x4.widen_low_i16x8_s | 0xa7 | ---- | 0xc7 |

proposals/simd/SIMD.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -648,6 +648,24 @@ def S.all_true(a):
648648
return 1
649649
```
650650

651+
## Bitmask extraction
652+
653+
* `i8x16.bitmask(a: v128) -> i32`
654+
* `i16x8.bitmask(a: v128) -> i32`
655+
* `i32x4.bitmask(a: v128) -> i32`
656+
657+
These operations extract the high bit for each lane in `a` and produce a scalar
658+
mask with all bits concatenated.
659+
660+
```python
661+
def S.bitmask(a):
662+
result = 0
663+
for i in range(S.Lanes):
664+
if a[i] < 0:
665+
result = result | (1 << i)
666+
return result
667+
```
668+
651669
## Comparisons
652670

653671
The comparison operations all compare two vectors lane-wise, and produce a mask

0 commit comments

Comments
 (0)