- Refine the API design of Intel hardware intrinsic
- Implement remaining AVX2 intrinsic [rely on (1)]
- Implement remaining SSE4.2 intrinsic [rely on (1)]
- Enable containment analysis on more hardware intrinsic forms (e.g., imm, 1-arg, 3-arg, etc.)
- Implement FMA intrinsic [rely on (4)]
- FMA intrinsic codegen is different from other ISAs whose instruction selection depends on the operator's position (e.g., in registers or memory?)
- Implement other ISA classes (
Bmi1, Bmi2, Aes, and Pclmulqdq)
- fully support all the Intel hardware intrinsic of existing APIs
- Create non-trivial benchmarks for Intel hardware intrinsic
- Improve the CQ of Intel hardware intrinsic base-on key scenarios [partially rely on (7)]
- Investigate the JIT throughput impact from hardware intrinsic recognition [rely on (7)]
- Identify candidates that can be optimized using HW intrinsics and implement them using intrinsics (CoreFX, mscorlib, HPC, ML, etc.)
- more...
category:cq
theme:intrinsics
skill-level:intermediate
cost:extra-large
Bmi1,Bmi2,Aes, andPclmulqdq)category:cq
theme:intrinsics
skill-level:intermediate
cost:extra-large