Conversation
… score calculation in llama
… block fused -- 2.5 TPS
…straction with new iron/ structure
📊 Test Results for Small Benchmark/Test Suite0685306 (2026_02_06_13_53_22) IRONCLADTested on
📈 Trends (vs main branch) for Small Benchmark/Test Suite0685306 (2026_02_06_13_53_22) IRONCLAD Trendsaxpy_1_cols_2_channels_2048_tile_2048_3.0
axpy_1_cols_2_channels_2048_tile_2048_3.0_0
axpy_2_cols_2_channels_2048_tile_1024_3.0
axpy_2_cols_2_channels_2048_tile_1024_3.0_0
axpy_4_cols_2_channels_2048_tile_512_3.0
axpy_4_cols_2_channels_2048_tile_512_3.0_0
axpy_8_cols_2_channels_2048_tile_256_3.0
axpy_8_cols_2_channels_2048_tile_256_3.0_0
dequant_1_cols_1_channels_2048_tile_2048
dequant_1_cols_1_channels_2048_tile_2048_0
dequant_1_cols_2_channels_2048_tile_1024
dequant_1_cols_2_channels_2048_tile_1024_0
dequant_2_cols_1_channels_2048_tile_1024
dequant_2_cols_1_channels_2048_tile_1024_0
dequant_2_cols_2_channels_2048_tile_512
dequant_2_cols_2_channels_2048_tile_512_0
dequant_4_cols_1_channels_2048_tile_512
dequant_4_cols_1_channels_2048_tile_512_0
dequant_4_cols_2_channels_2048_tile_256
dequant_4_cols_2_channels_2048_tile_256_0
dequant_8_cols_1_channels_2048_tile_256
dequant_8_cols_1_channels_2048_tile_256_0
dequant_8_cols_2_channels_2048_tile_128
dequant_8_cols_2_channels_2048_tile_128_0
eltwise_add_1_cols_2_channels_2048_tile_2048
eltwise_add_2_cols_2_channels_2048_tile_1024
eltwise_add_4_cols_2_channels_2048_tile_512
eltwise_add_8_cols_2_channels_2048_tile_256
eltwise_mul_1_cols_2_channels_2048_tile_2048
eltwise_mul_2_cols_2_channels_2048_tile_1024
eltwise_mul_4_cols_2_channels_2048_tile_512
eltwise_mul_8_cols_2_channels_2048_tile_256
gelu_1_cols_1_channels_2048_tile_2048
gelu_1_cols_2_channels_2048_tile_1024
gelu_2_cols_1_channels_2048_tile_1024
gelu_2_cols_2_channels_2048_tile_512
gelu_4_cols_1_channels_2048_tile_512
gelu_4_cols_2_channels_2048_tile_256
gelu_8_cols_1_channels_2048_tile_256
gelu_8_cols_2_channels_2048_tile_128
gemm_1792x896x1152_64x32x48_8cols_ccolmaj
gemm_192x384x64_48x96x16_4cols
gemm_192x384x64_48x96x16_4cols_bcolmaj_ccolmaj
gemm_2048x2048x2048_64x64x32_8_cols_0_bcolmaj_0_ccolmaj_0
gemm_2048x2048x2048_64x64x32_8_cols_0_bcolmaj_1_ccolmaj_0
gemm_2048x2048x2048_64x64x32_8_cols_1_bcolmaj_0_ccolmaj_0
gemm_2048x2048x2048_64x64x64_1cols
gemm_2048x2048x2048_64x64x64_2_cols_0_bcolmaj_0_ccolmaj_0
gemm_2048x2048x2048_64x64x64_2_cols_0_bcolmaj_0_ccolmaj_0_0
gemm_2048x2048x2048_64x64x64_2_cols_0_bcolmaj_1_ccolmaj_0
gemm_2048x2048x2048_64x64x64_2_cols_0_bcolmaj_1_ccolmaj_0_0
gemm_2048x2048x2048_64x64x64_2_cols_1_bcolmaj_0_ccolmaj_0
gemm_2048x2048x2048_64x64x64_2_cols_1_bcolmaj_0_ccolmaj_0_0
gemm_2048x2048x2048_64x64x64_2cols_bcolmaj
gemm_2048x2048x2048_64x64x64_8_cols_0_bcolmaj_0_ccolmaj_0
gemm_2048x2048x2048_64x64x64_8_cols_0_bcolmaj_0_ccolmaj_0_0
gemm_2048x2048x2048_64x64x64_8_cols_0_bcolmaj_1_ccolmaj_0_0
gemm_2048x2048x2048_64x64x64_8_cols_1_bcolmaj_0_ccolmaj_0_0
gemm_2048x2048x2048_64x64x64_8cols_bcolmaj_ccolmaj
gemm_384x1536x1792_32x48x64_4cols_bcolmaj
gemm_896x1792x640_32x64x80_8cols_ccolmaj
layer_norm_1_cols_1_channels_2048_tile_2048
layer_norm_1_cols_2_channels_2048_tile_1024
layer_norm_2_cols_1_channels_2048_tile_1024
layer_norm_2_cols_2_channels_2048_tile_512
layer_norm_4_cols_1_channels_2048_tile_512
layer_norm_4_cols_2_channels_2048_tile_256
layer_norm_8_cols_1_channels_2048_tile_256
layer_norm_8_cols_2_channels_2048_tile_128
matrix_vector_mul_128x128_32_1col
matrix_vector_mul_128x128_32_1col0
matrix_vector_mul_128x128_32tsi_128tso_1col0
matrix_vector_mul_2048x8192_1_1col
matrix_vector_mul_2048x8192_1_1col0
matrix_vector_mul_2048x8192_1_2col
matrix_vector_mul_2048x8192_1_2col0
matrix_vector_mul_2048x8192_1_4col
matrix_vector_mul_2048x8192_1_4col0
matrix_vector_mul_2048x8192_1_8col
matrix_vector_mul_2048x8192_1_8col0
matrix_vector_mul_2048x8192_1tsi_1024tso_2col0
matrix_vector_mul_2048x8192_1tsi_2048tso_1col0
matrix_vector_mul_2048x8192_1tsi_256tso_8col0
matrix_vector_mul_2048x8192_1tsi_512tso_4col0
matrix_vector_mul_8192x2048_4_1col
matrix_vector_mul_8192x2048_4_1col0
matrix_vector_mul_8192x2048_4_2col
matrix_vector_mul_8192x2048_4_2col0
matrix_vector_mul_8192x2048_4_4col
matrix_vector_mul_8192x2048_4_4col0
matrix_vector_mul_8192x2048_4_8col
matrix_vector_mul_8192x2048_4_8col0
matrix_vector_mul_8192x2048_4tsi_1024tso_1col0
matrix_vector_mul_8192x2048_4tsi_1024tso_2col0
matrix_vector_mul_8192x2048_4tsi_1024tso_4col0
matrix_vector_mul_8192x2048_4tsi_1024tso_8col0
mem_copy_16_cores_2_chans_2048_tile_128_False
mem_copy_16_cores_2_chans_2048_tile_128_False0
mem_copy_1_cols_1_channels_2048_tile_2048
mem_copy_1_cols_2_channels_2048_tile_1024
mem_copy_1_cores_1_chans_2048_tile_2048_False
mem_copy_1_cores_1_chans_2048_tile_2048_False0
mem_copy_2_cols_1_channels_2048_tile_1024
mem_copy_2_cols_2_channels_2048_tile_512
mem_copy_2_cores_1_chans_2048_tile_1024_False
mem_copy_2_cores_1_chans_2048_tile_1024_False0
mem_copy_2_cores_2_chans_2048_tile_1024_False
mem_copy_2_cores_2_chans_2048_tile_1024_False0
mem_copy_4_cols_1_channels_2048_tile_512
mem_copy_4_cols_2_channels_2048_tile_256
mem_copy_4_cores_1_chans_2048_tile_512_False
mem_copy_4_cores_1_chans_2048_tile_512_False0
mem_copy_4_cores_2_chans_2048_tile_512_False
mem_copy_4_cores_2_chans_2048_tile_512_False0
mem_copy_8_cols_1_channels_2048_tile_256
mem_copy_8_cols_2_channels_2048_tile_128
mem_copy_8_cores_1_chans_2048_tile_256_False
mem_copy_8_cores_1_chans_2048_tile_256_False0
mem_copy_8_cores_2_chans_2048_tile_256_False
mem_copy_8_cores_2_chans_2048_tile_256_False0
mha
mha0
relu_1_cols_1_channels_2048_tile_2048
relu_2_cols_1_channels_2048_tile_1024
relu_4_cols_1_channels_2048_tile_512
relu_8_cols_1_channels_2048_tile_256
rms_norm_1_cols_1_channels_2048_tile_2048
rms_norm_1_cols_2_channels_2048_tile_1024
rms_norm_2_cols_1_channels_2048_tile_1024
rms_norm_2_cols_2_channels_2048_tile_512
rms_norm_4_cols_1_channels_2048_tile_512
rms_norm_4_cols_2_channels_2048_tile_256
rms_norm_8_cols_1_channels_2048_tile_256
rms_norm_8_cols_2_channels_2048_tile_128
rope_1_cols_2_channels_4096_tile_4096_0
rope_1c_32rows_512cols_32arows_0m
rope_1c_32rows_512cols_8arows_0m
rope_2_cols_2_channels_4096_tile_2048_0
rope_2c_32rows_512cols_32arows_0m
rope_2c_32rows_512cols_8arows_0m
rope_4_cols_2_channels_4096_tile_1024_0
rope_8_cols_2_channels_4096_tile_512_0
rope_8c_32rows_512cols_32arows_0m
rope_8c_32rows_512cols_8arows_0m
sigmoid_1_cols_1_channels_2048_tile_2048
sigmoid_2_cols_1_channels_2048_tile_1024
sigmoid_4_cols_1_channels_2048_tile_512
sigmoid_8_cols_1_channels_2048_tile_256
silu_1_cols_1_channels_2048_tile_2048
silu_2_cols_1_channels_2048_tile_1024
silu_4_cols_1_channels_2048_tile_512
silu_8_cols_1_channels_2048_tile_256
softmax_1_cols_2_channels_4096_tile_2048
softmax_2_cols_2_channels_32768_tile_1024
softmax_2_cols_2_channels_32768_tile_512
softmax_2_cols_2_channels_4096_tile_1024
softmax_2_cols_2_channels_4096_tile_512
softmax_4_cols_4_channels_32768_tile_2048
swigluNo metrics available. swiglu_decode_1x2048x2048
swiglu_decode_1x2048x2048_0
swiglu_prefill_256x2048x2048_0No metrics available. tanh_1_cols_1_channels_2048_tile_2048
tanh_2_cols_1_channels_2048_tile_1024
tanh_4_cols_1_channels_2048_tile_512
tanh_8_cols_1_channels_2048_tile_256
transpose_2048_M_64_N_1_cols_1_channels_64_m_64_n_8_s
transpose_2048_M_64_N_1_cols_1_channels_64_m_64_n_8_s0
transpose_2048_M_64_N_1_cols_2_channels_64_m_64_n_8_s
transpose_2048_M_64_N_1_cols_2_channels_64_m_64_n_8_s0
weighted_rms_norm_1_cols_2_channels_2048_weights_2048
weighted_rms_norm_2_cols_2_channels_2048_weights_1024
weighted_rms_norm_4_cols_2_channels_2048_weights_512
weighted_rms_norm_8_cols_2_channels_2048_weights_256
|
📊 Test Results for Test Example Applications0685306 (2026_02_06_13_57_42) IRONCLADTested on
📈 Trends (vs main branch) for Test Example Applications0685306 (2026_02_06_13_57_42) IRONCLAD Trendsllama_3.2_1b
llama_3.2_1b_prompt_13_tokens_1
llama_3.2_1b_prompt_13_tokens_40
llama_3.2_1b_prompt_2048_tokens_1
llama_3.2_1b_prompt_2048_tokens_40
|
📊 Test Results for Small Benchmark/Test Suiteb0ad401 (2026_02_06_16_24_33) IRONCLADTested on
📈 Trends (vs main branch) for Small Benchmark/Test Suiteb0ad401 (2026_02_06_16_24_33) IRONCLAD Trendsaxpy_1_cols_2_channels_2048_tile_2048_3.0
axpy_1_cols_2_channels_2048_tile_2048_3.0_0
axpy_2_cols_2_channels_2048_tile_1024_3.0
axpy_2_cols_2_channels_2048_tile_1024_3.0_0
axpy_4_cols_2_channels_2048_tile_512_3.0
axpy_4_cols_2_channels_2048_tile_512_3.0_0
axpy_8_cols_2_channels_2048_tile_256_3.0
axpy_8_cols_2_channels_2048_tile_256_3.0_0
dequant_1_cols_1_channels_2048_tile_2048
dequant_1_cols_1_channels_2048_tile_2048_0
dequant_1_cols_2_channels_2048_tile_1024
dequant_1_cols_2_channels_2048_tile_1024_0
dequant_2_cols_1_channels_2048_tile_1024
dequant_2_cols_1_channels_2048_tile_1024_0
dequant_2_cols_2_channels_2048_tile_512
dequant_2_cols_2_channels_2048_tile_512_0
dequant_4_cols_1_channels_2048_tile_512
dequant_4_cols_1_channels_2048_tile_512_0
dequant_4_cols_2_channels_2048_tile_256
dequant_4_cols_2_channels_2048_tile_256_0
dequant_8_cols_1_channels_2048_tile_256
dequant_8_cols_1_channels_2048_tile_256_0
dequant_8_cols_2_channels_2048_tile_128
dequant_8_cols_2_channels_2048_tile_128_0
eltwise_add_1_cols_2_channels_2048_tile_2048
eltwise_add_2_cols_2_channels_2048_tile_1024
eltwise_add_4_cols_2_channels_2048_tile_512
eltwise_add_8_cols_2_channels_2048_tile_256
eltwise_mul_1_cols_2_channels_2048_tile_2048
eltwise_mul_2_cols_2_channels_2048_tile_1024
eltwise_mul_4_cols_2_channels_2048_tile_512
eltwise_mul_8_cols_2_channels_2048_tile_256
gelu_1_cols_1_channels_2048_tile_2048
gelu_1_cols_2_channels_2048_tile_1024
gelu_2_cols_1_channels_2048_tile_1024
gelu_2_cols_2_channels_2048_tile_512
gelu_4_cols_1_channels_2048_tile_512
gelu_4_cols_2_channels_2048_tile_256
gelu_8_cols_1_channels_2048_tile_256
gelu_8_cols_2_channels_2048_tile_128
gemm_1792x896x1152_64x32x48_8cols_ccolmaj
gemm_192x384x64_48x96x16_4cols
gemm_192x384x64_48x96x16_4cols_bcolmaj_ccolmaj
gemm_2048x2048x2048_64x64x32_8_cols_0_bcolmaj_0_ccolmaj_0
gemm_2048x2048x2048_64x64x32_8_cols_0_bcolmaj_1_ccolmaj_0
gemm_2048x2048x2048_64x64x32_8_cols_1_bcolmaj_0_ccolmaj_0
gemm_2048x2048x2048_64x64x64_1cols
gemm_2048x2048x2048_64x64x64_2_cols_0_bcolmaj_0_ccolmaj_0
gemm_2048x2048x2048_64x64x64_2_cols_0_bcolmaj_0_ccolmaj_0_0
gemm_2048x2048x2048_64x64x64_2_cols_0_bcolmaj_1_ccolmaj_0
gemm_2048x2048x2048_64x64x64_2_cols_0_bcolmaj_1_ccolmaj_0_0
gemm_2048x2048x2048_64x64x64_2_cols_1_bcolmaj_0_ccolmaj_0
gemm_2048x2048x2048_64x64x64_2_cols_1_bcolmaj_0_ccolmaj_0_0
gemm_2048x2048x2048_64x64x64_2cols_bcolmaj
gemm_2048x2048x2048_64x64x64_8_cols_0_bcolmaj_0_ccolmaj_0
gemm_2048x2048x2048_64x64x64_8_cols_0_bcolmaj_0_ccolmaj_0_0
gemm_2048x2048x2048_64x64x64_8_cols_0_bcolmaj_1_ccolmaj_0_0
gemm_2048x2048x2048_64x64x64_8_cols_1_bcolmaj_0_ccolmaj_0_0
gemm_2048x2048x2048_64x64x64_8cols_bcolmaj_ccolmaj
gemm_384x1536x1792_32x48x64_4cols_bcolmaj
gemm_896x1792x640_32x64x80_8cols_ccolmaj
layer_norm_1_cols_1_channels_2048_tile_2048
layer_norm_1_cols_2_channels_2048_tile_1024
layer_norm_2_cols_1_channels_2048_tile_1024
layer_norm_2_cols_2_channels_2048_tile_512
layer_norm_4_cols_1_channels_2048_tile_512
layer_norm_4_cols_2_channels_2048_tile_256
layer_norm_8_cols_1_channels_2048_tile_256
layer_norm_8_cols_2_channels_2048_tile_128
matrix_vector_mul_128x128_32_1col
matrix_vector_mul_128x128_32_1col0
matrix_vector_mul_128x128_32tsi_128tso_1col0
matrix_vector_mul_2048x8192_1_1col
matrix_vector_mul_2048x8192_1_1col0
matrix_vector_mul_2048x8192_1_2col
matrix_vector_mul_2048x8192_1_2col0
matrix_vector_mul_2048x8192_1_4col
matrix_vector_mul_2048x8192_1_4col0
matrix_vector_mul_2048x8192_1_8col
matrix_vector_mul_2048x8192_1_8col0
matrix_vector_mul_2048x8192_1tsi_1024tso_2col0
matrix_vector_mul_2048x8192_1tsi_2048tso_1col0
matrix_vector_mul_2048x8192_1tsi_256tso_8col0
matrix_vector_mul_2048x8192_1tsi_512tso_4col0
matrix_vector_mul_8192x2048_4_1col
matrix_vector_mul_8192x2048_4_1col0
matrix_vector_mul_8192x2048_4_2col
matrix_vector_mul_8192x2048_4_2col0
matrix_vector_mul_8192x2048_4_4col
matrix_vector_mul_8192x2048_4_4col0
matrix_vector_mul_8192x2048_4_8col
matrix_vector_mul_8192x2048_4_8col0
matrix_vector_mul_8192x2048_4tsi_1024tso_1col0
matrix_vector_mul_8192x2048_4tsi_1024tso_2col0
matrix_vector_mul_8192x2048_4tsi_1024tso_4col0
matrix_vector_mul_8192x2048_4tsi_1024tso_8col0
mem_copy_16_cores_2_chans_2048_tile_128_False
mem_copy_16_cores_2_chans_2048_tile_128_False0
mem_copy_1_cols_1_channels_2048_tile_2048
mem_copy_1_cols_2_channels_2048_tile_1024
mem_copy_1_cores_1_chans_2048_tile_2048_False
mem_copy_1_cores_1_chans_2048_tile_2048_False0
mem_copy_2_cols_1_channels_2048_tile_1024
mem_copy_2_cols_2_channels_2048_tile_512
mem_copy_2_cores_1_chans_2048_tile_1024_False
mem_copy_2_cores_1_chans_2048_tile_1024_False0
mem_copy_2_cores_2_chans_2048_tile_1024_False
mem_copy_2_cores_2_chans_2048_tile_1024_False0
mem_copy_4_cols_1_channels_2048_tile_512
mem_copy_4_cols_2_channels_2048_tile_256
mem_copy_4_cores_1_chans_2048_tile_512_False
mem_copy_4_cores_1_chans_2048_tile_512_False0
mem_copy_4_cores_2_chans_2048_tile_512_False
mem_copy_4_cores_2_chans_2048_tile_512_False0
mem_copy_8_cols_1_channels_2048_tile_256
mem_copy_8_cols_2_channels_2048_tile_128
mem_copy_8_cores_1_chans_2048_tile_256_False
mem_copy_8_cores_1_chans_2048_tile_256_False0
mem_copy_8_cores_2_chans_2048_tile_256_False
mem_copy_8_cores_2_chans_2048_tile_256_False0
mha
mha0
relu_1_cols_1_channels_2048_tile_2048
relu_2_cols_1_channels_2048_tile_1024
relu_4_cols_1_channels_2048_tile_512
relu_8_cols_1_channels_2048_tile_256
rms_norm_1_cols_1_channels_2048_tile_2048
rms_norm_1_cols_2_channels_2048_tile_1024
rms_norm_2_cols_1_channels_2048_tile_1024
rms_norm_2_cols_2_channels_2048_tile_512
rms_norm_4_cols_1_channels_2048_tile_512
rms_norm_4_cols_2_channels_2048_tile_256
rms_norm_8_cols_1_channels_2048_tile_256
rms_norm_8_cols_2_channels_2048_tile_128
rope_1_cols_2_channels_4096_tile_4096_0
rope_1c_32rows_512cols_32arows_0m
rope_1c_32rows_512cols_8arows_0m
rope_2_cols_2_channels_4096_tile_2048_0
rope_2c_32rows_512cols_32arows_0m
rope_2c_32rows_512cols_8arows_0m
rope_4_cols_2_channels_4096_tile_1024_0
rope_8_cols_2_channels_4096_tile_512_0
rope_8c_32rows_512cols_32arows_0m
rope_8c_32rows_512cols_8arows_0m
sigmoid_1_cols_1_channels_2048_tile_2048
sigmoid_2_cols_1_channels_2048_tile_1024
sigmoid_4_cols_1_channels_2048_tile_512
sigmoid_8_cols_1_channels_2048_tile_256
silu_1_cols_1_channels_2048_tile_2048
silu_2_cols_1_channels_2048_tile_1024
silu_4_cols_1_channels_2048_tile_512
silu_8_cols_1_channels_2048_tile_256
softmax_1_cols_2_channels_4096_tile_2048
softmax_2_cols_2_channels_4096_tile_1024
softmax_2_cols_2_channels_4096_tile_512
swigluNo metrics available. swiglu_decode_1x2048x2048
swiglu_decode_1x2048x2048_0
tanh_1_cols_1_channels_2048_tile_2048
tanh_2_cols_1_channels_2048_tile_1024
tanh_4_cols_1_channels_2048_tile_512
tanh_8_cols_1_channels_2048_tile_256
transpose_2048_M_64_N_1_cols_1_channels_64_m_64_n_8_s
transpose_2048_M_64_N_1_cols_1_channels_64_m_64_n_8_s0
transpose_2048_M_64_N_1_cols_2_channels_64_m_64_n_8_s
transpose_2048_M_64_N_1_cols_2_channels_64_m_64_n_8_s0
weighted_rms_norm_1_cols_2_channels_2048_weights_2048
weighted_rms_norm_2_cols_2_channels_2048_weights_1024
weighted_rms_norm_4_cols_2_channels_2048_weights_512
weighted_rms_norm_8_cols_2_channels_2048_weights_256
|
📊 Test Results for Test Example Applicationsb0ad401 (2026_02_06_16_28_30) IRONCLADTested on
📈 Trends (vs main branch) for Test Example Applicationsb0ad401 (2026_02_06_16_28_30) IRONCLAD Trendsllama_3.2_1b
llama_3.2_1b_prompt_13_tokens_1
llama_3.2_1b_prompt_13_tokens_40
llama_3.2_1b_prompt_2048_tokens_1
llama_3.2_1b_prompt_2048_tokens_40
|
📊 Test Results for Small Benchmark/Test Suitedfee1e0 (2026_02_06_17_29_36) IRONCLADTested on
📈 Trends (vs main branch) for Small Benchmark/Test Suitedfee1e0 (2026_02_06_17_29_36) IRONCLAD Trendsaxpy_1_cols_2_channels_2048_tile_2048_3.0
axpy_1_cols_2_channels_2048_tile_2048_3.0_0
axpy_2_cols_2_channels_2048_tile_1024_3.0
axpy_2_cols_2_channels_2048_tile_1024_3.0_0
axpy_4_cols_2_channels_2048_tile_512_3.0
axpy_4_cols_2_channels_2048_tile_512_3.0_0
axpy_8_cols_2_channels_2048_tile_256_3.0
axpy_8_cols_2_channels_2048_tile_256_3.0_0
dequant_1_cols_1_channels_2048_tile_2048
dequant_1_cols_1_channels_2048_tile_2048_0
dequant_1_cols_2_channels_2048_tile_1024
dequant_1_cols_2_channels_2048_tile_1024_0
dequant_2_cols_1_channels_2048_tile_1024
dequant_2_cols_1_channels_2048_tile_1024_0
dequant_2_cols_2_channels_2048_tile_512
dequant_2_cols_2_channels_2048_tile_512_0
dequant_4_cols_1_channels_2048_tile_512
dequant_4_cols_1_channels_2048_tile_512_0
dequant_4_cols_2_channels_2048_tile_256
dequant_4_cols_2_channels_2048_tile_256_0
dequant_8_cols_1_channels_2048_tile_256
dequant_8_cols_1_channels_2048_tile_256_0
dequant_8_cols_2_channels_2048_tile_128
dequant_8_cols_2_channels_2048_tile_128_0
eltwise_add_1_cols_2_channels_2048_tile_2048
eltwise_add_2_cols_2_channels_2048_tile_1024
eltwise_add_4_cols_2_channels_2048_tile_512
eltwise_add_8_cols_2_channels_2048_tile_256
eltwise_mul_1_cols_2_channels_2048_tile_2048
eltwise_mul_2_cols_2_channels_2048_tile_1024
eltwise_mul_4_cols_2_channels_2048_tile_512
eltwise_mul_8_cols_2_channels_2048_tile_256
gelu_1_cols_1_channels_2048_tile_2048
gelu_1_cols_2_channels_2048_tile_1024
gelu_2_cols_1_channels_2048_tile_1024
gelu_2_cols_2_channels_2048_tile_512
gelu_4_cols_1_channels_2048_tile_512
gelu_4_cols_2_channels_2048_tile_256
gelu_8_cols_1_channels_2048_tile_256
gelu_8_cols_2_channels_2048_tile_128
gemm_1792x896x1152_64x32x48_8cols_ccolmaj
gemm_192x384x64_48x96x16_4cols
gemm_192x384x64_48x96x16_4cols_bcolmaj_ccolmaj
gemm_2048x2048x2048_64x64x32_8_cols_0_bcolmaj_0_ccolmaj_0
gemm_2048x2048x2048_64x64x32_8_cols_0_bcolmaj_1_ccolmaj_0
gemm_2048x2048x2048_64x64x32_8_cols_1_bcolmaj_0_ccolmaj_0
gemm_2048x2048x2048_64x64x64_1cols
gemm_2048x2048x2048_64x64x64_2_cols_0_bcolmaj_0_ccolmaj_0
gemm_2048x2048x2048_64x64x64_2_cols_0_bcolmaj_0_ccolmaj_0_0
gemm_2048x2048x2048_64x64x64_2_cols_0_bcolmaj_1_ccolmaj_0
gemm_2048x2048x2048_64x64x64_2_cols_0_bcolmaj_1_ccolmaj_0_0
gemm_2048x2048x2048_64x64x64_2_cols_1_bcolmaj_0_ccolmaj_0
gemm_2048x2048x2048_64x64x64_2_cols_1_bcolmaj_0_ccolmaj_0_0
gemm_2048x2048x2048_64x64x64_2cols_bcolmaj
gemm_2048x2048x2048_64x64x64_8_cols_0_bcolmaj_0_ccolmaj_0
gemm_2048x2048x2048_64x64x64_8_cols_0_bcolmaj_0_ccolmaj_0_0
gemm_2048x2048x2048_64x64x64_8_cols_0_bcolmaj_1_ccolmaj_0_0
gemm_2048x2048x2048_64x64x64_8_cols_1_bcolmaj_0_ccolmaj_0_0
gemm_2048x2048x2048_64x64x64_8cols_bcolmaj_ccolmaj
gemm_384x1536x1792_32x48x64_4cols_bcolmaj
gemm_896x1792x640_32x64x80_8cols_ccolmaj
layer_norm_1_cols_1_channels_2048_tile_2048
layer_norm_1_cols_2_channels_2048_tile_1024
layer_norm_2_cols_1_channels_2048_tile_1024
layer_norm_2_cols_2_channels_2048_tile_512
layer_norm_4_cols_1_channels_2048_tile_512
layer_norm_4_cols_2_channels_2048_tile_256
layer_norm_8_cols_1_channels_2048_tile_256
layer_norm_8_cols_2_channels_2048_tile_128
matrix_vector_mul_128x128_32_1col
matrix_vector_mul_128x128_32_1col0
matrix_vector_mul_128x128_32tsi_128tso_1col0
matrix_vector_mul_2048x8192_1_1col
matrix_vector_mul_2048x8192_1_1col0
matrix_vector_mul_2048x8192_1_2col
matrix_vector_mul_2048x8192_1_2col0
matrix_vector_mul_2048x8192_1_4col
matrix_vector_mul_2048x8192_1_4col0
matrix_vector_mul_2048x8192_1_8col
matrix_vector_mul_2048x8192_1_8col0
matrix_vector_mul_2048x8192_1tsi_1024tso_2col0
matrix_vector_mul_2048x8192_1tsi_2048tso_1col0
matrix_vector_mul_2048x8192_1tsi_256tso_8col0
matrix_vector_mul_2048x8192_1tsi_512tso_4col0
matrix_vector_mul_8192x2048_4_1col
matrix_vector_mul_8192x2048_4_1col0
matrix_vector_mul_8192x2048_4_2col
matrix_vector_mul_8192x2048_4_2col0
matrix_vector_mul_8192x2048_4_4col
matrix_vector_mul_8192x2048_4_4col0
matrix_vector_mul_8192x2048_4_8col
matrix_vector_mul_8192x2048_4_8col0
matrix_vector_mul_8192x2048_4tsi_1024tso_1col0
matrix_vector_mul_8192x2048_4tsi_1024tso_2col0
matrix_vector_mul_8192x2048_4tsi_1024tso_4col0
matrix_vector_mul_8192x2048_4tsi_1024tso_8col0
mem_copy_16_cores_2_chans_2048_tile_128_False
mem_copy_16_cores_2_chans_2048_tile_128_False0
mem_copy_1_cols_1_channels_2048_tile_2048
mem_copy_1_cols_2_channels_2048_tile_1024
mem_copy_1_cores_1_chans_2048_tile_2048_False
mem_copy_1_cores_1_chans_2048_tile_2048_False0
mem_copy_2_cols_1_channels_2048_tile_1024
mem_copy_2_cols_2_channels_2048_tile_512
mem_copy_2_cores_1_chans_2048_tile_1024_False
mem_copy_2_cores_1_chans_2048_tile_1024_False0
mem_copy_2_cores_2_chans_2048_tile_1024_False
mem_copy_2_cores_2_chans_2048_tile_1024_False0
mem_copy_4_cols_1_channels_2048_tile_512
mem_copy_4_cols_2_channels_2048_tile_256
mem_copy_4_cores_1_chans_2048_tile_512_False
mem_copy_4_cores_1_chans_2048_tile_512_False0
mem_copy_4_cores_2_chans_2048_tile_512_False
mem_copy_4_cores_2_chans_2048_tile_512_False0
mem_copy_8_cols_1_channels_2048_tile_256
mem_copy_8_cols_2_channels_2048_tile_128
mem_copy_8_cores_1_chans_2048_tile_256_False
mem_copy_8_cores_1_chans_2048_tile_256_False0
mem_copy_8_cores_2_chans_2048_tile_256_False
mem_copy_8_cores_2_chans_2048_tile_256_False0
mha
mha0
relu_1_cols_1_channels_2048_tile_2048
relu_2_cols_1_channels_2048_tile_1024
relu_4_cols_1_channels_2048_tile_512
relu_8_cols_1_channels_2048_tile_256
rms_norm_1_cols_1_channels_2048_tile_2048
rms_norm_1_cols_2_channels_2048_tile_1024
rms_norm_2_cols_1_channels_2048_tile_1024
rms_norm_2_cols_2_channels_2048_tile_512
rms_norm_4_cols_1_channels_2048_tile_512
rms_norm_4_cols_2_channels_2048_tile_256
rms_norm_8_cols_1_channels_2048_tile_256
rms_norm_8_cols_2_channels_2048_tile_128
rope_1_cols_2_channels_4096_tile_4096_0
rope_1c_32rows_512cols_32arows_0m
rope_1c_32rows_512cols_8arows_0m
rope_2_cols_2_channels_4096_tile_2048_0
rope_2c_32rows_512cols_32arows_0m
rope_2c_32rows_512cols_8arows_0m
rope_4_cols_2_channels_4096_tile_1024_0
rope_8_cols_2_channels_4096_tile_512_0
rope_8c_32rows_512cols_32arows_0m
rope_8c_32rows_512cols_8arows_0m
sigmoid_1_cols_1_channels_2048_tile_2048
sigmoid_2_cols_1_channels_2048_tile_1024
sigmoid_4_cols_1_channels_2048_tile_512
sigmoid_8_cols_1_channels_2048_tile_256
silu_1_cols_1_channels_2048_tile_2048
silu_2_cols_1_channels_2048_tile_1024
silu_4_cols_1_channels_2048_tile_512
silu_8_cols_1_channels_2048_tile_256
softmax_1_cols_2_channels_4096_tile_2048
softmax_2_cols_2_channels_4096_tile_1024
softmax_2_cols_2_channels_4096_tile_512
swigluNo metrics available. swiglu_decode_1x2048x2048
swiglu_decode_1x2048x2048_0
tanh_1_cols_1_channels_2048_tile_2048
tanh_2_cols_1_channels_2048_tile_1024
tanh_4_cols_1_channels_2048_tile_512
tanh_8_cols_1_channels_2048_tile_256
transpose_2048_M_64_N_1_cols_1_channels_64_m_64_n_8_s
transpose_2048_M_64_N_1_cols_1_channels_64_m_64_n_8_s0
transpose_2048_M_64_N_1_cols_2_channels_64_m_64_n_8_s
transpose_2048_M_64_N_1_cols_2_channels_64_m_64_n_8_s0
weighted_rms_norm_1_cols_2_channels_2048_weights_2048
weighted_rms_norm_2_cols_2_channels_2048_weights_1024
weighted_rms_norm_4_cols_2_channels_2048_weights_512
weighted_rms_norm_8_cols_2_channels_2048_weights_256
|
📊 Test Results for Small Benchmark/Test Suitee199ef2 (2026_02_06_17_42_34) IRONCLADTested on
📈 Trends (vs main branch) for Small Benchmark/Test Suitee199ef2 (2026_02_06_17_42_34) IRONCLAD Trendsaxpy_1_cols_2_channels_2048_tile_2048_3.0
axpy_1_cols_2_channels_2048_tile_2048_3.0_0
axpy_2_cols_2_channels_2048_tile_1024_3.0
axpy_2_cols_2_channels_2048_tile_1024_3.0_0
axpy_4_cols_2_channels_2048_tile_512_3.0
axpy_4_cols_2_channels_2048_tile_512_3.0_0
axpy_8_cols_2_channels_2048_tile_256_3.0
axpy_8_cols_2_channels_2048_tile_256_3.0_0
dequant_1_cols_1_channels_2048_tile_2048
dequant_1_cols_1_channels_2048_tile_2048_0
dequant_1_cols_2_channels_2048_tile_1024
dequant_1_cols_2_channels_2048_tile_1024_0
dequant_2_cols_1_channels_2048_tile_1024
dequant_2_cols_1_channels_2048_tile_1024_0
dequant_2_cols_2_channels_2048_tile_512
dequant_2_cols_2_channels_2048_tile_512_0
dequant_4_cols_1_channels_2048_tile_512
dequant_4_cols_1_channels_2048_tile_512_0
dequant_4_cols_2_channels_2048_tile_256
dequant_4_cols_2_channels_2048_tile_256_0
dequant_8_cols_1_channels_2048_tile_256
dequant_8_cols_1_channels_2048_tile_256_0
dequant_8_cols_2_channels_2048_tile_128
dequant_8_cols_2_channels_2048_tile_128_0
eltwise_add_1_cols_2_channels_2048_tile_2048
eltwise_add_2_cols_2_channels_2048_tile_1024
eltwise_add_4_cols_2_channels_2048_tile_512
eltwise_add_8_cols_2_channels_2048_tile_256
eltwise_mul_1_cols_2_channels_2048_tile_2048
eltwise_mul_2_cols_2_channels_2048_tile_1024
eltwise_mul_4_cols_2_channels_2048_tile_512
eltwise_mul_8_cols_2_channels_2048_tile_256
gelu_1_cols_1_channels_2048_tile_2048
gelu_1_cols_2_channels_2048_tile_1024
gelu_2_cols_1_channels_2048_tile_1024
gelu_2_cols_2_channels_2048_tile_512
gelu_4_cols_1_channels_2048_tile_512
gelu_4_cols_2_channels_2048_tile_256
gelu_8_cols_1_channels_2048_tile_256
gelu_8_cols_2_channels_2048_tile_128
gemm_1792x896x1152_64x32x48_8cols_ccolmaj
gemm_192x384x64_48x96x16_4cols
gemm_192x384x64_48x96x16_4cols_bcolmaj_ccolmaj
gemm_2048x2048x2048_64x64x32_8_cols_0_bcolmaj_0_ccolmaj_0
gemm_2048x2048x2048_64x64x32_8_cols_0_bcolmaj_1_ccolmaj_0
gemm_2048x2048x2048_64x64x32_8_cols_1_bcolmaj_0_ccolmaj_0
gemm_2048x2048x2048_64x64x64_1cols
gemm_2048x2048x2048_64x64x64_2_cols_0_bcolmaj_0_ccolmaj_0
gemm_2048x2048x2048_64x64x64_2_cols_0_bcolmaj_0_ccolmaj_0_0
gemm_2048x2048x2048_64x64x64_2_cols_0_bcolmaj_1_ccolmaj_0
gemm_2048x2048x2048_64x64x64_2_cols_0_bcolmaj_1_ccolmaj_0_0
gemm_2048x2048x2048_64x64x64_2_cols_1_bcolmaj_0_ccolmaj_0
gemm_2048x2048x2048_64x64x64_2_cols_1_bcolmaj_0_ccolmaj_0_0
gemm_2048x2048x2048_64x64x64_2cols_bcolmaj
gemm_2048x2048x2048_64x64x64_8_cols_0_bcolmaj_0_ccolmaj_0
gemm_2048x2048x2048_64x64x64_8_cols_0_bcolmaj_0_ccolmaj_0_0
gemm_2048x2048x2048_64x64x64_8_cols_0_bcolmaj_1_ccolmaj_0_0
gemm_2048x2048x2048_64x64x64_8_cols_1_bcolmaj_0_ccolmaj_0_0
gemm_2048x2048x2048_64x64x64_8cols_bcolmaj_ccolmaj
gemm_384x1536x1792_32x48x64_4cols_bcolmaj
gemm_896x1792x640_32x64x80_8cols_ccolmaj
layer_norm_1_cols_1_channels_2048_tile_2048
layer_norm_1_cols_2_channels_2048_tile_1024
layer_norm_2_cols_1_channels_2048_tile_1024
layer_norm_2_cols_2_channels_2048_tile_512
layer_norm_4_cols_1_channels_2048_tile_512
layer_norm_4_cols_2_channels_2048_tile_256
layer_norm_8_cols_1_channels_2048_tile_256
layer_norm_8_cols_2_channels_2048_tile_128
matrix_vector_mul_128x128_32_1col
matrix_vector_mul_128x128_32_1col0
matrix_vector_mul_128x128_32tsi_128tso_1col0
matrix_vector_mul_2048x8192_1_1col
matrix_vector_mul_2048x8192_1_1col0
matrix_vector_mul_2048x8192_1_2col
matrix_vector_mul_2048x8192_1_2col0
matrix_vector_mul_2048x8192_1_4col
matrix_vector_mul_2048x8192_1_4col0
matrix_vector_mul_2048x8192_1_8col
matrix_vector_mul_2048x8192_1_8col0
matrix_vector_mul_2048x8192_1tsi_1024tso_2col0
matrix_vector_mul_2048x8192_1tsi_2048tso_1col0
matrix_vector_mul_2048x8192_1tsi_256tso_8col0
matrix_vector_mul_2048x8192_1tsi_512tso_4col0
matrix_vector_mul_8192x2048_4_1col
matrix_vector_mul_8192x2048_4_1col0
matrix_vector_mul_8192x2048_4_2col
matrix_vector_mul_8192x2048_4_2col0
matrix_vector_mul_8192x2048_4_4col
matrix_vector_mul_8192x2048_4_4col0
matrix_vector_mul_8192x2048_4_8col
matrix_vector_mul_8192x2048_4_8col0
matrix_vector_mul_8192x2048_4tsi_1024tso_1col0
matrix_vector_mul_8192x2048_4tsi_1024tso_2col0
matrix_vector_mul_8192x2048_4tsi_1024tso_4col0
matrix_vector_mul_8192x2048_4tsi_1024tso_8col0
mem_copy_16_cores_2_chans_2048_tile_128_False
mem_copy_16_cores_2_chans_2048_tile_128_False0
mem_copy_1_cols_1_channels_2048_tile_2048
mem_copy_1_cols_2_channels_2048_tile_1024
mem_copy_1_cores_1_chans_2048_tile_2048_False
mem_copy_1_cores_1_chans_2048_tile_2048_False0
mem_copy_2_cols_1_channels_2048_tile_1024
mem_copy_2_cols_2_channels_2048_tile_512
mem_copy_2_cores_1_chans_2048_tile_1024_False
mem_copy_2_cores_1_chans_2048_tile_1024_False0
mem_copy_2_cores_2_chans_2048_tile_1024_False
mem_copy_2_cores_2_chans_2048_tile_1024_False0
mem_copy_4_cols_1_channels_2048_tile_512
mem_copy_4_cols_2_channels_2048_tile_256
mem_copy_4_cores_1_chans_2048_tile_512_False
mem_copy_4_cores_1_chans_2048_tile_512_False0
mem_copy_4_cores_2_chans_2048_tile_512_False
mem_copy_4_cores_2_chans_2048_tile_512_False0
mem_copy_8_cols_1_channels_2048_tile_256
mem_copy_8_cols_2_channels_2048_tile_128
mem_copy_8_cores_1_chans_2048_tile_256_False
mem_copy_8_cores_1_chans_2048_tile_256_False0
mem_copy_8_cores_2_chans_2048_tile_256_False
mem_copy_8_cores_2_chans_2048_tile_256_False0
mha
mha0
relu_1_cols_1_channels_2048_tile_2048
relu_2_cols_1_channels_2048_tile_1024
relu_4_cols_1_channels_2048_tile_512
relu_8_cols_1_channels_2048_tile_256
rms_norm_1_cols_1_channels_2048_tile_2048
rms_norm_1_cols_2_channels_2048_tile_1024
rms_norm_2_cols_1_channels_2048_tile_1024
rms_norm_2_cols_2_channels_2048_tile_512
rms_norm_4_cols_1_channels_2048_tile_512
rms_norm_4_cols_2_channels_2048_tile_256
rms_norm_8_cols_1_channels_2048_tile_256
rms_norm_8_cols_2_channels_2048_tile_128
rope_1_cols_2_channels_4096_tile_4096_0
rope_1c_32rows_512cols_32arows_0m
rope_1c_32rows_512cols_8arows_0m
rope_2_cols_2_channels_4096_tile_2048_0
rope_2c_32rows_512cols_32arows_0m
rope_2c_32rows_512cols_8arows_0m
rope_4_cols_2_channels_4096_tile_1024_0
rope_8_cols_2_channels_4096_tile_512_0
rope_8c_32rows_512cols_32arows_0m
rope_8c_32rows_512cols_8arows_0m
sigmoid_1_cols_1_channels_2048_tile_2048
sigmoid_2_cols_1_channels_2048_tile_1024
sigmoid_4_cols_1_channels_2048_tile_512
sigmoid_8_cols_1_channels_2048_tile_256
silu_1_cols_1_channels_2048_tile_2048
silu_2_cols_1_channels_2048_tile_1024
silu_4_cols_1_channels_2048_tile_512
silu_8_cols_1_channels_2048_tile_256
softmax_1_cols_2_channels_4096_tile_2048
softmax_2_cols_2_channels_32768_tile_1024
softmax_2_cols_2_channels_32768_tile_512
softmax_2_cols_2_channels_4096_tile_1024
softmax_2_cols_2_channels_4096_tile_512
softmax_4_cols_4_channels_32768_tile_2048
swigluNo metrics available. swiglu_decode_1x2048x2048
swiglu_decode_1x2048x2048_0
swiglu_prefill_256x2048x2048_0No metrics available. tanh_1_cols_1_channels_2048_tile_2048
tanh_2_cols_1_channels_2048_tile_1024
tanh_4_cols_1_channels_2048_tile_512
tanh_8_cols_1_channels_2048_tile_256
transpose_2048_M_64_N_1_cols_1_channels_64_m_64_n_8_s
transpose_2048_M_64_N_1_cols_1_channels_64_m_64_n_8_s0
transpose_2048_M_64_N_1_cols_2_channels_64_m_64_n_8_s
transpose_2048_M_64_N_1_cols_2_channels_64_m_64_n_8_s0
weighted_rms_norm_1_cols_2_channels_2048_weights_2048
weighted_rms_norm_2_cols_2_channels_2048_weights_1024
weighted_rms_norm_4_cols_2_channels_2048_weights_512
weighted_rms_norm_8_cols_2_channels_2048_weights_256
|
📊 Test Results for Test Example Applicationse199ef2 (2026_02_06_17_46_55) IRONCLADTested on
📈 Trends (vs main branch) for Test Example Applicationse199ef2 (2026_02_06_17_46_55) IRONCLAD Trendsllama_3.2_1b
llama_3.2_1b_prompt_13_tokens_1
llama_3.2_1b_prompt_13_tokens_40
llama_3.2_1b_prompt_2048_tokens_1
llama_3.2_1b_prompt_2048_tokens_40
|
The draftiest of draft PRs.
Added
Changed
Removed
PR Merge Checklist
develcommit and pointing todevel.