-
Notifications
You must be signed in to change notification settings - Fork 25
Fold negate operations on constants by moving UnaryConstProp to high priority #1733
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ivConstSimplify patterns Co-authored-by: wsmoses <1260124+wsmoses@users.noreply.github.com>
Co-authored-by: wsmoses <1260124+wsmoses@users.noreply.github.com>
FAIL: //test/lit_tests:negate_const_prop.mlir.test (see /root/.bazel/execroot/__main__/bazel-out/aarch64-dbg/testlogs/test/lit_tests/negate_const_prop.mlir.test/test.log)
INFO: From Testing //test/lit_tests:negate_const_prop.mlir.test:
==================== Test output for //test/lit_tests:negate_const_prop.mlir.test:
-- Testing: 1 tests, 1 workers --
FAIL: Enzyme-JaX :: lit_tests/negate_const_prop.mlir (1 of 1)
******************** TEST 'Enzyme-JaX :: lit_tests/negate_const_prop.mlir' FAILED ********************
Exit Code: 1
Command Output (stderr):
--
enzymexlamlir-opt --enzyme-hlo-opt /root/.bazel/sandbox/processwrapper-sandbox/13774/execroot/__main__/bazel-out/aarch64-dbg/bin/test/lit_tests/negate_const_prop.mlir.test.runfiles/__main__/test/lit_tests/negate_const_prop.mlir | FileCheck /root/.bazel/sandbox/processwrapper-sandbox/13774/execroot/__main__/bazel-out/aarch64-dbg/bin/test/lit_tests/negate_const_prop.mlir.test.runfiles/__main__/test/lit_tests/negate_const_prop.mlir # RUN: at line 1
+ enzymexlamlir-opt --enzyme-hlo-opt /root/.bazel/sandbox/processwrapper-sandbox/13774/execroot/__main__/bazel-out/aarch64-dbg/bin/test/lit_tests/negate_const_prop.mlir.test.runfiles/__main__/test/lit_tests/negate_const_prop.mlir
+ FileCheck /root/.bazel/sandbox/processwrapper-sandbox/13774/execroot/__main__/bazel-out/aarch64-dbg/bin/test/lit_tests/negate_const_prop.mlir.test.runfiles/__main__/test/lit_tests/negate_const_prop.mlir
/root/.bazel/sandbox/processwrapper-sandbox/13774/execroot/__main__/bazel-out/aarch64-dbg/bin/test/lit_tests/negate_const_prop.mlir.test.runfiles/__main__/test/lit_tests/negate_const_prop.mlir:46:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: %[[C:.*]] = stablehlo.constant dense<-2.000000e+00> : tensor<4x4xf32>
^
<stdin>:17:75: note: scanning from here
func.func @neg_div_const_rhs(%arg0: tensor<4x4xf32>) -> tensor<4x4xf32> {
^
<stdin>:25:2: note: possible intended match here
%cst = stablehlo.constant dense<-3.000000e+00> : tensor<4x4xf32>
^
Input file: <stdin>
Check file: /root/.bazel/sandbox/processwrapper-sandbox/13774/execroot/__main__/bazel-out/aarch64-dbg/bin/test/lit_tests/negate_const_prop.mlir.test.runfiles/__main__/test/lit_tests/negate_const_prop.mlir
-dump-input=help explains the following input dump.
Input was:
<<<<<<
.
.
.
12: %0 = stablehlo.divide %cst, %arg0 : tensor<4x4xf32>
13: return %0 : tensor<4x4xf32>
14: }
15: }
16: module {
17: func.func @neg_div_const_rhs(%arg0: tensor<4x4xf32>) -> tensor<4x4xf32> {
next:46'0 X error: no match found
18: %cst = stablehlo.constant dense<-5.000000e-01> : tensor<4x4xf32>
next:46'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
19: %0 = stablehlo.multiply %arg0, %cst : tensor<4x4xf32>
next:46'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
20: return %0 : tensor<4x4xf32>
next:46'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
21: }
next:46'0 ~~~
22: }
next:46'0 ~~~
23: module {
next:46'0 ~~~~~~~~~~
24: func.func @neg_mul_const_rhs(%arg0: tensor<4x4xf32>) -> tensor<4x4xf32> {
next:46'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
25: %cst = stablehlo.constant dense<-3.000000e+00> : tensor<4x4xf32>
next:46'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:46'1 ? possible intended match
26: %0 = stablehlo.multiply %arg0, %cst : tensor<4x4xf32>
next:46'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
27: return %0 : tensor<4x4xf32>
next:46'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
28: }
next:46'0 ~~~
29: }
next:46'0 ~~~
30: }
next:46'0 ~~
31:
next:46'0 ~
>>>>>>
--
********************
********************
Failed Tests (1):
Enzyme-JaX :: lit_tests/negate_const_prop.mlir
Testing Time: 0.10s
Total Discovered Tests: 1
Failed: 1 (100.00%) |
… test conventions Co-authored-by: avik-pal <30564094+avik-pal@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EnzymeJAX Benchmarks
Details
| Benchmark suite | Current: 461c2b5 | Previous: 5807b16 | Ratio |
|---|---|---|---|
actmtch / JaX / cpu / Primal |
0.000004001951400005055 s |
0.000004077052000002368 s |
0.98 |
actmtch / JaXPipe / cpu / Primal |
0.0000038023466000140617 s |
0.000003983729900028266 s |
0.95 |
actmtch / JaX / cpu / Forward |
0.000005954053499999646 s |
0.000006333651200020541 s |
0.94 |
actmtch / JaXPipe / cpu / Forward |
0.000007144875999983924 s |
0.000007304315300007147 s |
0.98 |
actmtch / JaX / cpu / BothRev |
0.000006406852800000707 s |
0.000006389050899997528 s |
1.00 |
actmtch / JaXPipe / cpu / PreRev |
0.000007333234299994729 s |
0.000007454925799993362 s |
0.98 |
actmtch / JaXPipe / cpu / PostRev |
0.000006071447700014687 s |
0.00000632967709998411 s |
0.96 |
actmtch / JaXPipe / cpu / BothRev |
0.000007547885299982226 s |
0.00000736418199999207 s |
1.02 |
actmtch / JaX / cpu / Primal |
0.000007211223104968667 s |
0.000004077052000002368 s |
1.77 |
actmtch / JaXPipe / cpu / Primal |
0.000007408289902377874 s |
0.000003983729900028266 s |
1.86 |
actmtch / JaX / cpu / Forward |
0.000011168615700444209 s |
0.000006333651200020541 s |
1.76 |
actmtch / JaXPipe / cpu / Forward |
0.000014182031998643651 s |
0.000007304315300007147 s |
1.94 |
actmtch / JaX / cpu / BothRev |
0.00001097114799777046 s |
0.000006389050899997528 s |
1.72 |
actmtch / JaXPipe / cpu / PreRev |
0.000013555501395603642 s |
0.000007454925799993362 s |
1.82 |
actmtch / JaXPipe / cpu / PostRev |
0.000010726586700184272 s |
0.00000632967709998411 s |
1.69 |
actmtch / JaXPipe / cpu / BothRev |
0.000014121868798974902 s |
0.00000736418199999207 s |
1.92 |
actmtch / JaX / gpu / Primal |
0.00007511262430343777 s |
0.00008572420809996401 s |
0.88 |
actmtch / JaXPipe / gpu / Primal |
0.00007649357540067286 s |
0.00007496612270006154 s |
1.02 |
actmtch / JaX / gpu / Forward |
0.000109652780299 s |
0.0001092814691999 s |
1.00 |
actmtch / JaXPipe / gpu / Forward |
0.0001102674426045 s |
0.0001033177317 s |
1.07 |
actmtch / JaX / gpu / BothRev |
0.0001048842000949 s |
0.0001036159211998 s |
1.01 |
actmtch / JaXPipe / gpu / PreRev |
0.0001049357293988 s |
0.0001102903179998 s |
0.95 |
actmtch / JaXPipe / gpu / PostRev |
0.0001047791151038 s |
0.0001044380528999 s |
1.00 |
actmtch / JaXPipe / gpu / BothRev |
0.0001053454142995 s |
0.0001112481207999 s |
0.95 |
actmtch / JaX / cpu / Primal |
0.000003858289000163495 s |
0.000004077052000002368 s |
0.95 |
actmtch / JaXPipe / cpu / Primal |
0.000003831351999906474 s |
0.000003983729900028266 s |
0.96 |
actmtch / JaX / cpu / Forward |
0.000005349009900055535 s |
0.000006333651200020541 s |
0.84 |
actmtch / JaXPipe / cpu / Forward |
0.0000065607209999143375 s |
0.000007304315300007147 s |
0.90 |
actmtch / JaX / cpu / BothRev |
0.000005582144999971206 s |
0.000006389050899997528 s |
0.87 |
actmtch / JaXPipe / cpu / PreRev |
0.000006140340000092692 s |
0.000007454925799993362 s |
0.82 |
actmtch / JaXPipe / cpu / PostRev |
0.000005662017900067439 s |
0.00000632967709998411 s |
0.89 |
actmtch / JaXPipe / cpu / BothRev |
0.0000064751920001072 s |
0.00000736418199999207 s |
0.88 |
actmtch / JaX / tpu / Primal |
0.0001354097843999 s |
0.0001585435907996 s |
0.85 |
actmtch / JaXPipe / tpu / Primal |
0.0001351169114001 s |
0.0001541092320003 s |
0.88 |
actmtch / JaX / tpu / Forward |
0.0002406727831001 s |
0.0002245106158996 s |
1.07 |
actmtch / JaXPipe / tpu / Forward |
0.0002360667031 s |
0.0002414041468997 s |
0.98 |
actmtch / JaX / tpu / BothRev |
0.0002051004260998 s |
0.0002466575215999 s |
0.83 |
actmtch / JaXPipe / tpu / PreRev |
0.0002150883980999 s |
0.0002274255777003 s |
0.95 |
actmtch / JaXPipe / tpu / PostRev |
0.0002229205501 s |
0.0002327662272997 s |
0.96 |
actmtch / JaXPipe / tpu / BothRev |
0.0002204400882001 s |
0.0002310955225002 s |
0.95 |
actmtch / JaX / cpu / Primal |
0.000005343041600008291 s |
0.000004077052000002368 s |
1.31 |
actmtch / JaXPipe / cpu / Primal |
0.000005105859899958887 s |
0.000003983729900028266 s |
1.28 |
actmtch / JaX / cpu / Forward |
0.000007803619100013749 s |
0.000006333651200020541 s |
1.23 |
actmtch / JaXPipe / cpu / Forward |
0.00000974114370001189 s |
0.000007304315300007147 s |
1.33 |
actmtch / JaX / cpu / BothRev |
0.000008214167899950553 s |
0.000006389050899997528 s |
1.29 |
actmtch / JaXPipe / cpu / PreRev |
0.00000899436700001388 s |
0.000007454925799993362 s |
1.21 |
actmtch / JaXPipe / cpu / PostRev |
0.000008080943500044669 s |
0.00000632967709998411 s |
1.28 |
actmtch / JaXPipe / cpu / BothRev |
0.000009917121600028622 s |
0.00000736418199999207 s |
1.35 |
add_one / JaX / cpu / Primal |
0.000004195664000008037 s |
0.000004221375999986776 s |
0.99 |
add_one / JaXPipe / cpu / Primal |
0.00000413621080001576 s |
0.000004248234299984688 s |
0.97 |
add_one / JaX / cpu / Forward |
0.000007141736099993068 s |
0.000007514756600039618 s |
0.95 |
add_one / JaXPipe / cpu / Forward |
0.000007346325800017439 s |
0.000007491655400008312 s |
0.98 |
add_one / JaX / cpu / BothRev |
0.000007792541900016658 s |
0.000007534473600026103 s |
1.03 |
add_one / JaXPipe / cpu / PreRev |
0.0000074870461000045905 s |
0.000007696040299970264 s |
0.97 |
add_one / JaXPipe / cpu / PostRev |
0.000007443607599998358 s |
0.000007473608000009335 s |
1.00 |
add_one / JaXPipe / cpu / BothRev |
0.000007483675200001016 s |
0.000007556833199987523 s |
0.99 |
add_one / JaX / cpu / Primal |
0.000007523475197376683 s |
0.000004221375999986776 s |
1.78 |
add_one / JaXPipe / cpu / Primal |
0.000007594567799242213 s |
0.000004248234299984688 s |
1.79 |
add_one / JaX / cpu / Forward |
0.00001161255449987948 s |
0.000007514756600039618 s |
1.55 |
add_one / JaXPipe / cpu / Forward |
0.000011762097198516133 s |
0.000007491655400008312 s |
1.57 |
add_one / JaX / cpu / BothRev |
0.00001242623750003986 s |
0.000007534473600026103 s |
1.65 |
add_one / JaXPipe / cpu / PreRev |
0.000012487537099514155 s |
0.000007696040299970264 s |
1.62 |
add_one / JaXPipe / cpu / PostRev |
0.000011810645600780844 s |
0.000007473608000009335 s |
1.58 |
add_one / JaXPipe / cpu / BothRev |
0.000012532111100153996 s |
0.000007556833199987523 s |
1.66 |
add_one / JaX / gpu / Primal |
0.00007877660290105268 s |
0.00007682557509997423 s |
1.03 |
add_one / JaXPipe / gpu / Primal |
0.00007641267340513878 s |
0.00007817539320003562 s |
0.98 |
add_one / JaX / gpu / Forward |
0.0001053381641977 s |
0.0001271991753999 s |
0.83 |
add_one / JaXPipe / gpu / Forward |
0.0001108687464031 s |
0.0001065110891 s |
1.04 |
add_one / JaX / gpu / BothRev |
0.0001132732081052 s |
0.0001098862513001 s |
1.03 |
add_one / JaXPipe / gpu / PreRev |
0.0001136388134036 s |
0.0001095640186998 s |
1.04 |
add_one / JaXPipe / gpu / PostRev |
0.0001125026933033 s |
0.0001090896196999 s |
1.03 |
add_one / JaXPipe / gpu / BothRev |
0.0001117896780022 s |
0.0001090583037999 s |
1.03 |
add_one / JaX / cpu / Primal |
0.0000038595030000578845 s |
0.000004221375999986776 s |
0.91 |
add_one / JaXPipe / cpu / Primal |
0.000003874652000013157 s |
0.000004248234299984688 s |
0.91 |
add_one / JaX / cpu / Forward |
0.000005881521900118969 s |
0.000007514756600039618 s |
0.78 |
add_one / JaXPipe / cpu / Forward |
0.000006208461999995052 s |
0.000007491655400008312 s |
0.83 |
add_one / JaX / cpu / BothRev |
0.000006244080999931612 s |
0.000007534473600026103 s |
0.83 |
add_one / JaXPipe / cpu / PreRev |
0.0000062736858999414834 s |
0.000007696040299970264 s |
0.82 |
add_one / JaXPipe / cpu / PostRev |
0.000006283968000025198 s |
0.000007473608000009335 s |
0.84 |
add_one / JaXPipe / cpu / BothRev |
0.000006262187999891467 s |
0.000007556833199987523 s |
0.83 |
add_one / JaX / tpu / Primal |
0.0001469214134 s |
0.0001558269189001 s |
0.94 |
add_one / JaXPipe / tpu / Primal |
0.0001547014994001 s |
0.0001560672649 s |
0.99 |
add_one / JaX / tpu / Forward |
0.0002400321549999 s |
0.000238849361 s |
1.00 |
add_one / JaXPipe / tpu / Forward |
0.0002212308541 s |
0.000238180681 s |
0.93 |
add_one / JaX / tpu / BothRev |
0.0002263192771 s |
0.0002457600515997 s |
0.92 |
add_one / JaXPipe / tpu / PreRev |
0.0002326260751 s |
0.0002196414550999 s |
1.06 |
add_one / JaXPipe / tpu / PostRev |
0.000247805594 s |
0.0002283411046002 s |
1.09 |
add_one / JaXPipe / tpu / BothRev |
0.0002434140109999 s |
0.0002264422466992 s |
1.07 |
add_one / JaX / cpu / Primal |
0.000005281977499998902 s |
0.000004221375999986776 s |
1.25 |
add_one / JaXPipe / cpu / Primal |
0.000005445990200041706 s |
0.000004248234299984688 s |
1.28 |
add_one / JaX / cpu / Forward |
0.000008289526400039904 s |
0.000007514756600039618 s |
1.10 |
add_one / JaXPipe / cpu / Forward |
0.00000882873179998569 s |
0.000007491655400008312 s |
1.18 |
add_one / JaX / cpu / BothRev |
0.000008960592899984477 s |
0.000007534473600026103 s |
1.19 |
add_one / JaXPipe / cpu / PreRev |
0.00000894251150002674 s |
0.000007696040299970264 s |
1.16 |
add_one / JaXPipe / cpu / PostRev |
0.000008496905399988463 s |
0.000007473608000009335 s |
1.14 |
add_one / JaXPipe / cpu / BothRev |
0.000009014558299986674 s |
0.000007556833199987523 s |
1.19 |
add_two / JaX / cpu / Primal |
0.00000431663680001293 s |
0.000004400090899980569 s |
0.98 |
add_two / JaXPipe / cpu / Primal |
0.000004423477499994988 s |
0.000004424161599990839 s |
1.00 |
add_two / JaX / cpu / Forward |
0.000007162587499988149 s |
0.000007389712399981362 s |
0.97 |
add_two / JaXPipe / cpu / Forward |
0.000007344418200000291 s |
0.000007342772600031821 s |
1.00 |
add_two / JaX / cpu / BothRev |
0.00000904953790000036 s |
0.000009861097500015603 s |
0.92 |
add_two / JaXPipe / cpu / PreRev |
0.00000921900450002795 s |
0.000009610602399970958 s |
0.96 |
add_two / JaXPipe / cpu / PostRev |
0.00000921385960000407 s |
0.000009470495699997628 s |
0.97 |
add_two / JaXPipe / cpu / BothRev |
0.000009197498200001065 s |
0.000009597478399973624 s |
0.96 |
add_two / JaX / cpu / Primal |
0.000007822382601443678 s |
0.000004400090899980569 s |
1.78 |
add_two / JaXPipe / cpu / Primal |
0.000008307919598883018 s |
0.000004424161599990839 s |
1.88 |
add_two / JaX / cpu / Forward |
0.000012051957298535854 s |
0.000007389712399981362 s |
1.63 |
add_two / JaXPipe / cpu / Forward |
0.000012645824800711123 s |
0.000007342772600031821 s |
1.72 |
add_two / JaX / cpu / BothRev |
0.000014892743504606188 s |
0.000009861097500015603 s |
1.51 |
add_two / JaXPipe / cpu / PreRev |
0.000014926284604007378 s |
0.000009610602399970958 s |
1.55 |
add_two / JaXPipe / cpu / PostRev |
0.000014953870500903576 s |
0.000009470495699997628 s |
1.58 |
add_two / JaXPipe / cpu / BothRev |
0.00001501466690097004 s |
0.000009597478399973624 s |
1.56 |
add_two / JaX / gpu / Primal |
0.00007811830929713323 s |
0.00007813301940004749 s |
1.00 |
add_two / JaXPipe / gpu / Primal |
0.00009294862030074 s |
0.00007844890699998359 s |
1.18 |
add_two / JaX / gpu / Forward |
0.0001098364292003 s |
0.0001064112938 s |
1.03 |
add_two / JaXPipe / gpu / Forward |
0.0001313591471 s |
0.0001067091177999 s |
1.23 |
add_two / JaX / gpu / BothRev |
0.0001331018486001 s |
0.0001228723885998 s |
1.08 |
add_two / JaXPipe / gpu / PreRev |
0.0001301270833995 s |
0.0001251365968999 s |
1.04 |
add_two / JaXPipe / gpu / PostRev |
0.0001292183883022 s |
0.0001232139204001 s |
1.05 |
add_two / JaXPipe / gpu / BothRev |
0.0001275435724004 s |
0.0001428660842999 s |
0.89 |
add_two / JaX / cpu / Primal |
0.000003936356000122032 s |
0.000004400090899980569 s |
0.89 |
add_two / JaXPipe / cpu / Primal |
0.000003938152999944577 s |
0.000004424161599990839 s |
0.89 |
add_two / JaX / cpu / Forward |
0.000006113740900036646 s |
0.000007389712399981362 s |
0.83 |
add_two / JaXPipe / cpu / Forward |
0.000006389966999995522 s |
0.000007342772600031821 s |
0.87 |
add_two / JaX / cpu / BothRev |
0.000006900791999942158 s |
0.000009861097500015603 s |
0.70 |
add_two / JaXPipe / cpu / PreRev |
0.000007337476999964565 s |
0.000009610602399970958 s |
0.76 |
add_two / JaXPipe / cpu / PostRev |
0.000007291504000022542 s |
0.000009470495699997628 s |
0.77 |
add_two / JaXPipe / cpu / BothRev |
0.000007316602999890165 s |
0.000009597478399973624 s |
0.76 |
add_two / JaX / tpu / Primal |
0.0001451543803999 s |
0.0001393057717999 s |
1.04 |
add_two / JaXPipe / tpu / Primal |
0.0001368121203999 s |
0.0001396214168002 s |
0.98 |
add_two / JaX / tpu / Forward |
0.0002196909280999 s |
0.0002332947023998 s |
0.94 |
add_two / JaXPipe / tpu / Forward |
0.0002215320731 s |
0.0002180160022995 s |
1.02 |
add_two / JaX / tpu / BothRev |
0.0002375053660998 s |
0.0002375591291995 s |
1.00 |
add_two / JaXPipe / tpu / PreRev |
0.0002567654219999 s |
0.0002436417567994 s |
1.05 |
add_two / JaXPipe / tpu / PostRev |
0.000251546787 s |
0.0002494659634998 s |
1.01 |
add_two / JaXPipe / tpu / BothRev |
0.0002591534979999 s |
0.0002444009566999 s |
1.06 |
add_two / JaX / cpu / Primal |
0.000005456997400051478 s |
0.000004400090899980569 s |
1.24 |
add_two / JaXPipe / cpu / Primal |
0.000006003660500027763 s |
0.000004424161599990839 s |
1.36 |
add_two / JaX / cpu / Forward |
0.000008542415600004461 s |
0.000007389712399981362 s |
1.16 |
add_two / JaXPipe / cpu / Forward |
0.000008553678399948694 s |
0.000007342772600031821 s |
1.16 |
add_two / JaX / cpu / BothRev |
0.000010429514799943718 s |
0.000009861097500015603 s |
1.06 |
add_two / JaXPipe / cpu / PreRev |
0.000010643983999943884 s |
0.000009610602399970958 s |
1.11 |
add_two / JaXPipe / cpu / PostRev |
0.000010140004700042482 s |
0.000009470495699997628 s |
1.07 |
add_two / JaXPipe / cpu / BothRev |
0.000011033762100032618 s |
0.000009597478399973624 s |
1.15 |
cache / JaX / cpu / Primal |
0.000003951801499988506 s |
0.000004257991299982677 s |
0.93 |
cache / JaXPipe / cpu / Primal |
0.000004212864800001626 s |
0.000003844736900009593 s |
1.10 |
cache / JaX / cpu / Forward |
0.000010397164599999087 s |
0.000010476668100000096 s |
0.99 |
cache / JaXPipe / cpu / Forward |
0.0000102472766000119 s |
0.000010631851300013296 s |
0.96 |
cache / JaX / cpu / BothRev |
0.000014821704099995258 s |
0.000014376339499995083 s |
1.03 |
cache / JaXPipe / cpu / PreRev |
0.000011427356799958944 s |
0.000011329544000000169 s |
1.01 |
cache / JaXPipe / cpu / PostRev |
0.000015117984800008344 s |
0.000014214122499970472 s |
1.06 |
cache / JaXPipe / cpu / BothRev |
0.000011250036899991755 s |
0.00001143866600000365 s |
0.98 |
cache / JaX / cpu / Primal |
0.000008239676500670612 s |
0.000004257991299982677 s |
1.94 |
cache / JaXPipe / cpu / Primal |
0.000007812538801226765 s |
0.000003844736900009593 s |
2.03 |
cache / JaX / cpu / Forward |
0.000015278380399104206 s |
0.000010476668100000096 s |
1.46 |
cache / JaXPipe / cpu / Forward |
0.000013208760804263876 s |
0.000010631851300013296 s |
1.24 |
cache / JaX / cpu / BothRev |
0.000022778479597764088 s |
0.000014376339499995083 s |
1.58 |
cache / JaXPipe / cpu / PreRev |
0.00001666057049878873 s |
0.000011329544000000169 s |
1.47 |
cache / JaXPipe / cpu / PostRev |
0.00002490646590013057 s |
0.000014214122499970472 s |
1.75 |
cache / JaXPipe / cpu / BothRev |
0.0000175956474035047 s |
0.00001143866600000365 s |
1.54 |
cache / JaX / gpu / Primal |
0.00007634646440274082 s |
0.00008125018259997888 s |
0.94 |
cache / JaXPipe / gpu / Primal |
0.00007679389030090533 s |
0.00007603169419999177 s |
1.01 |
cache / JaX / gpu / Forward |
0.0001023016846971 s |
0.0001001671495001 s |
1.02 |
cache / JaXPipe / gpu / Forward |
0.0001042002119997 s |
0.00010074912 s |
1.03 |
cache / JaX / gpu / BothRev |
0.0001064217045961 s |
0.0001040311058 s |
1.02 |
cache / JaXPipe / gpu / PreRev |
0.0001090583958954 s |
0.0001043637658 s |
1.04 |
cache / JaXPipe / gpu / PostRev |
0.0001054391376965 s |
0.0001085071256 s |
0.97 |
cache / JaXPipe / gpu / BothRev |
0.0001078403675986 s |
0.0001133617831999 s |
0.95 |
cache / JaX / cpu / Primal |
0.000003801018000012846 s |
0.000004257991299982677 s |
0.89 |
cache / JaXPipe / cpu / Primal |
0.000003955931999917084 s |
0.000003844736900009593 s |
1.03 |
cache / JaX / cpu / Forward |
0.000018434353900011045 s |
0.000010476668100000096 s |
1.76 |
cache / JaXPipe / cpu / Forward |
0.00001992385989997274 s |
0.000010631851300013296 s |
1.87 |
cache / JaX / cpu / BothRev |
0.00003135023690010712 s |
0.000014376339499995083 s |
2.18 |
cache / JaXPipe / cpu / PreRev |
0.000024523660900013057 s |
0.000011329544000000169 s |
2.16 |
cache / JaXPipe / cpu / PostRev |
0.00003168091789993923 s |
0.000014214122499970472 s |
2.23 |
cache / JaXPipe / cpu / BothRev |
0.0000289541909000036 s |
0.00001143866600000365 s |
2.53 |
cache / JaX / tpu / Primal |
0.0001438770563998 s |
0.0001431915185996 s |
1.00 |
cache / JaXPipe / tpu / Primal |
0.0001461339154 s |
0.0001409267946997 s |
1.04 |
cache / JaX / tpu / Forward |
0.0002260165069999 s |
0.0002235779169997 s |
1.01 |
cache / JaXPipe / tpu / Forward |
0.0002275460851 s |
0.0002276957656999 s |
1.00 |
cache / JaX / tpu / BothRev |
0.0002297797270999 s |
0.0002332378573999 s |
0.99 |
cache / JaXPipe / tpu / PreRev |
0.0002332944501 s |
0.0002316751524995 s |
1.01 |
cache / JaXPipe / tpu / PostRev |
0.0002416815371001 s |
0.0002288646747001 s |
1.06 |
cache / JaXPipe / tpu / BothRev |
0.0002236944601001 s |
0.0002285151126001 s |
0.98 |
cache / JaX / cpu / Primal |
0.0000051495003999662 s |
0.000004257991299982677 s |
1.21 |
cache / JaXPipe / cpu / Primal |
0.00000541519979997247 s |
0.000003844736900009593 s |
1.41 |
cache / JaX / cpu / Forward |
0.000009568636800031526 s |
0.000010476668100000096 s |
0.91 |
cache / JaXPipe / cpu / Forward |
0.000009701170100015588 s |
0.000010631851300013296 s |
0.91 |
cache / JaX / cpu / BothRev |
0.00001673910019999312 s |
0.000014376339499995083 s |
1.16 |
cache / JaXPipe / cpu / PreRev |
0.00001192026030003035 s |
0.000011329544000000169 s |
1.05 |
cache / JaXPipe / cpu / PostRev |
0.000016787037999984023 s |
0.000014214122499970472 s |
1.18 |
cache / JaXPipe / cpu / BothRev |
0.000012839818600059517 s |
0.00001143866600000365 s |
1.12 |
Concat / JaX / cpu / Primal |
0.000004348818599964943 s |
0.0000042229700999996565 s |
1.03 |
Concat / JaXPipe / cpu / Primal |
0.000004207863000010548 s |
0.0000041841809999823455 s |
1.01 |
Concat / JaX / cpu / Forward |
0.000007003276900013589 s |
0.000006966495999995459 s |
1.01 |
Concat / JaXPipe / cpu / Forward |
0.000006991883999990023 s |
0.000007061789099998351 s |
0.99 |
Concat / JaX / cpu / BothRev |
0.00000722551150001891 s |
0.000007513651099998242 s |
0.96 |
Concat / JaXPipe / cpu / PreRev |
0.00000718951769999876 s |
0.000007483798000021125 s |
0.96 |
Concat / JaXPipe / cpu / PostRev |
0.000007448926000006395 s |
0.000008010000500007663 s |
0.93 |
Concat / JaXPipe / cpu / BothRev |
0.000007376917399960803 s |
0.000007434162400022615 s |
0.99 |
Concat / JaX / cpu / Primal |
0.00000724749619839713 s |
0.0000042229700999996565 s |
1.72 |
Concat / JaXPipe / cpu / Primal |
0.000007552580698393285 s |
0.0000041841809999823455 s |
1.81 |
Concat / JaX / cpu / Forward |
0.000012621453899191692 s |
0.000006966495999995459 s |
1.81 |
Concat / JaXPipe / cpu / Forward |
0.000011960934702074155 s |
0.000007061789099998351 s |
1.69 |
Concat / JaX / cpu / BothRev |
0.000012347428395878523 s |
0.000007513651099998242 s |
1.64 |
Concat / JaXPipe / cpu / PreRev |
0.000012382514000637455 s |
0.000007483798000021125 s |
1.65 |
Concat / JaXPipe / cpu / PostRev |
0.000013196386100025848 s |
0.000008010000500007663 s |
1.65 |
Concat / JaXPipe / cpu / BothRev |
0.000012835210299817845 s |
0.000007434162400022615 s |
1.73 |
Concat / JaX / gpu / Primal |
0.00007785435670521111 s |
0.0000781766037000125 s |
1.00 |
Concat / JaXPipe / gpu / Primal |
0.00007845623129978776 s |
0.00008940246439997282 s |
0.88 |
Concat / JaX / gpu / Forward |
0.000105900755705 s |
0.0001067922133001 s |
0.99 |
Concat / JaXPipe / gpu / Forward |
0.0001050787222979 s |
0.0001069774111998 s |
0.98 |
Concat / JaX / gpu / BothRev |
0.00009838557160110212 s |
0.0001050804037 s |
0.94 |
Concat / JaXPipe / gpu / PreRev |
0.0000988751760974992 s |
0.0001099798328999 s |
0.90 |
Concat / JaXPipe / gpu / PostRev |
0.00009909993279725312 s |
0.00009848711129998264 s |
1.01 |
Concat / JaXPipe / gpu / BothRev |
0.00009994386929902248 s |
0.00009838990830012336 s |
1.02 |
Concat / JaX / cpu / Primal |
0.000003829360000054294 s |
0.0000042229700999996565 s |
0.91 |
Concat / JaXPipe / cpu / Primal |
0.000003949884000030579 s |
0.0000041841809999823455 s |
0.94 |
Concat / JaX / cpu / Forward |
0.000005833120999886887 s |
0.000006966495999995459 s |
0.84 |
Concat / JaXPipe / cpu / Forward |
0.0000062652849999722095 s |
0.000007061789099998351 s |
0.89 |
Concat / JaX / cpu / BothRev |
0.000006335181000031298 s |
0.000007513651099998242 s |
0.84 |
Concat / JaXPipe / cpu / PreRev |
0.0000064015889000074825 s |
0.000007483798000021125 s |
0.86 |
Concat / JaXPipe / cpu / PostRev |
0.000005823093000071822 s |
0.000008010000500007663 s |
0.73 |
Concat / JaXPipe / cpu / BothRev |
0.00000581689099999494 s |
0.000007434162400022615 s |
0.78 |
Concat / JaX / tpu / Primal |
0.0001352276734 s |
0.0001434232967003 s |
0.94 |
Concat / JaXPipe / tpu / Primal |
0.0001360965974999 s |
0.0001399360876996 s |
0.97 |
Concat / JaX / tpu / Forward |
0.0002147018262001 s |
0.0002256741008 s |
0.95 |
Concat / JaXPipe / tpu / Forward |
0.0002167828572 s |
0.0002252096558004 s |
0.96 |
Concat / JaX / tpu / BothRev |
0.0002168739751001 s |
0.0002347539602997 s |
0.92 |
Concat / JaXPipe / tpu / PreRev |
0.0002172337852 s |
0.0002347212182998 s |
0.93 |
Concat / JaXPipe / tpu / PostRev |
0.0002255690400999 s |
0.0002320430794003 s |
0.97 |
Concat / JaXPipe / tpu / BothRev |
0.0002232629920999 s |
0.0002329050583997 s |
0.96 |
Concat / JaX / cpu / Primal |
0.000005237469999974564 s |
0.0000042229700999996565 s |
1.24 |
Concat / JaXPipe / cpu / Primal |
0.00000531824810004764 s |
0.0000041841809999823455 s |
1.27 |
Concat / JaX / cpu / Forward |
0.000008725240300009317 s |
0.000006966495999995459 s |
1.25 |
Concat / JaXPipe / cpu / Forward |
0.000008253533200058882 s |
0.000007061789099998351 s |
1.17 |
Concat / JaX / cpu / BothRev |
0.000009067926899933808 s |
0.000007513651099998242 s |
1.21 |
Concat / JaXPipe / cpu / PreRev |
0.000009266092300003948 s |
0.000007483798000021125 s |
1.24 |
Concat / JaXPipe / cpu / PostRev |
0.000009131811699990069 s |
0.000008010000500007663 s |
1.14 |
Concat / JaXPipe / cpu / BothRev |
0.00000912269259997629 s |
0.000007434162400022615 s |
1.23 |
const_scatter / JaX / cpu / Primal |
0.000007149000020945096 s |
0.000007620199994562427 s |
0.94 |
const_scatter / JaXPipe / cpu / Primal |
0.000007182499984992319 s |
0.000007098800006133388 s |
1.01 |
const_scatter / JaX / cpu / Forward |
0.000008555100021112593 s |
0.000009316900013800478 s |
0.92 |
const_scatter / JaXPipe / cpu / Forward |
0.000010215300017080154 s |
0.000010165499998038285 s |
1.00 |
const_scatter / JaX / cpu / Primal |
0.000013782200403511524 s |
0.000007620199994562427 s |
1.81 |
const_scatter / JaXPipe / cpu / Primal |
0.000012659502681344748 s |
0.000007098800006133388 s |
1.78 |
const_scatter / JaX / cpu / Forward |
0.000015646801330149174 s |
0.000009316900013800478 s |
1.68 |
const_scatter / JaXPipe / cpu / Forward |
0.00001771179959177971 s |
0.000010165499998038285 s |
1.74 |
const_scatter / JaX / gpu / Primal |
0.0001304114994127 s |
0.0001249818998985 s |
1.04 |
const_scatter / JaXPipe / gpu / Primal |
0.0001184006046969 s |
0.0001129132000642 s |
1.05 |
const_scatter / JaX / gpu / Forward |
0.0001619735034182 s |
0.0001453847999073 s |
1.11 |
const_scatter / JaXPipe / gpu / Forward |
0.000146603299072 s |
0.0001423464000254 s |
1.03 |
const_scatter / JaX / cpu / Primal |
0.000006403999941539951 s |
0.000007620199994562427 s |
0.84 |
const_scatter / JaXPipe / cpu / Primal |
0.000005364000026020222 s |
0.000007098800006133388 s |
0.76 |
const_scatter / JaX / cpu / Forward |
0.000007315000038943253 s |
0.000009316900013800478 s |
0.79 |
const_scatter / JaXPipe / cpu / Forward |
0.000008067999988270459 s |
0.000010165499998038285 s |
0.79 |
const_scatter / JaX / tpu / Primal |
0.0001657139999224 s |
0.0001735550002194 s |
0.95 |
const_scatter / JaXPipe / tpu / Primal |
0.0001694029999271 s |
0.000176965999708 s |
0.96 |
const_scatter / JaX / tpu / Forward |
0.0002695240000321 s |
0.0002668229994014 s |
1.01 |
const_scatter / JaXPipe / tpu / Forward |
0.0002525710000554 s |
0.0002653520001331 s |
0.95 |
const_scatter / JaX / cpu / Primal |
0.00001148350002040388 s |
0.000007620199994562427 s |
1.51 |
const_scatter / JaXPipe / cpu / Primal |
0.000008727099975658348 s |
0.000007098800006133388 s |
1.23 |
const_scatter / JaX / cpu / Forward |
0.000010738999935711036 s |
0.000009316900013800478 s |
1.15 |
const_scatter / JaXPipe / cpu / Forward |
0.000012689499999396504 s |
0.000010165499998038285 s |
1.25 |
GenDot / JaX / cpu / Primal |
0.000004143327600013436 s |
0.000004305226599990419 s |
0.96 |
GenDot / JaXPipe / cpu / Primal |
0.000004241194499991252 s |
0.00000431635519998963 s |
0.98 |
GenDot / JaX / cpu / Forward |
0.000006235251599991898 s |
0.000006539181100015412 s |
0.95 |
GenDot / JaXPipe / cpu / Forward |
0.000007392489799985924 s |
0.000007373701500000607 s |
1.00 |
GenDot / JaX / cpu / BothRev |
0.000006495842999993329 s |
0.000006726316999993287 s |
0.97 |
GenDot / JaXPipe / cpu / PreRev |
0.000007183236399987436 s |
0.000007370045800007574 s |
0.97 |
GenDot / JaXPipe / cpu / PostRev |
0.000006465256399997088 s |
0.000006784010499995929 s |
0.95 |
GenDot / JaXPipe / cpu / BothRev |
0.000007169618899979468 s |
0.000007361278600001242 s |
0.97 |
GenDot / JaX / cpu / Primal |
0.000008005369198508561 s |
0.000004305226599990419 s |
1.86 |
GenDot / JaXPipe / cpu / Primal |
0.00000838477819925174 s |
0.00000431635519998963 s |
1.94 |
GenDot / JaX / cpu / Forward |
0.000011561298603191972 s |
0.000006539181100015412 s |
1.77 |
GenDot / JaXPipe / cpu / Forward |
0.000013929392403224484 s |
0.000007373701500000607 s |
1.89 |
GenDot / JaX / cpu / BothRev |
0.000011507908900966868 s |
0.000006726316999993287 s |
1.71 |
GenDot / JaXPipe / cpu / PreRev |
0.000014126023801509291 s |
0.000007370045800007574 s |
1.92 |
GenDot / JaXPipe / cpu / PostRev |
0.000012031119503080844 s |
0.000006784010499995929 s |
1.77 |
GenDot / JaXPipe / cpu / BothRev |
0.000014146934100426731 s |
0.000007361278600001242 s |
1.92 |
GenDot / JaX / gpu / Primal |
0.00007574738429975695 s |
0.00007917201839991322 s |
0.96 |
GenDot / JaXPipe / gpu / Primal |
0.00007511231700191274 s |
0.00007852207989999442 s |
0.96 |
GenDot / JaX / gpu / Forward |
0.0001025417061988 s |
0.0001161023423001 s |
0.88 |
GenDot / JaXPipe / gpu / Forward |
0.0001065684496017 s |
0.0001094103704001 s |
0.97 |
GenDot / JaX / gpu / BothRev |
0.0001039184699009 s |
0.0001026670191 s |
1.01 |
GenDot / JaXPipe / gpu / PreRev |
0.0001033065549039 s |
0.000106410687 s |
0.97 |
GenDot / JaXPipe / gpu / PostRev |
0.000105141177104 s |
0.0001020137320998 s |
1.03 |
GenDot / JaXPipe / gpu / BothRev |
0.0001083939514996 s |
0.0001150465489999 s |
0.94 |
GenDot / JaX / cpu / Primal |
0.0000037965499999700113 s |
0.000004305226599990419 s |
0.88 |
GenDot / JaXPipe / cpu / Primal |
0.000004044928999974217 s |
0.00000431635519998963 s |
0.94 |
GenDot / JaX / cpu / Forward |
0.000005284428999948432 s |
0.000006539181100015412 s |
0.81 |
GenDot / JaXPipe / cpu / Forward |
0.00000643703989990172 s |
0.000007373701500000607 s |
0.87 |
GenDot / JaX / cpu / BothRev |
0.000005875175999972271 s |
0.000006726316999993287 s |
0.87 |
GenDot / JaXPipe / cpu / PreRev |
0.000006470939000064391 s |
0.000007370045800007574 s |
0.88 |
GenDot / JaXPipe / cpu / PostRev |
0.000005910908999976527 s |
0.000006784010499995929 s |
0.87 |
GenDot / JaXPipe / cpu / BothRev |
0.00000641697800001566 s |
0.000007361278600001242 s |
0.87 |
GenDot / JaX / tpu / Primal |
0.0001383264714 s |
0.0001405656597999 s |
0.98 |
GenDot / JaXPipe / tpu / Primal |
0.0001393808234999 s |
0.0001406256967995 s |
0.99 |
GenDot / JaX / tpu / Forward |
0.0002216271491 s |
0.0002363347731996 s |
0.94 |
GenDot / JaXPipe / tpu / Forward |
0.0002202895512 s |
0.0002397689259996 s |
0.92 |
GenDot / JaX / tpu / BothRev |
0.0002212696111 s |
0.0002314764793998 s |
0.96 |
GenDot / JaXPipe / tpu / PreRev |
0.0002210281560999 s |
0.0002301387635001 s |
0.96 |
GenDot / JaXPipe / tpu / PostRev |
0.0002212252290999 s |
0.0002327623953999 s |
0.95 |
GenDot / JaXPipe / tpu / BothRev |
0.0002223233840999 s |
0.0002207230910993 s |
1.01 |
GenDot / JaX / cpu / Primal |
0.000005200530299953244 s |
0.000004305226599990419 s |
1.21 |
GenDot / JaXPipe / cpu / Primal |
0.000005694404599944391 s |
0.00000431635519998963 s |
1.32 |
GenDot / JaX / cpu / Forward |
0.000008354419799979951 s |
0.000006539181100015412 s |
1.28 |
GenDot / JaXPipe / cpu / Forward |
0.000009641825499966216 s |
0.000007373701500000607 s |
1.31 |
GenDot / JaX / cpu / BothRev |
0.000008562783300021693 s |
0.000006726316999993287 s |
1.27 |
GenDot / JaXPipe / cpu / PreRev |
0.000009638679999989107 s |
0.000007370045800007574 s |
1.31 |
GenDot / JaXPipe / cpu / PostRev |
0.000008411166200039589 s |
0.000006784010499995929 s |
1.24 |
GenDot / JaXPipe / cpu / BothRev |
0.000009408169699963765 s |
0.000007361278600001242 s |
1.28 |
hlo_ffi / JaX / cpu / Primal |
0.000005993340299983175 s |
0.000006007991599972229 s |
1.00 |
hlo_ffi / JaXPipe / cpu / Primal |
0.000005802912099989044 s |
0.000005891712699985874 s |
0.98 |
hlo_ffi / JaX / cpu / Forward |
0.000009384499599991614 s |
0.000009990135099997132 s |
0.94 |
hlo_ffi / JaXPipe / cpu / Forward |
0.00000939740730000267 s |
0.000010069322099980127 s |
0.93 |
hlo_ffi / JaX / cpu / BothRev |
0.000009198034400014876 s |
0.000009401051999975606 s |
0.98 |
hlo_ffi / JaXPipe / cpu / PreRev |
0.000009163638200016069 s |
0.000009514673700005006 s |
0.96 |
hlo_ffi / JaXPipe / cpu / PostRev |
0.000009380678899970007 s |
0.000010275410900021598 s |
0.91 |
hlo_ffi / JaXPipe / cpu / BothRev |
0.000009148576500001582 s |
0.000009341794300007677 s |
0.98 |
hlo_ffi / JaX / cpu / Primal |
0.000010567158000776544 s |
0.000006007991599972229 s |
1.76 |
hlo_ffi / JaXPipe / cpu / Primal |
0.00001020744260167703 s |
0.000005891712699985874 s |
1.73 |
hlo_ffi / JaX / cpu / Forward |
0.000015678820299217476 s |
0.000009990135099997132 s |
1.57 |
hlo_ffi / JaXPipe / cpu / Forward |
0.00001569188599823974 s |
0.000010069322099980127 s |
1.56 |
hlo_ffi / JaX / cpu / BothRev |
0.00001512076790095307 s |
0.000009401051999975606 s |
1.61 |
hlo_ffi / JaXPipe / cpu / PreRev |
0.000016439395799534396 s |
0.000009514673700005006 s |
1.73 |
hlo_ffi / JaXPipe / cpu / PostRev |
0.000015701155201531945 s |
0.000010275410900021598 s |
1.53 |
hlo_ffi / JaXPipe / cpu / BothRev |
0.00001570278209983371 s |
0.000009341794300007677 s |
1.68 |
hlo_ffi / JaX / cpu / Primal |
0.000005278587999964657 s |
0.000006007991599972229 s |
0.88 |
hlo_ffi / JaXPipe / cpu / Primal |
0.000005103871999926924 s |
0.000005891712699985874 s |
0.87 |
hlo_ffi / JaX / cpu / Forward |
0.000007647225999971851 s |
0.000009990135099997132 s |
0.77 |
hlo_ffi / JaXPipe / cpu / Forward |
0.000007994939899981546 s |
0.000010069322099980127 s |
0.79 |
hlo_ffi / JaX / cpu / BothRev |
0.000007433629000115615 s |
0.000009401051999975606 s |
0.79 |
hlo_ffi / JaXPipe / cpu / PreRev |
0.000007752975899893499 s |
0.000009514673700005006 s |
0.81 |
hlo_ffi / JaXPipe / cpu / PostRev |
0.000007674778899854574 s |
0.000010275410900021598 s |
0.75 |
hlo_ffi / JaXPipe / cpu / BothRev |
0.00000770532300011837 s |
0.000009341794300007677 s |
0.82 |
hlo_ffi / JaX / tpu / Primal |
0.00006591903180014924 s |
0.00004765109919972019 s |
1.38 |
hlo_ffi / JaXPipe / tpu / Primal |
0.00006532774380011688 s |
0.00004664374529966153 s |
1.40 |
hlo_ffi / JaX / tpu / Forward |
0.00008875270359985734 s |
0.0000632015112998488 s |
1.40 |
hlo_ffi / JaXPipe / tpu / Forward |
0.00008996883670006355 s |
0.00006532635120020132 s |
1.38 |
hlo_ffi / JaX / tpu / BothRev |
0.0000906247067001459 s |
0.00006411575919992174 s |
1.41 |
hlo_ffi / JaXPipe / tpu / PreRev |
0.00008940254959998128 s |
0.00007189345279984991 s |
1.24 |
hlo_ffi / JaXPipe / tpu / PostRev |
0.00008959583870000643 s |
0.00006192938829990453 s |
1.45 |
hlo_ffi / JaXPipe / tpu / BothRev |
0.00008962087069994595 s |
0.00006723617699972238 s |
1.33 |
hlo_ffi / JaX / cpu / Primal |
0.00000735779550004736 s |
0.000006007991599972229 s |
1.22 |
hlo_ffi / JaXPipe / cpu / Primal |
0.000007255892000011954 s |
0.000005891712699985874 s |
1.23 |
hlo_ffi / JaX / cpu / Forward |
0.00001141210649993809 s |
0.000009990135099997132 s |
1.14 |
hlo_ffi / JaXPipe / cpu / Forward |
0.000011703212399970651 s |
0.000010069322099980127 s |
1.16 |
hlo_ffi / JaX / cpu / BothRev |
0.000010955290499987314 s |
0.000009401051999975606 s |
1.17 |
hlo_ffi / JaXPipe / cpu / PreRev |
0.00001138435020002362 s |
0.000009514673700005006 s |
1.20 |
hlo_ffi / JaXPipe / cpu / PostRev |
0.00001095931740001106 s |
0.000010275410900021598 s |
1.07 |
hlo_ffi / JaXPipe / cpu / BothRev |
0.00001132224299999507 s |
0.000009341794300007677 s |
1.21 |
llama / JaXPipe / cpu / Primal |
0.000824935139999 s |
0.0009036038300018 s |
0.91 |
llama / JaX / cpu / Primal |
0.0008415341099998 s |
0.0008815707099984 s |
0.95 |
llama / HLOOpt / cpu / Primal |
0.0009008619299993 s |
0.0009545046899984 s |
0.94 |
llama / PartOpt / cpu / Primal |
0.0008179616699999 s |
0.0008596076999992 s |
0.95 |
llama / DefOpt / cpu / Primal |
0.0009068425399982 s |
0.0009428964700009 s |
0.96 |
llama / IPartOpt / cpu / Primal |
0.0008062835700002 s |
0.000862209490001 s |
0.94 |
llama / IDefOpt / cpu / Primal |
0.0008949994100021 s |
0.0009406553999997 s |
0.95 |
llama / JaXPipe / cpu / Forward |
0.0023009041099976 s |
0.0026777235899999 s |
0.86 |
llama / JaX / cpu / Forward |
0.0023170847900019 s |
0.0026008398599969 s |
0.89 |
llama / HLOOpt / cpu / Forward |
0.002304403600001 s |
0.0026227514299989 s |
0.88 |
llama / PartOpt / cpu / Forward |
0.0022801333200004 s |
0.0026131044399971 s |
0.87 |
llama / DefOpt / cpu / Forward |
0.0023247164999975 s |
0.002628777750001 s |
0.88 |
llama / IPartOpt / cpu / Forward |
0.0022912288800034 s |
0.0026456083599987 s |
0.87 |
llama / IDefOpt / cpu / Forward |
0.0023033075799958 s |
0.0026386267999987 s |
0.87 |
llama / JaXPipe / cpu / PreRev |
0.0021493058099986 s |
0.0023864121999986 s |
0.90 |
llama / JaXPipe / cpu / PostRev |
0.0020852593500012 s |
0.0023078690700003 s |
0.90 |
llama / JaXPipe / cpu / BothRev |
0.0021346292799989 s |
0.0024488058399992 s |
0.87 |
llama / JaX / cpu / BothRev |
0.0020689514900004 s |
0.0025524685099981 s |
0.81 |
llama / HLOOpt / cpu / PreRev |
0.002214866119998 s |
0.0023098340199976 s |
0.96 |
llama / HLOOpt / cpu / PostRev |
0.0022572395800034 s |
0.0026894748899985 s |
0.84 |
llama / HLOOpt / cpu / BothRev |
0.002208681940001 s |
0.0023117150400003 s |
0.96 |
llama / PartOpt / cpu / PreRev |
0.0021695562500008 s |
0.0023249103199987 s |
0.93 |
llama / PartOpt / cpu / PostRev |
0.0020799080699998 s |
0.0023054931600017 s |
0.90 |
llama / PartOpt / cpu / BothRev |
0.0025088324800026 s |
0.0027967864899983 s |
0.90 |
llama / DefOpt / cpu / PreRev |
0.0021707829299975 s |
0.0023714528499976 s |
0.92 |
llama / DefOpt / cpu / PostRev |
0.0021054122899977 s |
0.0026971864500001 s |
0.78 |
llama / DefOpt / cpu / BothRev |
0.0022069275400008 s |
0.002330606559999 s |
0.95 |
llama / IPartOpt / cpu / PreRev |
0.0021775689800006 s |
0.0028142271800015 s |
0.77 |
llama / IPartOpt / cpu / PostRev |
0.0020928142799994 s |
0.0023104740999997 s |
0.91 |
llama / IPartOpt / cpu / BothRev |
0.0022054364799987 s |
0.0028174292300036 s |
0.78 |
llama / IDefOpt / cpu / PreRev |
0.0021670563900033 s |
0.0023330732900012 s |
0.93 |
llama / IDefOpt / cpu / PostRev |
0.0021323167599985 s |
0.0028763271700017 s |
0.74 |
llama / IDefOpt / cpu / BothRev |
0.002209497990002 s |
0.0023699240599989 s |
0.93 |
llama / JaXPipe / gpu / Primal |
0.0004307166440412 s |
0.0004334704140019 s |
0.99 |
llama / JaX / gpu / Primal |
0.0004426389379659 s |
0.0004326484760022 s |
1.02 |
llama / HLOOpt / gpu / Primal |
0.000447169151972 s |
0.0004502818159999 s |
0.99 |
llama / PartOpt / gpu / Primal |
0.0004435960680712 s |
0.000431408815999 s |
1.03 |
llama / DefOpt / gpu / Primal |
0.0004332904539769 s |
0.0004453713379989 s |
0.97 |
llama / IPartOpt / gpu / Primal |
0.0004494451259961 s |
0.0004493406619985 s |
1.00 |
llama / IDefOpt / gpu / Primal |
0.0004407739960588 s |
0.0004547928200008 s |
0.97 |
llama / JaXPipe / gpu / Forward |
0.0007361826560227 s |
0.0007431619740018 s |
0.99 |
llama / JaX / gpu / Forward |
0.0006975018039811 s |
0.0006885440839978 s |
1.01 |
llama / HLOOpt / gpu / Forward |
0.0007515277700731 s |
0.0007418201320033 s |
1.01 |
llama / PartOpt / gpu / Forward |
0.0007431363799842 s |
0.0007191103040022 s |
1.03 |
llama / DefOpt / gpu / Forward |
0.0007191080839838 s |
0.0007279277580018 s |
0.99 |
llama / IPartOpt / gpu / Forward |
0.0007228287820471 s |
0.0007357419239997 s |
0.98 |
llama / IDefOpt / gpu / Forward |
0.0007204930920852 s |
0.0007442585519966 s |
0.97 |
llama / JaXPipe / gpu / PreRev |
0.0008279653480276 s |
0.0008011282060033 s |
1.03 |
llama / JaXPipe / gpu / PostRev |
0.0008084509460022 s |
0.0007745135199984 s |
1.04 |
llama / JaXPipe / gpu / BothRev |
0.000788576315972 s |
0.0008277438679979 s |
0.95 |
llama / JaX / gpu / BothRev |
0.0007654964359244 s |
0.0007886702680007 s |
0.97 |
llama / HLOOpt / gpu / PreRev |
0.0008041749439435 s |
0.0007771036399972 s |
1.03 |
llama / HLOOpt / gpu / PostRev |
0.000784096259973 s |
0.0008106862879976 s |
0.97 |
llama / HLOOpt / gpu / BothRev |
0.0008189095280831 s |
0.0007915957759978 s |
1.03 |
llama / PartOpt / gpu / PreRev |
0.0007699611899442 s |
0.0008056679940018 s |
0.96 |
llama / PartOpt / gpu / PostRev |
0.0007726757860509 s |
0.0008064865260021 s |
0.96 |
llama / PartOpt / gpu / BothRev |
0.000829560742015 s |
0.0008089085460014 s |
1.03 |
llama / DefOpt / gpu / PreRev |
0.0007892480340087 s |
0.0007689638479969 s |
1.03 |
llama / DefOpt / gpu / PostRev |
0.0007608652879716 s |
0.0007930275760008 s |
0.96 |
llama / DefOpt / gpu / BothRev |
0.0007888096760725 s |
0.000799012544001 s |
0.99 |
llama / IPartOpt / gpu / PreRev |
0.0007984262319514 s |
0.0007939320200021 s |
1.01 |
llama / IPartOpt / gpu / PostRev |
0.0007883514799177 s |
0.0007959353119986 s |
0.99 |
llama / IPartOpt / gpu / BothRev |
0.0008181157519575 s |
0.000793690112001 s |
1.03 |
llama / IDefOpt / gpu / PreRev |
0.000767420514021 s |
0.0007910991800017 s |
0.97 |
llama / IDefOpt / gpu / PostRev |
0.000832364451955 s |
0.0008137880879985 s |
1.02 |
llama / IDefOpt / gpu / BothRev |
0.0008191467680735 s |
0.0008149435499981 s |
1.01 |
llama / JaXPipe / tpu / Primal |
0.0003687210180032 s |
0.0003827782979933 s |
0.96 |
llama / JaX / tpu / Primal |
0.0003792302379988 s |
0.0003815366980124 s |
0.99 |
llama / HLOOpt / tpu / Primal |
0.0003656196380034 s |
0.000379335798003 s |
0.96 |
llama / PartOpt / tpu / Primal |
0.0003894623380001 s |
0.0003638919179938 s |
1.07 |
llama / DefOpt / tpu / Primal |
0.0003485623179985 s |
0.0003311642000044 s |
1.05 |
llama / IPartOpt / tpu / Primal |
0.0003828373780015 s |
0.0003546996400109 s |
1.08 |
llama / IDefOpt / tpu / Primal |
0.0003682044779998 s |
0.000355125838003 s |
1.04 |
llama / JaXPipe / tpu / Forward |
0.0005467106379983 s |
0.0005789292460103 s |
0.94 |
llama / JaX / tpu / Forward |
0.0006735050960014 s |
0.0007043117599969 s |
0.96 |
llama / HLOOpt / tpu / Forward |
0.0005443833579993 s |
0.0005613966660021 s |
0.97 |
llama / PartOpt / tpu / Forward |
0.0005332202179997 s |
0.0005685037660005 s |
0.94 |
llama / DefOpt / tpu / Forward |
0.0005366710380003 s |
0.0005808443260029 s |
0.92 |
llama / IPartOpt / tpu / Forward |
0.0005290443379999 s |
0.0005734330859995 s |
0.92 |
llama / IDefOpt / tpu / Forward |
0.000536582138 s |
0.0005726253879984 s |
0.94 |
llama / JaXPipe / tpu / PreRev |
0.0004122061979978 s |
0.0004065967579954 s |
1.01 |
llama / JaXPipe / tpu / PostRev |
0.0003573919580012 s |
0.0003579140399961 s |
1.00 |
llama / JaXPipe / tpu / BothRev |
0.0003917374579978 s |
0.0003915522780007 s |
1.00 |
llama / JaX / tpu / BothRev |
0.000357581117998 s |
0.0003576681200065 s |
1.00 |
llama / HLOOpt / tpu / PreRev |
0.0003917292179976 s |
0.0003918229980044 s |
1.00 |
llama / HLOOpt / tpu / PostRev |
0.0003907205200011 s |
0.0003908781179925 s |
1.00 |
llama / HLOOpt / tpu / BothRev |
0.0003916079200025 s |
0.0003915520779992 s |
1.00 |
llama / PartOpt / tpu / PreRev |
0.0003914220999977 s |
0.0003915564759954 s |
1.00 |
llama / PartOpt / tpu / PostRev |
0.0003723710800004 s |
0.0003723276780074 s |
1.00 |
llama / PartOpt / tpu / BothRev |
0.0003917286399992 s |
0.0003915616579906 s |
1.00 |
llama / DefOpt / tpu / PreRev |
0.0003915216579989 s |
0.0003916253180068 s |
1.00 |
llama / DefOpt / tpu / PostRev |
0.0003833071179979 s |
0.0003833049380045 s |
1.00 |
llama / DefOpt / tpu / BothRev |
0.0003921114980003 s |
0.0003914555379888 s |
1.00 |
llama / IPartOpt / tpu / PreRev |
0.0003915584979986 s |
0.0003914821779908 s |
1.00 |
llama / IPartOpt / tpu / PostRev |
0.0003723574380019 s |
0.0003730398379993 s |
1.00 |
llama / IPartOpt / tpu / BothRev |
0.0003916727779978 s |
0.0003914907760045 s |
1.00 |
llama / IDefOpt / tpu / PreRev |
0.0003915998580014 s |
0.000391751195988 s |
1.00 |
llama / IDefOpt / tpu / PostRev |
0.0003994921780031 s |
0.0003996628360036 s |
1.00 |
llama / IDefOpt / tpu / BothRev |
0.0003913268979995 s |
0.0003914849780121 s |
1.00 |
llama / JaXPipe / cpu / Primal |
0.0029949684999974 s |
0.0009036038300018 s |
3.31 |
llama / JaX / cpu / Primal |
0.0029538945399963 s |
0.0008815707099984 s |
3.35 |
llama / HLOOpt / cpu / Primal |
0.0021316563799973 s |
0.0009545046899984 s |
2.23 |
llama / PartOpt / cpu / Primal |
0.0035302787399996 s |
0.0008596076999992 s |
4.11 |
llama / DefOpt / cpu / Primal |
0.0025365524400058 s |
0.0009428964700009 s |
2.69 |
llama / IPartOpt / cpu / Primal |
0.0021493333299986 s |
0.000862209490001 s |
2.49 |
llama / IDefOpt / cpu / Primal |
0.0036753271700035 s |
0.0009406553999997 s |
3.91 |
llama / JaXPipe / cpu / Forward |
0.0059528150099959 s |
0.0026777235899999 s |
2.22 |
llama / JaX / cpu / Forward |
0.0060628400800032 s |
0.0026008398599969 s |
2.33 |
llama / HLOOpt / cpu / Forward |
0.0078330205200018 s |
0.0026227514299989 s |
2.99 |
llama / PartOpt / cpu / Forward |
0.0068258038899966 s |
0.0026131044399971 s |
2.61 |
llama / DefOpt / cpu / Forward |
0.0062134469700049 s |
0.002628777750001 s |
2.36 |
llama / IPartOpt / cpu / Forward |
0.0078365000600024 s |
0.0026456083599987 s |
2.96 |
llama / IDefOpt / cpu / Forward |
0.0071629679100078 s |
0.0026386267999987 s |
2.71 |
llama / JaXPipe / cpu / PreRev |
0.0055491245700068 s |
0.0023864121999986 s |
2.33 |
llama / JaXPipe / cpu / PostRev |
0.0055784771000071 s |
0.0023078690700003 s |
2.42 |
llama / JaXPipe / cpu / BothRev |
0.0060461192599996 s |
0.0024488058399992 s |
2.47 |
llama / JaX / cpu / BothRev |
0.0060065460999976 s |
0.0025524685099981 s |
2.35 |
llama / HLOOpt / cpu / PreRev |
0.0070191765900017 s |
0.0023098340199976 s |
3.04 |
llama / HLOOpt / cpu / PostRev |
0.0055341093299921 s |
0.0026894748899985 s |
2.06 |
llama / HLOOpt / cpu / BothRev |
0.0056468205500004 s |
0.0023117150400003 s |
2.44 |
llama / PartOpt / cpu / PreRev |
0.0062379898600011 s |
0.0023249103199987 s |
2.68 |
llama / PartOpt / cpu / PostRev |
0.0066571221000049 s |
0.0023054931600017 s |
2.89 |
llama / PartOpt / cpu / BothRev |
0.0058375406300001 s |
0.0027967864899983 s |
2.09 |
llama / DefOpt / cpu / PreRev |
0.0065214269999978 s |
0.0023714528499976 s |
2.75 |
llama / DefOpt / cpu / PostRev |
0.0056835927700012 s |
0.0026971864500001 s |
2.11 |
llama / DefOpt / cpu / BothRev |
0.0061230776100001 s |
0.002330606559999 s |
2.63 |
llama / IPartOpt / cpu / PreRev |
0.0059080841899958 s |
0.0028142271800015 s |
2.10 |
llama / IPartOpt / cpu / PostRev |
0.0054277143599938 s |
0.0023104740999997 s |
2.35 |
llama / IPartOpt / cpu / BothRev |
0.0063610445199992 s |
0.0028174292300036 s |
2.26 |
llama / IDefOpt / cpu / PreRev |
0.0060633469499953 s |
0.0023330732900012 s |
2.60 |
llama / IDefOpt / cpu / PostRev |
0.0057156380099968 s |
0.0028763271700017 s |
1.99 |
llama / IDefOpt / cpu / BothRev |
0.005609379790003 s |
0.0023699240599989 s |
2.37 |
scatter_sum / JaX / cpu / Primal |
0.000004800594100015588 s |
0.000004781286200022805 s |
1.00 |
scatter_sum / JaXPipe / cpu / Primal |
0.000004697060499984218 s |
0.000004852364199996373 s |
0.97 |
scatter_sum / JaX / cpu / Primal |
0.00000923346860217862 s |
0.000004781286200022805 s |
1.93 |
scatter_sum / JaXPipe / cpu / Primal |
0.000008999330102233217 s |
0.000004852364199996373 s |
1.85 |
scatter_sum / JaX / gpu / Primal |
0.00009032420809962789 s |
0.00007720302430006996 s |
1.17 |
scatter_sum / JaXPipe / gpu / Primal |
0.0000988431557023432 s |
0.00007867281300004834 s |
1.26 |
scatter_sum / JaX / cpu / Primal |
0.000004057899899999029 s |
0.000004781286200022805 s |
0.85 |
scatter_sum / JaXPipe / cpu / Primal |
0.000004421349000040209 s |
0.000004852364199996373 s |
0.91 |
scatter_sum / JaX / tpu / Primal |
0.0001385789583999 s |
0.0001361191269999 s |
1.02 |
scatter_sum / JaXPipe / tpu / Primal |
0.0001379335693998 s |
0.0001397103007999 s |
0.99 |
scatter_sum / JaX / cpu / Primal |
0.000006248686400067527 s |
0.000004781286200022805 s |
1.31 |
scatter_sum / JaXPipe / cpu / Primal |
0.000006229060999976354 s |
0.000004852364199996373 s |
1.28 |
slicing / JaX / cpu / Primal |
0.000003775441300012972 s |
0.0000038275482999779344 s |
0.99 |
slicing / JaXPipe / cpu / Primal |
0.000003755979900006423 s |
0.000003795791399988957 s |
0.99 |
slicing / JaX / cpu / Forward |
0.000005902646599997752 s |
0.000005939210299993647 s |
0.99 |
slicing / JaXPipe / cpu / Forward |
0.000005919343599998683 s |
0.00000603474350000397 s |
0.98 |
slicing / JaX / cpu / BothRev |
0.000006310669000004055 s |
0.0000064569756999844685 s |
0.98 |
slicing / JaXPipe / cpu / PreRev |
0.000006321630999991612 s |
0.000006422275499971874 s |
0.98 |
slicing / JaXPipe / cpu / PostRev |
0.0000063186929000039525 s |
0.000006377462199998263 s |
0.99 |
slicing / JaXPipe / cpu / BothRev |
0.000006182768000007855 s |
0.000006449023799996212 s |
0.96 |
slicing / JaX / cpu / Primal |
0.000006940125400433317 s |
0.0000038275482999779344 s |
1.81 |
slicing / JaXPipe / cpu / Primal |
0.000006997507100459188 s |
0.000003795791399988957 s |
1.84 |
slicing / JaX / cpu / Forward |
0.000010825557302450762 s |
0.000005939210299993647 s |
1.82 |
slicing / JaXPipe / cpu / Forward |
0.000010870730294846 s |
0.00000603474350000397 s |
1.80 |
slicing / JaX / cpu / BothRev |
0.00001102571309893392 s |
0.0000064569756999844685 s |
1.71 |
slicing / JaXPipe / cpu / PreRev |
0.000010960100800730288 s |
0.000006422275499971874 s |
1.71 |
slicing / JaXPipe / cpu / PostRev |
0.000010984865296632053 s |
0.000006377462199998263 s |
1.72 |
slicing / JaXPipe / cpu / BothRev |
0.000011715405900031328 s |
0.000006449023799996212 s |
1.82 |
slicing / JaX / gpu / Primal |
0.00009132031049812211 s |
0.00007418459909986268 s |
1.23 |
slicing / JaXPipe / gpu / Primal |
0.00008319183890125715 s |
0.00007490196130002005 s |
1.11 |
slicing / JaX / gpu / Forward |
0.0001114069067989 s |
0.0001017370442999 s |
1.10 |
slicing / JaXPipe / gpu / Forward |
0.000111134487699 s |
0.000100429285 s |
1.11 |
slicing / JaX / gpu / BothRev |
0.0001117891712987 s |
0.0001030170156998 s |
1.09 |
slicing / JaXPipe / gpu / PreRev |
0.0001111445893999 s |
0.0001024978495999 s |
1.08 |
slicing / JaXPipe / gpu / PostRev |
0.0001088574483001 s |
0.0001014396143 s |
1.07 |
slicing / JaXPipe / gpu / BothRev |
0.0001179066875018 s |
0.0001008703892999 s |
1.17 |
slicing / JaX / cpu / Primal |
0.0000037492480001674263 s |
0.0000038275482999779344 s |
0.98 |
slicing / JaXPipe / cpu / Primal |
0.0000037575709999146056 s |
0.000003795791399988957 s |
0.99 |
slicing / JaX / cpu / Forward |
0.000005372719899969525 s |
0.000005939210299993647 s |
0.90 |
slicing / JaXPipe / cpu / Forward |
0.000005361817999983032 s |
0.00000603474350000397 s |
0.89 |
slicing / JaX / cpu / BothRev |
0.000005763366000064707 s |
0.0000064569756999844685 s |
0.89 |
slicing / JaXPipe / cpu / PreRev |
0.000005365840000013122 s |
0.000006422275499971874 s |
0.84 |
slicing / JaXPipe / cpu / PostRev |
0.000005719399000008707 s |
0.000006377462199998263 s |
0.90 |
slicing / JaXPipe / cpu / BothRev |
0.000005757930999970995 s |
0.000006449023799996212 s |
0.89 |
slicing / JaX / tpu / Primal |
0.0001345490785 s |
0.0001431329665996 s |
0.94 |
slicing / JaXPipe / tpu / Primal |
0.0001357882114998 s |
0.0001405793777004 s |
0.97 |
slicing / JaX / tpu / Forward |
0.0002206666371001 s |
0.0002153926234001 s |
1.02 |
slicing / JaXPipe / tpu / Forward |
0.0002239213621 s |
0.0002333329244 s |
0.96 |
slicing / JaX / tpu / BothRev |
0.000232637939 s |
0.000238331913 s |
0.98 |
slicing / JaXPipe / tpu / PreRev |
0.000235042264 s |
0.0002370358310996 s |
0.99 |
slicing / JaXPipe / tpu / PostRev |
0.0002318694040999 s |
0.0002231315370001 s |
1.04 |
slicing / JaXPipe / tpu / BothRev |
0.0002329168391001 s |
0.0002277348777002 s |
1.02 |
slicing / JaX / cpu / Primal |
0.000004781864100004896 s |
0.0000038275482999779344 s |
1.25 |
slicing / JaXPipe / cpu / Primal |
0.000004855344999941736 s |
0.000003795791399988957 s |
1.28 |
slicing / JaX / cpu / Forward |
0.000007674727399989934 s |
0.000005939210299993647 s |
1.29 |
slicing / JaXPipe / cpu / Forward |
0.000007309626400001434 s |
0.00000603474350000397 s |
1.21 |
slicing / JaX / cpu / BothRev |
0.000008127842199974111 s |
0.0000064569756999844685 s |
1.26 |
slicing / JaXPipe / cpu / PreRev |
0.00000774250000004031 s |
0.000006422275499971874 s |
1.21 |
slicing / JaXPipe / cpu / PostRev |
0.000008209079199968983 s |
0.000006377462199998263 s |
1.29 |
slicing / JaXPipe / cpu / BothRev |
0.000007900181199966028 s |
0.000006449023799996212 s |
1.23 |
sum / JaX / cpu / Primal |
0.000004977693899991209 s |
0.000005023583999991387 s |
0.99 |
sum / JaXPipe / cpu / Primal |
0.0000049789522999617474 s |
0.000005372708499999135 s |
0.93 |
sum / JaX / cpu / Forward |
0.000008268766499986669 s |
0.00000857204870003443 s |
0.96 |
sum / JaXPipe / cpu / Forward |
0.000008294460600018283 s |
0.00000850996489998579 s |
0.97 |
sum / JaX / cpu / BothRev |
0.000007496940700002597 s |
0.000007576139799994053 s |
0.99 |
sum / JaXPipe / cpu / PreRev |
0.000007221554999978252 s |
0.000007303304700008084 s |
0.99 |
sum / JaXPipe / cpu / PostRev |
0.000007202196300022479 s |
0.000007272418800039304 s |
0.99 |
sum / JaXPipe / cpu / BothRev |
0.000007263560499995947 s |
0.000007323511799995686 s |
0.99 |
sum / JaX / cpu / Primal |
0.00001070254179649055 s |
0.000005023583999991387 s |
2.13 |
sum / JaXPipe / cpu / Primal |
0.00001007704889634624 s |
0.000005372708499999135 s |
1.88 |
sum / JaX / cpu / Forward |
0.000015844940301030875 s |
0.00000857204870003443 s |
1.85 |
sum / JaXPipe / cpu / Forward |
0.000015206830302486196 s |
0.00000850996489998579 s |
1.79 |
sum / JaX / cpu / BothRev |
0.0000145496271958109 s |
0.000007576139799994053 s |
1.92 |
sum / JaXPipe / cpu / PreRev |
0.00001402229039813392 s |
0.000007303304700008084 s |
1.92 |
sum / JaXPipe / cpu / PostRev |
0.000014591200498398393 s |
0.000007272418800039304 s |
2.01 |
sum / JaXPipe / cpu / BothRev |
0.000014362987101776523 s |
0.000007323511799995686 s |
1.96 |
sum / JaX / gpu / Primal |
0.00007268763929605485 s |
0.00007136468480002805 s |
1.02 |
sum / JaXPipe / gpu / Primal |
0.00007760397300007753 s |
0.00008296686279991263 s |
0.94 |
sum / JaX / gpu / Forward |
0.0001253076502995 s |
0.00009818792730002316 s |
1.28 |
sum / JaXPipe / gpu / Forward |
0.0001212270777032 s |
0.00009816430880000552 s |
1.23 |
sum / JaX / gpu / BothRev |
0.0001098445831972 s |
0.0001021368894 s |
1.08 |
sum / JaXPipe / gpu / PreRev |
0.0001078954984026 s |
0.0001037522160999 s |
1.04 |
sum / JaXPipe / gpu / PostRev |
0.0001043953743996 s |
0.0001033100450998 s |
1.01 |
sum / JaXPipe / gpu / BothRev |
0.0001029508496052 s |
0.0001022641701998 s |
1.01 |
sum / JaX / cpu / Primal |
0.000004457432900017011 s |
0.000005023583999991387 s |
0.89 |
sum / JaXPipe / cpu / Primal |
0.000004534379000142508 s |
0.000005372708499999135 s |
0.84 |
sum / JaX / cpu / Forward |
0.000007049877000099513 s |
0.00000857204870003443 s |
0.82 |
sum / JaXPipe / cpu / Forward |
0.000006631600999935472 s |
0.00000850996489998579 s |
0.78 |
sum / JaX / cpu / BothRev |
0.000006445161000010557 s |
0.000007576139799994053 s |
0.85 |
sum / JaXPipe / cpu / PreRev |
0.000006168611999964924 s |
0.000007303304700008084 s |
0.84 |
sum / JaXPipe / cpu / PostRev |
0.000006144911000046704 s |
0.000007272418800039304 s |
0.84 |
sum / JaXPipe / cpu / BothRev |
0.000006426393999936408 s |
0.000007323511799995686 s |
0.88 |
sum / JaX / tpu / Primal |
0.0001514888933999 s |
0.0001446025065997 s |
1.05 |
sum / JaXPipe / tpu / Primal |
0.0001501713162999 s |
0.0001434415376003 s |
1.05 |
sum / JaX / tpu / Forward |
0.0002286451790998 s |
0.0002228683070003 s |
1.03 |
sum / JaXPipe / tpu / Forward |
0.0002270517271001 s |
0.0002239085328998 s |
1.01 |
sum / JaX / tpu / BothRev |
0.0002050247700999 s |
0.0002242962139003 s |
0.91 |
sum / JaXPipe / tpu / PreRev |
0.0002046864092 s |
0.0002169565482996 s |
0.94 |
sum / JaXPipe / tpu / PostRev |
0.0002137984401999 s |
0.0002373330310998 s |
0.90 |
sum / JaXPipe / tpu / BothRev |
0.0002070249861999 s |
0.0002302827284998 s |
0.90 |
sum / JaX / cpu / Primal |
0.000006763747999957559 s |
0.000005023583999991387 s |
1.35 |
sum / JaXPipe / cpu / Primal |
0.000006712410100044508 s |
0.000005372708499999135 s |
1.25 |
sum / JaX / cpu / Forward |
0.000010158208499979082 s |
0.00000857204870003443 s |
1.19 |
sum / JaXPipe / cpu / Forward |
0.000010512850099985372 s |
0.00000850996489998579 s |
1.24 |
sum / JaX / cpu / BothRev |
0.000009701825000047392 s |
0.000007576139799994053 s |
1.28 |
sum / JaXPipe / cpu / PreRev |
0.000009251544300059322 s |
0.000007303304700008084 s |
1.27 |
sum / JaXPipe / cpu / PostRev |
0.000009572474000015064 s |
0.000007272418800039304 s |
1.32 |
sum / JaXPipe / cpu / BothRev |
0.000009252118899985364 s |
0.000007323511799995686 s |
1.26 |
jaxmd40 / JaXPipe / gpu / Primal |
0.0010341148998122 s |
0.0010188502999881 s |
1.01 |
jaxmd40 / JaX / gpu / Primal |
0.001012074801838 s |
0.0010181756999372 s |
0.99 |
jaxmd40 / HLOOpt / gpu / Primal |
0.0009600657969713 s |
0.0009588386999894 s |
1.00 |
jaxmd40 / PartOpt / gpu / Primal |
0.0009539607039187 s |
0.0009595328998329 s |
0.99 |
jaxmd40 / DefOpt / gpu / Primal |
0.0006844455027021 s |
0.0006881713999973 s |
0.99 |
jaxmd40 / IPartOpt / gpu / Primal |
0.0009528780996333 s |
0.0009497868000835 s |
1.00 |
jaxmd40 / IDefOpt / gpu / Primal |
0.000706940999953 s |
0.0007047128001431 s |
1.00 |
jaxmd40 / JaX / gpu / Forward |
0.0012602948001585 s |
0.0012650197999391 s |
1.00 |
jaxmd40 / JaXPipe / gpu / PostRev |
0.0040172839944716 s |
0.0040265638001073 s |
1.00 |
jaxmd40 / JaX / gpu / BothRev |
0.0040270674973726 s |
0.004030471400074 s |
1.00 |
jaxmd40 / HLOOpt / gpu / PostRev |
0.0040146662970073 s |
0.0040430701999866 s |
0.99 |
jaxmd40 / PartOpt / gpu / PostRev |
0.0041001567966304 s |
0.0041082302001086 s |
1.00 |
jaxmd40 / DefOpt / gpu / PostRev |
0.0022280031000263 s |
0.0022301827999399 s |
1.00 |
jaxmd40 / IPartOpt / gpu / PostRev |
0.0040864561975467 s |
0.0041123922001133 s |
0.99 |
jaxmd40 / IDefOpt / gpu / PostRev |
0.0025320045009721 s |
0.0025540420001561 s |
0.99 |
jaxmd40 / JaXPipe / tpu / Primal |
0.00008297889999084874 s |
0.00008552799990866334 s |
0.97 |
jaxmd40 / JaX / tpu / Primal |
0.0000955170000452199 s |
0.00009704799958853982 s |
0.98 |
jaxmd40 / HLOOpt / tpu / Primal |
0.00009611700006644242 s |
0.00009828399997786618 s |
0.98 |
jaxmd40 / PartOpt / tpu / Primal |
0.00009870300000329736 s |
0.0001040369999827 s |
0.95 |
jaxmd40 / DefOpt / tpu / Primal |
0.00009747099993546726 s |
0.00009933100009220652 s |
0.98 |
jaxmd40 / IPartOpt / tpu / Primal |
0.0001036760000715 s |
0.0000997780000034254 s |
1.04 |
jaxmd40 / IDefOpt / tpu / Primal |
0.0001012430000628 s |
0.00009857900004135444 s |
1.03 |
jaxmd40 / JaX / tpu / Forward |
0.0001864600000772 s |
0.0001789469999494 s |
1.04 |
jaxmd40 / JaXPipe / tpu / PostRev |
0.0001938380000865 s |
0.0001996730003156 s |
0.97 |
jaxmd40 / JaX / tpu / BothRev |
0.0002016369999182 s |
0.0002029668998147 s |
0.99 |
jaxmd40 / HLOOpt / tpu / PostRev |
0.0002042459998847 s |
0.0002013620003708 s |
1.01 |
jaxmd40 / PartOpt / tpu / PostRev |
0.0002094290000968 s |
0.0002020419997279 s |
1.04 |
jaxmd40 / DefOpt / tpu / PostRev |
0.0002069619999019 s |
0.0001945800002431 s |
1.06 |
jaxmd40 / IPartOpt / tpu / PostRev |
0.0002580809999926 s |
0.000194637000095 s |
1.33 |
jaxmd40 / IDefOpt / tpu / PostRev |
0.000202820999948 s |
0.0001989140000659 s |
1.02 |
jaxmd40 / JaXPipe / cpu / Primal |
0.0001034824000271 s |
0.0000745606999771553 s |
1.39 |
jaxmd40 / JaX / cpu / Primal |
0.0000579096000365098 s |
0.00004693930004577851 s |
1.23 |
jaxmd40 / HLOOpt / cpu / Primal |
0.00006811389994254569 s |
0.00006560740002896637 s |
1.04 |
jaxmd40 / PartOpt / cpu / Primal |
0.00008910030001061387 s |
0.0000610910000432341 s |
1.46 |
jaxmd40 / DefOpt / cpu / Primal |
0.00006508220003524912 s |
0.000056112699985533256 s |
1.16 |
jaxmd40 / IPartOpt / cpu / Primal |
0.00006922789998498047 s |
0.00007039359998088912 s |
0.98 |
jaxmd40 / IDefOpt / cpu / Primal |
0.00006374139993567952 s |
0.00006217230002221185 s |
1.03 |
jaxmd40 / JaX / cpu / Forward |
0.0001065703999302 s |
0.0001253925000128 s |
0.85 |
jaxmd40 / JaXPipe / cpu / PostRev |
0.0001998273999561 s |
0.0001088297999558 s |
1.84 |
jaxmd40 / JaX / cpu / BothRev |
0.0001731671000015 s |
0.0001564086000144 s |
1.11 |
jaxmd40 / HLOOpt / cpu / PostRev |
0.0001484636999521 s |
0.0001211591000355 s |
1.23 |
jaxmd40 / PartOpt / cpu / PostRev |
0.0002345550000427 s |
0.0001418984000338 s |
1.65 |
jaxmd40 / DefOpt / cpu / PostRev |
0.0001350610000372 s |
0.0001165472999673 s |
1.16 |
jaxmd40 / IPartOpt / cpu / PostRev |
0.0001600910999513 s |
0.0001367823999316 s |
1.17 |
jaxmd40 / IDefOpt / cpu / PostRev |
0.0002594933000182 s |
0.000118978699993 s |
2.18 |
This comment was automatically generated by workflow using github-action-benchmark.
…ng negate of constants Co-authored-by: wsmoses <1260124+wsmoses@users.noreply.github.com>
Changed to use a dedicated |
Co-authored-by: wsmoses <1260124+wsmoses@users.noreply.github.com>
| }; | ||
|
|
||
| // Fold negate of constant values immediately | ||
| struct NegConstFold final |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot you should reuse UnaryConstProp like UnaryConstProp<stablehlo::NotOp, stablehlo::notOp>.
Also while you're fixing this, update the copilot instructions as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to reuse UnaryConstProp<stablehlo::NegOp, stablehlo::negOp> in commit c52c974. Removed the custom NegConstFold pattern and added UnaryConstProp to the high-priority pattern list instead.
I didn't find any copilot instruction files in the repository to update. If you're referring to specific documentation or instruction files, please let me know where they are located.
Co-authored-by: wsmoses <1260124+wsmoses@users.noreply.github.com>
Co-authored-by: wsmoses <1260124+wsmoses@users.noreply.github.com>
|
we seem to already have this, I'm confused |
Summary: Fix Negate of Constant Propagation
This PR fixes an optimization issue where patterns like
NegMulConstSimplifyandNegDivConstSimplifywould create intermediateneg(const)operations that required an additional pass to constant-fold.Changes
UnaryConstProp<stablehlo::NegOp, stablehlo::negOp>from standard patterns to high-priority pattern list (PatternBenefit 65000)NegMulConstSimplifyandNegDivConstSimplifyto ensure immediate foldingnegate_const_prop.mlirImplementation Approach
Reused the existing
UnaryConstProptemplate pattern infrastructure instead of creating a custom pattern. TheUnaryConstProp<stablehlo::NegOp, stablehlo::negOp>pattern:NegOpwith constant operandsstablehlo::negOpfrom the reference library to compute negated constant valuesBy registering this pattern with high priority, when patterns like
NegMulConstSimplifyandNegDivConstSimplifycreateneg(const)operations, they are immediately folded in the same pass, eliminating the need for multiple optimization passes.Test Fix
Fixed the test file format to match existing conventions in the repository - removed unnecessary
modulewrappers and used proper CHECK-LABEL directives.Original prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.