-
Notifications
You must be signed in to change notification settings - Fork 25
feat: remove more intermediate reshape operations #1870
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
680e48d to
b2739f2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EnzymeJAX Benchmarks
Details
| Benchmark suite | Current: 19db57f | Previous: 426a717 | Ratio |
|---|---|---|---|
actmtch / JaXPipe / cpu / Primal |
0.000006742300001860713 s |
0.0000070261399650917154 s |
0.96 |
actmtch / Jax / cpu / Primal |
0.000006291740005508473 s |
0.000007681720026084804 s |
0.82 |
actmtch / HLOOpt / cpu / Primal |
0.000007618559995989927 s |
0.000008456099985778564 s |
0.90 |
actmtch / PartOpt / cpu / Primal |
0.00000646157999881325 s |
0.000008354660039913142 s |
0.77 |
actmtch / IPartOpt / cpu / Primal |
0.000006491059998552373 s |
0.000008498499983033981 s |
0.76 |
actmtch / DefOpt / cpu / Primal |
0.000007149299999582582 s |
0.000008734280045246124 s |
0.82 |
actmtch / IDefOpt / cpu / Primal |
0.000006879740008116641 s |
0.000008587439951952547 s |
0.80 |
actmtch / JaXPipe / cpu / Forward |
0.000010864259993468294 s |
0.000012429179996615858 s |
0.87 |
actmtch / Jax / cpu / Forward |
0.000009452840008634667 s |
0.000010656700023901068 s |
0.89 |
actmtch / HLOOpt / cpu / Forward |
0.000010726260004503274 s |
0.000012589040015882348 s |
0.85 |
actmtch / PartOpt / cpu / Forward |
0.000010353419997954916 s |
0.000011920439947061825 s |
0.87 |
actmtch / IPartOpt / cpu / Forward |
0.000010768940007892523 s |
0.000012381419974190069 s |
0.87 |
actmtch / DefOpt / cpu / Forward |
0.00001013312000395672 s |
0.000012271340056031477 s |
0.83 |
actmtch / IDefOpt / cpu / Forward |
0.00001064875999418291 s |
0.000012316779984757888 s |
0.86 |
actmtch / JaXPipe / cpu / PreRev |
0.000010897239983478355 s |
0.00001274931999432738 s |
0.85 |
actmtch / JaXPipe / cpu / PostRev |
0.000009988120011712451 s |
0.000011261919989919989 s |
0.89 |
actmtch / JaXPipe / cpu / BothRev |
0.000010810720007157214 s |
0.000012745980011459324 s |
0.85 |
actmtch / Jax / cpu / BothRev |
0.00000941728000270814 s |
0.00001086691999262257 s |
0.87 |
actmtch / HLOOpt / cpu / PreRev |
0.000010659940001005452 s |
0.000012499179983933571 s |
0.85 |
actmtch / HLOOpt / cpu / PostRev |
0.000013300379998781864 s |
0.000014753979976376283 s |
0.90 |
actmtch / HLOOpt / cpu / BothRev |
0.000010641840003700051 s |
0.00001177669998469355 s |
0.90 |
actmtch / PartOpt / cpu / PreRev |
0.000010860399997909552 s |
0.000012723199997708434 s |
0.85 |
actmtch / PartOpt / cpu / PostRev |
0.000009844079995673384 s |
0.000011315820029267342 s |
0.87 |
actmtch / PartOpt / cpu / BothRev |
0.0000115475599977799 s |
0.000012766059990099166 s |
0.90 |
actmtch / IPartOpt / cpu / PreRev |
0.000010369440003614729 s |
0.000012880159993073904 s |
0.81 |
actmtch / IPartOpt / cpu / PostRev |
0.000009463919993777382 s |
0.000010472339963598645 s |
0.90 |
actmtch / IPartOpt / cpu / BothRev |
0.000010794539998641994 s |
0.000012367339950287716 s |
0.87 |
actmtch / DefOpt / cpu / PreRev |
0.000010389639987806733 s |
0.000012020940012007486 s |
0.86 |
actmtch / DefOpt / cpu / PostRev |
0.000011129979998258933 s |
0.000012939700009155783 s |
0.86 |
actmtch / DefOpt / cpu / BothRev |
0.000010952219993214385 s |
0.000012443320019883686 s |
0.88 |
actmtch / IDefOpt / cpu / PreRev |
0.000010425560001294798 s |
0.000012142540026616188 s |
0.86 |
actmtch / IDefOpt / cpu / PostRev |
0.000011277080000127171 s |
0.00001286107997657382 s |
0.88 |
actmtch / IDefOpt / cpu / BothRev |
0.00001113764000365336 s |
0.000012399619981806609 s |
0.90 |
actmtch / JaXPipe / cuda / Primal |
0.0000024 s |
0.000002016 s |
1.19 |
actmtch / Jax / cuda / Primal |
0.0000024 s |
0.000002015 s |
1.19 |
actmtch / HLOOpt / cuda / Primal |
0.0000024 s |
0.000002016 s |
1.19 |
actmtch / PartOpt / cuda / Primal |
0.0000024 s |
0.000002015 s |
1.19 |
actmtch / IPartOpt / cuda / Primal |
0.0000024 s |
0.000002015 s |
1.19 |
actmtch / DefOpt / cuda / Primal |
0.000002399 s |
0.000002015 s |
1.19 |
actmtch / IDefOpt / cuda / Primal |
0.0000024 s |
0.000002016 s |
1.19 |
actmtch / JaXPipe / cuda / Forward |
0.000010336 s |
0.000009887 s |
1.05 |
actmtch / Jax / cuda / Forward |
0.000010336 s |
0.000009888 s |
1.05 |
actmtch / HLOOpt / cuda / Forward |
0.000010336 s |
0.000010048 s |
1.03 |
actmtch / PartOpt / cuda / Forward |
0.000010367 s |
0.000010112 s |
1.03 |
actmtch / IPartOpt / cuda / Forward |
0.000010464 s |
0.00000992 s |
1.05 |
actmtch / DefOpt / cuda / Forward |
0.000010272 s |
0.000009696 s |
1.06 |
actmtch / IDefOpt / cuda / Forward |
0.000010432 s |
0.000010272 s |
1.02 |
actmtch / JaXPipe / cuda / PreRev |
0.00001088 s |
0.000009215 s |
1.18 |
actmtch / JaXPipe / cuda / PostRev |
0.000010464 s |
0.000010624 s |
0.98 |
actmtch / JaXPipe / cuda / BothRev |
0.00001056 s |
0.00000992 s |
1.06 |
actmtch / Jax / cuda / BothRev |
0.000010879 s |
0.000010271 s |
1.06 |
actmtch / HLOOpt / cuda / PreRev |
0.000010272 s |
0.000010144 s |
1.01 |
actmtch / HLOOpt / cuda / PostRev |
0.000010848 s |
0.00001024 s |
1.06 |
actmtch / HLOOpt / cuda / BothRev |
0.000010593 s |
0.000010304 s |
1.03 |
actmtch / PartOpt / cuda / PreRev |
0.000010624 s |
0.00000992 s |
1.07 |
actmtch / PartOpt / cuda / PostRev |
0.0000104 s |
0.00001008 s |
1.03 |
actmtch / PartOpt / cuda / BothRev |
0.00001024 s |
0.000010015 s |
1.02 |
actmtch / IPartOpt / cuda / PreRev |
0.000013056 s |
0.000010336 s |
1.26 |
actmtch / IPartOpt / cuda / PostRev |
0.000010592 s |
0.000010272 s |
1.03 |
actmtch / IPartOpt / cuda / BothRev |
0.000012 s |
0.0000104 s |
1.15 |
actmtch / DefOpt / cuda / PreRev |
0.000013088 s |
0.000010112 s |
1.29 |
actmtch / DefOpt / cuda / PostRev |
0.000010944 s |
0.000011712 s |
0.93 |
actmtch / DefOpt / cuda / BothRev |
0.000010624 s |
0.000010081 s |
1.05 |
actmtch / IDefOpt / cuda / PreRev |
0.000010816 s |
0.000010177 s |
1.06 |
actmtch / IDefOpt / cuda / PostRev |
0.000010304 s |
0.000009696 s |
1.06 |
actmtch / IDefOpt / cuda / BothRev |
0.000010976 s |
0.000010208 s |
1.08 |
actmtch / JaXPipe / tpu / Primal |
5.6325e-7 s |
5.63175e-7 s |
1.00 |
actmtch / Jax / tpu / Primal |
5.969000000000001e-7 s |
5.96825e-7 s |
1.00 |
actmtch / HLOOpt / tpu / Primal |
0.0000021011 s |
0.000002097775 s |
1.00 |
actmtch / PartOpt / tpu / Primal |
5.96725e-7 s |
5.9665e-7 s |
1.00 |
actmtch / IPartOpt / tpu / Primal |
5.521000000000001e-7 s |
5.52675e-7 s |
1.00 |
actmtch / DefOpt / tpu / Primal |
0.0000021594 s |
0.000002164025 s |
1.00 |
actmtch / IDefOpt / tpu / Primal |
0.00000210305 s |
0.000002095375 s |
1.00 |
actmtch / JaXPipe / tpu / Forward |
0.000003819525 s |
0.0000038255 s |
1.00 |
actmtch / Jax / tpu / Forward |
0.0000012143 s |
0.0000012064999999999998 s |
1.01 |
actmtch / HLOOpt / tpu / Forward |
0.000003936225 s |
0.0000039346750000000006 s |
1.00 |
actmtch / PartOpt / tpu / Forward |
0.000003917849999999999 s |
0.00000392895 s |
1.00 |
actmtch / IPartOpt / tpu / Forward |
0.000003934375000000001 s |
0.00000393315 s |
1.00 |
actmtch / DefOpt / tpu / Forward |
0.0000039066 s |
0.00000391195 s |
1.00 |
actmtch / IDefOpt / tpu / Forward |
0.000003941225 s |
0.0000039288250000000005 s |
1.00 |
actmtch / JaXPipe / tpu / PreRev |
0.0000034789750000000004 s |
0.00000348385 s |
1.00 |
actmtch / JaXPipe / tpu / PostRev |
0.000001632725 s |
0.000001634375 s |
1.00 |
actmtch / JaXPipe / tpu / BothRev |
0.0000034667500000000003 s |
0.000003499325 s |
0.99 |
actmtch / Jax / tpu / BothRev |
0.0000016350999999999998 s |
0.00000163605 s |
1.00 |
actmtch / HLOOpt / tpu / PreRev |
0.00000347495 s |
0.00000348265 s |
1.00 |
actmtch / HLOOpt / tpu / PostRev |
0.000003410075 s |
0.000003403275 s |
1.00 |
actmtch / HLOOpt / tpu / BothRev |
0.000003473925 s |
0.00000347005 s |
1.00 |
actmtch / PartOpt / tpu / PreRev |
0.0000034061000000000003 s |
0.00000340765 s |
1.00 |
actmtch / PartOpt / tpu / PostRev |
0.0000016018 s |
0.0000015938749999999998 s |
1.00 |
actmtch / PartOpt / tpu / BothRev |
0.0000034093750000000003 s |
0.000003406875 s |
1.00 |
actmtch / IPartOpt / tpu / PreRev |
0.000003480125 s |
0.00000347205 s |
1.00 |
actmtch / IPartOpt / tpu / PostRev |
0.0000016339 s |
0.00000163595 s |
1.00 |
actmtch / IPartOpt / tpu / BothRev |
0.0000034837249999999994 s |
0.000003471625 s |
1.00 |
actmtch / DefOpt / tpu / PreRev |
0.000003405775 s |
0.000003417425 s |
1.00 |
actmtch / DefOpt / tpu / PostRev |
0.000003415775 s |
0.00000340655 s |
1.00 |
actmtch / DefOpt / tpu / BothRev |
0.000003416125 s |
0.000003411675 s |
1.00 |
actmtch / IDefOpt / tpu / PreRev |
0.000003487 s |
0.00000347155 s |
1.00 |
actmtch / IDefOpt / tpu / PostRev |
0.000003415475 s |
0.000003411 s |
1.00 |
actmtch / IDefOpt / tpu / BothRev |
0.000003469975 s |
0.000003475375 s |
1.00 |
actmtch / JaXPipe / cpu / Primal |
0.000013009 s |
0.0000070261399650917154 s |
1.85 |
actmtch / Jax / cpu / Primal |
0.000013267 s |
0.000007681720026084804 s |
1.73 |
actmtch / HLOOpt / cpu / Primal |
0.000013973 s |
0.000008456099985778564 s |
1.65 |
actmtch / PartOpt / cpu / Primal |
0.000013176 s |
0.000008354660039913142 s |
1.58 |
actmtch / IPartOpt / cpu / Primal |
0.00001348 s |
0.000008498499983033981 s |
1.59 |
actmtch / DefOpt / cpu / Primal |
0.000014176 s |
0.000008734280045246124 s |
1.62 |
actmtch / IDefOpt / cpu / Primal |
0.000013974 s |
0.000008587439951952547 s |
1.63 |
actmtch / JaXPipe / cpu / Forward |
0.000019276000000000003 s |
0.000012429179996615858 s |
1.55 |
actmtch / Jax / cpu / Forward |
0.000017835 s |
0.000010656700023901068 s |
1.67 |
actmtch / HLOOpt / cpu / Forward |
0.00001858 s |
0.000012589040015882348 s |
1.48 |
actmtch / PartOpt / cpu / Forward |
0.000018141 s |
0.000011920439947061825 s |
1.52 |
actmtch / IPartOpt / cpu / Forward |
0.000018977 s |
0.000012381419974190069 s |
1.53 |
actmtch / DefOpt / cpu / Forward |
0.000018433 s |
0.000012271340056031477 s |
1.50 |
actmtch / IDefOpt / cpu / Forward |
0.000018822 s |
0.000012316779984757888 s |
1.53 |
actmtch / JaXPipe / cpu / PreRev |
0.000019474 s |
0.00001274931999432738 s |
1.53 |
actmtch / JaXPipe / cpu / PostRev |
0.000017204 s |
0.000011261919989919989 s |
1.53 |
actmtch / JaXPipe / cpu / BothRev |
0.000019094 s |
0.000012745980011459324 s |
1.50 |
actmtch / Jax / cpu / BothRev |
0.000017536 s |
0.00001086691999262257 s |
1.61 |
actmtch / HLOOpt / cpu / PreRev |
0.000019077 s |
0.000012499179983933571 s |
1.53 |
actmtch / HLOOpt / cpu / PostRev |
0.000019135 s |
0.000014753979976376283 s |
1.30 |
actmtch / HLOOpt / cpu / BothRev |
0.000019217000000000003 s |
0.00001177669998469355 s |
1.63 |
actmtch / PartOpt / cpu / PreRev |
0.000018395000000000003 s |
0.000012723199997708434 s |
1.45 |
actmtch / PartOpt / cpu / PostRev |
0.000017412000000000002 s |
0.000011315820029267342 s |
1.54 |
actmtch / PartOpt / cpu / BothRev |
0.000019476 s |
0.000012766059990099166 s |
1.53 |
actmtch / IPartOpt / cpu / PreRev |
0.000019292 s |
0.000012880159993073904 s |
1.50 |
actmtch / IPartOpt / cpu / PostRev |
0.000017224 s |
0.000010472339963598645 s |
1.64 |
actmtch / IPartOpt / cpu / BothRev |
0.000020125 s |
0.000012367339950287716 s |
1.63 |
actmtch / DefOpt / cpu / PreRev |
0.000018937 s |
0.000012020940012007486 s |
1.58 |
actmtch / DefOpt / cpu / PostRev |
0.00001883 s |
0.000012939700009155783 s |
1.46 |
actmtch / DefOpt / cpu / BothRev |
0.000019222 s |
0.000012443320019883686 s |
1.54 |
actmtch / IDefOpt / cpu / PreRev |
0.000019002 s |
0.000012142540026616188 s |
1.56 |
actmtch / IDefOpt / cpu / PostRev |
0.000019448 s |
0.00001286107997657382 s |
1.51 |
actmtch / IDefOpt / cpu / BothRev |
0.000020138 s |
0.000012399619981806609 s |
1.62 |
actmtch / JaXPipe / cpu / Primal |
0.00001 s |
0.0000070261399650917154 s |
1.42 |
actmtch / Jax / cpu / Primal |
0.000008999999999999999 s |
0.000007681720026084804 s |
1.17 |
actmtch / HLOOpt / cpu / Primal |
0.00001 s |
0.000008456099985778564 s |
1.18 |
actmtch / PartOpt / cpu / Primal |
0.00001 s |
0.000008354660039913142 s |
1.20 |
actmtch / IPartOpt / cpu / Primal |
0.000008999999999999999 s |
0.000008498499983033981 s |
1.06 |
actmtch / DefOpt / cpu / Primal |
0.00001 s |
0.000008734280045246124 s |
1.14 |
actmtch / IDefOpt / cpu / Primal |
0.00001 s |
0.000008587439951952547 s |
1.16 |
actmtch / JaXPipe / cpu / Forward |
0.000015 s |
0.000012429179996615858 s |
1.21 |
actmtch / Jax / cpu / Forward |
0.000013 s |
0.000010656700023901068 s |
1.22 |
actmtch / HLOOpt / cpu / Forward |
0.000015 s |
0.000012589040015882348 s |
1.19 |
actmtch / PartOpt / cpu / Forward |
0.000015 s |
0.000011920439947061825 s |
1.26 |
actmtch / IPartOpt / cpu / Forward |
0.000014 s |
0.000012381419974190069 s |
1.13 |
actmtch / DefOpt / cpu / Forward |
0.000014 s |
0.000012271340056031477 s |
1.14 |
actmtch / IDefOpt / cpu / Forward |
0.000014 s |
0.000012316779984757888 s |
1.14 |
actmtch / JaXPipe / cpu / PreRev |
0.000013 s |
0.00001274931999432738 s |
1.02 |
actmtch / JaXPipe / cpu / PostRev |
0.000012 s |
0.000011261919989919989 s |
1.07 |
actmtch / JaXPipe / cpu / BothRev |
0.000014 s |
0.000012745980011459324 s |
1.10 |
actmtch / Jax / cpu / BothRev |
0.000013 s |
0.00001086691999262257 s |
1.20 |
actmtch / HLOOpt / cpu / PreRev |
0.000014 s |
0.000012499179983933571 s |
1.12 |
actmtch / HLOOpt / cpu / PostRev |
0.000014 s |
0.000014753979976376283 s |
0.95 |
actmtch / HLOOpt / cpu / BothRev |
0.000014 s |
0.00001177669998469355 s |
1.19 |
actmtch / PartOpt / cpu / PreRev |
0.000015 s |
0.000012723199997708434 s |
1.18 |
actmtch / PartOpt / cpu / PostRev |
0.000013 s |
0.000011315820029267342 s |
1.15 |
actmtch / PartOpt / cpu / BothRev |
0.000015 s |
0.000012766059990099166 s |
1.17 |
actmtch / IPartOpt / cpu / PreRev |
0.000014 s |
0.000012880159993073904 s |
1.09 |
actmtch / IPartOpt / cpu / PostRev |
0.000013 s |
0.000010472339963598645 s |
1.24 |
actmtch / IPartOpt / cpu / BothRev |
0.000015 s |
0.000012367339950287716 s |
1.21 |
actmtch / DefOpt / cpu / PreRev |
0.000014 s |
0.000012020940012007486 s |
1.16 |
actmtch / DefOpt / cpu / PostRev |
0.000014 s |
0.000012939700009155783 s |
1.08 |
actmtch / DefOpt / cpu / BothRev |
0.000014 s |
0.000012443320019883686 s |
1.13 |
actmtch / IDefOpt / cpu / PreRev |
0.000014 s |
0.000012142540026616188 s |
1.15 |
actmtch / IDefOpt / cpu / PostRev |
0.000015 s |
0.00001286107997657382 s |
1.17 |
actmtch / IDefOpt / cpu / BothRev |
0.000014 s |
0.000012399619981806609 s |
1.13 |
add_one / JaXPipe / cpu / Primal |
0.000006527319999349856 s |
0.00000767638000070292 s |
0.85 |
add_one / Jax / cpu / Primal |
0.000006504780008071975 s |
0.000007881059946157621 s |
0.83 |
add_one / HLOOpt / cpu / Primal |
0.000006477259994426276 s |
0.00000769876004596881 s |
0.84 |
add_one / PartOpt / cpu / Primal |
0.000006698579993553722 s |
0.00000769573998695705 s |
0.87 |
add_one / IPartOpt / cpu / Primal |
0.000006570080004166812 s |
0.000007311640019906917 s |
0.90 |
add_one / DefOpt / cpu / Primal |
0.000006708960004289111 s |
0.000006927979984538979 s |
0.97 |
add_one / IDefOpt / cpu / Primal |
0.000006464819994107529 s |
0.000007248140027513728 s |
0.89 |
add_one / JaXPipe / cpu / Forward |
0.00001014962000454034 s |
0.000011312520000501535 s |
0.90 |
add_one / Jax / cpu / Forward |
0.000009547139995902398 s |
0.000010935220007013414 s |
0.87 |
add_one / HLOOpt / cpu / Forward |
0.000010000679994845995 s |
0.00001113865996558161 s |
0.90 |
add_one / PartOpt / cpu / Forward |
0.00000981631999138699 s |
0.000011150479995194472 s |
0.88 |
add_one / IPartOpt / cpu / Forward |
0.000010321519998797158 s |
0.00001149292000263813 s |
0.90 |
add_one / DefOpt / cpu / Forward |
0.000009844679993875616 s |
0.000011125779992653409 s |
0.88 |
add_one / IDefOpt / cpu / Forward |
0.000009882099989226844 s |
0.000010953519959002734 s |
0.90 |
add_one / JaXPipe / cpu / PreRev |
0.000011326540000027308 s |
0.00001329155999883369 s |
0.85 |
add_one / JaXPipe / cpu / PostRev |
0.00001145702001394966 s |
0.00001282565998735663 s |
0.89 |
add_one / JaXPipe / cpu / BothRev |
0.000011704820005888903 s |
0.000013342939955691691 s |
0.88 |
add_one / Jax / cpu / BothRev |
0.000011226359993088408 s |
0.000013148820007700124 s |
0.85 |
add_one / HLOOpt / cpu / PreRev |
0.000011924759999146772 s |
0.000013094879996060629 s |
0.91 |
add_one / HLOOpt / cpu / PostRev |
0.00001298861999885048 s |
0.000014685159976579598 s |
0.88 |
add_one / HLOOpt / cpu / BothRev |
0.00001104727999518218 s |
0.000013228520047050553 s |
0.84 |
add_one / PartOpt / cpu / PreRev |
0.00001146232000337477 s |
0.000012965020023329998 s |
0.88 |
add_one / PartOpt / cpu / PostRev |
0.000010983319991737516 s |
0.000012663999968935967 s |
0.87 |
add_one / PartOpt / cpu / BothRev |
0.000011836359999506384 s |
0.000014077939986236744 s |
0.84 |
add_one / IPartOpt / cpu / PreRev |
0.000011301260001346235 s |
0.00001328809998994984 s |
0.85 |
add_one / IPartOpt / cpu / PostRev |
0.000010790559999804829 s |
0.00001288855998609506 s |
0.84 |
add_one / IPartOpt / cpu / BothRev |
0.000011510620004173688 s |
0.000013109620003888268 s |
0.88 |
add_one / DefOpt / cpu / PreRev |
0.00001107387999581988 s |
0.000013311499997143982 s |
0.83 |
add_one / DefOpt / cpu / PostRev |
0.000011495440014641643 s |
0.000012807700004486833 s |
0.90 |
add_one / DefOpt / cpu / BothRev |
0.000011469480002688214 s |
0.00001317183998253313 s |
0.87 |
add_one / IDefOpt / cpu / PreRev |
0.000011350160002621124 s |
0.00001243973999407899 s |
0.91 |
add_one / IDefOpt / cpu / PostRev |
0.000011141860009047375 s |
0.00001230946002579003 s |
0.91 |
add_one / IDefOpt / cpu / BothRev |
0.00001152155999761817 s |
0.00001255659997696057 s |
0.92 |
add_one / JaXPipe / cuda / Primal |
0.000002303 s |
0.0000019200000000000003 s |
1.20 |
add_one / Jax / cuda / Primal |
0.000002303 s |
0.0000019200000000000003 s |
1.20 |
add_one / HLOOpt / cuda / Primal |
0.000002304 s |
0.0000019200000000000003 s |
1.20 |
add_one / PartOpt / cuda / Primal |
0.000002303 s |
0.0000019200000000000003 s |
1.20 |
add_one / IPartOpt / cuda / Primal |
0.000002303 s |
0.0000019200000000000003 s |
1.20 |
add_one / DefOpt / cuda / Primal |
0.000002303 s |
0.0000019200000000000003 s |
1.20 |
add_one / IDefOpt / cuda / Primal |
0.000002272 s |
0.0000019200000000000003 s |
1.18 |
add_one / JaXPipe / cuda / Forward |
0.00001024 s |
0.000010016 s |
1.02 |
add_one / Jax / cuda / Forward |
0.0000104 s |
0.000009792 s |
1.06 |
add_one / HLOOpt / cuda / Forward |
0.000010464 s |
0.000010145 s |
1.03 |
add_one / PartOpt / cuda / Forward |
0.00001088 s |
0.00001008 s |
1.08 |
add_one / IPartOpt / cuda / Forward |
0.000010433 s |
0.00000992 s |
1.05 |
add_one / DefOpt / cuda / Forward |
0.000009919 s |
0.000010144 s |
0.98 |
add_one / IDefOpt / cuda / Forward |
0.00001072 s |
0.000009761 s |
1.10 |
add_one / JaXPipe / cuda / PreRev |
0.000026176 s |
0.000024704 s |
1.06 |
add_one / JaXPipe / cuda / PostRev |
0.0000256 s |
0.000024256 s |
1.06 |
add_one / JaXPipe / cuda / BothRev |
0.000024928 s |
0.000024577 s |
1.01 |
add_one / Jax / cuda / BothRev |
0.00002544 s |
0.000025088 s |
1.01 |
add_one / HLOOpt / cuda / PreRev |
0.000025632 s |
0.000025088 s |
1.02 |
add_one / HLOOpt / cuda / PostRev |
0.000025344 s |
0.000024256 s |
1.04 |
add_one / HLOOpt / cuda / BothRev |
0.0000256 s |
0.0000312 s |
0.82 |
add_one / PartOpt / cuda / PreRev |
0.00002512 s |
0.000024896 s |
1.01 |
add_one / PartOpt / cuda / PostRev |
0.000025824 s |
0.000024609 s |
1.05 |
add_one / PartOpt / cuda / BothRev |
0.000025728 s |
0.000024768 s |
1.04 |
add_one / IPartOpt / cuda / PreRev |
0.000025856 s |
0.000024768 s |
1.04 |
add_one / IPartOpt / cuda / PostRev |
0.000025984 s |
0.000024704 s |
1.05 |
add_one / IPartOpt / cuda / BothRev |
0.000025696 s |
0.000023808 s |
1.08 |
add_one / DefOpt / cuda / PreRev |
0.000025664 s |
0.000025024 s |
1.03 |
add_one / DefOpt / cuda / PostRev |
0.000025696 s |
0.000025217 s |
1.02 |
add_one / DefOpt / cuda / BothRev |
0.00002576 s |
0.00002528 s |
1.02 |
add_one / IDefOpt / cuda / PreRev |
0.000026336 s |
0.00002496 s |
1.06 |
add_one / IDefOpt / cuda / PostRev |
0.000024992 s |
0.00002496 s |
1.00 |
add_one / IDefOpt / cuda / BothRev |
0.000025791 s |
0.00002432 s |
1.06 |
add_one / JaXPipe / tpu / Primal |
0.000001422725 s |
0.0000014234749999999995 s |
1.00 |
add_one / Jax / tpu / Primal |
0.000001399125 s |
0.00000140335 s |
1.00 |
add_one / HLOOpt / tpu / Primal |
0.0000014349999999999998 s |
0.000001423325 s |
1.01 |
add_one / PartOpt / tpu / Primal |
0.00000141155 s |
0.0000014036 s |
1.01 |
add_one / IPartOpt / tpu / Primal |
0.0000014268500000000002 s |
0.000001422475 s |
1.00 |
add_one / DefOpt / tpu / Primal |
0.0000014048749999999998 s |
0.000001397825 s |
1.01 |
add_one / IDefOpt / tpu / Primal |
0.0000014229 s |
0.000001422125 s |
1.00 |
add_one / JaXPipe / tpu / Forward |
0.00000185045 s |
0.00000184455 s |
1.00 |
add_one / Jax / tpu / Forward |
0.000001838 s |
0.000001838475 s |
1.00 |
add_one / HLOOpt / tpu / Forward |
0.00000184775 s |
0.000001845525 s |
1.00 |
add_one / PartOpt / tpu / Forward |
0.00000183505 s |
0.00000183625 s |
1.00 |
add_one / IPartOpt / tpu / Forward |
0.00000185085 s |
0.0000018463 s |
1.00 |
add_one / DefOpt / tpu / Forward |
0.000001838075 s |
0.00000183405 s |
1.00 |
add_one / IDefOpt / tpu / Forward |
0.000001843775 s |
0.00000185615 s |
0.99 |
add_one / JaXPipe / tpu / PreRev |
0.000002246525 s |
0.0000022316 s |
1.01 |
add_one / JaXPipe / tpu / PostRev |
0.000002234025 s |
0.000002234475 s |
1.00 |
add_one / JaXPipe / tpu / BothRev |
0.00000223895 s |
0.000002245975 s |
1.00 |
add_one / Jax / tpu / BothRev |
0.0000022354250000000003 s |
0.00000224375 s |
1.00 |
add_one / HLOOpt / tpu / PreRev |
0.0000022336750000000003 s |
0.000002241625 s |
1.00 |
add_one / HLOOpt / tpu / PostRev |
0.00000224425 s |
0.0000022375999999999995 s |
1.00 |
add_one / HLOOpt / tpu / BothRev |
0.000002241425 s |
0.0000022452 s |
1.00 |
add_one / PartOpt / tpu / PreRev |
0.00000224165 s |
0.00000223565 s |
1.00 |
add_one / PartOpt / tpu / PostRev |
0.000002253175 s |
0.00000224485 s |
1.00 |
add_one / PartOpt / tpu / BothRev |
0.00000224235 s |
0.0000022485 s |
1.00 |
add_one / IPartOpt / tpu / PreRev |
0.000002242075 s |
0.0000022385 s |
1.00 |
add_one / IPartOpt / tpu / PostRev |
0.000002237775 s |
0.000002240525 s |
1.00 |
add_one / IPartOpt / tpu / BothRev |
0.000002242175 s |
0.0000022301 s |
1.01 |
add_one / DefOpt / tpu / PreRev |
0.00000224545 s |
0.0000022470250000000003 s |
1.00 |
add_one / DefOpt / tpu / PostRev |
0.000002236075 s |
0.00000223715 s |
1.00 |
add_one / DefOpt / tpu / BothRev |
0.000002252375 s |
0.000002235075 s |
1.01 |
add_one / IDefOpt / tpu / PreRev |
0.000002242 s |
0.000002230725 s |
1.01 |
add_one / IDefOpt / tpu / PostRev |
0.000002245225 s |
0.000002232775 s |
1.01 |
add_one / IDefOpt / tpu / BothRev |
0.0000022384 s |
0.000002229975 s |
1.00 |
add_one / JaXPipe / cpu / Primal |
0.000013332 s |
0.00000767638000070292 s |
1.74 |
add_one / Jax / cpu / Primal |
0.00001297 s |
0.000007881059946157621 s |
1.65 |
add_one / HLOOpt / cpu / Primal |
0.000012952 s |
0.00000769876004596881 s |
1.68 |
add_one / PartOpt / cpu / Primal |
0.000012952 s |
0.00000769573998695705 s |
1.68 |
add_one / IPartOpt / cpu / Primal |
0.000012478 s |
0.000007311640019906917 s |
1.71 |
add_one / DefOpt / cpu / Primal |
0.000013041 s |
0.000006927979984538979 s |
1.88 |
add_one / IDefOpt / cpu / Primal |
0.000013255 s |
0.000007248140027513728 s |
1.83 |
add_one / JaXPipe / cpu / Forward |
0.000017367 s |
0.000011312520000501535 s |
1.54 |
add_one / Jax / cpu / Forward |
0.000017168 s |
0.000010935220007013414 s |
1.57 |
add_one / HLOOpt / cpu / Forward |
0.000017364000000000002 s |
0.00001113865996558161 s |
1.56 |
add_one / PartOpt / cpu / Forward |
0.000017371 s |
0.000011150479995194472 s |
1.56 |
add_one / IPartOpt / cpu / Forward |
0.000016952 s |
0.00001149292000263813 s |
1.47 |
add_one / DefOpt / cpu / Forward |
0.000017641 s |
0.000011125779992653409 s |
1.59 |
add_one / IDefOpt / cpu / Forward |
0.00001714 s |
0.000010953519959002734 s |
1.56 |
add_one / JaXPipe / cpu / PreRev |
0.00001976 s |
0.00001329155999883369 s |
1.49 |
add_one / JaXPipe / cpu / PostRev |
0.000019379 s |
0.00001282565998735663 s |
1.51 |
add_one / JaXPipe / cpu / BothRev |
0.000019312 s |
0.000013342939955691691 s |
1.45 |
add_one / Jax / cpu / BothRev |
0.000019316 s |
0.000013148820007700124 s |
1.47 |
add_one / HLOOpt / cpu / PreRev |
0.000019938 s |
0.000013094879996060629 s |
1.52 |
add_one / HLOOpt / cpu / PostRev |
0.000019886 s |
0.000014685159976579598 s |
1.35 |
add_one / HLOOpt / cpu / BothRev |
0.0000196 s |
0.000013228520047050553 s |
1.48 |
add_one / PartOpt / cpu / PreRev |
0.000019641 s |
0.000012965020023329998 s |
1.51 |
add_one / PartOpt / cpu / PostRev |
0.000019701 s |
0.000012663999968935967 s |
1.56 |
add_one / PartOpt / cpu / BothRev |
0.000019832 s |
0.000014077939986236744 s |
1.41 |
add_one / IPartOpt / cpu / PreRev |
0.000020519 s |
0.00001328809998994984 s |
1.54 |
add_one / IPartOpt / cpu / PostRev |
0.000019593 s |
0.00001288855998609506 s |
1.52 |
add_one / IPartOpt / cpu / BothRev |
0.000020029 s |
0.000013109620003888268 s |
1.53 |
add_one / DefOpt / cpu / PreRev |
0.0000198 s |
0.000013311499997143982 s |
1.49 |
add_one / DefOpt / cpu / PostRev |
0.000019896 s |
0.000012807700004486833 s |
1.55 |
add_one / DefOpt / cpu / BothRev |
0.000020492 s |
0.00001317183998253313 s |
1.56 |
add_one / IDefOpt / cpu / PreRev |
0.000019734 s |
0.00001243973999407899 s |
1.59 |
add_one / IDefOpt / cpu / PostRev |
0.000020003 s |
0.00001230946002579003 s |
1.63 |
add_one / IDefOpt / cpu / BothRev |
0.000020178 s |
0.00001255659997696057 s |
1.61 |
add_one / JaXPipe / cpu / Primal |
0.000008999999999999999 s |
0.00000767638000070292 s |
1.17 |
add_one / Jax / cpu / Primal |
0.000008999999999999999 s |
0.000007881059946157621 s |
1.14 |
add_one / HLOOpt / cpu / Primal |
0.000008999999999999999 s |
0.00000769876004596881 s |
1.17 |
add_one / PartOpt / cpu / Primal |
0.000008999999999999999 s |
0.00000769573998695705 s |
1.17 |
add_one / IPartOpt / cpu / Primal |
0.000008999999999999999 s |
0.000007311640019906917 s |
1.23 |
add_one / DefOpt / cpu / Primal |
0.000008999999999999999 s |
0.000006927979984538979 s |
1.30 |
add_one / IDefOpt / cpu / Primal |
0.000008999999999999999 s |
0.000007248140027513728 s |
1.24 |
add_one / JaXPipe / cpu / Forward |
0.000012 s |
0.000011312520000501535 s |
1.06 |
add_one / Jax / cpu / Forward |
0.000012 s |
0.000010935220007013414 s |
1.10 |
add_one / HLOOpt / cpu / Forward |
0.000012 s |
0.00001113865996558161 s |
1.08 |
add_one / PartOpt / cpu / Forward |
0.000012 s |
0.000011150479995194472 s |
1.08 |
add_one / IPartOpt / cpu / Forward |
0.000012 s |
0.00001149292000263813 s |
1.04 |
add_one / DefOpt / cpu / Forward |
0.000012 s |
0.000011125779992653409 s |
1.08 |
add_one / IDefOpt / cpu / Forward |
0.000012 s |
0.000010953519959002734 s |
1.10 |
add_one / JaXPipe / cpu / PreRev |
0.000014 s |
0.00001329155999883369 s |
1.05 |
add_one / JaXPipe / cpu / PostRev |
0.000014 s |
0.00001282565998735663 s |
1.09 |
add_one / JaXPipe / cpu / BothRev |
0.000014 s |
0.000013342939955691691 s |
1.05 |
add_one / Jax / cpu / BothRev |
0.000014 s |
0.000013148820007700124 s |
1.06 |
add_one / HLOOpt / cpu / PreRev |
0.000014 s |
0.000013094879996060629 s |
1.07 |
add_one / HLOOpt / cpu / PostRev |
0.000014 s |
0.000014685159976579598 s |
0.95 |
add_one / HLOOpt / cpu / BothRev |
0.000014 s |
0.000013228520047050553 s |
1.06 |
add_one / PartOpt / cpu / PreRev |
0.000014 s |
0.000012965020023329998 s |
1.08 |
add_one / PartOpt / cpu / PostRev |
0.000014 s |
0.000012663999968935967 s |
1.11 |
add_one / PartOpt / cpu / BothRev |
0.000014 s |
0.000014077939986236744 s |
0.99 |
add_one / IPartOpt / cpu / PreRev |
0.000014 s |
0.00001328809998994984 s |
1.05 |
add_one / IPartOpt / cpu / PostRev |
0.000014 s |
0.00001288855998609506 s |
1.09 |
add_one / IPartOpt / cpu / BothRev |
0.000015 s |
0.000013109620003888268 s |
1.14 |
add_one / DefOpt / cpu / PreRev |
0.000014 s |
0.000013311499997143982 s |
1.05 |
add_one / DefOpt / cpu / PostRev |
0.000015 s |
0.000012807700004486833 s |
1.17 |
add_one / DefOpt / cpu / BothRev |
0.000014 s |
0.00001317183998253313 s |
1.06 |
add_one / IDefOpt / cpu / PreRev |
0.000014 s |
0.00001243973999407899 s |
1.13 |
add_one / IDefOpt / cpu / PostRev |
0.000014 s |
0.00001230946002579003 s |
1.14 |
add_one / IDefOpt / cpu / BothRev |
0.000014 s |
0.00001255659997696057 s |
1.11 |
add_two / JaXPipe / cpu / Primal |
0.000007181719995514868 s |
0.00000840646003780421 s |
0.85 |
add_two / Jax / cpu / Primal |
0.000007412120000935829 s |
0.000007408079991364502 s |
1.00 |
add_two / HLOOpt / cpu / Primal |
0.000006910080007855868 s |
0.0000074600599691621025 s |
0.93 |
add_two / PartOpt / cpu / Primal |
0.000007012280004801141 s |
0.000007752780065857223 s |
0.90 |
add_two / IPartOpt / cpu / Primal |
0.000006975419996706477 s |
0.000008030020026126295 s |
0.87 |
add_two / DefOpt / cpu / Primal |
0.000007349439990775863 s |
0.00000800805999460863 s |
0.92 |
add_two / IDefOpt / cpu / Primal |
0.000006735419992764946 s |
0.000007063699977152282 s |
0.95 |
add_two / JaXPipe / cpu / Forward |
0.00001016190000655115 s |
0.000011770780010920136 s |
0.86 |
add_two / Jax / cpu / Forward |
0.000010153839998565672 s |
0.000011452759990788764 s |
0.89 |
add_two / HLOOpt / cpu / Forward |
0.000010582480003904493 s |
0.00001166930002909794 s |
0.91 |
add_two / PartOpt / cpu / Forward |
0.000010770500002763584 s |
0.000011410620018068583 s |
0.94 |
add_two / IPartOpt / cpu / Forward |
0.000009896839992507012 s |
0.000011299139960101456 s |
0.88 |
add_two / DefOpt / cpu / Forward |
0.0000102615999912814 s |
0.000011692319976646103 s |
0.88 |
add_two / IDefOpt / cpu / Forward |
0.000010313920001863152 s |
0.000011732559996744386 s |
0.88 |
add_two / JaXPipe / cpu / PreRev |
0.000013803239999106156 s |
0.000015161400006036274 s |
0.91 |
add_two / JaXPipe / cpu / PostRev |
0.00001332511999862618 s |
0.00001595720002114831 s |
0.84 |
add_two / JaXPipe / cpu / BothRev |
0.00001358063998623038 s |
0.00001547035995827173 s |
0.88 |
add_two / Jax / cpu / BothRev |
0.000013045000009697106 s |
0.00001497119998020935 s |
0.87 |
add_two / HLOOpt / cpu / PreRev |
0.000013749100003224157 s |
0.000015616819973729433 s |
0.88 |
add_two / HLOOpt / cpu / PostRev |
0.000014910239988239482 s |
0.000016793039994809077 s |
0.89 |
add_two / HLOOpt / cpu / BothRev |
0.000013744620005127218 s |
0.000015394640031445305 s |
0.89 |
add_two / PartOpt / cpu / PreRev |
0.000013719960006710608 s |
0.00001552674003505672 s |
0.88 |
add_two / PartOpt / cpu / PostRev |
0.000012994620001336442 s |
0.00001590923999174265 s |
0.82 |
add_two / PartOpt / cpu / BothRev |
0.000013382699999056058 s |
0.0000157065999792394 s |
0.85 |
add_two / IPartOpt / cpu / PreRev |
0.000013732899994920444 s |
0.00001480517998970754 s |
0.93 |
add_two / IPartOpt / cpu / PostRev |
0.000013543900015520194 s |
0.000015164839996941735 s |
0.89 |
add_two / IPartOpt / cpu / BothRev |
0.000013044039990290912 s |
0.000015474340025320998 s |
0.84 |
add_two / DefOpt / cpu / PreRev |
0.00001376413999878423 s |
0.00001505929999439104 s |
0.91 |
add_two / DefOpt / cpu / PostRev |
0.000013500779998594223 s |
0.00001657381996665208 s |
0.81 |
add_two / DefOpt / cpu / BothRev |
0.000013454059997002331 s |
0.000015267580029103555 s |
0.88 |
add_two / IDefOpt / cpu / PreRev |
0.00001369823999084474 s |
0.000015865159994064016 s |
0.86 |
add_two / IDefOpt / cpu / PostRev |
0.00001319982000040909 s |
0.00001576899999236048 s |
0.84 |
add_two / IDefOpt / cpu / BothRev |
0.000013646219995280264 s |
0.000015866280000409462 s |
0.86 |
add_two / JaXPipe / cuda / Primal |
0.0000024 s |
0.0000019200000000000003 s |
1.25 |
add_two / Jax / cuda / Primal |
0.0000024 s |
0.000001888 s |
1.27 |
add_two / HLOOpt / cuda / Primal |
0.0000024 s |
0.0000019200000000000003 s |
1.25 |
add_two / PartOpt / cuda / Primal |
0.0000024 s |
0.0000019200000000000003 s |
1.25 |
add_two / IPartOpt / cuda / Primal |
0.0000024 s |
0.0000019200000000000003 s |
1.25 |
add_two / DefOpt / cuda / Primal |
0.0000024 s |
0.0000019200000000000003 s |
1.25 |
add_two / IDefOpt / cuda / Primal |
0.0000024 s |
0.0000019200000000000003 s |
1.25 |
add_two / JaXPipe / cuda / Forward |
0.00001024 s |
0.000009504 s |
1.08 |
add_two / Jax / cuda / Forward |
0.000010144 s |
0.00000992 s |
1.02 |
add_two / HLOOpt / cuda / Forward |
0.000010368 s |
0.000009888 s |
1.05 |
add_two / PartOpt / cuda / Forward |
0.00000992 s |
0.000009153 s |
1.08 |
add_two / IPartOpt / cuda / Forward |
0.000010367 s |
0.00000992 s |
1.05 |
add_two / DefOpt / cuda / Forward |
0.000011328 s |
0.000009536 s |
1.19 |
add_two / IDefOpt / cuda / Forward |
0.000010368 s |
0.000009568 s |
1.08 |
add_two / JaXPipe / cuda / PreRev |
0.000034111 s |
0.000031585 s |
1.08 |
add_two / JaXPipe / cuda / PostRev |
0.000033119999999999995 s |
0.000031264 s |
1.06 |
add_two / JaXPipe / cuda / BothRev |
0.00003808 s |
0.000031744 s |
1.20 |
add_two / Jax / cuda / BothRev |
0.000034528000000000006 s |
0.00003152 s |
1.10 |
add_two / HLOOpt / cuda / PreRev |
0.000033664 s |
0.000032 s |
1.05 |
add_two / HLOOpt / cuda / PostRev |
0.000034144000000000004 s |
0.000031424 s |
1.09 |
add_two / HLOOpt / cuda / BothRev |
0.000032864 s |
0.000041184 s |
0.80 |
add_two / PartOpt / cuda / PreRev |
0.00003296 s |
0.000031584 s |
1.04 |
add_two / PartOpt / cuda / PostRev |
0.00003824 s |
0.000031392 s |
1.22 |
add_two / PartOpt / cuda / BothRev |
0.000038304 s |
0.00003136 s |
1.22 |
add_two / IPartOpt / cuda / PreRev |
0.000033728 s |
0.000032896000000000005 s |
1.03 |
add_two / IPartOpt / cuda / PostRev |
0.000033536000000000006 s |
0.000031616 s |
1.06 |
add_two / IPartOpt / cuda / BothRev |
0.000033567 s |
0.000032192 s |
1.04 |
add_two / DefOpt / cuda / PreRev |
0.0000416 s |
0.000031776 s |
1.31 |
add_two / DefOpt / cuda / PostRev |
0.000032064 s |
0.00003184 s |
1.01 |
add_two / DefOpt / cuda / BothRev |
0.000033311 s |
0.000032608 s |
1.02 |
add_two / IDefOpt / cuda / PreRev |
0.000033312 s |
0.000036225 s |
0.92 |
add_two / IDefOpt / cuda / PostRev |
0.000032416 s |
0.000035904 s |
0.90 |
add_two / IDefOpt / cuda / BothRev |
0.000033855 s |
0.000035327 s |
0.96 |
add_two / JaXPipe / tpu / Primal |
0.000001434825 s |
0.0000014340249999999998 s |
1.00 |
add_two / Jax / tpu / Primal |
0.000001472925 s |
0.000001485725 s |
0.99 |
add_two / HLOOpt / tpu / Primal |
0.000001430725 s |
0.000001440225 s |
0.99 |
add_two / PartOpt / tpu / Primal |
0.000001479875 s |
0.00000147715 s |
1.00 |
add_two / IPartOpt / tpu / Primal |
0.00000143225 s |
0.000001432175 s |
1.00 |
add_two / DefOpt / tpu / Primal |
0.00000147735 s |
0.00000147175 s |
1.00 |
add_two / IDefOpt / tpu / Primal |
0.000001445225 s |
0.00000144305 s |
1.00 |
add_two / JaXPipe / tpu / Forward |
0.000001835 s |
0.000001820975 s |
1.01 |
add_two / Jax / tpu / Forward |
0.000001836925 s |
0.000001826575 s |
1.01 |
add_two / HLOOpt / tpu / Forward |
0.0000018329 s |
0.0000018237 s |
1.01 |
add_two / PartOpt / tpu / Forward |
0.000001821525 s |
0.000001831525 s |
0.99 |
add_two / IPartOpt / tpu / Forward |
0.000001830775 s |
0.0000018267 s |
1.00 |
add_two / DefOpt / tpu / Forward |
0.0000018227 s |
0.000001830275 s |
1.00 |
add_two / IDefOpt / tpu / Forward |
0.0000018356 s |
0.000001832075 s |
1.00 |
add_two / JaXPipe / tpu / PreRev |
0.0000028284 s |
0.000002834025 s |
1.00 |
add_two / JaXPipe / tpu / PostRev |
0.00000274375 s |
0.000002750375 s |
1.00 |
add_two / JaXPipe / tpu / BothRev |
0.000002837675 s |
0.0000028336250000000003 s |
1.00 |
add_two / Jax / tpu / BothRev |
0.0000027488499999999995 s |
0.0000027459250000000003 s |
1.00 |
add_two / HLOOpt / tpu / PreRev |
0.000002827175 s |
0.0000028386 s |
1.00 |
add_two / HLOOpt / tpu / PostRev |
0.0000027509 s |
0.000002751625 s |
1.00 |
add_two / HLOOpt / tpu / BothRev |
0.000002825475 s |
0.0000028408 s |
0.99 |
add_two / PartOpt / tpu / PreRev |
0.000002755475 s |
0.0000027445750000000003 s |
1.00 |
add_two / PartOpt / tpu / PostRev |
0.0000028429500000000005 s |
0.000002841075 s |
1.00 |
add_two / PartOpt / tpu / BothRev |
0.000002756675 s |
0.00000274485 s |
1.00 |
add_two / IPartOpt / tpu / PreRev |
0.000002831275 s |
0.00000282995 s |
1.00 |
add_two / IPartOpt / tpu / PostRev |
0.000002745675 s |
0.0000027511750000000004 s |
1.00 |
add_two / IPartOpt / tpu / BothRev |
0.0000028472 s |
0.0000028333000000000005 s |
1.00 |
add_two / DefOpt / tpu / PreRev |
0.000002760225 s |
0.000002745525 s |
1.01 |
add_two / DefOpt / tpu / PostRev |
0.0000028344000000000003 s |
0.00000283105 s |
1.00 |
add_two / DefOpt / tpu / BothRev |
0.000002744925 s |
0.000002757425 s |
1.00 |
add_two / IDefOpt / tpu / PreRev |
0.0000028378500000000003 s |
0.000002837325 s |
1.00 |
add_two / IDefOpt / tpu / PostRev |
0.000002747075 s |
0.000002747225 s |
1.00 |
add_two / IDefOpt / tpu / BothRev |
0.0000028333000000000005 s |
0.00000283425 s |
1.00 |
add_two / JaXPipe / cpu / Primal |
0.000013482 s |
0.00000840646003780421 s |
1.60 |
add_two / Jax / cpu / Primal |
0.000013202 s |
0.000007408079991364502 s |
1.78 |
add_two / HLOOpt / cpu / Primal |
0.000012844 s |
0.0000074600599691621025 s |
1.72 |
add_two / PartOpt / cpu / Primal |
0.00001331 s |
0.000007752780065857223 s |
1.72 |
add_two / IPartOpt / cpu / Primal |
0.000013283 s |
0.000008030020026126295 s |
1.65 |
add_two / DefOpt / cpu / Primal |
0.000013259 s |
0.00000800805999460863 s |
1.66 |
add_two / IDefOpt / cpu / Primal |
0.000013055 s |
0.000007063699977152282 s |
1.85 |
add_two / JaXPipe / cpu / Forward |
0.00001848 s |
0.000011770780010920136 s |
1.57 |
add_two / Jax / cpu / Forward |
0.000018288 s |
0.000011452759990788764 s |
1.60 |
add_two / HLOOpt / cpu / Forward |
0.000018378 s |
0.00001166930002909794 s |
1.57 |
add_two / PartOpt / cpu / Forward |
0.000018022 s |
0.000011410620018068583 s |
1.58 |
add_two / IPartOpt / cpu / Forward |
0.000017988 s |
0.000011299139960101456 s |
1.59 |
add_two / DefOpt / cpu / Forward |
0.000017684 s |
0.000011692319976646103 s |
1.51 |
add_two / IDefOpt / cpu / Forward |
0.000018017 s |
0.000011732559996744386 s |
1.54 |
add_two / JaXPipe / cpu / PreRev |
0.000022969 s |
0.000015161400006036274 s |
1.51 |
add_two / JaXPipe / cpu / PostRev |
0.000023239 s |
0.00001595720002114831 s |
1.46 |
add_two / JaXPipe / cpu / BothRev |
0.000023364 s |
0.00001547035995827173 s |
1.51 |
add_two / Jax / cpu / BothRev |
0.000022512 s |
0.00001497119998020935 s |
1.50 |
add_two / HLOOpt / cpu / PreRev |
0.000022923 s |
0.000015616819973729433 s |
1.47 |
add_two / HLOOpt / cpu / PostRev |
0.000023641 s |
0.000016793039994809077 s |
1.41 |
add_two / HLOOpt / cpu / BothRev |
0.000023294 s |
0.000015394640031445305 s |
1.51 |
add_two / PartOpt / cpu / PreRev |
0.000023112 s |
0.00001552674003505672 s |
1.49 |
add_two / PartOpt / cpu / PostRev |
0.000023105 s |
0.00001590923999174265 s |
1.45 |
add_two / PartOpt / cpu / BothRev |
0.000023044 s |
0.0000157065999792394 s |
1.47 |
add_two / IPartOpt / cpu / PreRev |
0.000023136 s |
0.00001480517998970754 s |
1.56 |
add_two / IPartOpt / cpu / PostRev |
0.000022811 s |
0.000015164839996941735 s |
1.50 |
add_two / IPartOpt / cpu / BothRev |
0.000023166 s |
0.000015474340025320998 s |
1.50 |
add_two / DefOpt / cpu / PreRev |
0.000023319 s |
0.00001505929999439104 s |
1.55 |
add_two / DefOpt / cpu / PostRev |
0.000022747 s |
0.00001657381996665208 s |
1.37 |
add_two / DefOpt / cpu / BothRev |
0.000023235 s |
0.000015267580029103555 s |
1.52 |
add_two / IDefOpt / cpu / PreRev |
0.000023282 s |
0.000015865159994064016 s |
1.47 |
add_two / IDefOpt / cpu / PostRev |
0.000022865 s |
0.00001576899999236048 s |
1.45 |
add_two / IDefOpt / cpu / BothRev |
0.000023368 s |
0.000015866280000409462 s |
1.47 |
add_two / JaXPipe / cpu / Primal |
0.000008999999999999999 s |
0.00000840646003780421 s |
1.07 |
add_two / Jax / cpu / Primal |
0.000008999999999999999 s |
0.000007408079991364502 s |
1.21 |
add_two / HLOOpt / cpu / Primal |
0.000008999999999999999 s |
0.0000074600599691621025 s |
1.21 |
add_two / PartOpt / cpu / Primal |
0.00001 s |
0.000007752780065857223 s |
1.29 |
add_two / IPartOpt / cpu / Primal |
0.000008999999999999999 s |
0.000008030020026126295 s |
1.12 |
add_two / DefOpt / cpu / Primal |
0.000008999999999999999 s |
0.00000800805999460863 s |
1.12 |
add_two / IDefOpt / cpu / Primal |
0.000008999999999999999 s |
0.000007063699977152282 s |
1.27 |
add_two / JaXPipe / cpu / Forward |
0.000013 s |
0.000011770780010920136 s |
1.10 |
add_two / Jax / cpu / Forward |
0.000013 s |
0.000011452759990788764 s |
1.14 |
add_two / HLOOpt / cpu / Forward |
0.000013 s |
0.00001166930002909794 s |
1.11 |
add_two / PartOpt / cpu / Forward |
0.000013 s |
0.000011410620018068583 s |
1.14 |
add_two / IPartOpt / cpu / Forward |
0.000013 s |
0.000011299139960101456 s |
1.15 |
add_two / DefOpt / cpu / Forward |
0.000013 s |
0.000011692319976646103 s |
1.11 |
add_two / IDefOpt / cpu / Forward |
0.000013 s |
0.000011732559996744386 s |
1.11 |
add_two / JaXPipe / cpu / PreRev |
0.000017 s |
0.000015161400006036274 s |
1.12 |
add_two / JaXPipe / cpu / PostRev |
0.000017 s |
0.00001595720002114831 s |
1.07 |
add_two / JaXPipe / cpu / BothRev |
0.000017 s |
0.00001547035995827173 s |
1.10 |
add_two / Jax / cpu / BothRev |
0.000017 s |
0.00001497119998020935 s |
1.14 |
add_two / HLOOpt / cpu / PreRev |
0.000017 s |
0.000015616819973729433 s |
1.09 |
add_two / HLOOpt / cpu / PostRev |
0.000017999999999999997 s |
0.000016793039994809077 s |
1.07 |
add_two / HLOOpt / cpu / BothRev |
0.000017 s |
0.000015394640031445305 s |
1.10 |
add_two / PartOpt / cpu / PreRev |
0.000017 s |
0.00001552674003505672 s |
1.09 |
add_two / PartOpt / cpu / PostRev |
0.000017 s |
0.00001590923999174265 s |
1.07 |
add_two / PartOpt / cpu / BothRev |
0.000017 s |
0.0000157065999792394 s |
1.08 |
add_two / IPartOpt / cpu / PreRev |
0.000017 s |
0.00001480517998970754 s |
1.15 |
add_two / IPartOpt / cpu / PostRev |
0.000017 s |
0.000015164839996941735 s |
1.12 |
add_two / IPartOpt / cpu / BothRev |
0.000017 s |
0.000015474340025320998 s |
1.10 |
add_two / DefOpt / cpu / PreRev |
0.000016 s |
0.00001505929999439104 s |
1.06 |
add_two / DefOpt / cpu / PostRev |
0.000017 s |
0.00001657381996665208 s |
1.03 |
add_two / DefOpt / cpu / BothRev |
0.000017 s |
0.000015267580029103555 s |
1.11 |
add_two / IDefOpt / cpu / PreRev |
0.000017 s |
0.000015865159994064016 s |
1.07 |
add_two / IDefOpt / cpu / PostRev |
0.000017 s |
0.00001576899999236048 s |
1.08 |
add_two / IDefOpt / cpu / BothRev |
0.000017 s |
0.000015866280000409462 s |
1.07 |
cache / JaXPipe / cpu / Primal |
0.000006057979994693597 s |
0.000006617600029130699 s |
0.92 |
cache / Jax / cpu / Primal |
0.0000067347599929235006 s |
0.0000073614199845906115 s |
0.91 |
cache / HLOOpt / cpu / Primal |
0.0000061146600023676 s |
0.000007327980010813917 s |
0.83 |
cache / PartOpt / cpu / Primal |
0.00000605648000146175 s |
0.000006596160001208773 s |
0.92 |
cache / IPartOpt / cpu / Primal |
0.000006027340002674464 s |
0.000006951079985810793 s |
0.87 |
cache / DefOpt / cpu / Primal |
0.000005973320012344629 s |
0.000006810800023231423 s |
0.88 |
cache / IDefOpt / cpu / Primal |
0.000005934379996688221 s |
0.000006719999992128578 s |
0.88 |
cache / JaXPipe / cpu / Forward |
0.000014143180003429734 s |
0.00001635375998375821 s |
0.86 |
cache / Jax / cpu / Forward |
0.000013643900006172771 s |
0.00001622899997528293 s |
0.84 |
cache / HLOOpt / cpu / Forward |
0.000014762459993562516 s |
0.00001638741993701842 s |
0.90 |
cache / PartOpt / cpu / Forward |
0.000014526640004532965 s |
0.00001637195999137475 s |
0.89 |
cache / IPartOpt / cpu / Forward |
0.000015243199979977364 s |
0.000016631440003038735 s |
0.92 |
cache / DefOpt / cpu / Forward |
0.000014254939999318597 s |
0.00001649958000598417 s |
0.86 |
cache / IDefOpt / cpu / Forward |
0.000014173799988839163 s |
0.00001611459998457576 s |
0.88 |
cache / JaXPipe / cpu / PreRev |
0.00001473585998382987 s |
0.000017117220031650505 s |
0.86 |
cache / JaXPipe / cpu / PostRev |
0.00001951246000089668 s |
0.00002205438004239113 s |
0.88 |
cache / JaXPipe / cpu / BothRev |
0.000016874440000265167 s |
0.00001823537998461688 s |
0.93 |
cache / Jax / cpu / BothRev |
0.000019304400000237363 s |
0.00002179658003115037 s |
0.89 |
cache / HLOOpt / cpu / PreRev |
0.000015891100006228953 s |
0.00001804116001949296 s |
0.88 |
cache / HLOOpt / cpu / PostRev |
0.0000175184200043077 s |
0.000020775400025740963 s |
0.84 |
cache / HLOOpt / cpu / BothRev |
0.000015117539990114892 s |
0.000018207139974038 s |
0.83 |
cache / PartOpt / cpu / PreRev |
0.00001538501999220898 s |
0.000017113039994001157 s |
0.90 |
cache / PartOpt / cpu / PostRev |
0.00001975966000372864 s |
0.000022788399955970818 s |
0.87 |
cache / PartOpt / cpu / BothRev |
0.000015390600003684086 s |
0.000017054319996532286 s |
0.90 |
cache / IPartOpt / cpu / PreRev |
0.000015320739998969656 s |
0.00001702454002952436 s |
0.90 |
cache / IPartOpt / cpu / PostRev |
0.000019720619995950983 s |
0.00002129747997969389 s |
0.93 |
cache / IPartOpt / cpu / BothRev |
0.0000153360399963276 s |
0.00001820888001930143 s |
0.84 |
cache / DefOpt / cpu / PreRev |
0.000015187940009582237 s |
0.000016552119950574708 s |
0.92 |
cache / DefOpt / cpu / PostRev |
0.000015252539997163694 s |
0.00001614170001630555 s |
0.94 |
cache / DefOpt / cpu / BothRev |
0.000016388180006288168 s |
0.000016149760012922344 s |
1.01 |
cache / IDefOpt / cpu / PreRev |
0.000014925919995221193 s |
0.000016202219967453857 s |
0.92 |
cache / IDefOpt / cpu / PostRev |
0.00001514337999651616 s |
0.000015889159994912917 s |
0.95 |
cache / IDefOpt / cpu / BothRev |
0.000015097839993813975 s |
0.000016051540005719288 s |
0.94 |
cache / JaXPipe / cuda / Primal |
0.0000023670000000000004 s |
0.000002303 s |
1.03 |
cache / Jax / cuda / Primal |
0.000002336 s |
0.000002304 s |
1.01 |
cache / HLOOpt / cuda / Primal |
0.0000023670000000000004 s |
0.000002208 s |
1.07 |
cache / PartOpt / cuda / Primal |
0.000002336 s |
0.000002207 s |
1.06 |
cache / IPartOpt / cuda / Primal |
0.0000023670000000000004 s |
0.000002303 s |
1.03 |
cache / DefOpt / cuda / Primal |
0.000002336 s |
0.000002272 s |
1.03 |
cache / IDefOpt / cuda / Primal |
0.000002336 s |
0.00000224 s |
1.04 |
cache / JaXPipe / cuda / Forward |
0.0000023670000000000004 s |
0.000002335 s |
1.01 |
cache / Jax / cuda / Forward |
0.000002368 s |
0.000002335 s |
1.01 |
cache / HLOOpt / cuda / Forward |
0.0000023670000000000004 s |
0.000002335 s |
1.01 |
cache / PartOpt / cuda / Forward |
0.000002368 s |
0.000002335 s |
1.01 |
cache / IPartOpt / cuda / Forward |
0.0000023670000000000004 s |
0.000002335 s |
1.01 |
cache / DefOpt / cuda / Forward |
0.0000023670000000000004 s |
0.000002272 s |
1.04 |
cache / IDefOpt / cuda / Forward |
0.000002368 s |
0.000002335 s |
1.01 |
cache / JaXPipe / cuda / PreRev |
0.000010431 s |
0.00001088 s |
0.96 |
cache / JaXPipe / cuda / PostRev |
0.000010816 s |
0.0000112 s |
0.97 |
cache / JaXPipe / cuda / BothRev |
0.000010848 s |
0.0000112 s |
0.97 |
cache / Jax / cuda / BothRev |
0.000010656 s |
0.000010784 s |
0.99 |
cache / HLOOpt / cuda / PreRev |
0.000013696 s |
0.000013536 s |
1.01 |
cache / HLOOpt / cuda / PostRev |
0.000013664 s |
0.000013505 s |
1.01 |
cache / HLOOpt / cuda / BothRev |
0.000013728 s |
0.000013569 s |
1.01 |
cache / PartOpt / cuda / PreRev |
0.000010656 s |
0.00001104 s |
0.97 |
cache / PartOpt / cuda / PostRev |
0.000011328 s |
0.000010913 s |
1.04 |
cache / PartOpt / cuda / BothRev |
0.000011969 s |
0.000010847 s |
1.10 |
cache / IPartOpt / cuda / PreRev |
0.000010624 s |
0.000010976 s |
0.97 |
cache / IPartOpt / cuda / PostRev |
0.000010592 s |
0.000011167 s |
0.95 |
cache / IPartOpt / cuda / BothRev |
0.000010592 s |
0.000010657 s |
0.99 |
cache / DefOpt / cuda / PreRev |
0.000011136 s |
0.00001104 s |
1.01 |
cache / DefOpt / cuda / PostRev |
0.000010624 s |
0.000010753 s |
0.99 |
cache / DefOpt / cuda / BothRev |
0.000010752 s |
0.000010815 s |
0.99 |
cache / IDefOpt / cuda / PreRev |
0.00001072 s |
0.000010912 s |
0.98 |
cache / IDefOpt / cuda / PostRev |
0.000010848 s |
0.000010656 s |
1.02 |
cache / IDefOpt / cuda / BothRev |
0.000010784 s |
0.000010431 s |
1.03 |
cache / JaXPipe / tpu / Primal |
0.0000024664 s |
0.000002477725 s |
1.00 |
cache / Jax / tpu / Primal |
0.000002463375 s |
0.000002473575 s |
1.00 |
cache / HLOOpt / tpu / Primal |
0.000002466025 s |
0.00000247285 s |
1.00 |
cache / PartOpt / tpu / Primal |
0.000002458425 s |
0.0000024729 s |
0.99 |
cache / IPartOpt / tpu / Primal |
0.0000024613 s |
0.0000024635500000000003 s |
1.00 |
cache / DefOpt / tpu / Primal |
0.0000024446 s |
0.0000024609000000000004 s |
0.99 |
cache / IDefOpt / tpu / Primal |
0.000002462875 s |
0.000002476775 s |
0.99 |
cache / JaXPipe / tpu / Forward |
0.0000035428 s |
0.0000035618 s |
0.99 |
cache / Jax / tpu / Forward |
0.00000353855 s |
0.00000355105 s |
1.00 |
cache / HLOOpt / tpu / Forward |
0.0000035641000000000003 s |
0.000003544875 s |
1.01 |
cache / PartOpt / tpu / Forward |
0.000003562 s |
0.000003535125 s |
1.01 |
cache / IPartOpt / tpu / Forward |
0.00000357245 s |
0.000003558 s |
1.00 |
cache / DefOpt / tpu / Forward |
0.0000035285499999999994 s |
0.0000035295500000000004 s |
1.00 |
cache / IDefOpt / tpu / Forward |
0.000003556675000000001 s |
0.0000035562 s |
1.00 |
cache / JaXPipe / tpu / PreRev |
0.00000495595 s |
0.000004968549999999999 s |
1.00 |
cache / JaXPipe / tpu / PostRev |
0.000004983274999999999 s |
0.000004967025 s |
1.00 |
cache / JaXPipe / tpu / BothRev |
0.000004982825 s |
0.00000499125 s |
1.00 |
cache / Jax / tpu / BothRev |
0.00000498185 s |
0.000004999 s |
1.00 |
cache / HLOOpt / tpu / PreRev |
0.0000039337 s |
0.0000039558750000000005 s |
0.99 |
cache / HLOOpt / tpu / PostRev |
0.00000410925 s |
0.000004117675 s |
1.00 |
cache / HLOOpt / tpu / BothRev |
0.0000039428 s |
0.0000039434 s |
1.00 |
cache / PartOpt / tpu / PreRev |
0.00000497965 s |
0.000004983625 s |
1.00 |
cache / PartOpt / tpu / PostRev |
0.000004969125 s |
0.000004951725 s |
1.00 |
cache / PartOpt / tpu / BothRev |
0.000004965975 s |
0.000004973375 s |
1.00 |
cache / IPartOpt / tpu / PreRev |
0.0000049475000000000005 s |
0.000004962074999999999 s |
1.00 |
cache / IPartOpt / tpu / PostRev |
0.0000049675 s |
0.000004954775 s |
1.00 |
cache / IPartOpt / tpu / BothRev |
0.000004991225 s |
0.00000498675 s |
1.00 |
cache / DefOpt / tpu / PreRev |
0.000004967925 s |
0.000004984925 s |
1.00 |
cache / DefOpt / tpu / PostRev |
0.0000049811250000000005 s |
0.000004961 s |
1.00 |
cache / DefOpt / tpu / BothRev |
0.000004954225 s |
0.0000049668000000000005 s |
1.00 |
cache / IDefOpt / tpu / PreRev |
0.000004978125 s |
0.00000496935 s |
1.00 |
cache / IDefOpt / tpu / PostRev |
0.0000049628 s |
0.0000049644 s |
1.00 |
cache / IDefOpt / tpu / BothRev |
0.000004967650000000001 s |
0.000004968675 s |
1.00 |
cache / JaXPipe / cpu / Primal |
0.00001298 s |
0.000006617600029130699 s |
1.96 |
cache / Jax / cpu / Primal |
0.000012407 s |
0.0000073614199845906115 s |
1.69 |
cache / HLOOpt / cpu / Primal |
0.000012689 s |
0.000007327980010813917 s |
1.73 |
cache / PartOpt / cpu / Primal |
0.000012725 s |
0.000006596160001208773 s |
1.93 |
cache / IPartOpt / cpu / Primal |
0.000012788 s |
0.000006951079985810793 s |
1.84 |
cache / DefOpt / cpu / Primal |
0.000012636 s |
0.000006810800023231423 s |
1.86 |
cache / IDefOpt / cpu / Primal |
0.000012753 s |
0.000006719999992128578 s |
1.90 |
cache / JaXPipe / cpu / Forward |
0.000016933 s |
0.00001635375998375821 s |
1.04 |
cache / Jax / cpu / Forward |
0.000017102 s |
0.00001622899997528293 s |
1.05 |
cache / HLOOpt / cpu / Forward |
0.000017076 s |
0.00001638741993701842 s |
1.04 |
cache / PartOpt / cpu / Forward |
0.000016788999999999998 s |
0.00001637195999137475 s |
1.03 |
cache / IPartOpt / cpu / Forward |
0.000017152 s |
0.000016631440003038735 s |
1.03 |
cache / DefOpt / cpu / Forward |
0.000017483 s |
0.00001649958000598417 s |
1.06 |
cache / IDefOpt / cpu / Forward |
0.00001761 s |
0.00001611459998457576 s |
1.09 |
cache / JaXPipe / cpu / PreRev |
0.000018164 s |
0.000017117220031650505 s |
1.06 |
cache / JaXPipe / cpu / PostRev |
0.000019005 s |
0.00002205438004239113 s |
0.86 |
cache / JaXPipe / cpu / BothRev |
0.000017859 s |
0.00001823537998461688 s |
0.98 |
cache / Jax / cpu / BothRev |
0.000019958 s |
0.00002179658003115037 s |
0.92 |
cache / HLOOpt / cpu / PreRev |
0.000018144 s |
0.00001804116001949296 s |
1.01 |
cache / HLOOpt / cpu / PostRev |
0.000017881 s |
0.000020775400025740963 s |
0.86 |
cache / HLOOpt / cpu / BothRev |
0.000017776 s |
0.000018207139974038 s |
0.98 |
cache / PartOpt / cpu / PreRev |
0.000017618000000000003 s |
0.000017113039994001157 s |
1.03 |
cache / PartOpt / cpu / PostRev |
0.000018767 s |
0.000022788399955970818 s |
0.82 |
cache / PartOpt / cpu / BothRev |
0.000018042 s |
0.000017054319996532286 s |
1.06 |
cache / IPartOpt / cpu / PreRev |
0.000017621000000000003 s |
0.00001702454002952436 s |
1.04 |
cache / IPartOpt / cpu / PostRev |
0.000018843 s |
0.00002129747997969389 s |
0.88 |
cache / IPartOpt / cpu / BothRev |
0.000017519 s |
0.00001820888001930143 s |
0.96 |
cache / DefOpt / cpu / PreRev |
0.00002638 s |
0.000016552119950574708 s |
1.59 |
cache / DefOpt / cpu / PostRev |
0.000027705 s |
0.00001614170001630555 s |
1.72 |
cache / DefOpt / cpu / BothRev |
0.000028642 s |
0.000016149760012922344 s |
1.77 |
cache / IDefOpt / cpu / PreRev |
0.000031254 s |
0.000016202219967453857 s |
1.93 |
cache / IDefOpt / cpu / PostRev |
0.000026087 s |
0.000015889159994912917 s |
1.64 |
cache / IDefOpt / cpu / BothRev |
0.000026241 s |
0.000016051540005719288 s |
1.63 |
cache / JaXPipe / cpu / Primal |
0.000008999999999999999 s |
0.000006617600029130699 s |
1.36 |
cache / Jax / cpu / Primal |
0.000008999999999999999 s |
0.0000073614199845906115 s |
1.22 |
cache / HLOOpt / cpu / Primal |
0.000008999999999999999 s |
0.000007327980010813917 s |
1.23 |
cache / PartOpt / cpu / Primal |
0.000008999999999999999 s |
0.000006596160001208773 s |
1.36 |
cache / IPartOpt / cpu / Primal |
0.000008999999999999999 s |
0.000006951079985810793 s |
1.29 |
cache / DefOpt / cpu / Primal |
0.000008999999999999999 s |
0.000006810800023231423 s |
1.32 |
cache / IDefOpt / cpu / Primal |
0.000008999999999999999 s |
0.000006719999992128578 s |
1.34 |
cache / JaXPipe / cpu / Forward |
0.000011 s |
0.00001635375998375821 s |
0.67 |
cache / Jax / cpu / Forward |
0.000011 s |
0.00001622899997528293 s |
0.68 |
cache / HLOOpt / cpu / Forward |
0.000011 s |
0.00001638741993701842 s |
0.67 |
cache / PartOpt / cpu / Forward |
0.000011 s |
0.00001637195999137475 s |
0.67 |
cache / IPartOpt / cpu / Forward |
0.000011 s |
0.000016631440003038735 s |
0.66 |
cache / DefOpt / cpu / Forward |
0.000011 s |
0.00001649958000598417 s |
0.67 |
cache / IDefOpt / cpu / Forward |
0.000011 s |
0.00001611459998457576 s |
0.68 |
cache / JaXPipe / cpu / PreRev |
0.000011 s |
0.000017117220031650505 s |
0.64 |
cache / JaXPipe / cpu / PostRev |
0.000011 s |
0.00002205438004239113 s |
0.50 |
cache / JaXPipe / cpu / BothRev |
0.000014 s |
0.00001823537998461688 s |
0.77 |
cache / Jax / cpu / BothRev |
0.000012 s |
0.00002179658003115037 s |
0.55 |
cache / HLOOpt / cpu / PreRev |
0.000011 s |
0.00001804116001949296 s |
0.61 |
cache / HLOOpt / cpu / PostRev |
0.000035999999999999994 s |
0.000020775400025740963 s |
1.73 |
cache / HLOOpt / cpu / BothRev |
0.000011 s |
0.000018207139974038 s |
0.60 |
cache / PartOpt / cpu / PreRev |
0.000011 s |
0.000017113039994001157 s |
0.64 |
cache / PartOpt / cpu / PostRev |
0.000012 s |
0.000022788399955970818 s |
0.53 |
cache / PartOpt / cpu / BothRev |
0.000011 s |
0.000017054319996532286 s |
0.64 |
cache / IPartOpt / cpu / PreRev |
0.000011 s |
0.00001702454002952436 s |
0.65 |
cache / IPartOpt / cpu / PostRev |
0.000012 s |
0.00002129747997969389 s |
0.56 |
cache / IPartOpt / cpu / BothRev |
0.000012 s |
0.00001820888001930143 s |
0.66 |
cache / DefOpt / cpu / PreRev |
0.000011 s |
0.000016552119950574708 s |
0.66 |
cache / DefOpt / cpu / PostRev |
0.000011 s |
0.00001614170001630555 s |
0.68 |
cache / DefOpt / cpu / BothRev |
0.000011 s |
0.000016149760012922344 s |
0.68 |
cache / IDefOpt / cpu / PreRev |
0.000011 s |
0.000016202219967453857 s |
0.68 |
cache / IDefOpt / cpu / PostRev |
0.000012 s |
0.000015889159994912917 s |
0.76 |
cache / IDefOpt / cpu / BothRev |
0.000011 s |
0.000016051540005719288 s |
0.69 |
Concat / JaXPipe / cpu / Primal |
0.000006987039998875844 s |
0.000007270800006153877 s |
0.96 |
Concat / Jax / cpu / Primal |
0.000007006739995176759 s |
0.000006953919983061496 s |
1.01 |
Concat / HLOOpt / cpu / Primal |
0.000006464879993473005 s |
0.000007605120017615263 s |
0.85 |
Concat / PartOpt / cpu / Primal |
0.000006732580006882927 s |
0.000006844539975645603 s |
0.98 |
Concat / IPartOpt / cpu / Primal |
0.000007292820012025913 s |
0.000007298220016309642 s |
1.00 |
Concat / DefOpt / cpu / Primal |
0.000006477479994373425 s |
0.000007233900023493334 s |
0.90 |
Concat / IDefOpt / cpu / Primal |
0.000006781359993510705 s |
0.000007258480000018608 s |
0.93 |
Concat / JaXPipe / cpu / Forward |
0.000009602799991625943 s |
0.000011428860007072216 s |
0.84 |
Concat / Jax / cpu / Forward |
0.000009775799999260924 s |
0.00001162386000942206 s |
0.84 |
Concat / HLOOpt / cpu / Forward |
0.000009691600009773538 s |
0.00001077782001630112 s |
0.90 |
Concat / PartOpt / cpu / Forward |
0.000009572519995799669 s |
0.000011412619987822835 s |
0.84 |
Concat / IPartOpt / cpu / Forward |
0.00001016551999555304 s |
0.000011144519994559232 s |
0.91 |
Concat / DefOpt / cpu / Forward |
0.000010006459999658546 s |
0.000011024940040442745 s |
0.91 |
Concat / IDefOpt / cpu / Forward |
0.000009944120004092838 s |
0.000011167399989062688 s |
0.89 |
Concat / JaXPipe / cpu / PreRev |
0.000011728599995421971 s |
0.000012688919996435288 s |
0.92 |
Concat / JaXPipe / cpu / PostRev |
0.00001073239999186626 s |
0.00001220757997543842 s |
0.88 |
Concat / JaXPipe / cpu / BothRev |
0.000011374120001619303 s |
0.00001241400000253634 s |
0.92 |
Concat / Jax / cpu / BothRev |
0.000010680420002699977 s |
0.000012415119981596944 s |
0.86 |
Concat / HLOOpt / cpu / PreRev |
0.00001205781999942701 s |
0.00001305383997532772 s |
0.92 |
Concat / HLOOpt / cpu / PostRev |
0.000013610699998025666 s |
0.000014871699977447862 s |
0.92 |
Concat / HLOOpt / cpu / BothRev |
0.00001129023999965284 s |
0.000012056320047122426 s |
0.94 |
Concat / PartOpt / cpu / PreRev |
0.000011274859998593456 s |
0.00001218095996591728 s |
0.93 |
Concat / PartOpt / cpu / PostRev |
0.00001164438000841983 s |
0.00001255646002391586 s |
0.93 |
Concat / PartOpt / cpu / BothRev |
0.000012165940004251752 s |
0.000012368560010145304 s |
0.98 |
Concat / IPartOpt / cpu / PreRev |
0.000011056939995341964 s |
0.000013087139996059704 s |
0.84 |
Concat / IPartOpt / cpu / PostRev |
0.000011073659995872731 s |
0.00001286647993765655 s |
0.86 |
Concat / IPartOpt / cpu / BothRev |
0.000010850519990981412 s |
0.000012555660050566076 s |
0.86 |
Concat / DefOpt / cpu / PreRev |
0.000011033779994704671 s |
0.000012576600011016123 s |
0.88 |
Concat / DefOpt / cpu / PostRev |
0.000011207579996153072 s |
0.000012553639990073862 s |
0.89 |
Concat / DefOpt / cpu / BothRev |
0.000011315260001083517 s |
0.00001252969997949549 s |
0.90 |
Concat / IDefOpt / cpu / PreRev |
0.000011311779983316229 s |
0.000012443119985618976 s |
0.91 |
Concat / IDefOpt / cpu / PostRev |
0.00001131653999664195 s |
0.0000129725800070446 s |
0.87 |
Concat / IDefOpt / cpu / BothRev |
0.000011632940002073157 s |
0.00001261953995708609 s |
0.92 |
Concat / JaXPipe / cuda / Primal |
0.0000024 s |
0.0000019200000000000003 s |
1.25 |
Concat / Jax / cuda / Primal |
0.0000024 s |
0.0000019200000000000003 s |
1.25 |
Concat / HLOOpt / cuda / Primal |
0.0000024 s |
0.0000019200000000000003 s |
1.25 |
Concat / PartOpt / cuda / Primal |
0.0000024 s |
0.0000019200000000000003 s |
1.25 |
Concat / IPartOpt / cuda / Primal |
0.0000024 s |
0.0000019200000000000003 s |
1.25 |
Concat / DefOpt / cuda / Primal |
0.0000024 s |
0.0000019200000000000003 s |
1.25 |
Concat / IDefOpt / cuda / Primal |
0.0000024 s |
0.0000019200000000000003 s |
1.25 |
Concat / JaXPipe / cuda / Forward |
0.000010751 s |
0.00000976 s |
1.10 |
Concat / Jax / cuda / Forward |
0.000010496 s |
0.000010112 s |
1.04 |
Concat / HLOOpt / cuda / Forward |
0.00001056 s |
0.000010368 s |
1.02 |
Concat / PartOpt / cuda / Forward |
0.000010528 s |
0.000010176 s |
1.03 |
Concat / IPartOpt / cuda / Forward |
0.000010656 s |
0.000010304 s |
1.03 |
Concat / DefOpt / cuda / Forward |
0.000013535 s |
0.000009952 s |
1.36 |
Concat / IDefOpt / cuda / Forward |
0.000010496 s |
0.000010432 s |
1.01 |
Concat / JaXPipe / cuda / PreRev |
0.000016832 s |
0.00001584 s |
1.06 |
Concat / JaXPipe / cuda / PostRev |
0.000017247999999999998 s |
0.000016 s |
1.08 |
Concat / JaXPipe / cuda / BothRev |
0.000016768000000000003 s |
0.000016512 s |
1.02 |
Concat / Jax / cuda / BothRev |
0.000016927999999999998 s |
0.00001616 s |
1.05 |
Concat / HLOOpt / cuda / PreRev |
0.000017216 s |
0.000015808 s |
1.09 |
Concat / HLOOpt / cuda / PostRev |
0.000016544 s |
0.000015776 s |
1.05 |
Concat / HLOOpt / cuda / BothRev |
0.00001664 s |
0.000016225 s |
1.03 |
Concat / PartOpt / cuda / PreRev |
0.000016704 s |
0.000016352 s |
1.02 |
Concat / PartOpt / cuda / PostRev |
0.000016385 s |
0.000016 s |
1.02 |
Concat / PartOpt / cuda / BothRev |
0.000016704 s |
0.000016096 s |
1.04 |
Concat / IPartOpt / cuda / PreRev |
0.00001664 s |
0.000015968 s |
1.04 |
Concat / IPartOpt / cuda / PostRev |
0.00001696 s |
0.000015935999999999998 s |
1.06 |
Concat / IPartOpt / cuda / BothRev |
0.000016576000000000002 s |
0.000016128 s |
1.03 |
Concat / DefOpt / cuda / PreRev |
0.000017183 s |
0.000016096 s |
1.07 |
Concat / DefOpt / cuda / PostRev |
0.000016832 s |
0.000016224 s |
1.04 |
Concat / DefOpt / cuda / BothRev |
0.000016929 s |
0.000015809 s |
1.07 |
Concat / IDefOpt / cuda / PreRev |
0.000016958999999999998 s |
0.000016224 s |
1.05 |
Concat / IDefOpt / cuda / PostRev |
0.000016704 s |
0.000016448999999999998 s |
1.02 |
Concat / IDefOpt / cuda / BothRev |
0.000016958999999999998 s |
0.000016383999999999998 s |
1.04 |
Concat / JaXPipe / tpu / Primal |
0.000001526775 s |
0.00000152745 s |
1.00 |
Concat / Jax / tpu / Primal |
0.000001528225 s |
0.0000015306 s |
1.00 |
Concat / HLOOpt / tpu / Primal |
0.00000152115 s |
0.00000153055 s |
0.99 |
Concat / PartOpt / tpu / Primal |
0.00000152535 s |
0.00000153385 s |
0.99 |
Concat / IPartOpt / tpu / Primal |
0.00000152805 s |
0.0000015226500000000002 s |
1.00 |
Concat / DefOpt / tpu / Primal |
0.000001533825 s |
0.00000152385 s |
1.01 |
Concat / IDefOpt / tpu / Primal |
0.0000015229500000000002 s |
0.000001522975 s |
1.00 |
Concat / JaXPipe / tpu / Forward |
0.00000156835 s |
0.000001585125 s |
0.99 |
Concat / Jax / tpu / Forward |
0.00000155665 s |
0.00000155495 s |
1.00 |
Concat / HLOOpt / tpu / Forward |
0.00000156535 s |
0.000001570125 s |
1.00 |
Concat / PartOpt / tpu / Forward |
0.000001562325 s |
0.000001559625 s |
1.00 |
Concat / IPartOpt / tpu / Forward |
0.000001574875 s |
0.0000015803249999999998 s |
1.00 |
Concat / DefOpt / tpu / Forward |
0.00000155745 s |
0.00000156675 s |
0.99 |
Concat / IDefOpt / tpu / Forward |
0.000001570475 s |
0.0000015851499999999998 s |
0.99 |
Concat / JaXPipe / tpu / PreRev |
0.00000201075 s |
0.0000019909750000000004 s |
1.01 |
Concat / JaXPipe / tpu / PostRev |
0.00000208895 s |
0.0000020868 s |
1.00 |
Concat / JaXPipe / tpu / BothRev |
0.000001992025 s |
0.0000019944 s |
1.00 |
Concat / Jax / tpu / BothRev |
0.0000020739 s |
0.000002074825 s |
1.00 |
Concat / HLOOpt / tpu / PreRev |
0.00000199085 s |
0.00000199615 s |
1.00 |
Concat / HLOOpt / tpu / PostRev |
0.000002076275 s |
0.0000020738000000000004 s |
1.00 |
Concat / HLOOpt / tpu / BothRev |
0.0000019955 s |
0.000002012075 s |
0.99 |
Concat / PartOpt / tpu / PreRev |
0.000002068575 s |
0.000002072 s |
1.00 |
Concat / PartOpt / tpu / PostRev |
0.0000019977 s |
0.00000199355 s |
1.00 |
Concat / PartOpt / tpu / BothRev |
0.000002068425 s |
0.0000020755000000000003 s |
1.00 |
Concat / IPartOpt / tpu / PreRev |
0.000001997325 s |
0.000002003575 s |
1.00 |
Concat / IPartOpt / tpu / PostRev |
0.000002078 s |
0.000002073975 s |
1.00 |
Concat / IPartOpt / tpu / BothRev |
0.0000019906 s |
0.0000020022 s |
0.99 |
Concat / DefOpt / tpu / PreRev |
0.000002083675 s |
0.00000207405 s |
1.00 |
Concat / DefOpt / tpu / PostRev |
0.0000019956 s |
0.0000019901250000000003 s |
1.00 |
Concat / DefOpt / tpu / BothRev |
0.000002068 s |
0.00000206865 s |
1.00 |
Concat / IDefOpt / tpu / PreRev |
0.0000019954250000000005 s |
0.00000199775 s |
1.00 |
Concat / IDefOpt / tpu / PostRev |
0.00000207895 s |
0.0000020728 s |
1.00 |
Concat / IDefOpt / tpu / BothRev |
0.000001995875 s |
0.0000019916 s |
1.00 |
Concat / JaXPipe / cpu / Primal |
0.000012735 s |
0.000007270800006153877 s |
1.75 |
Concat / Jax / cpu / Primal |
0.000012929 s |
0.000006953919983061496 s |
1.86 |
Concat / HLOOpt / cpu / Primal |
0.000012748 s |
0.000007605120017615263 s |
1.68 |
Concat / PartOpt / cpu / Primal |
0.000012799 s |
0.000006844539975645603 s |
1.87 |
Concat / IPartOpt / cpu / Primal |
0.00001292 s |
0.000007298220016309642 s |
1.77 |
Concat / DefOpt / cpu / Primal |
0.000012364 s |
0.000007233900023493334 s |
1.71 |
Concat / IDefOpt / cpu / Primal |
0.000012949 s |
0.000007258480000018608 s |
1.78 |
Concat / JaXPipe / cpu / Forward |
0.000017953 s |
0.000011428860007072216 s |
1.57 |
Concat / Jax / cpu / Forward |
0.00001749 s |
0.00001162386000942206 s |
1.50 |
Concat / HLOOpt / cpu / Forward |
0.000017533 s |
0.00001077782001630112 s |
1.63 |
Concat / PartOpt / cpu / Forward |
0.000017044 s |
0.000011412619987822835 s |
1.49 |
Concat / IPartOpt / cpu / Forward |
0.000017614 s |
0.000011144519994559232 s |
1.58 |
Concat / DefOpt / cpu / Forward |
0.000017233999999999998 s |
0.000011024940040442745 s |
1.56 |
Concat / IDefOpt / cpu / Forward |
0.00001734 s |
0.000011167399989062688 s |
1.55 |
Concat / JaXPipe / cpu / PreRev |
0.000020215 s |
0.000012688919996435288 s |
1.59 |
Concat / JaXPipe / cpu / PostRev |
0.000019069 s |
0.00001220757997543842 s |
1.56 |
Concat / JaXPipe / cpu / BothRev |
0.000019311 s |
0.00001241400000253634 s |
1.56 |
Concat / Jax / cpu / BothRev |
0.000019865 s |
0.000012415119981596944 s |
1.60 |
Concat / HLOOpt / cpu / PreRev |
0.000019862 s |
0.00001305383997532772 s |
1.52 |
Concat / HLOOpt / cpu / PostRev |
0.000019124 s |
0.000014871699977447862 s |
1.29 |
Concat / HLOOpt / cpu / BothRev |
0.000019164 s |
0.000012056320047122426 s |
1.59 |
Concat / PartOpt / cpu / PreRev |
0.000020517 s |
0.00001218095996591728 s |
1.68 |
Concat / PartOpt / cpu / PostRev |
0.000019228 s |
0.00001255646002391586 s |
1.53 |
Concat / PartOpt / cpu / BothRev |
0.0000194 s |
0.000012368560010145304 s |
1.57 |
Concat / IPartOpt / cpu / PreRev |
0.00001927 s |
0.000013087139996059704 s |
1.47 |
Concat / IPartOpt / cpu / PostRev |
0.000019469 s |
0.00001286647993765655 s |
1.51 |
Concat / IPartOpt / cpu / BothRev |
0.000019469 s |
0.000012555660050566076 s |
1.55 |
Concat / DefOpt / cpu / PreRev |
0.000020246 s |
0.000012576600011016123 s |
1.61 |
Concat / DefOpt / cpu / PostRev |
0.000019648 s |
0.000012553639990073862 s |
1.57 |
Concat / DefOpt / cpu / BothRev |
0.000019281 s |
0.00001252969997949549 s |
1.54 |
Concat / IDefOpt / cpu / PreRev |
0.000019293 s |
0.000012443119985618976 s |
1.55 |
Concat / IDefOpt / cpu / PostRev |
0.000019346 s |
0.0000129725800070446 s |
1.49 |
Concat / IDefOpt / cpu / BothRev |
0.000019655 s |
0.00001261953995708609 s |
1.56 |
Concat / JaXPipe / cpu / Primal |
0.000008999999999999999 s |
0.000007270800006153877 s |
1.24 |
Concat / Jax / cpu / Primal |
0.000008999999999999999 s |
0.000006953919983061496 s |
1.29 |
Concat / HLOOpt / cpu / Primal |
0.000008999999999999999 s |
0.000007605120017615263 s |
1.18 |
Concat / PartOpt / cpu / Primal |
0.000008999999999999999 s |
0.000006844539975645603 s |
1.31 |
Concat / IPartOpt / cpu / Primal |
0.000008999999999999999 s |
0.000007298220016309642 s |
1.23 |
Concat / DefOpt / cpu / Primal |
0.000008999999999999999 s |
0.000007233900023493334 s |
1.24 |
Concat / IDefOpt / cpu / Primal |
0.000008999999999999999 s |
0.000007258480000018608 s |
1.24 |
Concat / JaXPipe / cpu / Forward |
0.000012 s |
0.000011428860007072216 s |
1.05 |
Concat / Jax / cpu / Forward |
0.000012 s |
0.00001162386000942206 s |
1.03 |
Concat / HLOOpt / cpu / Forward |
0.000012 s |
0.00001077782001630112 s |
1.11 |
Concat / PartOpt / cpu / Forward |
0.000012 s |
0.000011412619987822835 s |
1.05 |
Concat / IPartOpt / cpu / Forward |
0.000012 s |
0.000011144519994559232 s |
1.08 |
Concat / DefOpt / cpu / Forward |
0.000012 s |
0.000011024940040442745 s |
1.09 |
Concat / IDefOpt / cpu / Forward |
0.000013 s |
0.000011167399989062688 s |
1.16 |
Concat / JaXPipe / cpu / PreRev |
0.000014 s |
0.000012688919996435288 s |
1.10 |
Concat / JaXPipe / cpu / PostRev |
0.000015 s |
0.00001220757997543842 s |
1.23 |
Concat / JaXPipe / cpu / BothRev |
0.000014 s |
0.00001241400000253634 s |
1.13 |
Concat / Jax / cpu / BothRev |
0.000015 s |
0.000012415119981596944 s |
1.21 |
Concat / HLOOpt / cpu / PreRev |
0.000015 s |
0.00001305383997532772 s |
1.15 |
Concat / HLOOpt / cpu / PostRev |
0.000015 s |
0.000014871699977447862 s |
1.01 |
Concat / HLOOpt / cpu / BothRev |
0.000015 s |
0.000012056320047122426 s |
1.24 |
Concat / PartOpt / cpu / PreRev |
0.000014 s |
0.00001218095996591728 s |
1.15 |
Concat / PartOpt / cpu / PostRev |
0.000015 s |
0.00001255646002391586 s |
1.19 |
Concat / PartOpt / cpu / BothRev |
0.000014 s |
0.000012368560010145304 s |
1.13 |
Concat / IPartOpt / cpu / PreRev |
0.000014 s |
0.000013087139996059704 s |
1.07 |
Concat / IPartOpt / cpu / PostRev |
0.000015 s |
0.00001286647993765655 s |
1.17 |
Concat / IPartOpt / cpu / BothRev |
0.000014 s |
0.000012555660050566076 s |
1.12 |
Concat / DefOpt / cpu / PreRev |
0.000014 s |
0.000012576600011016123 s |
1.11 |
Concat / DefOpt / cpu / PostRev |
0.000014 s |
0.000012553639990073862 s |
1.12 |
Concat / DefOpt / cpu / BothRev |
0.000015 s |
0.00001252969997949549 s |
1.20 |
Concat / IDefOpt / cpu / PreRev |
0.000015 s |
0.000012443119985618976 s |
1.21 |
Concat / IDefOpt / cpu / PostRev |
0.000014 s |
0.0000129725800070446 s |
1.08 |
Concat / IDefOpt / cpu / BothRev |
0.000014 s |
0.00001261953995708609 s |
1.11 |
const_scatter / JaXPipe / cpu / Primal |
0.000006108559998665441 s |
0.0000066701799732982185 s |
0.92 |
const_scatter / Jax / cpu / Primal |
0.000006209959999523562 s |
0.000006743799976902665 s |
0.92 |
const_scatter / HLOOpt / cpu / Primal |
0.000007044020003377228 s |
0.000007621859967912315 s |
0.92 |
const_scatter / PartOpt / cpu / Primal |
0.000005980339994948736 s |
0.000006892500023241155 s |
0.87 |
const_scatter / IPartOpt / cpu / Primal |
0.0000066518399898996 s |
0.000006738659985785489 s |
0.99 |
const_scatter / DefOpt / cpu / Primal |
0.000006993760002842464 s |
0.0000071026000114216 s |
0.98 |
const_scatter / IDefOpt / cpu / Primal |
0.000007023020002634439 s |
0.000007858320032028132 s |
0.89 |
const_scatter / JaXPipe / cpu / Forward |
0.000010509120004371653 s |
0.000011870759990415536 s |
0.89 |
const_scatter / Jax / cpu / Forward |
0.000009151179992841208 s |
0.000010619959966788884 s |
0.86 |
const_scatter / HLOOpt / cpu / Forward |
0.000010640880004757491 s |
0.000011783159970946145 s |
0.90 |
const_scatter / PartOpt / cpu / Forward |
0.000010439600007430272 s |
0.000011229000037928926 s |
0.93 |
const_scatter / IPartOpt / cpu / Forward |
0.000010492479996173642 s |
0.000012111659980291734 s |
0.87 |
const_scatter / DefOpt / cpu / Forward |
0.00001019578000978072 s |
0.000011104419972980396 s |
0.92 |
const_scatter / IDefOpt / cpu / Forward |
0.000010187319996930455 s |
0.000012228739997226512 s |
0.83 |
const_scatter / JaXPipe / cpu / PreRev |
0.0002858140600096 s |
0.0002900659799706 s |
0.99 |
const_scatter / JaXPipe / cpu / PostRev |
0.0002793739999947 s |
0.0002828032600064 s |
0.99 |
const_scatter / JaXPipe / cpu / BothRev |
0.0002810276400077 s |
0.0002831334199981 s |
0.99 |
const_scatter / Jax / cpu / BothRev |
0.000281282700007 s |
0.0002821311999741 s |
1.00 |
const_scatter / HLOOpt / cpu / PreRev |
0.0002831070800039 s |
0.0002847428000131 s |
0.99 |
const_scatter / HLOOpt / cpu / PostRev |
0.0002855815800012 s |
0.0002864551800121 s |
1.00 |
const_scatter / HLOOpt / cpu / BothRev |
0.000281064319995 s |
0.0002834857999732 s |
0.99 |
const_scatter / PartOpt / cpu / PreRev |
0.0002847579600143 s |
0.000282958500029 s |
1.01 |
const_scatter / PartOpt / cpu / PostRev |
0.0002808074999916 s |
0.0002846486800353 s |
0.99 |
const_scatter / PartOpt / cpu / BothRev |
0.0002822812799877 s |
0.0002827322999928 s |
1.00 |
const_scatter / IPartOpt / cpu / PreRev |
0.0002822949599976 s |
0.0002970088200254 s |
0.95 |
const_scatter / IPartOpt / cpu / PostRev |
0.0002825830600113 s |
0.0002822358000321 s |
1.00 |
const_scatter / IPartOpt / cpu / BothRev |
0.0002826459200036 s |
0.0002828837800097 s |
1.00 |
const_scatter / DefOpt / cpu / PreRev |
0.0002853437799944 s |
0.0002843984800074 s |
1.00 |
const_scatter / DefOpt / cpu / PostRev |
0.0002807084799928 s |
0.0002837708200422 s |
0.99 |
const_scatter / DefOpt / cpu / BothRev |
0.000286460879995 s |
0.0002855213999737 s |
1.00 |
const_scatter / IDefOpt / cpu / PreRev |
0.0002846789999898 s |
0.0003056161200311 s |
0.93 |
const_scatter / IDefOpt / cpu / PostRev |
0.0002851838000015 s |
0.0002853217799838 s |
1.00 |
const_scatter / IDefOpt / cpu / BothRev |
0.0002815989600026 s |
0.0002853424200202 s |
0.99 |
const_scatter / JaXPipe / cuda / Primal |
0.0000024 s |
0.000001887 s |
1.27 |
const_scatter / Jax / cuda / Primal |
0.0000024 s |
0.000001887 s |
1.27 |
const_scatter / HLOOpt / cuda / Primal |
0.0000024 s |
0.000001887 s |
1.27 |
const_scatter / PartOpt / cuda / Primal |
0.0000024 s |
0.000001887 s |
1.27 |
const_scatter / IPartOpt / cuda / Primal |
0.0000024 s |
0.000001887 s |
1.27 |
const_scatter / DefOpt / cuda / Primal |
0.0000024 s |
0.000001887 s |
1.27 |
const_scatter / IDefOpt / cuda / Primal |
0.0000024 s |
0.000001887 s |
1.27 |
const_scatter / JaXPipe / cuda / Forward |
0.000010752 s |
0.00000992 s |
1.08 |
const_scatter / Jax / cuda / Forward |
0.000010688 s |
0.0000096 s |
1.11 |
const_scatter / HLOOpt / cuda / Forward |
0.00001056 s |
0.00001008 s |
1.05 |
const_scatter / PartOpt / cuda / Forward |
0.000010528 s |
0.000009857 s |
1.07 |
const_scatter / IPartOpt / cuda / Forward |
0.000010592 s |
0.000010048 s |
1.05 |
const_scatter / DefOpt / cuda / Forward |
0.000010752 s |
0.00001008 s |
1.07 |
const_scatter / IDefOpt / cuda / Forward |
0.0000104 s |
0.000009856 s |
1.06 |
const_scatter / JaXPipe / cuda / PreRev |
0.000016992 s |
0.000015968 s |
1.06 |
const_scatter / JaXPipe / cuda / PostRev |
0.000017024 s |
0.000016063000000000002 s |
1.06 |
const_scatter / JaXPipe / cuda / BothRev |
0.000016864 s |
0.000017984 s |
0.94 |
const_scatter / Jax / cuda / BothRev |
0.000017216 s |
0.0000184 s |
0.94 |
const_scatter / HLOOpt / cuda / PreRev |
0.000016864 s |
0.000016096 s |
1.05 |
const_scatter / HLOOpt / cuda / PostRev |
0.000016768000000000003 s |
0.000015744 s |
1.07 |
const_scatter / HLOOpt / cuda / BothRev |
0.000016767 s |
0.000016255999999999998 s |
1.03 |
const_scatter / PartOpt / cuda / PreRev |
0.000016767 s |
0.000017857000000000002 s |
0.94 |
const_scatter / PartOpt / cuda / PostRev |
0.000016863 s |
0.000016417000000000002 s |
1.03 |
const_scatter / PartOpt / cuda / BothRev |
0.000016767 s |
0.000016416 s |
1.02 |
const_scatter / IPartOpt / cuda / PreRev |
0.000016607 s |
0.00001616 s |
1.03 |
const_scatter / IPartOpt / cuda / PostRev |
0.000016799000000000003 s |
0.000016544 s |
1.02 |
const_scatter / IPartOpt / cuda / BothRev |
0.000016768999999999998 s |
0.000015904000000000002 s |
1.05 |
const_scatter / DefOpt / cuda / PreRev |
0.000017121 s |
0.000016352 s |
1.05 |
const_scatter / DefOpt / cuda / PostRev |
0.000016896000000000002 s |
0.000015935999999999998 s |
1.06 |
const_scatter / DefOpt / cuda / BothRev |
0.000016832 s |
0.00001584 s |
1.06 |
const_scatter / IDefOpt / cuda / PreRev |
0.00001664 s |
0.000021888 s |
0.76 |
const_scatter / IDefOpt / cuda / PostRev |
0.000017024 s |
0.000015935999999999998 s |
1.07 |
const_scatter / IDefOpt / cuda / BothRev |
0.000016512 s |
0.000016255999999999998 s |
1.02 |
const_scatter / JaXPipe / tpu / Primal |
0.000003805725 s |
0.000003818225 s |
1.00 |
const_scatter / Jax / tpu / Primal |
0.00000384415 s |
0.000003825575 s |
1.00 |
const_scatter / HLOOpt / tpu / Primal |
0.000003824675 s |
0.000003793475 s |
1.01 |
const_scatter / PartOpt / tpu / Primal |
0.00000385145 s |
0.000003842975 s |
1.00 |
const_scatter / IPartOpt / tpu / Primal |
0.00000381395 s |
0.000003810875 s |
1.00 |
const_scatter / DefOpt / tpu / Primal |
0.000003824275 s |
0.000003816975 s |
1.00 |
const_scatter / IDefOpt / tpu / Primal |
0.0000038132 s |
0.000003805875 s |
1.00 |
const_scatter / JaXPipe / tpu / Forward |
0.00000647835 s |
0.00000650045 s |
1.00 |
const_scatter / Jax / tpu / Forward |
0.000006475675 s |
0.000006493925 s |
1.00 |
const_scatter / HLOOpt / tpu / Forward |
0.000006498150000000001 s |
0.0000064629250000000005 s |
1.01 |
const_scatter / PartOpt / tpu / Forward |
0.0000064901 s |
0.00000649415 s |
1.00 |
const_scatter / IPartOpt / tpu / Forward |
0.0000064710500000000005 s |
0.000006451675 s |
1.00 |
const_scatter / DefOpt / tpu / Forward |
0.0000064881 s |
0.00000650245 s |
1.00 |
const_scatter / IDefOpt / tpu / Forward |
0.000006466275 s |
0.000006453625 s |
1.00 |
const_scatter / JaXPipe / tpu / PreRev |
0.000006669225 s |
0.00000668875 s |
1.00 |
const_scatter / JaXPipe / tpu / PostRev |
0.000006671275 s |
0.000006669874999999999 s |
1.00 |
const_scatter / JaXPipe / tpu / BothRev |
0.0000066567 s |
0.000006674075000000001 s |
1.00 |
const_scatter / Jax / tpu / BothRev |
0.000006656525 s |
0.00000666065 s |
1.00 |
const_scatter / HLOOpt / tpu / PreRev |
0.000006643825 s |
0.000006668475 s |
1.00 |
const_scatter / HLOOpt / tpu / PostRev |
0.000006664125 s |
0.000006640350000000001 s |
1.00 |
const_scatter / HLOOpt / tpu / BothRev |
0.0000066657750000000006 s |
0.000006670825 s |
1.00 |
const_scatter / PartOpt / tpu / PreRev |
0.000006685175 s |
0.000006679725 s |
1.00 |
const_scatter / PartOpt / tpu / PostRev |
0.000006661425 s |
0.0000066626 s |
1.00 |
const_scatter / PartOpt / tpu / BothRev |
0.000006656875 s |
0.00000665535 s |
1.00 |
const_scatter / IPartOpt / tpu / PreRev |
0.000006643550000000001 s |
0.000006675325 s |
1.00 |
const_scatter / IPartOpt / tpu / PostRev |
0.00000666745 s |
0.0000066698 s |
1.00 |
const_scatter / IPartOpt / tpu / BothRev |
0.000006655375 s |
0.00000667635 s |
1.00 |
const_scatter / DefOpt / tpu / PreRev |
0.000006649525 s |
0.0000066897 s |
0.99 |
const_scatter / DefOpt / tpu / PostRev |
0.0000066838 s |
0.000006663175 s |
1.00 |
const_scatter / DefOpt / tpu / BothRev |
0.000006678575 s |
0.00000667775 s |
1.00 |
const_scatter / IDefOpt / tpu / PreRev |
0.000006649175000000001 s |
0.000006667025 s |
1.00 |
const_scatter / IDefOpt / tpu / PostRev |
0.000006680325 s |
0.000006663125 s |
1.00 |
const_scatter / IDefOpt / tpu / BothRev |
0.000006692725 s |
0.000006661175 s |
1.00 |
const_scatter / JaXPipe / cpu / Primal |
0.00001297 s |
0.0000066701799732982185 s |
1.94 |
const_scatter / Jax / cpu / Primal |
0.000012837 s |
0.000006743799976902665 s |
1.90 |
const_scatter / HLOOpt / cpu / Primal |
0.00001306 s |
0.000007621859967912315 s |
1.71 |
const_scatter / PartOpt / cpu / Primal |
0.000012284 s |
0.000006892500023241155 s |
1.78 |
const_scatter / IPartOpt / cpu / Primal |
0.000012362 s |
0.000006738659985785489 s |
1.83 |
const_scatter / DefOpt / cpu / Primal |
0.000013347 s |
0.0000071026000114216 s |
1.88 |
const_scatter / IDefOpt / cpu / Primal |
0.000013323 s |
0.000007858320032028132 s |
1.70 |
const_scatter / JaXPipe / cpu / Forward |
0.000017816 s |
0.000011870759990415536 s |
1.50 |
const_scatter / Jax / cpu / Forward |
0.000016363 s |
0.000010619959966788884 s |
1.54 |
const_scatter / HLOOpt / cpu / Forward |
0.000017568000000000002 s |
0.000011783159970946145 s |
1.49 |
const_scatter / PartOpt / cpu / Forward |
0.000017684 s |
0.000011229000037928926 s |
1.57 |
const_scatter / IPartOpt / cpu / Forward |
0.000017641 s |
0.000012111659980291734 s |
1.46 |
const_scatter / DefOpt / cpu / Forward |
0.000017715999999999998 s |
0.000011104419972980396 s |
1.60 |
const_scatter / IDefOpt / cpu / Forward |
0.000018399 s |
0.000012228739997226512 s |
1.50 |
const_scatter / JaXPipe / cpu / PreRev |
0.000520072 s |
0.0002900659799706 s |
1.79 |
const_scatter / JaXPipe / cpu / PostRev |
0.000524602 s |
0.0002828032600064 s |
1.86 |
const_scatter / JaXPipe / cpu / BothRev |
0.0005186979999999 s |
0.0002831334199981 s |
1.83 |
const_scatter / Jax / cpu / BothRev |
0.000509981 s |
0.0002821311999741 s |
1.81 |
const_scatter / HLOOpt / cpu / PreRev |
0.000513085 s |
0.0002847428000131 s |
1.80 |
const_scatter / HLOOpt / cpu / PostRev |
0.000511103 s |
0.0002864551800121 s |
1.78 |
const_scatter / HLOOpt / cpu / BothRev |
0.000524223 s |
0.0002834857999732 s |
1.85 |
const_scatter / PartOpt / cpu / PreRev |
0.0005257619999999 s |
0.000282958500029 s |
1.86 |
const_scatter / PartOpt / cpu / PostRev |
0.000523146 s |
0.0002846486800353 s |
1.84 |
const_scatter / PartOpt / cpu / BothRev |
0.000510351 s |
0.0002827322999928 s |
1.81 |
const_scatter / IPartOpt / cpu / PreRev |
0.000526735 s |
0.0002970088200254 s |
1.77 |
const_scatter / IPartOpt / cpu / PostRev |
0.0005301829999999 s |
0.0002822358000321 s |
1.88 |
const_scatter / IPartOpt / cpu / BothRev |
0.000522049 s |
0.0002828837800097 s |
1.85 |
const_scatter / DefOpt / cpu / PreRev |
0.000521124 s |
0.0002843984800074 s |
1.83 |
const_scatter / DefOpt / cpu / PostRev |
0.000504847 s |
0.0002837708200422 s |
1.78 |
const_scatter / DefOpt / cpu / BothRev |
0.000519664 s |
0.0002855213999737 s |
1.82 |
const_scatter / IDefOpt / cpu / PreRev |
0.000533165 s |
0.0003056161200311 s |
1.74 |
const_scatter / IDefOpt / cpu / PostRev |
0.0005317049999999 s |
0.0002853217799838 s |
1.86 |
const_scatter / IDefOpt / cpu / BothRev |
0.000541714 s |
0.0002853424200202 s |
1.90 |
const_scatter / JaXPipe / cpu / Primal |
0.000008999999999999999 s |
0.0000066701799732982185 s |
1.35 |
const_scatter / Jax / cpu / Primal |
0.000008999999999999999 s |
0.000006743799976902665 s |
1.33 |
const_scatter / HLOOpt / cpu / Primal |
0.000008999999999999999 s |
0.000007621859967912315 s |
1.18 |
const_scatter / PartOpt / cpu / Primal |
0.000008 s |
0.000006892500023241155 s |
1.16 |
const_scatter / IPartOpt / cpu / Primal |
0.000008999999999999999 s |
0.000006738659985785489 s |
1.34 |
const_scatter / DefOpt / cpu / Primal |
0.000008999999999999999 s |
0.0000071026000114216 s |
1.27 |
const_scatter / IDefOpt / cpu / Primal |
0.000008999999999999999 s |
0.000007858320032028132 s |
1.15 |
const_scatter / JaXPipe / cpu / Forward |
0.000013 s |
0.000011870759990415536 s |
1.10 |
const_scatter / Jax / cpu / Forward |
0.000012 s |
0.000010619959966788884 s |
1.13 |
const_scatter / HLOOpt / cpu / Forward |
0.000013 s |
0.000011783159970946145 s |
1.10 |
const_scatter / PartOpt / cpu / Forward |
0.000013 s |
0.000011229000037928926 s |
1.16 |
const_scatter / IPartOpt / cpu / Forward |
0.000013 s |
0.000012111659980291734 s |
1.07 |
const_scatter / DefOpt / cpu / Forward |
0.000013 s |
0.000011104419972980396 s |
1.17 |
const_scatter / IDefOpt / cpu / Forward |
0.000013 s |
0.000012228739997226512 s |
1.06 |
const_scatter / JaXPipe / cpu / PreRev |
0.000336 s |
0.0002900659799706 s |
1.16 |
const_scatter / JaXPipe / cpu / PostRev |
0.000348 s |
0.0002828032600064 s |
1.23 |
const_scatter / JaXPipe / cpu / BothRev |
0.000327 s |
0.0002831334199981 s |
1.15 |
const_scatter / Jax / cpu / BothRev |
0.000368 s |
0.0002821311999741 s |
1.30 |
const_scatter / HLOOpt / cpu / PreRev |
0.000399 s |
0.0002847428000131 s |
1.40 |
const_scatter / HLOOpt / cpu / PostRev |
0.0003549999999999 s |
0.0002864551800121 s |
1.24 |
const_scatter / HLOOpt / cpu / BothRev |
0.000343 s |
0.0002834857999732 s |
1.21 |
const_scatter / PartOpt / cpu / PreRev |
0.000384 s |
0.000282958500029 s |
1.36 |
const_scatter / PartOpt / cpu / PostRev |
0.00035 s |
0.0002846486800353 s |
1.23 |
const_scatter / PartOpt / cpu / BothRev |
0.00057 s |
0.0002827322999928 s |
2.02 |
const_scatter / IPartOpt / cpu / PreRev |
0.000333 s |
0.0002970088200254 s |
1.12 |
const_scatter / IPartOpt / cpu / PostRev |
0.000372 s |
0.0002822358000321 s |
1.32 |
const_scatter / IPartOpt / cpu / BothRev |
0.000326 s |
0.0002828837800097 s |
1.15 |
const_scatter / DefOpt / cpu / PreRev |
0.000388 s |
0.0002843984800074 s |
1.36 |
const_scatter / DefOpt / cpu / PostRev |
0.000363 s |
0.0002837708200422 s |
1.28 |
const_scatter / DefOpt / cpu / BothRev |
0.000336 s |
0.0002855213999737 s |
1.18 |
const_scatter / IDefOpt / cpu / PreRev |
0.0003689999999999 s |
0.0003056161200311 s |
1.21 |
const_scatter / IDefOpt / cpu / PostRev |
0.000363 s |
0.0002853217799838 s |
1.27 |
const_scatter / IDefOpt / cpu / BothRev |
0.000357 s |
0.0002853424200202 s |
1.25 |
GenDot / JaXPipe / cpu / Primal |
0.000007164820005982619 s |
0.000008296160003737896 s |
0.86 |
GenDot / Jax / cpu / Primal |
0.000006894260000080976 s |
0.000008259619971795473 s |
0.83 |
GenDot / HLOOpt / cpu / Primal |
0.000007182400011060963 s |
0.000008371400035684928 s |
0.86 |
GenDot / PartOpt / cpu / Primal |
0.000006891279992942145 s |
0.00000901005999367044 s |
0.76 |
GenDot / IPartOpt / cpu / Primal |
0.000007341340001403296 s |
0.000008573640043323394 s |
0.86 |
GenDot / DefOpt / cpu / Primal |
0.000007125279992123978 s |
0.000007987099979800406 s |
0.89 |
GenDot / IDefOpt / cpu / Primal |
0.000006793240013394097 s |
0.000007670840022910853 s |
0.89 |
GenDot / JaXPipe / cpu / Forward |
0.000010680080006295611 s |
0.000011564219994397715 s |
0.92 |
GenDot / Jax / cpu / Forward |
0.00001023135998593716 s |
0.000011687760006680036 s |
0.88 |
GenDot / HLOOpt / cpu / Forward |
0.000010816820004038164 s |
0.000012549280017992717 s |
0.86 |
GenDot / PartOpt / cpu / Forward |
0.00001070916000571742 s |
0.000011167880002176387 s |
0.96 |
GenDot / IPartOpt / cpu / Forward |
0.000011368420014150616 s |
0.000012273299962544117 s |
0.93 |
GenDot / DefOpt / cpu / Forward |
0.000010471499995219347 s |
0.000011512759956531226 s |
0.91 |
GenDot / IDefOpt / cpu / Forward |
0.00001057081999078946 s |
0.000012003320016447103 s |
0.88 |
GenDot / JaXPipe / cpu / PreRev |
0.00001126492000594226 s |
0.000012277580008230873 s |
0.92 |
GenDot / JaXPipe / cpu / PostRev |
0.000010336860002553294 s |
0.000011590839994823907 s |
0.89 |
GenDot / JaXPipe / cpu / BothRev |
0.00001152759998831243 s |
0.000011976359992331707 s |
0.96 |
GenDot / Jax / cpu / BothRev |
0.00001020006000317153 s |
0.000011309560022709776 s |
0.90 |
GenDot / HLOOpt / cpu / PreRev |
0.00001127240001096652 s |
0.000012416999970810138 s |
0.91 |
GenDot / HLOOpt / cpu / PostRev |
0.00001264602000674131 s |
0.00001339757995083346 s |
0.94 |
GenDot / HLOOpt / cpu / BothRev |
0.000011217700007364327 s |
0.000011228120001760544 s |
1.00 |
GenDot / PartOpt / cpu / PreRev |
0.000011149759993713816 s |
0.000011987920024694176 s |
0.93 |
GenDot / PartOpt / cpu / PostRev |
0.00001017342000068311 s |
0.00001231975996233814 s |
0.83 |
GenDot / PartOpt / cpu / BothRev |
0.0000114205599993511 s |
0.000012321479962338344 s |
0.93 |
GenDot / IPartOpt / cpu / PreRev |
0.000011031660008029576 s |
0.000011272580013610423 s |
0.98 |
GenDot / IPartOpt / cpu / PostRev |
0.000010528539999086206 s |
0.000012042980015394278 s |
0.87 |
GenDot / IPartOpt / cpu / BothRev |
0.000010413259990400548 s |
0.000011842540016004931 s |
0.88 |
GenDot / DefOpt / cpu / PreRev |
0.000011051439998936986 s |
0.000011467599997558865 s |
0.96 |
GenDot / DefOpt / cpu / PostRev |
0.000011337839994212118 s |
0.000011891319991264026 s |
0.95 |
GenDot / DefOpt / cpu / BothRev |
0.000010511379994113667 s |
0.000012002500052403774 s |
0.88 |
GenDot / IDefOpt / cpu / PreRev |
0.000011223220001284062 s |
0.000011844639984701644 s |
0.95 |
GenDot / IDefOpt / cpu / PostRev |
0.000011027500002001031 s |
0.000012110100014979252 s |
0.91 |
GenDot / IDefOpt / cpu / BothRev |
0.000011024639995866891 s |
0.000012075799977537829 s |
0.91 |
GenDot / JaXPipe / cuda / Primal |
0.000002496 s |
0.000002015 s |
1.24 |
GenDot / Jax / cuda / Primal |
0.000002495 s |
0.000002016 s |
1.24 |
GenDot / HLOOpt / cuda / Primal |
0.000002495 s |
0.000001984 s |
1.26 |
GenDot / PartOpt / cuda / Primal |
0.000002527 s |
0.000002015 s |
1.25 |
GenDot / IPartOpt / cuda / Primal |
0.000002527 s |
0.000002015 s |
1.25 |
GenDot / DefOpt / cuda / Primal |
0.000002496 s |
0.000001984 s |
1.26 |
GenDot / IDefOpt / cuda / Primal |
0.000002495 s |
0.000001984 s |
1.26 |
GenDot / JaXPipe / cuda / Forward |
0.000010176 s |
0.000009792 s |
1.04 |
GenDot / Jax / cuda / Forward |
0.000010656 s |
0.000010016 s |
1.06 |
GenDot / HLOOpt / cuda / Forward |
0.000010624 s |
0.000009952 s |
1.07 |
GenDot / PartOpt / cuda / Forward |
0.00001056 s |
0.00000992 s |
1.06 |
GenDot / IPartOpt / cuda / Forward |
0.00001072 s |
0.000009792 s |
1.09 |
GenDot / DefOpt / cuda / Forward |
0.000010656 s |
0.000009632 s |
1.11 |
GenDot / IDefOpt / cuda / Forward |
0.00001056 s |
0.00001008 s |
1.05 |
GenDot / JaXPipe / cuda / PreRev |
0.000010496 s |
0.00000992 s |
1.06 |
GenDot / JaXPipe / cuda / PostRev |
0.000010752 s |
0.000009984 s |
1.08 |
GenDot / JaXPipe / cuda / BothRev |
0.000010752 s |
0.000009952 s |
1.08 |
GenDot / Jax / cuda / BothRev |
0.000010591 s |
0.000009952 s |
1.06 |
GenDot / HLOOpt / cuda / PreRev |
0.000010688 s |
0.000009792 s |
1.09 |
GenDot / HLOOpt / cuda / PostRev |
0.000010591 s |
0.000010464 s |
1.01 |
GenDot / HLOOpt / cuda / BothRev |
0.000011135 s |
0.00000992 s |
1.12 |
GenDot / PartOpt / cuda / PreRev |
0.000010751 s |
0.000009855 s |
1.09 |
GenDot / PartOpt / cuda / PostRev |
0.000010304 s |
0.000009984 s |
1.03 |
GenDot / PartOpt / cuda / BothRev |
0.000010528 s |
0.000010016 s |
1.05 |
GenDot / IPartOpt / cuda / PreRev |
0.000010528 s |
0.000009888 s |
1.06 |
GenDot / IPartOpt / cuda / PostRev |
0.000009344 s |
0.000010145 s |
0.92 |
GenDot / IPartOpt / cuda / BothRev |
0.00001056 s |
0.000009664 s |
1.09 |
GenDot / DefOpt / cuda / PreRev |
0.000010656 s |
0.000010016 s |
1.06 |
GenDot / DefOpt / cuda / PostRev |
0.000010464 s |
0.000009665 s |
1.08 |
GenDot / DefOpt / cuda / BothRev |
0.00001056 s |
0.000009856 s |
1.07 |
GenDot / IDefOpt / cuda / PreRev |
0.000010593 s |
0.000009984 s |
1.06 |
GenDot / IDefOpt / cuda / PostRev |
0.0000104 s |
0.000010016 s |
1.04 |
GenDot / IDefOpt / cuda / BothRev |
0.000010688 s |
0.000009792 s |
1.09 |
GenDot / JaXPipe / tpu / Primal |
9.29575e-7 s |
9.30175e-7 s |
1.00 |
GenDot / Jax / tpu / Primal |
9.2555e-7 s |
9.25725e-7 s |
1.00 |
GenDot / HLOOpt / tpu / Primal |
0.0000015757750000000002 s |
0.00000157525 s |
1.00 |
GenDot / PartOpt / tpu / Primal |
9.25875e-7 s |
9.25625e-7 s |
1.00 |
GenDot / IPartOpt / tpu / Primal |
9.30225e-7 s |
9.297e-7 s |
1.00 |
GenDot / DefOpt / tpu / Primal |
0.00000149465 s |
0.00000148965 s |
1.00 |
GenDot / IDefOpt / tpu / Primal |
0.000001582175 s |
0.0000015781 s |
1.00 |
GenDot / JaXPipe / tpu / Forward |
0.0000031760500000000003 s |
0.000003164375 s |
1.00 |
GenDot / Jax / tpu / Forward |
0.00000232045 s |
0.000002315075 s |
1.00 |
GenDot / HLOOpt / tpu / Forward |
0.0000031277500000000003 s |
0.000003115875 s |
1.00 |
GenDot / PartOpt / tpu / Forward |
0.000003233825 s |
0.00000322785 s |
1.00 |
GenDot / IPartOpt / tpu / Forward |
0.000003117625 s |
0.000003113875 s |
1.00 |
GenDot / DefOpt / tpu / Forward |
0.000003220375 s |
0.0000032144750000000003 s |
1.00 |
GenDot / IDefOpt / tpu / Forward |
0.000003118625 s |
0.00000311405 s |
1.00 |
GenDot / JaXPipe / tpu / PreRev |
0.000002971175 s |
0.0000029543750000000004 s |
1.01 |
GenDot / JaXPipe / tpu / PostRev |
0.00000240815 s |
0.00000240305 s |
1.00 |
GenDot / JaXPipe / tpu / BothRev |
0.000002958125 s |
0.0000029627 s |
1.00 |
GenDot / Jax / tpu / BothRev |
0.0000024127750000000004 s |
0.000002401275 s |
1.00 |
GenDot / HLOOpt / tpu / PreRev |
0.0000029723500000000004 s |
0.0000029584250000000004 s |
1.00 |
GenDot / HLOOpt / tpu / PostRev |
0.0000029275 s |
0.00000294135 s |
1.00 |
GenDot / HLOOpt / tpu / BothRev |
0.00000296765 s |
0.00000295935 s |
1.00 |
GenDot / PartOpt / tpu / PreRev |
0.0000029399 s |
0.000002942475 s |
1.00 |
GenDot / PartOpt / tpu / PostRev |
0.000002398775 s |
0.0000023939 s |
1.00 |
GenDot / PartOpt / tpu / BothRev |
0.00000294025 s |
0.000002942875 s |
1.00 |
GenDot / IPartOpt / tpu / PreRev |
0.000002960475 s |
0.0000029719 s |
1.00 |
GenDot / IPartOpt / tpu / PostRev |
0.000002410425 s |
0.000002410775 s |
1.00 |
GenDot / IPartOpt / tpu / BothRev |
0.000002980325 s |
0.000002961425 s |
1.01 |
GenDot / DefOpt / tpu / PreRev |
0.000002940425 s |
0.0000029349 s |
1.00 |
GenDot / DefOpt / tpu / PostRev |
0.00000296485 s |
0.000002974775 s |
1.00 |
GenDot / DefOpt / tpu / BothRev |
0.000002938225 s |
0.0000029336750000000003 s |
1.00 |
GenDot / IDefOpt / tpu / PreRev |
0.00000298295 s |
0.000002968575 s |
1.00 |
GenDot / IDefOpt / tpu / PostRev |
0.00000293295 s |
0.0000029402250000000003 s |
1.00 |
GenDot / IDefOpt / tpu / BothRev |
0.000002980125 s |
0.000002968075 s |
1.00 |
GenDot / JaXPipe / cpu / Primal |
0.000015516000000000002 s |
0.000008296160003737896 s |
1.87 |
GenDot / Jax / cpu / Primal |
0.000014901 s |
0.000008259619971795473 s |
1.80 |
GenDot / HLOOpt / cpu / Primal |
0.000014066 s |
0.000008371400035684928 s |
1.68 |
GenDot / PartOpt / cpu / Primal |
0.000015588 s |
0.00000901005999367044 s |
1.73 |
GenDot / IPartOpt / cpu / Primal |
0.00001498 s |
0.000008573640043323394 s |
1.75 |
GenDot / DefOpt / cpu / Primal |
0.000014295 s |
0.000007987099979800406 s |
1.79 |
GenDot / IDefOpt / cpu / Primal |
0.000014119 s |
0.000007670840022910853 s |
1.84 |
GenDot / JaXPipe / cpu / Forward |
0.000019373 s |
0.000011564219994397715 s |
1.68 |
GenDot / Jax / cpu / Forward |
0.000020421000000000003 s |
0.000011687760006680036 s |
1.75 |
GenDot / HLOOpt / cpu / Forward |
0.000019408 s |
0.000012549280017992717 s |
1.55 |
GenDot / PartOpt / cpu / Forward |
0.000019467 s |
0.000011167880002176387 s |
1.74 |
GenDot / IPartOpt / cpu / Forward |
0.000019438 s |
0.000012273299962544117 s |
1.58 |
GenDot / DefOpt / cpu / Forward |
0.000019345 s |
0.000011512759956531226 s |
1.68 |
GenDot / IDefOpt / cpu / Forward |
0.000019934 s |
0.000012003320016447103 s |
1.66 |
GenDot / JaXPipe / cpu / PreRev |
0.000019513 s |
0.000012277580008230873 s |
1.59 |
GenDot / JaXPipe / cpu / PostRev |
0.000021119 s |
0.000011590839994823907 s |
1.82 |
GenDot / JaXPipe / cpu / BothRev |
0.000020144 s |
0.000011976359992331707 s |
1.68 |
GenDot / Jax / cpu / BothRev |
0.000019993 s |
0.000011309560022709776 s |
1.77 |
GenDot / HLOOpt / cpu / PreRev |
0.00002042 s |
0.000012416999970810138 s |
1.64 |
GenDot / HLOOpt / cpu / PostRev |
0.000020393000000000003 s |
0.00001339757995083346 s |
1.52 |
GenDot / HLOOpt / cpu / BothRev |
0.000019157 s |
0.000011228120001760544 s |
1.71 |
GenDot / PartOpt / cpu / PreRev |
0.000019171 s |
0.000011987920024694176 s |
1.60 |
GenDot / PartOpt / cpu / PostRev |
0.000020405 s |
0.00001231975996233814 s |
1.66 |
GenDot / PartOpt / cpu / BothRev |
0.000019198 s |
0.000012321479962338344 s |
1.56 |
GenDot / IPartOpt / cpu / PreRev |
0.000019247 s |
0.000011272580013610423 s |
1.71 |
GenDot / IPartOpt / cpu / PostRev |
0.000020531 s |
0.000012042980015394278 s |
1.70 |
GenDot / IPartOpt / cpu / BothRev |
0.000019199 s |
0.000011842540016004931 s |
1.62 |
GenDot / DefOpt / cpu / PreRev |
0.000019784 s |
0.000011467599997558865 s |
1.73 |
GenDot / DefOpt / cpu / PostRev |
0.000019906 s |
0.000011891319991264026 s |
1.67 |
GenDot / DefOpt / cpu / BothRev |
0.000019323 s |
0.000012002500052403774 s |
1.61 |
GenDot / IDefOpt / cpu / PreRev |
0.000019563 s |
0.000011844639984701644 s |
1.65 |
GenDot / IDefOpt / cpu / PostRev |
0.000018855 s |
0.000012110100014979252 s |
1.56 |
GenDot / IDefOpt / cpu / BothRev |
0.000019256 s |
0.000012075799977537829 s |
1.59 |
GenDot / JaXPipe / cpu / Primal |
0.000011 s |
0.000008296160003737896 s |
1.33 |
GenDot / Jax / cpu / Primal |
0.000011 s |
0.000008259619971795473 s |
1.33 |
GenDot / HLOOpt / cpu / Primal |
0.00001 s |
0.000008371400035684928 s |
1.19 |
GenDot / PartOpt / cpu / Primal |
0.00001 s |
0.00000901005999367044 s |
1.11 |
GenDot / IPartOpt / cpu / Primal |
0.000011 s |
0.000008573640043323394 s |
1.28 |
GenDot / DefOpt / cpu / Primal |
0.00001 s |
0.000007987099979800406 s |
1.25 |
GenDot / IDefOpt / cpu / Primal |
0.00001 s |
0.000007670840022910853 s |
1.30 |
GenDot / JaXPipe / cpu / Forward |
0.000014 s |
0.000011564219994397715 s |
1.21 |
GenDot / Jax / cpu / Forward |
0.000015 s |
0.000011687760006680036 s |
1.28 |
GenDot / HLOOpt / cpu / Forward |
0.000014 s |
0.000012549280017992717 s |
1.12 |
GenDot / PartOpt / cpu / Forward |
0.000015 s |
0.000011167880002176387 s |
1.34 |
GenDot / IPartOpt / cpu / Forward |
0.000014 s |
0.000012273299962544117 s |
1.14 |
GenDot / DefOpt / cpu / Forward |
0.000014 s |
0.000011512759956531226 s |
1.22 |
GenDot / IDefOpt / cpu / Forward |
0.000014 s |
0.000012003320016447103 s |
1.17 |
GenDot / JaXPipe / cpu / PreRev |
0.000014 s |
0.000012277580008230873 s |
1.14 |
GenDot / JaXPipe / cpu / PostRev |
0.000014 s |
0.000011590839994823907 s |
1.21 |
GenDot / JaXPipe / cpu / BothRev |
0.000014 s |
0.000011976359992331707 s |
1.17 |
GenDot / Jax / cpu / BothRev |
0.000015 s |
0.000011309560022709776 s |
1.33 |
GenDot / HLOOpt / cpu / PreRev |
0.000015 s |
0.000012416999970810138 s |
1.21 |
GenDot / HLOOpt / cpu / PostRev |
0.000015 s |
0.00001339757995083346 s |
1.12 |
GenDot / HLOOpt / cpu / BothRev |
0.000014 s |
0.000011228120001760544 s |
1.25 |
GenDot / PartOpt / cpu / PreRev |
0.000014 s |
0.000011987920024694176 s |
1.17 |
GenDot / PartOpt / cpu / PostRev |
0.000015 s |
0.00001231975996233814 s |
1.22 |
GenDot / PartOpt / cpu / BothRev |
0.000014 s |
0.000012321479962338344 s |
1.14 |
GenDot / IPartOpt / cpu / PreRev |
0.000014 s |
0.000011272580013610423 s |
1.24 |
GenDot / IPartOpt / cpu / PostRev |
0.000015 s |
0.000012042980015394278 s |
1.25 |
GenDot / IPartOpt / cpu / BothRev |
0.000014 s |
0.000011842540016004931 s |
1.18 |
GenDot / DefOpt / cpu / PreRev |
0.000014 s |
0.000011467599997558865 s |
1.22 |
GenDot / DefOpt / cpu / PostRev |
0.000014 s |
0.000011891319991264026 s |
1.18 |
GenDot / DefOpt / cpu / BothRev |
0.000015 s |
0.000012002500052403774 s |
1.25 |
GenDot / IDefOpt / cpu / PreRev |
0.000015 s |
0.000011844639984701644 s |
1.27 |
GenDot / IDefOpt / cpu / PostRev |
0.000014 s |
0.000012110100014979252 s |
1.16 |
GenDot / IDefOpt / cpu / BothRev |
0.000015 s |
0.000012075799977537829 s |
1.24 |
hlo_ffi / JaXPipe / cpu / Primal |
0.000011069800011682672 s |
0.000010163059996557424 s |
1.09 |
hlo_ffi / Jax / cpu / Primal |
0.000010401959991668264 s |
0.000009343899982923176 s |
1.11 |
hlo_ffi / HLOOpt / cpu / Primal |
0.000011500960004013906 s |
0.000009920599950419272 s |
1.16 |
hlo_ffi / PartOpt / cpu / Primal |
0.000010608020004383432 s |
0.000009339799971712637 s |
1.14 |
hlo_ffi / IPartOpt / cpu / Primal |
0.000011773480000556448 s |
0.000009727540027597567 s |
1.21 |
hlo_ffi / DefOpt / cpu / Primal |
0.00001112487999989753 s |
0.000009657739974500146 s |
1.15 |
hlo_ffi / IDefOpt / cpu / Primal |
0.000011843779993796489 s |
0.000009740259984027945 s |
1.22 |
hlo_ffi / JaXPipe / cpu / Forward |
0.000016607320005732618 s |
0.00001370602000861254 s |
1.21 |
hlo_ffi / Jax / cpu / Forward |
0.000015605120004238416 s |
0.000013844219993188744 s |
1.13 |
hlo_ffi / HLOOpt / cpu / Forward |
0.00001578803999336742 s |
0.00001383915999213059 s |
1.14 |
hlo_ffi / PartOpt / cpu / Forward |
0.000015487040000152773 s |
0.000013616979977086883 s |
1.14 |
hlo_ffi / IPartOpt / cpu / Forward |
0.00001607402000900038 s |
0.0000135127200428542 s |
1.19 |
hlo_ffi / DefOpt / cpu / Forward |
0.000015092599999206867 s |
0.000013661500024682028 s |
1.10 |
hlo_ffi / IDefOpt / cpu / Forward |
0.00001561337999874013 s |
0.000013947420038675773 s |
1.12 |
hlo_ffi / JaXPipe / cpu / PreRev |
0.000016208839995215384 s |
0.000013623760041809874 s |
1.19 |
hlo_ffi / JaXPipe / cpu / PostRev |
0.00001548832000707989 s |
0.00001384286003485613 s |
1.12 |
hlo_ffi / JaXPipe / cpu / BothRev |
0.000015097779992174764 s |
0.000014081040008022683 s |
1.07 |
hlo_ffi / Jax / cpu / BothRev |
0.000016585980001764257 s |
0.000013902360005886294 s |
1.19 |
hlo_ffi / HLOOpt / cpu / PreRev |
0.000016859779998412704 s |
0.000013973720015201252 s |
1.21 |
hlo_ffi / HLOOpt / cpu / PostRev |
0.00001758572001108405 s |
0.00001598522000676894 s |
1.10 |
hlo_ffi / HLOOpt / cpu / BothRev |
0.000015445620001628412 s |
0.000013878419968023082 s |
1.11 |
hlo_ffi / PartOpt / cpu / PreRev |
0.000016116799997689668 s |
0.000013816640012009884 s |
1.17 |
hlo_ffi / PartOpt / cpu / PostRev |
0.000014931759997125482 s |
0.000013948579971838624 s |
1.07 |
hlo_ffi / PartOpt / cpu / BothRev |
0.000016092899991235753 s |
0.000014386660022864815 s |
1.12 |
hlo_ffi / IPartOpt / cpu / PreRev |
0.000016242600001987738 s |
0.000013554640017900966 s |
1.20 |
hlo_ffi / IPartOpt / cpu / PostRev |
0.000015602879998368734 s |
0.000013573120013461448 s |
1.15 |
hlo_ffi / IPartOpt / cpu / BothRev |
0.00001534966000235727 s |
0.000013850839995939167 s |
1.11 |
hlo_ffi / DefOpt / cpu / PreRev |
0.000015864200001942663 s |
0.000013287640031194317 s |
1.19 |
hlo_ffi / DefOpt / cpu / PostRev |
0.000015150199990330292 s |
0.000013696799987883425 s |
1.11 |
hlo_ffi / DefOpt / cpu / BothRev |
0.00001562863999424735 s |
0.000013824719981130328 s |
1.13 |
hlo_ffi / IDefOpt / cpu / PreRev |
0.000016428340009042584 s |
0.000013894240000809077 s |
1.18 |
hlo_ffi / IDefOpt / cpu / PostRev |
0.00001582107999638538 s |
0.000013869060003344204 s |
1.14 |
hlo_ffi / IDefOpt / cpu / BothRev |
0.000014694279998366256 s |
0.000013805299995510722 s |
1.06 |
hlo_ffi / JaXPipe / cuda / Primal |
0.0000023670000000000004 s |
0.000001983 s |
1.19 |
hlo_ffi / Jax / cuda / Primal |
0.0000023670000000000004 s |
0.000001983 s |
1.19 |
hlo_ffi / HLOOpt / cuda / Primal |
0.0000023670000000000004 s |
0.000001983 s |
1.19 |
hlo_ffi / PartOpt / cuda / Primal |
0.0000023670000000000004 s |
0.000001984 s |
1.19 |
hlo_ffi / IPartOpt / cuda / Primal |
0.0000023670000000000004 s |
0.000001983 s |
1.19 |
hlo_ffi / DefOpt / cuda / Primal |
0.0000023670000000000004 s |
0.000001984 s |
1.19 |
hlo_ffi / IDefOpt / cuda / Primal |
0.0000023670000000000004 s |
0.000001984 s |
1.19 |
hlo_ffi / JaXPipe / cuda / Forward |
0.000002432 s |
0.00000208 s |
1.17 |
hlo_ffi / Jax / cuda / Forward |
0.000002432 s |
0.000002049 s |
1.19 |
hlo_ffi / HLOOpt / cuda / Forward |
0.000002432 s |
0.000002048 s |
1.19 |
hlo_ffi / PartOpt / cuda / Forward |
0.000002432 s |
0.000002048 s |
1.19 |
hlo_ffi / IPartOpt / cuda / Forward |
0.000002432 s |
0.00000208 s |
1.17 |
hlo_ffi / DefOpt / cuda / Forward |
0.000002431 s |
0.00000208 s |
1.17 |
hlo_ffi / IDefOpt / cuda / Forward |
0.000002431 s |
0.00000208 s |
1.17 |
hlo_ffi / JaXPipe / cuda / PreRev |
0.000002431 s |
0.000002047 s |
1.19 |
hlo_ffi / JaXPipe / cuda / PostRev |
0.000002432 s |
0.000002047 s |
1.19 |
hlo_ffi / JaXPipe / cuda / BothRev |
0.000002431 s |
0.000002047 s |
1.19 |
hlo_ffi / Jax / cuda / BothRev |
0.000002431 s |
0.000002048 s |
1.19 |
hlo_ffi / HLOOpt / cuda / PreRev |
0.000002431 s |
0.000002048 s |
1.19 |
hlo_ffi / HLOOpt / cuda / PostRev |
0.000002432 s |
0.000002047 s |
1.19 |
hlo_ffi / HLOOpt / cuda / BothRev |
0.000002431 s |
0.000002047 s |
1.19 |
hlo_ffi / PartOpt / cuda / PreRev |
0.000002431 s |
0.000002047 s |
1.19 |
hlo_ffi / PartOpt / cuda / PostRev |
0.000002431 s |
0.000002047 s |
1.19 |
hlo_ffi / PartOpt / cuda / BothRev |
0.000002431 s |
0.000002047 s |
1.19 |
hlo_ffi / IPartOpt / cuda / PreRev |
0.000002431 s |
0.000002047 s |
1.19 |
hlo_ffi / IPartOpt / cuda / PostRev |
0.000002431 s |
0.000002048 s |
1.19 |
hlo_ffi / IPartOpt / cuda / BothRev |
0.000002431 s |
0.000002048 s |
1.19 |
hlo_ffi / DefOpt / cuda / PreRev |
0.000002431 s |
0.000002048 s |
1.19 |
hlo_ffi / DefOpt / cuda / PostRev |
0.000002432 s |
0.000002047 s |
1.19 |
hlo_ffi / DefOpt / cuda / BothRev |
0.000002431 s |
0.000002047 s |
1.19 |
hlo_ffi / IDefOpt / cuda / PreRev |
0.000002431 s |
0.000002048 s |
1.19 |
hlo_ffi / IDefOpt / cuda / PostRev |
0.000002431 s |
0.000002047 s |
1.19 |
hlo_ffi / IDefOpt / cuda / BothRev |
0.000002431 s |
0.000002048 s |
1.19 |
hlo_ffi / JaXPipe / tpu / Primal |
9.27925e-7 s |
9.3215e-7 s |
1.00 |
hlo_ffi / Jax / tpu / Primal |
9.57425e-7 s |
9.53125e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / Primal |
9.07025e-7 s |
9.07375e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / Primal |
9.55325e-7 s |
9.50025e-7 s |
1.01 |
hlo_ffi / IPartOpt / tpu / Primal |
9.07475e-7 s |
9.1355e-7 s |
0.99 |
hlo_ffi / DefOpt / tpu / Primal |
9.55075e-7 s |
9.58125e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / Primal |
9.074e-7 s |
9.0525e-7 s |
1.00 |
hlo_ffi / JaXPipe / tpu / Forward |
9.49075e-7 s |
9.49075e-7 s |
1 |
hlo_ffi / Jax / tpu / Forward |
9.822e-7 s |
9.81575e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / Forward |
9.7535e-7 s |
9.73475e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / Forward |
9.34025e-7 s |
9.33675e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / Forward |
9.74275e-7 s |
9.736749999999998e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / Forward |
9.34525e-7 s |
9.33775e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / Forward |
9.74825e-7 s |
9.73925e-7 s |
1.00 |
hlo_ffi / JaXPipe / tpu / PreRev |
9.38475e-7 s |
9.3785e-7 s |
1.00 |
hlo_ffi / JaXPipe / tpu / PostRev |
9.659e-7 s |
9.64675e-7 s |
1.00 |
hlo_ffi / JaXPipe / tpu / BothRev |
9.62725e-7 s |
9.62325e-7 s |
1.00 |
hlo_ffi / Jax / tpu / BothRev |
9.65625e-7 s |
9.64675e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / PreRev |
9.632499999999998e-7 s |
9.62175e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / PostRev |
9.6535e-7 s |
9.64525e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / BothRev |
9.62275e-7 s |
9.62325e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / PreRev |
9.6555e-7 s |
9.647e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / PostRev |
9.62075e-7 s |
9.6175e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / BothRev |
9.64925e-7 s |
9.64425e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / PreRev |
9.625e-7 s |
9.61625e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / PostRev |
9.6515e-7 s |
9.64925e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / BothRev |
9.62225e-7 s |
9.618500000000002e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / PreRev |
9.6525e-7 s |
9.64575e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / PostRev |
9.62675e-7 s |
9.6215e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / BothRev |
9.65375e-7 s |
9.6475e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / PreRev |
9.61725e-7 s |
9.6165e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / PostRev |
9.6545e-7 s |
9.64825e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / BothRev |
9.62675e-7 s |
9.61925e-7 s |
1.00 |
hlo_ffi / JaXPipe / cpu / Primal |
0.000018333 s |
0.000010163059996557424 s |
1.80 |
hlo_ffi / Jax / cpu / Primal |
0.000018381 s |
0.000009343899982923176 s |
1.97 |
hlo_ffi / HLOOpt / cpu / Primal |
0.000018343 s |
0.000009920599950419272 s |
1.85 |
hlo_ffi / PartOpt / cpu / Primal |
0.000018326 s |
0.000009339799971712637 s |
1.96 |
hlo_ffi / IPartOpt / cpu / Primal |
0.000017794 s |
0.000009727540027597567 s |
1.83 |
hlo_ffi / DefOpt / cpu / Primal |
0.000018222 s |
0.000009657739974500146 s |
1.89 |
hlo_ffi / IDefOpt / cpu / Primal |
0.00001812 s |
0.000009740259984027945 s |
1.86 |
hlo_ffi / JaXPipe / cpu / Forward |
0.000025334 s |
0.00001370602000861254 s |
1.85 |
hlo_ffi / Jax / cpu / Forward |
0.000024662 s |
0.000013844219993188744 s |
1.78 |
hlo_ffi / HLOOpt / cpu / Forward |
0.000024432 s |
0.00001383915999213059 s |
1.77 |
hlo_ffi / PartOpt / cpu / Forward |
0.000024989 s |
0.000013616979977086883 s |
1.84 |
hlo_ffi / IPartOpt / cpu / Forward |
0.000024807 s |
0.0000135127200428542 s |
1.84 |
hlo_ffi / DefOpt / cpu / Forward |
0.000025618 s |
0.000013661500024682028 s |
1.88 |
hlo_ffi / IDefOpt / cpu / Forward |
0.000025099 s |
0.000013947420038675773 s |
1.80 |
hlo_ffi / JaXPipe / cpu / PreRev |
0.000025374 s |
0.000013623760041809874 s |
1.86 |
hlo_ffi / JaXPipe / cpu / PostRev |
0.000024226 s |
0.00001384286003485613 s |
1.75 |
hlo_ffi / JaXPipe / cpu / BothRev |
0.000024604 s |
0.000014081040008022683 s |
1.75 |
hlo_ffi / Jax / cpu / BothRev |
0.000025102 s |
0.000013902360005886294 s |
1.81 |
hlo_ffi / HLOOpt / cpu / PreRev |
0.000024745 s |
0.000013973720015201252 s |
1.77 |
hlo_ffi / HLOOpt / cpu / PostRev |
0.000024858 s |
0.00001598522000676894 s |
1.56 |
hlo_ffi / HLOOpt / cpu / BothRev |
0.000024576 s |
0.000013878419968023082 s |
1.77 |
hlo_ffi / PartOpt / cpu / PreRev |
0.000025012 s |
0.000013816640012009884 s |
1.81 |
hlo_ffi / PartOpt / cpu / PostRev |
0.000024558 s |
0.000013948579971838624 s |
1.76 |
hlo_ffi / PartOpt / cpu / BothRev |
0.000025008 s |
0.000014386660022864815 s |
1.74 |
hlo_ffi / IPartOpt / cpu / PreRev |
0.0000249 s |
0.000013554640017900966 s |
1.84 |
hlo_ffi / IPartOpt / cpu / PostRev |
0.000024837000000000003 s |
0.000013573120013461448 s |
1.83 |
hlo_ffi / IPartOpt / cpu / BothRev |
0.000024693 s |
0.000013850839995939167 s |
1.78 |
hlo_ffi / DefOpt / cpu / PreRev |
0.000025106 s |
0.000013287640031194317 s |
1.89 |
hlo_ffi / DefOpt / cpu / PostRev |
0.000023702 s |
0.000013696799987883425 s |
1.73 |
hlo_ffi / DefOpt / cpu / BothRev |
0.000023961 s |
0.000013824719981130328 s |
1.73 |
hlo_ffi / IDefOpt / cpu / PreRev |
0.00002522 s |
0.000013894240000809077 s |
1.82 |
hlo_ffi / IDefOpt / cpu / PostRev |
0.000024647 s |
0.000013869060003344204 s |
1.78 |
hlo_ffi / IDefOpt / cpu / BothRev |
0.000024281 s |
0.000013805299995510722 s |
1.76 |
hlo_ffi / JaXPipe / cpu / Primal |
0.000013 s |
0.000010163059996557424 s |
1.28 |
hlo_ffi / Jax / cpu / Primal |
0.000013 s |
0.000009343899982923176 s |
1.39 |
hlo_ffi / HLOOpt / cpu / Primal |
0.000013 s |
0.000009920599950419272 s |
1.31 |
hlo_ffi / PartOpt / cpu / Primal |
0.000013 s |
0.000009339799971712637 s |
1.39 |
hlo_ffi / IPartOpt / cpu / Primal |
0.000012 s |
0.000009727540027597567 s |
1.23 |
hlo_ffi / DefOpt / cpu / Primal |
0.000013 s |
0.000009657739974500146 s |
1.35 |
hlo_ffi / IDefOpt / cpu / Primal |
0.000012 s |
0.000009740259984027945 s |
1.23 |
hlo_ffi / JaXPipe / cpu / Forward |
0.000017 s |
0.00001370602000861254 s |
1.24 |
hlo_ffi / Jax / cpu / Forward |
0.000017 s |
0.000013844219993188744 s |
1.23 |
hlo_ffi / HLOOpt / cpu / Forward |
0.000017 s |
0.00001383915999213059 s |
1.23 |
hlo_ffi / PartOpt / cpu / Forward |
0.000017 s |
0.000013616979977086883 s |
1.25 |
hlo_ffi / IPartOpt / cpu / Forward |
0.000019 s |
0.0000135127200428542 s |
1.41 |
hlo_ffi / DefOpt / cpu / Forward |
0.000017 s |
0.000013661500024682028 s |
1.24 |
hlo_ffi / IDefOpt / cpu / Forward |
0.000017 s |
0.000013947420038675773 s |
1.22 |
hlo_ffi / JaXPipe / cpu / PreRev |
0.000017 s |
0.000013623760041809874 s |
1.25 |
hlo_ffi / JaXPipe / cpu / PostRev |
0.000017 s |
0.00001384286003485613 s |
1.23 |
hlo_ffi / JaXPipe / cpu / BothRev |
0.000017 s |
0.000014081040008022683 s |
1.21 |
hlo_ffi / Jax / cpu / BothRev |
0.000017 s |
0.000013902360005886294 s |
1.22 |
hlo_ffi / HLOOpt / cpu / PreRev |
0.000017999999999999997 s |
0.000013973720015201252 s |
1.29 |
hlo_ffi / HLOOpt / cpu / PostRev |
0.000017999999999999997 s |
0.00001598522000676894 s |
1.13 |
hlo_ffi / HLOOpt / cpu / BothRev |
0.000017999999999999997 s |
0.000013878419968023082 s |
1.30 |
hlo_ffi / PartOpt / cpu / PreRev |
0.000017 s |
0.000013816640012009884 s |
1.23 |
hlo_ffi / PartOpt / cpu / PostRev |
0.000019 s |
0.000013948579971838624 s |
1.36 |
hlo_ffi / PartOpt / cpu / BothRev |
0.000017999999999999997 s |
0.000014386660022864815 s |
1.25 |
hlo_ffi / IPartOpt / cpu / PreRev |
0.000017999999999999997 s |
0.000013554640017900966 s |
1.33 |
hlo_ffi / IPartOpt / cpu / PostRev |
0.000017999999999999997 s |
0.000013573120013461448 s |
1.33 |
hlo_ffi / IPartOpt / cpu / BothRev |
0.000017 s |
0.000013850839995939167 s |
1.23 |
hlo_ffi / DefOpt / cpu / PreRev |
0.000017 s |
0.000013287640031194317 s |
1.28 |
hlo_ffi / DefOpt / cpu / PostRev |
0.000017 s |
0.000013696799987883425 s |
1.24 |
hlo_ffi / DefOpt / cpu / BothRev |
0.000017 s |
0.000013824719981130328 s |
1.23 |
hlo_ffi / IDefOpt / cpu / PreRev |
0.000017 s |
0.000013894240000809077 s |
1.22 |
hlo_ffi / IDefOpt / cpu / PostRev |
0.000017 s |
0.000013869060003344204 s |
1.23 |
hlo_ffi / IDefOpt / cpu / BothRev |
0.000017 s |
0.000013805299995510722 s |
1.23 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal |
0.0009213158000193 s |
0.0008871002001797 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal |
0.0009104983999804 s |
0.0008837956000206 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal |
0.0009805919999735 s |
0.0009228236001035 s |
1.06 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal |
0.0009094214000015 s |
0.0008805019999272 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal |
0.0009250901999848 s |
0.0008894220000001 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal |
0.0009860830000206 s |
0.0009508617999017 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal |
0.0009907198000064 s |
0.0009291351999308 s |
1.07 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward |
0.002204480000023 s |
0.0021042777999355 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward |
0.0023456586000293 s |
0.0022587204000046 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward |
0.0022288163999746 s |
0.0021500653998373 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward |
0.002199663000033 s |
0.0022119191999081 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward |
0.002225231400007 s |
0.0021429494000585 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward |
0.0022475720000102 s |
0.0021400593999715 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward |
0.0021418807999907 s |
0.0021881026000301 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev |
0.0051802711999926 s |
0.0050751822000165 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev |
0.0049583580000216 s |
0.0056805405999512 s |
0.87 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev |
0.0056441988000415 s |
0.0058089991998713 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev |
0.0055026360000056 s |
0.0052328876001411 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev |
0.0053289049999875 s |
0.0055911849999574 s |
0.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev |
0.005597296799965 s |
0.0047908924000694 s |
1.17 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev |
0.0070611067999834 s |
0.0059099502001117 s |
1.19 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev |
0.0049999660000139 s |
0.0036509223998109 s |
1.37 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev |
0.0035906825999973 s |
0.0054702081999494 s |
0.66 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev |
0.0051593262000096 s |
0.0052289548000771 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev |
0.0070447117999719 s |
0.0050246252000761 s |
1.40 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev |
0.0051492684000322 s |
0.0065582606001044 s |
0.79 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev |
0.0032928224000215 s |
0.0034128457998122 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev |
0.0056232034000231 s |
0.0054061793999608 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev |
0.005711300200005 s |
0.004691940000066 s |
1.22 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev |
0.0055255829999396 s |
0.0050492209999902 s |
1.09 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev |
0.003636172599954 s |
0.0050608083999577 s |
0.72 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev |
0.0050699929999836 s |
0.0056577018001007 s |
0.90 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev |
0.0035170639999932 s |
0.0047789924000426 s |
0.74 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Primal |
0.0002968629999999 s |
0.000284001 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Primal |
0.0002973109999999 s |
0.000284033 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Primal |
0.000304415 s |
0.000291169 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Primal |
0.000297503 s |
0.000282881 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Primal |
0.000297215 s |
0.000283489 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Primal |
0.0003047989999999 s |
0.000290817 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Primal |
0.000303359 s |
0.00029104 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Forward |
0.000582654 s |
0.000555745 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Forward |
0.000566686 s |
0.000537986 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Forward |
0.000582366 s |
0.0005573129999999 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Forward |
0.00058227 s |
0.000556898 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Forward |
0.0005819499999999 s |
0.000556737 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Forward |
0.000582109 s |
0.000557762 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Forward |
0.000582526 s |
0.000557154 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PreRev |
0.001054748 s |
0.001026882 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PostRev |
0.001008541 s |
0.000984706 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / BothRev |
0.001050653 s |
0.001019711 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / BothRev |
0.001004733 s |
0.000986369 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PreRev |
0.001036893 s |
0.0010103089999999 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PostRev |
0.0010606679999999 s |
0.001035842 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / BothRev |
0.001037788 s |
0.001006659 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PreRev |
0.001050044 s |
0.001025059 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PostRev |
0.000998044 s |
0.000975939 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / BothRev |
0.0010503 s |
0.001024803 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PreRev |
0.0010503 s |
0.001022979 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PostRev |
0.000997788 s |
0.000974241 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / BothRev |
0.00105126 s |
0.001024098 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PreRev |
0.001052284 s |
0.00101962 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PostRev |
0.000985276 s |
0.00095821 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / BothRev |
0.001048508 s |
0.001023619 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PreRev |
0.001052444 s |
0.001021217 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PostRev |
0.001053276 s |
0.001017858 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / BothRev |
0.001053724 s |
0.001020512 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / Primal |
0.0001264599999999 s |
0.00012352525 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / Primal |
0.000127945 s |
0.00012669775 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / Primal |
0.00015585725 s |
0.00015251875 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / Primal |
0.00013539725 s |
0.000134193 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / Primal |
0.00013379625 s |
0.00013077025 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / Primal |
0.000148733 s |
0.00014781275 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / Primal |
0.000153813 s |
0.00015106675 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / Forward |
0.00021452475 s |
0.000211938 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / Forward |
0.00026106075 s |
0.0002610617499999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / Forward |
0.00021419375 s |
0.0002121707499999 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / Forward |
0.0002174805 s |
0.0002180217499999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / Forward |
0.00021441 s |
0.00021172775 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / Forward |
0.00021754825 s |
0.00021815225 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / Forward |
0.0002147349999999 s |
0.0002117155 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / PreRev |
0.00035538625 s |
0.0003562175 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / PostRev |
0.00025663025 s |
0.0002588909999999 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / BothRev |
0.0003546877499999 s |
0.0003564275 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / BothRev |
0.00025663675 s |
0.0002595994999999 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / PreRev |
0.00035502475 s |
0.00035665025 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / PostRev |
0.0002908104999999 s |
0.00029234875 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / BothRev |
0.00035497275 s |
0.0003562995 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / PreRev |
0.00035519625 s |
0.00035881575 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / PostRev |
0.00027154975 s |
0.0002721429999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / BothRev |
0.000355524 s |
0.000358587 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / PreRev |
0.0003545835 s |
0.00035639775 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / PostRev |
0.00027216325 s |
0.0002747947499999 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / BothRev |
0.0003545447499999 s |
0.00035658475 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / PreRev |
0.0003576357499999 s |
0.0003597825 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / PostRev |
0.0002831822499999 s |
0.0002841942499999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / BothRev |
0.00035758025 s |
0.0003598795 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / PreRev |
0.00035697425 s |
0.000357839 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / PostRev |
0.00030080975 s |
0.000302267 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / BothRev |
0.00035766375 s |
0.0003577502499999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal |
0.001911167 s |
0.0008871002001797 s |
2.15 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal |
0.002112111 s |
0.0008837956000206 s |
2.39 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal |
0.002254026 s |
0.0009228236001035 s |
2.44 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal |
0.002523412 s |
0.0008805019999272 s |
2.87 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal |
0.002162878 s |
0.0008894220000001 s |
2.43 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal |
0.002169751 s |
0.0009508617999017 s |
2.28 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal |
0.002182197 s |
0.0009291351999308 s |
2.35 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward |
0.005627352 s |
0.0021042777999355 s |
2.67 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward |
0.005940199 s |
0.0022587204000046 s |
2.63 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward |
0.0059031529999999 s |
0.0021500653998373 s |
2.75 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward |
0.005963574 s |
0.0022119191999081 s |
2.70 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward |
0.005342604 s |
0.0021429494000585 s |
2.49 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward |
0.005899452 s |
0.0021400593999715 s |
2.76 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward |
0.0052812929999999 s |
0.0021881026000301 s |
2.41 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev |
0.00943576 s |
0.0050751822000165 s |
1.86 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev |
0.008678521 s |
0.0056805405999512 s |
1.53 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev |
0.008068538 s |
0.0058089991998713 s |
1.39 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev |
0.008703638 s |
0.0052328876001411 s |
1.66 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev |
0.010425204 s |
0.0055911849999574 s |
1.86 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev |
0.008165762 s |
0.0047908924000694 s |
1.70 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev |
0.010082184 s |
0.0059099502001117 s |
1.71 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev |
0.008714896 s |
0.0036509223998109 s |
2.39 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev |
0.009681898 s |
0.0054702081999494 s |
1.77 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev |
0.007816826 s |
0.0052289548000771 s |
1.49 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev |
0.008819219 s |
0.0050246252000761 s |
1.76 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev |
0.008366284 s |
0.0065582606001044 s |
1.28 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev |
0.010173897 s |
0.0034128457998122 s |
2.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev |
0.008575517 s |
0.0054061793999608 s |
1.59 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev |
0.008188604 s |
0.004691940000066 s |
1.75 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev |
0.008236665 s |
0.0050492209999902 s |
1.63 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev |
0.00875351 s |
0.0050608083999577 s |
1.73 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev |
0.007653495 s |
0.0056577018001007 s |
1.35 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev |
0.0107052049999999 s |
0.0047789924000426 s |
2.24 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal |
0.002639 s |
0.0008871002001797 s |
2.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal |
0.002012 s |
0.0008837956000206 s |
2.28 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal |
0.002322 s |
0.0009228236001035 s |
2.52 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal |
0.002262 s |
0.0008805019999272 s |
2.57 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal |
0.002578 s |
0.0008894220000001 s |
2.90 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal |
0.002051 s |
0.0009508617999017 s |
2.16 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal |
0.001796 s |
0.0009291351999308 s |
1.93 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward |
0.004546 s |
0.0021042777999355 s |
2.16 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward |
0.004186 s |
0.0022587204000046 s |
1.85 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward |
0.004416 s |
0.0021500653998373 s |
2.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward |
0.004726 s |
0.0022119191999081 s |
2.14 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward |
0.004468 s |
0.0021429494000585 s |
2.08 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward |
0.004749 s |
0.0021400593999715 s |
2.22 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward |
0.004868 s |
0.0021881026000301 s |
2.22 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev |
0.011094 s |
0.0050751822000165 s |
2.19 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev |
0.020307 s |
0.0056805405999512 s |
3.57 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev |
0.008717 s |
0.0058089991998713 s |
1.50 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev |
0.009814 s |
0.0052328876001411 s |
1.88 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev |
0.0088209999999999 s |
0.0055911849999574 s |
1.58 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev |
0.008016 s |
0.0047908924000694 s |
1.67 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev |
0.0148379999999999 s |
0.0059099502001117 s |
2.51 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev |
0.010439 s |
0.0036509223998109 s |
2.86 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev |
0.014638 s |
0.0054702081999494 s |
2.68 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev |
0.012754 s |
0.0052289548000771 s |
2.44 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev |
0.0094 s |
0.0050246252000761 s |
1.87 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev |
0.017626 s |
0.0065582606001044 s |
2.69 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev |
0.0100679999999999 s |
0.0034128457998122 s |
2.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev |
0.013432 s |
0.0054061793999608 s |
2.48 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev |
0.0097899999999999 s |
0.004691940000066 s |
2.09 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev |
0.0114749999999999 s |
0.0050492209999902 s |
2.27 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev |
0.009778 s |
0.0050608083999577 s |
1.93 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev |
0.011222 s |
0.0056577018001007 s |
1.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev |
0.012923 s |
0.0047789924000426 s |
2.70 |
scatter_sum / JaXPipe / cpu / Primal |
0.000007667100005619432 s |
0.000008399699972869711 s |
0.91 |
scatter_sum / Jax / cpu / Primal |
0.000007718980009485677 s |
0.00000856758005284064 s |
0.90 |
scatter_sum / HLOOpt / cpu / Primal |
0.000007659139998850151 s |
0.000009258660020350362 s |
0.83 |
scatter_sum / PartOpt / cpu / Primal |
0.000007713320005677815 s |
0.000009661359954407087 s |
0.80 |
scatter_sum / IPartOpt / cpu / Primal |
0.000007863519997499679 s |
0.000009734099976412837 s |
0.81 |
scatter_sum / DefOpt / cpu / Primal |
0.000007770459999392188 s |
0.000009198239977195045 s |
0.84 |
scatter_sum / IDefOpt / cpu / Primal |
0.00000765412000419019 s |
0.000008779059990047244 s |
0.87 |
scatter_sum / JaXPipe / cpu / Forward |
0.000011554239993074589 s |
0.000013061339986961684 s |
0.88 |
scatter_sum / Jax / cpu / Forward |
0.000011517739997088938 s |
0.000012101459969926508 s |
0.95 |
scatter_sum / HLOOpt / cpu / Forward |
0.0000116724200029239 s |
0.000012741700002152357 s |
0.92 |
scatter_sum / PartOpt / cpu / Forward |
0.00001154473999349648 s |
0.000012532019991340348 s |
0.92 |
scatter_sum / IPartOpt / cpu / Forward |
0.000011817240006166685 s |
0.0000133172999630915 s |
0.89 |
scatter_sum / DefOpt / cpu / Forward |
0.000011767280007006775 s |
0.00001335968000603316 s |
0.88 |
scatter_sum / IDefOpt / cpu / Forward |
0.000011618659993928303 s |
0.000013295580029080156 s |
0.87 |
scatter_sum / JaXPipe / cpu / PreRev |
0.00001236743999470491 s |
0.000012938739982928384 s |
0.96 |
scatter_sum / JaXPipe / cpu / PostRev |
0.000011306499993679609 s |
0.00001292862001719186 s |
0.87 |
scatter_sum / JaXPipe / cpu / BothRev |
0.000011907039995548983 s |
0.000012804760017388616 s |
0.93 |
scatter_sum / Jax / cpu / BothRev |
0.000011951460005548142 s |
0.000012629419989025336 s |
0.95 |
scatter_sum / HLOOpt / cpu / PreRev |
0.000012150580000707122 s |
0.000013125519999448442 s |
0.93 |
scatter_sum / HLOOpt / cpu / PostRev |
0.000013491180009168602 s |
0.000014698720024171053 s |
0.92 |
scatter_sum / HLOOpt / cpu / BothRev |
0.000011304120002932905 s |
0.000013126020039635478 s |
0.86 |
scatter_sum / PartOpt / cpu / PreRev |
0.000011587799995140814 s |
0.000013033599962000154 s |
0.89 |
scatter_sum / PartOpt / cpu / PostRev |
0.000011495760006710042 s |
0.000012915639981656568 s |
0.89 |
scatter_sum / PartOpt / cpu / BothRev |
0.000011804099999608296 s |
0.000012708979975286638 s |
0.93 |
scatter_sum / IPartOpt / cpu / PreRev |
0.000011991100002433086 s |
0.000012781279965565772 s |
0.94 |
scatter_sum / IPartOpt / cpu / PostRev |
0.000011454740001681784 s |
0.000013472300042849384 s |
0.85 |
scatter_sum / IPartOpt / cpu / BothRev |
0.0000113740600068013 s |
0.000012834419976570643 s |
0.89 |
scatter_sum / DefOpt / cpu / PreRev |
0.000011613260005560732 s |
0.000013145060020178787 s |
0.88 |
scatter_sum / DefOpt / cpu / PostRev |
0.000011694960003296728 s |
0.00001275038002859219 s |
0.92 |
scatter_sum / DefOpt / cpu / BothRev |
0.00001120000000582877 s |
0.000012998379997952724 s |
0.86 |
scatter_sum / IDefOpt / cpu / PreRev |
0.00001162016000762378 s |
0.000013091199998598312 s |
0.89 |
scatter_sum / IDefOpt / cpu / PostRev |
0.000011676740000439169 s |
0.00001284074000068358 s |
0.91 |
scatter_sum / IDefOpt / cpu / BothRev |
0.00001169420000678656 s |
0.000012844779957958964 s |
0.91 |
scatter_sum / JaXPipe / cuda / Primal |
0.000010784 s |
0.000009665 s |
1.12 |
scatter_sum / Jax / cuda / Primal |
0.00001072 s |
0.000010208 s |
1.05 |
scatter_sum / HLOOpt / cuda / Primal |
0.000010367 s |
0.00000976 s |
1.06 |
scatter_sum / PartOpt / cuda / Primal |
0.000010496 s |
0.000009824 s |
1.07 |
scatter_sum / IPartOpt / cuda / Primal |
0.000010688 s |
0.00000992 s |
1.08 |
scatter_sum / DefOpt / cuda / Primal |
0.000010529 s |
0.000009888 s |
1.06 |
scatter_sum / IDefOpt / cuda / Primal |
0.000010656 s |
0.000009952 s |
1.07 |
scatter_sum / JaXPipe / cuda / Forward |
0.000017247999999999998 s |
0.000016768000000000003 s |
1.03 |
scatter_sum / Jax / cuda / Forward |
0.000017216 s |
0.000016255999999999998 s |
1.06 |
scatter_sum / HLOOpt / cuda / Forward |
0.000017695 s |
0.000016927999999999998 s |
1.05 |
scatter_sum / PartOpt / cuda / Forward |
0.000019072 s |
0.000016576000000000002 s |
1.15 |
scatter_sum / IPartOpt / cuda / Forward |
0.00001984 s |
0.00001696 s |
1.17 |
scatter_sum / DefOpt / cuda / Forward |
0.000016448000000000002 s |
0.000016288 s |
1.01 |
scatter_sum / IDefOpt / cuda / Forward |
0.000017184 s |
0.000018336 s |
0.94 |
scatter_sum / JaXPipe / cuda / PreRev |
0.000016704 s |
0.000018624000000000003 s |
0.90 |
scatter_sum / JaXPipe / cuda / PostRev |
0.000018528 s |
0.000016511 s |
1.12 |
scatter_sum / JaXPipe / cuda / BothRev |
0.000018688 s |
0.00001616 s |
1.16 |
scatter_sum / Jax / cuda / BothRev |
0.000017888000000000002 s |
0.00001712 s |
1.04 |
scatter_sum / HLOOpt / cuda / PreRev |
0.00001952 s |
0.000016704 s |
1.17 |
scatter_sum / HLOOpt / cuda / PostRev |
0.000017472 s |
0.000015935999999999998 s |
1.10 |
scatter_sum / HLOOpt / cuda / BothRev |
0.000017151 s |
0.000021728 s |
0.79 |
scatter_sum / PartOpt / cuda / PreRev |
0.000017024 s |
0.000016832 s |
1.01 |
scatter_sum / PartOpt / cuda / PostRev |
0.000016832 s |
0.000015904000000000002 s |
1.06 |
scatter_sum / PartOpt / cuda / BothRev |
0.000017056 s |
0.000017568000000000002 s |
0.97 |
scatter_sum / IPartOpt / cuda / PreRev |
0.000017375999999999998 s |
0.000016832 s |
1.03 |
scatter_sum / IPartOpt / cuda / PostRev |
0.000017184 s |
0.000016191 s |
1.06 |
scatter_sum / IPartOpt / cuda / BothRev |
0.00001712 s |
0.00001696 s |
1.01 |
scatter_sum / DefOpt / cuda / PreRev |
0.000017375999999999998 s |
0.000017311 s |
1.00 |
scatter_sum / DefOpt / cuda / PostRev |
0.000017216 s |
0.000016417000000000002 s |
1.05 |
scatter_sum / DefOpt / cuda / BothRev |
0.00001728 s |
0.000016257 s |
1.06 |
scatter_sum / IDefOpt / cuda / PreRev |
0.000017183 s |
0.00001696 s |
1.01 |
scatter_sum / IDefOpt / cuda / PostRev |
0.00001696 s |
0.000021728 s |
0.78 |
scatter_sum / IDefOpt / cuda / BothRev |
0.000017728 s |
0.00001712 s |
1.04 |
scatter_sum / JaXPipe / tpu / Primal |
0.000001350125 s |
0.000001342625 s |
1.01 |
scatter_sum / Jax / tpu / Primal |
0.00000140485 s |
0.000001403475 s |
1.00 |
scatter_sum / HLOOpt / tpu / Primal |
0.000001350875 s |
0.000001342625 s |
1.01 |
scatter_sum / PartOpt / tpu / Primal |
0.0000014048 s |
0.0000014036 s |
1.00 |
scatter_sum / IPartOpt / tpu / Primal |
0.000001351125 s |
0.000001342725 s |
1.01 |
scatter_sum / DefOpt / tpu / Primal |
0.0000014053499999999998 s |
0.00000140385 s |
1.00 |
scatter_sum / IDefOpt / tpu / Primal |
0.000001351775 s |
0.0000013422999999999998 s |
1.01 |
scatter_sum / JaXPipe / tpu / Forward |
0.0000027063 s |
0.0000027100000000000003 s |
1.00 |
scatter_sum / Jax / tpu / Forward |
0.000002735425 s |
0.000002715175 s |
1.01 |
scatter_sum / HLOOpt / tpu / Forward |
0.000002704 s |
0.0000027005750000000004 s |
1.00 |
scatter_sum / PartOpt / tpu / Forward |
0.00000271045 s |
0.0000026893499999999995 s |
1.01 |
scatter_sum / IPartOpt / tpu / Forward |
0.0000027014750000000003 s |
0.0000027078 s |
1.00 |
scatter_sum / DefOpt / tpu / Forward |
0.000002693675 s |
0.00000269455 s |
1.00 |
scatter_sum / IDefOpt / tpu / Forward |
0.00000270625 s |
0.0000026998 s |
1.00 |
scatter_sum / JaXPipe / tpu / PreRev |
0.000002695175 s |
0.0000026912750000000003 s |
1.00 |
scatter_sum / JaXPipe / tpu / PostRev |
0.0000026889 s |
0.000002691025 s |
1.00 |
scatter_sum / JaXPipe / tpu / BothRev |
0.00000270325 s |
0.00000269995 s |
1.00 |
scatter_sum / Jax / tpu / BothRev |
0.000002742775 s |
0.0000027454000000000004 s |
1.00 |
scatter_sum / HLOOpt / tpu / PreRev |
0.0000027087250000000005 s |
0.0000026961750000000005 s |
1.00 |
scatter_sum / HLOOpt / tpu / PostRev |
0.000002744475 s |
0.0000027491 s |
1.00 |
scatter_sum / HLOOpt / tpu / BothRev |
0.000002708375 s |
0.000002698375 s |
1.00 |
scatter_sum / PartOpt / tpu / PreRev |
0.00000274675 s |
0.000002740025 s |
1.00 |
scatter_sum / PartOpt / tpu / PostRev |
0.0000027046500000000004 s |
0.0000026995500000000003 s |
1.00 |
scatter_sum / PartOpt / tpu / BothRev |
0.0000027474 s |
0.000002742875 s |
1.00 |
scatter_sum / IPartOpt / tpu / PreRev |
0.00000271605 s |
0.00000270345 s |
1.00 |
scatter_sum / IPartOpt / tpu / PostRev |
0.000002755275 s |
0.000002738425 s |
1.01 |
scatter_sum / IPartOpt / tpu / BothRev |
0.00000271015 s |
0.0000026955499999999995 s |
1.01 |
scatter_sum / DefOpt / tpu / PreRev |
0.0000027438000000000003 s |
0.0000027395750000000004 s |
1.00 |
scatter_sum / DefOpt / tpu / PostRev |
0.0000027074749999999994 s |
0.000002693525 s |
1.01 |
scatter_sum / DefOpt / tpu / BothRev |
0.000002744425 s |
0.000002744675 s |
1.00 |
scatter_sum / IDefOpt / tpu / PreRev |
0.0000027122 s |
0.00000269625 s |
1.01 |
scatter_sum / IDefOpt / tpu / PostRev |
0.0000027452 s |
0.0000027459 s |
1.00 |
scatter_sum / IDefOpt / tpu / BothRev |
0.000002708975 s |
0.0000026999 s |
1.00 |
scatter_sum / JaXPipe / cpu / Primal |
0.000016014 s |
0.000008399699972869711 s |
1.91 |
scatter_sum / Jax / cpu / Primal |
0.000015633 s |
0.00000856758005284064 s |
1.82 |
scatter_sum / HLOOpt / cpu / Primal |
0.000015776 s |
0.000009258660020350362 s |
1.70 |
scatter_sum / PartOpt / cpu / Primal |
0.000015348999999999998 s |
0.000009661359954407087 s |
1.59 |
scatter_sum / IPartOpt / cpu / Primal |
0.000015882 s |
0.000009734099976412837 s |
1.63 |
scatter_sum / DefOpt / cpu / Primal |
0.000015362 s |
0.000009198239977195045 s |
1.67 |
scatter_sum / IDefOpt / cpu / Primal |
0.000015368 s |
0.000008779059990047244 s |
1.75 |
scatter_sum / JaXPipe / cpu / Forward |
0.000023089 s |
0.000013061339986961684 s |
1.77 |
scatter_sum / Jax / cpu / Forward |
0.000021899 s |
0.000012101459969926508 s |
1.81 |
scatter_sum / HLOOpt / cpu / Forward |
0.000022793 s |
0.000012741700002152357 s |
1.79 |
scatter_sum / PartOpt / cpu / Forward |
0.00002255 s |
0.000012532019991340348 s |
1.80 |
scatter_sum / IPartOpt / cpu / Forward |
0.000022536 s |
0.0000133172999630915 s |
1.69 |
scatter_sum / DefOpt / cpu / Forward |
0.000023299 s |
0.00001335968000603316 s |
1.74 |
scatter_sum / IDefOpt / cpu / Forward |
0.000022733 s |
0.000013295580029080156 s |
1.71 |
scatter_sum / JaXPipe / cpu / PreRev |
0.000023226 s |
0.000012938739982928384 s |
1.80 |
scatter_sum / JaXPipe / cpu / PostRev |
0.000023223 s |
0.00001292862001719186 s |
1.80 |
scatter_sum / JaXPipe / cpu / BothRev |
0.000022982 s |
0.000012804760017388616 s |
1.79 |
scatter_sum / Jax / cpu / BothRev |
0.000023862000000000003 s |
0.000012629419989025336 s |
1.89 |
scatter_sum / HLOOpt / cpu / PreRev |
0.000022262 s |
0.000013125519999448442 s |
1.70 |
scatter_sum / HLOOpt / cpu / PostRev |
0.000023463 s |
0.000014698720024171053 s |
1.60 |
scatter_sum / HLOOpt / cpu / BothRev |
0.000023006 s |
0.000013126020039635478 s |
1.75 |
scatter_sum / PartOpt / cpu / PreRev |
0.000023002 s |
0.000013033599962000154 s |
1.76 |
scatter_sum / PartOpt / cpu / PostRev |
0.000022932 s |
0.000012915639981656568 s |
1.78 |
scatter_sum / PartOpt / cpu / BothRev |
0.000022635 s |
0.000012708979975286638 s |
1.78 |
scatter_sum / IPartOpt / cpu / PreRev |
0.000023855 s |
0.000012781279965565772 s |
1.87 |
scatter_sum / IPartOpt / cpu / PostRev |
0.000022158 s |
0.000013472300042849384 s |
1.64 |
scatter_sum / IPartOpt / cpu / BothRev |
0.000023493 s |
0.000012834419976570643 s |
1.83 |
scatter_sum / DefOpt / cpu / PreRev |
0.000022268 s |
0.000013145060020178787 s |
1.69 |
scatter_sum / DefOpt / cpu / PostRev |
0.000022904 s |
0.00001275038002859219 s |
1.80 |
scatter_sum / DefOpt / cpu / BothRev |
0.000023208 s |
0.000012998379997952724 s |
1.79 |
scatter_sum / IDefOpt / cpu / PreRev |
0.000022924 s |
0.000013091199998598312 s |
1.75 |
scatter_sum / IDefOpt / cpu / PostRev |
0.000023148 s |
0.00001284074000068358 s |
1.80 |
scatter_sum / IDefOpt / cpu / BothRev |
0.000022634 s |
0.000012844779957958964 s |
1.76 |
scatter_sum / JaXPipe / cpu / Primal |
0.000011 s |
0.000008399699972869711 s |
1.31 |
scatter_sum / Jax / cpu / Primal |
0.000011 s |
0.00000856758005284064 s |
1.28 |
scatter_sum / HLOOpt / cpu / Primal |
0.000011 s |
0.000009258660020350362 s |
1.19 |
scatter_sum / PartOpt / cpu / Primal |
0.000011 s |
0.000009661359954407087 s |
1.14 |
scatter_sum / IPartOpt / cpu / Primal |
0.000011 s |
0.000009734099976412837 s |
1.13 |
scatter_sum / DefOpt / cpu / Primal |
0.000011 s |
0.000009198239977195045 s |
1.20 |
scatter_sum / IDefOpt / cpu / Primal |
0.000011 s |
0.000008779059990047244 s |
1.25 |
scatter_sum / JaXPipe / cpu / Forward |
0.000017 s |
0.000013061339986961684 s |
1.30 |
scatter_sum / Jax / cpu / Forward |
0.000016 s |
0.000012101459969926508 s |
1.32 |
scatter_sum / HLOOpt / cpu / Forward |
0.000017 s |
0.000012741700002152357 s |
1.33 |
scatter_sum / PartOpt / cpu / Forward |
0.000017 s |
0.000012532019991340348 s |
1.36 |
scatter_sum / IPartOpt / cpu / Forward |
0.000016 s |
0.0000133172999630915 s |
1.20 |
scatter_sum / DefOpt / cpu / Forward |
0.000017999999999999997 s |
0.00001335968000603316 s |
1.35 |
scatter_sum / IDefOpt / cpu / Forward |
0.000017 s |
0.000013295580029080156 s |
1.28 |
scatter_sum / JaXPipe / cpu / PreRev |
0.000017 s |
0.000012938739982928384 s |
1.31 |
scatter_sum / JaXPipe / cpu / PostRev |
0.000017 s |
0.00001292862001719186 s |
1.31 |
scatter_sum / JaXPipe / cpu / BothRev |
0.000017 s |
0.000012804760017388616 s |
1.33 |
scatter_sum / Jax / cpu / BothRev |
0.000016 s |
0.000012629419989025336 s |
1.27 |
scatter_sum / HLOOpt / cpu / PreRev |
0.000017 s |
0.000013125519999448442 s |
1.30 |
scatter_sum / HLOOpt / cpu / PostRev |
0.000017 s |
0.000014698720024171053 s |
1.16 |
scatter_sum / HLOOpt / cpu / BothRev |
0.000017 s |
0.000013126020039635478 s |
1.30 |
scatter_sum / PartOpt / cpu / PreRev |
0.000016 s |
0.000013033599962000154 s |
1.23 |
scatter_sum / PartOpt / cpu / PostRev |
0.000017 s |
0.000012915639981656568 s |
1.32 |
scatter_sum / PartOpt / cpu / BothRev |
0.000017 s |
0.000012708979975286638 s |
1.34 |
scatter_sum / IPartOpt / cpu / PreRev |
0.000016 s |
0.000012781279965565772 s |
1.25 |
scatter_sum / IPartOpt / cpu / PostRev |
0.000017999999999999997 s |
0.000013472300042849384 s |
1.34 |
scatter_sum / IPartOpt / cpu / BothRev |
0.000017 s |
0.000012834419976570643 s |
1.32 |
scatter_sum / DefOpt / cpu / PreRev |
0.000017 s |
0.000013145060020178787 s |
1.29 |
scatter_sum / DefOpt / cpu / PostRev |
0.000017 s |
0.00001275038002859219 s |
1.33 |
scatter_sum / DefOpt / cpu / BothRev |
0.000017 s |
0.000012998379997952724 s |
1.31 |
scatter_sum / IDefOpt / cpu / PreRev |
0.000017 s |
0.000013091199998598312 s |
1.30 |
scatter_sum / IDefOpt / cpu / PostRev |
0.000017 s |
0.00001284074000068358 s |
1.32 |
scatter_sum / IDefOpt / cpu / BothRev |
0.000017 s |
0.000012844779957958964 s |
1.32 |
slicing / JaXPipe / cpu / Primal |
0.000006097719990521 s |
0.000007288319993676851 s |
0.84 |
slicing / Jax / cpu / Primal |
0.0000060572400047931295 s |
0.000006467119974331581 s |
0.94 |
slicing / HLOOpt / cpu / Primal |
0.000006739899999956833 s |
0.000007340799984376645 s |
0.92 |
slicing / PartOpt / cpu / Primal |
0.0000061114599975553575 s |
0.000006723080005031079 s |
0.91 |
slicing / IPartOpt / cpu / Primal |
0.000006765560005987936 s |
0.00000703373997566814 s |
0.96 |
slicing / DefOpt / cpu / Primal |
0.000006124279998402926 s |
0.000006649759980064118 s |
0.92 |
slicing / IDefOpt / cpu / Primal |
0.000006104960007178306 s |
0.000006718660024489509 s |
0.91 |
slicing / JaXPipe / cpu / Forward |
0.000009634919995278325 s |
0.000010664820019883336 s |
0.90 |
slicing / Jax / cpu / Forward |
0.000008973759995569709 s |
0.00001085290004084527 s |
0.83 |
slicing / HLOOpt / cpu / Forward |
0.000009343960002752282 s |
0.000010787459987113834 s |
0.87 |
slicing / PartOpt / cpu / Forward |
0.000009704120000151306 s |
0.000009920160018737078 s |
0.98 |
slicing / IPartOpt / cpu / Forward |
0.000009749359990109953 s |
0.00001104420003684936 s |
0.88 |
slicing / DefOpt / cpu / Forward |
0.000009281860002374742 s |
0.00001050316004693741 s |
0.88 |
slicing / IDefOpt / cpu / Forward |
0.00000926456001252518 s |
0.00001045673998305574 s |
0.89 |
slicing / JaXPipe / cpu / PreRev |
0.00001004898000473986 s |
0.00001086409999516036 s |
0.92 |
slicing / JaXPipe / cpu / PostRev |
0.000010457239995957937 s |
0.000011132019972137642 s |
0.94 |
slicing / JaXPipe / cpu / BothRev |
0.00000995396000917026 s |
0.000010977259998981026 s |
0.91 |
slicing / Jax / cpu / BothRev |
0.000009851759987213882 s |
0.00001058601999829989 s |
0.93 |
slicing / HLOOpt / cpu / PreRev |
0.000010423400010495242 s |
0.000011480219991426563 s |
0.91 |
slicing / HLOOpt / cpu / PostRev |
0.000011904919990683993 s |
0.000013056880015938076 s |
0.91 |
slicing / HLOOpt / cpu / BothRev |
0.00001018004000570727 s |
0.000011306099986541084 s |
0.90 |
slicing / PartOpt / cpu / PreRev |
0.00000960710000299514 s |
0.000011312520027786376 s |
0.85 |
slicing / PartOpt / cpu / PostRev |
0.000009860620002655196 s |
0.000010987319992636912 s |
0.90 |
slicing / PartOpt / cpu / BothRev |
0.00001033418000133679 s |
0.000011153960022056708 s |
0.93 |
slicing / IPartOpt / cpu / PreRev |
0.00001048506000870475 s |
0.000010705020013119791 s |
0.98 |
slicing / IPartOpt / cpu / PostRev |
0.00000961794000431837 s |
0.000010705700015023467 s |
0.90 |
slicing / IPartOpt / cpu / BothRev |
0.000010177359995395818 s |
0.000010749940001915092 s |
0.95 |
slicing / DefOpt / cpu / PreRev |
0.000010347780003030494 s |
0.000011468539933048303 s |
0.90 |
slicing / DefOpt / cpu / PostRev |
0.000009722579982280876 s |
0.000010730119965955964 s |
0.91 |
slicing / DefOpt / cpu / BothRev |
0.000009908540002925292 s |
0.000011325499990562091 s |
0.87 |
slicing / IDefOpt / cpu / PreRev |
0.000010187980005866848 s |
0.000010818559994731911 s |
0.94 |
slicing / IDefOpt / cpu / PostRev |
0.000010396719994787418 s |
0.000011040860026696464 s |
0.94 |
slicing / IDefOpt / cpu / BothRev |
0.000010105839994594134 s |
0.00001058544001352857 s |
0.95 |
slicing / JaXPipe / cuda / Primal |
0.000002271 s |
0.000001887 s |
1.20 |
slicing / Jax / cuda / Primal |
0.000002271 s |
0.000001887 s |
1.20 |
slicing / HLOOpt / cuda / Primal |
0.000002271 s |
0.000001887 s |
1.20 |
slicing / PartOpt / cuda / Primal |
0.000002271 s |
0.000001887 s |
1.20 |
slicing / IPartOpt / cuda / Primal |
0.000002271 s |
0.000001887 s |
1.20 |
slicing / DefOpt / cuda / Primal |
0.000002271 s |
0.000001887 s |
1.20 |
slicing / IDefOpt / cuda / Primal |
0.000002271 s |
0.000001887 s |
1.20 |
slicing / JaXPipe / cuda / Forward |
0.0000104 s |
0.000010112 s |
1.03 |
slicing / Jax / cuda / Forward |
0.000010496 s |
0.000009664 s |
1.09 |
slicing / HLOOpt / cuda / Forward |
0.00001008 s |
0.000009728 s |
1.04 |
slicing / PartOpt / cuda / Forward |
0.000010144 s |
0.000009312000000000002 s |
1.09 |
slicing / IPartOpt / cuda / Forward |
0.000010528 s |
0.000009792 s |
1.08 |
slicing / DefOpt / cuda / Forward |
0.000010528 s |
0.000010145 s |
1.04 |
slicing / IDefOpt / cuda / Forward |
0.000010752 s |
0.000010048 s |
1.07 |
slicing / JaXPipe / cuda / PreRev |
0.0000104 s |
0.000011232 s |
0.93 |
slicing / JaXPipe / cuda / PostRev |
0.000010239 s |
0.000010976 s |
0.93 |
slicing / JaXPipe / cuda / BothRev |
0.00001024 s |
0.00001008 s |
1.02 |
slicing / Jax / cuda / BothRev |
0.000010431 s |
0.000009952 s |
1.05 |
slicing / HLOOpt / cuda / PreRev |
0.000010624 s |
0.000009696 s |
1.10 |
slicing / HLOOpt / cuda / PostRev |
0.00001008 s |
0.000009568 s |
1.05 |
slicing / HLOOpt / cuda / BothRev |
0.000010496 s |
0.000009888 s |
1.06 |
slicing / PartOpt / cuda / PreRev |
0.000010432 s |
0.0000096 s |
1.09 |
slicing / PartOpt / cuda / PostRev |
0.000010112 s |
0.000009728 s |
1.04 |
slicing / PartOpt / cuda / BothRev |
0.000010336 s |
0.000009983 s |
1.04 |
slicing / IPartOpt / cuda / PreRev |
0.000010465 s |
0.000011137 s |
0.94 |
slicing / IPartOpt / cuda / PostRev |
0.000010176 s |
0.000009568 s |
1.06 |
slicing / IPartOpt / cuda / BothRev |
0.000010624 s |
0.000009953 s |
1.07 |
slicing / DefOpt / cuda / PreRev |
0.000010368 s |
0.00000976 s |
1.06 |
slicing / DefOpt / cuda / PostRev |
0.000010304 s |
0.000009633 s |
1.07 |
slicing / DefOpt / cuda / BothRev |
0.000010176 s |
0.000009824 s |
1.04 |
slicing / IDefOpt / cuda / PreRev |
0.000010496 s |
0.000009569 s |
1.10 |
slicing / IDefOpt / cuda / PostRev |
0.00001024 s |
0.000009856 s |
1.04 |
slicing / IDefOpt / cuda / BothRev |
0.000010368 s |
0.000009568 s |
1.08 |
slicing / JaXPipe / tpu / Primal |
0.0000010238 s |
9.63975e-7 s |
1.06 |
slicing / Jax / tpu / Primal |
9.70575e-7 s |
9.6105e-7 s |
1.01 |
slicing / HLOOpt / tpu / Primal |
0.00000102085 s |
9.61225e-7 s |
1.06 |
slicing / PartOpt / tpu / Primal |
9.708e-7 s |
9.65325e-7 s |
1.01 |
slicing / IPartOpt / tpu / Primal |
0.0000010202 s |
9.59025e-7 s |
1.06 |
slicing / DefOpt / tpu / Primal |
9.667e-7 s |
9.6145e-7 s |
1.01 |
slicing / IDefOpt / tpu / Primal |
0.0000010205 s |
9.6245e-7 s |
1.06 |
slicing / JaXPipe / tpu / Forward |
0.0000014079 s |
0.000001406125 s |
1.00 |
slicing / Jax / tpu / Forward |
0.000001476175 s |
0.000001412075 s |
1.05 |
slicing / HLOOpt / tpu / Forward |
0.000001523725 s |
0.00000151405 s |
1.01 |
slicing / PartOpt / tpu / Forward |
0.0000014935249999999998 s |
0.0000014341999999999998 s |
1.04 |
slicing / IPartOpt / tpu / Forward |
0.00000151945 s |
0.000001513225 s |
1.00 |
slicing / DefOpt / tpu / Forward |
0.000001493175 s |
0.000001437075 s |
1.04 |
slicing / IDefOpt / tpu / Forward |
0.00000152025 s |
0.00000151655 s |
1.00 |
slicing / JaXPipe / tpu / PreRev |
0.0000025657249999999995 s |
0.00000238275 s |
1.08 |
slicing / JaXPipe / tpu / PostRev |
0.000002519 s |
0.0000025262 s |
1.00 |
slicing / JaXPipe / tpu / BothRev |
0.000002586325 s |
0.0000023967250000000003 s |
1.08 |
slicing / Jax / tpu / BothRev |
0.000002534525 s |
0.000002552125 s |
0.99 |
slicing / HLOOpt / tpu / PreRev |
0.000002583475 s |
0.00000239915 s |
1.08 |
slicing / HLOOpt / tpu / PostRev |
0.0000025418000000000004 s |
0.000002548575 s |
1.00 |
slicing / HLOOpt / tpu / BothRev |
0.00000259175 s |
0.0000024025000000000003 s |
1.08 |
slicing / PartOpt / tpu / PreRev |
0.00000253425 s |
0.00000254185 s |
1.00 |
slicing / PartOpt / tpu / PostRev |
0.000002579575 s |
0.0000024003 s |
1.07 |
slicing / PartOpt / tpu / BothRev |
0.000002551225 s |
0.0000025502 s |
1.00 |
slicing / IPartOpt / tpu / PreRev |
0.0000025931749999999995 s |
0.0000023919000000000003 s |
1.08 |
slicing / IPartOpt / tpu / PostRev |
0.0000025407000000000005 s |
0.0000025419750000000003 s |
1.00 |
slicing / IPartOpt / tpu / BothRev |
0.0000025882 s |
0.000002392475 s |
1.08 |
slicing / DefOpt / tpu / PreRev |
0.00000253735 s |
0.0000025409 s |
1.00 |
slicing / DefOpt / tpu / PostRev |
0.0000025874 s |
0.00000239555 s |
1.08 |
slicing / DefOpt / tpu / BothRev |
0.0000025396 s |
0.0000025384750000000003 s |
1.00 |
slicing / IDefOpt / tpu / PreRev |
0.000002587025 s |
0.0000023991 s |
1.08 |
slicing / IDefOpt / tpu / PostRev |
0.0000025399 s |
0.000002548875 s |
1.00 |
slicing / IDefOpt / tpu / BothRev |
0.000002591725 s |
0.00000240065 s |
1.08 |
slicing / JaXPipe / cpu / Primal |
0.000012678 s |
0.000007288319993676851 s |
1.74 |
slicing / Jax / cpu / Primal |
0.000012606 s |
0.000006467119974331581 s |
1.95 |
slicing / HLOOpt / cpu / Primal |
0.000012675 s |
0.000007340799984376645 s |
1.73 |
slicing / PartOpt / cpu / Primal |
0.000012441 s |
0.000006723080005031079 s |
1.85 |
slicing / IPartOpt / cpu / Primal |
0.000012469 s |
0.00000703373997566814 s |
1.77 |
slicing / DefOpt / cpu / Primal |
0.000012446 s |
0.000006649759980064118 s |
1.87 |
slicing / IDefOpt / cpu / Primal |
0.000012437 s |
0.000006718660024489509 s |
1.85 |
slicing / JaXPipe / cpu / Forward |
0.000017205999999999998 s |
0.000010664820019883336 s |
1.61 |
slicing / Jax / cpu / Forward |
0.000016621 s |
0.00001085290004084527 s |
1.53 |
slicing / HLOOpt / cpu / Forward |
0.000016353 s |
0.000010787459987113834 s |
1.52 |
slicing / PartOpt / cpu / Forward |
0.000016637 s |
0.000009920160018737078 s |
1.68 |
slicing / IPartOpt / cpu / Forward |
0.000017094999999999998 s |
0.00001104420003684936 s |
1.55 |
slicing / DefOpt / cpu / Forward |
0.00001681 s |
0.00001050316004693741 s |
1.60 |
slicing / IDefOpt / cpu / Forward |
0.000016638 s |
0.00001045673998305574 s |
1.59 |
slicing / JaXPipe / cpu / PreRev |
0.000017786 s |
0.00001086409999516036 s |
1.64 |
slicing / JaXPipe / cpu / PostRev |
0.000017364000000000002 s |
0.000011132019972137642 s |
1.56 |
slicing / JaXPipe / cpu / BothRev |
0.00001788 s |
0.000010977259998981026 s |
1.63 |
slicing / Jax / cpu / BothRev |
0.000017517999999999997 s |
0.00001058601999829989 s |
1.65 |
slicing / HLOOpt / cpu / PreRev |
0.000017375999999999998 s |
0.000011480219991426563 s |
1.51 |
slicing / HLOOpt / cpu / PostRev |
0.000017748 s |
0.000013056880015938076 s |
1.36 |
slicing / HLOOpt / cpu / BothRev |
0.000017328 s |
0.000011306099986541084 s |
1.53 |
slicing / PartOpt / cpu / PreRev |
0.000017893 s |
0.000011312520027786376 s |
1.58 |
slicing / PartOpt / cpu / PostRev |
0.000017467 s |
0.000010987319992636912 s |
1.59 |
slicing / PartOpt / cpu / BothRev |
0.000017134 s |
0.000011153960022056708 s |
1.54 |
slicing / IPartOpt / cpu / PreRev |
0.000017207 s |
0.000010705020013119791 s |
1.61 |
slicing / IPartOpt / cpu / PostRev |
0.000017069000000000002 s |
0.000010705700015023467 s |
1.59 |
slicing / IPartOpt / cpu / BothRev |
0.000017336 s |
0.000010749940001915092 s |
1.61 |
slicing / DefOpt / cpu / PreRev |
0.000017749999999999998 s |
0.000011468539933048303 s |
1.55 |
slicing / DefOpt / cpu / PostRev |
0.000016863 s |
0.000010730119965955964 s |
1.57 |
slicing / DefOpt / cpu / BothRev |
0.000017564 s |
0.000011325499990562091 s |
1.55 |
slicing / IDefOpt / cpu / PreRev |
0.000016996 s |
0.000010818559994731911 s |
1.57 |
slicing / IDefOpt / cpu / PostRev |
0.000016916000000000002 s |
0.000011040860026696464 s |
1.53 |
slicing / IDefOpt / cpu / BothRev |
0.000016876 s |
0.00001058544001352857 s |
1.59 |
slicing / JaXPipe / cpu / Primal |
0.000008 s |
0.000007288319993676851 s |
1.10 |
slicing / Jax / cpu / Primal |
0.000008999999999999999 s |
0.000006467119974331581 s |
1.39 |
slicing / HLOOpt / cpu / Primal |
0.000008999999999999999 s |
0.000007340799984376645 s |
1.23 |
slicing / PartOpt / cpu / Primal |
0.000008999999999999999 s |
0.000006723080005031079 s |
1.34 |
slicing / IPartOpt / cpu / Primal |
0.000008999999999999999 s |
0.00000703373997566814 s |
1.28 |
slicing / DefOpt / cpu / Primal |
0.000008999999999999999 s |
0.000006649759980064118 s |
1.35 |
slicing / IDefOpt / cpu / Primal |
0.000008999999999999999 s |
0.000006718660024489509 s |
1.34 |
slicing / JaXPipe / cpu / Forward |
0.000012 s |
0.000010664820019883336 s |
1.13 |
slicing / Jax / cpu / Forward |
0.000012 s |
0.00001085290004084527 s |
1.11 |
slicing / HLOOpt / cpu / Forward |
0.000012 s |
0.000010787459987113834 s |
1.11 |
slicing / PartOpt / cpu / Forward |
0.000012 s |
0.000009920160018737078 s |
1.21 |
slicing / IPartOpt / cpu / Forward |
0.000012 s |
0.00001104420003684936 s |
1.09 |
slicing / DefOpt / cpu / Forward |
0.000013 s |
0.00001050316004693741 s |
1.24 |
slicing / IDefOpt / cpu / Forward |
0.000012 s |
0.00001045673998305574 s |
1.15 |
slicing / JaXPipe / cpu / PreRev |
0.000012 s |
0.00001086409999516036 s |
1.10 |
slicing / JaXPipe / cpu / PostRev |
0.000013 s |
0.000011132019972137642 s |
1.17 |
slicing / JaXPipe / cpu / BothRev |
0.000012 s |
0.000010977259998981026 s |
1.09 |
slicing / Jax / cpu / BothRev |
0.000013 s |
0.00001058601999829989 s |
1.23 |
slicing / HLOOpt / cpu / PreRev |
0.000012 s |
0.000011480219991426563 s |
1.05 |
slicing / HLOOpt / cpu / PostRev |
0.000013 s |
0.000013056880015938076 s |
1.00 |
slicing / HLOOpt / cpu / BothRev |
0.000012 s |
0.000011306099986541084 s |
1.06 |
slicing / PartOpt / cpu / PreRev |
0.000012 s |
0.000011312520027786376 s |
1.06 |
slicing / PartOpt / cpu / PostRev |
0.000012 s |
0.000010987319992636912 s |
1.09 |
slicing / PartOpt / cpu / BothRev |
0.000013 s |
0.000011153960022056708 s |
1.17 |
slicing / IPartOpt / cpu / PreRev |
0.000012 s |
0.000010705020013119791 s |
1.12 |
slicing / IPartOpt / cpu / PostRev |
0.000012 s |
0.000010705700015023467 s |
1.12 |
slicing / IPartOpt / cpu / BothRev |
0.000012 s |
0.000010749940001915092 s |
1.12 |
slicing / DefOpt / cpu / PreRev |
0.000012 s |
0.000011468539933048303 s |
1.05 |
slicing / DefOpt / cpu / PostRev |
0.000013 s |
0.000010730119965955964 s |
1.21 |
slicing / DefOpt / cpu / BothRev |
0.000013 s |
0.000011325499990562091 s |
1.15 |
slicing / IDefOpt / cpu / PreRev |
0.000012 s |
0.000010818559994731911 s |
1.11 |
slicing / IDefOpt / cpu / PostRev |
0.000013 s |
0.000011040860026696464 s |
1.18 |
slicing / IDefOpt / cpu / BothRev |
0.000012 s |
0.00001058544001352857 s |
1.13 |
sum / JaXPipe / cpu / Primal |
0.000007667779993880685 s |
0.000008742739983063074 s |
0.88 |
sum / Jax / cpu / Primal |
0.000007404020004742052 s |
0.000008432739987256355 s |
0.88 |
sum / HLOOpt / cpu / Primal |
0.000007994480006345838 s |
0.00000868413997523021 s |
0.92 |
sum / PartOpt / cpu / Primal |
0.00000745573999665794 s |
0.000008593839993409347 s |
0.87 |
sum / IPartOpt / cpu / Primal |
0.000008130819999223604 s |
0.000008873579981809598 s |
0.92 |
sum / DefOpt / cpu / Primal |
0.000007747900001504604 s |
0.000008341580014530337 s |
0.93 |
sum / IDefOpt / cpu / Primal |
0.000008481119998577924 s |
0.000008160259958458483 s |
1.04 |
sum / JaXPipe / cpu / Forward |
0.00001121236000699355 s |
0.000012359000020296664 s |
0.91 |
sum / Jax / cpu / Forward |
0.00001148140000168496 s |
0.000012516539954958715 s |
0.92 |
sum / HLOOpt / cpu / Forward |
0.00001145546000543618 s |
0.000012630199980776524 s |
0.91 |
sum / PartOpt / cpu / Forward |
0.000011307780007427936 s |
0.000012017819990433054 s |
0.94 |
sum / IPartOpt / cpu / Forward |
0.000011516060003486928 s |
0.0000124326400236896 s |
0.93 |
sum / DefOpt / cpu / Forward |
0.000010961339992263674 s |
0.000012678339962803876 s |
0.86 |
sum / IDefOpt / cpu / Forward |
0.000011057339995659276 s |
0.00001217218003148446 s |
0.91 |
sum / JaXPipe / cpu / PreRev |
0.000011736080002719971 s |
0.000011711759980244096 s |
1.00 |
sum / JaXPipe / cpu / PostRev |
0.000010845479998806697 s |
0.0000120298000274488 s |
0.90 |
sum / JaXPipe / cpu / BothRev |
0.000010823559985055908 s |
0.000012270700017324998 s |
0.88 |
sum / Jax / cpu / BothRev |
0.000011136799996620538 s |
0.000012310340025578626 s |
0.90 |
sum / HLOOpt / cpu / PreRev |
0.000011556560000371974 s |
0.000012216640025144444 s |
0.95 |
sum / HLOOpt / cpu / PostRev |
0.000012639439980830502 s |
0.000013434619959298288 s |
0.94 |
sum / HLOOpt / cpu / BothRev |
0.000010607400010940182 s |
0.000012282459956622916 s |
0.86 |
sum / PartOpt / cpu / PreRev |
0.00001056131999803256 s |
0.000012226860008013318 s |
0.86 |
sum / PartOpt / cpu / PostRev |
0.000011329480000767945 s |
0.000011789480022343924 s |
0.96 |
sum / PartOpt / cpu / BothRev |
0.0000106691000041792 s |
0.000012224460024299332 s |
0.87 |
sum / IPartOpt / cpu / PreRev |
0.00001063957999804188 s |
0.000011990420025540516 s |
0.89 |
sum / IPartOpt / cpu / PostRev |
0.000011201639999853796 s |
0.000012199719985801494 s |
0.92 |
sum / IPartOpt / cpu / BothRev |
0.00001108305999878212 s |
0.000011894060035047004 s |
0.93 |
sum / DefOpt / cpu / PreRev |
0.00001142704000358208 s |
0.000012119699986214985 s |
0.94 |
sum / DefOpt / cpu / PostRev |
0.000010642900003858812 s |
0.00001206306002131896 s |
0.88 |
sum / DefOpt / cpu / BothRev |
0.00001090210000256775 s |
0.00001233915999364399 s |
0.88 |
sum / IDefOpt / cpu / PreRev |
0.000011338460003571529 s |
0.000011822360029327683 s |
0.96 |
sum / IDefOpt / cpu / PostRev |
0.000010959799999454844 s |
0.00001169123997897259 s |
0.94 |
sum / IDefOpt / cpu / BothRev |
0.000010769340003662364 s |
0.000011358259980625007 s |
0.95 |
sum / JaXPipe / cuda / Primal |
0.000002463 s |
0.000002047 s |
1.20 |
sum / Jax / cuda / Primal |
0.000002463 s |
0.000002047 s |
1.20 |
sum / HLOOpt / cuda / Primal |
0.000002463 s |
0.000002047 s |
1.20 |
sum / PartOpt / cuda / Primal |
0.000002463 s |
0.000002047 s |
1.20 |
sum / IPartOpt / cuda / Primal |
0.000002463 s |
0.000002047 s |
1.20 |
sum / DefOpt / cuda / Primal |
0.000002432 s |
0.000002048 s |
1.19 |
sum / IDefOpt / cuda / Primal |
0.000002463 s |
0.000002047 s |
1.20 |
sum / JaXPipe / cuda / Forward |
0.000010528 s |
0.000010303 s |
1.02 |
sum / Jax / cuda / Forward |
0.000011008 s |
0.000009952 s |
1.11 |
sum / HLOOpt / cuda / Forward |
0.000010656 s |
0.000009728 s |
1.10 |
sum / PartOpt / cuda / Forward |
0.000010848 s |
0.000009984 s |
1.09 |
sum / IPartOpt / cuda / Forward |
0.000010752 s |
0.000009536 s |
1.13 |
sum / DefOpt / cuda / Forward |
0.000010719 s |
0.000009824 s |
1.09 |
sum / IDefOpt / cuda / Forward |
0.0000112 s |
0.00001008 s |
1.11 |
sum / JaXPipe / cuda / PreRev |
0.000010144 s |
0.000009632 s |
1.05 |
sum / JaXPipe / cuda / PostRev |
0.000010143 s |
0.000009951 s |
1.02 |
sum / JaXPipe / cuda / BothRev |
0.000010336 s |
0.000009792 s |
1.06 |
sum / Jax / cuda / BothRev |
0.000010272 s |
0.000010144 s |
1.01 |
sum / HLOOpt / cuda / PreRev |
0.000010656 s |
0.000009696 s |
1.10 |
sum / HLOOpt / cuda / PostRev |
0.000009952 s |
0.00000944 s |
1.05 |
sum / HLOOpt / cuda / BothRev |
0.000010113 s |
0.00000976 s |
1.04 |
sum / PartOpt / cuda / PreRev |
0.000010496 s |
0.000009824 s |
1.07 |
sum / PartOpt / cuda / PostRev |
0.000010017 s |
0.00001024 s |
0.98 |
sum / PartOpt / cuda / BothRev |
0.000010111 s |
0.000009696 s |
1.04 |
sum / IPartOpt / cuda / PreRev |
0.000012704 s |
0.000009824 s |
1.29 |
sum / IPartOpt / cuda / PostRev |
0.000010176 s |
0.000009984 s |
1.02 |
sum / IPartOpt / cuda / BothRev |
0.000010112 s |
0.000010047 s |
1.01 |
sum / DefOpt / cuda / PreRev |
0.000010304 s |
0.000010144 s |
1.02 |
sum / DefOpt / cuda / PostRev |
0.000010304 s |
0.000009985 s |
1.03 |
sum / DefOpt / cuda / BothRev |
0.000010848 s |
0.00000992 s |
1.09 |
sum / IDefOpt / cuda / PreRev |
0.00001088 s |
0.000009824 s |
1.11 |
sum / IDefOpt / cuda / PostRev |
0.000010175 s |
0.00000976 s |
1.04 |
sum / IDefOpt / cuda / BothRev |
0.000010656 s |
0.000009856 s |
1.08 |
sum / JaXPipe / tpu / Primal |
5.104499999999999e-7 s |
5.1025e-7 s |
1.00 |
sum / Jax / tpu / Primal |
5.473999999999999e-7 s |
5.4695e-7 s |
1.00 |
sum / HLOOpt / tpu / Primal |
5.10725e-7 s |
5.10625e-7 s |
1.00 |
sum / PartOpt / tpu / Primal |
5.46975e-7 s |
5.471999999999999e-7 s |
1.00 |
sum / IPartOpt / tpu / Primal |
5.10175e-7 s |
5.102750000000001e-7 s |
1.00 |
sum / DefOpt / tpu / Primal |
5.4705e-7 s |
5.470250000000001e-7 s |
1.00 |
sum / IDefOpt / tpu / Primal |
5.10375e-7 s |
5.104e-7 s |
1.00 |
sum / JaXPipe / tpu / Forward |
0.000001548525 s |
0.000001546225 s |
1.00 |
sum / Jax / tpu / Forward |
0.000001498975 s |
0.000001498375 s |
1.00 |
sum / HLOOpt / tpu / Forward |
0.0000015294749999999997 s |
0.000001529575 s |
1.00 |
sum / PartOpt / tpu / Forward |
0.00000149805 s |
0.000001495975 s |
1.00 |
sum / IPartOpt / tpu / Forward |
0.000001543575 s |
0.0000015346499999999998 s |
1.01 |
sum / DefOpt / tpu / Forward |
0.00000149725 s |
0.0000014994 s |
1.00 |
sum / IDefOpt / tpu / Forward |
0.0000015303249999999998 s |
0.000001527925 s |
1.00 |
sum / JaXPipe / tpu / PreRev |
0.0000010508000000000002 s |
0.0000010069500000000002 s |
1.04 |
sum / JaXPipe / tpu / PostRev |
0.000001090075 s |
0.00000103485 s |
1.05 |
sum / JaXPipe / tpu / BothRev |
0.000001049025 s |
0.000001003175 s |
1.05 |
sum / Jax / tpu / BothRev |
0.000001087375 s |
0.000001038925 s |
1.05 |
sum / HLOOpt / tpu / PreRev |
0.000001047625 s |
0.0000010023 s |
1.05 |
sum / HLOOpt / tpu / PostRev |
0.000001087525 s |
0.000001040175 s |
1.05 |
sum / HLOOpt / tpu / BothRev |
0.000001051725 s |
0.0000010055000000000002 s |
1.05 |
sum / PartOpt / tpu / PreRev |
0.000001090325 s |
0.0000010352 s |
1.05 |
sum / PartOpt / tpu / PostRev |
0.0000010508000000000002 s |
9.97725e-7 s |
1.05 |
sum / PartOpt / tpu / BothRev |
0.0000010896250000000002 s |
0.000001040425 s |
1.05 |
sum / IPartOpt / tpu / PreRev |
0.000001057475 s |
0.000001005175 s |
1.05 |
sum / IPartOpt / tpu / PostRev |
0.00000109585 s |
0.00000103545 s |
1.06 |
sum / IPartOpt / tpu / BothRev |
0.00000105905 s |
0.0000010101 s |
1.05 |
sum / DefOpt / tpu / PreRev |
0.0000011000749999999998 s |
0.000001034625 s |
1.06 |
sum / DefOpt / tpu / PostRev |
0.0000010545749999999998 s |
9.9925e-7 s |
1.06 |
sum / DefOpt / tpu / BothRev |
0.000001083125 s |
0.000001041325 s |
1.04 |
sum / IDefOpt / tpu / PreRev |
0.00000104795 s |
0.000001001975 s |
1.05 |
sum / IDefOpt / tpu / PostRev |
0.0000010861 s |
0.0000010411 s |
1.04 |
sum / IDefOpt / tpu / BothRev |
0.000001046525 s |
0.00000100035 s |
1.05 |
sum / JaXPipe / cpu / Primal |
0.000014868 s |
0.000008742739983063074 s |
1.70 |
sum / Jax / cpu / Primal |
0.00001457 s |
0.000008432739987256355 s |
1.73 |
sum / HLOOpt / cpu / Primal |
0.000014701 s |
0.00000868413997523021 s |
1.69 |
sum / PartOpt / cpu / Primal |
0.000014327 s |
0.000008593839993409347 s |
1.67 |
sum / IPartOpt / cpu / Primal |
0.000014848 s |
0.000008873579981809598 s |
1.67 |
sum / DefOpt / cpu / Primal |
0.000014684 s |
0.000008341580014530337 s |
1.76 |
sum / IDefOpt / cpu / Primal |
0.000014412 s |
0.000008160259958458483 s |
1.77 |
sum / JaXPipe / cpu / Forward |
0.00002015 s |
0.000012359000020296664 s |
1.63 |
sum / Jax / cpu / Forward |
0.000020038 s |
0.000012516539954958715 s |
1.60 |
sum / HLOOpt / cpu / Forward |
0.000020079 s |
0.000012630199980776524 s |
1.59 |
sum / PartOpt / cpu / Forward |
0.000020276 s |
0.000012017819990433054 s |
1.69 |
sum / IPartOpt / cpu / Forward |
0.000019812 s |
0.0000124326400236896 s |
1.59 |
sum / DefOpt / cpu / Forward |
0.000019123 s |
0.000012678339962803876 s |
1.51 |
sum / IDefOpt / cpu / Forward |
0.000019551 s |
0.00001217218003148446 s |
1.61 |
sum / JaXPipe / cpu / PreRev |
0.000018633 s |
0.000011711759980244096 s |
1.59 |
sum / JaXPipe / cpu / PostRev |
0.000018572 s |
0.0000120298000274488 s |
1.54 |
sum / JaXPipe / cpu / BothRev |
0.000018265 s |
0.000012270700017324998 s |
1.49 |
sum / Jax / cpu / BothRev |
0.000018417 s |
0.000012310340025578626 s |
1.50 |
sum / HLOOpt / cpu / PreRev |
0.00001839 s |
0.000012216640025144444 s |
1.51 |
sum / HLOOpt / cpu / PostRev |
0.000018787 s |
0.000013434619959298288 s |
1.40 |
sum / HLOOpt / cpu / BothRev |
0.0000186 s |
0.000012282459956622916 s |
1.51 |
sum / PartOpt / cpu / PreRev |
0.000018979 s |
0.000012226860008013318 s |
1.55 |
sum / PartOpt / cpu / PostRev |
0.000018780000000000003 s |
0.000011789480022343924 s |
1.59 |
sum / PartOpt / cpu / BothRev |
0.000018576 s |
0.000012224460024299332 s |
1.52 |
sum / IPartOpt / cpu / PreRev |
0.000018944 s |
0.000011990420025540516 s |
1.58 |
sum / IPartOpt / cpu / PostRev |
0.000018881 s |
0.000012199719985801494 s |
1.55 |
sum / IPartOpt / cpu / BothRev |
0.000019244 s |
0.000011894060035047004 s |
1.62 |
sum / DefOpt / cpu / PreRev |
0.000018787 s |
0.000012119699986214985 s |
1.55 |
sum / DefOpt / cpu / PostRev |
0.000018475 s |
0.00001206306002131896 s |
1.53 |
sum / DefOpt / cpu / BothRev |
0.000018859 s |
0.00001233915999364399 s |
1.53 |
sum / IDefOpt / cpu / PreRev |
0.000018386 s |
0.000011822360029327683 s |
1.56 |
sum / IDefOpt / cpu / PostRev |
0.000019177 s |
0.00001169123997897259 s |
1.64 |
sum / IDefOpt / cpu / BothRev |
0.000019643 s |
0.000011358259980625007 s |
1.73 |
sum / JaXPipe / cpu / Primal |
0.00001 s |
0.000008742739983063074 s |
1.14 |
sum / Jax / cpu / Primal |
0.00001 s |
0.000008432739987256355 s |
1.19 |
sum / HLOOpt / cpu / Primal |
0.000011 s |
0.00000868413997523021 s |
1.27 |
sum / PartOpt / cpu / Primal |
0.000011 s |
0.000008593839993409347 s |
1.28 |
sum / IPartOpt / cpu / Primal |
0.00001 s |
0.000008873579981809598 s |
1.13 |
sum / DefOpt / cpu / Primal |
0.00001 s |
0.000008341580014530337 s |
1.20 |
sum / IDefOpt / cpu / Primal |
0.000011 s |
0.000008160259958458483 s |
1.35 |
sum / JaXPipe / cpu / Forward |
0.000014 s |
0.000012359000020296664 s |
1.13 |
sum / Jax / cpu / Forward |
0.000015 s |
0.000012516539954958715 s |
1.20 |
sum / HLOOpt / cpu / Forward |
0.000015 s |
0.000012630199980776524 s |
1.19 |
sum / PartOpt / cpu / Forward |
0.000015 s |
0.000012017819990433054 s |
1.25 |
sum / IPartOpt / cpu / Forward |
0.000015 s |
0.0000124326400236896 s |
1.21 |
sum / DefOpt / cpu / Forward |
0.000014 s |
0.000012678339962803876 s |
1.10 |
sum / IDefOpt / cpu / Forward |
0.000015 s |
0.00001217218003148446 s |
1.23 |
sum / JaXPipe / cpu / PreRev |
0.000014 s |
0.000011711759980244096 s |
1.20 |
sum / JaXPipe / cpu / PostRev |
0.000014 s |
0.0000120298000274488 s |
1.16 |
sum / JaXPipe / cpu / BothRev |
0.000014 s |
0.000012270700017324998 s |
1.14 |
sum / Jax / cpu / BothRev |
0.000013 s |
0.000012310340025578626 s |
1.06 |
sum / HLOOpt / cpu / PreRev |
0.000014 s |
0.000012216640025144444 s |
1.15 |
sum / HLOOpt / cpu / PostRev |
0.000014 s |
0.000013434619959298288 s |
1.04 |
sum / HLOOpt / cpu / BothRev |
0.000014 s |
0.000012282459956622916 s |
1.14 |
sum / PartOpt / cpu / PreRev |
0.000014 s |
0.000012226860008013318 s |
1.15 |
sum / PartOpt / cpu / PostRev |
0.000014 s |
0.000011789480022343924 s |
1.19 |
sum / PartOpt / cpu / BothRev |
0.000014 s |
0.000012224460024299332 s |
1.15 |
sum / IPartOpt / cpu / PreRev |
0.000014 s |
0.000011990420025540516 s |
1.17 |
sum / IPartOpt / cpu / PostRev |
0.000014 s |
0.000012199719985801494 s |
1.15 |
sum / IPartOpt / cpu / BothRev |
0.000014 s |
0.000011894060035047004 s |
1.18 |
sum / DefOpt / cpu / PreRev |
0.000014 s |
0.000012119699986214985 s |
1.16 |
sum / DefOpt / cpu / PostRev |
0.000013 s |
0.00001206306002131896 s |
1.08 |
sum / DefOpt / cpu / BothRev |
0.000013 s |
0.00001233915999364399 s |
1.05 |
sum / IDefOpt / cpu / PreRev |
0.000014 s |
0.000011822360029327683 s |
1.18 |
sum / IDefOpt / cpu / PostRev |
0.000014 s |
0.00001169123997897259 s |
1.20 |
sum / IDefOpt / cpu / BothRev |
0.000014 s |
0.000011358259980625007 s |
1.23 |
value_and_grad / JaXPipe / cpu / Primal |
0.000014051159992050088 s |
0.000015538480001850986 s |
0.90 |
value_and_grad / Jax / cpu / Primal |
0.0000131080800088057 s |
0.000015070379977260016 s |
0.87 |
value_and_grad / HLOOpt / cpu / Primal |
0.000013616859996545827 s |
0.000014645099963672691 s |
0.93 |
value_and_grad / PartOpt / cpu / Primal |
0.000012969880003765866 s |
0.00001437700002497877 s |
0.90 |
value_and_grad / IPartOpt / cpu / Primal |
0.000013442959998428704 s |
0.000014697280030304682 s |
0.91 |
value_and_grad / DefOpt / cpu / Primal |
0.00001310192000346433 s |
0.00001450112002203241 s |
0.90 |
value_and_grad / IDefOpt / cpu / Primal |
0.000013357660013753048 s |
0.000014610480011469918 s |
0.91 |
value_and_grad / JaXPipe / cuda / Primal |
0.000033792000000000004 s |
0.000032352 s |
1.04 |
value_and_grad / Jax / cuda / Primal |
0.000032864 s |
0.000032896000000000005 s |
1.00 |
value_and_grad / HLOOpt / cuda / Primal |
0.000032800000000000004 s |
0.000032225 s |
1.02 |
value_and_grad / PartOpt / cuda / Primal |
0.000033632 s |
0.000032385 s |
1.04 |
value_and_grad / IPartOpt / cuda / Primal |
0.000033216 s |
0.000032672 s |
1.02 |
value_and_grad / DefOpt / cuda / Primal |
0.000032864 s |
0.000032608 s |
1.01 |
value_and_grad / IDefOpt / cuda / Primal |
0.000033632 s |
0.00003264 s |
1.03 |
value_and_grad / JaXPipe / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / Jax / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / HLOOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / PartOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / IPartOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / DefOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / IDefOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / JaXPipe / cpu / Primal |
0.000023356 s |
0.000015538480001850986 s |
1.50 |
value_and_grad / Jax / cpu / Primal |
0.000022972 s |
0.000015070379977260016 s |
1.52 |
value_and_grad / HLOOpt / cpu / Primal |
0.000022655 s |
0.000014645099963672691 s |
1.55 |
value_and_grad / PartOpt / cpu / Primal |
0.000022765 s |
0.00001437700002497877 s |
1.58 |
value_and_grad / IPartOpt / cpu / Primal |
0.000022622 s |
0.000014697280030304682 s |
1.54 |
value_and_grad / DefOpt / cpu / Primal |
0.000022905 s |
0.00001450112002203241 s |
1.58 |
value_and_grad / IDefOpt / cpu / Primal |
0.000022756 s |
0.000014610480011469918 s |
1.56 |
value_and_grad / JaXPipe / cpu / Primal |
0.000017 s |
0.000015538480001850986 s |
1.09 |
value_and_grad / Jax / cpu / Primal |
0.000017 s |
0.000015070379977260016 s |
1.13 |
value_and_grad / HLOOpt / cpu / Primal |
0.000017 s |
0.000014645099963672691 s |
1.16 |
value_and_grad / PartOpt / cpu / Primal |
0.000016 s |
0.00001437700002497877 s |
1.11 |
value_and_grad / IPartOpt / cpu / Primal |
0.000017 s |
0.000014697280030304682 s |
1.16 |
value_and_grad / DefOpt / cpu / Primal |
0.000016 s |
0.00001450112002203241 s |
1.10 |
value_and_grad / IDefOpt / cpu / Primal |
0.000017 s |
0.000014610480011469918 s |
1.16 |
jaxmd20 / JaXPipe / cuda / Primal |
0.001414361 s |
0.001938565 s |
0.73 |
jaxmd20 / Jax / cuda / Primal |
0.0014725379999999 s |
0.001496676 s |
0.98 |
jaxmd20 / HLOOpt / cuda / Primal |
0.001638937 s |
0.0013105629999999 s |
1.25 |
jaxmd20 / PartOpt / cuda / Primal |
0.0013497539999999 s |
0.001318019 s |
1.02 |
jaxmd20 / IPartOpt / cuda / Primal |
0.001348603 s |
0.001323556 s |
1.02 |
jaxmd20 / DefOpt / cuda / Primal |
0.000938235 s |
0.000916066 s |
1.02 |
jaxmd20 / IDefOpt / cuda / Primal |
0.000966332 s |
0.000951714 s |
1.02 |
jaxmd20 / JaXPipe / cuda / Forward |
0.001620091 s |
0.001571588 s |
1.03 |
jaxmd20 / Jax / cuda / Forward |
0.001836441 s |
0.0017808059999999 s |
1.03 |
jaxmd20 / HLOOpt / cuda / Forward |
0.0017070339999999 s |
0.001623371 s |
1.05 |
jaxmd20 / PartOpt / cuda / Forward |
0.001706677 s |
0.001664068 s |
1.03 |
jaxmd20 / IPartOpt / cuda / Forward |
0.0017145539999999 s |
0.001614468 s |
1.06 |
jaxmd20 / DefOpt / cuda / Forward |
0.001716986 s |
0.001624164 s |
1.06 |
jaxmd20 / IDefOpt / cuda / Forward |
0.0017166969999999 s |
0.0016119399999999 s |
1.06 |
jaxmd20 / JaXPipe / cuda / PreRev |
0.002760886 s |
0.002690471 s |
1.03 |
jaxmd20 / JaXPipe / cuda / PostRev |
0.0054248329999999 s |
0.005677523 s |
0.96 |
jaxmd20 / JaXPipe / cuda / BothRev |
0.002759573 s |
0.002690601 s |
1.03 |
jaxmd20 / Jax / cuda / BothRev |
0.005481326 s |
0.005316715 s |
1.03 |
jaxmd20 / HLOOpt / cuda / PreRev |
0.0028702299999999 s |
0.002768517 s |
1.04 |
jaxmd20 / HLOOpt / cuda / PostRev |
0.005503594 s |
0.005294383 s |
1.04 |
jaxmd20 / HLOOpt / cuda / BothRev |
0.002807093 s |
0.002714063 s |
1.03 |
jaxmd20 / PartOpt / cuda / PreRev |
0.002877429 s |
0.002801832 s |
1.03 |
jaxmd20 / PartOpt / cuda / PostRev |
0.005634188 s |
0.005661502 s |
1.00 |
jaxmd20 / PartOpt / cuda / BothRev |
0.002826196 s |
0.002745029 s |
1.03 |
jaxmd20 / IPartOpt / cuda / PreRev |
0.002889332 s |
0.002794438 s |
1.03 |
jaxmd20 / IPartOpt / cuda / PostRev |
0.0055886509999999 s |
0.0054231819999999 s |
1.03 |
jaxmd20 / IPartOpt / cuda / BothRev |
0.002815703 s |
0.002751239 s |
1.02 |
jaxmd20 / DefOpt / cuda / PreRev |
0.002926036 s |
0.002836042 s |
1.03 |
jaxmd20 / DefOpt / cuda / PostRev |
0.002827063 s |
0.002758054 s |
1.03 |
jaxmd20 / DefOpt / cuda / BothRev |
0.002809589 s |
0.0027710789999999 s |
1.01 |
jaxmd20 / IDefOpt / cuda / PreRev |
0.002926324 s |
0.002802917 s |
1.04 |
jaxmd20 / IDefOpt / cuda / PostRev |
0.00236485 s |
0.002298309 s |
1.03 |
jaxmd20 / IDefOpt / cuda / BothRev |
0.0028344269999999 s |
0.0027711109999999 s |
1.02 |
jaxmd20 / JaXPipe / tpu / Primal |
0.00928150875 s |
0.009277693125 s |
1.00 |
jaxmd20 / Jax / tpu / Primal |
0.009281055625 s |
0.0092693106249999 s |
1.00 |
jaxmd20 / HLOOpt / tpu / Primal |
0.009154576875 s |
0.009168335 s |
1.00 |
jaxmd20 / PartOpt / tpu / Primal |
0.0092015406249999 s |
0.0092010406249999 s |
1.00 |
jaxmd20 / IPartOpt / tpu / Primal |
0.00920164125 s |
0.009196949375 s |
1.00 |
jaxmd20 / DefOpt / tpu / Primal |
0.0088016868749999 s |
0.008796551875 s |
1.00 |
jaxmd20 / IDefOpt / tpu / Primal |
0.00870373375 s |
0.008697765625 s |
1.00 |
jaxmd20 / JaXPipe / tpu / Forward |
0.017415764375 s |
0.017415084375 s |
1.00 |
jaxmd20 / Jax / tpu / Forward |
0.018736701875 s |
0.018733859375 s |
1.00 |
jaxmd20 / HLOOpt / tpu / Forward |
0.01739822125 s |
0.017391626875 s |
1.00 |
jaxmd20 / PartOpt / tpu / Forward |
0.017416905 s |
0.017410939375 s |
1.00 |
jaxmd20 / IPartOpt / tpu / Forward |
0.01741094 s |
0.017415836875 s |
1.00 |
jaxmd20 / DefOpt / tpu / Forward |
0.017426603125 s |
0.017405771875 s |
1.00 |
jaxmd20 / IDefOpt / tpu / Forward |
0.01741189375 s |
0.0174111487499999 s |
1.00 |
jaxmd20 / JaXPipe / tpu / PreRev |
0.025457630625 s |
0.02547067625 s |
1.00 |
jaxmd20 / JaXPipe / tpu / PostRev |
0.02187902 s |
0.021862679375 s |
1.00 |
jaxmd20 / JaXPipe / tpu / BothRev |
0.0254868381249999 s |
0.025455455 s |
1.00 |
jaxmd20 / Jax / tpu / BothRev |
0.021871143125 s |
0.021865015625 s |
1.00 |
jaxmd20 / HLOOpt / tpu / PreRev |
0.0255961368749999 s |
0.02556869625 s |
1.00 |
jaxmd20 / HLOOpt / tpu / PostRev |
0.020469085 s |
0.02080791125 s |
0.98 |
jaxmd20 / HLOOpt / tpu / BothRev |
0.02570186125 s |
0.0256773987499999 s |
1.00 |
jaxmd20 / PartOpt / tpu / PreRev |
0.025487459375 s |
0.025461623125 s |
1.00 |
jaxmd20 / PartOpt / tpu / PostRev |
0.02152813875 s |
0.02152470625 s |
1.00 |
jaxmd20 / PartOpt / tpu / BothRev |
0.025567345 s |
0.0255368143749999 s |
1.00 |
jaxmd20 / IPartOpt / tpu / PreRev |
0.025486080625 s |
0.025459676875 s |
1.00 |
jaxmd20 / IPartOpt / tpu / PostRev |
0.021519329375 s |
0.021515244375 s |
1.00 |
jaxmd20 / IPartOpt / tpu / BothRev |
0.025578833125 s |
0.02555456625 s |
1.00 |
jaxmd20 / DefOpt / tpu / PreRev |
0.02548939125 s |
0.02545990625 s |
1.00 |
jaxmd20 / DefOpt / tpu / PostRev |
0.018822576875 s |
0.018808618125 s |
1.00 |
jaxmd20 / DefOpt / tpu / BothRev |
0.025574985 s |
0.0255338281249999 s |
1.00 |
jaxmd20 / IDefOpt / tpu / PreRev |
0.0254887824999999 s |
0.0254602499999999 s |
1.00 |
jaxmd20 / IDefOpt / tpu / PostRev |
0.01834349 s |
0.018301281875 s |
1.00 |
jaxmd20 / IDefOpt / tpu / BothRev |
0.0255843843749999 s |
0.02555139 s |
1.00 |
jaxmd40 / JaXPipe / cpu / Primal |
0.0687999389999999 s |
0.0715142149999999 s |
0.96 |
jaxmd40 / Jax / cpu / Primal |
0.0710809809999999 s |
0.071793129 s |
0.99 |
jaxmd40 / HLOOpt / cpu / Primal |
0.091523892 s |
0.096050322 s |
0.95 |
jaxmd40 / PartOpt / cpu / Primal |
0.069485885 s |
0.075406883 s |
0.92 |
jaxmd40 / IPartOpt / cpu / Primal |
0.067604632 s |
0.072501867 s |
0.93 |
jaxmd40 / DefOpt / cpu / Primal |
0.093857628 s |
0.093348485 s |
1.01 |
jaxmd40 / IDefOpt / cpu / Primal |
0.09189211 s |
0.096174862 s |
0.96 |
jaxmd40 / JaXPipe / cpu / Forward |
0.1691235059999999 s |
0.167727442 s |
1.01 |
jaxmd40 / Jax / cpu / Forward |
0.090996497 s |
0.089430882 s |
1.02 |
jaxmd40 / HLOOpt / cpu / Forward |
0.1627435529999999 s |
0.16773789 s |
0.97 |
jaxmd40 / PartOpt / cpu / Forward |
0.168396774 s |
0.16891999 s |
1.00 |
jaxmd40 / IPartOpt / cpu / Forward |
0.161285557 s |
0.170297303 s |
0.95 |
jaxmd40 / DefOpt / cpu / Forward |
0.166302271 s |
0.168942012 s |
0.98 |
jaxmd40 / IDefOpt / cpu / Forward |
0.167911657 s |
0.172370059 s |
0.97 |
jaxmd40 / JaXPipe / cpu / PreRev |
0.225888807 s |
0.246402501 s |
0.92 |
jaxmd40 / JaXPipe / cpu / PostRev |
0.146111627 s |
0.140108943 s |
1.04 |
jaxmd40 / JaXPipe / cpu / BothRev |
0.25196933 s |
0.234342806 s |
1.08 |
jaxmd40 / Jax / cpu / BothRev |
0.136940089 s |
0.137563376 s |
1.00 |
jaxmd40 / HLOOpt / cpu / PreRev |
0.2229825339999999 s |
0.226273474 s |
0.99 |
jaxmd40 / HLOOpt / cpu / PostRev |
0.179397601 s |
0.178276109 s |
1.01 |
jaxmd40 / HLOOpt / cpu / BothRev |
0.25464187 s |
0.248443251 s |
1.02 |
jaxmd40 / PartOpt / cpu / PreRev |
0.242189814 s |
0.227452227 s |
1.06 |
jaxmd40 / PartOpt / cpu / PostRev |
0.138690014 s |
0.130284111 s |
1.06 |
jaxmd40 / PartOpt / cpu / BothRev |
0.253496024 s |
0.2465847699999999 s |
1.03 |
jaxmd40 / IPartOpt / cpu / PreRev |
0.2417695919999999 s |
0.22577963 s |
1.07 |
jaxmd40 / IPartOpt / cpu / PostRev |
0.134319861 s |
0.14343332 s |
0.94 |
jaxmd40 / IPartOpt / cpu / BothRev |
0.254316743 s |
0.2539809779999999 s |
1.00 |
jaxmd40 / DefOpt / cpu / PreRev |
0.241453073 s |
0.230823172 s |
1.05 |
jaxmd40 / DefOpt / cpu / PostRev |
0.171336311 s |
0.1840009849999999 s |
0.93 |
jaxmd40 / DefOpt / cpu / BothRev |
0.245049834 s |
0.251601763 s |
0.97 |
jaxmd40 / IDefOpt / cpu / PreRev |
0.227253061 s |
0.239430548 s |
0.95 |
jaxmd40 / IDefOpt / cpu / PostRev |
0.175590594 s |
0.1760787939999999 s |
1.00 |
jaxmd40 / IDefOpt / cpu / BothRev |
0.257644724 s |
0.260008438 s |
0.99 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / JaXPipe / cuda / Primal |
1.702872536 s |
1.701965816 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / Jax / cuda / Primal |
1.705362856 s |
1.7047771029999998 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / HLOOpt / cuda / Primal |
1.7164505449999998 s |
1.714594805 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / PartOpt / cuda / Primal |
1.696581195 s |
1.696450168 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IPartOpt / cuda / Primal |
1.6940947000000002 s |
1.694809473 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / DefOpt / cuda / Primal |
1.665714621 s |
1.664932468 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IDefOpt / cuda / Primal |
1.915413227 s |
1.920949623 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / JaXPipe / tpu / Primal |
3.038836420625 s |
3.038568994375 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / Jax / tpu / Primal |
3.039341376875 s |
3.039189840625 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / HLOOpt / tpu / Primal |
3.121594910625 s |
3.1215918150000004 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / PartOpt / tpu / Primal |
3.0601986081250003 s |
3.05983537625 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IPartOpt / tpu / Primal |
3.06042502125 s |
3.060004419375 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / DefOpt / tpu / Primal |
2.102418320625 s |
2.10238759125 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IDefOpt / tpu / Primal |
2.948271054375 s |
2.944661759375 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / JaXPipe / cpu / Primal |
6.166764331 s |
6.304237769 s |
0.98 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / Jax / cpu / Primal |
6.188737092 s |
6.35860167 s |
0.97 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / HLOOpt / cpu / Primal |
6.15145843 s |
6.390747236999999 s |
0.96 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / PartOpt / cpu / Primal |
6.255778277 s |
6.4081710780000005 s |
0.98 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IPartOpt / cpu / Primal |
6.173246488 s |
6.465016108 s |
0.95 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / DefOpt / cpu / Primal |
2.45970196 s |
2.635569419 s |
0.93 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IDefOpt / cpu / Primal |
6.712345735 s |
7.045801032 s |
0.95 |
This comment was automatically generated by workflow using github-action-benchmark.
b2739f2 to
d0c24b3
Compare
| // Topological sort: we need to process ops in dependency order | ||
| // Since we collected via DFS, we need to reverse and ensure dependencies | ||
| // come first | ||
| auto sortedOps = mlir::topologicalSort(chainOps); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the toposort here still worries me a bit here, in contrast to a linear scan through the block [at least on large blocks]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me ponder a bit, we might be able to get away without sorting if we ensure a proper insertion order into the chain
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should now avoid the topological sort
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wsmoses is this good to go?
d0c24b3 to
49184ad
Compare
adb1748 to
d1b5e26
Compare
d1b5e26 to
19db57f
Compare
Mostly experimenting.
fixes #1862 #1865