AlphaS against latest master, including code generation and all processes generated by valassi · Pull Request #434 · madgraph5/madgraph4gpu

valassi · 2022-04-21T14:18:05Z

Hi @roiser I am creating here a PR superseding your #428 (to address the running of alphas issue #373). The main differences will be

porting to the current latest upstream/master
backporting your changes (in ggtt only for now) to code generation, and generating all five processes I use

valassi · 2022-04-22T09:13:29Z

I am making progress on integrating this, but there is work to be done.

I would say the main problems were:

while check.exe and gcheck.exe were ok, both runTest and the fcheck test were failing 78c4ed3
code generation was not there, and the design of the interface was using gc10/gc11 which are process-dependent parameters that should be removed from the interface (in eemumu they are not needed, for instance)

What I have done so far:

I have removed gc10/gc11 from the bridge interfaces, and only kept gs (NB: gs are actually also not needed in some processes like eemumu as this is a QCD coupling, but the whole Madgraph machinery distinguishes between couplings that are dependent or not on alphas, so for the moment I would keep the gs as-is in the interfaces)
gc10 and gc11 are now allocated within MEK, no longer within check_sa: they are an internal detail of MEK, they do not need to be exposed outside it
there was no way to pass gs through fbridge.inc from fortran code, I have now added that
partly thanks to the above, the fcheck tests are now succeding (amongst other things, the hardcoded value of gs that is set in check_sa.cc also needed to be set in fcheck_sa.f, and is now passed through fbridge.inc)
I have removed the computeDependentCouplings method from the MEK and from the Bridge: given that only Gs (not gc10, gc11) are passed through the bridge interface, I aminclined to treat thecomputation of dependent couplings as one instance of 'splitting kernels'... I will cook up something for now (possibly moving it to calculate_wavefunction), eventually this should be reviewed together with splitting kernels
I have started to rename and move the actual computation of dependent functions: I moved it to HelAmps for now, as it is a process-dependent computation (essentially it should only appear in HelAmps.h, or CPPProcess, or Parameters_sm: these are the only three files with process dependent stuff, all the rest I am trying to keep process independet)

Amongst the things tha are not yet ok

runTest still fails
remove gc10 and gc11 from sigmakin and similar interfaces, just keep g
(minor reminder: copying Gs inside the bridge must be fixed if fptype is not FORTRANFPTYPE)
(minor reminder: check timers in check_sa, find the right spot for copying gs)

Was: computeDependentCouplings -> calls dependentCouplings kernel -> calls dependent_coupling Then: computeDependentCouplings -> calls computeCouplings kernel -> calls G2COUP Now: (in MEK merged with ComputeMe) -> calls computeDependentCouplings kernel -> calls G2COUP

…200 registers instead of 170 upstream/master

… fails (fptype/FORTRANFPTYPE issue?)

…oat nobridge fgcheck test succeeds

…ter fising fptype != FORTRANFPTYPE

valassi · 2022-04-22T10:40:14Z

Ok I have now

fixed runtest (was missing hardcoding gs)
fixed fptype!=FORTRANFPTYPE (and the fgcheck test in float mode now succeeds, it was failing)

Still t do amongst other things

remove gc10 and gc11 from sigmakin and similar interfaces, just keep g (this is necessary for code generation)
(minor reminder: check timers in check_sa, find the right spot for copying gs)

In addition

investigate performance? I saw the number of registers has increased from 170 to 200 in ggtt double

valassi · 2022-04-22T11:06:38Z

I need to understand ho wto do gc10/gc11 with respect to code generation. I am dumping my main ideas in #356, as this is mainly about how to split the roles of MEK and CPPProcess (very much relatedto splitting kernels #310)

…gc11 are the dependent couplings)

…c11 are the dependent couplings)

(NB: now the naming convention is consistent, BufferCouplings and MemoryAccessCouplings)

…ncapsulate ndcoup in BufferCouplings (still to do: gc10 and gc11 are now derefrenced in MEK, one can do better)

valassi · 2022-04-22T14:57:15Z

Some progress, I transformed BufferCouplings into a buffer that holds an arbvitrary number ndcoup of coupling arrays. It works so far. Now I need to change the access functions.

…ts.h, remove temporary cuda 11.1 hacks

roiser · 2022-04-22T15:36:20Z

My idea here was to let the code generation do the work, but if you want to change this its also fine with me.

…ray MEs - previous one was using always one value in an array ("trivial"), strange that it worked?!

./tput/teeThroughputX.sh -flt -hrd -makej -makeclean -eemumu -ggtt -ggttg -ggttgg -ggttggg Note that it takes ~30 minutes for each of the four ggttggg tests to build (no inlining). Without inlining, all other processes are quite fast: ls -ltr ee_mumu/lib/build.none_*_inl0_hrd* gg_tt/lib/build.none_*_inl0_hrd* gg_tt*g/lib/build.none_*_inl0_hrd* | egrep -v '(total|\./|\.build|_common|^$)' ee_mumu/lib/build.none_d_inl0_hrd0: -rwxr-xr-x. 1 avalassi zg 95296 Apr 28 08:29 libmg5amc_epem_mupmum_cpp.so* -rwxr-xr-x. 1 avalassi zg 1074160 Apr 28 08:29 libmg5amc_epem_mupmum_cuda.so* ee_mumu/lib/build.none_d_inl0_hrd1: -rwxr-xr-x. 1 avalassi zg 94488 Apr 28 08:29 libmg5amc_epem_mupmum_cpp.so* -rwxr-xr-x. 1 avalassi zg 1069192 Apr 28 08:29 libmg5amc_epem_mupmum_cuda.so* ee_mumu/lib/build.none_f_inl0_hrd0: -rwxr-xr-x. 1 avalassi zg 99712 Apr 28 08:29 libmg5amc_epem_mupmum_cpp.so* -rwxr-xr-x. 1 avalassi zg 1071848 Apr 28 08:29 libmg5amc_epem_mupmum_cuda.so* ee_mumu/lib/build.none_f_inl0_hrd1: -rwxr-xr-x. 1 avalassi zg 94752 Apr 28 08:29 libmg5amc_epem_mupmum_cpp.so* -rwxr-xr-x. 1 avalassi zg 1062704 Apr 28 08:29 libmg5amc_epem_mupmum_cuda.so* gg_tt/lib/build.none_d_inl0_hrd0: -rwxr-xr-x. 1 avalassi zg 103904 Apr 28 08:29 libmg5amc_gg_ttx_cpp.so* -rwxr-xr-x. 1 avalassi zg 1238000 Apr 28 08:29 libmg5amc_gg_ttx_cuda.so* gg_tt/lib/build.none_d_inl0_hrd1: -rwxr-xr-x. 1 avalassi zg 98840 Apr 28 08:29 libmg5amc_gg_ttx_cpp.so* -rwxr-xr-x. 1 avalassi zg 1212608 Apr 28 08:29 libmg5amc_gg_ttx_cuda.so* gg_tt/lib/build.none_f_inl0_hrd0: -rwxr-xr-x. 1 avalassi zg 104224 Apr 28 08:29 libmg5amc_gg_ttx_cpp.so* -rwxr-xr-x. 1 avalassi zg 1211112 Apr 28 08:29 libmg5amc_gg_ttx_cuda.so* gg_tt/lib/build.none_f_inl0_hrd1: -rwxr-xr-x. 1 avalassi zg 103152 Apr 28 08:30 libmg5amc_gg_ttx_cpp.so* -rwxr-xr-x. 1 avalassi zg 1197872 Apr 28 08:31 libmg5amc_gg_ttx_cuda.so* gg_ttg/lib/build.none_d_inl0_hrd0: -rwxr-xr-x. 1 avalassi zg 112672 Apr 28 08:31 libmg5amc_gg_ttxg_cpp.so* -rwxr-xr-x. 1 avalassi zg 1868784 Apr 28 08:31 libmg5amc_gg_ttxg_cuda.so* gg_ttg/lib/build.none_d_inl0_hrd1: -rwxr-xr-x. 1 avalassi zg 111704 Apr 28 08:31 libmg5amc_gg_ttxg_cpp.so* -rwxr-xr-x. 1 avalassi zg 1847488 Apr 28 08:31 libmg5amc_gg_ttxg_cuda.so* gg_ttg/lib/build.none_f_inl0_hrd0: -rwxr-xr-x. 1 avalassi zg 117088 Apr 28 08:32 libmg5amc_gg_ttxg_cpp.so* -rwxr-xr-x. 1 avalassi zg 1813224 Apr 28 08:33 libmg5amc_gg_ttxg_cuda.so* gg_ttg/lib/build.none_f_inl0_hrd1: -rwxr-xr-x. 1 avalassi zg 111920 Apr 28 08:34 libmg5amc_gg_ttxg_cpp.so* -rwxr-xr-x. 1 avalassi zg 1799984 Apr 28 08:35 libmg5amc_gg_ttxg_cuda.so* gg_ttgg/lib/build.none_d_inl0_hrd0: -rwxr-xr-x. 1 avalassi zg 154432 Apr 28 08:37 libmg5amc_gg_ttxgg_cpp.so* -rwxr-xr-x. 1 avalassi zg 4281328 Apr 28 08:38 libmg5amc_gg_ttxgg_cuda.so* gg_ttgg/lib/build.none_d_inl0_hrd1: -rwxr-xr-x. 1 avalassi zg 149368 Apr 28 08:39 libmg5amc_gg_ttxgg_cpp.so* -rwxr-xr-x. 1 avalassi zg 4247744 Apr 28 08:40 libmg5amc_gg_ttxgg_cuda.so* gg_ttgg/lib/build.none_f_inl0_hrd0: -rwxr-xr-x. 1 avalassi zg 154752 Apr 28 08:42 libmg5amc_gg_ttxgg_cpp.so* -rwxr-xr-x. 1 avalassi zg 4119272 Apr 28 08:42 libmg5amc_gg_ttxgg_cuda.so* gg_ttgg/lib/build.none_f_inl0_hrd1: -rwxr-xr-x. 1 avalassi zg 145488 Apr 28 08:44 libmg5amc_gg_ttxgg_cpp.so* -rwxr-xr-x. 1 avalassi zg 4106032 Apr 28 08:45 libmg5amc_gg_ttxgg_cuda.so* gg_ttggg/lib/build.none_d_inl0_hrd0: -rwxr-xr-x. 1 avalassi zg 768848 Apr 28 08:48 libmg5amc_gg_ttxggg_cpp.so* -rwxr-xr-x. 1 avalassi zg 15758320 Apr 28 09:16 libmg5amc_gg_ttxggg_cuda.so* gg_ttggg/lib/build.none_d_inl0_hrd1: -rwxr-xr-x. 1 avalassi zg 759696 Apr 28 09:20 libmg5amc_gg_ttxggg_cpp.so* -rwxr-xr-x. 1 avalassi zg 15769792 Apr 28 09:46 libmg5amc_gg_ttxggg_cuda.so* gg_ttggg/lib/build.none_f_inl0_hrd0: -rwxr-xr-x. 1 avalassi zg 707728 Apr 28 09:51 libmg5amc_gg_ttxggg_cpp.so* -rwxr-xr-x. 1 avalassi zg 14711528 Apr 28 10:18 libmg5amc_gg_ttxggg_cuda.so* gg_ttggg/lib/build.none_f_inl0_hrd1: -rwxr-xr-x. 1 avalassi zg 702568 Apr 28 10:23 libmg5amc_gg_ttxggg_cpp.so* -rwxr-xr-x. 1 avalassi zg 14718768 Apr 28 10:49 libmg5amc_gg_ttxggg_cuda.so*

./tput/teeThroughputX.sh -flt -hrd -makej -makeclean -eemumu -ggtt -ggttgg -inlonly Note that inline builds take longer - and are now slower in c++ than in cuda! (For non inlined builds cuda is much slower than c++) ls -ltr ee_mumu/lib/build.none_*_inl1_hrd* gg_tt/lib/build.none_*_inl1_hrd* gg_tt*g/lib/build.none_*_inl1_hrd* | egrep -v '(total|\./|\.build|_common|^$)' ee_mumu/lib/build.none_d_inl1_hrd0: -rwxr-xr-x. 1 avalassi zg 107104 Apr 28 11:02 libmg5amc_epem_mupmum_cpp.so* -rwxr-xr-x. 1 avalassi zg 1074064 Apr 28 11:02 libmg5amc_epem_mupmum_cuda.so* ee_mumu/lib/build.none_d_inl1_hrd1: -rwxr-xr-x. 1 avalassi zg 98056 Apr 28 11:02 libmg5amc_epem_mupmum_cpp.so* -rwxr-xr-x. 1 avalassi zg 1069064 Apr 28 11:02 libmg5amc_epem_mupmum_cuda.so* ee_mumu/lib/build.none_f_inl1_hrd0: -rwxr-xr-x. 1 avalassi zg 99224 Apr 28 11:02 libmg5amc_epem_mupmum_cpp.so* -rwxr-xr-x. 1 avalassi zg 1071720 Apr 28 11:02 libmg5amc_epem_mupmum_cuda.so* ee_mumu/lib/build.none_f_inl1_hrd1: -rwxr-xr-x. 1 avalassi zg 94224 Apr 28 11:02 libmg5amc_epem_mupmum_cpp.so* -rwxr-xr-x. 1 avalassi zg 1062608 Apr 28 11:02 libmg5amc_epem_mupmum_cuda.so* gg_tt/lib/build.none_d_inl1_hrd0: -rwxr-xr-x. 1 avalassi zg 111152 Apr 28 11:02 libmg5amc_gg_ttx_cpp.so* -rwxr-xr-x. 1 avalassi zg 1237904 Apr 28 11:02 libmg5amc_gg_ttx_cuda.so* gg_tt/lib/build.none_d_inl1_hrd1: -rwxr-xr-x. 1 avalassi zg 98152 Apr 28 11:02 libmg5amc_gg_ttx_cpp.so* -rwxr-xr-x. 1 avalassi zg 1212480 Apr 28 11:02 libmg5amc_gg_ttx_cuda.so* gg_tt/lib/build.none_f_inl1_hrd0: -rwxr-xr-x. 1 avalassi zg 111472 Apr 28 11:03 libmg5amc_gg_ttx_cpp.so* -rwxr-xr-x. 1 avalassi zg 1210984 Apr 28 11:03 libmg5amc_gg_ttx_cuda.so* gg_tt/lib/build.none_f_inl1_hrd1: -rwxr-xr-x. 1 avalassi zg 102464 Apr 28 11:04 libmg5amc_gg_ttx_cpp.so* -rwxr-xr-x. 1 avalassi zg 1197776 Apr 28 11:04 libmg5amc_gg_ttx_cuda.so* gg_ttgg/lib/build.none_d_inl1_hrd0: -rwxr-xr-x. 1 avalassi zg 4416400 Apr 28 11:06 libmg5amc_gg_ttxgg_cuda.so* -rwxr-xr-x. 1 avalassi zg 607136 Apr 28 11:09 libmg5amc_gg_ttxgg_cpp.so* gg_ttgg/lib/build.none_d_inl1_hrd1: -rwxr-xr-x. 1 avalassi zg 4378688 Apr 28 11:11 libmg5amc_gg_ttxgg_cuda.so* -rwxr-xr-x. 1 avalassi zg 598160 Apr 28 11:14 libmg5amc_gg_ttxgg_cpp.so* gg_ttgg/lib/build.none_f_inl1_hrd0: -rwxr-xr-x. 1 avalassi zg 4201064 Apr 28 11:16 libmg5amc_gg_ttxgg_cuda.so* -rwxr-xr-x. 1 avalassi zg 627936 Apr 28 11:19 libmg5amc_gg_ttxgg_cpp.so* gg_ttgg/lib/build.none_f_inl1_hrd1: -rwxr-xr-x. 1 avalassi zg 4191952 Apr 28 11:21 libmg5amc_gg_ttxgg_cuda.so* -rwxr-xr-x. 1 avalassi zg 622952 Apr 28 11:24 libmg5amc_gg_ttxgg_cpp.so*

STARTED AT Thu Apr 28 08:28:54 CEST 2022 ENDED(1) AT Thu Apr 28 11:02:01 CEST 2022 ENDED(2) AT Thu Apr 28 11:32:47 CEST 2022 ENDED(3) AT Thu Apr 28 11:37:09 CEST 2022 ENDED(4) AT Thu Apr 28 11:40:19 CEST 2022 ENDED(5) AT Thu Apr 28 11:43:25 CEST 2022

…EFORE THE ALPHAS PR The typical build times were as for alpha, 30 minutes for each of 4 ggttggg tests (but some builds were cached) STARTED AT Thu Apr 28 12:13:58 CEST 2022 ENDED(1) AT Thu Apr 28 13:47:08 CEST 2022 ENDED(2) AT Thu Apr 28 14:24:29 CEST 2022 ENDED(3) AT Thu Apr 28 14:28:54 CEST 2022 ENDED(4) AT Thu Apr 28 14:32:07 CEST 2022 ENDED(5) AT Thu Apr 28 14:35:20 CEST 2022

Revert "[alphas] rerun all tests with allTees.sh USING UPSTREAM/MASTER CODE BEFORE THE ALPHAS PR" This reverts commit 08349369308297560e19a97277049f315cbba078.

…e as in upstream/master)

…remove inl1 data, confusing and irrelevant) Introducing the alphas memory access does not seem to have degraded performance significantly, good

valassi · 2022-04-28T15:15:37Z

This is FINALLY completed

code generation is complete
both eemumu with fixed couplings and ggtt* with running couplings are handled in codegen (I have stripped off testing one process with both of them to Reminder: check code generation for a process with BOTH dependent and independent couplings #438)
I have regenerated and tested all 5 processes and created a nice summary table
WARNING! I have regenerated heft_gg_h and this generates ok, but it fails to build at runtime for the rather complex issue Fix heft generation after alphas PR: more fundamentally, remove reading of param_card.dat? #439... I will NOT fix this now, but I will merge this anyway as I think that SM physics for LHC takes priority over BeyondSM generations
WARNING! I have regenerated uudd and this generates ok, but it fails to build with another issue Fix uudd in generation after the alphas merge #440. Hopefully this is rather simple. Anyway again I will merge this as is because this has been dragging on for far too long...
WARNING! I have also regenerated eemumu.mad, ggtt.mad, ggtt.madonly. While ggtt.madonly builds fine as expected, the two .mad builds fail. This is expected, because I have not integrated the alphas from madevent into cudacpp. This will be one of the next priorities, Fix eemumu.mad generation and build after the alphas merge (integrate alphas and madevent) #441. I will merge this standalone cudacpp anyway.

… does not build yet, see madgraph5#439 I will merge this anyway as standalone cudacpp for SM physics works fine (there is one exception, uudd fails, see madgraph5#440 - I will fix that a posteriori). Note also that alphas from madevent are still not integrated, and so ggtt.mad fails to build for instance, madgraph5#441. This will be the next big thing.

valassi · 2022-04-28T15:21:54Z

Note the performance difference between pre and post alphas:
https://github.com/madgraph5/madgraph4gpu/blob/7d1a6d9b604578c0957ebbf566539d4fb6121aac/epochX/cudacpp/tput/summaryTable_alphas.txt

Example

-------------------------------------------------------------------------------

+++ cudacpp REVISION 88fe36d1 +++
On itscrd70.cern.ch [CPU: Intel(R) Xeon(R) Silver 4216 CPU] [GPU: 1x Tesla V100S-PCIE-32GB]:

[nvcc 11.6.124 (gcc 10.2.0)] 
HELINL=0 HRDCOD=0
                eemumu          ggtt         ggttg        ggttgg       ggttggg
         [2048/256/12]  [2048/256/1]  [2048/256/1]  [2048/256/1]    [64/256/1]
CUD/none      1.32e+09      1.42e+08      1.44e+07      5.13e+05      1.21e+04
         [2048/256/12]  [2048/256/1]    [64/256/1]    [64/256/1]     [1/256/1]
CPP/none      1.66e+06      2.00e+05      2.48e+04      1.80e+03      7.15e+01
CPP/sse4      3.09e+06      3.15e+05      4.50e+04      3.34e+03      1.30e+02
CPP/avx2      5.47e+06      5.69e+05      8.82e+04      6.79e+03      2.60e+02
CPP/512y      5.87e+06      6.12e+05      9.81e+04      7.46e+03      2.83e+02
CPP/512z      4.67e+06      3.85e+05      7.22e+04      6.56e+03      2.95e+02

+++ cudacpp REVISION bae5c248 +++
On itscrd70.cern.ch [CPU: Intel(R) Xeon(R) Silver 4216 CPU] [GPU: 1x Tesla V100S-PCIE-32GB]:

[nvcc 11.6.124 (gcc 10.2.0)] 
HELINL=0 HRDCOD=0
                eemumu          ggtt         ggttg        ggttgg       ggttggg
         [2048/256/12]  [2048/256/1]  [2048/256/1]  [2048/256/1]    [64/256/1]
CUD/none      1.10e+09      1.36e+08      1.17e+07      4.91e+05      1.11e+04
         [2048/256/12]  [2048/256/1]    [64/256/1]    [64/256/1]     [1/256/1]
CPP/none      1.65e+06      1.99e+05      2.46e+04      1.79e+03      7.21e+01
CPP/sse4      3.10e+06      3.13e+05      4.43e+04      3.33e+03      1.32e+02
CPP/avx2      5.35e+06      5.61e+05      8.80e+04      6.76e+03      2.61e+02
CPP/512y      5.96e+06      6.11e+05      9.91e+04      7.58e+03      2.91e+02
CPP/512z      4.60e+06      3.80e+05      7.15e+04      6.48e+03      2.93e+02

-------------------------------------------------------------------------------

There is almost no difference in cpp (as expected?). CUDA on GPU seems around 10% slower consistently. But I guess we need to live with that...

valassi · 2022-04-28T15:26:56Z

(Well, about performance: not sure at all why eemumu would be slower... I opened #442 to investigate further if someone has the courage).

This is now complete. I am self merging.

…CCESS for independent couplings and CD_ACCESS for dependent couplings

…gpu#434 codegen, fails "output standalone_cudacpp CODEGEN_cudacpp_ee_mumu" with error: AttributeError : 'PLUGIN_GPUFOHelasCallWriter' object has no attribute 'model'

valassi assigned roiser and valassi Apr 21, 2022

valassi mentioned this pull request Apr 21, 2022

alphaS w/ buffers and calculation kernel #428

Closed

valassi marked this pull request as draft April 21, 2022 14:19

valassi changed the title ~~AlphaS against latest master, including code generation and all processes generated~~ WIP AlphaS against latest master, including code generation and all processes generated Apr 21, 2022

[alphas] rerun ggtt - no change, runTest is still failiung

d7d5028

valassi added 6 commits April 22, 2022 11:19

[alphas] fix runTest: add hardcoded gs as in fcheck and check/gcheck

0c48357

[alphas] rerun ggtt double with/without bridge, now all ok, but note …

2ab66d9

…200 registers instead of 170 upstream/master

[alphas] rerun ggtt float nobridge - almost all ok, but fgcheck still…

c004928

… fails (fptype/FORTRANFPTYPE issue?)

[alphas] fix bug in Bridge.h for fptype != FORTRANFPTYPE - now the fl…

9dc6f76

…oat nobridge fgcheck test succeeds

[alphas] rerun ggtt float nobridge - now the fgcheck test succeeds af…

688a424

…ter fising fptype != FORTRANFPTYPE

This was referenced Apr 22, 2022

Running of alphas (QCD coupling) #373

Closed

Clarify roles of Parameters_sm, CPPProcess and MatrixElementKernels #356

Open

valassi mentioned this pull request Apr 22, 2022

Split the sigmakin kernel into smaller kernels #310

Closed

valassi added 6 commits April 22, 2022 13:40

[alphas] add a parameter ndcoup=2 for ggtt in mgOnGpuConfig.h (gc10, …

b2c4f62

…gc11 are the dependent couplings)

[alphas] add a parameter ndcoup=2 for ggtt in MemoryBufers.h (gc10, g…

8e1eb92

…c11 are the dependent couplings)

[alphas] rename BufferGCs as BufferCouplings and improve a few comments

f960e61

(NB: now the naming convention is consistent, BufferCouplings and MemoryAccessCouplings)

[alphas] progress in making gc10/gc11 handling process independent: e…

10af9f3

…ncapsulate ndcoup in BufferCouplings (still to do: gc10 and gc11 are now derefrenced in MEK, one can do better)

[alphas] rerun ggtt, all ok so far

e7b2055

[alphas] fix a comment to make it consistent to all other files

634e6e2

valassi added 2 commits April 22, 2022 17:03

[alphas] move to cuda 11.4 implementation of MemoryAccessMatrixElemen…

285acd9

…ts.h, remove temporary cuda 11.1 hacks

[alphas] rerun ggtt, all ok so far with new MemoryAccessMatrixElements.h

f97e02d

valassi mentioned this pull request Apr 22, 2022

Implement fptype_sv in MemoryAccessMatrixElement.h #435

Closed

[alphas] bug fix in MemoryAccessGs, derive it from that for simple ar…

a579459

…ray MEs - previous one was using always one value in an array ("trivial"), strange that it worked?!

valassi added 16 commits April 28, 2022 11:15

[alphas] moved a few old scripts and text files to tput/old

6235170

[alphas] add a simple script to check build status

cfde1e8

[alphas] minor improvements to tput scripts

87f0a25

[alphas] go back to logs from the alphas PR

ba3cec8

Revert "[alphas] rerun all tests with allTees.sh USING UPSTREAM/MASTER CODE BEFORE THE ALPHAS PR" This reverts commit 08349369308297560e19a97277049f315cbba078.

[alphas] recreate all summary tables with the current script (the sam…

19eb107

…e as in upstream/master)

[alphas] add a -ALL option to the summary table script

3de479c

[alphas] add a summary table for the alphas PR, fix -ALL

5802d5a

[alphas] improve summary table script to allow different sortings

5a0ad28

[alphas] use a different sorting in the alphas summary table

6389329

[alphas] (ALMOST COMPLETE ALPHAS) simplify the alphas summary table (…

7d1a6d9

…remove inl1 data, confusing and irrelevant) Introducing the alphas memory access does not seem to have degraded performance significantly, good

[alphas] regenerate also eemumu.mad, ggtt.mad, ggtt.madonly

dab04e0

[alphas] progress in fixing generation for heft_gg_h

b9a0596

valassi mentioned this pull request Apr 28, 2022

Fix heft generation after alphas PR: more fundamentally, remove reading of param_card.dat? #439

Closed

valassi force-pushed the alphas branch from b3fd0fa to 2408393 Compare April 28, 2022 15:18

valassi changed the title ~~WIP AlphaS against latest master, including code generation and all processes generated~~ AlphaS against latest master, including code generation and all processes generated Apr 28, 2022

valassi marked this pull request as ready for review April 28, 2022 15:22

valassi mentioned this pull request Apr 28, 2022

Further analysis of possible performance regressions after the alphas merge #442

Closed

valassi merged commit 605291c into madgraph5:master Apr 28, 2022

This was referenced Apr 28, 2022

Move the loop over npagV from calculate_wavefunction to sigmakin? #415

Open

CI on the GPU does not run CUDA tests? #443

Closed

valassi added a commit to mg5amcnlo/mg5amcnlo_cudacpp that referenced this pull request Aug 16, 2023

[alphas] add a comment: FIXME FOR PR madgraph5/madgraph4gpu#434: CI_A…

1fdc548

…CCESS for independent couplings and CD_ACCESS for dependent couplings

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AlphaS against latest master, including code generation and all processes generated#434

AlphaS against latest master, including code generation and all processes generated#434
valassi merged 310 commits intomadgraph5:masterfrom
valassi:alphas

valassi commented Apr 21, 2022 •

edited

Loading

Uh oh!

valassi commented Apr 22, 2022

Uh oh!

valassi commented Apr 22, 2022

Uh oh!

valassi commented Apr 22, 2022

Uh oh!

valassi commented Apr 22, 2022

Uh oh!

roiser commented Apr 22, 2022

Uh oh!

valassi commented Apr 28, 2022

Uh oh!

valassi commented Apr 28, 2022 •

edited

Loading

Uh oh!

valassi commented Apr 28, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

valassi commented Apr 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

valassi commented Apr 22, 2022

Uh oh!

valassi commented Apr 22, 2022

Uh oh!

valassi commented Apr 22, 2022

Uh oh!

valassi commented Apr 22, 2022

Uh oh!

roiser commented Apr 22, 2022

Uh oh!

valassi commented Apr 28, 2022

Uh oh!

valassi commented Apr 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

valassi commented Apr 28, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

valassi commented Apr 21, 2022 •

edited

Loading

valassi commented Apr 28, 2022 •

edited

Loading