-
Notifications
You must be signed in to change notification settings - Fork 32
Triton Codec uses AVX instructions; is there an SSE4.2 version available? #194
Description
We are using Project Acoustics together with Wwise and Unreal 5.3. The min spec of UE 5.3 and all other software components that we use at runtime is SSE4.2. However, it appears that Project Acoustics uses AVX instructions without validating CPU capabilities. Is that intentional? What are the chances of a non-AVX-required version of Project Acoustics? We are happy to build from source and hammer a bit on it ourselves if necessary.
The reliance on AVX instructions means that AMD's pre-Zen architecture CPUs (i.e. any AMD CPUs before 2017) and any of Intel's Pentium/Celeron branded CPUs up till Tiger Lake (i.e. some low-end CPUs from Intel before 2020) cannot run Project Acoustics. This is a narrower minspec than we would like for our game. It is also more difficult than necessary to explain to a customer what the minspec is in a concise manner; they know how many cores their CPU has, they might know what processor generation their CPU is, but they don't know what instruction set their CPU supports.
The first place where the current Project Acoustics implementation hits an AVX instruction is in a static initialization method; this is either compiler-generated logic, or it is the constructor for a statically constructed object. Either way, this piece of code gets run by the C runtime before main() even is reached, and the code bytes originate from Triton.Codec.lib:
000001B988882830 48 83 EC 28 sub rsp,28h
000001B988882834 C5 FA 10 05 68 9B 08 00 vmovss xmm0,dword ptr [__real@42422884 (01B98890C3A4h)]
000001B98888283C FF 15 06 C4 07 00 call qword ptr [__imp_lroundf (01B9888FEC48h)]
000001B988882842 89 05 0C 5E 0B 00 mov dword ptr [std::_Facetptr<std::codecvt<char,char,_Mbstatet> >::_Psave+24h (01B988938654h)],eax
000001B988882848 48 83 C4 28 add rsp,28h
000001B98888284C C3 ret
(__real@42422884 is a float containing the value 48.5395660)
The AVX code is not wrapped in any "only do this if the CPU supports AVX" logic with fallback code paths. We don't know whether the Triton codec has such logic in place for its processing kernels. If that is the case, well, then that feature selection logic seems to be not present for the initialization logic for the static objects.