Skip to content
This repository was archived by the owner on Jul 1, 2024. It is now read-only.
This repository was archived by the owner on Jul 1, 2024. It is now read-only.

Triton Codec uses AVX instructions; is there an SSE4.2 version available? #194

@Kalmalyzer

Description

@Kalmalyzer

We are using Project Acoustics together with Wwise and Unreal 5.3. The min spec of UE 5.3 and all other software components that we use at runtime is SSE4.2. However, it appears that Project Acoustics uses AVX instructions without validating CPU capabilities. Is that intentional? What are the chances of a non-AVX-required version of Project Acoustics? We are happy to build from source and hammer a bit on it ourselves if necessary.


The reliance on AVX instructions means that AMD's pre-Zen architecture CPUs (i.e. any AMD CPUs before 2017) and any of Intel's Pentium/Celeron branded CPUs up till Tiger Lake (i.e. some low-end CPUs from Intel before 2020) cannot run Project Acoustics. This is a narrower minspec than we would like for our game. It is also more difficult than necessary to explain to a customer what the minspec is in a concise manner; they know how many cores their CPU has, they might know what processor generation their CPU is, but they don't know what instruction set their CPU supports.

The first place where the current Project Acoustics implementation hits an AVX instruction is in a static initialization method; this is either compiler-generated logic, or it is the constructor for a statically constructed object. Either way, this piece of code gets run by the C runtime before main() even is reached, and the code bytes originate from Triton.Codec.lib:

000001B988882830 48 83 EC 28          sub         rsp,28h  
000001B988882834 C5 FA 10 05 68 9B 08 00 vmovss      xmm0,dword ptr [__real@42422884 (01B98890C3A4h)]  
000001B98888283C FF 15 06 C4 07 00    call        qword ptr [__imp_lroundf (01B9888FEC48h)]  
000001B988882842 89 05 0C 5E 0B 00    mov         dword ptr [std::_Facetptr<std::codecvt<char,char,_Mbstatet> >::_Psave+24h (01B988938654h)],eax  
000001B988882848 48 83 C4 28          add         rsp,28h  
000001B98888284C C3                   ret  

(__real@42422884 is a float containing the value 48.5395660)

The AVX code is not wrapped in any "only do this if the CPU supports AVX" logic with fallback code paths. We don't know whether the Triton codec has such logic in place for its processing kernels. If that is the case, well, then that feature selection logic seems to be not present for the initialization logic for the static objects.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions