All of: x86::_mm256_store_ps, x86::_mm256_store_pd, x86_64::_mm256_store_ps, and x86_64::_mm256_store_pd have an incorrect signature where the destination is a *const f32, it should be a *mut f32. Additionally the implementation of these functions invokes UB:
pub unsafe fn _mm256_store_ps(mem_addr: *const f32, a: __m256) {
*(mem_addr as *mut __m256) = a;
}