diff --git a/changelog/vectorized_array_ops.dd b/changelog/vectorized_array_ops.dd deleted file mode 100644 index 0b20c4e4f4..0000000000 --- a/changelog/vectorized_array_ops.dd +++ /dev/null @@ -1,10 +0,0 @@ -Vectorized array operations are now templated - -Array operations have been converted from dedicated assembly routines for $(B some) array operations to a generic template implementation for $(B all) array operations. This provides huge performance increases (2-4x higher throughput) for array operations that were not previously vectorized. -Furthermore the implementation makes better use of vectorization even for short arrays to heavily reduce latency for some operations (up to 4x). - -For GDC/LDC the implementation relies on auto-vectorization, for DMD the implementation performs the vectorization itself. Support for vector operations with DMD is determined statically (`-march=native`, `-march=avx2`) to avoid binary bloat and the small test overhead. DMD enables SSE2 for 64-bit targets by default. - -Also see $(DRUNTIMEPR 1891) - -$(RED Note:) The implementation no longer weakens floating point divisions (e.g. `ary[] / scalar`) to multiplication (`ary[] * (1.0 / scalar)`) as that may reduce precision. To preserve the higher performance of float multiplication when loss of precision is acceptable, use either `-ffast-math` with GDC/LDC or manually rewrite your code to multiply by `(1.0 / scalar)` for DMD.