Auto-vectorization is consistently one of the least predictable optimization passes, which is rather awful, since when it doesn't trigger your functions are suddenly >3x slower. This drives people to more explicit SIMD coding, from direct assembly like in FFMPEG to wrappers providing some cross-platform support like Google's Highway.
It's just really hard to detect and exploit profitable and safe vectorization opportunities. The theory behind some of the optimizers is beautiful, though: https://en.wikipedia.org/wiki/Polytope_model
Auto-vectorization is consistently one of the least predictable optimization passes, which is rather awful, since when it doesn't trigger your functions are suddenly >3x slower. This drives people to more explicit SIMD coding, from direct assembly like in FFMPEG to wrappers providing some cross-platform support like Google's Highway.
It's just really hard to detect and exploit profitable and safe vectorization opportunities. The theory behind some of the optimizers is beautiful, though: https://en.wikipedia.org/wiki/Polytope_model
I’m always shocked at what the compiler is able to deduce wrt vectorization. When it works, it’s magical.