About this question

SIMD Vectorization Speedup

Medium · Algorithms & Data Structures · Quant Trader interview question · SIMD, vectorization, AVX-256, performance, optimization

You are optimizing a high-frequency trading algorithm that involves computing dot products between a large number of portfolio weights and market data. You have 256 portfolio weights. You are considering using AVX-256 instructions for vectorization, which can process 8 single-precision floating-point numbers (floats) simultaneously. What is the approximate theoretical speedup you can expect over non-vectorized (scalar) code when computing the dot products?