## SIMD adventures This week I've been trying to build a version of Vortex `take` operation that uses explicit SIMD vectorization, with the goal of getting all of Vortex over to Rust stable channel (see this Github Issue for tracking). ```rust let expected_array = { let bool_mask = lit(true) .evaluate(&Scope::new(array_data.clone())) .vortex_unwrap(); // this is just all true mask let mask = Mask::try_from(&bool_mask.to_bool().vortex_unwrap()).vortex_unwrap(); let filtered = filter(&array_data.clone(), &mask).vortex_unwrap(); root() .clone() .unwrap_or_else(|| root()) .evaluate(&Scope::new(filtered)) .vortex_unwrap() }; ``` ## A brief history of SIMD According to Sonnet 4, here is a chart of the adoption of SIMD instructions by the major processor vendors: | Technology | Vendor | Register Size | Announce Date | Release Date | Core First Used | | ---------- | ------ | ------------- | ------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------ | | MMX | Intel | 64-bit | 1996 | January 8, 1997 [How to Know If My Intel® Processor Supports Intel® Advanced Vector Extensions 2 (Intel® AVX2)](https://www.intel.com/content/www/us/en/support/articles/000090473/processors/intel-core-processors.html) | Pentium P5 | | 3DNow! | AMD | 64-bit | 1998 | 1998 | K6-2 | | SSE | Intel | 128-bit | 1999 | 1999 | Pentium III | | SSE2 | Intel | 128-bit | 2000 | 2001 | Pentium 4 | | SSE3 | Intel | 128-bit | 2004 | 2004 | Pentium 4 Prescott | | SSSE3 | Intel | 128-bit | 2006 | 2006 | Core 2 | | SSE4.1 | Intel | 128-bit | 2007 | 2007 | Core 2 Penryn | | SSE4.2 | Intel | 128-bit | 2008 | 2008 | Nehalem | | AVX | Intel | 256-bit | 2008 [State of Windows on Arm, Year End 2024 - Signal65](https://signal65.com/research/state-of-windows-on-arm-year-end-2024/) | early 2011 [Advanced Vector Extensions - Wikipedia](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) | Sandy Bridge | | AVX2 | Intel | 256-bit | 2011 | 2013 | Haswell | | AVX-512 | Intel | 512-bit | 2013 | 2016 | Xeon Phi Knights Landing | Impressively, SSE2 is been around for 25 years now! AVX2 is a good decade old, though Intel still sells chips without them, checkout [this HN discussion](https://news.ycombinator.com/item?id=24578591) (2020). But in general, AVX2 is a pretty good baseline expecation. ## Testing Rust AVX2 kernels in OrbStack on Apple Silicon I have a circa 2023 MacBook Pro with an M2 chip. OrbStack supports emulation of Intel processors: ``` $ orb create --arch amd64 ubuntu ubuntu-intel $ orb -m ubuntu-intel uname -a Linux ubuntu-intel 6.14.10-orbstack-00291-g1b252bd3edea #1 SMP Sat Jun 7 02:45:18 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux ``` The emulation layer seems pretty good, however it doesn't appear to support any of the AVX extensions, just SSE ``` $ orb -m ubuntu-intel cat /proc/cpuinfo | grep -E 'sse|avx' ... hits for sse, sse2, ssse3, sse4_1, sse4_2, but nothing for avx ```