## SIMD adventures
This week I've been trying to build a version of Vortex `take` operation that uses explicit SIMD vectorization, with the goal of getting all of Vortex over to Rust stable channel (see this Github Issue for tracking).
```rust
let expected_array = {
let bool_mask = lit(true)
.evaluate(&Scope::new(array_data.clone()))
.vortex_unwrap();
// this is just all true mask
let mask = Mask::try_from(&bool_mask.to_bool().vortex_unwrap()).vortex_unwrap();
let filtered = filter(&array_data.clone(), &mask).vortex_unwrap();
root()
.clone()
.unwrap_or_else(|| root())
.evaluate(&Scope::new(filtered))
.vortex_unwrap()
};
```
## A brief history of SIMD
According to Sonnet 4, here is a chart of the adoption of SIMD instructions by the major processor vendors:
| Technology | Vendor | Register Size | Announce Date | Release Date | Core First Used |
| ---------- | ------ | ------------- | ------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------ |
| MMX | Intel | 64-bit | 1996 | January 8, 1997 [How to Know If My Intel® Processor Supports Intel® Advanced Vector Extensions 2 (Intel® AVX2)](https://www.intel.com/content/www/us/en/support/articles/000090473/processors/intel-core-processors.html) | Pentium P5 |
| 3DNow! | AMD | 64-bit | 1998 | 1998 | K6-2 |
| SSE | Intel | 128-bit | 1999 | 1999 | Pentium III |
| SSE2 | Intel | 128-bit | 2000 | 2001 | Pentium 4 |
| SSE3 | Intel | 128-bit | 2004 | 2004 | Pentium 4 Prescott |
| SSSE3 | Intel | 128-bit | 2006 | 2006 | Core 2 |
| SSE4.1 | Intel | 128-bit | 2007 | 2007 | Core 2 Penryn |
| SSE4.2 | Intel | 128-bit | 2008 | 2008 | Nehalem |
| AVX | Intel | 256-bit | 2008 [State of Windows on Arm, Year End 2024 - Signal65](https://signal65.com/research/state-of-windows-on-arm-year-end-2024/) | early 2011 [Advanced Vector Extensions - Wikipedia](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) | Sandy Bridge |
| AVX2 | Intel | 256-bit | 2011 | 2013 | Haswell |
| AVX-512 | Intel | 512-bit | 2013 | 2016 | Xeon Phi Knights Landing |
Impressively, SSE2 is been around for 25 years now! AVX2 is a good decade old, though Intel still sells chips without them, checkout [this HN discussion](https://news.ycombinator.com/item?id=24578591) (2020). But in general, AVX2 is a pretty good baseline expecation.
## Testing Rust AVX2 kernels in OrbStack on Apple Silicon
I have a circa 2023 MacBook Pro with an M2 chip. OrbStack supports emulation of Intel processors:
```
$ orb create --arch amd64 ubuntu ubuntu-intel
$ orb -m ubuntu-intel uname -a
Linux ubuntu-intel 6.14.10-orbstack-00291-g1b252bd3edea #1 SMP Sat Jun 7 02:45:18 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
```
The emulation layer seems pretty good, however it doesn't appear to support any of the AVX extensions, just SSE
```
$ orb -m ubuntu-intel cat /proc/cpuinfo | grep -E 'sse|avx'
... hits for sse, sse2, ssse3, sse4_1, sse4_2, but nothing for avx
```