Defined Term mechanism updated Mon Jun 15 2026 00:00:00 GMT+0000 (Coordinated Universal Time)

SWAR (SIMD Within A Register)

SWAR — SIMD Within A Register — treats a single machine word as a vector of smaller lanes (e.g. a 64-bit register as 8 bytes or 32 bit-pairs) and operates on all lanes at once with ordinary integer instructions, using carefully chosen masks to stop carries from crossing lane boundaries. It is the data-parallel half of bit-manipulation (the complement to branchless-programming‘s control-flow trick): one cheap word-wide op does the work of a loop over lanes, with no dedicated SIMD hardware required.

The canonical example — parallel popcount

The constant-time population-count in bit-twiddling-hacks (and hackers-delight) is SWAR: the 0x55555555 / 0x33333333 / 0x0f0f0f0f mask sequence adds bits in 2-bit, then 4-bit, then 8-bit lanes in parallel (a logarithmic-depth tree of masked adds), then a final multiply-and-shift sums the byte lanes. No branches, no table, no per-bit loop — the whole word’s bits are counted in ~12 ops.

Where else it shows up

Byte-pattern detection — testing “does this word contain a zero byte / a given byte?” in one word-wide expression (the basis of fast strlen/memchr); a staple of bit-twiddling-hacks.
Parallel add/compare across lanes with mask-isolated carries — packed arithmetic before SSE/NEON.

The standing caveat applies, with a twist

Like the rest of the spoke (branchless-programming‘s lesson), much classic SWAR is now matched by true SIMD (SSE/AVX/NEON) and dedicated instructions (POPCNT) — hardware caught up. But SWAR’s value is more durable here: it needs no SIMD ISA at all, so it still wins on minimal/embedded targets, in portable code, and inside the wide-word inner loops the vector units can’t reach. It is the clearest case of bit manipulation as parallelism extracted from plain integer hardware.

bit-manipulation · population-count · branchless-programming · bit-twiddling-hacks · hackers-delight

SWAR (SIMD Within A Register)

The canonical example — parallel popcount

Where else it shows up

The standing caveat applies, with a twist

Related

Linked from