Speed is essential in multimedia, graphics and signal processing. Sometimes programmers resort to assembly language to get every last bit of speed out of their machines. GCC offers an intermediate between assembly and standard C that can get you more speed and processor features without having to go all the way to assembly language: compiler intrinsics. This article discusses GCC's compiler intrinsics, emphasizing vector processing on three platforms: X86 (using MMX, SSE and SSE2); Motorola, now Freescale (using Altivec); and ARM Cortex-A (using Neon). We conclude with some debugging tips and references.