Processing image data with single-instruction multiple-data (SIMD) CPU instructions provides a means of vectorising, thus speeding up execution, of standard image processing operators. SIMD register loads normally load from consecutive locations in memory, that is, consecutive pixels in a row of the image. For some algorithms, however, data dependencies between pixels along rows render SIMD vectorisation useless. If one could efficiently load pixels from columns of images this problem would be fixed. The Intel AVX2 CPU extension introduces an instruction for the gather loading of data from multiple memory locations into a single CPU SIMD register. We explore using these instructions for column loads of image data in two common image operations, transposing images and mean filtering, to test 1) whether they provide useful speed-ups when other vectorised approaches exist (and find that they do not), and 2) whether they provide means of implementing operations that otherwise would be difficult or extremely inefficient to achieve without a column load (they can provide speed-ups over scalar code).
An Exploration of Using the Intel AVX2 Gather Load Instructions for Vectorised Image Processing
Published 2018 in Image and Vision Computing New Zealand
ABSTRACT
PUBLICATION RECORD
- Publication year
2018
- Venue
Image and Vision Computing New Zealand
- Publication date
2018-11-01
- Fields of study
Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-10 of 10 references · Page 1 of 1
CITED BY
Showing 1-1 of 1 citing papers · Page 1 of 1