Autonomous robots need to recognize objects in order to understand their surroundings and localize themselves on maps. Therefore, the core of every vision pipeline contains software to detect and track objects of interest, called “features”. Classical computer vision algorithms define the location of such features based on the color or brightness of pixels. Typically, such a feature location is not only dependent on the pixel value at this precise location but also its neighbor pixels or other elaborate constraints.
To fully utilize a modern CPU, the right utilization of the vector units is of utmost importance. We were able to deliver up to 7× speedup with a hand-crafted vectorization of the customer’s algorithm.
Quick facts
Hardware:
- SoC: Qualcomm® QCS610
- CPU: Arm® Cortex®-A76 + Arm Cortex-A55
Operating System:
- Linux
Compiler:
- GNU Compiler Collection (GCC)
Summary of our results:
- Speedup of core feature detection algorithm by 7× on Cortex-A76 and 5× on Cortex-A55.