Push boundaries of existing software through performance optimization without needing to redesign the entire system.
Software performance is a critical consideration for embedded systems development, driven by a variety of needs; from technical necessity (slow software = no product) to competitve advantage (efficient software = better product than your competitors can offer at the time of market entry). Improving performance can directly impact a product’s price point or enable the inclusion of additional features, delivering higher quality and a better experience to the end customer. In this article, we want to look at the latter: competitive advantage through higher quality and better features.
It is crucial to recognize that every feature consumes resources – RAM, ROM, and runtime – and that higher quality features require greater resource allocation than lower quality ones. Any hardware you select, on the other hand, only has a fixed amount of resources to offer. That’s where performance optimization comes into play: Reducing the overall footprint of existing features. Often this results in a simultaneous reduction across all three resource categories.
Ultimately, performance optimization translates to tangible benefits beyond mere resource reduction.1 Effective optimization empowers your engineers to either expand the system’s capabilities by adding more features or enhance the quality of existing ones, delivering a superior user experience. It also makes your product’s behavior more stable and thus contributes to product safety. For each of these three gains, we will look at one specific example that the team behind Efficientware tackled successfully in the past.
Add Additional Features
Performance optimization is a strategic enabler for adding new functionality to existing embedded systems, even when the hardware appears to be fully utilized.2 By reducing the resource consumption of existing features – without compromising them, that’s why we call it “squeeze” – we can free the necessary resources to accommodate new ones.
In the past, we helped a project that used a Convolutional Neural Network for processing video data in an autonomous vehicle. The GPU utilization reached near 100%, leaving no space for further feature development. We optimized the single largest contributor to overall runtime – non-maximum suppression – by a factor of 13, after the project team tried its best at optimization already.
This significant reduction in runtime unlocked sufficient GPU resources for engineers to introduce additional neural network segments. These additions enabled the project’s engineers to implement autonomous driving in adverse weather conditions – a real competitive advantage back then.
The example clearly illustrates: Optimized performance unlocks the potential to expand the system’s feature set without requiring a hardware upgrade and potential consequent system redesign.
Improve Quality of Existing Features
Performance optimization can improve the quality of existing features of your product. Often, this manifests itself as the ability to process more of the same data (e.g. radar, video), leading to outputs of superior fidelity or a shorter latency between real world and system reaction.
In the past, we helped a team that created a robot lawn mower. The lawn mower’s navigation was video-based, and accurate location tracking was paramount because it oriented itself without a pre-installed wire around the lawn. To maintain correct operation, the robot’s speed needed to be deliberately constrained to ensure video processing could keep pace with rapid angular velocity. The engineers identified that the extraction of image features was the limiting bottleneck and prevented achieving the required frame rate.
We optimized this critical feature extraction component by a factor of 7. It made the image processing pipeline fast enough to ultimately keep up with higher angular velocities and, thus, the robot operated more smoothly. Here, our performance optimization directly improved user experience.
Stability
Worst-case scenario are, by definition, edge cases where a system is pushed to the limits of its design considerations. Consequently, the likelyhood of parts of the system breaking down is increased dramatically. Such worst cases do not only exist for the functionality product (like exposing a screw to its maximum admissible stress), but also in terms of runtime performance.
Let’s take the example of a driver-assistance system that operates on radar sensor data. The more objects it detects, the slower the radar processing pipeline typically is. When overwhelmed, the system may be forced to disregard the currently processed input data, effectively discontinuing to process the current data and prioritizing the subsequent frame – after all, it’s a real-time system, and there is no value in too old results.
While dropping the currently processed input data might be justifiable based on safety arguments, a recurring pattern of dropped data represents a significant risk. The issue is often exacerbated by the “bursty” nature of these high loads; after all, the detections represent objects in the real world, and these do not tend to simply vanish. The result is that if a system is overloaded processing a single frame, it is highly likely to experience a similar overload in the following frames as well. This potentially cascades into a series of consecutively dropped input data.
We helped multiple driver-assistance projects in the past with exactly this issue. We identified worst-case scenarios and optimized processing latency and throughput for exactly those scenarios. (Of course, this also benefits much more than just worst cases.) This way, the project benefitted from a more robust and predictable system response under pressure and ultimately enhanced the overall system safety and dependability.
Conclusion
By making your code more efficient, you extend the lifespan of your software and provide a more robust platform for future innovation.
Comment this article on LinkedIn.
Though it is worthwhile to note that some functions, for example a boot process, rise in quality precisely by the reduction of resource consumption. ↩︎
Note that “fully utilized” does not imply “utilized well.” you might work 10h per day, but if 9 of those is filling meaningless Excel sheets, the waste is apparent. So, too, a 99% CPU utilization does not imply 99% of CPU cycles are contributing to customer value ↩︎