Introduction
How to best set up development hardware in the real world for CI integration.
In this two-part article I will show how development hardware and eval boards can be integrated into the CI workflow (with a focus on the hardware side) and why you would want that. This is not going to be a tutorial for any given actual CI system. No scripts, no pipeline definitions, no configuration files. Rather, this article will show the experiences gained during the devising, setup, and maintenance of such systems.
Motivation
In one project we were tasked with measuring performance “live” on-target as well as ROM usage in general to make sure that future commits didn’t accidentally make the software too slow or too big. How to best go about that? Well, the best way is to verify the performance on actual, real-world hardware. The closer that hardware is to what we will later have in the final product, the better.
This “real-world” hardware often comes in the form of eval boards: One or more PCBs with components on it, a lot of LEDs and connectors, and various doodads and bits sticking out. Quite possibly there is no case provided, too.
These devices are intended for individual developers to play with them, but we’ll not let ourselves be limited by the manufacturer’s idea of how they are to be used. And, while it may be justifiable to give every developer an eval board that costs € 50, at € 50,000 things look a bit different.
Even if we were to give every developer their own board, how would we verify that performance goals are met? It’s not that we don’t trust them, but when everyone works on their own bit of the code, at merge-time there might be a rude awakening. No, we need to integrate the performance measurements into the CI workflow directly. This way we can warn about unmet performance goals (or, depending on how far along your project is, flat-out deny merging PRs that deteriorate the performance crosses a red line). On top of that, the performance engineers can see early on when and where performance takes a nosedive.
Furthermore, it removes the burden to set up hardware and software (at least partially) just to measure resource usage. The fewer error-prone steps the developer has to go through, the better, and if we remove the actual steps required to get the hardware up and running this saves the valuable time of possibly dozens of people.
For the project managers, a system that tells them the current performance metrics is invaluable, too, since they can see at a glance if the project is moving in the right direction. E.g. is the project getting faster, slower, or staying the same over some period of time.
Reviewers also benefit, since they can see if a certain commit should focus on performance more, or if that is not a concern for the given code.
Integration
The actual integration was relatively easy, if somewhat time-consuming. Since the CI was a tad overloaded, commits could take a few hours to make it through the build queue. Also, the home-grown logging system they used wasn’t a good fit for what we intended to log. In the end, though, we had an integration that checked both the performance and ROM usage against a specified limit and reject the PRs if they went above those numbers.
Checking the ROM usage is easy enough: we just had to compare the binary size resulting from the build process to the defined limit1. Checking the performance is also “easy”: we instrumented everything appropriately, gave it a fixed input and then just had to compare the runtime we got from the software running on the eval board to the defined limit. But suddenly we were involving an actual, physical device, which exists in, well… reality. And reality is nothing, if not messy.
Experiences
The case presented above is just one example of multiple similar setups I had to tend to. So, the following experiences are not all from this specific setup, but rather a collection of things encountered over the years.
Not directly computer-related
Some things are not directly related to computers, eval boards or the like, but more about their physical environment.
Space
Some say it’s the final frontier. I say you need it, and quite possibly lots of it. Because, your devices have to be located somewhere, and that somewhere is going to be messy.
Most eval board setups are a random jumble of cables and one or more PCBs. And these devices and their infrastructure need to go somewhere. I will assume that you have a separate, closed-off room for your devices. This is not a necessity, of course, and may not even be possible in your situation, but it will greatly simplify things. If you cannot get a room, place your devices out of the way of foot traffic to minimize risk.
If you are in a position where you can create a mount for a 19-inch rack to put your devices in, great! If not, something like a heavy duty shelf works well, too. Sure, the devices lie around in the rack, not being fixed to anything, but better like this, than on a desk or the floor.
Also, get some cable binders, ideally velcro ones, and here preferably ones you cut to your requirements, not fixed-length ones. Because all those cables should at least be somewhat organized: You may later shift or remove devices and knowing what is what will come very handy.
Finally, have the location be checked by your safety officer, pointing out all the things you have planned for. You may need to name one or more contacts for your room, so be sure you have talked to those people ahead of time. Also, you may have to have regular safety assessments, and may have to give regular safety trainings to colleagues who need access to your room. After all of this ask the safety officer to give their okay. In the absence of further safety regulations, following this procedure ensures you didn’t overlook something obvious.
Network integration
If your devices offer network connectivity, chances are your developers want to use it. Especially in larger companies there might be rules and regulations you need to follow with regard to what you can and cannot plug into your network, though. It would be wise to not antagonize your IT department and check what you are allowed to do with them before plugging something into the network2.
There are three possibilities:
- Your location offers network access for however many devices you have and your network policies allow you access to your network by “random” devices3. In this case you’re set to go.
- Your location does not offer as many network ports as you require, but you can plug in an ethernet switch. In this case get a switch (future-proof it by choosing a model with at least a few more network ports than you require at the moment).
- Your location does not offer as many network ports as you require, and you are not allowed to plug in an ethernet switch. In this case still get the ethernet switch, but create a private network.
Also, don’t forget the network cables. Most eval boards don’t come with network cables. Nothing worse than being stopped dead in your tracks by a missing piece of € 0.50 ethernet cable.
Cooling
If you have multiple “bigger” devices cooling may become an issue, fast. But even smaller devices sitting in a closed-off, possibly windowless room when it’s 40 °C outside may add quite a lot of heat to an already hot room.
For a quick back-of-the-envelope calculation let’s assume you want to 15 instances of a mid-powered eval board that requires 60 watt during regular operation plus two PCs at 250 w each. That amounts to 12 · 60 w + 2 · 250 w = 1,220 w. Modern buildings require something like 50 to 60 w per square meter of heating, which means one can heat roughly 20 to 25 m² with the power output of your setup alone. Let’s also assume we only have a 3 by 4 m² room. Suddenly you’re heating your 12 m² room by almost double the heat output of what you would normally use. So you may want to check the temperature regularly, possibly via automated means, and, if at all possible, get an air-conditioned room.
And if your devices use liquid cooling you need to plan ahead. Liquid cooling devices either need to be refilled, or they need to be replaced completely. Yes, one-time use liquid cooling devices are a thing, so be sure to check the manual for its intended replacement period.
But even if the device you use can be refilled, that also needs to be planned. The cooling liquid needs to be at hand when the time comes, and, possibly, needs to go through a release process in your company, because it is, after all, a chemical. And chemicals are usually subject to workplace safety regulations, so you might need training, a safe storage place and rules for getting rid of spillage or unneeded remainders.
Electrical and Fire Safety
Especially smaller eval boards more often than not do not come with a case. That is fine if a board sits on a developer’s desk (it really is not fine, but that’s another story). If such a board now sits in an “empty” room, quite possibly running 24/7, this is not “fine” anymore, because of electrical and fire safety.
Electrical shorts or misbehaving components may, in the worst case, start fires endangering lives. To alleviate this always run devices in their fully assembled and closed cases. If that is not possible for any reason (e.g. as mentioned already, many devices simply do not have a case available, at all), you can always get a generic fireproof case. Then have a qualified person install the device in that case, drilling holes for cables and fastenings as needed.
Summary
You must not neglect the physical environment the devices are going to live in. Because doing so might — if worst comes to worst — actually endanger lives. But even lower-tier risks are worth taking into account. Because, why not make your life as easy as possible with a little foresight?
We’ll continue this article in part 2 with all the more directly computer-related experiences.
Depending on the implementation, it is a bit more involved. But for our purposes I’ll stick to the easy-as-pie size comparison. ↩︎
Not antagonizing your IT department is always a wise suggestion! From having worked in IT for over a decade I can tell you that IT is totally willing to help. Yet, many people seem to think that the IT department’s raison d’être is to say no to everything. As long as you don’t approach them under the assumption that they will be a roadblock you have to work around somehow and as long as you are open to changes in your approach to the solution you envision the IT department will probably be more than happy to help you. ↩︎
Or allow access to “random” devices after a security evaluation — in that case schedule the appropriate amount of time for that evaluation to take place! ↩︎