What do I want to build?
The previous blog post looks at why I decided to build the PiRack and what the finished product is. This one talks about the design decisions and compromises I made to get there. It started with these rough design goals:
- A server with Raspberry Pi Compute Module 4’s in blade format
- Takes as many blades as possible
- Looks and functions like a normal server
- Has centralised management and provisioning of the blades
Although I started this project to fulfil a specific need I had, it was designed from the start with half an eye on turning it into a full product if it worked well. This added another two goals:
- Doesn’t cost a lot of money
- Uses easily available off-the-shelf components as far as possible
Starting at the beginning: the case
Everyone has their own process, but for me if I’m designing something that’s going to end up as a “finished” device (rather than just test/development boards), I tend to start with the case.
For the PiRack, this meant a simple starting point: it should be a 19” rack case, and it should be an existing, cheap, off-the-shelf case. Unfortunately, this doesn’t narrow things down much as there’s a huge variety of different 19” enclosures available.
The height of rack cases is measured in “rack units”, which are 1¾” (44.45mm) tall. Compute Module 4’s are 55mm wide by 40mm tall. For vertically mounted blades, then, the CM4’s physical dimensions determine the minimum height. At 40mm, a single rack unit of 44.45mm just isn’t enough - the case itself will have a material thickness, and there needs to be a little margin for the blades to slide into. This means that a case that’s at least 2U high is needed.
At 2U (88.9mm), there is plenty of space to fit in everything the blade needs, so there’s no point in going for something taller. The shorter the case, the more cases can be fitted into a rack. So, a 2U case it is.
The next choice is metal vs plastic. The vast majority of servers are metal, but there were a couple of factors swaying me away from this. Firstly, cost: plastic cases are generally cheaper than metal. Given the cost goal, this was an important factor. Secondly, even though I wanted to use off-the-shelf, there were some bits which were going to have to be custom. At the very least, the bezel on the front of the blade will need to be built specifically for the PCB. Again, in order to keep costs down, this would likely mean 3D printed plastic rather than custom CNC’d metal. If there were going to have to be visible plastic parts anyway, then I felt that a fully plastic case would be better.
Looking around various distributors, I found a nice 2U black ABS case which does the job nicely. It is just 205mm deep, so it fits into shallow racks easily. There was plenty of stock, and the price was sensible. I bought one and was happy with the quality, so the case was chosen!
Power
Deciding how to power the PiRack was probably the easiest design decision of the whole project. ATX power supplies that provide power to PCs and servers have been around for decades. Millions are built each year, so the prices are incredibly competitive. Given that I was building a server, using a standard server power supply was a no-brainer.
Next up is how much power it needs to deliver. The CM4 datasheet puts the typical maximum load current at 1.4A @ 5V, around 7W, and other benchmarks seem to bear this out. The power supply, then, isn’t going to need to be able to deliver a huge amount of power.
After doing some research, I found that I could get small 1U power supplies from a few places in a fairly standard form factor: around 150mm x 82mm x 41mm. These were available with lots of different power outputs, typically starting at 200W. Even allowing for overhead, 200W is enough to power a lot of CM4s.
(Slight digression: a 200W PSU will deliver 200W of total power, but this is split into different power rails. A 200W PSU may only be capable of delivering 100W at 5V.)
How many blades?
Choosing the case and power supply fixed certain minimum dimensions and gave some new design requirements.
At 205mm, the case is quite shallow. This makes for a neat design that fits into small cabinets, but does restrict the space available to fit components into. The power connector needs to be at the back of the case, so the PSU needs to be mounted that way around. As it’s 150mm long, there’s only around 40mm of space in front of it, and there’s no way a blade could be short enough to fit there. This meant that the 82mm width of the PSU was unavailable for blades.
After subtracting the PSU width from the available space inside the case, there is around 330mm left to take blades.
There is a minimum height for each blade, which is roughly the thickness of the blade, PCB, the CM4 connectors, the CM4 itself, and a heatsink (more on the heatsinks and cooling below.) This is around 13mm, so in theory you could fit around 25 of these. However, that doesn’t include any connectors, and I was also keen to include a small status display. I
In the end I went for a blade height of 30mm, with slots for 10 blades. This leaves lots of room for airflow, as I wanted to be cautious with cooling. At full blast it’s unlikely to draw more than 100W of power.
The backplane
As one of my design goals was centralised management, it was clear from the start that some kind of backplane would be needed to connect the blades together. I did a quick count and reckoned that I’d need at least 20 pins to carry the various different power and signal lines.
There are lots of backplane connectors on the market, but “cheap” and “easily available” meant that I quickly homed in on one specific connector: PCI express. These are used in computers to connect expansion cards to motherboards, and so just like ATX power supplies are manufactured in huge quantities. The connectors are cheap, and you only need one connector per blade - the edge of the blade card itself is the other connector.
A standard PCI-e x1 connector has 36 pins, which is more than enough for this. One word of warning: although it uses the PCI-e connector, the blade is NOT PCI-e compatible! Don’t plug a PCI-e card into the PiRack, and don’t plug a PiRack blade into a computer!
RP2040
In order to do the management tasks needed by the server, microcontrollers were going to be needed in various different places. I went almost immediately to Raspberry Pi’s RP2040 chip.
In many ways this is the wrong choice, as in many cases it’s much more powerful than is really needed. There are cheaper, lower footprint microcontrollers that would have done the job well.
The main reason I chose the RP2040 was simple: familiarity. I’ve used it in a number of my projects and know it fairly well, so it was easy to drop into this. It’s also extremely well documented and supported. The price is a little more, but not so much more that it was worth doing a lot of extra work to use something cheaper. It’s also nice to keep the whole thing within the Raspberry Pi family!
Blades
As a maximum blade width 30mm had been decided, the next step was to decide how long and how tall. The height was fairly easy: it was just the height of the inside of the case minus space for the guide rails. This ended up being 68mm.
The length was mostly determined by the NVMe cards. I wanted to be able to fit the largest of these, the 22110 cards. At 110mm long, this gave the absolute minimum length. Space was needed at one end for the front panel connectors, and at the other for the card edge connector. A blade length of 145mm then left enough space for the backplane and fans.
The CM4 has lots of different functionality, so I then had to decide what I wanted to break out from it. As this is a server-focused design, this is what I ended up with:
- On the front of the blade
- Gigabit Ethernet socket
- HDMI socket
- USB-A socket
- M.2 m-key NVMe socket
- SD card slot
The SD card allows CM4 Lites to be used, though these can also be used without SD cards and boot from an NVMe drive instead.
I also decided to add a battery backed real time clock to allow the blade to keep its time while powered off. I chose the PCF8563 as these are simple, easily available, and already supported in the Pi kernel.
An FSUSB42 switch allows the CM4’s USB to be routed two ways: either to the type A socket on the front of the blade, or to the backplane connector.
The CM4’s UART0 is also connected to the backplane. This allows the serial console to be connected to the management server.
An RP2040 is used for management of the blade. This is powered even when the CM4 itself is off, allowing it to control the CM4’s power. It can also be used to select the CM4’s boot mode, so that provisioning can be done via the backplane USB. There is also a small EEPROM used to allow the RP2040 to store configuration data.
The RP2040 also drives an OLED display on the front of the blade, independent of the CM4 itself. Why a display rather than status LEDs? It goes back to my days as a network engineer. If you’re working in a busy rack, it can often be tricky to identify which server you want to work on. If it’s a production rack, it can make for very nervous moments when you unplug something to find out if the server is the one you really thought it was! Servers often had some sort of ID LED which can be triggered from software, but I thought I’d go a step further. The OLED means you can even display an IP address, so you can quickly see the name and address of a blade.
Backplane design
The backplane is in concept pretty straightforward. It’s just a long PCB with PCI-e connectors evenly spaced, providing power and data into the blades. It has holes cut into it to allow the fans at the back to draw air across the blades.
The PSU is connected into the backplane through the standard 24-pin ATX power connector. ATX PSU’s provide a number of power rails through this connector: +3.3V, +5V, +12V, -12V and a +5V standby rail. This last one is always on when the power lead is connected, and the other ones are switched by connecting the PS_ON# pin to ground.
The PiRack uses the +5V rail to power the blades, and +3.3V for powering some of the RP2040’s (the blades themselves have 3.3V LDOs to power their RP2040’s from the +5V rail.) The +12V rail is only used to power the three case fans, which are switched via MOSFETs.
The backplane needs to be able to take the USB and UART connections from each blade and route this to the management server. The management server only needs to be able to control one blade at a time, so some kind of multiplexer is needed.
To multiplex USB, a dedicated USB switch is needed in order to preserve the required impedance. Unfortunately, ICs which can switch 10 USB ports are nearly impossible to come by. So, to keep it simple and cheap, the 10 USB lines are instead connected to one of three FSUSB74 four port switches. Each of these switches is in turn connected to a master FSUSB74. Setting two switches in combination allows any of the 10 USB lines to be used.
The USB needs to be connected from the backplane to the management server, and it needs to keep the USB impedance specification. The simplest way to do this is with a USB connector on the backplane, another on the management board, and a standard USB cable between the two.
The UARTs are simpler to deal with. A pair of 74HC4067 16:1 analog multiplexers are used, one for TX and one for RX.
These switches all need GPIOs to select the outputs, and GPIOs are also needed to detect the presence of each blade. After I added them all up I realised that there were too many for the RP2040. Using an I2C controlled GPIO expander was the obvious solution for this, and in the end I used four TCA9535s. I was able to space these out across the backplane, which made routing much easier.
Front panel and management server
I decided early on that I wanted a front panel and display to allow basic control of the server. As there is a chunk of space in front of the power supply where blades can’t go, this was an ideal place. There was plenty of room for a 2.4” LCD, and these are a sensible price. 5 buttons allow for control from the front panel, and an Ethernet management socket and USB-C complete it.
An RP2040 drives the display and front panel buttons, but isn’t really powerful enough to run a web server for remote management. A CM4 does this job much better. Unfortunately there just wasn’t enough space to mount the CM4 directly on the same board as the panel, so instead it’s mounted on a small daughterboard via another PCI-e connector.
The panel is connected to the backplane via an 8-pin ribbon.
Inter-board communications
All of these various boards and RP2040s need to talk to each other. Nothing needs to be done at a particularly high speed, so I used I2C. This is well supported in both the RP2040 and CM4, and extra I2C buses can be added to the RP2040 using the PIOs.
They aren’t all directly connected, however. The management board connects to the panel board, the panel board connects to the backplane, and the backplane connects to the blades. Each RP2040 can effectively act as a repeater to send an I2C request on to the next device in the chain.
What didn’t make the cut - the elephant in the room
There’s one glaring omission. Why have an Ethernet socket on each blade, and not have an integrated switch in the backplane? That was in my thoughts originally, but I decided it was just too complicated for a version 1 project.
The backplane would need a switch with at least 10 1GbE switch ports for the blades, plus an uplink port to the outside world. The uplink port would really need to run at least 10GbE otherwise the bandwidth to each blade would be seriously constrained. This is where things get complicated.
To get the full bandwidth, you really need a single switch IC. Only Microchip really has something suitable with publicly published information (which would be needed if I was to keep it open source) and available through normal distribution channels. Broadcom has some but there is less public information on these, and all of the other manufacturers have the datasheets hidden.
The switch ICs are all BGAs and the number of lines in 11 Ethernet connections make for a much more complicated PCB. They are also not really “dumb” switches - they essentially run a network operating system. This is great in that it means you can do things like VLAN and QoS, but adds massively to the firmware development time.
In the end I figured it just wasn’t worth the time and cost in trying to do this all at once. If there is sufficient demand for the PiRack, then I’ll go back and create an integrated switch version in the future.