Servers make for the edge of their world
Image credit: Dreamstime
Latency is everything for 5G, but it's only part of the story for those who want to get processing closer to industrial control.
As with the rollouts of 3G and 4G over the past couple of decades, 5G’s arrival is accompanied by grand claims as to what will be possible with faster wireless transmissions. Twenty years ago, we were going to get ubiquitous videoconferencing. In reality, it was quicker web surfing and email and, for the first few years, physically hot handsets.
Today, we have everyday video calls, though only a tiny minority actually use video calls every day. With 5G, the big claim is about latency and what that means for robotics, industrial control and virtual reality. You can build smaller, lighter robots and VR headsets by moving much of the processing to remote computers, with 5G providing the last-mile connection. Prior wireless standards favoured bandwidth over latency, leading to round-trip delays that could hit 50ms to 100ms based on protocol processing alone. Thanks to fast paths through the 5G stack, that can come down as low as 1ms, assuming your 'remote' computer is next door.
The problem with remote processing based on cloud computing is that you have to send the data a long way. You can do as much as you like to improve the throughput of computers though custom accelerators and dropping entire server blades into iced mineral oil to try to keep them from overheating, but you are never going to overcome the delays incurred by the finite speed of photons moving through hundreds of kilometres of fibre-optic cable.
The answer is to move the cloud closer. A lot closer to the edge. You will have servers next to the fibre-to-the-kerb gateways sitting along roadsides and mounted on basestation masts or lying in cabinets not far away. Factories and industrial sites would have them on-campus in their own closets and, in many cases, scattered around the shopfloor or warehouse. Instead of being dedicated to specific machine tools, they will provide services for lots of different pieces of equipment, not necessarily in the same owner’s facility.
Whereas data-centre servers are moving to increasingly exotic architectures to try to keep pace with workload demands, the problem for those wanting to use the edge infrastructure is that they will have less choice over what to use and this is influencing the way suppliers like Arm are designing hardware.
With the cloud today, if you don’t like one data centre’s offering, you can easily redeploy onto one that fits your application better. The whole software infrastructure has evolved to make it possible, starting with the executable images that customers upload to their favoured site.
Rather than recompiling an application for each different node, the favoured trick is to use a container. This is a software technique that takes advantage of some facilities already in Linux to wall off an application from others that run on the same processor, which improves security. It has a sizeable speed advantage over using virtual machines managed by a hypervisor. However, one of the big reasons for using containers from the likes of Docker is that it makes it easier to move the applications around. You can personalise the container to have your favoured set of Linux services and have them always be available no matter where you run the container itself. Without the container, the programmer has to take account of the specific operating-system environment each blade is running.
Thanks to a consistent interface between the application and the underlying hardware, developers can take advantage of automated tools such as Kubernetes to work out where best to run each container. For example, a big workload that needs to be distributed over many nodes can be set up to wait for a batch of blades to become available rather than wind up being split across racks or even different data centres. Nodes can spin up and down automatically to process the results of those workloads. The hardware efficiency this brings compared to old-style workload management is dramatic. I’ve spoken to operators who reckon many workloads last just a matter of minutes before completing and being taken off a server to be replaced by something else.
Acceleration adds the wrinkle of demanding specific types of hardware being available in a data centre. For example, Amazon’s F1 provides a way to access field-programmable gate arrays (FPGAs) remotely. The accelerators made from this programmable hardware are used to simulate not just electronic circuits but things such as financial models and to power genomic analysis. Being as close to hardware as you can get, these applications do not move around so readily. High-level synthesis and compilation tools will make it easier to use different substrates if a favoured target is not available, in much the same way OpenCL made it possible to program for different GPUs using the same high-level code. Such portability is not an issue when you can simply shift to a data centre that’s a bit further away.
If you are developing applications that are more likely to run in a roadside cabinet or a closet somewhere on a factory campus because you are concerned about latency, your choices as to what hardware is available are far more limited. Arm system architect and fellow Rob Dimond says the advantage of a conventional CPU architecture is that you can pretty much guarantee to find one anywhere. “We see AI as an absolutely key workload at the infrastructure edge, but CPUs are really the default machine-learning processors,” he says.
To that end, Arm has started adding AI-oriented instructions such as multiply operations that work on a floating-point format that is pretty much only used on these workloads while many other companies roll out dedicated machine-learning accelerators, both for embedded systems and for data centres.
They might not run as quickly as the dedicated accelerators, but the argument is going to be that at least you know the CPUs will be there and can safely compile for them and have orchestration software direct workloads to the nearest available node. This is where Arm’s latest research programme comes in. Dimond says all are welcome to work on Project Cassini, including competitors. The aim is to bring cloud-style tooling to servers on the edge, with some additions. “We see the future as a software-defined platform,” he says. “We see people deploying onto Docker containers.”
This shift to the edge introduces a bunch of new problems. Data centres are high-security installations. A roadside cabinet is a lot more vulnerable. Data centres are also staffed 24/7. If something goes wrong, maintenance crews have to make their way to the site before they can even make a start on a fix. And energy is far more constrained.
Orchestration software will need to be a lot more careful about where it deploys workloads. Some nodes will be, in effect, locked down to avoid them getting bricked by wayward updates or applications. The metrics used to determine whether a workload can run on a particular box will need to be more subtle, taking into account factors like round-trip latency and more constrained hardware. An edge could as easily be a Raspberry Pi in a box as a rack full of passively cooled server blades designed specifically for the job.
There will, naturally, be the question of whether the bright future for edge servers genuinely beckons in the near term. It may take 6G’s arrival to be a factor in design. But that will at least provide some more time for the companies working in the sector to get the necessary infrastructure software up and running.
Sign up to the E&T News e-mail to get great stories like this delivered to your inbox every day.