A new algorithm could vastly simplify methods for finding the most efficient route across complex networks like the Internet.
Mathematicians and computer scientists have wrestled with problems such as finding the most efficient way to transport items across networks like highway systems or the Internet for decades, traditionally using a maximum-flow algorithm, also known as "max flow".
In a max-flow algorithm the network is represented as a graph with a series of nodes, known as vertices, and connecting lines between them, called edges, which each have a maximum capacity — just like roads or fibre-optic cables.
These algorithms attempt to find the most efficient way to send goods from one node in the graph to another, without exceeding the capacity constraints.
But as the size of networks like the Internet has grown exponentially it has become prohibitively time-consuming to solve these problems using traditional computing techniques, according to Jonathan Kelner, an associate professor of applied mathematics at MIT and a member of MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL).
In response, Kelner and his team have created a new theoretical algorithm that can dramatically reduce the number of operations needed to solve the max-flow problem, making it possible to tackle even huge networks like the Internet or the human genome.
"There has recently been an explosion in the sizes of graphs being studied," Kelner said. "For example, if you wanted to route traffic on the Internet, study all the connections on Facebook, or analyse genomic data, you could easily end up with graphs with millions, billions or even trillions of edges."
In a paper to be presented at the ACM-SIAM Symposium on Discrete Algorithms in the USA this week, Kelner and his colleague Lorenzo Orecchia, an applied mathematics instructor, alongside graduate students Yin Tat Lee and Aaron Sidford, describe how have abandoned the methods of previous max-flow algorithms, which have come at the problem one edge, or path, at a time.
"Many previous algorithms would find a path from point A to point B, send some flow along it, and then say, 'Given what I've already done, can I find another path along which I can send more?' When one needs to send flow simultaneously along many different paths, this leads to an intrinsic limitation on the speed of the algorithm," said Kelner.
But in 2011 Kelner, CSAIL graduate student Aleksander Madry, mathematics undergraduate Paul Christiano, and colleagues at Yale University and the University of Southern California developed a technique to analyse all of the paths simultaneously by viewing the graph as a collection of electrical resistors.
The researchers imagined connecting a battery to node A and a ground to node B, and allowing the current to flow through the network.
"Electrical current doesn't pick just one path, it will send a little bit of current over every resistor on the network," Kelner said. "So it probes the whole graph globally, studying many paths at the same time."
This breakthrough allowed the new algorithm to solve the max-flow problem substantially faster than previous attempts, and now the MIT team has developed a technique to reduce the running time even further, making it possible to analyse even gigantic networks, Kelner says.
Unlike previous algorithms, which have viewed all the paths within a graph as equals, the new technique identifies those routes that create a bottleneck within the network. The team's algorithm divides each graph into clusters of well-connected nodes, and the paths between them that create bottlenecks, Kelner says.
"Our algorithm figures out which parts of the graph can easily route what they need to, and which parts are the bottlenecks,” he says. “This allows you to focus on the problem areas and the high-level structure, instead of spending a lot of time making unimportant decisions, which means you can use your time a lot more efficiently.”
The result is an almost linear algorithm, Kelner says, meaning the amount of time it takes to solve a problem is very close to being directly proportional to the number of nodes on the network.
So if the number of nodes on the graph is multiplied by 10, the amount of time would be multiplied by something very close to 10, as opposed to being multiplied by 100 or 1,000, he says.
"This means that it scales essentially as well as you could hope for with the size of the input," he added.