At first glance, the big news coming out of this summer’s conference on the theory of computing appeared to be something of a letdown. For more than 40 years, researchers had been trying to find a better way to compare two arbitrary strings of characters, such as the long strings of chemical letters within DNA molecules. The most widely used algorithm is slow and not all that clever: It proceeds step-by-step down the two lists, comparing values at each step. If a better method to calculate this “edit distance” could be found, researchers would be able to quickly compare full genomes or large data sets, and computer scientists would have a powerful new tool with which they could attempt to solve additional problems in the field.
Yet in a paper presented at the ACM Symposium on Theory of Computing, two researchers from the Massachusetts Institute of Technology put forth a mathematical proof that the current best algorithm was “optimal” — in other words, that finding a more efficient way to compute edit distance was mathematically impossible. The Boston Globe celebrated the hometown researchers’ achievement with a headline that read “For 40 Years, Computer Scientists Looked for a Solution That Doesn’t Exist.”
But researchers aren’t quite ready to record the time of death. One significant loophole remains. The impossibility result is only true if another, famously unproven statement called the strong exponential time hypothesis (SETH) is also true. Most computational complexity researchers assume that this is the case — including Piotr Indyk and Artūrs Bačkurs of MIT, who published the edit-distance finding — but SETH’s validity is still an open question. This makes the article about the edit-distance problem seem like a mathematical version of the legendary report of Mark Twain’s death: greatly exaggerated.
The media’s confusion about edit distance reflects a murkiness in the depths ofcomplexity theory itself, where mathematicians and computer scientists attempt to map out what is and is not feasible to compute as though they were deep-sea explorers charting the bottom of an ocean trench. This algorithmic terrain is just as vast — and poorly understood — as the real seafloor, said Russell Impagliazzo, a complexity theorist who first formulated the exponential-time hypothesis with Ramamohan Paturiin 1999. “The analogy is a good one,” he said. “The oceans are where computational hardness is. What we’re attempting to do is use finer tools to measure the depth of the ocean in different places.”
To continue reading, please go to Quanta Magazine. Publication does not imply endorsement of views by the World Economic Forum.
To keep up with the Agenda subscribe to our weekly newsletter.
Author: John Pavlus is a writer and filmmaker whose work has appeared in Scientific American, Bloomberg Businessweek, and The Best American Science and Nature Writing series.
Image: An illustration picture shows a projection of binary code on a man holding a laptop computer, in an office in Warsaw. REUTERS/Kacper Pempel