Artículos de revistas
An Improved Algorithm For The All-pairs Suffix-prefix Problem
Registro en:
Journal Of Discrete Algorithms. Elsevier Science Bv, v. 37, p. 34 - 43, 2016.
1570-8667
1570-8675
WOS:000379021500004
10.1016/j.jda.2016.04.002
Autor
Tustumi
William H. A.; Gog
Simon; Telles
Guilherme P.; Louza
Felipe A.
Institución
Resumen
Finding all longest suffix-prefix matches for a collection of strings is known as the all pairs suffix-prefix match problem and its main application is de novo genome assembly. This problem is well studied in stringology and has been solved optimally in 1992 by Gusfield etal.[8] using suffix trees. In 2010, Ohlebusch and Gog[13] proposed an alternative solution based on enhanced suffix arrays which has also optimal time complexity but is faster in practice. In this article, we present another optimal algorithm based on enhanced suffix arrays which further improves the practical running time. Our new solution solves the problem locally for each string, scanning the enhanced suffix array backwards to avoid the processing of suffixes that are no suffix-prefix matching candidates. In an empirical evaluation we observed that the new algorithm is over two times faster and more space-efficient than the method proposed by Ohlebusch and Gog. When compared to Readjoiner[5], a good practical solution, our algorithm is faster for small overlap lengths, in pace with theoretical optimality. (C) 2016 Elsevier B.V. All rights reserved. 37 34 43