Readjoiner: a fast and memory efficient string graph-based sequence assembler

Link:
Autor/in:
Erscheinungsjahr:
2012
Medientyp:
Text
Schlagworte:
  • Genome
  • Algorithms
  • Novo genome
  • Neoplasms
  • Genes
  • Genome
  • Algorithms
  • Novo genome
  • Neoplasms
  • Genes
Beschreibung:
  • Background: Ongoing improvements in throughput of the next-generation sequencing technologies challenge the current generation of de novo sequence assemblers. Most recent sequence assemblers are based on the construction of a de Bruijn graph. An alternative framework of growing interest is the assembly string graph, not necessitating a division of the reads into k-mers, but requiring fast algorithms for the computation of suffix-prefix matches among all pairs of reads.
    Results: Here we present efficient methods for the construction of a string graph from a set of sequencing reads. Our approach employs suffix sorting and scanning methods to compute suffix-prefix matches. Transitive edges are recognized and eliminated early in the process and the graph is efficiently constructed including irreducible edges only.
    Conclusions: Our suffix-prefix match determination and string graph construction algorithms have been implemented in the software package Readjoiner. Comparison with existing string graph-based assemblers shows that Readjoiner is faster and more space efficient. Readjoiner is available at http://www.zbh.uni-hamburg.de/readjoiner.
Lizenz:
  • info:eu-repo/semantics/openAccess
Quellsystem:
Forschungsinformationssystem der UHH

Interne Metadaten
Quelldatensatz
oai:www.edit.fis.uni-hamburg.de:publications/9ae50c7c-a32b-48d4-96e3-5f7214e6d9a4