Standard

A Comparison of Seed-and-Extend Techniques in Modern DNA Read Alignment Algorithms. / Ahmed, Nauman; Bertels, Koen; Al-Ars, Zaid.

2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). ed. / Tianhai Tian; Qinghua Jiang; Yunlong Liu; Kevin Burrage; Jiangning Song; Yadong Wang; Xiaohua Hu; Shinichi Morishita; Qian Zhu; Guohua Wang. Piscataway, NJ : IEEE, 2016. p. 1421-1428.

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

Harvard

Ahmed, N, Bertels, K & Al-Ars, Z 2016, A Comparison of Seed-and-Extend Techniques in Modern DNA Read Alignment Algorithms. in T Tian, Q Jiang, Y Liu, K Burrage, J Song, Y Wang, X Hu, S Morishita, Q Zhu & G Wang (eds), 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, Piscataway, NJ, pp. 1421-1428, IEEE International Conference on Bioinformatics and Biomedicine 2016, Shenzhen, China, 15/12/16. https://doi.org/10.1109/BIBM.2016.7822731

APA

Ahmed, N., Bertels, K., & Al-Ars, Z. (2016). A Comparison of Seed-and-Extend Techniques in Modern DNA Read Alignment Algorithms. In T. Tian, Q. Jiang, Y. Liu, K. Burrage, J. Song, Y. Wang, X. Hu, S. Morishita, Q. Zhu, ... G. Wang (Eds.), 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (pp. 1421-1428). Piscataway, NJ: IEEE. https://doi.org/10.1109/BIBM.2016.7822731

Vancouver

Ahmed N, Bertels K, Al-Ars Z. A Comparison of Seed-and-Extend Techniques in Modern DNA Read Alignment Algorithms. In Tian T, Jiang Q, Liu Y, Burrage K, Song J, Wang Y, Hu X, Morishita S, Zhu Q, Wang G, editors, 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). Piscataway, NJ: IEEE. 2016. p. 1421-1428 https://doi.org/10.1109/BIBM.2016.7822731

Author

Ahmed, Nauman ; Bertels, Koen ; Al-Ars, Zaid. / A Comparison of Seed-and-Extend Techniques in Modern DNA Read Alignment Algorithms. 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). editor / Tianhai Tian ; Qinghua Jiang ; Yunlong Liu ; Kevin Burrage ; Jiangning Song ; Yadong Wang ; Xiaohua Hu ; Shinichi Morishita ; Qian Zhu ; Guohua Wang. Piscataway, NJ : IEEE, 2016. pp. 1421-1428

BibTeX

@inproceedings{ff496b4c8b5a4f8e9c7456c6942635b8,
title = "A Comparison of Seed-and-Extend Techniques in Modern DNA Read Alignment Algorithms",
abstract = "DNA read alignment is a major step in genome analysis. However, as DNA reads continue to become longer, new approaches need to be developed to effectively use these longer reads in the alignment process. Modern aligners commonly use a two-step approach for read alignment: 1. seeding, 2. extension. In this paper, we have investigated various seeding and extension techniques used in modern DNA read alignment algorithms to find the best seeding and extension combinations. We developed an open source generic DNA read aligner that can be used to compare the alignment accuracy and total execution time of different combinations of seeding and extension algorithms. For extension, our results show that local alignment is the best extension approach, achieving up to 3.6x more accuracy than other extension techniques, for longer reads. For seeding, if BLAST-like seed extension is used, the best seeding approach is identifying all SMEMs in the DNA read (e.g., approach used by BWA-MEM). This combination is up to 6x more accurate than other seeding techniques, for longer reads. With local alignment, we observed that the seeding technique does not impact the alignment accuracy. Furthermore, we showed that an optimized implementation of local alignment using vector instructions, enabling 4.5x speedup, makes it the fastest of all extension techniques. Overall, we show that using local alignment with non-overlapping maximal exact matching seeds is the best seeding-extension combination due to its high accuracy and higher potential for optimization/acceleration for future DNA reads.",
keywords = "DNA, Computers, Indexes, Genomics, Bioinformatics, Irrigation",
author = "Nauman Ahmed and Koen Bertels and Zaid Al-Ars",
year = "2016",
month = "12",
doi = "10.1109/BIBM.2016.7822731",
language = "English",
isbn = "978-1-5090-1612-9",
pages = "1421--1428",
editor = "{ Tian}, Tianhai and { Jiang}, Qinghua and { Liu}, Yunlong and Burrage, {Kevin } and { Song}, Jiangning and Wang, {Yadong } and Hu, {Xiaohua } and Morishita, {Shinichi } and Zhu, {Qian } and { Wang}, Guohua",
booktitle = "2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)",
publisher = "IEEE",
address = "United States",

}

RIS

TY - GEN

T1 - A Comparison of Seed-and-Extend Techniques in Modern DNA Read Alignment Algorithms

AU - Ahmed, Nauman

AU - Bertels, Koen

AU - Al-Ars, Zaid

PY - 2016/12

Y1 - 2016/12

N2 - DNA read alignment is a major step in genome analysis. However, as DNA reads continue to become longer, new approaches need to be developed to effectively use these longer reads in the alignment process. Modern aligners commonly use a two-step approach for read alignment: 1. seeding, 2. extension. In this paper, we have investigated various seeding and extension techniques used in modern DNA read alignment algorithms to find the best seeding and extension combinations. We developed an open source generic DNA read aligner that can be used to compare the alignment accuracy and total execution time of different combinations of seeding and extension algorithms. For extension, our results show that local alignment is the best extension approach, achieving up to 3.6x more accuracy than other extension techniques, for longer reads. For seeding, if BLAST-like seed extension is used, the best seeding approach is identifying all SMEMs in the DNA read (e.g., approach used by BWA-MEM). This combination is up to 6x more accurate than other seeding techniques, for longer reads. With local alignment, we observed that the seeding technique does not impact the alignment accuracy. Furthermore, we showed that an optimized implementation of local alignment using vector instructions, enabling 4.5x speedup, makes it the fastest of all extension techniques. Overall, we show that using local alignment with non-overlapping maximal exact matching seeds is the best seeding-extension combination due to its high accuracy and higher potential for optimization/acceleration for future DNA reads.

AB - DNA read alignment is a major step in genome analysis. However, as DNA reads continue to become longer, new approaches need to be developed to effectively use these longer reads in the alignment process. Modern aligners commonly use a two-step approach for read alignment: 1. seeding, 2. extension. In this paper, we have investigated various seeding and extension techniques used in modern DNA read alignment algorithms to find the best seeding and extension combinations. We developed an open source generic DNA read aligner that can be used to compare the alignment accuracy and total execution time of different combinations of seeding and extension algorithms. For extension, our results show that local alignment is the best extension approach, achieving up to 3.6x more accuracy than other extension techniques, for longer reads. For seeding, if BLAST-like seed extension is used, the best seeding approach is identifying all SMEMs in the DNA read (e.g., approach used by BWA-MEM). This combination is up to 6x more accurate than other seeding techniques, for longer reads. With local alignment, we observed that the seeding technique does not impact the alignment accuracy. Furthermore, we showed that an optimized implementation of local alignment using vector instructions, enabling 4.5x speedup, makes it the fastest of all extension techniques. Overall, we show that using local alignment with non-overlapping maximal exact matching seeds is the best seeding-extension combination due to its high accuracy and higher potential for optimization/acceleration for future DNA reads.

KW - DNA

KW - Computers

KW - Indexes

KW - Genomics

KW - Bioinformatics

KW - Irrigation

UR - http://resolver.tudelft.nl/uuid:96b4c-8b5a-4f8e-9c74-56c6942635b8

U2 - 10.1109/BIBM.2016.7822731

DO - 10.1109/BIBM.2016.7822731

M3 - Conference contribution

SN - 978-1-5090-1612-9

SP - 1421

EP - 1428

BT - 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

A2 - Tian, Tianhai

A2 - Jiang, Qinghua

A2 - Liu, Yunlong

A2 - Burrage, Kevin

A2 - Song, Jiangning

A2 - Wang, Yadong

A2 - Hu, Xiaohua

A2 - Morishita, Shinichi

A2 - Zhu, Qian

A2 - Wang, Guohua

PB - IEEE

CY - Piscataway, NJ

ER -

ID: 13682526