A Novel Gap-Filling Method Based on Hybrid Read Information Analysis

Yejin Kan, Dongyeon Kim, Gangman Yi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

De novo assembly, which discovers the entire nucleotide sequence by reconstructing the reads resulting from next-generation sequencing, is a subject that must be studied for genetic information analysis. The recombination of reads is performed in several steps, but gaps that cannot be resolved occur even after scaffolding. Gap-filling is performed as the last assembly stage to fill the unidentified regions called gaps, significantly improving overall assembly performance. We propose a gap-filling method using hybrid reads to resolve gaps based on sequence similarity estimation and graph searches. The proposed method consists of three key steps: extracting the candidate sequence, estimating similarity, and filling the gaps based on the graph. Hybrid reads extract sequences with more accurate information, and candidate sequences corresponding to noise are effectively removed based on the similarity estimation. In conclusion, a graph search using statistical information derives a final sequence that guarantees high coverage, resolves gaps, reduces misassemblies, and improves accuracy.

Original languageEnglish
Title of host publicationProceedings - 2022 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2022
EditorsDonald Adjeroh, Qi Long, Xinghua Shi, Fei Guo, Xiaohua Hu, Srinivas Aluru, Giri Narasimhan, Jianxin Wang, Mingon Kang, Ananda M. Mondal, Jin Liu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages3827-3829
Number of pages3
ISBN (Electronic)9781665468190
DOIs
StatePublished - 2022
Event2022 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2022 - Las Vegas, United States
Duration: 6 Dec 20228 Dec 2022

Publication series

NameProceedings - 2022 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2022

Conference

Conference2022 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2022
Country/TerritoryUnited States
CityLas Vegas
Period6/12/228/12/22

Keywords

  • De Bruijn graph
  • de novo assembly
  • gap-filling
  • hybrid reads
  • next-generation sequencing

Fingerprint

Dive into the research topics of 'A Novel Gap-Filling Method Based on Hybrid Read Information Analysis'. Together they form a unique fingerprint.

Cite this