Optimizing de novo Assembly Algorithms for Next-Generation Sequencing Data

Presentation Type

Poster/Portfolio

Presenter Major(s)

Computer Science, Mathematics

Mentor Information

Christian Trefftz, Greg Wolffe

Department

School of Computing and Information Systems

Location

Henry Hall Atrium 68

Start Date

11-4-2012 9:00 AM

Keywords

Information, Innovation, and Technology, Life Science, Mathematical Science, Technology

Abstract

Next-generation sequencing (NGS) platforms have presented unique challenges to the computing community. The large number of short reads characteristic of NGS data has increased the difficulty of assembling genomes without use of a reference sequence, a method known as de novo sequence assembly. Further complicating the problem is the recent interest in metagenomics, the sequencing of multi-genetic material from environmental samples. Specialized data structures, such as de Bruijn graphs and bloom filters, have been incorporated as the backbone of modern assembly software. But as the rapid growth in metagenomic data illustrates, the development of new data structures and algorithms must continue to keep pace. The goal of this project is to analyze and optimize the performance of these assembly algorithms, focusing specifically on the pre-processing and graph partitioning stages. Both memory usage and run-time optimizations are considered, and a range of computing platforms is targeted.

This document is currently not available here.

Share

COinS
 
Apr 11th, 9:00 AM

Optimizing de novo Assembly Algorithms for Next-Generation Sequencing Data

Henry Hall Atrium 68

Next-generation sequencing (NGS) platforms have presented unique challenges to the computing community. The large number of short reads characteristic of NGS data has increased the difficulty of assembling genomes without use of a reference sequence, a method known as de novo sequence assembly. Further complicating the problem is the recent interest in metagenomics, the sequencing of multi-genetic material from environmental samples. Specialized data structures, such as de Bruijn graphs and bloom filters, have been incorporated as the backbone of modern assembly software. But as the rapid growth in metagenomic data illustrates, the development of new data structures and algorithms must continue to keep pace. The goal of this project is to analyze and optimize the performance of these assembly algorithms, focusing specifically on the pre-processing and graph partitioning stages. Both memory usage and run-time optimizations are considered, and a range of computing platforms is targeted.