Graduate Student / Postdoc Seminar

The New New (Wrong Wrong) Thing: The Genomics Story

Speaker: Bhubaneswar Mishra, Courant

Location: Warren Weaver Hall 1302

Date: Friday, December 10, 2010, 1 p.m.

Synopsis:

The recent advances in DNA sequencing technology and their focal role in Genome Wide Association Studies have highlighted many intrinsic problems with the way genomics algorithms are implemented and the genomics data, collected and used: for instance, in the whole-genome sequence assembly (WGSA) problem. Here, a new strategy, based on global constrained optimization and branch-and-bound, is proposed and embodied in an efficient and accurate assembler, SUTTA. SUTTA is shown to intelligently deal with various errors in sequence reads and nonrandom structures in the genomes, without sacrificing performance in space and time efficiency. A new metric ('Feature-Response Curve') is presented to compare assemblers' performance and accuracy transparently.

An extensive assessment of SUTTA, based on this and standard metrics, against several well known assemblers (ARACHNE, EULER, CAP3, Minimus, PHRAP, TIGR) on microbial and partial human genome sequences (using shotgun reads of varying read-lengths, coverages, accuracies and with and without mate-pairs) leads to the conclusion that SUTTA provides a promising and flexible solutions for GWSA problems in the context of evolving sequencing technologies. A particularly interesting application of SUTTA involves assembling short reads from human genomes that do not align to the published reference genome sequence ('the dark-genomic-matter'). This is joint work with Giuseppe Narzisi.

The talk will also highlight various mathematical and computational problems in genomics through examples related to this problem.