Reconstructing Sequences from Traces
Oct 18, 2010
from 12:30 PM to 01:30 PM
|Add event to calendar||
University of Illinois, Urbana-Champaign
Monday, October 18, 2010 at 12:30PM
54-134 Engineering IV Building
Abstract: In many problems in bioinformatics and information theory, one is faced with the task of reconstructing a sequence based on partial information about its subsequences. Examples include synchronization theory, deletion-error control coding, genome sequencing, and mass spectrometry analysis.
We start with an overview of well studied reconstruction problems. We then proceed by introducing a new reconstruction problem, motivated by mass-spectrometry protein sequencing. One such instance is the problem of sequence reconstruction based on multiset compositions. This problem can be simply stated as follows: reconstruct a string from the multiset of its substring compositions. We show that all strings of length 7, one less than a prime, or one less than twice a prime, can be reconstructed. For all other lengths we show that reconstruction is not always possible and provide sometimes-tight bounds on the number of the confusable strings. The lower bounds are derived by combinatorial arguments and the upper bounds by algebraic considerations of bivariate polynomials. The problem can be viewed as a combinatorial simplication of the turnpike problem, and its solution may shed light on this long-standing problem as well.
This is a joint work with Jayadev Acharya, Hirakendu Das, Alon Orlitsky, and Shengjun Pan from University of California, San Diego.
Biography: Olgica Milenkovic received her MSc degree in Mathematics and PhD in Electrical Engineering from the University of Michigan, Ann Arbor, in 2001 ad 2002, respectively. In 2002 she joined the faculty of University of Colorado, Boulder. After a visiting professor appointment at UCSD, she moved to University of Illinois, Urbana-Champaign in 2007, where she is now Associate Professor in the Department of Electrical and Computer Engineering. Her research interest include coding theory, analysis of algorithms, bioinformatics, and signal processing. For her work, she was rewarded the NSF Career Award, DARPA Young Faculty Awards, and several best conference paper awards in coding theory and bioinformatics.