Speaker: Dr. Martin Madera, Karplus Bioinformatics Group, Dept. of Biomolecular Engineering, University of California at Santa Cruz
Time: 2:00-3:00pm, Wednesday, April 23rd, 2008
Place: 321 Stanley Hall
Abstract:
We have been examining protein sequences using computers for more than 40
years: comparing them, aligning them, grouping them into families and
superfamilies, and building phylogenetic trees for them. During that time,
many significant advances have been made -- dynamic programming, local scoring,
heuristics for speed-up, progressive multiple sequence alignment, maximum
likelihood trees, statistical assessment of similarities, profile-sequence and
profile-profile comparisons, iterated database searches, and secondary
structure prediction (to name but a few of the most important ones!).
During the talk I will describe the current state of the art in recognizing homologies among protein sequences, namely comparison of profile hidden Markov models using predicted secondary structure and residue burial. But I will mostly focus on three areas that we still get wrong: (1) combining posterior decoding with local scoring; (2) false positives with highly significant E-values caused by errors in iterated database searches; and (3) null models and handling correlations in secondary structure sequences. I will propose solutions to these problems, and discuss some preliminary results.
To receive announcements of future seminars in this series, send a message to
majordomo@listlink.berkeley.edu with the following in the body of the
message:
subscribe bioinformatics your_email_address
There should be no other text or signature files in the body of the message.