Bioinformatics Seminar

An occasional seminar series on bioinformatics, at Berkeley and beyond!

Title: How to Compare Protein Sequences, or: What We Still Get Wrong

Speaker: Dr. Martin Madera, Karplus Bioinformatics Group, Dept. of Biomolecular Engineering, University of California at Santa Cruz

Time: 2:00-3:00pm, Wednesday, April 23rd, 2008

Place: 321 Stanley Hall

Abstract:
We have been examining protein sequences using computers for more than 40 years: comparing them, aligning them, grouping them into families and superfamilies, and building phylogenetic trees for them. During that time, many significant advances have been made -- dynamic programming, local scoring, heuristics for speed-up, progressive multiple sequence alignment, maximum likelihood trees, statistical assessment of similarities, profile-sequence and profile-profile comparisons, iterated database searches, and secondary structure prediction (to name but a few of the most important ones!).

During the talk I will describe the current state of the art in recognizing homologies among protein sequences, namely comparison of profile hidden Markov models using predicted secondary structure and residue burial. But I will mostly focus on three areas that we still get wrong: (1) combining posterior decoding with local scoring; (2) false positives with highly significant E-values caused by errors in iterated database searches; and (3) null models and handling correlations in secondary structure sequences. I will propose solutions to these problems, and discuss some preliminary results.



To receive announcements of future seminars in this series, send a message to majordomo@listlink.berkeley.edu with the following in the body of the message:
subscribe bioinformatics your_email_address
There should be no other text or signature files in the body of the message.

Return to the Berkeley Phylogenomics Group homepage