Bioinformatics Seminar

An occasional seminar series on bioinformatics, at Berkeley and beyond!

Title: Trichomonas vaginalis genome annotation validation via proteomics and phylogenetics

Speaker: Dr. Richard D. Hayes, Molecular Biology Institute, University of California at Los Angeles

Time: 1:00-2:00pm, Wednesday, June 3rd, 2009

Place: 321 Stanley Hall

Abstract:
Trichomonas vaginalis is one of the most prevalent non-viral sexually transmitted infectious parasitic protists in the world, with over 200 million new cases of infection every year. T. vaginalis colonizes the human urogenital tract, where it remains extracellular and causes lesions in vaginal epithelia, leading to symptoms in infected women ranging from inflammatory disease, infertility and pregnancy complications, to predisposition to HIV infection and cervical cancer. Genome sequencing of a laboratory strain of T. vaginalis was completed in April 2005 by The Institute for Genome Research. The current version 1.0 of the genome annotation was produced by a completely automated process that ultimately produced a set of nearly 60,000 putative gene models, most consisting of single exons. In a majority of cases, existing molecular biology and biochemistry evidence, sequence homology, or enzymatic domain homology matches were not correctly incorporated into these annotations, resulting in more than 80% of predicted genes receiving the annotation “conserved hypothetical protein”. Based on comparison to the genomes of other divergent, unicellular parasites, a high percentage of hypothetical proteins with insignificant similarity to proteins present in other organisms is expected; however, 80% is unusually high. To begin the process of validating the current annotation, peptide data from several proteomics investigations of T. vaginalis were modeled against the current genome annotation as sources of gene expression evidence. The results presented include strong predictions for instances where sequencing or assembly errors have contributed to annotation errors: incorrectly predicted start codons leading to exon boundary errors, and frameshift errors resulting in the annotation of single genes as two truncated genes in close proximity. All data will become public by its incorporation in the current genome website, http://trichdb.org, to facilitate continued analysis by the full T. vaginalis research community.



To receive announcements of future seminars in this series, please subscribe to the bioinformatics mailing list at https://calmail.berkeley.edu.

Return to the Berkeley Phylogenomics Group homepage