Speaker: Dr. Jan Baumbach, International Computer Science Institute, University of California at Berkeley
Time: 12:00-1:00pm, Tuesday, June 9th, 2009
Place: 321 Stanley Hall
Abstract:
Partitioning biomedical data objects into groups, such that the objects in each
group share common traits, is a long-standing challenge in computational
biology. Here we present an integrated data clustering framework based on
weighted transitive graph projection: Transitivity Clustering. We illustrate a
typical, biomedical clustering task that starts with a list of amino acid
sequences, investigates similarity functions and parameter estimation problems,
and finally deals with an integrated result interpretation; all of which can be
done easily with TransClust, our Transitivity Clustering implementation, but
with no other clustering software. Exemplarily, we reconstruct families of
functionally related proteins. In a large-scale study, we compute the core
genome for al 51 sequenced actinobacteria. We also present a whole-genome
based phylogenetic tree for all organisms of this phylum. Project web site: http://transclust.cebitec.unibielefeld.de
In the talk, we very briefly motivate the necessity of protein sequence clustering approaches as an essential part of inter-species gene regulatory network transfer workflows. We will also discuss our previous work on weighted transitive graph projection. Here, we mainly concentrate on the FORCE heuristic.
— Some background literature (PubMed IDs):
To receive announcements of future seminars in this series, please subscribe to the bioinformatics mailing list at https://calmail.berkeley.edu.