Clustering And Alignment

We work on developing and assessing methods for homolog identification and alignment.

FlowerPower

>
Digitalis, photographed with our Olympus 3030Z digital camera, in England.

You can use our web server to run FlowerPower to obtain a cluster of homologous proteins and a multiple sequence alignment.

FlowerPower uses a combination of methods to cluster and align proteins, including PSI-BLAST (to obtain a set of potential homologs), ClustalW (to obtain an initial alignment), Bayesian Evolutionary Tree Estimation (BETE, to build a tree and identify subfamilies), and Subfamily HMM construction. Our FlowerPower method for clustering and aligning proteins is named after the way the cluster appears to bloom outwards during the iterative database searches, as shown here.

The subfamily HMMs are used in place of a single HMM (as in the UCSC SAM-T99 method) or profile (as in PSI-BLAST) to identify and align sequences in the subsequent iteration. This gives FlowerPower alignments higher quality in regions that are divergent among the family as a whole.

We also use heuristic approaches to prevent (or at least, discourage) the intrusion of false positives in any iteration, such as alignment quality analysis, confirming that a sequence aligns globally (if desired), that the alignment is not too gappy, and so on.


Group Members

Jason Chan
Dan Kirshner

(bpg group access)