PhyloFacts::
Microbial Phylogenomic Encyclopedia

BPG home | PhyloFacts home | Sequence search | Database search | Browse | Publications | Help
PhyloBuilder | Gather homologs -> Align sequences -> Construct trees -> Find subfamilies -> Structure prediction

Microbial Phylogenomic Encyclopedia v. 2.0.   24 July 2008: 25,138 families; 1,169,076 Hidden Markov Models (family and subfamilies).

This resource is funded by a grant from NSF/USDA CSREES Microbial Genome Sequencing Program

We combine evolutionary tree construction with structure analysis to reconstruct the phylogeny of prokaryotic proteins and provide subfamily classifications. The molecular evolution of these complex proteins involves gene duplication and domain shuffling, to produce a vast and challenging superfamily of biological macromolecules.

Protein families in this library include those based on the complete sequenced genomes of various prokaryotes (Escherichia coli, Mycobacterium tuberculosis, etc. -- see Genome coverage) and all PhyloFacts protein families that include any prokaryote sequences.

Please cite the following paper in references to this resource: Nandini Krishnamurthy, Duncan Brown, Dan Kirshner and Kimmen Sjölander, "PhyloFacts: An online structural phylogenomic encyclopedia for protein functional and structural classification," Genome Biology.


Protein Search

Submit sequences for classification against the HMM library. This library is designed to help biologists do the following:

  • Predict molecular function by phylogenomic analysis, using the phylogenetic tree for the family.
  • Classify novel sequences to functional subtypes, using the subfamily HMMs for the family.
  • Predict specificity positions, using the alignment analysis plots for each family.

Browse Books in our library

Each "book" in the HMM library corresponds roughly to a (whole-chain) protein family or domain, and contains the following data (generally downloadable, in different formats):

  • A cluster of homologs, typically from many species
  • One or more phylogenetic trees.
  • A decomposition of the tree into subtrees, to identify functional subfamilies.
  • A multiple sequence alignment for the family, as well as for individual subfamilies.
  • GO (Gene Ontology) annotations and evidence codes.
  • Other annotations and experimental data.
  • Hyperlinks to papers and online resources.
  • An analysis of the family's multiple sequence alignment using the subfamily decomposition to predict specificity positions defining the individual subtypes.
  • A predicted structure, including construction of comparative models for some families.
  • A predicted cellular localization (i.e., membrane-localized, secreted, cytoplasmic, nuclear, etc.).

If you have any questions or comments, please email phylo.