PhyloFacts 3.0.2

PhyloFacts release PF3.0.2 contains 7,337,238 protein sequences from 99,254 unique taxa (including strains) across 92,800 families (25,446 grouped by Pfam domain and 67,354 grouped by multi-domain architecture agreement). The PhyloFacts resource integrates a wealth of information on protein families from across the Tree of Life. For each family, PhyloFacts includes a multiple sequence alignment, one or more phylogenetic trees, predicted 3D protein structures, cellular localization, and Gene Ontology (GO) annotations and evidence codes. PhyloFacts includes hidden Markov models for classification of user-submitted protein sequences to protein families across the Tree of Life.

The protein families in PhyloFacts typically contain homologs from many species. The phylogenetic distribution of a protein family can vary from highly restricted (e.g., mammals) to throughout the Tree of Life. Gathering homologs from many divergent species enables us to take advantage of experimental investigations in different systems, and allows powerful inferences of function and structure that might not otherwise be possible.
More ...

  • Query PhyloFacts by UniProt accession or identifier
  • Protein functional site prediction using the INTREPID and Discern algorithms
  • View PhyloFacts family alignments, trees, and annotations
  • Query PhyloFacts by Pfam accession (PhyloFacts-Pfam Project)
  • Query PhyloFacts by BioCyc reactions (PhyloFacts-Biocyc Project)
  • View coverage of key species in PhyloFacts
  • View PhyloFacts coverage statistics
  • Download PhyloFacts data
  • How to cite PhyloFacts


Paste your protein sequence in FASTA format here to find matching PhyloFacts families and predicted orthologs.

Creating FAT-CAT job ...
Close this panel