Annotation & Prediction

Ensembl creates automated annotation on a selection of chordate genomes, and also imports non-vertebrate model organisms for comparative purposes.

In addition, a large number of (mainly vertebrate) genomes are available with reduced annotation on our Rapid Release site, whilst our sister project Ensembl Genomes provides databases of bacteria, fungi, plants, metazoa and other non-vertebrates.

Ensembl annotation

Protein-coding genes are automatically annotated using Ensembl's genebuild pipeline. All transcripts are based on mRNA and proteins in public scientific databases.


The Ensembl Variation database stores areas of the genome that differ between individual genomes ("variants") and, where available, associated disease and phenotype information.

Comparative genomics

Ensembl focuses on two main areas of comparative genomics: creation of gene trees using representative proteins from each gene in a species, and alignment of DNA sequences to infer synteny, conservation, etc.


Ensembl Regulation provides resources for studying gene expression and its regulation, with a focus on the computational annotation of non-coding regulatory features in the genome.