Annotation & Prediction
Ensembl creates automated annotation on a selection of chordate genomes,
and also imports non-vertebrate model organisms for comparative purposes.
In addition, a large number of (mainly vertebrate) genomes are available with reduced annotation
on our Rapid Release site, whilst our sister project
Ensembl Genomes provides databases of
bacteria, fungi, plants, metazoa and other non-vertebrates.
Protein-coding genes are automatically annotated using Ensembl's genebuild pipeline. All transcripts are based on mRNA and proteins in public scientific databases.
The Ensembl Variation database stores areas of the genome that differ between individual genomes ("variants") and, where available, associated disease and phenotype information.
Ensembl focuses on two main areas of comparative genomics: creation of
gene trees using representative proteins from each gene in a species, and
alignment of DNA sequences to infer synteny, conservation, etc.
Ensembl Regulation provides resources for studying gene expression and its regulation, with a focus on the computational annotation of non-coding regulatory features in the genome.