Assembly
Vervet-AGM (Chlorocebus sabaeus), also known as the Vervet Monkey or African Green Monkey, is originally from West Africa but was introduced in the late 1600s to the Caribbean. The species is important in studying high blood pressure and AIDS, since it is a host for simian immunodeficiency virus (SIV). This release features the assembly ChlSab 1.1 (GCA_000409795.2), which became available in March 2014.
The assembly comprises 2,003 toplevel sequences (including 21 chromosomes) from 162,723 contigs. The N50 of the scaffolds is 81.8Mb. The N50 size is the length such that 50% of the assembled genome lies in blocks of the N50 size or longer.
Gene annotation
The gene set was built using a mixed approach. Due to the lack of species-specific sequences and the availability of RNASeq data for Vervet-AGM from Washington University, the final gene set comprises models based on orthologous proteins from the vertebrate division of UniProtKB, longest translations of some human gene models from Ensembl 73, as well as models from RNASeq data.
11,258 gene models were made exclusively from RNASeq data. The data were also used to add UTR to gene models. The total gene set contains 10165 protein-coding genes with a further 8,218 ncRNAs and 505 pseudogenes.
More information
General information about this species can be found in Wikipedia.
Statistics
Summary
Assembly | ChlSab1.1, INSDC Assembly GCA_000409795.2, Mar 2014 |
Base Pairs | 2,789,656,328 |
Golden Path Length | 2,789,656,328 |
Annotation provider | Ensembl |
Annotation method | Full genebuild |
Genebuild started | Mar 2014 |
Genebuild released | Oct 2014 |
Genebuild last updated/patched | Feb 2015 |
Database version | 113.1 |
Gene counts
Coding genes | 19,165 |
Non coding genes | 8,245 |
Small non coding genes | 6,326 |
Long non coding genes | 3 |
Misc non coding genes | 1,916 |
Pseudogenes | 575 |
Gene transcripts | 28,078 |
Other
Genscan gene predictions | 88,465 |
Short Variants | 0 |