Human self-LastZ results

Human (Homo sapiens, GRCh38) was aligned to itself using the LastZ alignment algorithm (LastZ) in Ensembl release 76. After running LastZ, alignments between the same locations were removed and the remaining raw LastZ alignment blocks were chained according to their location in both genomes. During the final netting process, the best sub-chain was chosen in each region on the reference species.

Self-alignments reveal duplicated regions: some being recent and almost identical, others being much older and indicators of ancient whole-genome duplications.

Configuration parameters

ParameterValue
Gap open penalty (O)400
Gap extend penalty (E)30
HSP threshold (K)5000
Threshold for gapped extension (L)5000
Threshold for alignments between gapped alignment blocks (H)3000
Masking count (M)10
Seed and Transition value (T)1
Scoring matrix (Q)$ENSEMBL_ROOT_DIR/ensembl-compara/scripts/pipeline/primate.matrix

Chunking parameters

ParameterHuman (reference)Human (non-reference)
Chunk size30,000,00010,100,000
Overlap0100,000
Group set size10,100,000
Masking options

Statistics over 186,172 alignment blocks

Genome coverage (bp) Coding exon coverage (bp)
Human

Uncovered: 2,777,232,633 out of 3,099,750,718
Covered: 322,518,085 out of 3,099,750,718

Uncovered: 31,042,982 out of 38,571,754
Matches: 6,850,638 out of 38,571,754
Mismatches: 580,885 out of 38,571,754
Insertions: 97,249 out of 38,571,754
Identity over aligned base-pairs: 91.0%