Human self-LastZ results

Human (Homo sapiens, GRCh38) was aligned to itself using the LastZ alignment algorithm (LastZ) in Ensembl release 76. After running LastZ, alignments between the same locations were removed and the remaining raw LastZ alignment blocks were chained according to their location in both genomes. During the final netting process, the best sub-chain was chosen in each region on the reference species.

Self-alignments reveal duplicated regions: some being recent and almost identical, others being much older and indicators of ancient whole-genome duplications.

Configuration parameters

ParameterValue
Gap open penalty (O)400
Gap extend penalty (E)30
HSP threshold (K)5000
Threshold for gapped extension (L)5000
Threshold for alignments between gapped alignment blocks (H)3000
Masking count (M)10
Seed and Transition value (T)1
Scoring matrix (Q)
Primate:
    A    C    G    T
   90 -330 -236 -356
 -330  100 -318 -236
 -236 -318  100 -330
 -356 -236 -330   90

Chunking parameters

ParameterHuman (reference)Human (non-reference)
Chunk size30,000,00010,100,000
Overlap0100,000
Group set size010,100,000
Masking options{default_soft_masking => 1}

Statistics over 152,975 alignment blocks

Genome coverage (bp) Coding exon coverage (bp)
Human

Uncovered: 2,973,649,289 out of 3,096,649,726
Covered: 123,000,437 out of 3,096,649,726

Uncovered: 27,423,921 out of 35,239,646
Matches: 5,960,500 out of 35,239,646
Mismatches: 1,707,758 out of 35,239,646
Insertions: 147,467 out of 35,239,646
Identity over aligned base-pairs: 76.3%