Example 2: Erroneously Overlaping Annotations

perl xGAEVAL.pl --clean_db --GFF examples/example2.gff3


This example illustrates a specific case of incongruence GAEVAL is trained to observe. In this example, two Arabidopsis gene loci ( At2g01960.1 and At2g01970.1 ) are annotated as overlapping genes. While genes do sometimes overlap, this instance is the result of ambiguity in the automated annotation pipeline used to annotated these loci. This example includes two 'full-length' cDNA which show the actual extent of the processed transcript from each gene. Also included are two EST alignments. One which is obviously associated with the At2g01970.1 gene based on a shared intron prediction. The second is 'ambiguous' however and reviewed outside of the context of the other evidence alignments could be associated with either gene annotation.

GAEVAL Results:

Incongruency Analysis (At2g01960.1)

1:  Ambiguously Overlapping Annotations Detected:
2:   At2g01970.1
3:    Ambiguous evidence:
4:     gaeval_evidence (1 alignments)

This section describes incongruencies found between the annotation and the supplied evidence alignments.


Gene structure prediction programs often form a structure prediction based on contiging and clustering of all available sequence alignments in a given genomic region. These programs seldom consider adjacent annotations. As such, under the circumstances of having available a sequence completely contained within an exon and within close proximity to the adjacent gene these sequences will be ambiguously used to define the structure of both gene annotations. This situation will create an artifical overlap.