Vector Sequence Contaminations
The latest assembly of the Arabidopsis thaliana genome (GenBank entries of 2/19/04) contains vector sequence contaminations. For example, region 3,617,880 to 3,625,027 of chromosome II is a cloning vector. Alignments were generated using VMATCH; see here for the output.
Complete list of vector contaminations:
| CHROMOSOME |
LOCATION |
GENOMIC CONTEXT VIEW |
vmatch ALIGNMENT |
| 2 |
3,617,880 to 3,625,027 |
gc-vc2 |
vm-vc2 |
| 3 |
5 to 101 |
gc-vc3a |
vm-vc3 |
| 3 |
13,748,162 to 13,748,842 |
gc-vc3b |
vm-vc3 |
| 3 |
13,754,115 to 13,760,806 |
gc-vc3c |
vm-vc3 |
| 5 |
11,246,005 to 11,246,090 |
gc-vc5 |
vm-vc5 |
- NCBI UniVec - database and tools "that can be used to quickly identify segments within nucleic acid sequences which may be of vector origin (vector contamination)"
- VMATCH - powerful string matching software developed by Dr. Stefan Kurtz