State of the Arabidopsis Genome

Dec. 20, 2000 D. Nelson

The State of the Arabidopsis Genome.

There are only 7 clones that are not sequenced yet. In addition there is one 5kb gap. The
partial CYP84A4 gene has not been located in the Arabidopsis sequence, so it may be on
one of these 7 clones or in the 5kb gap. If it does not turn up, it may be a contamination
from another species. There are three ESTs and three genome survey sequences that do not
match with high confidence to known P450 genes of Arabidopsis.
AI993108 91% to 71B18 I-helix to heme
AI993723 91% to 72A13 C-term
AW004264 89% to 72A15 C-term
B67502 BAC clone T24M9 C-helix region 74% to 97C1
B09868 GSS fragment aa 157-207 mid region 55% to 86A4
B12879 GSS fragment mid region aa 142-218 51% to 86A4, only 40% to B09868

The three ESTs may just be poor quality sequence for the best matched genes, but the GSS
fragments do not appear to be that similar.

Chr I

There are two clones not yet in Genbank F8L2, F20B16 and a centromere gap between
T28N5 and F25O15 that contains 180bp repeats. (see Figure 6 page 805 Nature 408 14
Dec. 2000).

Chr II

There is a centromere sequence gap between T12J2 and T14C8 that contains the 180bp
repeat cluster. Between T5M2 and T5E7 just upstream of T12J2 there is a 270kb insertion
of the Arabidopsis mitochondrial genome (99% identical) [see Nature 402 p. 765, Dec. 16,
1999). There are no missing clones on the long or short arms.

Chr III

The centromere gap of 180bp repeats is between T15D2 and T25F15. In figure 6 (page
805 Nature 408 14 Dec. 2000), T25F15 is not shown. The next clone downstream is
F23H6. There is a 5kb gap near the end of the chromosome between T12C14 and F26K9
(p. 820 Nature 408 Dec. 14, 2000).

Chr IV

The centromeric gap is between clones F21I2 and F14G16. This gap contains 180bp
repeats and 5SrDNA. There are probably no other genes in this region.

Chr V

There are five clones that are not yet in Genbank T19N18, T2K12, T2L20 and an adjacent
pair F13M11, T6G21. The only other unresolved region is at the centromere where there
is an island of four clones T3P1, F7I20, F17M7 and F19I11 that are not connected by
sequence to the long or short arms. The gap between F23C8 and T3P1 contains the long
cluster of rDNA and the gap between F19I11 and F13C19 holds the 180bp repeat cluster.
There are probably no genes in these regions aside from the rRNA. (see Figure 6 page 805
Nature 408 14 Dec. 2000).