Dec. 20, 2000  D. Nelson

The state of the Arabidopsis genome.

There are only 7 clones that are not sequenced yet.  In addition there is one 5kb gap.  The 
partial CYP84A4 gene has not been located in the Arabidopsis sequence, so it may be on 
one of these 7 clones or in the 5kb gap.  If it does not turn up, it may be a contamination 
from another species.  There are three ESTs and three genome survey sequences that do not 
match with high confidence to known P450 genes of Arabidopsis.  
AI993108 91% to 71B18 I-helix to heme
AI993723 91% to 72A13 C-term
AW004264 89% to 72A15 C-term
B67502 BAC clone T24M9 C-helix region 74% to 97C1
B09868 GSS fragment aa 157-207 mid region 55% to 86A4
B12879 GSS fragment mid region aa 142-218 51% to 86A4, only 40% to B09868

The three ESTs may just be poor quality sequence for the best matched genes, but the GSS 
fragments do not appear to be that similar.  

Chr I 

There are two clones not yet in Genbank F8L2, F20B16 and a centromere gap between 
T28N5 and F25O15 that contains 180bp repeats. (see Figure 6 page 805 Nature 408 14 
Dec. 2000).

Chr II

There is a centromere sequence gap between T12J2 and T14C8 that contains the 180bp 
repeat cluster.  Between T5M2 and T5E7 just upstream of T12J2 there is a 270kb insertion 
of the Arabidopsis mitochondrial genome (99% identical) [see Nature 402 p. 765, Dec. 16, 
1999).  There are no missing clones on the long or short arms.


The centromere gap of 180bp repeats is between T15D2 and T25F15.  In figure 6 (page 
805 Nature 408 14 Dec. 2000), T25F15 is not shown.  The next clone downstream is 
F23H6.  There is a 5kb gap near the end of the chromosome between T12C14 and F26K9 
(p. 820 Nature 408 Dec. 14, 2000).  

Chr IV

The centromeric gap is between clones F21I2 and F14G16.  This gap contains 180bp 
repeats and 5SrDNA.  There are probably no other genes in this region.  

Chr V

There are five clones that are not yet in Genbank T19N18, T2K12, T2L20 and an adjacent 
pair F13M11, T6G21.  The only other unresolved region is at the centromere where there 
is an island of four clones T3P1, F7I20, F17M7 and F19I11 that are not connected by 
sequence to the long or short arms.  The gap between  F23C8 and T3P1 contains the long 
cluster of rDNA and the gap between F19I11 and F13C19 holds the 180bp repeat cluster.
There are probably no genes in these regions aside from the rRNA.  (see Figure 6 page 805 
Nature 408 14 Dec. 2000).