Rice P450 Links

Sept. 25, 2002, There are 458 different P450 gene sequences in rice. Some of these are short pseudogene fragments of only one exon or part of one exon. I count 309 full length, probably functional, P450 genes and another 14 that are now incomplete but may become completed as the genome is finished. That would give 323 full length P450s and 135 pseudogenes. This compares to 249 full length P450s in Arabidopsis and 23 pseudogenes. Several sequences may join together to form complete sequences. (98A17 and 98A18), (714D1 and 714D2), (various CYP730 fragments) this may reduce the gene count by three or four. August 9, 2002, 875 total rice contigs, 341 are japonica, 534 are indica 250 japonica sequences are full length. 293 rice sequences are full length (from both subspecies) I will be trying to name all the rice genes before August 20. All gene pieces have been sorted into family bins and many have been named but some like the 94 family still need to be sorted into subfamilies. (8/19/02) As of March 1, 2002 I have updated the rice P450 page. There are now 202 full length P450s, 61 pseudogene fragments and 61 incomplete sequences. A blast server for the rice P450 sequences is available. The 324 sequence contigs are non-redundant, meaning I have blasted them against each other and combined overlapping fragments into contigs. I have done systematic searches of the Genbank Database for rice P450 ESTs, GSSs, sequences in nr and HTGS hits. These are compiled into the set of sequences on the blast server. The public data has now been completely searched. I will now be looking at the private 5X coverage of rice, but the results from this cannot be presented until they are approved according to legal agreements, so I will not be able to add them daily. Note from Oct. 16, 2001 Since 1993, there are 197241 rice entries in Genbank. The majority are recent additions. 2001 32693 up to Oct. 16, 2001 2000 43284 1999 67898 1998 34465 1997 8170 1996 1529 1995 3733 1994 3468 1993 4866 no entries before 1993 Of the total number of entries 197241, 93164 are GSS sequences = Genome Survey Sequence (from BAC ends) 92634 are ESTs 553 are STS (Sequence Tagged Sites for mapping) 277 are patents 1145 are HTGS high throughput genomic 6667 are microsatellite repeats from the Monsanto data (in NR section with accession numbers starting with AY) 2801 are other NR section sequences, mRNAs, genes and finished BACs in 1999 the activity was 01/ 6291 02/ 4294 03/ 688 04/ 4191 05/ 4860 1709 GSSs + 3094 ESTs 06/ 10776 6162 GSSs + 4527 ESTs 07/ 6370 5050 GSSs + 1284 ESTs 08/ 3785 3239 GSSS + 444 ESTs 09/ 1577 1188 GSSs + 360 ESTs 10/ 1089 839 GSSs + 216 ESTs 11/ 18077 17800 GSSs + 261 ESTs 12/ 11507 11412 GSSs + 72 ESTs in 2000 the activity was 43284 20085 GSSs + 22558 ESTs for the whole year in 2001 the activity was 32639 62 GSSs + 23831 ESTs (up to 10/16/01) 01/ 916 0 GSSs + 793 ESTs 02/ 7063 0 GSSs + 1276 ESTs 5637 = microsatellite sequences 03/ 1833 51 GSSs + 1614 ESTs 04/ 1311 0 GSSs + 1212 ESTs 05/ 2766 0 GSSs + 2632 ESTs 06/ 702 0 GSSs + 595 ESTs 07/ 3647 2 GSSs + 3355 ESTs 08/ 576 0 GSSs + 85 ESTs 09/ 420 9 GSSs + 420 ESTs 10/ 13405 0 GSSs + 12269 ESTs 12269 ESTs deposited Oct. 5