Drosophila P450 Links

What’s New Jan. 27, 2000

All Drosophila melanogaster P450 protein sequences have been posted as a FASTAFile All Drosophila melanogaster P450s These sequences are all in Genbank. There are no confidential sequences left.There are 86 P450 genes and 4 pseudogenes. CYP51 is absent. Since CYP51 is also absent in C. elegans, this important eukaryotic sterol biosynthetic gene may have been lost in the common ancestor of flies and nematode worms.

What’s New Jan. 19, 2000

I have posted a 4 family tree with 89 sequences, including the new Drosophila sequences in the 4 family and in the 18 clan. New 4 Family treeA second tree covering the remaining sequences including the 6, 9, 12 and 28 families is also hereNew 6 and 9 Family tree

What’s New Jan. 2, 2000

Two new sequence alignments of the I-helix to the end half of the proteins are posted. The first alignment covers 73 sequences (mainly insect sequences) in a 4 family alignment. The second alignment has 80 sequences (mainly 6, 9, 12, 28 families) with many new Drosophila sequences included in both alignments. See second alignment. These will form the basis for naming the new Drosophila P450s, which will be done by Jan. 15, 2000. Once the trees for these alignments have been debugged and polished, they will be posted with the new nomenclature for the Drosophila sequences. The trees will contain about 31-32 confidential sequences, but these will not be in the alignments.Nov. 29, 1999 We are now in the final stages of completion of the fly genome. In the next month, nearly all P450s in Drosophila should be identified and posted to this site. Currently 80 N-terminal sequences have been identified and 57 are from complete sequences. The remaining 23 partial sequences should be filled in soon as the Drosophila data is deposited from Celera. I will hold off naming them until the sequence data is complete. For earlier updates see the Whats New section of the main page.A press release of July 28 from Celera stated that one million sequences (500 million bp of sequence ) have been completed from the Drosophila genome. Below is a quote from the press release. “Celera expects to complete the random sequencing phase of Drosophila in early September when it will begin sequencing the human genome. This will entail completing another 2 million sequences-or about 1 billion letters of genetic code. Working with the Berkeley Drosophila Genome Project (BDGP), Celera will then fill gaps and resolve ambiguities in the sequence to produce finished sequence. Celera will begin making sequence data available to the public in October 1999, and anticipates release of the completed sequence by the end of the year and publication in collaboration with the BDGP in early 2000.”Note July 9: The Rubin sequencing effort continues to deposit more sequence with over 1700 in June and at least 26 so far in July. These will be searched for new P450s.Note June 16: The 4 family has been partially displayed in a tree including subfamilies 4A-4P, with many mammal sequences included. The 47 family is probably missnamed. Also 4E4 should be a separate subfamily. Another tree has been prepared that reduces the number of very similar sequences to include more CYP4 subfamilies up to Cyp4aa1 (formerly Cyp47). See the the tree above with 61 CYP4 sequences.Note: June 14, 1999 The tree with 56 insect P450s includes many new Drosophila sequences. Some are not yet named. This tree is based on an alignment that covers the I-helix to the ends of the sequences, since many are missing the N-terminal. The 4 family sequences are not included here. There are too many to fit, they will be treated in a separate tree.Note: June 11, 1999 The Drosophila P450s have been found in Genbank by systematic BLAST searches of the nr, month, others ESTs, gss and htgs sections, using different P450 family representatives. The first search with Cyp4d2 yielded 101 new ESTs, 6 new sequences from month, one from htgs and none from gss or nr. The second search with Cyp6d2 only found 17 new ESTs and one sequence from month. The third search hit only 5 new ESTs and one sequence from nr. At this point the search was halted, since the returns were not worth the effort of scanning the output for new sequences. Some of the new sequences are very different from other P450s (AC005130) and cannot be easily assembled into a complete sequence by comparison with known P450s. I have identified exon containing ORFS from this gene, but I cannot detect the exon boundaries. If you are brave have a try at it. The new sequences (almost 300 total in the original FASTA file) have been compared with each other by repetitive Do-It-Yourself WU-BLASTs and condensed onto 98 contigs. Ten of these are from other Drosophila species (4d10, 4e5, 6a9, 9b3, 9f1, 13b1, 28a1, 28a2, 28a3, 28a4), 88 are from D. melanogaster. Based on C. elegans 80 P450 genes, these 88 genes and gene fragments may represent nearly all the P450s from Drosophila, though some are probably N- and C-terminals of the same gene and the number of contigs will drop as the genome is completed.Note: On May 28, 1999 28,049 Drosophila genome survey sequences were deposited from Genoscope in France. These are BAC end sequences. The percent of the Drosophila genome sequenced as reported at the MOT tables jumped from 15% to 24%. I have not had a chance to search these for P450 hits, but there should be a number of new P450s in this large sequence collection of 9% of the Drosophila genome.A preliminary BLAST search with 6a2 as the query found 37 bona fide P450 hits in the genome survey sequences in the month section of genbank. These probably represent 25 different genes. There are probably more than this, but I will have to search with other families like 9 and 28 to find them. These sequences have now been translated and added to the FASTA file above.June 10, 1999. More extensive searches of the nr, est, htgs, gss, and month sections of Genbank have identified 235 ESTs, 44 genome survey sequences, 30 AC00XXXX genomic P1 clones and 41 other sequences for a total of 350 accession numbers for Drosophila P450s. These have all been translated and are being assembled into contigs. (See the FASTA file)

|AC007549  Drosophila melanogaster chromosome 2 clo...  1012  0.0 Cyp6a2
emb|AL054861.1|CNS00A30  Drosophila melanogaster 182  3e-77 cyp6a9
emb|AL053264.1|CNS0098O  Drosophila melanogaster 272  4e-73 cyp6a9
emb|AL072094.1|CNS00GEP  Drosophila melanogaster 178  1e-72 cyp6a9
emb|AL055555.1|CNS004XH  Drosophila melanogaster 171  5e-50 cyp6a9
emb|AL070586.1|CNS00DGA  Drosophila melanogaster 108  1e-38 cyp6a9
emb|AL054261.1|CNS004MS  Drosophila melanogaster 105  1e-22 cyp6a9
emb|AL069964.1|CNS00DFU  Drosophila 136  6e-32 72% identical to 6a9
emb|AL054065.1|CNS004PR  Drosophila melanogaster 222  2e-62 cyp6a8
emb|AL063862.1|CNS00350  Drosophila melanogaster 123  5e-28 cyp9c1
emb|AL076220.1|CNS00JFP  Drosophila melanogaster 77  5e-23 cyp9c1
gb|AC007581.2|AC007581  Drosophila melanogaster 89  7e-18 cyp9c1
gb|AC007291.10|AC007291  Drosophila melanogaster 57  9e-14 cyp4e3
emb|AL076873.1|CNS00JXU  Drosophila 191 1e-59 exact match with AA951440
emb|AL076863.1|CNS00JXK  Drosophila 173  2e-79 exact match with AA951440
emb|AL052842.1|CNS000F5  Drosophila 215  8e-56 exact match with AA699131
emb|AL074108.1|CNS00HVU  Drosophila 171  7e-48 exact match with AA699131
emb|AL078165.1|CNS00KMI  Drosophila 196  4e-50 exact match with Dm3472
emb|AL069773.1|CNS00ERU  Drosophila 87  4e-17 exact match to Dm3472
emb|AL065891.1|CNS006T4  Drosophila 126  5e-29 exact match with AA141600
emb|AL058810.1|CNS0017H  Drosophila 68  3e-11 exact match with Dm0590
emb|AL055637.1|CNS00ALR  Drosophila 62  1e-09 exact match with AL058497
emb|AL058497.1|CNS00BYD  Drosophila 40  0.006 exact match to AL055637
emb|AL070449.1|CNS00FAM 72  1e-12 exact match with composite sequence CK01076
emb|AL059533.1|CNS005I8 62  2e-09 exact match with composite sequence CK01076
emb|AL061295.1|CNS001S5  Drosophila 59  1e-08 exact match to L46858
emb|AL061650.1|CNS00613  Drosophila 74  3e-13 60% identical to L46858
emb|AL065705.1|CNS006L5  Drosophila 58  2e-08 exact match to AA698945
emb|AL059237.1|CNS00CG4 58  2e-08 exact match to AC005811, AL062712, AL068269, AL075733
emb|AL062712.1|CNS002HH 54  3e-07 exact match to AC005811, AL059237, AL068269, AL075733
emb|AL068269.1|CNS00LIR 70  3e-18 exact match to AC005811, AL062712, AL059237, AL075733
emb|AL075733.1|CNS00J4Z 51  2e-06 exact match to AC005811, AL062712, AL068269, AL059237
emb|AL054245.1|CNS009UB  Drosophila 46  2e-12 exact match to AL062352
emb|AL062352.1|CNS002D3  Drosophila 89  4e-29 exact match to AL054245
emb|AL057969.1|CNS00BXP  Drosophila 61  3e-09 exact match to AL067059
emb|AL067059.1|CNS007EC  Drosophila 56  9e-08 exact match to AL057969
emb|AL057750.1|CNS00162 136  5e-58 65% identical to AL062684
emb|AL062684.1|CNS002GP 155  7e-38  65% identical to AL057750
gb|AC007356.6|AC007356  Drosophila 71  2e-12 probable mitochondrial clan sequence
gb|AC005472.9|AC005472 114  2e-25 66% identical to AA567377
gb|AC007571.2|AC007571  Drosophila 82  1e-15 probable new family
emb|AL072844.1|CNS00H2C  Drosophila 80  5e-15 42% identical to 6a5
emb|AL070820.1|CNS00FNQ  Drosophila 40  0.006 40% identical to CYP28A1