Drosophila pseudoobscura P450s Drosophila pseudoobscura P450s The Drosophila pseudoobscura genome has been sequenced and aligned against the Drosophila melanogaster genome at the UC Santa Cruz genome browser. As part of a bioinformatics class, students searched the UCSC browser fruitfly genome with P450s from D. melanogaster. The aligned pseudoobscura sequences were obtained by clicking on the D. pseudoobscura histogram plot at the bottom of the browser window. The students then tried to assemble the aligned D. pseudoobscura gene by comparison to the D. melanogaster gene. This was a difficult task with moderate success. After they had submitted their efforts I went back and checked each assembly, and searched for more genes that were not assigned to the class. There are now 80 assembled P450 genes from D. pseudoobscura. The 4d1 orthologs have the alternative splice structure, so there are 79 genes. The CYP307A2P pseudogene has an ortholog in D. pseudoobscura and it is nearly complete, but still appears to be a pseudogene. The results of this analysis are given in the following links.
- 171 Drosophila genes including all D. melanogaster genes Feb. 13, 2004
- An alignment of 171 Drosophila P450s with intron location color coded Feb. 13, 2004
- An intron list for Drosophila P450s Feb. 16, 2004
- A link to the P450 Blast server with all 171 Drosophila sequences Feb. 13, 2004
- A tree with 169 Drosophila sequences Feb. 16, 2004
Some observations about the two genomes
Most genes were 65% to 96% identical between orthologs. One exception was in the CYP313 family. Several of the CYP313s did not have orthologs, with the best matches being to other CYP313s but at about 45-48% identity. This suggests that the ancestor of the two species had at least one additional subfamily that was lost in D. melanogaster. The CYP313As seem to be lost in D. pseudoobscura. CYP313A4 is a special case where 313A4 should probably be in a separate subfamily.
Some of the higher percentage conserved P450s:
CYP18A1 96% also called Eig17-1
Sequence and developmental expression of Cyp18, a member of a new cytochrome P450 family from Drosophila. Mol Cell Endocrinol. 1997 Jul 4;131(1):39-49.
CYP4C3 94%
A link to CYP4C3 at Antibes Inra France
CYP4G15 94%
A new cytochrome P450 from Drosophila melanogaster, CYP4G15, expressed in the nervous system. Biochem Biophys Res Commun. 2000 Jul 14;273(3):1132-7.
CYP4G1 93%
A link to CYP4G1 at Antibes Inra France
CYP49A1 91%
A link to CYP49A1 at Antibes Inra France
CYP314A1 90%
Shade is the Drosophila P450 enzyme that mediates the hydroxylation of ecdysone to the steroid insect molting hormone 20-hydroxyecdysone. Proc Natl Acad Sci U S A. 2003 Nov 25;100(24):13773-8.
CYP301A1 89%
A link to CYP301A1 at Antibes Inra France
CYP303A1 89%
A link to CYP303A1 at Antibes Inra France
Another function paper for shadow (sad) CYP315A1 and disembodied (dib) CYP302A1:
Molecular and biochemical characterization of two P450 enzymes in the ecdysteroidogenic pathway of Drosophila melanogaster. Proc Natl Acad Sci U S A. 2002 Aug 20;99(17):11043-8.
The difference in percent conservation may indicate the selective pressure to maintain a sequence from deviation from the parent sequence. The obvious interpretation is that the most conserved P450s are serving the most critical functions, though completely similar protein sequences would be expected to vary in their statistical rate of change around some normal range, the nearly 30% range seems too great to be caused merely be statistical variation.
The CYP6A subfamily has varied the most among the various subfamilies. Some sequences have no ortholog. This shows up in the browser alignment as two Drosophila melanogaster sequences being aligned to the same pseudoobscura sequence. This is easier to view in the tree of the P450s. In this tree pseudogenes Cyp6a15p and Cyp307a2p have been left out because they are too short to be used for making trees.
Intron gain and loss
The tree has mapped on it the possible origin and loss of intron sequences. This is based on the sequence alignment with the color coded positions of all introns. The introns are generally not shared between the deepest divisions on the tree. One exception is the 14A intron (first intron on page 14 of the alignment). This intron occurs in the heme signature as a phase 1 intron in G of CIG. This intron is also seen in CYP4V5 of Fugu so it is older than the divergence of protostomes and deuterostomes. Some introns are present in a majority of CYP4 clan members (especially the 6D intron before the ETAM exon, also seen in Fugu CYP4V5), The ETLR intron (11B) is seen in most CYP6 family members and related families CYP28 and CYP309. Several introns are common to the mitochondrial clan. The history of the gain and loss of these introns is apparent from the tree and the alignment. The lack of a universal intron suggests that the introns were either inserted after the deepest divisions occurred between the three main branches on the tree [CYP3 + CYP4 clans, CYP2 clan. Mito clan], or each of these three lines underwent a loss of introns, to start over with a clean slate. This could only happen if the number of sequences was very small in each major clan.
The color coded alignment can be compared to similar color coded alignments of the Anopheles P450s and the Dictyostelium P450s on those sections of the P450 Homepage. Many of these exons are shared with Anopheles, including the ETAM (6D) and EXXR (11B) introns and intron14A at CIG in the heme signature.
Use of conserved introns to decide between differences in tree branching, NJ vs UPGMA
The CYP310 and the CYP6d clusters share introns 6B, 7A and 9B, yet they are widely separated on the UPGMA tree. These sequences are adjacent on the NJ tree, with CYP310 having a long branch. This is proabably the correct assignment based on the shared introns. The UPGMA tree missplaced the 310 family due to its long branch.