March 25, 1999
Note on P450 evolution in yeasts and early eukaryotes.

The yeast Saccharomyces cerevisiae has three cytochrome P450 genes CYP51 (lanosterol 
14 alpha demethylase), CYP56 (DIT2 a dityrosine forming enzyme required for spore wall 
biosynthesis), and CYP61 (22 sterol desaturase).  The sequences of these proteins are 
given below.  The Fission yeast Schizosaccharomyces pombe is a second fungal genome 
that is nearing completion (82.7% on Mar 22, 1999) and will be completed this year.  
So far there are only two P450s found in S. pombe.  These are homologs of the CYP51 
and CYP61 sequences.  The CYP56 sequence has no homolog in S. pombe yet.  The 
CYP56 sequence may be specific to the spore wall biosynthesis program of yeast so that it 
is absent in S. pombe.  A dityrosine bluish fluorescence has been seen in Candida albicans 
Smail et al., 1995.
Candida albicans is closely related to S. cerevisiae and so it could have the DIT2 
gene.  S. pombe is more distantly related, perhaps branching from yeast 500-1000 
million years ago.  A tree of fungal species showing S. cerevisiae, C. albicans, S. 
pombe, Neurospora crassa and others is given  here .  
The Candida albicans genome is also being sequenced.  The results from these three 
genomes should help to clarify the history of P450s in fungi.  The two P450s CYP51 and 
CYP61 are found in all three species, but CYP56 is only in S. cerevisiae so far.  If 
CY56 is in C. albicans, but not in pombe, then it seems probable that CYP56 evolved 
from CYP51 or CYP61.  The ancestral yeast may have had only the two P450s and not 
three.  

Since CYP61 is a later enzyme in the ergosterol pathway, it is probable that CYP61 
evolved from a duplication and divergence of the CYP51 gene.  This would suggest that 
the ancestral eukaryote had only CYP51, essential for the formation of cholesterol and 
related sterols.  These are characteristic of eukaryotes and it is expected that 
cholesterol/ergosterol biosynthesis must have existed in the very earliest eukaryotes.  

It should be noted that Candida species also have the CYP52 family of P450s that 
metabolize alkanes.  These are not found in S. cerevisiae or pombe.  The question arises 
did a progenitor CYP52 P450 exist in the common ancestor that was lost in S. cerevisiae 
and pombe, or did alkane hydroxylases also evolve from the CYP51 ancestor.  This 
question may also be reversed.  Did the common ancestor to all eukaryotes bring with it an 
alkane or fatty acid hydroxylase that gave rise to CYP51.  The CYP52 sequences are most 
like a cluster of bacterial P450s called the eukaryote-like bacterial P450s.  These include 
CYP102 (a fatty acid hydroxylase) and CYP110.  This cluster next joins with the CYP4 
clan that also includes fatty acid hydroxylases.  This may be the more parsimonious 
history of eukaryotic P450s, that they derived from a eukaryote-like bacterial fatty 
acid hydroxylase and gave rise to CYP51 and cholesterol biosynthesis very early on in 
the eukaryote lineage.  Mutagenesis experiments on CYP2C2 have shown that a single 
amino acid substitution S473V allows CYP2C2 to accept progesterone as a substrate, 
when CYP2C2 is normally a lauric acid hdroxylase, so this is not a far fetched idea. 
 Ramarao M, Kemper B, 1995.  
It will be useful to know the distribution of CYP52 sequences among the fungi.  If 
they are only found among the fungi related to Candida, but not in Neurospora, 
Penicillium, or S. Pombe, then these might represent a novel development of the 
Candida related yeasts.  

The genome sequence of Giardia is also being done Giardia genome project 
page.  The position of Giardia on the tree of life is in flux at the moment.  It 
was thought to be very ancient, one of the earliest eukaryotic branches, but this is 
being revisited in the light of more recent studies.  Several amitochondriate 
eukaryotes are being reexamined, since HSP sequences in these organisms appear to be 
derived from mitochondrial genes.  Therefore, the loss of mitochondria is now thought 
to have happened in these species.  In any event, the Giardia branch is still 
earlier than the plant-animal-fungi divergences.  Any P450s in Giardia will possibly 
be primitive features.  The only problem with this organism is its anaerobic habitat.  
It may have dumped any primitve P450s it had, since it is an anaerobe.  After 12000 
sequence reads, no P450 has been found in Giardia, but a solid match has been found to 
the NADPH cytochrome P450 reductase 26% identical to the N terminal half (e-9).

query seq is Giardia, Sbjct is rat reductase.

Query: 414 IPRGTIIYMTCTFFAGEHPPASKEFIAWLQTVNPSLRPFRDIRFAVFGMGSKNYTTFCAA 593
           I +  +++   T+  G+    +++F  WLQ  +  L     ++FAVFG+G+K Y  F A 
Sbjct: 128 IDKSLVVFCMATYGEGDPTDNAQDFYDWLQETDVDLT---GVKFAVFGLGNKTYEHFNAM 184

Query: 594 SKNADKSIEIFGGTRILDALHLDRDEFKSDDSAYIHWKK-------DLFKVLGLSEQPVI 752
            K  D+ +E  G  RI +    D D    +D  +I W++       + F V    E+  I
Sbjct: 185 GKYVDQRLEQLGAQRIFELGLGDDDGNLEED--FITWREQFWPAVCEFFGVEATGEESSI 242

Query: 753 STNKIIVTKNTSLPDKWVCDVS----------PLGYKRGIMSKV---KVLSDGKVDGVVH 893
              +++V ++  +   +  ++           P   K   ++ V   + L+ G    ++H
Sbjct: 243 RQYELVVHEDMDVAKVYTGEMGRLKSYENQKPPFDAKNPFLAAVTANRKLNQGTERHLMH 302

Query: 894 L-YEITCPCMKYEAGGHCAILPRN 962
           L  +I+   ++YE+G H A+ P N
Sbjct: 303 LELDISDSKIRYESGDHVAVYPAN 326

This sequence is from the Giardia genome project at the Marine Biological 
Laboratories, Woods Hole, MA. 

Dictyostelium discoideum is another primitive eukaryote whose genome is being 
sequenced.  In this case there are already 45 ESTs representing 18 different P450 
genes(Dictyostelium sequences).  This shows that 
eukaryotes more ancient than yeast can have numerous P450s.  One of these ESTs is the 
probable CYP51 of Dictyostelium (AU033519).  One of these 18 sequences is complete.  The other full length sequences will be needed to compare them to the fungal sequences, but there is no strong candidate for a CYP61.  

>S. pombe CYP61 C22 sterol desaturase
MEPSDQIIRFNDKFTTISYLPWILIMQKGHIPGPRFKIPFMGSFLDSMKPTFEKYNAKWQ
TGPLSCVSVFHKFVVIASERDLARKILNSPSYVQPCVVDAGKKILKHTNWVFLDGRDHIE
YRKGLNGLFTTRALASYLPAQEAVYNKYFKEFLAHSKDDYAQYMIPFRDINVATSCRTFC
GYYISDDAIKHIADEYWKITAAMELVNFPIVLPFTKVWYGIQSRKVVMRYFMKAAAESRK
NMEAGNAPACMMEEWIHEMIETRKYKSENKEGAEKPSVLIREFSDEEISLTFLSFLFASQ
DATSSAMTWLFQLLADHPDVLQKVREEQLRIRKGDIDVPLSLDLMEKMTYTRAVVKECLR
LRPPVLMVPYRVKKAFPITPDYTVPKDAMVIPTLYGALHDSKVYPEPETFNPDRWAPNGL
AEQSPKNWMVFGNGPHVCLGQRYAVNHLIACIGKASIMLDWKHKRTPDSDTQMIFATTFP
QDMCYLKFSPFDASTVDWKNSKEAFSNEAVSAATVETESA

>Sacchromyces CYP61 C22 sterol desaturase
MSSVAENIIQHATHNSTLHQLAKDQPSVGVTTAFSILDTLKSMS
YLKIFATLICILLVWDQVAYQIKKGSIAGPKFKFWPIIGPFLESLDPKFEEYKAKWAS
GPLSCVSIFHKFVVIASTRDLARKILQSSKFVKPCVVDVAVKILRPCNWVFLDGKAHT
DYRKSLNGLFTKQALAQYLPSLEQIMDKYMDKFVRLSKENNYEPQVFFHEMREILCAL
SLNSFCGNYITEDQVRKIADDYYLVTAALELVNFPIIIPYTKTWYGKKTADMAMKIFE
NCAQMAKDHIAAGGKPVCVMDAWCKLMHDAKNSNDDDSRIYHREFTNKEISEAVFTFL
FASQDASSSLACWLFQIVADRPDVLAKIREEQLAVRNNDMSTELNLDLIEKMKYTNMV
IKETLRYRPPVLMVPYVVKKNFPVSPNYTAPKGAMLIPTLYPALHDPEVYENPDEFIP
ERWVEGSKASEAKKNWLVFGCGPHVCLGQTYVMITFAALLGKFALYTDFHHTVTPLSE
KIKVFATIFPKDDLLLTFKKRDPITGEVFE

>Candida albicans AL033396 71% to Saccharomyces CYP61 
MNSTEVDNLPFQQQLTSFVELAVAKATGSPITTLFTIIFLILSY
DQLSYQINKGSIAGPRFKFYPIIGPFLESLDPKFEEYKAKWDSGELSCVSIFHKFVVI
ASSRDLARKILSSPKYVKPCVVDVAIKILRPTNWVFLDGKQHTDYRRSLNGLFSSKAL
EIYIPVQEKYMDIYLERFCKYDGPREFFPEFRELLCALSLRTFCGDYITEDQIALVAD
NYYRVTAALELVNFPIIIPYTKTWYGKKIADDTMKIFENCAAMAKKHINENNGTPKCV
MDEWIHLMKEAREKHSEDPDSKLLVREFSNREISEAIFTFLFASQDASSSLACWLFQI
VADRPDIVAKIREEQLRVRNNNPDVRLSLDLINEMTYTNNVVKESLRYRPPVLMVPYV
VKKSFPVTESYTAPKGAMIIPTLYPALHDPEVYDEPDSFIPERWENASGDMYKRNWLV
FGTGPHVCLGKNYVLMLFTGMLGKFVMNSDMIHHKTDLSEEIKVFATIFPKDDLILEW
KKRDPLKSL

>S. pombe CYP51 lanosterol 14 alpha demethylase
MAFSLVSILLSIALAWYVGYIINQLTSRNSKRPPIVFHWIPFVGSAVAYGMDPYVFFREC
RAKYGDVFTFVCMGRKMTAFLGVQGNDFLFNGKLADLNAEEAYSHLTTPVFGKDVVYDIP
NHVFMEHKKFIKSGLGFSQFRSYVPLILNEMDAFLSTSPDFGPGKEGVADLLKTMPVMTI
YTASRTLQGAEVRKGFDAGFADLYHDLDQGFSPVNFVFPWLPLPRNRRRDRAHKIMQKTY
LKIIKDRRSSTENPGTDMIWTLMSCKYRDGRPLKEHEIAGMMIALLMAGQHTSAATIVWV
LALLGSKPEIIEMLWEEQKRVVGENLELKFDQYKDMPLLNYVIQETLRLHPPIHSHMRKV
KRDLPVPGSKIVIPANNYLLAAPGLTATEEEYFTHATDFDPKRWNDRVNEDENAEQIDYG
YGLVTKGAASPYLPFGAGRHRCIGEQFAYMHLSTIISKFVHDYTWTLIGKVPNVDYSSMV
ALPLGPVKIAWKRRN

>Saccharomyces CYP51 lanosterol 14 alpha demethylase
MSATKSIVGEALEYVNIGLSHFLALPLAQRISLIIIIPFIYNIV
WQLLYSLRKDRPPLVFYWIPWVGSAVVYGMKPYEFFEECQKKYGDIFSFVLLGRVMTV
YLGPKGHEFVFNAKLADVSAEAAYAHLTTPVFGKGVIYDCPNSRLMEQKKFVKGALTK
EAFKSYVPLIAEEVYKYFRDSKNFRLNERTTGTIDVMVTQPEMTIFTASRSLLGKEMR
AKLDTDFAYLYSDLDKGFTPINFVFPNLPLEHYRKRDHAQKAISGTYMSLIKERRKNN
DIQDRDLIDSLMKNSTYKDGVKMTDQEIANLLIGVLMGGQHTSAATSAWILLHLAERP
DVQQELYEEQMRVLDGGKKELTYDLLQEMPLLNQTIKETLRMHHPLHSLFRKVMKDMH
VPNTSYVIPAGYHVLVSPGYTHLRDEYFPNAHQFNIHRWNKDSASSYSVGEEVDYGFG
AISKGVSSPYLPFGGGRHRCIGEHFAYCQLGVLMSIFIRTLKWHYPEGKTVPPPDFTS
MVTLPTGPAKIIWEKRNPEQKI

>Candida albicans CYP51 X13296
MAIVETVIDGINYFLSLSVTQQISILLGVPFVYNLVWQYLYSLR
KDRAPLVFYWIPWFGSAASYGQQPYEFFESCRQKYGDVFSFMLLGKIMTVYLGPKGHE
FVFNAKLSDVSAEDAYKHLTTPVFGKGVIYDCPNSRLMEQKKFAKFALTTDSFKRYVP
KIREEILNYFVTDESFKLKEKTHGVANVMKTQPEITIFTASRSLFGDEMRRIFDRSFA
QLYSDLDKGFTPINFVFPNLPLPHYWRRDAAQKKISATYMKEIKSRRERGDIDPNRDL
IDSLLIHSTYKDGVKMTDQEIANLLIGILMGGQHTSASTSAWFLLHLGEKPHLQDVIY
QEVVELLKEKGGDLNDLTYEDLQKLPSVNNTIKETLRMHMPLHSIFRKVTNPLRIPET
NYIVPKGHYVLVSPGYAHTSERYFDNPEDFDPTRWDTAAAKANSVSFNSSDEVDYGFG
KVSKGVSSPYLPFGGGRHRCIGEQFAYVQLGTILTTFVYNLRWTIDGYKVPDPDYSSM
VVLPTEPAEIIWEKRETCMF

>Saccharomyces CYP56
MELLKLLCLILFLTLSYVAFAIIVPPLNFPKNIPTIPFYVVFLP
VIFPIDQTELYDLYIRESMEKYGAVKFFFGSRWNILVSRSEYLAQIFKDEDTFAKSGN
QKKIPYSALAAYTGDNVISAYGAVWRNYRNAVTNGLQHFDDAPIFKNAKILCTLIKNR
LLEGQTSIPMGPLSQRMALDNISQVALGFDFGALTHEKNAFHEHLIRIKKQIFHPFFL
TFPFLDVLPIPSRKKAFKDVVSFRELLVKRVQDELVNNYKFEQTTFAASDLIRAHNNE
IIDYKQLTDNIVIILVAGHENPQLLFNSSLYLLAKYSNEWQEKLRKEVNGITDPKGLA
DLPLLNAFLFEVVRMYPPLSTIINRCTTKTCKLGAEIVIPKGVYVGYNNFGTSHDPKT
WGTTADDFKPERWGSDIETIRKNWRMAKNRCAVTGFHGGRRACLGEKLALTEIRISLA
EMLKQFRWSLDPEWEEKLTPAGPLCPLNLKLKFNENIME

Percent identities from Do-it-yourself WU-BLAST (some partial alignments)

           51 Sc   51 Sp   51 Ca   61 Sc   61 Sp   61 Ca   56 Sc

CYP51 Sc            49%     65%     26%     25%     26%     23%

CYP51 Sp                    47%     24%     23%     24%     22%

CYP51 Ca                            29%     24%     24%     26%
 
CYP61 Sc                                    51%     69%     27%

CYP61 Sp                                            52%     28%

CYP61 Ca                                                    26%

CYP56 Sc


The C. albicans sequences are more closely related to S. cerevisiae than to pombe as 
expected.  If a CYP56 homolog exists in albicans it should be about 65-69% identical.