Leishmania major and Leishmania infantum P450s


The Leishmania major genome project is not finished.  These four genes are a sample found at Genbank and Sanger.

>CYP51E1 LM7_11c.1.Contig2  L. major Friedlin chromosome 7_11c GenEMBL AZ048391
76% to T. brucei CYP51E1
123423 MIGEFFLLLTAGLALYGWYFCKSFNTTRPTDPPVVHGAMPFVGHIIQFGKDPLDFMLNAK 123244
123243 KKYGGVFTMNICGNRVTVVGDVHQHNKFFTPRNEILSPREVYSFMVPVFGEGVAYAAPY 123067
123066 PRMREQLNFLAEELTVAKFQNFAPSIQHEVRKFMKANWNKDEGEINILDDCSAMIINTAC 122887
122886 QCLFGEDLRKRLDARQFAQLLAKMESCLIPAAVFLPWILKLPLPQSYRCRDARAELQDI 122710
122709 LSEIIIAREKEEAQKDSNTSDLLASLLGAVYRDGTRMSQHEVCGMIVAAMFAGQHTSTIT 122530
122529 TTWSLLHLMDPRNKRHLAKLHQEIDEFPAQLNYDNVMEEMPFAEQCARESIRRDPPLIML 122350
122349 MRKVLKPVQVGKCVVPEGDIIACSPLLSHQDEEAFPNPREWNPERNMKLVDGAFCGFGAG 122170
122169 VHKCIGEKFGLLQVKTVLATVLRDYDFELLGPLPEPNYHTMVVGPTASQCRVKYIRKKA 121993
121992 AA 121987

>CYP51E1 Contig3290 Leishmania infantum 
8620 MIGELLLLLAAGLALYGWYFCKSFNTTRPTDPPVVHGTTPFVGHIIQFGKDPLGFMLKAK 8799
8800 KKYGGIFTMNICGNRITVVGDVHQHSKFFTPRNEILSPREVYSFMVPVFGEGVAYAAPYP 8979
8980 RMREQLNFLAEELTVAKFQNFAPSIQHEVRKFMKANWNKDEGEINILDDCSAMIINTACQ 9159
9160 CLFGEDLRKRLDARQFAQLLAKMESCLIPAAVFLPWILKLPLPQSYRCRDARAELQDILS 9339
9340 EIIIAREKEEAQKDTNTSDLLAGLLGAVYRDGTRMSQHEVCGMIVAAMFAGQHTSTITTT 9519
9520 WSLLHLMDPRNKRHLAKLHQEIDEFPAQLNYDNVMEEMPFAEQCARESIRRDPPLVMLMR 9699
9700 KVLKPVQVGKYVVPEGDIIACSPLLSHQDEEAFPNPREWNPERNMKLVDGAFCGFGAGVH 9879
9880 KCIGEKFGLLQVKTVLATVLRDYDFELLGPLPEPNYHTMVVGPTASQCRVKYIKKKKAA 10056

&&&&&&&&&&&&&&&&&&&

>AC125412 AC125412.2 Leishmania major chromosome 27 clone LB01811 strain
59% to the Trypanosoma brucei seq Tb03.27F10.920
      MAANVLQSYIVAALHSAATQLPSSVQPYAMMLTREDMVSTTLATAIATA
18369 VILYTVITVVLPVLRMDFYLSKLPTIKNSIPFLGHALLLAGPSPWSKMSNWSLYPEKNLP 18190
18189 QKKKSVDGPQTSRLVTYNVAGMRVIYINEPRLLRRVLLTHQRNYRKALAAAYKHFMCLLG 18010
18009 TGLVTSEDEQWKKGRLLLSHAMRIDILDSVPEMAMKAVDRILLKLDAVDAKNPSVDLNEE 17830
17829 YRHMTLQVISESALSLSAEESDRIFPALYLPIVHECNKRVWAPWRAYMPFLHGSRMRNRC 17650
17649 LSELNKVLRDIICRRWEQRNDSNYTAKPDILALCISQIDRIDEKMIVGLIDDVKTILLAG 17470
17469 HETSAALLTFATYEVLRHPEIRQKILEEATRLFDPARCTCTVQTRYGPRGVPALNDVRS 17293
17292 LVWTPAVLRETLRRHSVVPLVMRYAAKDDVWPAADTGLDADVRIPAGCTIAVGIEGVHNN 17113
17112 PDVWNKPEVFDPTRFIDAEIANDTNYLNQSTKDVKFAKKIDPYAFIPFINGPRNCLGQHL 16933
16932 SMIETQVALAYMVLSYDLTIYRDPSYKGDVAAYEDAVGRHHDFIIPQVPHDGLKVWGT 16759
16758 PNKLFM* 16738

>Contig3226 Leishmania infantum ortholog to AC125412 20 aa diffs 
6459 MAANALHSHIVAALHNAATKLPSSVQPYAMLLTREDMVSTTLATAIATAVILYTVITVVL 6280
6279 PVLRMDFYLSKLPTIKNGIPFLGHALLLAGPSPWSKMSNWSLYPEKNLPQKKKSADGPQT 6100
6099 SRLVTYNVAGMRVIYINEPRLLRRVLLTHQRNYRKALAAAYKHFMCLLGTGLVTSEDEQW 5920
5919 KKGRLLLSHAMRIDILDSVPEMAMKAVDRILLKLDAVDAKNPSVDLNEEYRHMTLQVISE 5740
5739 SALSLSAEESDRIFPALYLPIVHECNKRVWAPWRAYMPFLQGSRVRNHCLSELNKVLRNI 5560
5559 ICRRWEQRNDSNCTGKPDILALCISQIDRMDEKMIVGLIDDVKTILLAGHETSAALLTFA 5380
5379 TYEVLRHPEIRQKILEEATRLFDPARCTRTVQTRYGPRGVPAVNDVRDLVWTPAVLRETL 5200
5199 RRHSVVPLVMRYAAKDDVWPAADTGLDADVRIPAGCTIAVGIEGVHNNPDVWNKPEVFDP 5020
5019 TRFIDAEIANDTNYLNQSTKDVKFAKKIDPYAFIPFINGPRNCLGQHLSMIETQVALAYM 4840
4839 VLNYDLTIYRDPSYKGDAAAYEDAVGRHHDFIIPQVPHDGLKVWGTPNKLFM 4684

&&&&&&&&&&&&&&&&&&&&&

>LM34.1.Contig35  L. major Friedlin chromosome 34 unfinished whole chromosome
145491 MTPTVSPVQAAAAIATVGFLAYATTRMLQALYSAPPNIPEPAIP
       PNSEDGLVWSVIKRVLYRHFYVVRKGDPLKTLQRWCVEFDYKPFVMKIFFRPHVVLSSPV
       DIEHVLLRADTKFYKDTGYDIVRIVVGRVGLLAVGNKAQHAVHRRILMPIFRSQNIRGVA
       NEIIRMHALRMMGGLFHLIQCGGEQDAVVNLSDHVFRMALSAIGEAAFRASRSESLRVRG
       HFDVMMKMSRVNYFC
146208 PYLKSSAQRNARNTLKEMSVELLDKNMQINQIGTRRCVMDALIDELYVHFSMDDV 146372
146373 LDHVVTFLFAGHDTVSHTLEFLFALLGTNTEVQERLYEALEDLMPSICTCPTVQELME 146546
146547 CDYLVAIVKEVLRMYPAAPIIYRDAAEDVYLPGSAVVIPKGMTVVITLSALQRNTHVYGD 146726
146727 DVDVFRPERWLGEEGEALRKRCGRCGYIPFSCGKRSCIGQELGYLELLVVTALMGRHLK 146903
146904 MELVGKFPEARYNITIAVSHSVSMRITARDGIPVSEVYERIANVLDLNDEDAGSARNVTRGV* 147092

>Contig2424 Leishmania infantum ortholog to LM34.1.Contig35, 13 aa diffs
1504 MAPTVSPVQAAAAIATVGYLAYATTRMLQALYSAPPNMPEPAIPPNSEDGLVWSVVKRVL 1325
1324 YRHFYVVRKGDPLKTLQRWCAEFDYKPFVMKIFFRPHVVLSSPVDIEHVLLRADTKFYKD 1145
1144 TGYDIVRIVVGRVGLLAVGNKAQHAIHRRILMPIFRSQNIRGVANEIIRMHALRMMGGLF 965
 964 DLIQCGGEQDAVVNLSDHVFRMALSAIGEAAFRASRGESLRVRGHFDVMMRMSRVNYFCP 785
 784 YLKSSAQRNARNTLKEMCVELLDKNMQINQIGTRRCVMDALIDELYVHFSMDDVLDHVVT 605
 604 FLFAGHDTVSHTLEFLFALLGTNTEVQERLYEALEDLMPSICTCPTVQELMECDYLVAIV 425
 424 KEVLRMYPAAPIIYRDAAEDVYLPGPAVVIPKGMTVVITLSALQRNTHVYGDDVDVFRPE 245
 244 RWLGEEGEALRKRCGRCGYIPFSCGKRSCIGQELGYLELLVVTALLGRHLKMELVGEFPE 65
  64 ARYNITIAVSHSVSMRITARD 2 (missing 32 aa)

&&&&&&&&&&&&&&&&&&&&&&

>LM30.1.Contig3  L. major Friedlin chromosome 30 unfinished whole chromosome
Note this sequence matches the third seq fragment of T. cruzi 84% 
this seq is 51% identical to Chlamydomonas scaffold 690
52% to Choanoflagellate P450
78627 MAAFSRLLGVQLPYVMEVVIFLALAYATAYVLTNIMFASIHKFKMAGPLTAIPVLGGVVGMIRDPYSFWERQR 78845
78846 QYEPHGYSWMAILTQFVVFVTRADLCHKIFATNGEDTLTLQLHPNGKLILGDNNIAFQSG 79025
79026 PGHKALRSSFMNLFTTKALSLYLPIQERLIHEHLSRWVRDYPWGGKPEEMRTHIRELN 79199
79200 CETSQTVFLGEHLHNHTEFTHNYNIITRGFLSAPLYLPGTPLYKAVQARKLTM 79358
79359 VELQAAVRRSKARMAKPDAEPHCLLDFWTATVLEKIKEAEEEGGDAPAYSSDHAMADTIL 79538
79539 DFLFASQDASTASLTMITATMADHPEILERVRKEQARLRPNNEPLTYDLVQEMTFT 79706
79707 RQCVMEQLRLFPPAPMVPMKVHGDFQLDEKTVVRKGSMIIPSLVACCREGFTNPDTYD 79880
79881 PDRMGPERQEDRKFAKQFIPFGVGPHRCVGYNYAINHLTVYLALIAHHVEWQRTRTPD 80054
80055 SDKILYLPTLYPHDCLQTWRYREGIEPTAEKV* 80153

>Contig2309 Leishmania infantum ortholog to LM30 above (10 aa diffs)
430 MAAFSRLLGVQLPYVVEVVIFLALAYATAYVLTSIMFARIHKFKMAGPLTAIPVLGGVVT 251
250 MIRDPYSFWERQRQYEPHGYSWMAILTQFVVFVTRADLCHKIFTTNGEDTLTLQLHPNGK 71
 70 LILGDNNIAFQSGPGHKALRSSF 2 (amino acid 143)
141 aa gap
>Contig3659 starting at amino acid 285
  3 EEGGDAPAYSSDHAMADTILDFLFASQDASTASLTMITATMADHPEILERVRKEQARLRP 182
183 NNEPLTYELVQEMTFTRQCVMEQLRLFPPAPMVPMKVHSDFQLDEKTVVPKGSMIIPSLV 362
363 ACCREGFTNPDTYDPDRMGPERQEDRKFAKQFIPFGVGPHRCVGYNYAINHLTVYLALIA 542
543 HHVRWQRTRTPDSDKILYLPTLYPHDCLQTWRYREGMEPKAEKV 674

the best matches to this seq in Fungi are CYP61 sequences
CYP61A3   S. pombe Z98974 join(10345..10419,10514..12001)...   598  7.3e-62   1
CYP61A2   C.albicans AL033396 comp(39932..41485) gene="Ca...   509  2.0e-52   1
Yeast     CYP61A1 Z49211 Z71257                                466  7.1e-48   1
CYP51     yeast M15663 U10555                                  169  3.1e-17   2

in Dictyostelim
CYP524A1  Seq 91 complete seq 25% to 10, 42 468 aa             774  1.1e-80   1
CYP508A3  seq 21+67+72+82+87 57% to 508A1 458aa                194  4.9e-16   1

in Chlamydomonas
21. Scaffold 690 10 EXONS 43% to 710A1 exon 1 predicted b...  1178  1.2e-123  1
2.  scaf 846 BI528139 33% to 707A2 possible 85 clan membe...   211  4.5e-18   1

in rice CYP710s
AP002093d       $F CYP710A8 CDS complement(143977..145503...   844  3.4e-87   1
AP002092c       $F CYP710A7 CDS complement(80797..82311) ...   837  1.9e-86   1
AP002092b       $F CYP710A6 CDS complement(60811..62349) ...   827  2.1e-85   1
AP002092a       $F CYP710A5 CDS complement(54602..56137) ...   789  2.3e-81   1
AP004162.1      $F CYP707A6 chr 8 clone OJ1320_D12, = AY0...   293  2.4e-26   2

in Arabidopsis
CYP710A3                                                       847  1.0e-87   1
CYP710A4                                                       847  1.0e-87   1
CYP710A1                                                       830  6.4e-86   1
CYP710A2                                                       801  7.6e-83   1
CYP88A4                                                        280  5.2e-25   1