ESTIMATE 90 BACTERIAL P450 GENES

This is a collection of the newest bacterial P450s many have not been 
named yet.

AE000101 Rhizobium sp. NGR234 plasmid pNGR234a same seq as Z68203 CYP127

MSDLRRKRVKTNPIPDHVPPALVRHFSLFTSPGMAPTPNGDPHA
AVACVHDDGPPIFYSPSNTRDGRGTWVITRARDQRRVLEDTETFSSHRSIFASALGEH
WPVIPLELDPPAHGVFRALLNPLFSSRRVLALEPTIHARAGALIDCIAKEKTSCDVMK
DFALPFTFSVFLSFLGLSQRRSEVLVGWVSDLLHGNAEKRRAAARSVVAFIDEMAAMR
RKSPAVDFMTFVVQAKIEGRSLTEEEVRGIGVLFLVAGLDTVAAAIGFDMAYLARNPK
HQELLRNEPARLGLAAEELLRAYSTVQIIRVATKDIEFEGVPIREGDYVSCPAMIANR
DPSEFKCPNTIDLARQDNQHTAFGYGPHLCHGAHLARREIVIGLREWLARIPAFRIKE
GTAPITHGGHVFGISNIILTWA

AF071144  Streptomyces glaucescens cytochrome P450

LLIAGHETTTSMIALSTLLLLDRPELPAELRNDPDLMPAAVDEL
LRVLSVADSIPLRVAAEDIELSGRTVPADDGVIALLAGANHDPEQFDDPERVDFHRTD
NHHVAFGYGMHQCLGQNL

AF071146  Micromonospora inyoensis cytochrome P450

LLIAGHETTSHMISLGVTALLEHPDQLAALQNDLTLLPEAVEEL
LRYLSIADYVPSRVALEDVVIGGTVIRAGEGVVPLLAAADWDPKVFDNPGTLDIHRGN
RRHVAFGYGVHQCLGQNL

AF071147 Streptoalloteichus hindustanus cytochrome

LLIAGHETTANMLALGAFALLEHPEQLAELRANPDLMPGAVEEL
MRYLSIVHIGPVRTAVADVEIEGQLIRAGESVTVSVPAANWDPAKFPEPERLDLTRRT
SGHLAFGHGVHQCLRQNL

AF071148  Amycolata autotrophica cytochrome P450

LLIAGHETTSHMISLGVTALLERPDQLAALQNDLTLLPEAVEEL
LRYLSIADYVPSRVALEDVVIGGTVIRAGEGVVPLLAAADWDPKVFDNPGTLDIHRGN
RRHVAFGYGVHQCLGQNL

U17130 Rhodococcus erythropolis ORF6' gene CYP116

/gene="thcB"
/standard_name="cytochrome P-450"
/note="first member of the new cytochrome P-450 family
CYP116"
/function="degradation of thiocarbamate herbicides"
/product="ThcB"
MTVDHAPEGVKSPTGCPVSGMAADFDPFRGAYQVDPSSSLRQAR
KDEPVFFSPLLDYWVVTRYEDIKQIFKTPSVFSPSITVDQITPISDEALQILGSYQFA
AGRMLVNEDEPIHTERRRLLMQPFEADNVATLEPKIREVVNTYLDRVIKDGRADLIGD
LLYEVPCIVALIFLGVPDEDIETCRQYGMQQTLFTWGHPTGDEQTRVATGMGKFWEFA
GGLVDKLKADPNAKGWIPHAIEMQRQHPDLFDDNYLQNIMFGGVFAAHETTTNATGNA
FRTLLENRSSWDEICADPTLIPKAIEECLRYSGSVVAWRRKAVVDTTVGEVDIPAGGR
LLIVMASANRDDSMFPEPDDFDIHRGNAQRHLTFGIGSHTCLGATLARLEMKVFLEEV
SRRLPHMSLVAGQEFSYLPNTSFRGPEHVLVEWDPQQNPVPADRP"

AB000735 Nocardioides sp. 32% with B.s. 109B1

MSSVLAPESDVDLFAGEVLADPYPIYAALRSLGPVVWLPRHGFY
VVVRYAEAREVLNDHERFVSGRGVGFNQQFNDVRSGSIIASDPPRHDILRSVLNERLG
PRALKDTEVMIRSRASDLVAEMAQRRSFDAVKDFAEVFPVQVVGELIGLPEGSRGRLL
QWANGAFNAFGPSGERTAAGLEAIAEQFDYIRTVANREQLMPDSMGAAVYQAADEGII
SEDDCLPLLSAYLTAGMDTTVNALSAMLLLLSTDPDQWQELRSSPSLAPSVVNEVLRI
EAPAQLFSRVTATRVDLAGTRIEAGERVAVIYASANRDEKKYPDPDRFDIRRNPAGHL
AFGSGLHVCAGQFLAKTELRAVLDALIEQIETMTVGEPVRKINNVLRGLSSLPATFVP
VQTSR

AF034769 Agrobacterium tumefaciens plasmid pTiC58

/gene="virH1" 77% identical to pinF1
MITSSISGTDQQFQNATQPKELDPDAVPVSRLDSEGHEIFAEWR
PKRPFLRREDGVFLVLRADDIFLLGTDPRTRQIETELMLNRGVTRGAVFDLIRYSMLF
SNGEVHVKRRSAFAKTFAFRMIDALRPEITKLTEHLWDDVPRVDDFDFAEMYASKLPA
LTIASVLGLPFGDAPFFTRLVYNVSRCLSPSWGEDDFPEIEASAVELQDYVRAVVADR
SRRISDDFLSCYLKAVREEGTLSPIEEIMQLVFLILAGSDATRNAMVMLPTLLLQNPV
VWSSLCHDQSGVAAAVEEGLRFEPSVGSFPRLALEDIDLDGYVLPKGSFLALSIMSGL
RDERHYEHPQLFDIKRKQMRRHLGFGAGVHRCLGEALARIELQEGLRTLLRRAPSLRV
TGDWPRMIGHGGARRATGMTVNLGVDR"

U84350 Amycolatopsis orientalis hypothetical hydrox

/product="hypothetical hydroxylase d" 91% IDENTICAL TO AJ223998 GENE 1
ISTYDPPEHTRLRKMLTPEFTVRRIRRMEPAIQSLVDDRLDMLE
AEGSPADLQGLFADPVGAHALCELLGIPRDDQREFVRRIRRNADLSRGLKARAADSAA
FNRYLDNLIARQRADPDDGLLGTIVREHGGNVTDEELRGLCTALILGGVETVAGMIGF
GVLALLQHPGQIPLLFEGPEKADRVVNELVRYLSPVQAPNPRLAIKDVVVDGQLIKAG
DYVLCSVLMANRDEALTPDPNVFDANRTAVSDVGFGHGIHYCVGAALARSMLRTAYQT
LWRRFPGLRLAVPVEEVKYRSAFVDCPDQVPVTW

AJ223998 Amycolatopsis orientalis cosmid PCZA361 2 genes
gene 1 N-TERM IS FROM END OF AJ223999

MAHGIDQHGIDQHGIDQHGIDQHGIDQHGIDQHGIDQHSIDQHG
IDQHGMEQVAPLLQEPANFQMRTGCDPHEENFDLRAHGPLVRLVGDSSTQLGRDYVWQ
AHGYDVVRKILGDHENFTTRPQFTHAKSDAHVEAQFVGQISTYDPPEHTRLRKMLT 91% IDENTICAL TO U84350
PEFTVRRIRRMEPAIQGLIDDRLDMVEAEGPPADLQGLFADPVGALALCELLGIPRDD
QREFVRRIRRNTDLSRGLKARAADSAAFNRYLDNLISRQRKDPDEGFLGMIVREHGDN
VTDEELKGLCTALILGGVETVAGMIGFGVLALLENPDQIQLLFAGPEKADRVVNELVR
YLSPVQAPNPRLAIKDVVIDGQLIKAGDYVLCSVLMANRDEALTPNPNVLDANRAAVS
DVGFGHGIHYCVGAALARSMLRMAYQTLWQRFPGLRLAVPIAEVKYRSAFVDCPDQVP
VTW
gene 2 34% WITH B.S. Z99119 (107H)
MQTTTAVGDLGNPDLYTTLDRHARWRELAARDAMVWSEPGSSPT
GFWSVFSHRACAAVLAPSAPFTSEYGMMIGFDRDHPDKSGGQMMVVSEQEQHRKLRRL
VGPLLSRAAARKLSERVRTEVSGVLDQVLDGGVCDVATAIGPRIPAAVVCEILGVPAE
DEDMLIELTNHAFGGEDELFDGMTPRQAHTEILVYFDELITARRERPADDLVSTLVTD
DELTIDDVLLNCDNVLIGGNETTRHAITGAVHALATVPGLLTGLQDGSADVDTVVEEV
LRWTSPAMHVLRVSTDDVTINGQDLPAGTPVVAWLPAANRDPAEFDDPDTFLPGRKPN
RHITFGHGMHHCLGSALARIELSVVLRVLAERVSRVELVKEPAWLRAIVVQGYAELSA
RFTGR

AJ223999 Amycolatopsis orientalis cosmid PCZA363

31325..32500 GENE 1 42% WITH AJ223999 GENE 2 AND U84350
/note="similar to P450 related oxidase/hydroxylase"
MFAENNAVRGTEIHRREQFDPGPELRSLMAEGPMSILEADDPAE
GRTGWLATGYDEIRQVLGSDKFSAKLLYGGTVAGRIWPGFLNQYDPPEHTRLRRMLTS
AFTVRRMQGFRPRIELIVEATLDDIEATGGPVDFVPRFAWPIATTVTCDFIGIPRDDQ
ADLSRALLASRSERTGKRRVAAGNKFWTYMSQVAAQARRDPGDNMFGAVVREHGDAIT
DAELLGVAAFIIGAAGDQVARFLAAGAWLIAEHPEQFAVLRDKPDTIPDWLNEVARYL
TSDEKTTPRIALEDVYIGDQLVKAGDAVTCSLLAANRRNFPAPEDQFDITRERPAHVT
FGHGIHHCLGRPLAEMVFRTAITALTRRFPTLRLAEPGREIKLGPPPFDVEVLLLDW

32520..33740 GENE 2 42% WITH AJ223999 GENE 1
/note="similar to P450 related oxidase/hydroxylase" 53% WITH U84350
MVAQELSQSGDDDPRPLHIRRQDLDPADELLAAGALTRVTVGSG
ADAETSWMATTHGAVRQVMGDHKSFSTRRRWHQRDEIGGTGIFRPRELVGNLMDYDPP
EHTRLRQKLTPGFTLRKMQRMQPYIEQIVADRLDAMEQAGSPADLIEFVADEVPGAVL
CELIGVPRDDRAMFMQLCHGHLDASRSQKRRAAAGEAFSRYLLAMIARERKDPGEGLI
GAVIAEYGDEATDEELRGFCVQVMLAGDDNISGMIGLGVLALLRHPEQIAAFQGDEQS
AQRAVDELIRYLTVPYAPTPRIAMQDVIVASQMIKKGESVICSLPAANRDPALVPDPN
RLDVTREPVPHVAFGHGVHHCLGAALSRLELRTVYTALWQRFPTLRLADPAKETNFRL
TTPAYGVTSLLVAW

33791..>34244 GENE 3
/note="similar to P450 related oxidase/hydroxylase"
THIS GENE IS IDENTICAL TO AJ233998 GENE 1 N-terminal
MAHGIDQHGIDQHGIDQHGIDQHGIDQHGIDQHGIDQHSIDQHG
IDQHGMEQVAPLLQEPANFQMRTGCDPHEENFDLRAHGPLVRLVGDSSTQLGRDYVWQ
AHGYDVVRKILGDHENFTTRPQFTHAKSDAHVEAQFVGQISTYDPPEHT

AF040571  Amycolatopsis mediterranei 2 genes 
65% WITH AF040570 GENE 3

PVPLSIVSRDRVSIQRLACKVFGETRSGPPGRASFVLIAPRRPQ
TVHKGHNSEHCQVGGCSTGRPLPTVRNRACRTRAGRPRRHQEGMAMTATAKPSAKPVD
LFSPEVVADPFGWYARLREEPLPHTGTLNLGTMMGGPDMWLATRYEDVLTVLTDPRFL
TNPPADSPLADIRDGVFKAMNFPEDLIPWMANKLNTADGEDHTRLRKLVSYALTARRV
NALRPRVEKITEDLFDRMAEQGKDGSPVDIVEEFCFPLPVTVICELVGIDEPDRAAWH
AWGNAMATMDGPKIPPALRECIALSRELMARRRVEPKDDLVTALVQAQADDPGRVSDD
EIVGILFSLVTAGHQTTTYLIGNSVLFLLEQPEQLARLKAEPALWAQAVRELQRLGPL
QFTQARFPSEDVELGGVTIPRGTPVAPLLLAANTDPRKFPEPDKLVVDRLSVANEMHL
SFGKGIHRCLGQHLAYQEAEVALHGLFTRFPDLALAVPRDEIPWILRPAFTRTKTLPL
KLA

48% WITH M31939 105C1 49% WITH 105B1
MTTKVTENAPSTESLRSPLPPEFVRREDPFHVPPALVAVSERGP
VARATLAAGDPFWLVSGYEEARAVLSDPRFSSDRFQYHPRFKELSPEFAFPAPSLMIC
ELLGVRYEDRAEFQQRASALLQMNAPVAEAVKNADALRAFMQALVTDKRANPAGDIIS
GLIHHAGADPALTDDELINIANLLLIAGYDTTASMLGLGIFVLLQRPAQLATLRDDPS
RIADAVEGLLRYLSVVNPGIFRFAKEDLEFAGEHIPAGSTVVVSVVATNRDARHWPDP
DLDLTRPRGPHLAFGHGVHQCLGQQLARMEMQAGYAELLRRLPNVRLAVPPEEVPLRN
DMLTYGVHSLPIAWDAP

AJ223012 Amycolatopsis mediterranei genes encoding rifamycin polyketide
           synthases, ORFs 1 to 5 IDENTICAL TO AF040570 GENE 1

Sbjct: 156 DPPAVFDSLREERPLAKMVYPDGHVGWIVSSYELVREVLSDLR-FSHSCEVGHFPVTHQG 332
Sbjct: 333 QVIPTHPLIPGMFIHMDPPEHTRYRKLLTGEFTVRRASRLIPRAEAVAAEQIEVMRAKGA 512
Sbjct: 513 PADVVMDFAKPLVLRMLGELVGLPYEERDRYVPAVTLLHDAEADPAEAAAAYEVAGKFFD 692
Sbjct: 693 EVIERRRQRPQDDLISSLVT-----EDLTQEELRNIVTLLLFAGYETTEGALATGVFALL 857
Sbjct: 858 HHTDQLAALRAEPEKLDAAIEELLR-YLTVNQYHTYRTALEDVKLEGELIKKGDTVTVSL 1034
Sbjct: 1035PAANRDPAKFGCPAELDIERDTSGHVAFGFGIHQCLGQNLARIELRAGFTALLRAFPELR 1214
Sbjct: 1215LAVPADEVPLRLKGSVFSVKKLPVSW 1292

AF040570  Amycolatopsis mediterranei rifamycin biosy 3 genes

/note="ORF0; similar to Streptomyces Sp. ChoP" GENE 1 
IDENTICAL TO AJ223012 41% TO 105C1
MTDAISFEVPWDRTDKFDPPAVFDSLREERPLAKMVYPDGHVGW
IVSSYELVREVLSDLRFSHSCEVGHFPVTHQGQVIPTHPLIPGMFIHMDPPEHTRYRK
LLTGEFTVRRASRLIPRAEAVAAEQIEVMRAKGAPADVVMDFAKPLVLRMLGELVGLP
YEERDRYVPAVTLLHDAEADPAEAAAAYEVAGKFFDEVIERRRQRPQDDLISSLVTED
LTQEELRNIVTLLLFAGYETTEGALATGVFALLHHTDQLAALRAEPEKLDAAIEELLR
YLTVNQYHTYRTALEDVKLEGELIKKGDTVTVSLPAANRDPAKFGCPAELDIERDTSG
HVAFGFGIHQCLGQNLARIELRAGFTALLRAFPELRLAVPADEVPLRLKGSVFSVKKL
PVSW

complement(67462..68673) GENE 2
/note="ORF4"
/product="cytochrome P450 monooxygenase" 50% WITH X63601 105D1
MTETVAFPQDRTCPYQPAPGYEPLADRRPIAEVTLYDGRKAWAV
TGHELARKLLCDPRISSDRTNPAWPMVSAAAAVEFTDVQQKVIKLLTALVGVDGPAHR
ARRKLVQAGFTVKRIDSLRPKIQALVDRQIDEMIAEGAPVDLLKKFASPVPLIALCDL
LGIPHEDQDYFEKKAHQILFGPDAGGAYDELMAYLTKLIDKAEQNPGEEGFLQALLAE
RDPSSDVDHEEILQMFLIVLVTGHDTTSSSIGLGAFTLLQHPERLAELRADPSLMPAA
VDELTRFVVVPDGLQRVAADDIEVDADTTIRKGDGVFFLFSLINRDEEAYEDGDRLDW
HRPTFRDHLTYGFGTHQCAGMNHARAIMEIAFRRLIDRLPGLRLAVPADEVRVKPGDA
FQGLAELPVTW

complement(68704..69969) GENE 3 65% WITH AF040570 GENE 1
/product="cytochrome P450 monooxygenase"
MTTTAETSAEPVDLFSPEVVADPWGAHAAVREAPTLQIGGFMGG
PPMYLAARYEDVRQVLMDQRFQCNPLADSPAEDIRNGVFRHLDMPEELIPWLKNLINA
SDGEEHARLRKLVSYALTAHRVGKFRPRVEELTAELLDKLEAAGGDGTPIDLVESFCY
PLPVTVICELVGVDHEDRPQWRAWSDAMATLDGKRLPEEVRKTLAAARGMIERRRAEP
KDDMVTALVEAQAKNPGIATDDELVGVLFSLVTAGHQTTTYLIGNSVLALHEHPDQLA
RLKADPSLWPQAVRELQRLGPIQFGQPRFATEDIEVGGTVIPRGMPVVPLIMAANSDP
RKFPEPEKLQIDRLAVGSESHLGFGKGIHRCLGQHVAYLEAEVALRGLYTRFPDMTLA
VPREEIPWILRPGFTRTKDLPVRLARPAS

D87924  Actinomadura hibisca polyketide synthase gene
44% WITH 107G1 X86780 (check for two genes) and 107D1
MPSSKDAPTVDPRPDVTPAFPFRPDDPFQPPCEHARLRASDPVA
KVVLPTGDHAWVVTRYADVRFVTSDRRFSKEAVTRPGAPRLIPMQRGSKSLVIMDPPE
HTRMRKIVSRAFTARRVEGMRAHVRDLTSGFVDEMVEHGPPADLIAHLALPLPVTVIC
EMLGVPPEDRPRFQDWTDRMLTIGAPALAQADEIKAAVGRLRGYLAELIDAKTAAPAD
DLLSLLSRAHADDGLSEEELLTFGMTLLAAGYHTTTAAITHSVYHLLREPSRYARLRE
DPSGIPAAVEELLRYGQIGGGAGAIRIAVEDVEVGGTLVRAGEAVIPLFNAANRDPEV
FADPEELDLGRTDNPHIALGHGIHYCLGAPLARLELQVVLETLVERTPALRLAIDDAD
ITWRPGLAFARPDALPIAW

Z97193 Mycobacterium tuberculosis cosmid Y180
44% with 107A1
MKDKLHWLAMHGVIRGIAAIGIRRGDLQARLIADPAVATDPVPF
YDEVRSHGALVRNRANYLTVDHRLAHDLLRSDDFRVVSFGENLPPPLRWLERRTRGDQ
LHPLREPSLLAVEPPDHTRYRKTVSAVFTSRAVSALRDLVEQTAINLLDRFAEQPGIV
DVVGRYCSQLPIVVISEILGVPEHDRPRVLEFGELAAPSLDIGIPWRQYLRVQQGIRG
FDCWLEGHLQQLRHAPGDDLMSQLIQIAESGDNETQLDETELRAIAGLVLVAGFETTV
NLLGNGIRMLLDTPEHLATLRQHPELWPNTVEEILRLDSPVQLTARVACRDVEVAGVR
IKRGEVVVIYLAAANRDPAVFPDPHRFDIERPNAGRHLAFSTGRHFCLGAALARAEGE
VGLRTFFDRFPDVRAAGAGSRRDTRVLRGWSTLPVTLGPARSMVSP

Z97345 Mycobacterium tuberculosis cosmid Y25C11
34% with 107B1
MRRSPKGSPGAVLDLQRRVDQAVSADHAELMTIAKDANTFFGAE
SVQDPYPLYERMRAAGSVHRIANSDFYAVCGWDAVNEAIGRPEDFSSNLTATMTYTAE
GTAKPFEMDPLGGPTHVLATADDPAHAVHRKLVLRHLAAKRIRVMEQFTVQAADRLWV
DGMQDGCIEWMGAMANRLPMMVVAELIGLPDPDIAQLVKWGYAATQLLEGLVENDQLV
AAGVALMELSGYIFEQFDRAAADPRDNLLGELATACASGELDTLTAQVMMVTLFAAGG
ESTAALLGSAVWILATRPDIQQQVRANPELLGAFIEETLRYEPPFRGHYRHVRNATTL
DGTELPADSHLLLLWGAANRDPAQFEAPGEFRLDRAGGKGHISFGKGAHFCVGAALAR
LEARIVLRLLLDRTSVIEAADVGGWLPSILVRRIERLELAVQ

AL022021 Mycobacterium tuberculosis sequence v049         
34% with 127
MTTPGEDHAGSFYLPRLEYSTLPMAVDRGVGWKTLRDAGPVVFM
NGWYYLTRREDVLAALRNPKVFSSRKALQPPGNPLPVVPLAFDPPEHTRYRRILQPYF
SPAALSKALPSLRRHTVAMIDAIAGRGECEAMADLANLFPFQLFLVLYGLPLEDRDRL
IGWKDAVIAMSDRPHPTEADVAAARELLEYLTAMVAERRRNPGPDVLSQVQIGEDPLS
EIEVLGLSHLLILAGLDTVTAAVGFSLLELARRPQLRAMLRDNPKQIRVFIEEIVRLE
PSAPVAPRVTTEPVTVGGMTLPAGSPVRLCMAAVNRDGSDAMSTDELVMDGKVHRHWG
FGGGPHRCLGSHLARLELTLLVGEWLNQIPDFELAPDYAPEIRFPSKSFALKNLPLRWS

AL022022 Mycobacterium tuberculosis sequence v023
38% with 107B1
MTEAPDVDLADGNFYASREARAAYRWMRANQPVFRDRNGLAAAS
TYQAVIDAERQPELFSNAGGIRPDQPALPMMIDMDDPAHLLRRKLVNAGFTRKRVKDK
EASIAALCDTLIDAVCERGECDFVRDLAAPLPMAVIGDMLGVRPEQRDMFLRWSDDLV
TFLSSHVSQEDFQITMDAFAAYNDFTRATIAARRADPTDDLVSVLVSSEVDGERLSDD
ELVMETLLILIGGDETTRHTLSGGTEQLLRNRDQWDLLQRDPSLLPGAIEEMLRWTAP
VKNMCRVLTADTEFHGTALCAGEKMMLLFESANFDEAVFCEPEKFDVQRNPNSHLAFG
FGTHFCLGNQLARLELSLMTERVLRRLPDLRLVADDSVLPLRPANFVSGLESMPVVFT
PSPPLG

AL021942 Mycobacterium tuberculosis H37Rv complete
40% with Z96800 Mt Y63
MSGTSSMGLPPGPRLSGSVQAVLMLRHGLRFLTACQRRYGSVFT
LHVAGFGHMVYLSDPAAIKTVFAGNPSVFHAGEANSMLAGLLGDSSLLLIDDDVHRDR
RRLMSPPFHRDAVARQAGPIAEIAAANIAGWPMAKAFAVAPKMSEITLEVILRTVIGA
SDPVRLAALRKVMPRLLNVGPWATLALANPSLLNNRLWSRLRRRIEEADALLYAEIAD
RRADPDLAARTDTLAMLVRAADEDGRTMTERELRDQLITLLVAGHDTTATGLSWALER
LTRHPVTLAKAVQAADASAAGDPAGDEYLDAVAKETLRIRPVVYDVGRVLTEAVEVAG
YRLPAGVMVVPAIGLVHASAQLYPDPERFDPDRMVGATLSPTTWLPFGGGNRRCLGAT
FAMVEMRVVLREILRRVELSTTTTSGERPKLKHVIMVPHRGARIRVRATRDVSATSQA
TAQGAGCPAARGGGPSRAVGSQ

AL008609 Mycobacterium leprae cosmid B1788 like 107B1 42%

TRLRKLVSKAFAPKVVQALEGDIAALVDSLLDKGAAAGQFDVIA
DLAFPLAVAVICRLLGVPYEDAPEFGRVSALLVQSVDPFITITGEPPEATEERLRAGV
WLRDYLEQLVKCRRGTPGEDLISRLIELDESGDQLTEEEIIATCGLLLVAGHETTVNL
IANAVLAMLRNPSQWKALSSNPQRAPLVVEETLRYDPAIHLIGRVAAKDMTIGQTTLT
EGDTMVLLLAAANRDPAVYSRPDEFDPDRPSSRHLAFAVGSHFCLGAALARLEATVTL
SAISARFPQVQLAGELVYKPNVAMRGMSALPVQV

CYP?       Mycobacterium tuberculosis
           GenEMBL Z83866
           coding region 23158-24636
29% with CYP26
MATIHPPAYLLDQAKRRFTPSFNNFPGMSLVEHMLLNTKFPEKK
LAEPPPGSGLKPVVGDAGLPILGHMIEMLRGGPDYLMFLYKTKGPVVFGDSAVLPGVA
ALGPDAAQVIYSNRNKDYSQQGWVPVIGPFFHRGLMLLDFEEHMFHRRIMQEAFVRSR
LAGYLEQMDRVVSRVVADDWVVNDARFLVYPAMKALTLDIASMVFMGHEPGTDHELVT
KVNKAFTITTRAGNAVIRTSVPPFTWWRGLRARELLENYFTARVKERREASGNDLLTV
LCQTEDDDGNRFSDADIVNHMIFLMMAAHDTSTSTATTMAYQLAAHPEWQQRCRDESD
RHGDGPLDIESLEQLESLDLVMNESIRLVTPVQWAMRQTVRDTELLGYYLPKGTNVIA
YPGMNHRLPEIWTDPLTFDPERFTEPRNEHKRHRYAFTPFGGGVHKCIGMVFDQLEIK
TILHRLLRRYRLELSRPDYQPRWDYSAMPIPMDGMPIVLRPR

CYP?       Mycobacterium tuberculosis
           GenEMBL Z95150 
           cosmid cY164 from Sanger Centre
           coding region 29289-30488
35% with 107D1
MTSTSIPTFPFDRPVPTEPSPMLSELRNSCPVAPIELPSGHTAW
LVTRFDDVKGVLSDKRFSCRAAAHPSSPPFVPFVQLCPSLLSIDGPQHTAARRLLAQG
LNPGFIARMRPVVQQIVDNALDDLAAAEPPVDFQEIVSVPIGEQLMAKLLGVEPKTVH
ELAAHVDAAMSVCEIGDEEVSRRWSALCTMVIDILHRKLAEPGDDLLSTIAQANRQQS
TMTDEQVVGMLLTVVIGGVDTPIAVITNGLASLLHHRDQYERLVEDPGRVARAVEEIV
RFNPATEIEHLRVVTEDVVIAGTALSAGSPAFTSITSANRDSDQFLDPDEFDVERNPN
EHIAFGYGPHACPASAYSRMCLTTFFTSLTQRFPQLQLARPFEDLERRGKGLHSVGIK
ELLVTWPT

ADD 3 PSEUDOMONAS AERUGINOSA GENES

CYP118 fragments 

RLGSGIVEQRISVLSIIALADPXXXXXXXDETVWENI
XXXXXXRLLMGDELFTXXXXESNWGKAHN (THIS FRAGMENT MATCHES 12/19 RESIDUES IN CYP102A1 AND 
CYP102A2
SQHDDILDIMLYSADPSTGEQLDTDNV
VNQILTLLVSGSQTLANAIAFALHYLLSIHHDIAAQTRREI
RCLRRVVDATLRLWSXXXXXXRQARRDTTLGNGXXFPKGXXXXXXXXXXXX

THE CYP118 FRAGMENT BELOW IS 50% IDENTICAL TO yetO (CYP102A2) of B. subtilis

RADAWGPDANEFNPDRVLPEICRKLPYTYILFGTGLRTCIGRRFALHEMALELTMIVHQYD

CYP102A2 P450 part (N-terminal half)
MKETSPIPQPKTFGPLGNLPLIDKDKPTLSLIKLAEEQGPIFQI
HTPAGTTIVVSGHELVKEVCDEERFDKSIEGALEKVRAFSGDGLFTSWTHEPNWRKAH
NILMPTFSQRAMKDYHEKMVDIAVQLIQKWARLNPNEAVDVPGDMTRLTLDTIGLCGF
NYRFNSYYRETPHPFINSMVRALDEAMHQMQRLDVQDKLMVRTKRQFRYDIQTMFSLV
DSIIAERRANGDQDEKDLLARMLNVEDPETGEKLDDENIRFQIITFLIAGHETTSGLL
SFATYFLLKHPDKLKKAYEEVDRVLTDAAPTYKQVLELTYIRMILNESLRLWPTAPAF
SLYPKEDTVIGGKFPITTNDRISVLIPQLHRDRDAWGKDAEEFRPERFEHQDQVPHHA
YKPFGNGQRACIGMQFALHEATLVLGMILKYFTLIDHENYELDIKQTLTLKPGDFHIS
VQSRHQEAIHADVQAAEKAAPDEQKEKTE

X82490  Fusarium oxysporum (a fungal P450)

MAPMLRPLVYRFIPERARIKDQWTKGRKRVMASMRERQEKGGNL
EDPPTMLDHLSNGRNEHIADDVELQLLHQMTLIAVGTVTTFSSTTQAIYDLVAHPEYI
TILREEVESVPRDPNGNFTKDSTVAMDKLDSFLKESQRFNSPDLSMSNLKNYKLCESL
TGHSNLPTRTIADMKLPDGTFVPKGTKLEINTCSIHKDHKLYENPEQFDGLRFHKWRK
APGKEKRYMYSSSGTDDLSWGFGRHACPGRYLSAINIKLIMAELLMNYDIKLPDGLSR
PKNIEFEVLASLNACANA

C. crescentus at TIGR

>gcc_1109
         Length = 6908

  Minus Strand HSPs:

 Score = 254 (89.4 bits), Expect = 3.1e-21, P = 3.1e-21
 Identities = 93/298 (31%), Positives = 159/298 (53%), Frame = -2

Query:    24 VRRED-PFHVPPALVAVSER--GPVARATLAAGDPFWLVSGYEEARAVLSDPRFSSDRFQ 80
             +RRE  P+  P  L  ++ER  G V R  L   +P  L +G E    +     FSS    
Sbjct:  1060 LRREHMPYFTPSYLRGLTERVKGEVTRL-LDEMEPL-LANGAE----IDMVEHFSS---- 911

Query:    81 YHPRFKELSPEFAFPAPSLMICELLGVRYEDRAE--FQQRASALLQMNAPVAEAV----K 134
               P F  L      P P      L  + Y +RA+    ++A+A +Q    + + V     
Sbjct:   910 VLPLFT-LCEILGVP-PEDRPKFLTWMHYLERAQDLAVKQANAPMQPTLELMQFVMDFNN 737

Query:   135 NADALRAFMQALVTDKRANPAGDIISGLIHHAGADPALTDDELINIANLLLI-AGYDTTA 193
             N + +  + + ++  +R +P  D+++  I  A  D A+  DE ++ + LL++ AG DTT 
Sbjct:   736 NVEEMFEYGRTMLHKRREDPKEDLMTA-IARAQLDGAVLPDEYLDGSWLLIVFAGNDTTR 560

Query:   194 SMLGLGIFVLLQRPAQLATLRDDPSRIADAVEGLLRYLSVVNPGIF--RFAKEDLEFAGE 251
             + L   + +L + P Q   L  DPS +  AV+  +R   +V+P ++  R A  D+E  G+
Sbjct:   559 NTLSGAMRLLTEFPDQKQKLIADPSLLGGAVDEFIR---MVSPVVYMRRTATRDVEVNGQ 389

Query:   252 HIPAGSTVVVSVVATNRDARHWPDPD-LDLTRPR-GPHLAFGHGVHQCLGQQLARMEMQA 309
              I  G   ++   A NRD   + +PD LD+TR   G H+AFG+G H C+G+++A+++++ 
Sbjct:   388 LIREGEKAIMYYGAANRDPAMFENPDQLDVTRANAGKHIAFGYGPHTCIGKRVAQIQLEE 209

Query:   310 GYAELLRRLPNV 321
              Y ++L R P++
Sbjct:   208 AYRQILARFPDL 173

>gcc_831
        Length = 4246

  Plus Strand HSPs:

 Score = 253 (89.1 bits), Expect = 4.7e-21, P = 4.7e-21
 Identities = 71/238 (29%), Positives = 127/238 (53%), Frame = +3

Query:   100 MICELLGVRYEDRAEFQQRASALLQMNAPVAEAVKNADALRA-FMQAL-----VTDKRAN 153
             M+  L    +E+R +  + +   +   +P +  +++ +A RA  ++ L     + ++R N
Sbjct:  2946 MLATLFDFPWEERRKLTRWSD--IATASPESGLIESEEARRAELLECLAYFTNLWNERVN 3119

Query:   154 ---PAGDIISGLIHHAGADPALTDDELINIANLLLIAGYDTTASMLGLGIFVLLQRPAQL 210
                P  D+IS ++ H  A   +   E +    LL++ G DTT + L  G++ L + P + 
Sbjct:  3120 LTEPGNDLIS-MLAHGEATRDMPPMEYLGNVILLIVGGNDTTRNSLTGGLYALSKNPQEE 3296

Query:   211 ATLRDDPSRIADAVEGLLRYLSVVNPGIFRFAKEDLEFAGEHIPAGSTVVVSVVATNRDA 270
             A LR DP  I + V  ++R+ + +   + R A ED E AG+ I  G  VV+  V+ NRD 
Sbjct:  3297 AKLRADPGLIPNMVSEIIRWQTPL-AHMRRTALEDYELAGQTIKKGDKVVMWYVSGNRDD 3473

Query:   271 RHWPDPDLDLT-RPRGP-HLAFGHGVHQCLGQQLARMEMQAGYAELLRRLPNVRLAVPPE 328
                 + D  +  RP    HL+FG G+H+C+G +LA M+++  + E+L+R P + +   P+
Sbjct:  3474 TVIENADQFIVDRPNARRHLSFGFGIHRCVGNRLAEMQLKIVWEEILKRFPKIEVLEEPK 3653

Query:   329 EV 330
              V
Sbjct:  3654 RV 3659


>gcc_113
        Length = 3962

  Minus Strand HSPs:

 Score = 226 (79.6 bits), Expect = 1.7e-18, Sum P(2) = 1.7e-18
 Identities = 70/243 (28%), Positives = 114/243 (46%), Frame = -1

Query:    77 DRFQYHPRFKELSPEFAFPAPSLMICELLGVRYEDRAEFQQRASALLQMNAP-------- 128
             DR   H    + + + AF  P  +I E+LGV   D     +    L     P        
Sbjct:   731 DRMAEHGDRCDFARDVAFLYPLHVIMEVLGVPESDEXRMLKLTQELFGNADPDLNRTGKS 552

Query:   129 VAEAVKNADALRA----FMQ---ALVTDKRANPAGDIISGLIH-HAGADPALTDDELINI 180
             V +  +  D++++    FM    A+  D+RANP  D+ + + +     +P +   E ++ 
Sbjct:   551 VTDVGEGVDSIQSVVMDFMMYFNAITEDRRANPRDDLATLIANGKINGEP-MGHLEAMSY 375

Query:   181 ANLLLIAGYDTTASMLGLGIFVLLQRPAQLATLRDDPSRIADAVEGLLRYLSVVNPGIFR 240
               +   AG+DTT+S     ++ L + P Q A ++ DPS I   +E  +R+++ V     R
Sbjct:   374 YIIAATAGHDTTSSTTAGALWALAENPDQFAKVKADPSLIPGLIEESIRWVTPVKH-FMR 198

Query:   241 FAKEDLEFAGEHIPAGSTVVVSVVATNRDARHWPDP-DLDLTRPRGPHLAFGHGVHQCLG 299
              A  D E  G+ I  G  +++S  + NRD   + DP    + R    H+AFG+G H C G
Sbjct:   197 TATADAELGGQKIAKGDWIMLSYPSGNRDEAVFEDPFTFRVDRTPNKHVAFGYGAHICFG 18

Query:   300 QQLAR 304
             Q LAR
Sbjct:    17 QHLAR 3

>gcc_1435
         Length = 834

  Minus Strand HSPs:

 Score = 71 (25.0 bits), Expect = 2.5, P = 0.92
 Identities = 28/102 (27%), Positives = 40/102 (39%), Frame = -3

Query:   250 GEHIPAGSTVVVSVVATNRDARHW-------PDPDLDLTRPRG--PHLAFGHGVHQCLGQ 300
             G  +  G  + +S    +R  + W       PD  +D   P G    L FG G   C+G 
Sbjct:   697 GHAVVPGQIITISPWLIHRHRKLWDAPTAFVPDRFIDQPHPWGIEAFLPFGAGPRVCIGA 518

Query:   301 QLARMEMQAGYAELLRRLPNVRLAVPPEEVPLRNDMLTYGVHSLP 345
               A  E Q   A LL R   + L      +P+ +  +T G    P
Sbjct:   517 SFALAEAQIVLASLLERF-EIGLVSDRPVIPIAS--ITLGPDHAP 392