Cytochrome P450 sequences from Volvox carteri

A colonial green algae similar to the single cell algae Chalmydomonas.

 

David Nelson

Sept. 11, 2007

 

 

>CYP51G1 estExt_Genewise1.C_30095|Volca1

84% to Chlamy

MADLTAELSVLLEKFTATQMVLAGSAILFLALIIGRVLFNNLPGKRPPVYEGIPFVGGLLKFSQGPWKLLHDGYAKFGEV

FTVPVAHKRVTFLIGPDVSPHFFKAGDDEMSQSEVYDFNVPTFGPGVVFDVEQKVRTEQFRMFTEALTKNRLKAYVPQFN

REAEEYFAKWGDEGVIDFRDEFSKLITLTAARTLLGREVREQLFEQVADLLHGLDEGMVPISVFFPYLPIPVHQKRDRCR

KELSKIFGKVIRQRRESGHREEDVLQQFVDARYQNVNGGRALTEEEITGLLIAVLFAGQHTSSITTSWTGIFMAANKKAW

LPAVEEQKAIVAKHGTDLSFEALSEMEVLHRNITEALRMHPPLLLVMRYAKKPFSVTTSDGKTFVVPKGDVVAASPNFSH

MLPQIFKNPKAYDPDRFAPPREEQNRPYSFIGFGAGRHACIGQNFAYLQIKSIWSVLLRNFEFELLDPVPDADYESMVIG

PKACRVKYTRRKLL*

 

>CYP97A13 fgenesh4_pg.C_scaffold_106000021|Volca1

80% to 97A5

MQQHQSRTLGGRPQQQPQRLPRCPVLRSAGISRSRPIVHAEPTEGNQPDSGKLFGLIPLRARGENLDARIESGEFTDAGS

TKEKLTRPLRQALAKEPIVGRPVARFLADLGRRWRSEAAKRMPEARGDIREIVGQPVFVPLYKLFLVYGKIFRLSFGPKS

FVIISDPAYAKQILLTNADKYSKGLLSEILDFVMGTGLIPADGEVWKARRRAVVPALHRKYVASMVGMFGDCTVHGTATL

DCAVASGQSIDMENYFSRLALDIIGKAVFNYDFDSLTHDDPVIQAVYTVLREAEHRSTAPLAYWNLPGATIVVPRQRRCQ

EALRIVNDTLDGLIDKCKKLVEEEDMEFNEEFLSDQDPSILHFLLASGDEISSKQLRDDLMTMLIAGHETTAAVLTWTLY

TLASHPEATEAIRREVDEVLGDRAPNVEDFKSLRFTTRVINEAMRLYPQPPVLIRRALQEDKFDQYVVPAGSDLFISVWN

LHRSPELWDEPDKFKPERFGPLDGPIPNEVTENFGYLPFGGGRRKCIGDQFALFEALVALAMLVRRYDFVLDTSKPPVGM

TTGATIHTTGGLYMHVKKRDMSGLAAAVRRQETPAYAFAYGTSTVAAMASPASSSPAAVAGGGCPFHTGAAVPPPPPAAV

AAASVTVGGATATLGGGVSIGPSAPAGPAGL*

 

>CYP97B17 e_gw1.51.4.1|Volca1

60% to 97B8 moss

MQQLNCRSGARGPLRIAAGCSPRPRRAHAFSPNPFNPSLSSLATSPLRPGRLS

RSTGLQSRLIPAGLPDPLAVGLFFAPGFAALVYAYFRGKGNLTDGLSRLLTEI

SQGYFQPDVGGKNIPVAQGELSDLAGDQPLFKALYQW

FIESGGVYKLVFGPKAFIVVSDPVVVRHLLK

DNAFNYDKGVLAEILEPIMGRGLIPADLDTWRVRRRAVVPAFHR

QYYDAMVTMFGRCADRSSDKLQALVEKGQVGLGGR (1)

VVDMESEFLSLGLDIIGLGVFNYDFGSITSESPVIKAVYGVLKEAEHRSTFYLPYWNLPLADVLVPR

QAKFRRDLRVINDCLDDLIRKAQETRVEEDAEALQNRDYSKLRDPSLLRFLVDMRGEDVTNKQLRDDLMTMLIAGHETTAAV

LTWALYCLMQSPAALERVLREVDGVVGDRTP (1)

TPGRIKAMPYLRAT

LGESLRMYPQPPILIRRALGEDVLPGGLRGDP

AGYPIGTGADLFISVWNLHRSPYLWKDPDTFRPDRFFESYSNPDFEGKWAGAY

AVSGGAALYPNEVGSDFAFIPFGGGAR

KCVGDQFAMFEATVALAVLLRRFSFALEGPPEKVGMATGATIHTANGLMVRVSRRTP

PPPPPAPAAGSPREEQLPRQPVAA*

 

>CYP97C14 estExt_Genewise1Plus.C_520019|Volca1

about 88% to 97C3

MLLSGRVAQPKACGHASNNPRRRPVPFQSYHQANRIVKVRAQDDE (0)

PIMGKSIDAAGAGASFTSPGWLTQLNMLWGGKGNV

PVANAQPDDIKELLGGALFKALYKWMQESGPVYLLPTGPVSSFLVISDPAAAKHVLRATDNSQRNIYNKGLVAEVSQFLF

GKGFAVAGGDDWKVRRRAVGPSLHRAYLEAMLVRVFGPSSEFAADKLRVAARSGTPVNMEAMFSQLTLDIIGKAVFNYDF

NSLTSDSPLIQAVYTALKETEQRATDLLPLWKVPALGWLIPRQRKALQ

Seq gap

(1) GSALTWTLYLLVQNPDKMAKAVAEVESVMGS

RTAPTLADYGQLRYVMRCVNESMRLYPHPPVLLRRALVEDELPGGYKVPVGQDVMISVYNIHHSEAVWDNPEAFIPERFG

PLDGPVPSEQNTDFRYIPFSGGPRKCVGDQFALMEAVVALAVLLRQFDFSLVPNQKIGMTTGATIHTTDGLYMYVKER

RTCAGQAAAGAAAVTAG*

 

Probable chlamy N-term

422264 MMLSNRTSGRPTVGSRSSSSARRPALFVPVKHVSRVAPLRAQNEDDE (0) 422404  PSTFGKNID…

 

>CYP710B1 ortholog e_gw1.84.34.1|Volca1

79% to 710B1 revised N-term. and mid

MNTTTVDDGLFGGSAGVLGSLGFSCMPSGSALIAAGGAIALGYTIWEQ

AKFRWYRMPKKGDGLLP (1)

GPGTVTPILGGIVEMVKDPFAFWERQRLYSFPGMSWNSICGIFTVFVTDAALSRYVFSHNSEDSLLLC

LHPNAEWILGKTNIAFMSGPQHKALRKSFLALFTRKALGLYVLKQDAIIRQNFDEWMSMPGPLEVRPLIRDLNAFTSQMV

FVGPYLDDPQ (0)

EREQFSAAYRAMTDGFLTFPLLLPGTGVWKGRQGRKFVVKVLTKAAARSKVRMAAGQEPECLLDFWTKQ

ILSDMKDA

ADAGAEPPFYHEDRKIAETVMDFLFASQDASTASLVWTITLMAEHPDILARVREEQARLRPNLDAPVTGDVLNEMTFTRQ

VVKEILRYRPPAPMVPQRAQAPFKLTETYTAPKGTLIIPSLVSASLQGYANPEKFDPDRFGPERQEDIKYASNFLVFGHG

PHYCVGKEYAINHLTVFLALLSTSLDFNRIRSNISDEIKYLPTLYPGDSVFEMRWRAKK*

 

>CYP739A7 estExt_fgenesh4_pg.C_400138|Volca1 revised N-term

57% to 739A5

        MATVSDRSANGPSQGPLAWLATA

1095883 AAGVLVLSFIYNFIVGKKRLPGPWLTFPILGDAVELFTTDPARLLFSR (2 GC) 1095740

1095643 FKRYGR (2) 1095626

VFRMKLLGMETYVVADPEALRPLLSDDGAHFAIPIASFNWLMESLSVQNSKQTHGPWRKLH

MAALTGGGLKALLPAVRRIMEDHVQQWASSGRVGIFEEARRMGLDLSIFAITGVDLEGKIDMRWFKEQMYLFLGGLYGLP

FALPGTKLAKGLAAKKRLLGALMPLLKERHEAFHAEWEAAGGDPAKVAAKLMDES

EEPITVHKAQMMGFHSIGAGTLRGT

AMSVLHSVMAAADTTRFALFNTWALLAQLPAVQDKIYEEQQK

VVSELGPELSFQALSRMPYLDAVFKEALRVLPPSSGGFRKLTRD

ATICGVTLPAGTLIWYHALLLQILDPVLWDGDTSVDLPPHMDWQNNLEAAFQPERWMGDETQ

RPRSFYVFGWGSHLCAGTNL

VQMEVKLLLALVMRRFRLELEMPDMLTRAELFPYVVPVKGTDGMRLVPREESLPWN*

 

No CYP740 is found in Volvox but Chlamy CYP740 is most like CYP739A7 in Volvox

 

>CYP741A2 estExt_fgenesh4_pg.C_670030|Volca1

61% to 741A1

 

Scaffold_67

       MLPLLTSALLGLSLTPFA

351864 YILYAIIWPYLCSLPLRIKLQNMPGPP 351944

351945 GRIFFGNAFDLVYKPAQQQMAKWAEEYGPIYKLSLPGAMVR (0) 352067

353109 VVVLTEPEAITTVIKLERFEKDRMMYKVLEE (0) 353201

353382 ITSETLPNMLTVPTGPYYRAVRKAITPAFSSANLR (2) 353486

354110 QFFPELVSLTERLAAQLMTHGSVVPVDVDRAAQRLTIDVI (1) 354229

354515 GRFAFDRDFGALGFSRSEELEAICALMLALETSQNPLNRWFWWRK (0) 354649

355096 EARELYAARLRYDALIRRVLDGLQQRPPAAHCLLSHLLRCTDPTT (1) 355230

355429 GKPLSAKRLRCETAFLLVAGFETTGHGIAWSLLFLAGDEE (0) 355548

356508 AEARVAEELDVCGLLATSTRPKPRPVTWGDLGQLRYLNAVIQEALRLMP 356654

356655 PVSAGTIR (2) 356678

357147 TAPRDTRLAGKDVPAGATVW (0)  357206

357687 IPHYAVQRSSRTWGADAGRFRPERWLTGFSG 357779

357780 LQQQQQGEKEEEGKLPKGQSEG (0)

358335 ARGWLPFSDGPRNCVGQSLALLELRTTLATLCGRFR (2) 358442

358561 FRLAEEMGGVE (1) 358593

359207 GVVAAARQDVTLKPGDKGLLMHVIPRVPT* 359296

 

>CYP742A2 fgenesh5_synt.43__16|Volca1

59% to CYP742A1 in overlapping regions

 

Scaffold_43

       MRSSFHRTSSVGQPPRLEQ

786109 KIPGPPGLPWLGQLPAFLRSRFFPKQMLKWAEEYNG (1) 785996

785616 VYRMEIVGRRYLVVS (1) 785572

785320 DPALLPPIMGRGGPGLPKSEGYGMWDPVIS (2) 785231

784442 PHPGVQGIVTVSETTDVWRAVRRSYGPAMGLGSL (?) 784341

   (1) ATCTAGMIC

783662 TTATDTDAGSSLPCPPSSSSLEMDRLLKALSLDMLGRCAFGLDFGAARNPEH (0) 783507

       68aa missing in seq gap

781496 (1) GPDALSDEQICAEIATVILAGYETTAN (2) 781416

776461 TLTWMLFALDAHPEAAAKLEEELRSH (1) 776384

776027 GLIPSPGADDADLSSADE DPVLAAFLALHGSHEALSSLPYLDAFVRECLRMFSVAPNGAVKVL 775842

775841 PPESPPTRIGPYEVEPGTTVWVPFWSLHLSERNWERPLEFQPVSWL (0) 775704

       GVSLNNNSSSSCSDNSGSSS

775308 SSKATRYMPFSEGSRNCVGQHLGMLQIKAVLSYLFSRFRFRLDDQRMG 775165

775164 GVEGALARQRVNLTLEVEGGMWM 775096

775095 AAELRSPRRGQQQQVGNAAVAAAGAGAGAS*

 

>CYP743A3 estExt_fgenesh5_synt.C_250087|Volca1

best match to 743A2

 

Scaffold_25 61% to CYP743 in overlapping regions

        MRKRASQLRGVGAATFLSFHAKNTRNVSYETPTPPPSPAFA

1531643 GPPAIPIVGCVPQVMRYGFPHFLRRCYEKYGPVYK (0) 1531747

1532097 VALGRSWVVVVADADFIRQ (0) 1532153

1532447 IGFRFRERAIVDPNLNRGVYREADKAGLVLAK (2) 1532542

1532767 GEYWRMIRTAWQPAFSSASLSGYLPRMVACAAQLAERLEGRARG (0) 1532898

1532920 VRGSRVDLWRELGSMTLQVVGSTAYG (2) 1532997

        QLMHACAAVFRYGSPYYGSR (2)

1533858 YSRAVLLFPELREPISLLAHWAPDMPFRKLLQ (0) 1533953

        ARKLLRDTCLTLIRDWEEKQHKPTKTETK

        TEAEAEVRNGHAATRETGPTAEEPASTSPSYDN

1536455 GGEGGGGGDVVRRAPAVAPGSFLGLMLSARDKLTGDALSDDQVVSQ (0) 1536592

1537082 AQTFILAGYETTANSLTFAVYNIATHPEG (1) 1537165

        VERRLLAEMDEVLGPDR (2)

1538910 APTDADLSRMPYTEAVLHETMRLYPAAHAITREVTNTPTQ (0) 1539029

1539384 VGGYTIPADTHVILGIYTAHHDERFWPRAEEFIPERFMP (0) 1539500

1540194 DSPLYPEVCPRAPHAHAPFGHGSRMCIGWRFAMQ (0) 1540295

1540665 EAKVALAMLYQRLRFELEPGQVPLETVSALTLAPKDGLWVRPVPRKQL* 1540817

 

>CYP744D1 estExt_fgenesh4_pg.C_460099|Volca1

most like CYP744, EST = JGI_CBGZ12453.rev

 

Scaffold_46 49% to CYP744A4 in overlapping regions 251/508 aa

        MVGSSALAGSAKIGALCKTR (2)

1008553 AGTMSYRAGPLGWPFVGNLPQIVAMDVTKYLSYTAKKYGPICK (0) 1008681

        IWFGTRPWLLISDPVLAR (2)

1009644 KLAFQCTARPLELPTFLDTLTGENRQIELVSAFFAQ (2) 1009751

1010701 HGETWRRGRRAFEASIIHPAR (2)  1010763

1011250 LNAHLPAVRRCLARFIPSLERYASSGQPLNALSALGDLMLAITGELAYG (2) 1011396

        GGAGAGAGAGARMAYFCRDVFRTFRLQDASIYLPLQ (0)

1013127 LVFPSMAPAIRLLADALPDANQRRSMGARSAMAVVSRQLIAQWARAR 1013267

        ADGGGSGGAGSGGGGEDDSARAVAVAP

1013391 DCGRGGDFREVDGGISGSSFLAAMLEGRR 1013477

        RRGGGVGVAEQGQGQGQQQQQQQQQQQPNYVELTDDE

1014473 VIAQSLTFILASYETSSTTTALALLLLAAHPGAQRRLAEEVDAVQGGELT 1014622

1014623 AAVLAE 1014640

1015219 LPYTEAVLKETMRLYPALPMMHRHARNDIRLEDGRVAPK (2) 1015335

1015726 GTFLALCSYNMHHDADLWPQPELFLPERFL 1015815

        RPEVEPGGGGGGGNGGGGGDGGGGGGGDGGNPLGRAHPSPP

1015939 WFGFGLGPRMCVGHKLATM (0) 1015995

1016590 VAKATLVSLLRRFSFSLAPHQHLPPAMATGLTYGPRDG 1016703

        AVWLQLHSRNTAPIVAV*

 

>CYP745A1 ortholog fgenesh4_pg.C_scaffold_34000081|Volca1 REVISED (REMOVED AN INTRON)

71% to 745A1

MAATTHWSPLADLLADAGSATTLLRLVPVVLLGGGLLVYIIARTALFIKDYMRISRALAKIPTAPGSLPLLGNVIPMITC

VRRNIGAWDLMEEWLDNTGPIVKFSILGTQGVVFRDPSALKRVFQTGYKMYEKDLDVSYRPFLPILGSGLVTADGALWQK

QRMLMGPALRVDVLDDIVTIAKKAVDRLCEKLAHHAGKGQSVNIEEEFRLLTLQVIGEAVLSMAPEECDRVFPSLYLPVM

NEANHRVLRPYRMYLPTPEWFRFRTRMSQLNEYLIDLFR (2)

RRWESRQRLGRQKPADILDRIMEAIEESGAKWDAALETQLC

YEIKTFLLAGHETSAAMLTWSTFELAMNEKATNK (0)

VVSEAEAAFGLRSEKEANRRNVDEMIFTLAV

LKEALRKYSVVPVVTRKLVRGSGAADDPVGVLGHPLPKGIMVACHLQGTHRLYEAPDEFRPERFMPEGEYDSFDDAIRAY

MFVPFIQGPRNCLGQHLALLEARVVLSLLHKRFKFRVVDGDKVTRMPTVIPVGPTGGLNVVLEQRTDAQAKS*

 

>CYP746A1 ortholog estExt_Genewise1Plus.C_390098|Volca1

about 81  % to 746A1

MYSLRAVAQKNVATSSNRTTGGGSSNAVPAASGDLRLDLWFSLLQATAVQLTAATAAAATAANGVALRLQ

RLSEPAPMNFPPGPEGDQTLSLLSDPLGFLTSTTAQ

YGPLVGLLLGGERVVLVTGREAARTVLVEQAGSVYVKEGTAFFPGSSLAGNGLLVSDGPVWQRQRRLSNPAFRRAAVEAY

SGAMVAATCDMLDNQWVAGGTRDVYADFNELTLRVTLEALFGF (0)

(?) SEEAQQIVAAVEKAFTFFTQR (2)

AATGFIIPEWL

PTWDNLEFAAAVQQLDRVVYGMIARRREELDAAFDALPSDLLTSLLLSRDDDGTGMSDQALRDELMTLLVAGQETSAILL

GWAAALLAAHPEVQARAAAEVASVCGRGVAPTAASVRDMPYLESIVLETLRLYSPAYMVGRCAQVDATLGPYSLPTGTTV

LVSPFVMHRDAAVWDQPNVFLPERWQELQTSNLGPNGAYLPFGGGPRNCIGTGFAMMEGMLVLAAVLQRYDLTLPPQTLQ

QAQHAAVDAAAVAGVLPASFPKPKPLLTLRPESVVLRITPR*

 

POSSIBLE N-TERM FOR CHLAMY SCAFFOLD 1

3574307 PSPAPGKSYSYNYPATTTSSSSSTSAAPFAPLAASPPATSDLRLDLWLSL 3574158

3574157 LQASASQISAAAAQLTAAAAGAATRLQ

 

>CYP747A1 ortholog fgenesh4_pg.C_scaffold_76000035|Volca1

about 79% to 747A1 Chalmy

 

SCAFFOLD_76 SHORTENED ON BOTH ENDS ADDED GRAY REGION

       MPLAASALDASTSTAAPDQQTSKCPFSQLASKISAATGAVPATSVRAAHPAELQ

411878 PIPGPAPLSLEALKDVSIIFLEGLHVAMLRFSDKYGPICR (2) 411759

411538 FANPASLNGATSWVFLNSPENIQHVCATNVRNYSRRYLPDIYTYVTHGKGILGSQ (0) 411374

410845 DEYNARHRRLCSGPFRSRTQLQRFSKVVVERW (0) 410750

       VSAVGGHLGGGGPG (1)

409596 GGLLVTDVAIQTQRLTLDVVGRVAFSHDFRQVEQVR (2) 409495

409254 DLAGAAGDSGLLQDQVLWAVNTFGEVLAQIFITPLPLLR (0) 409138

408821 VLDRLGLPHLRRLDEAVSIMRRAMLDVIQ (0) 408729

408110 ATDDAGRGLSDEELWEDVHDIMGAGHETTATTTAALLYCISAHPDVRQRVEQELDDVL (1) 407940

       PSSMAPPFQPCSTAYPAIPSGDQPPSCEALERLPYLQ (0)

       ACVKECMRLYPAIPVFPRETLSDDVLPSGHAVSA(1)

       GDVVFMSSYALGRSAALWPDPLTFNPDR (2)

406620 FTPEGETAQHRFQWLPFGAGPRMCLGASFAL (0) 406528

406141 MSVALMAATLLQRFRFTPLRPNTPILPVAYDITMNFNPSAGLHMRVQPRDRAVRRLGQ* 405965

      

>CYP748A1 ortholog fgenesh4_pg.C_scaffold_40000112|Volca1

trace archive sequences

ABSY209135.g1 exon 1

ABSY189778.b1 exon 2a

ABSY140806.b1 exon 2b

ABSY42643.g1 exon 3

ABSY86219.x1 exon 4

ABSY93957.g1 exons 5, 6

ABSY112787.y1 exons 8, 9 fused

ABSY106164.g1 exons 10, 11

 

Scaffold_40 73% to CYP748A1

MSSSWEELCFYGHLASTLFSPKYDLARVPGPRGSFGLGNITAVMRPDYHVQ (0)

MLEWANQYGGVYKFSLGFQWVVVVSDPRIAVQ (0)

VLGRGPDSIPRKCVGYKFFDL (0)

ATNAAGAHSFFTTSDETQWAAVRKAAAAAFSSANVR (2)

KAFPIALRHSRL (0)

LSMLHVFMEALFGIRPEDFP (1)

GRQVAADMNLVLEEANERLKVPLRKVAMALVRPT (0)

AQARIRAAQLRLAKVYGDMYEIIRSRCSPSTRLPPE

GVTDLWACLGRVRHPVT

GAPLGRDALVPEIGALMMAGFDTSSHSVAWVLFALAAHPGAQLRCRQELAARGLVAEGA (1)

GSQARDPTLDDLTQLPYLNAVIDETMRMFPVAATASVR (2)

EVTQPTRVGDYVIPPGVIVWPMLYALHNAVHNWDRPDEFLPERWLPGSG (0 GC)

RFIPFSDGLKSCLGQ (0)

ALGLMEVRTMLVVLLGRYHFQLDPGLGGPEAVRRNMIMSLTLKIKGGLKLVANPLHGATTNATT*

 

>CYP767A1 ortholog fgenesh4_pg.C_scaffold_56000003|Volca1

 

exon 1 is a best guess

trace archive hits

ABSY171556.g1 exon  2 PKY…

ABSY46806.x3  exon  3 DGL… fused with exon 4

ABSY46806.x3  exon  4 MCT…

ABSY5198.y1   exon  5 DRT…

ABSY140583.g1 exon  6 SAF…

ABSY56673.x2  exon  7 GLT…

ABSY90166.y3, ABSY10903.x1, ABSY90166.y1, ABSY125944.g1 exon 8 PAI…

ABSY174072.y1 exon 10 GTQ…

ABSY225235.b1 exon 12 FMP…

ABSY176428.b2 exon 13 MGG…

 

76% to 767A1 Chlamy

    MYSGRWWELPRDLSDLARRSRRHAAAHLAIGASAAKRNGQ (0)

    PKYDLDLIPGPWTHALPFIGNLLQFLRPDFHRVCLRWADKYGGIVR (2)

(2) IKFLWHDGLLVTDPPALAAICGRGEGAVDKAANIYSPIN

    QMCTPHAYPNLLTSLADDRWRAVRKAIALSFAFGNIRKKFPLIR (2)

(2) DRTGELLEWLRGVGPLESVDVDQAALRVTLDVIGL (0)

(0) SAFGHDYGCTRLQQVPYNHLLRVLPRAFTEVMRRIANPFRSFAPGLVKNGKK (1)

    GLTSFKDFQRHMQELLGEIKARGPPARGDADIGAQLYRVLEAAR (0)

(0) PAITDERILSE (0)

    IGILFVEGFETTGHTISWTLFNIATTP (1)

(1) GTQEAVAEELSSLGLLVRPKSEGGRSAARQLELDDLKRLRYLTACVKESMRMYPVVSIMGR (2)

    TTDKPTRVGPYVVPSGTPVATALFAIHNTIHNWRDPMTFKPERWLGECSLGVLGS (2)

(2) FMPFSEGPRSCVGQSLAKLEVMTVLAMLLANFRIELSDE (0)

(0) MGGREGVRQRESTHLTLQTRGTRGIRMHLHPRDQE*

 

>CYP768A1 ortholog fgenesh5_synt.31__13|Volca1

70% to 768A1

 

Scaffold_31 70% to 768A1 in overlapping regions

CYP768A1 Volvox ortholog

ABSY165990.g1 exons 1,2,3

ABSY147804.y1 exon 4 C-helix partial

ABSY193853.g1 exon 4 C-helix partial

ABSY111272.b1 exon 4 intact

ABSY75276.y2 exons 5,6 fused

ABSY73799.g1 exons 9,10

ABSY165990.b1 exons 14,15 fused, 16, 17

ABSY22806.b1 exons 18, 19

MWDTLRFYYSTHGPLGAWTPAIVLLLNILGIALALAVTKFIGLYFA (0)

PSYDLRKIPTPPVGDAILGHVKFLLRPDYHRVILAWTRKYGKIFRLR (2) 398

ILTQWTVVITDPAAAAQVLAVVPGRTHNYTLVDE (0) 700

GLGGPGKIS (2)

MFGTRDEAHWRNVRKATAPAFSMAN (0) 620

VPDARALPGFDLLVPRILLLMAEANRQIVDPLWALWYRTPLAPLLSK (0)

HVSECRAAVREVRAFHTATAARLLDR (2)_

PDPPSDNTLLWACLHRLRHHITGARLTPTQLHPE (1)

VGMYTTAGFDTTASTLGWCL (2) 705

YAAALHPDQQQKVADELQQACVFGNGAVVEDLVKLPYLTAFVNEAMRLYPTTAVAAER (2)

VSPDRPVAVGPFTLPPGVVLWPLVYGIHMSDANWDEPEAFR (2) 835

MERWLEDPRCAFARGE (1) 495

RGPGASGAPRRFLPFADGPKNCVGQ (0) 261

NFGLVVVRAVLALLLSRYRVALHGDMGLER (2 GC)

VAVVTKLSKLRLVMTPRD* 878

 

>CYP769A1 ortholog fgenesh4_pg.C_scaffold_19000020|Volca1

 

Scaffold_19 58% to Chlamydomonas CYP769A1 in the overlapping regions

MSIDARLDRRLNYRCNLRGRVSRRALQDVHLSTRWTKTA (1)

PPPGVPLLGHSLTLRAWPSWTWWWFRSGGPRGDQLLLRALLRWSEQYDGAFQLRNGWL

VLHPNAVPSSATATSSAQWRLLRRSLLHAFSDSELQLDFE (0)

GPGAVVDVNDAALRLSLDVMGLSKLGYDFQVGMAV (0)

AVESQGEVLMLRLLGEVAAEWAVRRRRLLGRWAPWISDGAAEGQTR

CRILHHFIEQ (0)

LLLAHGPTGHSIAWALGCLAARRGVQEKLVAELKKE (1)

GIFNDPLRLTYDMLSKLPYLDCVVREVLRLYPTMPCPATVRTLKK

DVALHGRTLTAASDVWVDVFSMHRSPKWWRDPHHFKPERWTA (0)

SPPPLAPLCSPEAFMPFSFGSRSCLGQKLAVAQIKAALAMLLCFLVFEPS (1)

VAPWGLGLFLRPEGGMQLLVAPRKKNS*

 

No CYP770A1 match is found in Volvox

The best match to CYP770A1 is CYP746A1. 

CYP770A1 may be a pseudogene of the CYP746 family

 

>CYP771A1 ortholog

fgenesh4_pg.C_scaffold_42000137 [Volca1:95263] contains the N-term

fgenesh5_synt.42__51|Volca1

and gw1.42.40.1|Volca1

about 38% to CYP4F animal in the last half

 

Scaffold_42  56% to Chlamy CYP771A1 ortholog in overlapping regions

1132371 MLGQSAVSGDLLLLLQSRLQPNRHLKRQQLQGCQGRCRCWRIYPRYAGMAQHRTHA (0) 113204

1131341 VRYLGKQVLLVREPDDVAAVLSRRADRFTKHPRQQRVKAWLG (1) 1131216

1130806 AGLATQADPLAHAAQRETLAPAFRADYVRQLDSVMAAAAARLAETLLTGA 1130657

1130656 VATSQPQQPQQPLRLDFQNLFKRHSLDVLGLASLQVDLGLLGRGVEQPEV (1) 1130507

1125031 AVAPVNAASTTATETVPYDLISVLTDIENAALWLLLQLPIPDHLLPGYDKYMANIATLDEL (0) 1124849

1122906 LVTMFFGGTDTSALALTLTAYHLAHCPEAQRAARAE (0) 1122796

1122073 VLEVLGGRSVRELQSDAVQRRLPFLTACLNETLRLYPALPEITRLAQQ (0) 1121930

1120234 DDVLSGYQVTRGSSVVVSLYSMHRHPAIWPRPDEWLPQRWM 1120112

1119721 EAADDDAEDRGPRS 1119680

1119577 PNAFLPFGVGPRGCIGRNFSLLNMQ (0) 1119503

1118541 MTLAALLSSLEL (0) 1118506

1118451 KSAAASSLDLVQSVVIRVKGGMWLGVRPYGN* 1118356

 

>CYP771A1 Chlamydomonas seq.

first exon is a guess, exons 7 and 8 are in a seq gap

 

Scaffold_21 56% to Volvox CYP771A1 ortholog in overlapping regions

MRAGYVRAKAVSCLWPKCRQLPTRVRFIRHVRWKPKRSPPPLLRPAH (0)

297759 VRYLGKRKLLLREPDDVAAVLAR 297827

PGEDAFRKHPRQQRVSAFLG (1)

AGLATQPDRQRHAAQRDA (FROM TRACE FILE GNL|TI|335849579)

299116 LAPAFRPDAVRQLDAVMAAAAERLAEALMAAAEAEAE 299226

EAEAVAAASGSSSGAAGAAAGAGAGAAAGELQVEMQDLLKRHSL

DLLGLAALRSDMGALRRSPVMAAA (1 GC)

AAAAAAAGGGYAAVGADVDVVTLMTE

302461 IEAASLWLLMALPVPNELLPGYGTYEANVRRLDEL (0) 302565

303433 LVTMLLGGTDTSALTVAFAAWHLAAEPQLQAELRRE (0) 303540

303889 VLGVLGGRALGELRAEDVKAMPLLAAVVNETLRLHPPLAEITRVATQ (0) 304029

exons 7 and 8 missing

305941 PNAFLPFGVGSRSCIGRHFGLLSTQ (0) 306015

306339 LTLAALVARFEVLPPAPPAPTALDWSQSIVITSRSGVWLRLRPIRQ* 306479

 

New family in Volvox also found in Chlamydomonas

 

>CYP772A1 Volvox fgenesh5_synt.67__7|Volca1

MFVTDLLAAQLSVWFAVVFGAILVVAAFSSLVWSRQGKNDAALVPGLPILGNALALGRHGVSYINKCRRK (0)

FGDSFTLSLAGVKMTFLFEPSHIHYFFSAPDEKVTFR (2)

RPAIEQFTQRVFGLTSRVFFPLHSK (0)

MLKELRELLVPAMLTDHMQ (0)

SLGTRALQLLPSYVHHDQ (0)

VDLCSLCRSLVFHCA (0)

HGGLPPRPPAGVEWLARTFFTFEDQFE (0)

LATSPLPHAFLPEFVKSRSELLRVFL

AADRRGLFQGTPAGEMLDR (2)

TTGSCAQLRPNMLLALIWASQ (0)

ANTIPAVFWSTAFLLLPENAVHKASVISELE (0)

RELQSKVLSAGGPAAAATTG (1)

ELVAAATRLAANRRSAVSRCVAEALRLRVQSIDVRQAAAPLDLPS

QSGEGARLQLPRGRLLAVCPFESHHDKKLCGAVERSGEAVGSD

PWVYDPCRPEVRLGDGSAVLPSVA (1)

GLAFGGGQYRCPGRFFAEHELGLLVQLLLWSYDMSLSY (1 GC)

DPQLQAVQGGSFLYAALATLLGPSAMAWGFGWFDGLDGPMQ (0)

EWRESGDPAGLLPPCDLRRLVGVKVPRKPCWVQLGRIA*

 

>CYP772A1 Chlamy      Chlre3/scaffold_93:93009-97931

estExt_fgenesh2_pg.C_930012

green matches ESTs AV393031.1, AV629836.1, BG847501.1, BG845414.1, BG845413.1

67% TO VOLVOX

like human CYP7B1

93009 MTMVQDSMIQALDALPVP 93062

93063 AVAASVVAVIITTVLLAVFRSRPGDAPSVPGLPLLGSAMALGRHGVAFINKCRQQ (0) 93227

93330 FGNSFSLSLAGVKMTFLFDPQHIDYFFGAPDSKITFR (2 GC) 93440

93627 PAVEQFTQRVFGLTSRLFFPLHFK (0) 93698

93980 MLTELRHLLVPASIAAHMQ (0) 94036

94271 ALGGRVLALLPLYVHHPQVDLYSLCRGLVFHCAGGEGG (0) 94426

94923 hqRPPEGVHRLARDFFAFEDGFE (0) 94985

LAASPVPHAFQPEFTAARQRLLALLAAADARGLFAGTLAGQLLER (2)

TAGLPPALRPNLLLAVLWASQ (0)

95654 (0) ANTVPATFWATGFLLLPENAHHRAAVLAELQ (0) 95746

      AELKGAVSAAGSPGGSAAYSNE (1)

96300 ELVAAAARVASSRRSAVSRCVAEALRLRVQSIDVRIAADHLELPLA (0) 96437

96510 GVKGGGGDVLRLPRGRLLAICPFVSHHDTQLYGGAAAAAAAAAAGCPAVTGAAAAGDVSS

96690 PWAFNPDRPELKLGDGTAVVSSVA (1) 96761

96995 GLAFGGGPYRCPGRFFAEQELGLLVQLLLWTYDIQLSYT (0 GC) 97111

97343 PQLRQVAGGSWLYGVLSGLVGARALAWGCGWFDGVDGPLE (0) 97492

97827 DFRHSGDPGGLLPPCDLKRLVGVKVPRRPLWVQL 97931

GVPHWQARRLGLVGVAPAATSRRWADIGLG*

 

AV629836 Chlamy EST

RLARDFFAFEDGFE

LAASPVPHAFQPEFTAARQRLLALLAAADARGLFAGTLAGQLLER (2)

TAGLPPALRPNLLLAVLWASQ

ANTVPATFWATGFLLLPENAHHRAAVLAELQAELKGAVSA

AGSPGGSAAYSNEELVAAAARVASSRRSAVSRCVAEALRLRVQSIDVR

 

AV393031 Chlamy EST

HHALSPATAAAAPAPARADLDSVPAQ

PQLRKVAGGSWLYGVLSGLVGARALAWGCGWFDG

VDGPLEDFRHSGDPGGLLPPCDLKRLVGVKVPRRPLWVQL

GVPHWQARRLGLVGVAPAATSRRWADIGLG*

 

 

>Plesiocystis pacifica SIR-1 1103186006250, whole genome shotgun

            sequence.

ACCESSION   NZ_ABCS01000107

Bacteria; Proteobacteria; Deltaproteobacteria; Myxococcales;

            Nannocystineae; Nannocystaceae; Plesiocystis.

CDS             11570..13018

MNEHGLPVVIAACLAAPAAYLGALAAINGWRRPGEPPLIRGAVP

YLGAALPFGRDAMRFLDGLRAEHGELFTVFIGGRRMTFALDPMAVPALLKAKQLSFAP

VADEVVDLAFALPKVREYAAIHALERASKDFLKGAALSPLTARMEAQLDALLPEYVDA

LTPAGQGHAEADLYRFIWDLMFAAGTDALYGQGVATPALSKAFDDFDQAFPLMLARVP

DFIVREGIAGREFLVAPMAASTGPEAPAQSEWMAKRAEIMAAEDPEFRGRIQVSVLWA

AHANTIPATFWAVAHLLRHPEALAAVRAELEASEGLRAGDRSTATLDQLRHLDSAIRE

SLRLSSGSLTLRLAAEDCELELPTGRFRLRKDDQVAIAPFLTHRDPEIFPDPEAYQHD

RFYVEKGVKQFFKAGKRVPMPLMPFGAGVSMCPGRFFAVNEIKLCVTLLLSRYDLELV

DDGPLPSYDLSRVGLGIYPPSQDVRVRITARS

 

Chlamy CYP772A1 vs Plesiocystis 31%

Query:    15 LPVPAVAASVVAVIITTVLLAVFRS--RPGDAPSVPG-LPLLGSAMALGRHGVAFINKCR 71

             LPV  +AA + A       LA      RPG+ P + G +P LG+A+  GR  + F++  R

Sbjct:     6 LPV-VIAACLAAPAAYLGALAAINGWRRPGEPPLIRGAVPYLGAALPFGRDAMRFLDGLR 64

 

Query:    72 QQFGNSFSLSLAGVKMTFLFDPQHIDYFFGAPDSKITFRGCPAVEQFTQRVFGLTS-RLF 130

              + G  F++ + G +MTF  DP  +     A   +++F   P  ++     F L   R +

Sbjct:    65 AEHGELFTVFIGGRRMTFALDPMAVPALLKA--KQLSF--APVADEVVDLAFALPKVREY 120

 

Query:   131 FPLHFKMLTELRHLLVPASIAAHMQALGGRVLALLPLYVH--------HPQVDLYSLCRG 182

               +H  +    +  L  A+++     +  ++ ALLP YV         H + DLY    

Sbjct:   121 AAIH-ALERASKDFLKGAALSPLTARMEAQLDALLPEYVDALTPAGQGHAEADLYRFIWD 179

 

Query:   183 LVFHCAGGEGGHQRPPEGVHRLARDFFAFEDGFELAASPVPHAFQPEFTAARQRLLALLA 242

             L+F  A G             L++ F  F+  F L  + VP     E  A R+ L+A +A

Sbjct:   180 LMF--AAGTDALYGQGVATPALSKAFDDFDQAFPLMLARVPDFIVREGIAGREFLVAPMA 237

 

Query:   243 AADARGLFAGT-LAGQLLERTAGLPPALRPNLLLAVLWASQANTVPATFWATGFLLLPEN 301

             A+      A +    +  E  A   P  R  + ++VLWA+ ANT+PATFWA   LL   

Sbjct:   238 ASTGPEAPAQSEWMAKRAEIMAAEDPEFRGRIQVSVLWAAHANTIPATFWAVAHLLRHPE 297

 

Query:   302 AHHRAAVLAELQAE--LKGAVSAAGSPGGSAAYSNEVAEALRLRVQSIDVRIAADHLELP 359

             A   AAV AEL+A   L+    +  +        + + E+LRL   S+ +R+AA+  EL

Sbjct:   298 A--LAAVRAELEASEGLRAGDRSTATLDQLRHLDSAIRESLRLSSGSLTLRLAAEDCELE 355

 

Query:   360 LAGVKGGGGDVLRLPRGRLLAICPFVSHHDTQLYGSSPWAFNPDRPELKLG------DGT 413

             L   +       RL +   +AI PF++H D +++   P A+  DR  ++ G       G

Sbjct:   356 LPTGR------FRLRKDDQVAIAPFLTHRDPEIF-PDPEAYQHDRFYVEKGVKQFFKAGK 408

 

Query:   414 AVVSSVAGLAFGGGPYRCPGRFFAEQELGLLVQLLLWTYDIQL 456

              V   +  + FG G   CPGRFFA  E+ L V LLL  YD++L

Sbjct:   409 RVPMPL--MPFGAGVSMCPGRFFAVNEIKLCVTLLLSRYDLEL 449

 

>CYP7B1 NM_004820 human

          Length = 506

 

 Score = 96.7 bits (239), Expect = 2e-22

 Identities = 116/493 (23%), Positives = 199/493 (40%), Gaps = 78/493 (15%)

 

Chlamy CYP772A1 vs human CYP7B1

Query: 11  ALDALPVPAVAASVVAVIITTVLLAVFRSRPGDAPSVPG-LPLLGSAMALGRHGVAFINK 69

           +L+ L +P +A +   +++   LL     RPG+ P + G LP LG  + L +  + F+ 

Sbjct: 13  SLERLGLPGLALAAALLLLALCLLVRRTRRPGEPPLIKGWLPYLGVVLNLRKDPLRFMKT 72

 

Query: 70  CRQQFGNSFSLSLAGVKMTFLFDPQHIDYFFGAPDSKITFRGCPAVEQFTQRVFGLTSRL 129

            ++Q G++F++ L G  +TF+ DP               F+    ++   Q  F + S 

Sbjct: 73  LQKQHGDTFTVLLGGKYITFILDP---------------FQYQLVIKNHKQLSFRVFSNK 117

 

Query: 130 FFPLHFKMLTELRHLLVPASIAAHMQALGGRVLALLP-------LYVHHPQV-------- 174

                F +    ++  +   +    Q L G+ L +L          V  PQ+       

Sbjct: 118 LLEKAFSISQLQKNHDMNDELHLCYQFLQGKSLDILLESMMQNLKQVFEPQLLKTTSWDT 177

 

Query: 175 -DLYSLCRGLVFHCAG----GEGGHQRPPEGVHRLARDFFAFEDGFELAASPVPHAFQPE 229

            +LY  C  ++F        G+       + +  L  DF  F+D F    S +P     

Sbjct: 178 AELYPFCSSIIFEITFTTIYGKVIVCDNNKFISELRDDFLKFDDKFAYLVSNIPIELLGN 237

 

Query: 230 FTAARQRLLALLAAADARGLFAGTLAGQ----LLERTAGLPPALRPNLLLAVLWASQANT 285

             + R++++   ++     +   +   Q    +LE+             L  LWAS ANT

Sbjct: 238 VKSIREKIIKCFSSEKLAKMQGWSEVFQSRQDVLEKYYVHEDLEIGAHHLGFLWASVANT 297

 

Query: 286 VPATFWATGFLLLPENAHHRAAVLAELQAELKGAVSAAGSPGGSA--------------A 331

           +P  FWA  +LL    A      +A ++ E+   + + G   GS               

Sbjct: 298 IPTMFWAMYYLLRHPEA------MAAVRDEIDRLLQSTGQKKGSGFPIHLTREQLDSLIC 351

 

Query: 332 YSNEVAEALRLRVQSIDVRIAADHLELPLAGVKGGGGDVLRLPRGRLLAICPFVSHHDTQ 391

             + + EALRL   S  +R   + L L         GD   + +G L+AI P V H D +

Sbjct: 352 LESSIFEALRLSSYSTTIRFVEEDLTL-----SSETGDYC-VRKGDLVAIFPPVLHGDPE 405

 

Query: 392 LYGSSPWAFNPDRPELKLGDGTAVVSSVAG--------LAFGGGPYRCPGRFFAEQELGL 443

           ++  +P  F  DR    + DG    +            + FG G  +CPGRFFA  E+ 

Sbjct: 406 IF-EAPEEFRYDR---FIEDGKKKTTFFKRGKKLKCYLMPFGTGTSKCPGRFFALMEIKQ 461

 

Query: 444 LVQLLLWTYDIQL 456

           L+ +LL  +D+++

Sbjct: 462 LLVILLTYFDLEI 474