472 cytochrome P450 sequence pieces from Amphioxus. Very fragmentary with

two haplotypes for many genes.

From JGI Branchiostoma assembly Jan 30, 2008

Search for P450 at 1.0e-5 or less (481 results, some false positives)

 

This file has clans 7, mito, 19, 20, 26, 46, 51, 74

 

CYP7 clan (12 sequences)  includes CYP39, no CYP8 sequences found

 

$$$$$$$$$$

 

>fgenesh2_pg.scaffold_10000055|Brafl1 41% to CYP7A1

MPCSSCLAVMIKVMIPTTSTDHGWNLYPPVSLCRQGGVTPPTGWVVTPESYAILCPSWQQVFCEQYQCLSAATLLPAGLL

EAPPLFSYYSGARGVVHGGQGPVGHAPPPDSEMFSENVEYSGAITEQELWFEIYCLFQWNKTVFLFRVRGSSSSFEVTQT

NSPDSVRQALITAGLSSTRLRRAGGTCSTRDDYTYIMWKFLITLTSCFCMKRSNKVSPVEDGDVKEETAPGEEEMNRGTT

ECPVVTAQPPMSQPRSSKSSADVLAELRQDGLLPLNTRGESVAFQVPASEPDAPPRRPVKLAKLEETLQERRERVKKEPA

GSRTKLRQQLSDAANRRDEMLQNRSRKLAESSRRAKAKARAAKKERKSTAFVISSVSDTDAIVPRDSEKAQALEKRLSKR

RKRVAKRITAEDMKKQQELAAERRRRSNKVSPVEDGDVEKEPAPGEEEMVRGTTECPVVTAQPPMSQPRSSKSSADVLAE

LRQDGLLPLNTRGESVAFQVPLVKPASEPDAPPRRPVKLAKLEETLQERRERVKKEPAGSRSKLRQQLSDAANRRDEMLQ

NRSRKLAESSRRARAKARAAKKEGKSTAFVISSVSDTDAIVPRDSEKAQALEKRLSKRRKRVAKRITAEDMKKQQELAAE

RRRCHIDRLAYLSTYSK

 

MVTELLGVCLAVVLVFVLLQVTTRRRRPGEPPLEPGPLPYLGVALEFSRNPLGFITSRWKKYG

DVFTVRLAGHYTTFVLDPHSFTHAIRNSKVLDFRVFSSKIAHRAFGMPIVYGTHRDWVRADSDALYPKELQGQGLEKVTE

VMMNNLQSAMLAATDVKDKWNKGELWSFVYRIMFSASYKTLFGRHKEDEEETARLLHAMEEFQKYDKRFPEIISNVPWWL

MGQTKKRYEYLKSMVSPTELSQRGVSDFIRMRQEIYADGNLSPDEMTGFNFATMWASLSNTVPAAFWTLFYLLKDPVAMD

AVREEVNQILKETGQSLETVKEAGEMLHVTREQLNDMKCLGSAINEALRMCSASIIIRVATEDAELALESGSTFRIRKGD

RVALYPGFLHMDPEVFDDPETFKYDRFLENGMEKTTFYKNGRKLRHYLLPFGHGASMCPGRFFALNEIKQFVTIVVCYFN

MELMEKQTPPKDQSRAGLGTLAPLKECLFRYSLK*

>fgenesh2_pg.scaffold_63000051|Brafl1 38% to CYP7A danio

96% to fgenesh2_pg.scaffold_10000055|Brafl1

MVTELLGVCLAVVLVFVLLQVTTRRRRPGEPPLEPGPLPYLGVALEFSRNPLGFITSRWKKYGDVFTVRLAGHYTTFVLD

PHSFTHAIRNSKVLDFRVFSSKIAHRAFGMPIVYGTHRDWVRADSDALYPKELQGQGLEKVTEVMMTNLQSAMLAATDVK

AEWNKGELWSFVYRIMFSASYKTLFGRHKEDEEETARLLHAMEEFQKYDKRFPEIISNVPW

(gap)

CLMGQTKKRYEYLKSNTVP

AAFWTLFYLLKDPVAMDAVRAEVDQILKETGQSLETVKEAGKMIHVTREQLNDMKCLGSAINEALRMCSASIIIRVATED

AELALESGSTFRVRKGDRVALYPGFLHMDPEVFDDPETFKYDRFLENGMEKTTFYKNGRKLRHYLLPFGHGVSMCPGRFF

ALNEIKQFVTIVVCYFNMELMEKQTPPKDQSRAGLGTLAPLKECLFRYSLK*

>fgenesh2_pg.scaffold_1047000003|Brafl1

only 6 aa diffs to fgenesh2_pg.scaffold_10000055|Brafl1

MVTELLGVCLAVVLVFVLLQVTTRRRRPGEPPLEPGPLPYLGVALEFSRNPLGFITSRWRKYGDVFTVRLAGHYTTFVLD

PHSFTHAIRNS STGGTEDQRSTTCVQIR ASYKTLFGRHKEEKDETALLLHAMEEFQKYDKRFPEIISNVPWWLMGHTKKR

YEYLK

DTCRHDRQCISRHGVGSCCAPRRPIFSPLPVCKSAGQVGDTCQRSGERLAYPTSVGRRQYIFTCPCAEGLQCELF

SGYADIGTCVPVQY*

 

>estExt_fgenesh2_pg.C_4350040|Brafl1 25% to CYP8b.c, 23% to CYP7B1 human

MGSVLGTLQLLGWNNQMLKPNREEDFVEKNIGFPCRVVTGNKTVQSVFDIDLFKKEEFCFGVVGEVRKDFTEGVCPCILS

NGKIHEKNKGFLMEVIAKAGEDIPPSTALSVLSNISKWGSTPMSDFESKLTDVAADAFLPNIFGESTHFHGEEIRLYRSG

AI AVRLSIVKALTGRNLDEERRAMTSILEKIKTSERYQQLLDLGKSYGLGEKEATAQLLFPVFINGAYGLAAHLVCTFAC

LDTISAEDREELREEALAALKNHRGLTRESLEEMPKIESFVLEVLRFCPNPVFWSTIATCPTTVEYTTDSGEHTLKIEEG

ERVYASSYWALRDPAVFDKPEDFMWRRFLGPEGDALRKHHVTFHGRLTDTPAVNNHMCPGKDVSLSALKGSIAIFNTFFG

WELQEPPFWTGKKLSRGSLPDNEVKIKSFWVQHPE DLKEIFPSHFQDIVNEVDDVGDIDVLVKTKTGKYSGSGTNSNVYI

RLFDDKGHQSRELQLDVWWKDDFEKGQEGQYKLKDIKVAAPIVKIELFRDGCHPDDDWYCESVSVQLNPDNNGPTYDFPV

NRWIRQNDHVWLSPGGGEPPKDDVNPIDD*

 

>CYP7 estExt_fgenesh2_pg.C_10470002|Brafl1 40% to CYP7B1

no allele

MISGILAGCLVVLVVAILVQAVGRKRDPNEPPLESGPVPYLGVALQFAMDSLKFIRSRQKKYGDVFTVKLAGKYTTFVLD

PHSYSDVMRQHKILDFKTVGMDIVERGFGTTHFEKTGRAHVLHTADAYFPVHLQGNALDPLTNTMMGHLQTAMLADIGEAA*

 

$$$$$$$$

 

>estExt_fgenesh2_pg.C_1950037|Brafl1 27% to CYP7D1, 30% to CYP7B1

MGGVWSNTYGFIKGVTDGVHMMKPEGEHPSVVRTNPGLPVVALMNQDTIHYAINPETYKKEPYSFGPVGVSKDVLRGHCP

SMFSNDEDHRRKKALLVDAYKQGEKSLPSILFNQIKAHFGEWSRLKDVPDFEERVFHIMSETLTEALFGRKIDGQLCFTW

LNGLITEAKTWIPMPSLAWKRRQAIKAIPELLKAIETAPKYRELVQLCHTHGVEVEEGIFTILYGTLFNGCAAQTAAIVS

SVARLHTLSDAEKN EIIQTTLQVLEKHGGVSEESLGEMKTLESFILEVLRLHPPVFNYWVLARKDLVISPEKENIKVRKG

ERMLGCCFFAQRDGSVFPDPDRFRWNRFLDEQGGQKKHLFFPRGSFTEAADLNSHQCPGQDIGFFMMKTTLSVFLCYCSW

ELKDAPVWSDKPIRVGNPDDPVRLVRFNFRSEQ AGRALTQGNRLVLIRAQVCLAVWTLTHLSVSRLVLKLDATTMPRNQR

APGSGGLPVSERRTRGHEKEIEAGWERSKFNEFVSDLVSLERSLPDTRPVRCHKAQVLDNLPTTSVIICFCEEAVSTLLR

SVHSVINRSPPHLLKEIILVDDASTAAYLKEDLDTYMSKFPQVKIVHLPEREGLIRARLRGAEIATGDVLTFLDSHIECN

VGWLEPLLDRIGRNRTTVPCPSIDRINDNTFGYEAANENMRGGFNWGMKFDWVSLPPGEDDRRYQDIWSQNEIIKSPTMA

GGLFSIDRRFFWELGGYDPGFQIWGAENLEISFKDIFYALNPHVENEIANAGDVSDRKRMREQLGCKSFQWYIDHVYPEI

TIPDLRAKARGEVKNRAMSLCLDAVYGEKVGAYFCHGEGGQQSFTLRMDDKIMLRWFFSVCLAAGLPIRNHKGAFLLTKK

PCTAPEVIAWNHTKGGPLVDQKTGKCLGVVNLSPEEHLVALRPCNQQRVQDWTFQNYLVDM*

>estExt_fgenesh2_pg.C_3320046|Brafl1 27% to CYP7D1

MGGVWSNTYGFIKGVTDGVHMMKPEGEHPSVVRTNPGLPVVALMNQDTIQYALNPETYKKEPYSFGPV GVSKDVLRGHCP

SMFSNDEDHRRKKALLVDAYKQGEKSLSSILFNQIKAHFGEWSRLKDVPDFEERVFHIMSETLTEALFGRKIDGQLCFTW

LNGLITEAKTWIPMPSLAWKRRQAIKAIPELLKAIETAPKYRELVQLCHTHGVEVEEGIFTILYGTLFNGCAAQTAAIVS

SVARLHTLSDAEKNEIIQTTLQVLEKHGGVSEESLGDMKTLESFILEVLRLHPPVFNYWVLARKDLVISPEKENIKVCKG

ERMLGCCFFAQRDGSVFPDPDRFRWNRFLDEQGGQKKHLFFPRGSFMEAADLNSHQCPGQDIGFFMMKTTLSVLLCYCSW

ELKDAPVWSDKPIRVGNPDDPVRLVRFNFRSE QAGRALVNTSAKKI*

 

 

>estExt_fgenesh2_pg.C_1940045|Brafl1 27% to CYP7D1

MGGVWSDTFGFIKGLVHGPHMMKPEGEHPSVFRANPGVPAVVLLNRDTIQYAFNPETYEKEPYSFGPVCAAKDVVGGHCP

SMFSNDEDHRRKKALLIDVYKQGQKTLPSVFFSQIKAHFEEWSRLEDVPDFEERVFHITSETLTEALFGKKIDGRLCYTW

GNGIPTDFRTWIPIPPAARKRRQAVEVLPALLKAIKETPKYQELVQLCHTHGVEVEEGILTILYGTLFNGCGAQTATIIS

SVACLHTLSDAEKNEIIQTTLQVLEKRGG ISEESLSEMKTLESFILEVLRLHPPVFNYWALARKDLVISPEKENIKVCKG

ERMVGSCFWAQRDGSVFPDPDRFRWNRFLDEDEQGGQKKHLFFPRGSWTEAADLDSHYCPGQDIGFFILKVLLAVLLGYC

SWELKDAPV WSDNTFRLGNPDDPVRLARFNFRSEQAGRALGIRPDNIAPNAI*

>estExt_fgenesh2_pg.C_510020|Brafl1 30% to CYP4V6

90% to estExt_fgenesh2_pg.C_1940045|Brafl1

87% to estExt_fgenesh2_pg.C_3320046|Brafl1

87% to estExt_fgenesh2_pg.C_1950037|Brafl1

MKPKGEHPSAFRMNNGVPAVVLLTRDTIQYAFNPETYEKDPYSFGPGGVSKDVVRGHCPSMFSNDEDHRRKKALLIDVYK

RGQKTLPSVFFSQIKEHLEEWSRLEDVPDFEERVFHIMSETLTEALFGRKIDGELCFTWLNGLLTDFKTWIPIPSMSRKR

RLAIEALPALLKAIKEAPKYQELVQLCHTHGVEVEEGIFTILYGTLFNGCAAQCAAIVSSVARLHTLSDTEKNDIIQTTL

QVLEKHGG VSEESLGEMKTLESFILEVLRLHPPVFNFWCLARKDLVISPEKENIKVCKGERMVGCCFWAQRDESVFPDPD

RFRWNRFLDEDKQGGQKKHLFFPRGSWTEAPDLDSHQCPGQDIGFFMMKALLAVLLGY CSWELTAAPMWSDKTIRVGNPD

DPVRLARFNFRSEQAGRALGIRPDNIAPNAI*

 

$$$$$$$$$

 

>CYP39 amphioxus 49% to CYP39 zebrafish,  start MET not certain, 2 choices

MATTIGEHSPGDELYNAFKY

MILFSLCFAFFSWRNIVKKGRPPCMDGWIPWFGCAIDFGKAPLDFIEETKRK (0)

LGPVFTIVAAGRWMTFVTEPEDITTFFQSPNLDFQKAVQDPVSHT (1)

ASVSTESFFQHHTKIHDTIKGRLAPANLHSFCSNLWGEFKQQLEQLEHHGKDDLNTLVRR (2)

CMFAAVVNNLFGAENVPTDKDRIQEFSDIFVKYDADFEYGSQLPPFFLR (2)

EWAESKKWLLSLFSRSIANMERKETESQ (0)

TLLQSLTKMVDRPHAPNYALLMLWASQANAVP(0)

MSFWVLAMILSNEDVHAAVKKEVQDNLGSP (1)

GDEPITEEDLKKLPLLKRCIMETIRLRSPGVITRAVDKPLRIR (0)

KYIVPKGHLLMMSPYWAHRNPNFFPEPDKFLP  (0)

DRWLDADLEKNLFLDGFVGFGGGRYQCPGR (2)

WFALMEMQMLLAMMIQMFDFKLLGEVPKEVCQNFNYLISIHII*

>fgenesh2_pg.scaffold_124000018|Brafl1 45% to CYP39A1

MATTIGEHSPGDELYNAFKYMILFSLCFAFFSWRNIVKKGRPPCMDGWIPWFGCAIDFGKAPLDFIEETKRKLGPVFTIV

AAGRWMTFVTEPEDITTFFQSPNLDFQKAVQDPVSHTASVSTESFFQHHTKIHDTIKGRLAPANLHSFCSNLWGEFKQQL

EQLEHHGKDDLNTLVRRCMFAAVVNNLFGAENVPTDKDRIQEFSDIFVKYDADFEYGSQLPPFFLRSIANMER

(gap)

KETESQT

LLQSLTKMVDRPHAPNYALLMLWASQANAVPMSFWVLAMILSNEDVHAAVKKEVQDNLGSPGDEPITEEDLKKLPLLKRC

IMETIRLRSPGVITRAVDKPLRIRKYIVPKGHLLMMSPYWAHRNPNFFPEPDKFLPDRWLDADLEKNLFLDGFVGFGGGR

YQCPGRWFALMEMQMLLAMMIQMFDFKLLGEVPKESPLHVVGTQQPVGPCPVEWTKI*

>CYP39A1 fgenesh2_pg.scaffold_124000030|Brafl1 45% to N-term

MATTIGEHSPGDELYNAFKYMILFSLCFAFFSWRNIVKKGRPPCMDGWIPWFGCAIDFGKAPLDFIEETKRKLGPVFTIV

AAGRWMTFVTEPEDITTFFQSPNLDFQKAVQDPVSHTASVSTESFFQHHTKIHDTIKGRLAPANLHSFCSNLWGEFKQQL

EQLEHHGKDDLNTLVRRCMFAAVVNNLFGAENVPTDKDRIQEFSDIFVKYDADFEYGSQLPPFFLREWAESKKWLLSLFS

RSIANMERKETESQTLLQSLTKMVDRPHAPNYALLMLWASQANAVP

(gap)

KYIVPKGHLLMMSPYWAHRNPNFFPEPDKFLP

LSDLDGETRSMVEKMMYDQRQKAMGLPTSDEQKKEDVLKKFMEQHPEMDFSKAKFC*

 

Mito clan (28 sequences, some duplicates)

 

$$$$$$$$

 

>CYP11amphi mixed seq 43% to Gene C, 35% to Gene B, 34% to gene D

36% to 27B1 fugu, 38% to 11A1 fugu, 33% to CYP24 fugu, 32% to 27C1 fugu

37% to chicken CYP11A1, 39% to catfish Ictalurus punctatus 11A1

This is a probable CYP11A gene

(2) EAKPFSALPGPPSVPVLGNFLHMWWEGLLEKEKLNKNHIMFTDFFRQYGPIFR (2)

(2) LKIVNVDMVSIKDPVAVQELFRKEGKYPARIDIKPWRRYREISGKATGVFLS (2)

(2) NGKDWQKNRSIMARPMLRPKHVSTYVSNLDTVSADMIKRLRVLQARADGIEV PNISDELFKWALE (1)

(1) SICTVLFNERMGYLQDNISQDAQDFIQGIHTIFLTTNTVIFPDADVHRFLRTKPWRQSVEAWDTVFRV(1)

(1) GEKVMVRKLQEALEREERGEGEDDQPNFLAFVNSTGRLTKDEIYSNTIELMGAAIDT (0)

(0) TSNTLLWTLYELSRRPELQDRLYQEVTQVIGQDKVMTWDHLKDLHLLKAIIKETLR (2) 885

(2) MYPVVHNVSRLLQEDTVLMGYRLPAK (0)

(1) TCVVAQVYAMGRDPQLFPDPDEFKPERWLRTGEAHDEINPYSSLPFGFGPRSCL (1)

(1) GRRVAEVELQLLLAK (0)

(0) MSQQFVLSQVEPEEISSVAQPLLMPETPLHLRFVDRK*

 

>fgenesh2_pg.scaffold_28000018|Brafl1 34% to CYP27C1 98% to 11amphi above 6 aa diffs

MMSVPVISGSRQRLSAVVGRAVSPWRPQGHIRVRALVGYRSGLVGPRTVPSPVQTYSTAAVGSTSHHNDDSEAKPFSALP

GPPSVPVLGNFLHMWWEGLLEKEKLNKNHIMFTDFFRQYGPIFRLKIVNVDMVSIKDPVAVQELFRKEGKYPARIDIKPW

RRYREISGKATGVFLSNGKDWQKNRSIMARPMLRPKHVSTYVSNLDTVSADMIKRLRVLQARADGIEVPNISDELFKWAL

ESICTVLFNERMGYLQDNISQDAQDFIQGIHTIFLTTNTVIFPDADVHRFLRTKPWRQSVQAWDTVFRVGEKVMVRKLQE

ALEREERGEGEDDQPNFLAFVNSTGRLTKDEIYSNTIELMGAAIDTTSNTLLWTLYELSRRPELQDRLHQEVTQVIGQDK

VMTWDHLKDLHLLKAIIKETLRMYPVAPNVSRVLQEDTVLMGYMLPAKTCVVAQVYAMGRDPQLFPDPDEFKPERWLRTG

EAHDEINPYSSLPFGFGPRSCLGRRVAEVELQLLLAKMSQQFVLSQVEPEEISSVAQPLLMPETPLHLRFVDRK*

 

$$$$$$$$$$$

 

this block related to gene B

 

>Gene B 84% to Gene D, 35% to CYP11 amphi, 33% to Gene C

30% to CYP24 Fugu, 30% to 27A3 fugu, 27B fugu, 27C fugu, 30% to 11A fugu

in nr blast best mammal hit is CYP24 mouse, but Drosphila hits are better.

34% to 49A1 D. melanogaster

    MYQLLSAARHQGQSLFRVCRARSLAALKTTYRPQSNKAEESVTYDTAARPFEEIPGPKGLPLIGTALEYTPF(1)

(1) GQFKMITNLRESFRERTRTYGSIYRERIGPLDLVVISDPKEIEKVFRNE

    GRYPERIELASIKVYREIKKLPTGLINL (2)

(2) NGPEWQRVRSSVQKDLMRPKTVGAYASLQDDVTRDLVDVIRALIGKEESGGQVQNFINYVYRWALE (1)

(1) AISVVVLDKRLGCLTLGDLEPGSDAKLMIDGVNDFFDAFVKLEMSATGL YKYISTPTWRKFAKAVDQFHR (2)

(2) VAEKLLKEKLAKTTTEDGKPAESDTDFLQSLLSRNDVTFEEAMEMAVDLLSAGIDT

(0) SGNTLMFNLFCLAKNPEAQEKLYREIQEVVPAGQPIDDKVLNRMHYLRAVVKETFR (2)

(2) VYPTVLNNVRRLDQDIVLSGYVVPAK (0)

(0) TTILLAHDVISSLPEYYPEPEVYKPERWLRDDESSSVQPFTLLPFGYGPRMCL (1)

(1) GRRFAEQELHLGLIR (0)

(0) IVQNFHVGWAGEDMKQDNRIILAPDRDSFVFSERT*

estExt_gwp.C_8820003|Brafl1   34% to CYP24                    2134  6.9e-224  1

 

>fgenesh2_pg.scaffold_214000064|Brafl1 3 genes fused,

31% to CYP24

MYQLLSAARHQGQSLFRVCGARSLAALKTPCRPQSNKAEESVTYDTAARPFEEIPGPKGLPLIGTALEYTPFGQFKMITN

LRESFRERTRTYGSIYRERIGPLDLVVISDPKEIEKVFRNEGRYPERIELASIKVYREIKKLPTGLINLNGPEWQRVRSS

VQKDLMRPKTVGAYASLQDDVTRDLVDVIRALIGKEESGGQVQNFINYVYRWALEAISVVVLDKRLGCLTLGDLQPGSDA

KLMIDGVNDYFASLVKLEMSATGLYKYISTPTWRKFAKAIDQWHFVAAKLLKEKLAKSATKDGKPAESDTDFLQSLLSRS

DVTFEEAMLMAVDLMAAGIDTSGNTLMFNLFCLAKNPEAQEKLYREIQEVVPAGQPIDDKVLNRMHYLRAVVKETFRVYP

TVLNNVRRLDQDIVLSGYVVPAKTTILLAHDVISSLPEYYPEPEVYKPERWLRDDESSSVQPFTLLPFGYGPRMCIGRRF

AEQELHLGLIR

30% to CYP24

SNKAEESVTYDTAARPFEE

IPGPKGLPLIGTALEYTPFGQFKMITNLRGSFRERTRTYGSIYRERIGPL

DLVVISDPTEIEKVFRNEGRYPERIELASIKVYREIKKLPAGLINLNGPEWQRVRSSVQKDLMRPKTVGAYASLQDDVTR

DLVDVIRALIGKEESGGQVQNFINYVYRWALEAISVVVLDKRLGCLTLGDLQPGSDAKLMIDGVNDYFASLVKLEMSATG

LYKYVSTPTWRKFAKAIDQWHLVAAKLLKEKLAKTATKDGKPAESDTDFLQSLLSRSDVTFEEAMLMAVDLMAAGIDTSG

NTLMFNLFCLAKNPEAQEKLYREIQEVVPAGQPIDDKVLNRMHYLRAVVKETFRVYPTVLNNVRRLDRDIVLSGYVVPAK

TTILMAHDVISSLPEYYPEPEVYRPERWLRDDESSSVQPFTLLPFGYGPRMCIDPNKKVRMY

31% to CYP27A1

RLQRAVRHQGQSLFRVCG

ARSLAALKTTVTQTQSTRAEESGVYDTAARPFEEIPGPKGLPFIGTGWDYSPFGRFPIKTNFRDSFRERTRTYGSIYRER

IGPLDLVVISDPKEIGKVFRNEGKYPERPPMGSIKTYREVRKLPTGIANLNGPEWQRVRSSVQKDLMRPKTVGAYASLQD

DVTRDLVDVIRALIGREESGGQVQNFTNYVYRWALEAISVVVLDKRLGCLTLGDLEPGSDAKLMIDGVNDFFDAFVKLEM

SATGLYKYISTPTWRKFAKAVDQFHSVAEKLLKEKLAKTTTEDGKPAESDTDFLQSLLSRNDVTFEEAMEMAVDLLSAGI

DTTGNTLMFNLFCLAKNPEAQEKLYREIQEVVPAGQPIDDKVLNRMHYLRAVRQETFRIYPTALSNMRTLDRDMVLSGYA

VPAKTIVLMAHDVISSLPEYYPEPEVYRPERWLRDDESSGVQPFTLLPFGYGPRMCIGRRFAEQELHLGLIRIVQNFHVG

WAGEDMKQVHRLILSPDRDTFVFSERT*

 

>fgenesh2_pg.scaffold_214000063|Brafl1 34% to CYP24

MSLLQRAVRQQGQSLFRVCGVRSLAALKTTYRLQSTRAEESVADDTAARPFEEIPGPKGLPLIGTALEYSPFGRFPIKTN

LRSSYRERTKIFGSIYREKIGPLDLVVISDPKEIEKVFRNEGRYPERLPLESIKAYRELKKLPAGVVNLNGPEWQRVRSS

VQKDLMRPKTVGAYASLQDDVTRDLVDVIRALIGKEESGGQVQNFINYVYRWALEAISVVVLDKRLGCLTLDDLEPGSDA

KLMIDGVNDFFDSFVVLETSATGLYKYISTPTWRRFEKAIDQWHTVAAKLLKEKLAKGATEEGKPAESDTDFLQSLLSRN

DVTFEEAMMTVVELLAGGIDTTGNTLMFNLFCLAKNPEAQEKLYREIQEVVPAGQPIGDKVLNRMHYLRAVVKETFRVYP

TVPNNLRKLDRDIVLSGYRVPAKTTVFMVDDVISSLPEYYPEPEVYRPERWLRDDESSSVQPFTLLPFGYGPRMCIGRRF

AEQELHLGLIRIVQNFHVGWAGEDMKQVNRMVFAPDRDTFVFSERT*

 

>fgenesh2_pg.scaffold_214000062|Brafl1 two genes fused

30% to CYP27A3 31% to CYP24

MQTLFSDWTGFSAFWTGQIFPKTPHTIDDFDSGLGSQSTRAEESVAYDTAARPFEEIPGPKGLPLIGTGLDYAPFGRFPL

KTHLRESFRERTKAYGSIYREKLGPLDLVVISDPKEIEKVFRNEG

(gap)

RNGPEWQRVRSSVQKDLMRPKTVGAYASLQDDVTR

DLVDVIRALIGKEGSGGQVQNFTNFVYRWALEAISVVVLDKRLGCLTLDDLEPGSDAKLMIDGVNDFFNAAVKLELSGAG

RLYKYISTPTWRKFANAIDQWHGVAAKLLKEKLTKSAAEDGKPAESDTDFLQSLLSRNDVTFEEAMLMAVDLMAAGIDTT

GNTLMFNLFCLAKNPEAQEKLYREIQEVVPAGQPIDDKVLNRMHYLRAVVKETFRLCPTVGNNIRTLDRDMVLSGYVVPA

KTKIFMAHDVISSLPEYYPEPEVYKPERWLRDDESSSVQPFTLLPFGYGPRMCIGRRFAEQELHLGLIRLAVRHQG

QSLLRVCGARSLAALKPTY

25% to CYP24

RLQSTRAEESVADGTAARPFEEIPGPKGLPLIGTALDYTPFGRFPLKTNFRESFRERTRTYGSIY

REKIGPRELVVISDPKDIQKVYRNEGRYPERPQVDSIKTYREMKKLPAGIVVLNGPEWQRVRSSVQKDLMRPKTVGAYAS

LQDDVTRDLVDVIRALIGKEGSGGQVHNFINYVYRWTLESIGVVVLDKRLGCLTLGDLEPGSDAQLMIGGVNDFFNAFSK

LEMSATGLYKYISTPTWRKFQKAIDQWHTVAAKLLKEKLTQSTIEDGKPAESDTDFLQSLLSRNDVTFEEAMEMALDLL

(gap, missing I-helix) 37% to 27C1

VYPTFLNNVRTLDRDIVLSGYVVPGKTIIIIGNDIISSLSEYYPEPEVYKPERWLRDDEFSSVQPFTLLPFGYGPRMCIGR

RFAEQELHLGLIRIVQNFH

VGWAGEDMKQENRMVFAPDRDTFVFSERT*

 

>fgenesh2_pg.scaffold_214000072|Brafl1 33% to CYP27A3

MATGRATSRRNGQWGATLAIREINGPEWQR

VRSSVQKDLMRPKTVGAYASLQDDVTRDLVDVIRALIGKKESGGQVQNFT

NYVYRWALEAISMVVLDKRLGCLTLNDLEPGSDAKLMIDGVNDFFDAFVKLEMSATGLYKYISTPTWRKFAKAFDQWHAV

AEKLLKEKLAKSAAEEGKPAESDTDFLQRLLSSKDITFEEAMMMAVDLMAAGIDTTGNTLMFNLFCLAKNPEAQEKLYRE

IQEVVPAGQPIDDKVLNRMHYLRAVRQETFRFYPTVLSNTRILDRDVVLSGYFVPAKTIVLMAHDVISSLPVYYPEPEVY

KPERWLRGDESSSVQPFALLPFGYGPRMCIGRRLAEQELHLGLIRIVQNFHVGWAGEDMKQNNRIILAPDRDTFVFSART*

 

>e_gw.882.7.1|Brafl1

RERTKIFGSIYREKIGPLDLVVISDPKEIEKVFRNEGRYPERLPLESIKAYRELKKLPAGVVNLNGPEWQRVRSSVQKDL

MRPKTVGAYASLQDDVTRDLVDVIRALIGKEESGGQVHNFINYVYRWALEAISVVVLDKRLGCLTLDDLEPGSDAKLMID

GVNDFFDSFVVLETSATGLYKYISTPTWRRFEKAIDQWHTVAAKLLKEKLAKSAAEDGKPAESDTNFLQSLLSRSDVTFE

EAMMTVVELLAGGIDTTGNTLMFNLFCLAKNPEAQEKLYREIQEVVPAGQPIGDKVLNRMHYLRAVVKETFRVYPTVPNN

LRKLDRDIVLSGYRVPAKTTVFMVDDVISSLPEYYPEPEVYKPERWLRDDESSSVQPFTLLPFGYGPRMCIGRRFAEQEL

HLGLIRVGSFAV*

 

>estExt_gwp.C_8820003|Brafl1 34% to CYP24

MSLLPRVVRHHGRLFNVCSARSLVTYRSQSTRAEESVAYDTAARPFEKIPGPKGLPLIGTGLDYAPFGRFPLKTHLRESF

RERTKAYGSIYREKLGPLDLVVISDPKEIEKVFRNEGRYPERVQLESVRTYREIKKLPIGVVNLNGPEWQRVRSSVQKDL

MRPKTVGAYASLQDDVTRDLVDVIRALIGKEGSGGQVQNFTNFVYRWALEAISVVVLDKRLGCLTLDDLVPGSDAKLMID

GVNDFFNAAVKLEMSGAGRLYKYISTPTWRKFANAIDQWHGVAAKLLKEKLAKSAAEEGKPAESDTDFLQSLLSRSDVTF

EEAMLMAVDLMAAGIDTTGNTLMFNLFCLAKNPEAQEKLYREIQEVVPAEQPIDDKVLNRMHYLRAVVKETFRLCPTVGN

NIRTLDRDMVLSGYVVPAKTKIFMAHDVISSLPEYYPEPEVYKPERWLRDDESSSVQPFTLLPFGYGPRMCIGRRFAEQE

LHLGLIRVSFVALFRH*

 

$$$$$$$$$

 

>Gene D 84% to gene B, 34% to CYP11 amphi, 30% to gene C

31% to CYP24 fugu

    MSLLPRVVRHHGRLFNVCSARSLVTYRSQSTRAEESVAYDTAAR

    PFEKIPGPKGLPLIGTGLDYAPF (1)

(1) GRFPIKTNLRDSYRERTKTYGSIYREKIGPRELVVISDPKDIQKVYRNE

    GRYPERPQVDSIKTYREMKKLPAGIVVL (2)

(2) NGPEWQRVRSSVQKDLMRPKTVGAYASLQDDVTRDLVDVIKALIGKEESGGQVHNFINYVYRWTLE (1)

(1) AISVVVLDKRLGCLTLGDLEPGSDAQMMIGGVNDFFNAFAKLEMSATGL YKYISTPTWRKFQKAIDQWHT (2)

(2) VAAKLLKEKLTQSTIEDGKPAESDTDFLQSLLSRNDVTFEEAMEMALDLLSAGIDT (0)

(0) TGNTLMFNLFCLAKNPEAQEKLYREIQEVVPAEQPIDDKVLNRMHYLRAVVKETFR (2)

(2) LCPTVGNNIRTLDRDMVLSGYVVPAK (0)

(0) TKIFMAHDVISSLPEYYPEPEVYKPERWLRDDESSSVQPFTLLPFGYGPRMCI (1)

(1) GRRFAEQELHLGLIR (0)

(0) IVQNFHVGWAGEDMKQVNRLVLSPDRDSFVFSARA*

 

$$$$$$$

 

>GENE F 61% TO GENES D AND B

    MSRILQIVGRRAAFTQAGLQNVPVWRPLGGRNGRGAASSAAATEQTTVQDGAARPFDEIPGPRGLPFIGTALDYSPF (1)

(1) GRFPIHTKMANSTIERYQTYGKIYREKIGLRDMVFVCDPKDIETVFRSDGRLPERPIPESIATYRRLKNKPLGVALL (2)

(2) NGEEWFRLRRSVNKDMMRPKAVGAYATMQDEVSRELVGLIQGVVRKGKTAGQVPDFTKLLYKWGLE (1)

(1) ALSLVVLGKRLGCLTLDQLPEDSDAQRMIGAVNDFFYSFAKLQMSFPLFRYIRTPGWTTFERAMDTVSS (2)

(2) ITEKMIGERLEKLRQMEEPPDEADFLTSLLSREDMNLDEAIQMSVDLLQGAIDT (0)

(0) TAHTLVFNLYCLAKNPDAQQKLYEEILEVVPPEQPIDDRVLNKMHYLRAVVKETFR (2)

(2) MYPTLLSTARTLTRDVVLSGYHVPAK (0)

(0) TNVMLAQNVISTLPEYYPEPESYIPERWLRTESSNVQSFSLLPFGYGPRMCI (1)

(1) GRRFAEQELYLGLVR (0)

(0) IIQNFHVGWDGEDMKQVWRIFNAPDRDTFVFSERKS*

 

>fgenesh2_pg.scaffold_119000067|Brafl1 CYP29% to CYP11A1

same as gene F above

MTHTGNADGSVHGIEILANGSLQDKYSLSQGDMDGPIVPVNETITADGVQRNVILVNDQFPGPTLEVMEGAQVVVTVVNE

LLREATSLHFHGMYMRGVPYMDGVPYVTQCPILPMHSFTYRFKAEPAGTHWYHSHLGSQKEDGLYGAFIVHKNSIPTTPS

LPMFLQDWWHDDFNTIDVDSAYMEHRGPGRFFGPWQERGFSFEGTELTALNFKSALINGRGRYNNNSAPLTRFEISSGET

LRFRLINAGAEYTFRVSIDAHSMTVVANDGHDVEPVHVQSILVFPGESYDFEVVGDPSNSGTYWIRAQTLWAGKGPDVEP

EDRLQEVRAILAYDNAPTDEDPNSAMQTCTENSPCRVLNCPFPAFPAGSNTECIYVSDLNSTEEYSMSDESETEEYFFNF

GYQIGSSVNGRKFDTPKKPLIFKAPYDITPCEATCETDGCKCTYMVEIPLGKTIRFVLMDLGVESEGHHLIHLHGYDFRV

LAMGFPVHNETTGRWISQNADIDCGNDNKCNMASWNVTRPNLNYNKPPIRDTVVIPARGYTVIEFRSNNPGFWYFHCHQT

THMNEGMSMIIAEALDKLPALPYGFPTCGDFTGTEKPPGRGRTVAAMEQSVTKVELDHTQLVIIIVISAAMSATIALAAV

GIYNARAKVNAFQRQVVKRSYVVCDQALGPQVLTTDKPLDTRHKPRGIMHLLNAFILPCLCVTMATTQRCTDDVCEFTLV

VRYARTMTHTERDGEVHGIEILTNGSLQDKYSLSQGDMDGPIVPVEETITADGVQRNVIVVNDQFPGPTLEVIEGAQVVV

TVVNNLLREATSLHFHGMYMRGVPYMDGVPYVTQCPILPMHSFTYRFMAEPAGTHWYHSHLGSQKEEGLYGAFIVHKNSM

PTTPSLPMFLQDWWHDDFNNIDVDSAFMEHRGPGRFFVPWQNRGFSFDGNKLSSVRFISALINGRGRYNNNSAPLTRFEI

SPGETLRFRLINAGAEYTFRVSIDAHSMTVVANDGHDVEPVQVQSILVFPGESYDFEVVGDPSNSGTYWIRAQTLWAGKG

PDVEPEDRLQEVRAILAYDNDPTDEDPNSDMQNCTENSPCRVLNCPFPAFPAGSNTECVYVSDLNSTEEYSMPDESETEE

YFFNFGYQIGSSVNGRKFATPKKPLIFKAPYDITPCEATCETDGCTCTYTTEIPLGKTIRFVLMSLGFGSGGHHVIHLHG

YDFRVLAMGFPEYNETTGRWITQNDDINCGDDNKCNMAAWNVARPNLNYNKPPTRDTVVIPARGYTVIEFRSNNPGFWLF

HCHQTTHMKEGMSMIIAEALDKLPALPYGFPTCGDFTGTEKPPGRGRTAAAMEQSVTLVELDNTQLVIIIVVSAAMSATI

ALAAVGIYNARVNKSKEKMIDTP IVGRRAAFTQAGLQNVPVWRPLGGRNGRGAASSAAATEQTTVQDGAARPFDEIPGPR

GLPFIGTALDYSPFGRFPIHTKMANSTIERYQTYGKIYREKIGLRDMVFVCDPKDIETVFRSDGRLPERPIPESIATYRR

LKNKPLGVALLNGEEWFRLRRSVNKDMMRPKAVGAYATMQDEVSRELVGLIQGVVRKGKTAGQVPDFTKLLYKWGLESLS

LVVLGKRLGCLTLDQLPEDSDAQRMIGAVNDFFYSFAKLQMSFPLFRYIRTPGWTTFERAMDTVSSITEKMIGERLEKLR

QMEEPPDEADFLTSLLSREDMNLDEAIQMSVDLLQGAIDTTAHTLVFNLYCLAKNPDAQQKLYEEILEVVPPEQPIDDRV

LNKMHYLRAVVKETFR CAINSIMARHRTLHHGHRRKLSFIIPVLLVYVLVSAFLDLTYSGYMAKHVSDGDSHQTITTTEG

TNMTKLLWEGLSRLEQMDQQRANLTEKLKNIAKMANVSEEAIGPWLSQLRPMTIVDAPAGNRTALLTCQDIAEIRISNPM

GKGVTKVVELGNYQGHGVAVKRVLPTVKDVRECKRTIERSGWNKCFVFPNYKLLKEILLLQQLKHPNIVQLLGYCVQNEE

TDENLAEHGVVSVTEMGTKFHVGRARKMDWKMRLKMAIDLASLLDYLEHSPMGSLLMADFKVEQFVWVGGKVKLTDLDDV

SNVERKCAVDSDCWVDKKDVGVPCTNGSCRGLNAKHNMNGAYKTILRHIMVHTGTEETALREDLRSVSISAASLHSRLLQ

LLDKELAIDSPTHR*

 

$$$$$$$

>Gene G 55% to amphi 11

MFLGLMRCQTPSQTYSTGPQAASHPQLDPP

AKPFSALPEPMKGLPGILKTLVVLCTGGMSRKAQLKSHVVIGQLFQMYGPILR (2)

NRFGNFDMVNICDPDAAREVFKVEGKYPERLDIAPWRLHREDAGKELAVLLG (2)

NDKKWHKNRTVVSRPMLRPQSVAAYVLKIDDVATDMLQHIRSVRAGPDGTEVLDLENELFKWALE(1)

SISAVLFNERMGLLQDNIPQDAQDFINGMHDAFDSLTRAMTPDARLHKLLNTKSWQKNKQAWDT (0)

GEKVMDRQLQRAEERQARGEADDGQLDFLWFISSREKLTKEEIYANAIELMGAAIDT (0)

TSTTLLWTLYQLCHRPDLQDKLYQEVTQVIGQDEVITYDHLKNLHLFKAVIKETLR (2)

LHPVAFAITRVIQQDTVLMGYKIPAK

TVVMVSLYDMARDPRLYKNPEEYRPERWLRGAEDYVDTHPYAYLPFGFGTRSCI (1)

GRRVAETELQVLLAK (0)

ICQQFVLKQRNPRVIPAMTKGILMPAEKMDICFIERQ*

>e_gw.241.76.1|Brafl1 33% to CYP27C1, 99% to gene G above

FGNFDMVNICDPDAAREVFKVEGKYPERLDIAPWRLHREDAGKELAVLLGNDKKWHKNRTVVSRPMLRPQSVAAYVLKID

DVATDMLQHIRSVRAGPDGTEVLDLENELFKWALESISAVLFNERMGLLQDNIPQDAQDFINGMHDAFDSLTRAMTPDAR

LHKLLNTKSWQKNKQAWDTVFKIGEKVMDRQLQRAEERQARGEADDGQLDFLWFISSREKLTKEEIYANAIELMGAAIDT

TSTTLLWTLYQLCHRPDLQDKLYQEVTQVIGQDEVITYDHLKNLHLFKAVIKETLRLHPVAFAITRVIQQDTVLMGYKIP

AKTVVMVSLYDMARDPRLYKNPEEYRPERWLRGAEDYVDTHPYAYLPFGFGTRSCIGRRVAETELQVLLAKICQQFVLKQ

RNPRVIPAMTKGILMPAEKMDICFIERQ*

>fgenesh2_pg.scaffold_140000032|Brafl1 31% to CYP11A2

90% or more to gene G above

MIRLCALTQRRSAATIVGRWLDFHRGARAASQGLLRCQTPNQPYSSGPQAASHPQLDPPVKPFSALPEPMKGMPGILKFL

VVLCTGGMSRKAQLKSHMMIGQLFQMYGPILRNRFGNFDMVNTCDPDAAREVFKVEGKYPERLDIAPWRLHREDAGKELA

VLLGNDKKWHKNRTVVSRPMLRPQSVAAYVLKIDDVATDMLQHIRSVRAGPEGTEVLDLENELFKWALESISAVLFNERM

GLLQDNIPQDAQDFINGMHDAFDSLTRAMTPDARLHKLLNTKSWQKNKQAWDTVFKIGEKVMDRQLQRAEERQAR

(gap) 37% to CYP27A.c

GEADDGQLDFLSFISSREKLTKEEIYANAIELMGAAIDT

VNSTSMSITLSQLVTDTVHE

TSTTLLWTLYQLCHRPDLQDKLYQEV

TQVIGQDEVITFDHLKNLHLFKAVIKETLRLHPVAFAITRVIQQDTILMGYEIPAKTVVMVSLYDMARDPRLYKHPEEYR

PERWLRGAEDYVDTHPYAYLPFGFGTRSCIGRRVAETELQVLLAKICQQFVLKQRNPRVIPAMTKGILMPAEKMDICFIE

RQ*

 

$$$$$

 

>fgenesh2_pg.scaffold_283000056|Brafl1 29% to CYP24

MSHILKIAGRRTAVRHQLRLPGFWRFCGRQGVRGAATTATAAEQVAPEETVRPFQE

IPGPKGLPFIGTALDYSPFGRFPIHTQLGNSAIERY

KTHGKIYREKLGPGREMVFVCDPKDIGTVFRSDGRLPERPPVNSIATYRKMRKKPPGLGNLMGEDWHR

VRSSVNKEMMRPKSVGAYATMQDDVSREMAELIQTVVRKGDSGGQVDNFMNLMHKWGLESLSLVILGKRMGCLTLDQLAE

DSDAQRMISAVLEFFLYFGKLEMSLPFYRYFSTPAWKKFETAMDTMN

(Gap)

SLLSQKDMTLDEAVMMAIELLTGAFESTANTLA

FNLYCLAKNPAAQQKLYEEIMNVVPPGQPIDDRVLNKMSYLRAVFKETSRLYPTIFFNARTLTRDVVLSGYHVP

AKIIQKFHVGWDGEDMKQIYKIFNTPDRDTFIFRERE*

>e_gw.77.176.1|Brafl1 33% to CYP24

93% to fgenesh2_pg.scaffold_283000056|Brafl1 (allele)

KTYGKIYREKLGPGREMVFVCDPRDIGTVFRSDGRLPQRPPVNSLATYRKMRKKPLGLGNLMGEDWHRVRSSVNKEMMRP

KSVGAYATMQDDVSREMAEQIQTVVRKGDSGGQVDNFMNLMHKWGLESLSLVILGKRLDCLTLDQLAEDSDAQRMISAVL

EFFLYFGKLEMSLPLYKYFNTPAWKRFVRALDTMN RYAICPIQERILTELSKLEEPPQETDFLSN LLSQKDMTLDEAVMM

AIELLTGAFESTANTLAFNLYCLAKNPAAQQKLYEEILEVVPPGQPIDDRVLNKMSYLRAVFKETSRLYPTIFFNARTLT

RDVVLSGYHVPAKTQIIMANNVISTLPEYYPDPEAYIPERWLRTESSAANVQAFALLPFGYGARMCVGRFLPVKNRSVS*

 

 

$$$$$$$$$

 

>fgenesh2_pg.scaffold_191000017|Brafl1 27% to CYP27C1

MGITGVLGRRCDAVMRSGRVFNGQWKCGRSSLRNVGLCILRKSSSTVTNVGMETCVDPTANKTDVAVRPFHEIPGPKGLP

IIGSLWEYTFLGKLDPRRFDEVLWNRYQEYGKIYKEDLGPRGTFVRIADPGDIETVYRNEGRYPHRPSFPLVRESMEAAG

QELLKHRARSESSFNGQGLEWYRTRSAVNRTLLRRSGVALFHPTLNEISDDFLTLLKRSLDENNTVPDITWQIRRHNTEV

AGTTIFGRRPGCLEPDFSGSCQTSEMIKSIDDFFASWLKLEIGFPLTKYLLKDTWNGYMNAHRNILRIVKYHMDLDVEYE

DSRPSVLGYLLSESSLSDTDAAMSAVELFVGGMQSSSHADMFQLYELARHPHVQETIRREVTEALPKGEAVTSAHLHKLP

YLKAFVKETFRFHPVGLLHMRILDRDVVLSGYRVPAHTTIEIPMSVLGRLEELYPQADRFLPERWLRRGPNGFRSRMFSH

VTPFGHGPRACIGRRLAEDKFYIQIAKLVQNFDLHCDEEVGTVTGCFQELSPTPNIRFTPR*

 

$$$$$$$$

 

>estExt_GenewiseH_1.C_30140|Brafl1 33% to 11A2, 33% to CYP27A1

MGGWMDKFHLHMQNRWRQYGSIYKENIGPQEIVCMFDPEDVAPVLRAEGRYPRRYAFDSFYLAREIMGHKLGVFLENDEK

WQQYRTVMNKKLLRPQQAAAFTPLMDEAASNFMSYLRRKRDQGGMVTDLQAHLFRWAMESGCTAMFNQHLGLLSEDPPQL

AKDFISSTMAVLDTTNTMMTIPPKVHKALNTKAWKEHLEGWQTSFRVTKQLIEEIMERGLEKESEEDEEIPDLVSYLLSV

KLRPEEVLANIVDVLGGAVDTTSNTMAFTMHTLARHPDIQEKLHDEVMRVAPDHQAPVTQEQVHKMPYLRGVIKEVLRLY

PVAYVFSRVLNHDAVVHGYKIPAGTNLVVCPYVMGRDPNSYDDPEEFRPERWYRENSKSVKAFSWLPFGFGARGCVGRRI

AETEMHLVLIRICQNFLLEQEKDEELVGRIRLVLIPDKSVDLKLIDRN*

>e_gw.29.150.1|Brafl1 32% to CYP27C1

92% to estExt_GenewiseH_1.C_30140|Brafl1

IAQNRWQQYGSIYKENIGPQEIVCMFDPEDVAPVLRAEGRYPRRYAFDSFYLAREIMGHKLGVFLENDEKWQQYRTVMNK

KLLRPQQAAAFTPLMDEAASNFMSYLRRKRDQGGMVTDLQAHLFRWAMESGCTAMFNQHLGLLSEDPPQLAKDFISSTMA

VLDTTNTMMTIPPKPGVKTYCTNVAPGSFLSSLELVFIMERGLKKESEEDEEIPDLVSYLLSVKLRPEEVLANIVDVLGG

AVDTTSNTMAFTMHTLARHPNIQEKLHDEVMRVAPDRQAPVTQEQVHKMPYLRGVIKEALRLYPVAYVFSRVLNHDAVVH

GYKIPAGTNLVVCPYVMGRDPNSYDDPEEFRPERWYRENSKSVKAFSWLPFGFGARGCVGRRIAETEMHLVLIRICQNFL

LEQEKDEELVGRIRLVLIPDKSVDLKLIDRN*

 

>e_gw.3.68.1|Brafl1 33% to CYP11A2

89% to estExt_GenewiseH_1.C_30140|Brafl1

GQEGATAKPFEAIPGPKGLPLVGTALHAAMGGWMDKFHLHMQNRWRQYGSIYKEIIGPQEIVCMFDPEDVAAVLRAEGRY

PRRHSVDSFYLAREIMGHKLGVLLENDEKWQQYRTVMNKKLLRPQQAAAFTPMMDEAASNFMSYLRRKRDQGGMVTDLQA

HLFRWAMESGCTAMFNQHLGLLSEDPPQLAKDFISCSMAILDTTNTMMTIPPKVHKALNTNAWKEHLEGWQTSFRVTKQL

IEEIMERELKKENEEDEEISDLVSYLLSVKLRPEEVLANIVDVLGGAVDTTSNTMAFTMHTLARHPDIQEKLHDEVMRVA

PDRQAPVTQEQVQKMPYLRGVIKEILRLYPVAYIFSRVLNHDAVVHGYKIPAGTNLVVCPYVMGRDPKSYDNPEEFRPER

WYRENRESVKAFSWLPFGFGARGCVGRRIAETEMHLVLIRICQNFVLEQKKDEELVGRIRLVLIPDKSVDLKLTDRN*

 

$$$$$$$$$

 

>fgenesh2_pg.scaffold_410000012|Brafl1 27% to CYP24

MQTRVKATVPTLRETGRYGVGKLHERHLDLHRQYGDICREKLLGREIVHVFSREIAQEVFMQEGRYPGRTVIEPDALYRT

TRGIPLGLLSLQDAEWHRLRRLAQDRILRPAVQSAVLPNMDRIAQEFVMRTDMLRSPGSDVMERNYKDELHLWSLEW

(gap)

KLIFSLPLYKVVPTPTWRKLAAAQDTFFRLSENYIKQVLTDSGDGDPETQDSLLLHLLRKSELSKEEVSATMTDLFQGGIDTT

TNGMMYSLFALAKNPEVQELVCQEIRTHLPEGARVTPEVLGKMKYLKAVIKETFRVCLPGCCRLWPVIFGTARQYDYDVV

LGGYDVPAKTEILVHHRVMCRQDKYFRDPLTFDPTRWLRDEKTPRVPTYLFMPFGHGVRMCIGMLNIILTIRRRFAEQQL

QLLVIRMLQRFHVECEEAELRQVFSLVLLPDRNPRFIFRRRQGETA*

 

 

>gw.501.20.1|Brafl1 30% to CYP24

LKKLHESFFERYRQFGKISKETIGNKCFVSVYDPRDIETLFRTEGPNPSWMQLMALGEVRKRLGKPLGMINETGQKWRQL

RYAAQSKLLNPKSVSSFVPVLDEISRDFVEKLRTGRSAATLEPTIDLDAELRKWSLESVVSATLGIRLGCLQKHRQIPDK

DTEDLLQSSDAFLDTWSKLELGPPLYMLYPTKTWRKFLRANELWLSAAGRMIDRSLDRSESERDPLQPEVTLLEHIVTRK

ELTPDDVVMIITELIFAGIESTAVAMTYNLYTMAKNQHVQEKVRREVNAVVGKSGKVTQDALKSLKYVKACIKETSRVLP

AFSMRNRILDKEIVLAGYRVPPNVIIRVLTHVTGQLPEYVVEPDRFAPERWLRDDTTIPKPHPFAVRPFGVGTRSCIGQR

LAEQELGILLAKV

>fgenesh2_pg.scaffold_44000117|Brafl1

87% to gw.501.20.1|Brafl1

MAFAVLMMMAAAVLPNFARSAITLIPMGSTYLPYGFDPAGAPLYGMGDRGAVEQLTYDADNYRIYTVGEARILNVIDISD

PKNAALVYQLQLPGGATDVDSCGRFVAVSIHDDFKVLPGTVLIYSMYDTTRKNMTLLHQIQVGALPDMVKFTKDCMTLVT

CNEGEPGLDESGNFVDPEGSASVIAFQSTNLGQESAPTVRTATFRKFDSLAEEYNSRGVRWTLPMIQVGSEVMEFNLSQT

LEPEYVAYNSDGSKAYIALQENNAIAVLDMATATFDDIYPLGSKYWGTASIDTSNEDGGSLVSRNLKSQRVQKAMNLTSQ

LGCAVFSSIDGLDPENPDKYSSLHLFGGRGFSVWDADDLSLVWDSGDDVERMVAKYYPTIFNSDYDEEFFNSTPAARFDH

RSCKKGPETESLAIGEVDGKTAFFVGNERSSTILVYSLADEDIITPVFQSIHFSGRTDLTWRQAYQDRVVGDIDPEDMRF

VSTRDSPTNSPLLLVAGTVSGTVSVYEVAESDDDGVSTAGKMKRAWLQHLVAKKLGADAVSIGRSGGETSTFSPPVTRY

25% to CYP27B1

RQFGKISKETIGNKTFVSVYDPRDIETLFRTEGPNPSWMQLMALGEVRKRLGKPLGMINETGQKWRQLRYAAQSKLLNPKS

VSSFVPVLDEISRDFVEKLRTGRSAATLEPTIDLDAELRKWSLESVVSATLGIRLGCLQKHRQIPDKDTEDLLQSSDAFL

DTWSKLELGPPLYMLYPTKTWRKFLR

ANELW 38% to CYP27B1

LRVLPAFSMRNRILDKEIVLSGYRVPPNVIIRVLTHVTGQLPEYVVEPD

RFAPERWLRDDTTIPKPHPFAVRPFGVGTRSCIGQRLAEQELGILLAKMIQQFHIE

CDGEMEQIFNIANKPDLSGTFKFTEL*

 

>Gene C 38% to CYP11 amphi, 34% to Gene E, 34% to Gene B

42% to 27B1 Fugu, 38% to 27C1 fugu, 42% to 27A1 fugu (but not first exon)

37% to 11A1 fugu, 36% to CYP24 fugu (Best match to CYP27B)

42% to Xenopus trop. 27B1, 41% to Xenopus laevis 27A1

    MAQQILRNSSVCSLVRPNSRALVSVAPAATVQQNRPLKEMPGPTNKLGQLWWGFKNRSRMHEAQ (0)

(0) LEQERKYGRMWQSSFGFNPNVNVAHVALAEQLMRQEGKYPKRIEVNFMQQYRDLRGYSYGLLNQ (2)

(2) NGPEWRHLRTAVSKRIMRPKEVPR (2)

(2) YGDSMNEVVTDMIDRFKDLRDTTGGGKTVPDLTNELYKWAME (1)

(1) SIATVLFDTRLGCLEREMPEKTQQFIDSIATMFRTAFLVSALKPWMLTYLGLGVWKRHVEAWDVIFSV(1)

(1) AHENIDRKVLDIDARLSRGEDLVGSFLTYMLTGTDVTKKDLYATVTELLLAGVDT (0)

(0) TSNTMVWTLYELARHPELQERLHQEVTSVVSPGQIPTVDDVKNMALLKNVIKEILR (2)

(2) VYPVLPANGRVLDKDIVLDGYNIPKG (0)

(0) TQFAILHYNMTRDPEVFEEPDRFNPDRWTRMGTEKVNTFSSVPFGFGPRQCA (1)

(1) GRRLAEMEMYLVLAR (0)

(0) LVQTFEVRQLTPGEVVRPVTRALLVPGDPVHLEFIDRP*

>CYP27 40% to 27B1 Fugu, 37% to 27C1 fugu, 40% to 27A1 fugu (but not first exon)

35% to 11A1 fugu, 34% to CYP24 fugu (Best match to CYP27B)

    MAQQILRNSSVCSLVRPNSRALVSVAPAATVQQNRPLKEMPGPTNKLGQLWWGFKNRSRMHEAQ (0)

(0) LEQERKYGRMWQSSFGFNPNVNVAHVALAEQLMRQEGKYPKRIEVNFMQQYRDLRGYSYGLLNQ (2)

(2) NGPEWRHLRTAVSKRIMRPKEVPR (2)

(2) YGDSMNEVVTDMIDRFKDLRDTTGGGKTVPDLTNELYKWAME (1)

(1) SIATVLFDTRLGCLEREMPEKTQQFIDSIATMFRTAFLVSALKPWMLTYLGLGVWKRHVEAWDVIFSV(1)

(1) AHENIDRKVLDIDARLSRGEDLVGSFLTYMLTGTDVTKKDLYATVTELLLAGVDT (0)

(0) TSNTMVWTLYELARHPELQERLHQEVTSVVSPGQIPTVDDVKNMALLKNVIKEILR (2)

(2) VYPVLPANGRVLDKDIVLDGYNIPKG (0)

(0) TQFAILHYNMTRDPEVFEEPDRFNPDRWTRMGTEKVNTFSSVPFGFGPRQCA (1)

(1) GRRLAEMEMYLVLAR (0)

(0) LVQTFEVRQLTPGEVVRPVTRALLVPGDPVHLEFIDRP*

>CYP27 fgenesh2_pg.scaffold_25000096|Brafl1

MAQQILRNSSVCSLVRPNSRALVSVAPAATVQQNRPLKEMPGPTNKLGQLWWGFKNRSRMHEAQLEQERKYGRMWQSSFG

FNPNVNVAHVALAEQLMRQEGKYPKRIEVNFMQQYRDLRGYSYGLLNHNGPEWRHLRTAVSKRIMRPKEVPRYGDSMNEV

VTDMIDRFKDLRDTTGGGKTVPDLTNELYKWAMESIATVLFDTRLGCLEREMPEKTQQFIDSIATMFKTAFLVSALKPWM

LTYLGLGVWKRHVEAWDVIFSVAHENIDRKVLDIDARLSRGEDLDGSFLTYMLTGTDVTKKDLYATVTELLLAGVDTTSN

TMVWTLYELARHPELQERLHQEVTSVVSPGQIPTVDDVKNMALLKNVIKEILRVYPVLPANGRVLDKDIVLDGYNIPKGT

QFAILHYNMTRDPEVFEEPDRFNPDRWTRMGTEKVNTFSSVPFGFGPRQCAGRRLAEMEMYLVLARLVQTFEVRQLTPGE

VVRPVTRALLVPGDPVHLEFIDRP*

>CYP27 e_gw.25.105.1|Brafl1

QLEQERKYGRMWQSSFGFNPNVNVAHVSLAEQLMRQEGKYPKRIEVNFMQQYRDLRGYSYGLLNHNGPEWRHLRTAVSKR

IMRPKEVPRYGDSMNEVVTDMITRFKDLRDTTGGGKTVPDLTNELYKWAMESIATVLFDTRLGCLEREMPEKTQQFIDSI

ATMFRTAFLVSALKPWMLTYLGLGVWKRHVEAWDVIFSVGESHENIDRKVLDIDARLSRGEDLDGSFLTYMLTGTDVTKK

DLYATVTELLLAGVDTTSNTMVWTLYELARHPELQDRLHREVTSVVSPGQIPTVDDVKNMALLKNVIKEILRVYPVLPAN

GRVLDKDIVLDGYSIPKGTQFAILHYNMTRDPEAFEEPDRFNPDRWTRMGTEKVNTFSSVPFGFGPRQCAGRRLAEMEMY

LVLARLVQTFEVRQLTPGEVVRPVTRALLVPGDPVHLEFIDRP*

 

CYP19 clan (2 subfamilies)

 

>CYP19 amphioxus 37% to CYP19 zebrafish ovarian, 38% to brain form

41% to e_gw.484.33.1 so there are two CYP19 subfamilies in Amphioxus

two possible start METs

    MLQFLVIESRGSFPLNRSRTRHGITSQIEADGCS

    MDTGEGWDVLLVVLLVVLVWYYIRETWTSGIDGIFPP (1)

(1) GPPYIPLLTPLWTLWVFLHDGIWAATAGYAAKYGDFVRVWLGTEQTFIISR (2)

(2) ASAAAHVLKSSKYRARFGDPSGLAQIGMNGSGVIFNNDVQSWKFLRFFFVK (1)

(1) VLDRAAGVSAIATRRQLANIRDIASSNPDGAVDVVTLMRRITLEIGNRLFLGVNIEN (1)

(1) DLEVVNTINGYFAAWEFFMIRPKVLQLIYPTLYRKHQTAV (2)

(2) RALQDVVGKLVDKKRAVMNGDEAEEEFSIPKGEHDFAAALIQAQ (0)

(0) EFGQVSASCVRQCVTEMLLAGPDTMSVHIYFILLHIAEHGLENGILREIREVL (1)

(1) GDRDPTRDDLSKMVFLDHVIN ESMRARPVVTFVMRHAEEEDHVDGYVIPKG

    TNVIINLVAVHQDPRHFP EPETFDPDHFKEK (0)

(0) VPSTQFMPFGLGVRSCVGRTIAPLQMKAVLITLLRMYQLSPSRDHQSLEVSRNLSEHPTEPGSMFLYPRLETI*

 

 

>estExt_gwp.C_90165|Brafl1 45% to CYP19

96% to assembled seq above

MFSLQECGQVSASCVRQCVTE

MLVAGPDTMSVNIYFILLHIAEHGLENGILREIREVLGDRDPTRDDLSKMVFLDHVINE

SMRARPVVTFVMRHAEEEDHVDGYVIPKGTNVIINLVAVHQDPRHFPEPETFDPDHFKEKVPSTQFMPFGLGVRSCVGRT

IAPLQMKAVLITLLRMYQLSP

>estExt_fgenesh2_pg.C_90115|Brafl1 39% to CYP19a C-term

98% to estExt_gwp.C_90165|Brafl1

MLVAGPDTMSVNIYFILLHIAEHGLESGILREIREVLDRDPTRDDLSKMVFLDHVINESMRTRPVVTFVMRHAEEEDHVD

GYVIPKGTNVIINLVAVHQDPRHFPEPETFDPDHFKEKVPSTQFMPFGLGVRSCVGRTIAPLQMKAVLITLLRMYQLSP

SRDHQSLEVSRNLSEHPTEPGSMFLYPRLETI*

 

>CYP19 scaffold_484 96% to first 3 exons below on e_gw.484.33.1|Brafl1

290078 MSGVMSVLTEQLQTWSAGLTCVTAVIVTGAALVLTWGGWASGRSVDVP (1) 289935

288972 GPPWLLGFGPLMSFARFIWMGVPVAAAHYGARYGDFVRVWIAGERTYVITR (2) 288820

288346 PSAPWPVLKSTNSCRRFGSRTGLRPIGMYQNGIIWNGDDGWRVLRGFFQK (1) 288197

 

>CYP19 e_gw.484.33.1|Brafl1 38% to CYP19 human and danio (-) strand

first exon is a guess, no frameshifts exist in e_gw.1098.5.1 so it may be correct

294278 MSGVMSVLTEQLQTWSAGLTCVTAVIVTGAALVLTWGGWASGRSVDVP 294135

293172 GPPWLLGFGPLMSFARFIWMGVPVAAAHYGARYGDFVRVWIAGERTYVITR (2) 293020

292546 PSAAWHVLKSNNYCRRFGSRTGLSTIGMYQNGIIWNGDDGWRVLRGFFQK (1) 292397

287888 ALNADTLNRATSAAVDATYRQMGNIAALQQKAADGKIEALDFLRRITLEVTNNLTLGVHIAD (1) 287703

287339 PDDLVERIVRYFKAWEFFLLRPPIMYLMTPKLYWKHCQAV (2) 287220

286970 NDLNDAIAELLTNKRQELKTAPPSDKPDFATCLLQAE (0) 286860

286169 ERGEVSPAHVQQCVLEMLLAGTDTSSVSMYYLLVSVAENPQVELKVLEEMRDIL (1) 286008

286565 ERGEVSPAHVQQCVLEMVL 286509 (duplicate exon 7 seq)

285823 GERDPTKADLPQLVYLEQVIKEAMRIKPVGPVIMRQAKEDDR (2) 285695

285428 IDGIETPAGTNIILNLADMHRRQDNFPAPDDFNPQHFDNK (0) 285309

284605 DFKGEYVPFGTGPKGCIGQFLAMIEMKAIMCTLLRKHHLRAIPGESLEGIETHWDIAQQPVNASYMYFEERN* 284387

 

>CYP19 e_gw.1098.5.1|Brafl1

95% to e_gw.484.33.1|Brafl1 yellow exon 9 is wrong, exon 9 is in a seq gap

49213 MSGVMYVLTEQLQAWSAGLTCVTAVIVTGAALVLTWGGWASGRSVDVP (1) 49070

48196 GPPWLLGFGPLMSFARFIWMGVPVAAAHYGARYGDFVRVWIAGERTYVITR 48044

47584 PSAAWHVLKSNNYCRRFGSRTGLSTIGMYQNGIIWNGDDGWRVLRGFFQK 47435

47273 ALNADTLNRATSAAVDATYRQMGNIAVLQQKTADGKIEALDFLRRITLEVTNNLTLGVHIAD (1) 47088

46510 PDDLVERIVRYFKAWEFFLLRPPIMYLMTPKLYWKHCQAV 46397

46144 NDLNDAIAELLTNKRQELKTVPPSDKPDFATCLLQAE 46034

45737 ERGEVSPAHVQQCVLEM 45687 (duplicate exon 7 seq)

45639 ERGEVSPAHVQQCVLEMLLAGTDTSSVSMYYLLVSVAENPQVELKVLEEMRDIL 45478

45352 GERDPTKADLPQLVYLEQVIKEAMRIKPVGPVIMRQAKEDDR (2) 45227

      SVFITIPLLYGNVNISITLYYALTKLLTHPPLQ

44076 DFKGEYVPFGTGPKGCIGQFLAMIEMKAIMCTLLRKYHLRAIPGESLEGIETHWDIAQQPVNASYMYFEERN* 43858

 

$$$$$$$

 

CYP20 clan

 

>CYP20 e_gw.479.56.1|Brafl1 39% to CYP20

MLDYAIFAITFVVFLIAAVLYLYPGSNKITTIPGLEPSDPKDGNMDDVGRAGSLHEFLLKLHAEYGDIASFWWGQQLVVS

LAAPELWKQHERAFDRPPLLFKGFEPLWGTMSITYANGVDGRTRRKLYDPSFGHEAMKHYFSIFQELGQEMAKNWASMEG

DQHIPLQAHMLALTTKATTRCSFGDAFKDEKECVQFSRNFNICWCDVEERVNGSHPTEGSPREKKFQEARGKLQATIGRV

VKYRRENPPPPQEQLFIDVLIEGDLPEEQVFGDAITYMVGGFHTTANLLTWALYFIATHEEVEEKLYQELSDVLGKKGEV

TPDNIPQLVYLRQVLDETLRCAVVTPWGARYMDLDAEIGGHIVPAKTPVIHAFGVVLQDERFWPEPNKFDPERFDAENSK

GRHKLAFQPFGSAGGRKCPGYRFTYVETTVFLSILCRQFKLHLVDGQVVKPRHGLVTRPVDEIWITVTKRD*

>CYP20 estExt_GenewiseH_1.C_860218|Brafl1

88% to e_gw.479.56.1|Brafl1

MLDYAIFAITFVVFLIAAVLYLYPGSNKITTIPGLEPSDPKDGNMDDVGRAGSLHEFLLKLHAEYGDIASFWWGQQLVVS

LAAPELWKQHERAFDRPPLLFKGFEPMFGAMSITYANSVDGRTRRKLYDPSFGHEALKHYFSIFQELGQEMASKWESTKG

DQHIPLHAHMMALALKTFTRSSFGDSFKDEKECVQFGRNYGICWNDMEERIKGSHPTEGSPREKKFKEALGKLHATIARV

AKYRRENPPPPQEQLFIDVLIEGNLPEEQVLCDAMTFTVGGFHTSGNLLTWALYYIATHEEVEEKLHQELSDVLGKKGEV

TPDNISQLVYLRQVLDESLRCAVIAPWGARYMDLDAEVGGHIVPAKTPVIHAFGVVLQDERIWPEPNKFDPDRFDAENSK

GRHKLAFQPFGFAGGRKCPGYRFAYTWTSVFLSILCRQFKLHLVDGQVVKPCHGFVTRPVDEIWITVTKRD*

>CYP20 fgenesh2_pg.scaffold_86000110|Brafl1

87% to e_gw.479.56.1|Brafl1

MLDYAIFAITFVVFLIATVLYLYPGANKITTIPGLEPSDPKDGNLGDLGRAGSLHEFLLKLHAEYGDIASFWWGQQLVVS

LGAPELWKQHERIFDRPRFEPLIGAKSIQYANSVDGRTRRKLYDPSYGHNAMKHYYSIFQELGQEMAKKWESMKGDQHIP

LHAHIIALAMKAITRSSFGDAFKDEKECVQFGRNYDICWNDMEERIKGSYPTEGSPREKKFEEAKGKLHATIARVAKYRR

ENPPPPQEQLFIDVLIEGDLPEEQVLCDAMTYMVGGFHTSGNLLTWALYFIATHEEVEEKLYQELSDVLGKKGEVTPDNI

SQLVYLRQVLDESLRCAVVAPWGARYMDLDAEVGGHIVPAKTPVIHAFGVVLQDERIWPEPNKFDPERFDAENIKGRHKL

AFQPFGFAGGRKCPGYRFTYVETTVFLSILCRQFKFHLVDGQVVTPWHGLVTRPLDEIWITVTKRD*

>CYP20 e_gw.89.28.1|Brafl1

83% to e_gw.479.56.1|Brafl1

MLDYAIFAITFVVFLIATGLYLYPGPNKITTIPGLEPSDPKDGNLGDIGRAGSLHEFLLKLHAEYGDIASFWWGQQLVVS

LGAPELWKQHERIFDRPPLLFKGFEPLIGAKSIQYANGLDGRTRRKLYDPSFGHNAMKYYYSIFQELGQEMAQKWESMEG

DQHIPLRAHTIDLTMKAITRCSFGDTFKDEECLQFSRNYDICWDDINERTKGNYPVEGSPREKKFQEALGRLHTTIGRVA

KYRRENPPPPQEQLFIDLLIEGDLPEEQVRAKSHTYWTISSVMTLYHCLLLTWALYFIATHKEVEEKLYQELIDVLGKKE

DVTPDNISQLVYLRQVLDETLRCAVVGPWGARYMDLDIEIGGHIVPAKTPVIHAFGVVLQDERIWPEPNKFDPERFDAES

SKGRHKLAFQPFGFAGGRKCPGYKFSYAETSVFLSILCRQFKLHLVDGQVVTWHGIIMITRPVDEIWITVTKRD*

>CYP20 e_gw.86.147.1|Brafl1

83% to e_gw.479.56.1|Brafl1

MLDYAIFAITFVVFLIAAVLYLYPKSNKITTIPGLEPSDPKDGNLGDVGRAGALHEFLLKLHAEYGDIASFWWGQQLVVS

LGAPELWKQHERIFDRPPLLFKGFEPLIGAMSIQYANHVDGMTRRKLYDPSFGHEAMKHYYSIFQELGQEMAKKWETMEG

DQHIPLHAHMIALAMKAITRSSFGDSFKDEKECVQFGRNDDICWNDMEERVKGSYPTEGSPREKKFQEALGKLHTTIRRV

VKYRRENPPPPQEQLFIDVLIEGDLPEEQVLCDAMTFMVGGFHTSGNLLTWALYFIATHEEVEEKLYQELSDVLGKKGEV

TPDNISQLVYLRQVLDESLRCAVITPWGARYMDLDAEIGGHIVPAKTPVIHAFGVVLQDERIWPEPNNLEFESATGFYSL

INLAHSSPIFPPPPGYRFSYIETSVFLSILCRQFKLHLVDGQVVTPWHGCVTRPLEEIWITVTKRD*

>CYP20 estExt_GenewiseH_1.C_4790081|Brafl1

86% to e_gw.479.56.1|Brafl1

MLDYAIFAITFVVFLIATVLYLYPGANKITTIPGLEPSDPKDGNLGDVGRAGSLHEFLLKLHTEYGDIASFWWGQQLVVS

LGAPELWKQHERIFDRPPLLFKGFEPLIGAKSIQYANSVDGRTRRKLYDPSYGHNAMKHYYSIFQELGQEMAKKWESMKG

DQHIPLHAHIIALAMKAITRSSFGDAFKDEKECVQFGRNYDICWNDMEERIKGSYPTEGSPREKKFEEGLTFQQLHATIA

RVAKYRRENPPPPQEQLFIDVLIEGDLPEEQVLCDAMTYMVGGFHTSGNLLTWALYFIATHEEVEEKLYQELSDVLGKKG

EVTPDNISQLVYLRQVLDESLRCAVVAPWGARYMDLDAEVGGHIVPAKTPVIHAFGVVLQDERIWP

 

Amphioxus has 8 CYP20 sequences.  This complicates the CYP20 story, since most other species have 1.  Two of the contigs may represent a recent duplication with differential gene loss.

479.a and 479.b/c are nearly identical to 86.a and 86.b.  The 479.d and 86.c sequences do not match, so that suggests a larger, possibly 4 gene cluster, with loss of the 479.d seq in 86 and loss of the 86.c seq in 479.

 

The scaffold 89 seq is unique.

 

This leaves five distinct CYP20s in Branchiostoma.

 

I am still looking for a gliomedin-like neighbor.

 

>CYP20 amphioxus 39% to CYP20 Danio from trace archive (hybrid seq)

    MLDYAIFAITFVVFLIATVLYLYP (0)

(0) GANKITTIPGLEPSDPK (2)

(2) DGNLGDVGRAGSLHEFLLKLHTEYGDIASFWWGQQLVVSLGAPELWKQHERIFDRP (1)

(1) ALLFKGFEPLIGAKSIQYANSVDGRTRRKLYDPSYGHNAMKHYYSIFQE (0)

(0) LGQEMAKKWESMKGDQHIPLHAHIIALAMKAITRSSFGDAFKDEKECVQFGRNYDI (0)

(0) CWNDMEERIKGSHPTEGSPREKKFKE (1)

(1) ALGKLHATIARVAKYRRENPSPPQEQLFIDVLIEGNLPEEQ (0)

(0) VLCDAMTFTVGGIHTSGN (1)

(1) LLTWALYYIATHEEVEEKLHQELSDVLGKKGEVTPDNISQLV (2)

(2) YLRQVLDESLRCAVIAPWGARYMDLDAEVGGHIVPAK (0)

(0) TPVIHAFGVVLQDERIWPEPNK (2)

    FDPDRFDAENSKGRHKLAFQPFGFAGGRKCP (1)

(1) GYRFTYTWTSVFLSILCRQFKLHLVDGQVVKPCHGLVTRPVDEIWITVTKRD*

 

>amphioxus CYP20 like scaffold 479.a seq gap in exon 12/13

99% to 86.a , 37% to CYP20 danio, 38% to CYP20 fugu, 37% to CYP20 human

possible allele to 86.a (only 6 aa diffs)

432510 MLDYAIFAITFVVFLIATVLYLYP (0) 432581

432736 GANKITTIPGLEPSDPK (2) 432786

433046 AGNLGDVGRAGSLHEFLLKLHAEYG 433120 (duplicate seq)

433166 DGNLGDVGRAGSLHEFLLKLHAEYG 433240 (duplicate seq)

433286 DGNLGDVGRAGSLHEFLLKLHTEYGDIASFWWGQQLVVSLGAPELWKQHERIFDRP (1) 433453

434119 ALLFKGFEPLIGAKSIQYANSVDGRTRRKLYDPSYGHNAMKHYYSIFQE (0) 434265

434881 LGQEMAKKWESMKGDQHIPLHAHIIALAMKAITRSSFGDAFKDEKECVQFGRNYDI (0) 435048

435360 CWNDMEERIKGSYPTEGSPREKKFEE (1) 435437

435701 AKGKLHATIARVAKYRRENPPPPQEQLFIDVLIEGDLPEEQ (0) 435823

436815 VLCDAMTYMVGGFHTSGN (1) 436868

437343 VLTWALYFIATHEEVEEKLYQELSDVLGKKGEVTPDNISQLV (2) 437468

437729 YLRQVLDESLRCAVVAPWGARYMDLDAEVGGHIVPAK (0) 437839

438253 TPVIHAFGVVLQDERIWPEPNK (2) 438318

       VDPERFDAENIXXXXXXXXXXXXXXXXXXXX

439186 XXXXXXXXXLVFLFILCRQFKFHLVDGQVVTPWHGLVTRPLDEIWITVSKRD* 439317

 

>amphioxus CYP20 like scaffold 479.b

2 aa diffs to 86.b

441870 MLDYAIFAITFVVFLIAAVLYLYP 441941

442088 GSNKITTIPGLEPSDPK 442138

444041 DGNMDDVGRAGSLHEFLLKLHAEYGDIASFWWGQQLVVSLAAPELWKQHERAFDRP 444208

446694 ALLFKGFEPMFGAMSITYANSVDGRTRRKLYDPSFGHEALKHYFSIFQE (0) 446840

447423 LGQEMASKWESTKGDQHIPLHAHMMALALKTFTRSSFGDSFKDEKECVQFGRNYGI (0) 447590

exons 6,7,8,9,10 in a seq gap

456021 TPVIHAFGVVLQDERIWPEPNK 456086

456446 FDPDRFDAENSKGRHKLAFQPFGFAGGRKCP (1) 456538

457265 GYRFTYTWTSVFLSILCRQFKLHLVDGQVVKPCHGFVTRPVDEIWITVTKRD* 457423

 

EST matches trace file

>BW799748 Amphioxus Branchiostoma floridae 1 aa dif to 479.b

NSKGRHKLAFQPFGFAGGRKCP

GYRFTYTWTSVFLSILCRQFKLHLVDGQVVKPCHGLVTRPVDEIWITVTKRD*

 

>amphioxus CYP20 like scaffold 479.c

1 aa diff to 86.b

note: the top three exons are identical to 479.b

the missing exons 11,12,13 are in 479.b

I suspect this is assembled incorrectly and 479.b and 479.c are one gene

The combined fragments would have only 3 aa diffs to CYP86.b (allele)

460211 MLDYAIFAITFVVFLIAAVLYLYP 460282

460433 GSNKITTIPGLEPSDPK 460483

460715 DGNMDDVGRAGSLHEFLLKLHAEYGDIASFWWGQQLVVSLAAPELWKQHERAFDRP 460882

exon 4,5,6,7 in a seq gap

461613 VLCDAMTFTVGGFHTSGN 461666

462121 VLTWALYYIATHEEVEEKLHQELSDVLGKKGEVTPDNISQLV 462246

462794 YLRQVLDESLRCAVIAPWGARYMDLDAEVGGHIVPAK (0) 462904

no exon 11,12,13

 

>CYP20 amphioxus scaffold 479.d

87% to 86.a, 86% to 86.b, 85% to scaf89, 88% to 86.c, 86% to 479.a

37% to CYP20 human, 38% to CYP20 danio

464840 MLDYAIFVITFVVFLIATVLYLYP (0) 464911

465054 GLNKITTIPGLEPSDPK (2) 465104

465455 DGNLGDVGRAGSLHEFLLKLHAEYG 465529 (duplicate seq)

465568 GGNLGDVGRAGSLHEFLLKLHAEYGDIASFWWGQQLVVSLGAPELWTQHKRIFDRP (1) 465735

467328 ALLFKGFEPLWGTMSITYANGVDGRTRRKLYDPSFGHEAMKHYFSIFQE (0) 467474

467704 LGQEMAKNWASMEGDQHIPLQAHMLALTTKATTRCSFGDAFKDEKECVQFSRNFNI (0) 467871

468141 CWCDVEERVNGSHPTEGSPREKKFQE (1) 468218

468460 ARGKLQATIGRVVKYRRENPPPPQEQLFIDVLIEGDLPEEQ (0) 468582

469033 VFGDAITYMVGGFHTTAN (1) 469086

469528 LLTWALYFIATHEEVEEKLYQELSDVLGKKGEVTPDNIPQLV (2) 469653

469951 YLRQVLDETLRCAVVTPWGARYMDLDAEIGGHIVPAK (0) 470061

470256 TPVIHAFGVVLQDERFWPEPNK (2) 470321

470700 FDPERFDAENSKGRHKLAFQPFGSAGGRKCP (1) 470792

471108 GYRFTYVETTVFLSILCRQFKLHLVDGQVVKPRHGLVTRPVDEIWITVTKRD* 471266

 

CYP20 sequences are also found on scaffold 86 and scaffold 89 in amphioxus

 

>scaffold 89 e_gw.89.28.1 [Brafl1:221840] model has error in exon 8

Brafl1/scaffold_89:1792029-1799319

84% to 479.d, 88% to 86.a, 84% to 86.b, 88% to 86.c, 87% to 479.a

38% to CYP20 danio, 37% to CYP20 human

MLDYAIFAITFVVFLIATGLYLYP

GPNKITTIPGLEPSDPK

DGNLGDIGRAGSLHEFLLKLHAEYGDIASFWWGQQLVVSLGAPELWKQHERIFDRP

PLLFKGFEPLIGAKSIQYANGLDGRTRRKLYDPSFGHNAMKYYYSIFQE

LGQEMAQKWESMEGDQHIPLRAHTIDLTMKAITRCSFGDTFKDEECLQFSRNYDI

CWDDINERTKGNYPVEGSPREKKFQE

ALGRLHTTIGRVAKYRRENPPPPQEQLFIDLLIEGDLPEEQ

VLCDAMTYMVGGFHTSGN

LLTWALYFIATHKEVEEKLYQELIDVLGKKEDVTPDNISQLV

YLRQVLDETLRCAVVGPWGARYMDLDIEIGGHIVPAK

TPVIHAFGVVLQDERIWPEPNK

FDPERFDAESSKGRHKLAFQPFGFAGGRKCP

GYKFSYAETSVFLSILCRQFKLHLVDGQVVTWHGIIMITRPVDEIWITVTKRD*

 

>scaffold 86.a

fgenesh2_pg.scaffold_86000110 [Brafl1:79625] corrected exon 4

Brafl1/scaffold_86:2108590-2115417

MLDYAIFAITFVVFLIATVLYLYP

GANKITTIPGLEPSDPK

DGNLGDLGRAGSLHEFLLKLHAEYGDIASFWWGQQLVVSLGAPELWKQHERIFDRP

ALLFKGFEPLIGAKSIQYANSVDGRTRRKLYDPSYGHNAMKHYYSIFQE

LGQEMAKKWESMKGDQHIPLHAHIIALAMKAITRSSFGDAFKDEKECVQFGRNYDI

CWNDMEERIKGSYPTEGSPREKKFEE

AKGKLHATIARVAKYRRENPPPPQEQLFIDVLIEGDLPEEQ

2112464 VLCDAMTYMVGGFHTSGN 2112517

LLTWALYFIATHEEVEEKLYQELSDVLGKKGEVTPDNISQLV

YLRQVLDESLRCAVVAPWGARYMDLDAEVGGHIVPAK

TPVIHAFGVVLQDERIWPEPNK

FDPERFDAENIKGRHKLAFQPFGFAGGRKCP

2115259 GYRFTYVETTVFLSILCRQFKFHLVDGQVVTPWHGLVTRPLDEIWITVTKRD* 2115417

 

>scaffold 86.b

estExt_GenewiseH_1.C_860218 [Brafl1:265292]

Brafl1/scaffold_86:2118885-2129038

MLDYAIFAITFVVFLIAAVLYLYP

GSNKITTIPGLEPSDPK

DGNMDDVGRAGSLHEFLLKLHAEYGDIASFWWGQQLVVSLAAPELWKQHERAFDRP

PLLFKGFEPMFGAMSITYANSVDGRTRRKLYDPSFGHEALKHYFSIFQE

LGQEMASKWESTKGDQHIPLHAHMMALALKTFTRSSFGDSFKDEKECVQFGRNYGI

CWNDMEERIKGSHPTEGSPREKKFKE

ALGKLHATIARVAKYRRENPPPPQEQLFIDVLIEGNLPEEQ

2123453 VLCDAMTFTVGGFHTSGN 2123506

LLTWALYYIATHEEVEEKLHQELSDVLGKKGEVTPDNISQLV

YLRQVLDESLRCAVIAPWGARYMDLDAEVGGHIVPAK

TPVIHAFGVVLQDERIWPEPNK

FDPDRFDAENSKGRHKLAFQPFGFAGGRKCP

2128631 GYRFAYTWTSVFLSILCRQFKLHLVDGQVVKPCHGFVTRPVDEIWITVTKRD* 2128789

 

>scaffold 86.c

e_gw.86.147.1 [Brafl1:220957] exon 12 in model is wrong

corrected below

Brafl1/scaffold_86:2141287-2148370

MLDYAIFAITFVVFLIAAVLYLYP

KSNKITTIPGLEPSDPK

DGNLGDVGRAGALHEFLLKLHAEYGDIASFWWGQQLVVSLGAPELWKQHERIFDRP

PLLFKGFEPLIGAMSIQYANHVDGMTRRKLYDPSFGHEAMKHYYSIFQE

LGQEMAKKWETMEGDQHIPLHAHMIALAMKAITRSSFGDSFKDEKECVQFGRNDDI

CWNDMEERVKGSYPTEGSPREKKFQE

ALGKLHTTIRRVVKYRRENPPPPQEQLFIDVLIEGDLPEEQ

2144990 VLCDAMTFMVGGFHTSGN 2145043

LLTWALYFIATHEEVEEKLYQELSDVLGKKGEVTPDNISQLV

YLRQVLDESLRCAVITPWGARYMDLDAEIGGHIVPAK

TPVIHAFGVVLQDERIWPEPNK (2)

Exon 12 in a seq gap

2148212 GYRFSYIETSVFLSILCRQFKLHLVDGQVVTPWHGCVTRPLEEIWITVTKRD* 2148370

 

$$$$$$

 

CYP26 clan (5 sequences)

 

>e_gw.480.39.1|Brafl1 22% to CYP26B1 magenta probably right seq

scaffold 480 463546:469390 (5845 bp)

MAELPGYVGYPWVGDNSLEFYRDPVSFMEKRVQDYSSRIFQARFINRPTVFVGSAEAVKKLLNEKSQHFEMGYKALWQGL

YGDNVLFSDGWSNFFVFIT VSILHIMSHCNNCNHYPNQTCGDPQPIQDLNLLMR PVKVYELLKQMSTEISMGLFLDIERE

TDNSLAPLVSQLMTQHWHGIISMPANLKLPSWGGNWESGYSKAQEAKDELLKIIGERIGKNKHNNVLGLMKTAGFRSEDE

IYRHLLLFVSALVPKAFSSLFTSFTLQLAGPSKVSMRQKALEDETFLEHILLEVQRLWPPFIGGRRLVRQEFTLAGYRIP

KEHGLMYVTHTAHRDPQIFPEPNSFKPERWSTCNAGHEGYLCAFGGGPRRCIGTQLVQLVLKHVTKYLLHNFHW QVTQAE

IPPYKWLPVSRP

>estExt_GenewiseH_1.C_4470006|Brafl1

scaffold 447 85292:91028 (5737 bp)

90% to e_gw.480.39.1|Brafl1

about 10 aa diffs to green region of e_gw.480.39.1|Brafl1

MAELPGYVGYPWVGDNSLEFYRDPVSFMEKRIQDYSSRIFQARFINRPTVFVGSAEAVKKLLNEKTQHFEMGYKALWQGL

YGDNVLFSDGWSNFFVFIT NLSSSFSGKKLLTSLQIPGHLPHMTVNLR PVKVYELLKQMSTEISMGLFLDIERETDNSFA

PLVSQLMTQHWHGIISMPANLKLPTWGGNWESGYSKALEAKDELLKIIGDRIGKNKHNNVLGLMKTAGFRSEDEVYRHLL

LFVSALVPKAFSSLFTSFTLQLAGPSKASMRQKALEDETFLEHILLEVQRLWPPFIGGRRLVRQEFTLAGYRIPKEHGLM

YVTHTAHRDPQIFPEPNSFKPERWSTSNAGHEEYLCAFGGGPRRCIGTQLVQLVLKHVTKYLLHNFHW EVTQAEIPPYKW

LPVSRPTVEDQVIFTPRDSPDQEVEVGVEVAETSL*

 

Hybrid seq

MAELPGYVGYPWVGDNSLEFYRDPVSFMEKRVQDYSSRIFQARFINRPTVFVGSAEAVKKLLNEKSQHFEMGYKALWQGL

YGDNVLFSDGWSNFFVFIT VSILHIMSHCNNCNHYPNQTCGDPQPIQDLNLLMR PVKVYELLKQMSTEISMGLFLDIERE

TDNSLAPLVSQLMTQHWHGIISMPANLKLPSWGGNWESGYSKAQEAKDELLKIIGERIGKNKHNNVLGLMKTAGFRSEDE

IYRHLLLFVSALVPKAFSSLFTSFTLQLAGPSKVSMRQKALEDETFLEHILLEVQRLWPPFIGGRRLVRQEFTLAGYRIP

KEHGLMYVTHTAHRDPQIFPEPNSFKPERWSTCNAGHEGYLCAFGGGPRRCIGTQLVQLVLKHVTKYLLHNFHW EVTQAEIPPYKWLPVSRPTVEDQVIFTPRDSPDQEVEVGVEVAETSL*

 

$$$$

 

>CYP26 fgenesh2_pg.scaffold_164000029|Brafl1 44% to CYP26B

Brafl1/scaffold_164:756408-763969

MLEEVVGYLVFPAFLMVLSWKLWGRYATPSDPACALPLPAGTTGFPIIGETLSFILEGADFSRKRHALYGDIFKTHILGR

PTIRVRGADNVRKILRGENDIVGTMWPDNFRMVLGTENLAMCGSGPLHRQRKKIVMRAFRHDALEIYTDSMQAMIADTLR

VWCRGPQPLAVYPAAREMMFRLAIAVLVGFHQDEEEARRVGSLFRTAVKNIFSLPLNVPGSALRKALQCRQEIDEWLKRH

IHEKHAQIWSGEVPDDVLSFIISSAKEEGKAVDQQQLLDTAVELLFAGHETTSSAATSLIMHLALQPQVVQKVQEDLEKH

GLLQPDQPLSLEQVGRLTYVGQVVKEVLRISPPIGGGFRKALKTFAIGGFQVPEGWAVMYSIRDTHSASQLFSSPQQFDP

DRWAAADSTAIRYDFLPFGAGPRACAGKEFAKLQLKLLCVELVRSCRWELADGKVPEMKSVPVLHPANGLPVNFVSLDDV

TVKREDADGLAAPAHAPLMNTDLVTRSDPCLTLDKNGNLYPTSEQNSPDTVTVVGPDLSNIV*

>CYP26 fgenesh2_pg.scaffold_164000030|Brafl1

Brafl1/scaffold_164:786544-802905

62% to fgenesh2_pg.scaffold_164000029|Brafl1 43% to CYP26A1

MLVELVTVLVLPCVALLLSWKLWTQYYTWSDPGPDTPLPPGSMGLPFIGETLSLVTQGGKFSSSRHAQYGDVFKTHILGR

PTIRVRGATNVRKILLGENHIVTSLWPQTFRTVLGTGNLAMSNGEEHRLRRKVIMKAFNYEALERYVPIMQEILREAVQR

WCGAPQPVTVWPMAREMAFRVASAVLVGFQHSDEEIQHLTSLFTNMVKNLFSLPVKLPGSGLSNGLFYRQAIDEWMMNHI

QRKKEFVLQGGDSGDVLSHIMNNAKDNGEKLSDQEIQDTVVELLFAGHETTSSAATSLIMHLALQPQVVQKVQEDLEKHG

LLQPDQPLSLEQVGRLTYVGQVVKEVLRRRPPIGGGYRRALKSFDIGGFHVPKGWAVLYSIRDTHEASQIFSSPELFDPD

RWTPETSQAPLARYDMVTFGGGPRACVGKEFAKLLLKLLCVELTRRCRWKLADDKLPDMKLIPIVYPADGLPVIFTPIGG

KSPGDENKNGVPYEERTRGKDCPILCSVSFEKDINVAT*

>CYP26 estExt_fgenesh2_pg.C_1640031|Brafl1

Brafl1/scaffold_164:838132-841029

45% to CYP26C1, 42% to CYP26B1, 42% to CYP26A1

64% to fgenesh2_pg.scaffold_164000029|Brafl1

MLAELLINAAVPLVLVWTLWTLWKHYSTQGDPACDLPLPKGSMGLPFIGETLAFVTQGADFSRSRHELYGDVYKTHILGR

PTVRVRGADNVRKILHGENTLVTTIWPYSIRAVLGTQNLGMSFGEEHRFRKRVVMKAFNQNAMESYLRSTQTVLRETVAQ

WCVQPQPVVVYPASREMALKIAAASLIGVHTGQEDAQRVTVLFQNMIDNLFSLPVKIPFGGLSKALRYRQIIDEWLEGHI

KRKQRDIDNGDIGTDALSRLILAARDVGHDLNSQEIQDTAVELLFAGHETTSSAATSLIMHLALQPQVVQKVQEDLEKHGL

(gap)

LQPDQPLSLEQVGRLTYVGQVVKEVLRISPPIGGGFRKALKTFELDGFQVPAGWTVTYSIRDTHGSVGNVSSPDQFDPD

RWAADSDGSRRGRHHYIPFGAGPRACAGKEFAKLQLKLLCVELVRSCRWELADGKVPAMTAIPVPRPVNGLPVQFTPCEP

ITNNTLSDATEQNTNLSVCYSSNVPGPSHTSPKQQDFDAPCQIVMARKSEACGA*

 

$$$$$$$$

 

CYP46 clan (1 sequence)

 

>CYP46 from ESTs and trace archive

CF917908 BI377382 Amphioxus 5-6 hrs cDNA 52% to CYP46 fugu

ATGI35342.g1  ATUP100909.x1  ATUP374858.g2  ATUP181014.g1

AFSA27081.b2 ATUP181014.b1  AFPZ282931.x1 (possible exon 2)

walked upstream to ATWW110117.g1 ATUP762936.y1 (N-term)

MAVVAVLMVLGVLAVVGLAVAGVVYLGYIYYMHRKYDHLPGPPRKS (2)

MNLLTYPICTTFLTLYSDENEYLTAFLD (0)

LLFCERYGPIVRLNFLHRVIIFVSSPEAVR (0)

ELLVTGKYIKPPDQYERIGSIFGERQ (0)

FLGEGLVTETNQERWHKRRRIMDPAFSRKYLQTL

MDKFNTSGDLFVEKLQTLADGVTPVSMVDMFGRVTLDVIAK (0)

VAFSMDLNTILDDHTPFPMATYITLSALIQQFRHPFME (0)

YNPFQRDYIRKVREACRLLRKTGHSVLQERQDQIRRGEQLPNDIMTLILKAN (1)

DEDSGLTVEKLVDDFVTFFIA (1)

GSETTANQLSFTLMELGRYPDVLEK (2)

LRAEMREVCGNKEYITYEDIGKLQYMGQ (0)

VLKESLRMYPPATGTSRLVEEEMELCGHRIPGDTVLI (0()

TSTYIMSHMEKHYPEPYTFNPDRFTPDADRPLYTYFPFSLGSRSCIGQHFSQ (0)

IEAKVLLCKFLQKLEFELDPNQSFAVSEASTLRPKGGCICMLSKLEK*

>CYP46 estExt_fgenesh2_pm.C_1290004|Brafl1 48% to CYP46

MHRKYDHLPGPPRKR

(gap)

CERYGPIVRLNFLHRVIIFVSSPEAIRELLVTGKYIKPPDQYERIGSIFGER add this line to lower seq

FLGEGLVTETNQE

RWHKRRRIMDPAFSRKYLQTLMDKFNTSGDLFVEKLQTLADGVTPVSMVDMFGRVTLDVIAKVAFSMDLNTILDDHTPFP

MATYITLSALIQQFRHPFMEYNPFQRDYIRKVREACRLLRKTGHSVLQERQDQIRRGEQLPNDIMTLILKANDEDSGLTV

EKLVDDFVTFFIAGSETTANQLSFTLMELGRYPDVLEKLRAEMREVCGNKEYITYEDIGKLQYMGQVLKESLRMYPPATG

TSRLVEEEMELCGHRIPGDTVLITSTYIMSHMEKHYPEPYTFNPDRFTPDADRPLYTYFPFSLGSRSCIGQHFSQIEAKV

LLCKFLQKLEFELDPNQSFAVSEASTLRPKGGCICMLSKLEK*

>CYP46 estExt_fgenesh2_pg.C_1860050|Brafl1

same as estExt_fgenesh2_pm.C_1290004|Brafl1

MAVVAVLMVLGVLAVVGLAVAGVVYLGYIYYMHRKYDHLPGPPRKS

FISGHIDDMTKCLQDGKFTQDMILEW

(gap)

FLGEGLVT

ETNQERWHKRRRIMDPAFSRKYLQTLMDKFNTSGDLFVEKLQTLADGVTPVSMVDMFGRVTLDVIAKVAFSMDLNTILDD

HTPFPMATYITLSALIQQFRHPFMEYNPFQRDYIRKVREACRLLRKTGHSVLQERQDQIRRGEQLPNDIMTLILKANDED

SGLTVEKLVDDFVTFFIAGSETTANQLSFTLMELGRYPDVLEKLRAEMREVCGNKEYITYEDIGKLQYMGQVLKESLRMY

PPATGTSRLVEEEMELCGHRIPGDTVLITSTYIMSHMEKHYPEPYTFNPDRFTPDADRPLYTYFPFSLGSRSCIGQHFSQ

IEAKVLLCKFLQKLEFELDPNQSFAVSEASTLRPKGGCICMLSKLEK*

 

CYP51 clan (1 sequence)

 

>CYP51 estExt_fgenesh2_pm.C_2170007|Brafl1

MLVEMGNLLLENALETVQELGSGTVALTTIVVLLGVTYFGRQFVSSVGKAEKLPPVVPHTIPILGHGYNFYKNPIGFLEE

AYKKYGPVFTITMAGSKFTYLVGSDAAATLFNSKNEDLNAEEVYSRLTTPVFGKGVAYDVPNPVFLEQKKMFKTGLNIAR

FRTHVSLIEEETKEYFKRWGDSGERDLFEALAQLTILTASRCLHGKEVRSMLHEGIAQLYADLDGGFTQMAWLLPGWLPL

PSFRKRDRANREMKKVFKKIIQQRRESGDCDDDMLQTLMESTYKDGRPLTDDEITGMMIGLLMAGQHTSSTTSTWMGFFL

AKHKDIQARAYQEQLDICGEDLPPLNYDDLKEMALLDKCLAETLRLRPPIMTMMRMCKTPQQVKGYTIPVGHQVCVSPTV

NQKLEDTWEEAGTWNPNRFLEGNASTGKFSYVPFGAGRHRCIGENFAYVQIKTIWAVLLREFEFELIDGHFPSINFETMI

HTPSQAIIRYKKR*

>CYP51 estExt_GenewiseH_1.C_4170037|Brafl1

1 aa diff to estExt_fgenesh2_pm.C_2170007|Brafl1 (allele)

MLVEMGNLLLENALETVQELGSGTVALTTIVVLLGVTYFGRQFVSSVGKAEKLPPVVPHTIPILGHGYNFYKNPIGFLEE

AYKKYGPVFTITMAGSKFTYLVGSDAAATLFNSKNEDLNAEEVYSRLTTPVFGKGVAYDVPNPVFLEQKKMFKTGLNIAR

FRTHVSLIEEETKEYFKRWGDSGERDLFEALAQLTILTASRCLHGKEVRSMLHEGIAQLYADLDGGFTQMAWLLPGWLPL

PSFRYRDRANREMKKVFKKIIQQRRESGDCDDDMLQTLMESTYKDGRPLTDDEITGMMIGLLMAGQHTSSTTSTWMGFFL

AKHKDIQARAYQEQLDICGEDLPPLNYDDLKEMALLDKCLAETLRLRPPIMTMMRMCKTPQQVKGYTIPVGHQVCVSPTV

NQKLEDTWEEAGTWNPNRFLEGNASTGKFSYVPFGAGRHRCIGENFAYVQIKTIWAVLLREFEFELIDGHFPSINFETMI

HTPSQAIIRYKKR*

 

$$$$$$$

 

CYP74 clan (10 sequences, 7 distinct sequences, two pairs of alleles, plus one nearly identical duplicate on scaffold 120)

 

>fgenesh2_pg.scaffold_781000005 [Brafl1:110589] 23% to CYP74B2

Scaffold 781

96% to fgenesh2_pg.scaffold_402000022

MGVSMSNTKGLVRHVKAGPRALKPGGEHPAAVRTNVGIPVVALLNQDTIHHVFNTDLVDKEQYCLGYVGV

RSELLRGHCPSMFANGQEHRRKKAFLIDVFRGRQKTLPPVLSRQIMAHFKEWSRLEALADFEDKVFFLMS

DILTETVFGRKLDGRLALHWLQGLPSVRTWIPFPTKAKQDLAASALPVLLKSIEESPNYEELIQLSYLHD

IEEEDAIDNILFVIVFNAVAAVSAVIVTFITRLHTITEADRNVLLKTTLQALLKHESLSEESLGDMKALD

SFLLEVLRLHPPVFNFFGVAKKDFAIPTGVDKNVEVRQGEQLMGSCFWAQRDAKVFLSPNVFRCYRFMDS

KELLVDREQDGGKKRHLIFGHGS (2)

L  T  E  A  A  D  L   (frameshift) 

DS  H  Q  C  P  G  Q  D  I  A  F  Y  L  M  K  A  T  L  A  V  L  L  C YC

SWELEALPVWSDKTARLGRPDDLVSLTWFNFDSDTARHVLESYDLNCEK*

 

>fgenesh2_pg.scaffold_402000022 [Brafl1:102192]

Scaffold 402

57% to fgenesh2_pg.scaffold_107000039

96% to fgenesh2_pg.scaffold_781000005|Brafl1 (probable allele)

MGVSMSNTKGLVRHVKAGPRALKPGGEHPAAVRTNVGIPVVALLNQDTIHHVFNTDLVDKEQYCLGYVGV

RSELLRGHCPSMFANGQEHRRKKAFLIDVFRGRQKTLPPVLSRQIMAHFKEWSRLEALADFEDKVFFLVS

DSLTETVFGRKLDGRLALHWLQGLPSVRTWIPIPTKARQDLAASALPVLLKSIEESPNYEELIQLCYLHD

IEEEDGIVNILFTIVFNAVAAVSAVIVTFITRLHTIIEADRNILLKTTLQALLKHESLSEESLGDMKVLD

SFLFEVLRLHPPVFNFFGVAKKDFAIPTGVDKNVEVRQGEQLMGSCFWAQRDAKVFLSPNVFRCYRFMDS

KELLVDREQDGGKKRHLIFGHG

SLTEAADLDSHQCPGQDIAFYLMKATLAVLLCYC

SWELEALPVWSDKTARLGRPDDLVSLTWFNFDSDTASHVLESYHLNCEK*

 

>fgenesh2_pg.scaffold_107000039 [Brafl1:81984] 27% to CYP7D1

Scaffold 107

61% to fgenesh2_pg.scaffold_402000022|Brafl1

51% to estExt_fgenesh2_pg.C_1950037

816169 MGACMSDTSGLLNTKKSGPHVLNPRGEHPTIVRTNVGIPCVGLLSQETIQYVFDPELVDKEPCCFGYSEVPGDV

RRGHCPSMFANGQEHRRKKAFLVDVFKECRDKIQTVLFKTILEDFEEWSRVKTVPDFEDRVYFLISKAVT

EAVFGTKLDGRLALTWLEGAIQLKTWLPIPNYAKRHRLAVAALGELMKTIEESPKYEELIRMCHLHDLEA

EDGMMTLMHAILFNGCGAVTTTIITSVARYQTIPAGERKDLQTSVLQEVEKFGSITEESLGEMEFLESFL

LEVLRMHPPVADFWGVAKKDFTVSAGEIKEEIRKGERLLGSCFWAQRDVSVFLRPGLFRSRRFLDEKE

KRSNLLFPHGSFLEAASLDSHQCPAMDIAFILMKATLAVLLCYCKWELQDTPEWSDKITRLGKPDGLVSLTS

FGFDLVEARRVLEL*

 

$$$$$$$

 

>estExt_fgenesh2_pg.C_1200087|Brafl1 22% to CYP74B2, 25% to CYP74F1 rice

(29% to Nematostella XM_001636310.1 12 exons)

1244338  MGNCCSNYAGMWRALQQGNYSIKEINYGGADATVLRRNIGVTVVSLLDQHNIRYVFDMDLVEKVPFTLGNTALRPAVLGG

HCPGMLSNGVEHVRRKEFAMAVIQRSLTNSLFSTMVEQLHAHTSMWATVGHNIYDFEDRVNRFCADAVSTVILGTTLPYE

SVRAWQNGLHSHRPRVPTLGRYLAKSHALRALPVLLRNIRNAPAYEDIIHLGKTCGLTEEEATHEILYTIVGHALPQVQN

PLLACLAAYAAMPDLDRRQMWEEMNK (0) 1245135

1247089 VLHNVGTFTETVLGSMTCVESFILEVLRLRPPMEMFFGRARKDFIVKTRDREIFQ (0) 1247256

1247668  VHEGEVVCGSAFWAGRDPTSFRVPIMFRRNRFACPGSEALRGSLIFGRGPLTFLPTNENHQCPGLELAMGVLKPSMAWL

LMFCKWKLTEEPKWSGKKRSRCGKPDNPMGMVTFKYYPTDVANYYPLPGVTPSNEKGKPGKDNSPNVSSFVSSIL* 1248132

 

Two genes nearly identical 45 kb apart (possible assembly error?) 

estExt_fgenesh2_pg.C_1200094

1303480 MGNCCSNYAGMWRALQQGNYSIKEINYGGADATVLRRNIGVTVVSLLDQHNIRYVFDMDLVEKVPFTLGNTALRPAVLGG

HCPGMLSNGVEHVRRKEFAMAVIQRSLTNSLFSTMVEQLHAHTSMWATVGHNIYDFEDRVNRFCADAVSTVILGTTLPYE

SVRAWQNGLHSHRPRVPTLGRYLAKSHALRALPVLLRNIRNAPAYEEIIHLGKTCGLTEEEATHEILYTIVGHALPQVQN

PLLACLAAYAAMPDLDRRQMWEEMNK (0) 1304277

1306472 VLHNVGTFTEAVLSSMTCVESFILEVLRLRPPMEMFFGRARKDFIVKTRDREIFQ (0) 1306636

1307061 VHEG 1307072

end in a sequence gap

 

>estExt_fgenesh2_pg.C_1950037 [Brafl1:125761] two genes fused

Scaffold 195

Neighbor to amphioxus on right side scaffold 195 also on another scaffold

P450 like Nematostella/CYP74

MGGVWSNTYGFIKGVTDGVHMMKPEGEHPSVVRTNPGLPVVALMNQDTIHYAINPETYKKEPYSFGPVGV

SKDVLRGHCPSMFSNDEDHRRKKALLVDAYKQGEKSLPSILFNQIKAHFGEWSRLKDVPDFEERVFHIMS

ETLTEALFGRKIDGQLCFTWLNGLITEAKTWIPMPSLAWKRRQAIKAIPELLKAIETAPKYRELVQLCHT

HGVEVEEGIFTILYGTLFNGCAAQTAAIVSSVARLHTLSDAEKNEIIQTTLQVLEKHGGVSEESLGEMKT

LESFILEVLRLHPPVFNYWVLARKDLVISPEKENIKVRKGERMLGCCFFAQRDGSVFPDPDRFRWNRFLD

EQGGQKKHLFFPRGSFTEAADLNSHQCPGQDIGFFMMKTTLSVFLCYCSWELKDAPVWSDKPIRVGNPDD

PVRLVRFNFRSEQAGRALVNTSAKKI*

 

>estExt_fgenesh2_pg.C_3320046 [Brafl1:128846] 3 exons

Scaffold 332 also 98% identical to upper seq but gene neighbors are different

MGGVWSNTYGFIKGVTDGVHMMKPEGEHPSVVRTNPGLPVVALMNQDTIQYALNPETYKKEPYSFGPVGV

SKDVLRGHCPSMFSNDEDHRRKKALLVDAYKQGEKSLSSILFNQIKAHFGEWSRLKDVPDFEERVFHIMS

ETLTEALFGRKIDGQLCFTWLNGLITEAKTWIPMPSLAWKRRQAIKAIPELLKAIETAPKYRELVQLCHT

HGVEVEEGIFTILYGTLFNGCAAQTAAIVSSVARLHTLSDAEKNEIIQTTLQVLEKHGGVSEESLGDMKT

LESFILEVLRLHPPVFNYWVLARKDLVISPEKENIKVCKGERMLGCCFFAQRDGSVFPDPDRFRWNRFLD

EQGGQKKHLFFPRGSFMEAADLNSHQCPGQDIGFFMMKTTLSVLLCYCSWELKDAPVWSDKPIRVGNPDD

PVRLVRFNFRSEQAGRALVNTSAKKI*

 

>estExt_fgenesh2_pg.C_1940045 [Brafl1:125747] 81% to estExt_fgenesh2_pg.C_1950037

scaffold 194

MGGVWSDTFGFIKGLVHGPHMMKPEGEHPSVFRANPGVPAVVLLNRDTIQYAFNPETYEKEPYSFGPVCA

AKDVVGGHCPSMFSNDEDHRRKKALLIDVYKQGQKTLPSVFFSQIKAHFEEWSRLEDVPDFEERVFHITS

ETLTEALFGKKIDGRLCYTWGNGIPTDFRTWIPIPPAARKRRQAVEVLPALLKAIKETPKYQELVQLCHT

HGVEVEEGILTILYGTLFNGCGAQTATIISSVACLHTLSDAEKNEIIQTTLQVLEKRGGISEESLSEMKT

LESFILEVLRLHPPVFNYWALARKDLVISPEKENIKVCKGERMVGSCFWAQRDGSVFPDPDRFRWNRFLD

EDEQGGQKKHLFFPRGSWTEAADLDSHYCPGQDIGFFILKVLLAVLLGYCSWELKDAPVWSDNTFRLGNP

DDPVRLARFNFRSEQAGRALGIRPDNIAPNAI*

 

Seq downstream similar to a rickettsia seq

 

>estExt_fgenesh2_pg.C_510020 [Brafl1:120723]

Scaffold 51

87% to estExt_fgenesh2_pg.C_1940045

83% to estExt_fgenesh2_pg.C_1950037

623366 MGGVWSDTFGFVKGLVYGPHM

MKPKGEHPSAFRMNNGVPAVVLLTRDTIQYAFNPETYEKDPYSFGPGGVSKDVVRGHCPSMFSNDEDHRR

KKALLIDVYKRGQKTLPSVFFSQIKEHLEEWSRLEDVPDFEERVFHIMSETLTEALFGRKIDGELCFTWL

NGLLTDFKTWIPIPSMSRKRRLAIEALPALLKAIKEAPKYQELVQLCHTHGVEVEEGIFTILYGTLFNGC

AAQCAAIVSSVARLHTLSDTEKNDIIQTTLQVLEKHGGVSEESLGEMKTLESFILEVLRLHPPVFNFWCL

ARKDLVISPEKENIKVCKGERMVGCCFWAQRDESVFPDPDRFRWNRFLDEDKQGGQKKHLFFPRGSWTEA

PDLDSHQCPGQDIGFFMMKALLAVLLGYCSWELTAAPMWSDKTIRVGNPDDPVRLARFNFRSEQAGRALG

IRPDNIAPNAI*

 

>fgenesh2_pg.scaffold_163000045 [Brafl1:87575]

Scaffold 163

73% to estExt_fgenesh2_pg.C_1940045 pseudogene

MGGVWSDTFGLIKGLVYGPHMLKSEDEYPTAFRTNNGVPAVVLLNRDTIQYVFNPEMYEKEPFYFGYLGT

SKDVMRGHCPSMFLNGEEHRQKKALLIDAYKQGQKALPSVLFKQIKAHFGEWSRLDEVPDFEDRVFHFFS

EALTEALFGRKVDGQLCRTWLNGLLNDFKTWIPMPSMARKRRLAIEAIPVMWKAIEEAP

K  Y  *  E  L  V  Q  L  C  D  T  H  G  V

E  A  E  E  G  I  F  T  I  L  C  G  T  I  F  N  G  I  A  A  E  R  A  A

I  V  S  S  V  A  R  L  H  T  L  S  D  A  E  K  N  E  I  I  Q  T  T

L  Q  V  L  E  K  H  G  G  VS  (frameshift)

G  E  M  K  T  L  E  S  F  L  L  E  V

L  R  L  H  P  P  V  F  N  L  W  G  L  A  R  K  D  F  I  I  S  P  E

K  E  N  I  QV  (1)

IA I  I  R  K  G  E  Q  L  L  G  S  C  F  W  A  Q  R  N  G  S  V  F  P

D  P  D  R  F  R  W  N  R  F  V  G  E  D  E  Q  G  E  Q  K (2?)

(0) K  H  L  F  L  P  R  S (2)

NWTEAYDFDSH

HCAGQDIAFLTMKATLAVLLCYCSWELKDAPVWSDKTLRVGNPDDPVRLTRFSFRSEQAGRALGIRPDNT

YPNSI*

 

unnassigned (1 sequence)

 

>gw.549.14.1|Brafl1 heme signature like seq possible pseudogene fragment no allele

CVYWDFKLNDDKGGWSSEGCNVYYAADTHTVCHCNHLTNFALLMDVYGSTAKLSEGNQKALSIISLIGCAVSSAGLLFAL

ITFLLFRTLRRDNPTKILINLCVALLLVNLTFVTLSHPEQFHAGFMCKTHAMVMHYALLAAIAWMGIEAVNMYLAFVKVF

DTYYTNFVMKICLAGWGRLLKYVQFSMGMKSCVARACVCVCVCVCVCVCVNLCYLSGIAFYAAFVAPVCVVLIFNTTMYG

LVLRHVVRMRGKVEKSELSEVITKLKRAAGLCVLLGVTWLFAMLAIDKAAVFFSYVFAICNSLQGFFIFVFHCVLRKSAR

KRWMALLPC