Aphid P450s from Acyrthosiphon pisum

 

Oct. 13, 2009, corrections June 6, 2013

 

D. Nelson

 

CYP2 clan

 

>CYP15A1 LOC100162751

49% to CYP15A1 Tribolium, 51% to CYP15A1 Diploptera puncatata

52% to CYP15A1 Reticulitermes flavipes

MLFFVTLVISLVLLFLILDTIKPRRYPP (1) GPKWLPIGVN (tribolium N-term)

 MFFVAVVISVFIIVCILDIITPHKYPI (1) GPTRVPLLGNYLE

IRKLRNKLGFYHLVWDHLAKYYGKVFSVKLGRIEAVVVSGYDAVRQVLCKDDFDGRPDGFFFRFRAFYKRLGIVFVDGPTWTEQRKFCMQHLRKMGFGGDLMERIIIEEVNDLMLDISRKCENGKPIEVYGLFDVSVLNGLWAMLAGHRFALNDSRLARLMELVHVSFRMLDMSGGILNQMPFIRFFAPKCSGYKYLKQIINEFYTFLKESVEEHKCRANDQEDDFISAFLKEIEKNKESPGSFSEEQLLVILLDLFLAGSETTSSMLSFVILLLLKHQDIQAKVHAELDAVVGDREIHLADKNRLNYLEAVLMEVQRHSNVAPLAIAHRTIRKTSLQEYTIPKDTLVLASIWSVHMDEQHWGDPKVFRPERFLDSSGKIINDSWFMPFGVGRRRCLGEILAKTNIFMFIAKLIQHFEIRIPQGAQLPDKPQDGVTISPSPFSAIFIPRRCLSQ.

 

>CYP15A2P LOC100169165  whose N-term 100 or so aa are wrong

49% to CYP15A1 Tribolium

87% to LOC100162751

missing N-term exon, revised middle

Since the rest of the protein is so conserved (87%), the N-term exon seems to be gone.  Therefore this is probably a paeudogene.

GPSRIPFIGNYLEIRKLRNELGFYHLVWHQLAKCYGQVFSVKLGRIEAVVVSGYDAVRQVLCKDDFDGRPDGFFFRFRAFYKRLGIVFVDGPTWNDQKKFCMQHLRKMGFGGDLMEKIIIEEVHDLMVDITIKSENGKPIKVHGLFDISILNGLWAMLAGQRFALNDSRLARLMELVHVSFRMLDMSGGILNQMPFIRFLAP

NSSGYEHIKQILNEFYTFLK (0)

ESVEEHKCG

ENYQEDFISAFLMEIEKNKESPESFSEEQLLVILLDLFLAGSETTSSMLSFAVLLLLKHQDIQDKVHAELNAVVGDREIQLADKKKLNYLEAVLMEVQRHSNVAPLAIAHRTIRKTSLQEYIIPKDTLVLASIWSVHMDEHHWGDPEVFRPERFLDSTGNIIKDSWLMPFGIGRRRCLGEILAKANVFMFIANLIQNFEIRIPNGVQLPDRPQDGVTISPSPFSAIFIPRR.

 

>CYP15A3P LOC100160402p which has either 2 annotation errors, or is a pseudogene.

44% to CYP15A1 Tribolium, 79% to LOC100162751

middle region revised

missing N-term exon

Since the rest of the protein is so conserved (79%), the N-term exon seems to be gone.  Therefore this is probably a pseudogene.

GPTRVPLLGNFLEIQKLKNKLGFYHLVWDKLAKCYGQVYSVKFGPIETVVVSGYDAVREVLSKDDFDGQADGFFFRTRAFYKKLGIVFVDGPMWTEQRKFCMRHLQKLGFCGDVMEKIVIEEVNDLVLDITRKYENGKSIEVRGLFEVSVLNGLWAMLAGGRFSLNDSRLARVVELIHESLRILDMPGGILNQWPFIRYLAP

LSRNKHLKQIINELYILLK (0)

ESVEEHKCSENDQE

DFISAFLMDIEKNKKSLGSFSEEQLVVILLDLFLAGSETTSITLSSVILHLLMNQDIQTKVRAELDAVIGDREILPSDRKRLNYLEAVFMEVQRHSNVVPLAIATNRTIRKTTLQDYIIPKDTLVLASIWSVHMDEQHWGDPEVFRPERFLDSKGKIINDSWLMPFGVGKRRCLGEKLAKTYIFMFIAKLIQHFEIRIPTDIQLPDKPQNGVNISQTPVSVFFIPRRCLKAN.

 

>CYP18A1 LOC100163652 = 26 hydroxylase (not a Halloween gene)

MTAETMSDGDGYSRELWLNAVAAALGLTYSAYRQLRAARTLPPGPWGVPFLGYAPFLSNHCTYLKYNELARRYGPICSFTQRGNTVILLSDHKLIKTAFDMKQITGRPNDGYMDIIGGYGAVNSTGKLWESQRKFLHLVLRHMGMTFTGHNRLNMENRIMIEVSTLTETFHKTCGKPIDLNAGSLCLAITNVISSLTMSVRFEPNDPRFERYMHMVDEGFKLFGMLRPVSLFLPRRHITDERNIQEKIKNNHQEIAKYFQSIIEEHRSTFDPNSIRDLVDAYLLEIKRSQEAGTMDQLFQGLDPNRQVQQILGDLFSAGMETIKNTILWAMVYMLHYPDVMTKVQDEIDSVVGQYKSPVLDDYPNLPYTQATLYEVLRKSSITPLGTTHATTSDVTLNGYHIPTGAQIIPLQHFVHNDPNLWDEPEAFKPERFINAEGKVKKPDCFLPFGVGRRKCLGETLAQMELYLFFSTLLHEFDVCLPDGDELPSMDGQVGITLTPQSFKVVMKARNK.

 

>CYP303A1 LOC100162206 = nompH (not a Halloween gene)

MWILVLVLFSVVVALLSYLDMRKPKNYPPGPKWLPILGSALTVNSLRKQTGYLYRATICLAESYGPIVGLKVGKDRQVVCCGYNAIKEMLTKEEFDGRPQGPFYETRTWGTRRGLLLTDEEFWVEQRRFVLRHLREFGFGKRTMAELVQDEAVQLVEDFKEKIAMSKNGNGEIFEMRDAFSVGVLNTLWSMMASKRYNADDIELKNLQALLTELFANIDMVGALFSQFPVLRFIAPEASGYKSFVNIHQQVWKFLKAELDDHKETFIINQPRDLMDVYLQMLHSEDKKESYSESQLLAICMDMFMAGSETTSKSLGFGFLYLLLNPEVQKKAQEEIDRVVGRDRLPTLNDRPNMPYLEALVLESVRVFMGRTFSIPHRALKDTTLQGYHIPKDTMVIANFAALLNDDDVWDHPDRFWPERFIGCDGKLIVPDEYLPFGYGKHRCMGQTLARSNIFLFSACLLQNFDFSVPDGQAPPSTLGVDGVTPSPGEFNAYVSLRPR

 

>CYP305E1 LOC100168939 SCAFFOLD13 coords:75476-86851

41% to CYP305A1 Drosophila melanogaster

MAWYFVCFVTVILLLIALRTCRKPKNYPPGPKWIPFVGNTYQLSKLAATKNGQYLAFEELRQRYKSDIIGLKLGREYVVIVFGNDLLNETFHRDEFQGRPDNFFMRLRTMGKRRGITMTDGDLWKVHRSFAVRHLKLLGLGQRRVDELIHDEYQLMVDRLFDATKSVTPTLYLQSAVMNVLWELTAGTKFEDPKLLTLMRKRSSAFDMAGGLLNQIPWLRYLAPTRTGFSLITEINQQLYSLISNIIVEHKKTITHTTRDFIDAYLNQMKKEEIYNTMFTEEQLIAVCLDLFIAGSSTTSSTLDFAILAMARWPDVQAKVQSTLDEIQPPGTYITAEQILKNRYVEAVLLETKRLNHVTPIIGPRRVLRNTNLNGYNIPKNTTILMSLYSVHQDQLKWGDPEVFRPERFMDTNGKINTTEDMYFFGFGKRRCPGEALAQRFVNLAFANLIHDFTIEIDQLPDGVNCGILLTPKPYKIKMTKRK

 

>CYP306A1 LOC100165691 = phm 25 hydroxylase

MFWIIGVILFGVLCAGYLWRSNRNLPPGPWGVPIFGYLPWLNPTEPYKTLTALASKYGPIYSIQMGKHFAVVMSDPTLVRMALARNELADRTNFEVVNEIMQEHGLIFTHGPLWKEQRKFVCNWLKVIGVTKFGDKKNNLQLLIADAVSTTISKLRQSNNRPIDTGTFFLVHIGDFINLIVLGKAWPEDDPNWIYLRNLAEDGSKKFAIATPLSVLPILKIIPKYRNTVFEVIEGVKNTHLIYKTLMEKRGNEIHESDDLMAMFMKEMTKRKNDKDSHYFTEKQCCFLLSDLFGAGVETTVNTLRWFLLYMALNQEIQNDLQKLLDSACTDGGLIGLEQIESIPLLKACVSETMRLRPVAPSGIPRSVNTEITISGYRIPKGTMVLPLQWAMHHDEKYWTDPETFRPKRFLDDEGNMINHKAFMPFQAGKRACVGDTLSYWILYLFGANIIHNFNVSAEQGLSEKEINTIMDGEFGITLSPATHKVVFKSRI.

 

and three paralogs of the Halloween gene CYP307 spo (which has a complex history, see Rewitz and Sztal’s papers), so one might expect different expression patterns.

 

>CYP307A1 LOC100160204

85% to LOC100160738

57% to CYP307A1 Tribolium, 49% to Cyp307a2 Drosophila melanogaster

39% to CYP307B1 Tribolium

MDTAKGVVAAAADNVTVVLLLLLSVVLLILAVKSASGRGPWTSRRRPGKSTAAVALTAVPDGPTAYPVIGALHAMDGHRDKPFHRFTELSHKYGPVFSMTMGSMPCVIVNDFDSIKEVLITNGSKFGGRPDFSRYNVLFAGDRNNSLALCDWSWLQETRRKIARKYCSPKVCSSNYGLLDSISSDELDVFLESLAAVTIRGFECEVQLKKQLLMACANMFIRFMCSTQFEYGDPKFQNMVRTFDEIFWDINQGYAVDFLPWLKPFYAGHMRKLSKWSTQIRRFIMDTVVSKRYAADDVDEQEPIDFTDALLMSLRKEPGLKMNHVLFELEDFIGGHSAVGNMIMLALSMVATRPHVAQAIRDEAEQVTGGQRLVRLYDKPDMPYTEATLFETLRFISSPIVPHVATEDTTIKGFKISKGTCIIINNYEINTSPAYWDNPEVFDPNRFVHRESGTKPCIRKPEYFLPFSTGKRTCIGQQLVSGFGFVLLAGILQRYEVKATAQLAIPEARLALPPDTYPLILKPLDGSR

 

>CYP307A3 cLOC100160738

52% to CYP307A1 Tribolium, 50% to Cyp307a2 Drosophila melanogaster

37% to CYP307B1 Tribolium

MDTTNGIVAGADTVTVALSLLLPVVLLMLAVAWACGPLAAHRRPGTSTAAVLDGPKSFPIIGSLHAMDGHQDSPFRRFTELSHQYGPVFAMTMGSMPCVVVNDYDSIKEVLITNGSKFGGRPDFTRYNALFAGDRNNSLALCDWSSLQETRRKIARTYCSPKVYSSNYCLLDSISSNELDVFLDSLATVSVRGSECEVQLKQLLLMASANMFIRFMCSTQFEYGDPEFQNMVRTYDEIFWDINHGYAVDFLPWLKPFYAGHMRKLSKWSTQIRQFIMDMVVSKRSSYAKAQEPTDFTDALLMSLRKEPGLKMNHVLFELEDFIGGHSAVGNMVMLALSMVATRPHVAQAIRDEAEQVTGGQRLACLYDKPDMPYTEATLLETLRFISSPIVPHVATEDTTIKGFKISKDTCIIINNYEINTSPAYWDNPEVFDPNRFVHRKFGAKPCIRKPEYFLPFSTGKRTCIGQQLVSGFGFVLLAGVLQRYEVKATAELAIPEARMALPPDTYPLILKPLDGSR

 

>CYP307C1 LOC100159333

46% to CYP307B1 Tribolium, 43% to CYP307A1 Tribolium, 40% to CYP307A2 D. melanogaster

42% to LOC100160204, 40% to LOC100160738

MEFVFSSLTYLLLFVLTAVLLFLIRDELKTKQVDHRAGLVDPPAPKAWPIIGHLYLMARYKVPYRVFDEIMADLGSVFRLDLGSVPCVVVNGLNNIREVLMIKGDHFDSRPSFRRFNQLFKGDKNNSLAFCDWSQLQKTRRELLRAHTFPNTTSNMYTRLDTCLKTELADLTDTLDTMANTECVDIKNMLLHTCANVFMSYFCSTRFSRSYDKFREFIRNFDDVFYEVNQGAPCDFLPSLMPLYHWHFKKIRSWSSKIRNFMETEIFNKRKAAWVPGTKPVDFVDNLLDAVTQPDRDDGFDMDIGLFSLEDIIGGHSAITNFIVKTLGFLVDRPDVQRRIQEESDAVVRASGSVGLSDRSQMPYTEAVVYESLRLIASPIVPHLANRDTSVDGVRIRKGTTVFLNNYSLHMSPELWNNPEHYSPERFINAEGRLEKPEYFIPFSGGKRSCMGYKLVQLLSFCTISTLLNKYTLLPVEDVSYAVPKGNLALPFVTFPFRLRPRNFRKQ

 

CYP3 clan

 

>CYP6CY1 LOC100159226.pro          SCAFFOLD10025:26656..31321 (+ strand)

82% to LOC100163313.pro (adjacent), 77% to LOC100168115.pro (2 genes away)

MFTASWWINVITPCTIIVTITYYFCVSTFKKWEKLNVPYIKPIPLFGNFLNVALGKNHPLEFYNKIYHEFAGQKYAGVFQMRTPYLMVRDPEIINDVMIKNFSSFPDRGIYSDFVAEPLTNNLLLMENPQWKIIRNKLTPAFTAGKLKTMYDQIKECGDELMKNIDIDLNRTSNEIEVKDIMGKYSTDVIGTCAFGLKLNAINDDESPFRKYGKLIFKPSLRVLMRELCVMITPALLKVVRLKKFPTAATDFFHAAFNETMTYRLENNIVRNDFVHYLMQARNDLVLNTDLPKHEKFAESQIVANAFVLFAAGFETVSSAISYCLYELALNKSIQDRVRKEIQLQLSKNNGQINHELLIDLNYLDMVIAETLRKYPPLVALFRKASQTYRVPNSSLIIEKGQKIIIPIYAIHYDNKYYSDPEKFIPERFSAEEKAKRPSGVYLPFGDGPRICIDSEWLRNVLMLQ

 

>CYP6CY2  LOC100163313.pro        SCAFFOLD10025:31376..37086 (+ strand)

MFTANWWINFITPCTIIVTIAYYFCVSTFKKWEKLNVPYIKPIPLFGNLLNVAVGKDHPLDFYNKIYHKFAAHKYAGVFQMRTPYLMVRDPEIINDMLIKDFSSFPDRGIYSDFVAEPFSNHLFFMENPQWKIIRNKLTPAFTSGKLKMMYDQIKECGDELMKTIDIELIKNDDEIEVRDIIGKYSTDVIGTCAFGLKLNAIKDDESPFRKHGKTLFEPSLRALFKELCLMIAPALLKVIKVKDFPTDATDFLHTVFKETITYRQKNKIVRNDIFQCLIQVRNDLVLNADLSKNEKFTETQIVANAFAMFAAGFETVSSAISYCLYELALNKSIQDRVREDIELKLSNNDGQINHELLIDLNYLDMVIAETLRKYPPVVALFRKASQTYRVPNDSLIIEKGQKIIIPIYALHYDSKYYTDPEKFIPERFSAEEKAKRPSGIHLPFGDGPRICIGKRFAEMEMKLAFVEILTKFEVFPCEKTEIPLKYSNKVFTLMPKHGIWLRFKRIN

 

>CYP6CY3 LOC100168115.pro         SCAFFOLD10025:41696..46319 (+ strand)

37% to CYP6A14, 38% to CYP6Y1, 39% to CYP6AQ1 bee

42% to CYP6AX1

MAYDLNEIKIKLNYVNIINVIIVVLYHYLFKIRFREKKYAEKYPLIEFDTKRENSVDSCVFCVGTLFNIQNCVNYRKGHMYPSSSSATDWWIYIVTPCLVAVTITYYFCISTFNKWEKLNVPYIKPIPLFGNFLKVALAKDHPLEFYDKIYYKFSGLKYGGLFQMRTPYLMVRDPEIINNMLIKDFSSFPNRGIYSDLAANPLSDNLFFMENPRWKTIRSKLTPAFTSGKLKIMYDQIKECGDKLMKNIDNDLKGKNDEIEVRDIMGKYSTDVIGTCAFGLKLNSISDDESPFRKYGKSIFIPSLRTLFRELCLMVSPALLKVVRVKDFPTDATAFFNAAFKETITYRLENKIVRNDFVNCLMQARNDLTLNTNLPKHERFSESQIVANAFVMFAAGFETTSTTLSYCLYELALNIHIQDKVRQEIQLKLSKSDGQIDNEFLMGLNYLDMVIAETLRKYPPLIALFRKASQTYRLPDNLILEKGQKIVIPIYSIHFDSKYFEDPLKFNPERFSSEERAKRPNCVYLPFGDGPRTCIGKRFAELEMKLALVEMLTKFEVLPCGKTEVPLKYSNKALTLMPKHGIWLRFKKIV

 

>CYP6CY4  LOC100164042.pro        SCAFFOLD10025:137606..141395 (+ strand)

87% to LOC100167264.pro, 85% to LOC100163313.pro

80% to LOC100168115.pro 90kb downstream of LOC100168115.pro

MFTANWWINVITPCTIIVTIAYYFCVSTFKRWEKLNVPYIKPIPLFGNFLNIALGKDHPLEFYNKIYYEFAGRKYGGLFQMRTPYLMVRDPEIINDVMIKDFSSFPDRGIYSDFTANPLSNNLFFMENPQWKTIRNKLSPAFTSGKLKTMYDQIKKCGDELMKNIDIDLNKNGNEIEVRDILGKYSTDVIGTCAFGLKLNAISDDESPFRKYGKSIFTPSLRMLFRELCLMITPALLKVIRVKDFPTAATDFFHAAFKETMTYRIENKIVRNDFVHCLMQARNDLVLNTDLPKHEKFTETQIVANAFVMFAAGFETVSTTVSYSLYELALDKSIQDRAREEIQLKLSKNDGQINHEFLMDLNYLDMVIAETLRKYPPLVALFRKASQTYRIPNDSLIIEKGQKIIIPIYAIHYDTKYYPEPEKFIPERFSVEEKAKRPSGIYLPFGDGPRMCIGKRFAEMEMKLAFVEILTKFEVFPCEKTEVPLKYSNKVLTLMPKHGIWLRFNRIN

 

>CYP6CY5 LOC100167264.pro         SCAFFOLD13514:28216..31057 (+ strand)

MFTANWWINVITPCTIIVTIAYYFCVSTFKKWEKLNVPYIKPIPLFGNFLDIALGKAHPLEFYGKIYNEFAGRKYGGLYQMRTPYLMVRDPEIINDMLIKDFSSFPDRGIYSDFVANPLSNGLFFMENPQWKIIRNKLTPAFTSGKLKTMYDQIKECGDELMKTIDMDLIKNGKEIEVRDIMGKYSTDVIGTCAFGLKLNAINDDESPFRKHGKSIFTPSLRSLFRELCLMVTPALLKVVRVKDFPTDATDFFHAVFKETITYRLENKIVRNDFVQCLIQARNDLVLNADLPNHDFVLEKFTESQIVANAFGMFAAGFETVSSTISYCLYELALNKSIQDRLRKEIQLKLSKNDGQINPEFLMDLNYLDMVIAETLRKYPPLVALFRKASQKYRLPNDSLIIEKGQKIIIPIYALHYDNKYFTDPENFIPERFSAEEKAKRPNGIYLPFGDGPRICIGKRFAEMEMKLAFVEMLTKFEVFPCDKTDIPLKYSNNVITLVPKHGIWLTFKRIN

 

>CYP6CY6 LOC100165240.pro         SCAFFOLD13514:36706..45131 (+ strand)

downstream of LOC100167264.pro

MFTDNWWIYVITPCTIIVTIVYYFCVSTFKKWENLNVPYIKPVPLFGNFLNVALGKEHHIDFYNKFYHKFAGHKYAGVFQMRLPILMIIDPEIINDVLIKDFSSFPNRGFSVDFKANPLSNNLFLMENPQWKIIRNKLTPAFTSGKLKVMYDQIKECGEELMKNIDIDLKKSGDEIEVRDIMGKYSTDVIGTCAFGLKLDAINDDESPFRKHGKSIFAPSLRQLFREMCMLISPVLVKVVRVKDFPKDATDFFHAAFKETMKYRHENKIVRNDLVHCLMQARNDLVLNTDLPKHEIVLEKFTESQIVANAFIMFAAGFETVSSAISYCLYELALNKSIQDRVREEIQLKLSKNDGQINHEFLMELHYLDMVLAETLRKYPPLVFLMRKALQTYRLPNDSLTIEKDQKVIIPVYAIHHDSKYYPEPENFIPERFSTEEKAKRPNGTYMPFGDGPRICIGKRFAEVEMKLAMVEMLTKFEVFPCEKTEVPLKYSHKTITLMPKHGIWLKFKKIN

 

>CYP6CY7 LOC100160895.pro          SCAFFOLD17283:187176..190201 (+ strand)

MIDVISCSIIGLLSSVYILYATVFLSIAYYLCTSTHDKWRKLNVPYTKPLPLFGNSMNLVLAREHPMDFFTGLYNRFPDEKLCGFYQMTTPFLMIRDPKLINNIMVRDFSYFTDHGFDTDPSVNILANSLFMLNGDRWRTMRQKLSPGFTSGKLKDTHDQIKECTDQLINIVDDNLKVSDHFEIRELVGNFSTDVIGMSAFGLKLDTIRNGNLDFRKFGKKIFQSDFKQLFVQAMMLFCPKLVTILKLKQFPDDAADFYGSMFRDVLEYRDRNNVIRNDVTQTLIQAKKDLVTNNDGDDSTSKNKWTEMDIVGNAILMFVAGAETVSITICFCLYQLALNKDIQDKLREEIVTTNAKHGGQLNNDFLTNLHYMNMVLEEVSRMYSITMILFRQATKNYEVPGQSLVIEKGQKIIIPAYCIHNDPKYYPNPGTFDPERFSTEEKAKRLNGTYIPFGDGPRLCIGKRFAELEMKLVLSKILLKYEVLPCEKTEVPINIRGAGSIVNPKNGVWLSFKPIVAN

 

>CYP6CY8 LOC100159248       SCAFFOLD5532:14697..15575 (- strand) and            SCAFFOLD5532:7975..9925 (- strand) (seq gap in middle of gene

MLIFANFWMDFIILVTVLFSIIYYYCTSTFNVWKKLNVPYVRPIPLFGNYLKVALGIENPMETYKNIYYELAGFQYGGMFQMRTPYLMIRDPEIVNNILIKDFSYFTDRGIHVDFKAEPLSEVLFLMENPRWKKLRSKLSPAFTSGKLKQMYSQIEKCGQDMIINIFAELKKNPNEIDIRDILAKYSIDVIGSCAFGLALNVASDDTSLFRSYGKTAFSETLRKYPLLFALFRVATKTYRVPNDSLIIEKGQKIIIPTFSLHFDPRYFSDPEVFNPERFSTKEKAMRPNGVYLPFGDGPRLCIGKRFAEMEMKLALVEILSKFEVEPCEKTEIPIQFSKLSVVVIPKDEKILLKLNPLSE

 

>CYP6CY9 LOC100161627       SCAFFOLD1099:1136..6561 (+ strand)

MSASQLLVDLAAGWWTVAVLALLAATVYHFCTSTFGYWRDRGVPYVRPTVPLFGNIGGLALGVEHQARMFGRIYDGFRGQRYGGFFQMRTPHLMVCDPALVNRVLIGDFAHFTDHGMYTAGPDENPLANGLFNMNGAQWKIMRQKLSPVFTAGKLRHMRGQVTECSEQLMRNVAADVPAGGGQMEIRDVLGKYSTDVIGTCAFGLHLNAINDERSSFRKHGKAVFAPSFRVLLKELAWMVTPALRRALRIGDMPPDAAQFFTAAFTDTMKYREEHGIVRDDFMQSLIQARTDLVVNKTEPSVEFLETDIVANAFILFAAGFETVSTAMSFCLYELALKKPIQDKVREEMNTTKKKHNAEIDNDFLKDLHYLEMVLAETLRKYPPLLTLFREATQDYQVPDDTFVIEKGTKVLIPAYAIHHDYRYYPDPETFDPERFSPEEKAKRPNGTYMPFGDGPRLCIGKRFAEMEMKLALTELLTTYEVEPCEKTDIPMRFSKRSLIITPENGIWLKFKPIHTSK

 

>CYP6CY10P LOC100159387 SCAFFOLD6634 coords:14145-18091 (- strand) pseudogene

upstream part not found in region up to next gene

47% to CYP6A13, 48% to CYP6AQ1 C-term only CYP3 clan

67% to LOC100168007.pro

MESQILSNAFGFFAAGFDTTSTSISYCLYELALKKNIQDRVREEIKLTKSKYNGVIDNEFLNDLHYLDMVIAESLRKYPLMFALFRVATKTYRVPNDSLIIEKGQKIIIPTFSLHYDPKYFSDPEVFNPERFSPKEKAMRPNGVYLPFGDGPRLCIGKRFAEMEMKLALVEILSKFEVEPSEKTMIPVQFSKLSVVVIPRDEKILLKLNPLSE

 

>CYP6CY11P SCAFFOLD12002:AUG4_SCAFFOLD12002.g13.t1 pseudogene

        SCAFFOLD12002:309999..313284 (- strand)

86% to LOC100163195.pro not in the collection

MISCLIDFLLGTPAIAVTVLMAFVYYYTTNTYDKWLKLNLRYDPPWPLVGNTMKMVTLIE HQLATIDGIYKRLAGEKYCGFYQTKTPFLMIRDPELINNILIKDFLNFANRGFHKDPALN IIANGLFFMEGPKWDVMRQKLSSGFTSGKLKLAHNQIAECSDELMRFIAAKMKENDQIEVK

*TMSKYSTDVIGTCAFGLSL

 

>CYP6CY12 LOC100163195.pro 39% to CYP6AX1

SCAFFOLD12002:306085..309474 (- strand)

MISCLIYVLFGTPAIAAVAVLAAILYYYTTNTYDKWLKLKVPHDPPWPLVGNTAKMMTLIEHQLTTIDGIYKRFSGEKYCGFYQMKTPFLMIRDPELINNILIKDFSNFADRGFHKDPALNIIANGLFFMEGPKWKMMRQKLSPGFTSGKLKLAHNQIAECSDELMRFIAAKMKENDQIEVK ETMSKYSTDVIGTCAFGLKLDTVKNEGSDFRLYGRKILKLSFRFLLAEMVSPKILKLLGVAEFPPDASAFYESAFKEVIRYREENGIVRHDVAQSLIEARKELVLDSTDENGFTEQHIIANAILMFLAGFETVSSTLSFCLYHLALNQDVQEKIRDEMNSKLKQHGKINNDFLVNLHYTDMVLAETERMYVVTNALFREAVKTYHVPGDTLVIEKGTKIMIPIYSIHHDPTYYPEPYIFDPQRFSPEEKAKRQSSTYLPFGDGPRFCIGKRFAELEMKMVLSQIITTFRILPCEKTEVPLKLQNGLPMMVAKNGIWLRFQSISE

 

>CYP6CY13 LOC100167704.pro       SCAFFOLD7563:47979..58025 (- strand)

only P450 on this 60 kb contig

MISWMFNCLIDSFTLICTTVIGLLFYYYSTSTYKKWRKANVPHTKPVPFFGNFFRSTLGFETINDTYHNIYKQFPDKKFCGFYQMRTPTLMIRDPELINNVLIKDFSHFTDHGLDMDPSVNFLASSLFFTRGQKWKIMRQKMSAGFTSGKLKLMHSQIKDCSKEMIDYIDRKSKTTDQFDMHDIMNKYATDVIGTCAFGLKLGSMKDEDNEFRKFTKLLFKPSFRLIFTNILSLISPKTSNILKIKTSSPEVMEYFTTSFQNVIEYREKNNMDRNDVAQTLMRARKELKFTEMDIISNAILMYLAGAEPVSDTLGFCLHELAINKHVQDKLRKHINTKRKEHGGEFTNDYLMDLHYADMVLTETLRKCNGTIVLFRKATKAYQVPDSSLVIEKGQQIIIPTYSIHHDPKYYTNPDVFDPERFSPEEKSKRPSSTELLFGDGPRFCIGKRLAELEMKLGLSEIISKFEILPCEKTENPVQLANAGGAIKPKNGIWLI

 

>CYP6CY14 LOC100161480.pro       SCAFFOLD8603:3827..8475 (- strand) model short

MISYLTNLLFDYIFLSLIIVCTFLYYYTTSTYDTWRKLNVPFAKPVPFFGNIFKMFTGLERQVDAFGRIYQQFPDEKFCGFYQMSTPFLMLRDPELINTVIIKDFSYFTDHGIDMNPSVNVMARSLFFATGQKWKTMRQKLSPGFTSGKLKGTHEQIRECSDQLTNCIYEKSQKTDAIEVYELVGNTATDVIGTCAFGMKLDTINNDNSSFRQNVKKVFKPSGKVIFAQILGVLFPKIVKFLKLQTSPVDVDAVNFFHSVFGEVIEYRTKNDVVRNDLTQTLMKARQDLVVSSDYKGEEKYCELDIIANAMLLFTAGSETVTATASFCFYELALNKVIQDRLRDEIISSKIKHGGQLNNEFLEDLHYADMVLDXXIEKGQKILIPIYSIHHDPKYYPNPETFDPERFTAEEKSKRPNGTFLPFGDGPRHCIGKRFAELELKLILSKILTKFEISPCEKTEIPLQMNKERGITSPKNGIWLNFRPIVE

 

>CYP6CY15 LOC100163900.pro SCAFFOLD5222 coords:2410-5346

MYFLTDWLLDNFTYLSLIAVFTGFYYYSTSTYGKWQKLNIPYIPPVPLFGNAFRMVTKLECPMDMYDRLYKQFPDVKLLGFYQMTEPMLLIRDPELINAILIKDFPYFTDHGFVMDPSTTVMAKSLFFSNGQRWRTMRQKLSPGFTSGKLRDTYLAINECSNQMVSSIVEKLGKTDRLAIRSIISGFSNDVIGMCAFGIQLDSMNNEDSDFRRYSERIFEKTTKQIIVQAVTTIFPFVINLFKIQMFSAEATNFFRKVFADVINYREKNNIVRNDLTQTLLQARKELVLKENSTAEGIVFADQFTDDDIIGNAIVLFADGAETISSIVSFCLYELALNKEIQDKMRAEICSMKAKHDGQFNNDFLMDLRYTNMVLEETGRKYSIASILMREATKTYTLPDESFVIEKGQKLIIPMFSIHRDPKYYPDPLIFDPERFSKEQKSQRPNGIYMPFGDGPRMCMGKRFAELEMKLVLSNVLSKFEVLPCEETEIPLEITDETGVIAPKRDLVLKFRPIIED

 

>CYP6CY16 LOC100162372     SCAFFOLD1019:69426..73624 (+ strand)

MISFMTDWLHDNVTCLSLIAVLASFYYYSTSTYGKWRILNIPYVPPVPLFGNTTRMMLRLEHPIDMFERFYNSFPDVKLFGFYQMRDPVLLVRDPELINAILVKDFSYFTDHGIDLDSSTSVLANSLFFANGQKWRTMRQKLSPGFTSGKLKDTHGQINECSDEMVSGIVESIKKKTDQIDVKTITGGFSTDVIGTCAFGMKLDTIKNDDSDFRRYVKIMFQSTPKQMIVQVLLMICPWVIKVLKINMFSVEATNFFHNVFTDVFKYREEHNVIRNDLTQTLMQARKELVLKENSSIEDKFTDADIIGNAILMFTAGSETISSMLSFCLYELALNIEIQDRLRSEICSMKAKHDGHLNNDYLMDLYYTNMVLEETARKYSIAFNLMRVATKTYTLPDESFVIEKGQKLIIPMFSIHRDPKYYPDPLRFDPERFSTEQKSQRPNGIYMPFGDGPRLCIGKRFAESEMKLVLSNVLSKFEVLPCEKTEIPVNIRSMSGFITPKNGIVLKFRPIVEH

 

>CYP6CY17 LOC100164459mod.pro (N-term trimmed)         SCAFFOLD2510:85936..93942 (+ strand)

MNDIKLLNGLVSGAVDSPDLLSRIGFRIPDKYSNRKRPLPLNKPIMISFLIDCLVNNVTCLSLIVIFTGSFYYYSTSTYNKWRKLKIPYVPPVPLFGNTFRMLARLEHPIDTFDKIYNHFPDFKLFGFYQMREPMLLVRDPELINMILVKDFLYFTDHGVDIDPSMSTLAKSLFFANGQKWRTMRQKLSPGFTSGKLKGTYCQINECSDEMVSSIVEAIGKKTDRIELKTITGRFSTDVIATCAFGLKLDSIKNGDSEFRRYVKILFQTTTKQAIILILSLICPRVVKILRLQFFSLEATNFFSKVFADVIKYREDHNVSRNDITQTLIEARKELVLKEISTTEDKFTDDDIIGNAIFLFSAGSETISSLVCFCLYELALNKEIQDKLRAEIYSMKAKHNGKLNNDYLVDLRYTNMVLEETGRKYSIAFNITRVATKTYTLPDESFVIEKGQKLIIPMFNIHRDPKYYPDPLRFDPERFSMEQKSQRPNGTYIPFGDGPRLCIGKRFAEAEMKLVLSKVLSKFEVQPCEQTEIPLDIRSGSGLLSPKNGLVLKFKPIIEH

 

>CYP6CY18 LOC100168007.pro        SCAFFOLD1502:67256..70800 (+ strand)

39% to CYP6AX1

73% to LOC100164042.pro

MHRTTLISLGGLFMHNINLLSNNIIVSVVKMSLYNLYLYCLYGCNVIYKLDKFSGITIYHNENPSSIQYRITNDIFTLYLMKIAEANAVLTLAYVKLVVFCVFLIMFSFIFIWWINIITPCLFIFTITYYFCTLTYSKWEKINVPYIQPIPLFGNFLDVALGMQHPIDFYRKIYYELAGYKYGGLFQMRTPYLMIRDPEIINNVLIKDFSNFPNRGIYSDFSANPLSNQLFFMENPQWKIIRKILSPAFTSGKLKLMYDQIKECGDELMKNIHKNLTKTDNKMEVRDILGKYSTDVIGTCIFGLKLNAVSDDNSTFRKYGKSLFLPSLRTHLRELSLMITPALLNILRFKDFPADATEFFHSAFHETITYREKNNIVRNDFVQTLIQARNDLVLNKNIPQRERFLESQIVANAFVMFAAGFETVSTAISFCLYELSFCLYELSLKKHIQDKVREEINLKLSKNNGLINNDLLIDLNYLDMVLAETLRKYPPTFALFRKASQTYHVPNDSLTIEKDQKVIIPIYSLHYDPKYFADPEVFDPERFSPEEKSKRISGTYLPFGDGPRICIGKRFAELEMKLALVEILTKFETEPCERTEVPIRFSKKALITMPENGIWLTFKKITNQ

 

>CYP6CZ1 LOC100165972.pro   SCAFFOLD1502:79066..88489 (+ strand)

downstream of LOC100168007.pro, 46% to LOC100168007.pro

MFEFVYELFDLKMLLVTAFLGAIYVYSTWTHSHWSKLGISSPSAPVPLFGHAMPSMLGQMHFMDVLHNLYKELGDQRFGGIYTMRTPQLLVKDPELIGHILIKDFNNFTDRGLYAGTHTNPLNNNIFFTRGERWKTMRQKLSPTFTANKLKYMNEQVKECSDGLLSTIGKNLDDDAGRIEIREMMAKYSTDVIGSCAFGLKLDAINDPDSEFRKHGKTVFQPSLRSKIRVAVIFMQPSLLSIFRVHHYSHRTIRFFHDAFQQTIEYREKHNEDRKDFVQHLMKAREDLVLNPNLKPE

EKFTEMDIVANAYILFIAGFETVSTSMSFCMYELALRKDVQDKVRKEILEVKSKYNGQMNSECLNELHYMGMVIKETLRKYPPLVTLNRVVTKPYVIPGTQIKLKIGTKIVVPVHAIHYDPKYYSDPEAFEPDRFSDENIHNIQPNTYMPFGDGPRFCIGKRFAEFEMKMALSEVLTNYEVMACDKTQIPIKYVIGSFVNIPESVWLKFRKVNT

 

>CYP6DA1 LOC100159636.pro SCAFFOLD17790 coords:18123-22112

41% to CYP6AX1

MSGVWSIPFVQLCAAAVLLVTFLGYMYLTYHYGKWTGLGVPHAAPSPPFGSLRDVVMGRVPLVDAIHSLYRRFDGQRYFGIYEGRQPLLVVCDPQLVHTIMVKDFRSFVDRNAGKVSFVHDKLFDHLVNLRGEQWKAIRAKLSPTFSAAKLKSMLGDINVCTARLIDNLNGQITKNSGIVDVSEASAQFTTDTIGSCAFGLDCNALSNPDSEFRRTGRAIFTPSLRSNLLNITRLVGFGRLLDVFRIRGMSGNIYDFFDNLLDTTMEQHKSGENTRNDFIALLVKLKDEEKQKEHGQKLFTDDILAANSFVFFVAGFETTASTISYCLYELAMNPEIQVKLRENIKKTLDANDGKLAYDTLKDMKYLDMVINETFRLHPPVPVLNRVCTQKYTITDSNITLNVGDKLIIPTYSLHHDSKYYSDPEIFDPERFTEENISSRPHGTFLPFGDGPRICIGLRFAMMEAKTGLAEILSKFEVSPCKETQTPIKIKPRSILLTPNESIRLSFKSIDQ

 

>CYP6DA2 LOC100168454.pro   SCAFFOLD17790:22986..29493 (+ strand)

downstream of LOC100159636.pro

39% to CYP6AX1, 39% to CYP6a2

MICFSCWLEIVPIAAIASALLTYVYCTRYYGHWTALGVPHTKPAPLLGHFAGPTMGRESGTITVDTLYRRFVGHRYFGVYQLRHPMLVVRDPVLVHAVLATEFGSFHDRVMSRTSFEHDGLFNSLVNLRGDKWKAVRAKLSPTFTVAKLKAMFASLHVCTGQLTDKLLLLTSGGQGIVNVTDVSSKFTIDTIGRCAFGINCNTLFDSNTEFQRAGQAVFTPTLKSSVLNFMRLIDLGWLVDLFRLRSMPDLVYEFYLNLFQDTLELRKNEKEDRNDFVSILVKLRNDEKINNSRVELFTDDVLASNAFIFFAAGFETTASAMSYCLYELALNQDIQVELRKQIQHTLNENGGILTYDVLKDMKYLDMVLNETLRMHPPGPGLLRVCTKKFKIPDSDITLDTGMKVLIPTYSLHHDPAYYPNPELFDPLRFTEDNKALRPNGTFLPFGDGPRICIGLRFALMEAKTGLAEIISKFEIFPCKYTKIPIKLNPRSILLTPNEPISLLFKQIA

 

>CYP6DB1 AUG5s6612g1t2 XR_045850 40% to CYP6K1

missing about 60 aa at N-term added

MISKEIIFVYTTCAIVVLFTAVYLYYRNIYSYWKKLGVYHLEPLFFFGNAKERVLFKKSFHEFHRD

 

MYFKFKGHRYAGYYLGRRASLVILDPEIIKCIMIKDFNHFTDRQTMRFRTSEYITEMLINLKGSKWKRMRGQLTPAFTSGKLRTMEHLVDVCCNNMSDFLNENIKSEQGYDLEMKDFFGKFTLDVIATCAFGVESNSLKDVNGGFASRVSKFASLSIMKRLTLYIVLLFMPGIARFVPLSFFNMEVIQFLANVIKEAKKCRKSTGQKRNDFLQLLLDSETDLDKNDKTKSKEDVLTEAQVVAQSVLFLIAGFETSSTLLTFTCYELAINQTIQDKLREEICSVLKRFNGKCTYEAMQEMPLLDMVLMETLRMHPPVAQLERVSTQDYTLPDSNLLLKKGMTVQIPVIGLHYDPEYYPDPYKFEPYRFSPEEKAKRSHYVFLPFGTGPRNCIGLRFALMSTKRGMVHLLKDFSIDLSKEMTVPYEYSKHSMLLKAKDGIRLSFNKLSA

 

>CYP6DC1 AUG5s9515g6t1p 41% to CYP6AM1  Hodotermopsis sjostedti

MVFFDSSLLNLVAYGIVISTTFYFYLRYRYTFWQRQGCPVPLKPHIIYGHTKEVTKMKTWVGKHYANIYYNTDGYKFVGFYQFQKPKLMLRDLNIIKDVFTKEFSTFPNRGIVFDDKLEPLTGNLLTLEGHRWKVLRNKLTPAFTIGKIKNMIDLIDGRAQEMVRVLEKSAVIGEQVEFKELLARFSTDVISIVAFGFETNSLTNPDAEFRRVGRMLFSTSLETIIRNALNALAPSLIGLLKVRSIKKEYADFFYNVVNDTVKYREENGIQRNDFLDLLMKIKRGQNLASDEDNFKFTMDVLAAQCFVWFIGGYETSSVTLTFTFFELAQNLDVQMRAQDEIDSVLSKYDGKLTYEILQEMPYLDMIVSEALRKYPPVPNLTRKAVKPYKLPNSDFTLDKGLQVVIPVYGIHNDPEYWPEPEKFIPERFTEEEKRNRPQYAYLPFGAGPRLCIGMRFGMMQVKVALFRILSTYNISLSKSMKLPMKMNPKTIPANPDGGMFLHITKRKN

 

>CYP6DD1 LOC100162594.pro   SCAFFOLD16233:102870..132505 (- strand)

cyan region may be too long, 46% to CYP6AY1, 42% to CYP6M11

MMLYINFKFIQLRAIIPKHEAALYIRFI

MFPAVVIIVACCTTVILFLYKYTTYTYKYWKSKSVTFATPVPLFGNIKDHVTLKMTQGECLKNIYNDFPREKFVGMYQLQTPTLLLRDPETIRLFLVKSFAHFTDRGFSYDGHREPLTKHLVNLEGDTWKILRQKLTPTFSSGKIKSMLGLLQGCGVQLIEYMDATIESGKTEFEIRDLTAKFTTDVIGTCAFGLECNSLKDSQSEFRRMGCAVLNSSASLALAKMVRVFFPKLFKALKLRTFPAEVQQFFMGIVKQTIDFRNTNRVRRNDFIQLLLEIKNQNHNQENAIKSIELTEELIAAQVFVFFLAGFETSSTTLSFCLHEMAVNQDIQNRVYDEINETANMYGLPFSYEAISSMNYLEQCLKETMRKYPPVQALARVCTKQFRVPGTDLDLDVGTAVLIPVYAIHHDPQYYPEPDTFNPDRFAKDGDGGGGDNGRPSGVFLPFGDGPRICIGMRFAMLEMKLALAQFLHRYLVTLSDKSCTRIEFEPASFLSCPKGGIWLNVNKRKA

 

>CYP6-un1 AUG5s17796g1t1p          SCAFFOLD17796:3506..66376 (+ strand) possible pseudogene

36% to CYP6BX1 Dendroctonus ponderosae C-term  no Cys

MFVPRKNSRFSIIALSFLLRNLVADVNILAVRISFLFIPPSADVADKQVRLDMFEFVTGTIEQTAALVTGALYELARDQNIQNHLREHLDSVLDEHQEQVTIDQWXXXXXXXXXXXXXXXXXXNQTLRKYPLAAVVRRVATKPYVVPGTGGRGTIEPDSLIVVPVYELHHDAEHFTEPEKFQPHRFPGQLSSAYMPHGSGPQSYIGKHFVELEAKLVIAMLMSRYEVHVDSATPGTPPDPLDRKSFEGVRMTVANRSSVRESTMGASLRQFHMSNGVNSGDRKFLFF

 

 

CYP4 clan

 

>CYP4G51 LOC100164072 CYP4G like 64% to CYP4G16

MVTNVQGVNPLFALSAFNLFFYLLTPAIVLWYIYFRMSRKQLYDLASKIPGSEGLPLLGNALDFMQDPHTIFEKIYERSFEFEKNSPIKMWIGPRLLVFLTDPRDVEVILSSNVYIDKSPEYRLFEPWLGNGLLISTGDKWRAHRKLIAPTFHLNVLKSFVTLFNVNSRDTVSKLRKMGSSTFDIHDFMSECTVEILLETAMGVSKKTQKKSGFEYAAAVMKMCDILHMRHTNLWLKPDFIFNFTKYAKEQVGLLDLIHGLTNNVLAKKKEEFLKKKSLMKEVSDIPAASEEIVETSSTLEVEEVPYGNSFGQSAGLKDDLDVEDDGIGEKKRVAFLDLLIECSENGVVLSDEEVREQVDTIMFEGHDTTAAGSSFFLCLMGAHQDVQQKVVDELYSIFGDSDRPVTFQDTLQMKYMERCIMETLRMYPPVPIISRQIKEKVKLGEDITLPVGATIVIATFKIHRNEDVFPNPEVFNPDNFLPEKSASRHYYAYVPFSAGPRSCVGRKYAMLKLKIILSTILRNFKINSNLTEKDWKLQADIILKRTDGFKLSLEPRKSLAKTAA

 

>CYP4CH1 LOC100167623p SCAFFOLD15279:22883..50385 (- strand)

70% to LOC100163721

small duplication near C-term

MWYLTVITPIAILVVIIMILRLEGRRKRVLANKIPGPDGSIFIGMLPLFLQGPEQLILKGLKVYQKYEKSLFKVWVLNNLYIVLTRPEDIEMVLTNPKLQKKSKEYLVLQESIMGQGIFSIDDIKKWKSNRKMVSGGFNFTIIKSFIPIFYEESNVLNDILKQKCDLKSNECDISVPVSMATMEMIGKTALGVKFNAQNGGRHRFVENLQTAMHAWEYRISHP

WYLSKTLFQLSSVKKKHDQSQKIINEFTDEIINKKLDELNQNANN

KNKVETDDEDVCRKTKTVIEILLGNYHEMSHEQIRDELVTIMI

GGQETTAMANACAIFMLAHHPDVQNKVFEELQSIFSTGDHNRPPTYEDLQQMEYLERVIKETLRIFPPLPVFGRSLEEEMKIGEHLCPAGSTLMVSPLFVHSSGQYYTDPEKFNPDNFLPDTCRGRHPYSFIPFSAGYRNCIGIKYG

ILQMKTVISTLVRK

NTFSPSERCPTPKHLRVMFLSTLKFVDGCYVKIVPRTS*

 

>CYP4CH2 LOC100167777p   SCAFFOLD9588:3006..11080 (+ strand) added blue

30% to CYP4G7

missing N-term, might be a pseudogene

YEKSMFKAWLFNKLYIVLTRPEDIEFVLASPKFLRKAKEYMVLQQSIMGQGIFTIEDINKWKINR (2?)

XXXXXXXXXXXXXXXXXXXXXXXXVLAEILGD NSDSTSKECDISVPVSMATMEMIGKTALGVTFNAQKGGCNRFVENLLTAMHAWEYRITHPWYLSSTLFQFSSIKQKHDHSQKIINEFTDEIIKSKIVEINNSGSENGVNADDDDIGRNTKTLTKIFLENPHENMTLEQIRDELVTVMIGGQETTAMANACVVFMLAHHQDVQDKVFKEQESIFSIGDRNRPITYNDLLQMEYLERVIKETLRLFPPLPVFGRDLNEDTTIGDHLCPAGSTLIICPLFLHSSPQHYGSTAHGPDAFDPDNFLPEACHERHAYAYIPFSTGPRNCIGIKYAMLQMKTVASTLVRHHRFLPSDRCPTPDQLRLVFLTTLKLADGCYVKVEPRRPQ

 

>CYP4CH3 LOC100163721         SCAFFOLD9588:25696..30352 (+ strand)

revised middle

70% to LOC100167623p, downstream of LOC100167777p

MXXXXXXXXXXXXXXXXXILVITIVILILKGRKNRILANRIPGPNGWFLVGMLPLFLQGPEKLIKNILREYRIYEKHIVKFWLFNNLYIVLTRHQDIELVLGNPKFLRKSKDYMVLQESIMGQGIFSIDDIEKWKNNR

KMVMKGFNFTPTKSFIPIFYQEAN

VLAEILQEKCVLKSNECNISGPVSMATMEMIGKTALGVTFNAQTGGCNQFVEHLQTAMHAWEYRVTHP

PWYLNNTLFRFSSVKREHDRSQKIINKLTDEIIKQKIIELSQNINNS

EKKIESDNEECCQKSKTVLEILLGSSHKMDHEQIRDEIVTVMI (1)

GGQETTAMAITCTIFMLAHHQDVQNKVFEELQSIFVNGDRNVPPTYKDFQQMKYVEMVIKETLRLFPPLPFLGRRLDEDMKIGEYMCPAGAALIICPIFVQSSPLYYTDSEKFNPDNFLPDACGSRHSYAYIPFGAGLRNCIGIKYAMLQIKTVISTLVRK IKFSPSERCPTPEDLRLMFLMTLKLVDGCYIKMEPRT

 

>CYP4CJ1 LOC100166760.pro   SCAFFOLD7010:99236..111631 (+ strand)

MIFSNVIGALTSDSNTQWMALLSLVVLGVYFLFSDRFSENRGRQISLLPSITRSQWTSLILSLKLASFGPRDILPYFDNVIKKYGSLIHLKIIARHYIIINDPDDIKVLLSSVQHITKGPDYEMLEPWLNKGLLTSTDQKWHSRRKLLTNTFHFKILETYVPSLNKHSRSLVKNLINASDNGKSIADIDSHVTLCALDIVCETIMGVNLRTQEGKSMNYVKAIKNVSQILIKRIFTFWYWNEIVFNLSSIGREFRKSLKLLHDFTENVIRERRKILENVEQKKVDENGKKRIYSFLDLLVGVSKENPGAMTDKDIREEVDTFLFEGHDTSSIAITMAIIHLGLDQNIQNLVRDELYEIFGDSDRDATMEDLKAMTNLERVIKETMRLYPSVTGITRTLKQPLHLDKYTIPSKSVMVVVPHLLHRDKNIYPNPEKFDPDRFLPEQCNGRHPYAYIPFSAGPRNCIGQKFAMYQMKTVLSTILRYTNVETLGTQKSIVISTQLILRADYLPSGVRDFPVKSFSLV

 

>CYP4CJ2 LOC100164743.pro         SCAFFOLD7010:120446..129462 (+ strand)

adjacent to LOC100166760.pro

MLETNVYNFVLIPLAILISYAIWSRLRKPLEYRQISSHVPSVTKNLWSELLFSCSIAMKHPRDLLPFFMEIFLNNGPVVHCNITGRSYVLLNDPDDIKILLSSTQYINKGPEYKMLKPWLNDGLLLSSGLKWQNRRKLLTNTFHFKTLDMYNPAVNKHAKVFTKKLLEACEDDKEISVMEYVTLCSLDIICETIMGTEMNAQKGKSIQYVYSIKSACRSVIDRVFKFWLWNDLIYRISESGRSFFKSIRVLHDFTDSVIKRKQSLLKTSGNTIVQPESKPAEKRKTKSFLDLLLDVLKDNPDQMTIKDIREEVDTFLFEGHDTSSISMTMTLLLLGMHQDIQDRAREELHSIFGDSDRDATMEDLNAMRYLDAVIKESLRLYPSVPSFTRELETTLQLENYKIPPMTTMVIFPYILHRNENIFPKPEDFIPERFLDEDNKSKFLFGYIPFSAGARNCIGQKYAMNQMKTVVSTVLRNAKIVSSGCKEDIKISMQLLIRIESLPKVIFRPL

 

>CYP4CJ3 LOC100162661.pro SCAFFOLD7010:131616..139794 (+ strand)

41% to CYP4C3 adjacent to LOC100164743.pro

MIEVNFYSVVLVPLAGLISYAIWSRLRMPVEYRQISSHVPSVTKSFWSEMVLSWKLAMLQPK

DILPFVTDLFKENGPVVHFNLSGRSYVLLNDPDDLKVLLSNTQYIKKGPEYEMLKPWLNEGLLLSSGQKWHNRRKLLTNTFHFKTLDMYNPSINKHSRILVDKLFEASANDDKEISIAEYVTLCSLDIICETIMGTEMNAQKGKSAEYVHSIKSACKSVIERIFKFWLWNDLVFRMSGSGQSFFKSIKILHEYTDNVIKSKRASLNNSGIEKIRSDSKFEKTKKKSFLDLLLNVLNDTPDQMSDRDIREEVDTFLFEGHDTSSIAMTMILVLLGMHPEIQDRARDELRSIFGYSTRDATMEDLNAMKYLEAVIKESLRMYPSVPAFTRELDKPLQLNKYIIPPMTTITVYPFILHRNEDIYPDAEEFIPERFLDEENKAKFIFGYLPFSAGARNCIGQKYAMNQMKIVVSTILRNAKFESLGRKEDIQISTQLIIRIESLPKMKFYKL

 

>CYP4CJ4 LOC100160629        SCAFFOLD7010:141476..155172 (+ strand)

adjacent to LOC100162661.pro

MIEVNFYSVVLMPLAGLISYAIWSRLRMPVEYRQISSHVPSATKTFWSEMVLSWKLAMMQPK

DILPFLTDLIRNNGPVVHFNLSGRSYVLLNDPDDLKILLSNTQNIKKGPEYEMLKPWLNEGLLLSSGQKWHNRRKLLTNTFHFKTLDMYNHSINKHSRILVDKLLDASANSNKEISIADYVTLCSLDIICETIMGTEMNAQEGKSVQYVHSIKCACKSVIERIFKFWLWNDLIYKISGSGQSFFKSIKALHEFTDNVIKSKRALLNNSGIEEMQSDKKTKKKSFLDLLLNVLNDTPDQMNDRDIREEVDTFLFEGHDTSSISMTMTLVLLGMYPDIQDRARDELHSIFGDSDRNATMEDLNAMKYVEAVIKESLRLYPSVPGITRELQTPLQLKNYIIPPMTTIAVYPFILHRSENIYPNAEEFIPERFLDEENKAKFQFGYLPFSAGARNCIGQKYAMNQMKIVVSTILRNAKFESLGSKEDIQISTQLVLRIESLPKMKFFNL.

 

>CYP4CJ5 AUG5s3079g1t1     SCAFFOLD3079:13926..21110 (+ strand)

64% to LOC100166760.pro

43% to CYP4CE1, 43% to 4V16, 42% to 4C1

missing the N-term

DVLPFLSSILKQHGSLVHIHLLGHSYVLLNDPNDIKVLLSSPQHINKGPEYGLLKPWLNKGLLTSGSQKWQMRRKLLTYTFHFKILETYISSFNKHAQCLTKKLENMASNNQRVSIYTHMTLCALDLVCDTIMGTELRSQEGKSLEYVEAINTVTDITIKRIFKFWLWNGSIFNLSQIGRDFNKSLKILHTFTENVIKEKRAKLESVNCLETEELSFGKKRVESFLDLLIGISKQNPEKMTDMDIREEVDTFLFEGHDTSSTAMTMAFIQLGLNQDIQNSVREELYSIFGDSDREATMADLKSMTYLDRVIKETIRLYPSVPSVTRMLRQHLHIKEYDIPPQTVVVVVPYLLHREEKHFPNPLTFDPDRFLPEHSINRHPYAFIPFSAGPRNCIGQKFAMYQMKTIISTVIRKMKIETLGSQDDIKISAQLILRPESLPDIKLTKIK

 

>CYP4CK1 LOC100161405.pro   SCAFFOLD15733:160611..183325 (- strand)

47% to CYP4BT1 Pediculus humanus

MNYHDHLTTRSRHCTTTVEEINDVERDDYNRSPLPLSTSGILIRTVEPILATVSLVFDKSFILFACFSKFSLQYFNELYLFLCQLLVDLHFFRLLLEWHSKFGDTYQLWIGLRPFIAMANADHIQQILKSTVHIDKNLEYNLLLPFIGTGLVTSSGSKWHTRRKLLSPTFHQNILEGFLPLIEKQMKTLVKVLRKEVNNVNGFDIKPYAKLAALDTIGNTAMGCEINSQENSQLDYVKALDELTAIMQKRFITPWLKPNLLFNLTSLSKRQKACIDVIHTFTRKVIKERKDNFKLFNNQTSDANKNEIHYEKKPNRALLDLLIEVSEDGKVLSDEDIQEEVDTFMFAGVDTTSVTLSWVMYVLGKHPHVQDKIVEELNQKIPNFGDGNLTLNILSSLDYLGRTIKEVLRLYPSVPFIGRQIYQPLTIGDHTILPGTSIFINVFALHRNEKHFENPEKFDPDRFLKEKKNDRHRFAFVPFSAGSRNCIGQKFAMIVLKIAVATVIKTYRVKSIDPEEKLGLVGEIVLNALNGIHVTLEERT

 

>CYP380A1 LOC100165004        SCAFFOLD17282:20108..54426 (- strand) top half only

N-term is on SCAFFOLD11661:5770-6260 (-) strand

see EST EE264487.1 Myzus persicae to confirm N-term

MYGKLSLPELIIYASVALILALWFHWRWKHRYFLDLAEKLPGPPCYPLIGTTSMYTSTYD

ETIAKLKENAEKYNYEPVGTWIGPIHYVSVVKPEDIQ

IVLNNSRALEKGQLYSF

LKSLLGEGLLTASVDRWRKHRRIISYAFNVKFLEQLYPVFNEKNKTLIKNLRKNINSTQP

FDLWDYIISTTFDTICQTAMDYRINEKKKKRPPR

 

MFGKLSSPELIIYTFVALILALWFHWRWKHRYFLDLAEKLPGPPSYPLIGTTSMFTHTYD (1)

ETIAKLKENAEQYNYEPVGTWIGPIHYVSVVKPEDIQ (0)

IVLNNSRALEKGQLYSFLKSLLGEGLLTASVDRWRKHRRIISYAFNVKFLEQLYPVFNEKNKILVKNLRKNINSTQPFDLWDYIISTTFDTICQTA MDYRINEKHNKTEFLDLMTTIANQLVKTVNRPYLYPSLFFSIYRSMSGLGEKLELINKLPLQLIDEKKIDFRSKIVESDSYPEEFTNEKKNKFKTFIDTLLEASENDPDFTNADIRDEVITMMFAGSDTNATTECFCLLLLAIHQDIQDEVYDEIYNVVRDSDRELTPEDTANFSYLEQVIKETLRMYPTISVFTRQLVEDVKVTNYVLPRGASVTISPIVTHHCPHLYPNPEAFNPDNFSIENVAKRHKYSYIAFSGGPRGCIGMKYAMISMKLMITEILRNFSVHTDIKLSDVRIKMNDAFTRKVGGYPITIRPRDRRPSYVRRNTRVA

 

>CYP380A2P LOC100167018.pro SCAFFOLD17282:20108..54426 (- strand) bottom half only, pseudogene

55% to LOC100165004        

VIKEKKAEFDQRLKATNDKVDVTNNDDEKYSKLFLDILFELNNNGGNFSDSDIRDEVVTMMT ()

GGSETSAITICFCLLMLAIDQDIQ

DKVYDEVYDIFGESDHIITIEDTTRLVYLEQVLKETLRLYPVGPVLLREIREDLKI ()

FSNDYVLPKGTTCVISPIATHHSPDLYPNPWSINPENFSPENVAKRHKYSFIPFSGGPRGCI ()

GSKYAMMSMKVTVSTFLRHFSVHTDIKLTDIKLKIDLLMRSVHGYPVTIRPRDKRPTYYNMRNQNKQG

 

>CYP380B1 LOC100161319     SCAFFOLD11137:17306..28967 (+ strand)

39% to CYP4G14 C-term half

52% to LOC100165004, 50% to LOC100167889

see model below for N-term

MGYNLNDQRTLSEFVLAMKKVSELSKCIVKPWLYIDQIFAVYTYLTGLNVYMSQLNRVSLQIIRDKKLEFKSIKLQQSTDKSHEVVPEKKRNSTKVFLDKLLKLNDEGADFTDEDLKDEVITMTVAGSDTSAISECFCILLLAMHQDIQDKVYDEIYSVLGDSDREVIPEDIFRFKYLEMVLKESLRLFPPGAIFSRKINENVKLTNFELPKGSNVFVSPYVTHRCPQLYPNPDTFNPENFSAENEANRHKFSFLAFSGGPRGCLGVKYAMISMKLMMVAVLRRYSVHTDCKLSEIEMQIDLLAKKANGYPITIRPRERTQDR

 

>SCAFFOLD11137:AUG4_SCAFFOLD11137.g1.t1 = LOC100161319

MFAQMRMAIHNAAHALPMTKSELYFYASIVIFVVLWCRMRWQYRQFYRLADKLKGPPSYP LKGSIFDLSTTPEKLMYNFKESAEKYNYEPVKLWVGPFFFVGVYKPEDVQIVLNSSKALE KGMIYHIIRHAVGEGVFTAPMGKWKKHRRVIASIFSSKFLDQLYPIFNENNKKLVENISK HVGETQPFDIWDYIISCNLNNVSQAA

MGYNLNDQRTLSEFVLAMKKVSELSKCIVKPWLY IDQIFAVYTYLTGLNVYMSQLNRVSLQIIRDKKLEFKSIKLQQSTDKSHEVVPEKKRNST KVFLDKLLKLNDEGADFTDEDLKDEVITMTVAGSDTSAISECFCILLLAMHQDIQDKVYD EIYSVLGDSDREVIPEDIFRFKYLEMVLKESLRLFPPGAIFSRKINENVKLTNFELPKGS NVFVSPYVTHRCPQLYPNPDTFNPENFSAENEANRHKFSFLAFSGGPRGCLGVKYAMISM KLMMVAVLRRYSVHTDCKLSEIEMQIDLLAKKANGYPITIRPRERTQDR

 

>CYP380C1 LOC100162836.pro        SCAFFOLD10061:3374..10265 (- strand)

small sequence gap

MIEIIVYIIVVIFIV

MWCYFKWHNRPFEKLAARMPGLPAYPFIGSLYTCIGVTSEQLRSRILDLVKDYNLGPIKCWMGPYFGVFIVRPEDIQIVLNSSNALQKGFVYNFFKVILGEGLFTAPIDKWRIHRRMISPFFNGKLLEQFFPVFIEKNRILIRNVAKQLNETQVFDLWDYIAPFAFDTICQNTLGYNIDTQTNKNECEFAKAIVKTLDLEGMRIYKPWLYPEFVFSMYLKLTGQQRVFETVRKFPLQVIKEKKAEFDQRKKLIDAKIDVTNSNEHQSKLFLDTLFELNNGGGNFSDSDIRDEVITMLAA

XXXXXXXXXXXXXXXXXXXXXXX

DKVYDEIYDILDDSDHMISIEDTTRLVYLEQVLNETLRLFPAGPMQLKEIQEDLKISSSDYVLPKGTMCVISPLVTHISPDLYSNPRDFNPENFSPENIAKRHRYSFIPFSGGPRGCIGSKYVMMIMKVTVSTFLRHFSVHTNIKLTDIKLKLDVLMRSVDGYPVTIQPRHKRPTYKRNKKPLR

 

>CYP380C2 LOC100160808    SCAFFOLD10061:16344..23495 (- strand)

49% to LOC100165004 

MIEIIVVIFVM

MWCYIKWHNRPFEKLAARMPGFPAYPFIGTGFQFIGLTPEQIMNRILDYEKDYNLEPFKIWIGPYFGVFIVKPEDLQIVLNSSKALQKGCVYDFFKHVTGEGLFTAPVDKWRIHRRMISPLFNGKLLEQFFPVFIEKNRILIRNVANQLNETQVFDLWDYIAPFALDTICQNTLGYNLDTQTNKNGCEFAEAIVTTTDLEGMRIYKPWLYPEIVFSMYLKLTGQQRVFETVRKFPLQVIKEKKAEFDQRKKLIDAKIDVTNNNEHQSKLFLDTLFELNNDGGNFSDSDIRDEVVTMLTGGSETSAITVCFCLLMLAIHQDIQDKVYDEIYDIFDESDHMISIEDTTRLVYLEQVLKETLRLFSVGPLLLREIQEDLKIFSSDYVLPKGTTCVLAPIGTHLSPNLYSNPRDFNPENFSPENIAKRHRYSFIPFSGGPRGCIGSKYAMMSMKVTVSTFLRNFRVYTDIKLTDIKLKLGLLMRSVDGYPVTIRLRDKRPTYKRNKKPPR

 

>CYP380C3 LOC100158738.pro         SCAFFOLD10061:35318..41755 (- strand)

MIEIIVYIIVIIFVVTWCYFKWHNRPFEKLASR

MPGPPAYPFIGTLYQFIGLTSEQIVSRILDYVKDYNLEPFKFWMGPYFGVVIVKPEDLQIVLNSSKALQKGYVYDFFKDIGGEGLFTAPVDKWRIHRRMISPLFNGKLLAQFFPAFIEKNQILIRNVAKQLNETQVFDLWDYIAPFALDTICQNTMGYNLDTQTNKNECEFAEAI

(seq gap)

VIKEKKAEFDQRKKLNDAKMDVTNSNEHQSKLFLDTLFELNNGGGNFSDSDIRDEVITMLIAGSETSAITVRFCLLMLAIHQDIQDKVYDEIYDIFDESDHMISIEDTTRLVYLEQVLKETLRLFSVGPLLLREIQEDLKIYDDYVLPKGTMCIISSIATHHSPDLYPNPWSFNPENFSPENVVKRHKYSFIPFSSGPRGCIGSKYAMMSMKVTVSTFLRHFSVHTDIKLTDIKLKLGLLMKSVNGYPVTIRPRDKRPTYKRNLKPLR

 

>CYP380C4 LOC100159590   SCAFFOLD17803:1..16255 (- strand)

missing C-term in a seq gap

MIEIIVYIIVVIFVVMWCYFKWHNRPFEKLAARMPGPPAYPFIGTLYGCIGLTSGQIVSRILDYVKDYNLEPFKFWMGPYFGVFIVKPEDLQIVLNSSNAFQKGFVYDFFKVILGEGLFTAPVDKWRIHRRMISPFFNGKLLEQFFPVFIEKNRILIRNVGKQLNETQVFNLWDYVAPFALDVICENTMGYNLDTQTNKNECEFAKAIVIKEKKAEFDQRKKLNDAKMDVTNSNEHQSKLFLDTLFELNNGGGNFSDSDIRDEVITMLAAGSETNAITVCFCLLLLAIHQDIQDKVYDEIYDIFDESDHMISIEDTSRLVYLEQVLKETLRLLPAAPFLLREIQEDLKIFSSDYVLPKGTMCIISPLATHRSPDLYSNPRDFNPENFSPENIAKRHRYSFIPFSGGPRGCI

 

>CYP380C5v1 LOC100162710.pro       SCAFFOLD12542:91226..95809 (+ strand)

runs of the end of the contig

& =  frameshift

74% to CYP380C2 LOC100160808

MIEIIAYIIGIVLVMVWCYFKWQNRRFEKLAAIMPGPPAYPIIGIGYTFFGSSEHVMSKIIDLVKEYNLSPIKLWLGPYFAVSISKPEDLQIILNNSKALQKDRMYDFFKYAVGEGLFTAPVDKWKRHRRMITPAFNAKLFEQFFPVFNEKNKILIKNVTKELNKTQMFDLWHYVAPAALDTICQTTMGYNLDTQSNNKECEFGEAIVM (2)

ASEVAAMRIYKPWIYPEMVFSMYLKLTGHQRVFETVK &

KFPLQVIKEKKDEFDQRKKAINAKVDLANNKDENQSKLFLDILFELNNTGGNFSDSDIRDEVVTMMTGGSETSAITICFCLLMLAIHQDIQDKVYDEIYDIFGGSEETITIEDTTKLVYLEQVLKETLRLYPVRPVLLRELQDDVKIFSNDYVLPKGTTCVLCPITTHHCPVIYPNPWSFNPENFTPENVAKRHRYSFIPFSGGPRGCIGSKYAMLSMKVTVSTFLRHFS

VHTDI

 

>CYP380C5v2 aLOC100167486   SCAFFOLD4690:1..2605 (- strand)

96% to CYP380C5 LOC100162710.pro  runs off the end of the contig

MIEQIAYIIGIVLVWSYFKWQNRRFEKLAAIMPGPTAYPIIGIGYKFFGSSEDVMSKIIDLVKEYNLSPIKLWLGPYFAVSISKPEDLQIILNNSKALQKDQMYDFFKYAVGEGLFTAPVDKWKRHRRMITPAFNAKLFEQFFPVFNEKNKILIKNVTKELNKTQMFDLWHYVAPAALDTICQTTMGYNLDTQSNNKECEFGEAIVM

ASEVAALRIYKPWLYPEMVFSMYLKLTGHQRVFETVKKFPLQ

 

>CYP380C6 LOC100167889      SCAFFOLD12103:1756..16564 (+ strand)

49% to LOC100165004 

MEVSQDFPVSSLKHSAGGPRMTSTELTAYGVISFIVVLWCHYKWNRRHFERLASKMTGPPAYPIIGAGLEFVGTPQQVIERIIKLFDIYGSEPFKVWMGTSLGVTISKPEDVQIVLNSSKALEKDQFYKFFKNTVGEGLFSAPVHKWRRHRRLITPVFNANLLDQFFPVFNEKNRILTRNLKKELGKTQPFDLWDYIADTTLDIICQTAMGYNLDTQLNNESEFAEALTKASELDSMRIYKPWLHPDIIFSIYGKLTGLHNVYKTLHKLPNQVIKEMKETYAQRKIDNKSNTIDVNDDDKKRLKVFLDTLLDLNEAGANFSDEELRDEVVTMMIGGSETSAITLCFCLLLLAIHPEIQDKVYDEIYEVLGDGDQTITIEDTTKLVYLEQCLRETLRLYPIGPLLLRQLQDDVKIFSGDHTLPKGTTCIISPICTHHIPELYPNPWSFNPDNFDAENVSKRHKFSFIAFSGGPRGCIGSKYAMLSMKVLVSTFLRNYSVHTNVKLSDIKLKLDLLMRSANGYPVTIRPRDRRPTYKKNTHCSTVNL

 

>CYP380C7 LOC100168315   SCAFFOLD2534:44976..48195 (- strand)

MENIIQSVRDFRLTTSEVIVYQLIVCFVVIWCQFKWIRRNFESVAAKMKGPKGYPFIGSSFDFIGTPEQVMEKVLKIDDKYSPGPIKIWVGPYFGVIVIKPEDVQAVLNNSKALQKDRVYDFIKNIFGEGLLTAPVHKWRKHRRLITPSFNASLLNQFFPVFNEKNKILIRNLKKELGKTTPFDLWDYIAPTTLNLICQTAMGYNLDTQSEYGTEFENAMIKASELDSLRMKTPWLYLSFMFKLYLKLKGHSDVFNTLYKLPIKMIQEKKEAFAQRKILNKPSAVDVTDNEREKLKVFLDTLFELNEAGANFSDDDIKDEVVTMMIGGSETSAITICFSLLMLAIHPDIQDKVYDEIYEVFHDDNETITIEDTNKLVYLEQVLKETLRLFPVLPLVFRKLEDDIKIASDDLVLPKGTTCIISILGTHHFSESYPNPWTFNPENFNPENITNRHKYSFIAFSGGPRGCIGSKYAMMSMKVAMSTFLRNYSVHTHYTFDDIKLKIDLLLRSANGYPVTIQLRDRRPTYIRNKKL

 

>CYP380C8 LOC100165148p   SCAFFOLD1571:20813..37353 (- strand)

also             SCAFFOLD17147:5399..9863 (- strand) C-term

see EST ES224491.1 Myzus persicae for N-term

VLLLLNARYCRIEATMQSVSGFRLTITEVFA

YTIICTLAILWCRFKWNRRHLDRLAAGLE

GPPAYPIIGSALQFIGTPEEVLNNLVQLIEDYCPGPFKIWMGPYFGVAIVKPEDLQIVLN

SSRTLQKDRFYNFIKNIFGEGLLTAPVDKWRKHRRLITPSFNSILLNEFFPV

 

Y  P  T  I  C  F  V  V  I

LWCRYKWNRRHLDKLAAGLKGPPAYPIIGSALQFIGTPEEPNLFQIVLNSSRALQKDRFYNFVKNIFGEGLLTAPVDKWRKHRRLITPSFNSILLNEFFPVYNEKSKMLIRNLKSELNKTQPFDLWDYIAPITLNLICQNAMGYNLDSQSKSGSEFEKAMIKASELDSIRVSKPWLYPSIMFSLYLKLKGYSNVFNSLYKLPLKMIHKKREEFAQKKIGNESNYLDVTDNERKHSKVFLDTLFELNEAGANFSYDDIRDEVVTMMIGGSETNAITLCFCVLLLAIYPSIQDKVYDEIYDVLGDGDQTITIEDTSKLLYLDQVLKETLRLFPVIPLILRQLQGDVKIISNNIVLPKGSTCYLSPLATHRDSDSYPNPTSFDPENFSPENIAKRHKYSFIGFSGGPRGCIGSKYAMLSMKVLVATFLRNYSVHTDCKFNDIKLRLDLLLRSSNGYPVTIRTRDRRP

VYKFKLEYI

 

>CYP380C9 LOC100162179       SCAFFOLD17731:37803..45605 (- strand)

N-term may be too long

MGLGDYVQLYNALETGAII

MQSVGEFRLAVSEVLLYSAIISVVVFWCSCKWNNRHINKLDSKMKGPPAYPIIGSALELLGTPELDKWRKHRRLITPLFNANLLSQFFPVFNEKNKILIRNLKKELGKTQPFDLWDYIAPTTLNLICQNAMGYNLDSHSQCGSEFEKAMIKASELDSIRIYKPWLFPNIFFSLFLRLQGQSNVFKTLKKLPLKMINEKKEVFAQKKIVKETIVMNNTDGEKKNLKVFLDTLFELNETGANFSDNDILDEVVTMMIGGSETSAITLCFSLLLLAIHPDIQNKVYDEIYDVLGDGDQTITTEDTIKLVYLEQVLKETLRLFPVLPLVIRKLQDDVKIISGNHLLPKGTTCYIAPLFTHRDCDSYPNPLNFNPENFSQENISKRHKYSFIAFSGGPRGCIGSKYAMLSMKVMMSMFLRNYSVHTNCKFNDIKLKLDLLLRSANGYPVFIQSRDRRPSYKLNKT

 

Mito clan

 

>CYP302A1 LOC100165806 dib 22-hydroxylase

MPSAKCFLGCTNVRYGARIVSILDFKSTLFQILRFSSTETTAVKEFNEIPGPTSLPLVGTLYQYLPVFGKYKFDRLHHNGLAKLRQYGPVVREDIVPGVSIVWIFKPEDIETLYRKEGRYPERRSHLALQKYRLSKPDVYNTGGLLPTNGSDWWRLRKAFQKHLSKVQCIKRYVDSTNTVVGEFIDRRIKRAELRDDFGPELSRLFLELTYYVAFDERLQRFKDEEWDSDSECSKLIKAAHDINSAIMKTDNGPQLWRKFDTPMYKSIQKGHEQIEKIALRVVNEKLISIKTTDSKTSLLGEYLSSDDTDFKDVIGMTVDTLLAGIDTATYSCCFGLYHLSSNPDVREKMFDESRALLPDNHTPVTDRVLERAVYAKAVVKEMFRMNPISVGVGRILPEECVFSGYRVPAGTVVVTQNQVSCRLEEYFRRPNEFLPERWIKGSAEYEPVSPYLVLPFGHGPRTCIARRLSEQFLQVVLIKIVRNFEMTWTGPKLDSESLLINKPDGPISIIFKTRD.

 

>CYP315A1 LOC100159616 sad 2-hydroxylase

MANRYCSLVLVNSTKKRFMSTSNLKTVITESKKEIPIVKGLPLVGTMFSILAAGGGRKLHEYIDKRHQKYGSVFREKLGSVDAIWISNPLDMKLLFAQEGKFPKHILPEAWLLYNDTYGQKRGLYFMNGKEWWKYRQIFNKVMLKDLNVNFIKSYKVVINDLLNEWELSNGQVIPNLIADLYKISISFMVAHLVGRVYDDCKNDLSNDINCLAQCIQKVFQCTVKFTVIPAKTSKLLKLNIWNDFVIAVDNSIESANNLVSKLMSLNGDGLLNSVLNVHDIPIDMIKRLMIDFIIAAGDTTAYSTQWSLYTLGLHKSIQNNLRHSLLKTDFLECDYLNNILKEVLRMYPLAPFITRIPPSDIYLTDHKIPANSLVIMSMFTSSRNGKYFNSPNEFIPDRWNRLKNNKYNGVNEPFATLPYGFGARSCIGQKMAHVQMCLTLSE

 

>CYP301A1 LOC100164600

MKNIRQFQIHSIRWRSTATQHAHSPHVSAGSPEALEVTNDLITAKHYSQVPGPTPWPIIGNTWRMLPIIGPYQISDLANVSYILYKQYGKIAKLGNLVGRPDLLFVYDADEIEKVYRQEGDTPFRPSMPCLVKYKSQVRGQFFGRLPGVVGVHGEAWREFRTKVQKPVLQPQTVKKYIQPIEEVSDYFIKRMQEMKNENSEMPADFDNEIHKWALECIGRVALDARLGCLNPDLPKNSEPQKIIDAAKYALRNVALLELKYPFWRYLPSTLWKKYVSNMDYFIEICMKYIDDAMLRLKNKSQSVNESELSLVERILANEPDPKTAYILALDLILVGIDTISMAVCSMLYQIATRPEEQEKIHQEILKILPNKDDKLDASKLEKMVYLKAFIKEVLRMYSTVIGNGRTLQKDMVICGYRIPKGIQLVFPTIVTGNMEEYVTDCKQFKPERWLKQSTDYIHPFASLPYGHGPRMCLGRRFADLEMQVFLAKLIRSHKLEYLHKPLEYKVTFMYAPDGELKFKMTERPTS.

 

>CYP301B1 = old CYP49A1 LOC100161793 (name corrected June 6, 2013)

MSVLARRLRNLRITVDHANKSTEVFTSVSQGDVDFVKDYSELPGPKSLPLLGNNWRFMSYIGDYKVTEIDKLSLRLWKEYGDIVKIEKLLGRPDMVFLYDADEIEKVFRNEELMPHRPSMPSLNYYKHVLRKDFFGDLAGVIAVIKKIKNKDQEVPDDFLNEIHKWSLESIAKVALDQKLGCLEDEHAVDSDTQNLIDAINTFFANVPELELKIPFWKLFSTPTWRKYINALDTITNVTSKHINRSMDRLLSQKSFCPDSQSSLLQRVLSLDPSNPKLAQILSLDMFIVGIDTTSAALASILYQLSRHPDKQKKLREEIRTVLPNADSKLTSSKLEQLQYLKACIKETLRMYPVVIGNGRCMTKETIISGYKIPKGVQVVFQHYAISNSSKYFSQPDQFLPERWLKGSGYKHHPFASLPFGYGKRMCLGRRFADLELQTVVSKIFQNFEVKYEYGDLEYTVHPIYMPDGPLKFKMIED.

 

Three other mitochondrial P450s are paralogs of CYP314A1 shd 20-hydroxylase

>CYP314A1a LOC100167431   SCAFFOLD4030:67656..77815 (- strand)

45% to CYP314A1 Manduca sexta

MVQKNFWTKIGGACCIVVACITALVKLVLKYVVGTYSNVEYPSEAQQKIYKTIADIPGPRSLPVFGTRWIYWKFCLYKLNAVHLAYEDMFNRYGDIIREEALWNIPVISVKNRDFIERVLRQSGKYPIRPPNEVTANYRKSRPDRYTNTGLVNEQGEVWAMLRNKLTPELTSPRTIRRFLPEVNQLADDFNNLISLARDGNNVVKEFEAYCNRMGLESTCTLILGRRFGFLDGEISETATRLADSVTSQFRASQEAFYGLPLWKLIPTKAYKDFVASEDALYNIVSEIVDSALIDEQQSCTDVRSVFVSILQTSELDNRDKKAAIIDYIAAGIKTLGNTLVFLLYLVAKHPEVQEKIYNEISRLAPAGTSVTAEHLHKATYLRACITEAHRLKPTAPCIARVLESEIEYDNYRLPPGSVVLLHTGLACLDENNFKDATSYRPERWLDELTKKSPFLVAPFGCGKRMCPGKRFVDLELQIVLAKMVKQFEIDFEGQLKTEFEFLLTPVDSNFILRDRIC.

 

>CYP314A1b LOC100165833 SCAFFOLD10596 coords:34935-48215 (+) strand

94% to CYP314A1a, 66% to CYP314A2

44% to CYP314A1 Manduca sexta

a recent gene duplication may include some flanking genes as well

MVQKNFWTKIGSACCIVIACITTLVKLVLKYVVGTYSNHENPSDAQQKIYKTIADIPGPRALPFFGTRWIYWKFCLYKLNAVHLAYEDMFNRYGDIICEEALWNIPVISVKNRDFIERVLRQSGKYPIRPPNEVTANYRKSRPDRYTNTGLVNEQGEVWAMLRNKLTPELTSPRTIRRFLPEVNQLADDFNNLISLARDGNNVVRGFEGYCNRMGLESTCTLILGRRIGFLDGEVSETATRLADSVTSQFRASQEAFYGLPLWKLIPTKAYKDFVASEDALYDIVSEFVESALIDEQQSFTDVRSVFVSILQASELDNRDKKAAIIDYIAAGIKTLGNTLVFILYLVAKHPEVQEKIYNEVSLLAPAGTPITSEHLHKATYLNACIIEAHRLKPTAPCIARVLESEIEYDNYRLPPGTVVLLHTGLACLDENNFKDATSYRPERWLDELAKKSPFLVAPFGCGKRMCPGKRFVDLELQIVLAKMVKQFQIDFEGQLKTEFEFLLTPVDSNFILRDRIY.

 

>CYP314A2 LOC100169172   SCAFFOLD543:7676..15068 (+ strand)

40% to CYP314A1 Manduca sexta

MALQKIIRKIWTSIKVTCFIVLACVTALVKFVSKNTLGIYRKFRKPADAQRRIYKTVADIPGPRSFPIIGTRWIYWKFGSYKLNAVHLGFEAMFLCFGDIIREETLWNSPVISVINRDCIEKVLRQSGKYPIRPPNEVIANYRRSRPDRYTNTGVSNEQGVIWNSLRKRLTSKMTSPDVVQGVFPEIKSMVDDFIHLLCQARNKNNIVKGFEGLSNRMGLESSCMLILGRRNRFLDRVVNETAMRLTDAVTTQFRASQKTFYGHPFWKIIPTKLYKEFIASEETFYEIMSEIIDFALSDETQSGISENSVFGSILRAPNMDMKEKKAAIIEFIGAGIKTFGNTLVFVLYLIAKHPEVQEKLYNEISRLAPADTPITNEHLKQAKYLNACIMEAHRYSPTAPCIARVLESQIIYDGYCLPKGTTVLMQTGLACLDERNFKDATSYIPERWMNKETYDSLFLVAPFGCGKRICPGKKFVELALKIVLAKMVKQFHIGYEGQLETVFEFILTPVNANFILRDRIN.

 

>CYP353B1 LOC100161881

is too short at the C-term yellow is probable C-term exon

45% to Tribolium castaneum XM_969024.1 CYP353A1 mito clan

MTFRQFKPFSSIPEPKRWPLLGHTHLFIPKIGPYDSQHLTEAMGDIERMLGPVFKLMLGGKTMVVTTRVEEAKTLFAHEGKHPARPIFPALNLLRKKPFGTGGLVSE (2)

NGVEWYRLRKAIAPLMSKNIYESYIPQHKKAAVDFIDYIKLNRNKDKCLKDMFYHLTKFSVE (1)

AISIVSPGLRIKCLNTTMSECFVEAGNKFMDGLYNTLKEPPIWKFYKTNAYRNLESSHSTCKNFIDEYLKQTHEHNALVNAINTNSNLTNTDINLLVLEIFFGGIDA (0)

TATTLAMTLFYISQDESVQKACEEDVLQGTNAYIKACIKETLRLSPTAGANARYLPKTTVIGGYEIPANTLVMAFNSLTSTKEKYFKAPLEYQPSRWLRNSNIQKFDPYASLPFGHGPRMCPGRHVAMQEMTILLSE (0)

LIKNFKISLPAEHAKNIGMIYRMNRIPDSRIDIIFNNK*

 

Compare to

>gi|225029066|gb|GO270917.1|  N4(1)G11 Suppressive subtractive hybridization screening of prediapausing

Leptinotarsa decemlineata Leptinotarsa decemlineata

cDNA, mRNA sequence.

Length=619

 

 Score = 55.5 bits (132),  Expect = 3e-07, Method: Composition-based stats.

 Identities = 27/53 (50%), Positives = 35/53 (66%), Gaps = 2/53 (3%)

 Frame = -2

 

Query  1    MCPGRHVAMQEMTILLSELIKNFKISLPAEHAKNIGMIYRMNRIPDSRIDIIF  53

            MCPG+ +A  E+ ILL ++++  K SL       IGM+YRMNRIPD  IDI F

Sbjct  480  MCPGKRLAENEIVILLKQILR--KYSLEVSDTSPIGMVYRMNRIPDRVIDIRF  328