P.sojae P450s

 

P450s found in the genome blast server at JGI

 

D. Nelson July 27, 2004

 

A tree of 65 Stramenopile or Chromista sequences

 

>CYP5014A1 sca20 47% to scaf_20e 361222 to 363018 minus strand, C-term long

MLAFLPTLAASDSRQNVVTS

ALIALLLGASTYATLSHIERSRAKHQRHKEGLPVPRPSTTLPIMGNTLDFVKNNDVFHDC

VSSLVQEFNGEPFLLSAPGRPDILVVSTPEAFEGVTKRQFDTFVKGD*LHEMFYDLLGNA

LTNSDGDVWQFQRKIFAKLFSARALRESMTSTIQKHGRTMHTLFENGAASGASFDLFRLL

SRFAMESFAEIGFGIQMGSLAIGEDHPFEKAFDIAEEATAKRFSVPAWFWKLQRLLSVGS

EGQLQRAIQVIDSTVLKFIYESIAGRARDEKRTGGAQNIVSLALDSCDLEGEADPQLLRS

IAIAAIIAGRDTTSETLSWFFYTLSQHPEVERNIRTEMLERIPRLVLETGYFPAMDEVQS

LTYLEAAIKETLRLYPPASFNIKHCSADIFLSDGTFIPEGTTIGLPSYAMGRMTSTWGPD

CNEYKPERFLDPDTGKLLSVSPFQFPAFFAGLRICVGMNLAMLEMKIVLTGLLSRFLSRL

QTGFDI VASTFQLVHDLGLRCPPVLQVRRVGGVHVNLHAPQFFHNRSVKHPSANAFVVKE

TLSGFSRSPSLITHVNTFNSAAPSELVFVFLQTLALPV*

 

>CYP5014B1 sca11 56039 to 56596 plus strand C-term frag., 73% to scaf_20d (ortholog)

56039 SKIRHELASKLPEL

VNGSISSPSMAQVNELVYLEAVVKEAMRLNPAVPSNIREALEDVVLCDGTVVKAGEAVSW

SSYSMGRMPHVWGPDAKQFKPERWIDATTGKLMAVSPLKFPLFNAGPRVCLGTKLAMMEI

KITTASVLSKYNLTAVPGHQVTYRLSLSLAMKDGFKVNVRKATAASFDGVA* 56596

 

>CYP5014C1 sca45a 79% to scaf_20c 111379 to 113001 minus strand, no introns

MLSVSALKLETPLHHALAVTSFLLLPLVIQLSRRIGSSSAETPEAFKERADSEPERREAGR

PPWTLPVLHNTLGFLLAGNNLHEWITRTCERFEGNPFTVKVLGLPRMLVVSTPEAFEDVL

KYQFMNFPKGPQYSENMKDLLGDGLFAADGVKWAHQRDIAHGLFRTKELRECMVKAITRH

TMALHDVLKQICARNRSVDLYKLLSCFSTEAFADISFGLKMDCLRANKELPFQAAFDRAQ

RLTALRFVRPRWFWKMQRRLGLGAEDQLQLDIKEIDATVLSIVQRVLAQRAMAPEDKDSN

MLSLYLDAIARSSGTDEQLYDPVHLRDVVVNFLVAGRDTTAQALSWFFFCVSQNPRVESK

LRREIYKKLPELMTAESCVPTLEQVNKLVYLEAVIKETLRLYPSMPIAPKYAVRDTVLSD

GTFVAAGSMVCLPLYAMGRMPHAWGPDAAEFKPERWVDPVTKKITSVSAFKFVAFNGGPR

MCLGSSLAGLELKLVAAALLSRFHIYVENPEDVGFGFSLTLPVKGPMNARLARVSASFG*

 

>CYP5014D1 sca45c 80% to scaf_20e, 139685 to 141217 minus strand

MTDKLSSSVAVAALSGLVVLPLA*RLLHVDKDKSQLST

RKVVRPATTLPVLGNTLDVIKNLPIRCDWLTSLCQDAQGEPVLLQSLGTPDTTLLSTPQAF

EDVFKNQFDNFPKGPKKSEYLCELLREGIFAVDNEKWYRQRKTASNLFTMRALRDSMTST

IQRHLVVLDRIFNRAAETDDTLDLFRLLNRFTMEAFTEIGFGVHMNWLDSDKEHPFQTAF

DQSQQLLVLRFVRPSWFWKAQRMMGVGAEGQLQRELHVIHSTIFDIVAQNLQNRAKGEND

KAGMDIVSLFLDDLNRSGDADESCFDPTYLRDIVVNFIIAGRDTTAQALSWFFYCLSHNP

QVETKIRKELRAKLPRLFSGDCSPSMDEVSELTYVEAALRETLRLYPSVPIVNKEAVHDT

VLSDGTFIAAGTVAALPMYALGRMTHFWGPDAAEFKPERWIDAQAGKLISALAFKFVAFN

AGPRLCLGKNLAMLEMKLIVASLLSKYRVELERPEDVTYAISKDLLVGESP*

 

>CYP5014D2 sca45b 78% to scaf_20b 115947 to 117587 minus strand  transc. 136349

117587 MLSVSSLRNKLPFNPVKLGIGTLVFASVVVLLAKPPYEPTKKEDPSKKSH

117437 RKIHRPEATLPVLENTLTVIEAARAGDIHDRTLLSCRESNAEPVLVRSIGVPDQLIVCTPEAFEDVLKLEFSN

FPKGSYQCENLRDLLGDGIFAVDGEQWVHQRKTASNLFTMRALRDSMAFVIQRHAVVLYD

ILRQTSESNETLDLFKLLNRFTIEAFTEIGFGVHMGCLDSEEEHPFQKAFDHAQRALLLR

FVRPGWFWELQKWLGVGAEGQLKNDIEVINKTVLDIVEKALAKRSSIGSGIEIDGSASQG

KDIVSLFLGDADSDTQQLDPMFLRNIVVNFLIAGRDTTAQTLSWFFLNLAKNPDVETAIR

NEIAKKLPNIEGSEVNVSHATMQDVSQLVYLEAALKETLRLHPPVPMIPKYVVEDTTLSD

GTFVKAGSLIVLATYVMARLPQVWGPDAEEFKPERWIDPSTGKLIVVSAYKFASFNAGPR

MCLGMNLAMLEMKLVVAGLLSKFHVEVLNPEDVTYDLSLTLPLKGALNVKVSQAALPSNP

DFA* 115947

 

>CYP5014D3 sca45d 136352 74% to scaf_20a 145254 to 146840 minus strand no introns

MLPILQLVEKSPVAGLALTGLLVLPLVITLHSRHKKSEEGIGKIHRPASTLPFLGNTWDL

VIHGVRGDMHDFMVQIGKQFNAEPVLLQALGIPLNLILYTPEGFEDVLKTQFSNFGKGPF

MRENLRDLMGDGIFAVDGEQWVHQRKTASNLFTMRALRDSMTVVIQRHAVVLYDILRRAS

ESKETLDLFKLLNRFTIEAFTEIGFGVHMGCLDSEEEHPFATAFDRAQRALRFRFTRPGW

FWKTQRWLGLGVEGQLQRDIQVIDKTVLEIVEKALARRSSRVENPEKKAGGDIVSLFLDS

AGSSNEKQFDPKYLRDIVVNFLIAGRDTTAQALSWFFFNISKNPRVEAAIRNELAQRLPK

VKAEAATPSMQDVSQLVYLEAALKETLRLHPSVPVEPKQTLKDTTLSDGTFVPAGSAIAL

ANYAMGRMPQVWGPDAEEFKPERWIDPSTWKLIAVSAYKFASFNAGPRMCLGMNLAMLEM

KLVVAGLLSKFHIEVLNPENVTYDVSLTLPVKGALTVKVSQIAEPAGA*

 

>CYP5014E2 sca118 plus strand 58% to scaf_36p missing N-term to C-helix

174064 SNLITTRALREYMAPVIQEKTLLLQSILADKSETKEPFDMYKLMRQFTLDTF

AEIGFGCHLEILTSGKEHPFEVAFDEANRISSERFTKPTWLWKFQRFLNIGNERRLREAI

SVMNEFSVDLIMEAMEQMKNSKPDEADVESPAHKNIMAILLSKKEAVTPTQVRDIVLTSL

EARRNTTSDTLAWFFHSLSHHPQVERKLRAEIRSKLPKFGEIHIYVPSYEAVQDLPYLEA

TLREALRLHPTGPSIPYHYQRDTVLQDETFISAGTDVFLHLYSAGRLTSAWGSDAASFNS

QRFNDLTTGEVLPSKYSPFSSGPRVCIGRNLALLEMKIAIAAVVGRFRLCEEPSLTQRRP

AFQNFQISELFIPVCIDLTRRRSTS* 175197

 

>CYP5014E3P Sca116a minus strand fragment N-term (probably all part of same pseudogene)

whole combined sequence 60% to scaf_36p

114619 MLSPLQLSGGSTVALGLLACGLAVAGVVSYTCTWSSKSESSKAGRVPYL 114473

114472 PSWIPLLGNTVELARNVDRHHEWVAEHSLQRDGKPFALRLPGKNDTLFLS 114323

114322 RPEHFEEVVKTQSSNFSKGDI 114260 frameshift

114261 LREIFDDFLSEDILIIHGERWRFHRKILASLFTPRALREYMTRIVREDVR 114112

114111 RLQSVLQ 114091

Sca116a pseudogene plus strand middle fragment

113385 SETQESFDLSKLLLQFTIDTXXXX 113444

113449 GFGHKLETLTSDGVHPFEAAFDDANRISSQRNTVPPCVWKLQRCLNVGSE 113598

113599 RRLREAIDEMNGLLLVLISSAYG 113667 frameshift and deletion

       XXXXXXXXXXXXXXXXXXXXXXX

113667 SRPRITATEVRDISLAGL

EVGRNTTADAMMWFFHALSQNPQVQKKLHAEILAKLPKLGESESYIPSHEDIQKMPYLEA

TILELLRLHPAVPGIPYHC 113957 frameshift

113959 VETVFADNTFIPASTDIILSLYSAGRLTSVWAIVGRFRLIEEPL 114090

note: N-term is inverted right after this point in the sequence

 

>CYP5014F1 sca116d 170709 to 172295 plus strand, no introns 89% to scaf_36a

MLQSMFSKSPVVPGLVTAALLVTLYWTKSAKGSAKLKGDKVKNAVILPGTLPVVGNAVELAANA

ARMHDWLADQFAATNGEAFIVRLPGKDDMMFIAKPEHLEAVLKTQFDVFPKSEYIHDVFY

DMLGDGIVVTNGETWKRQRNVVVGLFSARALREHMTPLVQKYTVQLGDILADAAATNTPV

DVFDLLHRYTFDVFGEIGFGAKMGSMDGAFQPFAEAMDEAQFLAGKRFKQPMWYWKLRRW

LNVGDEKKLKENVRVIDEHLMGIIADAIERRRHRVEEMKAGRPAALADKDIVSIVLDSME

ASGQPVNPVEVRNIAVASIIAGRDTTADCMGWLFHLLSENPRVEAKLRDEVLAKIPQLAT

DKSYVPSVEDINKVPYLEACIRELLRLYPPGPLITTHCIKDTVFPDGTFVPANTDIGIAL

FSAGRLTSVWGEDALEYKPERFIDSESGEIIPMTATKFCAFSAGPRICVGQNLAFVETKI

VIASIVGRFHMIPEPGQNVAYTQGISLGMMDPLMMRLEAVNASA*

 

>CYP5014F2 sca116c 167571 to 169189 plus strand 86% to scaf_36b

MLQSFFEDKLSYLHPVVPGLVAAAVAVAIYCTTDTAEPPTLEGEDGKAVAKRVRYLPSKIPVLGNAID 167774 frameshift

167774 LLSNSERMHDWIADQIVPFDGEPFTLRLPGKSDMMFIAKPEH

IEEVLKTQFENFPKSQHIHDVFFDLLGDGIVTTNGETWKRQRRVLVNLFSARALREHMTP

ISQKYVVQLRKIFEDAVASKEPMDAFGLMHRYTLDVFAVIGFGTEMKLLEGRYQPFAEAI

EESQYIVSARFKQPDAQWKLMRWLNIGSEKKLRHAIQVIDEHVMGIISGAIQRRQERDQA

IKAGEAAKPADRDIVSIILDSMESNNQVVDPVEVRNIATAALIAGRDTTADALGWLFHVL

SQNPSVEAKLRSELLTHMPRLTTDPEYVPTAEELNQVPYLEATIRELLRLLPAGPVIATH

CVRDTVFPDGTFVPKNTDIGLAFYTTGRLTSVWGEDALEFKPERFLDADTGEVVKVSSSK

FCAFSAGPRICVGRNLAFLEMKIVIANILSRFHLVPEPGQQPTYTQGITLGMQTPLMMRV

EAVTAHAAA*

 

>CYP5014G1 sca39 84% to scaf_86 455071 to 456241 plus strand missing N-term

SSRALREHMAPVIQKHVRVLQRVLTDVAAAKMPIDMFNYSGRFTLDAFGEIAFGFNMSTLTLQ

RD*HPFERAFVDAQHIAASRLVVPTWYWKLKRSLNVGSERRLREALTTVDQFVMDVISKT

VDKRNAPISDAEDKVHTRGRDIVSLILANETVDGTPVDPILVRNVVLMALIAGRDTAADA

LAWLFHLLTLNPRVEEKLRAYLLASLPKLGSDFDYVPDMQEVQSLPYLEATINEALRLYS

PVGLAQKLCVRDTVFPDGTFVPKGSNIALVYHAMARMPGVWGPDAAAFNPERFIDPQTGE

LIKVSSGKFSAFNTGPRVCVGRKLAMMEMKMVVACVVSRFRFDEVPGQDVAC 456135 frameshift

456137 GGGLTIGMKNPLMMRVQQLAPKEDGDEVVVGVAA* 456241

 

>CYP5014H1 sca79b  355924 to 357594 plus strand 87% to scaf_6m

MLRKWLTKHRALSPLGPAGLALLAGA

AVVAAYVATRSSGDAVSVLDCKEEKTKEKWENTPQDKPKVVPYLPSKVPWIGNMLQLAGN

AHRFHSWMAEQCIAHNGVFKLHLPGQSDMLVTAVPEHYEHVVKTQFEHFSKGHQQYDMFV

DLMGHSVLIIEGERWKYHRRLLVRLFSARALRDHMTPVIQRHTLLLQNVFLKAAVAKKPV

DVYMFMHRFTFKAFAEMVFNNSLDSIDSEHEHPFEQAFDEAQSIVAGRLQQPVWFWKLMR

WLNVGLERKLREDVALIDEFIMEIISTAIEARRQRQEDLKAGRPVKDADKDIVSIVLECM

EQDGDMVSPTDVRNIAVAALGAGRDTSADAMSWLLHTLTQNPHVEDKLRAELLENLPKLA

TSPSYVPSMDEVHGLVYLEATIRELLRLQTPVPFTLRECIHDTVFSDGTFVPKGTNVGMC

HFGAARRPEVWGPDAAEFNPERFIDQETGKLVQTPMAKFNAFSGGQRMCVGKALAMLEMK

LVIATLVGRFHFREVPGQNVQYAMGITIGMRNSLMMHIEPVRTGASAAAA*

 

>CYP5014J1 sca116b 163736 to 165337 plus strand, one stop codon, 48% to scaf_6p

MPPPSLLKGGGLSSPALLGLLVATAAAMLFAAKPSRGKPLPANVTPVPFLPSTPL

LGNTLELAANAARLQDWVADRSRECDGQPFVVQLLGKRNLVYLSRPEHFEQVLKLQSSNF

NKGLAIHDIYSDFMGESILLVNGDRWKYHRRVLVNLFSARALRDFMTPIIQKNILVLMDI

LARARERNEALDIHKLMNKFTFETFAKIGFGQKLGNLVSPEDHPFERAFDEAHHITGHRM

TTPTWLWKLKRWLNVGSERKLRECVEVMDSLVMGIISDAIAKRQQRGQEEEAGEHDHEKD

IVSIILE*MHADGRPVEPSEVRSIALLSLIAGRDTTANAVSWILHMLHEHPRVEEKLRAE

LYEKLPKLATSRDYMPSLEELQDLPYLEAVINENLRLLPIFPYTSRQCIRDTVFPDGTFI

QAGEVLGLPHYVMARLTSVWGENAAEFVPERFLDAKSGEVLDLPVATSSAFGAGPRICVG

RRLASMEMKLLLACIVGRYHLVELPGQTVRYKLALSLTMKDPLMVNVQHVNQALAKSA*

 

>CYP5014K1 sca79a 346171 to 347793 minus strand 86% to scaf_6p no introns

MLSLSELRTHPLVVGFVAVAAAVTLYSVVANAGSDALDDEDDEGKTDRKAK

PIPYLPGGHPVLGHTLLMARNLDRFQDWLVETSVARGGAPFVLRQPGKNDWLFSARPEDF

EQILKVHFDTFIKGPQVRELLDDFMGENIVIINGHRWKFQRKALVNLFTARALKEHMTPV

VQKCALALQRVFAKAAESGDVLDVHHIMGRFTLETFAEIEFGSQLGLLEKGEENAFETAI

DDANHISLERFAVPMWVWKLKRWLNVGSERRLKEDMAVISSFVMSCISDAIERRKQRLEA

AARGEPVGPVAKDIVSILLDSEDATGEPVLPKDVFNISLAGVLAGKDTTGDATSWLMHLL

HENPRVENKLRAELLAKVPKLAEDESYVPPMEELDAITYLEATIRESLRLKPPAPCVTQH

CTQDTVFPDGTFVPKGMDTTLLYHASALLPSVWGPDAAEFNPERFLDDNGKLLVLPPLKF

IAFSAGPRKCVGRKLAMIEMKVVTACLVSRFHLVEVTGQDIRGTMGISLGMKNGMKVSVQ

ATPGVAKRA*

 

>CYP5014_un1 sca38 minus strand pseudogene fragment, I-helix to EXXR region 49% to scaf_36a

IVLAILDCVEVTS*HVNPGEVCRLNANYADRDITVDCMDWLFHLLSKNP

RIDAELFAEALVKLT*LMAYKRYVSSMKRLNKVPYLESCFREPPQLPGPLITTSCVIGTG

IPGGAFVPANKEIGTDLFSLTRGSG

 

>CYP5014_un2 sca205 11789 to 13031 plus strand, 44% to scaf_20e probable pseudogene

11789 MLAFLPTLAASDSRQNVVTSALIA

LLLGASTYATLSHIERSRAKHQRHKEGLPVPRPSTTLPIMGNTLDFVKNNDVFHDCVSSL

VQEFNGEPFLLSAPGRPDILVVSTPEAFEGVAKRQFDTFVKGE*LHEMFYDLLDNALTNN

DGEVWQFQRKIFAKLFSVRALRESITSTIQKHDRTLHTLFENAAVSGESFDLFRLLSRFA

MESFAKIGFGIQMGSLAIGEDHPFEKAFDITEEATAKRFSE 12523

(deletion)

12529 TQHMLERIPRLALETGYFPTMDEVQSLTYIEAAIKETLRLYPPASTAARTPSSPTV 12696 frameshift

12696 SADTFLSDGTFVPEGTTIGLPSYAMGRMASN

CNEYKPERFLDPGTGKLLSVSAFKFPAFFAGPRICVGMNLAMLEMKIVLTGRVTVQPGQE

VTYVRSLALLMNPFMVKIEKVSPSVVPIA* 13031

 

>CYP5014_un3 sca265 plus strand C-term pseudogene fragment 84% to sca205

9688 VAVQPGQEVTYVRSLALPMKNPFMVKIEKVPPSLIPIA* 9804

 

>CYP5015A1 sca42 90% to scaf_63 376713 to 378335 minus strand, no introns

MGGSSASSSSSLALWVALP

TAAAAAVLAYLLIPDERQRAIRRLPAPASTLPVLGNTLDMMSLEQPRLHDWIAEQCKAFG

GRTWRLQVVGAPPLVVVSSVEGFEDVLKTQFEVFDKGDRMNTIFRDIAGGGIVAVDGPQW

VAQRKMLSRLFTMRAFRDTISQCVHDYTLVLGRMLGDAARTGVPIDFADVMHRFSFDVFT

DIAFGLQGNSLEGGEHTQFMEAMGKIVHNIEMRFHSPDWLWKLKRALKLGSEKELAQEVA

ILDKMVFTIINKNMERKFNPDAAAAEWPPRPQRSTKDVVSLFLDAHDEQKAAGEDGGDTP

LDANFLRDIAVVVLLAGKDTTAWSMSWLIIMLNRNPKVETKLRQELREKLPKLFSDPSYV

PTMDDVEGLVYLEAVLRENLRLNPLVPLNAKEANRDTTLVDGTFVKKGTRVYIPSYTLGR

MKSVWGRDASKFKPERWLMQDPWTGEQTIRPVSAFQFVSFHAGPRT

CLGMRFAMLEMKTVLAYMLSKYHFTTRENPKSYTYDVASLLQVKGPLICKVQRAG*

 

>CYP5015B1 sca73a 218927 to 220480 plus strand, no introns, 81% to scaf_27d,

transcript 139397 = 3 P450s fused

MSNMVLPFAIAASLVAAAVAYFTSPTEQDRAVCELPTPRSTLPVLKNTLDLTIRQRARIY

DWILEQCREHGGRPWRVRVLGRPPAVILSSPEAMEGVLKTQFDVFVKGSAVAEISHDLLG

EGIFTVDGSKWRHQRKAASHFFSMNMIKHAMEHVVRDHSALLAVKLRAAADNGETLNIKR

VFDFFTMDIFTKIGFGVELKGLETGGNCDFMEAFERASRRIMARFQQPMCVWKLARWLNV

GAERQMAEDMKLINGVVYDVIHRSLEGNDKRSSCSGRKDLVSLFLEKASVEYAADDHTEM

TPTMLRDMSMVFIFAGRDSTSLTMTWFIIEMNRHPEVLANVRRELADKLPKLGMDDTETP

SVEDIDQLVYLEAAIRECIRLNPVAPAMQRTAAQDTTLYNGTVIKAGTRVILPHYAMGHL

ETVWGPDAEEFKPERWIDADTGKLLHVSPFRFTAFLAGPRMCLGMRFALAEMKITLATIL

SKFDLQTVENPDGFTYIPSVTLQVKGPVDVAITRAHA*

 

note: sca73d 225097 to 225615 is identical to aa 348-519 sca73b

note: sca73e 225763 to 227318 plus strand is identical to sca73c

note: sca73f 235636 to 237189 is identical to sca73a

note: sca73g 237303 to 237752 is identical to aa 1-150 sca73b

 

>CYP5015C1 sca73b 220594 to 222153 plus strand, 81% to scaf_27c

MKLEAITALISPASVAASCVALLLVYVATPSAHDRAVKHLP

TPEGDIPVLRSTLEIVRAQKSGKFHDWALAYCRKFQGRPWCLRILGKTPSVVVCCPEAFE

DIQKTQFDAFDKSPFVSAAMYDVLGHGIFAVSGPLWQHQRKTASHLFTTQMLQYAMEVVV

PEKGEALVKRLDEISKANQVVNMKRLLDLYTMDVFAKVGFDVDLHGVESDQNAELLDAFD

RMSVRMLERIQQPVWYWKLLRWLNVGPEKQLAEDIKMTDDLIYSVMSRSIEEKTKGS RKD

LISLFIEKSAVEYTKGVHTKKDLKLMRDFVISFLAAGRETTATTMSWVILMLNRYPKVLD

QVRQELKAKLPGLASGETRAPTLENIQQLVYLEAVIKETLRLFPVVAITGRSATRDVRLY

EGTVIKADTRVVMPHYAMGRMETVWGPDANEFKPERWIDPATGKVNVVSPFKFSVFLGGP

RVCLGMKFAMAEVKISLAKLLSQFDFKTVKDPFDFTYRSSITLQIKGPLDVVVSRLKA*

 

>CYP5015C2P sca73c pseudogene, 54% to scaf_27c  222301 to 223856 plus strand

MVVSYDELWVFGIVCVALLLGYLVTPSAQTRAVLHLPKPPGYLPVLIRVQH

SGRFHNWALSTCRKYEGKPWCMHVLGKAPTVFVCTPEAFEDVEKIQYEAFGRNP

LFVEATTDVLGQGVFAISGPLWHHQRKTASRLISTQMIQHNMDVVVPDKCKELMKRLDAA

ASEENPIDRVVSLKWRLDLFTMDVFCKVGFGIDMHKMETEKIIAMLEALQRSSARIVGRI

LEPSWFWKLRRDLNIGAERQFTKDMECVNDMICGFIAPSIEEKAQRDQVEAKEDEKDSRM

DLISLYLDQDAADNGKDAPFDPKKQRDFLVSFLAAGQDTTSTSMSWFVVMINRYPKVLDKNS 223341 frameshift

223343 GKMPDLASGKQTVPSLEDTQQLVYLEAAIRETL

RLFPVAPISG*TATRNVTLSNGVFLVKGTSVHIPHYTIGRMKTVWGPDAEEFKPERWIDQ

VTERITPVSPFKFSAFYGGPHACLGMKFA 223708 frameshift

223710 MSEIKITLAALLSRFNLRTSRDPFAYTYRMALSLRIDGGLDVAVSHLE*

 

>CYP5015D1 sca73am 216869 to 218524 minus strand, 139396, 83% to scaf_27a

MWGPLELALNLSLTSWGVLVCSLLLGWHFLSSRKQARALSKFTRPASTLPVLGNTLDLMF

KHRHDIHDWMLDECRRCEGRPWVLAAVGRPTTVVLSDVDAFEDVLHRKFDSFGKCSAWLV

SDVFGDGIFAADGVSWIHQRKTASHLFSLHMMRESMEQVVREQATVLCETLRAHCTDNQT

STSPQRGVPVNLKYTMDWYATNVFTRVGFGVDLDSLSSQEHNEFFCAFTRLPIGIHRRIQ

QPGWLWRLKRALDLGDEKQLKLDMARVDGVIYQVISQSMESKSDTAPVESKRLPDLISLF

LAKETNEYRDREAKQDNGAVATCRVETTPKLIRDMAFNFTAAGRGTTSQSLQWFIIMMNR

FPGVERKIREELQAKLPQLFEEDSTPPSMNDVQQLVYLEAAIKESLRLNPVAPLIGRTAT

QDVVFSDGMFIPSGTRVIIPTFAVARLQSIWGEDAAEFKPERWI DPHTGKLRVISLYKFL

VFLAGPRSCLGAKLAMLELKVALATVLSKFHLRVLRDPFEIGYDASISLPVKGDVLAIVE

AAKVGNSAGAA*

 

>note sca73b 233842 to 235233 is an exact duplicate of aa 1-464 of sca73a

 

>CYP5015E4 sca96a minus strand 88% to scaf_41b 141298 prediction too long

263342 MKSVSELFGDRSDVAVTAAAA

VTVGLGLSLLLHSTKKSKMSDTRKLPPMPKTTLPILKNILDAGGNAERFHDWLNEQSIEF

DNRPWMLSIPGRPATIVLSSPEMFEDVLKTQDDVFLRGPSGQYISFDLFGNGMVITDGDL

WFYHRKTASHLFSMQMMKDVMEATVREKLAVFLDVLGVYHQRGQQFSAKQELSHFTMDVI

AKIAFSIELNTLKDSPDREDDHEFLKAFNKACVAFGVRIQSPMWLWRLKRYLNVGWEKVF

KENNTIIQNFINDVIVQSMNKKAEYSAKGEKMVARDLITLFMESNLRHSEDIHIADDDAT

IMRDMVMSFAFAGKDSTADNMCWFIVNMNRYPEVLKKIREEMKEKLPGLLTGEIRVPTQE

QLRDLVYLEAVMKENMRLHPSTAFIMREAMDNTTLVDGTFVEKGQTLMISSYCNARNKRT

WGDDCLEFKPERMIDPETGKLRVLSPYVFSGFGAGQHVCIGQKFAMMEIKTTLATLYSKF

DIKTVEDPWEITYEFSLTMPVKGGLSVEVTPLTPLKRASSACK* 261708

 

>CYP5015E5P sca96c plus strand N-terminal (inverted from seq sca96b) 88% to scaf_41a

270399 RFAVAVSLGLSLLLHSTKKSKKSDARKLPPMPKTTLPILKNILDDGGNAERF 270554

270555 HDWLNEQSIEFGNRPWMFSIPGRPATIVLSSPEMFEDVLVTQDDVFLRGP 270704

270705 SGQYISFDLFGNGMVITDGDLWFYHRKTASHLFSMQMT 270818

 

>CYP5015E5P sca96b 270398 to 269250 minus strand 141300 prediction adds incorrect N-term

88% to scaf_41a missing N-term inverted on sca96c

270398 RDVMEATVHDKL

GVFLDVLDIYHKRGKPFSIKQELSHFTMDAIAKIGFGLDMDTLKNSPDREEDHEFLEAFN

KGSVPFGVRIQSPLWLWELKKYLNVGWEKVLMDNTKIMHQFINKVILDSMNKKAELAAKG

EKMEARDLVTLLMESKLRQTEDMHIEDDDATIMRDMVMTFVFAGKDSTAHSMGWFIVNMN

RYPDVLKKIREEMKEKLPGLLTGEIRVPMQEQIKDLVYLEAVVKENIRLHPSTGFIVRET

MQDTTLVDGTFVEKGQTLMVSSYCNARNKKTWGDDCLEFKPERMIDPETGKLRVLSPYVF

SGFGSGQHVCIGQKFAQMEIKMAMATLFSKFDIKTVEDPWKLTYEFSLTIPVKGPLDVEV

TPLTPLTPPK* 269250

 

>CYP5015E6 sca80 349103 to 347501 minus strand 140005 64% to scaf_41a

349103 MKIVTQLPTDKRDAAVAAAAVVTLGLLVSYLSRPKDKGNKPKRKMAHVPKSTLPLLGNML

DMSTNMPRFHDWISE*CAEFDNEPWTLQIPGKEPWIVLSSAELFEDV 348783 frameshift

348781 LKTQADNFLRGPVSHHQAYDVFGNGLSISDGDAWFYQRKTASH

LFSMQIMKTVMEDSVREKLDVFLDVLGKYAARGKPFGIKKWLSHFTMDVFSKIGFGVELD

TLKNTFDQEGDHEFLEAFNVASVAFGVRIQTPTWLWELKKFLNVGWEKIIMDNCKKFHDF

IDSFVLKAMVERGQNKVARDLISLFLDSSIDTSELQIEEDEATIMRDMVTTFIFAGKDSS

AHSLGWFIVNMNRYPEILRKIREEIKEKLPGLLTGEIQVPTAAQLQELVYLEAVIRENIR

LHPSTGFIMRQATEATTLVDGTFVDKEVSVLLPSYANARNPRTWGEDASEFKPERFIDAD

TGKIRNFSPFVFSSFGSGPHICLGMKLALMEVKLTLATLLSKFDFKTVEDPWQMTYDFSL

TIPVKRPMEVEVTPLVTPYADSA* 347501

 

>CYP5015E7  sca91 286085 to 287701 plus strand no introns 77% to scaf_41a

MKSVSELFGDRNDVAVTAAAAVAVSLGLSLLL

HSTKKSKKPEGVRLPPMPKTTLPILKSLFDAGGNVARFHDWLNEQSIEFDHRPWMYSIPG

RPVTIVLTSPDTIEDALSTQNDVFLRGPVGQYMSEDIFGNGMIIADGDPWYYHRKTSSHL

FSMQMMKDVMEATVREKLEVFLDVLDIYHKRGQSFSAKQELLHFTMDVIAKIGFGLELDT

LKDGPHRDEDHEFQEAFDQAAVAYAVRVQSPLWLWEIKRYFNIGWEKVFRDNTTILHNFI

DEVITQSMKKKAELAAKGEKMVARDLITLFMESTLRENQDMHIEDDDATIMRDMVMTMMF

AGRDSTAHSMCWFIVHMNRYPEILEKIRDEMKEKLPGLLTGEIKVPTQEQLRELVYLEAV

MKENIRLIPSTGFIAREAMRDTTLVDGTFVGKGQTIMVSSYCNARNADNWGEDASEFKPE

RMIDPKTGKLRVLSPFVFSPFGSGQHACMGQKFAMMQMKLTLATLYSKYDIKTVEDPWKL

TYEFSLTIPVKGPLDIEVTPLSPLMA*

 

>CYP5015F1 sca117 128109 to 129785 minus strand no introns, 89% to scaf_1

MNPAGLPHQQLQLQHSTSSCHSPIFSQQLKPSQPTNQLTGTSMWSSASH

DGAQQSVLLAFGALTALYASWKILSMPVPLPDPGMEDLFRPASTLPILGNTLDVLLFNRY

RMSDWINDQTDASEGKPWILQLLFQPPWVVLSMPSDLDDVFRDQFDVFEKGGTLGDISFD

VLGNGLLNVSGDKWKQQRRAASHLFSTQSIRDVMEPVIREKTLQLRDVLAQCADREQTVS

MKSLLGKFTSDVFTRIGFGVELNQLGGDVLVDDMHPLDIALHAVQNRFQTPMWMWKLTRF

LNVGAERRLRENMKIVNDMVRGIMVRSIGDKTPGDGKKNLLTLLMKDDVDADPRELQDTA

VNFFIAGKDTTSFSLSWLIVMMNRYPRVLQKIREEIASVLPGLLTGEMSAPTLEDTQKLV

YLDAAVKESVRLWSVSTYRCTTRDTTLTSGAFIEKGTVVVVSKYAAARRKNVWGDDAAEY

RPERWFDEKTGEPKSITPPQFITFSTGPRKCIGMRLAMLEMKTVMAVLFSRFDIETVEDS

FKITYDFSFVLPVKGPLAVRIRDRTAPSV*

 

>CYP5015F2 sca19 215894 to 214356 minus strand, no introns 132088 84% to scaf_21m

MIDSQALSPVLATAFALLLVCWKLLSKPRPHSNGQELFRPASTLPFLGNTLDVLWFQRHR

LHDWMTEQSLASGGKPWLLTGIGQRPKVVLTSPAAYEDVFKTQFDVFVRGPGETVLEVLG

QGIFNVDGDKWRHQRRVTSHLFSMHMLKDCMKSVVREKTVQLREVLATCAERGQTVSMKS

LLNKFTADTFTRIGFGVDLNGLADPVDVDTSQPLDTALGVVQTRLQSPVWLWKPRRFFNV

GSERVMRENMQQVQDTVQKIMAKSLADKEHQANGEEATTSSKHKDLMSLMLQSGDFTDPR

EVRDICVNFYAAGKDTTAFSLSWFIVMMNRHPRVLCKVREELRRVAPELFTGELDTPTLG

HLQQLTYLEAALKESLRLNSLAVYRLANRDTTLSDGTFVPKDARAVFSMYASARQPSVWG

SDAADYNPGRWIDEETGKLSSFKFVTFSAGPRQCIGMRLAMMEMMTVLSVVFSRFDLETV

VDPLDITYDFSLVLPVKGSLAVRVHSLSAHMA*

 

>CYP5015G1 sca24a 615142 to 613559 minus strand no introns 133194 57% to scaf_11a

MWTLSQHATFDKAA

ATVALVTAAYVGWNVVSAVVARRAVNRVLADQGVYEPPSLPVLGHTLDLMHNKDRFHDWF

AEQCLAAGGRPWVLRIIGRPPTLVLTSPQEIEDVFKTQVDIFEKGLDIREIGHDFFGDGI

VGVDGEKWQKQRRTASHLFSVGMLRDVMDAVVMEKT LQLRDVLAECARVNRPVSMKSLLA

KLSSDVFTKIGFGVDLNGLGGDVDDDMEHPFIKAVETYGSVFQSRLQSPMWLWRLKKRLG

VGEEGELRKARVIVHDLVMEIMKKSMASKNSATGSKQQ KDLITLFMKTMDSSADVMEVRD

AVMNFFLAGRDTTSFSMSWMIVNMNRYPRVLEKIRAEINANLPELLTGEIQAPSMADLQK

LPYLEAAMRESLRLYMATVHRAPNRSTTLSGGLHVPFGTHVIVPTYAMGRMPTVWGEDAA

EYRPERWIGEDGRVLKVSPFKFFSFLAGPHQCLGMRFALLEMQTVMAVLLSRFDIKTVEN

PFEITYDYSLVIPVKGPLMANIHDRSTSVAASS*

 

>CYP5015G2 sca24b 622007 to 620433 minus strand no introns 89% to scaf_11a

622007 MWGISQHHERQAVLAAGTLSGLYLGYKLLVAVYKELKITRALDAQGLHRPKSTLPILGNTLDV

MYFQKDRLQDWMAEQSQVSDGKPWVLSIIGRPQTLILTSPEACEDVFKAQFDNFGRGDEL

VDLQHDIFGEGVAGVDGEKWLKQRRIASHLFSMKMLRDVMDEVICEKSLKLRDVLAQCAK

EGCVVPMKSLLGKFSSDVFTKIGFGVDLHGLDGDINSEMDHPFIEAVDGYAEVFGARLQS

PMWYWKLKRFLNIGDERMLKRCIKVATELLNEVMLKSMASKTAEDWNTK TDLLTLFVDTT

GKTDSSDLRDAMMDFFLAGKETTSFSLAWVIVNLNRHPRVLAKLRAEIREKLPGLMTGEL

EVPTMEDLAKVPYIEAVLKESLRLYMTGVHRTPMRSTTLREGTFVPYGSYVVMSVYAAAR

VKKVWGEDAAEYNPDRWIDEETGKIKFVNPFQFITFGGGPHQCIGMRFALLEMQTVIAVL

FSRFDIKTVEDPFKITYDYSVTLPIKGPLECTVHEATAPAY* 620433

 

>CYP5015G3 sca24c 624127 to 622555 minus strand, 86% to scaf_11b, 2 frameshifts

624127 MWGLAKHQVSEREAALAVSALGALYVSYKLLSAMYKSGSMARAFDAQGLYRPKSTL

PILGNTLDVMFYQKERLWDWMAEQSIL 623879 frameshift

623879 QEGKPWVLSIVGRPDALVVTSPEACEDVFKTQFDNFGRGTELRDVIYDIFGDGIAGVDGE

EWQKQRRVASHLFSMKMLRDVMDEVIIEKVTKLKDVLAECAKQGKVVPMKSLFGKFTSEV

FTKIGFGVDLRSLESDPCSDSNNAFIRAVDVYAEVFGARVQSPAWFWKLKRFLSIDDEGRLKQSAKVA 623322 frameshift

623322 GGTQQVLAKSLEVRRQDSSDAKR

TDLLTLFVEANTSIDPKAVHDTLMSFLLASK

DTSSFSLSWVLINLNRYPAVLAKLRDEIRANLPGLMTGEIKVPTMEDLQKLPYLEAVAKE

SLRLHMTASNRMANTATTLSDGTFVPEGCAVMIPMYASARVKSVWGEDAAEYKPERWIDA

ATGKVTPVSPFKFVTFGAGPRQCLGMRFALLQIQTTMAVLFSHFDLKTTEDPFDLTYDFA

ITLPVKGPLNVTVREITPAAY* 622555

 

>CYP5015G4 sca24d minus strand, 58% to scaf_11b, 133197 (short prediction)

632042 MWTAQNHATTSSAVLLTAATLGSLYAGWKVATALYSQRVLDAALTKQKLHSPDSTLPVLGNTLDLLFFQRERLW

DWVTEQSAISGGKPWVLRIIDRPTSLVVTSPETLEDIFKTQFETFERGADMRELFYAFVG

DGIVGADGEQWVKHRRTASLMFTTRTLREVVDAVAKEKSLQLRDVLSECAKQGRVVSMKS

LLTKFSGDAFTKIGFGVDLNSLGGNVESAMDHPFMEAVEVYAEVLCTRLLSPTWLWKLKRFLNVGDERALKHANKI

VHDLTYEVMRESME KKTHGEGMALQQ 631155 ()

625161 KDLLSLFMQSGDTV

DVQVVRDSVMNF LLAGHDTTSFSLSWVVINLNRYPDVLAKLRTEFRERLPGLMTGEIDVT

TYEDLQNLPYLEAVVKESLRLYVTAVNRVANQSTTLSDGTFVPLGCGIMVALYAAARMKN

VWGEDADEYKPERWIDPKTGKVKNVSSFKFISFIAGPRQCIGMRFALLQMRVAIAVMFSR

FDLKTVEDPFKLTYDIAFTLPVKGPLNVSVHELA* 624475

 

>CYP5015G5 sca24 plus strand missing N-term 193 aa

571848 SVIGTFSGDTFTKIAFSVD 571904

571905 LNGLADAENDHP

571941 FNEAVDVMAEMLGSRLLSPTWVWKLKRFLNIGDEHKLKQACAIVHELTHR 572090

572091 VMSKSM 572108

572182 LLLAGKDTTDFSLAWILVNLNRYPDVLTKLRKEINEKLPGLVTGEIDIPT 572331

572332 MDDLKDMPNLEAVVKESLRLHAIA 572403

572404 TTRVPNKSVTLSDGTFVPAGCTARMTSVWGEDASVYKPERWI 572529

       IDAETGKVKMVSPFKFGTLIS 572589

572589 GSRQCVGMRFALLEMRIATAVLFSRFDLKTVDDPFDVTY 572705

       GSLWRSRARLMSPCTLFRLRSLRRGGED*

 

>CYP5015G6 sca43 plus strand 55% to scaf_11b

184231 MLASLYDVVFSVASLAALYAAWRVGSRVYSQRIIDVALANQKL

HSPPSTIPLLGNTLDALFLQKTRFWDWIAEQSELSGGKPWVLRLVGRPTTLVCTSPEALE

DIFKTHFDTFERGADLRDLLYDFFGDGIVGADGENWQKQRRAAS 184671 frameshift

184673 TTRALRDAMATVVKEK

ALHLRDALAKCAKEGRTVDMKSLLEKFSGDTFTKIAFGVDLNGMESDHPFNKAVDVMSET

LDSRLLSPTWLWKMKRFLNVGDERKLKEACAIVHELTHQVMTESMQQQQKKKNKNKSDVL

TLLLDSSGDLDVAVVRDAVMNFLLAGKDSTIFSLSWILVNLNRHPEVLRNEINEKLPGLV

SGEMDAPTMDDLKDLTYVEAVVKESLRLHGIATTRVPKKSIILSDGTFAPAGCAVMMPAY

ASARLTSVWGEDASDYKPERWIDP 185512 frameshift

185512 RVKPVSPFKFGTFIAGPRQCVGMRFALLEMRLVTAV

LFSRFDLKTVKDPFEISYEYAFTLPIKGPLLDVTVRAVSAA*

 

>CYP5015G7P sca46 minus strand 50460 to 51598 pseudogene with large deletion

48% to scaf_11a

MWSYAQHASSQHSAVLAGTAMVVALLGWKALSHALRTPEQKAADES

FRRIHRPASTLPLVGNTLDAMFFQTERFVDWMADQSALAGGKPWLMSIVG 51311 frameshift and deletion

51311 QREGKENMKIIKAIVNDVMAQSLARKNARPEHDTEDQEAEKLELKEPTTETKYLLSFFLESGLTDAQQL*D

MAVNFFFAGKDTSSFVLSWFIVMMNRHPQVLRTIRDEIRDQLPELLSGEIDAPSMEQLGR

LPYLEAAMRENLRLNTSMTTRSPNQDTTLSCGTFIPKNSVVCVCHYASARLKSTWGDDAA

EYKPERWLDPDTGKLRQFSPYQFVTFLAGPRQCIGMRFAMLELRMVASVLCSRFNIKTVE

DPFSLTYELSTVFPVKGRLMCTVESSGLAVSS*

 

>CYP5016A1 sca131 17930 to 19567 minus strand (N-term seems short) 73% to scaf_92, no introns

MMPSNLLNSRGLVAIPPGPLLGSTLALVVFWVMRHG

RERRLSWTSTPLPPRGYRTLPTASASQSPVRVGMMTDQPTEGSPAFYEWVCAMTSRFRGK

PWLLHLPGRPDVLVVSSPTSFEDIQRTFALQFEKVDNDAEGLAHDAHGGAIAFIYTGQVR

PSVNMQRQLASSVLSSAALRQQASVLVKHHLQSLLRILDDTASSGDPLDVTRLMRSFAME

VFTELDFGLQLGALRSRRGECSDLEQAVDEVQRRVAERLKRPAVAWKLERLLDVGSEAAL

SRSVDVVSRITLGAVDTKRKRRRGGSPCDSPIAGARVDMLDLLLSQKCSSKSSKDPEFLA

EFVLGLVVAARDSMAHALSSCLQCLARHPEEQEKLARELKEAEEEDRDLQSVVYLEAVVK

EALRLYPAKPFIRRRARIDTVLSDGTFVAAGAKVAMDLYSMARRENVWGQNSAQFRPQRW

IDSTNGKLRPTSNYKFNAFLGGPRACLGADMAMTEMKTVLAKVIGRVHLDAVEPRMAEKD

KTKAWDAACDAAVRVLVRRRGPSPPGQYS*

 

>CYP5017A2 sca169a 37279 to 38733 plus strand 78% to scaf_14, yellow = poor match

37229 MKMPWHLFFDGAIY 37320 (0)

37384 ITDPKDVQHILSTNFDNYVKPQGFLDAFQEIFENSFFAVNHHAQAPDGGAGW

RLQRKVAAKVFTTANFRIFTEQVFARHAEETLLAAQAEATESRAREDPDERFCCDMQEIS

AKYTLNSIFDVAFGLPLSEIEGTENFAEHMGFVNTHCAQRLFVKQYYKLLRWVMPSEREL

RRRTREIRAVADTVLLRRLQESQEKINARSDLLSLFIRKARELAFESTKDQKVDAASLLG

PKTLRSILLTFVFAGRDTTAECLTYSFYAIARHPRVQKQIVEELESTKENTGSTHATFTF

DQVKEMKYLEAVVYEAVRLYPALPFNVKNAVKDDYLPDGTFVPAGVDVVYSPWFMGRNGE

LWGNDPLEFRPERWLEMPKRPSAYEFPAFQAGPRVCLGMGMAVLEAKLFLATTLSRFHVA

IAPGEKQERGYVLKSGLFMDGGLPLQMTPRPQSAASA* 38733

 

>CYP5017A3 sca169b 51409 to 52752 minus strand 79% to sca169a 74% to scaf_14

52874 MKMPWHLFFDGAIY 52833 (0)

52752 ITDPKDVQHILSTNFNNYVKPQGFLDAFQEAFDNSLFI

LNHNAEAPDGGAGWRLQRKVTLKVFTTANFRIYTEKIFARHAEETMVNAQAEAVKVRDSQ

SSNESFCCDMQAVSARYTFNSIFDVAFGLPLSEIEGADEFAEQINFVNEHCAQRLFVKQY

YKMLSWVMPSERELRRCTRGIRAVADNILLRRLKEPGEKISARSDLLSLFIRKARELAAE

GTKEQGADAAALLGPNTLRSIILTFVFGGRDTIAECITYSFYAIAKHPQVQQRIVEELES

IKTSGGLKVTAFTFDEVNSMKYLDAVVYEALRLYPAVPYNVKSAVKDDYLPDGTFVPAGV

DVVYCPWYMGRNSALWGDNPLEFRPERWLEMSKRPSAYEFPVFQAGPRICPGMNMAILET

KFFLATTLSRFHVAIAPGEKQERGYVLKMALFMDGGLPLQMTPRAQFTS* 51409