Cytochrome P450s from Schmidtea mediterranea (planarian, flatworm, Platyhelminthes)

 

Last modified Oct. 13, 2008      D. Nelson

 

planarian tree

Powerpoint slide of a Planarian P450 tree

 

CYP2 clan (50 sequences, some are partials, 39 complete sequences)

 

Planarian clade A

 

>v31.024852 35 24% to #5 

31% to v31.005723                                                     

MIYIILILLLVICLLINKLIYSFKDPPGPIGIPFIGIF

PVVLFKKLFRIKSNENKFFMEMHKIYGNIWSFRIINQRYVVLGAPELQTEAFKHSGMTLS

GRPLTELYKYVSKRRGVLSSDGQIWVLYRKLTTKALHEISHKTRGFEDVITENIKLLLNY

IDRNPDSIRTSNLFHSMTLTIIFKIVFNIDTDFDDEKIKKYIKNVQIITSSNLTFSNAVF

PLTNVPLLKYCVPSVLRIKKAMDNNVNILTELWKESKENCETNELESSSIGEFYLKQLQL

TKVYEEEHQHLNDMNFIRNFSDIVGAGSDTVAATLAWCCLLLADRVKIQNKLRSEINEFV

EVKSVIFFSYSDKSQLPYLCSFIEEVHRYFTIIPRTVHRPMKNLIFHGYKLLENDIILGD

SYTSNHNADIWPNADKFIFDRFVENKEEKIKHLQENFYPFGTGKRNCIGESLARKEVLLT

LVALICHYKIELTNESKLNFEEILNGENGLVFAPFCHNLKFSRISN*

 

>v31.005980 39 26% to #5   

94% to v31.000094, 94% to v31.009749. 93% to v31.010860

81% to v31.013885, 81% to v31.012632                                                 

MIIMFYVNYAPKSNEPPGPISVPFLGVIPYI

IYKSILLKKKLNQKLLFQQLKKHYGEISKFYIGDQRIVLLSSFDIISEAYVKNKFIFNGR

PRIFGLDILSDNFKGLMLSEKKIWKENRSTTVKSFHDLGVGRKEFNDLSVANIESMCEWI

RECQGKEFDVNRILFKTFLKIMNSLIFGHEPEETESNIYDIYETICFLAQTDFTETQILW

KYRSFSAVKKLFPMFEKYANKVSMLHKYIELKVQRKLKYLTEDPNEECLIKYYINKQMEN

SEIFICCVDLYLAGVDTVSSFVSACLLLLGKHQHFQEKIKIEIENLENTVNPITYEHKRF

FPFTQAFIEECHRYFAVVPAIERRPMEEVILKGYLIKSTDLIIGEQYSLNFNSTIWENPN

EFNPYRFIKEENGNFQPNSYLIPFGIGGRTCLGEIIGRTETFNSIVAIVSKFNVSLSMET

SGNCDKILEGTSGGIRHSPLDHTLVFTENVEK*

 

>v31.013885 37 23% to #5                                                       

MLFYCLWLIFILTIILLSITYSLKSNEPPGPIS

VPLFGVVPYIIYKSILLKKKLNQKLIFQQLKKDYGEISKFYIGDQRIVLLSSFDIISEAY

VKNKFIFNGRPRMVGLDILSDDFKGLLLNEGTMWKQNRSTTVKSFHNLGVGRKEFNDLSV

ANIESMCEYLREFQGKELDVKTILIKTFLKIMNSVIFGHETEETKSNILEIYEIIVFLTQ

TDFTVFQIFWQYRSFPGVKKLFPILENYANKVSIIHNYIKLKAQWKLRDLTEDPNEECLF

NIYRNEQIEYSEILQCCVDLYLAGVDTVSSIVSACLLLLGKHQHFQEKIKIEIKNLENTV

NPITYEHKRFLPFTQAFIEECHRYFITIPTFQHRVVKEVILKGYLIKPTDVIIGEQYSLN

FNSTIWENPNEFNPYRFIKEENGNFQPNCYLIAFGIGGRTCLGEVIGKTEIFNSIVSIVS

KFNVSLSMETTENYDKILEGTSGGIVHSPLDHTLVFTENLEKY*

 

>v31.009749                                                        

N-term runs off the end

CIHIESMCEWIRECQGKEFDVNRILFKTFLKIMNSLIFGHEPEETESNIYDIYETICFLA

QTDFTETQILWKYRSFSAVKKLFPMFEKYANKVSMLHKYIKLKVQRKLKYLTEDPNEECL

IKYYINKQMENSEIFICCVDLYLAGVDTVSSFVSACLLLLGKHQHFQEKIKIEIENLENT

VNPITYEHKRFLPFTQAFIEECHRYFAVIPAIERRPMEEVILKGYLIKPTDLIIAQQYSI

DYDSTLWENPNEFNPYRFIKEENGNFQPNSYLIPFGIGGRTCLGEVIGRNETFNSIVAIV

SKFNVSLSMETSGNYDKILEGTSGGILHSPLDHTLVFTENVEK*

 

>v31.000094  pseudogene &= frameshift        

94% to v31.005980                                          

MLFYFIWLSVCFLLMIIIFYVNYAPKSNEPPGPISVPFLGVIPYIIYKSILLKKKLNQKL

IFQQLKKDYGEISKFYIGDQRIILLSSFDIISEAYVKNKFIFNGRPRIFGIDILSDDFKG

LMLSEKKIWKENRSTTVKSFHDLGVGRKEFNDLSVSNIESMCEWIRECQGKEFDVNRILF

KTFLKIMNSLIFGHEPEETESNIYDIYETICFLAQTDFTKTQILWKYRSFSAVKKLFPML

EKFE &

NKVSMLHKY

IKLKVQRKLKYLTEDPNEECLIKYYINKQMENSEIFICCVDLYLAGVDTVSSFVSACLLL

LGKHQHFQEKIKIEIENLENTVNPITYEHKRFLPFTQAFIEECHRYFAVIPGIERRPMEE

VILKGYLIKPTDLIIGEQYSLNFNSTI*ENPNEFNP*RFIKEENSNFQPNSYLIPFGIGG

RTCLGEIIGKTETFNSIVAIVSKFNVSLSMETSGNYDKILEGISGGILHSPLDHFSIY*

 

>v31.010860 partial C-term pseudogene                                                        

TQILWKYRSFSAVKKLFPMLEKFE &

NKVSMLHKYIKLKVQRKL

NYLTEDPNEECLIKYYINKQMENSEIFICCVDLYLAGVDTVSSFVSACLLLLGKHQHFQE

KIKIEIENLENTVNPITYEHKRFLPFTQAFIEECHRYFAVVPAIERRPMEEVILKGYLIK

PTDLIIAQQYSIDYDSTLWENPNEFNPYRFIKEENGNFQPNSYLIPFGIGGRTCLGEVIG

RNETFNSIVAIVSKFNVSLSMETSGNYDKILEGTSGEILHSPLDHTLVFTENVEK

 

> v31.012632 36 24% to #5                                                       

MLFYCLWLIFILTIILLSISYSLKSNEPPGPISVPLFGVIPYIIYK

SILLKRKLNQKLIFQQLKKDYGEISKFYIGDQRIVLLSSFDIISEAYVKNKFIFNGRPRM

VGLDILSDDFKGLILNEGTMWKQNRSTTVKSFHNLGVGRKEFNDLSVANIESMCEYIREF

QGKELDVKTILIKTFLKIMNSVIFGHETEETKSNILEIYEIIVFLSQTDFTVFQVFWQYR

SFPGVKKLFPILEKYANKVSIIHNYIKLKAQWKLRDLNEDPNEECLFNIYRNEQIEYSEI

LQCCVDLYLAGADTVSSIVSACLLLLGKHQHFQEKIKIEIENLENTVNPITYEHKRFLPF

TQAFIEECHRYFITIPTFQHRVVKEVILKGYLIKPTDVIIGEQYSLNFNSTIWENPNEFN

PYRFIKEENGNFQPNCYLIAFGIGGRTCLGEVIGKTEIFNSIVSIVSKFNVSLSTETTEN

YDKILEGTSGGIVHSPLDHTLVFTENVEKY*

 

>v31.005723 40 25% to #5   stop codon in frame                                                   

MNIIFGVLFVLIYVFLKHFYRRKSHPPGPLGFPIVGMFPYSILKI

IFSKFIINESESEFFRQMKQKYGGIWSIESCGQRIITLASFDILHEAFVDNGMLFSGCSQ

NSLVELYGNKRGILFVDGDIWKRNRHTVLKVLRILGMGTSTMEKLISEHVDCLIKDLKKR

SGIPIDPSKEFLKLTSDVINIINFNEPNDENDPNFRTYILNLEKYTNSYLRYIIFVWPWF

FLSKFGRFLFPKIEEYVNAIDINRRFVAEKCMSRLKVFSESPNLKCSFDYIWYHKVLNEA

SKDQNCDMDVLNVIQIMTDLQNAATETTFNLLKFRCLLLSRYPDVQNDLRNELYNLVEDS

AQITLNLKNKCPKLLSFINEVHRFCNLVPKVGHRVIR*CTFRGFELDTRDIVLGDINSIM

FDENNFNNPQLFDPYRFLNEMRTEYVYSKFFIPFGIGKRTCLGENLANMEINLILSRILY

EFQISLSHETDKHFETILEGKFGIIHSCLNHELVFHALWQNH*

 

>v31.000479a 33 24% to #5 16-17k                                                       

MNIIIGVIFVFILLFLRHFYKRKSHPPGPLGFPIFGIFPYSITQIIFSK

FMTNVSECEYFKQMRQKYGSIWSIETVGQRIITLASFDIVHEAFVENGMLFSGCAQNSLV

ELYGGKRGILFVDGAIWKQNRHLAIKALTTMGMGTSALEKLNCEQADCLIKEIKQKSGIQ

IDPRKEFVKAIANIINIVNFNEPNNENDPKFEKYMLCLEKFANSHIKHIIFIWPSFFLSK

LGRFLFPKVEEYVNAIYNNRRLVADKCMSRLKVFNQSSDLKCVFDYIWKNEDFIEDLNDQ

SFNVKIFNVIQIMTDLQNAGTETTYNLLRFCCLLLGRYPEVQDDLRNELQNLVEDSTQIT

LNLKNKCPKLLSFIGEVHRFCNLVPKVGHRVMKKFIFRGFEFDTSDMVFADFNSIMFDEN

HFDNPQKFDPYRFLDETRTKYIPSQHLIGFGIGRRTCIGESLANMEIFFLLSRIIFEFQI

SLSNETVEHFAKIFEGEIGIMRTSLYHELVFHSVSKI*

 

>v31.000479b 31 24% to #5 27-29k                                                       

MNIIIGVIFVFILLFLRHFYKRKSHPPGPLGFPIFGIFPYSITQIIFSKFMTNVSECEY

FKQMRQKYGNIWSIETVGQRTITLASFDIVHEAFVENGMLFSGCAQNSLMELFGNRHGIL

FADGIIWKRNRHLAMKGLRYFGMGTSTMEKLICEQADCLIKEIKQKSGIQIDPSKIFVKN

ISNIINIVNFNEPNNENDPKFEKYVSCLDKISNSHIKHIIFIWPSFFLSKLGRFLFPKVE

EYVNAIYNNRRLVADKCMSRLKVFNQSSDLKCMFDYIWRNEDFIEDLNDQSYDVKLFNVI

QIMTDLQNAGTETTYNLLRFCCLLLGRYPEVQDDLRNELQNLVEDSIQITLSLKNKCPKL

LSFIDEVHRFCNLVPKSDHRVMKKCFFRGFDFYPSDMVFADYNAIMFDENHFDNPQKFDP

YRFLDETRTEYTPSKYLIPFGMGRRTCIGENLSNMEIFLFLSRILYEFHISLLNETVEHF

TEILKGETGIIHSCLNHELIFQSSGKVSSFIG*

 

> v31.021522 30 26% to #5                                                       

MFLDCFIVFISLLSATERSSWPNWMARFWINTGKTILVLTGSKYSQSKCYSEMKDKYGAL

WSRKLFHIREIVFTSYELIYETLVIHGTQFSNRPKDLPILDYVSKKGGIIFIGGKYWKAH

RKDTLKIMHKLGMTNLTIEDKVVQESEKLLQEIFKSANKPIESDKLYSKPVMNIICQLML

NRSFEYDDQKFINVIKNIQVIGQDGNVHFLFYMRSIINFWIFREAFKSVRIFMDNLEELE

EWVESQLDATSEKFNPEFDNPSCILEHYLVLQKQQESNGESSDLYSKHQIIRNCFDIYSA

GIDTTTHMLCWCTLMLGLKPDFQTRIRNEIKANKSNSAMNFNTTKSFHFFKAFLDECFRY

FPVIPRIAHAVEKPIYFKGFHLVPRDLLMMDFKAMSKDPDIWKDPEEFDPFRFMDETRHL

YKPNKHLIAFGIGKRSCMGENLARCELTIILLSIIQTFEIKLTQNTLENINLIISGTEGL

IYAPLEHEIIYIPI*

 

> v31.008063 38 24% to #5                                                       

MFIEVLCFLIVLLFLYHYYQ

RQKDPPGPIGWPVFGSTPERLYWYLTGRKYSQSKCYSEMKEKYGALWSRKLFHIKEIVFA

SYELIYESLVIHGTQFSNRPKDLPILDYVSKKGGIIFIGGKYWKAHRKDTLKIMHKLGMN

NLTIEDKVVQESEKLLQEIFKSANKPVESDKLYSKPVMNIICQLMLNRSFEYDDQKFINV

IKNIQVIGQDGNVHFLFYMRSIINFWIFRETFKSVRIFMDNLEELEEWVESQLDATSEKF

NPEFDNPSCILEHYLVLQKQQESNGESSDLYSKHQIIRNCFDIYSAGIDXXXXNHR-ISH

TKSFHFFKAFLDECFRYFPVIPRIAHAVEKPIDFKGFHLVPRDLLMMDFKAMSKDPDIWK

DPEEFDPFRFMDETRHLHKPNKHLIAFGIGKRSCMGENLARCELTIILLSIIQNFEIKLT

QNTLENINLIISGTEGLIYAPLEHEIIYIPI*

 

> v31.023906 34 24% to #5                                                        

MFIEVLCFLIVLLFLYHYYQRQKDPPGPIGWPVFGSTPERLYWYL

TGRKYSQSKCYSEMKDKYGALWSRKLFHIREIVFASYELIYETLVIHGTQFSNRPKDLPI

MDYVSKKGGIIFIGGKYWKAHRKDTLKIMHKLGMNNLTIEDKXXXXKIIATNLGMNN-P-

RQGIQESEKLLQEIFKSANKPVKSDKLYSKPVMNIICQLMLNRSFEYDDQKFINVIKNIQ

VIGQDGNVNLLFFMRSIINFWIFRETFKSVRIYMDNLEELEKWVESQLDATSEKFNPEFD

NPSCILEHYLVLQKQQESNGESSDLYSKHQIIRNCFDIYGAGIDTTTHMLCWCTLMLGLK

PDFQTRIRNEIKANKSNSAMNFNTTKSFHFFKAFLDECFRYFPVIQRAVHAVEKPIDFKG

FHLVPRDLLMMDFKTMSKDPDIWKDPEEFDPFRFMDETRHLHKPNKHLIAFGIGKRSCMG

ENLARCELTIILLSIIQTFEIKLTQNTLENINLIISGTEGLIYAPLEHEIIYIPI*

 

>v31.028126 29 25% to #5                                                      

MFIEVLCFLIVLLFLYHYYQRQKDPPGPIGWPVFG

STPERLYWYLTGRKYSQSKCYSEMKEKYGALWSRKLFHIREIVFASYELIYESLVIQGTQ

FSNRPKDLPIMDYVSKKGGIIFIGGKYWKAHRKDTLKIMHKLGMNNLTIEDKVVQESEKL

LQEIFKSANKPVKSDKLYSKPVMNIICQLMLNRSFEYDDQKFINVIKNIQVIGQDGNVNL

LFFMRSIINFWIFRETFKSVRIYMDNLEELEEWVESQLDATSEKFNPEFDNPSCILEHYL

VLQKQQESNGESSDLYSKHQIIRNCFDIYGAGIDTTTHMLCWCTLMLGLKPDFQTRIRNE

IKANKSNSPMNFNTTKSFHFFKAFLDECFRYFPVIQRAVHAVEKPIDFKGFHLVPRDLLM

MDFKTLSKDPDIWKDPEEFDPFRFMDETRHLYNPNKHLIAFGIGKRSCMGENLARCELTI

ILLSIIQTFEIKLTQNTLENINLIISGTEGLIYAPLEHEIIYIPI*

 

Planarian clade B

 

>v31.000423 19 43% to #5                                                      

MILVVLFCITVYVFYKFYLRKTSSALPSHLKPIPV

VSGGLPIVGNLLNVSSAPFQYLNDLREKYGPIYSIQIGWKNIIVLNNYDVINKSLKEQGN

TFSGRWANVFTKDLAKDSGILFKDKEFWATQRSFTLRVLRDFGFGKQCAENVVYREIEKM

VKEIKLKNGRNIETTKLVSFATVNMISDFIMKREFENNDENILYVIKSIQDIARVADVNS

LVMNVFPIFDWFPRISLMLIYYFHPMRIIQTIDRTTDFIGSIVEDHRKRFCPENESQDLI

DAFLLEQHRLNSLNSKNHSFTDWQVVRLISELFLAGYETTANTLSWTFLLLAIHPEIQTK

IRDEISKEIGFIRWPTMNDKKSLNYCQAVLDEIFRYSTVLPLSIAHRVLENASINGFFVP

ADSILFPNLYACHRDPSVWKKPYDFYPEHFLNENGNYQPKVELVPFGLGKRQCLGEALAR

MEEFIVVVSFLQKFSISLSDESKKLDTFQLLNGSRGTFRGPAIHTLNFDELVH*

 

>v31.001047 28 32% to #5    AIG aligns with CIG  no CYS   

39% to v31.003777                                           

MAWHIIFAVAYLILGLVLVSVLHGKKRKPRFSDQEKPKLNDV

STIKSYLFYSLPIYRNLSVFRLKYGNFFSLKLGMRKITFINNFDLIKQIMYGKEDTFSGK

WKPKMQSFTNDRSDYLKEQERIVTQRNFNNKLLKDFLTEKSKLCSTISMECQHFLKELER

SNEKSLNISNMVSLFSFNIFFSLLLDKRFEVKDPKFRTLQDTMTIEIKKRNFLYIFHLLF

PFIENSKFTAKFLIWMLSRTKIYKEFNEIFDDELRAHLKPIENNLEHDDLITYIQSMKYY

STSLSSSVRKSFDKKLIVNSYETISSDYEIISSILLWSMFFMSYFPEVQKKVHEEIVYVI

GKERMPEMKDKENLHYTQAVMDEILRLGSPVFLGAFYRAFKDISIENNIIPQNSLMFINL

YGCHTDPKLWEKPFDFYPNHFLNKSDSNQTTYSPREELLIFGIDKKQAIGESLTRAQYFL

LFASIFQKFKVNSSENISTNKFKEIIRNSDGMVRTPSIKNYVFHIL*

 

>v31.005628 22 41% to #5                                                        

MLILIILFVLCFLICFRLLNFKRNLPKGLKECPLVKSW

PIVGILPQIKAPMYDFFMKLAKTYGPIYRIRFGMREMVILNNDDLVRKALVNNSDSFSGR

STTQKYTFLCMNSGIIFLDNHVWSIQRPFLSKVLRDFGMGNCKSVEIAQSECLMLLDDLA

KKGGQSVDVKNLFYDCTANIIAKFVMNKRVTSTDSLFRLMKDMQNRIVKQKDLVQFFLLM

FPIFDSWDWFNRFFVFITFRHRIFERIHVKWQEVIEDHLKTVDFDCEGDDLIERFLIKKN

QLLQQEKSVKTFTDMQLIRNSLELFVAGFDTTATTLVWSMMFMAKHIKIQEQVYDEIIDK

IGKERLPTLSDKKSLSYTQAVIDEILRMSSVAPMAIIHRTTKDAVVDGYFIPENTLIIPN

IYACHYDPKIWETPNEFSPNHFLSKNSTGELKYKAREELIPFSLGKRQCLGESLARTEFF

IFFTTILQRFRVQFSKPLSDSDYRSAIRGNDGVLRVCSVTDYSFEIRD*

 

>v31.001287 27 40% to #5                                                       

MIYFIILALFIIAYFLVRSRKKDSRMIPIEKGLPL

VGVLLKLKYPVLDYYEKLARKHGPLFRMRIGMRDIFVLNNYDTIRKAFVENGNNFAGRWK

LFQALVVTNNSGLVSIDGDLWHMHKAFVLNVMHKCVTSFDNTQEECLCLLEELESLNKKV

INLSEVIRVYTVNIISRFVLNKRFNRDNPIFTFLSKMSLEFAERADIFQFIFFTLPIFDK

LPGICRLLIDFTDRGRKFRKAQNIFQEEIEEHRKRLDLDSEGEDLIDCFLIKQHQLKSTT

GSVGSFTDIQLIQSSLELFIAGFETTSTILEWSFLFMARYNDIQENVYKEIVDKIGKERL

PTMKDRKQLVYTQAVMDEVYRLSSISPLAIIHRVVNDCNIDGYFIPKDSLVQSNIWGCHT

DPNVWERHKEFYPHHFLKTDAAGNIKYDPKQELNAFGIGKRQCIGESLARMEYFIFFTSV

MQRFHVNFAKSVNEIEFDSAIRGCDGIIRSPVSLKDYIFTRRK*

 

>v31.000183a 23 42% to #5 169-170k                                                      

MIIICIITIILGVTIYWKLIKRTSISKGLKIIPTEKGLPFVGVLHQMETPILKYYERLIKKHGSV

FAVRYGMRDVIILNDYQTVKSAFVDDGSNYSGRWRLVSSLLAANNSGIAFIDGQNWQTQR

TFMLKVLHDFGMGKSVSYEIVRQECFNLLEELETLQGKRVDASDLIHLYTVNIISKFTMN

KRFSRGDDKLRNINDLITHVAKQKDLLQFWFFALPIFDKMPLLCDWIISRTHRIKHFDAV

HKYFQEEVDEHRKSLDLNSEGEDLIDRFLIKQHQLKASTGSIEGFTDFQLLRMSLELLIA

GLETTTTTLEWSFLFMSKYTIIQQKVHEEIVNIIGKERFPTTNDRKDLNYTQAVMDEIFR

VSSVAPLALVHRAMNDSSINGYYVPKDALVFSNLYACHFDEKVWEKPKEFYPERFLSTDI

NGDIKYTANENLNAFGMGKRRCAGESLAKMEAFIFFTSIMQRFEVRFSEDVSDALYEEAL

IGNDGLVRCSKINKFIFTKRE*

 

>v31.000183b 24 41% to #5 176-178k

MIWFFLLILISVVALLFNFLRPRPIPKGLQKVPIEKGFPLIGILPLLTFPIL

EYYERLIKKHGPVFGMKMGLRTIFVLNDYRAIRNALIDNKDNFSGRWRLFQSLLGTQNAG

IVSTDGDQTHIHRRFIIKVMHEFGMGTLRSLTNIQTECTALLECLEKCQGERIQVSDIIP

TYMANIITKFLMNKTFSVGDPKLNLIKNILTEIARTKDIFQFLYFTLPILDRFPRVCDFL

MSFTTRKDLFNKVHRILQQEIEEHRKTFDVKNEGEDLIDRFLMKQDQLKSSEESVKSFTD

FQLVRHLLELFLAGYETTSTIFHWSLFFMSKYQEIQSKVHQEIIDSIGTERLPTNNDRKD

LTYTTAVMDEILRLSSVLPLALIHRTTNDTEVNSYFIPKDSLVLINLYACHVDPNVWEKP

REFYPEHFLSPKENGNIKYVPKDQLNPFGMGKRQCIGESLGRMQYFIFFVSILQRFQVSF

ADPVTESQFQEALRANEGIVRYPSINEFVFNRIQ*

 

>v31.000183c 25 39% to #5 181-183k                                                      

MIIIYLCTFLLGAAI

FWKFTRRTPIPKGLRRIPTEKGLPFVGVLHQIETPIFRYYERLIKKHGSVFAVRYGMRDV

IILNDYQTVKSAFVDDGSNYSGRWKVSQSLLMSGYARIAFIDGEMWQIQRPFVLKVFHDF

GIGKSVSYEIARQECLYLIDELETLNGKWVDVSDLIHLYTVNIISKFITNKRFSRKEDKY

KTIGNLGQQALKQKDCLQLLFYTLPLFDKMTWLCDWILRRTNRKVLFDAVDKLVQEEIDE

HRKTLDLNSEGEDLIDRFLIKQHQLKASTGSSKQFTDFQLRRLSLELLLAGLETTTTSLE

WSFLFMSKYTIIQQKVHEEIVNIIGKERFPTLNDKKDLHYTQAVIDEIFRVSSVGPLALV

HRAMNDSSINGYHVPKDSLVYSNLYACHFDEKVWEKPKEFYPERFLSTDINGDIKYKAND

SLNPFGVGKRRCTGESLARMEIFVFFTSIMQRFEVRFAEDVSDALYEEALIGNEGIVRGS

KIKNYIFTKLK*

 

> v31.003865 26 40% to #5                                                       

MIIIYLCTFLLGAAIYWKFIRLTSIPKELKRIPTEKGLPII

GVLHKIESPILRYYERLIKKHGSVFAVRFGMRDVIILNDYQTVKSAFVHDGSNYSGRWKV

SQSLMMSGNAGIAFIDGEMWQTQRTFMLKVFHDFGMGKSVSYEIVRQECHFLIDELETLI

GKCVDVSDLIQLYTVNIISKFIMNKRFSRKEDKYKRLSDLSQHALKQKDFFQFLFYTLPL

FDKMNWLCDWILRRTNRRIPFDAVDKLVQEELDEHRKTLDLNSEGEDFIDRFLIKQHQLK

TSTGSSEQFTDFQLLRLSMELLVAGLETTMTTLGWSFLFMSKYTNIQKKVHNEIIKTVGK

ERFPTLNDRKYLHYTQAVMDEVIRVSSVAPLAFVHRAMEDSSLNGYYVPKDSLVFSNLYA

CNFDEKVWKKPKEFYPEHFLSTDINGVIIYTPNESLNPFGVGKRRCVGESLARMEIFVFF

TSIMQRFEVKFAEDVSEALYEEALIGNEGVVRSSKIKNFIFEKLG*

 

>v31.003664a 18 41% to #5  6-7k                                                    

MILILCSFCIILILYYFGKR

WRSGQRILRNQAPLVKGLPLIGSLLDIKLPLHIYLKELSEIYGPVYRIKLGFREILVFNS

FEILKQTFKLDGNSFSGRWSTCLFKVLTHDSGIFLKDGEVWVKQRAFVQKVLRDFGVGRS

KSSETVQFECESCLDEIKLLVGKKVDFSQLIPIYTLNIISKFIMNKRFCREDPKIKMIEN

VLLEFLKKADFVTMFYLVFPCFENSSLVFKILMKLQAHLKMVIEIHDSFKKEIESHQKVL

DLNSDGEDMIDRFLLEQKRVREKTGSIDTFTEWQLIRNSFEMLVAGYETTSTTLVWSLMF

MSRHQDIQEKVYKEIVSVIGKEKLPSISDKKDMDYTQAIMDEILRMSSVTPLASFHRAMI

DCSAKGFFVPKDTLIFPNLYACHFDPDVWINPSEFYPNHFLSTDDNIPMKYCPREELIPF

GIGKRQCLGESLARMEYFIFFCSILQRFKVRFAEEMTEMRYQEVLRGNDGVIRICLETNF

IFEERQL*

 

> v31.003664b 41 17-21k    gap in middle                                               

MILILFSFCIILILYYFGKRWRSRQRILRNQAPLVKGLPLIGSLLDIK

LPLHIYLKDLSQIYGPVYRIKLGFREILVFNSFEILKQTFKLDGNSFSGRWSTCLFKVLT

HDSGIIFTDGEVWAKQRAFMQKVLRDFGVGRXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

XXXXXX-WQRYETTSTTLVWSLMFMSRHQDIQEKVYKEIVSVIGKEKLPSINDKKDMDYT

QAIMDEILRLSSVTPLASFHRAMIDCSAKGFFVPKDTLIFPNLYACHFDPDVWSNPSEFY

PNHFLSTDDNIPMKYCPREELIPFGIGKRQCLGESLARMEYFIFFCSILQRFKVRFAEEM

TEMRYQEVLRGNDGVIRICLETNFIFEERQL*

 

>v31.003664c 42 possible pseudogene or partial seq that runs off the end

MILILF

SFCIILILYYFGKRWRSCQRILRNQAPLVKGLPLIGSLLDIKLPLHIYLKDLSQIYGPVY

RIKLGFREILVFNSFEILKQTFKRDGNSFSGRWSTCLFKF &

QVLTHDSGIIIYRWAKFGLNSHHLLQKGLKRISGVGKSKSSE

 

>v31.041651 17 42% to #5                                                     

MILILFSFC

IILILYYFGKRWRSRQRILRNQAPLVKGLPLIGSLLDIKLPLHIYLKDLSQIYGPVYRIK

LGFREILVFNSFEILKQTFKLDGNSFSGRWSTCLFKVLTHDSGIIFTDGEVWAKQRAFMQ

KVLRDFGVGRSKSSEIVQFECESCLDEIKQLAGKKVDFSQLIPIYTLNIISKFIMNKRFC

REDPKLKFMQKMMLDSLKKIDFVLIFYLVFPYFENSSLVFKFFMKLTTHLKMVLKIYDSL

NEEIETHQKLLDWNSDGEDMIDRFLLEQKRVMEKTGSVETFTKWQLIRNSFEMLVAGYET

TSTTLVWSLMFMSRHQDIQEKVYKEIVSVIGKEKLPSINDKKDMDYTQAIMDEILRLSSV

TPLASFHRAMIDCSAKGFFVPKDTLIFPNLYACHFDPDVWSNPSEFYPNHFLSTDDNIPM

KYCPREELIPFGIGKRQCLGESLARMEYFIFFCSILQRFKVRFAEEMTEMRYQEVLRGND

GVIRICLETNFIFEERQL*

 

> v31.007336 16 42% to #5                                                     

MILILFSFCIILILYYFGKRWRSRQRILRNQAPLVKGLPLIG

SLLDIKLPLHIYLKDLSQIYGPVYRIKLGFREILVFNSFEILKQTFKLDGNSFSGRWSTC

LFKVLTHDSGIIFTDGEVWAKQRAFMQKVLRDFGVGRSKSSEIVQFECESCLDEIKQLAG

KKVDFSQLIPIYTLNIISKFIMNKRFCREDPKLKFMQKMMLDSLKKIDFVLIFYLVFPYF

ENSSLVFKFFMKLTTHLKMVLKIYDSLNEEIETHQKLLDLNSDGEDMIDRFLLEQKKVME

KTGSVDTFTEWQLIRNSFEMLVAGYETTSTTLVWSLMFMSRHQDIQEKVYKEIVSVIGKE

KLPSINDKKDMDYTQAIMDEILRMSSVTPLASFHRAMIDCSAKGFFVPKDTLIFPNLYAC

HFDPDVWSNPSEFYPNHFLSTDDNIPMKYCPREELIPFGIGKRQCLGESLARMEYFIFFC

SILQRFKVRFAEEMTEMRYQEVLRGNDGVIRICLETNFIFEERQL*

 

>v31.003777a 14 46% to #5 9-10k 3 exons   pseudogene                                             

MIFIIFISFIGICGLYFLINLKN

RNTSISVGLKNLPIVRGIPIFGILFQKKGPFHIYLKNLGHKYGSIFSIKFGMREVVVLNK

(small deletion)

RTFVLKVFRDFGFGKSRSYEIVESECKFMLEELKSLQ

KTILDFYNFIPIYTANIISKFIMNKRFERTDPKIKAIEFAIFQTTKKSNILQILHLMFPK

LDDSFIISTFTLWLTSRRKLFSGIEKVFEEEIQEHLENLDFNSEGNDLIDRFLIEKQRLV

QSTGNHQGFTDKQLIFMTFELFGAGYETTSTTLGWSMVFMARYPHIQNKVFDEIRENIGL

EKIPIMNDKKSLTYTQAVMDEILRLSSVAPLGIYHRTFSDTSVDNFFIPKNSLIFPNYFA

CHTDPETWRKPFEFFPEHFISSTENGELKYTPREDLVPFGIGKRQCI ()

GESLAKMEFFIFFVSIFQSFEVKFAENISDKLYQENLEGNDGIIRYSKIKNFIFHKRDVNN*

 

>v31.003777b 15 46% to #5 17-19k                                                       

MMFLLFISFIGICGLYFLIN

LKNRNTSISVGLKNLPIVRGIPIFGILFQKKEPLHIFLKNLGHKYGSIFSIKFGMREVVV

LNNFRIIKKTLKDEGDNFSGRWQTKLKTVISLNTGILFIDGDHWHTQRTFVLKVLRDFGF

GKSRSYEIVESECKFILEELKSLQETVVDISKLIPIYTANIISKFIMNKRFERTDPKIKA

IEFAIFQTTKKSNILQILHLMFPKLDDSFIISTFTLWLTSRRKLFSGIEKVFEEEIQEHL

ENLDFNSEGNDLIDRFLIEKQRLVQSTGNHQGFTDKQLIFMTFELFGAGYETTSTTLGWS

MVFMARYPHIQNKVFDEIRENIGLEKIPIMNDKKSLTYTQAVMDEILRLSSVAPLGIYHR

TFSDTSVDNFFIPKNSLIFPNYFACHTDPETWRKPFEFFPEHFISSTENGELKYTPREDL

VPFGIGKRQCIGESLAKMEFFIFFVSIFQSFEVKFAENISDKLYQENLEGNDGIIRYSKI

KNFIFHKRDVNN*

 

>v31.001148 21 42% to #5                                                       

MDLIYYVVFLLTFVFVYFWTFKKDRLPNGQREVPTAYGLPL

IGVIHKIKPTLPYYLLRLGKELGPIFSIKLGMRKVFVLNSYDSIKKALVDDGYNFSGRWK

TKLFSDFLHDSGVLFTDGKLWETQRSFILYVLRDFGFGKSRSIEMVQDECSNFIEEIDTL

VGQTLNFSTLIPIYSMNIISRFITNRRYTRDDPMIQLIEESGRKLLTNKDKAQFILLMFP

IFEQSTLFVKLIQHISSVKFIIPV-KYFAEIIEEHREALDFNSEGEDLIDRFLKKQRQLE

QSTGNFWTFHNWQMIQSACELFFAGSETSSTTLTWSCLFMSRYPDIQKKVHNEIVEIIGI

ERLPTMNDKRSLNYTQAVMDEILRLSSVVPLSIFHRTFEDANINGYYIPKNSMIFPNLFA

CHVDPDLWTNPMTFDPEHFLKLETDGTLKYRPREELIPFGIGKRQCIGESLARIEFFIFF

TRMMQRFRISFAEELSEEQYSKVLISDDAPVKSCQITDYIFHRL*

 

>v31.000808 20 42% to #5                                                       

MDLIFYVVSLLTFVFIYFWTFKKDRLPNGQREVPAAYG

LPLIGVIHKIKPTLPYYLLRLGKELGPIFSIKLGMRKVFVLNSYDSIKKALVDDGYNFSG

RWKTKFFSDFIHDSGVLFTDGKLWETQRSFILNVLRDFGFGKSRSVEMVQEECSNFIEEL

DTLVGQTLNFSTLIPIYSMNIISRFITNRRYTRDDPMIQLIEESGRKVLTNKDKAQFILL

MFPIFEQSTLFAKFMQLTSSWKSLIPVQEYFAEIIEEHREALDFNSEGEDLIDRFLINQH

QLQQRNGNIETFNNWQMIRNASELFIAGSETSSTTLTWSCLFMSRYPDIQKKVHNEIVEI

IGRERLPTMNDKRLLNYTQAVMDEILRLSSVVPLSVFHRTFEDANINGYYIPKNSLIFPN

LFACHVDPDLWTNPMTFDPEHFLKLETDGTLKYRSRKELIPFSIGKRQCIGESLARIEFF

IFFTRMMQRFRISFAEELSEEQYSKVLISDDAPVKSCQITDYIFHRL*

 

>v31.001324a 9 46% to #5 44-45k                                                       

MELCYCVTFILIFLYFLYWVCRKPPIPKGLKEIPMESGLPIIGAIHKIEPPMHHYLLKLGR

KLGPIFRIKLGMRKIVVLNNYEMIKKALIDDANNFSGRWSQRFIKEITHDSGIIFKDGKI

WATQRSFILKVLRDFGLGKSKSIEIVQNECFNFLDELDTLSGQIVDFSELIPIYTTNIIS

RFIMNKRFTRDDPNIQLIEETINNSTKNKDIIQLILLMFPCFDNSSLICQLIYLLTSKWK

SFVKIQNYFGEEIEEHRTNLDLTSEGEDLMDRFLIKQHELKQSTGNVESFVDWQLIRTSV

ELFIAGYETTSTTLNWCCLFMSKYPDIQEKVHKEISEIIGKERLPTMNDKKALNYTQAVM

DEVLRLSSVAPMGLIHRTFDDANINGFYIPKDSLIFPNLYACHMDPEIWESPTTFYPEHF

LSLDSNGISKYQPREELIPFSVGKRQCLGESLARSEYFIFFVSMVQRFQISFAEEISEDK

YSQTIIGNEGVIRTSDIKNFLFKRR*

 

>v31.001324b 10 45% to #5 47-48k                                                        

MELCYCVTFILIFLYFLYWV

CRKPPIPKGLKEIPMESGLPIIGAIHKIEPPMHHYLLKLGRKLGPIFRIKLGMRKIVVLN

NYEMIKKALIDDANNFSGRWSPKVDAEVTHDSGIIYKDGKIWATQRSFILKVLRDFGLGK

SKSIEIVQNECFNFLDELDTLSGQIVDFSELIPIYTTNIISRFIMNKRFTRDDPNVHFME

KAITKFTKNKDIFQLILLMFPCFDNSSLICQLIYLLTSEWKSFIRTQDFFADEIKEHRNA

LDFTSEGEDLMDRFLIKQYQLKQSTGNVESFVDWQLIRTSVELFIAGFETTSTTLNWCCL

FMSKYPDIQEKVHKEISEIIGKERLPTMNDKKALNYTQAVMDEVLRLSSVVPMALIHRTF

HDANINGFYIPKDSLIFPNLYACHMDPEIWEFPKTLYPEHFLSLDSNGISKYQPREELIP

FSVGKRQCIGESLARSEYFIFFVSMVQRFQISFGEEISEDRYSQTIIGNEGLIRTSDIKN

FLFKRR*

 

>v31.001324c 11 46% to #5 64-66k                                                       

MELCYCVTFILIFLYFLYW

VCRKPPIPKGLKEIPMESGLPIIGAIHKIEPPMHHYLLKLGRKLGPIFRIKLGMRKIVVL

NNYEMIKKALIDDANNFSGRWSPKIITEVSHDSGIMFKDGQGLATQRSFILKVLRDFGLG

KAKSLEIIQNECLNLLDELDTFSDQIVDFSEFFPIYTTNIISRFIMNKRFTRDDPNIQFI

WKTMDNLAKNKDIFHLILLMFPCLDQSFIVARLLFLFVSDWKRFVKIQDYFAEEIKEHRN

ALDFTSEGEDIMDRFLIKQHELKQSTGNVESFVDWQLIRNSVELFIAGFETTSTTLNWCC

LFMSKYPDIQEKVYKEISEIIGKERLPTMNDKKAMNYTQSVMDEVLRLSSVSPLSLIHRT

FHDANINGFYIPKDSLIFPNLYACHMDPEIWESPTTFYPEHFLSLDSNGISKYQPREELI

PFSVGKRQCLGESLARSEYFIFFVTMVQRFQISFGEEISEEKYLQTIIGNEGLIRTSDIK

NFLFKRR*

 

>v31.001324d 12 44% to #5 81-83k                                                       

MELCYYVTFILISLFFYYWTFKKTQIPKGLKEIPVESGLPIIGAIHKIEMPISRYFQKLSRKLGPIF

RIKLGMRKIVVLNNYEMIKKTLIDDANNFSGRWLPKVFTDAAHDSGIMFKDGHIWAIQRS

FLLKVLRDFGLGKAKSLEIIQNECLNFLDELDTLSGQIVEFSELIPIYTTNIISRFIMNK

RFTRDDPKVQFIGKTITNLTKSKDILQLILIMFPCFDTSSLFSKLVFLLNTKWKRFVTIE

NYFWEEIEEHRTNLDLTSEGEDLMDRFLIKQRQLKQSTGNVESFVDWQLIRNSVEIFIAG

YETTSTTLNWCCLFMSKYPDIQEKVHKEISEIIGKERLPTMNDKKALNYTQAVMDEVLRL

SSVVPMALIHRTFYDANINGFYIPKDSLIFPNLYACHMDPEIWESPTTFYPEHFLSLDSN

EISKYQPREELIPFSVGKRQCLGESLARSEYFIFFVSMVQRFQISFAEEISEEKYLQTII

GNEGHIRTSDITKFKFKKR*

 

>v31.001324 12-de 37% to #5 83k last exon only                                                   

LKNSEYFIFFVSMVQRFQISFAEDISEEKYSQTIIGNEGHIRTPDIKNFQFKMRY*

 

>v31.001324e 13 46% to #5 89-91k                                                       

MELCYCVTFILIFLYFL

YWVCRKPPIPKGLKEIPMESGLPIIGAIHKIEPPMHHYLLKLGRKLGPIFRIKLGMRKIVVLNN

YEMIKKALIDDANNFSGRWSPKIITEVSHDSGIMFKDGQGLATQ RSFILKVLRDFG

LGKSKSLEIIQNECLNLLDELDTLSDQIVDFSEFFPIYTTNIISRFIMNKRFTRDDPNIQ

FIWKTMDNLAKNKDIFHLILLMFPCLDQSFIVARLLFLFVSDWKRFVKIQDYFAEEIKEH

RNALDFTSEGEDLMDRFLIKQHELKQSTGNVESFVDWQLIRNSLELIIAGYETTSTTLNW

CCLFMSKYPDIQEKVHEEISEIIGKERLPTMNDKKALNYTQAVMDEVLRLSSVAPMGLMH

RTFHDANINGFYIPKDSLIFPNLYACHMDPEIWESPTTFYPEHFLSLDSNGISKYQPREE

LIPFSVGKRQCIGESLARSEYFIFFVSMVQRFQISFAEEISEEKYLQTIIGNEGHIRTSD

ITKFKFKNR*

 

>v31.000013:319135..327791 (- strand)

gene model mk4.000013.25 was three exons, changed to one exon

MVIFIYILLLIFFILLLSNIRR

FYLDQTPPIGYRKLPEGKEWSFIGNLIRLFREPLWSHLYNKS

KIHGPIFKLIMGCREVIVLSSNNTICKVFKNKGVNLSERWPDILFETYNCEMNILFKDKA

WWTIRRSLILKVLKNFISKEQKLIEIVNEECLILLKDIKDVCGKEINTKELFRKSMVNIL

SKIIINLRFERESKNYKILNEIIEENSQIKNMLIKTMLLFPWLEKLPKFVRFLFWCSEYK

RNYKIQNSFFKEIIEEHKRKFDSMNKVENLIDAFLKEQNNLGENNRIIDGITDWQLIRNG

IELFIVAFEKIRTLLSWSMVLMARHSEIQERIHEEIIQLIGRDKIPNLVNKRSLPYTQAV

IDEIFRYSSLVPITLPHKITEDIELHEYLIQKNSLVFVNFYAVHRDGDVWEKPDDFYPEH

FLKLLTNGDIKYTPRDELMPFGIGSMQCLGEKFTKFQLFVILTSIVQNFKIKFRREITEE

NSNSILHETNGMIRSPLNSDVVFEMR*

 

>v31.008506 8 55% to # 5                                                    

MITLLNVLISIFLVIILYFLSRKFSTNPIPIGLRPLQGPS

GWPLVGNLFSIKEPLWSIFYKHAKTYGEIFKFKLGNRELVIINSSKILFKTLKDQGVNFS

GRQQNDLIKYITHDCGIVFKEKEWWSTQRSFALKVLRDFGLGKQRSIEIVNNECSILLQE

LEETSGKNICTKSCFPSVTVNIIAKLLMNMRFEK NSSEFFEFRKIIEQTTHNRSFLVILF

QIYTLPHFTIPFIARFSGLRNHIDAVHNFVINIINERKKTFSHENEGQDFIDMFLIEKHK

LESENVKNHTFTDWQLIRMVVELFGAGFESTSTTLSWAMVLLARHTEIQDKVHQEIINLI

GVDKIPDLNNKKFLHYTQAVMDEILRISTVAPLAVPHRVFNDAEVEGFYIPKDSTIFTNL

YACHRDEKVWEKPFDFYPEHFLTKLDNGEMKYTPREELVPFSTGKRQCPGESLARLEYFV

FLTSILQKFKIKFNKAIDDITYEKITRGTDGTVRAPLDVDLIFEMRF*

 

>v31.023085  same as v31.008506 8 with frameshifts at & 

runs off the end                                           

MITLLNVLISIFLVIILYFLSRKFSTNPIPIGLRPLQGPS

GWPLVGNLFSIKEPLWSIFYKHAKTYGEIFKFKLGNRELVIINSSKILFKTLKDQGVNFS

GRQQNDLIKYITHDCGIVFKEKEWWSTQRSFALKVLRDFGLGKQRSIEIVNNECSILLQE

LEETSGKNICTKSCFPSVTVNIIAKLLMNMRFEK &

NSSEFFEFRKI &

QTTHNRSFLVILFQIYTLP &

HFTIP

 

> v31.000051 7 63% to #5                                                       

MDNFVLIFITLGIALYFIKKFFFNQSASKNLNLLQGPTGWPL

IGNINSFNEPLWMVLYKYSKHYGSLFKYTMGRRDIVILSSSKMLFKTLKEQGNNFSGRFQ

NFLIKARTRNSGILFTEKEWWATQRSFALKVLRDFGFGKQRSQEIVSNECMILLKDIEEE

CGKVVSARSLFPKATVNVIAKLVMNMRFESDSKDFTELTKLIEKAIQNNNFILTLFQMYP

ILDNMPFIAPFLLWWSGRNKTAKITSKFINNIIDDHKKRFNIENEGEDFIDIFLKEQHKL

NEEKVKNHTFTDQQLIRVVLELFFAGYETTSTTLSWSMLLLAKHPNIQEKVYQEIVDLLG

TEKLPNFADKKCLQYTQAVMDEILRVSTVVPLSISHRVFENTEVDGVYIPKDSIIFPNLF

ACHRDENVWEKPFEFYPEHFLNINNSGQLKYTPRDELVPFSIGKRQCLGEGLARIEFFIF

LTGIIQKFKIRFENEISEEEFAKITRGTDGGIRAPLSSDVVFELRI*

 

>v31.003543  6 95% to #5                                                       

MFTIFEVIVVIALALPIYFLSKLMTRPSIPKGLKPLQGPSGW

PLIGNLFTIKEPLWSFLYKQTFKYGPLFKFTMGCRDIVVISSNKILSKALKEKGIPFSGR

WKNIIAITLVHNSGILFTDGEWWATQRSFALKVLKDFGFGKQRSTEIISHECAFLLKEIE

ESCGKVISARSLFPKCTVNVIAKLAMNMRFERDTPNFKFLTANVERVAQNKNFIVTILLM

YPILEEYPRIAKLLLRFSGREQEFEEMHAHFRAIIEEHKKRFNVENEGEDFIDMFLREKY

ELDNEKVLNHTFTDWQLVRVIFELFVAGYETTSTTLTWTMILMAKNLEIQDKVHKEIVNI

IGSERIPNVSDKKSLPYTQAVMDEIFRVTSVVPLSLPHRLLENSEINGTFIPKNSIIFPN

LYACHRDAEVWKRPFEFYPEHFLTEESNGSIKYSPMDELVPFSIGKRQCLGEALARMEFF

IFFTSIMQRFKVKFEKEISVDAYAKMTMGTDGTIRAPLSSDVVFEIR*

 

>v31.010720 5                                                   

MFTIFEIIVVIVLALPIYFLSKLVTRPSIPKGLKPLQGPSGWPLIGNLFTIKEPLW

SFLYKQTFKYGPLFKFTMGCRDIVVISSNKILSKALKEKGIPFSGRWKNIIIKTIALNSG

ILFTDGEWWATQRSFALKVLKDFGFGKQRSTEIISHECAFLLKEIEESCGKVISARSLFP

KCTVNVIAKLAMNMRFERDTPNFKFLTANIERIAQNNNFIVTILLMYPILEEYPRIAKLL

LRFSGRDQEFEVIHAYFRAIVEKHKKRFNVENEGEDFIDMFLREKYELDNEKVLNHTFTD

WQLVRVIFELFIAGYETTSTTLTWAMILMAKNLEIQDKVHKEIVNIIGSERIPNVSDKKS

LPYTQAVMDEIFRVTSVVPLSLPHRLLENSEINGTFIPKNSIIFPNLYACHRDAEIWKRP

FEFYPEHFLTEESNGSIKYSPMDELVPFSIGKRQCLGEALARMEFFIFFTSIMQQFKVKF

EKEISVDAYAKMTMGTDGTIRAPLSSDVVFEIR*

 

Planarian clade C

 

>v31.008081 = AY067980 and DN308641 a longer EST

Upstream seq for N-term part of gene this matches v31.016521

MWQDILTGAGVGILSVFV

ILKFIERLRLPPGPWALPFWGNYRGFNGAMIYKRMVEYYKPKYGNLVTLFW

GNTPIVFINDYKLAIEALGNKNGKHLAGRKSVQT (1)

(1) EDIVSEERHNIFGNNYGPLWKEMIRALKLGVSKCGLTPSLEKNIQKSINFYCDELIRNP

DNSGKGVFPFDESYWLSDNTVLSIVIGRTFTLEDEDYIILQDSRRQVFDIMDK (0)

SEFVNWIPFGRYLPLKCIRRLFEQSTKLNNLLAK FVSDARKMGRYRNNPDNIIEYLLQETENSKLVTEK (2)

KVNLNDVHIRQIFKDILFASYEGIGMTNIWYMALLGLHPQIQERIRLDITKLCSKKK

DSTPKFEWKNELPYIQAVEYEFFRFSSIVGILDFHCVLKSVTLGGYHIPKGCLIAVNQ

YAINHDAERWENPETIKPERFLDSSGAFTDRSDLMPFGMGVRMCPGTSITKGYGF

LLLCNLCLNYKIKWSSENEFTELVDDPENAVLFRFPKDYLLNISKV*

 

>clone= H.49.4f P450 AY067980 Schmidtea mediterranea

same as v31.008081

IGXXGGXXHFPKGXFNXXXXGRRFIXXXDXGEILEPFRLERFLDGSGAFPARFVLMP

FGMGVRMCPGPSITKGYGFLLLCNLCLNYKIKWSSENEFTELVDDPENAVLFRFPKDYLL

NISKV

 

In Figure S3 bottom of category 3 images is H.49.4f

I think the table 1 number is a typo and it should be H.49.4f

 

When I went to your 2002 paper on the ESTs, the accession numbers were given

I searched Genbank for H.49.9f and did not find any results, but H.49.4f pulled up a P450 sequence, so I think that is the one you labeled category 3 in Table 1.

 

>Related gene to v31.008081

Matches EST DN295793.1

v31.019995 exons 1,2

v31.016694  exon 4

v31.029556 CDF exon  3

MNRTISSYKLPPGPNRNSDEADFIINEKLIYQKIVQKYIPKYGPIC

TLYKDGTPVVFINDIETGYEVFVRKLSKQVSGRSASY (1)

EKILSGNRNNILNSDYGPLWKTLQRN

LHKGLRHCANLDRLNEIMYKCFDKFKETVEYRRGSKNGGIEVDNDLFWMADNVITMLLFN

EIFNDVDYKKYQGAIEVIPQYAEE (0)

CDFPNDKPSKKSDFKSVEKLQKVFNQLIEVLQIRIDEHRKKGDYKIACFDITDFVLLAVEEYHLDPKSKKL (2)

4054 GIKLTDLHVKHSLIDVIFASQEGLGNSFVWVFGLLALYPNIQENIRTEINKFRDDNNVEVTL 4239

4240 HWKSKMPYTNAVLCEIQRYASADTFTDFHSTMDDINIGPYEIPKDSLIVLNLYAIHHDK 4416

4417 KQWKNPDIIDPQNFLSEDGRSFKRRKDFVPYGLGLRTCVGNTFSENVLFVLIMLVVDNYK 4596

4597 MVWGKNKSQEIIDDKNAVFIRDVEPFNIQFIRLK* 4701

 

>v31.006592 2 aa diffs to v31.019995 exons 1,2 above

The rest is not in the first 9500 nucleotides of this contig

MNRTISSYKLPPGPNRNSDEADFIINEKLIYKKIV 11776

11775 QKYIPKYGPICTLYKDGTPVVFINDIETGYEVFVRKLSKQVSGRSASYG (1) 11638

9775 EKILSGNRNNILNSDYGPLWKTLQRNLHKGLRHCANLDRLNEIMYKCFDKFKETVEYRR 9600

9599 GSKNGGIEVGNDLFWMADNVITMLLFNEIFNDVDYKKYQGAIEVIPQYAEE (0) 9447

 

>v31.022901 37% to v31.019995 above in the C-term half

top half supported by EST DN311932

     MLTDLFESSVFYLC

6243 LGIVAYIIYKWHFPKKHNLPPGPKAWPIIGSIPDPYSSPLSYKRLIEDWKEKYGEIILF 6419

6420 KRMGMNFVYVSDYKLGYQAFVKDYSNELAGRPLNSV (1) 6527

6580 DRLSVGNENIFFTRSLDAGWTELRKQIMIMLRKVAEPQKMTEISNNAYELFATHIREKKE 6759

6760 LVSDGGFYVEQLCYWVGHNILVQVFWEDAVEFDNPNYQDTISVIGKLMAKVGN (0) 6909

     LAFFDITPLNCLPHPGKNKILEYSKYLSDKVQ (0)

     xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

  (2) YGQLPM

10781 LTDWFISNLSLDIYFAGMDAFGIGFSWTIGLLATHPHVQDNIRKELQKNDIENVESFNY 10957

10958 KMMDKLHYLNAVVNEVMRIHSLSSFTSFHRAMSDVKVGEYDIPKDTIIALNIYGAHYDPK 11137

11138 YWKNPNEFNPDRFINADGKFQSPKEGFMPFGVGRRTCVAERYTVKESHFIFTGRICQDFK 11317

11318 ITWSSKNKTDKVEEDLENTFFRHTAPFYVEFHPIK*

 

>v31.010994     2 aa diffs to v31.022901

MLTDLFESSVFYLC

LGIVAYIIYKWHFPKKHNLPPGPKAWPIIGSIPDPYSSPLSYKRLIEDWKEKYGEIILF

KRMGMNFVYVSDYKLGYQAFVKDYSNELAGR

(missing seq in a gap)                                     

LTDWFISNLSLDIYFAGMDAFGIGFSWTIGLLATHPQVQDNIRKELQKNDIENVESFN

YKMMDKLHYLNAVVNEVMRIHSLSSFTSFHRAMSDVKVGEYDIPKDTIIALNIYGAHYDP

KYWKNPNEFNPDRFINADGKFQSPKEGFMPFGVGRRTCVAERYTVKESHFIFTGRICQDF

KITWSSKNKTDKVEEDLENTFFRHTAPFYVEFHSIK*

 

CYP3 clan (15 sequences, some are partials or dulicates, 6 complete sequences))

 

Planarian clade D

 

>v31.004132b  46 heme 

29599 FDSENKQNRSQFSFLPFGIGPRICIGQRFALLEAKFALVHLLKNYTILPCEKTP 29438

 

>v31.004132a 45 heme

43% to v31.000200 a

20881 MFGDCLFSNCSLCLILTLLFIYVLYR (2)

20739 LTTFSYNLDQKIGHEGPKPLPLIGNILDFRRHGFHVCDVRNQKRFGHLYP (2) 20590

      IYIGQIGMIMVSDEVLLRQILVKEARSFIDRP (0)

19137 KFLLSSSVMKRNLIQLKGKDWDRVRTLVTPTFTSGKLKK (0) 19021

      MSPMIVASIQDMMKKFDDVKDSNINMKN (2)

18033 IFGSLTMDVIAKTMFGIDIDKSSNSSHMFIEHGKKLFNISFSSMIFMLS (1) 17887

14360 TLFPPSMGLIEYLFPRGITSPDSMTFFEKSMAVTINSRRHSNE (0)

      EFHDFLALVLKSSTETGNLQHSGESTSETISYKKLTEEEILAQSIVFLLAGYETTASALN 14181

14180 FCCYYLAKNPDTQEELYNELVENLPDN (0) 14100

12595 KEITFDVLSKLNLLEQSLSETLRICPPISRFARECTKDTILTSSTGQVIQAKKGMKFVIPV 12413

12412 YAIQHDERLWPDPEKFDIHR (2) 12353

11351 FDSENKQNRSQFSFLPFGIGPRICIGQRFALLEAKFALVHLLKNYTILPCENTP (0)

      KKLTFGPSGLTVVKEPVILRIEKR*

 

>v31.003455 b  heme CYP3 like    runs off the end

45% to v31.000200   c

95% to ec1.13784.001 EST (DN302691)

96% to v31.004132

      MFGDCLFSNCSLCLILTLLFIYVLYR (2)

21890 LTTFSYNLDQKIGHEGPKPLPLIGNILDFRRHGFHVCDVRNQKRFGHLYP (2) 22039

      IYIGQIGMIMVSDEVLLRQILVKEARSFIDRP (0)

28071 LKFLLSSSVMKRNLLNKGKDWDRVRTLVTPTFTSGKLKK (0) 28190

      MSPMIVASIQDMMKKFDDVKDSNINMKN (2)

31393 IFGSLTMDVIAKTMFGIDIDKSSNSSHMFIEHGKKLFNISFSSMIFMLS (1) 31539

      ALFPPSMGLIEYLFPRGITSPDSMTFFEKSMAVTINSRRHSNE (0)

32986 EFHDFLDLVLKSSTETGNLQHSGENTNEAISYKSLTEEEILAQSIVFLLAGYETTASALT 33165

33166 FCCYYLAKNPDTQEELYNELVENLPDN (0) 33245

34803 KEITFDVLSKLNLLEQCLNEILRICPPISRFARECTKDTILTSSTGQVIQAKIGMKFVIPV 34985

34986 YAIQHDERLWPDPEKFDIHR (2) 35045

 

>v31.003455 a  heme CYP3 like

note: the exon beginning with KEIT is exactly the same as exon 9 of

v31.003455 b.  This is probably the correct end of that gene

out of sequence

707 KEITFDVLSKLNLLEQCLNEILRICPPISRFARECTKDTILTSSTGQVIQAKIGMKFVIP 886

887 VYAIQHDERLWPDPEKFDIHR 949

8045 FDSENKQNRSQFSFLPFGIGPRICIGQRFALLEAKFALVHLLKNYTILPCEKTP 8206

8253 EKLTFGTSGLTVVKEPVILRIEKR* 8327

 

>v31.016958 1 aa diff to v31.003455 a (duplicate contig)

8324 MFGDCLFSNCSLCLILTLLFIYVLYR (2)

8466 LITFSYNLDQKIGHEGPKPLPLIGNILDFRRHGFHVCDVRNQKRFGHLYP 8615

 

>v31.000200 a CYP3 like

MIEFISDYIYFVYFIIIIILFYK (2)

59731 LGTIKYNLHLKLGHRGPQPWPFIGNFPQILRYGFYEIDIINRQKYGNIYP (2)

FYFGNIATSVIFDEELLKLILIKEAKCFTDRL (0)

DIGFNGKISKNMAIVLKGKNWESVRNLISPTFTSGKLRK (0)

MSPLIIDCISLLEERINSHPLEEPVDVK (1)

NFGAFSIDVIARTMFGIRIDSQTNQNNPFIVHAKNLFKFSIRTPMLLLF (1)

FFPDTMRCIRKFIPNYEIMGRETLEYFVKQLDMAIDERQNSLE (0)

EYNDFLTEIMKVKQEQTDKSYGNSICLTRAEIIAQLLMFL

MAGYETTASSLSYVAYFLAM NTEAQQEIYEEILKADYN (0) 100624

103629 NEITYESITNLPLLDAAINESLRLCPTVPRIDRVCTKDTVLLTSKNQKIYAKEGDVFKIPVYA

IHMDPNIWPNPHAFDIHR (2) 103871

103924 (?) FDNENKKNRSYLSFMGFGMGPRNCVGMRFALYEMKMALVSILKRWELRSCIQTP (0) 104085

104135 QTIKIKKSSLSGLEKPILLKIHERK* 104212

 

>v31.000200 b CYP3 62% to v31.000200   a

       MIEYFYDFKLLLCFVVIFLIYK (2)

126060 LGTLKQNLHKKLGHQGPPPLPFFGNFFDIIRHGLYELDIINRDKYGKIYP (2) 126209

126261 FYFGGIPISIVSDEELLKQILIKDAKCFTDRL (0) 126356

128465 DIGSNGKIGQKMLIILKGKSWERVRNLVSPTFTSGKLRK (0) 128581

129215 MSPLMKDCIDLLDKRLAGIPENDPIDVKD (2) 129298

130409 VFGALTMDMIARTMFGMHIDSQTNPNNPFIINAKKLFKFSLFNPMLILI (1) 130543

130593 LFFPEITPIFRKFFPNWEFLGRQTMNFFVSNLNKAMDERKKSPE (0) 130724

135892 QYNDFLTEMMKAHKEVNDSDPEQLEILDSRSLTNSEIIAQSFLFLLAGYETTATSLTYVA 136071

136072 YFLAKNPEAQQEVYREIIKVFGTD (0) 136143

139694 EISYEKISNLPFLEAAINESLRICPPASRVDRVCTEDVVLVTSDNKQINAKKGDVFTVPI 139873

139874 YAIHMDPEIWPEPYKFDLSR 139933

140600 FDNESKKTRSIYSFMPFGMGPRNCVGMRFAVLEMKMALVNLLKQFELVPCAETL (0)

       DVIKFKKTAFGGPVKPIILKIVKRS*

 

>v31.000200 c CYP3 like frameshift at & in exon 9

63% to v31.000200   b

       MMEFFLNYFLDIKPLLFVLGLYLFYK (2)

160265 YGQTKQNLHIKVGHKGPPPWPYIGNFIDIFRFGVHEMDIMNHKKYGNIYP (2) 160414

       FYLGNIRSCIVSDEKILKQILIKEAKYFTDRL (0)

163489 DIGFNGKSTEKMLTVLKGKDWERVRNLVSPTFTSGKLKK (0) 163605

165274 MSPLIADGVNLLEKRIIEYPTDTPIDFKE (2) 165360

172501 FFGSLTMDVIARTMFGMHIDTQTNPKNMFMIQAKKLFNVSAINPLILIF (1) 172647

       LFFPELAGLVKKIIPKWEFFGRSTFQFFNNNLYKAIEEREKLKDDGR (0)

172882 EYNDFLALMLKAHKEVDESDPNQLHNMDSKSLTNQEIMAQSLIFLLAGFETTATTLTFL 173058

173059 SYFLAKNPNYQREVYDEIVIGVD (0) 173115

181103 DEITYETLIKMRLLDAAISETLRICPPASRIDRACTQDVVLTASDGSKITAEQG & 181264

181265 DVFTVPIYSIHMDSSIWPNPNKFDIHR (2) 181346

201967 FDDENKKTRGAYSFMPFGMGPRNCVGMRFALLEVKWAIVTILQKFEIVPCPETL 202128

       DELKFKKSGLTAPIKPILLKVVKRSQN*

 

>v31.001060 a CYP3 like 

62% to v31.000200   c

      MIGLLLSYIFDIKFLLAFVVVYLIYK (2)

33972 FGKQNQNLHVKLGHNGPPPLPFLGNLLDVQKMGFHTFDKMCRQKYGKLYP (2) 33823

33776 MYFGGSPMTVVGDEELLKQIMIKEFKCFTDRM (0) 33681

30268 ELAFRSKRAHKMLTVLKGKDWERVRNLITPTFTSGKLRK (0) 30152

29939 MSPLISDGIELLKKRISNNPPDTPIDIKE (2) 29853

27056 FFGALTMDVIARTMFGMQIDSQTNPNNQFVINAKRLFTFSLFNSFVIIF (1) 26910

26858 LMFPKLGILFRKILPKWEFFGKESFQFFIDNMQMAIDERAKSTD (0) 26727

21847 KYNDFLALMKAHKKETENDSESSKLMDSKTLSNDEIMAQSLVFLLAGYETTATTLAFVSY 21668

21667 FLAKNPDAQQDLYNEIMECLGSE (0) 21599

19966 REIANDTLMTMPILDAIVCESLRICPPAVRVDRVCTQDVTLETSDKTKVCFKKGDIIAIPI 19790

19789 YAVHSDPSIWPDPEKFDYHR (2) 19730

17523 FDPDRKKTNGQYSFMPFGMGPRNCVGMRFALLEVKWAIVSILQNFEIVPCQETL (0) 17362

      EKLTFKISGLLSPISPILLKIVKRGQE*

 

>v31.001060 b CYP3 like 

96% to v31.001060 a

      MIGLLLSYIFDIKFLLAFVVVYLIYK (2)

74587 FGKQNQNLHVKLGHNGPPPLPFLGNLLDVQKMGFHTFDKMCRQKYGKLYP (2) 74438

74391 MYFGGSPMTVVGDEELLKQIMIKEFKCFTDRM (0) 74296

69116 ELAFRSKRDHKMLTDLNGKDWERVRNLITPTFTSGKLRK (0) 69000

68787 MSPLISDGIELLKKRISNNPPNTPIDIKE (2) 68704

61335 FFGALTMDVIARTMFGLQIDSQTNPKSQFVINAKKLFTFSFFNSFVIIF (1) 61189

61142 VLFPKLGILFRKILPKWEFFGKESFQFFIDNMQMAIDERAKSTD (0) 61011

55078 KYNDFLALMMKAHKKETENDSESSKLMDSKTLSNDEIMAQSLVFLLAGYETTATTLAFVS 54899

54898 YFLAKNPDAQQDLYNEIMECLGSE (0) 54827

53204 REIPNDTLMTMPILDAIVCESLRICPPAVRVDRVCTQDVTLETSDKTKVCFKKGDIIAIPI 53022

53021 YAVHSDPSIWPDPEKFDYHR (2) 52962

38784 FDPDHKKTNGQYSFMPFGMGPRNCVGMRFALLEVKWAIVSILQNFEIVPCQETL (0) 38626

      EKLTFKISVLLGPISPILLKIVKRGQE*

 

Planarian clade E

 

Two genes plus one pseudogene fragment plus alleles

 

>v31.019801

36% to v31.000200   b

v31.029872 identical to v31.019801

v31.010235 a  DN314674 identical to v31.019801

ec1.03053.004 100% to v31.010235 a, 100% to v31.019801

2723  MLIFVSLLLTFLSLFYFYRRRMFASFAKYNIPGPPPNFLFGNLIEYRNNTFYELLIKWQK 2902

2903  EFGKVFVYFEGPTPNVVIGDPDLLQEVMVKQFSNFHGRKLFPMQKNPDEDEK 3058

4403  VSMFLARGKRWKRLRTTINPAFSDAKMRRMFPMVDESIASFVNKLNERGDKKFDCINIQEY 4585

4586  LQRLTMDIICKCAFGVDTDCQTNVNHPFIVSLKKFLKEM 4702

6292  NLSTLKAAVVLTLAEIKPVLVYLLEAFNQRPGQRVELMKIRSTLVQVIEARRKCSEKRFDLLQ 6480

      TDEEITEQTSSNQKKTSKPLTTEEVIAQANLFLFAGYETSSTALS 7520

7521  FILHSLSVYPDVQDKLFQEISRLHE (0) 7595

9947  KIEDKSIGSRELYDALGRLPYLNALIYETLRFYPFASTVIHRMCLKSEGTHLSNGLYIPQGTFV 10138

10139 IPNVFAIHYDKDIWTSVDPH (0) 10198

12108 TFSPERFLQSNGTELEFIGPPNPFAWLPFGAGPRNCIGQRFAMMVIRATLCQFVYRFQ 12281

      VSSTEKTENPLKLKVGATISPINGVNIVLKKRLSY

      YSFVIITTLKLNISRWMYPIQHTPLKI*

 

>v31.032551 N-term CYP5 like

31% to v31.000200 b 2aa diffs to ec1.03053.004, 2 aa diffs to v31.019801

possible allele

2590 MLIFVSLLLTFLSLFYFYRRRRFASFAKYNIPGPPPNFLFGNLIEYRNNTFYELLIKWQKEFGKV 2784

2785 FVYFEGPTPNVVIGDPDLLQEVMVKQFSNFHGRKLFPMQKNPDEDEK (0) 2925

4134 VSMFLARGKRWKRLRTTINPAFSDAKMRRMFPMVDESIASFVNKLNERGDKKFDCINIQE 4313

4314 YLQRLTMDIICKCAFGVDTDCQTNINHPFIVSLKKFLKEM 4433

     NLSTLKAAVVLTLAEIKPVLVYLLEAFNQRPGQRVELMKIRST 5962

5963 LVQVIEARRKCSEKRFDLLQ 6022

 

>v31.010235 b CYP3 like   61% to v31.019801

v31.022052  40% to CYP5 heme, 36% to v31.001060 identical to v31.010235 b

v31.013908 use to complete gap in v31.010235 b

24484 MFIFISLVVFLFTLFYVYRRRKISLFSQYNIRGPPPNFIFGNLIEYRSTVFYELFAKWERKYGKV 24678

24679 FVYFEGPTPNVVIGDPDLLQDVLVKQFSNFHGRKLFPFQKNPDEDEK (0) 24819

25019 ASMFFARGKRWKRLRSTINPAFSDSKMRRMLPLIEQSINVLMTKLHEESQKHIESIEIMDVMQRL

25214 TMDIICKCVFGIETNCQLNVNDPFIVSLKAFLGGT (0) 25318

28640 NFSSLKMALLLTFPEIKSIFIYMMERFNSGISQKINFLKIRSTLLQIINARKKSSEKRFDLLQV (1) 28828

31440 NEKEIKTNIPMPKMNSKQLTTEEVIGTATLFVFAGYDTSSMAMSYILHSLAVNYDAQNKLTSEIDEVFK (0) 31646

      ELEEKNLNSWEFYDSIVKLPYLNAVIYETLRFYPFVSTIMNRMCLKSEGTHLSNGQYIPKD

      TNIIVNVHSVHFDKDLWGPHDPH (0)

34909 IFYPERFLQINGDEYEFVAPTNPLIWLPFGGGPRNCIGKRLALMI 35043

      IKLTLSHLIHQFRITPSQNTENPLQLKLGTTICPANGVNLGIRERLVFVFTISS*

 

>v31.010235 c pseudogene fragment

19617 FSDAKMRRMFPMVDESIASFVNKLNERGDKKFDCINIQEYLQRLTMDIICKCAFGVDTDC 19438

19437 QTNVNHPFIVSLKKFLKEM 19381

 

>v31.019771 a N-term CYP3 like    

2 aa diffs to v31.010235 b possible allele

6603 MFIFISLVVFLFTLFYVYRRRKISLFSQYNIRGPPPNCIFGNLIEYRSTVFYELFAKWERKYGKV 6409

6408 FVYFEGPTPNVVIGDPDLLQDVLVKQFSNFHGRKLFPFQKNPDEDEK 6268

6040 ASMFFARGKRWKRLRSTINPAFSDSKMRRMLPLIEQSINVLMTKLHEESQKHIESIEIMDV 5858

5857 MQRLTMDIICKCVFGIETNCQLNVNDPFIVSLKKFLGGT (0) 5741

2410 NFSSLKMALLLTFPEIKSIFIYMMERFNSGISQKINFLKIRSTLLQIINARKKSSEKRFDLLQV (1) 2219

 

>v31.019771 b N-term CYP3 like pseudogene fragment

5 aa diffs to v31.019801

5 aa diffs to CYP10235 c probable allele of the pseudogene

12090 FSDAKMCRMFSMVDESIASFVNKLNERGDKKFDCINIQEYLERL

12222 TMDIIC*CAFGVDTNCQTNVNHPFIVSLKKFLKEM (0) 12326

 

>Dugesia ryukyuensis (planarian)

BW640569.1 EST 62% to v31.019801

FIGPPHPMAWIPFGAGPRNCIGQRLAMMIIKLTLGRFINHFELVCSNKTENPLKLKIGGT

IAPVNGVHLLVKERKL*

 

Planarian clade F

 

CYP20 clan (1 sequence)

 

>CYP20    Schmidtea mediterranea (freshwater planarian)

         NZ_AAWT01057969

         34% to Paracentrotus lividus

         33% to Strongylocentrotus purpuratus

 

         32% to Saccoglossus kowalevskii

         25% to human CYP20

         from SmedGD database

         Missing first exon

23008(2)aga KPKQIVIIPGLQKSDPK(2)

22918FGNLLEIKNSGSLHQFLSDLHLKFGAVATFYWNDQKIVSVCSNDLFKSVQDLFDRPSLI

FGNFHEFIGENCLQYANGNEGQNRRKQYDKAFQHDMIIR(2)22625

18726 YISTIENVGKSNLEEWHGSKVTVDSIEHWAYIMAVKSILKACIIPEEQSPEYCDTIITLYKS

VFIGLEKKFVECKVLSDEEEIQFQKDKNQFYSIIREIINKRLDFLENSDLENEKLLLIDY

IILYHRELNRNLIENEVLIYLIGGFHTT(1) 18274

13764(1)GNALSWCIYYLAKHPKCQEKLFTEISKRQIQLQNTSECYKLILD

LKYLKACIEETLRLSQLAPYAARVSDEDRYLSNYEVPANTPIILALGVSLKDEKI

FNNPDQFNPERFVDSNFPSFAFVPFGFAGKRKCPGER(2)13357

13304 FSYLELSIWLIQLVGKYRFTLTNPHQTVNKVYGLVTRMSENVSVQFGERSDVL*13143

 

Planarian clade G

 

>v31.00742 similar to CYP2A7, CYP2Y in the CYP2 clan

36893 MIFFTCLFCLVCVFILTKFRRASNLPPGPIGLPFFGYLFFI (1) 36771

35352 ERPSFKSLHRLGKIYGKIFSLTICGRTVIVINELSLIKHILVNRKEFSGRTTLF (0) 35191

      IGENIRENIFSHNLWTNSHRYKRTFTYKTYFGKQKRIFRQNDLICERTFSQKSC (2)

      IIDEECSLLINFIKTITDEPCNSR (2)

34674 LFFNKCVSNIIWKIVFGTRFDYSSTNAILNVENVAINNTENDIITFRQLMPRTWL (0)

      (unidentified seq)

31030 DKETTRLVFELFAAGTDTIILTLQWGMLLLCIHPKEQEKMRKEIEEFIITPVGEDIFISW 30851

30850 NDRQKLRYCQQ (0) 30818

26720 VIAEIHRYASVTPLALPHRALTSSKVEGFDIPYNSIVLIDIYSVHYDQNIWIEPQKFNPE 26541

26540 RFNGRSPTEKLIPFSL (1) 26493

26451 GTRTCPGSSLAQTEIFTILTNIVINYKILPVKKNLSLDDL 26332

      GDGTSGLTRGPSYHELIFKKLNDKNSK* 26248

 

Planarian clade H

 

>v31.000354

31% to v31.000423

MFIQLFGAVLCFIIFYWYLIKPRNLPPGPS (1)

DSKDRSKIFDSLKKYGNIVSIRVGFSYMIILYDLDTIKEAFTVKGDQFSGRGLKLLQKEITHSK (1)

GIISAEGKYQRDNWRFILRSLRNMGFGKMAFEEPITES (0)

FRKMVKAIDDFSAENPDSSMDFRQCLTEAVADNISQIILGVKFDKKQLLFIIERMEKIQTVPP

IFLMVNSFPILNIFL (2)

FFKYFKNVVGIEQFKNIRVPIQDMIKKFITNHKETFDSNNIRDFIDCYINEQPEHQSTEEESYWWD (1)

DEQIKHACTDLIAAGIGTTESTISWVLFSLANDQDWQKKVANEIDSVIGNERFPSMKDLAN

LPICEATILETLR (2)

FSSLAATGIPHTATEDSNLHEYFIPKDAIVLSYLMEVARNKDVWPDPDNFNPAANFLNSA

QNQIINRDKIIPFGVG (2)

RRSCIGESLARMEITLFFIALFQKYTVSLDPKFGMAKVAPLIFRILNHRVLKFQSRI*