9 Monosiga brevicolis P450s

 

D. Nelson Feb. 22, 2008

 

>fgenesh2_pg.scaffold_6000057|Monbr1 missing N-term and C-term poor match, CDS in 11 exons

MGKECNVPGSTGVPLLSDQTLAFWREPLEFCRKRTEKHGSVFQTRLLNHGTIVVCDYDTAREVLALPEEAASASEAYDDL

FHDVFGEESILLTTHNREHIRLKSCLLTWLAPSNIEALDPLLTNVAART (0)

KASSDEPLDLYQNLKVVCLHATLMIMLGLE (0)

QSVIYEQHLERLLRDHW (2)

HGITGLLSTFNLGLGSKSTVATAQEARDILIRLIQ (0)

DQLVRVREEPDNAALTFLRQFDK (0)

ALDEYDNAFKADQLLLLVSALVPKALAASLTLA (0)

MAAGTQHACVAANGELDPRRWHQVLRETQRCYPSMFAVRRHVKK (0)

NEGVDVGRFHIPLGYHIMVIIPLANADPAMYEDPQ (0)

HRLIFHDWAQRGTRAPQSFNPDRWAENVPRPLTFGHGTH (1)

ACPGRTLTDALLLKLCAQLHHSFEVSQLALEY (0)

LELETPLRHASLELKWWPVPRPATPLLGRLIARQERDPSSTPAT

QARTDKGRLPTVGTTSGPGSRAASQAASPRLPRRSQHVTEI*

 

estExt_fgenesh1_pg.C_60048 [Monbr1:36486] alternative model

 

>fgenesh1_pg.scaffold_21000118|Monbr1 no introns, may be too short at N-term

MWLFITVVLAFVGLAVWLRERARMRFPGGSRDILLHMTGWLQGLRTEVIQGDVARAHHVLVSKGSDKGRVIEENMMLPVW

TDGCSIESCNGRVWRTVRHSYDHFVRQLPPIEDLETYCLQHTPSITADAGVLDSPGIARTVLAAFGQHLFADACTPQHLD

VIEACSWDIRAHVAMRTNGHDDLKRACNRTIINLLEQSAFESPPDLPGWQHPLAFSAVMQPFFISPSVNFVDIVIALDNL

CDLRAKSVHWRKQGYGRAKMDIVAHLVESIRIRHPFPMLERLHGDTQYFIALDSFFAPGHPQAAFNPTQWHAEGMYQDPL

HSLIFGAGPRKCPGRTIAWASLPALLYRWLELEAAGHLTWRPADRHRFSGRDLDNAPTPMAESLHAAGRIVSCLANLAFE

TRRPEYTDAPPATPLATLRSRRPNE*

 

 

>estExt_fgenesh1_pg.C_170105|Monbr1

top part is not P450 has WD domain, probably a G protein beta homolog (deleted)

CDS in 20 exons for whole seq

last 8 exons are P450 seq. 36% to CYP4T9 Xenopus.  35% to zebrafish CYP4T

34% to CYP4A11 hum CYP4 clan, 34% to 4V5 fugu, 36% to 4T danio CYP4 clan

MLEYATTAIWTALAIILVRWIVKTVISIRAIEKLPGPKFKFPLG (2)

TLYAHPTPVEHVKLIKRLSAENPTSGFRFWLGPVQAVC (0)

VLNHPQAVRAILEDEPPKAPMLYRYIQPWLGEHSLLTLEGERWKNMRRLLTPAFHLHILHHYAP (0)

VIVDASSIIIEKFLNHAKTKPEEDYDVFSDYALLTLDVICRAAFSHEGDPQRNPEDKYAVSIGQ (0)

IAESIIARAINPKMMFDGVYYRSKEGKEFQELLDYVHDHADN (0)

LIDKRAKEIEGLTLDDIGRTRPGGRTLLDFLDILLTTQDENGQRLSKEAIRHQCDTFLF

AGHDTTSSCLSWLSYLLSVNPEAQEKCRKEIFDAFGDEAPTYEDVQKKI

PYLTCCIKEALRMYPPIPGVARKLTKPVNVGSTVLQPGTT (1)

AAVGILALHYNPTLWEEPTKFKPERFETGVKHDSYSFLPFSIGRRNCIGQ

NLALNEIRLAMCQILRKVVILPSAEKDYEPQPMSQIVLRSENGVRMRFK (0)

AYEE*

 

>estExt_fgenesh2_pg.C_280122|Monbr1 27% to CYP4F22 human CYP4 clan

second part from 275-500 is 36% to Helicosporidium sp. CX129156 CX128716.1 CYP711 like fragment

44% to estExt_fgenesh1_pg.C_280109|Monbr1

MLEYILYAVGGFVALCIGLVMYSLVPNPMGYFRMRRVFNTK (2)

LQEYLKVPLWRPMDGSFHEMQT (0)

DLIGFFKRQLDYGNL (2)

FGVIPLWYVFDPNLILTKPEDMKQVLFGDDLQYYRDNTCFYVMHMVLGK (0)

GIINVGGMEWRNQHRILYKAFAPDNLMYFRPAFAARARKMVDTFKAHAQSGEPIDLLKTMNEITLGVMIDTAFGNTLS (2)

HEEQSEMRHHLMYVIKQTTNFVHQVPLLRYIFADHTQLKRRLGEMHSLVETSLIRRREGKSFGEVCEVKR (2)

HMIDLIIEANHSESEDGYRMSDEVMRDNMISLMAAGTETTATAMTWTLYFLDKYPE (0)

VYRKVREENMNIDLEHLAEPGDLTKIVPYLTQVIQESMRMCSPLGN

IPGRRPFKDMQVGDLVVPAHVPMLTFAHQIHHNPQIWDAPE (0)

EFRPERFAKDGEASRDRLRFQPFGTGRRYCLGKYMAMAEMQ (0)

VVLSHMVRDLRFEYAGTAEGITPAFRPPTIQPRDGMPMHIKLA*

 

>estExt_fgenesh1_pg.C_280109|Monbr1 Seq revised  still missing  N term

30% to CYP208A1 Streptomyces globisporus  bacterial like seq, CDS in 7 exons

44% to estExt_fgenesh2_pg.C_280122|Monbr1

 

MWMAVALVVVAGVVLVPLLLLYPFLPDLRQWHRVRQVYNAR (2)

       LRTYLKVQPWSPLAGSFTALMQ (0)

       RYHGTMQHLMDLNKG (2)

494705 FGAFVLWFAYEPNVVLTRPEDIKQLLTDNDLNYTRDNSSFALFNRFIGQ (0)

       SIINANGEEWRRQHRILYKAFSPDKLVGFRSTFANRGERLAHSLLELSQA (1)

       EGSVKLGHWLGKMTLSVIIETAFGNTLR (2)

       PDEQDLMAQEFIYMTNEFTNFAHQ (0)

       IPVLRHVLTDTQRLETGFERLYGLVDQAVARRRSGEQDDGQIKLIDLI

       LEANGEEDDRSRLDDAAMRDNLLLLLAAGTETTATTLGWLLYELAVNPK (0) 493503

493381 ELAKLRAENAALDLEALEQPGDLSKLVPQLTNAIHEALRLHEPLGGFPSRRPLHTTQ (0) 493211  (GC bound?)

493013 IGDLVVSPGTPVLSMMSAVHRNPEYWAEPE (0) 492924

492833 VFRPARFAPGGEVEQNPFQYFPFGKGRRYCLGKYFAIAELQVVVSHLLRRLDMEYLGDRA

       TMRVVYKPPVLHASDDLPM 492597

       RFFARRTSRRGSKLVEAV*

 

See fgenesh2_pg.scaffold_28000122 [Monbr1:28763] for an alternative model

 

CYP51 clan

 

>CYP51A1 estExt_fgenesh1_pg.C_250046|Monbr1 51% TO CYP51A1 danio

54% to DC515864 cDNA Library, Monosiga ovata

MDKLPAAVVPYAEAAQEALVSLHETLGRPSTTTYLATSAVALGIWKYIRGNYLRPAKAPPKVPSQVPWLGCIFAFGQSPI

EFMIDCYKKYGPVYSFVMFGTEVTYLLGSEASSRFWSTHNDVLNAEDLYANITVPVFGEGVAYAVEHKIFSEQKQMAKEG

LTIDRFKAYTSMIEKETNGFIERWGQTGTIDFFDNMARMIIYTATRCLHGNETREDFDEDVAKLYHALDGGFTPQAWFFP

PWLPLPSFRRRDRAHRELKERFYKIIDRRRQKAEEGTQTDLMHTFMTTPYKNVEDGRHLTTDEVSGMMIALLMAGQHTSS

TVSSWLTCFITTTPGLEEKLYQEQVELFKRRPGPLSYEHINEMPLLWACIRETLRLRPPIMSIMRRAREDYKVTVNGVEY

VIPKGSQVCVSPTVNGRLEDEWEDPNTFNPYRFLKEEDGKLVVTEGEQITKG GKFKWVPFGAGRHRCIGFGFAQVQIRCI

MSTILRKYKLEMVSGKLPPINYTTMIHTPTEPIVRYTRR*

 

See fgenesh1_pg.scaffold_25000047 [Monbr1:10963] variant model

See fgenesh2_pg.scaffold_25000059 [Monbr1:28307] long model

same as estExt_fgenesh2_pg.C_250059 [Monbr1:33827]

 

Plant-like

 

>CYP704 fgenesh1_pg.scaffold_8000052|Monbr1 27% to 4F2 hum, 34% to 704B1 Arab.,

37% to 94D2 44% to 704B2 delete Cyan 42% to CYP704F1 Physcomitrella

37% to CYP704E1 Physcomitrella

32% to CYP745 seq

MLIPLLLIAAIAGLLHIWQKRLESPNAHMAAGCVPLLGHSLLVQKHLSKILEWFWANSKAANFKTWQLK

IIGQAPYVCVLDPVVVKHVLQDNFDNYIK (0)

GRLFRDRFTELLGRGIFNADGPEWSYQRKTAAHLFKRRELSGFMTE (2)

VFSDHGRLVCQKLDEASRTGTVVDLQ (0)

ELFYRYTLESIGKIAFGVNLGCFENDRVEFAVNFDTAQRIIMERVLDPAW (2)

EIRRWFNFIHPDEIELRRCVKKLDGIAH (0)

GIIQDRRKIGDLSDREDLLSRFMAVKDEQGKPLDDERLRDVVMSFVIAGRDTTANCLSWVFYELHQHPEVFAK (2)

LKKEVDTVLDGAEPTHDLVHS (1)

GMPYLHAVVKETLRLHPSVPK (0)

DGKVAVKDDVLPDGTVIKAGTIVIYLPWVMGRMES (2)

LWEDATRFNPERWLNQTTEPSHFQYTAFNAGPRLCLGMHMAYIEAKLLVAMLVQRFDFEVK

PNQEFTYTVTLTMPLKNGLLVTPTKRA*

 

See estExt_fgenesh2_pg.C_80053 [Monbr1:32130] same as estExt_fgenesh1_pg.C_80051 [Monbr1:36798]

fgenesh2_pg.scaffold_8000054 [Monbr1:24820] different model

 

>CYP745 estExt_fgenesh2_pg.C_170049|Monbr1 28% to CYP4F8 human,

41% to 745A1 Volvox, 42% to Chlamydomonas CYP745A1,

36% to CYP5160B1 Ectocarpus

41% to CT887000 Phaeodactylum tricornutum EST

47% to CYP745B1   e_gwEuk.8.160.1|Ost9901_3 Ostreococcus lucimarinus  (marine micro algae)

probable food for Monosiga brevicolis, 45% to O. tauri 745B1

41% to DC507813 cDNA Library, Monosiga ovata

MAAAAANQAMDLAHQGLDWAFHRVVLQAATILPPWLLRHVPANWQALTPAKLAVVTPAAFIVARIVMHQ (0)

LHQLRIKFALRNVQRAPQWLPIVGHTWALLIGTPWDVFHSWFETTGADLLKANVMGENSLLVYKP

RHLRQIMNSKLHNYPKDVDFAFKTFMDILGSGLVSSNGALWKKQRTLLSHALRIDILEETM (0)

PVAKRAIDRLSEKLEAIRGTGEYIEIAEE (0)

FRVLTLQVIGELILSLSPEESSRVFPDLYLPIMEEANRRVWEPYRAYIPTP (1)

GWFHYNRTLHELNNYLCNLIRKRWADRQAAVAAGTNEDDKDILEVIMADIDPA

TWGEGTVLQLRDEIKTFIMAGHETSAAMMTWACYELHRHPEVREKFIQEAQ (2)

AVFGTGIAADAEGADKFTKTPLPANEQLKGLQYTMNVLK (0)

ETLRFYSLVPVVARVTVEDDVLDGHVVPAGTRILISLRSAHDNPETWKDPMTYRPERFDEPF

DLYAFMPFIQGPRNCLGQHLALLEARIVMALLMLRFKLTPRDESCGERHPSIVPVCPKNGMWVRVD*

 

See fgenesh1_pg.scaffold_17000049 [Monbr1:9776] different model

 

Included here for comparison are two ESTs from the same CYP745 gene

In the colonial choanoflagellate Proterospongia sp.

 

Proterospongia ESTs from TBestBD

http://amoebidia.bcm.umontreal.ca/pepdb/searches/welcome.php

 

found 2 ESTs in 1303 ESTs = CYP745

 

>PRL00000480 N-term of CYP745 like gene

47% to DC507813 Full length cDNA Library, Monosiga ovata Dec 18 2007

41% to CYP745   estExt_fgenesh2_pg.C_170049|Monbr1

45% to CYP745B1   e_gw1.08.00.85.1|Ostta4

53% to CYP745B1   e_gwEuk.8.160.1|Ost9901_3 Ostreococcus lucimarinus

yellow is probable untranslated region

PRVR LRGDSKQEKGRRTMSG

MAASLSGLSLQNAGGKAKEAWTGLVDTLRDGFTHRPVVTA

LKCAGALVAIKVAVDTTRYVAEQWSIGSALRSLPRATGSLPFLGHALRLNVESPWDVMET

WIKSFNYNVMALDFFGKTGVVISDLERVRRVFNSKQRNYDKDLELSYSSFLDLLGNGLVT

SGGALWYKQRTLLGHALRVEILEETAPVAKRAADRLCKRL

 

>PRL00000755 C-terminal of CYP745 like gene

55% to CYP745 of estExt_fgenesh2_pg.C_170049|Monbr1

47% to CYP745B1   e_gw1.08.00.85.1|Ostta4

46% to CYP745B1   e_gwEuk.8.160.1|Ost9901_3 Ostreococcus lucimarinus

PRVR vector seq (same as seq above)

TSAAMMTWLTYELTQNPDKREKFLKNASAVLGTGKGKGKTPAEQFDNFTLPDRKEI

NKLTYILNSLKETLRYYTLVPVVTREAVEEDDLCGVRVPAGCKVFIHIKAVHNNPEVWEK

PRTFMPERFEKEHDPCAFLPFIVGPRNCLGQHLALLEARIVMALMMLRFDFEPAQSNVGE

KHGRTVPVCPKHGMWLN

 

These are probably from the same gene. Missing 142 aa in middle region.

 

 

Newest M. brevicollis seq found may be in the CYP7 clan

 

>fgenesh1_pg.scaffold_11000095 [Monbr1:8471] MODEL IS FUSED WITH DOWNSTREAM GENE

like CYP39 in CYP7 clan note: this translation requires A GT-AT BOUNDARY AT PPD/RWER

Expect = 8e-22 21% IDENTITY

394781 MWILVALVVFCAVAAQLSVLLQQKSISPDIPMIGGALPWIG (2)

CGLRFIKNPRQFFEDLRVKHGDTFGIYMFGCRMLCLFDQKGVDQLYRMRDVDASFFEATKGLLSLKLPPE (0)

LLQDSSLKKFHQALKPKLMPHYIR (2)

YAHDVVTDHVARTVRAGEQTVSLFPYVKSLVHKI (1)

GLACWLHPAANSPERFKHIVQAYEQLDPEQ (0)

GFANGAEFLKTMLSGKRAERRAIDALQTAVAELCQSLGT (0)

NLVSMYEAQPEDAMPSQRHRAIATNLFHFMLASQANMYAGMAWTLIHLLT

MADQQHLRLVRAEVLHAQQAHGEDFLRTQASLDSLAFLDACIVETLRVVQQSITLRKVMRPCTLQMDSGS

AVLPPPWYLLTLLSVTNMDPATIDPAVAKSSANSPPD (2)

RWERVTLARPSPNPLTSTFGHGYHACPGRTFALNMSKIVLAQHILAFDL

VPQFERATVPVTSVGALARVEHDCPLRLIPQV* 392554

 

Fusion sequence at C-terminal has been deleted here.

 

See fgenesh2_pg.scaffold_11000089 [Monbr1:25689] alternative model