8 Monosiga ovata P450s

 

D. Nelson Feb. 20, 2008

 

Assembled from ESTs

 

>CYP51A1 DC515864 Full length cDNA Library, Monosiga ovata Dec 18 2007

DC505021 DC481719 DC499362 DC515864 DC455628 DC461341

DC454243 DC488261 DC503517 DC463918 DC518896 DC496720

DC481661 DC475708 DC470881 DC457948 DC508866 DC473892

DC461250, DC476196 DC482539 DC489615 DC514865 DC490097

DC454294 DC469866 DC502657 DC513272 DC505282 DC458836

DC477187 DC482903 DC502887 DC495059(some errors) DC451008.1

EC165400 02-JUN-2006

cDNA Library, Monosiga ovata Dec 18 2007

54% to CYP51A1 Monosiga brevicollis

MOC-079L11 3 prime DC505020 = 5 prime

MQHIGSLTLADIGTRITEYVSTANRTHLLAGGVAALVTLNWIKK

TYFRSSKLPPHAGSNFPFFGSMVSFGQHPVKFLERCYKEAGPVFTFTMLGSEVTYLAGGE

VTDDFWSSKNDDLAAEDLYANLTVPVFGKGVAYDVPHPVFSEQKGITKKGLTQQRFAKYT

AIIEKETLAYIQRWGESGTCDLFKDLSELIIFTATHCLHGEELRSTFDESVAALYCDLDK

GFTTAAWFLPNWLPLPSFRVRDRAHRELIRRFTHAIRDRKAKG

DVPGHDDMLETFMTATYEKVNDGRAFTESE

TAGMLLALLLAGQHTSSTVSSWLGFYMAQNPALQQDLFEEQQRVMGSQSGPLSLEAINSMPHLWAA

IRETLRLRPPLLTLMRNCRRPMEVKVGEKTYVIPKGNQVCVSPALQGVLEDLWDEPEKFD

MNRFLKKDASGTEVVTDGTQVAKGGKLKWVPFGAGRHRCIGFDFAQVQIRAIWSVILRNY

EITMTDVPEIDFTTLLQLPVHTKVHYRRRPTAAKA*

 

>DC486766 Full length cDNA Library, Monosiga ovata Dec 18 2007

MOC-053F16 3 prime, DC486765 = 5 prime

like Helicosporidium   sp. CX129156 CX128716.1 CYP711 like fragment

~50% to DC460778

MSLTGIAAPLLPWQRSLVLAIATPILSIAVLLVSILYPSWRSPLRRALPSP

PAPSLLLGHLPAMGKTGHSTLFAWAKKYGGAFFVRLGCWPAVIITDVELIKSICITQFKD

FHDRSQAFRPNPHVRHLLWAQGAYWKACRNAVSPAFSRSNIAGFGAQMNQSARGLVDRVG

ALAGTGQSVDIMRVVGAMTMEVIGGTALGVDLSASNREVSEKIIAAADLLFGSNVAGSTI

TSFIRFLVPALLPLWLRVPALTA

(34 aa gap)

GFLQLMLAAKDPESGLQLTDKEVIEQCRLFMLAGFETTANTLTMAIYLLARNPDAEARMV

AEIDELFTGTEVDYDSVQRFKYVDCVLQETLRMYPPGSNLLREATSDTVLGDLEVPKGTT

IIMPMYTVHRDPALFPEPESFRPDRFLEGSALAPTDKYANLPFGAGPHMCIGNRFALAEA

AVTLIHLYKNFTLRLAAPLPDPLPLRQSITLSPAVPIPVRFDRR*

 

>DC460778, DC458147, DC494197, DC496896, DC457970  cDNA Library,

Monosiga ovata Dec 18 2007

MOC-015K20 3 prime DC460777 = 5 prime

~53% to DC486766

MAVFSKLGISLTSIPALPRPSWAALAAAAPVVLTASAVAYMVLP

GLLSPLRGQMPSPPTVSMLLGHLPAFTKRSHLVLLQWTKLLGKVFFVRLGAWPVVVITDI

ELIKAVNISQFKDFPDRAPCFVARQDIRHLLWARGAYWKACRNAISPAFSRNNLTGFAQQ

MNESAHALAARLGRAADAGEVINMMDVVGNMTLQVIAGTAFGVRLGPEAAGQTAAFVAAA

KELFGSSVQGANLHGILRFIFPFLADIYTYIPPT

(small gap at I-helix)

YETTANALTF

STYLLARNPAAEARLLAELADVGVPDYISYDDSQKYKYVECVVQEALRLYPAGNAIVRQA

VCSTTLGPYEIPKDTCIVTPLYTLHRDPDLFPMPEEFCPDRFLDGHPLAPSDKYAHLPFG

LGPHMCIGYRFALAEAVIALVRIYKDFTLRLDAAVPDPLPLQQGITLTADVPELFHVERRAP*

 

>DC501006 DC513559 DC511491 Monosiga ovata Dec 18 2007 MOC-073O13 5'

DC513560 = 3 prime

DC511492.1 DC513560.1 36% to estExt_fgenesh2_pg.C_280122 Monosiga ovata

~45% to DC499869

MDILLTIFLSILGMLLFGLLCILAFCLPKPYEYL

LSRKIINTKTGKPLSNNNIKLPFGDLLSCLKDVVGAERSRLKYIPENAIGWWALNNYDVQ

LLNADWLRELLTMPDVKFNRSLESFKVLHRLLGTSLLGNQGEAWARMHKILYKAFQPSAL

ASYVPFFISQTKEIANNIDASIATNTSYDMNLAFNDLTLAVMVDAGFGKAVSPADRKIIL

HAFRYLFMETQNPLHDIPILSKLPFPSNLECERQFRALHETADRI

(very small gap)

MHASGEYQEAHDGKYVVDMLLDAHEEEGKLSDIELRDNVVMLLVAGSETTGTTLTWVLHY

LTVYPDIKARVLAEIDTLDISWEHFNVTNFDTGMPYLTMVVSETLRDTPSIYGIPERIFT

EDGTVGDLRLPKGSRVGVSQYCMHHNQNYWSEPDKFDPERFAPEASAARHRFAYLPFGLG

RRQCMGKFFALNEIRVVIALLLKHYEFTYDASVGPVEVTWRPPTLMPKRGLPMFAKRR*

 

Note: a short EST matched this sequence in the N-terminal region.

It may be from a related P450 (from M. ovata TbestDB)

 

>MNL00001922_unclassified N-term Genbank EC163715.1

Query: 63  CLKDVVGAERSRLKYIPENAIGWWALNNYDVQLLNADWLRELLTMPDV 110

           C++DV G + SRLK   E+ + W  L  Y  +LLN DWLRELL M DV

Sbjct: 1   CVEDVGGGDWSRLKQSCESVLDWRGLERYYCELLNDDWLRELLRMDDV 144

 

52% to DC501006 short sequence like N-term region in gray above

CVEDVGGGDWSRLKQSCESVLDWRGLERYYCELLNDDWLRELLRMDDV

 

 

>DC454381 Full length cDNA Library, Monosiga ovata Dec 18 2007

60% to DC499869.1, 46% to DC501006,

MAVHDGHGVGSILALLGLGVLGVAAVGAACLVSFLLPSPAELRRRKALHNRITGNAL

PTSGFHLNPLSDLLDELSDPFGTSTVRRDMAKGAQSGAYGLWAFGQYYAMLSNADWLREM

LSMPDKAFRRSFQPFRTFSRLLDSSLIAAEGETWARQHRIMYKAFQPSALAGYVPLYARR

AHMLADRLAAVAAAGGREDMLLAFNDLALGIMVDAGFGAAVSADDFKILFDCFRYIMCEM

QNVVHDLPIIGALPLPTNVRLDHQFKALHAAA

 

>DC499869 Full length cDNA Library, Monosiga ovata Dec 18 2007

61% to DC454381

DC499870 = 3 prime

DC499870.1 35% to estExt_fgenesh2_pg.C_280122 Monosiga ovata

48% to DC511492.1 DC513560.1

~45% to DC501006

MVVWSSPSWPATPASALRLAGLAVLGSVALCTACAISFCLPSLRSYIISRRVRVR

STGALLPENQPGIPMKTILNEINDPIGENKRRLAAASGIGALGLWLLDSYSVLLANTDWI

RELLSHPDKIYGRTFMPFKTMDRIIGGSLIGSEGDDWARQHRIMYKAFQPSALAGYVPLY

ARRAHMLADRLAAVAAAGGREDMLLAFNDLALGIMVDAGFGAAVSADDYAIIFDGFRTAM

QETTNIIHDLPIIGSLPLPTNIRLENQFKRL

(gap)

GISAVAPDVLTRVRAEVDAVVGSVEGMDSSKLDGVLPYLTKVVNESLRLNPPVAGIVGRI

IKQDMTLGDITLPAGAGVGASTIMTHYNPANWDRPDEFDPERFGEEAVSKPPRFGYIPFG

QGRRQSLGRVFALNEIRVVLAALLKQFPFEYDDAGGPVPLQVPPPSLVPPAGMPMTVRARL

 

>DC514820 DC464533 Full length cDNA Library, Monosiga ovata Dec 18 2007

DC464532 = 5 prime end

Best full length matches are to aromatase CYP19 low 20% range

MLALASLLPTGLRSLYRKMLAMTELDPVDVQEDVLAHTVPQ

VAHQLLEKRDVAVVKLGPRRVVFTCNPAIADHVFKEKQDCYIQRLCEDEGVRELGMYQQG

VIWNNSAQWSSISKDVFHRAINPHALKEASRLARARAQLVFQSALQGQAAGGVDVLKCCR

LVTLHVTLQLFFGISTEELSAQVSAESVIADVCNYFKAWEYFLLRPASSTPPDEKTQAHR

DAINRLRTTVRIIIECARAKLAAAHADHS

(gap)

DGPPAPVATAPSASASASASAPPRASTFTVTRLLADLRRRRFEHMQQAVDTMPDDPEDRE

AWCAWLAAAHDAAVEASAKSDLLKALLHESLRFKPVGPVVIRQAVADDILPASASPHTGT

PMRKGDGIVIALDLMHRRADLFERPNVFDPGRFLQSTEDTDMIKFSQPSRFAPFGAGRKS

CVGKDLGMAEILQVCA

AILTAMAFDLGDNEPLADLETRWDIANQPTRPIALRACRPSLIL

CGPSSSGKTTLRKQLQAEGGWSSHASIE

 

>CYP710D1  Monosiga ovata (choanoflagellate, single celled ancestor to animals)

           GenEMBL CO435081, EC169877.1 EC166517.1 ESTs partial sequence

           53% to XM_813889.1 Trypanosoma cruzi 710C1, 43% to 710A1

           46% to 710A7, 56% to 710B1, 49% to 710A14

           note: seq ortholog not found in Monosiga brevicola at JGI

DC485544 = 5 prime end

DC485545 DC517956 Full length cDNA Library, Monosiga ovata Dec 18 2007

MAQELLDAIAAWRPVATWQSALYATAALVSGYALYEQIRFRQWKQGMEGPA

LAVPLIGSIVEMVKNPYAFWENQRLRNPCGISWNSICGQFMLFSTQTDVTKKIFMNNGED

SFRLFLHPSGWKI

LGEHNIAFKHGPSHKALRKSFLNLFTRKALGVYLGIQERLVREHLATWKLVDEPTEFRLK

VRDLNLLTSQTVFIGPYLRSDEERVTFCNNYLLMTEGFLSFPIAFPGSGLWKAINARHAI

VDKLVAAARESKARMAAGADPECLLDFWSQRILEEIAEAKPGDEPAEHWSDWEMGNTMMD

FLFASQDASTASLTWTAAFMSERPDVLAKVQAEQKALRPNDEPLTYDMVEQLVYTRAVIK

EILRFRPPAVMVPAIAMVDFPLTD TCVAPKGSL

VVPSIWAACMQGFPHPEVFDPDRMGPERQEDVQYRDNFLTFGVGPHMCVGREYAINH

LVAFLSLLSTTCSWTRIYTPESHTIKYLPTIYPGDCLIHLKPLQASA*