Red algae Galdiera sulphuraria

 

May 12, 2006  D. Nelson

 

http://genomics.msu.edu/galdieria/sequence_data.html

 

data from the Galdieria Genome Project

 

Michigan State University Galdieria Database http://genomics.msu.edu/galdieria

 

Barbier G, Oesterhelt C, Larson MD, Halgren RG, Wilkerson C, Garavito RM, Benning C, Weber APM (2005) Genome Analysis. Comparative Genomics of Two Closely Related Unicellular Thermo-Acidophilic Red Algae, Galdieria sulphuraria and Cyanidioschyzon merolae, Reveals the Molecular Basis of the Metabolic Flexibility of Galdieria and Significant Differences in Carbohydrate Metabolism of Both Algae. Plant Physiol 137: 460-474

 

Weber, A.P.M., Oesterhelt, C., Gross, W., Bräutigam, A., Imboden, L.A., Krassovskaya, I., Linka, N., Truchina, J., Schneidereit, J., Voll, L.M., Zimmermann, M., Riekhof, W.R., Yu, B., Garavito, M.R., Benning, C. (2004). EST-analysis of the thermo-acidophilic red microalga Galdieria sulphuraria reveals potential for lipid A biosynthesis and unveils the pathway of carbon export from rhodoplasts. Plant Mol. Biol.55: 17-32.

 

>contig_454_Oct13_2005  39% to 3A5, no introns

130976 MASPNEIARWFLQTRTKLHSYLFSYCFMFLAPRIALVSPEAAKH

       VMVKNVRNYVKPPMVRQGLSNLLGNKGILLAEGDDHARQRRIILPA 130707

130706 FHFDALVHLGPIFRAQGQQVVQRWLNRPEEAIDVHLDMTQVTMNVIALAAFG 130551

130550 YDPNTDSGQELYRAYRDIFTQRPPSRMLAMLFSLLPSWLLQSMPLSRLLRRQQSNVR 130380

130379 LVKKKVTEIVQKRREEYEALLVKDSNAMGKSTTNRDLLDMLVAARDPELEKKSSHLP 130209

130208 YLTDEEITSQALTFMAAGQVTTAVLLSWTLFELSIHPSAQEKLRQELQTMETTLSTQDIT 130029

130028 EMVQHLDKLEYLDVVLHESLRLHPPVLFITRQAVQDDEILGFPISQGAIVNIPIVALHRD 129849

129848 PEQWGPDAESFRPERFLSSDKNNVVIQRHAMAWLPFLYGTRACTGQRFAMLEAKTILFEL 129669

129668 LTKVSVRLQPGCEVKGYGMVSVPRDVRLQVVDLHKE* 129558

 

>CYP710B like seq from genomic DNA 85% to EST based seq. EST = A4_15B03

contig_966_Oct13_2005  29466-30995

29466 MDIVSFNSLSSGNLIILLVVITMICYFILEQLHYFWWKRSSKLPGPSFTLPFLGSIIEMV 29645

29646 KNPYQFWEKQRLLDPQGVSANFLVGRITLFVTDSALVRAILNNNSARTFLLALHPSARLI 29825

29826 LGKNNIAFMHGQEHKELRKSFLSLFTRKALGVYLTLQETSIKSHLQKWIQLSKENDEMEM 30005

30006 SFLCRDLNLETSQYVFAGPYIGEQRDQFCHWYITVTKAFISAPVFLPGTNLWKAYFARKK 30185

30186 IVALLENAVIQSKKYIGNGGTPRCLLDFWTQRVLEEMEEATQQDKEMPSYSNNRKMAETL 30365

30366 MDFLFASQDASTASLTWTLALMSDYPDVLKKVQEEQKRLRPNNEPLSFELVESMTYTRQV 30545

30546 VKEILRYRPPAVMVPQNAMGSVPLTENVTVPKGSFVMPSIWSSCMQGFPDAYKFDP (0) 30713

30771 DRMSPERQEDIKYRQNFLTFGIGPHVCVGREYAINNLIAFLALIS 30905

30906 TECKFQRYRTKKSDDIIYLPTIYPGDCLMKFV* 31004

 

>CYP710B1 related seq, ESTs HET_11H09, HET_31E01

MQLTEFDSFNKFLSGN

LVFLGVSIALVCYLLFEQLRYFWWKRSSKLRGPSFTLPFLGSIIEMVKNPYEFWEKQRLL

DPQGVSANFSLGRITLFVTDSALVRAIL NNNSAKTFLLALHPSARLILGKNNIAFMHGPE

HKELRKSFLALFTRKALGVYLTLQETTIKSHLQRWIELCKEKSPLEMSFFCRDLNLE

TSQYVFAGPYIGERRDEFCSWYITVTKAFISAPVFFPGTNLWKAYFARKKIVALLENAVI

QSKRYMADGGSPRCLLDFWTQRVLEEVEEAAQQGRSVSYANNRKMAETMMDFLFASQ

DASTASLTWTLALMADHPDILKRVQEEQKRLRPNNEPLSFELVENMTFTRQVVKEILRYRPPA

 

>contig_981_Oct13_2005 EST = A4_11A04 30% to 4F12

75266 MLQLWIVLVTFSCLFLYVFILPKWRNRHIPGPRPSLLLGNVSELSRQGGTAPLVFERFRKQYGDVFQIWSFYRQ 

75044 IVVISHPDDIKYIIVTKNFPKAEEFNLSLSPLAGRGLLTVGKSQHQERRRAISKHFNE 74871

74870 DFLRQLHRHMRVELMILLSKLQQVTERKESIDFDKEATSYTLDVMCRTGFGCTANTQED 74694

74693 ASHPISRAVNVSLREMYHNLVAYPIRNCFGLYSSPALKNATGVIREFASQVIEARR 74526

74525 TESEEDKTRRPLDLLDIFLKMDNLSDQNIIAEIATFLVAGHDTTSHT 74385

74384 MSWLIYEVCQHPEIEQKIQQEVDTIWGDRQDWMLSFEEIGQLEYLNKVWKETLRKHPVAA 74205

74204 TGTLRRLDTDVTLPSCGMLLRKNTAILVPIYLVHRNPEFWPDPETFEPERFTRENTMKRH 74025

74024 PFAFQAFSNGPRNCIGQFFATHEALTTLSSLYHFFTFRLACRAEDVKPYHAMTMKP SVGKVSEDAKGV

73820 SEYVKLPVWVTPRNTMAHLREE* 73752

 

>contig_989_Oct13_2005  41% to contig 981

47117 MMMSCLAVSLLQLSNLSQDWSRVFKLFILAALFWTVFKFLKYVYPYWRFRNI

      PGPPPKWPVGNIFELLRKPGQEHRILLQYA 46872

46871 KQYGPTFQLWYLNRRTIIVANPEDAKFVLATRNYPKSPIFCRCFSPLGHGLLTLSQEEH 46695

46694 PVQRKAISQRFNEEFLQSLHHHLTAELEVFMAQMDALCDTERVVDLDALISALTLDVIAR 46515

46514 TAFGVSFTAQTSQHHPMPHAVLTLLDELVNNMIFYPYRFWLSHITQKRLNEAINVIRKFC 46335

46334 NMVIDLRLQESREEKSNRVRDLLDIFLESDETRDNVIAHVATFMLAGHD  (1) 46188

46137 STSHTLSFCMYEIAQNRDIERKLQEESDRFIVAQDRIVPFDQVGHLDYTRMVWNEALRTH 45958

45957 PAAANTSVRCADRDDVLPGSGIPITKGTGLMVSSYLIHHLPQYWENPDHFIPERHTKEAV 45778

45777 RQRSPYYFLPFSRGSRNCIGQFVANHEALTILSTIYKRYEIRLAVGAQEVEEYFRVTMKP 45598

      HCRFYVQGKKDPSLDAHLGLPVKIYSRKCYS* 45502

 

>CYP51 contig_1016_Oct13_2005

188929 MLSQDSIALSTLTSSLEAYCWALVYILSTILFFGILWRITGSFFLSKLG

IAREVKGQQLPPTYKEGLPLVGNLIAFAKGPLNVVQRGYQSCGDIFTFK (0) 189228

189255 VFHKHITFLVGPKAHEIFFQGTDDE

LDQNEVYAFSVPIFGKGVVYDAPLEKRLQQLRIMSAALRPARMYGYVDQMVLEAVQFFRK

WGDQGQVDILESLSDL

IILTASRCLMGREVREQLFEKVSKLYHDLDQGMQPISVFAPYLPISAHRKRDKAREEMVQ

LFRTVIQNRRRRNVKEDDMLQTFMDASYRDGSRPSEYEVA

GLLIALLFAGQHTSSITGSWTGMLLLRNKDVFERVKKEQDTIIEEHGDELNYDVLSKM

NLLHLCIKETLRMYPPLILLMRKVLKPKFYKEYVIPENDIVMVSPAASGRLENVFKNPNA

WDPDRFGPNREEDKKAPFSFIGFGGGRHGCMGEQFAYLQIKTIWTV LVRSFDLEPIGDLSQPDY (0) 190415

190469 NAMVVGPRPPCLLKYRKKKDSFLDRVSLYA* 190555

 

>contig_1041_Oct13_2005

gene model c1041_g24.t1 looks like two genes run together

dihydrolipoamide acetyltransferase (E2) subunit of PDC and a P450

c1041_g24.t1" class=CDS position=contig_1041:66689..63211 (- strand)

N-terminal not clear, MDHK might be the start or one more exon could be upstream.

64863 MSLMKRVMFLGGQVARWLVNGGLLSLVFVDLAFSRWSLEDLWCTPPSQSFNCIGQGR (1) 64693 (alt N-term)

64600 DHSMDHKSKQVIFVYIIFDCNLPSLPGPSPWPIVGNCIPLSSN

      LYQTLYQYVEQPISLYFIASTPFVVVTDEAAVR (2) 64373

64319 KVLGSGMYQKPKYFGYRSSTIRYSVEMNQKLILTNEQMRQQQADSSRKA 64173

64172 LKVMIDSKVSDIIDGMIEAAEAVVHAVDGREQVENIRRKVIELNLNVLFGYKNDKDV 64002

64001 GSLSHIIFEAGKEFILRTVNPFRIGWRWMANFRFFQ (1) 63894

63833 YVFSLITIGRRVCQHMDSQPATWVHGWVGKVGKIGKLGKVVGLIMASSQTVPTTCLWLLFLLSK () 63642

63641 YPQVVEKIREETSRVLHSTKKQSMEEFTVDDLNELAYVDCVVKECLRLYPPFPLLQREPE  (0) 63462

63405 MDDILENVKIPARTPVYIVPWLLHHHPKYWKQPEDFIPDRFMYNASHGDAPSDFVYIPFGRGNK (2) 63214

63161 MCAGYHLALLELKILTIYVCQYYDWKCSFPQGKEPVSKKYPIETITHSSCNRFFFIMQLLSIGNVS*

 

>contig_1062_Oct13_2005  30% to 39A1

26036 MWIGLLLFFVLVFTLYLVRQNTCTGNKANLSYSPVCKGLPLLGSALEFGK

      NPLKFLQECRKQYGDVFTVLLPGRRMTFIFAPTQE 25782

25781 LRKIFFNGSPNLISFTAGVEPLTCRIFGISKKGFSMAHRSLLTTLRSELGAKHIPQLAHR 25602

25601 LINRYLFTFRTVWGKEDEKEASNLLTETLSDASLRVIFGDEFANASPSLFKDFVDF 25434

25433 DEWFELAATPLLPHFLLRPFVKSRRKLLDTISQNWKYTKNAPIHKLTE 25290

25289 AYGNDGNVPSLLLSALWATWSNVSPTSFWTLTHILADEKAKVKVLAEVEKSCPLLLSSKT 25110

25109 ELSLEWIFSNLPFTAYCVSETLRLYASVVDIRKVVENLEFREFIIRKGDYLCISPAVSHR 24930

24929 ETTLFPQSEDFIPDRFQKQGTHPNAVFDKDLLTFGGGFYKCPGQSFAMVEIVLLIALVFY 24750

24749 LYDIQLVDRVPKMKESQSVGIKKPSCSCRIHYLWKRRLAGMEEI* 24615