Four P450 fragments from Tetrahymena

 

D. Nelson 4/5/2004

 

These sequences resemble CYP4V and CYP3A and CYP46, so the 3 and 4 clans proabably had a common ancestor and these seqs derived from that same common ancestor.

 

 

>Tetrahymena P450 seq 1a N-term (partial 291 aa) BM400694.1

MFIEILIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLLGELVEIKEAIK

KYEDVDYFVSHQCDENPDLRLYVVNLGSKIKLRLVDPDLMRDFFLNQQYYEKDTFYIGNV

LRCAPQGIAFVEGEQWKKARKMFSQAFHFEYLTSLAPLIEQIASKVFNQAMESSEILANY

DPLVYSQKITGQVVIATFFGEQVNEKKFRGMDLVSALTHMLNLLGEQSMSLQ YFLFGADF

FKLRLTQSQRYVDDIIEEFRSFLTDLIEGKHQKLSQKLKEYGKIVSLPFSLESLHLRNNA

 

>Tetrahymena P450 seq 1b BM399152 N-terminal probably same gene as seq 1a

MFIEILIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLL

GELVEIKEAIKKYEDVDYFVSHQCDENPDLRLYVVNLGSKIKLRLVDPDLMRDXFLNQQY

YEKDTFYIGNVLRCAPLRYS (fs)

HLQRAKQWKKARTMFS (fs)

QAFHFEYLTSLAPLIEQIAFKSL (fs)

NQAMESSEILASYDPLVYSQKITGQGVIATLFGEQVNEKKX (fs)

RGMD (fs)

LVSALTHMLNLL

 

>Tetrahymena P450 seq 2 N-term (partial 214 aa) BM396441.1

MVSYFALAGLAIVLYILYVFIINPYLQYRK

YLKWGKGSFYPFVGVFYGAGLRVKQYKDVDHHLKHMYDDGSDPKIYVENNATGAIIKISD

PEYIKEFVQLENKAYQKTTLLIDNIIRLVGQGIIFSEGPQWKKNRNVLSGVFHFEQLSKR

VPSIEKITKEVYKRYIDSGNVKNVDVIELFQEITAEVVSKPSSVIFQRLILPWHEFAGSS

SYLI

 

>Tetrahymena P450 seq 3a C-term (partial 293 aa) BM399816 with frameshift (fs)

KTVGKRATRGGAYNSVGKQISTPFYFLFRTNFFKWGIRESDRELNKQIKEFRQMIGDIIN

ERIKEEEELEKRGEQTTKEDLVYYLKKNNLRGVLSLDEIISEFMTFYVAGMDTTGHLCGM

AIWFLTQHPEIKKKLQEELDANTDYSQNGLLKLPYLNGVIQETQRLYGPAGQLFNRVALR

DHMLKDIPIKKGTIVKPSPCSVHRHPKYFEDPHSFKPERWFNKKYCHS (fs)

YTFIPFNAGPRN

CIGQHLALVEARIMMYYFMKTFDFESDHNFEMVLNKASQQTSRQLRIILAQEI

 

Tetrahymena P450 seq 3b BM399815 BEGINNING = SEQ 3

THIS IS PROBABLY A POOR QUALITY VERSION OF SEQ 3

THE TWO ARE 85% IDENTICAL AT NUC. LEVEL, 5 PRIME ENDS NEARLY IDENTICAL

TREATED AS SAME GENE IN TIGR GENE INDEX FOR TETRAHYMENA

 

AYNSVGKQISTPFYFLFRTNFFKWGIRESDRKLNKQIKEFRQMIGDIINER

IKEEEELEKRGEQTTKEDLVYYLKKNNLLGVLSLDEIINGFLNF (FS)

YLAGLVPLGHLGGCGLWVPIYKHPRIQKRNSEKNLD

ANSDYSQKWSPLKLSLLNGSYLRKLQRLLWTPLG*LFNP

VAPQEPHAQGLPIQKGIIVRPXXX

SVHRLLNI*RPHSFSLKKVYKICHSELYPFSGPERLA

PRLNKLMYIFGDWLKNNEGQSLNRLKISLTFP

 

>Tetrahymena P450 seq 4 C-term CF653700 (168 AA)

ICLWVLAQHPELQQKIRAEIDSVIQTFDDLKHEXLNKLEYFNAFFKESLRVYPTAPQVIP

RVSARDHMVDDFPIPKGAFVSNLTIQYNEQKFPLLCKDIDTFNPDRFLDKNIIQDHFSFI

PYSAGPRNCIGQHLALIEAKIMIAYILKNYVVLPNEEHQKVRFNHLFL

 

>Tetrahymena P450 seq 5 BM400871 and BM400870 C-helix to I-helix

GELTRWRRSKRNFLS (fs)

LFHFNALKNRVLSSRRLPRSSWATLPSDGKTPITIIEELQNITSE

VVIQTFFGENLKGMTVNGLQPSVEISKIIGDGFSYKANSFAYFLKLMVFGQEKASRVLNT

TFEKNFLKRVENYNQFIEGIVDKRLSELEKLTDTSKVDENFLNLYLLEYIKQQKALKENP

KIYADYEIIPKREIVHQFTTFFFAGMDTTANQTGICL

 

Note GELTRWRRSKRN at the beginning of this sequence is found at the beginning of many Tetrahymena EST sequences.  It was not found to be vector in Vecscreen, but it is probably not part of the protein sequence.

 

&&&&&&&&&&

 

>Tetrahymena P450 seq 1 (partial 291 aa N-term)

MFIEILIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLLGELVEIKEAIK

KYEDVDYFVSHQCDENPDLRLYVVNLGSKIKLRLVDPDLMRDFFLNQQYYEKDTFYIGNV

LRCAPQGIAFVEGEQWKKARKMFSQAFHFEYLTSLAPLIEQIASKVFNQAMESSEILANY

DPLVYSQKITGQVVIATFFGEQVNEKKFRGMDLVSALTHMLNLLGEQSMSLQYFLFGADF

FKLRLTQSQRYVDDIIEEFRSFLTDLIEGKHQKLSQKLKEYGKIVSLPFSLESLHLRNNA

 

>5009-0-77-F07.t.1 full open reading frame, no stops

 

 

ATATAGAATATCAAATGCTATAATATAAA

ATGTTTATAGAAATATTAATTATTATTCTTT

TTTTTGCTGCTTTAAGGTTAGTAATTATCCCTTATTTCAAGTTCCTTAAGTACAAAAAAT

ATGGGGATGGCAGATTTGTTCCATTATTAGGAGAACTAGTAGAGATAAAGGAAGCTATTA

AAAAGTACGAGGATGTAGACTATTTCGTAAGTCACTAATGCGATGAAAATCCAGACTTAA

GATTATATGTTGTCAATCTTGGCTCTAAGATAAAGCTTAGATTAGTAGATCCCGACTTAA

TGAGAGATTTCTTTTTAAATCAGTAGTATTATGAAAAAGATACTTTTTACATTGGCAACG

TCCTAAGATGCGCACCTTAAGGTATAGCATTTGTAGAGGGCGAGCAATGGAAAAAAGCCA

GAAAAATGTTTTCTCAAGCTTTCCATTTTGAATACCTCACTTCTCTAGCTCCATTAATAG

AATAAATAGCTTCAAAAGTCTTTAACTAAGCTATGGAAAGTAGCGAAATCCTTGCCAATT

ATGATCCCCTTGTATATTCATAAAAGATAACAGGATAAGTGGTTATTGCTACCTTTTTTG

GAGAACAAGTAAATGAAAAAAAGTTTAGAGGTATGGATTTAGTCTCAGCTTTGACACATA

TGTTAAATCTACTTGGAGAGTAATCAATGAGCCTCTAATACTTTTTGTTTGGAGCAGACT

TTTTTAAGTTAAGACTAACTCAATCCTAAAGATATGTCGATGATATCATTGAAGAATTCC

GTTCTTTCCTGACAGATCTGATTGAGGGAAAGCATTAGAAGCTTTCACAAAAATTGAAAG

AATACGGAAAAATAGTGTCCCTTCCATTTAGCTTAGAAAGTTTGCACCTTAGAAATAACG

CAAA

 

There is only one stop codon TGA

TAA and TAG = Q

 

>gi|18200747|gb|BM400694.1|BM400694 5009-0-77-F07.t.1 Chilcoat/Turkewitz cDNA (large fraction)

           Tetrahymena thermophila cDNA.

          Length = 934

 

 Score =  559 bits (1441), Expect = e-161

 Identities = 291/291 (100%), Positives = 291/291 (100%)

 Frame = +3

 

Query: 1   MFIEILIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLLGELVEIKEAIKKYEDVDYFV 60

           MFIEILIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLLGELVEIKEAIKKYEDVDYFV

Sbjct: 60  MFIEILIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLLGELVEIKEAIKKYEDVDYFV 239

 

Query: 61  SH*CDENPDLRLYVVNLGSKIKLRLVDPDLMRDFFLNQ*YYEKDTFYIGNVLRCAP*GIA 120

           SH*CDENPDLRLYVVNLGSKIKLRLVDPDLMRDFFLNQ*YYEKDTFYIGNVLRCAP*GIA

Sbjct: 240 SH*CDENPDLRLYVVNLGSKIKLRLVDPDLMRDFFLNQ*YYEKDTFYIGNVLRCAP*GIA 419

 

Query: 121 FVEGEQWKKARKMFSQAFHFEYLTSLAPLIE*IASKVFN*AMESSEILANYDPLVYS*KI 180

           FVEGEQWKKARKMFSQAFHFEYLTSLAPLIE*IASKVFN*AMESSEILANYDPLVYS*KI

Sbjct: 420 FVEGEQWKKARKMFSQAFHFEYLTSLAPLIE*IASKVFN*AMESSEILANYDPLVYS*KI 599

 

Query: 181 TG*VVIATFFGEQVNEKKFRGMDLVSALTHMLNLLGE*SMSL*YFLFGADFFKLRLTQS* 240

           TG*VVIATFFGEQVNEKKFRGMDLVSALTHMLNLLGE*SMSL*YFLFGADFFKLRLTQS*

Sbjct: 600 TG*VVIATFFGEQVNEKKFRGMDLVSALTHMLNLLGE*SMSL*YFLFGADFFKLRLTQS* 779

 

Query: 241 RYVDDIIEEFRSFLTDLIEGKH*KLSQKLKEYGKIVSLPFSLESLHLRNNA 291

           RYVDDIIEEFRSFLTDLIEGKH*KLSQKLKEYGKIVSLPFSLESLHLRNNA

Sbjct: 780 RYVDDIIEEFRSFLTDLIEGKH*KLSQKLKEYGKIVSLPFSLESLHLRNNA 932

 

>gi|18199205|gb|BM399152.1|BM399152 5009-0-54-B02.t.1 Chilcoat/Turkewitz cDNA (large fraction)

           Tetrahymena thermophila cDNA.

          Length = 824

 

 Score =  267 bits (683), Expect = 2e-73

 Identities = 139/157 (88%), Positives = 141/157 (89%), Gaps = 1/157 (0%)

 Frame = +2

 

Query: 1   MFIEILIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLLGELVEIKEAIKKYEDVDYFV 60

           MFIEILIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLLGELVEIKEAIKKYEDVDYFV

Sbjct: 62  MFIEILIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLLGELVEIKEAIKKYEDVDYFV 241

 

Query: 61  SH*CDENPDLRLYVVNLGSKIKLRLVDPDLMRDFFLNQ*YYEKDTFYIGNVLRCAP*GIA 120

           SH*CDENPDLRLYVVNLGSKIKLRLVDPDLMRD FLNQ*YYEKDTFYIGNVLRCAP   +

Sbjct: 242 SH*CDENPDLRLYVVNLGSKIKLRLVDPDLMRDXFLNQ*YYEKDTFYIGNVLRCAPLRYS 421

 

Query: 121 FVEG-EQWKKARKMFSQAFHFEYLTSLAPLIE*IASK 156

              G    KK  + F QAFHFEYLTSLAPLIE*IA K

Sbjct: 422 ICRGRSNGKKPEQCFPQAFHFEYLTSLAPLIE*IAFK 532

 

 

 Score = 86.3 bits (212), Expect = 9e-19

 Identities = 58/89 (65%), Positives = 63/89 (70%), Gaps = 6/89 (6%)

 Frame = +3

 

Query: 116 P*GIAFVEGEQWKKARKMFSQAFHFEYLTSLAPLIE*I------ASKVFN*AMESSEILA 169

           P*GIAFVEGE  +K     SQ   F  L+ L   + *+       SKVFN*AMESSEILA

Sbjct: 408 P*GIAFVEGEAMEK-----SQNNVFLKLSILNTSLL*LH**NK*LSKVFN*AMESSEILA 572

 

Query: 170 NYDPLVYS*KITG*VVIATFFGEQVNEKK 198

           +YDPLVYS*KITG* VIAT FGEQVNEKK

Sbjct: 573 SYDPLVYS*KITG*GVIATLFGEQVNEKK 659

 

 

 Score = 30.0 bits (66), Expect = 0.078

 Identities = 32/96 (33%), Positives = 45/96 (46%), Gaps = 5/96 (5%)

 Frame = +1

 

Query: 125 EQWKKARKMFSQAF-----HFEYLTSLAPLIE*IASKVFN*AMESSEILANYDPLVYS*K 179

           +QWKKAR MFS +F     HF    +     +   +K++  A     ++  Y  +   *+

Sbjct: 436 KQWKKARTMFSSSFPF*IPHFSSSINRINSFQKSLTKLWKVAKSWPVMIPLY--IHKR*Q 609

 

Query: 180 ITG*VVIATFFGEQVNEKKFRGMDLVSALTHMLNLL 215

             G ++      E    KK     LVSALTHMLNLL

Sbjct: 610 DKGLLLPCL---ENK*MKKS*RYGLVSALTHMLNLL 708

 

 

&&&&&&&&

 

>5009-0-20-F06.t.1

AACATACGAGCCACGCGGGGCGGCCGCTCTAAAGAAAAAATTTTGTATTAAAAATTAAAA

AAAACATAAATAGATATTTAGACAAGTTTCTA

ATGGTAAGCTACTTTGCTTTAGCAGGTC

TAGCAATAGTCCTATACATTTTGTATGTATTTATTATCAATCCTTACTTGTAGTACAGAA

AATACTTGAAGTGGGGTAAAGGTTCTTTCTACCCTTTCGTTGGTGTTTTCTATGGTGCTG

GCTTACGTGTTAAGCAATACAAAGATGTTGATCATCACTTGAAGCATATGTATGATGACG

GATCAGACCCTAAAATTTATGTTGAAAACAATGCCACAGGTGCCATCATCAAGATTTCTG

ACCCTGAATATATTAAGGAGTTTGTCTAACTTGAAAACAAGGCTTATCAAAAGACTACTC

TCTTAATTGACAATATCATCAGACTCGTAGGTTAGGGAATCATCTTCTCTGAAGGCCCCC

AATGGAAGAAAAACAGAAATGTACTTTCTGGTGTCTTCCACTTCGAACAACTCAGCAAAC

GTGTCCCATCAATAGAAAAAATTACTAAGGAAGTTTATAAGCGTTATATTGATTCAGGCA

ATGTTAAAAACGTTGATGTCATCGAATTATTTTAAGAAATCACTGCTGAAGTCGTATCTA

AACCTTCTTCAGTAATATTTCAAAGATTAATCCTTCCTTGGCATGAGTTTGCTGGTAGCT

CTTCATACCTCATTA

 

 

>Tetrahymena P450 Seq 2 N-term BM396441.1

MVSYFALAGLAIVLYILYVFIINPYLQYRK

YLKWGKGSFYPFVGVFYGAGLRVKQYKDVDHHLKHMYDDGSDPKIYVENNATGAIIKISD

PEYIKEFVQLENKAYQKTTLLIDNIIRLVGQGIIFSEGPQWKKNRNVLSGVFHFEQLSKR

VPSIEKITKEVYKRYIDSGNVKNVDVIELFQEITAEVVSKPSSVIFQRLILPWHEFAGSS

SYLI

 

 

Compared to each other

 

Query:    40 LAIVLYILYV-FIINPYL*YRKYLKWGKGSFYPFVGVFYGAGLRVKQYKDVDHHLKHMYD 98

             L I+L+   +  +I PY  + KY K+G G F P +G        +K+Y+DVD+ + H  D

Sbjct:     6 LIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLLGELVEIKEAIKKYEDVDYFVSH*CD 65

 

Query:    99 DGSDPKIYVENNATGAIIKISDPEYIKEFV*LENKAYQKTTLLIDNIIRLVG*GIIFSEG 158

             +  D ++YV N  +   +++ DP+ +++F  L    Y+K T  I N++R   *GI F EG

Sbjct:    66 ENPDLRLYVVNLGSKIKLRLVDPDLMRDFF-LNQ*YYEKDTFYIGNVLRCAP*GIAFVEG 124

 

Query:   159 PQWKKNRNVLSGVFHFEQLSKRVPSIEKITKEVYKRYIDSGNV-KNVDVIELF*EITAEV 217

              QWKK R + S  FHFE L+   P IE I  +V+   ++S  +  N D +   *+IT  V

Sbjct:   125 EQWKKARKMFSQAFHFEYLTSLAPLIE*IASKVFN*AMESSEILANYDPLVYS*KITG*V 184

 

Query:   218 V 218

             V

Sbjct:   185 V 185

 

>gi|18196479|gb|BM396441.1|BM396441 5009-0-20-F06.t.1 Chilcoat/Turkewitz cDNA (large fraction)

           Tetrahymena thermophila cDNA.

          Length = 735

 

 Score =  426 bits (1095), Expect = e-121

 Identities = 214/214 (100%), Positives = 214/214 (100%)

 Frame = +3

 

Query: 1   MVSYFALAGLAIVLYILYVFIINPYL*YRKYLKWGKGSFYPFVGVFYGAGLRVKQYKDVD 60

           MVSYFALAGLAIVLYILYVFIINPYL*YRKYLKWGKGSFYPFVGVFYGAGLRVKQYKDVD

Sbjct: 93  MVSYFALAGLAIVLYILYVFIINPYL*YRKYLKWGKGSFYPFVGVFYGAGLRVKQYKDVD 272

 

Query: 61  HHLKHMYDDGSDPKIYVENNATGAIIKISDPEYIKEFV*LENKAYQKTTLLIDNIIRLVG 120

           HHLKHMYDDGSDPKIYVENNATGAIIKISDPEYIKEFV*LENKAYQKTTLLIDNIIRLVG

Sbjct: 273 HHLKHMYDDGSDPKIYVENNATGAIIKISDPEYIKEFV*LENKAYQKTTLLIDNIIRLVG 452

 

Query: 121 *GIIFSEGPQWKKNRNVLSGVFHFEQLSKRVPSIEKITKEVYKRYIDSGNVKNVDVIELF 180

           *GIIFSEGPQWKKNRNVLSGVFHFEQLSKRVPSIEKITKEVYKRYIDSGNVKNVDVIELF

Sbjct: 453 *GIIFSEGPQWKKNRNVLSGVFHFEQLSKRVPSIEKITKEVYKRYIDSGNVKNVDVIELF 632

 

Query: 181 *EITAEVVSKPSSVIFQRLILPWHEFAGSSSYLI 214

           *EITAEVVSKPSSVIFQRLILPWHEFAGSSSYLI

Sbjct: 633 *EITAEVVSKPSSVIFQRLILPWHEFAGSSSYLI 734

 

&&&&&&&&&&&


BM399816

5009-0-62-A08.t.2 Chilcoat/Turkewitz cDNA (large fraction)

            Tetrahymena thermophila cDNA, mRNA sequence.

        1 aagacggtgg gaaagcgagc cacgcggggc ggcgcctaca actctgttgg taagtaaatt

       61 agcactccct tttacttctt attccgtacc aatttcttca aatggggcat cagagaatct

      121 gacagggagt tgaacaagta gataaaagaa ttccgtcaaa tgattggtga catcatcaac

      181 gagcgtatca aagaagaaga agagttagaa aagcgtggtg aataaactac caaggaagat

      241 cttgtttatt atcttaaaaa gaataacctc cgtggagtcc tctccctcga tgaaattatt

      301 agtgaattca tgactttcta cgttgctggt atggatacaa ctggtcatct ttgcggtatg

      361 gccatatggt tccttactta acaccccgaa attaaaaaga aactctaaga agaacttgat

      421 gctaacactg actactctca aaatggtctc cttaagcttc cttaccttaa tggagttatc

      481 taagaaactc aacgtctcta tggacccgct ggttaattat tcaatcgtgt cgctcttaga

      541 gaccacatgc ttaaggacat tcctatcaag aagggaacta ttgttaagcc ctctccctgc

      601 tctgttcaca gacatcctaa atatttcgaa gaccctcatt ccttcaagcc tgaaagatgg

      661 tttaacaaaa aatactgtca ctccttacac ttttatcccc ttcaatgctg gtcccagaaa

      721 ctgcattggc taacatcttg ccttagtaga agctagaatt atgatgtatt atttcatgaa

      781 gacttttgat tttgaaagcg atcataattt tgaaatggtt ctcaataagg cttcttaata

      841 aaccagtaga taactcagaa taatcttagc tcaagaaatc

 

>Tetrahymena P450 seq 3 C-term (293 aa) BM399816

KTVGKRATRGGAYNSVGKQISTPFYFLFRTNFFKWGIRESDRELNKQIKEFRQMIGDIIN

ERIKEEEELEKRGEQ TTKEDLVYYLKKNNLRGVLSLDEIISEFMTFYVAGMDTTGHLCGM

AIWFLTQHPEIKKKLQEELDANTDYSQNGLLKLPYLNGVIQETQRLYGPAGQ LFNRVALR

DHMLKDIPIKKGTIVKPSPCSVHRHPKYFEDPHSFKPERWFNKKYCHS (fs)

YTFIPFNAGPRN

CIGQHLALVEARIMMYYFMKTFDFESDHNFEMVLNKASQQTSRQLRIILAQEI

 

>EMBOSS_001_3

DGGKASHAGRRLQLCW*VN*HSLLLLIPYQFLQMGHQRI*QGVEQVDKRIPSNDW*HHQR

AYQRRRRVRKAW*INYQGRSCLLS*KE*PPWSPLPR*NY**IHDFLRCWYGYNWSSLRYG

HMVPYLTPRN*KETLRRT*C*H*LLSKWSP*ASLP*WSYLRNSTSLWTRWLIIQSCRS*R

PHA*GHSYQEGNYC*ALSLLCSQTS*IFRRPSFLQA*KMV*QKILSLLTLLSPSMLVPET

ALANILP**KLEL*CIIS*RLLILKAIIILKWFSIRLLNKPVDNSE*S*LKK

 

 

>BM399816 Seq with frameshift  like 3fam and 4V

KTVGKRATRGGAYNSVGK*ISTPFYFLFRTNFFKWGIRESDRELNK*IKEFRQMIGDII

NERIKEEEELEKRGE*TTKEDLVYYLKKNNLRGVLS

LDEIISEFMTFYV AGMDTTGHLCGM

AIWFLT*HPEIKKKL*EELDANTDYSQNGLLKLPYLNGVI*ETQRLYGPAG*LFNRVALR

DHMLKDIPIKKGTIVKPSPCSVHRHPKYFEDPHSFKPERWFNKKYCHS

YTFIPFNAGPRNCIG*HLALVEARIMMYYFMKTFDFESDHNFEMVLNKAS**TSR*LRIILAQEI

 

Note that the N-term of this frag overlaps with BM400694.1 weakly so a complete seq

could be assembled, but it would be a hybrid.

 

>gi|18200747|gb|BM400694.1|BM400694 5009-0-77-F07.t.1 Chilcoat/Turkewitz cDNA (large fraction)

           Tetrahymena thermophila cDNA.

          Length = 934

 

 Score = 33.5 bits (75), Expect = 0.001

 Identities = 18/46 (39%), Positives = 29/46 (63%)

 Frame = +3

 

Query: 14  NSVGK*ISTPFYFLFRTNFFKWGIRESDRELNK*IKEFRQMIGDII 59 (ETAM exon?)

           N +G+*  +  YFLF  +FFK  + +S R ++  I+EFR  + D+I

Sbjct: 696 NLLGE*SMSL*YFLFGADFFKLRLTQS*RYVDDIIEEFRSFLTDLI 833

 

 

 

>gi|18199869|gb|BM399816.1|BM399816 5009-0-62-A08.t.2 Chilcoat/Turkewitz cDNA (large fraction)

           Tetrahymena thermophila cDNA.

          Length = 880

 

 Score =  467 bits (1201), Expect = e-133

 Identities = 230/233 (98%), Positives = 230/233 (98%)

 Frame = +1

 

Query: 1   KTVGKRATRGGAYNSVGK*ISTPFYFLFRTNFFKWGIRESDRELNK*IKEFRQMIGDIIN 60

           KTVGKRATRGGAYNSVGK*ISTPFYFLFRTNFFKWGIRESDRELNK*IKEFRQMIGDIIN

Sbjct: 1   KTVGKRATRGGAYNSVGK*ISTPFYFLFRTNFFKWGIRESDRELNK*IKEFRQMIGDIIN 180

 

Query: 61  ERIKEEEELEKRGE*TTKEDLVYYLKKNNLRGVLSLDEIISEFMTFYVAGMDTTGHLCGM 120

           ERIKEEEELEKRGE*TTKEDLVYYLKKNNLRGVLSLDEIISEFMTFYVAGMDTTGHLCGM

Sbjct: 181 ERIKEEEELEKRGE*TTKEDLVYYLKKNNLRGVLSLDEIISEFMTFYVAGMDTTGHLCGM 360

 

Query: 121 AIWFLT*HPEIKKKL*EELDANTDYSQNGLLKLPYLNGVI*ETQRLYGPAG*LFNRVALR 180

           AIWFLT*HPEIKKKL*EELDANTDYSQNGLLKLPYLNGVI*ETQRLYGPAG*LFNRVALR

Sbjct: 361 AIWFLT*HPEIKKKL*EELDANTDYSQNGLLKLPYLNGVI*ETQRLYGPAG*LFNRVALR 540

 

Query: 181 DHMLKDIPIKKGTIVKPSPCSVHRHPKYFEDPHSFKPERWFNKKYCHSYTFIP 233

           DHMLKDIPIKKGTIVKPSPCSVHRHPKYFEDPHSFKPERWFNKKYCHS  F P

Sbjct: 541 DHMLKDIPIKKGTIVKPSPCSVHRHPKYFEDPHSFKPERWFNKKYCHSLHFYP 699

 

 

 Score =  129 bits (324), Expect = 1e-31

 Identities = 83/164 (50%), Positives = 91/164 (55%)

 Frame = +2

 

Query: 130 EIKKKL*EELDANTDYSQNGLLKLPYLNGVI*ETQRLYGPAG*LFNRVALRDHMLKDIPI 189

           E+ KKL   +D   +YS   LL+   L   +   + L  P   LF  +      L    +

Sbjct: 473 ELSKKLNVSMDPLVNYSIVSLLETTCLRTFLSRRELLLSPLPALFTDILNISKTLIPSSL 652

 

Query: 190 KKGTIVKPSPCSVHRHPKYFEDPHSFKPERWFNKKYCHSYTFIPFNAGPRNCIG*HLALV 249

           K G                              K     YTFIPFNAGPRNCIG*HLALV

Sbjct: 653 KDG----------------------------LTKNTVTPYTFIPFNAGPRNCIG*HLALV 748

 

Query: 250 EARIMMYYFMKTFDFESDHNFEMVLNKAS**TSR*LRIILAQEI 293

           EARIMMYYFMKTFDFESDHNFEMVLNKAS**TSR*LRIILAQEI

Sbjct: 749 EARIMMYYFMKTFDFESDHNFEMVLNKAS**TSR*LRIILAQEI 880

 

>gi|18199868|gb|BM399815.1|BM399815 5009-0-62-A08.t.1 Chilcoat/Turkewitz cDNA (large fraction)

           Tetrahymena thermophila cDNA.

          Length = 817

 

 Score =  213 bits (542), Expect = 5e-57

 Identities = 139/240 (57%), Positives = 165/240 (68%), Gaps = 8/240 (3%)

 Frame = +2

 

Query: 8   TRGGAYNSVGK*ISTPFYFLFRTNFFKWGIRESDRELNK*IKEFRQMIGDIINERIKEEE 67

           T   AYNSVGK*ISTPFYFLFRTNFFKWGIRESDR+LNK*IKEFRQMIGDIINERIKEEE

Sbjct: 17  TGAAAYNSVGK*ISTPFYFLFRTNFFKWGIRESDRKLNK*IKEFRQMIGDIINERIKEEE 196

 

Query: 68  ELEKRGE*TTKEDLVYYLKKNNLRGVLSLDEIISEFMTFYVAGMDTTGHLCGMAIWFLT* 127

           ELEKRGE*TTKEDLVYYLKKNNL GVLSLDEII+ F+ F + G  +TG      +W L 

Sbjct: 197 ELEKRGE*TTKEDLVYYLKKNNLLGVLSLDEIINGFLNFLLGGFGSTG--SSWRMWPLGP 370

 

Query: 128 H----PEIKKKL*EELDANTDYS-QNGL-LKLPYLNGVI*ETQRL-YGPAG*LFN-RVAL 179

           +      +K+KL*EEL   T  + ++GL L   YL GVI*E   + YGP    ++ R  L

Sbjct: 371 YI*TPANLKEKL*EELGLLTVTTLKSGLPLSFLYLMGVI*ENSNVSYGPRWVNYSIRSLL 550

 

Query: 180 RDHMLKDIPIKKGTIVKPSPCSVHRHPKYFEDPHSFKPERWFNKKYCHSYTFIPFNAGPR 239

           ++HMLKD   ++  ++  S  SVHR       PHSF  ++ +  K CHS  + PF+   R

Sbjct: 551 KNHMLKDFLFRRELLLGLS-WSVHRLLNI*R-PHSFSLKKVY--KICHSELY-PFSGPER 715

 

>gi|37509509|gb|CF653700.1|CF653700 EST00033 Suppression Subtractive Hybridization Libraries of

           Tetrahymena thermophila Exposed in

           Dichlorodiphenyltrichloroethane (DDT) Tetrahymena

           thermophila cDNA clone DDT-236.

          Length = 505

 

 Score =  106 bits (264), Expect = 9e-25

 Identities = 59/150 (39%), Positives = 88/150 (58%), Gaps = 7/150 (4%)

 Frame = +1

 

Query: 120 MAIWFLT*HPEIKKKL*EELDANT----DYSQNGLLKLPYLNGVI*ETQRLYGPAG*LFN 175

           + +W L *HPE+++K+  E+D+      D     L KL Y N    E+ R+Y  A  + 

Sbjct: 1   ICLWVLA*HPELQQKIRAEIDSVI*TFDDLKHEXLNKLEYFNAFFKESLRVYPTAPQVIP 180

 

Query: 176 RVALRDHMLKDIPIKKGTIVKPSPCSVH--RHPKYFEDPHSFKPERWFNKKYCHS-YTFI 232

           RV+ RDHM+ D PI KG  V       +  + P   +D  +F P+R+ +K      ++FI

Sbjct: 181 RVSARDHMVDDFPIPKGAFVSNLTI*YNE*KFPLLCKDIDTFNPDRFLDKNIIQDHFSFI 360

 

Query: 233 PFNAGPRNCIG*HLALVEARIMMYYFMKTF 262

           P++AGPRNCIG*HLAL+EA+IM+ Y +K +

Sbjct: 361 PYSAGPRNCIG*HLALIEAKIMIAYILKNY 450

 

>Tetrahymena seq 4 C-term CF653700

1   ICLWVLAQHPELQQKIRAEIDSVIQ TFDDLKHEXLNKLEYFNAFFKESLRVYPTAPQVIP 180

181 RVSARDHMVDDFPIPKGAFVSNLTIQYNEQ KFPLLCKDIDTFNPDRFLDKNIIQDHFSFI 360

361 PYSAGPRNCIGQHLALIEAKIMIAYILKNYVVLPNEEHQKVRFNHLFL

 

ATTTGTCTTTGGGTTTTGGCTTAACACCCTGAACTCCAACAAAAGATCAGAGCTGAAATT

GATTCTGTTATTTAGACCTTTGACGATCTTAAGCACGAAGNTTTGAATAAACTTGAATAC

TTCAATGCTTTCTTCAAAGAATCCCTCAGAGTTTATCCCACTGCCCCTCAAGTCATTCCT

AGAGTCTCTGCTCGTGACCACATGGTCGATGACTTCCCTATTCCCAAGGGTGCCTTCGTA

TCAAACTTAACCATTTAATATAATGAATAAAAGTTCCCACTCCTCTGCAAAGATATTGAT

ACCTTCAATCCCGATAGATTCTTAGACAAGAACATCATTCAAGATCATTTCAGTTTTATC

CCTTACTCAGCAGGTCCCCGTAATTGCATCGGCTAGCACTTGGCCTTAATCGAAGCTAAG

ATCATGATTGCTTACATCCTCAAAAATTACGTAGTTTTACCCAACGAAGAACACTAGAAA

GTTAGATTTAACCACTTATTCTTGT

 

>EMBOSS_001_1

                              PYSAGPRNCIG*HLALIEAKIMIAYILKNY VVLPNEEHQKVRFNHLFL

                              >EMBOSS_001_2

                              LTQQVPVIASASTWP*SKLRS*LLTSSKIT*FYPTKNTRKLDLTTYSC

                              >EMBOSS_001_3

                              LLSRSP*LHRLALGLNRS*DHDCLHPQKLRSFTQRRTLES*I*PLI

 

LOCUS       BM399815                 817 bp    mRNA    linear   EST 17-JAN-2002

DEFINITION  5009-0-62-A08.t.1 Chilcoat/Turkewitz cDNA (large fraction)

            Tetrahymena thermophila cDNA, mRNA sequence.

ACCESSION   BM399815

VERSION     BM399815.1  GI:18199868

KEYWORDS    EST.

SOURCE      Tetrahymena thermophila

  ORGANISM  Tetrahymena thermophila

            Eukaryota; Alveolata; Ciliophora; Oligohymenophorea;

            Hymenostomatida; Tetrahymenina; Tetrahymena.

REFERENCE   1  (bases 1 to 817)

  AUTHORS   Turkewitz,A.P., Karrer,K.M., Jahn,C., Orias,E., Kirk,K.E., Frankel

            ,J. and Klobutcher,L.

  TITLE     EST from Tetrahymena thermophila, strain CU428.1, growing cells

  JOURNAL   Unpublished (2002)

COMMENT     Contact: Turkewitz AP

            Molecular Genetics and Cell Biology

            University of Chicago

            920 E. 58th Street, Chicago, IL 60637, USA

            Tel: 773 702 4374

            Fax: 773 702 3172

            Email: apturkew@midway.uchicago.edu

            Seq primer: T3.

FEATURES             Location/Qualifiers

     source          1..817

                     /organism="Tetrahymena thermophila"

                     /mol_type="mRNA"

                     /strain="CU428.1"

                     /db_xref="taxon:5911"

                     /clone_lib="Chilcoat/Turkewitz cDNA (large fraction)"

                     /note="Vector: BlueScript2 SK+; Details on library

                     preparation can be found in Chilcoat and Turkewitz (2001)

                     Proc. Natl. Acad. Sci USA, 98: 8709-8713."

ORIGIN     

        1 aacaaagctg gagctcacgg gggcggccgc ctacaactct gttggtaagt aaattagcac

       61 tcccttttac ttcttattcc gtaccaattt cttcaaatgg ggcatcagag aatctgacag

      121 aaagttgaac aagtagataa aagaattccg tcaaatgatt ggtgacatca tcaacgagcg

      181 tatcaaagaa gaagaagagt tagaaaagcg tggtgaataa actaccaagg aagatcttgt

      241 ttattatctt aaaaagaata acctccttgg agtcctctcc ctcgatgaaa ttattaatgg

      301 attcctgaat ttcctacttg gcgggtttgg ttccactggg tcatcttggc ggatgtggcc

      361 tttgggtccc tatatataaa cacccgcgaa tttaaaagag aaactctgag aagaacttgg

      421 attgctaaca gtgactactc tcaaaagtgg tctcccctta agctttcttt acttaatggg

      481 agttatctaa gaaaactcca acgtctctta tggaccccgc tgggttaatt attcaatccg

      541 gtcgctcctt aagaaccaca tgcttaagga cttcctattc agaagggaat tattgttagg

      601 cctctcctgg tctgttcaca gacttctaaa tatttgaaga cctcattcct tcagcctgaa

      661 gaaggtttac aaaatctgtc actccgaact ttatcccttt agcggtccgg aacggttggc

      721 accacggctt aataagctga tgtatatctt cggagactgg ttgaagaaca atgaaggtca

      781 aagtttaaac cggctcaaaa tttctttaac ttttccg

 

>EMBOSS_001_1

NKAGAHGGGRLQLCW*VN*HSLLLLIPYQFLQMGHQRI*QKVEQVDKRIPSNDW*HHQRA

YQRRRRVRKAW*INYQGRSCLLS*KE*PPWSPLPR*NY*WIPEFPTWRVWFHWVILADVA

FGSLYINTR EFKRETLRRTWIANSDYSQKWSPLKLSLLNGSYLRKLQRLLWTPLG*LFNP

VAPQEPHAQGLPIQKGIIVRP LLVCSQTSKYLKTSFLQPE EGLQNLSLRTLSL*RSGTVG

TTA**ADVYLRRLVEEQ*RSKFKPAQNFFNFSX

 

>EMBOSS_001_2

TKLELTGAA AYNSVGKQISTPFYFLFRTNFFKWGIRESDRKLNKQIKEFRQMIGDIINER

IKEEEELEKRGEQTTKEDLVYYLKKNNLLGVLSLDEIINGFLNF LLGGFGSTGSSWRMWP

LGPYIQTPANLKEKL*EELGLLTVTTLKSGLPLSFLYLMGVI*ENSNVSYGPRWVNYSIR

SLLKNHMLKDFLFRRELLLGLSW SVHRLLNI*RPHSFSLKKVYKICHSELYPFSGPERLA

PRLNKLMYIFGDWLKNNEGQSLNRLKISLTFP

 

>EMBOSS_001_3

QSWSSRGRPPTTLLVSKLALPFTSYSVPISSNGASENLTES*TSR*KNSVK*LVTSSTSV

SKKKKS*KSVVNKLPRKILFIILKRITSLESSPSMKLLMDS *IS YLAGLVPLGHLGGCGL

WVPIYKHPRIQKRNSEKNLD C*Q*LLSKVVSP*AFFT*WELSKKTPTSLMDPAGLIIQSG

RSLRTTCLRTSYSEGNYC*ASPGLFTDF*IFEDLIPSA*RRFTKSVTPNFIPLAVRNGWH

HGLIS*CISSETG*RTMKVKV*TGSKFL*LF

 

 This seq frameshiftS around

 

TKLELTGAA AYNSVGKQISTPFYFLFRTNFFKWGIRESDRKLNKQIKEFRQMIGDIINER

IKEEEELEKRGEQTTKEDLVYYLKKNNLLGVLSLDEIINGFLNF (FS)

YLAGLVPLGHLGGCGLWVPIYKHPRIQKRNSEKNLD

ANSDYSQKWSPLKLSLLNGSYLRKLQRLLWTPLG*LFNP

VAPQEPHAQGLPIQKGIIVRPXXX

SVHRLLNI*RPHSFSLKKVYKICHSELYPFSGPERLA

PRLNKLMYIFGDWLKNNEGQSLNRLKISLTFP

 

&&&&&&&&&&&

 

>gi|18199205|gb|BM399152.1|BM399152 5009-0-54-B02.t.1 Chilcoat/Turkewitz cDNA (large fraction)

           Tetrahymena thermophila cDNA.

          Length = 824

 

 Score =  262 bits (670), Expect = 6e-69

 Identities = 136/157 (86%), Positives = 138/157 (87%), Gaps = 1/157 (0%)

 Frame = +2

 

Query: 1   MFIEILIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLLGELVEIKEAIKKYEDVDYFV 60

           MFIEILIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLLGELVEIKEAIKKYEDVDYFV

Sbjct: 62  MFIEILIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLLGELVEIKEAIKKYEDVDYFV 241

 

Query: 61  SHQCDENPDLRLYVVNLGSKIKLRLVDPDLMRDFFLNQQYYEKDTFYIGNVLRCAPQGIA 120

           SH CDENPDLRLYVVNLGSKIKLRLVDPDLMRD FLNQ YYEKDTFYIGNVLRCAP   +

Sbjct: 242 SH*CDENPDLRLYVVNLGSKIKLRLVDPDLMRDXFLNQ*YYEKDTFYIGNVLRCAPLRYS 421

 

Query: 121 FVEG-EQWKKARKMFSQAFHFEYLTSLAPLIEQIASK 156

              G    KK  + F QAFHFEYLTSLAPLIE IA K

Sbjct: 422 ICRGRSNGKKPEQCFPQAFHFEYLTSLAPLIE*IAFK 532

 

 

 Score = 77.4 bits (189), Expect = 3e-13

 Identities = 52/85 (61%), Positives = 57/85 (67%), Gaps = 2/85 (2%)

 Frame = +3

 

Query: 116 PQGIAFVEGEQWKKARKMFSQAFHFEYLTSLAPLI--EQIASKVFNQAMESSEILANYDP 173

           P GIAFVEGE  +K++            TSL  L    +  SKVFN AMESSEILA+YDP

Sbjct: 408 P*GIAFVEGEAMEKSQNNVFLKLSI-LNTSLL*LH**NK*LSKVFN*AMESSEILASYDP 584

 

Query: 174 LVYSQKITGQVVIATFFGEQVNEKK 198

           LVYS KITG  VIAT FGEQVNEKK

Sbjct: 585 LVYS*KITG*GVIATLFGEQVNEKK 659

 

 

 Score = 37.4 bits (85), Expect = 0.39

 Identities = 35/107 (32%), Positives = 52/107 (48%), Gaps = 5/107 (4%)

 Frame = +1

 

Query: 125 EQWKKARKMFSQAF-----HFEYLTSLAPLIEQIASKVFNQAMESSEILANYDPLVYSQK 179

           +QWKKAR MFS +F     HF    +     ++  +K++  A     ++    PL   ++

Sbjct: 436 KQWKKARTMFSSSFPF*IPHFSSSINRINSFQKSLTKLWKVAKSWPVMI----PLYIHKR 603

 

Query: 180 ITGQVVIATFFGEQVNEKKFRGMDLVSALTHMLNLLGEQSMSLQYFL 226

              + ++      +  +K  R   LVSALTHMLNLL E  MSL   L

Sbjct: 604 *QDKGLLLPCLENK*MKKS*R-YGLVSALTHMLNLL-ESIMSLYILL 738

 

 

>gi|18196479|gb|BM396441.1|BM396441 5009-0-20-F06.t.1 Chilcoat/Turkewitz cDNA (large fraction)

           Tetrahymena thermophila cDNA.

          Length = 735

 

 Score =  115 bits (289), Expect = 9e-25

 Identities = 61/180 (33%), Positives = 100/180 (55%), Gaps = 1/180 (0%)

 Frame = +3

 

Query: 7   IIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLLGELVEIKEAIKKYEDVDYFVSHQCDE 66

           + I+ +     +I PY  + KY K+G G F P +G        +K+Y+DVD+ + H  D+

Sbjct: 120 LAIVLYILYVFIINPYL*YRKYLKWGKGSFYPFVGVFYGAGLRVKQYKDVDHHLKHMYDD 299

 

Query: 67  NPDLRLYVVNLGSKIKLRLVDPDLMRDFF-LNQQYYEKDTFYIGNVLRCAPQGIAFVEGE 125

             D ++YV N  +   +++ DP+ +++F  L  + Y+K T  I N++R    GI F EG

Sbjct: 300 GSDPKIYVENNATGAIIKISDPEYIKEFV*LENKAYQKTTLLIDNIIRLVG*GIIFSEGP 479

 

Query: 126 QWKKARKMFSQAFHFEYLTSLAPLIEQIASKVFNQAMESSEILANYDPLVYSQKITGQVV 185

           QWKK R + S  FHFE L+   P IE+I  +V+ + ++S  +  N D +    +IT +VV

Sbjct: 480 QWKKNRNVLSGVFHFEQLSKRVPSIEKITKEVYKRYIDSGNV-KNVDVIELF*EITAEVV 656

 

 

&&&&&&&&&&

 

 

>gi|37509509|gb|CF653700.1|CF653700 EST00033 Suppression Subtractive Hybridization Libraries of

           Tetrahymena thermophila Exposed in

           Dichlorodiphenyltrichloroethane (DDT) Tetrahymena

           thermophila cDNA clone DDT-236.

          Length = 505

 

 Score =  109 bits (273), Expect = 6e-23

 Identities = 58/150 (38%), Positives = 89/150 (59%), Gaps = 7/150 (4%)

 Frame = +1

 

Query: 319 MAIWFLTQHPEIKKKLQEELDANT----DYSQNGLLKLPYLNGVIQETQRLYGPAGQLFN 374

           + +W L  HPE+++K++ E+D+      D     L KL Y N   +E+ R+Y  A Q+ 

Sbjct: 1   ICLWVLAQHPELQQKIRAEIDSVIQTFDDLKHEXLNKLEYFNAFFKESLRVYPTAPQVIP 180

 

Query: 375 RVALRDHMLKDIPIKKGTIVKPSPCSVH--RHPKYFEDPHSFKPERWFNKKYCHS-YTFI 431

           RV+ RDHM+ D PI KG  V       +  + P   +D  +F P+R+ +K      ++FI

Sbjct: 181 RVSARDHMVDDFPIPKGAFVSNLTIQYNEQKFPLLCKDIDTFNPDRFLDKNIIQDHFSFI 360

 

Query: 432 PFNAGPRNCIGQHLALVEARIMMYYFMKTF 461

           P++AGPRNCIG HLAL+EA+IM+ Y +K +

Sbjct: 361 PYSAGPRNCIGQHLALIEAKIMIAYILKNY 450 VVLPNEEHQKVRFNHLFL

 

>CF653700.1

ICLWVLAQHPELQQKIRAEIDSVIQTFDDLKHEXLNKLEYFNAFFKESLRVYPTAPQVIP

RVSARDHMVDDFPIPKGAFVSNLTIQYNEQKFPLLCKDIDTFNPDRFLDKNIIQDHFSFI

PYSAGPRNCIGQHLALIEAKIMIAYILKNYVVLPNEEHQKVRFNHLFL

 

 

        1 atttgtcttt gggttttggc ttaacaccct gaactccaac aaaagatcag agctgaaatt

       61 gattctgtta tttagacctt tgacgatctt aagcacgaag ntttgaataa acttgaatac

      121 ttcaatgctt tcttcaaaga atccctcaga gtttatccca ctgcccctca agtcattcct

      181 agagtctctg ctcgtgacca catggtcgat gacttcccta ttcccaaggg tgccttcgta

      241 tcaaacttaa ccatttaata taatgaataa aagttcccac tcctctgcaa agatattgat

      301 accttcaatc ccgatagatt cttagacaag aacatcattc aagatcattt cagttttatc

      361 ccttactcag caggtccccg taattgcatc ggctagcact tggccttaat cgaagctaag

      421 atcatgattg cttacatcct caaaaattac gtagttttac ccaacgaaga acactagaaa

      481 gttagattta accacttatt cttgt