Amphioxus (cephalochordate) ESTs, WGS and HTGS sequences

 

Branchiostoma floridae (many seqs)

Branchiostoma belcheri (1 seq)

 

D. Nelson August 13, 2004, added CYP51 Jan. 19, 2005,

Many new WGS Trace file sequences. modified May 10, 2005

 

To retrieve the trace archive files such as AFSA125350.y1

http://www.ncbi.nlm.nih.gov/Traces/trace.cgi?size=1&cmd=retrieve&s=&m=&retrieve=Search&val=TRACE_NAME%3D%27ATUP646033.g1%27&file=trace&gz=on&fasta=on&scfrcf=scf&dopt=fasta

 

and add the accession number into the search window as shown here:

 

TRACE_NAME='ATUP646033.g1'

 

The TRACE_NAME= limits to the appropriate field.  The quotes around the accession number are needed.

 

CYP2 clan

 

>AFPZ293133.y1 exon 1 90% to AC150407.1 gene at 44817-45095

ATGN222242.g1, ATGI214615.g1, AFSA852354.g2, AFSA840251.g2

AFSA264268.g2, AFSA229077.g2, ATGN243325.b1, ATGI179003.g1

ATUP200634.x2,

Exon 2 ATUP200634.y2 mate pair to ATUP200634.x2

AFSA601420.g2, AFSA489196.b2, AFSA489196.g2

MESAVSFVSGLLANLTLQSILVLVLAFLVTYWLLGTGDRQKNLPPGPRGLPLLGN

LLSFRPSYLLSNLAAWRDKYGDVFCVRIANRLAVVLNG ()

HKAIQDALVKQPEVFSNRPPPFIDSAKDQG

 

AFSA125350.y1 walk with this one it overlaps AFSA489196.g2 on the way to exon 3

 

>AFSA820329.b2 new exon 3 AFSA896985.b2 ATUP864470.b1 ATUP552587.x1

AFPZ806277.x1 AFPZ776469.x1 AFPZ908625.y1 ATGI225190.b1 AFPZ278931.y1

AFSA277571.g2

Exon 4 ATGN165786.g1 exon 4 ATUP951310.g1 ATUP682533.g1 ATUP194164.y2

ATGI58079.g1 ATWW119241.g1 ATUP547159.g1 AFPZ278931.x4 AFPZ278931.x1

ASWX119777.b2 (joins exon 3 of AFSA820329.b2)

ASWX119777.g2 exon 2 seq with frameshift mate pair of ASWX119777.b2

AFSA254465.b2 exon 2 seq

ATUP194164.x2 mate pair of ATUP194164.y2, AFPZ526846.y1

HKAIQDALVKQPEVFSNRPPPMLDSAKDQG

GVAMSEYGEDWKVKRRIGLT

ALRQFGMGKRSLEGKITEEARILCDVLAEKNGTATDMSLLLSNAVSNVICAMSF

GERFEHNDMEFQRLMRLMSEMVGGSGGNAGSSISRFIPLVRKLPFFKKGLERRV

KMSLEVVDFIKSKIKEHKETFDPADIRDIIDVYLMETQQQTPDDADRTITEMGMINTMRD

LFIAGAETTATTLKWGLLYLARHLEVQRKVQDEIDREFGASPPTLSQRGKLPYTEATILE

IQRIRPIAPLAVPHTTSTATVLHGFDIPADTFVIPNLWSAMMDPAVAPDPETFNPDRFLD

EDGTVVRPEWLIPFSL

GRRQCLGEQLAKMELFLFLTTLLQHFTFKLPDGAPALSMEGSLGIVLAPKAYQICAVPRDN*

 

>AFPZ869380.y1 AFSA270651.b2 (4 aa diffs to ATGN165786.g1) exon 4

GRRQCLGEQLAKMELFLFLATLLQHFTFKLPYGAPAPSMEGSMGIVLAPKAYQICAVPRDN*

 

>ASFW203349.g2 WGS exon 1 94% to AC150407.1 gene at 44817-45095

AFSA277180.g2, AFSA337814.g2, AFSA337814.b2

MESAVSFVSGLLANLTLQSTLVLVLAFLVTYWLLGAGDRQKNLPPGPRGLPLLGNLLSF

RPSHLLSNLAAWRQQYGDVFCVRIANRLAVVLNG

 

>AFPZ80615.b2 ASWX143119.b2 ATUP19565.y2 ATUP590430.x1 ATUP871195.g1

AFPZ617150.x1 ATGI45147.b1 AFPZ676420.b2 ATUP848153.y1(poor seq)

New exon 3 seq

Exon 4 ATGI75778.b1 ATWX106000.g1 ATWX78968.b1 ATUP879942.b1

ATUP25175.g1 AFSA516244.g2 AFSA298646.b2 AFPZ676420.b2(joins with exon 3)

ATGI213326.g1 overlaps ATUP871195.g1 exon 3 and goes 800bp upstream

Has exact match to exon 2 from AC150407.1 first seq.

Note: there are two seqs with exact aa seq but one silent nuc diff

These are proably from different genes.

ATGI213326.g1 AFPZ69924.g2 ASWX145321.g2 ATUP276953.x2 are 100% identical

In the exon 2 region

HKAIQDALVKQPEVFSNRPPAVVDSANDQ

GVVMAQYGEGWKVKRRLGLTALRQFGMGKRSLEGKITEEARAVCDILAEK

DGAATDISLLLSNGVSNVICSMSFGERFEYNDTEFQRLMRLMSELVTGSAISR

FNPYVRKLPFIKKGVESRMKMAKEITEFIKAKIKEHKDTFDPADIRDIIDVYLMETQQQI

PDDGDRTITEMGMINTMRDLFIAGAETTATTLKWGLLYLARHLEVQRKVQDEIDQEFGSS

PPTLSQRGKLPYTEATILEIQRIRPIVPLSVPHTTSAATVLHGFDIPANTFVIPNLWSAM

MDPTVAPDPETFNPDRFLDEDGTHVRPEWFIPFSL (1)

GRRQCLGEQLAKMELFIFLTTLLQHFTFKLPDGAPAPSMDGSLGVVLAPDPYQICAIQRD

 

>AC150407.1 two linked genes 5000-6800 region and 44000-48000 region

AFSA41261.x1 WGS exon 1 96% to AC150407.1 gene at 44817-45095

ATUP33911.b1, ASWX26139.b2, ATUP603318.x1, AWXX12138.b1, AFPZ106727.x1

ATWW205625.g1, ATWW98645.b1, AFSA387158.g2, AFSA192393.g2, ATGN203140.g1

ATUP879942.g1, AFSA748020.g2, ATGN123505.b1, ATUP723083.y1, AFPZ103877.x1

APWS99281.g1, ATUP590430.y1 (mate pair to exon 1 sequence ATUP590430.x1)

ATUP871195.b1 (mate pair to exon 1 sequence ATUP871195.g1)

Note ATGN123505.b1 overlaps with AFSA940316.g2 to join exons 1 and 2

AC150407.1 is missing exons 1 and 2 (sequence gap)

exon 2 WGS seqs = APNK3267.b2, AFPZ279007.x4, AFPZ279007.x1, ATUP615432.y1

ATWX107943.g1, ATGN209312.b1, ATUP704177.b1, AFSA940316.g2

exon 2 from ATUP615432.y1 (overlaps exon 3)

next 8 are all exon 2 different from AFPZ80615.b2 exon 2 at one nucl.

AFPZ279007.x4 AFPZ279007.x1 ATWX107943.g1 AFSA940316.g2

APNK3267.b2 ATUP615432.y1 ATGN209312.b1 ATUP704177.b1

Exon 3 WGS sequences ATUP603318.y1 (mate pair to exon 1 sequence ATUP603318.x1)

ATWW25636.b1, AWYB3861.g1, ATUP183373.b1, ATUP688580.x1, ATUP664516.g1

ASWX26139.g2 (mate pair to exon 1 sequence ASWX26139.b2)

ATUP239851.g1, ATWX22588.b1 ATUP615432.y1 ASFW157059.g2 ATWW157810.g1

AFPZ279007.x1 AFPZ279007.x4 ATUP704177.b1

Exon 4 AFPZ919116.b2 ASWX175598.x1 ATGN296415.g1 ASFW157059.g2

MESAVAFASGLLANLTLQSTLVLVLAFLTTYWLLGAGGRQKNLPPGPRGLPLLGNL

LSFRPSRLLSNLAAWRQQYGDVFCVRIANRLTVVLNG

HKAIQDALVKQPEVFSNRPPAVVDSANDQ

5313 GVVMAQYGEGWKVKRRLGLTALRQFGMGKRSLEGKITEEARAVCDILAEKDGAATD

ISLLLSNGVSNVICSMSFGERFEYNDTEFQRLMRLMSELVGGSAISRFNPYVRKL

PFIRKGVESRVKMSMEIVEFIKLKIKEHKETFDPADIRDIIDVYLMETQQQTPDDGDR

TITEMGMINTMRDLFIAGAETTATTLKWGLLYLARHLEVQRKVQDEIDREFGSSPPTLS

QRGKLPYTEATILEIQRIRPIVPLSVPHTTSTATVLHGFDIPANTFVIPNLWSAMMDPTV

APDPETFNPDRFLDEDGTLVRPEWFIPFSL (1) 6266

6618 GRRQCLGEQLAKMELFIFLTTLLQHFTFKLPDGAPAPSMDGSLGVVLAPNPYQICAVPRDN* 6803

 

>AC150407.1 two linked genes 5000-6800 region and 44800-48000 region

WGS exon 1 = AFPZ47185.g2, APWS114129.b1, AFSA346565.b2, APNK34495.b3

WGS exon 2 = ATUP592730.y1

ATWX46513.g1, ATGN30351.b1, ATGN26143.g1, ATGN270605.g1, ATUP890454.x1

APWS24002.b1, ATGI65252.g1, ATGN86876.g1, ATWW134155.g1, AFPZ623416.g2

Exon3 ATGI94162.b1 ASWX162986.g2 AFPZ47185.b2 AFPZ855725.x1 AFPZ929344.x1

ASWX162986.b2 ATGI65252.g1

Exon 4 ASWX162986.b2 ATUP592730.x3 ATUP592730.x1 AFSA165640.b2 AFSA319547.b2

ATWW189487.g1

44817 MESVVPFASGLLANLTLQSTLVLVLAFLTTYWLLGAGGRQKKPPPGPRGLPLLGNL

      LSFRPSRLLSNLAAWRQQYGDVFCVRIANRLAVVLNG 45095 (2)

46004 HKAIQDALVKQPEVFSDRPSPFRFSDKDQ 46090 (1)

46542 GVVMAQYGESWKVKRRLGLTALRQFGMGKRSLEGKITEEARAVCDILAEKDGA

      AMDVSLLLSNGVSNVICSMSFGERFEYNDEEFQRLMRLMSELVSAGGISRFIPLVRKLPF

      LNEGSKNRAKMSMEIVEFIKVKIKEHKETFDPADIRDIIDVYLMETQQQTPDDVDR

      TITEMGMIGTVRDLFIAGAETTATTLKWGLLYLARHLEVQRKVQDEIDREFGSSPPTLS

      QRGKLPYTEATILEIQRIRPIVPLSVPHTTSAATVLHGFDIPANTFVIPNLWSAMMDSAV

      APDPETFNPDRFLDEDGTLVRPEWFIPFSL (1) 47495

47838 GRRQCLGEQLAKMELFIFLTTLLQHFTFKLPEGAPAPSMDGSLGVVLAPKPYQICGVPR* 48017

 

>AC150409.1   Branchiostoma floridae very similar to above clone

45% to 2U1 fugu

AFPZ598379.y1 AFPZ177295.y01 AFPZ177295.y1 ATUP642958.x1

ATGI278297.g1 ATUP598704.y1 ATUP909233.y1

AFSA820152.g2

exon 3 with some frameshifts

97% to AC150407.1 46000 might be an allele

exon 3,4 ATWW157810.g1

exon 4 ATUP610859.x1 ATWX26371.g1 ATUP919332.x1 ATWW121802.b1 ATGI269677.g1

ASFW189763.g2 ASFW131684.g2 AFSA849352.b2 AFSA173661.g2 AFPZ99561.y1

10750 MESAVAFASGLLANLTLQSTLVLVLAFLTTYWLLGAGGRQKNLPPGPRGLPLLGNLLSFR 10571

10570 PSRLLSNLAAWRQQYGDVFCVRIANRLTVVLNG 10472 (2)

8793 HKAIQDALVKQPEVFSDRPSPFRFSDKDQ 8707 (1)

8256 GVVMAQYGESWKVKRRLGLTALRQFGMGKRSLEGKITEEARAVCDILAEKDGTATDIS 8083

8082 LLLSNGVSNVICSMSFGERFEYNDAEFQRLMRLMSELVSAGGISRFIPLVRKLPFFNEGS 7903

7902 KNRAKMSMEIV 7870

EFIKAKIKEHKETFDPADIRDIIDVYLMETQQQTPDDVDRTITEMGMIGTVRDLFI

AGAETTATTLKWGLLYLARHLEVQRKVQDEIDREFGSSPPTLSQRGKLPYTEATILEIQR

IRPIVPLSVPHTTSAATVLHDFDIPANTFVIPNLWSAMMDPTVAPDPETFNPDRFLDEDGTLV

RPEWFIPFSL (1)

GRRQCLGEQLAKMELFTFLTTLLQHFTFKLPDGAPAPSMDGSLGVVLAPKPYQICAVPRDN*

 

>50% to CYP2U1 zebrafish ASFW117295.g2 AFSA812739.g2 ASFW150452.g2 New seq

TTWKGGVFFLPRALKPRRGRPKVREEIAREFASPVPPWSERERLPYTEATIMEIQGIRPIVPLNIFHGN

TSATTLYGYDIPAGTYIIPSLWSAMMDPKVTPEPEEFRPERFLDDEGKVVKPEWFLPFSA (1)

GRRRCLGEQLAKMELFLFYTSLLQHFTFKLPDGAPAPPMDGSLGFVLSPPAYDICAVPRHSS*

 

>62% to 2U1 zebrafish

AFSA220461.b2 ATWX77582.b1 APWS173478.g1 new C-term exon

GRRICLGEQLAKMELFLFLTSLLQQFTFKLPEGAPKPDMCGEIGATLLPKPYNIQAISRKK*

 

>DE198043.1 genomic survey sequence. NEW 1/6/06

          Length=653 43% to 2U1 Fugu

VQTTVRAELDRVLMRGESVSAAHRRALPVTEATVMEILRLATPSPLNFRATACDVTLRG  476

YRLPEGTWTLMNCWAVHRDPLQWTEPDTFDFTRFLDREGRVTTPPAWRPFGIGTRS  308

 

>DE197854.1 genomic survey sequence. NEW 1/6/06

DE000432

          Length=702 51% to 2U1

71   SVLHRYIIPKDTIVFAGQWSVHHDPELFPEPDMFDPERFLDDEGNFKNIEYFMPFSM (1) 241

375  GPRSCMGQPLAEVQLFLLFTNLMQNFKLKLPEGAAKPSSEGVMGITLAPKPFDLV  539

 

>DE017611.1 genomic survey sequence. NEW 1/6/06

          Length=625 46% to 2K11

GVLFAAYGPDWKHQRKFALMTLRDFGVGKRSLEGKIR  373

EEADALIQEVESKNGLPFDIKQMLPNAVSNVICSIAFGNRFEYGDPEFLRLIGLLNAAVE  553

AQPSRDILPNIHPVFRRLPFGS  619

 

>DE195161.1 genomic survey sequence. NEW 1/6/06

67% to 2N11

LFLAGTDTTSTTLRWALLYMILHPDIQEKVQQEIDSVLGPNQEPEMAHR  322

 

>DE189345.1 genomic survey sequence. NEW 1/6/06

61% to 2N11

LFLAGTDTTATTLHWAVLFMILHPDIQQKVQQEIDSVLGPNQDPSMEHR

 

>DE013036.1 genomic survey sequence. NEW 1/6/06

          Length=592 44% to 2Z2

VIYDLFFTGAETSSTCLRWAVFLMAVYPDVQARIHREVDTVLGSDGEVTLDKRAALPFL  386

DATISEVYYLNS  350

 

>DE012415.1 genomic survey sequence. NEW 1/6/06

58% to 2N11

AGTDTTATTLHWAVLFMILHPDIQQKVQQEIDSVLGPNQDPSMEHR

 

>CF918864 BI377274 Amphioxus 5-6 hrs cDNA 45% to 2U1 fugu

RVRRDATVSLAHRPEMPYTDAFLHEVLRIRPPGPLSVPHMAGPGATLNGYEIPQNTQVYA

NLWSLHMDPEYWPEPERFDPTRFIGPDGKVLPNPPSYAPFSLGRRACPGKQLAKSEAFLF

LVTMVQRFSFKLPEGAPVPPMDGVMGFSLAAQPHSLCAISRN*

 

>CF918826 BI383662 Amphioxus 5-6 hrs cDNA 51% to 2U1 fugu

AAESGTRPDYIIPQDAMIFVNLWSVHMDPQLFPDPNTFRPERFLDQDGNFVKQAVIPFGI

GPRVCLGEQLAKMEVFMLFVSLMQRFTFHLPEGAPEPSMLGKLASAINVPCPFELCAVAR*

 

>BI387982 Amphioxus 26hr cDNA library 48% to 2N2 zebrafish

NGKPVPKPAALMPFSA

GRRACPGEAVXKADTFLLLGGLVQNFRFSIPEGEGPPDLTPDDKTGGDTCIPYPYKVVMSCRKCML*

 

>BI388387.1 C-helix to mid

EGANYSDGCXGVIFAPYGSFWKEQRKFTLMSLRDFGFGNRSIYGKIVEESQVLQSVIAKF

DGQPFSTHRLLHNAVANVTCNILFGDRWEYDDPLFQRMMDALNYMVSTNVFAVPQNFIPF

TRYIPGWAGRLEPWLKKFLSIMGYLREELDKHKVIFDPTDLRDFINTYLLEIQNQ

 

>BI387848 Amphioxus 26hr cDNA library

52% to 2U1 FUGU, 50% to 2U1 mouse 75% to BI377261

AFSA235046.g2  ATGI91479.g1  ATUP593811.y1 AFSA108094.b2

RHASDLLLDGTETTGNTLLWALLYMTQNPTIQHK (0)

VQQELDAVVGESQPTLSHRSQLPYVNACLLETMRIRTLVPLAVPHATTQDVTIQEFDIPQGTQ (0)

VLPNLYSLHMDPTYWPDPDRFDPERFLDAEGNVINKPQSFMPFGG (1)

GRVCLGEQLARMELFLFFSTLLQSFTFKTPEGAPPPKTDGGLGITWTP

 

>AFPZ7602.y1 ATGI55268.b1 

VILNLYSLHVDPTYWPDPERFDPERFLDAEGNVINKPESFMPFAG (1)

 

>ASWX176511.y1  87% to BI377261

ATUP829661.g2 AFPZ642936.g2  ATUP921353.y1  ATWW61130.b1

VLTNLHSLHMDPAYWPDPDRFDPERFLDAEGKVINKPKSFMPFSG (1)

 

>ATUP598105.y1  ATGN136393.b1  ATUP193767.y2 ATGI126577.b1

ATWW83807.g1 

VHEELDAVVGESLPTLSHRSLLPYVNACLQEVMRIRPVGPLAIPHATTEAVRVRGYDIPKRTQ (0)

VLLNLYSLHMDPAYWPDPDRCDPERFLDAEGNVINKPESFMPFGG (1)

GRVCLGEQLARIELFLFFSTLLQSFTFKTPEGAPPPNADGILGLTLAPHPFQLCAIPR*

 

>ATUP937768.y3  ATUP937768.y1  ATUP905825.y1  ATWW1274.b2

AFSA664077.g2 

VLFNLYSLHMDPAYWPDPDRFDPERFLDAEGNVINKPESFMPFGG (1)

GRRVCLGEQLARMELFLFFSTLLQSFTFKTPEGAPPPNTDGIFRLTLKP HPFQLCAIRR*

 

>AFSA690405.b2 exon 1 and part of exon 2 89% to AFSA636542.b2

walk to ATWW106344.b1   APWS97989.g1

note: this is probably a poor version of APWS97989.g1

downstream the sequences are the same

MAILFSWIVESVLEILQISGLTLQTILVFCVPFLLACTF*KRPRNLPXYPAGRVPVLGH 849

LLALGRAPHLKLTXWRRQYGDVFTVRMGMEDVVVLNGYTAVRDALVDRSELFASRPPNYL 669

FDLTVGFGE ()

DIVTARWGSQFX QRRRL

 

>ATUP47463.b1 exon 1,2 87% to AFSA636542.b2

MAAVVSWISESVQEIPQISGLTLQTCLVFXAAFLLTCALXRRPRNLPPYPAGHVPVLG 791

HLLALGRAPLLKLTAWRRQYGDVFTVRMGMEDVVVLNGYTAVKDALVDRSELFASRPPNY 611

LFDSSVGFGK ()

DIGAARWGTGLKQRRRFATAALKHLGMKVGTGSVEDNIRQEASCLRKR (0)

 

>ATGI68302.b1 exon 1 82% to ASWX66916.b2

MAVIVSWIAELVWEIFQISGLTIQTFLVFCVVFLLAYVLLKRHKNLPPYPAGRVPVLGHL 326

LALGREPPLKLTAWRRQYGDVFTVRMGMEDVVVLNGYTAVKDALVDRSELFASRPPNYLL 506

DAIVGCGK ()

 

>AFPZ28428.y1 exon 1 79% to AFSA636542.b2

MATAVFRWIIQSVQDTLQIYGLNLQSLLVFCTAFVLACALLKRSPNLPPYPAGRVPVLG 304

HLLALGRAPHLKLTAWRRQYGDVFTVRMGMEDAVVLNGYTAVKDALVDRSELFASRPPNY 484

LFDLTVDSGK ()

 

>ASWX66916.b2 exon 1 89% to AFSA636542.b2

walked to AWXX13027.b1 ATUP266482.b1

mate pair = ASWX66916.g2 exon 3

AFSA35511.g2 exons 2,3 ATGN165304.g1 ASFW57081.b2

walked upstream to ATUP266482.b1 which = ASWX66916.b2 join seqs.

MAAVVSWIAESVLEILPMSGPTLQTFLVFCVAFLLTWALLRRPRNLPPYPAGRVPVLG 489

HLLALGRAPHLQLTAWRRQYGDVFTVRMGMEDVVVLNGYTAVKDALVDRSELFASRPPNY 669

LFDSSVGFGK ()

DIGAARWGTELKQRRRFATAALKHLGMKVGTGSVEDNIRQEASCLRNR  (0)

IAEYHGQPFGISNDMKVAVANVICSMAFGRRYGYEDETFRELSEAIRNLLAEIGSGQFISVFPLLRFVPG

 

>ATWX43498.b1 exons 2,3 very similar to AFSA35511.g2

walked to ATUP237571.b1 no obvious exon kept walking to ASWX77262.g2 = exon 4

ASFW107932.b3 exon 4 walked to AFPZ187653.x1 exon 4,5 ASWX45971.b2

DIGAAPLGDRVEAEKRFATAALKHLGMKVGTGSVEDNIRQEASCLRKR (0)

IAEYHGQPFAISNDMKVAVANVICSMAFGRRYGYEDETFRELSEAIRNLLAEIGSGQFISVFPLLRFVPG ()

ACKEVLKHLSKIHEVLWDEIARHRENFDRENPRDFLDFCLLELEQREK

VEGLTEENVLYMAQNLFLAGTDTTANTLLWSLLYMTLNPDIQNK (0)

VHEELDA

 

>AFPZ601018.b2  ATGN133651.b1 ATGN143242.b1 

walked to ATUP71680.y1 (exon 5)

walked to ATUP705359.g2 (exon 4)

walked to ATGI77993.b1 (exon 3)

walked to AFPZ24940.g2 (exon 2)

walked from exon 7 downstream to AFSA524984.b2 to try to find a mate pair

to exon 1 did not work

tried finding more exon 3,4 hits to look for more mate pairs

ATUP206044.x2 mate pair = ATUP206044.y2 = exon 1

MTGAVQWIADSVQEILQISELTLQTFLVLCSTFLLACVVFNRSRSRNL

PPYPAGRVPVLGHLLALGRAPLLKLTAWRRQYGDVFTVRMGMEDVVVLNGYTAVQDALVD

RSELFASRSPFYYLFDALFAFGK (1)

DIISARWGSGFRQKKRFATTVLKNLGMRVGRGSIEDSIREEASCLRNR (0)

IAENNGQPFDIAHDVAVAVANIICSMAFGKRYDYEDETFRELTKAIATISIELGAGHIT SVFPLLRFVPV (1)

VLYNHSHLYATVNRPIIKALEASSKVKNVMREEISRHREHLDRENPRDFLDLCLLELEQQE

KVEGLTEENVFHMAQDLFLGGTDTTANTLTWSLLYMTLNPDVQNK (0)

VHEELDAVVGESLPALSHRSQLPYVNACLLETMRIRTIVPLASHATTQEVKVQGYDIPKGTQ (0)

LMLTSPHMDPANWPDPDPFDPERFLDAEGNVIKKPESFMPFSG (1)

GRRVCLGEQLARMELFLFFSTLLQSFTFKTPVGAPPPNTDGIPGLTFMPHPFQLLAIER*

 

>APWS97989.g1 exon 1 93% to AFSA636542.b2

ATUP196459.x2 AFSA321451.g2 ASFW202410.g2

Walked upstream to ATGI153668.g1 AFPZ313895.x1 mate pair AFPZ313895.y1

to try to find a mate pair in the C-term part

AFPZ313895.x1 mate pair AFPZ313895.y1 = AFSA636542.b2 seq exon 3

These two seqs are 95% identical

APWS97989.g1 exon 4 end of exon 4 = BI377261 join seqs.

BI377261 Amphioxus 5-6 hrs cDNA 49% to 2U1 fugu 75% to BI387848

AFPZ459499.y1 ATUP541153.g1 AFSA636542.b2

MAIIVSWIVESVLEILQISGLTLQTILVFCVAFLLACTFWKRPRNLPPYPAGRVPVLGH 585

LLALGRAPHLKLTAWRRQYGDVFTVRMGMEDVVVLNGYTAVRDALVDRSELFASRPPNYL 405

FDLTVGFGE ()

Missing exon 2

VAEYEGKPIDIAHGINVAVANVICSMTFGKRYDYEDETFRELSEAVVTIMSELGAGQIIS VFPLLRFVPG (1)

ASYSVSAQLAKIQKVLREEMSRHREHLDRENPRDFLDFCLLELEQQEKVAGLTEENVLYMAQ

NLFFAGTDTTTNTLRWSLLYMALNPDIQKK (0)

VQEELDAIVGESLPTLSHRSQLPYVNACLLETMRIRHIGPLAVPHATTDTVKVKEYDIAKGTQ (0)

VLPNLHSLHMDPAYWPDPERFDPERFLDAEGNVINKPESFMPFSG (1)

GRRVCLGEQLARMELFLFFSTLLQSFTFKTPEGAPPPSTDGVF GVTLTPHPFQLCAIPR*

 

>AFSA636542.b2 ATUP541153.g1 ATUP933964.y1  ATUP933964.x1 ATUP738986.y1

ATGI10244.g1 ATGN171873.g1  ATUP926693.b1 (exon 6) AFSA482736.g2 (exon 6)

AFSA726698.g2 (exon 6)

34% to 2N1 35% to 2D4

MAVIVSWIVESVLEILQISGLTLQTILVFCVAFLIACTFLLK

RPRNLPPFPAGRVPVLGHLLALGRAPHLTLTAWRRQYGDVFTVRMGMEDVVVLNGYTAV

KDALVDMSELFASRPPNYLFDLTVGFGE  (1)

DIVTARWGSKFRQRRRFATTALRNLGMKVGTGSIEEKIREEAIRLRNR (0)

VAEYEGKPIDIAHGINVAVANVICSMTFGKRYDYEDETFRELSEAVVTI

MSELGAGQIISVFPLLRFVPG (1)

ASYSVSGQLAKIQKVLREEMSRHREHLDHENPRDFL

DFCLLELELQEKVAGLTEENVLYMTQNLFFGGTDTTTNTLLWSLLYMILNPDIQKK (0)

AQEELDAVVGESLPTLSHRSQLHYVNACLLEVMRIRH

IGPLAVPHATTDTVKVKEYDIAKGTQ (0)

VLPNLHSLHMDPAYWPDPDRFDPVRFLDAE GNVINKPESFMPFSG (1)

GRRVCLGEQLARMELVLFFSTLLQSFTFKTPEGAPPPSTDGIFGITLTPHPFQLCAIPR

 

>exon 1 ATGN171873.g1 APWS102929.b1 ATGI10244.g1

ATGGCTGTAATTGTCAGCTGGATAGTTGAGTC

CGTCCTGGAGATTTTGCAGATCTCCGGGCTGACTCTGCAAACAATTCTCGTCTTCTGTGT

GGCCTTCCTCATTGCGTGCACGTTCTTGTTAAAGCGCCCCAGGAACCTGCCACCTTTCCC

GGCAGGACGCGTGCCTGTTCTCGGGCACCTCCTCGCCTTGGGCCGAGCGCCTCACCTCAC

GCTGACGGCGTGGAGGCGGCAGTACGGGGACGTCTTCACCGTCAGGATGGGGATGGAAGA

TGTGGTGGTTCTGAACGGCTACACTGCCGTCAAGGATGCGCTCGTGGACATGTCCGAGCT

GTTCGCGTCCAGGCCGCCAAACTACCTGTTCGATTTGACAGTTGGATTCGGAGAAGGT (1)

 

>DIVT exon 2 ATGI10244.g1

AGACATTGTTACTGCACGTTGGG

GGAGCAAGTTCAGACAGAGACGGAGGTTTGCTACCACGGCGTTAAGGAACCTCGGCATGA

AGGTCGGCACTGGCAGCATTGAAGAGAAAATCCGAGAGGAAGCTATACGTCTCCGCAACA

GGGT

 

>VAE exon 3 ATUP738986.y1

AGGTTGCAGAATACGAGGGAAAAC

CTATTGATATCGCCCATGGTATCAACGTGGCGGTCGCGAACGTCATCTGCTCCATGACGT

TCGGAAAGCGCTACGACTACGAGGATGAAACGTTCCGGGAGCTCTCTGAGGCGGTTGTGA

CAATAATGTCTGAGCTTGGAGCGGGGCAGATTATCAGTGTCTTCCCCCTGTTACGGTTTG

TTCCAGGAGGT

 

>ASYS exon 4

AGCCAGCTACAGTGTATCTGGACAACTGGCGAAGATCCAAAAGGT

GTTGAGGGAAGAAATGTCTCGCCATCGAGAACACCTGGATCACGAGAACCCACGAGACTT

CCTCGACTTCTGCCTGCTGGAGCTGGAACTGCAGGAAAAGGTGGCTGGTCTGACGGAAGA

GAACGTCCTGTATATGACACAGAACCTTTTCTTCGGTGGAACAGACACGACCACCAACAC

ATTGCTGTGGAGTCTACTCTACATGATTTTGAACCCAGACATCCAAAAGAAGGT

 

>AQEEL exon 5

AGGCACAAGAGGAGCTTGATGCCGTTGTTGGTGAGAGTCTGCCCACCCTGTCCC

ACCGTTCCCAGCTGCACTACGTGAACGCCTGCCTGTTGGAGGTCATGAGGATCCGCCATA

TCGGGCCTCTTGCCGTTCCCCACGCCACCACAGACACGGTCAAAGTGAAGGAGTACGACA

TCGCTAAGGGAACCCAGGT

 

>VLP exon 6 AFSA726698.g2 ATUP926693.b1 AFSA482736.g2

ATUP933964.y1

AGGTACTACCGAA CTTGCACTCCCTCCACATGGACCCCGNCTACTGGCTTGATCCGGACC

GTTTTGACCCCGTAAGATTCCTGGACGCGGAA

GGGAACGTCATCAACAAGCCTGAGTCCTTCATGCCTTTTTCTGGAGGT

 

>GRR exon 7 ATUP933964.x1

AGGCCGACGTGTGTGTCTTGGTGAGCAGCTGGCCAGGATGGAACTTGTCCTG

TTCTTCTCGACTCTACTGCAGTCCTTCACCTTCAAGACGCCAGAGGGCGCCCCTC

CTCCAAGCACTGACGGCATCTTTGGGATAACATTGACACCGCATCCGTTCCAGCTTTGTG

CAATACCACGTTAG

 

Other closely related exons

 

>ATUP699472.x1 exons 6,7

VLLNVYSLHMDPAYWLDPDRFDPERFLDAEGKVINKPESFLPFGG (1)

GGRVCLGEQLARMELFLFFTTLLQSFTFKPPEGASPPNADGILGLTLAPHPFQLSAIPR*

 

>AFPZ728456.y1 exons 5,6

VHEELDAVVGESLPTLSHRSQLPYVNACLQEVMRIRPVGPLAIPHATTEAVKVRGYDIPKRTQ (0)

VLLNLYSLHMDPAYWPDPDRFDPERFLDAEGKVINKPDSFLPFGG (1)

 

>AFPZ476483.b2 exons 5,6

VHEELDAVVGESLPTLSHRSQLPYVNAC

LQEVMRIRPVGPLAIPHATTEAVKVRGYDIPKRTQ (0)

VLLNLYSLHMDPAYWPDPDGFDPEXFLDAEGKVXHKPES

 

>exons 5,6

AFSA16336.x4  AFSA16336.x1  AFPZ506410.x1 APNK80508.g2  ASWX68286.g3 

ATUP343092.y1 ATGN182700.g1 ATUP756295.y1 AFPZ866552.y1 ATUP443435.g1

ASFW36405.b2  AFSA625448.b2 AFSA427303.b2 AFSA716480.g2 AFPZ471003.x1

VQQELDAVMGASLPSLSHRSKLPYVNACLMETMRIRTLLSVILHATAQEVKVQGYDIPKGTR (0)

VLMNMHSLHMDPAYWPDPDRFDPERFLDAEGNVINKLPSFMPFSG (1)

AGGTACAGCAGGAGCTTGATGCC

GTTATGGGCGCGAGTCTGCCCAGCCTGTCCCACCGCTCCAAGCTGCCCTACGTGAACGCC

TGCCTGATGGAGACCATGCGGATCCGCACTCTTCTGTCTGTCATCCTTCACGCCACCGCG

CAGGAGGTCAAAGTGCAGGGATACGACATTCCTAAGGGAACTCGGGT

AGGTGTTGATGAACATGC

ACTCCCTCCACATGGACCCCGCCTACTGGCCTGACCCGGACCGGTTTGACCCCGAAAGGT

TTCTGGACGCGGAAGGGAACGTCATCAACAAACTTCCATCCTTCATGCCTTTTTCAGGAGGT

 

>ATGI42736.b1   ATGN217089.g1  ATUP49594.g2   AFSA786188.b2 

AFSA126109.g2  AFPZ657783.y1 ATUP738387.x1

AFPZ495923.y1  dup. exon 5 (pseudogene) exon 6 and part of 7

VHEELDAVVGASLPALSDRSQLL

YVNACLLETMRIRTLVPVSLPH

VQQELDAVVGASLPALSHRSQLPYVNACLMETMRIRTLLSVILHATAQEVKVQGYDISKGTR (0)

VLMNMHSLHMDPAYWPDPDRFDPERFLDAEGNVINKLPAFMPFSG (1)

GHRVCLGEQLARMELFLFFSTLLQSFTIKTPEGAPPPNTDGIFGLALKPHPFQLCAIPR*

AGGTGTTGATGAACATGC

ACTCCCTCCACATGGACCCCGCCTACTGGCCTGACCCGGACCGGTTTGACCCCGAAAGGT

TTCTGGACGCGGAAGGGAACGTCATCAACAAACTTCCAGCCTTCATGCCTTTTTCAGGAGGT

 

>AFSA241515.g2  AFPZ140710.y1  APWS45577.b1  ASWX65492.b2  

ATUP320554.x1  ATGN323264.g1  ATGN284296.b1  ATGI170827.b1 

ATUP12995.x2   ATUP716729.x1  AFSA152443.b2

VLMNMYSLHMDPVYWPDPDRFDPERFLDAEGNVINKPESFMPFGG (1)

GRRVCLGEQLARMELFLFFSTLLQSFHFKTPEGAPAPCADGIFRMTVTPHPFELCAIPV*

 

>AFSA83521.b2

VLMNMYSIHMDPVYWPDPDRFDPERFLDAEGNVINKPESFMPFGG (1)

GRRVCLGEQLARMELFLFFSDLLQSFTFKTPEGAPAPCADGIFPMTLTPXPFELCAIPR*

 

>APWS92234.g2  ATWX24634.b1  ATGN357284.b1  ATUP895861.x1  ATGI42736.g1

ATGI104268.g2  ATWW86466.b1  ATWW117588.b1  ATUP559927.b1  AFPZ509619.y1

ATGN267193.b1

IHEELDAVVGESLPALSHRPQLPYVNACLLETLRIRTLV XXXXHATTQDVKVQQFDIPKGTQ (0)

VLPNLHSLHTDPAYWPDPDRFDPERFLDAEGNVINKPESFMPFSG (1)

 

>ATUP710771.b1 1 aa diff to APWS92234.g2

AFSA525510.b2

VLPNLHSLHTDPAYWPDPDRFDPERFLDAEGNVINKPESFMPFGG (1)

 

>AFPZ185379.x1 ATGI56471.b1  ATGN311806.g1  ATUP672818.b1 AFSA909926.b2

AFSA523356.b2  AFSA330395.g2 AFSA330395.b2

ATGI93799.g1 APWS110835.b1 ASWX76430.g2 

GTQCKLHACRSTLEDPLEQQAKLSSLTEENVLHMAGDLFLAGTETTTNTLQWSLLYMTLNPDIQNK (0)

VQEELDAVVAESLPTLSHRSQLPYVNACLLEVMRIRTLIPAVRHVTTQEVKVQEYHISMGTW (0)

VLANLHSLHTDPAYWPDPDRFDPERFLDAEGNVINNPKSFMPFGG (1)

GRRACLGEQLARMELFLFFSTLLQSFTFTTPEGALPPNTDGVFGLTLVPHPFQLCATPR*

 

>ATUP912010.y1 ATGI128166.b1  ASFW164761.b2  AFSA840082.g2

AFSA174046.g2  AFSA315286.g2  AFPZ159110.y1  ATWW201417.g1

AFSA778163.g2 

VLVNLHSLHMDPVYWPDPDRFDPERFLDAEGNVVNKPQSFMPFAG

 

>ATUP680104.g1

VILNLHSVHMDPAFWPDPDRFDPDRFLDAEGNFINKPESFMPFSA (1)

 

>ATWW125683.g1

VHEELDAVVGASLPTLAHRSQLPYVNAFLMEVMRIRYVGPLGVPHATTAAVKVQEYDIPEGTQ (0)

IILNLHSVHMDPAFWPDPDRFDPDRFLDAEGNFINKPESFIPFSA

 

>APWS102434.g1 ATWW177217.g1

KVQGYDIPKGXX

VLMNLYSLHMDPAYWPDPDRFDPERFLDAEGNLINKPESFMPFG (1)

 

>ATWW233361.g1 ATUP551452.y1

VHEELDAVVGESLPALSHRSQLPYVNACLMEIMRIRYVGPLSVPHATTAPVKVQEYDIPKGTQ (0)

VIVNLHSLHVDPAYWPDPDRFDPDRFLDAEGNFINKPESFMPFS

 

>AFPZ859823.x1 AWYB2850.g1 ATWW63772.b1 AFPZ870007.y4 AFPZ870007.y1

mate pair AFPZ859823.y1 = exon 7 ATUP557464.y1

ATUP557464.y1 ASFW50972.b2 AFSA305932.b2  AFPZ122560.y1  ATUP820771.x1

Almost 51% to CYP2U1 human

VWTKIQFSNIPLLITIVSGKLVTRFLFPVLFLPLVNR ???? uncertain

AFMEVLKQNSRVHEVLWDEIARHRETFDSENPRDFIDFCLLELEQQE

KVDGLTEENVMYMAQDLFFAGTETATNTLLWSLLYMTLNPGVQQK (0)

VHEELDTVVGASLPTLSHRSRLPYANACLMETMRIRHIAPLIIPHATTDTVRVQEYDIPEGTQ (0)

VLMNMYSLHMDPAYWPDPDRFDPERFLDAEGNVINKPESFMPFGG (1)

GRRVCLGEQLARMELFLFFSTLLQSFSFKTPEGAPAPCADGIFRMTLTPHPFELCAIPR*

 

>ATWW225973.b1  CYP2U like ATUP29908.b1 ATUP481728.g1 APNK56784.b2

AFSA784029.g2 ATUP411423.g1

FLDSDGKVVTRPESFMPFST (1)

GRRVCLGEQLAKMELFLLFSSLLKHFTLKLPEGAAAPSTDGIMGFFYVPPKVNMCITKR*

 

CYP17/CYP1 like

>ATGI187647.g1

MWLMTITVGLVTLILVKWLKDYVQRWRMPPGPFFWPVIGNLSCKYRGS (0)

 

>ATGI151113.b1  ATWW15542.g1

SYLTFIDLAKTYGDVFSLKMGMTDVVVLNSLDAVKEAFVKKGEDFAGRPKMT (1)

 

>AFPZ866519.x1   66% to C-helix of CYP17A2

TDISSEGGKDIAFADYSPTWKLHRKLFHSAIR (2)

 

>ATGI157309.b1 ATGN157240.b1 ATUP362994.b1 AFPZ866519.x1

GYASAQNLQSKVHESLEDTIAVFSKMEGQAVDLEDYIYQLVYNVICSAAFGTR (2)

 

>AFPZ866519.y1

YNMDDEDFDTLMKISKDTTETFGQGLLADVYPVLRFLPSS (1)

 

>AFPZ295620.x1

SVTANRKMTHQLMEIMQRHLEQHRESFDP (1)

IPLNEYQCTLLQITSVTSQITMIKAQKDAEEEGIQDIDSLTDTHLRQLIGDISF (1)

 

>ASWX154218.b2 I-helix to EXXR region ATGN97768.g1 AFSA255326.b2

47% to CYP1A7

AGTISTILTLRWAILYLAVHPEIQEKVAAELDSVVGRDRLPELSDREATPYTEAIFHEVMRMASMDPV

SLPHATTVDTTLS ()

GYQIPKGTWILPNLWALHHDPDTWGDPDVFRP

 

>AFPZ295620.y1 ASWX154218.g2 (very end + downstream seq)

DVFRPERFLDESGKPIPKPAALMPFG (2)

VGRRACPGEALGKADTFLLLGGLVQNFRFSIPEGEGPPDLTPDEIGQ

GSISIPYPYNVVMTCRK*

 

35% to Xenopus CYP17 and 36% to CYP1A6 and CYP1A7

MWLMTITVGLVTLILVKWLKDYVQRWRMPPGPFFWPVIGNLSCKYRGS (0)

SYLTFIDLAKTYGDVFSLKMGMTDVVVLNSLDAVKEAFVKKGEDFAGRPKMT (1)

TDISSEGGKDIAFADYSPTWKLHRKLFHSAIR (2)

GYASAQNLQSKVHESLEDTIAVFSKMEGQAVDLEDYIYQLVYNVICSAAFGTR (2)

YNMDDEDFDTLMKISKDTTETFGQGLLADVYPVLRFLPSS (1)

SVTANRKMTHQLMEIMQRHLEQHRESFDP (1)

IPLNEYQCTLLQITSVTSQITMIKAQKDAEEEGIQDIDSLTDTHLRQLIGDISF (1)

AGTISTILTLRWAILYLAVHPEIQEKVAAELDSVVGRDRLPELSDREAT

PYTEAIFHEVMRMASMDPVSLPHATTVDTTLS ()

GYQIPKGTWILPNLWALHHDPDTWGDPDVFRPERFLDESGKPIPKPAALMPFG (2)

VGRRACPGEALGKADTFLLLGGLVQNFRFSIPEGEGPPDLTPDEIGQGSISIPYPYNVVMTCRK*

 

>DE040433.1 Amphioxus genomic survey sequence. No introns, NEW 1/6/06

41% to 1B1 Danio, 40% to CYP1C1 fugu, 39% TO 1A1 HUMAN, 39% to Xenopus 1A6, 1A7

trace file 630869645 632546376 539391436

MAAVATAALFGLSYLQVVLIAVLLVLVAAVVASSLRQNTPSLPPGPWGF

PVVGIFPALGSRPHHAFSRMAEKYGDVFRVKFGSRT

VIILNGIDMVKDACVKQSACFAGRPALYSFKQVKNGITFKTYSPSWVARKKVTVGALKGF

VNGRVGALTASAETMITEEAQELARVFLSKSGQPSNPEEYAHTAVANVVCALCFGKRYEH

GDQEFRQLLRNTEKFRQAIGAGNPADFMPWLRFFPNKNMKLFKEAMESSTQLFDKHINAH

LQTYDPSVIRDIADALIYNMRENKEAGLTDEFVLECVIDIFGAGQDTTSQMLHWAFLY

MLVFPDVQARVQREIDGVVGRERAPTLADEASLPYTVAVIQEIVRHTGVVPMSIPHLTTK

DTQLHGYTLPKDTIVFANLFSVGHDRRIWGDPSSFRPERFLDPSGTTLDPAAVEKNLPFS

AGKRRCPGEHLAKQEMFLFFSILLQQCSFERVNGTASPTLEGTFGLVMRPQPYSMIVRPR*

 

>gi|62381799|gb|DN791732.1| 90857715 Sea Urchin primary mesenchyme cell cDNA library

           Strongylocentrotus purpuratus cDNA clone PMCSPR2-126F11

           5', mRNA sequence.

          Length = 983

 

 Score =  259 bits (662), Expect = 5e-68

 Identities = 124/306 (40%), Positives = 189/306 (61%)

 Frame = +3

 

Query: 172 LVYNVICSAAFGTRYNMDDEDFDTLMKISKDTTETFGQGLLADVYPVLRFLPSSSVTANR 231

           ++YNV+    FG  Y ++D +    M ++ D  +  G GL AD++   +++P+S     +

Sbjct: 84  IMYNVLAHLCFGLSYELEDPNVTQWMDVNNDVNDKLGLGLAADIFSWAKYIPTSGPRMIK 263

 

Query: 232 KMTHQLMEIMQRHLEQHRESFDPIPLNEYQCTLLQITSVTSQITMIKAQKDAEEEGIQDI 291

           ++T  +   ++  +++ RE +DP  +N++   LL               KAQ+DA +EG +++

Sbjct: 264 EITETMFGFLRSQVDEAREHYDPENINDFYSLLL------------KAQEDARKEG-ENV 404

 

Query: 292 DSLTDTHLRQLIGDISFAGTISTILTLRWAILYLAVHPEIQEKVAAELDSVVGRDRLPEL 351

           D LTDTH+ Q + DI  AG  +T+ TL WA+  L  +PEIQ K+ AE+D V+GRDRLP +

Sbjct: 405 DKLTDTHIFQTVADIFGAGIQTTVETLYWAMALLVTYPEIQAKIRAEIDDVIGRDRLPTI 584

 

Query: 352 SDREATPYTEAIFHEVMRMASMDPVSLPHATTVDTTLSGYQIPKGTWILPNLWALHHDPD 411

           +DR   PYTEA  +EV+R +S+ P+++PHAT+ DT   GY IPKGT ++ N  ++H+DP

Sbjct: 585 NDRGNLPYTEASLYEVLRYSSIAPIAVPHATSRDTEFGGYHIPKGTTVMINTHSMHYDPQ 764

 

Query: 412 TWGDPDVFRPERFLDESGKPIPKPAALMPFGVGRRACPGEALGKADTFLLLGGLVQNFRF 471

            W  PD F PE FLD+ G     P + +PFG GRR C GEA+ KAD FL+  G +QN+ F

Sbjct: 765 EWDQPDKFLPEHFLDDGGTIREHPPSFLPFGAGRRGCLGEAVAKADLFLIFXGFLQNYTF 944

 

Query: 472 SIPEGE 477

           S   G+

Sbjct: 945 SKAPGK 962

 

CYP3 clan

 

>BI385897 Amphioxus 26hr cDNA library 57% to 3A65 zebrafish, 62% to 3A49 fugu

RFFSTRVREVNGLHIPAGMIVNIPVYAIHYDADLWPEPEKFKPERFTKEEKESRDPYAYL

PFGSGPRNCVGMRLAQLELKFALAKMLQKFRFVTCDKTDIPVRLQNTLGNQIEGGLFLKV

EART*

 

>AU234604 Amphioxus Notochord cDNA Branchiostoma belcheri (two parts)

42% to 5A1 fugu

HEGKGVGKYIGRTPHLQISDPEMLREIFVKQFHKFANRAPEGMALDVKPQSRMLTQLVDE

DWKNVRSTISPAFSGGKLKQMTEAINSCADLLVGNIGKFGEKGESFDTKELTGAFTTD

(seq gap) 44% to 5A1 47% to 3A49

IPKQMMILIPVLGIHYDPERWPEPYKFIPERFTKEEKEKRDPFDWLPFGAGPRNCIGIRLAM

MELXGGLARVLMK

TGPXTDIPLKXMKNKQXPTPENGIRLXAELXHPGXD*

 

>AC150395.1 137000-149000 region - strand in HTGS first exon is a best guess

no matches to other P450s or ESTs can yet identify this N-terminal exon.

149120 MLPPNLSQEGCINDQTHRVSS (2) 149058 (possible exon 1)

148997 MFSDIPFFYDRAHIISIWYNRKL (2) 148929 (possible exon 1)

148908 MKDGRLAFRTFFCKQSLHTLSKNDK (2) 148834 (possible exon 1)

148551 YATWPYNTFKKLGIPGPPPLPLIGNLIDYKK 148459

147339 GLSNIDLEWMKKYGKYWG 147286

146527 VYEGQLPVLIVADTKLIKQINVKEFPNFANRR 146432

145813 LMPGNGPVMKYSLTVLQDAEWKRVRSYMSPFNSAYSLKQ 145697

       QLCYLIENTSDNLVAAMKRYHDAGQYVDVKE

144646 IFGCYTMDVISSTGFGTDVNSLSDPDSIFIKNVKKFYAIGALSPFTLLT 144500 (1)

143785 FGFPWFAFFLDRNNWFFNIVPPPVFNFFADAIRKVISIRESNPAESD (0) 143639

143212 KRVDVMQLLLKSHNTALDEPGNEGNIKH 143129 (1)

142553 GLSYNEILANGFIFWIGGYDTTATTISFLAYNLALNPDIQERVIAEIDEIMRGRVGHIFEYLR (0) 142368

141604 ECMDYKAASEMKYLKMCVDETLRMYPPSQR 141515

141141 AKEDIDLDGVKIPKGMCVQFSSFAIHYDPDNWPDPEKFDPER 141016

139356 FTPEEKKKRDPYAYVAWGVGPRSCVMKRLGMLEVKFAIAKILMKYRLRPCEKTQ 139195

138066 IPIRVKVSNLTQPDHGMFLKLEARTDI* 137883

 

>ATGN234930.b1 exon 2 seq AFSA808968.g2 55% to CYP3C zebrafish

(2) YFMWPYSAFEDLGIPGPKPLPLFGNYLSYGK (0)

    SAGEFDRECYKKFNKVYG (2)

 

CYP4F like (looking, there are several genes, possibility of

Hybrid assembly here.  These are the same as the CYP4T section)

 

>AFPZ699255.y1  AFPZ733698.x1 ATUP16248.y2 AFPZ163606.x1

APWS139179.b1 ATUP98469.x4 AFSA796551.b2  ATUP336859.y1

AFPZ793080.x1

48% to 4T5 43% to 4F28 more like 4T

DQLLSHDHNCRYRTCWRTPVIALTFCSHPETVKPILSNK (1)

AQKTEWMYRFFRPWL (1)

GDGLLLSDGPKWQRNRRLLTPAFHFDILKHYVQLFSESTA VLL(0)

DKWMSRGPGASVELFDHIGLMTLDNILKCSLGYNSRCQTDG (2)

IKWMSRGPDASVELLDDIGLMTLDNILKCSLGYNSRCQTDG (2)

SAPYILAVNDLTRLFAERGDQPLHYFDFIYYLSSDGRR (2)

XXXXXNMVHRHSAEIIRQRKDTLKEQSDGDSA--KKYLDFLDILLRAK (0)

DEDGNGLTDAEIRDEVDTFVFEGHDTTASGLAWTLYCLARHPGHQEKCRKEAQEVLQGRTELTW  (2)

DNLSSLKYITLCITESLRM YPPVQRLFRQLEKPMTFFDGRTLPQ (1)

GSPTMTDIAGTHRNRDIWPNPT (0)

VYDPYRFSPENSANRHPYAFLPFSAGPR (2)

NCIGKNFAMNEMKVSVALILQHFQLELDETKSPAVPFDSLTVQAKDGIWVKLHPVKNDT*

 

CYP4T like (looking, at least 4 different sequences same

as 4F like)

 

> CA385834.1|   Oncorhynchus mykiss cDNA clone

Amphi. 1   EPKDRVSYAWLKPWIGDGLLVSEGQKWFRNRRLLTPGFHFDVLKPYVKVFSECTNIML 58

           EPKD +SY +L PWIGDGLLVSEGQKWFR+RRLLTPGFH+DVLKPYVK+ ++    ML

Sbjct: 390 EPKDDLSYRFLIPWIGDGLLVSEGQKWFRHRRLLTPGFHYDVLKPYVKMMADSAKTML 563

 

MELFETLKKVTLDSYRIHHLVAIFSLVYVILKISKLIVKRNEW

IRALETFPGPPKHWLFGHVREFKQDGNDMYKVVKWGESYPLAFQMWFGPFVSILNIHHPDYVKTILAST

               ATGN270676.g1                 FPLWIGPFRVVLSLVHSDYIKEIVNSP (1)

AFPZ138711.y1 SEVHCLFQVRPDETGFTIVPQWAAKFKFAFPLWIGP

EPKDDLSYRFLIPWIGDGLLVSEGQKWFRHRRLLTPGFHYDVLKPYVK

MMADSAKTMLDKWETHSKSDESFELFEHVSLMTLDSIMKCAFSSNTNCQTVRG

           GESGTNSYIKAVYELSDLVNVRFRTFPYTASGSST

 

>AFSA90852.g2

MNCLNLGISVPLSLSTFAMVSIPAQWLPHWETGYLRTACLTVLVAVAVQLVFRFLRAL

LWKRYIQKVLAPFPGQPAHWLFGHMRE (0)

ATGAATTGCTTAAACTTGGGAATTTCTG

TCCCTTTATCCTTAAGCACCTTTGCTATGGTGTCCATACCGGCGCAGTGGTTGCCCCACT

GGGAGACCGGTTACCTGCGGACCGCCTGCCTGACCGTGCTGGTTGCCGTGGCCGTTCAGC

TGGTGTTCAGGTTCCTCCGTGCGTTGCTATGGAAACGGTACATTCAGAAAGTTCTGGCAC

CATTCCCAGGACAACCTGCACACTGGCTGTTTGGTCATATGAGAGAGGTGAGGGATTGT

ATGN26250.g1 AFSA103155.g2 AFSA90852.g2 AFPZ722043.x1

AFPZ440133.b2

walked up from ASWX93226.g2

AFPZ138711.y1 AFPZ115087.x1 ASWX102329.g2 ASWX93226.g2

ASWX155166.g2 AFPZ440133.b2

ATGN270676.g1 VRPDETGFTIVPQWAAKFKFAFPLWIGPFRVVLSLVHSDYIKEIVNSP (1)

AGGTCCGGCCGGATGAAACCGGTTTCACCATCGTGCCACAG

AGTGGGCAGCGAAGTTCAAGTTTGCCTTCCCGCTCTGGATCGGGCCGTTCCGCGTGGTT

CTCAGCCTCGTACATCCCGACTACATCAAGGAGATCGTCAACTCACCAGGT

 

walked up to ASWX93226.g2

walked up to ATGI120597.g1 ATUP404745.b2 ATGN167137.b1

AFPZ550292.x1 AFPZ345721.x1 ATUP332415.x1 ATUP404745.b2

ATGN270676.g1 ATGN167137.b1

(1) EPKDRVSYAWLKPWI (1)

AGAACCAAAGGACAGGGTGTCATATGCCTGGCTGAAACCATGGATAGGT

 

>ATUP272608.y2 mate pair C-helix

APNK110373.g1 APNK106080.b1 ASWX61863.b2 ATUP449675.g1

AFPZ713557.b2 AFSA198365.g3 

walked down to AFPZ81376.b2 ATUP272608.y2 ATGI238519.b1

AFSA874000.b2 AFSA451698.b2

walked down to APWS127433.g1 ATGN176879.g1 ATGN84797.g1

AFPZ133921.y1 mate pair

GDGLLVSEGQKWFRNRRLLTPGFHFDVLKPYVKVFSECTNIML (0)

AGGTGACGGGTTGTTGGTCAGCGAGGGACAGAAATGGT

TCCGTAACCGGCGCCTCCTCACGCCGGGGTTTCACTTCGACGTGCTGAAGCCGTACGTCA

AGGTCTTCTCTGAATGTACCAACATCATGCTAGT

 

>ATGI238519.b1  ATGN124963.b1  ATGN84797.g1   AFPZ612539.b2  ATGI238519.b1 

AFSA451698.b2  AFPZ612539.g2  APWS127433.g1 AFPZ133921.y1 (short with mate pair)

these next seqs are 100% aa matches with three nuc diffs

AFPZ115087.y1 ASWX93226.b2  ASFW53123.g2  AFSA488932.b2 AFSA447520.b2

AFSA13398.y1  AFPZ368667.y1 AFPZ407080.b2

DRWADLAPGTPVEMFHYASAMTLDSLMRCALSVRSDCQRDSDGSP (2)

AGGACAGGTGGGCAGATCTGGCACC

TGGCACACCTGTGGAGATGTTCCACTACGCCAGCGCCATGACACTGGACAGCCTGATGCG

ATGTGCGCTCAGCGTGCGCTCGGACTGCCAGCGGGACAGCGACGGGAGTCCGT

 

>AFPZ133921.y1 no intron between exons 5 and 6

this seq also reaches exon 7

APWS127433.g1  ATGN124963.b1  ATGN84797.g1  AFSA451698.b2 

AFPZ612539.b2  ATGN176879.g1  AFPZ612539.g2 

MTLDSLMRCALSVRSDCQRDS

DGSPYIRAVYDLTKCVVERGRYQPFHIPLIFHLSPTGFR (2)

AGGACAGGTGGGCAGATCTGGCACCTG

GCACACCTGTGGAGATGTTCCACTACGCCAGCGCCATGACACTGGACAGCCTGATGCG

ATGTGCGCTCAGCGTGCGCTCGGACTGCCAGCGGGACAGCGACGGGAGTCCGTACATCCG

CGCCGTGTACGACCTGACGAAGTGCGTGGTGGAGCGTGGTCGCTACCAACCGTTTCACAT

TCCCCTCATCTTCCACCTCAGTCCAACTGGCTTTAGGT

 

>ATGN249826.g1 ASWX134355.g2 AFPZ468928.y1 ASWX173456.y1

the following are 100% aa seq matches with three nuc diffs

AFSA56765.y1 AFPZ522964.x1

(0) DKWTKLGSGCSVEMFEHVSLMTLDSILKCSLSYHSNCQTDR (2)

AGGACAAATGGACCAAGCTTGGCTCTGGATGCTCT

GTGGAGATGTTTGAACACGTCAGCCTGATGACTCTGGACAGCA

TCCTGAAATGTAGTCTCAGTTACCATAGCAACTGCCAGACTGACAGGT

 

>ATUP915634.x1 mate pair SWX165907.b2 mate pair ATWX7439.g1 ATUP623212.x1

AFSA620746.b2 ATGI69856.b1 ATWW161831.g1 AFPZ932881.y1

the following have 1 aa diff AFPZ218223.x1 APWS83183.b1

ATUP402691.g1 mate pair ATGI268005.g1

AFSA650722.b4

(0) ENWEEFGAGASIDVFQHVSLMTLDSMLKCALSQNTGCQKR (2)

AGGAAAACTGGGAAGAGTTCG

GGGCTGGTGCCTCTATAGATGTGTTCCAACACGTCAGCCTGATGACTCTGGACAGCATGC

TGAAATGTGCTCTCAGTCAGAACACTGGCTGTCAGAAAAGGT

 

>ATUP430247.b1 mate pair AFPZ224663.x3 mate pair

AFPZ575633.x1  ATUP555575.x1  ATUP546626.b1 AFSA312247.g2 

the following have 1 aa diff APWS45028.g1 AFSA160628.b2

the following have 2 aa diffs AFPZ818365.b2

(0) ENWEESGAGTSIDVFQHVSLMTLDSMLKCALSQDTGCQKR (2)

AGGAAAACTGGGAAGAGTCTGGGGCTGGCACCTCCATAGATGTGTTTCAACACGTCAGCCTGA

TGACTTTGGACAGTATGCTGAAGTGTGCTCTCAGTCAGGACACCGGCTGTCAGAAAAGGT

 

>AFPZ354459.y1 and 100% matches

AFSA73126.x1 ATGN346495.g1  ATGN268793.g1  ATGN182214.b1 

ATUP895246.y1  ATWW135985.b1  ATWW155643.g1  ATUP777488.y1 

ATWW119927.b1  ATUP289252.x2  ATUP174497.b1  AFPZ892508.y1 

AFSA908409.b2  AFSA735851.b3 

100% aa with 1 nuc diff ATGI117873.b1 AFPZ107894.y1 AFSA27567.b2

ATGN330749.b1 ATUP754095.x1 AFSA311147.b2

AFSA307647.b2 AFPZ683302.b2

1 aa diff APWS50764.b1 ATUP606978.x1 ATGN213407.g1

ATUP850436.x1 ATWW227658.g1 AFPZ789326.y1

ASFW186505.g2

100% aa seq with 2 nuc diffs APWS109107.b1 ASWX40589.g2

ATUP750108.x1 ATUP733499.x1 ASFW37910.g2 AFSA448858.b2

AFSA346719.g2

100% aa seq with 3 nuc diffs AFPZ492058.x1 AFPZ290838.x1

APWS27430.b1 ATWW70674.b1 ASFW45413.b2 AFSA582563.b2

AFSA108008.b2

(0) DKWSRVAAGSSVELFDHVSLLTLDSMLKCSLGYRSDCQTDG (2)

AGGACAAGTGGAGCAGAGTTGCTGCGGGCTCCTCCGTGGAACTGTTT

GATCACGTGAGCCTGCTGACGTTGGACAGCATGCTGAAGTGCAGCCTTGGTTACCGTAGT

GACTGTCAAACTGACGGGT

 

>ATGN258597.b1 AFPZ55223.b2 AWYB1196.b1 ATUP857105.g1

ATGI235564.b1 AFSA312605.b2

1 aa diff ASWX177131.g2 ATUP648660.b1 ASFW159249.b2

(0) EKWLSRGPGASVELFDQVGLMTLDNILKCSLGYHSNCQTDG (2)

AGGAAAAGTGGCTGTCACGTGGTCCAGGCGCGTCTGTGGAGCTGTTTGACCAGGTCGGCCTGAT

GACGTTGGACAACATCCTGAAATGCAGCCTCGGTTACCATAGCAACTGCCAGACTGACGGGT

 

>ATUP337506.y1 mate pair ATGI217395.b1 ATGN85884.g1

ASFW173326.b2 AFSA644705.g2 AFSA137556.g2

APNK84474.g2

1 aa diff three nuc diffs ATGI10129.b1 APNK71044.b2

ATUP412849.x1 ATGN367966.b1 ATGI161407.b1 ATUP795448.x1

AFPZ717049.x1

(0) AKWRQLGAGASIDMFEHVSLMTLDSMLKCALTVESNCQVDR  (2)

AGGCCAAGTGGAGGCAGCTTGGTGCGGGTGCATCCATCGACATGTTTGAGCAC

GTGAGTCTGATGACGCTGGACAGTATGCTGAAGTGTGCGCTCACAGTGGAGAGTAACTGT

CAGGTGGACAGGT

 

>ATUP337506.y1 mate pair ATGN85884.g1 AFPZ139710.y1 AFPZ433829.y1

APNK24749.b2 APNK41276.b2

same aa seq 2 nuc diffs ATUP587507.x1 AFSA873673.b2

AFSA471589.g2 

KQNSYIAAVFSLTKLALQRFHLFPLHSDLIYYLTPMGYRLVQSKGSLSFSTTQ (2)

AGAAAACAGAACTCGTATATTGCTGCTGTATTCTCCCTGACCAAG

TTGGCTCTACAGCGTTTCCACCTCTTCCCTCTGCACAGTGATCTGATCTACTACCTCACC

CCTATGGGATACAGGTTGGTACAGTCCAAGGGCTCTCTAAGCTTCTCTACCACACAGT

 

There are at least 7 different exon 5 sequences

 

(0) ENWEEFGAGASIDVFQHVSLMTLDSMLKCALSQNTGCQKR (2)

(0) ENWEESGAGTSIDVFQHVSLMTLDSMLKCALSQDTGCQKR (2)

(0) DKWTKLGSGCSVEMFEHVSLMTLDSILKCSLSYHSNCQTDR (2)

(0) DKWSRVAAGSSVELFDHVSLLTLDSMLKCSLGYRSDCQTDG (2)

(0) EKWLSRGPGASVELFDQVGLMTLDNILKCSLGYHSNCQTDG (2)

(0) AKWRQLGAGASIDMFEHVSLMTLDSMLKCALTVESNCQVDR  (2)

(0) DRWADLAPGTPVEMFHYASAMTLDSLMRCALSVRSDCQRDS   (no intron)

 

>ATWW176870.g1 ATGN284985.b1

FRKACKTAHDFSDEVIRKRRTELQQQGCHQNDTANSSEDGGKKRYLDFLDILLQAR

AGGTTTCGTAAAGCATGTAAAACTGCTCATGACTTCTCTGATGAAGTCATCAGAAAGAGGCGGACAGAGC

TCCAACAGCAAGGCTGTCATCAGAACGACACAGCGAACAGCTCGGAAGATGGGGGCAAGA

AACGATACCTGGACTTCTTGGACATCCTGCTACAAGCAAGGGT

>ATGN26250.b1 I-helix ATUP147621.b1 AFPZ133921.x1 AWXX12803.b1

ATWW81852.b1 ATWW176870.g1 ATUP449675.b1 AFSA492010.b2

60% to CYP4F42 Xenopus 59% to 4T5

DEDGKGLSEREIRDEVDTFMFEGHDTTASGVSWILYNLAKHPACQDRCRAEVDAVLQGRAEVKW

>ATWW81852.b1 ATUP449675.b1 ATGI42359.g1 ATUP272608.x2 ATGN148197.b1 EXXR exon

AFSA105706.g2 AFPZ660013.g2 ATGN130493.b1 ATUP427110.b1 walk to ATUP164815.x1

EDLSKLPYTTMCIKESLRMHSPVPGVTRLTTQPHTFPDGRSIPA

AGGATTTTTTGCTTACATGTAGGGAGGACCTGTCC

AAGCTGCCCTACACCACCATGTGTATCAAGGAGAGTCTGCGGATGCACTCCCCTGTCGGGGGTGACACGGCTCA

CCACACAGCCGCACACCTTTCCTGATGGGAGAAGCATCCCCGCAGGT

>ATUP164815.x1 AFPZ309668.x1 ATGI130869.b1 ATWW159270.b1 ATUP541926.g1

APNK46679.g2

GCTAPILGAPGCTCTQYFEPCY (0)

EFDPERFSPENSKGRSSHAFIPFSAGSR

AGGAATTTGACCCTGAGCGTTTCTCGCCTGAGAAC

TCCAAGGGCCGCTCTTCCCATGCCTTCATTCCTTTTTCAGCTGGATCTCGGT

>AFPZ309668.x1 AWYB4583.g1 AFPZ660013.b2 ATUP427110.g1

NCIGQHFAMNELKVTVALTLQRYRLELDETRPPYRVARLITRTRDGLWLKVYPRGADN*

 

 

50-52% to CYP4F sequences, this seq intact

MNCLNLGISVPLSLSTFAMVSIPAQWLPHWETGYLRTACLTVLVAVAVQLVFRFLRAL

LWKRYIQKVLAPFPGQPAHWLFGHMRE (0)

VRPDETGFTIVPQWAAKFKFAFPLWIGPFRVVLSLVHSDYIKEIVNSP (1)

EPKDRVSYAWLKPWI (1)

GDGLLVSEGQKWFRNRRLLTPGFHFDVLKPYVKVFSECTNIML (0)

DRWADLAPGTPVEMFHYASAMTLDSLMRCALSVRSDCQRDS   ATGN124963.b1

DGSPYIRAVYDLTKCVVERGRFPPFHIPLIFHLSPTGFR (2) ATGN124963.b1

FRKACKTAHDFSDEVIRKRRTELQQQGCHQNDTANSSEDGGKKRYLDFLDILLQAR (0)

DEDGKGLSEREIRDEVDTFMFEGHDTTASGVSWILYNLAKHPACQDRCRAEVDAVLQGRAEVKW (2)

EDLSKLPYTTMCIKESLRMHSPVPGVTRLTTQPHTFPDGRSIPA (1)

GVSVSIGVHSLHHNIHVWGDNVM (0)

EFDPERFSPENSKGRSSHAFIPFSAGSR (2)

NCIGQHFAMNELKVTVALTLQRYRLELDETRPPYRVARLITRTRDGLWLKVYPRGADN*

 

DKWTKLGSGCSVEMFEHVSLMTLDSILKCSLSYHSNCQTDR  (2)  ASWX173456.y1

QSSAYIRAVYDITRLFVE RIRFPPYYSDFIYSLSGTGSFDRRRSGCGVVLWL (1) ASWX173456.y1

 

DRWADLAPGTPVEMFHYASAMTLDSLMRCALSVRSDCQRDS    ATGI238519.b1

DGSPYIRAVYDLTKCVVERGRYQPFHIPLIFHLSPTGFR (2) AFPZ133921.y1

 

Missing piece

ETDLGIAIYGCHHNSALWENPE capitella

GCTIGVSIYGIHMNSTVWENPY danio

GTRIGTSVFGIHRNATVWENPT tetraodon

ESRIGTSVFGIHRNASLWENPN fugu

GVSVSIGVHSLHHNIHVWGDNVM   ATUP164815.x1

GTLVGLSIYAIHKNPAVWEDPE xenopus

 

>ATUP164815.x1 ATGI130869.b1 ATWW159270.b1 ATUP541926.g1

APNK46679.g2

GVSVSIGVHSLHHNIHVWGDNVM  (0)   

AGGTGTTTCTGTGAGCATTGGAGTGCACAGCTTACATC

ATAACATCCATGTGTGGGGAGACAACGTCATGGT

 

CYP4T5      Fugu rubripes (pufferfish)

            No accession number

            Scaffold_8637

            78% to 4T2

508  MEITRALVVLGWSHFYQLLALFCLAIVLYKLTVLLMLKRALIRNFESFPGPPGHWLFGNILE 693 (0)

902  FKQDGNDLDKLVKFGQKYPYCFPLWFGPFVCFLNIHHPEYVKTILAST 1045 (1)

1142 EPKDDLAYSFIQNWI 1186 (1)

1291 GNGLLVSQGQKWFRHRRLLTPGFHYDVLKPYVKLMAHSTKTML 1419 (0)

1673 DKWESYAKTNKPLEVFEYVSLMTLDTILNCAFSYDSNCQTER 1798 (2)

2267 KNTYIKAVYELSNLINLRFRIFPYHNDLIFYLSPHGFRYRKACMVAHSHT 2416 (1)

2521 EEVIKKRREALKKEKELERIQAKRNLDFLDILLFAK 2638 (0)

3171 DENQQGLLDEDIRAEVDTFMFEGHDTTASGISFLLYNLACHPKHQKLCRKEIMQVLHGKDTMDW 3362 (2)

3457 EDLNKIPYTTMCIKESLRMHPPVPGISRKTTKPITFFDGRTLPA 3588 (1)

392  ESRIGTSVFGIHRNASLWENPNV 457 (1 0) this exon from Fugu LPC.11421.x1

     fdhwrflpenvskrsphafvpfsagpr this exon from 4T2 Dicentrarchus labrax

     NCIGQNFAMNEMKVVIAMTLLKYELLEEPTLKPKIIPRLVLRSLNGIHIKIKNANQN*

 

search with this Dicentrarchus CYP4T2 seq

          gacaaatggg gaagttatgc aaacagcaac

      541 gagtcctttg aattgtttca acatgtgagc cttatgactc tggacagcat cttgaagtgt

      601 gctttcagct acaacagcaa ctgtcagact gagagtggaa caaatgtgta catcaaagca

      661 gtgtatgaac tcagtgatct gataaacctg cggttgagga catttccata ccacagtgac

      721 ctaattttct acctcagccc acatgggtac agatacagaa aggcaatcaa agtggctcag

      781 agtcataca

 

fugu   1   DKWESYAKTNKPLEVFEYVSLMTLDTILNCAFSYDSNCQTER-KNTYIKAVYELSNLINL 59

           DKW SYA +N+  E+F++VSLMTLD+IL CAFSY+SNCQTE   N YIKAVYELS+LINL

dicent 511 DKWGSYANSNESFELFQHVSLMTLDSILKCAFSYNSNCQTESGTNVYIKAVYELSDLINL 690

                                                        QSSAYIRAVYDITRLFVE

Query: 60  RFRIFPYHNDLIFYLSPHGFRYRKACMVAHSHT 92

           R R FPYH+DLIFYLSPHG+RYRKA  VA SHT

Sbjct: 691 RLRTFPYHSDLIFYLSPHGYRYRKAIKVAQSHT 789

           RIRFPPYYSDFIYSLSGTGSFDRRRSGCGVVLWL

 

ASWX173456.y1

QSSAYIRAVYDITRLFVE RIRFPPYYSDFIYSLSGTGSFDRRRSGCGVVLWL (1)

AGCCAGTCCAGCGCGTACATCCGTGCTGTGTATGACATCACGAGGCTGTTTGTCGAGCGTATTCGCTT

TCCGCCGTACTACAGTGACTTCATCTACTCGCTCAGCGGTACCGGCTCATTCGATCGACG

GCGGTCGGGGTGTGGTGTGGTTTTGTGGTTGGGT

 

>AFSA29926.g2 ATGI49052.b1 65% to CYP4F42 Xenopus

DEDGTGLTDAEIRDEVDTFLFEGHDTTASGISWALYHLAKHPEYQDRCRREAEGLLQGRTEMTW

 

>APWS109579.g1 74% to CYP4F42 Xenopus

DEDGNGLSDVEIRDEVDTFMFEGHDTTASGLSWTLYNLARHPEHQERCRQEARSVLQGRSVVTG

 

>ATUP337506.x1

DEDGNGLSDVEIRDEVDTFMFEGHDTTASGLSWTLYNLAKHPEHQERCRQEARSVLQGRSVVNR

 

 

>AFSA246853.b2 AFPZ352137.x1 AFPZ224663.y3 APWS16853.b1

HKAACNIVHKYSEEIILQRKEVLKQQSAGDSTHGKKYLDFLDILLRAK (0)

DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQDKCRKEAQEVLQGRTVDTW

 

>ASWX6756.g2 ATGI162241.b1 AFPZ352137.x1 ATUP390177.g1 ATUP407011.b2

AFSA735184.b2 AFSA426465.b2 AFPZ654427.g2 ATWW138964

5 nuc diffs to ATUP664988.g1, 1 aa diff

DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQDKCRKEAQEVLQGRTEVTW

 

 

>ATWW138964.g1 

           FRDEVDTFMFEGHDTTASGLALTLYCLARHPGHQDKCRKEAQEVLQGRTEVTW

 

>ATUP664988.g1 ATWW180530.g1 AFSA128002.b2 AFSA100306.g2 AFPZ619699.x1

ATGN318029.g1 AFPZ458632.x1 

YRKACNLVHEYAKRIIAERREALKQRLTEDDEETNKKKYLDFLDILLKAR (0)

DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKCRREAQEVLQGRTEVTW

 

>ATUP172867.g1 ATWX43092.b1 ASWX40865.b2

DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARYPGHQDKCRKEAQEVLQGRT EVTW

 

>ATWX40562.b1 ATUP926382.b1 ATUP177523.b1 ATUP539865.g1 

 note same aa seq, 3 nuc diffs to ATUP664988.g1

ATUP16248.y2 ATGN214275.b1 ATUP837406.y1 ATUP298210.b1

AFPZ694240.b2 AFSA5009.x1

note same aa seq, 4 nuc diffs to ATUP664988.g1

72% to 4F42 Xenopus 60% to 4T5 52% to 4V5

DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKCRKEAQEVLQGRTEVTW

>AFPZ112195.x1 ATGN200959.g1 ATUP907530.x1

4 nuc diffs to ATUP664988.g1 2 aa diffs to ATUP664988.g1

DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLACHPGHQEKCRLEAQEVLQGRTEV

>ATGI22425.b1 ATUP539865.g1

DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKCRREAQEVLQGGTEVTW

 

>AFPZ112195.x1 probably same as ATWX40562.b1 with two errors

DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLACHPGHQEKCRLEAQEVLQGRTEVTW

 

>ATUP540419.g1 ASFW147712.b2 ATWW5209.g1

DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKCRKEAQEMLQGRTEVTW

 

>ASWX54130.g2 AFPZ224663.y3 ATGN280478.g1  ATGI221699.b1  ATUP430247.g1

DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKCRKEAQEVLQGRPDVTW

 

>ATUP296454.b1 ASWX46239.b2 ATGN159465.g1 ATUP915634.y1

note ATUP915634.x1 is a middle exon DKW...

DEESNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYC

DEDGNGLTNTEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKCRKEAQEVLQGRTDVTW

 

>ASWX163983.b2 

DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQDKCRREAQEVLQGGTEVTW

 

>ATGN72941.g1 ASWX123977.g2  AFSA780539.b2 ASFW81426.b2

1 aa diff to ATWX40562.b1

DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKCRKEAQEVLQGGTEVTW

 

>AFSA853381.g2 APNK78662.g3 ATUP588704.x1 1 aa diff to AFSA246853.b2

YKKACNEVHQFSEKIIQQRKQDLDNLSTTETTRRQKYLDFLDILLMAK (0)

DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWILYC

 

>ASWX165907.g2  1 ATUP402691.b1 ATUP921527.y1 ASFW127211.b2 ASFW54195.b2

1aa diff to ASWX46239.b2

note ASWX165907.b2 = mate pair = DWK exon

DEDSNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYC

 

>ATUP646033.g1 

DEMYGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYC

 

Alignment of 4F-like I-helix exon sequences.  The green aa are probably seq errors,

since they occur only once, or on the end of a seq.  The sequences

from AFSA853381 to APNK78662 are short pseudogene pieces.  The last three

Sequence seem to be different genes and additional searches may be required.

Numbers after accessions are number of occurrences.

Those with one occurrence should be combined with others (seq error)

Those with 9 are probably two genes with identical exons

 

ASWX6756      9   DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQDKC

ATUP172867    3   DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARYPGHQDKC

ATUP296454b   4   DEDGNGLTNTEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKC

ASWX54130     5   DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKC

AFSA246853    4   DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQDKC

ATUP540419    3   DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKC

ATUP664988    6/4 DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKC

ATWX40562     4/6 DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKC

ATGI22425     2   DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKC

ASWX163983    1   DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQDKC

ATGN72941     4   DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKC

AFPZ112195    1   DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLACHPGHQEKC

AFSA853381    3   DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYC-----------

ATUP296454a   4   DEESNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYC-----------

ASWX165907    5   DEDSNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYC-----------

APWS109579    1   DEDGNGLSDVEIRDEVDTFMFEGHDTTASGLSWTLYNLARHPEHQERC

AFSA29926     2   DEDGTGLTDAEIRDEVDTFLFEGHDTTASGISWALYHLAKHPEYQDRC

ATGN26250     7   DEDGKGLSEREIRDEVDTFMFEGHDTTASGVSWILYNLAKHPACQDRC

                             :*******:**********::  **           

 

ASWX6756          RKEAQEVLQGRTEVTW-

ATUP172867        RKEAQEVLQGRTEVTW-

ATUP296454b       RKEAQEVLQGRTDVTW-

ASWX54130         RKEAQEVLQGRPDVTW-

AFSA246853        RKEAQEVLQGRTVDTW-

ATUP540419        RKEAQEMLQGRTEVTW-

ATUP664988        RREAQEVLQGRTEVTW-

ATWX40562         RKEAQEVLQGRTEVTW-

ATGI22425         RREAQEVLQGGTEVTW-

ASWX163983        RREAQEVLQGGTEVTW-

ATGN72941         RKEAQEVLQGGTEVTW-

AFPZ112195        RLEAQEVLQGRTEVTW-

AFSA853381        -----------------

ATUP296454a       -----------------

ASWX165907        -----------------

APWS109579        RQEARSVLQGRSVVTGW

AFSA29926         RREAEGLLQGRTEMTW-

ATGN26250         RAEVDAVLQGRAEVKW-

 

AFSA246853.b2  +

AFPZ352137.x1  +

AFPZ224663.y3  +

APWS16853.b1   +

ASWX54130.g2   +

ASWX46239.b2   +

ATGN280478.g1  +

ATUP296454.b1  +

ATGN159465.g1  +

ATUP915634.y1  +

ATGI221699.b1  +

ATGI162241.b1  +

ATUP390177.g1  +

ATWW5209.g1    +

ATUP540419.g1  +

ATUP407011.b2  +

ATUP430247.g1  +

ASWX6756.g2    +

ASFW147712.b2  +

AFSA735184.b2  +

AFSA426465.b2  +

AFPZ654427.g2  +

ATWW138964.g1  +

AFSA5009.x1    +

AFPZ133921.x1  +

AFPZ112195.x1  +

ATUP16248.y2   +

ATGI22425.b1   +

APNK78662.g3   +

ASWX163983.b2  +

ASWX165907.g2  +

ASWX40865.b2   +

ATUP588704.x1  +

ATWX40562.b1   +

ATWX43092.b1   +

ATGN214275.b1  +

ATGN200959.g1  +

ATUP402691.b1  +

ATGN318029.g1  +

ATUP837406.y1  +

ATUP921527.y1  +

ATUP298210.b1  +

ATUP926382.b1  +

ATUP907530.x1  +

ATGN72941.g1   +

ATWW180530.g1  +

ATWW176870.g1  +

ATWW81852.b1   +

ATUP664988.g1  +

ATUP646033.g1  +

ATUP539865.g1  +

ATUP449675.b1  +

ATUP177523.b1  +

ATUP172867.g1  +

ASWX123977.g2  +

ASFW127211.b2  +

ASFW81426.b2   +

ASFW54195.b2   +

AFSA853381.g2  +

AFSA849885.b2  +

AFSA780539.b2  +

AFSA128002.b2  +

AFSA100306.g2  +

AFPZ458632.x1  +

AFPZ619699.x1  +

AFPZ694240.b2  +

 

100% matches to AFPZ760949.g2 exon

AFPZ50624.g2 ATWX87052.g1 ATUP695364.x1 ATWW34381.b1

ATUP656534.b1 ATUP453935.b1 AFPZ760949.g2

(0) DQLLSHDHNYHYFTCWQTPIIALTFCSHPETVKLILSNKS (1)

 

99% match to AFPZ760949.g2 exon

ASWX83469.g2 one base missing, seq error?

 

97% to AFPZ760949.g2 exon

ASFW95633.g2 stop codon and two other changes probable seq error

*LSHDHTFHYFTCWQTPIIALTFCSHPETVKLILSNKS

 

89% to AFPZ760949.g2 exon

AFSA23133.b2 ATGN291068.b1 ATGI170747.g1 ATWW165626.b1

ASFW10811.g2 AFSA451581.g2 AFSA330051.g2 AFPZ476812.b2

AFPZ466033.x1 AFPZ632859.y1 AFPZ605110.g2

(0) DQLLSHDHNCRYRTCWRTPVIALTFCSHPETVKPILSNKS (1)

 

AFPZ434251.y1 another seq by itself, this matches at 96% to 7

Other seqs which are probably the correct seq, this being a little off

DLLLSHDHNYHYFTCWQTPIIASTFCSHPESLV  PSSLCR

AGGACCTGTTACTGTCTCATGATCACAACTATCA

CTACTTTACCTGCTGGCAGACTCCGATAATCGCCTCAACATTCTGCAGCCATCCCGAGAG

CCTGGT

 

The Seven seqs are the same as listed above

AFPZ50624.g2  ATWX87052.g1  ATUP695364.x1 ATWW34381.b1 

ATUP656534.b1 ATUP453935.b1 AFPZ760949.g2

 

 

>AFPZ50624.g2 walked upstream to AFPZ760949.g2 (no hits)

(0) LLSHDHNYHYFTCWQTPIIALTFCSHPETVKLILSNK  (1)

>ATUP412432.y1

ARKTEWVYRFFRPWL (1)

GDGLLLSDGPKWQRNRRLLTPAFHFDILKHYVKLFSESTAVLL (0)

 

>APWS102130.b1

 (1) GNSLFLSDGDQWKVHRRLLTPAFHFDILKQYVSVYNREATEMI (0)

 

>ATUP98469.y2 ASFW123814.b2

(0)   SAPYVLAVHDLTKLIEDRPDYLSNHIDF IYYLSADGRR (2)

     APYVLAVHDLTKLIEDRPDKLSNHIDF IYYLSKDGR

FRRACKIVHSFSAQVIKERKEELKKKDSSFKSGKCLDLLDILLKAK

FRRACKIVHSFSAQVIKERNEETEKK

 

>AFPZ133921.y1 exon 5 mate pair to I-helix exon 8

MTLDSLMRCALSVRSDCQRDSDGSP (2)

 

ATUP207008.y2

MILTANILVFLTCFTVNSTQFSLDDCHVDSIKFD

ATGN176295.g1

MILTANILVFLTCFTVNSTQFSLDDCHVDSIKFD

 

>danio 4F seq BC056734

MILTANILVFLTCFTVNSTQFSLDDCH

MLLYGISPFVLSVNHVFALIFLACLLTVVKLLIVRRKGVKTMER

FPGPPAHWLFGHVKEFRQDGHDLEKIVKWMELYQFAFPLWFGPSLAVLNIHHPSYAKT

ILTTTEPKDDYAYKFFIPWLGDGLLVSTGQKWFRHRRLLTPGFHYDVLKPYVKLISDS

TKVMLDKWEVHSRSEESFELFKHVSLMTLDSIMKCAFSCNSNCQTDSGTNPYIQAVFD

LCHLVNLRFRVFPYHSKAIFHLSPHGYRFRKAASIAHNHTAEVIRKRKEVLKMEEEQG

IVKNRRYLDFLDILLSARDEHQQGLSDEDIRAEVDTFMFEGHDTTASGISWIFYNLAC

NPEHQEKCRQEIQQALDGKDTLEWEDLNKIPYTTMCIKESLRLHPPVPGISRKLTKPL

TFFDGRTVPEGCTIGVSIYGIHMNSTVWENPYKFDPLRFLPENAANRSPHAFVPFSAG

PRFVTRSV

 

>CYP4F39 UPSTREAM OF 4F5 chr7 (+) 94% to mouse 4f39 = ortholog

13051717 MLPITDYLLYLLGLEKTAFRVYVLSALLLFLLFLLFRLLLQAFKLFS

         DFRITCRRLSCFPEPPGRHWLLGHMSM 13051938

13054230 YLPNEKGLQNEKKVLDTMHHIILAWVGPFLPLLVLVHPDYIKPVLGAS 13054373

13064279 AAIAPKDEFFYSFLKPWL 13064332

 

>CYP4F17 = CYP4F19temp AI030199 EST CHR7 13095557  13103056 chr7 (+)

90% to 4f17 next closest 82%, probable ortholog of 4f17

13095557 MLQLSLSWLGRGPVTVSPWQLLLVVGTSLLLARILAWISAFYDN

         YCRLRCFPQPPSRHWFWGHLNL 13095754

13102916 VKNNEEGLQLLAEMSHQFQDIHLCWIGIFYPILRLIHPKFIGPILQA 13103056

13103866 AAAVAPKEMIFYGFLKPWL 13103922

 

>CYP4F5/4f16 13119940  13133265 chr7 (+) 3 aa diffs to mRNA U39207 90% to 4f16 89% to 4f37

13119940 MPWLTVSGLDLGSVVTSTWHLLLLGAASWILARILAWTYSFCENCSRLRCFPQSPKRNWFLGHLGT 13120137

13122954 IQSNEEGMRLVTEMGQTFRDIHLCWLGPVIPVLRLVDPAFVAPLLQAP 13123097

13125947 ALVAPKDTTFLRFLKPWL 13126000

 

>Xenopus 4F42 mRNA AB114053

MLPFLDHFLDSLNLSHSTFRVYIFYAVILFFSLIMFRTILKMVT

YIYAYIINARRLRCFPEPPRRSWLLGHLGLFMPTEEGLT   EVSDAISSFRKTFLTWMGP

ISLVSVFHPDTVKPIVAASAAIAPKDDLFYGFLRPWLGDGLLLSHGEKWGRHRRLLTP

AFHFDILKSYVKIFNQSTDIMLAKWRRMTVEGPVSLDMFEHISLMTLDTLLKCTFSYD

SDCQEKPSDYIAAIYELSSLMVKREHYLPHHLDFIYNLSSNGRNFRQACKKVHEFTAG

VVQQRQKALKEKGMEEWIKSKQGKTKDFIDILLLSKVEDGNQLSDEDMRAEVDTFMFE

GHDTTASGLSWILYNLARHPEYQEKCRKEIIELLEGKILKHLEWDELSQLPFTTMCIK

ESLRLHPPVTAVSRRCTEDIKLPDGKVIPKGNTCLISIYGTHHNPEIWPNPQVYDPYR

FDPENVQERSSHAFVPFSAGPRNCIGQNFAMAEMKIVLALILYKFHVRLDETKAVRRK

PELILRAENGLWLQVEELKR

Xenopus mRNA 4F42

atgttgcc gtttttggac cattttctgg

      121 actccttaaa cctgagtcac tcaactttcc gagtttatat tttctatgct gttattctct

      181 ttttctctct tataatgttt cgaaccatat taaagatggt gacatatatt tatgcttata

      241 tcatcaatgc cagacgtctg cgttgttttc cagagcctcc aagacgtagc tggcttttag

      301 gacatttggg actgtttatg ccaacagagg agggccttac agaagtgagt gacgccattt

      361 cttcttttcg taaaacattt ctgacatgga tgggacccat ctctttagta tcagtgtttc

      421 atccggacac agttaaacca atagttgcag cctcagctgc cattgctcct aaagatgatc

      481 tgttctatgg tttcctcaga ccctggttag gggatggact gttgcttagc catggggaga

      541 aatgggggag gcaccggcgc cttctgacac ctgcctttca ctttgacatc cttaagagct

      601 atgtgaagat ttttaatcag agcacagata tcatgcttgc aaagtggcgg agaatgacag

      661 tagagggccc tgtgtctctg gatatgtttg aacatatcag tctgatgacc ttggatacac

      721 ttcttaaatg tactttcagc tatgacagtg actgccaaga gaagccaagt gattatattg

      781 ctgctattta tgaactgagc tcactaatgg tgaaacgtga gcactacttg ccccatcatt

      841 tagattttat ctacaacctt tcctccaatg gaaggaattt ccggcaggct tgcaaaaaag

      901 tgcatgaatt cactgccgga gtggtacagc aaagacagaa ggcattgaag gagaagggga

      961 tggaggaatg gattaagtct aaacaaggca aaaccaagga tttcattgat attctattgc

     1021 tgtcaaaggt tgaagatgga aaccagctat ccgatgaaga tatgagggcc gaagttgaca

     1081 catttatgtt tgaaggtcat gataccacag caagtggctt atcatggatt ctatacaatt

     1141 tggctcgcca ccctgaatat caggagaaat gcagaaagga gattatagag ttgctggaag

     1201 ggaaaatcct gaagcatttg gagtgggatg aattgtctca gttgccattc actacaatgt

     1261 gcatcaagga gagtctgcgg cttcaccctc ctgtaactgc agtatccaga cgctgtacag

     1321 aggatatcaa attacctgat ggcaaagtca tccccaaagg aaacacctgc ttgatcagta

     1381 tttatggaac ccaccacaac cctgagatct ggcctaatcc acaggtttat gacccatatc

     1441 gatttgatcc agagaacgtc caagaaaggt cttcccatgc atttgtacca ttctcagctg

     1501 gacccaggaa ttgtattgga cagaatttcg ctatggccga gatgaagatt gttttagctc

     1561 taatccttta caaatttcat gtgagattgg atgagaccaa ggcagtgcgc agaaaacctg

     1621 agttgatcct acgtgcagaa aatgggctct ggctgcaggt ggaagaactg aaacgt

 

CYP4V like (looking hits with megablast using 4V4 Xenopus

All look like the 4Fs and 4Ts, CYP4s not differentiated yet?)

 

>BI385317.1 Branchiostoma floridae cDNA clone 54% to Xenopus laevis CYP4V4

3  ANIPFPAGSQLCIGHRVALISDKDILSSILHLF 101

 

Xenopus 4V4 mRNA AB114054

atggagctaa agggagatgt

      121 taatgtgctg ttgtggacgg ctgttatcgt ggtgctgttg accctgctgg tcttttccgc

      181 tttgcccgtc ctgctggact acgtgcgtaa atgcaaagtt atgagactga ttccgggtcc

      241 cggacccaac tacccgctcg tgggggacgc gctgctccta aagagcgatg caagagaatt

      301 ctttctccaa atgtgtgaat tcgcagagga ctttagatca gaaccacttc taaaactttg

      361 gattggacca attccttttt taatagtcta ccatgcagac actctagagc catttctgag

      421 cacatccaaa catgtggaca aggcctacct ctataaattt cttcaccctt ggcttggtaa

      481 aggactgcta acaagtacag gggaaaagtg gcgtataaga agaaagatga taacgccaac

      541 ctttcatttt gcaattctct ctgagttttt ggaagtcatg aatgagcaat ccaatgtatt

      601 agttgaaaag ctccagaagc atgctgatgg ggagtctttt gattgcttta tagatgtgac

      661 actttgtgta ttggacatca tatcagaaac agccatgggg aggaaaatag aagcacagag

      721 caataaagat tctgaatatg ttcaagcaat atacaagatg gctgatttca ttcagaacag

      781 acagacaaag ccatggttgt ggagtgactc tttatatgca tacttgaaag aaggaaaaga

      841 gcacaataaa accctaaaca ttctccacac cttcactgat aaggcgattc tagaaagagc

      901 tgaagagctt aagaaaatgg aagtaaaaaa aggtgatagt gatcctgagt cagaaaagcc

      961 caagaaaaga agtgcatttc tagatatgct tctgatggca acagatgatg ccggcaataa

     1021 aatgagctac aaggatatcc gtgaggaagt tgataccttc atgtttgagg gtcatgatac

     1081 aacagcatca gccctaaatt ggacattgtt tttactgggc tcacacccag aggcgcagag

     1141 acaagttcat aaagagctgg atgaagtttt tggtaaatct gaccgtcctg tcacaatgga

     1201 tgatctaaag aagttgcgtt atcttgaagc cgtaattaaa gaatcacttc gaatattccc

     1261 ccctgtcccg atgtttggtc gaaccgttac agaggactgc actgtccgag gatttaaagt

     1321 gccgaaagga gtaaatatca ttgttattac ttactcattg catcgtgatc cagaatattt

     1381 ccctgaacca gaagaattca gaccagagag gttctttcct gaaaatgcta gtgggcgtaa

     1441 tccttatgcc tatattcctt tttctgctgg actcagaaac tgcattggtc agcgttttgc

     1501 tctgatggaa gaaaaggttg tcttatcctc catacttagg aaatactggg tagaggcaac

     1561 tcagaaacgt gacgaatgtc tccttgtagg agagctcatt ctccgccccc aggatggcat

     1621 gtggattaag ctgaagaaca gagaaactgc ctccagtgcc

 

CYP7 like (looking, no match with danio mRNA via megablast)

 

>gnl|ti|681941888 ATWX61331.g1

AFPZ187714.y1 APNK84514.b2 ATGI163705.b1 ATUP144046.y1

ASFW156594.b2 AFSA564815.b2 AFSA527605.g2 AFSA514068.b2

AFSA775248.g2

walked upstream from AFSA527605.g2 to:

ASFW100120.b3

more matches = AFPZ788847.y1 ASFW125784.g2

walked upstream to ATGI210792.g1

 

walked upstream from AFPZ931995.g2 (exon 2) to

ASWX150423.b2  ATGN25235.b1   ATUP765390.x1 

ATGN88506.g1   AFSA923158.g2  AFPZ422094.x1 

Possible N-term exon

MILSIALIWAVVVGFCCLLWLAVGIRHR (2) Fugu N-term

MI-SGILAGCLVVLVVAILVQAVG-RKR  (2)

ATGATTTCTGGGATTCTGGCCGGCTGTTTGGTGGTGTTGGTGGTGGCCATTTT

GGTACAGGCTGTCGGCAGGAAACGGT

 

64% to CYP7A

AFPZ788847.x1 mate pair

AFPZ395440.x1 ATUP709168.g1 AFSA796848.g2 AFPZ931995.g2

AFSA923158.b2 AFSA474510.b2 ATUP692470.g1

(2) RDPNEPPLESGPVPYLGVALQFAMDSLKFIRSRQKKYGDVFTVKLAGKYTTFVLDPHSYSDVMRQHK (2)

 

AGAGACCCGAATGAACCGCCCCTTGAG

TCTGGTCCCGTGCCCTACCTCGGGGTGGCGCTACAATTTGCCATGGACAGTCTCAAATTC

ATCCGCTCGCGACAGAAGAAGTACGGAGACGTCTTTACGGTGAAGCTTGCCGGAAAGTAC

ACGACATTTGTCCTTGACCCGCACTCCTACAGCGACGTCATGCGGCAGCACAAGT

 

AFPZ187714.x1 mate pair

AFSA19340.x1  ATWW110249.g1 ASFW88412.g2 

These 5 have 5 nuc diffs and one aa diff ATUP312192.b1

ATGI154523.g1 ATWW237445.b1 ATGI112989.b1 AFSA474510.b2

The first Ag in the cluster of three at the beginning of this

Exon seq is missing here, so one of the other two is correct

ILDFKTVGMDIVERGFGTTHFERTGRAHVLHTADAYFPVHLQGNALDPLTNTMMGHLQTAMLADI (1)

 

AGTAGAGCGTGGATTTGGGACGACGCACTTCGAAAGGACGGGCCGTGCGCACGTGCTGCACACCGCTGACGC

CTATTTCCCCGTCCATCTGCAGGGGAACGCCCTGGACCCGCTCACGAACACCATGATGGG

GCATCTACAGACAGCCATGTTGGCTGATATAGGT

 

AFSA19340.x1  AFPZ744954.y4 AFPZ744954.y1 ATGI112989.b1

ATUP312192.b1 ATWW110249.g1 ATGI230779.g1 ASFW88412.g2 

ATWW237445.b1 ATUP785498.x1

GSETGWKKDGLWSFVRRIVSEASFLTIFGKHK (2)

AGGGTCGGAAACTGGCTGGAAGAAGGACGGGCTGTGGTCCTTTGTCCGCCGCATCG

TCTCTGAAGCGTCCTTCCTCACCATCTTCGGCAAGCACAAGT

Walked down from AFPZ744954.y4 to:

ATUP899939.x1  ATGI210792.g1  ATUP38762.b1   ATGN118038.g1 

ATWW157309.g1  ATUP571743.g1  AFPZ712795.b2 

 

QLIEQRDEVFCGGGLSGKELAGAHFSTVWASLVRSI

 

CYP7 amphioxus Complete seq. 41% to CYP7A1 from zebrafish and Fugu, 35% to Ciona CYP7

MISGILAGCLVVLVVAILVQAVGRKR  (2)

RDPNEPPLESGPVPYLGVALQFAMDSLKFIRSRQKKYGDVFTVKLAGKYTTFVLDPHSYSDVMRQHK (2)

ILDFKTVGMDIVERGFGTTHFERTGRAHVLHTADAYFPVHLQGNALDPLTNTMMGHLQTAMLADI (1)

GSETGWKKDGLWSFVRRIVSEASFLTIFGKHK

SQTVEQERARLMVVMETFWDYDRKFPQVVAGIPFWMLGKAKEQRDFLL (0)

AFLSKDNLNQRDVLQLIEQRDEVFCGGGLSGKELAGAHFSTVWASL (0)

SNTLPTAFWTLFHLLQDPVAMAAVRREVET (0)

ETGQTVTGFRDGGEKIDFTRQQLADMTCL (1)

GSVVNEALRVSSVSIVLRQALEETTIALNSGSTFKIRKGDRVALFPQIVHMDPEVYEDPE (0)

TFKYDRYLENGKEKTTFYKNGKKLRHYLIPFGIGTSRCPGRFFAVNEIKQFVSLIVCYFD

MELIDKETPPLDQSRTGLGVLPPKTDPMFRYKIK*

 

AGAGCAACACTCTACCCACGGCCTTCTGGACACTCTTCCACCTC

CTACAGGACCCTGTTGCCATGGCTGCAGTCAGGAGGGAGGTGGAGACGGT

 

AGGAACCACAATAGCCCTTAAC

TCCGGCAGTACCTTCAAGATCCGCAAGGGAGACAGGGTGGCGCTGTTCCCACAAATCGTT

CACATGGACCCAGAAGT

 

Query:     1 SNTLPTAFWTLFHLLQDPVAMAAVRREVETETGQTVTGFRDGGEKIDFTRQQLADMTCLG 60

             +NTLP  FWTLFH+++ P AM A   EV      +         ++  TR+QL +M  L

Sbjct:   295 ANTLPATFWTLFHMIRCPAAMKAASEEVRQTFESSNQKVDPTNSRLVLTREQLDNMPVLD 354

 

Query:    61 SVVNEALRVSSVSIVLRQALEETTIALNSGSTFKIRKGDRVALFPQIVHMDPEVYEDPET 120

             S++ EA+R+SS S+ +R A  +  + L++  ++ IRK D +A++P ++H DPE+Y+DP 

Sbjct:   355 SIIKEAMRLSSASLNVRMAKSDFLLQLDNKESYHIRKDDVIAMYPPMIHFDPEIYDDPLE 414

 

Query:   121 FKYDRYL-ENGKEKTTFYKNGKKLRHYLIPFGIGTSRCPGRFFAVNEIKQFVSLIVCYFD 179

             FKYDRY+ E G+EKT FY+NG+KLR+Y +PFG G ++CPGRFFAV+EIKQF+SL++ YF+

Sbjct:   415 FKYDRYIDEKGQEKTAFYRNGRKLRYYYMPFGSGVTKCPGRFFAVHEIKQFLSLLLSYFE 474

 

Query:   180 MELIDKET--PPLDQSRTGLGVLPPKTDPMFRYKIK 213

             MEL+D +   PPLDQSR GLGVL P  D  FRY++K

Sbjct:   475 MELLDSDVKEPPLDQSRAGLGVLQPTYDVDFRYRLK 510

 

>CYP7A1 Fugu Scaffold_5172  Length = 18849 59% to 7A1

= LGW1565.x1 Length = 555 27-153 CYP7A1

= LGW57257.y1 50% to 7a1 238-350

= LOL6406.x1 61% to 7A1 390-436 also LOL6406.y1

= LGW154142.y1

= LGU7599.x1

insertion of 6 aa in exon 4 vs mammalian seqs, but probably real see zfish seq below

14694 MILSIALIWAVVVGFCCLLWLAVGIRHR 14777 (2)

15070 HSSEPPVENGLIPYLGCALQFGANPLQFLRSRQKKYGHIFTCKIA 15204 frameshift

15205 GQYIHFLCDPFSYHSVIRQGRHLDWRKFHFATSVK 15310 (0 expected) bad boundary

15425 AFGHDSFDPRHGHTTENLHQ 15484

15485 TFLKTLQGEALPSLIKTMMGHLQDVMLKSDTLRRSKDHWEVDGIFAFCYK 15634 (0)

15757 VMFESGYLTLFGKELGEDTCQARQAAQKALVLNALENFKEFDKIFPALVAGLPIHVFKSAYSARE 15951 (0)

16053 NLAKTMHAEKLSKRENVSDLISMRMILNDSLSTFNDVSKARTHVALLWASQANTLPATFWSLFYMIR 16253 (2)

16383 SPDAIKAAREEAQKVFETFGVKIDPHNPTLNLTRDVLDNMPVL 16511 (1)

16744 DSIIKEAMRLSSASLNIRVAKEDFLLHLDNQEAYRIRKDDVIALYPPMLHYDPEIFEDPY 16923 (0)

17029 EYKFDRFLDENNQEKTTFTRNGRKL 17103

17104 RYFYMPFGSGVTKCPGRFFAVYEIKQFLTLVLTYFDMELLDPAIQVPPLDQSRAGLGILQ 17283

17284 PTYDVDFRYKLKLAY* 17331

 

 

danio mRNA for 7A

at gatcctaacc atttccttca tttgggccat

       61 agtggttggt ctttgctgtt gtctttggct tattacagga atacgcagaa gacatcctgc

      121 agagcctcca ttagagaatg gctggattcc cttccttggc tgtgctcttc agtttggggc

      181 aaatccttta gagtttcttc gcagcagaca gaagaagcat ggccatattt ttacatgcaa

      241 gattgctggg cagtatgttc atttcctttg tgatccattc tcctaccatg ctgtcatccg

      301 tcaaggaagg caccttgact ggaagaaatt tcactttgat gcctctgcga aggcatttgg

      361 tcatgagagc atggatccca gtcaaggtta caccactgag aatttgcatc agactttcct

      421 gaagaccctg caaggggatg ccttgtcttc tctaattgag accatgatgg aaaacctcca

      481 gggcaccatg ctgcaatccg gaatgctgaa ggccacaacc tctgaatggc aaagtgatgg

      541 tatttacgcc ttctgctaca aggtcatgtt tgaagcaggc tacctgaccc tcttcggaaa

      601 ggaactggat ggggaccaga gcattgcacg tcagcaggcc caaaaggctc tggtgctcaa

      661 tgctttggac aactttaaag agttcgataa gatcttccca gctctgatcg ctgggctccc

      721 cattcatgtt tttaagagtg cctacagcgc tcgtgagaaa cttgccaaga ctatgctcca

      781 tgagaacctc agcaggcgtg ccaatgtgtc tgatctcatc tccttgcgca tgcttttgaa

      841 cgacacacta tctaccttca acgagctgag caaagcccgg acccacgtcg ctatactttg

      901 ggcttcacaa gccaacactc tgcctgcaac cttctggact ctgttccaca tgatcaggtg

      961 ccctgcggca atgaaggctg ctagtgagga ggtgaggcaa acctttgaaa gttctaatca

     1021 gaaagttgat cctacaaatt ctcggcttgt actgacaagg gagcagttgg acaacatgcc

     1081 agttttagac agcatcatta aagaggcgat gagactgtcc agtgcatccc ttaatgtgag

     1141 aatggccaag agcgatttcc ttcttcaact agacaataag gagtcttacc acattcggaa

     1201 agatgatgta attgctatgt acccaccgat gattcacttt gatcctgaaa tttatgatga

     1261 tcctttggaa ttcaagtatg atagatacat tgatgagaaa gggcaggaaa agaccgcctt

     1321 ttaccgtaat gggcgcaagc ttcgttacta ctacatgccc tttggctctg gggtgaccaa

     1381 atgcccagga cgcttctttg ccgtgcatga aatcaagcag ttcttgtctt tgttgctatc

     1441 gtactttgag atggaacttt tggactctga tgtgaaagaa ccgccgttgg accagtctcg

     1501 ggctggactg ggtgtactgc agcccaccta tgatgttgac tttcgttaca gactcaaatc

     1561 tctc

 

MILTISFIWAIVVGLCCCLWLITGIRRRHPAEPPLENGWIPFLGCALQFGANPLEFLRSR

QKKHGHIFTCKIAGQYVHFLCDPFSYHAVIRQGRHLDWKKFHFDASAKAFGHESMDPSQG

YTTENLHQTFLKTLQGDALSSLIETMMENLQGTMLQSGMLKATTSEWQSDGIYAFCYKVM

FEAGYLTLFGKELDGDQSIARQQAQKALVLNALDNFKEFDKIFPALIAGLPIHVFKSAYS

AREKLAKTMLHENLSRRANVSDLISLRMLLNDTLSTFNELSKARTHVAILWASQANTLPA

TFWTLFHMIRCPAAMKAASEEVRQTFESSNQKVDPTNSRLVLTREQLDNMPVLDSIIKEA

MRLSSASLNVRMAKSDFLLQLDNKESYHIRKDDVIAMYPPMIHFDPEIYDDPLEFKYDRY

IDEKGQEKTAFYRNGRKLRYYYMPFGSGVTKCPGRFFAVHEIKQFLSLLLSYFEMELLDS

DVKEPPLDQSRAGLGVLQPTYDVDFRYRLKSL

 

CYP8 like (looking, no obvious CYP8s, but at least three CYP7s)

 

>GENE A 85% to Gene B 53% to CYP7 amphioxus 43% to 8B1 fugu, 37% to 7A1 Fugu

 

AFSA913951.b2 possible N-term missing start MET but similar to Gene B

Only matching seq in database probably has errors

TELLSVCLGGVLAFVLLQVITRRM

ACAGAGCTGCTGAGCGTCTGTCTGGGCGGAGTCTTAGCCTTCGTGCTTCT

ACAGGTTATAACAAGGCGTATGGT

 

ATUP32525.b1  mate pair to exon 2

MVTELLGVCLAVVLVFVLLQVTTRRR (2)

 

ATWX106442.b1 exon 2

AFPZ751569.x1 ATUP32525.g1 (note mate pair has exon 1)  ATWW184707.g1 AFSA329657.g2

(2) RPGEPPLEPGPLPYLGVALEFSRNPLGFITSRWKKYGDVFTVRLAGHYTTFVLDPHSFTHAIRNSK (2)

VLDFRVFSSKIAHRAFGMPIVYGTHRDWVRADSDALYPKELQGQGLEKVTE (0)

AWYB9202.g1 AFPZ202055.x1 ATUP209080.y2  AFSA329657.g2 

VMMTNLQSAMLAATDVKAEWNKGELWSFVYRIMFSGK (0)

 

ATUP926791.g1

CTCAGACNGGGGTTAACGGATGTTTATGATTTNTGGTATGTAAATAGCTTAAGTGATGCTTACATAAATGACATATTAATTATGCAAATTGGTATCTAATTTTCCTAATTAGTTAAGAAATTTTGTAAACACGCTCAGTTCCATTATAGGACCATTGCAACATGTAACATTTGTAACTGAGAAGGAGAAGAATATTGATTGATAGATGTTATGCAAAATAAAGATCTAATTTGCATAATTAATGAGAAAATGCTATAATTTTATTGTGGTAAATAACAGGAAGTTCTTACTTGTTGCATTTGAAAGTTATGTATAGGTGAACATTAATTATGCAAATGAGATCCTCATATGCATAAATTACCTGAAAATGCGATAAAAGCCTTCTTTCTTAACATAGATTGTGTACGTCTGTTGTCTGATAGAGGAGATGATCAACTGATATAATTTATGCAAATGAGAACCTTATTTGCATAATTAATGACAAACA CAATTTACACAGCAGTCGTAAAGGTACGGATCATTTGGCGAAGGTATGGGGTCGTGGAACTCTAGTTAAGATTGAAGTTGGGGATAGTACTTTTCTGAGGGCTTGTTTCAGTGTATTTTGACAAGCTTTCCTATGGAGAGATCTAACTGTGTTATTTGCCCCGTACACTCCAGGAGACCTGGTGAGCCGCCATTGGAGCCAGGCCCTCTTCCGTACCTGGGGGTCGCCCTGGAGTTTTCCAGAAACCCCCTGGGTTTCATTACTTCCCGCTGGAAGAAATATGGAGACGTGTTCACCGTGCGACTGGCCGGCCACTACACCACCTTCGTCCTGGACCCGCACTCATTCACTCACGCCATCCGGAACAGCAAGTCAGTATGAACAAACATGTTGTAAAGAACCATTAAAGGTGTGTTCACACATGCGTATAGATTGAAATCCGTAAAAATAGTAGAAATGGACAAATAGGTCCAGCGACCCNNNN

 

>gnl|ti|669707816 name:ATWX106442.b1 mate:669707912 mate_name:ATWX106442.g1 template:ATWX106442 end:R  + strand

NGTGCTTTATGCAAC CAATTTACACAGCAGTCGTAAAGGTACGGATCATTTGGCGAAGGTATGGGGTCGT

GGAACTCTAGTTAAGATTGAAGTTGGGGATAGTACTTTTCTGAGGGCTTGTTTCAGTGTATTTTGACAAG

CTTTCCTATGGAGAGATCTAACTGTGTTATTTGCCCCGTACACTCC AGGAGACCTGGTGAGCCGCCATTG

GAGCCAGGCCCTCTTCCGTACCTGGGGGTCGCCCTGGAGTTTTCCAGAAACCCCCTGGGTTTCATTACTT

CCCGCTGGAAGAAATATGGAGACGTGTTCACCGTGCGACTGGCCGGCCACTACACCACCTTCGTCCTGGA

CCCGCACTCATTCACTCACGCCATCCGGAACAGCAAGTCAGTATGAACAAACATGTTGTAAAGAACCATT

AAAGGTGTGTTCACACATGCGTATAGATTGAAATCCGTAAAAATAGTAGAAATTGGACAAATATTGTCAG

ACGAGTTAGCAGGGTCACAAACTTCCTGGCTAGCTTACTCCTCTAGTATGTTCTGCGTCTACAGTTGTCG

AATTTCTACTATTTTCCAGCAACCTGTGGAGCCCGGTAGAGACTAAATAGGGTATTTGGGGATCGCTAAG

TCGGTGGGAGGTAGTTTCAGAACTCCATAAGGACCTCATACGGACTTGAATATATACGCACGTGTGGATC

TGATAGGTATACAGTTCAGTTTTTTGCTTGTTCATTTGTTTGTTCATTTGTTTGTTCATTTGTTTGTTCA

TTTGTTTGTTCATTTGTTTGTTCATTTGTTTGTTCATTTGTTTGTTCATTTGTTTGTTTGTTTGTTTGTC

TGACATAGCCGAAAAACCACTTTCAGGCGGAACACACCTGAACACCATCCCAAATGACGTCCTTGACCGA

AACTAGGTTCTCATTTTCAGCTACTATGATGATGTGTATTTAATAGTCAATCAT

 

 

>joined ATWX106442.b1 and ATUP926791.g1 and AFSA75691.x1

NNNNNNNNGGGGGGGAAGGAGAANCGGACGCTCGCACTCTCACCACTTCTCCGTGTCTAACCATCACACG

TGTGTGTATAGCGAATAACAATAACCTGTGCTGTGGGGTACAGAAGAATGTTACCTCGCCAAGACAGTTA

TATTTTGGGTAGCGTTTGTTTGTATATATGTAACGTCATGTATGTACATGTATGTATGTATGTATGTAGA

GACCAGCATAACTAAAAAAAGCCTTAATGGATTGTATTGCTATTTAGTATGTGGGTAGGTCTTGATGAGA

CCTGGAAACGATTAGATTTTGGGCCCCCTAGCAGCTTGTTACGGTACTGCAGCAGAGCTTCCTGGTTTAA

TATCTCGAGTTCTGAACATGCTGCGGCTATGATTTTTGAGTGGTAGACAGGTATTGGTGCCGAGAGTAAG

TGGTGTAGGTTTGGGCCCCCTAGCGGCTTGTTTTGAAACTGCAGGGCAGTGTCAGACTTTAAAAGGGAAT

AACTCAAGAACGGGTTAACGGATTGTTATGATTTTTGGTATGTAAATAGCTTAAGTGATGCTTTACATAA

TGACATATTAATTATGCAAATTGGTATCTAATTTTCCTAATTAGTTAAGAAATTTTGTAAACACGCTCAGTTCCATTATAGGACCATTGCAACATGTAACATTTGTAACTGAGAAGGAGAAGAATATTGATTGATAGATGTTATGCAAAATAAAGATCTAATTTGCATAATTAATGAGAAAATGCTATAATTTTATTGTGGTAAATAACAGGAAGTTCTTACTTGTTGCATTTGAAAGTTATGTATAGGTGAACATTAATTATGCAAATGAGATCCTCATATGCATAAATTACCTGAAAATGCGATAAAAGCCTTCTTTCTTAACATAGATTGTGTACGTCTGTTGTCTGATAGAGGAGATGATCAACTGATATAATTTATGCAAATGAGAACCTTATTTGCATAATTAATGACAAACA

CAATTTACACAGCAGTCGTAAAGGTACGGATCATTTGGCGAAGGTATGGGGTCGT

GGAACTCTAGTTAAGATTGAAGTTGGGGATAGTACTTTTCTGAGGGCTTGTTTCAGTGTATTTTGACAAG

CTTTCCTATGGAGAGATCTAACTGTGTTATTTGCCCCGTACACTCC

 

>GENE B

AFPZ923737.x1 AFPZ877092.x1 AFSA910992.g2 AFSA351865.g2

AFPZ570243.x1 AFPZ9740.b2  ATUP4956.b1  AFPZ718164.y1 AFPZ823314.b2

MVTELLGVCLAVVLVFVLLQVTTRRR (2) probable N-term

 

ATGI76532.g2 2 aa diffs to gene A

RPGEPPLEPGPLPYLGVALEFSRNPLGFITSRWRKYGDVFTVRLAGHYTTFVLDPHSFTHAIRNSR (2)

 

Joined AFSA351865.g2 and ATGI76532.g2

AAAAATTTTTTTTCAGGCTAGGAGGCTAACTTGAGGAATTTAAACATTTTTTTCGGGGCTTCTGGAGAAAGTACCGTCCCTGGAACAGGGAGGGGAAAGGTCGCAGTGCAAAGGGCATGTACACTCCACATGTACAGAGCTGCGGGGGATGAAGAAATAGCCGGCAGGCAGGTTAATGCACCCTGGATCCTAAATTTCGCCTTTGAGACGACATTTGAGGCGATCATTGGAGTTAGACTTTCATCGCACATGCGCCTTAACTGTGTTAGAGTCACACAGACTGACCTGCCTGCCGACCTACAGCAGGATGGTGACAGAATTGCTAGGCGTTTGTTTGGCCGTAGTCTTAGTCTTCGTGCTTCTACAGGTCACAACAAGGCGTAGGTAGGAGAAACTATTTCACGTCTTAACAACTTGTCGGAGGCTGTTGCCCGGCTTGACTACGGCGCACGCAGGGGGTATGCGATTATACTGAGTATTAAGGGGGTGAAGTTTATCTACTCACATCGTGAGTCCTATCAAAGGCATTTCATATTCATAGAT

ACACGGGTTTCTTATCCAGTCCATTCTACATCAAAGGGGTCGTGTCGAACATTAACTTGGAGTGTGAGTC

GCTCCTGTGCTATTCAACGAGTTGATGAGCACCCTATTCCACTAGACGGCGATCACGCTGAGACCTTGCT

GCGACGGTGCGACCTAAACTGGATTTAATTCTGGAACTCCTGAATTCATAATTGCAATATCATGCAAACG

TTACGTAAAAAGTATTACTGGAAAAAAGTCATATTTTTTGTTGAAGTCGTTGAGCGCTCTGTCAAGCGCG

GTCGTATAGCGAGATCACCGCCTTGTGGAATGGAGGTGTAAGTTGTTGTCACTACCTTGATACTATTCCC

CTTCCCACAGGAGACCTGGTGAGCCTCCCCTGGAGCCGGGGCCTCTCCCGTACCTAGGCGTCGCCTTGGA

GTTTTCCCGAAACCCCCTGGGCTTCATCACCTCCCGCTGGAGGAAATATGGCGACGTGTTCACCGTGCGG

CTGGCCGGCCACTACACCACCTTCGTCCTGGACCCGCACTCATTCACTCACGCCATCCGGAACAGCAGGT

TAGTATGAAAGGAAATGTCTTATAACAACTACAGTGAGCTTTTCCAAACGTGCCTTGTATATAGATCACT

GTTGTTTTAGTTTGTTTGTTCGTTTTTTGTTTGGTTGACATAGCCGATAAACCGCCTTCAGGCATAAAAC

ACCAGTTCTGTAACAAGCTGTGAAGTTAAAAAGTTAAACCCTTCCCACACCAAGATGGCGCATAGGGCGG

TGCCCATCTCTGTTTCATTAGCCCTGGGCCACACACAACGCAATCACTACAGCAGGGGGCTAGTCCACTG

GTAGTGGTGTGTGTTCAACTTCCATACTCTTTCCCGAATGCTGAGTGCTAAGCAGAGAAAGCAGCATGCA

CCATTTTTAAAGTCTTTGGTATGACTCGG

 

 

AFPZ88959.b2 mate pair of AFPZ88959.g2

AFPZ88959.b2  APNK98223.b2   ATGN213244.b1  ATWX3781.b3   

ATUP385028.g1  ATGI219815.g1  ATWW61924.b1   ASWX32142.g2   APNK48064.b2  

ASFW94882.g2   AFSA507746.b2  AFSA700071.b2  AFPZ958552.x1  ATGN85334.b1  

GLDFRLFSSKIAHRAFGVPIFYGTPRDWVRADSDALWPKELQGQGLDKVTE

AGGGGTCTGGACTTCCGACTATTTTCGTCCAAGATAGCTCATCGCGCCTTCGGGGTGCCTATA

TTCTATGGTACACCCCGCGACTGGGTCCGAGCCGACAGTGATGCGCTATGGCCAAAGGAA

TTACAGGGGCAGGGAGTGGATAACATCACAGAGGT

 

ATUP4956.g1

VMMSNLQSAMLAATDVTVEWNKGELWSFVYRIIFSGK (0)

 

AFSA160234.g2 walking up from ATGN200156.g1

(0) TLFGRHKEEKDETALLLHAMEEFQKYDKRFPEIISNVPWWLMGHTKKRYEYLK (0)

 

 

ATGN200156.g1 ATGI16796.g1 ATGN186157.b1 ATUP270375.g1 AFSA312546.b2

AFPZ88959.g2 AFPZ541093.x1

Walking up from ATGN200156.g1

AFPZ244071.x1  AFSA312546.b2 (SDFI + VRAE exons) AFSA19010.x1  

SMVSPAGLGQRGVSDFIRMRQEIYADANLTPDEITACNFATMWASL (0)

SNTVPAAFWTLFYLLKDPVAMAAVRAE (0)

AGAGTAACACCGTCCCTGCCGCCTTCTGGACCTTGTTCTACCTCCTGAAG

GATCCTGTCGCCATGGCTGCCGTTAGGGCGGAAGT

 

Matches to VRAE exon

AFPZ541093.x1  ATGI16796.g1   ATGN341992.g1  ATGN200156.g1 

ATGN281106.g1  ATGN186157.b1  ATUP270375.g1  ATGI164468.g1 

AFSA312546.b2  AFPZ416567.g2 

1 aa diff IRAE istead of VRAE

AFPZ88959.g2   ATUP769642.y1  ATUP460828.g1 

 

Walked down from ATGN281106.g1 to ASFW19202.g2 (82% match)

Note ASFW19202.g2 was the limit of extension upstream from the EXXR exon

 

Walked down from ASFW19202.g2  to ATUP281063.x2 and found the missing exon

SLETMKEAGKMIHVTREQLNDMKCL (1)

 

Related C-terminal sequence

>ATUP314736.g1

TFKYDRFLENGMEKTTFYKNGRKLRHYLLPFGHGVSMCPGRFFALNEI

KQFVTIVICYFNMELLEKQTPPKDQSRAGLGTLAPLKECLFRYSLK*

 

Walked up from ATUP314736.g1 to AFPZ78488.b2 ATGI251736.b1

Found EXXR exon

(1) GSAINEALRMCSASIIIRVATDDAELALESGSTFRIRKGDRVALYPGFVHMDTEVFDDPE (0)

 

 

Walked up from AFPZ78488.b2 to AFPZ1765.y1 ATGI167277.g1 ASFW94882.b2

No exon found

 

Walked up from AFPZ1765.y1 to ATUP281063.x2 ATWW105983.b1 ATUP770651.x1

ATGI221136.g1 AFSA540330.b2 ATGI167277.g1 ASFW19202.g2

 

ASFW19202.g2 could not be extended further upstream

 

ASFW94882.g2 mate pair to ASFW94882.b2 links this C-term seq to this N-term exon

Only 3 aa diffs to AFPZ88959.b2

GLDFRLFSSKIAHRAFGVPIFYGTPRDWVRADSDALWPKELQGQGLDKVTE

 

ATUP770651.y1 mate pair to ATUP770651.x1

Same as AFSA160234.g2

(1) SYKTLFGRHKEEKDETALLLHAMEEFQKYDKRFPEIISNVPWWLMGHTKKRYEYLK (0)

 

The evidence suggests that Gene B is composed of two parts joined by

multiple mate pairs

 

>Gene B assembled

42% to CYP7A fugu and 40% to CYP8B2 fugu only 31% to 8A1

MVTELLGVCLAVVLVFVLLQVTTRRR (2)

RPGEPPLEPGPLPYLGVALEFSRNPLGFITSRWRKYGDVFTVRLAGHYTTFVLDPHSFTHAIRNSR (2)

GLDFRLFSSKIAHRAFGVPIFYGTPRDWVRADSDALWPKELQGQGLDKVTE

VMMSNLQSAMLAATDVTVEWNKGELWSFVYRIIFSGK (0)

TLFGRHKEEKDETALLLHAMEEFQKYDKRFPEIISNVPWWLMGHTKKRYEYLK (0)

SMVSPAGLGQRGVSDFIRMRQEIYADANLTPDEITACNFATMWASL (0)

SNTVPAAFWTLFYLLKDPVAMAAVRAEVDQ (0)

ETGQSLETMKEAGKMIHVTREQLNDMKCL (1)

GSAINEALRMCSASIIIRVATDDAELALESGSTFRIRKGDRVALYPGFVHMDTEVFDDPE (0)

TFKYDRFLENGMEKTTFYKNGRKLRHYLLPFGHGVSMCPGRFFALNEI

KQFVTIVICYFNMELLEKQTPPKDQSRAGLGTLAPLKECLFRYSLK*

 

>BJ652936.1| BJ652936 Eptatretus burgeri hagfish cDNA clone

           hg128o16 5', mRNA sequence.

          Length = 591

 

Query: 257 VPAAFWTLFYLLKDPVAMAAVRAESLETMKEAGK------MIHVTREQLNDMKCLGSAIN 310

           +PAAFW L++LL  P A+  +R E  + +K  G+      ++ ++   L ++ CLGSAI+

Sbjct: 14  LPAAFWALYHLLCHPDALTVIRKEVDDVLKSTGQYPKPSSLLKLSPTTLPNLVCLGSAIS 193

 

Query: 311 EALRMCSASIIIRVATDDAELALESGSTFRIRKGDRVALYPGF-VHMDTEVFDDPETFKY 369

           E+LR+CSASI IRVA DD +L LE G T  +RK D VA+YP   +H+D E++ +PE +KY

Sbjct: 194 ESLRLCSASINIRVAQDDLDLELEPGRTVPLRKNDWVAMYPQTALHLDPEIYPEPEIYKY 373

 

Query: 370 DRFLENGMEKTTFYKNGRKLRHYLLPFGHGVSMCPGRFFALNEIKQFVTIVICYFNMELL 429

           DRFLENG EKT FYK G+KL HYL+PFG GVSMCPGRF ALNEIKQF+ ++I   ++E+L

Sbjct: 374 DRFLENGQEKTNFYKGGQKLHHYLMPFGSGVSMCPGRFLALNEIKQFLFLLIAVLDLEIL 553

 

Query: 430 EKQTPPK 436

             Q   K

Sbjct: 554 PDQPQVK 574

 

>gi|58647881|gb|CX908537.1| JGI_CAAN1354.fwd NIH_XGC_tropTe4 Xenopus tropicalis cDNA clone

           IMAGE:7686846 5', mRNA sequence.

          Length = 784

 

 Score =  249 bits (636), Expect = 5e-65

 Identities = 121/260 (46%), Positives = 175/260 (67%), Gaps = 7/260 (2%)

 Frame = +3

Note the strong match in yellow upstream of EXXR.  The region before the yellow

Is a poor match and there might be another small exon in this region that fits better.

Query: 177 FQKYDKRFPEIISNVPWWLMGHTKKRYEYLKSMVSPAGLGQRG-VSDFIRMRQEIYADAN 235

           F K+D +FP ++ N+P  L+G TKK  E L     P  + +R  +S+ ++ R+ +    

Sbjct: 3   FTKFDAKFPYLVINIPIALLGATKKIREELIHFFFPNKMEKRSEISEVVQERKNVLEQYE 182

 

Query: 236 LTPDEITACNFATMWASLSNTVPAAFWTLFYLLKDPVAMAAVRAEVDQILKETGQSLETMKEAG 291

           L   +  A +FA +WAS+ NT+PA FW ++YL++ P A+AAVR E    ++  G+    

Sbjct: 183 LQDYDRAAHHFAFLWASVGNTIPATFWAMYYLVRHPEALAAVRDEIDHLLQSTGQKKGPE 362

 

Query: 292 KMIHVTREQLNDMKCLGSAINEALRMCSASIIIRVATDDAELALESGSTFRIRKGDRVAL 349

             IH+TREQL+ M  LGSAI E+ R+C+AS+ IR+  +D +L LE   T R+RK D +AL

Sbjct: 363 YDIHITREQLDSMVLLGSAIKESFRLCAASMNIRLVQEDFDLELEGNQTIRLRKDDFIAL 542

 

Query: 350 YPGFVHMDTEVFDDPETFKYDRFLENGMEKTTFYKNGRKLRHYLLPFGHGVSMCPGRFFA 409

           YP  +HMD E+++DPE +KYDRF+ENG EK  FYK G+KL+ YL+PFG G S CPGRFFA

Sbjct: 543 YPPALHMDPEIYEDPERYKYDRFVENGKEKILFYKKGKKLKEYLMPFGSGTSKCPGRFFA 722

 

Query: 410 LNEIKQFVTIVICYFNMELL 429

           +NEIKQF+ +++ Y +MEL+

Sbjct: 723 MNEIKQFLAVLLIYVDMELV 782

 

>CYP7 amphioxus for comparison 54% to Gene B same family

MISGILAGCLVVLVVAILVQAVGRKR  (2)

RDPNEPPLESGPVPYLGVALQFAMDSLKFIRSRQKKYGDVFTVKLAGKYTTFVLDPHSYSDVMRQHK (2)

ILDFKTVGMDIVERGFGTTHFERTGRAHVLHTADAYFPVHLQGNALDPLTNTMMGHLQTAMLADI (1)

GSETGWKKDGLWSFVRRIVSEASFLTIFGKHK

SQTVEQERARLMVVMETFWDYDRKFPQVVAGIPFWMLGKAKEQRDFLL (0)

AFLSKDNLNQRDVLQLIEQRDEVFCGGGLSGKELAGAHFSTVWAS L(0)

SNTLPTAFWTLFHLLQDPVAMAAVRRE (0)

TVTGFRDGGEKIDFTRQQLADMTCL (1)

GSVVNEALRVSSVSIVLRQALEETTIALNSGSTFKIRKGDRVALFPQIVHMDPEVYEDPE (0)

TFKYDRYLENGKEKTTFYKNGKKLRHYLIPFGIGTSRCPGRFFAVNEIKQFVSLIVCYFD

MELIDKETPPLDQSRTGLGVLPPKTDPMFRYKIK*

 

 

CYP11 like (looking)

 

15th International Conference on Comparative ENDOCRINOLOGY

MAY 23-28 2005 BOSTON

P15.3  Wed, 16:30-18:30  Spawning behavior and sex steroids in amphioxus MIZUTA, T*, KUBOKAWA, K; Ocean Research Institute, University of Tokyo

 

Abstract: Amphioxus is the evolutionary closest animal to vertebrate. We have studied reproductive behavior of captive amphioxus in tanks. Spontaneous spawnings were recorded and analyzed during reproductive periods. Characteristics of the behavior are as follows: 1) The first spawning animal was a male every spawning day with no exception. 2) Spawnings of male and female were non-synchronous. Spawnings lasted approximately 2 hours after the first male spawning of the day. 3) We failed in the prediction of the spawning day in tank and the habitat, although spawning occurs after the sunset in dark. 4) Level of gonadal maturation differed widely among individuals even in the breeding season and from this fact we supposed that spawning occurs irregularly when animals attain gonadal maturation. In amphioxus as in all vertebrates and some invertebrates so far as studied, endogenous endocrine factors would play an important role in inducing spawning. To confirm previous reports histochemically showing existence of sex steroid hormones in amphioxus gonads, we attempted to measure testosterone, progesterone and estradiol-17beta in extracts of fully matured ovaries of amphioxus by radioimmunoassay (RIA). Progesterone and a steroid-like substance that shows a similar replacement curve with estradiol-17beta were detected. Concentrations of these steroids in mature amphioxus ovaries were significantly higher than those in ovaries of amphioxus collected in the non-breeding season. In addition, we demonstrated immunopositive reactions to antibodies against 3beta-HSD of fish and P450scc of a commercial source in peripheral epidermal cells and inner parts of an oocyte, respectively. These facts suggest that the pathway from cholesterol to progesterone known in vertebrates exists at least in mature amphioxus.

 

CYP19 like (looking, cannot find it with megablast of danio or Xenopus CYP19 mRNA)

 

>AFPZ686853.b2 possible N-term, not great. Two possible start METs

AFPZ686853.b2  AFPZ745305.x1  ATUP248884.b1  AFSA807178.g2 

ATWW145756.g1  ATUP442682.g1  APNK55600.b2  

MLQFLVIESRGSFPLNRSRTRHGITSQIEADGCS

MDTGEGWDVLLVVLLVVLVWYYIRETWTSGIDGIFPP (1)

ATGTTGCAGTTTCTAGTGATAGAAAGCAGA

GGTAGCTTCCCATTGAACAGAAGTCGTACCAGACACGGCATAACCAGTCAGATCGAGGCA

GACGGCTGCAGTATGGACACAGGCGAGGGATGGGATGTTCTGTTAGTCGTGCTGCTTGTT

GTGCTTGTCTGGTACTACATCCGGGAAACCTGGACCAGCGGGATCGACGGGATATTTCCC

CCAGGT

 

>AFSA119741.g2

ASFW13005.g2  AFSA119741.g2  AFSA64159.g2   ATGI113449.b1  ATUP184988.g1 

AFSA770314.b2  AFSA140463.g2  AFPZ766687.y1  APWS36590.b1  AFPZ686853.b2 

(1) GPPYIPLLTPLWTLWVFLHDGIWAATAGYAAKYGDFVRVWLGTEQTFIISR (2)

AGGTCCTCCGTACA

TCCCGCTATTGACGCCGCTATGGACCCTATGGGTGTTCCTTCACGATGGCATCTGGGCAG

CCACGGCCGGGTACGCCGCCAAGTACGGGGACTTCGTGCGGGTCTGGCTCGGCACCGAGC

AGACTTTCATCATCAGCAGGT

 

>AFSA903966.g2  AFSA119741.g2  AFPZ531554.y1  ATGI113449.b1 

AFPZ766687.y1  ATUP184988.g1  ATWX16177.g1  

(2) ASAAAHVLKSSKYRARFGDPSGLAQIGMNGSGVIFNNDVQSWKFLRFFFVK (1)

AGAGCATCAGCAGCTGCGCATGTGCTTAAGTCCAGTAAGTACC

GGGCGCGGTTCGGCGACCCTTCTGGGCTTGCGCAAATCGGCATGAACGGCTCGGGCGTCA

TCTTCAACAACGACGTGCAGAGCTGGAAGTTCCTCCGCTTCTTCTTCGTCAAAGGT

 

AFPZ213319.y1  AFPZ116760.y1  ASWX37478.g2   ATGN339521.b1  ATGN140388.g1 

ATUP248884.g1  ATWW121655.g1  AFSA777349.g2  ATGN262628.g1  ATUP212374.g1 

ATUP184988.b1  ASFW193353.b3  AFPZ750454.y1  AFPZ323837.y1 

(1) VLDRAAGVSAIATRRQLANIRDIASSNPDGAVDVVTLMRRITLEIGNRLFLGVNIEN (1)

AGTTCTTGACAGAGCAGCCGGCGTATCCGCCATTGCTACCAGACGACAACTGGCTAACATCCGGG

ACATTGCGTCGAGTAACCCGGATGGAGCAGTGGATGTCGTCACACTAATGCGCAGAATCA

CGCTGGAAATCGGAAACCGGCTATTTCTGGGTGTCAACATAGAAAATGGT

 

ASFW152099.b2

ATGN140388.g1  ATGN163395.g1  ATGN123302.b1  ATUP248884.g1  AFPZ895309.y1 

AFSA777349.g2  ASFW152099.b2  AFPZ940540.x1  AFPZ323837.y1  ATUP212374.g1 

ASFW193353.b3  AFSA522025.g2  AFPZ686853.g2 

(1) DLEVVNTINGYFAAWEFFMIRPKVLQLIYPTLYRKHQTAV (2)

AGATCTGGAGGTGGTGAACACAATCAATGGATATTTCGCTGCCTGGGAGTTCTTCATGATAC

GACCCAAGGTGCTGCAGTTGATTTATCCTACCCTGTACAGAAAACACCAGACAGCAGTGT

 

APNK10005.g2   APNK8873.g2    ASFW152099.b2  AFSA830540.b2  AFSA765117.b2 

ATGN123302.b1  ATGI177758.b1  ATUP570910.g1  AFPZ686853.g2  ATWW67966.g1  

ATGN163395.g1 

(2) RALQDVVGKLVDKKRAVMNGDEAEEEFSIPKGEHDFAAALIQAQ (0)

AGGAGGGCTTTGCAAGACGTGGTGGGGAAGCTGGTGGACAAGAAGAGGGCCGTCATGAATGGAGAC

GAGGCCGAGGAAGAATTCAGCATCCCAAAAGGCGAACACGATTTCGCAGCTGCACTCATC

CAGGCGCAGGT

 

>ATUP846622.y1  aa 305-358

ATUP846622.y1  ATGI19659.g1   ATWW169596.g1  AFPZ766687.x1  AFPZ938114.x1 

AFSA830540.b2  ATUP215781.y2  ATWW67966.g1   ATUP163149.x1 

AFSA496531.g2  APWS76397.b1   ASWX46173.b2  

(0)  EFGQVSASCVRQCVTEMLLAGPDTMSVHIYFILLHIAEHGLENGILREIREVL (1)

AGGAATTTGGCCAGGTGTCAGCCTCCTGTGTTCGGCAGTGCGTGACAGAAATGCTGCTTG

CCGGTCCGGACACCATGTCCGTCCACATCTACTTCATCCTCCTGCACATAGCCGAGCATG

GTCTAGAGAACGGGATACTTAGGGAAATCAGGGAAGTCTTGGGT

>aa 359-440

AFPZ750454.x3  ASWX46173.b2   ATUP637968.y1  ATUP846622.y1  ATGI38272.g1  

ATGI156524.b1  ATWW50911.b1   ATUP151917.g1 ASWX78442.g2

(1) GDRDPTRDDLSKMVFLDHVIN ESMRARPVVTFVMRHAEEEDHVDGYVIPKGTNVIINLVAVHQDPRHFP

    EPETFDPDHFKEK (1)

AGGGGACCGAGATCCCACGAGAGATGATCTTAGCAAGATGG

TGTTCCTCGATCACGTGATCAACGAGAGTATGCGCGCAAGGCCAGTGGTCACTTTCGTCA

TGCGCCATGCTGAAGAGGAAGACCACGTGGACGGTTACGTCATACCAAAGGGGACCAACG

TGATCATCAACTTGGTTGCCGTGCACCAAGACCCTCGTCACTTTCCCGAGCCTGAAACGT

TCGATCCAGATCACTTCAAAGAAAAGGT

>ASWX78442.g2

AFPZ597386.y1  ASWX78442.g2   ATGN272728.g1  ATGN236719.g1  AFPZ323837.x1 

AFPZ213319.x1  ATUP570910.b1  AFSA770314.g2  ATUP554961.x1  AFSA555906.g2 

AFPZ556237.x1  ATWW50911.b1   AFPZ791616.y1 

(0) VPSTQFMPFGLGVRSCVGRTIAPL

QMKAVLITLLRMYQLSPSRDHQSLEVSRNLSEHPTEPGSMFLYPRLETI*

AGGTACCCTCTACCCAGTTCATGCCGTT

TGGCCTCGGCGTTCGCTCCTGTGTGGGACGAACCATCGCACCTCTTCAGATGAAGGCTGT

CCTCATCACGCTACTGCGCATGTACCAACTGAGCCCGTCACGTGATCATCAGAGCCTCGA

GGTGAN

 

>CYP19 amphioxus 37% to CYP19 zebrafish ovarian, 38% to brain form

two possible start METs

    MLQFLVIESRGSFPLNRSRTRHGITSQIEADGCS

    MDTGEGWDVLLVVLLVVLVWYYIRETWTSGIDGIFPP (1)

(1) GPPYIPLLTPLWTLWVFLHDGIWAATAGYAAKYGDFVRVWLGTEQTFIISR (2)

(2) ASAAAHVLKSSKYRARFGDPSGLAQIGMNGSGVIFNNDVQSWKFLRFFFVK (1)

(1) VLDRAAGVSAIATRRQLANIRDIASSNPDGAVDVVTLMRRITLEIGNRLFLGVNIEN (1)

(1) DLEVVNTINGYFAAWEFFMIRPKVLQLIYPTLYRKHQTAV (2)

(2) RALQDVVGKLVDKKRAVMNGDEAEEEFSIPKGEHDFAAALIQAQ (0)

(0) EFGQVSASCVRQCVTEMLLAGPDTMSVHIYFILLHIAEHGLENGILREIREVL (1)

(1) GDRDPTRDDLSKMVFLDHVIN ESMRARPVVTFVMRHAEEEDHVDGYVIPKG

    TNVIINLVAVHQDPRHFP EPETFDPDHFKEK (0)

(0) VPSTQFMPFGLGVRSCVGRTIAPLQMKAVLITLLRMYQLSPSRDHQSLEVSRNLSEHPTEPGSMFLYPRLETI*

 

 

>CYP19A1 Fugu ov Scaffold_7098 64% to LDZ38561.x1 CYP19 Length = 14029 53% to CYP19

= LGS44549.x1 like ovary CYP19 P450s

9466 MAAVGLDAEVLVSVSPNATEAESPGSSAGTRALIILTCLLLLVWSHTEKKSVP 9308 (1)

9242 SLLGPSFCLGFGPLLTYVRFIWTGIGTASNYYNKKYGDIVRVWVNGEETLVISR 9081 (2)

8985 ASAVHHVLKSRQYTSRFGSKQGLSCIGMNERGIIFNNNVTEWRKIRGYFTK 8830 (1)

8759 ALTGPAVQNTVEVCNSSTQAHLDRLEDLAQVDVLSLLRCTVVDISNRLFLDIPIN 8595 (1)

8499 EKELLLKIHKYFDTWQTVLIK----PDIYFKFGWIHQKHKTAA 8392 (2)

8296 RELQEAIEGLVEQKRRDLEQADKLENINFTAELLFAQ 8186 (0)

8084 NHGELSAENVMQCVLEMVIAAPDTLSVSLFFMLLLLKQNPDVELQLLQEIDAVVGK (0 expected, bad boundary)

     RQLQNGDLQKLRVLETFINECLRFHPV 7719

7718 VDFTMRRSLSDDVIEGYRVPKGTNIILNTGHMHRTEFFLRPTEFCLQNFEKN 7563 (0)

     APRRYFQPFGSGPRACVGKHIAMVMMKSILVTLLSQYSVCPHEGLT 7327

7326 LDCLPQTNNLSQQPVEHQEEAQQLSMRFLPRQRGSWQTV* 7207

 

>gi|47847288|dbj|AB178482.1| Rana rugosa mRNA for P450 aromatase, complete cds

          Length = 1726

 

Frog

Query: 9   VLLVVLLVVLVWYYIRETWTSGIDGIFPPGPPYIPLLTPLWTLWVFLHDGIWAATAGYAA 68

           V+  V L++++W Y     TS I     PGP Y   L PL T   FL  GI +A+  Y +

Sbjct: 105 VVAFVFLLIIIWSYEE---TSSI-----PGPSYCLGLGPLITYGRFLWTGIGSASNYYNS 260

 

Query: 69  KYGDFVRVWLGTEQTFIIS 87

            YG+FVRVW+  E+T IIS

Sbjct: 261 MYGEFVRVWINGEETLIIS 317

 

MILEALNTMQYNITEAMPSLAPATAASVVAFVFLLIIIWSYEETSSIP

GPSYCLGLGPLITYGRFLWTGIGSASNYYNSMYGEFVRVWINGEETLIISSSSA

TCHVMKHGHYVSRFGSKLGLQCIGMNENGIIFNSNPSLWKVIRPFFNRALSGPGLIQT

TEHSMKSTKRFLAKLSDVTDQVGNVNVLKLMRLIMVDTSNNLFLRIPTDENEIVLQIQ

KYFDAWETLILKPDIFFKFSWLYKKYEKSVNDLKKAVEILIEQKRQELSASDKLDEHL

DFASELIFAQNHGVLTAENVNQSIVEMLIAAPDTMSVSLYFILTLIAQHPKAEKMILD

EIHAVVGDREVQSSDMPNLKVLENFIYESMRYQPVVDVVMRKALEDDVIDGYYVKKGT

NIILNIGRMHKVEFFPKPNEFSLENFEKTVPQRYFQPFGFGPRACAGKYIAMVMMKAI

LVTLLKRYKVQTLQGRCLENIHNNNNLSTYPDESQSSLEMAFISLHTAPLAH

 

gi|58384757|gb|AY859423.1| Mugil cephalus aromatase cytochrome P450 brain isoform (Cyp19b)

           mRNA, complete cds

          Length = 2313

 

 Score = 53.1 bits (126), Expect = 3e-06

 Identities = 30/88 (34%), Positives = 48/88 (54%)

 Frame = +2

 

Query: 1   MDTGEGWDVLLVVLLVVLVWYYIRETWTSGIDGIFPPGPPYIPLLTPLWTLWVFLHDGIW 60

           M  G   +V  ++LL++L++  +  TW+     + P GP ++  L P+ +   F+  GI

Sbjct: 140 MTAGTASEVASLLLLLLLLFLLLVTTWSRTHRSLIP-GPYFLAGLGPILSYIRFMWSGIG 316

 

Query: 61  AATAGYAAKYGDFVRVWLGTEQTFIISR 88

            A   Y  KYG  VRVW+  E+T I+SR

Sbjct: 317 TACNYYNNKYGSIVRVWINGEETLILSR 400

 

MMLLLLEKLTMGPMTAGTASEVASLLLLLLLLFLLLVTTWSRTH

RSLIPGPYFLAGLGPILSYIRFMWSGIGTACNYYNNKYGSIVRVWINGEETLILSRSS

AVYHVLRSAHYTSRFGSKMGLECVGMEGKGIIFNNDVPLWKKVRAYFAKALTGPGLQR

TVGICVSSTAKRLDRLQDVTDSSGHVDVLNLLRAIVVDISNRMFLRVPLNEKDLLMKI

QNYFETWQTVLIKPDIFFKMGWLYNKHKRAGKELQDAMDALLDIKRKIINETEKLDED

FDFATELIFAQNHGELSADNVRQCVLEMVIAAPDTLSISLFFMLMLLKQNPDVEMQIV

EEMNTILSERDVQNLDYQGLKVLESFINESLRFHPVVDFTMRKALEDDNIDGTAIRKG

TNIILNIGLMHKTEFFPKPKEFSLMNFDRTVPSRFFQPFGCGPRSCVGKHIAMVMMKA

ILVTLLSRYTVCPRQGCTLNSIKQTNNLSQQPVEDEHSLAMRFIPRAAQPQLKPL

 

>gi|44886089|dbj|AB164064.1| Cynops pyrrhogaster P450 arom mRNA for cytochrome P450 aromatase,

           complete cds

          Length = 3176

 

 Score = 54.7 bits (130), Expect = 1e-06

 Identities = 35/80 (43%), Positives = 44/80 (55%)

 Frame = +2

 

Query: 9   VLLVVLLVVLVWYYIRETWTSGIDGIFPPGPPYIPLLTPLWTLWVFLHDGIWAATAGYAA 68

           +LLV   ++LVW Y     TS I     PGP Y   L P+ +   FL  GI +A   Y 

Sbjct: 320 LLLVSCFLLLVWRYEE---TSSI-----PGPGYCMGLGPVLSYCRFLRTGIGSAANYYNN 475

 

Query: 69  KYGDFVRVWLGTEQTFIISR 88

            YGDFVRVW+  E+T IIS+

Sbjct: 476 LYGDFVRVWINGEETLIISK 535

 

MLLETLNPMYYNISHVVPEVSPTATVSLLLVSCFLLLVWRYEET

SSIPGPGYCMGLGPVLSYCRFLRTGIGSAANYYNNLYGDFVRVWINGEETLIISKSSA

TFHVMKHEHYTSRFGSRLGLQCVGMNENGIIFNSNPSLWKEIRPYFSKALSGPGLVQT

TDMCIKSTLTYLSRLKEVTTENGNVNVLTLMRLIMLDTSNNLFLRIPLDESEIVLKIQ

KYFDAWQALLLKPDIFFKISWMYYKYEKSAKDLKEAIEKLIEKKRKKLSTVERLEENM

DFASELIFAQNRGDLSADNVNQCILEMLIAAPDTLSVTLYFMLMLIAQHPRVEAKIME

EIKAVIGDREVRSTDMQNLKVVESFICESMRYQPVVGLVMRKALADDVIDGYYVKKGT

NIILNLGRMHRVEYFQKPNEFTLENFQKNVPYRYFQPFGFGPRACAGKYIAMVMMKAI

LVTLLKRYSVQPIMGRCLENIQNNNDLSVHPDETQSSLEMVFLPRNGTI

 

>gi|40021578|gb|AY489060.1| Halichoeres tenuispinis

protogynous Wrasse

MLMDVSSEVTVFLLLMVLLLLFTSWSRTQKQIPGPPFLAGLGPL

LTYSRFIWTGIGTACNYYNNKYGSIVRVWINSEETLILSRSSAVYHVLRSAHYTARFG

STTGLECIGMEGKGIIFNSDVQLWRKVRTYFSKALTGPGLQRTVGICVSSTAKHLERL

KEMTDPSGHVDALNLLRAIVVDISNKLFLRVPINEKDLLMKIQSYFETWQTVLIKPDI

FFKIGWLYNKHKKAAQELQDVMESLLVTKRKMIKESEKLDDDLDFATELIFAESHGEL

SADNVRQCVLEVWRSQLQYTLSISLFFMLMLLKQNPDVELRIVEEMNTVLREKGDGNL

DYQSLNVLESFINESLRFHPVVDFTMRKALEDDNIEGIKIAKGTNIILNIGLMHKTEF

FPHPTEFSLTNFDKTVPSRFFQPFGCGPRSCVGKHIAMVMMKAILVALLSRYTVCPRQ

GCTINSIRQTNDLSQQPVEDEHSLAMRFIPRATQPPLSHIFSQEM

 

Gen Comp Endocrinol. 1984 Oct;56(1):53-58.

 

    In vitro conversion of androgen to estrogen in amphioxus gonadal tissues.

 

    Callard GV, Pudney JA, Kendall SL, Reinboth R.

 

    The ability to convert androgen to estrogen (aromatization) is a constant feature of gonadal and neural tissues in all major vertebrate groups. In experiments reported here, the existence of this pathway was investigated in the protochordate amphioxus (Branchiostoma lanceolatum). Following incubation with [3H]19-hydroxyandrostenedione, gonadal homogenates contained authentic estrone and estradiol-17 beta, as determined by derivative formation and recrystallization to constant specific activity. Cephalic ("brain") and other segments were aromatase negative. The results indicate that a potential for estrogen biosynthesis in the gonads predates that in other tissues and arises prior to the evolution of true vertebrates.

 

danio aromatase mRNA

atg gcaggtgatc tgctccagcc ctgtggaatg aagccggtgc gtctcggcga

       61 ggctgtggtg gatcttctta tccaaagggc tcataacggc actgaaaggg ctcaggacaa

      121 tgcgtgtgga gctacagcca caatactgct gctgctactc tgcctgctgc tagccatcag

      181 acaccatcga ccacacaaat cacacattcc aggtccttct ttcttttttg gtctgggtcc

      241 tattgtctcc tactgtcggt tcatctggtc tgggatcggg actgccagca actactacaa

      301 cagcaagtat ggagacattg tgcgtgtctg gatcaatggt gaggaaactc tcatcttgaa

      361 caggtcgtca gctgtatatc acgtgttaag gaagtctttg tacacttcac gctttggaag

      421 taaactgggt ctgcagtgca tcgggatgca tgagcagggc atcatattca actcaaatgt

      481 ggctctctgg aagaaagtcc gtgcatttta tgctaaagct ctcacaggtc cagggcttca

      541 gaggactatg gagatctgca ccacctccac aaactctcac ctggacgatt tgtctcagct

      601 gacggatgct caaggacagc tggacattct taacttactg cggtgcatcg tggtggacgt

      661 ttccaacaga ctgtttctag gagtcccgct caatgagcac gatctgcttc agaagattca

      721 taaatacttt gacacctggc agactgtatt aatcaagcct gatgtctact tcagactgga

      781 ctggctgcac aagaagcaca agagagatgc tcaggagttg caggatgcca tcacagctct

      841 gatcgagcag aagaaagttc aactggcaca cgcagagaaa cttgaccacc tcgactttac

      901 agcagagctg atatttgctc agagccatgg agagctgagc gcagagaacg tcaggcagtg

      961 tgtgttggag atggtgatcg cggctccaga cactctctcc atcagtctgt tcttcatgct

     1021 gctgttatta aaacaaaatc cagatgtcga gttaaagatc ctgcaggaaa tggacagtgt

     1081 tttagctggc cagagcctcc agcactcgca tctgtccaag ctgcagatcc tggagagttt

     1141 tatcaacgag tctctacgtt ttcacccggt cgtggacttc accatgcggc gggcgctgga

     1201 tgatgatgtc atcgagggat acaacgtgaa gaaaggaaca aacatcatac tgaatgtggg

     1261 tcggatgcac agatccgaat tcttctccaa acccaatcag ttcagtcttg acaacttcca

     1321 gaaaaatgtt ccgagtcgtt tcttccagcc gttcggatcg ggtcctcggt cgtgtgtggg

     1381 gaagcacatt gccatggtga tgatgaagtc tattctggtg gctctgctgt ctcgtttctc

     1441 tgtgtgtcct atgaaggcct gtacagtaga aaacatcccg caaaccaaca acctgtcaca

     1501 gcagccggtg gaggagccgt ccagcctcag cgtgcagctt atcctcagaa acactctc

 

Xenopus CYP19 mRNA

atggaagcc ttgaatccag tgcagtataa

       61 catcacagaa gctgttccca ctctggcacc tgccactact ctttctctgc tgctcttcat

      121 ttttgtgctc atcattctat ggaatcaaga ggagacatct ctgataccag gcccagctta

      181 ttgcatggga ctcgggcccc tcatttctta tggccgtttt ctactgacag gaattggcaa

      241 agcagcaaat tactacaaca acatgtatgg agaatttgtg agagtctgga ttaatggcga

      301 ggaaacactg attatcagca aatcttcagc aacatttcac atcatgaaac acagccatta

      361 tgtctcacgc tttggaagca agctagggct acagtgcatt ggcatgaatg aaaatggcat

      421 catattcaat agcaacccat ctctatggaa ggtcattcgg ccatacttca tcagagcttt

      481 gtctggtcca ggacttatgc aaacaacaga aaactgtata agatctacaa atcactacct

      541 ggataacctg agtaatgtta caaatgaact gggaaatgta gatgtcctta agctaatgag

      601 gcttattatg ttagatacat caaacaatct cttcctaagg atacccttag atgaaagtga

      661 aattgttctt aagatccaga aatactttga cgcctggcag gctctgcttc tgaaaccaga

      721 catcttcttt aaaatttcct ggctgtacaa gaaatatgaa aaatcagcaa atgatctgaa

      781 ggaagctatt gaacttctca ttgaacagaa aagacagaaa ctctcaagtt ctgagaagct

      841 ggatgaggat atggattttt catcagaact catatttgcc cagaatcatg gagatctaac

      901 agctgagaat gtcaatcagt gtattctgga aatgctaata gctgctcccg ataccatgtc

      961 tgtatctctc ttcttcatgt tagttctgat tgctcaacac ccaaagatag aagaaggaat

     1021 aatgaatgaa atggataaag ttattggtaa ccgggatgta gagagcaatg acattccaaa

     1081 tcttaaaatt ctggagagct ttatttatga aagcatgagg taccaaccag tggtagacct

     1141 ggttatgcgc aaggctctgg aggatgatat cattgatggt tactatgtga agaaaggcac

     1201 taacatcatt ttaaacttgg ggcgcatgca caaaattgta tactttccaa aaccgaacga

     1261 gttcaccttg gaaaattttg aaaagacggt tccatatcgt tacttccagc cctttggctc

     1321 tggtccacgt gcatgtgccg ggaagtacat agccatggtg atgatgaaag tcattctggt

     1381 tactcttctc aagaggtaca aagtgcagac attgagagga agatgcctgg agaatatcca

     1441 aaataacaat gatttgtcca tgcaccctga tgaaagtcaa ccttccttag agatgatctt

     1501 cattcctaaa aacacagcag agttcaaact g

 

CYP20 (no hits with megablast using Danio CYP20 mRNA)

 

>ATGN150171.g1 aa 1-24

also ATGN245625.b1 ATGN288660.b4

640 MLDYAIFAITFVVFLIATVLYLYP  (0) 711

ASFW100504.g2 mate pair = C-term

165 MLDYAIFVITFVVFLIATVLYLYP  (0)94

 

>ATGN288660.b4 exon 2

GANKITTIPGLEPSDPK (2)

 

>ATUP402034.g1 aa 41-90

DGNLGDVGRAGSLHEFLLKLHTEYGDIASFWWGQQLVVSLGAPELWKQHERIFDRP (1)

 

 

walked upstream of ATGN370239.b1 

ASWX106517.g2  ATUP286141.x2  ATWW97828.g1   AFPZ484891.b2 

AFPZ43753.b2   APWS171586.b1  ATUP727787.y1  ATUP750333.x1 

AFSA945609.b2  AFSA313913.g2  ATUP328643.y1  AFPZ512582.y1 

AFPZ529103.y1 

 

poor match on ATUP750333.x1

 (1)    ALLFKGFEPLIGAKSIQYANSVDGRTRRKLYDPSYGHNAMKHYYSIFQE (0)

 

>ATGN370239.b1 aa 143-202

518 LGQEMAKKWESMKGDQHIPLHAHIIALAMKAITRSSFGDAFKDEKECVQFGRNYDIV 688

ATGI38129.b1 AFSA140408.g2 mate pair = C-term

    ILQLGQEMAKKWETMEGDQHIPLHAHMIALAMKAITRSSFGDSFKDEKECVQFGRNDDIV

ATWW28266.b2 mate pair = C-term

221 VLQLGQEMASKWESTKGDQHIPLHAHMMALALKTFTRSSFGDSFKDEKECVQFGRNY 391

 

>ATGI38129.b1

(0) CWNDMEERIKGSHPTEGSPREKKFKE (1)

AGTGTTGGAATGATATGGAAGAGAGGATCAAGGGAAGTCACCCC

ACGGAAGGAAGTCCCAGGGAGAAAAAGTTTAAAGAAGGT

 

>APWS103772.b1

(1) ALGKLHATIARVAKYRRENPSPPQEQLFIDVLIEGNLPEEQ (0)

AGCACTGGGAAAGTTACACGCTACTATTGCACGGGTGGCAAAGTA

CCGTCGAGAGAACCCTTCCCCACCCCAGGAGCAACTCTTCATCGATGTGCTCATTGAGGG

GAATCTGCCTGAGGAGCAGGT

 

>ATGN158528.b1

(0) VLCDAMTFTVGGIHTSGN (1)

AGGTCTTGTGTGATGCTATGACATTTACGGTTGGGGGAATCCATACTTCTGGAAACTGT

 

>ATGN168444.b1 I-helix region, mate pair = C-term

AFPZ734943.y1   ATUP79075.y1    ATGN168444.b1   ATWW144961.g1  

AFSA813644.b2   AFPZ813089.b2   AFPZ604816.b2  

(1) VLTWALYYIATHEEVEEKLHQELSDVLGKKGEVTPDNISQLV (2)

AGTGCTGACATGGGCCCTGTACTACATCGCCACTCATGAGGAGGTAGAAGAG

AAACTGCACCAGGAACTGAGCGATGTCTTGNGGAAGAAAGGAGAAGTCACCCCTGACAAC

ATCTCACAACTAGTGT

 

>AFPZ604816.b2

AFPZ910670.x1  AFPZ33570.g2   APNK104245.g1  ATGN336865.g1  ATGN336865.b1 

ATGN326825.b1  ATGN165528.g1  ATGI17171.b1   ATUP478376.x1  ATUP166894.x1 

AFPZ798704.x1  AFPZ868290.x1  ASFW78821.b2   AFSA582650.b2  AFPZ450600.b2 

(2) YLRQVLDESLRCAVIAPWGARYMDLDAEVGGHIVPAK (0)

AGGTACCTACG

ACAGGTTCTTGACGAGTCGTTGCGCTGTGCCGTGATCGCTCCATGGGGCGCACGTTACAT

GGACCTGGACGCTGAAGTAGGAGGCCACATTGTGCCAGCCAAGGT

 

>AFSA912322.b2

772 (0) QTPVIHAFGVVLQDERIWPEPNK (2) 834

AGACCCCAGTTATTCATGCTTTTGGAGTTGTCCTCCAAGATGAGAGGATTTGGCCAGAGCCAAACAAGT

GAATTTTAATTTTCAACATTTTGGGGCTCTTAGTGTTAAAATATAGAAACTCCAGAAAA

 

>ATUP839681.y1

(2) FDPDRFDAENSKGRHKLAFQPFGFAGGRKCP (1)

AGGTTTGATCCAGATAGGTTTGATGCAGAG

AACAGTAAGGGTCGTCACAAGTTGGCATTCCAGCCATTTGGGTTTGCGGGGGGTCGCAAA

TGCCCAGGT

 

>ATGN302804.b1 aa 410-462

AFPZ226042.y3 AFPZ39024.g2  AFPZ33570.b2  ATGN302804.b1 

ATWW28672.g1  AFPZ888298.y1  AFPZ888298.x1  AFSA693004.b2 

2 nucl diffs

AFPZ336078.y1  AFPZ535923.x1  AFPZ30312.b2   APWS33488.g1   ATGN214653.g1 

ATGI38129.g1   ATWW28266.g2   ASFW145026.g2  ASFW100504.b2 

AFSA939980.g2  AFSA140408.b2  AFPZ444424.x1  AFPZ593083.g2 

3 nucl diffs

ATGI33407.b1  APWS15046.g1  ATGN168444.g1 ATUP527043.g1 AFSA547804.g2 

52 (1) GYRFTYTWTSVFLSILCRQFKLHLVDGQVVKPCHGLVTRPVDEIWITVTKRD* 204

AGGTTCACCTATACGTGGACATCAGTGTTCCTGTCCATCCTGTGCCGACAGTTCAAGCTCCATCTGGTG

GACGGACAGGTGGTCAAGCCGTGCCACGGGCTCGTCACGCGCCCGGTCGACGAGATCTGG

ATTACGGTCACCAAGCGTGACTAA

 

>CYP20 amphioxus 39% to CYP20 Danio

    MLDYAIFAITFVVFLIATVLYLYP (0)

(0) GANKITTIPGLEPSDPK (2)

(2) DGNLGDVGRAGSLHEFLLKLHTEYGDIASFWWGQQLVVSLGAPELWKQH ERIFDRP (1)

(1) ALLFKGFEPLIGAKSIQYANSVDGRTRRKLYDPSYGHNAMKHYYSIFQE (0)

(0) LGQEMAKKWESMKGDQHIPLHAHIIALAMKAITRSSFGDAFKDEKECVQFGRNYDI (0)

(0) CWNDMEERIKGSHPTEGSPREKKFKE (1)

(1) ALGKLHATIARVAKYRRENPSPPQEQLFIDVLIEGNLPEEQ (0)

(0) VLCDAMTFTVGGIHTSGN (1)

(1) LLTWALYYIATHEEVEEKLHQELSDVLGKKGEVTPDNISQLV (2)

(2) YLRQVLDESLRCAVIAPWGARYMDLDAEVGGHIVPAK (0)

(0) QTPVIHAFGVVLQDERIWPEPNK (2)

    FDPDRFDAENSKGRHKLAFQPFGFAGGRKCP (1)

(1) GYRFTYTWTSVFLSILCRQFKLHLVDGQVVKPCHGLVTRPVDEIWITVTKRD*

 

>CYP20 Danio rerio (zebrafish) ctg10765 74% to fugu

 9501 MLDFAIFAVTFVIILIGAVLYLYP (0) 9572

      SSRRASGVPGLNPTEEK (2?)

      DGNLQDIVNKGSLPEFLVGLH

      DEFGSVASFWFGARPVVSLGAVNQLRQHINPNWT (1) 10024

12291  TDSFETMLKSLLGYQSGSGVGLTESMMRKKVYE-GAINKTLENNFPLLLQ (0) 12439

12929 QVEELVDKWASYPKSQHTPLCAHFL 13003

13003 GLAMKAVTQLAMGSRFRDDAEVIRFRKNHEA (0) 13095

15738 IWSEIGKGYLDGSLEKSSSRKAHYES (1?) 15815

15897 ALAEMESVLKSVAKQRPGQGSSQSFVNYLLQANLTERQ (0) 16010

16583 VMEDGMVFTLAGCVITAN (1) 16636

17689 LCIWAVHFLSVSEAVQDRLYHELVEVLGDELVSLEKIPQLR (2) 17811

19293 YCQQVLNETVRTAKLTPVAARLQEVEGKVDQHVIPKE (0) 19403

21269 TLVIYALGVVLQDADTWSLPYR (2) 21334

21425 FNPDRFAEESVMKSFSLLGFSGSQACPELR (2) 21514

      FAYTVATVLLSTLVRRLRMHRVDGQVVEARYELVTTPKDDTWITVSKRN*

 

>CD784670.1 CYP20 Rhipicephalus appendiculatus cDNA a Tick

MLDFAIFAVCFVVFLLALVLYLYPSSAKQTTIPGLEPSDKKEG

NVGDIVQAGGLQNFLISLHKEHGPIASFWIGTKLVVSIGKADLFKTQSHVFDKPAELFVL

YRDVMGAGSIFFANGAEARKRRRLIDEVLTGKSLDKFLGPIEKLCSEVVMHLKDTPDDEH

VPVYQYMYALCMKISTRLLFGEYFFDDMEVLKFSRNFELCIKELEE

 

>CD295714.1 CYP20 Sea urchin larva cDNA Strongylocentrotus purpuratus

106 MLDFAIFAVTFIILLVGLLIYIYPTTPQKTTTVPGLEPSDPVKGNLDEIGDAGSLHQFLT 285

286 KLHAEHGDIASFYFTDQLCVSITSPELFKEHQAVFNRPALLFKLFEPLITPDSIQYANGG 465

466 DGRKRRDLTDRCFGFQALQNFIGVFNKNHRALVKK 570

 

>DN668857.1  CYP20 Gasterosteus aculeatus cDNA 77% to Danio

Conner Creek sticklebacks

MLDFAIFAVTFVVILVGAVLYLYPSSRRASGIPGLNPTDEKDGNLQDIVDRGSLHE

FLVSLHREFGSVASFWFGGRPVVSLGSVHLLRQHINPNHTTDSFETMLKSLLGYQSAMGG

GAAETVIRKKVYENAINNTLKSNFPLVLKLVEELVGKWKSFPASQHTPLCAHLLGLAMKT

VTQLALGESFGDDAKVLSFRKNHDA

IWSEIGKGYLDGSLEKSSSRKGDYEK

ALSEMESMLLSVXEGKKAQKKQT

FVDALLQFSLTERQ

VMEDCMVFTLAGCVITAN

WGIWAVHFL  

 

CYP21 like (looking, searched with heme signature exon and I-helix exon and EXXR exon from zebrafish, No hits found)

 

>CYP21 AL953915.4 Zebrafish exon 7 boundaries not certain, no good EST or mRNA in vertebrata. N-term not clear

     MCFSVVSVVLLLFILWMLVVKFWRQSHRRTDG (0)

4754 IVILICVSFYCPIAVFPKLLHSLYKLFFSTVSPTISGPRSL PLLGNMLDLAQDHLPIHLTALA 4942

4943 KCYGNIYRLNCGSTS

5738 AMVVLNNSEIIREALVKKWSDFAGRPYSYTG

5918 XDIVSGGGRTISLGDFSEEWKAHRRVTHSALQRCTTDSLHSVIEKQAQHLCQ 6070

7187 VLRDYSGKAVDLSEDFTVASSNVITTLTFSKA 7282

7369 YDKSSAELQKLQECLNEIVSLWGSPWISALDSFPLLR

     KFPNPPFSRLMKEVARRDELIGKHIEEFK 7665 (0)

     KSEHKEGGTLTSSLLKCLEPQQGAANHT (0)

 8868 TLTDTHVHMTTVDLLIGGTETIAALLNWTVAFLLHRPE 8981

14582 VQDKVYEELCCVLDVRYPQYSDRHKLPYLCALISEMLRLRPVAPLAVPHRAIRNS 14746

16875 SIAGHFIPKNTIIIPNLYGAHHDPEVWDDPYSFKP 16979

17077 ERFLEGGGGSLRSLIPFGGGARLCLGEAVAKMEMFLFTAYLLREFKFLPASKEEPLP 17247

17248 ELRGVASVVLKVKPYTVIAHPREQ* 17322

 

 

CYP24 like (looking, cannot find it with megablast of N-term

Part of zebrafish CYP24 EST seq CN507760, or whole tetraodon mRNA)

 

>CYP24 zebrafish ctg12249 CN507760.1 69% to human CYP24 except N-term 76% to fugu CYP24

      MRAHLQRAPQILELLKKKTAGLQHCKPTSSVCVLDSKDAAGSAPCAHS

90158 LDSIPGPTNWPLFGSLIEVIRNGGLKRQHETL 90253

91424 IHFHKKFGKIFRMKLGSFESVHIGSPCLLEALYRKEGSYPERLEIKPWKAYRDMRDEAYGLLIL 91615

91705 EGRDWQRVRSAFQQKLMKPTEVMKLDGKI 91791

      SEVAADLIKRIGKVNGKMDDLYFELNKWSFET

92515 ICYVIYDKRFGLLQDSVSKEGMDFITAVKT

      MMSTFGTMMVTPVELHKTLNTKTWKDHTEAWDRIFST 92850

95260 AKHYIDKNLQKQSNGEADDFLSDIFHNGNLTKKELYAATTELQVGGVET 95406

      TANSMLWVIFNLSRNPCAQGKLLKEIQDVVPAGQTPRAEHIKNMPYLKACLKESMR

      VSPSVPFTSRTLDKDTVLGDYTLPKG 99051

      TVLMLNSQAIGVSEEYFDNGRQFRPERWLEEKSSINPFAH

      VPFGVGKRMCIGRRLAELQIQLGLCWILRDYK

      IVATDLEPVDSLHSGTLVPSRELPVAFVPR 101033

 

CYP26 like there only appear to be two not three

both are most like 26C.

 

>AFPZ916045.b2 exon 1 APWS115441.b1 APWS115441.b1 ATGN366174.g1

AFSA580900.g2 AFPZ435288.x1 ASWX124624.b2 AFPZ464480.y1 AFPZ610800.g2

MLAELLINAAVPLVLVWTLWTLWKHYSTQGDPACDLPLPKGSMGLPFIGETLAFVTQ (0)

 

>APWS143762.g1 exon 2 or 3

GADFSRSRHELYGDVYKTHILGRPTVRVRGADNVRKILHGENTLVT

TIWPYSIRAVLGTQNLGMSFGEEHRFRKRVVMKAFNQNAMESYLRSTQTVLRETVAQWCV

QPQPVVVYPASREMALKIAAASLIGVHTGQEDAQRVTVLFQNMIDNLFSLPVKIPFGGLLSK

 

>AFPZ682082.g2

ALRYRQIIDEWLEGHIKRKQRDIDNGDIGTDALSRLILAARDVGHDLNSQEIQDTA

VELLFAGHETTSSAATSLIMHLALQPQVVQKVQEDLEKHGLLQPDQPLSLEQVGRL

TYVGQVVKEVLRISPPIGGGFRKALKTFELD

 

>BI377228 Amphioxus 5-6 hrs cDNA 53% to 26C1 Fugu 44% to 26B1 fugu

AFPZ57964.g2

GRLTYVGQVVKEVLRISPPVGGGFRKALKTFELD ()

GFQVPAGWTVTYSIRDTHGSVGNVSSPDQFDPDRWAADSDGSRRGRHHYIPFGAGPRACAG

KEFAKLQLKLLCVELVRSCRWELADGKVPAMTAIPVPRPVNGLPVQFTP

CEPITNNTLSDATEQNTNLSVAQQN*

 

>46% to Xenopus CYP26, 47% to 26C1 hum, 43% to 26A1 hum 42% to 26B1 hum

65% to the second Amphioxus seq

MLAELLINAAVPLVLVWTLWTLWKHYSTQGDPACDLPLPKGSMGLPFIGETLAFVTQ (0)

GADFSRSRHELYGDVYKTHILGRPTVRVRGADNVRKILHGENTLVT

TIWPYSIRAVLGTQNLGMSFGEEHRFRKRVVMKAFNQNAMESYLRSTQTVLRETVAQWCV

QPQPVVVYPASREMALKIAAASLIGVHTGQEDAQRVTVLFQNMIDNLFSLPVKIPFGGLLSK (0)

ALRYRQIIDEWLEGHIKRKQRDIDNGDIGTDALSRLILAARDVGHDLNSQEIQDTA

VELLFAGHETTSSAATSLIMHLALQPQVVQKVQEDLEKHGLLQPDQPLSLEQVGRL

TYVGQVVKEVLRISPPIGGGFRKALKTFELD (0)

GFQVPAGWTVTYSIRDTHGSVGNVSSPDQFDPDRWAADSDGSRRGRHHYIPFGAGPRACAG

KEFAKLQLKLLCVELVRSCRWELADGKVPAMTAIPVPRPVNGLPVQFTP

CEPITNNTLSDATEQNTNLSVAQQN*

 

>CF919306 Amphioxus 26 hrs cDNA 61% to BI377228 52% to 26C1 fugu

ASWX177691.b2 (exon 4)  APWS171289.g1 (exon 2)

46% to 26C1 43% to 26B1 44% to 26A1

GGKFSSSRHAHYGDVFKTHILGRPTIRVRGATNVRKILLGENHIVTSLWPQTFRTVLGT

GNLAMSNGEEHRLRRKVIMKAFNYEALERYVPIMQEILREAVQRWCGAPQPVTVWPMARE

MAFRVASAVLVGFQHSDEEIQHLTSLFTNMVKNLFSLPVKLPGSGLSN (0)

GLFYRQAIDEWMMNHIQRKKEFVLQGGDSGDVLSHIMNNAKDNGEKLSDQEIQDTVVELLFAGHET

TSSAATSLIMHLALQPQVVQKVQEDLEKHGLLQPDQPLSLEQVGRLTYVGQVVKEVLRRR

PPIGGGYRRALKSFDIG (0)

GFHVPKGWAVLYSIRDTHEASQIFSSPELFDPDRWTPETSQAPLARYDMVTFGGGPRA

CVGKEFAKLLLKLLCVELTRRCRWKLADDKLPDMKLIPIVYPADGLPVIFTP

IGGKSPGDENKNGVPYEERTRGKDCPILCSVSFEKDINVAT*

 

>exon 2

APWS171289.g1  AFSA937430.b2  AFSA602616.g2  APNK6178.b2   

ATUP304592.g1  ATGN98133.b1  AFSA785561.b2

AGGGAGGCAAATTCAGTTCCAGCAGACATGCGCACTACGGGGATGTCTTCAAGACCCACA

TCCTGGGCCGCCCGACCATCCGCGTTAGAGGGGCGACCAACGTGCGCAAGATCCTGCTGG

GAGAGAACCACATCGTCACCAGTCTGTGGCCGCAGACGTTCCGCACGGTTCTGGGGACCG

GGAACCTCGCCATGAGTAACGGCGAGGAGCACAGGCTGCGCAGAAAGGTCATCATGAAGG

CCTTCAACTACGAGGCGCTGGAGAGGTACGTCCCCATCATGCAGGAGATCCTGCGCGAGG

CTGTCCAGCGGTGGTGCGGGGCTCCCCAGCCGGTGACTGTCTGGCCCATGGCACGGGAGA

TGGCCTTCCGTGTGGCGTCAGCCGTCCTGGTGGGCTTCCAGCACAGCGATGAGGAGATCC

AACACCTCACCTCTCTCTTCACCAACATGGTCAAGAACCTCTTCTCTCTCCCAGTCAAGC

TACCCGGGAGTGGGCTCAGTAACGT

 

>exon 3 ATWX28853.g1 ATUP829694.b2 ATGI148270.g1 AFSA233238.b2

AFPZ187567.x1 ATUP946171.b1 ATUP911017.x1 ATUP863742.g1

ATGI148270.g1 ATWX28853.g1  APWS145712.b1 AFPZ130036.y1

AGGGGCTGTTTTATCGACAAGCCATCGATG

AGTGGATGATGAACCACATTCAGAGGAAGAAAGAGTTTGTGCTGCAGGGTGGCGACAGTG

GAGACGTCTTGTCGCACATCATGAACAATGCGAAGGACAACGGAGAGAAGTTGTCTGACC

AGGAGATCCAGGACACGGTGGTGGAGCTGCTGTTTGCCGGGCACGAGACCACGTCCAGCG

CCGCCACCTCCCTCATCATGCACCTGGCGCTGCAGCCACAGGTGGTTCAGAAGGTGCAGG

AGGACCTGGAGAAGCACGGGCTGCTGCAGCCGGACCAGCCTCTGAGTCTGGAGCAGGTCG

GCAGGCTGACGTACGTGGGGCAGGTCGTCAAGGAGGTGCTCAGGCGGCGCCCGCCCATTG

GAGGAGGCTACCGCAGAGCGCTCAAGTCTTTTGACATCGGCGT

 

>exon 4

ASWX177691.b2 ATUP915224.y1 ATUP215152.y2 AFSA763907.g2

AFSA344826.g2 AFPZ764896.b2 AFPZ69374.g2  AFPZ407263.b2 AFSA77547.g2

AGGGTTTCCATGTGCCCAAGGGATG

GGCGGTACTGTACAGCATCAGAGACACACACGAAGCCTCCCAAATCTTCTCCTCGCCGGA

GCTGTTCGACCCTGACCGATGGACCCCCGAGACATCCCAGGCGCCCCTGGCCCGGTACGA

TATGGTGACGTTCGGCGGGGGACCACGAGCCTGTGTCGGGAAGGAGTTTGCCAAGCTCCT

ACTGAAGCTTCTGTGTGTGGAGCTGACGAGAAGGTGCCGCTGGAAGCTGGCAGACGACAA

GCTACCGGACATGAAGCTCATTCCCATCGTGTATCCAGCCGACGGCTTACCTGTTATCTT

CACTCCCATTGGCGGAAAGTCACCTGGTGACGAAAACAAAAATGGCGTGCCGTATGAGGA

GAGGACAAGGGGCAAGGACTGTCCTATTCTCTGCTCGGTGTCGTTTGAAAAAGACATAAA

CGTCGCGACT

 

 

CYP27 (looking, no hits with megablast using human 27A1 or Xenopus 27A)

 

Endocrinology. 2003 Jun;144(6):2704-16.

                                                         

Cloning of a functional vitamin D receptor from the lamprey (Petromyzon marinus), an ancient vertebrate lacking a calcified skeleton and teeth.

 

Whitfield GK, Dang HT, Schluter SF, Bernstein RM, Bunag T, Manzon LA, Hsieh G, Dominguez CE, Youson JH, Haussler MR, Marchalonis JJ.

Department of Biochemistry and Molecular Biophysics, College of Medicine, University of Arizona, Tucson, Arizona 85724,

USA. kerr@medbioc.arizona.edu

 

The nuclear vitamin D receptor (VDR) mediates the actions of its 1,25-dihydroxyvitamin D(3) ligand to control gene expression in terrestrial vertebrates. Prominent functions of VDR-regulated genes are to promote intestinal absorption of calcium and phosphate for bone mineralization and to potentiate the hair cycle in mammals. We report the cloning of VDR from Petromyzon marinus, an unexpected finding because lampreys lack mineralized tissues and hair. Lamprey VDR (lampVDR) clones were obtained via RT-PCR from larval protospleen tissue and skin and mouth of juveniles. LampVDR expressed in transfected mammalian COS-7 cells bound 1,25-dihydroxyvitamin D(3) with high affinity, and transactivated a reporter gene linked to a vitamin D-responsive element from the human CYP3A4 gene, which encodes a P450 enzyme involved in xenobiotic detoxification. In tests with other vitamin D responsive elements, such as that from the rat osteocalcin gene,  lampVDR showed little or no activity. Phylogenetic comparisons with nuclear receptors from other vertebrates revealed that lampVDR is a basal member of the VDR grouping, also closely related to the pregnane X receptors and constitutive androstane receptors. We propose that, in this evolutionarily ancient vertebrate, VDR may function in part, like pregnane X receptors and constitutive androstane receptors, to induce P450 enzymes for xenobiotic detoxification.

 

CYP11 amphioxus, most similar to CYP11A1 of vertebrates

 

Looking for N-term

 

These sequences are repeat sequences.  There are too many of them to be

A true gene. 

 

>APWS65319.g1 (query)

Query: 698 ETISTSDTKWRPRIAKILKPAD*LEIPGEYVGQGAFFKFINV*TMSPERNTRYLXHIXCI 519

           +T ST D  WRPRIA ILKPAD*LEI GEY G   FFKFINV*TMSPER+TR L HI CI

Sbjct: 191 QTPSTGDPTWRPRIASILKPAD*LEITGEYAGLATFFKFINV*TMSPERHTRSLTHIHCI 370

 

                                     

Query: 518 LESFVVVCGVSGSFVVICGV*ADRLLT 438

           LESFVVVCG+S SFVV+CG+*ADRL T

Sbjct: 371 LESFVVVCGISESFVVVCGI*ADRLQT 451

 

>APWS65319.g1 (query)

ATGTCTCCAGAGCGCAACACAAGGTATTTGANCCATATATANTG

CATATTAGAGTCGTTTGTGGTCGTTTGTGGTGTTTCTGGGTCGTTTGTGGTCATTTGTGGTGT

 

APWS32978.b1 Sbjct

ATGTCTCCAGAGCGCCACACAAGGTCTTTGACCCATATACACTG

CATTTTAGAGTCGTTTGTGGTCGTTTGTGGTATTTCCGAGTCGTTTGTGGTCGTTTGTGGTAT

 

 

Walked up to ATGN68471.b1

ATGN295224.g1  ATGN68471.b1   ATUP813076.x1  (mate pair to LKIV exon)

ATUP179744.g1  ATGI135158.g1  (mate pair to heme sign. exon)

ATUP548358.y1  ATWW121053.b1  APNK7243.b2    AFSA920522.b2  AFSA567674.b2 

AFSA527467.g2  ATUP163891.y1

MSQVPII (0)

ATGAGTCAGGTGCCTATCATAGT

(0) TYSTAAVGSTSHHDDDREAKPFSALPGPPSVPVLGNFLHMWWEGLLEKEKLNKNHIMFTDFFRQYGPIFR (2)

AGACATACAGCACAGCTGCAGTCGGGTCTACCAGCCACCACG

ATGACGACAGGGAGGCCAAGCCTTTCTCTGCCCTGCCTGGTCCC

CCCTCTGTACCAGTCCTCGGCAACTTCCTGCACATGTGGTGGGAGGGACTCCTCGAGAAA

GAAAAGCTCAACAAAAATCATATCATGTTCACAGATTTCTTTCGTCAGTATGGTCCAATATTCAGGT

 

>ASFW5255.g2

ATUP770573.x1  ATUP502350.b1  ATUP22951.b1   ATUP261870.b1  ATUP813076.y1 

ATUP163891.y1  ATGN295224.g1 

2 nuc diffs but same aa seq

AFPZ508950.x1  ATGI69596.g1   APWS104982.b1  AFSA460369.b2  ATUP958061.g1 

ATGN113210.g1  ATGI237486.g1  ASWX31370.b2   AFPZ863602.y1  AFSA738793.g2 

(2) LKIVNVDMVSIKDPVAVQELFRKEGKYPARIDIKPWRRYREISGKATGVFLS (2)

AGTGTTCCGGAGGAGGGAAGTACCGGCACGCATCGACATCAAACCCTG

GAGGAGGTACAGGGAAATCTCAGGCAAGGCCACTGGAGTGTTTCTCAGGT

 

>ATWX54621.b1 C-helix (mate pair to GRR exon)

(2) NGKDWQKNRSIMARPMLRPKHVSTYVSNLDTVSADMIKRLRVLQARADGIEV PNISDELFKWALE (1)

AGTAATGGCAAAGACTGGCAGAAAAACAGGTCCATAATGGCGCGACCCATGC

TACGCCCCAAACATGTGTCTACGTACGTCAGCAACTTGGACACGGTGTCAGCCGACATGA

TCAAGCGACTGCGAGTACTCCAGGCAAGGGCCGATGGGATAGAAGTTCCAAACATATCAG

ATGAGCTGTTCAAATGGGCTCTAGAATGT

 

>ATWX25911.b1

SICTVLFNERMGYLQDNISQDAQDFIQGIHTIFLTTNTVIFPDADVHRFLRTKPWRQSVEAWDTVF (1)

AGCCATCTGCACGGTCCTGTTCAATGAGCGGATGGGGTATCTACAG

GACAACATCTCCCAGGATGCTCAGGACTTCATCCAGGGTATCCACACCATCTTCCTCACA

ACCAACACCGTCATCTTCCCTGACGCGGATGTGCATCGTTTCCTGAGAACCAAACCGTGG

AGACAGTCTGTGGAGGCGTGGGACACGGTTTTCCGT

 

>AFPZ267404.y1 (mate pair to GRR exon)

(1) REKVMVRKLQEALEREERGEGEDDQPNFLAFVNSTGRLTKDEIYSNTIELMGAAIDT (0)

AGGTGAGAAGGTGATG

GTCCGTAAGTTACAAGAAGCTCTGGAGCGGGAGGAGAGGGGGGAGGGGGAGGACGATCAA

CCCAACTTCCTGGCATTCGTCAACAGCACAGGGAGGCTGACCAAGGATGAGATTTACTCC

AACACCATTGAGTTGATGGGTGCTGCTATTGACACGGT

 

>ATGN150620.g1 55% to zebrafish 27A aa 281-336

AFSA195579.g2  AFPZ323214.y1  ATGN360541.b1  ATGN150620.g1  ATWW217237.g1 

ATWW208446.g1  ATUP560326.g1  APNK7248.b2  AFSA699487.b2  AFPZ633876.y1 

APWS109299.g1  ATWX69434.g1   ATUP305845.g1  ATGI148000.g1  AFSA903424.g2 

AFPZ643989.g2 

 

 (0) TSNTLLWTLYELSRRPELQDRLYQEVTQVIGQDKVMTWDHLKDLHLLKAIIKETLR (2) 885

AGACCTCCAACACCCTCCTGTGGACCCTGTACGAGCTGTCACGCAGACCTGAACTCCAGGACAGACTG

TATCAGGAGGTCACACAGGTCATAGGTCAGGACAAGGTCATGACCTGNGATCACCTGAAG

GACCTGCACCTCCTGAAGGCCATCATTAAGGAGACTCTGAGGT

 

>ATWX69434.g1

APNK115343.b1  ATWX69434.g1   ATUP302108.b1  ATGN360541.b1  ATGN112603.g1 

AFSA814391.b2  AFSA350501.g3  ATUP613311.y1  ATWW217237.g1 

(2) MYPVVHNVSRLLQEDTVLMGYRLPAK (0)

AGGATGTATCCAGTTGTCCATAATGTCAGCCGTTTGCTGCAGGAGGACACAGTGCTCATGGGATATCGG

TTACCCGCAAAGGT

 

 

walked down to

AFPZ139562.y1  ATUP806443.x1  ATGN112603.g1  ASFW123561.b2  AFPZ913658.y1 

51% to 27A

AFPZ913658.y1  ATUP205039.x2  ATWX47281.g1   APWS151338.g1  ATUP560326.b1 

ATWW147637.g1  ASFW123561.g2  AFSA517290.b2  AFPZ633876.x1  ATGI135158.b1  APWS157725.b1  AFSA460369.g2  AFSA100624.g2  AFSA100624.b2  ATUP806443.x1 

ASFW123561.b2 

 (1) TCVVAQVYAMGRDPQLFPDPDEFKPERWLRTGEAHDEINPYSSLPFGFGPRSCL (1)

AGACCTGCGTGGTTGCCCAAGTGTACGCCATGGGGCGGGACCCC

CAGCTGTTTCCTGATCCAGACGAGTTTAAACCCGAGCGCTGGTTGAGAACAGGAGAGGCT

CACGATGAAATCAACCCGTACAGCTCTCTGCCATTTGGCTTCGGGCCACGCAGTTGTCTTGGT

 

>AFPZ633876.x1

ATWX54621.g1   ATGN45774.g1   ATGN92985.g1   ATWW147637.g1  ATUP440242.g1 

ASFW123561.g2  ASFW5255.b2    AFPZ633876.x1  AFPZ267404.x1  ATUP305845.b3 

ATUP305845.b1  ASFW110757.g2  AFPZ455460.x1 

(1) GRRVAEVELQLLLAK (0)

AGGTCGTCGTGTGGCAGAGGTCGAGCTGCAACTCCTTCTTGCAAAGGT

 

>ATWX25911.g1  last exon of a CYP27 like gene

note there is an odd repeat of the first 10aa of this seq

up to 10X in ATWX25911.g1

AFPZ79643.b2  ATGI71035.g1  ATWX25911.g1  ATGN112603.b1 (mate pair)

ATWW217237.b1 (mate pair) ATWW212578.g1 ATWW208446.b1 (mate pair) AFPZ944117.x1

AFSA285680.b2 ATGN209347.b1 AFPZ755865.b2

(0) MSQQFVLSQVEPEEISSVAQPLLMPETPLHLRFVDRK*

AGATGTCCCAGCAGTTTGTGCTGAGTC

AGGTGGAACCAGAAGAGATTTCCTCAGTAGCGCAGCCGTTACTGATGCCGGAGACACCCC

TGCACCTCCGGTTTGTGGACAGGAAGTAA

 

 

>Assembled CYP11 seq missing exon 1, cannot identify yet.  This is a guess.

Green parts match other genes.

    MSQVPII (0)

(0) TYSTAAVGSTSHHDDDREAKPFSALPGPPSVPVLGNFLHMWWEGLLEKEKLNKNHIMFTDFFRQYGPIFR (2)

(2) LKIVNVDMVSIKDPVAVQELFRKEGKYPARIDIKPWRRYREISGKATGVFLS (2)

(2) NGKDWQKNRSIMARPMLRPKHVSTYVSNLDTVSADMIKRLRVLQARADGIEV PNISDELFKWALE (1)

(1) SICTVLFNERMGYLQDNISQDAQDFIQGIHTIFLTTNTVIFPDADVHRFLRTKPWRQSVEAWDTVF (1)

(1) REKVMVRKLQEALEREERGEGEDDQPNFLAFVNSTGRLTKDEIYSNTIELMGAAIDT (0)

(0) TSNTLLWTLYELSRRPELQDRLYQEVTQVIGQDKVMTWDHLKDLHLLKAIIKETLR (2) 885

(2) MYPVVHNVSRLLQEDTVLMGYRLPAK (0)

(1) TCVVAQVYAMGRDPQLFPDPDEFKPERWLRTGEAHDEINPYSSLPFGFGPRSCL (1)

(1) GRRVAEVELQLLLAK (0)

(0) MSQQFVLSQVEPEEISSVAQPLLMPETPLHLRFVDRK*

 

 

 

more CYP27 related sequences from amphioxus

 

Gene B, probable CYP24 sequence

 

TBLASTN search of trace files with CYP11 amphioxus (above) as query

 

>ATGI133701.b1  ATGI133701.g1  ATGI263107.g1  (mate pair to last exon)

ATUP25462.b1 mate pair

ATUP253168.g1 

AFPZ178922.x01 AFPZ178922.x1  APWS49400.g1   ASFW134828.g2 mate pair

16 aa diffs to seq AFPZ69957.g2

MYQLLSAARHQGQSLFRVCRARSLAALKTTYRPQSNKAEESVTYDTAAR

PFEEIPGPKGLPLIGTALEYTPF(1)

 

>ATGN202956.g1

AFSA200913.g2  ATGN202956.g1  ATGN232406.b1  ATUP741276.x1 

APWS88827.g1  ASWX68424.g2  

(1)  GQFKMITNLRESFRERTRTYGSIYRERIGPLDLVVISDPKEIEKVFR

NEGRYPERIELASIKVYREIKKLPTGLINL (2)

3aa diffs to above seqs

APNK105102.b1  ATUP936479.x1  ATGN122794.b1  ATWW20426.b1  

ASFW96552.b2  AFSA664961.g2 

GQFKMITNLRGSFRERTRTYGSIYRERIGPLDLVVISDPTEIEKVFRNEGRYPERIELASIKVYREIKKLPAGLINL

AGGTCAGTTTAAAATGATA

ACAAACCTGCGGGAATCCTTCAGGGAGAGAACGAGGACTTACGGCAGTATCTACCGGGAG

AGGATCGGTCCACTTGACCTGGTGGTCATCAGCGATCCGAAAGAGATAGAGAAGGTGTTC

CGCAACGAGGGGAGATACCCGGAGCGCATTGAGCTGGCGAGCATCAAAGTCTACCGGGAG

ATAAAGAAGTTGCCAACTGGATTGATCAACCTGT

 

>AFPZ73802.b2  ATWW116775.b1  ATUP525866.b1  AFSA402775.b2 ATUP741276.x1

The next set are the same aa seq but have some nucleotide changes

ATUP25462.g1  AFSA200913.b2  AFPZ79091.g2  ATUP96117.x2  ATGI14630.g1  

ATUP80026.x2   APWS81623.b1   ASWX9205.b2    ASWX9204.b2    ATUP613797.x1 

ATGN332649.g1  ATGN213212.b1  ATGN297330.b1  ATGI197442.b1  ATGI153639.g1 

ATWW52587.b1   ATWW71881.b1   ATUP471525.g1  ASWX134134.b2  ASFW134828.b2 

AFSA527936.g2  AFSA482665.b2  AFSA737597.b2  AFSA440956.g2  AFSA93983.g2  

(2)  NGPEWQRVRSSVQKDLMRPKTVGAYASLQDDVTRDLVDVIRALIGKEESGGQVQNFINYVYRWALE (1)

AGAAACGGCCCGGAGTGGCAGCGCGTGCGCAGCTCGGTTCAGAAGGACCTCATGCGGCCTAAGACTGTC

GGTGCGTACGCCTCCCTGCAGGATGACGTCACAAGGGACTTGGTTGACGTCATCAGGGCT

CTGATAGGGAAGGAAGAGAGCGGAGGTCAAGTTCAAAACTTCATCAACTATGTGTACAGA

TGGGCGCTAGAGGGT

 

>ATGN332649.g1 ASFW117082.b2  AFPZ821362.g2  ATUP96117.x2  AFSA900319.g2 

(1) AISVVVLDKRLGCLTLGDLEPGSDAKLMIDGVNDFFDAFVKLEMSATGL YKYISTPTWRKFAKAVDQFHR (2)

AGCGATCAGCGTGGTTGTGCTGGACAAGCGGCTGGGGTGCCTGACCT

TGGGTGACCTTGAACCTGGTTCTGACGCAAAACTGATGATTGACGGGGTCAATGACTTCT

TCGATGCGTTCGTGAAACTGGAGATGTCAGCAACTGGCCTCTACAAGTACATCAGCACAC

CGACGTGGAGGAAGTTCGCAAAGGCAGTCGACCAGTTTCATAGGT

 

>ASFW117082.b2

ATUP830924.b1  ATGN22852.g1   ATUP560796.g1  ATUP234960.g1 

AFSA900319.g2  AFPZ821362.g2  ATUP198761.x1 

(2) VAEKLLKEKLAKTTTEDGKPAESDTDFLQSLLSRNDVTFEEAMEMAVDLLSAGIDT (0)

AGCGTTGCTGAAAAGTTGCTGAAGGAAAAGCTG

GCTAAAACTACAACCGAAGATGGGAAACCCGCCGAGTCCGACACGGACTTCCTCCAGAGC

CTCCTGTCCAGGAATGACGTCACCTTCGAGGAGGCCATGGAGATGGCGGTGGATCTGTTG

TCTGCAGGGATTGACACGGT

 

Walked downstream to

AFSA197261.b2  AFPZ521495.y1  AFPZ279370.x1  ATWW68556.g1  

ATWW76986.b1   AFPZ635668.x1  AFPZ710720.y1  AFPZ603520.g2 

No hits found

 

>ASWX115484.g2

(2) VYPTVLNNVRRLDQDIVLSGYVVPAK (0)

AGGGTTTATCCCACTGTCCCCTAA

CAACGTAGACGGTTGGACCAAGACATCGTGGTGTCTGGATATGTCGTTCCTGCCAAGGT

 

>ATUP741276.y1 mate pair to exon 2 ASWX115484.g2 (more accurate at N-term)

 (0) SGNTLMFNLFCLAKNPEAQEKLYREIQEVVPAGQPIDDKVLNRMHYLRAVVKETFR (2)

 (2) VYPTVLNNVRRLDQDIVLSGYVVPAK (0)

AGTCAGGGAACACTCTGATGTTCAATCTCTTCTGCCTGGCGAAAAACCCGGAA

GCCCAGGAGAAACTTTACCGAGAGATCCAGGAGGTGGTCCCAGCCGGGCAGCCCATAGAT

GATAAGGTGTTGAACAGGATGCACTACCTGCGGGCCGTGGTGAAGGAAACTTTCAGGT

>ATUP152918.x2 more accurate seq of VYP exon

AGGGTTTATCCAACTGTCCTAAACAACGTAAGACGGTTGGACCAGGACATCGTGTTGT CTGGATATGTCGTTCCTGCGAAGGT

 

>ATUP280746.y2

(0) TTILLAHDVISSLPEYYPEPEVYKPERWLRDDESSSVQPFTLLPFGYGPRMCL (1)

 

>ATGN202956.b1 mate pair to exon 2

(1) GRRFAEQELHLGLIR (0)

 

>ATGI263107.b1 mate pair to first exon

(0) IVQNFHVGWAGEDMKQDNRIILAPDRDSFVFSERT*

 

37% to 27B1 fugu

    MYQLLSAARHQGQSLFRVCRARSLAALKTTYRPQSNKAEESVTYDTAARPFEEIPGPKGLPLIGTALEYTPF(1)

(1) GQFKMITNLRESFRERTRTYGSIYRERIGPLDLVVISDPKEIEKVFRN

    EGRYPERIELASIKVYREIKKLPTGLINL (2)

(2) NGPEWQRVRSSVQKDLMRPKTVGAYASLQDDVTRDLVDVIRALIGKEESGGQVQNFINYVYRWALE (1)

(1) AISVVVLDKRLGCLTLGDLEPGSDAKLMIDGVNDFFDAFVKLEMSATGL YKYISTPTWRKFAKAVDQFHR (2)

(2) VAEKLLKEKLAKTTTEDGKPAESDTDFLQSLLSRNDVTFEEAMEMAVDLLSAGIDT (0)

(0) SGNTLMFNLFCLAKNPEAQEKLYREIQEVVPAGQPIDDKVLNRMHYLRAVVKETFR (2)

(2) VYPTVLNNVRRLDQDIVLSGYVVPAK (0)

(0) TTILLAHDVISSLPEYYPEPEVYKPERWLRDDESSSVQPFTLLPFGYGPRMCL (1)

(1) GRRFAEQELHLGLIR (0)

(0) IVQNFHVGWAGEDMKQDNRIILAPDRDSFVFSERT*

 

Gene D probable CYP24 sequence

 

 

>AFSA245302.g2  ATUP830413.b1  ATUP956020.b1  ATUP792235.y1  AFPZ580605.x1 

AFPZ494158.y1  AFPZ133883.y1  ATUP874170.b1 

AFPZ330780.x1 ATGN48350.g1 mate pair to LCPT exon conflict this is probably the

Correct N-term

MSLLPRVVRHHGRLFNVCSARSLVTYRSQSTRAEESVAYDTAAR

PFEKIPGPKGLPLIGTGLDYAPF (1)

 

 

>ATGN253365.g1

41% to CYP11A 38% to CYP27A, 73% to ATGN202956.g1

ATGN253365.g1  ATUP236935.g1  ASFW152033.b2  AFSA537353.b2  AFSA355465.g2 

AFPZ611442.g2  ATUP848010.x1 

(1) GRFPIKTNLRDSYRERTKTYGSIYREKIGPRELVVISDPKDIQKVYRNE

GRYPERPQVDSIKTYREMKKLPAGIVVL (2)

AGGTCGATTTCCAATAAAAACAAACCTGCGAGATTCATACAGAGAGAGAAC

AAAGACCTACGGGAGTATCTACCGTGAAAAGATCGGACCAAGAGAACTAGTGGTCATCAG

CGATCCGAAGGACATCCAGAAGGTGTACCGCAACGAGGGGAGATATCCGGAGCGCCCACA

GGTGGACAGCATCAAAACCTACCGGGAGATGAAGAAGCTGCCAGCTGGAATAGTGGTTCTGT

 

(2) NGPEWQRVRSSVQKDLMRPKTVGAYASLQDDVTRDLVDVIKALIGKEESGGQVHNFINYVYRWTLE (1)

AGAAACGGTCCTGAGTGGCAGCGCGTGCGCAGTTCGGTTCAGAAGGACCTCATGCGGCCCA

AGACTGTCGGTGCGTACGCCTCTCTGCAGGATGACGTCACCAGGGACTTGGTTGACGTCA

TCAAGGCTCTGATAGGGAAGGAGGAGAGCGGAGGTCAAGTTCACAACTTCATCAACTATG

TGTACAGATGGACGCTAGAGTGT

 

>ATUP236935.g1 mate pair ATUP618228.y1 ATGN318410.b1  AFSA186811.g2 

(1) AISVVVLDKRLGCLTLGDLEPGSDAQMMIGGVNDFFNAFAKLEMSATGL YKYISTPTWRKFQKAIDQWHT (2)

AGCGATCAGCGTGGTT

GTGCTGGACAAGCGGCTAGGTTGCCTGACCTTGGGTGACCTTGAACCGGGTTCTGACGCA

CAAATGATGATTGGCGGGGTCAACGACTTCTTCAACGCATTTGCCAAACTGGAAATGTCA

GCAACTGGTCTCTACAAGTACATCAGCACACCGACCTGGAGGAAGTTC

CAAAAGGCGATCGACCAGTGGCACACGT

 

>AFPZ611442.b2 mate pair

AFPZ179516.y1  ASWX174932.x1  ATWW156666.g1  ATUP173093.g1  ATUP427411.g2 

AFSA551799.b2  AFSA145337.b2  AFPZ611442.b2  AFPZ183898.y1  ATWX88590.g1  

AWYB4327.b1  ATGN37072.g1  ATUP18347.x2  ATGI56395.b1  ATWW168445.g1 

ATWW74735.b1  

(2) VAAKLLKEKLTQSTIEDGKPAESDTDFLQSLLSRNDVTFEEAMEMALDLLSAGIDT (0)

AGAGTCGCTGCGAAGTTGCTGAAGGAAAAGCTGACGCAGAGTACAATT

GAAGATGGGAAACCCGCCGAGTCCGACACGGACTTTCTCCAGAGCCTCCTGTCCAGGAAT

GACGTCACCTTCGAGGAGGCGATGGAGATGGCATTGGATCTGTTGTCTGCCGGGATCGACACGGT

 

>ATUP173093.g1 ATGN155684.b1

(0) TGNTLMFNLFCLAKNPEAQEKLYREIQEVVPAEQPIDDKVLNRMHYLRAVVKETFR (2)

AGACGGGGAACACCCTGATGTTCAACCTCTT

CTGCTTGGCGAAAAACCCGGAAGCCCAGGAGAAACTTTACCGAGAGATCCAGGAGGTGGT

CCCAGCCGAGCAGCCCATAGATGATAAGGTGTTGAACAGGATGCACTACCTGCGGGCCGT

GGTGAAGGAAACTTTCAGGT

 

>ATGN48350.b1 mate pair to exon 1

ATUP204612.y2  ATGN213212.g1  ATUP924192.x1  ATUP305484.g1  ATGI56395.g1  

ATGN48350.b1   ATUP796046.y1  ATWW71881.g1   ATUP574827.g1  ATGI251785.g1 

AFPZ551860.x1  ASFW167838.b2  ASFW29294.g2   AFSA914086.g2  AFSA905811.b2 

AFSA186811.b2 

(2) LCPTVGNNIRTLDRDMVLSGYVVPAK (0)

AGGCTTTGTCCAACTGTTGGCAACAACATAAGAACGCTGGACCGAGACATGGTGTTGT

 

>ATGN253365.b1 short, mate pair

ATGN48350.b1  ATUP759533.x1  ATUP574827.g1  AFSA914086.g2 

AFPZ115025.y1  ATUP204612.y2  APWS64696.b1   ATGN213212.g1  ATWW71881.g1  

AFPZ845352.y1  AFSA905811.b2  AFSA186811.b2 

 

(0) TKIFMAHDVISSLPEYYPEPEVYKPERWLRDDESSSVQPFTLLPFGYGPRMCI (1)

AGACGAAGATCTTCATGGCTCACGACGTCATCAGCTCGCTT

CCGGAACCGGAAGTCTACAAACCGGAAAGATGGCTCCGTGATGACGAGTCGAGCAGCGTCCAACCGTTCACCCT

GCTGCCGTTCGGCTACGGACCGAGGATGTGCATTGGT

all with same aa seq but some nuc changes, probably more than one gene

ATGN253365.b1  ASFW10928.g2   AFSA627713.g2  AFPZ69957.b2  AFPZ115025.y1 

APWS64696.b1   ATUP837108.y1  AFPZ845352.y1  AFSA186811.b2  AFPZ377974.x1 

AFPZ577425.x3  ATGI105841.b1  ATUP127524.x2  ATUP52374.y2   ASWX174932.y1  ASWX70755.g2   ATUP404385.b1  ATGN368254.b1  ATGI182479.b1  ATUP207917.x2 

ATGI127474.b1  ATUP163752.y1  ATUP554136.y1  ASWX158863.b2  AFSA947971.g2 

AFSA594073.g2  AFSA576548.b2  AFSA537353.g2  AFSA763143.b1  AFSA409073.b2 

AFPZ409600.b2 

(1) GRRFAEQELHLGLIR (0)

AGGTCGGCGTTTCGCAGAACAAGAGCTTCATCTCGGATTGATCAGGGT

 

Possible C-term seq for Gene D

>AFPZ69957.b2  APWS64696.b1  ATUP837108.y1 ATUP782584.y1  AFPZ838779.y1 

(0) IVQNFHVGWAGEDMKQVNRLVLSPDRDSFVFSARA*

 

    MSLLPRVVRHHGRLFNVCSARSLVTYRSQSTRAEESVAYDTAAR

    PFEKIPGPKGLPLIGTGLDYAPF (1)

(1) GRFPIKTNLRDSYRERTKTYGSIYREKIGPRELVVISDPKDIQKVYRNE

    GRYPERPQVDSIKTYREMKKLPAGIVVL (2)

(2) NGPEWQRVRSSVQKDLMRPKTVGAYASLQDDVTRDLVDVIKALIGKEESGGQVHNFINYVYRWTLE (1)

(1) AISVVVLDKRLGCLTLGDLEPGSDAQMMIGGVNDFFNAFAKLEMSATGL YKYISTPTWRKFQKAIDQWHT (2)

(2) VAAKLLKEKLTQSTIEDGKPAESDTDFLQSLLSRNDVTFEEAMEMALDLLSAGIDT (0)

(0) TGNTLMFNLFCLAKNPEAQEKLYREIQEVVPAEQPIDDKVLNRMHYLRAVVKETFR (2)

(2) LCPTVGNNIRTLDRDMVLSGYVVPAK (0)

(0) TKIFMAHDVISSLPEYYPEPEVYKPERWLRDDESSSVQPFTLLPFGYGPRMCI (1)

(1) GRRFAEQELHLGLIR (0)

(0) IVQNFHVGWAGEDMKQVNRLVLSPDRDSFVFSARA*

 

data for do-it-yourself

>CYP11amphi mixed seq 43% to Gene C, 35% to Gene B, 34% to gene D

36% to 27B1 fugu, 38% to 11A1 fugu, 33% to CYP24 fugu, 32% to 27C1 fugu

37% to chicken CYP11A1, 39% to catfish Ictalurus punctatus 11A1

This is a probable CYP11A gene

(2) EAKPFSALPGPPSVPVLGNFLHMWWEGLLEKEKLNKNHIMFTDFFRQYGPIFR (2)

(2) LKIVNVDMVSIKDPVAVQELFRKEGKYPARIDIKPWRRYREISGKATGVFLS (2)

(2) NGKDWQKNRSIMARPMLRPKHVSTYVSNLDTVSADMIKRLRVLQARADGIEV PNISDELFKWALE (1)

(1) SICTVLFNERMGYLQDNISQDAQDFIQGIHTIFLTTNTVIFPDADVHRFLRTKPWRQSVEAWDTVF XX(1)

(1) GEKVMVRKLQEALEREERGEGEDDQPNFLAFVNSTGRLTKDEIYSNTIELMGAAIDT (0)

(0) TSNTLLWTLYELSRRPELQDRLYQEVTQVIGQDKVMTWDHLKDLHLLKAIIKETLR (2) 885

(2) MYPVVHNVSRLLQEDTVLMGYRLPAK (0)

(1) TCVVAQVYAMGRDPQLFPDPDEFKPERWLRTGEAHDEINPYSSLPFGFGPRSCL (1)

(1) GRRVAEVELQLLLAK (0)

(0) MSQQFVLSQVEPEEISSVAQPLLMPETPLHLRFVDRK*

 

>Gene B 84% to Gene D, 35% to CYP11 amphi, 33% to Gene C

30% to CYP24 Fugu, 30% to 27A3 fugu, 27B fugu, 27C fugu, 30% to 11A fugu

in nr blast best mammal hit is CYP24 mouse, but Drosphila hits are better.

34% to 49A1 D. melanogaster

    MYQLLSAARHQGQSLFRVCRARSLAALKTTYRPQSNKAEESVTYDTAARPFEEIPGPKGLPLIGTALEYTPF(1)

(1) GQFKMITNLRESFRERTRTYGSIYRERIGPLDLVVISDPKEIEKVFRNE

    GRYPERIELASIKVYREIKKLPTGLINL (2)

(2) NGPEWQRVRSSVQKDLMRPKTVGAYASLQDDVTRDLVDVIRALIGKEESGGQVQNFINYVYRWALE (1)

(1) AISVVVLDKRLGCLTLGDLEPGSDAKLMIDGVNDFFDAFVKLEMSATGL YKYISTPTWRKFAKAVDQFHR (2)

(2) VAEKLLKEKLAKTTTEDGKPAESDTDFLQSLLSRNDVTFEEAMEMAVDLLSAGIDT

(0) SGNTLMFNLFCLAKNPEAQEKLYREIQEVVPAGQPIDDKVLNRMHYLRAVVKETFR (2)

(2) VYPTVLNNVRRLDQDIVLSGYVVPAK (0)

(0) TTILLAHDVISSLPEYYPEPEVYKPERWLRDDESSSVQPFTLLPFGYGPRMCL (1)

(1) GRRFAEQELHLGLIR (0)

(0) IVQNFHVGWAGEDMKQDNRIILAPDRDSFVFSERT*

 

>Gene C 38% to CYP11 amphi, 34% to Gene E, 34% to Gene B

42% to 27B1 Fugu, 38% to 27C1 fugu, 42% to 27A1 fugu (but not first exon)

37% to 11A1 fugu, 36% to CYP24 fugu (Best match to CYP27B)

42% to Xenopus trop. 27B1, 41% to Xenopus laevis 27A1

    MAQQILRNSSVCSLVRPNSRALVSVAPAATVQQNRPLKEMPGPTNKLGQLWWGFKNRSRMHEAQ (0)

(0) LEQERKYGRMWQSSFGFNPNVNVAHVALAEQLMRQEGKYPKRIEVNFMQQYRDLRGYSYGLLNQ (2)

(2) NGPEWRHLRTAVSKRIMRPKEVPR (2)

(2) YGDSMNEVVTDMIDRFKDLRDTTGGGKTVPDLTNELYKWAME (1)

(1) SIATVLFDTRLGCLEREMPEKTQQFIDSIATMFRTAFLVSALKPWMLTYLGLGVWKRHVEAWDVIFSV(1)

(1) AHENIDRKVLDIDARLSRGEDLVGSFLTYMLTGTDVTKKDLYATVTELLLAGVDT (0)

(0) TSNTMVWTLYELARHPELQERLHQEVTSVVSPGQIPTVDDVKNMALLKNVIKEILR (2)

(2) VYPVLPANGRVLDKDIVLDGYNIPKG (0)

(0) TQFAILHYNMTRDPEVFEEPDRFNPDRWTRMGTEKVNTFSSVPFGFGPRQCA (1)

(1) GRRLAEMEMYLVLAR (0)

(0) LVQTFEVRQLTPGEVVRPVTRALLVPGDPVHLEFIDRP*

 

>Gene D 84% to gene B, 34% to CYP11 amphi, 30% to gene C

31% to CYP24 fugu

    MSLLPRVVRHHGRLFNVCSARSLVTYRSQSTRAEESVAYDTAAR

    PFEKIPGPKGLPLIGTGLDYAPF (1)

(1) GRFPIKTNLRDSYRERTKTYGSIYREKIGPRELVVISDPKDIQKVYRNE

    GRYPERPQVDSIKTYREMKKLPAGIVVL (2)

(2) NGPEWQRVRSSVQKDLMRPKTVGAYASLQDDVTRDLVDVIKALIGKEESGGQVHNFINYVYRWTLE (1)

(1) AISVVVLDKRLGCLTLGDLEPGSDAQMMIGGVNDFFNAFAKLEMSATGL YKYISTPTWRKFQKAIDQWHT (2)

(2) VAAKLLKEKLTQSTIEDGKPAESDTDFLQSLLSRNDVTFEEAMEMALDLLSAGIDT (0)

(0) TGNTLMFNLFCLAKNPEAQEKLYREIQEVVPAEQPIDDKVLNRMHYLRAVVKETFR (2)

(2) LCPTVGNNIRTLDRDMVLSGYVVPAK (0)

(0) TKIFMAHDVISSLPEYYPEPEVYKPERWLRDDESSSVQPFTLLPFGYGPRMCI (1)

(1) GRRFAEQELHLGLIR (0)

(0) IVQNFHVGWAGEDMKQVNRLVLSPDRDSFVFSARA*

 

>AWYB2467.g1 another CYP27-like last exon, short, bad exon boundary

FIQTFKTRQLCKKEFVSPHEEAIIVVV*

 

 

New gene C-like seq, partial  this is ALMOST GENE C EXCEPT FOR EXONS 3,4

 

 

AFSA609591.g2

(1) SIATVLFDTRLGCLEREMPEKTQQFIDSIATMFKTAFLVSALKPWMLTYLGLGVWKRHVEAWDVIFS G(2)

AGCCATCGCCACCGTTCTGTTCGACACACGGCTAGGCTGCTTGGAGAGGGAAATGCCGGAGAAGAC

GCAGCAGTTTATCGACTCCATCGCCACCATGTTCAAAACCGCGTTCCTTGTGTCAGCCCT

CAAGCCGTGGATGCTGACATACCTCGGTCTCGGTGTCTGGAAGCGCCACGTGGAAGCTTG

GGACGTCATCTTCAGTGTGGGT

 

>ATUP315848.g1 mate pair

(1) AHENIDRKVLDIDARLSRGEDLVGSFLTYMLTGTDVTKKDLYATVTELLLAGVDT (0)

AGCTCACGAGAACATAGACAGAAAAGTGCTGGACATTGACGCCAGACTGAGTCGTGGAGAGG

ATCTGGTCGGGTCGTTCCTGACCTACATGCTGACCGGAACAGACGTGACCAAGAAGGACC

TGTACGCCACTGTTACGGAGCTCCTGCTGGCGGGAGTGGATACGGT

 

>ATUP354679.y1  only seq of its kind, errors?

(1) ADENINSQTLDNDARVNRGEDMDGSFLTYMQTGTDATKKD

AGCTGACGAGAACATAAACAGCCAAACGCTGGACAATGACGCCAGAGTGAATCGTGGAGAGGATATGGACG

GATCGTTCCTGACCTACATGCAGACCGGCACAGACGCGACCAAGAAGGAC

 

>ATGN173204.g1  mate pair

RVYPVLPANGRVLDKDIVLDGYNIPKG

TQFAILHYNMTRDPEVFEEPDRFNPDRWTRMGTEKVNTFSSVPFGFGPRQCAG

 

81% to Gene C

(0) RKQERKYGRMWQSSLGFXXNVNVAHVALAEQLMRQEGKYPKRIEVNFMQQYRDLRGYSYGLLNQ (2)

NGPEWRHLRTAVSKRIMRPKEVPR (2)

YGDSMNEVVTDMIDRFKDLRDTTGGGKTVPDLTNELYKWAME (1)

SIATVLFDTRLGCLEREMPEKTQQFIDSIATMFKTAFLVSALKPWMLTYLGLGVWKRHVEAWDVIFSG (2)

(1) AHENIDRKVLDIDARLSRGEDLVGSFLTYMLTGTDVTKKDLYATVTELLLAGVDT (0)

RVYPVLPANGRVLDKDIVLDGYNIPKG

TQFAILHYNMTRDPEVFEEPDRFNPDRWTRMGTEKVNTFSSVPFGFGPRQCAG

 

 

 

New gene B like seq

>ATGN177260.g1  ATWW236255.g1  ATUP526573.b1  AFPZ358980.y1  AFSA636106.g2 

MSRILQIVGRRAAFTQAGLQNVPVWRPLGGRNGRGAASSAAATEQTTVQDGAAR

PFDEIPGPRGLPFIGTALDYSPF (1)

GGCAGAGGGCCAGCCTTCACACAGGCAGGGCTCCAAAATGTGCCAGTCTGGAGGCCGTT

AGGGGGACGAAATGGCCGGGGGGCCGCCAGCTCAGCGGCTGCCACTGAACAAACCACCGT

CCAGGACGGGGCTGCCCGACCGTTTGACGAGATTCCCGGACCGAGAGGGCTTCCCTTCAT

CGGCACCGCGTTGGACTACAGTCCGTTTGGT

>ATGN177260.g1 ATUP895607.y1  ATUP526573.b1  ASFW163334.g2  ASFW53365.g2  

ATUP313014.b1  ATWW230398.b1  ATUP374619.b1  ASFW128042.g2  AFPZ580557.x1 

(1) GRFPIHTKMANSTIERYQTYGKIYREKIGLRDMVFVCDPKDIETVFRSDGRLPERPIPESIATYRRLKNKPLGVALL (2)

AGGGCGTTTTCCTATACACACAAAAATGGCCAACTCAACAATTGAAAGGTACCAGACGTACGGAAAGATCTACCGG

GAAAAGATCGGCCTAAGAGACATGGTGTTCGTGTGTGACCCGAAGGACATCGAGACGGTT

TTCAGGAGTGACGGTCGGCTTCCAGAGAGACCTATCCCAGAGTCCATCGCAACATACCGC

CGACTCAAAAATAAACCGCTAGGCGTCGCGCTGCTGT

 

>AFSA668396.g2 walked down to this seq.

AWYB1977.b1  ATUP449091.g1  AFPZ437391.b2  ATUP251235.g1 

ATUP194809.y2  ATWW220754.g1  ATUP395766.g1  AFSA823226.b2 

(2) NGEEWFRLRRSVNKDMMRPKAVGAYATMQDEVSRELVGLIQGVVRKGKTAGQVPDFTKLLYKWGLE (1)

AGAAATGGAGAAGAGTGGTTCCG

CCTCCGAAGGTCTGTTAATAAGGACATGATGCGCCCAAAGGCGGTAGGTGCGTACGCCAC

AATGCAGGATGAGGTGTCCCGGGAGCTGGTCGGGCTGATACAAGGGGTGGTACGGAAGGG

CAAGACCGCTGGACAGGTCCCCGACTTTACCAAGCTCCTGTATAAATGGGGCCTAGAATGT

 

>ATUP194809.y2 AFPZ580549.x1  AFPZ186588.y1  AFPZ285430.y1  ATWX54571.b1  

ATGN37777.b1   ATGN82712.b1   ATUP548476.x1  ATWW20581.g1   ATWW18932.g1  

ATGI262715.b1  ATWW217460.g2  ATWX54571.g1  ASWX130145.g2 

ALSLVVLGKRLGCLTLDQLPEDSDAQRMIGAVNDFFYSFAKLQMSFPLFRYIRTPGWTTFERAMDTVSS (2)

AGCCTTGAGTCTCGTTGTTCTG

GGAAAACGCCTTGGCTGTTTGACCCTGGATCAACTCCCTGAAGATTCTGACGCGCAGCGT

ATGATCGGGGCGGTCAACGACTTCTTTTACAGTTTCGCCAAGCTTCAAATGTCGTTTCCC

TTGTTCAGATACATCAGAACTCCTGGATGGACGACTTTTGAGAGAGCCATGGACACAGTC

AGTAGGT

>ATGN82712.b1 AFPZ344411.y1 ATGN297809.g1  ATGN37777.b1  ATWW217460.g2 

ATUP548476.x1  ATWW20581.g1  ATWW18932.g1 AFSA241302.g2

(2) ITEKMIGERLEKLRQMEEPPDEADFLTSLLSREDMNLDEAIQMSVDLLQGAIDT (0)

AGCATTACGGAGAAGATGATCGGCGAAAGGCTAGAGAAACT

CCGCCAAATGGAAGAGCCTCCAGATGAGGCGGATTTTCTGACGAGCCTGCTGTCTCGGGA

GGACATGAACTTGGACGAGGCCATCCAAATGTCGGTGGATTTATTGCAAGGTGCAATTGACACGGT

 

>AFSA241302.g2 ATUP374619.g1  ATUP449091.b1  ASWX125766.g2  AFSA636106.b2 

AFPZ617570.y1 

(0) TAHTLVFNLYCLAKNPDAQQKLYEEILEVVPPEQPIDDRVLNKMHYLRAVVKETFR (2)

AGACGGCACACACTCTGGTCTTCAATCTGTACTGCCTGGCGAAAAATCCC

GATGCTCAGCAAAAACTGTACGAGGAGATCTTGGAGGTTGTTCCACCAGAGCAGCCCATA

GACGACAGGGTGTTGAACAAGATGCACTATCTTCGTGCTGTGGTGAAGGAGACATTCAGGT

 

>ATUP374619.g1 APWS22248.b1

(0)  MYPTLLSTARTLTRDVVLSGYHVPAK (0)

AGGATGTATC

CGACCCTCTTGAGCACCGCACGGACCCTAACCCGTGACGTAGTGCTGTCGGGATATCACGTGCCTGCTAAGGT

 

Walked to ATUP325667.y1 from APWS22248.b1 and kept walking to APNK44066.g2

 

>AWYB1977.g1 mate pair to exon 3

(0) TNVMLAQNVISTLPEYYPEPESYIPERWLRTESSNVQSFSLLPFGYGPRMCI (1)

AGACCAATGTTATGCTGGCTC

AGAATGTCATCAGCACACTGCCGGAGTATTATCCGGAACCGGAATCATACATACCGGAGA

GATGGCTCCGGACTGAGTCATCAAACGTCCAGTCATTCTCCCTTCTGCCTTTTGGATATG

GACCAAGGATGTGCATTGGT

 

>AWYB1977.g1 ATUP615174.y1

(1) GRRFAEQELYLGLVR (0)

AGGTCGCCGTTTTGCGGAACAGGAGTTGTACCTTGGTTTGGTCAGGGT

 

>ATUP503290.b1

(0) IIQNFHVGWDGEDMKQVWRIFNAPDRDTFVFSERKS*

AGATCATCCAAAACTTCCATGTTGGTTGG

GACGGAGAAGACATGAAGCAAGTGTGGAGGATTTTCAATGCACCGGACAGGGACACGTTC

GTCTTCAGCGAGAGGAAAAGTTAG

 

>GENE F 61% TO GENES D AND B

    MSRILQIVGRRAAFTQAGLQNVPVWRPLGGRNGRGAASSAAATEQTTVQDGAARPFDEIPGPRGLPFIGTALDYSPF (1)

(1) GRFPIHTKMANSTIERYQTYGKIYREKIGLRDMVFVCDPKDIETVFRSDGRLPERPIPESIATYRRLKNKPLGVALL (2)

(2) NGEEWFRLRRSVNKDMMRPKAVGAYATMQDEVSRELVGLIQGVVRKGKTAGQVPDFTKLLYKWGLE (1)

(1) ALSLVVLGKRLGCLTLDQLPEDSDAQRMIGAVNDFFYSFAKLQMSFPLFRYIRTPGWTTFERAMDTVSS (2)

(2) ITEKMIGERLEKLRQMEEPPDEADFLTSLLSREDMNLDEAIQMSVDLLQGAIDT (0)

(0) TAHTLVFNLYCLAKNPDAQQKLYEEILEVVPPEQPIDDRVLNKMHYLRAVVKETFR (2)

(2) MYPTLLSTARTLTRDVVLSGYHVPAK (0)

(0) TNVMLAQNVISTLPEYYPEPESYIPERWLRTESSNVQSFSLLPFGYGPRMCI (1)

(1) GRRFAEQELYLGLVR (0)

(0) IIQNFHVGWDGEDMKQVWRIFNAPDRDTFVFSERKS*

 

OTHER GENE PARTS FOR MITO CLAN P450S

 

>AFSA551799.g2 mate pair to VAAK exon conflict

MSLLQRVVRHQGQSLFRVCGARSLAALKTTCRPQSNKAEDSVTYDTAAL

PFEEIPGPKGLPLIGTALEYAPF

 

>AFPZ69957.g2 ASWX57129.b2  AFSA905811.g2 AFSA409132.g2 ATGI61372.g1 

ATGI89560.g1  AFSA423514.b2 APWS48518.g1 

Mate pairs to short GRR exon conflict with another mate pair.

This seq is probably linked to another gene

MSWLYVQLAVRHQGQSLLRVCGARSLAALKPTYRLQSTRAEESVADGTAAR

PFEEIPGPKGLPLIGTALDYTPF (1)

ATGTCCTGGCTCTATGTACAGCTAGCAGTCCG

ACACCAGGGGCAGAGTTTGCTCCGGGTTTGTGGCGCACGAAGTTTGGCGGCCCTAAAACC

GACCTACCGTCTGCAGAGCACCCGGGCGGAAGAGAGCGTTGCGGACGGCACAGCCGCGCG

GCCGTTTGAGGAGATCCCCGGGCCGAAGGGTCTTCCACTTATCGGGACGGCGTTGGATTA

TACTCCTTTTGGT

 

>ATUP315634.b1   ATWW149030.b1   AFSA389754.b02  AFSA389754.b2   AFPZ813868.y1  

3 aa diffs to seq AFPZ69957.g2  10 nuc diffs

MSWLYVQRAVRHQGQSLLRVCGARSLATLKTTYRLQSTRAEDSVADGTAAR

PFEEIPGPKGLPLIGTALDYTPF (1)

 

>AWXX5654.b1  ATUP433905.g1  AFPZ654299.b2  AFPZ135481.y1  ATGI264780.b1 

AFSA249358.b2  ATGN18206.g1   ATGN129557.g1 

21 aa diffs to seq AFPZ69957.g2

MYQIQRAVRHQAQSLFRPRVCGARSLAALKTTVTRAESTRAEESGVYDTAAR

PFEEIPGPKGLPFIGTGWDYSPF (1)

ATGTACCAGATACAGAGAGCAGTCCGACACCAGGCGCAAAGTTTGTT

CCGGCCGCGGGTTTGTGGCGCACGAAGTTTGGCAGCCCTCAAAACAACCGTCACCCGAGC

AGAGAGCACCCGGGCGGAAGAGAGCGGTGTATACGACACGGCCGCTCGGCCGTTCGAGGA

GATCCCCGGGCCGAAGGGTCTTCCGTTCATTGGAACGGGGTGGGACTATTCTCCTTTTGGT

 

>ATGN260669.g1

(2) VYPTVPNNLRKLDRDIVLSGYRVPAK (0)

AGGGTCTACCCTACCGTCCCCAACAACCTGAGAAAGCTGGACCGAGACATCGTGTTGTCGGGCTATCGCGTCCCTGCGAAGGT

 

>AFSA220523.g2  AFPZ54828.b2   ATUP129877.y2  ATUP810869.y1  ATUP470328.g1 

AFSA753924.b2  AFSA312923.b2  AFPZ676451.b2  ATGN67239.g1

(1)  IVQNFHVGWAGEDMKQVHRLILSPDRDTFVFSERT*

AGATTGTCCAAAACTTCCACGTCGGATGGGCAGGAGAAGA

CATGAAACAAGTGCACAGGCTGATACTATCTCCGGACAGGGACACCTTCGTCTTCTCTGA

AAGAACTTAG

 

>AFPZ40864.g2  ATGI105841.b1  ASWX70755.g2   ATUP404385.b1  ATUP207917.x2 

ATGI153639.b1  ATGN62974.g1   ATUP28511.g1   AFPZ892318.y1  ASFW104198.g2 

AFSA947971.g2  AFSA763143.b1  AFSA409073.b2 

(0) IVQNFHVGWAGEDMKQVNRMVFAPDRDTFVFSERT*

AGATTGTGCAAAACTTCCACGTCGGATGGGCAGGAGAAGACATGAAGCAGGTGAACAGGA

TGGTGTTTGCTCCGGACAGGGACACCTTCGTCTTCTCCGAAAGAACTTAG

 

>AFSA259507.g2  AFPZ304571.y1  ATUP58438.y1   APWS8104.b1   ATGN297330.g1 

ATUP18347.y2   ATWW68556.b1   ATWW188350.g1  ATGI263107.b1  ATWW93446.b1  

ATWW20426.g1   ASFW96552.g2   AFSA917305.g2  AFSA558643.g2  AFSA423514.g2 

AFSA301564.g02 AFSA301564.g2  AFPZ826132.y1 

(0) IVQNFHVGWAGEDMKQNNRIILAPDRDTFVFSART*

 

>AWYB2467.g1 another CYP27-like last exon, short, bad exon boundary

possible pseudogene

FIQTFKTRQLCKKEFVSPHEEAIIVVV*

ACTTTATACAGAGACTTTTAAAACAAGACAACTTTGTAAAAAAGAGTTCGT

AAGTCCGCATGAGGAGGCCATAATTGTAGTAGTGTGA

 

 

 MFLGLLRCQTPNQPYSSGPQAASHPQLDPP

 

>AFPZ89755.b2   ATGN64641.g1   ATWW116837.b1  ASFW21216.g2   AFPZ735799.y1 

6 aa diffs to ASWX94650.g2

MFLGLLRCQTPNQPYSSGPQAASHPQLDPPVKPFSALPEPMKGMPGILKFLVVLCTGGMS

RKAQLKSHMMIGQLFQMYGPILR (2)

 

>ATGN169213.g1  ATUP794883.x1 mate pair to C-helix ATUP203418.x2 (errors)

ASWX94650.g2  ATUP371507.b1 

MFLGLMRCQTPSQTYSTGPQAASHPQLDPP

AKPFSALPEPMKGLPGILKTLVVLCTGGMSRKAQLKSHVVIGQLFQMYGPILR (2)

ATGTTCCTAGGCCTGATGAGATGCCAGACTCCCAGCCAAACGTACTCCACCGGCCCACAAGCAGCCT

CCCACCCACAGCTGGATCCACCT GCCAAGCCGTTCTCAGCGCTCCCTGAGCCCATGAAAG

GCTTGCCTGGAATCCTGAAGACCCTGGTGGTCTTGTGTACCGGTGGCATGTCTCGAAAAG

CACAGTTAAAAAGCCACGTGGTGATCGGCCAGTTGTTTCAGATGTATGGTCCCATTTTAAGGT

 

>ATUP12956.x2 ATWW168034.b1 upstream of AFPZ532732.x1 AFPZ589745.b2 repeat seq upstream AFPZ591979.b2

(2) NRFGNFDMVNICDPDAAREVFKVEGKYPERLDIAPWRLHREDAGKELAVLLG (2)

AGGAATAGATTTGGAAACTTTGATATGGTGAACATCTGCGACCCTGACGCGGCTCGAGAAGTCTTTAAGG

TAGAGGGGAAATATCCTGAACGGCTGGACATCGCTCCATGGAGGCTGCACAGGGAGGATG

CTGGCAAGGAACTGGCTGTCCTGCTTGGGT

 

>C-helix 

AFPZ532732.x1  ATGN14732.b1   ATGN243066.g1  ATUP675551.g1  ATUP12956.x2 mate pair

ATGN74825.g1   ATUP826551.x1  ATUP797911.y1  ATUP370238.b2  AFSA483928.b2

AFPZ756711.g2 mate pair to EXXR  AFPZ589745.g2  ATWW185948.b1  ATUP794883.y1

AFPZ247615.y1 Mate pair

 (2) NDKKWHKNRTVVSRPMLRPQSVAAYVLKIDDVATDMLQHIRSVRAGPDGTEVLDLENELFKWALE (1)

AGCAACGACAAGAAATGGCACAAGAACCGTACCG

TGGTTAGCCGCCCGATGCTCCGTCCACAGAGCGTAGCGGCGTACGTGCTGAAGATCGATG

ACGTAGCGACCGACATGCTCCAGCATATCCGATCCGTCAGGGCTGGGCCGGATGGGACAG

AAGTGCTTGACCTGGAAAATGAGCTCTTCAAATGGGCCCTGGAGT

 

>APWS86618.g1 errors

ATUP703599.g1  AFSA520872.b2  ATWW44409.g1   AFPZ591979.g2  AFPZ316443.x1 

 (1) SISAVLFNERMGLLQDNIPQDAQDFINGMHDAFDSLTRAMTPDARLHKLLNTKSWQKNKQAWDT

AGCAATCTCAGCAGTTTTGTTCAACGAGCGTATGGGGCTCCTTCAGGAT

AACATCCCCCAGGATGCTCAGGACTTCATCAATGGTATGCATGATGCTTTTGACTCCCTT

ACACGAGCAATGACGCCAGATGCACGACTTCACAAGCTTCTAAACACCAAGAGTTGGCAA

AAGAACAAGCAAGCATGGGACACTGT

 

>ATWX9490.b1  78% to CYP11 amphi above

AFPZ247615.x1  ATUP477116.y1  AFPZ127527.x1  APWS133955.g1  ATGI147213.b1 

AFSA734994.b2  ATUP12956.y2   AFSA430892.b2  ATUP556366.y1  ATUP609432.y1 

AFPZ756711.b2 

(0) TSTTLLWTLYQLCHRPDLQDKLYQEVTQVIGQDEVITFDHLKNLHLFKAVIKETLR (2)

AGACCTCCACCACCCTCCTCTGGACCCTGTACCAGCTGTG

TCACCGACCCGACCTGCAGGACAAGCTGTACCAGGAGGTCACGCAGGTCATAGGTCAGGA

TGAGGTCATCACCTTTGATCACCTGAAGAACCTGCACCTCTTCAAGGCTGTCATCAAGGA

GACACTGAGGT

 

>ATUP12956.y2 does not match Gene C similar to amphi 11

     REKVMVRKLQEALEREERGEGEDDQPNFLAFVNSTGRLTKDEIYSNTIELMGAAIDT amphi 11

 (1) GEKVMDRQLQRAEERQARGEADDGQLDFLWFISSREKLTKEEIYANAIELMGAAIDT (0)

(0) TSTTLLWTLYQLCHRPDLQDKLYQEVTQVIGQDEVITYDHLKNLHLFKAVIKETLR (2)

 

>ATUP609432.y1 mate pair to YKIPAK exon same as ATUP12956.y2

(1) GEKVMDRQLQRAEERQARGEADDGQLDFLWFISSREKLTKEEIYANAIELMGAAIDT (0)

AGGAGAAAAAGTGATGGACCGTCAGCTTCAGCGAGCAGAAGAGCGCCAGGCCCGAGGTGAGGCAGATGACGGA

CAGCTGGACTTCCTGTGGTTCATCAGCAGTAGGGAGAAACTCACCAAGGAAGAAATCTAC

GCCAACGCCATCGAACTGATGGGGGCAGCCATTGACACAGT

ATUP609432.y1  AFPZ127527.x1  APWS133955.g1  ATUP12956.y2  

ATGI147213.b1  AFSA734994.b2  AFSA430892.b2  AFPZ756711.b2 

(0)   TSTTLLWTLYQLCHRPDLQDKLYQEVTQVIGQDEVITYDHLKNLHLFKAVINETLR (2)1023

(0)  TSTTLLWTLYQLCHRPDLQDKLYQEVTQVIGQDEVITYDHLKNLHLFKAVIKETLR ATUP12956.y2

AGACCTCCACCACCCTCCTCTGGACCCTGTACCAGCTGTGTCACCGA

CCTGACCTGCAGGACAAGCTGTACCAGGAGGTCACCCAGGTCATAGGTCAGGATGAGGTC

ATCACCTATGATCACCTGAAGAACCTGCACCTCTTCAAGGCTGTCATCAACGAGACACTGAGGT

 

>ATGN13711.g1    ATWW113491.b1   ATGI245214.b01  ATGI245214.b1   ATUP609432.x1  

(2) LHPVAFAITRVIQQDTVLMGYKIPAK (0)

AGGTTGCATCCTGTAGCCTTCGCCATTACTCGTGTGATTCAACAGGACACTGTTCTGATGGGGTACAAGATTCCTGCAAAGGT

 

>ATUP557471.x1  ATUP741717.x1

(0) TVVMVSLYDMARDPRLYKNPEEYRPERWLRGAEDYVDTHPYAYLPFGFGTRSCI (1)

AGACTGTTGTGATGGTGAGCCTTTATGACATGGCCCGAGATCCCCGGCTCTACAAAAACCCAGAGGAGT

ACCGGCCTGAGCGCTGGCTCCGGGGTGCAGAGGACTACGTGGACACCCATCCCTACGCTT

ACCTGCCGTTTGGGTTTGGGACTCGCAGTTGCATTGGT

 

>ATGN243066.b1 mate pair to exon 3 (C-helix exon)

(1)GRRVAETELQVLLAK (0)

AGGACGACGAGTGGCAGAAACCGAGTTGCAAGTACTTCTGGCAAAGGT

(0) ICQQFVLKQRNPRVIPAMTKGILMPAEKMDICFIERQ*

AGATTTGTCAGCAGTTTGTTCTGAAGCAGAGGAACCCCAGAGTCATCCCTGCCATGACAAAAGGAATTTTGATGCCAGCTGA

GAAAATGGACATTTGCTTCATTGAGAGGCAGTGA

 

>Gene G 55% to amphi 11

MFLGLMRCQTPSQTYSTGPQAASHPQLDPP

AKPFSALPEPMKGLPGILKTLVVLCTGGMSRKAQLKSHVVIGQLFQMYGPILR (2)

NRFGNFDMVNICDPDAAREVFKVEGKYPERLDIAPWRLHREDAGKELAVLLG (2)

NDKKWHKNRTVVSRPMLRPQSVAAYVLKIDDVATDMLQHIRSVRAGPDGTEVLDLENELFKWALE(1)

SISAVLFNERMGLLQDNIPQDAQDFINGMHDAFDSLTRAMTPDARLHKLLNTKSWQKNKQAWDT (0)

GEKVMDRQLQRAEERQARGEADDGQLDFLWFISSREKLTKEEIYANAIELMGAAIDT (0)

TSTTLLWTLYQLCHRPDLQDKLYQEVTQVIGQDEVITYDHLKNLHLFKAVIKETLR (2)

LHPVAFAITRVIQQDTVLMGYKIPAK

TVVMVSLYDMARDPRLYKNPEEYRPERWLRGAEDYVDTHPYAYLPFGFGTRSCI (1)

GRRVAETELQVLLAK (0)

ICQQFVLKQRNPRVIPAMTKGILMPAEKMDICFIERQ*

 

Gene C best match to 27B1

 

>AFSA878991.g2 mate pair to GRR exon  Possible N-term exon

more accurate seqs.

AFPZ747779.y1  ASWX43340.g2   ATGI250108.g1  APWS156165.g1  ATWW111331.g1 

AFPZ480651.g2  AFSA927711.b2  AFSA184323.b2  AFSA106917.g2 

MAQQILRNSSVCSLVRPNSRALVSVAPAATVQQNRPLKEMPGPTNKLGQLWWGFKNRSRMHEAQ (0)

ATGGCACAGCAGATCC

TCCGCAACTCCTCCGTCTGCTCGCTGGTCCGGCCCAACTCGCGAGCCCTGGTGTCAGTCG

CGCCTGCCGCCACAGTGCAGCAGAACAGGCCGCTTAAAGAGATGCCCGGACCGACCAACA

AGCTGGGGCAGCTGTGGTGGGGGTTCAAGAACCGTTCTCGTATGCACGAGGCTCAGGT

 

Walked up to exon 1

 

>ATUP828790.y1 Exon 2

AFSA214120.b2  ATUP315848.b1  ATGN173204.b1 ATGI270690.b1 

(0) LEQERKYGRMWQSSFGFNPNVNVAHVALAEQLMRQEGKYPKRIEVNFMQQYRDLRGYSYGLLNQ (2)

AGCTGGAGCAAGAGAGGAAGTACGG

CAGGATGTGGCAGTCGTCCTTTGGGTTCAACCCTAACGTGAACGTGGCGCACGTGGCTCT

AGCCGAGCAGCTCATGCGTCAGGAGGGGAAGTACCCCAAACGGATCGAGGTGAACTTCAT

GCAGCAGTACCGAGACTTGAGGGGGTACTCCTATGGGCTGCTCAATCAGT

 

>AFSA655556.b2 with frameshifts

(0) RKQERKYGRMWQSSLGFXXNVNVAHVALAEQLMRQEGKYPKRIEVNFMQQYRDLRGYSYGLLNQ (2)

AGCGGAAGCAAGAGAGAAAGTA

CGGCAGGATGTGGCAGTCGTCTTTGGGTTTCANCCCTAACGTGAACGTGGCGCACGTGG

CTCTAGCGGAGCAGCTCATGCGTCAGGAGGGGAAGTACCCCAAACGGATCGAGGTGAACTT

CATGCAGCAGTACCGAGACTTGAGGGGGTACTCCTATGGGCTGCTCAATCAGT

 

>ATUP354679.x1 mate pair

ATGN268013.b1  AFSA655556.b2  AFPZ211549.x1

ATUP315848.b1  ATGN93122.g1   ATGN173204.b1 

(2) NGPEWRHLRTAVSKRIMRPKEVPR (2)

AGTAACGGACCGGAGTGGCGCCACCTCAGGACAGCCGTCAGCAAGCGGATCATGCGGCCTAAGGAAGTCCCG

CGGT

 

ATUP354679.x1  ATUP856458.b1  ATUP259747.b1  ASFW77687.g2   AFSA928752.g3 

    YGDSMNEVVTDMITRFKDLRDTTGGGKTVPDLTNELYKWAME

(2) YGDSMNEVVTDMIDRFKDLRDTTGGGKTVPDLTNELYKWAME (1)

AGATATGGAGACAGCATGAACGAAGTTGTGACGGACATGATCGACAGGTTTAA

GGACCTGAGGGACACTACGGGCGGCGGGAAGACTGTGCCGGACCTCACCAATGAGCTGTA

CAAATGGGCCATGGAGTGT

 

walked up to ATUP765980.x1 from ATGN129003.b1

(2) YGDSMNEVVTDMITRFKDLRDTTGGGKTVPDLTNELYKWAME  (1)

 

>ATWX78362.g1

AFPZ253671.y1  AFPZ147917.x1   ATGN83464.g1    ATWW125259.g1  

ATUP663607.b1   ASFW68308.g2    AFSA918151.b3   AFPZ475595.g2  

 

(0)   SIATVLFDTRLGCLEREMPEKTQQFIDSIATMFRTAFLVSALKPWMLTYLGLGVWKRHVEAWDVIFSV(1)

AGCCATCGCCACAGTTCTGTTCGACACACGGCTAGGCTGCCTGGAGAGGGAAATGCCGGAGAA

GACCCAGCAGTTTATCGACTCCATCGCCACCATGTTCAGAACCGCGTTCCTTGTGTCAGC

CCTCAAGCCGTGGATGCTGACATACCTCGGTCTGGGCGTCTGGAAGCGCCACGTGGAAGC

CTGGGACGTCATCTTCAGTGTGGGT

 

>ATUP315848.g1 87% to CYP11amphi above  mid region to I-helix

ATUP210798.g1 AFSA609591.g2 (closest matches same as CYP11 amphi above)

These have same aa seq but differ in several nucleotides

ATUP663607.b1  ASFW68308.g2  AFPZ253671.y1  AFPZ147917.x1  ATWX78362.g1  

ATGN83464.g1   ATWW125259.g1  AFSA918151.b3  AFPZ475595.g2 

Seq ATUP315848.g1 has no other matches, probably has seq errors.

ATUP315848.g1   KEMPEKTQQFIDSIATMIRTACHVSALMPWMLTYLGLGVWKLHVEACDVIFSV  (1) errors

AIATVLFDTRLGCLEREMPEKTQQFIDSIATMFKTAFLVSALKPWMLTYLGLGVWKRHVEAWDVIFSVG AFSA609591.g2

AGCCATCGCCACCGTTCTGTTCGACACACGGCTAGGCTGCTTGGAGAGGGAAATGCCGGAGAAGAC

GCAGCAGTTTATCGACTCCATCGCCACCATGTTCAAAACCGCGTTCCTTGTGTCAGCCCT

CAAGCCGTGGATGCTGACATACCTCGGTCTCGGTGTCTGGAAGCGCCACGTGGAAGCTTG

GGACGTCATCTTCAGTGTGGGT

ATUP315848.g1  AFSA197634.g2  ATUP856458.g1  ATGI172026.g1  ATGI56118.g1  

AFPZ906315.x1  ATUP612783.x1  (mate pair)

ATWX17304.b1   ATUP256612.b1  ATWW188828.b1 

ATUP506396.g1  AFPZ902923.y1  AFSA557210.g2 

(2)  AHENIDRKVLDIDARLSRGEDLVGSFLTYMLTGTDVTKKDLYATVTELLLAGVDT (0) ATUP315848.g1

AGCTCACGAGAACATAGACAGAAAAGTGCTGGACATTGACGCCAGACTGAGTCGTGGAGAGG

ATCTGGTCGGGTCGTTCCTGACCTACATGCTGACCGGAACAGACGTGTGACCAAGAAGGACC

TGTACGCCACTGTTACGGAGCTCCTGCTGGCGGGAGTGGATACGGT

 

>ATUP523464.b1 walked up using this seq to ATGI61992.b1 and continued to walk up

and found AFSA197634.g2 which contains the AHEN exon

 

>ATUP210798.b1 ATUP816022.y1 walked up from PKG exon to this EXXR exon

APWS10286.b1 ATGI33690.g1  ATUP523464.b1 

(0) TSNTMVWTLYELARHPELQERLHQEVTSVVSPGQIPTVDDVKNMALLKNVIKEILR (2)

AGACATCCAACACAATGGTGTGGACCCTG

TACGAGCTAGCCCGCCACCCGGAGCTACAGGAGAGACTACACCAGGAGGTGACCAGTGTG

GTGTCCCCGGGACAGATACCTACGGTTGATGACGTCAAGAACATGGCGCTACTCAAAAAC

GTCATCAAGGAGATCCTTAGGT

 

>ASWX38725.b2

AFPZ393891.x1  ASWX38725.b2   ATGN173204.g1  ATGN130997.b1 

ATUP816022.y1  AFSA574211.b2  AFSA184323.g2 

 (2) VYPVLPANGRVLDKDIVLDGYNIPKG (0)

AGAGTGTACCCAGTCTTGCCTGCG

AATGGGCGTGTTCTGGACAAAGACATCGTACTTGACGGCTACAATATTCCTAAGGGGGT

 

ACAGAACGGCAGGGCAGGCTGAATGCCTGAAGGTCGACACAATAGGATCCCGCTCATCTAAAGTACATATTAAGTTACATATAGAATACAAGGTCGCCATAAAATGTACACAGAGTATCCTAATAAGGATCATTTATGTACAAAATACGTATAATGAAGATAGAAAATGAGATAGCGCATAATATGCAATATAAAGATAATATAATCTGTAATGGGGAATAAAAAATGTGACGATGAGTATCACATTGTCCCAAGATCCGTACCTACAGTGGTACAACGTCGTGACTGGGCAGACATGATTGTCCCGATTTGACGACGTTGAACCCTAGCTCTGCGTAGGAGAATGTGACATTCCAGTGTACTTGTACAATCATTTTTTTTATGGCAAAGTAAATTTGTCAAGGGTAAAAATATGGCGATAATGATCGGGACCTTGATATTGCTTTTCTTTTCTTTTCCTTTGATGTAGAGTGTACCCAGTCTTGCCTGCGAATGGGCGTGTTCTGGACAAAGACATCGTACTTGACGGCTACAATATTCCTAAGGGGGT

 

>ATGN316355.b1 45-47% to CYP27A,B and C 55% to 11A

ASWX38725.b2  ATGN316355.b1 ATGN173204.g1 ATGN130997.b1 AFSA574211.b2

AFSA184323.g2 AFSA878991.b2 ATUP816022.y1

(0)   TQFAILHYNMTRDPEVFEEPDRFNPDRWTRMGTEKVNTFSSVPFGFGPRQCA (1)

AGACGCAGTTTGCAATCCTC

CACTACAACATGACGCGTGATCCCGAGGTGTTCGAGGAGCCCGACAGGTTCAACCCTGAC

CGCTGGACCCGCATGGGCACCGAGAAGGTCAACACTTTCTCCTCGGTACCGTTCGGCTTC

GGACCCAGACAATGCGCAGGT

ATGN316355.b1  AFSA927711.g2  AFSA878991.b2  mate pair to exon 1 candidate

APWS128374.b1 

AFSA770791.g2  ATUP612783.y1  ATUP663607.g1  (mate pair)

(1) GRRLAEMEMYLVLAR (0)

AGGCCGACGACTCGCTGAGATGGAGATGTATTTGGTTCTTGCAAGGGT

 

>AFSA927711.g2 last exon

ATUP523464.g1  AFSA927711.g2  APWS128374.b1  APWS106525.g1  ATWW160672.b1  

ATUP526035.g1  AFSA770791.g2  ASWX100449.g2  APNK87583.b2   ATUP612783.y1 

(0) LVQTFEVRQLTPGEVVRPVTRALLVPGDPVHLEFIDRP*

AGCTGACGCCAGGAGAAGTGGTCCGGCCCGTGACTC

GTGCCCTGCTGGTGCCCGGAGATCCGGTCCACCTGGAGTTTATAGACAGGCCGTGA

 

40% to 27B1 Fugu, 37% to 27C1 fugu, 40% to 27A1 fugu (but not first exon)

35% to 11A1 fugu, 34% to CYP24 fugu (Best match to CYP27B)

    MAQQILRNSSVCSLVRPNSRALVSVAPAATVQQNRPLKEMPGPTNKLGQLWWGFKNRSRMHEAQ (0)

(0) LEQERKYGRMWQSSFGFNPNVNVAHVALAEQLMRQEGKYPKRIEVNFMQQYRDLRGYSYGLLNQ (2)

(2) NGPEWRHLRTAVSKRIMRPKEVPR (2)

(2) YGDSMNEVVTDMIDRFKDLRDTTGGGKTVPDLTNELYKWAME (1)

(1) SIATVLFDTRLGCLEREMPEKTQQFIDSIATMFRTAFLVSALKPWMLTYLGLGVWKRHVEAWDVIFSV(1)

(1) AHENIDRKVLDIDARLSRGEDLVGSFLTYMLTGTDVTKKDLYATVTELLLAGVDT (0)

(0) TSNTMVWTLYELARHPELQERLHQEVTSVVSPGQIPTVDDVKNMALLKNVIKEILR (2)

(2) VYPVLPANGRVLDKDIVLDGYNIPKG (0)

(0) TQFAILHYNMTRDPEVFEEPDRFNPDRWTRMGTEKVNTFSSVPFGFGPRQCA (1)

(1) GRRLAEMEMYLVLAR (0)

(0) LVQTFEVRQLTPGEVVRPVTRALLVPGDPVHLEFIDRP*

 

 

 

Gene E

 

>ATUP671281.g1  Exon 1 mate pair to WALE exon

ATGN333457.b1 AFPZ666321.y1  AFPZ292784.x1 

MSLRLAKCVSSPVSRQPSFGLLSSRWKSTVSGQVAHDEGQEGATAKPFEAIPGPKGLPLVGTALHAAL

GGWMDKFHLHMQ (0)

ATGTCACTCCGGCTGG

CTAAGTGTGTGTCGTCCCCAGTATCTAGACAACCTAGCTTCGGGCTGCTCTCCTCTCGAT

GGAAGTCTACTGTGTCCGGACAGGTGGCACACGATGAGGGACAGGAAGGGGCCACAGCCA

AGCCTTTCGAGGCTATCCCCGGCCCCAAAGGGCTACCCTTGGTGGGGACGGCTCTACATG

CTGCGCTGGGTGGCTGGATGGACAAGTTTCATCTTCATATGCAGGT

 

>AFPZ666321.y1 AFSA226961.g2  ATUP21126.b1   AFSA635190.g3 

(0) NRWQQYGSIYKENIGPQEIVCMFDPEDVAPVLRAEGRYPRRYAFDSFYLAREIMGHKLGVFLE (2)

AGAATCGTTGGCAGCAGTATGGTTCCATC

TACAAAGAGAATATCGGGCCACAGGAGATAGTTTGTATGTTCGATCCTGAGGATGTTGCA

CCGGTGCTTCGAGCGGAGGGCCGCTACCCGCGAAGATACGCCTTCGACAGCTTTTATCTG

GCCAGGGAAATCATGGGCCACAAGTTGGGTGTCTTCCTCGAGT

 

New WALE exon

ATGN293166.g1  ATUP671281.b1  ATWW95000.g1   AFPZ170408.y1 

(2) NDEKWQQYRTVMNKKLLRPQQAAAFTPLMDEAASNFMSYLRRKRDQGGMVTDLQA HLFRWAME (1)

AGGAATGATGAAAAATGGCAGCAGTACAGGACTGTGATGAACAAGAAGCTGCTC

CGACCCCAGCAGGCGGCAGCCTTCACCCCCCTGATGGACGAAGCGGCCTCTAACTTCATG

TCTTACCTACGGAGGAAGAGAGACCAGGGGGGGATGGTGACGGACTTACAGGCACATCTG

TTCCGCTGGGCCATGGAATGT

 

>AFPZ666321.x1 mate pair to exon 2

AFPZ335039.x1  ATUP622514.x1  ATUP357508.x1  ATUP707696.b1 

AFPZ666321.x1  ATUP138049.x2  ATGI251985.g1  ATUP168924.y1 

(1) SGCTAMFNQHLGLLSEDPPQLAKDFISSTMAVLDTTNTMMTIPPKVHKALNTKAWKEHLEGWQTSFRV (1)

AGCGGGCTGCACGGCCATGTTTAATCAGCAT

CTCGGACTCCTGAGTGAAGATCCTCCACAGCTAGCGAAGGACTTCATCTCTTCCACGATG

GCCGTTCTTGACACAACCAACACCATGATGACCATTCCT

CCAAAAGTTCACAAAGCCCTAAACACCAAGGCTTGGAAAGAGCATCTCGAAGGGTTGGCA

AACCAGCTTCAGAGTCAGT

 

>AFPZ148655.y1

SKQLIEEIMERGLEKESEEDEEIPDLVSYLLSVKLRPEEVLANIVDVLGGAVDT

AGCCAAGCAACTGATTGAGGAGATAATGGAAAGAGGCCTGGAGAAGGAGAGTGAAGAAGATGAAGAAATCCC

AGACCTGGTCTCCTACCTGCTGTCAGTGAAGCTCCGCCCTGAAGAGGTGCTGGCAAACAT

TGTTGATGTGTTAGGGGGCGCTGTTGATACAGT

 

>AFPZ335039.y1 ATUP622514.y1  ATUP357508.y1  ATGN214200.b1  ATUP168924.x1 

ATGI112649.b1  ATUP825055.y1  ATUP508644.b1  AFPZ790052.x1 

(0) TSNTMAFTMHTLARHPDIQEKLHDEVMRVAPDHQAPVTQEQVHKMPYLRGVIKEVLR (2)

AGACATCCAACACCATGGCTTTCAC

CATGCACACCTTGGCAAGGCACCCTGACATCCAGGAGAAACTGCATGATGAGGTGATGAG

GGTTGCGCCTGATCATCAGGCACCCGTCACACAGGAGCAGGTGCACAAGATGCCTTACCT

CAGGGGTGTCATCAAGGAGGTTCTACGGT

 

>ATGN214200.b1

AFSA202738.b2  ATWX51051.g1   AFSA5097.x1    AFPZ493490.y1  AFPZ223933.y1 

ASWX68418.b2   AWXX14152.g1   AFSA632513.g2  AFSA297851.g2 

(2) LYPVAYVFSRVLNHDAVVHGYKIPAGTNLV (0)

AGATTGTACCCAGTGGCCTATGTCTTCAGTAGAGTCCTGAACCATGACG

CTGTGGTGCATGGGTATAAAATTCCCGCTGGAACAAATCTAGTGGT

 

>ATWX51051.g1 AFSA202738.b2  ATUP76115.y2 ATUP812816.x1 

APWS144205.b1  AFSA632513.g2 

(0) VCPYVMGRDPNSYDDPEEFRPERWYRENSKSVKAFSWLPFGFGARGCV (1)

AGGTTTGTCCATATGTGATGGGAAGGGATCCAAACAGCTATGACGACCCAGAAGAG

TTTCGCCCTGAGAGATGGTACCGAGAGAACAGCAAGAGTGTCAAAGCCTTCTCCTGGCTT

CCCTTCGGCTTTGGGGCACGAGGCTGTGTTGGT

 

>ATUP76115.y2  

ATGN118007.b1  AFPZ870175.x1  AFSA635190.b3  AFSA535457.g2  ATWW222576.b1 

ATUP813922.x1  AFSA519825.b2  AFPZ202046.x1  AFPZ887647.y1  ATGN171031.b1 

GRRIAETEMHLVLIR (0)

AGGCCGCCGTATTGCAGAGACAGAGATGCACCTGGTTCTCATAAGGGT

 

Walked downstream from AFPZ870175.x1

 

(0) ICQNFLLEQEKDEELVGRIRLVLIPDKSVDLKLIDRN*

AGATTTGCCAGAACTTTCTGCTGGAGCAGGAGAAGGATGAAGAACTTGTCGGTA

GGATCAGACTGGTCCTGATCCCTGACAAATCAGTGGACCTCAAACTCATCGATCGCAACTAA

 

Assembled gene E 36% to CYP11 amphioxus, 35% to CYP11A1 fugu

35% to CYP11A1 from catfish Ictalurus punctatus

MSLRLAKCVSSPVSRQPSFGLLSSRWKSTVSGQVAHDEGQEGATAKPFEAIPGPKGLPLVGTALHAAL

GGWMDKFHLHMQ (0)

NRWQQYGSIYKENIGPQEIVCMFDPEDVAPVLRAEGRYPRRYAFDSFYLAREIMGHKLGVFLE (2)

NDEKWQQYRTVMNKKLLRPQQAAAFTPLMDEAASNFMSYLRRKRDQGGMVTDLQA HLFRWAME (1)

SGCTAMFNQHLGLLSEDPPQLAKDFISSTMAVLDTTNTMMTIPPK VHKALNTKAWKEHLEGWQTSFRV (1)

SKQLIEEIMERGLEKESEEDEEIPDLVSYLLSVKLRPEEVLANIVDVLGGAVDT (0)

TSNTMAFTMHTLARHPDIQEKLHDEVMRVAPDHQAPVTQEQVHKMPYLRGVIKEVLR (2)

LYPVAYVFSRVLNHDAVVHGYKIPAGTNLV(0)

VCPYVMGRDPNSYDDPEEFRPERWYRENSKSVKAFSWLPFGFGARGCV (1)

GRRIAETEMHLVLIR (0)

ICQNFLLEQEKDEELVGRIRLVLIPDKSVDLKLIDRN*

 

 

CYP27 sequences from other species for comparison

 

 

>DN712365.1| CNB01-H07.y1d-s SHGC-CNB Gasterosteus aculeatus stickleback

MEQLGGPGLLTTLNWLFVKGYFQTTQQMQIEHSRIYGPLWKSKYGPLVVVNVASAQLIEQ

 VLRQEGRLPVRTDMPHWRSYRELRNQAHGPLTEMGVKWQRIRSILNPRMLKPKHVSSYAC

 TINEVVGDFVRRAAWLRETGGGGLMVNDLTAELYKFAFEGICSVLFETRMGCMNEVVPEE

 TQKFIFSVGEMFRLSPILVLFPKSFWPYTPFWKQFVAAWDHLFKVAEELVQQKMDEIQEK

 VHLDQSVXGAYLTHLLLSEQMTVTEILXSITELLLGRVDTTSNTIAWALYQLAKQPEIXE

 QLYQEVIG

 

>CYP27A2P Fugu

29906 MASFTALRCAAIGARNSALRPATLPSRNLNLQATSEAANLKG

IADLPGPNTYKILYWLFVKGYGERSHLLQ 30118 (0)

30735 GKLKNIYGPMWRWKLGPYDFVSVASPELIARVIQQEGRYPVRVQLPHWKEYRDLRGQAYGLHVE 30926 (2)

31027 TGPEWSRLRSALKPRMLKLREV

 

>CYP27A1 Fugu Scaffold_3437 Length = 26117 46% to 27A1

Scaffold_7201 62% to 27A1 506-532

MSACLCVNSCGRKEAGCCGLWPGLAPAPGAQGGSPQSQRLPPISASNHGRWRTFHASASGSC

LQINVSGFHSHMHELQ (0)

136  (0) ILEKGRYGPIYRNGMNAVSVSTAKLLGEVLRNDDKFPNRGDMSIWKEYRDLRGYGYGPFTE 321 (2)

536  KDERWYNLRAVLNKRMLRPKDALQYGDTIGEVVTDFIRRIYFLRQRSPTGDVVTDLNNELYHFSLE 733 (1)

816  AIASILFETRLGCLEEEIPTGTQDFINAISQMFSNNFQVFLMPKWSRGVLPYWRRYVAGWDGIFSF 1013 (1)

1206 ATRLIDRKMEFIQQHLDNNQNVEGEYLTYLLSNTQMSIKDVYGSVSELLLAGVDT 1370 (0)

1465 TSNTLTWTLHLLSKYPQCQEILFKEVSTSVPADRAPSAEEVTRMPYLRAVVKESLR 1632 (2)

1756 MFPVIPMNGRILADKDVMIGGYQFSKN 1836 (0)

1949 TAFNFSHYAIGRDEDTFPEPATFMPERWLQDSHNRPNAFGAIAFGFGVRGCVGRRIAELEMYSFLCH 2149 (0)

2308 LMRHFEIKPDPKMGELKSVCRTVLIPDKPVSLRFLDRGSGHAA* 2439

 

CYP39 (assembled complete gene)

 

>ATUP838185.x1 aa 29-59

ATGI10911.b1  APWS107422.b1  ATGN275780.b1  ATUP838185.x1 

APWS157683.g1  ATGN4070.g2  ATUP459383.b1  AFSA654722.b4 

MATTIGEHSPGDELYNAFKY

MILFSLCFAFFSWRNIVKKGRPPCMDGWIPWFGCAIDFGKAPLDFIEETKRK (0)

ATGATCTTGTTCTCTTTGTGTTTTGCCTTC

TTCTCATGGAGGAACATTGTGAAGAAGGGCCGTCCCCCGTGTATGGACGGCTGGATCCCG

TGGTTCGGCTGCGCCATCGACTTCGGAAAAGCCCCTCTTGACTTCATCGAAGAGACAAAG

CGGAAGGT

>ATGN303672.g1 aa 60-117

629 LGPVFTIVAAGRWMTFVTEPEDITTFFQSPNLDFQKAVQDPVSHT (1)

 

walked down from ATUP838185.x1 to AFPZ337166.y1 AFSA400861.b2

ATUP929649.y1  ATWW116863.g1  AFSA358917.g03 AFSA358917.g3 

ASVSTESFFQHHTKIHDTIKGRLAPANLHSFCSNLWGEFKQQLEQLEHHGKDDLNTLVRR (2)

AGCCTCAGTGAGTACAGAGTCGTTCTTTCAGCACCACACAAAGATC

CACGACACGATAAAAGGGCGCCTGGCGCCCGCAAACCTGCACAGCTTCTGTTCCAACCTG

TGGGGGGAGTTCAAGCAACAGTTGGAACAGCTGGAGCATCATG

GGAAGGATGACCTCAACACACTAGTGAGGAGGT

 

>ATUP158518.b1 aa166-213

753 CMFAAVVNNLFGAENVPTDKDRIQEFSDIFVKYDADFEYGSQLPPFFLR (2) 607

 

>AFSA40665.y1   AFSA10785.x1   AFPZ913982.b2  AFPZ737815.x1  ATGN30214.b1  

ATUP158518.b1  ATGN191626.b1  ATWW47273.b1   AFSA788203.g2  AFSA136204.b2 

(2) EWAESKKWLLSLFSRSIANMERKETESQ (0)

AGGGAATGGGCAGAGTCCAAAAAGTGGCTGTTGTCACTT

TTTTCAAGATCAATAGCTAATATGGAGAGAAAGGAAACAGAATCTCAAGT

 

>ATWW47273.b1

(0) TLLQSLTKMVDRPHAPNYALLMLWASQANAVPV (1)

 

>ASWX108980.b2

MSFWVLAMILSNEDVHAAVKKEVQDNLGSP (1)

 

>ATUP929649.x1 aa 310-350

810 (1) GDEPITEEDLKKLPLLKRCIMETIRLRSPGVITRAVDKPLRIR (0)  697

AGGTGATGAACCAATCACTGAGGAGGATCTGAAGA

AGCTGCCCCTGCTGAAGCGCTGCATCATGGAGACCATCCGCCTCAGGTCACCTGGGGTCA

TCACCAGGGCTGTGGACAAACCACTGAGGATCAGGGT

 

aa 353-383

279 (0) KYIVPKGHLLMMSPYWAHRNPNFFPEPDKFLP  (0) 184

 

>ATUP879374.g1 heme plus stand  aa 380-

ATUP879374.g1  ATUP879374.b1  ATGN226830.g1  ATUP107319.x4 

ATUP730302.x1  ATUP374814.g2  AFSA368526.g2  AFSA276902.b2 

AFPZ475698.b2  AFPZ457102.x1  AFPZ419288.y1 

167 (0) DRWLDADLEKNLFLDGFVGFGGGRYQCPGR (2)

AGGACCGCTGGCTTGA

CGCTGACCTGGAGAAGAATCTGTTTCTGGATGGGTTTGTTGGCTTTGGTGGTGGAAGATA

CCAATGTCCTGGGAGGT

 

>AFSA276902.b2

WFALMEMQMLLAMMIQMFDFKLLGEVPKEVCQNFNYLISIHII*

 

>CYP39 amphioxus 49% to CYP39 zebrafish,  start MET not certain, 2 choices

MATTIGEHSPGDELYNAFKY

MILFSLCFAFFSWRNIVKKGRPPCMDGWIPWFGCAIDFGKAPLDFIEETKRK (0)

LGPVFTIVAAGRWMTFVTEPEDITTFFQSPNLDFQKAVQDPVSHT (1)

ASVSTESFFQHHTKIHDTIKGRLAPANLHSFCSNLWGEFKQQLEQLEHHGKDDLNTLVRR (2)

CMFAAVVNNLFGAENVPTDKDRIQEFSDIFVKYDADFEYGSQLPPFFLR (2)

EWAESKKWLLSLFSRSIANMERKETESQ (0)

TLLQSLTKMVDRPHAPNYALLMLWASQANAVP(0)

MSFWVLAMILSNEDVHAAVKKEVQDNLGSP (1)

GDEPITEEDLKKLPLLKRCIMETIRLRSPGVITRAVDKPLRIR (0) 

KYIVPKGHLLMMSPYWAHRNPNFFPEPDKFLP  (0)

DRWLDADLEKNLFLDGFVGFGGGRYQCPGR (2)

WFALMEMQMLLAMMIQMFDFKLLGEVPKEVCQNFNYLISIHII*

 

>Xenopus tropicalis CYP39 from ESTs CX931744.1 CX851900.1 CX876827.1

MDPIASVSSALLSPTAALGLLVALLTAVLVRYLLPNGSQKPPY

PPCIRGWIPWFGAAFDMGKAPLEFIARAREKHGPIFTVLAAGNRLTFLSGKEGISAFFSSKE

ADFQQAVQKPVQHTASINKEDFLKSHSSIHETIKLRLSQNRLHLYFDRIRNEFSTRIELP

NPEGTEDLFALVKKVMYPAVADTLFGKGLCPTGKGKLEEFAEHFWKFDEGFEYGSQLP

EFLLRDWSQSKQWLLRLFKKIVIEAEMNNPLEETSKTLHQHLLDTLKGNSTYNNSLLLLW

ASQANANPVTFWTLGFIISDPLVYKAAMDEIHSVFGKAGNKELNMNEAELKRLPFIKTC

VLEAIRLRSPGAITRKAVQPLKINNYLVPAGDLLMLSPYWLHRDPTLFPEPEMFRPERWS

KANLEKNVFLEGFVAFGGGKYQCPGRWFALMEMHMLVVMMLYKYEFSLLDPLPKQSNLHL

VGTQQPDGPCRVRYKLRK*

 

>CN061761.1  Ambystoma tigrinum tigrinum cDNA, mRNA salamander

MDAAAAVLLTLALVLIFRMLVLRRGTPGA

PPCVRGWIPWLGAAFELGKAPLQFIEQARAKHGPIFTVLAARNTMTFVFDEEGMSAFFTSKQVDFAQAVQ

KPVQYTASITKENFYKGHNDIHVLMNSRLSQSNLHLYMKNLCEKLHRPMESLGTEGPLGS

L*

 

>CYP39 zebrafish  ctg9833 48% to 39A1 AL929058.4 96727-71438

130332 MEWITLILTVSVVIISFHLLFGKSHPNAPPCIRGWIPWFGAAFEFGKAPLHFIQQARAK 130156 (0?)

127387 YGPVFTVVAAGKRMTFVTLNEDFRVFFTSKDVDFEQAVQEPVHNT 127253

127049 ASISKDNFFESHPTCSAIIKGRLTPGNTAMLSPHLCEEFNDHLESLGSEGSGQLNELIK 126873

125860 SVMYPSVMSNLLGRCNSPSSALSRQEFLEKFTTYDEGFEYGSQLPEMFLK 125711

124805 EWSNSKHWLLSL 124770 LRKMVIKSEETLYSESDRK

123739 TLLQHLAASISEQYLPNYGLLLLWASLANAIP 123674 (0)

       VTFWAVAFILSNPTAYKIVMDQINSVLGRQ

       MSFWVLAMILSNEDVHAAVKKEVQDNLGSPG

120968 DKQKTKVTLDDLQQMPYVKWCIMEAIRLRAPGAITRKVVRPLKLQ 120840

116801 NYVIPPGDMLMLSPYWAHRNPKYFPDPEDFKP 116706

112276 ERWETADLEKNVLLEGFVAFGGGKNQCPGR

105126 WYAIMELHMFVALILYKFEFAQLDPMPK 105043

 

ATGGAATGGATTACTTTAATTCTGACGGTGTCTGTGGTGATTATTTCGTTTCATCTTCTCTTTGGTA

AAAGTCACCCCAATGCTCCACCATGTATAAGAGGTTGGATACCCTGGTTTGGGGCTGCTT

TTGAATTTGGCAAAGCACCTTTGCATTTCATCCAGCAAGCCAGAGCCAAA

 

TATGGTCCAGTTTTCACAGTGGTGGCTGCTGGGAAGAGAATGACATTTGTGACCCTAAATGA

GGACTTCCGAGTTTTCTTCACCTCCAAAGATGTTGATTTTGAACAAGCAGTGCAAGAACC

CGTTCACAACACAG

 

CTTCTATCAGCAAAGACAACTTC

TTTGAGTCCCATCCGACCTGTTCGGCTATAATCAAAGGCAGACTGACTCCGGGCAACACG

GCGATGCTCTCTCCTCATCTCTGTGAAGAATTTAATGACCATTTGGAGAGCCTTGGAAGC

GAAGGATCTGGACAGCTAAATGAGCTCATCAA

 

GAGTGTGATGTATCCGTCAGTGATGAGCAATTTACT

AGGCCGCTGTAATTCGCCCAGCAGCGCTCTCAGCAGGCAGGAGTTCCTGGAGAAGTTTAC

AACCTACGATGAGGGATTTGAGTACGGCTCACAACTCCCAGAGATGTTCCTCAA

 

GGAATGGTCAAACTCAAAGCATTGGCTGCTTTCTCTGCTGAGAAAAATGGTTATCAAATCA

GAGGAGACCCTCTATAGTGAGAGTGATAGAAAG

 

ACTTTACTACAGCACCTGGCTGCTTCAATAAGTGAACAATATCT

ACCCAACTATGGTTTACTGTTGCTTTGGGCTTCCTTGGCAAATGCCATACCA

 

GTAACATTTTGGGCAGTTGCTTTCATTTTGTCCAATCCCACAGCATATAAGATTGTA

ATGGATCAGATCAACTCTGTCCTTGGTCGCCAAG

 

ACAAGCAGAAGACAAAGGTGACCCTAGATGACTTGCAGCAGATGCCGTATGTGAAGTGGTGTATC

ATGGAAGCCATTCGGCTGAGGGCCCCCGGAGCCATCACTCGCAAAGTAGTGCGACCTCTC

AAATTACAG

 

AATTACGTTATCCCACCTGGAGACATGCTGATGTTG

TCTCCTTACTGGGCTCACCGAAACCCCAAGTATTTCCCAGACCCGGAAGATTTCAAACCTGgc

 

GAACGATGGGA

AACGGCAGATTTAGAAAAGAATGTCCTCTTGGAAGGGTTTGTGGCATTCGGAGGGGGGAA

AAATCAGTGCCCAGGAAG

 

GTGGTACGCCATTATGGAGTTGCATATGTTCGTGGCCCTCATCCTCTACAAGTTTGAGTTTG

CCCAATTGGATCCAATGCCTAAG

 

Xenopus ESTs for CYP39 CX931744.1 CX851900.1 CX876827.1

 

MDPIASVSSALLSPTAALGL

LVALLTAVLVRYLLPNGSQKPPYPPCIRGWIPWFGAAFDMGKAPLEFIARAREKHGPIFT

VLAAGNRLTFLSGKEGISAFFSSKEADFQQAVQKPVQHTASINKEDFLKSHSSIHETIKL

RLSQNRLHLYFDRIRNEFSTRIELP  

NPEGTEDLFALVKKVMYPAVADTLFGKGLCPTGKGKLEEFAEHFWKFDEGFEYGSQLPE 1

FLLRDWSQSKQWLLRLFKKIVIEAEMNNPLEETSKTLHQHLLDTLKGNSTYNNSLLLLWA

SQANANPVTFWTLGFIISDPLVYKAAMDEIHSVFGKAGNKELNMNEAELKRLPFIKTCVL

EAIRLRSPGAITRKAVQPLKINNYLVPAGDLLMLSPYWLHRDPTLFPEPEMFRPERWSKA

NLEKNVFLEGFVAFGGGKYQCPGRWFALMEMHMLVVMMLY

KYEFSLLDPLPKQSNLHLVGTQQPDGPCRVRYKLRK*

 

ATGGA CCCTATCGCC

AGTGTCTCCT CCGCTCTGCT CTCCCCTACC GCCGCGCTAG GGCTGCTGGT GGCGCTACTC

ACCGCGGTGC TGGTTCGGTA CCTGCTGCCC AATGGGTCCC AGAAGCCGCC CTACCCTCCC

TGCATCCGAG GCTGGATCCC CTGGTTCGGG GCTGCGTTTG ACATGGGGAA AGCGCCCCTG

GAGTTCATCG CCAGAGCCAG GGAGAAGCAC GGCCCCATAT TCACAGTGCT GGCAGCGGGG

AACAGGCTTA CGTTTCTCAG CGGAAAGGAG GGCATATCTG CATTTTTCTC TTCCAAAGAA

GCCGATTTCC AGCAAGCTGT ACAGAAACCT GTTCAACATA CAGCTTCAAT AAACAAGGAG

GACTTCTTGA AGAGCCACTC CTCTATACAC GAGACCATAA AGCTGCGCTT GTCACAGAAC

AGGCTCCATC TTTACTTTGA CCGGATACGG AATGAGTTCA GCACGCGGAT AGAACTTCCG

AATCCAGAAG GGACCGAAGA TCTCTTTGCC TTGGTAAAGA AAGTCATGTA CCCAGCAGTG

GCAGATACCC TGTTTGGCAA GGGCTTGTGT CCAACTGGGA AGGGCAAGCT GGAAGAGTTT

GCCGAGCATT TCTGGAAGTT CGATGAAGGC TTTGAGTATG GCTCTCAGCT GCCGGAGTTC

CTCTTGAGAG ACTGGTCTCA GTCCAAACAA TGGCTGCTAA GGCTCTTNCA GAAAATCGTA

ATAGAAGCGG AGATGAACAA TCCTCTCGAG GAGACATCTA AGACATTGCA TCAGCACTTA

CTGGACACGC TCAAAGGGGA CTCCACCTAC AACAACAGCC TGCTGCTTCT CTGGGCATCT

CAAGCCAACG CCCAA

CCAGTTACCTTTTGGACGCTTGGCTTCATTATTTCAGATCCGCTGGTGTACAAGGCTGCCATGGATGAGATACATTCTGTATTTGGTAAAGCAGGTAACAAAGAATTAAATATGAATGAAGCTGAGCTGAAAAGGCTTCCATTTATCAAAACCTGCGTGCTGGAGGCGATACGGCTAAGATCTCCCGGGGCCATCACCAGGAAAGCTGTTCAGCCCCTGAAGATTAATAATTACCTTGTTCCGGCTGGGGACCTCTTGATGTTGTCGCCTTATTGGCTCCACCGAGACCCGACGCTGTTCCCTGAACCGGAAATGTTTCGGCCTGAACGCTGGAGCAAAGCAAATTTAGAGAAGAATGTTTTCTTGGAGGGTTTCGTGGCCTTTGGCGGAGGGAAGTATCAGTGTCCGGGAAGATGGTTTGCACTGATGGAGATGCACATGCTTGTCGTGATGATGCTCTACAAGTATGAATTTAGCCTTCTGGACCCACTTCCAAAACAGAGCAATCTTCATTTAGTGGGAACTCAGCAACCTGATGGGCCCTGCCGTGTTCGATACAAGCTGAGAAAATAA

 

CYP46

 

>CF917908 BI377382 Amphioxus 5-6 hrs cDNA 52% to CYP46 fugu

ATGI35342.g1  ATUP100909.x1  ATUP374858.g2  ATUP181014.g1

AFSA27081.b2 ATUP181014.b1  AFPZ282931.x1 (possible exon 2)

walked upstream to ATWW110117.g1 ATUP762936.y1 (N-term)

MAVVAVLMVLGVLAVVGLAVAGVVYLGYIYYMHRKYDHLPGPPRKS (2)

MNLLTYPICTTFLTLYSDENEYLTAFLD (0)

LLFCERYGPIVRLNFLHRVIIFVSSPEAVR (0)

ELLVTGKYIKPPDQYERIGSIFGERQ (0)

FLGEGLVTETNQERWHKRRRIMDPAFSRKYLQTL

MDKFNTSGDLFVEKLQTLADGVTPVSMVDMFGRVTLDVIAK (0)

VAFSMDLNTILDDHTPFPMATYITLSALIQQFRHPFME (0)

YNPFQRDYIRKVREACRLLRKTGHSVLQERQDQIRRGEQLPNDIMTLILKAN (1)

DEDSGLTVEKLVDDFVTFFIA (1)

GSETTANQLSFTLMELGRYPDVLEK (2)

LRAEMREVCGNKEYITYEDIGKLQYMGQ (0)

VLKESLRMYPPATGTSRLVEEEMELCGHRIPGDTVLI (0()

TSTYIMSHMEKHYPEPYTFNPDRFTPDADRPLYTYFPFSLGSRSCIGQHFSQ (0)

IEAKVLLCKFLQKLEFELDPNQSFAVSEASTLRPKGGCICMLSKLEK*

 

 

>CYP46a zebrafish ctg30700 62% to Fugu CYP46 90% to 276353

chick           LLLLLALLLLLVAGLYCCYVRRVHAAYDHIPGAPRE

rat          MSPGLLLLGSAVLLAFGLCCTFVHRARSRYEHIPGPPRP

Xeno  MELWVFIGWAALLLLALAFICFLLYCGYIQYIHMKYDHIPGPPRD

danio2 MISEWIFYIILYLLAAVFTAFFAYCLYVHHIHQKYDHIPGPPRD

       MISEWIFYIILYLLAAVFTAFFAYCLYVHHIHQKYDHIPGPPRD 276222

276114 NFLLGHSPSLTKALYSDDSLIYDLFLQW 276031

274832 AEKYGPVYRINTLHYVTIVVYCPEATK 274752

274005 TIMMSPKYTKDPFVYRRVFNLFGKR 273931

271082 FLGYGLITAVDHDIWYRQRRIMDPAFSSS 270996

       YLRSLISTFDEMSERLMDKLEEIANNKTPAVMHDLVNCVTLDVICK

270501 VAFGVDLNFLTQKDSPFQNAVELCLKGMTFDVRDPLFR

       IIPKNWKLIQQVREATELLRKTGEKWIQNRKTAVKNGEDVPKDILTQILKSAGM

269262 QNVTSTEDF EQMLDNFVTFFMA 269197

268999 GQETTANQLSFAIMELGRNPEIYKR 268925

268842 AKAEVDEVLGTKREISNEDLGKLTYLSQ 268759

268675 VLKETLRLYPTAPGTNRWLHEDMVINGIKIPGGCSVM 268565

268365 FNSFVSQRLEKFFKDPLKFDPKRFDENAPK 268276

296567 PYYTYYPLALGPRTCIGQVFSQ 296502

264748 MEAKVVLAKLLQRFEFSLVPGQSFDIKDTGTLRPKSGVICYIKQCS* 264608

CYP51

Assembled from parts by searching the trace archive with discontinuous

Megablast, selecting Branchiostoma_WGS as the database

AFPZ2891.b1 potential N-term exons 1,2

APWS98017.g1 1025 bp NEDLN exon 3

ATWW63735.b1 NEDLN exon goes 780 bp upstream

AWXX2239.g1  1014 bp QKK exon 4

ASWX166314.b2 QLYAD exon plus QKK exons 4,5,6

ATUP479444.b1 exons 7,8

ATUP655342.b2 I-helix and EXXR helix exons 9,10

AFPZ426833.g2 PFGAG exon 11

ATGN38239.b1 heme signature exon 12

67% to Xenopus CYP51

predicted N-term exon

     MLVEMGNLLLENALETVQELGSGTVALTTIVVLLGVTYFGRQFVSSVGKAE (0)

     KLPPVVPHTIPILGHGYNFYKNPIGFLEEAYKK (0)

     YGPVFTITMAGSKFTYLVGSDAAATLFNSK 394 NEDLNAEEVYSRLTTPVFGKGVAYDVPNP (0) 314

461  VFLEQKKMFKTGLNIARFRTHVSLIEEETKEYFKRWGDSGER (1) 332

     DLFEALAQLTILTASRCLH (1)

777  GKEVRSMLHEGIAQLYADLDGGFTQMAWLLPGWLPLPSFR (2) 896  

     KRDRANREMKKVFKK (0)

     IIQQRRESGDCDDDMLQTLMESTYK (2)

884  RDGRPLTDDEITGMMIGLLMAGQHTSSTTSTWMGFFLAKHKDIQARAYQEQLDICGEDLPPLNYDD (0) 687

252  LKEMALLDKCLAETLRLRPPIMTMMRMCKTPQQVKGYTIPVGHQVCVSPTVNQKLEDTWEEAGTWNPNR (2) 46

     FLEGNASTGKFSYVPFGA (1)

403  GRHRCIGENFAYVQIKTIWAVLLREFEFELIDGHF 507

     PSINFETMIHTPSQAIIRYKKR*

 

 

ATWW63735.b1 rev comp

TTCAGCAGTGTCGGCAAGGCAGAGGTAAGGGACAGTTCATTTGATCCATCATGTAGGGTTTGGTGTGAGAAAATCATGAGAAGAAATGAAGATATGTCGCCTTAAACTTTAATGTAAAATGTAACAAAAGAGATCGTTAACTAGTTATAGAATTATAGTAATAATTTCATAGGGACAGTCCATTACTGTAATACTGGGATAGGTTTGGGTTGCTGCTATTTGGCATCTGAAAACTTGTACAATGTATATTGAATTTGAACGCTTGTTCTTCGTGTTTGTATGTTTCAG AAGTTGCCACCTGTAGTGCCTCACACTATCCCAATACTTGGTCATGGTTACAACTTCTACAAGAACCCCATCGGATTTCTTGAGGAGGCATACAAGAAG GTAACGTTAGCTTGTTATACCTCAATCACTTATACCAAGGAGGATAACCTCCTTGCTTATACAATTACTGTTGTATGTACATTAAAAGAATGACATGTTCCTTTAAAAATTGAGACTAATGTTATGAGTTAAGTTATGACCTACTTAGATATTTGCAACTTTTTATTTGACTGTTGTTGGAAAGTCTTTGTCAACAAGTACTGGGTATTTGTCTTAAACTTTTATCTTACTGTTACTTTGTGCTCTAAATTACTGTTAGTATGTCAGCAAACTGTTTTTTTTTCTCCTACACCTTTCCAG TATGGCCCAGTCTTCACCATCACCATGGCAGGGTCAAAGTTCACATACTTGGTAGGGTCTGACGCAGCAGCGACTCTGTTCAACAGTAAGAACGAGGACCTGAATGCTGAGGAGGTGTACTCACGCCTTACCACACCTGTCTTCGGCAAGGGTGTGGCCTATGATGTTCCCAACCCT GTAAGTTGAAGAGACTTTATTGATGATTTCACTCTCACTCAGTCACTAACCGGGATAGCGTCCGTGTTCCTCTACAGATACAANNN

 

ATGN38239.b1 heme signature last exon

AGGTCGGGATTCTCCTACGTGCATTTGGTGCAGGTAAGTTGATGAACACTCATTACGACA

TTGACTTATGTCCTATCTGTGGCTGACGTCGGAGGTTTAAAAAAAAAGACACTTAAAGCA

GCTTTCAGACTTAAAATATTGTTTTAATTCTTTGTACTATACATGAGGTAAATGAATCAT

AATCCAGAGATATCTAGAATATTTTGCAGTACTATAGATACAACTGTTGGACATTTGAAG

GGGGGATGGATACAAAAAGCTGTGACTCACTAGTGTACAAAATTGTTACAACTTTGTTGA

TATACTGTAACCATAAAGTGGTTTCTGCTATCTCAGTCATAGAATATAACTGATTCTCCT

TGTAGTATGGTTATATTTGAATTGTTATTTACTGTTTACCCAG GCCGCCACCGCTGCATT

GGTGAGAACTTTGCGTACGTCCAGATCAAGACCATTTGGGCAGTGCTGCTGAGAGAGTTC

GAGTTTGAACTGATCGACGGACACTTCCCTTCCATCAACTTTGAAACCATGATCCATACC

CCATCACAAGCCATCATCCGTTACAAGAAACGA TAAAAGTTCATACCACAGCTTAGCCTG

AGTACCATCCTTGTTGGTTTAAACTCACTCTCATATTAGTATGTTGTCAAACACTAGTGA

TATAGGAGCAAGCCTAAACTAACAGGTATGGTACCCAGGCTAAGACAAGTTGTTTTTATG

ATGTATTATGATTATGAATATACTTCATACACTTCATCTGTTAAAGTCTTATGTCTGCCA

TAGACAAAACTTCAATACAGCCTAGCAAAGATGACAATATTGTACTTTTAAGTTGACTGA

TACATTTTCATACAGAAAGCAATTGAGATCACTAGTATGTTACCATTCATCTTCCAGGGA

AGTTCTAGATAGCAATGTTGAAATGTTTTGAGATTGAT

 

CYPunassigned low similarity to CYP7/8 in animals, CYP74 in plants

>BI386673.1 Branchiostoma floridae cDNA clone

note: heme signature not intact

 2  LFFRHAHTRPVLEKHEASRRKVSG

74  KMNTLESFILEVLRLHPPVFNYWVLARKDLVISPEKENIKVRKGERMLGCCFFAQRDGS 250

251 VFPDPDRFRWNRFLDEQGGQKKHLFFPRGSFTEAADLDSHQCPGQDIGFFMMKTTLAVLL 430

431 CYCTWELNDAPVWSDKPIRGGNTGDHV 511

>gnl|ti|662623028 ATUP882989.g1 from trace files

                                 SFMEAADLNSHQCPGQDIGFFMMKTTLSVLL

    CYCSWELKDAPVWSDKPIRVGNPDDPVRLVRFNFRSEQAGRALVNTSAKKI*

 

>gnl|ti|681897069 name:ATUP358726.x1 genomic

SALRRLSPNQDVPDFEERFSHIRSETLTEALFGRKIDGQLCFTWLNGLITEAKTWIPMPS

LAWKRRQAIKAIPELLKAIETAPKYQELVQLCHTH

GVEVEEGIFTILYGTLFNGCAAQTA

AIVSSVARLHTLSDAEKNEIIQTTLQVLEKHGGVSEESLGEMKTLESFILEVLR

 

>gnl|ti|646690304 ATWW129463.b1 100% match

NNNNNCCAGTGAAATCTCCGTCATGGAAGTCAGTCAGTCAGAAACGCCATGATCTTCCACCCTTACATCT

GCTTGTAGATTAGTACCTTCCACGAGGGAAGAAAAGATGCTTCTTCTGTCCTCCTTGTTCATCAAGAAAG

CGGTTCCACCGGAAACGGTCCGGATCCGGAAACACAGATCCGTCACGCTGAGCAAAGAAGCAACATCCAA

GCATTCGCTCCCCCTTGCGGACCTACGGAGCATAGCAACACATACATAAATGTATTTCAAACTTTATGGT

CAACTTTATCATCAATAACTGAACTGTTAATCAACACAACATTGAAAATAATTCACCGAATCCAAAAGGA

GCCACTTTGGTTGGGGGCACACAAATGGCTAATTTCATTCTATGCCCGATTGCGTGTTTGCGGGATTGTT

CGTTCGACCGGTATAACCGGGCAATTTTTCAGTTAATACCTCCGACTGATTACCATGGCAGTACATGCAT

CTGCCGTGATTGATAATTGACATAAATTACACGGCAATCCGCAATCTACGGCAACACAACCCAGATGTTG

TAAAAGTTTAAGTTTGAAGCAAAGGAGATACGTATGAAGTCATCAGAGGTCGTTAAAAAGGCATGGGTAT

GTTCAGTTGGGAAATCAAAGTAAAAGGTTGGGTCTTTGGGGCGTTGTGGAAGTAGGGGCCGAGAGTTCTT

ATCGAATTCTGGCCACTACTATCCCCCGACCCAGGCAATGTTGATTAATACTCTCCAGGCAGAGTTTTGG

CTCTGGCTGTTTTAGACATTTTCTCTGGCTTTCTATTTTCTACCATACTCACTTTTCACTATTCATGACA

CGCCCCCAAGACGGAGATTGTGTTCGGCCAGTTTTTAAGGAAATTTCGTGTTTCGACTTCTCCAGCCCGG

CATATAGATGACCCACNACCGTGGAATTATATACAGAAATTAGTCCATTCTACAG

 

>BI381296.1 Branchiostoma floridae cDNA clone close to BI386673.1

12  RKHLFFPRGSFTEAADLNSHQCPGQDIGFFMMKTTLAVLLCYCSWELKDAPVWSDKPIRV 191

192 GNPDDPVRLVRLNFRSEQAGRALVNTSAKKI 284

 

>cluster02001.2 frame1 from http://goblet.molgen.mpg.de/cgi-bin/Blast-amphioxus.cgi

 LFFRHAHTRPVLEKHEASRRKVSGKMNTLESFILEVLRLHPPVFNYWVLARKDLVISPEK

 ENIKVRKGERMLGCCFFAQRDGSVFPDPDRFRWNRFLDEQPRHKKHLFFPRGSFTEAADL

 NSHQCPGQDIGFFMMKTTLAVLLCYCTWELNDAPVWSDKPIRGGNTDDHVRLVRLNFRSE

 QAGRALVNTSAKKI*