Vitis vinifera cytochrome P450s

 

 

This file includes 222 sequences found in GenPept by searching for

Vitis[orgn] AND P450 on Sept. 20, 2007.  These start with CAN.

Note: on Oct 4, the same search found 642 accessions.  These include

416 Sequences from the other grape genome project starting with CAO.

Click here for a link to those 416 sequences.

 

These automated assemblies have not been checked against known

P450s for errors in assembly, gene fusions etc.

 

262 accessions from the grape genome project in the WGS section have been

mined for P450s and they have been assembled and sorted into family groups. 

(see bottom of file for a complete list of the CAAP accessions)

 

591 sequences are present below but some are duplicates. Gene sequences are being clustered into identical or presumed identical gene bins indicated with an #

followed by a number.  Pseudogenes are being labeled in a similar way with an

@ sign followed by a number.  (in progress)

Oct. 4, 2007, revised Nov. 14, 2007

 

For an older file of P450s from grape see /vitis.old.htm

 

P450 sequences in CYP family order

 

 

Table of 49 P450 families present

CYP83-like sequences here merged with CYP71AT

(missing 91 (merged with CYP81), 95 (part of the CYP72 family), 99 (grass specific),

702 (Brassicales only), 705 (part of CYP712), 708 (Brassicales only), 713 (merged with CYP71A),

717 (merged with CYP81), 719 (Ranunculales), 723 (grass specific), 725 (Taxus only overlaps 716),

726 (part of CYP71, Euphorbia), 729, 730 (protist contaminant),

731 (protist contaminant), 732 (protist contaminant))

 

The only missing families that appear lost in Vitis are CYP729 and CYP749

 

CYP51   2 genes,  1 pseudogene

CYP71  24 genes, 28 pseudogenes

CYP72  22 genes, 21 pseudogenes   43 sequences

CYP73   3 genes,  0 pseudogenes

CYP74   7 genes,  0 pseudogenes

CYP75  11 genes,  8 pseudogenes   52 sequences

CYP76  24 genes, 23 pseudogenes   47 sequences

CYP77   2 genes,  0 pseudogenes

CYP78   7 genes,  0 pseudogenes

CYP79   9 genes, 13 pseudogenes    4 alleles/duplicates 26 sequences

CYP80   6 genes,  0 pseudogenes

CYP81  21 genes, 14 pseudogenes   35 sequences

CYP82  34 genes, 37 pseudogenes   18 alleles/duplicates 89 sequences

CYP84   3 genes,  0 pseudogenes

CYP85   2 genes,  1 pseudogene

CYP86   6 genes,  1 pseudogene

CYP87   7 genes,  0 pseudogenes

CYP88   2 genes,  0 pseudogenes

CYP89  14 genes, 11 pseudogenes   25 sequences

CYP90   4 genes,  1 pseudogene

CYP92   6 genes,  1 pseudogene

CYP93   4 genes,  0 pseudogenes

CYP94   9 genes,  0 pseudogenes

CYP96   5 genes,  2 pseudogenes

CYP97   3 genes,  0 pseudogenes

CYP98   1 gene,   0 pseudogenes

CYP701  1 gene,   0 pseudogenes

CYP703  1 gene,   0 pseudogenes

CYP704  6 genes,  0 pseudogenes

CYP706  9 genes,  7 pseudogenes

CYP707  5 genes,  0 pseudogenes

CYP709  1 gene,   0 pseudogenes

CYP710  1 gene,   1 pseudogene

CYP711  1 gene,   1 pseudogene

CYP712  2 genes,  2 pseudogenes

CYP714  6 genes, 11 pseudogenes    1 allele 18 sequences

CYP715  1 gene,   0 pseudogenes

CYP716 15 genes,  7 pseudogenes   22 sequences

CYP718  0 genes,  1 pseudogene

CYP720  1 gene,   0 pseudogenes

CYP721  5 genes,  3 pseudogenes

CYP722  1 gene,   0 pseudogenes

CYP724  2 genes,  0 pseudogenes

CYP727  1 gene,   0 pseudogenes

CYP728  6 genes,  2 pseudogenes

CYP733  1 gene,   0 pseudogenes

CYP734  2 genes,  0 pseudogenes

CYP735  1 gene,   0 pseudogenes

CYP736  8 genes,  4 pseudogenes

 

Totals 315   +  201        +           23  =   553

315 named genes, 201 named pseudogenes 23 alleles/duplicates = 539 named sequences.

 

#1

>CYP51G6 CAAP02000072.1 81% to 51G1 Arab.

190429  MDVDNKFFNVALLIVATVVVAKLISALLIPKSRKRLPPTVKAFPVIGGLLRFLKGPVV  190256

190255  MLREEYPKLGSVFTLNLLNKNITFFIGPEVSAHFFKAPEADLSQQEVYQFNVPTFGPGVV  190076

190075  FDVDYSVRQEQFRFFTESLRVTKLKGYVDQMVTETE (0) 189968

188248  DYFSKWGDSGEVDLKYELEHLIILTASRCLLGQEVRDKLFADVSALFHDLDNGMLPISV  188072

188071  IFPYLPIPAHRRRDQARTKLAHIFANIIASRRETGKSENDMLQCFMDSKYKDGRQTTEAE  187892

187891  VTGLLIAALFAGQHTSSITSTWTGAYLFRHKEFLSAVLDEQKNLMKKHGNKVDHDILSEM  187712

187711  DVLYRCIKEALRLHPPLIMLLRSSHSDFSVTTKDGKEYDIPKGHIVATSPAFANRLPHIY  187532

187531  KDPERYDPDRFAVGREEDKVAGAFSYISFGGGRHGCLGEPFAYLQIKAIWSHLLRNFEFE  187352

187351  LISPFPEIDWNAMVVGVKGKVMVRYKRRVLPVD*  187250

 

#2

>CYP51G CAAP02000381.1 = AM475390.2, 81% to 81G1 Arab. 90% to CAAP02000072.1

97293   MDVDNKFFNAAFLLVATLVVAKLISALIIPRSKKRLPPTIKAFPLIGGLIRFLKGP  97460

97461   VVMLREEYPKLGSVFTLKLLNKNISFFIGPDVSAHFFKAPESDLSQQEVYRFNVPIFGPG  97640

97641   VVFDVDYSVRQEQFRFFTEALRVTKLKGYVDQMVMEAE (0) 97754

104754  DYFSKWGDCGEVDLKYELEHLIILTASRCLLGQEIRNKLFADVSALFHDLDNGMLPISV  104930

104931  IFPYLPIPAHRRRDQARKKLAEIFANIIASRKETGKSENDMLQCFIASKYKDGRPTTESE  105110

105111  VTGLLIAALFAGQHTSSITSTWTGAYLLRHKEYLSAVQDEQRSLMKKYGSKVDHDILSEM  105290

105291  DVLYRCIKEALRLHPPLIMLLRSSHTDFSVTTRDGKEYDIPKGHIVATSPAFANRLPHIY  105470

105471  KDPDRYDPDRFAVGREEDKAAGAFSYISFGGGRHGCLGEPFAYLQIKAIWSHLLRNFELE  105650

105651  LISPFPEVDWNAMVVGVKGKVMVRYKRRELPVN*  105752

 

>CYP51G1 AM475390.2 Vitis vinifera (Pinot noir grape) = CAAP02000381.1

9521  MDVXXKFFNAXFLLVATLLVAKLISALIIPRSKKRLPPTIKAFPLIGGLIRFLKGPVVML  9342

9341  REEYPKLGSVFTLKLLNKNISFFVGPDVSAHFFKAPESDLSQQEVYRFNVPIFGPGVVFD  9162

9161  VDYSVRQEQFRFFTEALRVTKLKGYVDQMVMEAE   (0) 9060

3862  DYFSKWGDCGEVDLKYELEHLIILTASRCLLGQEIRNKLFADVSALFHDLDNGMLPISV  3686

3685  IFPYLPIPAHRRRDQARKKLAEIFANIIASRKETGKSENDMLQCFIDSKYKDGRPTTESE  3506

3505  VTGLLIAALFAGQHTSSITSTWTGAYLLRHKEYLSAVQDEQRSLMKKYGSKVDHDILSEM  3326

3325  DVLYRCIKEALRLHPPLIMLLRSSHTDFSVTTRDGKEYDIPKGHIVATSPAFANRLPHIY  3146

3145  KDPDRYDPDRFAVGREEDKAAGAFSYISFGGGRHGCLGEPFAYLQIKAIWSHLLRNFELE  2966

2965  LISPFPEVDWNAMVVGVKGKVMVRYKRREL  2876

 

@1

>CYP51G7P pseudogene CAAP02006913.1

792 SHIFIGGGRNRCLGQHFAYLQVKAMWSHLL*NFEL*PISPFSKINWNAMVVGV 950

 

>CYP71AH1 old 71A11 tobacco

MKFLLVVASLFLFVFLILSATKRKSKAKKLPPGPRKLPVIGNLLQIGKLPHRSLQKLSNEYGDFIFLQLGSVPTVV

VFSAGIAREIFRTQDLVFSGRPALYAGKRFSYNCCNVSFAPYGNYWREARKILVLELLSTKRVQSFEAIRDEEVSS

LVQIICSSLSSPVNISTLALSLANNVVCRVAFGKGSDEGGNDYGERKFHEILFETQELLGEFNVADYFPGMAWINK

INGLDERLEKNFRELDKFYDKIIEDHLNSSSWMKQRDDEDVIDVLLRIQKDPNQEIPLKDDHIKGLLADIFIAGTD

TSSTTIEWAMSELIKNPRVLRKAQEEVREVAKGKQKVQESDLCKLEYLKLVIKETLRLHPPAPLLVPRVTTASCKI

MEYEIPADTRVLINSTAIGTDPKYWENPLTFLPERFLDKEIDYRGKNFELLPFGAGRRGCPGINFSIPLVELALAN

LLFHYNWSLPEGMLPKDVDMEEALGITMHKKSPLCLVASHYNLL

 

>CYP71AH2 tobacco

MNFLVVLASLFLFVFLMRISKAKKLPPGPRKLPIIGNLHQIGKL

PHRSLQKLSNEYGDFIFLQLGSVPTVVVSSADIAREIFRTHDLVFSGRPALYAARKLS

YNCYNVSFAPYGNYWREARKILVLELLSTKRVQSFEAIRDEEVSSLVQIICSSLSSPV

NISTLALSLANNVVCRVAFGKGSAEGGNDYEDRKFNEILYETQELLGEFNVADYFPRM

AWINKINGFDERLENNFRELDKFYDKVIEDHLNSCSWMKQRDDEDVIDVLLRIQKDPS

QEIPLKDDHIKGLLADIFIAGTDTSSTTIEWAMSELIKNPRVLRKAQEEVREVSKGKQ

KVQESDLCKLDYLKLVIKETFRLHPPVPLLVPRVTTASCKIMEYEIPVNTRVFINATA

NGTNPKYWENPLTFLPERFLDKEIDYRGKNFELLPFGAGRRGCPGINFSIPLVELALA

NLLFHYNWSLPEGMLAKDVDMEEALGITMHKKSPLCLVASHYTC

 

>71A9/CYP71AH3 Glycine max

MISFTVFVFLTLLFTLSLVKQLRKPTAEKRRLLPPGPRKLPFIG

NLHQLGTLPHQSLQYLSNKHGPLMFLQLGSIPTLVVSSAEMAREIFKNHDSVFSGRPS

LYAANRLGYGSTVSFAPYGEYWREMRKIMILELLSPKRVQSFEAVRFEEVKLLLQTIA

LSHGPVNLSELTLSLTNNIVCRIALGKRNRSGADDANKVSEMLKETQAMLGGFFPVDF

FPRLGWLNKFSGLENRLEKIFREMDNFYDQVIKEHIADNSSERSGAEHEDVVDVLLRV

QKDPNQAIAITDDQIKGVLVDIFVAGTDTASATIIWIMSELIRNPKAMKRAQEEVRDL

VTGKEMVEEIDLSKLLYIKSVVKEVLRLHPPAPLLVPREITENCTIKGFEIPAKTRVL

VNAKSIAMDPCCWENPNEFLPERFLVSPIDFKGQHFEMLPFGVGRRGCPGVNFAMPVV

ELALANLLFRFDWELPLGLGIQDLDMEEAIGITIHKKAHLWLKATPFCE

 

#9

>CYP71AH4 CAAP02005003.1a, 53% to 71B.d, 64% to 71A9, 62% to 71AH2 Nicotiana tabacum DQ350356.1

note 71A9 is 58% to 71AH2 so it is probably misnamed should be CYP71AH3

17504  MGISSFQASHSMVSQSLLLLLLVIFSALLLFLLSTKQKRKSVASRRLPPGPKKLPLIGNLHQLGSLPH  17301

17300  VGLQRLSNEYGPLMYLKLGSVPTLVVSSADMAREIFREHDLVFSSRPAPYAGKKLSYGCN  17121

17120  DVVFAPYGEYWREVRKIVILELLSEKRVQSFQELREEEVTLMLDVITHSSGPVYLSELT  16944

16943  FFLSNNVICRVAFGKKFDGGGDDGTGRFPDILQETQNLLGGFCIADFFPWMGWFNKLNG  16767

16766  LDARLEKNFLELDKIYDKVIEEHLDPERPEPEHEDLVDVLIRVQKDPKRAVDLSIEKIKGVLLT (0) 16575

16475  DMFIAGTDTSSASLVWTMAELIRNPSVMRKAQEEVRSAVRGKYQVEESDLSQLIYLKLVVKE  16310

16309  SLRLHPPAPLLVPRKTNEDCTIRGYEVPANTQVFVNGKSIATDPNYWENPNEFQPE  16142

16141  RFLDSAIDFRGQNFELLPFGAGRRGCPAVNFAVLLIELALANLLHRFDWELADGMRREDL  15962

15961  DMEEAIGITVHKKNPLYLLATPAN*  15887

 

@7

>CYP71AH5P CAAP02005003.1b, pseudogene 70% to CAAP02005003.1a

26375  NVAFTSFGEY*KEVRNIVILEVLSAKRVHSFQ

25611  HGWMQAIKLMFDVIAHSSGPVNSIELRVFLSNNVIC*VAFGTKFDGGGDNGTRRFPEIL  25435

25434  QETQNLLGGFCIADFFPWMGWFDKLNAWLGCQVDKNFMELNRIYDKGIEMHLDPERPEPE  25255

25254  HEDLVDVLI*VQKDLRQVVSLSNEKIKGVLT (0)

25074  VHCSD*YPFSLAGMDNAEMIRNRSVMRKAQEKVRSTVRGKYQVEESDLSQLIYLKLVVKE  24895

24894  SLRLHLPAPSLVPRKTTKNCTI  24829

24815  FPQIHVFVNGNLISIDSNYWENPNEFQPERFVDSSIDFRGQSFEFLPFGASMRGCPGANF  24636

24635  AVLLIEVALTNILHRLTGNFLMG  24567

 

>CYP71AH6 Gossypium raimondii 58% to CAAP02005003.1a, 53% to 71A9/71AH3

CO072855.1 CO095493.1, CO072856.1

MDFQFILTLSFIAFTLMVFKYKARTRRLPPGPWKLPIIGNLHQLGDSSHKSIQRLSQ

QYGPMMFLQLGAVPTLVISSADAAMAIFKGPGGGYDLAFSGRPTNLYVAKKLSYEYNGIT

FAPYGELWREMRKIAVAELLSSKRVQSFRTIREEEVAAMLNHIDIASSSSAPVNLKKLSL

LLANHVVCRVTFGKKYGGGGDGGTNRFDRVLHEVQHLVGEFVVSDYFPWMWWVNKLNGMETRVEKNFEELDKLY

DEVIADHVAPTRTKANHEDIVDVLLRLQKDARQLITLNNQQIKGVLTDMFIAGTDTTAS

SLVWTFTELIRNPPSMEKVKYEVRKVGNGRDKIEESDIPKLHCLHSVIKETLRLHPPAPL

LVPRETTEDCVVGDYEIPAKTRVIINAKSIGTDPKYWENPHDFQPDRFMKSSVDFKGQHL

EFLPFGVGRRGCPGMSFAIMLLQLMVANFLYRFDWELPEGMSVEDVDMEEELGITVFKKT

PLCLVPIRVV*

 

#10

>CYP71AP5 CAAP02001743.1a, 43% to 71B2, 51% to 71B.c 53% to 71A1, 78% to 71AP4

15554  MALLQWLKEGFLPSFLFAGIILVAVLKFLQKGMLRKRKFNLPPSPRKLPIIGNLHQLGNMPHIS  15363

15362  LHRLAQKFGPIIFLQLGEVPTVVVSSARVAKEVMKTHDLALSSRPQIFSAKHLFYDCTDI  15183

15182  VFSPYSAYWRHLRKICILELLSAKRVQSFSFVREEEVARMVHRIAESYPCPTNLTKILGL  15003

15002  YANDVLCRVAFGRDFSAGGEYDRHGFQTMLEEYQVLLGGFSVGDFFPSMEFIHSLTGMK  14826

14825  SRLQNTFRRFDHFFDEVVKEHLDPERKKEEHKDLVDVLLHVKEEGATEMPLTMDNVKAIIL (0)  14646

14513  DMFAAGTDTTFITLDW  14466

14465  GMTELIMNPKVMERAQAEVRSIVGERRVVTESDLPQLHYMKAVIKEIFRLHPPAPVLVPR  14286

14285  ESMEDVTIDGYNIPAKTRFFVNAWAIGRDPESWRNPESFEPQRFMGSTIDFKGQDF  14118

14117  ELIPFGAGRRSCPAITFGAATVELALAQLLHSFDWELPPGIQAQDLDMTEVFGITMHR  13944

13943  IANLIVLAKPRFP* 13902

 

#7

>CYP71AS3.a CAAP02000057.1 Vitis vinifera 6 genes in a cluster 62% to CYP71AS1

177875  MELYSPSMWLHLLLLLLPLMFLIKRKIELTGQKKPLPPGPTKLPIIG 177735

177734  NLHQLGALPHYSLWQLSKKYGSIMLLQLGVPTVVVSSAEAAREFLKTHDIDCCSRPPLV  177558

177557  GLGKFSYNHRDISFAPYGDYWREVRKICVLEVFSTKRVQSFQFIREEEVALLIDSIVQSS  177378

177377  SSGSPIDLTERLMSLTANIICRIAFGKSFQASEFGDGRFQEVVHEAMALLGGLTAADFFP  177198

177197  YVGRIVDRLTGLHGRLERSFHEMDGFYQQVIEDHLNPGRVKEEHEDIIDVLLRIEREQSE  177018

177017  SSALQFTKDNAKAIVM (0) 176970

176205  DLFLAGVDTGAITVSWAMTELARNPRIMKKAQAEVRNSIGNKGKVTEGDVDQ  176050

176049  LHYLKMVVKETLRLHPPAPLLLPRETMSHFEINGYHFYPKTQVHVNVWAIGRDPNLWKNP  175870

175869  EEFLPERFMDNSVDFRGQHFELLPFGAGRRICPGMYMAIATVELALANLLYRFNWNLPNG  175690

175689  MKETDISMEEAAGLTVRKKFALNLVPILHHC*  175594

 

@5

>CYP71AS3-de1b CAAP02000057.1 54% to CYP71B.a

178567  RLTRLYGWLERRTSYELDGFY*QVIGLHDLKDVKEDFIDVLLQTERD  178427

 

#6

>CYP71AS4.b CAAP02000057.1 Vitis vinifera 6 genes in a cluster

170360  MALYSPSMWLHLLLLLLPLMYLIKRRIELKGQKKPLPPGPTKLPIIG 170220

170219  NLHQLGTLPHYSWWQLSKKYGPIILLQLGVPTVVVSSAEAAREFLKTHDIDCCSRPPLV  170043

170042  GLGKFSYNHQDIGFAPYGDYWREVRKICVHEVFSTKRLQSFQFIREEEVALLIDSIAESS  169863

169862  SSGSPIDLTERLMSLTANIICRIAFGKSFQVSEFGDGRFQEVVREAMALLGGFTAADFFP  169683

169682  YVGRIVDRLTGLHGRLERSFLEMDGFYQRVIEDHLNPGRVKEEHEDIIDVLLKIQRERSE  169503

169502  SGAVQFTKDSAKAILM  (0) 169455

169009  DLFLAGVDTGAITLTWAMTELARNPRIMKKAQVEVRNSIGNKGKVTEGDVDQ  168854

168853  LHYLKMVVKETLRLHPPVPLLLPRETMSHFEINGYHIYPKTQVQVNVWAIGRDPNLWKNP  168674

168673  EEFLPERFMDNSVDFRGQHFELLPFGAGRRICPGMYMVIATVELALANLLYRFNWNLPNG  168494

168493  MREADINMEEAAGLTVRKKFALNLVPILHHC*  168398

 

#8

>CYP71AS4v2 CAN60733.1| 73% to CAN83446.1 62% to 71AS1 55% to 71B34

96% to 71B.d and 71B.e, possible allele of 71B.b, since CAN83446.1 = 71B.e

MELYSPSIWLCLLLLLLPLMYLIKRRIELKGQKKPLPPGPTKLPIIGNLHQLGALPHYSWWQLSKKYGPI

MLLQLGVPTVVVSSVEAAREFLKTHDIDCCSRPPLVGLGKFSYNHRDIGFAPYGDYWREVRKICVLEVFS

TKRVQSFQFIREEEVTLLIDSIAQSSSSGSPIDLTERLMSLTANIICRIAFGKSFQVSEFGDGRFQEVVH

EAMALLGGFTAADFFPYVGRIVDRLTGHHGRLERSFLEMDGFYERVIEDHLNPGRVKEEHEDIIDVLLKI

ERERSESGAVQFTKDSAKAILMDLFLAGVDTGAITLTWAMTELARNPRIMKKAQVEVRNSIGNKGKVTEG

DVDQLHYLKMVVKETLRLHPPAPLLVPRETMSHFEINGYHIYPKTQVXVNVWAIGRDPNLWKNPEEFLPE

RFMDNSVDFRGQHFELLPFGAGRRICPGMYMAIATVELALANLLYRFNWNLPNGMREADINMEEAAGLTV

RKKFALNLVPILHHC

 

#5

>CYP71AS5.c CAAP02000057.1 Vitis vinifera 6 genes in a cluster

157794  MALYSPSIWLHLLLLLLPLMFLIKRKIELKGQKKPLPPGPTKLPIIG 157654

157653  NLHQLGALPHYSLWQLSKKYGSIMLLQLGVPTVVVSSAEAAREFLKTHDIDCCSRPPLV  157477

157476  GLGKFSYNHRDISFAPYGDYWREVRKICVLEVFSTKRVQSFQFIREEEVALLIDSIVQSS  157297

157296  SSGSPIDLTERLMSLTANIICRIAFGKSFQASEFGDGRFQEVVHEAMALLGGLTASDFFP  157117

157116  YVGRIVDRLTGLHGRLERSFHEMDGFYQQVIEDHLNPGRVKEEHEDIIDVLLRIEREQSE  156937

156936  SSALQFTKDNAKAILM  (0) 156889

156035  DLFLAGVDTGAITVAWAMTELARNPGIMKKAQAEVRSSIGNKGKVTESDVDQ  155880

155879  LHYLKVVVKETLRLHPPAPLLLPRETMSHFEINGYHIYPKTQVHVNVWAIGRDPNLWKNP  155700

155699  EEFLPERFMDNSVDFRGQHFELLPFGAGRRICPGMYMAIATVELALANLLYRFNWNLPNG  155520

155519  IREADISMEEAAGLTVRKKFALNLVPILHHC*  155424

 

@4

>CYP71AS5-de1b CAAP02000057.1 65% to CYP71B.c

159495  VKEEHENFIDVLLQTERDRT  159436

 

#4

>CYP71AS6v1 .d CAAP02000057.1 Vitis vinifera 6 genes in a cluster

152305  MALYSPSIWLHLLLLLLPLMFLIKRKIELKGQKKPLPPGPTKLPIIGNLHQLGALPHYSL  152126

152125  WQLSKKYGSIMLLQLGVPT 152069

152068  VVVSSAEAAREFLKTHDIDCCSRPPLVGPGKFSYNHRDIGFAPYGDYWREVRKICVLEVF  151889

151888  STKRVQSFQFIREEEVTLLIDSIAQSSSSGSPIDLTERLMSLTANIICRIAFGKSFQVSE  151709

151708  FGDGRFQEVVHEAVALLGGFTAADFFPYVGRIVDRLTGLHGRLERSFLEMDGFYERVIED  151529

151528  HLNPGRVKEEHEDIIDVLLKIERERSESGAVQFTKDSAKAIIM (0) 151400

150931  DLFLAGVDTGAITLTWAMTELARNPRIMKKAQVEVRSSIGKKGKVTKGDVDQLHYLKMVV  150752

150751  KETLRLHPPVPLLVPRETMSHFEINGYHIYPKTQVHVNVWAIGRDPNLWKNPEEFLPERF  150572

150571  MDNSVDFRGQHFELLPFGAGRRICPGMYMAIATVELALANLLYRFNWNLPNGMREADISM  150392

150391  EEAAGLAVRKKFALNLVPILHHC* 150320

 

>CYP71AS6v2 gi|147855782|emb|CAN83446.1a 2 genes 55% to CYP71B

97% to 71B.d missing some seq after LPII

This may be an allele of 71B.d since it is upstream of 71B.e

MALYSPSXWLHLLLLLLPLMYLIKRXIELKGQKKPLPPGPTKLPII

 

VSSAEAAREFLKTHDIDCCSRPPL

VGXGKFSYNHRDIGFAPYGDYWREVRKICVLEVFSTKRVQSFQFIREEEVTLLIDSIAQSSSSGSPIDLT

ERLMSLTANIICRIAFGKSFQASEFGDGRFQEVVHEAMALLGGFTAADFFPYVGRIVDRLTGLHGRLERS

FLEMDGFYQRVIEDHLNPGRVKEEHEDIIDVLLKIERERSESGAVQFTKDSAKAILMDLFLAGVDTGAIT

LTWAMTELARNPRIMKKAQVEVRNSIGNKGKVTE GDVDQLHYLKMVVKETLRLHPPAPLLVPRETMSHFE

INGYHIYPKTQVHVNVWAIGRDPNLWKNPEEFLPERFMDNSVDFRGQHFELLPFGAGRRICPGMYMAIAT

VELALANLLYRFNWNLPNGMREADINMEEAAG

 

#3

>CYP71AS7v1.e CAAP02000057.1 Vitis vinifera 6 genes in a cluster, 58% to CYP71AS1

135067  MAPYSPDLWLPLVLLFLSLLFLLKKILELKEQKGPPGPPKLPIIG 134933

134932  NLHQLGALIHQSLWQLSKKYGPVMLLHLGFVPTLVVSSAEAAKKVLKDHDISCCSRPPLI  134753

134752  SIGRLSYNYLDISFAPYGPYWREIRKICVLQLFSTNRVQSFQVIREAEVALLIDSLAQSS  134573

134572  SSASPVDLTDKIMSLTANMICRIAFGRSFEGSEFGKGRFQEVVHEATAMMSSFFAADFFP  134393

134392  YVGRIVDRLTGIHERLEKSFHELDCFYQQVIEEHLNPGRMKEEHEDIIDVLLNIEKEQDE  134213

134212  SSAFKLTKDHVKAILM (0) 134165

134087  DLFLAGVDTGAITVVWAMTELARKPGVRKKVQDEVRSHIRERGKVRESDIEQ  133932

133931  FHYLKMVVKETLRLHPPVPLLLPKETMSTIEISGYQIYPKTQVYVNVWAIGRDPNLWNNP  133752

133751  EEFFPERFIDNSVDFKGQHFEFLPFGAGRRVCPAMNMAIAMVELTLANLLYHFNWKLPHG  133572

133571  MKEGDINMEEAPGLSVHKKIALSLVPIKYP*  133479

 

>CYP71AS7v2

CAN83446.1b 2 genes

Contains some intron seq. This seq ortholog to CYP71AS7v1

KKILELKEQKGPPGPPKLPIIGNLHQLGALIHQSLWQLSKKH

GPVMLLHLGFVPTLVVSSAEAAKKVLKDHDISCCSRPPLISIGRLSYNYLDISFAPYGPYWREIRKICVL

QLFSTNRVQSFQVIREAEVALLIDSLAQSSSSASPVDLTDKIMSLTANMICRIAFGRSFEGSEFGKGRFQ

EVVHEATAMMSSFFAADFFPYVGRIVDRLTGIHERLEKSFHELDCFYQQVIEEHLNPGRMKEEHEDIIDV

LLNIEKEQDESSAFKLTKDHVKAILMAYFFEQDLFLAGVDTGAITVVWAMTELARKPGVRKK

Missing some seq here

EKFRESDI

EQFHYLKMVVKETLRLHPPVPLLLPKETMSTIEISGYQIYPKTQVYVNVWAIGRDPNLWNNPEEFFPERF

IDNSVDFKGQHFEFLPFGAGRRVCPAMNMAIAMVELTLANLLYHFNWKLPHGMKEGDINMEEAPGLSVHK

KIALSLVPIKYP

 

@3

>CYP71AS8P.f CAAP02000057.1 Vitis vinifera 6 genes in a cluster

pseudogene 74% to .c, missing first exon

126787  NLLLAGVNTSASTVVWAMAELARNPIVMKKAQAEVRSVIGN  126660

126659  KGKVTESDLDQLLYFKLVVKETFRLHPPSPLLLPRETMSHFQMNGYHIHPKTRVHVNV*A  126480

126479  IGRDPNVWKNPKEFFPESFIDNSIDFKGQHFELLPFGAGRRVCPAINMGIAMLELTFANL  126300

126299  LYHFNWKLPHGMKEEDINMEEGAGITSPKKFALILRPTQYP*  126174

 

@2

>CYP71AS9P.fg CAAP02000057.1 pseudogene 60% to CYP71B.d

122651  FLAGAKQECPTMV*EMAELARNPRTMKKTQAEVRSCAGKQGKVLGT

122506  DLDQLNYLKMMTMKEMLRLYPSV

        TILPTETMQHFNIN

        VYPKTQFLQLDVLAIGKDP  122327

122326  NIWEN

122309  PEEFSLERF  122283

 

@6

>CYP71AS10P CAAP02000950.1  pseudogene (+) strand, 49% to CYP71AS5.c   CAAP02000057.1

8049 VIKKATVVLASFSREDFFQFGGWIIDKFIGVHA*REKSFHIFDQFYQKVIDDHLDLNRP 8225

8226 KPEHEDIVDVLLGL*KDQTNV 8288

8601 NLFLGII*ATTITIVWALTELAKNPRVMKVAQAEIKSCLGYKLMVEESDLDRFQYLKIVF 8780

8781 K 8783

8780 QTLRLHPPLVMLTPWETVAHCKIGGYDVYPKTRIHINVWVIGKDPRVWDNLEEFNPERF 8956

8957 MNSDIDFRGQHFALVPLGAGRRLCLGMNIATTIMELTLANLLYSFD*RLPSGMKMEEIST 9136

9137 EEGFGSPGHKNEPLYLIP 9190

 

>CYP71AS11P AM481172 missing part of exon 1

CAN66328.1

this part 66% to CYP71AS7v1

11384 MATYSPFLWLPLLLLLPSLFFLIKRTVDQ*RVQREQLPPGLPIIGNLHQLGQLPHQS 11214

11213 LWQLFHKYG 11187

11185 TVIVLHLGFVPTLVVSSAEAARVVLKTRD 11099

(gap)  this part 73% to 71AS7v1

10117 NLLLAGVNTSASTVV*AMAELARNPRVMKKAQAEVRSVMGNKGKVTESDLDQLLYLKLVV 9938

 9937 KEIFRLHPPGPLLLPRETMSHFQMNGYHIHPKTRVHVNV*AIGREPNVWKNPEEFFPLRF 9758

 9757 IDNSIDFKGQHFELLPFGAGRRVCPAINMGIAMVELTFANLLYHFNWKLPHGMKEEDINM 9578

 9577 EEGAGITSPKKFAFILRP 9524

 

>CYP71AS12P second pseudogene on AM481172 56% to CYP71AS6v1

8682 VMLLQLGSVPTVVVSSA*ATKEVKT 8608

7225 FLAGAKQECPTMV*EMAELARNPRIMKKTQAEVRSCAGKQGKVLGT 7088

7080 DLDQLNYLKMMTMKEMLRLYP

7018 FSHTILPTETMQHFNIN 6068

6966 SSSVYPKTQFLQLDVLAIGKDP 6901

6900 NIWENTQKNF 6871

6883 PEEFSLERF 6857

 

>CYP71AT3 CAAP02000328.1a, 92% to CAN64422.1

(CAO61025.1)

46439  MTLLLFVILAFPLFLLFLYRKHRKNGGLLPPGPPGLPFIGNLHQMDNSAPHRYLWQLS  46612

46613  KQYGPLMSLRLGFVPTIVVSSAKIAKEVMKTQDLEFASRPSLIGQQRLSYNGLDLAFSPY  46792

46793  NDYWREMRKICVLHLFTLKRVKSYTSIREYEVSQMIEKISKLASASKLINLSEALMFLTS  46972

46973  TIICRVAFGKRYEGEGCERSRFHGLLNDAQAMLGSFFFSDHFPLMGWLDKLTGLTARLEK  47152

47153  TFREMDLFYQEIIEEHLKPDRKKQELEDITDVLIGLRKDNDFAIDITWDHIKGVLM (0)  47320

47389  NIFLGGTDTGAATVTWAMTALMKNPRVMKKAQEEVRNTFGKKGFIGEDDVEKLPYLKA  47562

47563  VVKETMRLLPSVPLLVPRETLQKCSLDGYEIPPKTLVFVNAWAIGRDPEAWENPEEFMPE  47742

47743  RFLGSSVDFRGQHYKLIPFGAGRRVCPGLHIGVVTVELTLANLLHSFDWEMPAGMNEEDI  47922

47923  DLDTIPGIAMHKKNALCLVAKKYN*  47997

 

>CYP71AT4 CAAP02000328.1b, 96% to CAN64422.1 but 5.8 kb upstream, different gene

76820  MTVLLFVILAFPLLLLFLHRKHRKNGGLLHLPPGPPGLPVIGNLHQMDNSAPHRYLWQLS  76999

77000  KQYGPLMSLRLGFIPTIVVSSARIAKEVMKTHDLKFASRPSLIGPRRLSYNCLDLAFSPY  77179

77180  NDYWREMRKICVLHLFTLKRVQSYTPIREYEVSQMIEKISKLASASKLINLSETVMFLTI  77359

77360  TIICRVSFGKRYEDEGCETSRFHGLLNDAQAMLGSFFFSDHFPLMGWLDKLTGLTARLEK  77539

77540  TLRDMDLFYQEIIEDHLKPDRKKQEQEDITDVLIELQKDNSFAIDITWDHIKGVLM (0)  77707

77780  NIFVGGTDAGTATVIWAMTALMKNPRVMKKAQEEVRNTFG  77899

77900  KKGFIGEDDVEKLPYLKAVVKETMRLLPAAPLLLPRETLQKCSIDGYEIPPKTLVFVNAW  78079

78080  AIGRDPEAWENPEEFIPERFLGSSVDFRGQNYKLIPFGAGRRVCPAIHIGAVTVELTLAN  78259

78260  LLYSFDWEMPAGMNKEDIDFDVIPGLTMHKKNALCLMAKKYN*  78388

 

>CYP71AT5P gi|147832399|emb|CAN64422.1| 48% to CYP83A2/83B1

CAAP02000328.1c 84167-85735 100% match, 96% to CYP71AT4

MTVLLFVILAFPLLLLFLHRKHRKNGGLLHLPPGPPGLPFIGNLHQMDNSARHRYLWQLSKQYGSLMSLR

LGFIPTIVVSSARIAKEVMKTHDLEFASRPSLIGPQRLSYNCLDLAFSPYNDYWREMRKICVLHLFTLKR

VQSYTPIREYEVSQMIEKISKLASASKLINLSETLMFLTSTIICRVAFGKRYEDEGFERSRFHGLLNDAQ

AMLGSFFFSDHFPLIGWLDKLTGLTARLEKTFRDMDLFYQEIIEDHLKPDRKKQEQEDITDVLIGLQKDN

SFAIDITWDHIKGVLM (0)

NIFVGGTDTGAATVIWAMTALMKNPRVMKKAQEEVRNTFGKKGFIGEDDVEKLP

YLKAVVKETMRLLPAVPLLIPRETLQKCSIDGYEIPPKTLVFVNAWAIGRDPEAWENPEEFIPERFLGSS

VDFRGQNYKLIPFGAGRRVCPGIHIGAVTVELTLANLLYSFDWEMPAGMNKEDIDFDVIPGLTMHKKNAL

CLMAKKYN*

 

>CYP71AT6P CAAP02000328.1d, pseudogene 100% to CAN64424.1 in overlaps, 69% to CAN64422.1

94192  ILLALPLILLEIRETMEECFFRPPGPPGLPFIGNLLHLDKSAPHRYLWQLSEKYGAL  94362

94363  MFLRLGFVPTLVVSSARMAEEVMKTHDLEFSSRPSLLGQQKLS*NGLDLAFAPYTNYWRE  94542

94543  MKKICTLHLFNSKRAQSFRSIREDEVSRMIEKISKFASASKLVNLSETLHFLTSTIICRI  94722

94723  AFSKRYEDEGWERSRFHTLLSEAQAIMGASFFKDYFPFMGWVDKLTGLTARLQKILRELD  94902

94903  LFYQEIIDHLNPERTKYEQEDIADILIG

       RINDSSFAIDITQDHIKAVVM

95017  NIFVGGTDTIAAILVWAMTALMKDPIVMKKAQEEIRNIG  95259

95260  GKKGFRDEDDIEKLPYLKALTKETMKLHPPIPLIPRATPENCSVNGCEVPPKTLVFVNA  95436

95437  WAIGRDPESRENPHEFNPERFLGTFIDFKGQHYGLMAFRAGRRGCPGIYLRTVIIQLALG  95616

95617  NLLYSFDWEMPNGMTKEDIDTDVKHGVTM  95703

 

>CYP71AT6P CAN64424.1 44% to 83A2/B1, pseudogene

MKKICTLHLFNSKRAQSFRSIREDEVSRMIEKISKFASASKLVNLSETLHFLTSTIICRIAFSKRYEDEG

WERSRFHTLLSEAQAIMGASFFKDYFPFMGWVDKLTGLTARLQKILRELDLFYQEIIDHLNPERTKYEQE

DIADILI

 

GGTDTIAAILVWAMTALMKDPIVMKKAQEEIRNIGGKKGFRDEDDIEKLPYLKALTKETMKLH

PPIPLIPRATPENCSVNGCEVPPKTLVFVNAWAIGRDPESRENPHEFNPERFLGTFIDFKGQHYGLMAFR

AGRRGCPGIYLRTVIIQLALGNLLYSFDWEMPNGMTKEDIDTD

GHFTGQLGQLAGNILGGFRQLRFSGVSITMWKLKRWKLRVHETQKNI

 

>CYP71AT7 CAAP02000328.1e, 84% to 104360

99326   MMILLLILLALPLFLLFLLRNRRRTPLPPGPPGLPLIGNLLQLDKSAPHIYLWRLS  99493

99494   KQYGPLMILRLGFVPTLVVSSARMAKEVMKTHDLEFSGRPSLLGLRKLSYNGLDVAFSPY  99673

99674   NDYWREMRKICVLHLFNSKRAQSFRPIREDEVLEMIKKISQFASASKLTNLSEILISLTS  99853

99854   TIICRVAFSKRYDDEGYERSRFQKLVGEGQAVVGGFYFSDYFPLMGWVDKLTGMIALADK  100033

100034  NFKEFDLFYQEIIDEHLDPNRPEPEKEDITDVLLKLQKNRLFTIDLTFDHIKAVLM (0)  100201

100333  NIFLAGTDTSAATLVWAMTMLMKNPRTMTKAQEELRNLIGKKGFVDEDDLQKLPYLKAIV  100512

100513  KETMRLHPASPLLVPRETLEKCVIDGYEIPPKTLVYVNAWAIGRDPESWENPEEFMPERF  100692

100693  LGTSIDFKGQDYQLIPFGGGRRICPGLNLGAAMVELTLANLLYSFDWEMPAGMNKEDIDI  100872

100873  DVKPGITMHKKNALCLLARIPMH*  100944

 

>CYP71AT8 CAAP02000328.1f, 71% to CAN64422.1

104360  MMILLLILLALPLFLLFLLRNQRRAPLPPGPPGLPFIGNLLQLDKSAPHLYLWRLS  104527

104528  KQYGPLMFLRLGFVPTLVVSSARMAKEVMKTHDLEFSGRPSLLGQQKLFYNGLGLTFTPY  104707

104708  NDYWREMRKICVLHLFNSKRVQSFRYIREDEVLEMIKKISKFASASKLTNLSEILIPLTS  104887

104888  TIICRVAFGKRYDDEGCERSRFHELLGGIQTMAIAFFFSDYFPLMSWVDKLTGMISRLEK  105067

105068  VSEELDLFCQKIIDEHLDPNKPMPEQEDITDILLRLQKDRSFTVDLTWDHIKAILM (0) 105235

105143  DIFIAGTDTSAATLVWAMTELMKNP  105427

105428  IVMKKAQEEFRNSIGKKGFVDEDDLQMLCYLKALVKETMRLHPAAPLLVPRETREKCVID  105607

105608  GYEIAPKTLVFVNAWAIGRDPEFWENPEEFMPERFLGSSIDFKGQDYQFIPFGGGRRACP  105787

105788  GSLLGVVMVELTLANLLYSFDWEMPAGMNKEDIDTDVKPGITVHKKNALCLLARSHT*  105961

 

>CYP71AT8 AM489206.2a 58% to 71AT1 tomato

1212 MMILLLILLALPLFLLFLLRNQRRAPLPPGPPGLPFIGNLLQLDKSAPHLYLWR 1373

1374 LSKQYGPLMFLRLGFVPTLVVSSARMAKEVMKTHDLEFSGRPSLLGQQKLFYNGLGLTFT 1553

1554 PYNDYWREMRKICVLHLFNSKRVQSFRYIREDEVLEMIKKISKFASASKLTNLSEILIPL 1733

1734 TSTIICRVAFGKRYDDEGCERSRFHELLGGIQTMAIAFFFSDYFPLMSWVDKLTGMISR 1910

1911 LEKVSEELDLFCQKIIDEHLDPNKPMPEQEDITDILLRLQKDRSFTVDLTWDHIKAILM (0) 2087

2206 DIFIAGTDTSAATLVWAMTELMKNPIVMKKAQEEFRNSIGKKGFVDEDDLQMLCYLKA 2378

2379 LVKETMRLHPAAPLLVPRETREKCVIDGYEIAPKTLVFVNAWAIGRDPEFWENPEEFMPE 2558

2559 RFLGSSIDFKGQDYQFIPFGGGRRACPGSLLGVVMVELTLANLLYSFDWEMPAGMNKEDI 2738

2739 DTDVKPGITVHKKNALCLLARSH 2807

 

>CYP71AT9 CAAP02000328.1g, 73% to CAN64422.1

(CAO61031.1) on contig CU459218.1 chr18 scaffold_1

111012  MILHLILLALPLFLLFLVRNHRNNGRTPLPPGPPGLPFIGNLLQISKTAPHLYLWQLSKQ  111191

111192  YGSLMFLRLGFVPTLVVSSARMAKEVMKTHDLEFSGRPSMLGLKKLTYNGLSLSVAPSND  111371

111372  YWREMRKVCALHLFNSKRVQSFRHIREDEVLETVKKISKFASASKLTNLSEILILLTSTI  111551

111552  ICRVAFGKRYDDEGCERSRFHELLGGVQTMSMAFFFSDHFPLMGWVDKLTGMIARLEKIF  111731

111732  EELDLFCQEIIDEHLDPNRSKLEQEDITDVLLRLQKDRSSTVDLTWDHIKAMFV (0) 111929

111831  DIFVAGTDTSAATVVWAMTELMKNPIVMKKAQE  112091

112092  ELRNLIGKKGFVDEDDLQKLSYLKALVKETMRLHPAAPLLVPRETLEKCVIDGYEIAPKT  112271

112272  LVFVNAWAIGRDPEFWENPEEFMPERFLGSSIDFKGQDYQLIPFGGGRRVCPGLLLGAVM  112451

112452  VELTLANLLYSFDWEMPAGMNKEDIDTDVKPGITMHKKNALCLLARSHI*  112601

 

>CYP71AT9 AM489206.2b 57% to 71AT1 tomato, 88% to AM489206.2a

same as partial seq CAN71113.1

7716 MILHLILLALPLFLLFLVRNHRNNGRTPLPPGPPGLPFIGNLLQISKTAPHLYLWQLSKQY 7898

7899 GSLMFLRLGFVPTLVVSSARMAKEVMKTHDLEFSGRPSMLGLKKLTYNGLSLSVAPSNDY 8078

8079 WREMRKVCALHLFNSKRVQSFRHIREDEVLETVKKISKFASASKLTNLSEILILLTSTII 8258

8259 CRVAFGKRYDDEGCERSRFHELLGGVQTMSMAFFFSDHFPLMGWVDKLTGMIARLEKIF 8435

8436 EELDLFCQEIIDEHLDPNRSKLEQEDITDVLLRLQKDRSSTVDLTWDHIKAMFV (0) 8597

8697 DIFVAGTDTSAATVVWAMTELMKNPIVMKKAQEELRNLIGKKGFVDEDDLQXLSYLKA 8870

8871 LVKETMRLHPAAPLLVPRETLEKCVIDGYEIAPKTLVFVNAWAIGRDPEFWENPEEFMPE 9050

9051 RFLGSSIDFKGQDYQLIPFGGGRRVCPGLLLGAVMVELTLANLLYSFDWEMPAGMNKEDI 9230

9231 DTDVKPGITMHKKNALCLLARSHI* 9305

 

>CYP71AT10Pv1 CAAP02000328.1h, pseudogene, 70% to CAN64422.1, 96% to CAN71114.1

94% to CYP71AT5P

116430  LHLPPGPGLPFIGNLYQMDNSTPHVYLWQLSKQYGPILSLGLGLVPTLVDSLAKMAKEL  116606

116607  LKAHDLEFSSRSSSLGQQSVT  116669

        YNGLDLD

117280  FAPYDGYWREMRKICVLHPFSSKRVQSFRSIREDEVSRIIEKISKSASAAKLTDLSETVM  117459

117460  LLTSNIICRTAFGKRYEDKGYDRSRFHGLLNDAQAMMGSFFFTDHFPSMGWVDKLTDLIA  117639

117641  RPEKNFKELDLFYQEVIDEHLDPKRPKQEQEDIAVVLLRLQRERLFSVDLTWDHIKAVLM  117820

117972  DVFVAGTDPGAATLVWAMAEVTKNPGGKKKAQEELRTVFGRKGFVDEDDLHKLPYLKA  118145

118146  LVKETLRVHPPAPLLLTKETLENCTIDAYDIPPKTLVFVNAWAIGRDPEAWENPEEILPE  118325

118326  RFLSSSVDFKGQDYELISFSVGRRGCPGIHLGVVTVELALANLLYSFD*EMPAGMNKENI  118505

118506  DMDMKPGLTLDKRNALCLQARQYNLAS*  118589

 

>CYP71AT10Pv2 AM489206.2c pseudogene 70% to AM489206.2a 56% to 71AT1

13061 PPGPGLPFIGNLYQMDNSAPHVYLWQLSKQYGPILSLGLGLV 13186

14708 GVTXTLVVSSARMAKEVLKAHDLEFSSRSSSLGQQRLSYNGLDLAFAPYDGYWREMRKICVLHPF 14902

14903 SSKRVQSFRSIREVEVSRMIEKFSKSASAAKLTDLSETVMLLTSNIICRTAFGKRYEDK 15079

15080 GYDRSRFHGLLNDAQAMMGSFFFTDHFPSMGWVDKLTXL

      LLRPEKNFKELDLFY*EIIDEHLDPKRPKQEQEDIXVV 15311

15312 LLRLQRERLFLVDLTWDHIKAVPM (0)

15535 DVFVAGTDPGAATLVWAMAEVTKNPGGKKKAQEELRTVFGRKGFVDEDDLHKLPYLKA 15708

15709 LVKETLRVHPPAPLLLXKETLENCTIDGYDIPPKTLVFVNAWAIGRDPEAWENPEEILPE 15888

15889 RFLSSSVDFKGQDYELISFSVGRRGCPGIHLGVVTVELALANLLYSFD*EMPAGMNKENI 16068

16069 DMDMKPGLTLDKRNALCLXARQY 16137

 

>CYP71AT10Pv2 CAN71114.1 50% to 83A2/B1

same as AM489206.2c only 9 aa diffs with CYP71AT10Pv1 (97% identical)

MAKEVLKAHDLEFSSRSSSLGQQRLSYNGLDLAFAPYDGYWREMRK

ICVLHPFSSKRVQSFRSIREVEVSRMIEKFSKSASAAKLTDLSETVMLLTSNIICRTAFGKRYEDKGYDR

SRFHGLLNDAQAMMGSFFFTDHFPSMGWVDKLT

 

DVFVAGTDPGAATLVWAMAEVT

KNPGGKKKAQEELRTVFGRKGFVDEDDLHKLPYLKALVKETLRVHPPAPLLLXKETLENCTIDGYDIPPK

TLVFVNAWAIGRDPEAWENPEEILPERFLSSSVDFKGQDYELISFSVGRRGCPGIHLGVVTVELALANLL

YSFD

 

>CYP71AT11P CAAP02000504.1  pseudogene 76% to CAAP02000328.1e 50% to 71B37

104371 MLLLLVFLMVLPLFLLWKHRVNGGKLLPPGPPGLPLIGSLHQL 104499

116804 SL*SLTDTYGISLNNMDPLMFLHLGFEPILVVSSPRTAEVMKTHDPEFSSRPSLLVIT 116977

125485 ALQKLSYNGLDLAFASYGAYWREIRKICV 125571

125582 DIVDILLKLHKDRLFTVDLSWNHIKAVLM (0) 125668

126490 AGTDTVAATMVWTMTALMKNPRVMKKAQKEVRTLVGEKCFVDEDDIQKLTYMKALVKESMR 126672

126673 LYPAAPLLIPRETLQKCNIDGY*IPTKTLVFVNAWAIGRDPESWENPEEFMPERFLGTCI 126852

126853 DFKGQDYKLIPFGAGRRIWPGMNLGAVTVELALANLLYSFDWEMPAGMKMEDIDTDAKPG 127032

127033 LTMTKKNDLYLVARNYI* 127086

 

>CYP71AU3 CAAP02005726.1 85% to CAAP02001743.1b, 54% to 71A26

11548 MGSFLDLLYKENASFFLLFLPFF

11479 VFIYFLIKWLYPTTPAVTTKRLPPSPPKLPIIGNLHQLGLLPHRSLWALAQRHGPIMLLH 11300

11299 FGKVPVVIVSAADAAREIMKTNDVIFLNRPKSSIFAKLLYDYKDVSMAPYGEYWRQMRSI 11120

11119 CVLHLLSNRRVQSFRGVREEETALLMEKISSSSSSSTPIDLSKMFLSLTNDLICRVALGR 10940

10939 KYSGDETGRKYRELLKEFVGLLGGFDVADYIPWLSWVNFINGLDAKVEKVAKEFDRFLDE 10760

10759 VVKEHVERRKRGVDEEVKDFVDVLLGIQEDNVTGVAITGVCIKALTL (0) 10619

10189 DMFAAGSDTTYTVLEWAMTELLRHPQVMRQLQNEVRGIAQGKLLITEDDLDKMQYLKAVI 10010

10009 KETLRLHPPVPLLLPRESTRGAKIMGYDIEVGTQVITNAWAIGRDPLLWDEAEEFRPERF 9830

 9829 LNSSIDFTGKDFELIPFGAGRRGCPGTLFAAMAIEVALANLVHQFDWEVGGGGRREDLDM 9650

 9649 TECTGLTIHRKVPLLAVATPWPR* 9578

 

>CYP71AU4 gi|147767047|emb|CAN67678.1| 46% to 71T4

CAAP02004888.1 13222-15147 1 aa diff

MLLLDPLSFSLFPFFFFIVLLVRWLFSTPPTTHKTLPPSPPRLPVLGNMHQLGIYPYRSLLCLARCYGPL

MLLQLGRVRTLVVSSPDAAQEIMKTHDLIFANRPKMSLGKRLLYDYKDVSVAPYGEYWRQMRSICVLHLL

SNKRVQSFNTVRREEISLLIQKIEEFSSLSTSMDLSGMFMRLTNDVICRVAFGRKYSGDERGKKFRRLLG

EFVELLGGFNVGDYIPWLAWVEYVNGWSAKVERVAKEFDEFLDGVVEEHLDGGTGSIAKGDNEKDFVDVL

LEIQRDGTLGFSMDRDSIKALILDIFAGGTDTTYTVLEWAMTELLRHPKAMKELQNEVRGITRGKEHITE

DDLEKMHYLKAVIKETLRLHPPIPLLVPRESSQDVNIMGYHIPAGTMVIINAWAMGRDPMSWDEPEEFRP

ERFLNTNIDFKGHDFELIPFGAGRRGCPGISFAMATNELVLANLVNKFDWALPDGARAEDLDMTECTGLT

IHRKFPLLAVSTPCF*

 

>CYP71AU5 CAAP02003357.1  92% to CAAP02005726.1 53% to 71A26

38067 MGSFLGLLYKENDS

38025 FFLLLLPFFIFTHFLIKWLYPTTPAVTTKKLLPSPPKLPIIGNLHQLGSLPHRSLWALAQ 37846

37845 RHGPLMLLHFGRVPVVIVSAVDAAREIMKTNDAIFSNRPKSNISAKLLYDYKDVSTAPYG 37666

37665 EYWRQMRSICVLHLLSTRRVQSFRGVREEETALLMEKISSSSSSSIPIDLSQMFLSLTND 37486

37485 LICRVALGRKYSGDENGRKYRELLKEFGALLGCFNVGDYIPWL 37357

37357 SWVNFINGLDAKVEKVAKEFDRFLDEVVKEHVERRKRGVDEEVKDFVDVLLGIQEDN 37187

37186 VTGVAITGVCIKALTL 37139

36747 DMFAAGSDTTYTVLEWAMTELLRHPQVMRQLQNEVRGIAQGKLLITEDDLDKMQYLKAVI 36568

36567 KETLRLYPPIPLLVPRESTRDAKIMGYDIAARTQVITNVWAIGRDPLLWDEAEEFRPER 36391

36390 FLNSSIDFRGQDFELIPFGSGRRGCPGTLFAAMAIEVVLANLVHRFDWEVGGGGRREDLD 36211

36210 MTECTGLTIHRKVPLLAVATPWPR* 36136

 

@9

>CYP71AU6P CAAP02001743.1b, pseudogene, 58% to CAN67678.1

35127  IFIYFLIKWLYPTTSTVTTKRLPHFPLKLPIIGNLFQLGSLSHRSL*VLAQRHGSLMLLH  34948

34947  FGRVPVVIVSIANTAREIMKTNDVIFSNRSKSNISAKLLYDYKDVSTTPYKEYWRQMRSI  34768

34767  CVLHFLSTRRVLSFRGVQEEETTLMMEKISSSASSTPIDLSQMFQSLTNDLICRVSL*RK  34588

34587  YSGDETGRKYRELLKKFVGLLGGFNVGDYIPWLSWVNFINGLETKVEKVSKVFDRFLD  34414

34413  EVVK*HVERRKRCGVDEEMKDFVDVLL  34333 XXXXXXXXXXXXXXXXXXXXXX

33813  DMFAARSDSTYTVLEWAMTKLLRHPQVMRQL*NEARGIAQGKLLITEDDLGKMQYLMAVIK  33631

33631  ETLRLHPLIPLLILRESTRGAKIMGYDIEAGTRVITNAWPIGGDPLLWDEAEEFWPERF  33455

33454  LNSSIDFTGKDFELISFGAGQRGCPGTLFAKMAIELVLANLVHHFDWEVAGGGRREDLDM  33275

33274  TECIGLTIHIKVLLLAVATP  33215

 

#11

>CYP71BC1 gi|147861230|emb|CAN80448.1| = AM435124.2

CAAP02002092.1 15806-13388 (-) strand 1 aa diff.

MTMKISENMLLLFSQSSANQWLLALGILSFPILYLFLLQRWKKKGIEGAARLPPSPPKLPIIGNLHQLGK

LPHRSLSKLSQEFGPVLLLQLGRIPTLLISSADMAKEVLKTHDIDCCSRAPSQGPKRLSYNFLDMCFSPY

SDYWRAMRKVFVLELLSAKRAHSLWHAWEVEVSHLISSLSEASPNPVDLHEKIFSLMDGILNMFAFGKNY

GGKQFKNEKFQDVLVEAMKMLDSFSAEDFFPSVGWIIDALTGLRARHNKCFRNLDNYFQMVVDEHLDPTR

PKPEHEDLVDVLLGLSKDENFAFHLTNDHIKAILL (0)

NTFIGGTDTGAVTMVWAMSELMANPRVMKKVQAEV

RSCVGSKPKVDRDDLAKLKYLKMVVKETFRMHPAAPLLIPHRTRQHCQINANGCTYDIFPQTTILVNAFA

IGRDPNSWKNPDEFYPERFEDSDIDFKGQHFELLPFGAGRRICPAIAMAVSTVEFTLANLLYCFDWEMPM

GMKTQDMDMEEMGGITTHRKTPLCLVPIKYGCVE*

 

@10

>CYP71BC1-de2b CAAP02002092.1 C-term pseudogene, 67% to 71BC1

16741  KAQHTDMEEVGGITISR  16691

16647  PLCFVPIKYGWV  16612

 

@11

>CYP71BC3-de1b CAAP02002092.1 N-term pseudogene 80% to 71BC2

19486  VVLYSVICFFLVQKWGNRVVVERATTPPSPSKLAIIGNLHQLS*WSYRSLWTLSQKYGSI  19307

19306  MFLQLGSV  19283

 

#12

>CYP71BC3 gi|147781883|emb|CAN72169.1| 62% to CYP71BC1

CAAP02002092.1 28083-26374 (-) strand 100% match

MAMEIAEAVMEVFSPSSVTDWLFTLSVVLLSVLCFFLVQKWGNRAVLERATTPPSPPKLPIIGNLHQLSKLH

HRSLWTLAQKHGSIMFLQLGSIPTIVISSADMAEQVLRTRDNCCCSRPSSPGSKLLSYNFLDLAFAPYSD

HWKEMRKLFNANLLSPKRAESLWHAREVEVGRLISSISQDSPVPVDVTQKVFHLADGILGAFAFGKSYEG

KQFRNQKFYDVLVEAMRVLEAFSAEDFFPTGGWIIDAMSGLRAKRKNCFQNLDGYFQMVIDDHLDPTRPK

PEQEDLVDVFIRLLEDPKGPFQFTNDHIKAMLM (0)

NTFLGGTDTTAITLDWTMSELMANPRVMNKLQAEVRS

CIGSKPRVERDDLNNLKYLKMVIKEALRKHTPIPLLIPRETMDYFKIHDKSSSREYDIYPGTRILVNAWG

IGRDPKIWKDPDVFYPERFEDCEIEFYGKHFELLPFGGGKRICPGANMGVITAEFTLANLVYCFDWELPC

GMKIEDLGLEEELGGITAGRKKPLCLVARRCGCSCTEPM*

 

@12

>CYP71BC3-de1c CAAP02002092.1 N-term pseudogene, 78% to CYP71BC2

29938  KLATIGNLHQLSKWSYRSLWTLSQKYGSIMFLQLGSV  29828

 

@13

>CYP71BC3-de1d CAAP02002092.1 N-term pseudogene 79% to CYP71BC2

31559  VVLFSVICFFLVQKWGNRVVVERATTPPSPSKLAIIGNLHQLS*WSYRSLWTLSHKYGSI  31380

31379  MFLQLGSV  31356

 

@14

>CYP71BC3-de2b CAAP02002092.1 C-term pseudogene 95% to 71BC2

39030  KMVIKEAMRKHTPIPLLIPRETMDYFKIHDKSSSREYDIYRETRILVNAWGIGRDPKSWK  38851

38850  DPDVFYPERFEDCEIEFYGKHFELLPFGGGKRICPGANMGVITAEFTLANLVCCFDWELP  38671

38670  CGMKIEDLGLEEELGGITASRKTPLCLVARRCGC  38569

 

>CYP71BE1 CAAP02002803.1, 46% to CAAP02001743.1a, 42% to 71B37,

6 aa diffs to CYP71BE1 AM445470.2

28146  MEFPSSFLFPFLLFLFILFKVSKKSKPQISIPKRPPGPWKLPLIGNLHQLVGSLPHHSLRDL  28331

28332  AKKYGPLMHLQLGQVSMLVVSSPEIAKEVMKTHDINFAQRPHLLATRIVSYDSTDVAFSP  28511

28512  YGDYWRQLRKICVVELLSAKRVKSFQVIRKEEVSKLIRIINSSSRFPINLRDRISAFTYS  28691

28692  AISRAALGKECKDHDPLTAAFGESTKLASGFCLADLYPSVKWIPLVSGVRHKLEK  28856

28857  VQQRIDGILQIVVDEHRERMKTTTGKLEEEKDLVDVLLKLQQDGDLELPLTDDNIKAVIL (0) 29036

       DIFGGGGDTVSTAVEWTMAEMMKNPEVMKKAQAE  29216

29217  VRRVFDGKGNVDEAGIDELKFLKAVISETLRLHPPFPLLLPRECREKCKINGYEVPVKTR  29396

29397  VVINAWAIGRYPDCWSEAERFYPERFLDSSIDYKGADFGFIPFGSGRRICPGILFGIPVI  29576

29577  ELPLAQLLFHFDWKLPNGMRPEDLDMTEVHGLAVRKKHNLHLIPIPYSPLTVG*  29738

 

>CYP71BE4P CAAP02000100.1e pseudogene

97018 LIGNMHQLISYLPHHALRDLAKKHGPLMDLQLGEVSTIIVSSPETAKGVIKTQII 96854

96853 ISQRPHV*KFWI*ELFTAKPVQFFQSIREEEVSGLVRSIS 96734

96733 LNIRSPINLAKE 96698

96499 SGTMVHRVMSEMLKNPQIMKKAQAEVRQTFETKGEVDDIGIHELKILKLVVKETPRLHPP 96320

96319 APLLLPRECGERFEISGCDDIPLNPMSLLLHGQLEEMEALNST*QLQPREIFLKSLVDYK 96140

96139 GTNFDFIPFG 96110

 

>CYP71BE5 CAAP02000100.1d 61% to CAAP02002803.1

81714 MELQFSFFPILCT

81675 FLLFIYLLKRLGKPSRTNHPAPKLPPGPWKLPIIGNMHQLVGSLPHRSLRSLAKKHGPLM 81496

81495 HLQLGEVSAIVVSSREMAKEVMKTHDIIFSQRPCILAASIVSYDCTDIAFAPYGGYWRQI 81316

81315 RKISVLELLSAKRVQSFRSVREEEVLNLVRSVSLQEGVLINLTKSIFSLTFSIISRTA 81142

81141 FGKKCKDQEAFSVTLDKFADSAGGFTIADVFPSIKLLHVVSGMRRKLEKVHKKLD 80977

80976 RILGNIINEHKARSAAKETCEAEVDDDLVDVLLKVQKQGDLEFPLTMDNIKAVLL 80812

80544 DLFVAGTETSSTAVEWAMAEMLKNPRVMAKAQAEVRDIFSRKGNADET 80401

80400 VVRELKFLKLVIKETLRLHPPVPLLIPRESRERCAINGYEIPVKTRVIINAWAIARDPKY 80221

80220 WTDAESFNPERFLDSSIDYQGTNFEYIPFGAGRRMCPGILFGMANVELALAQLLYHFDWK 80041

80040 LPNGARHEELDMTEGFRTSTKRKQDLYLIPITYRPLPVE* 79921

 

>CYP71BE6 CAAP02000100.1c 61% to CAAP02002803.1

60542 MELQFSFFPILCT

60513 FLLFIYLLKRLGKPSRTTHPAPNLPPGPWKLPIIGNMHQLVGSLPHHSLRNLAKKHGPLM 60334

60333 HLQLGEVSAIVVSSREMAKEVMKTHDIIFSQRPCILAASIVSYDCTDIAFAPYGDYWRQI 60154

60153 RKISILELLSAKRVQSFRSVREEEVLNLVRSISSQEGVSINLTESIFSLTFSIISRAA 59980

59979 FGKKCKDQEAFSVTLEKFAGSGGGFTIADVFPSIKLLHVVSGIRHKLEKIHKKLD 59815

59814 TILENIINEHKARSEASEISEAEVDEDLVDVLLKVQKQGDLEFPLTTDNIKAILL 59650

59202 DLFIAGSETSSTAVEWAMAEMLKNPGVMAKAQAEVRDIFSRKGNADETMIHELKFLK 59032

59031 LVIKETLRLHPPVPLLIPRESRESCEINGYEIPVKTRVIINAWAVARDPEHWNDAESFNP 58852

58851 ERFLDSSIDYQGTNFEYIPFGAGRRMCPGILFGMANVEIALAQLLYYFDWKLPNGTQHEE 58672

58671 LDMTEDFRTSLRRKLNLHLIPITYRPLPVE* 58579

 

>CYP71BE6-de1b CAAP02000100.1c-de1b pseudogene N-term

63932 ISILCTFLLFIYLLKRLGKPYRTNGPARKLPAGPWKLPIIGNMHQLFGSLPHHSLRNLAK 63753

63752 QHGTLMHLQPGEASTIVVS*REMEK 63678

 

>CYP71BE7 CAAP02000100.1b 61% to CAAP02002803.1

47703 MELHFPSFH

47676 ILSAFILFLVVVLRTQKRSKTGSLTPNLPPGPWKLPLVGNIHQLVGSLPHHALRDLAKKY 47497

47496 GPLMHLQLGEVSTIVVSSSEIAKEVMKSHDIIFAQRPHILATRIMSYNSTNIAFAPYGDY 47317

47316 WRHLRKICMSELLSANRVQSFQSIRNEEESNLVRSISLNTGSPINLTEKTFASICAIT 47143

47142 TRAAFGKKCKYQETFISVLLETIKLAGGFNVGDIFPSFKSLHLISGMRPKLEKLH 46978

46977 QEADKILENIIHEHKARGGTTKIDKDGPDEDLVDVLLKFHEDHGDHAFSLTTDNIKA 46807

46806 VLL (0) 46798

46624 DIFGAGSEPSSTTIDFAMSEMMRNPRIMRKAQEEVRRIFDRKEEIDEMGIQELKFLKLVI 46445

46444 KETLRLHPPLPLLLPRECREKCEIDGHEIPVKSKIIVNAWAIGRDPKHWTEPESFNPERF 46265

46264 LDSSIDYKGTNFEYIPFGAGRRICPGILFGLASVELLLAKLLYHFDWKLPNGMKQQDLDM 46085

46084 TEVFGLAVRRKEDLYLIPTAYYPLSHE* 46001

 

>CYP71BE8P CAAP02000100.1a frameshift and stop possible pseudogene 

44% to CYP71B33, 64% to CAAP02002803.1

      MEIHLPSSYAFFAFLLSMFIVFKIGKVQIQNL

31068 PAKLPPGPWKLPLIGNMHQLVGSLPHHTLKRLASKYGPFMHLELGEVSALVVSSPEIARE 30889

30888 VMKTHDTIFAQRPPLLSSTIINYNATSISFSPYGDYWRQLRKICTIELLSAKRVKSFQSI 30709

30708 RE*EVSKLIWSISLNAGSPINLSEKIFSLTYGITSRSAFGKKFRGQDAFVSAIL 30547

30546 EAVELSAGFCVADMYPSLKWLHYISGMKPKLEKVHQKIDRILNNIIDDHRKRKTTTKAG 30370

30369 QPETQEDLVDVLLNLQEHGDLGIPLTDGNVKAVLL (0) 30265

29794 DIFSGGGETSSTAVVWAMAEMLKSPIVMEKAQAEVRRVFDGKR 29666

29665 DINETGIHELKYLNSVVKETLRLHPSVPLLLPRECRERCVINGYEIPENTKVIINAWAIA 29486

29485 QDPDHWFEPNKFFPERFLDSSIDFKGTDFKYIPFGAGRRMCPGILFAIPNVELPLANLLY 29306

29305 HFDWKLPDGMKHEDLDMTEEFGLTIRRKEDLNLIPIPYDPFLVL* 29171

 

#13

>CYP71BE9Pv1 CAAP02000216.1a  pseudogene CYP71BE like, 78% to CAAP02001833.1a

7514 MDFLFSSILFAFLLFLYMLYKMGERSKASISTKKLPPGPWKLPLL 7648

7648 GNMHQLVGSLPHQSLSRLSKQYGPLMSLQLCEVYALTISSPEMAKQV 7788

7789 MKTHDINFAHRPPLLASNVLSYDSTDILYPPYGDYWRQLRNICVVELLTSKRVKSFQLVR 7968

7969 EAELSNLITAVVSCSRLPFNRNENLSSYTFSIISRAAFGEKFEDQDAFISVTKEMAELYS 8148

8149 GFCVADMYPSVKWLDLISGMRYKLDKVFQR 8238

8241 DRILQNIVDEHRDKL*PQAGKLQGEEDLVDVLLKLQQHGDLEFPLTDNNIKGVIL 8405

13594 NIFSGGGKTTFTSVD*

13642 AMSEMLKNPRVMEKAQAEVRRVFDGKGNVDETGLDG 13749

13749 IKIF*AVVKETLRLHTPFPLLLPRECREMCWIDGYEIPEKTRIIVNAWAIG*DSVYWVEA 13928

13929 ERFYPERFLDSSIDYKGTDFGYIPFGAGRRICPGIPFAMPYIELPLAHLLYHFDWKLP 14102

14103 KGIKAEDLDMTEAFCLAVCRKQDLHLIPIPYNPLHAQ* 14216

 

>CYP71BE9Pv2 Pinot noir (a highly heterozygous grape genome)

CAN66039.1

top part is a retrotransposon seq like AAP46207  putative retrotransposon protein Oryza sativa

97% (6 aa diffs) to CAAP02000216.1a probable ortholog to the pseudogene

from AM472203.2 exon 2 only

MNEEMKALQIDLPIGKIPVGCRWVFTIKYKVDGTVEWLRKSLYGLKQSPRAWFGRFTSFMKSIGYKQSNS

YHTLFLKHNKEQIIALIVCVDDMIVIGNDYEEMKTLQEHLAHDFEMKDLDKLKYFLGIEVSRSKKAYALS

VVCQFMHSPSKEHMNVVIHILRYLKSSPGKGILFTKGDNLDINGYTDADWAGSIQDRCSTSWYFTFKVVA

RSNAEAEYKGMAKAICELLWIRNLVKDLHIKQVSPMKLYCDNKAACDIAHNPVQHDRTKYVEVGRHFIKE

KLESKLIEVPHVRSQDQLADVLTKAMSNQ

2182 NIFSGGGKTTSTSVD*AMSEMLKNPRVMEKAQAEVRRVFDGKGNVDETGLDGLKFFK 2352

2352 AVVKETLRLHTPFPLLLPRECREMCWIDGYEIPEKTRIIVNAWAIG*DSVYWVEA 2516

ERFYPERFLDSSIDYKCTDFGYVP

FGAGRRICPGIPFAMPYIELPLAHLLYHFDWKLPKGIKAEDLDMTEAFCLAVCRKQDLHLIPIPYNPLHAQ* 2804

 

>CYP71BE10v1 CAAP02000216.1b 79% to CAAP02001833.1a

51405 MEFSSSSLLFAFLLFLYMLYKIGKRSKANISTQKLPPGPWKLPLIGNVHQLVGSLPHRS 51581

51582 LTLLAKKYGPLMRLQLGEVSTLIVSSPEMAKQVMKTHDTNFAQRPILLATRILSYDCSGV 51761

51762 AFAPYGDYWRQLRKICVVELLTAKRVKSFQSVREEEISNLITMVTSCSRLQINFTEKISS 51941

51942 LTFSIIARAAFGKKSEDQDAFLSVMKELVETASGFCVADMYPSVKWLDLISGMRYKIDKV 52121

52122 FRMTDRILQNIVDEHREKLKTQSGKLEGEADLVDVLLKLQQNDDLQFPLTDNNIKAVIL (0) 52298

52520 DIFGGAGESTSTSVEWAMSEMLKAPIVIEKAQAEVRSVFDGKGHVDETAIDELKFLKAVV 52699

52700 NETLRLHPPFPLLLPRECREMCKINGYEIPEKTRIIVNAWAIGRDSDYWVEAERFYPERF 52879

52880 LDSSIDYKGTDFGYIPFGAGRRICPGILFAMPGIELPLANLLYHFDWKLPNGMKAEDLDM 53059

53060 TEAFGLAVRRKQDLHLIPIPYNPSHAD* 53143

 

>CYP71BE10v2 Pinot noir (a highly heterozygous grape genome)

CAN81963.1 (partial translation of intact gene)

Overall 98% to 71BE10 probable ortholog

from AM487125.2 first exon 97% (7 aa diffs) to 71BE10,

second exon 1 aa diff to 71BE10

12930 MEFFSSSLLFAFLLFLYMLYKIAKRSKDNISTQKLPPGPWKLPLIGNVHQLVGSLPHRSL 12751

12750 TXLAKKYGPLMRLQLGEVSTLIVSSPEMAKQVMKTHDTNFAQRPILLATRILSYDCSGVA 12571

12570 FAPYGDYWRQLRKICVVELLTAKRVKSFQSVREEEISNLITMVTSCSRLQINFTEKISSL 12391

12390 TFSIIARAAFGKKSEDQDAFLSVMKELVEXASGFCVADMYPSVKWLDLISGMRYKIDKVF 12211

12210 RMTDRILQNIVDEHREKLKTQSGKLEGEADLVDVLLKLQQNGDLQFALTDNNIKAVIL (0) 12037

11816 DIFGGAGESTSTSVEWAMSEMLKAPIVMEKAQAEVRSVFDGKGHVDETAIDELKFLKAVV 11637

11636 NETLRLHPPFPLLLPRECREMCKINGYEIPEKTRIIVNAWAIGRDSDYWVEAERFYPERF 11457

11456 LDSSIDYKGTDFGYIPFGAGRRICPGILFAMPGIELPLANLLYHFDWKLPNGMKAEDLDM 11277

11276 TEAFGLAVRRKQDLHLIPIPYNPSHAD* 11193

 

>CYP71BE11-de1b CAAP02000216.1c pseudogene N-term

66338 WKLPLIVNMHGLV 66376

 

>CYP71BE11 CAAP02000216.1c pseudogene 85% to CAAP02001833.1a

66818 LLFLYMLYKIGKRSKGNISAQKLPLEPWKLPLIGNMHQLIDGSLPHRSLSRLTKQYESLM 66997

66998 SLQLGEVSTLIISSPEMAKQVMKTHDINFAQR 67093

72159 STLLATNILSYHSIDIDFPPYGDYGRHLQKICVVELLTS*RFKSFQLVGEDELSNL 72326

72327 IT 72332

72334 TLTSCSRLPINLTDKLSSCTFAIIAGAAFGEKCKDQDAFILVLKETLELLFGLCVTNM 72507

72508 YPSVKWLDLISGMRYKIEKVFQRTDRILQNIVDEHRDKMQTEAGKLQGEENIVDVLLKIQ 72687

72688 QHGDHEFPLTDNNIKSXXX 72735

74253 DIFAGGGETTSISVKWAISEMLKNX 74324

74320 RMMEKAQAEVRRVFDGQGNADEELKFLKGVVKETLRLHPPLPLLIPRECREMCEINRYEI 74499

74500 PKKTLIIINAWAIGRDSNYWVEAERFYPDRFLDSSIDYKGTDFGYIPFGAGRRMYHGILF 74679

74680 SLPIIELSLAHLLYHFDWKLPNGMKA*DLDMTEALGLVVRRKQDLHLIPILDNPLHAQ* 74856

 

>CYP71BE12 CAAP02000216.1d one frameshift 83% to CAAP02001833.1a

86311 MDFQFSSILFAFLLFLYMLYKMGERSKASISTQKLPPGPWKLPLIGNMHQLVGSLPHQS 86487

86488 LSRLAKQYGPLMSLQLGEVSTLIISSPDMAKQVMKTHDINFAQRPPLLASKILSYDSMDI 86667

86668 VFSPYGDYWRQLRKICVVELLTAKRVKSFQLVREEELSNLITAIVSCSRPINLTENIFS 86844

86845 STFSIIARAAIGEKFEGQDAFLSVMKEIVELFSGFCVADMYPSVKWLDLISGMRYKLDKV 87024

87025 FQRTDRMLQNIVDQHREKLKTQAGKLQGEGDLVDVLLELQQHGDLEFPLTDNNIKAVIL (0) 87201

87442 DIFSGGGETTSTSLDWAMSEMLENPRVMEKAQAEVRRVFDGKGNVDE 87582

87583 TGLDELKFLKAVVKETLRLHPPLPLLVPRECREMCEINGYEIPKKTSIIVNAWAIGRDSD 87762

87763 YWVEAERFYPERFLDSSIDYKGTDFGYIPFGAGRRMCPGILFSMPSIELSLAHLX 87924

87927 HFDWKLPNEMKAEDLDMTEAFGLAVRRKQDLLLIPIPHNQSHAQ* 88061

 

>CYP71BE13-de2b CAAP02001833.1a pseudogene 94% to CAN81963.1

10995 GHVDENAIDELKFLKAVVKETLRLHPPFPILLPRECREMRKINGYRIPEKTRIIVNAWA 11171

11172 IG*DSDYWVEAERFYPERFLDSSIDYKGADFGYIPFGAGRRICPGILFAMPNIELPLAYL 11351

11352 LYHFDWKLPNGMKAEDLDMTEAFGLAVRRKQDLHLIPIPYKP 11477

 

>CYP71BE13 CAAP02001833.1b 69% to 71BE1

15131 MDVLFSSILFASLLFLYMLYKIGKRWRGNISSQKLPPGPWKLPLIGNMHQLIDGSLPHHSLSRLA 15325

15326 KQYGPLMSLQLGEISTLIISSPEMAKQILKTHDINFAQRASFLATNTVSYHSTDIVFSPY 15505

15506 GDYWRQLRKICVVELLTSKRVKSFQLIREEELSNLITTLASCSRLPINLTDKLSSCTFAI 15685

15686 IARAAFGEKCKEQDAFISVLKETLELVSGPCVADMYPSVKWLDLISGMRHKIEKVFKRTD 15865

15866 RILQNIVDEHREKMKTEAGKLQGEEDLVDVLLKLQQHGDLEFPLTDNNIKAVIL (0) 16027

16320 DIFAGGGETTSISVEWAMSEMLKNPRVMDKAQAEVRRVFDGKGNADEELKFLKVVV 16487

16488 KETLRLHPPFPLLIPRECREMCEINGYEIPKKTLIIVNAWAIGRDSDHWVEAERFYPERF 16667

16668 LDSSIDYKGTDFGYIPFGAGRRMCPGILFSLPIIELSLAHLLYNFDWKLPNGMKADDLDM 16847

16848 TEALGIAVRRKQDLHLIPIPYNPSHVQ* 16931

 

>CYP71BE14P CAAP02008751.1 CYP71BE pseudogene 64% to 71BE1

6182 IQLTVSTLVVSSPEIAKEFMKTDDVSFAQRPNILVTSIVSYGSTNIGFAPYSDYWRQVR 6006

6005 KLCATELLSAKRVKSFQLIREEEVSNVIKRIASHSGSTINLSEEISSVTLPL 5850

5850 IARAAFGKICKDQDSFIGAVTEMAELATGFCAADVFPSVK*VDQVTGIRSKLEKLHERVD 5671

5670 RILQNIVKEHKESMTTKRGKLEAEDLVDTFLKIQEDGDLKFPLTENNVKAVIL (0) 5512

     DMFSG

5255 AGETSSTVGEWAMTELIRHPRVMEKAQ 5175

5178 TRVRREFAGKGTVEESGIHELKFIKAVVKETLRLHPPAPLLLPRECRERC 5029

5028 EINGYEIPVKTRVIDNA*AIGRDPDSWTEPERFNPERFLDSWLDYKGTDFEFIPFGAGRR 4849

4848 MCPDMSFAIPSVELSLANFIYHFDWKLPTGIKPEDLDMTEIISLSVRRKQNLHLIPIPYN 4669

4668 PFPAE* 4651

 

>CYP71BE15P CAAP02007291.1 pseudogene 78% to 71BE1

6388 MEFSSSSVLFPFLLFLFMLFRIGKRSKPNISTPKLPPGPWKLPLIGNLHQLVGSLPHHSL 6567

6568 KDLAEKYGPLMHLQLGQVS 6624

6627 ASPQIAKEVMKTHDLNFAQRPHLLVTRIVTYDSTDIAFAPYGDYWRQLRKICVIELLSAK 6806

6807 RVRSFQLIRKEEVSNLIRFIDSCSRFPIDLREKISSFTFAVISKAALGKEFKEQDSLESV 6986

6987 LEEGTKLASGFCLADVYPSVKWIHLISGMRHKLEKLHGRIDG 7112

7111 EHRERMEKRTGELEAEEDFIDVLLKLQQDGDLELPLTDDNIKAVIL 7248

7688 GHATASTAVEWAMSEMMKNPRVMEQAQAEVRRVFDGKGDVDETGIDELKFLKAVVSETLR 7867

7868 LHPPFPLLLPRECREKCKINGYEVPVKTRMTINAWAIGRDPDYWTEAERFYPERFLDSSV 8047

8048 DYKGADFGFIPFDAGRRMCPGILFAIPSIELPLAHLLFHFDWELPNGMRHEDLDMTEVHG 8227

8228 LSAKRKHSLHLIPIPYNS*PVG* 8296

 

>CYP71BE16P CAAP02000648.1 pseudogene 66% to CYP71BE13

1591 QVHQRLDRILQNIIDEHKESKTTTETGKQEANEDLVDILLKLQKHGNFGFPLIDNNIKAIIL (0) 1776

2908 NIFGGGGETSSIAIEWAM*KMM 2973

2975 KNPRVMEKA*AKVRQIFGGKKTLR*WMKQV*DTL

3077 KTKVIINAWAIGRDPYYQTKAKRFHPE*FLDSPIDYKGNNFEYIPFGAGKRICPGILFAIPNIELPL 3277

3278 ANMLDHFDWELLYGMKKDDIDMTESFGLKVRRKQDLCLILIPHNPLHVE* 3427

 

>CYP71BG1 Solanum tuberosum

DR034423.1, BQ514535.1, BM114062.1, BQ119583.2, BQ506191.2, CK717210.1

67% to CYP71BG2P

MEASILQLLLLLSLTSCTILFYKIRRWRRPPSPPSLPIIGHLHLLTDMPHHTFFHLSQKLG

PIIHLQLGQIPTLIISSPRLAELILKTNDHIFCSRPQIIAAQYLSFGCSDITFSPYGPYWRQARKICVTE

LLSSKRVNSFQFIRNEEINRMIQLISSHFDSELSSELDLSQVFFALANDILCRVAFGKRF

IDDRLKDKDLVSVLTETQALLAGFCLGDFFPDWEWVNWLSGMKKRLMNNLKDLGEVCDEI

IDEHLMKKRDDDQNGDGSEDFVDVLLRVQKRDDLQVPITDDNLKALIL (0)

DMFVAGTDTSAATLEWTMTELARHPSVMKKAQDEVREIAAN

KGKVEEFDLQHLHYMKAVIKETMRLHPPVPLLVPRESIEKCTLDDYEIPAKTRVLINTY

AIGRDPEYWNNPLDYNPERFMEKDIDFRGQDFRFLPFGGGRRGCPGYALGLATIELSLAR

LLYHFDWKLPTGVEAQDVNLSEIFGLATRKRVALKLVPTINKLYLLSD*

 

>CYP71BG2 tomato breaker fruit Solanum lycopersicum

BM411522.1, BM412569.1, BP881630.1, ES895470.1, DB685010

DU947425.1 (GSS)

MEASILQLLLLLSLTSCTILFYKIRGRWRRRPPSPPSLPIIGHLHLLNQMPHHTFFNLSQ

KLGKIIYLQLGQIPTLIISSPRLAELILKTNDHIFCSRPQIIAAQYLSFGCSDITFSPYG

PYWRQARKICVTELLSSKRVHSFEFIRDEEINRMIELISSRSQSEVDLSQVFFGLA

NDILCRVAFGKRFIDDKLKDKDLVSVLTETQALLAGFCFGDFFPDFEWVNWLSGMKKRLM

NNLKDLREVCDEIIKEHLMKNRDDDGSEDFVDVLLKVQKRDDLQVPITDDNLKALI

LDMFVAGTDTSAATLEWTMTELAR

HPSVMKKAQNEVRKIVANRGKVEEFDLQHLHYMKAVIKETMRLHPPVPLLVPRESIEKCS

IDGYEVPAKTRVLINTYAIGRDPEYWNNPLDYNPERFMEKDIDLRG

QDFRFLPFGGGRRGCPGYALGLATIELSLARLL

YRFDWKLPSGVEAQDMDLSEIFGLATRKKVALKLVPTITKLYPTF*

 

>CYP71BG3 CA993587.1 Gossypium hirsutum CO128388.1 Gossypium raimondii

DRKWLNSRSQSLTPPSPPS

LPIIGHLHLLTDMPHHTFTILAQKLGPIIYLQLGQVPTVIVSSPRLARLILKTHDHVFSN

RPQLVSAQYLSFNCSDVTFSPYGPYWRQARKICVTELLSSKRVNSFQLIRDEEVSRLL

TTLSAHPGSEVNVSELFLSLANDILCRVAFGRRFTERVGSSNHLAAVLRETQELFAGM

SVGDFFPEWEWVHSVSGYKRRLMKNLNELRRVCDEVIQEHLQRGETGIKEDFVDVLLR

VQKQDNLEVPITDDNLKALVLDMFVAG TDTSAATLEWTMTELVKHPEIMK

QAQEEVRAVARRTGKAIDETHLQHLHFTKSIIKEAMRLHPTVPLLVPRESMDECIIDGYK

IPPKTRLLINTYAIGRDPNSWDNPLQFNPNRFQDSNIDLKDQDFRFLPFGGGRRGCPGYG

FGLATVEIALARLLFHFDWELPYGIHTDDVDVDEIFGLASRKRTPLILVPTVNEGL*

 

>CYP71BG4 DY280303.1, DY276238.1, Citrus clementina,

Citrus reticulata x Citrus temple EX448715.1 (C-term)

Hybrid: Two different species of citrus are combined here

To try to achieve a full seq. Still missing the C-term

63% to 71BG1, 64% to 71BG3, 63% to 71BG2

MMDSFTPQVLLPLFVVSIITLLYWKLLS

RSRSQPATAANTPPSPPNKYPIIGHLHLLTDMPHHTFAALADKLGPIFHLQLGQVPTVVI

SSSELAKLVLKTHDHVFASRPQLIADQYISFGCSDVTFASYGPYWRQVRKICVTELLSSK

RVGSFQAVRDEEVKRLLTSVKSQCGSVTDMSKLFFTLANDILCLAAFGMRYVNEEGKKSN

NLASVFTESQELLSGFCIGDFFPEW

GWLSSLSGFTRRLRKNTQDLTVAIDEIISEHLFRKQATDDSGSSLMDGDGD

FIDVLLRVQQRDDLEVPITDDNLKALVLDMFMA

GTDTTAATMEWTMTELARHPRVMKKAQEEVRRVASGGGEVNESHIQQLRYMKAVIKETMR

LHPTVPLLVPRESMEKCVLEGYEIPAKTRILINSYAIGRDPKSWENPLEYIPERFDENNI

DFKDQDFRLLPFGGGRRGCPGYSFGLATVETALARLLYHFDWALPPGV

 

@8

>CYP71BG5P CAAP02000323.1  pseudogene 48% to CAAP02001743.1a, 52% to 71P1 rice

like potato and cotton ESTs, 67% to 71BG1, 65% to 71BG2, 65% to 71BG3

 7816 LPPSPPPLPIIGHLHLLTDMPHHSLSDLALKLGPIIHPRLGQVATVVVSSARLAALVLKT 7995

 7996 HDHVFASRPPLTAAQYLSFGCSDVTFSPHGTYWRQARKICVTELLSPKRVTYFQFIRNEE 8175

 8176 THPPPHHLPSSLSALSGSETDMSQLFFTLANNLCRVAFGKRFMDDSEGEKKHMVDVLTE 8352

 8353 TQALFAGFCIGDFFPDWKWLNSITGLNRRLRKNLEELIAVCNEIIEEHVNEKKERED 8523

 8524 FVGVLLRVQKRKDLEVAITDDNLKALVL (0) 8607

 9110 DMFVAGTDT 9136 XXXXXXXXXX

 9142 ELARHPHVMKKAQQEVRNIASGEGKVEETHLHQLHYKKAVIK*TMRLHPPVPLLVPRQSM 9321

 9322 ENCILDGYEIPAKIQLLINTYAIGCVPQSWE 9414

10537 NPLDYNPKRFVDGDVDFKGQDSGFLPFGGGRRGCPSYSFGLATVEIALARLLYHFDWELP 10716

10717 HGVEADDMDLNEIFGLATRKNSGLILVPRY 10806

 

CYP72 family (22 genes) [21 pseudogenes]

 

CYP72A subfamily (18 genes) [20 pseudogenes]

 

>CYP72A85 CAAP02000598.1a 65% to CAAP02002795.1

68% to CAAP02000473.1

GSVIVT00009392001 on Genoscope browser

chrUn_random from 57641820 to 57646375 (4556bp) on strand +

no P450 neighbors

35540 MEIAYDSVLIFCAFALLSLAWRAFYLVWLRPRRLERCLRRQGLMGNSYRPLHGDAKKVSIMLKEA 35734

35735 NSRPINLSDDIVPRVIPFLYKTIQQY (1) 35812

37936 GKNSFTWVGPIPRVNIMKPELIREVFLEAGRFQKQKPNPLANFLLTGL 38079

38080 VSYEGEKWAKHRKLLNPAFHVEKLK (0) 38154

38642 LMSPAFHLSCRQMISKMEEMVSPEGSCELDVWPFLKNLTADALSRTAFGSSYEEGRRLFQL 38824

38825 LQEQTYLTMEVFQSVYIPGW (2) 38884

39056 YLPTKRNKRMKKIDKEMNTLLNDIITKRDKAMKDGKTANEDLLGILMESNSKEIQEGGN 39232

39233 SKNAGISMQEVIEECKLFYLAGQETTSNLLLWTMVLLSKHPNWQTLAREEVFQVFGKNKP 39412

39413 EFAGLSRLKV (0) 39442

39670 VTMIFYEVLRLYPPGATLNRAVYEDINLGELYLPSGVEIVLPTILVHHDPEIWGDDVKEF 39849

39850 KPERFSEGVMKATKGQVSYFPFGWGPRICIGQNFAMAEAKMALAMILQCFTFELSPSYTH 40029

40030 APTSVLTLQPQYGAHLILHKI* 40095

 

$$$$

 

>CYP72A86P CAAP02000598.1b pseudogene exons 4 and 5, 84% to CAAP02002686.1

117092 FFPTKTNKRMKQISKEVH 117039

117038 ALLGGIINKREKAMEAGETANSDLLGILMESNFREIQEHQNNTKIGMSAKDVIDECKLFY 116859

116858 LAGQETTSVLLLWTMVLLSQHPDWQARAREEVLQVFGNNKPENDGLNHLKI (0) 116706

116303 VTMIFHEVLRLYPPVTVLTRMVSKDTQVGDMYFPAGVQVSLPTILVHHDHEIWGDDAKEF 116124

116123 NPERFAEGVSKATKNQVSFLPFGWGPRVCIGQNFAMMEAKIALAMILQRFSFELSPSYAH 115944

115943 APYSLITIQPQYGAHLILRGL* 115878

 

>CYP72A86P CAAP02000983.1 pseudogene 84% to CAAP02002686.1

3 aa diffs to CAAP02000598.1b

chrUn_random from 3711617 to 3699111 on strand -

GSVIVT00000151001 first two lines

GSVIVT00000150001 C-term part

16119 TGDVISRTAFGSSYEEGRRIFQLQKEQTYLAIKVAMSVYIPGWR 15988

 4832 FFPTKTNKRMKQISKEVHALLGGIINKREK 4743

      AMEAGETANSDLLGILMESNFREIQEHQN 4655

 4654 NTKIGMSAKDVIDECKLFYLAGQETTSVLLLWTMVLLSQHTDWQARAREEVLQVFGNNKP 4475

 4474 ENDGLNHLKI 4445

 4035 VTMIFHEVLRLYPPVTVLTRMVSKDTQVGDMYFPAGVQVSLPTILVHHDHEIWGDDAKEF 3856

 3855 NPERFAEGVSKATKNQVSFLPFGWGPRVCIGQNFAMMEAKTALAMILQRFSFELSPSYAH 3676

 3675 APFSLITIQPQYGAHLILRGL 3613

 

$$$$

 

>CYP72A86P-ie5b CAAP02000598.1b-ie5b pseudogene internal exon 5 fragment

chrUn_random from 3699666 to 3699734 on strand -

116504 IMMIFHEVLKLLYPLYT*HHAMH 116436

 

$$$$

 

>CYP72A87 CAAP02000355.1a   one stop codon, possible pseudogene  93% to CAN67740.1

2 aa diffs to CAN71061.1 exon4 and 5

GSVIVP00005878001 Genoscope browser version stops at PERFS*

chrUn_random from 35354787 to 35357214 (2422bp) on strand +

35357215 to 35357439 continues to the true end

52926 MEMKQLNLVALSFAFITILIYAWRILNWMWLRPKRLERCLKQQGLAGNSYRLLHGDFKEM 53105

53106 FMMIKEASSRPISISDDIVQRITPFHYHSIKKY (1) 53204

53464 GKSSFIWMGPKPRVNIMEPELIRDVLSMHTVFRKSRVHAFVKLLVSGLPFLDGEKWA 53635

53636 KHRKIINPAFRLEKLK (0) 53683

53868 NMLPAFHLSCSDMISKWEGKLSTEGSCELDVWPYLQNLTSDAISRTAFGSNYEEGRMIFE 54047

54048 LQREQAQLLVQFSDSAYIPGWW (2) 54113

54473 FLPTKSNKRMKQIRKEVNALLWGIIDKRGKAMKAGETLNDDLLGILLESNFKEIQEHEN 54649

54650 DKNVGMCIKDVIEECKIFYFAGQETTSALLLWTMVLLSKHPNLQARAREEVLHVFGNNKP 54829

54830 EGDGLNHLKI (0) 54859

55153 VMMILHEVLRLYPPVPLLARTVYEDIQVGDMYLPAGVDVSLPTILVHHDHEIW 55311

55312 GEDAREFNPERFS*GVLKATKSPVSFFPFGWGSRLCIGQNFAILEAKMVLAMILQRFSFS 55491

55492 LSPSYSHAPCSLVTLKPQHGAHLILHGI* 55578

 

>CYP72A87 gi|147816916|emb|CAN71061.1| 60% to 72A15

2 aa diffs to CAAP02000355.1a

MXXEALNRGVM

FLPTKSNKRMKQIRKEVNALLWGIIDKRGKAMKAGETLNDDLLGILLESNFKEIQEHEN

DKNVGMCIKDVIEECKIFYFAGQETTSALLLWTMVLLSKHPNWQARAREEVLHVFGNNKPEGDGLNHLKI

VMMILHEVLRLYPPVPLLARTVYEDIQVGDMYLPAGVDVSLPTILVHHDHEIWGEDAREFNPERFSQGVL

KATKSPVSFFPFGWGSRLCIGQNFAILEAKMVLAMILQRFSFTLSPSYSHAPCSLVTLKPQHGAHLILHG

I

 

$$$$

 

>CYP72A87-de1b CAAP02000355.1b pseudogene 97% to CAN67740.1

3 aa diffs to CAN67740.1

chrUn_random from 35362457 to 35362735 on strand +

60596 MEMKQLNLVALSFAFITILIYAWRVLNWMWLRPKRLERCLKQQGLAGNSYRLLYGDFKEM 60775

60776 SMMIKEATSRPISISDDIVQRVAPFHYHSIKKY (1) 60874

 

>CYP72A88 CAAP02000355.1c 94% to CAN67740.1, 97% to CAAP02000355.1d

GSVIVP00005881001 in Genoscope browser

chrUn_random from 35380636 to 35383685 (3050bp) on strand +

78889 MEMKQLNLVALSFAFITILIYAWRILNWMWLRPKRLERCLKQQGLAGNSYRLLHGDLKEM 79068

79069 FMMIKEASSRPISISDDIVQRIAPFQYHSIKKY (1) 79167

79982 GKSSFIWMGPKPRVNIMEPELIRDVLSMHTVFRKPRVHALVKLLVSGLLFLDGEKWA 80152

80153 KHRKIINPAFRLEKLK (0) 80200

80450 NMLPAFHLSCSDMISKWEGKLSTEGSCELDVWPYLQNLTGDAISRTAFGSNYEEGRMIFE 80629

80630 LQREQAQLLVQFSESAFIPGWR (2) 80695

80839 FLPTKSNKRMKQIRKEVNALLWGIIDKRGKAMKAGETLNDDLLGILLESNFKEIQEHEN 81015

81016 DKNVGMSIKDVIEECKLFYFAGQETTSVLLLWTMVLLSKHPNWQARAREEVLHVFGNNKP 81195

81196 EGDGLNHLKI (0) 81225

81519 VMMILHEVLRLYPPVPLLARTVYEDIQVGDMYLPAGVDVSLPTILVHHDHEIWGEDAREF 81698

81699 NPERFSQGVLKATKSPVSFFPFGWGSRLCIGQNFAILEAKMVLAMILQRFSFSLSPSYSH 81878

81879 APCSLVTLKPQYGAHLILHGI 81941

 

>CYP72A89 CAAP02000355.1d 95% to CAN67740.1

GSVIVT00005885001 in Genoscope browser

chrUn_random from 35415527 to 35417550 (2024bp) on strand +

112703 MEMKQLNLVALSFTFITILIYAWRILNWMWLRPKRLERCLKQQGLAGNSYRLLHGDFKEM 112882

112883 FMMIKEATSRPISISDDIVQRIAPFHYHSIKKY (1) 112981

113632 GKSSFIWMGPKPRVNIMEPELIRDVLSMHTVFRKPRVHALVKLLVSGLLFLDGEKWA 113802

113804 KHRKIINPAFRLEKVK 113851

114101 NMLPAFHLSCSDMISKWEGKLSTEGSCELDVWPYLQNLTGDAISRTAFGSNYEEGRMIFE 114280

114281 LQREQAQLLVQFSESAFIPGWR 114346

114705 FLPTKSNKRMKQNRKEVNELLWGIIDKREKAMKAGETLNDDLLGILLESNFKEIQEHGN 114881

114882 DKNVGMSIKDVIDECKIFYFAGQETTSVLLLWTMILLSKHPNWQARAREEVLHVFGNNKP 115061

115062 EGDGLNHLKI 115091

115384 VMMILHEVLRLYPPVPLLARTVYEDIQVGDMYLPAGVDVSLPTILVHHDHEIW 115542

115543 GEDAREFNPERFSQGALKATKSLVSFFPFGWGSRLCIGQNFAILEAKMVLAMILQRFSFS 115722

115723 LSPSYSHAPCSLVTLKPQYGAHLILHGI 115806

 

>CYP72A90 gi|147795635|emb|CAN67740.1| 55% to 72A15

95% to CAAP02000355.1d

no exact match in Genoscope

MEMKQLNLVALSFAFITILIYAWRVLNWMWLRPKRLERCLKQQGLAGNSYRLLYGDFKEMSMMIKEATSR

PISFSDDILQRVAPFHYHSIKKYGKSSFIWMGLKPRVNIMEPELIRDVLSMHTVFRKPRVHALGKQPASG

LFFLEGEKWAKHRKIINPAFRLEKLKNMLPAFHLSCSDMISKWEXKLSTXGSCEXDVWPYLQNLTGDAIS

RTAFGSNYEEGRMIFELQREQAQLLVQFSQSACIPGWRFLPTKSNKRMKQNRKEVNELLWGIIDKREKAM

KAGETLNDDLLGILLESNFKEIQEHGNDKNVGMSIKDVIDECKIFYFAGQETTSVLLLWTMVLLSKHPNW

QARAREEVLHVFGNNKPEGDGLNHLKIVMMILHEVLRLYPPVPLLARTVYEDIQVGDMYLPAGVDVSLPT

ILVHHDHEIWGEDAREFNPERFSQGALKATKSLVSFFPFGWGSRLCIGQNFAILEAKMVLAMILQRFSFS

LSPSYSHAPCSLVTLKPQYGAHLILHGI

 

>CYP72A91P gi|147791559|emb|CAN72865.1| AM476150.2

52% to 72A15 95% to CAAP02000355.1c CYP72A88

has no exact match in Genoscope

MEMKQLNLVALSFAFITILIYAWRILNWMWLRPKRLERCLKQQGLAGNSYRLLHGDFKEMFMMIKEATSR

PISISDDIVQRIAPFHYHSIKKYGKSSFIWMGPKPRVNIMEPELIRDVLSMHTVFRKPRVHALVKLLVSG

LLFLDGEKWAKHRKIINPAFRLEKVK

NMLPAFHLSCSDMISKWD

(deletion of 7 aa)

SCELDVWPYLQNLTGDAISRTAFGSNYEKGRMIFE

LQREQAQLLVQFSESAFIPGWRFXPTKSNKRMKQIRKEVNALLWGIIDKRGKAMKAGETLNDDLLGILLE

SNFKEIQEHENDKNVGMSIKDVIEECKLFYFAGXETTSALLLWTMVLLSKHPNWQARAREEILHVFGNNK

PEGDGLNHLKIVMMILHEVLRLYPPVPFLARSVYEDIQVGDMYLPAGVDVSLPTILVHHDHEIWGEDARE

FNPERFSQGVLKAMKSPVSFFPFGWGSQSCIGQNFAILEAKMVLAMILQRFSFSLSPSYSHAPSSLVTLI

PQYGAHLXLHGI

 

>CYP72A92 CAAP02000149.1a 90% to CAAP02002795.1

GSVIVP00005888001 in Genoscope browser

chrUn_random from 35487285 to 35497057 (9773bp) on strand -

33193 MKLSSVAISFAFITLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYRLLHGDFREMLRM 33014

33013 ISEANSRSISLSDDIVQRVLPFHCHSIKKY (1) 32921

31022 GKNYFIWMGPKPVVNIMDPELIRDVFLKYNAFRKPPPHPLGKLLATGLVTLEGEQ 30858

30857 WTKRRKIINPAFHLEKLK (0) 30804

30164 HMVPAFQLSCSDMVNKWEKKLSKDGSCELDIWPDLENLAGDAISRTAFGSSYEEG 30000

29999 RRIFQLQKEQAHLAVKVFRSVYIPGWR (2) 29919

29675 FVPTKTNKRMRQISNEVHALLKGIIERREKAMKVGETANDDLLSLLMESNFREMQEHDE 29499

29498 RKNVGMSIKDVIEECKLFYFAGQETTSDLLLWTMVLLSKHSNWQARAREEVLQVFGNKKP 29319

29318 DGDGLNHLKI (0) 29289

24060 VTIIFHEVLRLYPPVSMLIRTVVADSQVGGWYFPDGALITLPILLIHHDHEIWGEDAKEF 23881

23880 NPERFSEGVSKATKGQFAFYPFGYGPRVCIGQNFAMMEAKMALAMILQRFSFELSPSYAH 23701

23700 APSNIITIQPQYGAYLILHGL* 23635

 

$$$$

 

>CYP72A93 CAAP02000149.1b 86% to CAAP02002795.1

GSVIVP00005893001 in Genoscope browser

chrUn_random from 35547921 to 35552540 (4620bp) on strand -

88676 MELISVAISFAFITLLIYAWRLLNWVWLRPKKLERCLRQQGITGNSYRLLHGDVREMLRM 88497

88496 ISEANSRPISLSDEIVQRVLPFHYHSLKKY (1) 88407

86832 GKNYFIWMGPKPVVNIMDPELIRDVFLRYNAFHKPAPHPLGKLLATGLVTLEGEQ 86668

86667 WTKHRKIINPAFHLEKLK (0) 86614

85979 HMVPAFQLSCGDMVNKWEKKLSKDGSCELDIWPDLENLTGDAISRTAFGSSYEEG 85815

85814 RRIFQLQKEQAHLAVKVFRSVYIPGWR (2) 85734

85493 FVPTKTNKRIRQIRNELHALLKGIIEKREKAMLVGETANDDLLSLLMESNFREMQEHDE 85317

85316 RKNVGMSIDDVIEECKLFYFAGQETTSDLLLWTMILLSKHSNWQARAREEILQVFGNKKP 85137

85136 DGNGLNHLKI (0) 85107

84572 VTMIFHEVLRLYPPVSMLIRTVFVDSQVGRWYFPVGSHVALPILLIHHDHEIWGEDAKEF 84393

84392 NPERFSEGVSKATKGGQFAFFPFGYGPRACIGQNFAMMEAKMALAMILQRFSFELSPSYA 84213

84212 HAPFNVITVQPQYGAHLILHGL* 84144

 

>CYP72A93 gi|147833897|emb|CAN66491.1| AM486124.1

62% to 72A15

4 aa diffs to CAAP02000149.1b

11981 MELISVAISFAFITLLIYAWRLLNWVWLRPKKLERCLRQQGITGNSYRLLHGDVREMLRM 11802

11801 ISEANSRPISLSDEIVQRVLPFHYHSLKKY () 11712

10111 GKNYFIWMGPKPVVNIMDPELIRDVFLRYNAFHKPAPHPLGKLLATGLVTLEGEQ 9947

9946 WTKHRKIINPAFHLEKLK 9893

9253 HMVPAFQLSCSDMVNKWEKKLSKDGSCELDIWPDLENLTGDAISRTAFGSSYEEG 9089

9088 RRIFQLQKEQAHLAVKVFRSVYIPGWR ()

8767 FVPTKTNKRIRQIRNELHALLKGIIEKREKAMXVGETANDXLLSLLMESNFREMQEHDE 8591

8590 RKNVGMSVXDVIEECKLFYFAGQETTSDLLLWTMVLLSKHSNWQARAREEILQVFGNKKP 8411

8410 DGNGLNHLKI (0) 8381

7846 VTMIFHEVLRLYPPVSMLIRTVFPDSQVGRWYFPVGSHVALPILLIHHDHEIWGEDAKEF 7667

7666 NPERFSEGVTKATKGGQFAFFPFGYGPRACIGQNFAMMEAKMALAMILQRFSFELSPSYA 7487

7486 HAPFNVITVQPQYGAHLILHGL* 7418

 

$$$$

 

>CYP72A94P CAAP02000149.1c pseudogene exon 4 only, 79% to CAAP02001786.1

chrUn_random from 35615448 to 35615834 on strand –

152090 FFPTKTNKRMKQISKEVHALLRGIINKREKAMEAGETANSGLLGILMESNFKEIHEHQN 151914

151913 NMKIGMSAKDVIDECKLFYLAGQETISVLLLWTMVLPSQHSDWQARAREEV*QVFGNNK 151737

151736 RQNDGLNHLKI (0)

 

$$$$

 

>CYP72A95 CAAP02000149.1d frameshift in exon 5, possible pseudogene

same as CAN72247.1, 94% to CAAP02002686.1 another pseudogene

GSVIVT00005897001 in Genoscope browser

chrUn_random from 35636714 to 35642885 (6172bp) on strand –

not correctly assembled, only contains C_term from MMEAK to end

184279 MKHSSVAISFGFLTVLISCLWRLLNWVWLRPKRLERCLREQGLAGNSYRLLHGDFKEMSM 184100

184099 MLKEAYSRPISLSDDIAPRVLPFHCHFIKKY (1) 184007

183058 GKNFFAWFGPNPMVNIMEPELIRDILLKSNVFQKPPPHPLGKLLVSGLVTLEGERWA 182888

182887 KRRKNINPAFHLEKLK (0) 182840

182021 NMLPAFHLSCSDMVTKWKMLSVGGSCELDVWPYLENLTGDVISRTAFGSSYEEGR 181857

181856 RIFQLQKEQTHLAIQVTMSVYIPGWR (2) 181779

180119 FLPTKTNRRMKQISKEVYALLRGIVNKREKAMKAGETANSDLLGILMESNFREIQEHQN 179943

179942 NKKIGMSVRDVIEECKLFYLAGQETTSVLLVWTMVLLSEHPNWQARAREEVLQVFGNKKP 179763

179762 EADGLNHLKI (0) 179733

179322 VTMIFHEVLRLYPPIAMLARAVYKDTQVGDMCFPAGVQVRP 179200

179203 PTILVHHDHEIWGDDAKEFNPERFAEGVLKATKNQVSFFPFGWGPRVCIGQNFAMMEAKI 179024

179023 ALAMILQHFSFELSPSYAHAPFNILTMQPQYGAHLILRGLQC* 178895

 

>CYP72A95 gi|147815271|emb|CAN72247.1| 50% to 72A10, = CAAP02000149.1d, cyan part too long

MKYQKVQIXWSSRAGSTLRHLPRCEGCELSLEALKKSLKLE

MKHSSVAISFGFLTVLISCLWRLLNWVWL

RPKRLERCLREQGLAGNSYRLLHGDFKEMSMMLKEAYSRPISLSDDIAPRVLPFHCHFIKKYGKNFFAWF

GPNPMVNIMEPELIRDILLKSNVFQKPPPHPLGKLLVSGLVTLEGERWAKRRKNINPAFHLEKLKNMLPA

FHLSCSDMVTKWKMLSVGGSCELDVWPYLENLTGDVISRTAFGSSYEEGRRIFQLQKEQTHLAIQVTMSV

YIPGWR

 

$$$$

 

>CYP72A96 CAAP02000473.1 97% to CAAP02000149.1d, bad boundary at RKNFF

GSVIVP00000189001 on Genoscope browser (missing N-term)

chrUn_random from 4489657 to 4495024 on strand –

52293 MKHSSVAISFGFLTVLISCLWRLLNWVWLRPKRLERCLREQGLAGNSYRLLHGDFKEMS 52117

52116 MMLKEAYSRPISLSDEIAPRVLPFHCHFIKKY (1) 52021

51446 RKNFFAWFGPNPMVNIMEPELIRDVLLKSNVFQKPPPHPLGKLLVSGLVTLEGERWA 51276

51275 KRRKIINPAFHLEKLK (0)

50085 NMLPAFHLSCSDMVTKWKMLSVGGSCELDVWPYLENLTGDVISRTAFGSSYEEGR 49921

49920 RIFQLQKEQTHLAIQVTMSVYIPGWR (2) 49843

48155 FLPTKTNRRMKQISKEVYALLRGIINKREKAMKAGETANSDLLGILMESNFREIQEHQN 45979

47978 NKKIRMSVKDVIEECKLFYLAGQETTSVLLVWTMVLLSEHPNWQARAREEVLQVFGNKKP 47799

47798 EAAGLNHLKI (0) 47769

47357 VTMIFHEVLRLYPPVAMLARAVYKDTQVGDMCFPAGVQVVLPTILVHHDHEIWGDDAKEF 47178

47177 NPERFAEGVLKATKNQVSFFPFGWGPRVCIGQNFAMMEAKIALAMILQHFSFELSPSYAH 46998

46997 APFSILTMQPQYGAHLILRGLQC* 46926

 

>CYP72A97P CAAP02002686.1 pseudogene, 76% to CAAP02004668.1 CYP72A

94% to CAAP02000149.1d

GSVIVP00000152001 in Genoscope browser not assembled correctly

Only exon6 and 7 correct in this model (VFGN-HRAV)

chrUn_random 3749666 to 3760267 on strand –

13105 MKHSSVAISFGFLTVLISCLWRLLNWVWLRPKRLERCLREQGLAGNSYRLLHGDFKEMSM 12926

12925 MLKEAYSRPISLSDDTTPRVLPFHFHFIKKY 12833

11910 GKNSFAWFGPNPMVNIMEPELIRDVLLKSNVFQKPPPHPLGKLLVSGLVTLEGERWA 11740

11739 KRRKIINPAFHLEKLK 11692

10658 NMLPAFHLSCSDMVTKWKMLSVGGSCELDVWPYLENLTGDVISRTAFGSSYE 10503

10502 EGRRIF*LQKEQTHFASQ 10449

5472 VTMSVYIPGWR 5440

3730 FYPQRRNRRMKQISKEVYALLRGIVSNREKAMKAGETASSDLLGILMESNFREIQEHQNN 3551

3550 KKIGMSVKDVIEECKLFSLDGQETTSVLLVWTMVLLSEHPNWQACAREEVLQ 3395

3395 VFGNKKPEADGLNHLKI 3345

2933 VTMIFHEVLRLYPLVAMLHRAV 2868

2866 YKDTQVGDMCFPVGVQVVLPTILVHHDHEIWGDDAKEFNPKRFAEAVLKATKNQVSFFPF 2687

2686 GWGPRVCIGQNFAMMEAKIALAMILQHFSFELSPSYAHAPFSILTMQPQYGAHLILRGLQC* 2501

 

>CYP72A97P-ie5b CAAP02002686.1a-ie5b see CAAP02000598.1b-ie5b

3152 IMMIFHEVLKL 3120

 

>CYP72A98 gi|147777099|emb|CAN63404.1| AM456876.2

49% to 72A14

86% to CAAP02002686.1, 88% to CAAP02000149.1d

no exact match in Genoscope

MKHSSVAISFGFLTVLISCLWRLLNWVWLRPKRLERCLREQGLAGNSYRLLHGDFKEMSMMLKEAYSRPI

SLSDDIAPRVLPFHCHFIKKY ()

GKNSFAWFGPNPMVNIMEPGLIRDVLLKSNVFQKPPPHPLGKLLVSGLV

TLEGERWAKRRKIINPAFHLEKLK ()

NMLPAFQLSCSDMVTKWKKLSVGGSCELDVWPXXXXXXXX

VISRTAFGSSYEEGRRIFQLQKELTHLASQ

VTMSVYIPGXR ()

FLSTKMNRRMKXISKEVYALLRGIINKREKAMKAGKXANSEXLLGILMESNFREI

QEHQNNKKIGMSAKDXIEECKLFYLAGQETTSVLLLWTMFLLSEHPNWQACAREEVLQVFGKK

KPEADGLNHLKI

VTMIFHEVLRLY

PLVAMLNRAVYKDTQVGDMYFPARVQVALPTILVHHDHEIWGDNAKGFDPERFAEGILKATKTSSA

(Deletion)

CIGQNFAMMEAKIALAMILQHFSFELSPSYAHAPFNILTMQPQYGVHLILRGLQC

 

$$$$

 

>CYP72A99 CAAP02004338.1a runs off the end 84% to CAN72247.1

100% to CAO16049.1 end is 98% to CAAP02000983.1

1697 MKLSSVAISFGFLTVLISCVWRLLNWVWLRPKRLERCLREQGLAGNSYRLLQGDSKEMSR 1518

1517 MMKEAYSRPISLSDDIVQRVLPFHCHFIKKY (1) 1425

172  GKNFFTWVGPSPRVNIMEPELMRDVLLKSNIFQKTPSHPLVKLLVSGLVALEGEQWA 2

 

CYP72A99 gi|157327641|emb|CAO16049.1| unnamed protein product [Vitis vinifera]

4 aa diffs to CAAP02000983.1 from TDGE to end

GSVIVP00009398001 in Genoscope

chrUn_random from 57722038 to 57738734 on strand -

MKLSSVAISFGFLTVLISCVWRLLNWVWLRPKRLERCLREQGLAGNSYRLLQGDSKEMSRMMKEAYSRPI

SLSDDIVQRVLPFHCHFIKKYGKNFFTWVGPSPRVNIMEPELMRDVLLKSNIFQKTPSHPLVKLLVSGLV

ALEGEQWAKRRKIINPAFHPEKLKNMLSAFHLSCSDMVNKWKKLSVEGSCELDVWPYLENL

TGDVISRTA

FGSSYEEGIRIFQLQKEQTYLAIKVAMSVYIPGWRFFPTKTNKRMKQISKEVHALLGGIINKREKAMEAG

ETANSDLLGILMESNFREIQEHQNNTKIGMSAKDVIDECKLFYLAGQETTSVLLLWTMVLLSQHPDWQAR

AREEVLQVFGNNKPENDGLNHLKIVTMIFHEVLRLYPPVTVLTRMVSKDTQVGDMYFPAGVQVSLPTILV

HHDHEIWGDDAKEFNPERFAEGVSKATKNQVSFLPFGWGPRVCIGQNFAMMEAKIALAMILQRFSFELSP

SYAHAPYSLITIQPQYGAHLILRGL

 

$$$$

 

>CYP72A100P CAAP02004338.1b 90% to CAAP02004439.1

100% to CAO16050.1 = CU459449.1

32686 SLPQAT MIFHKVLRLYPLVAMLPRVVYKDTQVGDMCFPAGVQVLLSTILVHHDHEILGDD 32507

32506 AKEFNPERFAEGVLKATKNQVSFFPFGWGPRVCIGQNFAMMEAKIALAMIL*HFSFELSP 32327

32326 SYTHASFSILTMQPQYGAHLILRGLQC* 32243

 

>CYP72A100P gi|157327642|emb|CAO16050.1| unnamed protein product [Vitis vinifera]

beginning = CAAP02001786.1

92% to CAAP02000473.1

GSVIVP00009400001 in Genoscope browser not correctly assembled

57770331 to 57770582 on strand – NY* to CKLF

57769280 to 57770326 on strand – YLAG to end

NY*HAHRFLPTKMNRRMKQISKEVYALLRGIINKREKAMKAGKTANSDLLGILMESNFREIQEHQNNKKIGMS

VKDVIEECKLF

YLAGQKTTSVLLVWTMALLSEHPNWQAHAREEVLQVFGNKKWEVDGLNHLKIAT

MIFHKVLRLYPLVAMLPRV

VYKDTQVGDMCFPAGVQVLLSTILVHHDHEILGDDAKEFNPERFAEGVLKATKNQVSFFPFGWGPRVCIG

QNFAMMEAKIALAMIL*HFSFELSPSYTHASFSILTMQPQYGAHLILRGLQC*

 

>CYP72A101PX = CYP72A100P

CAAP02001786.1  pseudogene 89% to CAAP02004439.1 CYP72A

57770174 to 57770582 on strand –

note CYP72A100P may be identical to 72A101P (merge)

426 NY*HAHRFLPTKMNRRMKQISKEVYALLRGIINKREKAMKAGKTANSDLLGILMESNFRE 247

246 IQEHQNNKKIGMSVKDVIEECKLFY 172

170 YLAGQKTTSVLLVWTMALLSEHPNWQAHAREEVLQVFGNKKWEVDGLNHLK 18

 

$$$$

 

>CYP72A102P CAAP02004439.1 pseudogene 83% to CAAP02000101.1

91% to CAAP02000473.1

GSVIVP00000178001 in Genoscope browser not correctly assembled

4296186 to 4297437 on strand –

2539 NY*HAHRFLPTKTNRKMKQISKEVYALLRGIVNKREKAMKVGETTNSDLLGMLMESNFRE 2360

2359 IQEHQNNKKIRISVKDVIEECKLFYLAGQKTTSVLLVWTMVLLSEHPN*QARAREEVLQV 2180

2179 FGNKKWEADGLNHLKI (0) 2135

1719 VTMIFHEVLRLYPPIAMLPRVVYKDTQVGDMCFPTGLQVVLPTILVHHDHEIWGDD 1552

1551 AKEFNPKRFVEGVLKVTKNQVSFFPFGWGPRVCIGQNFAMMEAKIALAMIL*HFSFELSP 1372

1371 SYTHASFNILTM*PQYGAHLILHGLQC* 1288

 

$$$$

 

>CYP72A103 CAAP02002795.1 87% to CAAP02004668.1

90% to CAAP02000149.1a

25135 MKLSSVAISFAFITLLIYAWRLLNSVWLKPKKIERYLRQ

25018 QGLIGNSYRLLHGDFREMSRMIDEANSRPISLSDDIVQRVLPFHYHSIKKY (1) 24866

24228 GKNCFIWMGPKPVVNIMEPELIRDVLLKHNAFQKPPVHPLGKLLATGVIALEGEQ 24064

24063 WTKRRKIINPAFHLEKLK (0) 24010

23631 HMVPAFQLSCSEMVNKWEKKLSKDGSCELDIWPDLENLAGDVISRTAFGSSYEE 23470

23469 GRRIFQLQKEQAHLAVQVSQSIYIPGWR (2) 23386

23080 FVPTKTNKRMRQISNEVNALLKGIIERREKAMKVGETANDDLLGLLMESNYKEMQEHGE 22904

22903 RKNVGMSNKDVIEECKLFYFAGQETTSVLLLWTMVLLSKHSNWQARAREEVLQVFGNKKP 22724

22723 DGDGLNHLKI (0) 22694

22147 VTMIFHEVLRLYPPASMLIRSVYADTEVGGMYLPDGVQVSLPILLLHHDHEIWGDDAKDF 21968

21967 NPERFSEGVSKATKGQFAFFPFGYGPRVCIGQNFAMMEAKMALAMILQRFSFELSPSYAH 21788

21787 APISVITIQPQYGAHLILHGL* 21722

 

>CYP72A103 gi|157356442|emb|CAO62605.1| unnamed protein product [Vitis vinifera]

identical to CAAP02002795.1

GSVIVP00000202001 in Genoscope browser

4708885 to 4712295 on strand –

MKLSSVAISFAFITLLIYAWRLLNSVWLKPKKIERYLRQQGLIGNSYRLLHGDFREMSRMIDEANSRPIS

LSDDIVQRVLPFHYHSIKKYGKNCFIWMGPKPVVNIMEPELIRDVLLKHNAFQKPPVHPLGKLLATGVIA

LEGEQWTKRRKIINPAFHLEKLKHMVPAFQLSCSEMVNKWEKKLSKDGSCELDIWPDLENLAGDVISRTA

FGSSYEEGRRIFQLQKEQAHLAVQVSQSIYIPGWRFVPTKTNKRMRQISNEVNALLKGIIERREKAMKVG

ETANDDLLGLLMESNYKEMQEHGERKNVGMSNKDVIEECKLFYFAGQETTSVLLLWTMVLLSKHSNWQAR

AREEVLQVFGNKKPDGDGLNHLKIVTMIFHEVLRLYPPASMLIRSVYADTEVGGMYLPDGVQVSLPILLL

HHDHEIWGDDAKDFNPERFSEGVSKATKGQFAFFPFGYGPRVCIGQNFAMMEAKMALAMILQRFSFELSP

SYAHAPISVITIQPQYGAHLILHGL

 

$$$$

 

>CYP72A104P gi|147798934|emb|CAN63796.1| AM469525.2 56% to 72A7

pseudogene

5 aa diffs plus some errors to CAAP02002795.1 (same with CAO62605.1)

no exact match in Genoscope, may be the same seq as CYP72A103

RFVPTXTNKRMRQISNEVNALLKGIIERREKxxEVGExxTSTANXXLLGLLMESNYKEMQEHDERKNVGMS

NKDVIXECKLFYFAGQETTSVLLLWTMVLLSKHSXWQARAREEVLQVFGNKKPDGDGLXHLKI (0)

14303 VTMIFHEVLRLYPPASMIXX 14250

14251 SVYXDTEVGG

MYLPDGVXVSLPILLVHHDHEIWGDDAKDFNPERFSEGVSKATKGQFAFFPFGYGPRVCIGQNFAMMEAK

MALAMIVQRFSFELSPSYAHAPFSVITIQPQYGAHLILHGL

 

$$$$

 

>CYP72A105 CAAP02002402.1a 91% to CAAP02004668.1

no exact match in Genoscope

 9868 MKLSSVAVSFAFITLLIFAWRLLNWVWLRPKKLERCLRKQGLTGNSYRLLHGDFREMSRM 10047

10048 NNEANSGPISFSDDIVKRVLPFFNHSIQKY (1) 10137

11178 GKNSFTWLGPKPVVNIMEPELIRDVLLKHNVFQKPPPHPLGKLLATGVVALEGEQW 11345

11346 TKRRKIINPAFHLEKLK (0) 11396

11706 HMVSAFQLSCSDMVNKWEKKLSMDDSCELDIWPYLQILTGDVISRTAFGSSYEEGRRIFQ 11885

11886 LQKEQAHLVAQVTQSVYVPGWR (2) 11951

12540 FFPTKINRRMRQIRNEVNALLKGIIEKREKAMKVGETANHDLLGLLMESNYREMQENDE 12716

12717 RKNVGMSIKDVIEECKLFYFAGQETTSVLLLWTMVLLSKHSNWQARAREEVLQVFGNKKP 12896

12897 DGDGLNHLKI (0) 12926

13502 VTMIFHEVLRLYPPASMLIRTVFADSQVGGLYLSDGVLIALPILLIHHNHEIWGEDAKEF 13681

13682 NPGRFSEGVSKAAKTQVSF 13738

13737 FFPFGYGPRICVGQNFAMMEAKMALAMILQRFSFDLSPSYAHAPSS 13874

13875 LLMQPQHGAHLILHGL* 13925

 

>CYP72A105 gi|147810740|emb|CAN67452.1| 64% to 72A15

3 aa diffs to CAAP02002402.1a

MVLLSKHSNWQARAREEVLQVFGNKKPDGDGLNHLKIVTMIFHEVLRLYPPASMLIRTVFADSQVGGLYL

PDGVLIXLPILLIHHNHEIWGEDAKEFNPGRFSEGVSKAAKTQVSFFPFGYGPRICVGQNFAMMEAKMAL

AMILQRFSFDLSPSYAHAPXSLLTMQPQHGAHLILHGL

 

$$$$

 

>CYP72A106P CAAP02002402.1b pseudogene

GSVIVP00011018001 on Genoscope Browser not assembled correctly

69155641 to 69158451 on strand +

48350 ISVAISFAFITLLIYAWRLLNWVWLRPKKLERCLRQQGITGNSYRLLHGDVREMLRMISE 48529

48530 ANSRPISLSDEIVQRVLPFHYHSLKKYGIAGFL 48628

49855 SRFVPTKTNKRMRQISNEVNALLKGSIERREKAMKVGEMREHDERKNVG 50001

50002 MSNKDVIKECKLFYFAGQETTSVLLLWTMVPLSKHSNWQGRAREEVLQVFGNKKPDGDG 50178

50179 LNHLK 50193

50799 VYADTEVGGMYLPDGVQVSLPILLVHHDHEIWGDDAKDFNPERFSEGVSKATKGQFAFFP 50978

50979 FGYGPRVCIGQNFAMMEAK 51035

50717 MIKLFSILQVTMIFHEVLRLYPPASMICLC

      MALAMIVQRFS 51067

51068 FELSPSYAHAPFSVITIQPQYGAHLILHGL 51157

 

$$$$

 

>CYP72A107 CAAP02002484.1 96% to CAAP02004668.1 CYP72A

no exact match in Genoscope

10902 MKLSSVAISFVFITLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYSCLYGDFKEMSRM 11081

11082 INEANSRPISFSDDIVQRVLPFHDHSIQKY (1) 11171

12213 GKNSFTWLGPKPVVNIMEPELIRDVFLKHNAFQKVPPHPLGKLLATGVVALEGEQW 12380

12381 TKRRKIINPAFHLEKLK (0) 12431

12655 HMVSAFQLSCSDMVNKWEKKLSLDGSCELDVWPYLENLAGDVISRTAFG 12801

12802 SSYEEGRRIFQLQREQAHLAIQVTRSIYVPGWR (2) 12900

13371 FFPTKTNRRMRQISNEVNALLKGIIEKREKAMKVGETANHDLLGLLMESNYRDMQENDE 13547

13548 RKNVGMSIKDVIEECKLFYLAGQETTSVLLLWTMVLLSKHSNWQTRAREEVLRVFGNKKP 13727

13728 DGDGLNHLKI (0) 13757

14373 VTMIFHEVLRLYPPVSMLLRTVFADSQVGGLYLPDGVQIALPILLLHHDHEIWGEDAKEF 14552

14553 NPGRFSEGVSKAAKTQVSFFPFGYGPRICVGQNFAMMEAKMALAMILQRFSFELSPSYAH 14732

14733 APISLITMQPQYGAHLILHGL* 14798

 

>CYP72A107 gi|147791938|emb|CAN72443.1

 gi|147791939|emb|CAN72444.1| AM462621.1 65% to 72A15

100% to CAAP02002484.1

Note CAN72443.1 is the N-terminal of the same gene

same as CAN68126.1 and and 1 aa diff to CAAP02004668.1

adjacent to CAN72443.1

MKLSSVAISFVFITLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYSCLYGDFKEMSRMINE

ANSRPISFSDDIVQRVLPFHDHSIQKY

GKNSFTWLGPKPVVNIMEPELIRDVFLKHNAFQKVPPHPLGKLLATGVVALEGEQW

TKRRKIINPAFHLEKLK ()

HMVSAFQLSCSDMVNKWEKKLSLDGSCELDVWPYLENLAGDVISRTAFG

SSYEEGRRIFQLQREQAHLAIQVTRSIYVPGWR ()

FFPTKTNRRMRQISNEVNALLKGIIEKREKAMKVGETANHDLLGLLMESNYRDMQENDE

RKNVGMSIKDVIEECKLFYLAGQETTSVLLLWT

MVLLSKHSNWQTRAREEVLRVFGNKKPDGDGLNHLKI (0)

VTMIFHEVLRLYPPVSMLLRTVFADSQVGGLYL

PDGVQIALPILLLHHDHEIWGEDAKEFNPGRFSEGVSKAAKXQVSFFPFGYGPRICVGQNFAMMEAKMAL

AMILQRFSFELSPSYAHAPISLJTXXPQYGAHLILHGL

 

>CYP72A107 gi|147781059|emb|CAN68126.1| AM465661.2 partial seq

66% to 72A15

1 aa diff to CAAP02004668.1

3 aa diffs to CAAP02002484.1

508 FFPTKTNRRMRQISNEVNALLKGIIEKREKAMKVGETANHDLLGLLMESNYRDMQENDE 684

RKNVGMSIKDVIEECKLFYLAGQETTSVLLLWT

MVLLSKHSNWQTRAREEVLRVFGNKKPDGDGLNHLKI (0)

VTMIFHEVLRLYPPVSMLLRTVFADSQVGGLYL

PDGVQIALPILLLHHDHEIWGEDAKEFNPGRFSEGVSKAAKSQVSFFPFGYGPRICVGQNFAMMEAKMAL

AMILQRFSFELSPSYAHAPISLLTTHPQYGAHLILHGL

 

$$$$

 

>CYP72A108 CAAP02004668.1 72% to CAN67740.1 CYP72A

96% to CAAP02002484.1

GSVIVP00011014001 on Genoscope Browser

69064443 to 69068470 on strand +

7761 MKLSSVAISFAFIVLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYRLLHGDFREMS 7934

7935 EMINEANSRPISFSDDIVQRVLPFHDHSIQKY (1) 8030

9054 GKNSFTWFGPKPVVYIMEPELIRDVLLKHNVFQKPPPHPLSKLLATGVVAL 9206

9207 EGEQWTKRRKIINPAFHLEKLK (0) 9272

9534 HMVSAFQLSCSDMVNKWEKKLSLDGSCELDVWPYLENLAGDVISRTAF 9677

9678 GSSYEEGRRIFQLQREQAHLAIQVTRSIYVPGWR (2) 9779

10366 FFPTKTNRRMRQISNEVNALLKGIIEKREKAMKVGETANHDLLGLLMESNYRDMQENDER 10545

10546 KNVGMSIKDVIEECKLFYLAGQETTSVLLLWTMVLLSKHSNWQTHAREEVLRVFGNKKPD 10725

10726 GDGLNHLKI (0) 10752

11363 VTMIFHEVLRLYPPVSMLLRTVFADSQVGGLYLPDGVQIALPILLLHHDHEIWGEDAKE 11539

11540 FNPGRFSEGVSKAAKSQVSFFPFGYGPRICVGQNFAMMEAKMALAMILQRFSFELSPSYA 11719

11720 HAPISLLTTHPQYGAHLILHGL* 11788

 

>CYP72A108 gi|147858656|emb|CAN80407.1| 40% to 72A10

2 aa diffs to CAAP02004668.1 and CAO21263.1

MKLSSVAISFVFITLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYRLLHGDFREMSXMINEANSRPIS

FSDDIVQRVLPFHDHSIQKYGEQWTKRRKIINPAFHXEKLKHMVSAFQLSCSDMVNKWEKXLSLDGSCEL

DVWPYLENLAGDVISRTAFGSSYEEGRRIFQLQREQAHLAIQVTRSIYVPGWR

 

>CYP72A108 gi|157328551|emb|CAO21263.1| unnamed protein product [Vitis vinifera]

100% to CAAP02004668.1

MKLSSVAISFAFIVLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYRLLHGDFREMSEMINEANSRPIS

FSDDIVQRVLPFHDHSIQKYGKNSFTWFGPKPVVYIMEPELIRDVLLKHNVFQKPPPHPLSKLLATGVVA

LEGEQWTKRRKIINPAFHLEKLKHMVSAFQLSCSDMVNKWEKKLSLDGSCELDVWPYLENLAGDVISRTA

FGSSYEEGRRIFQLQREQAHLAIQVTRSIYVPGWRFFPTKTNRRMRQISNEVNALLKGIIEKREKAMKVG

ETANHDLLGLLMESNYRDMQENDERKNVGMSIKDVIEECKLFYLAGQETTSVLLLWTMVLLSKHSNWQTH

AREEVLRVFGNKKPDGDGLNHLKIVTMIFHEVLRLYPPVSMLLRTVFADSQVGGLYLPDGVQIALPILLL

HHDHEIWGEDAKEFNPGRFSEGVSKAAKSQVSFFPFGYGPRICVGQNFAMMEAKMALAMILQRFSFELSP

SYAHAPISLLTTHPQYGAHLILHGL

 

$$$$

 

>CYP72A109 CAAP02001850.1 6 aa diffs to CAAP02003454.1

exact match to 53909800 to 53913784 + strand

GSVIVP00009051001 in Genoscope browser

 163 MKLSSVAISFVFITLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYRLLHGDFREMSRM 342

 343 INEANSRPISFSDDIVQRVLPFHDHSIQKY (1) 432

1262 GKNNFIWLGPKPVVNIMQPELIRDVLLKHNAFQKPPPHPLGKLLASGISSLDGEQW 1428

1429 TKRRKIINPAFHLEKLK (0) 1479

1885 HMVSAFQLSCSDMVNKWEKQLSLDGSCELDIWPYLQNLTGDVISRTAFGSSYEEG 2049

2050 RRIFQLQKEQALLTVQVTRSVYVPGWR (2) 2130

2730 FFPTKTNRRMRQISSEVDALLKGIIEKREKAMQAGETANDDLLGLLMESNYREMQENDE 2906

2907 RKNVGMSIKDVIEECKLFYLAGQETTSALLLWTMVLLSKHSNWQARAREEVLRVFGNKKP 3086

3087 DGDGLNHLKI (0) 3116

3722 VTMIFHEVLRLYPPAPMLTRAVFADSQVGGLYLPDGVQIALPILLIHHDDKIWGEDAK 3895

3896 EFNPGRFSEGVSKAAKSQVSFFPFGYGPRICVGQNFAMMEAKMALAMILQRFSFELSPSY 4075

4076 AHAPISLLTIQPQHGAHLILHGL* 4147

 

>CYP72A110 CAAP02001422.1 6 aa diffs to CAAP02001850.1

no exact match in Genoscope

11379 MKLSSVAISFVFITLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYRLLHGDFREMSRM 11200

11199 INEANSRPISFSDDIVQRVLPFHDHSIQKY (1) 11110

10292 GKNNFIWLGPKPVVNIMQPELIRDVLLKHNAFQKPPRHPLGKLLASGVASLEGEQW 10125

10124 TKRRKIINPAFHLEKLK 10074

 9745 HMVSAFQLSCSDMVNKWEKQLSLDGSCELDIWPYLQNLTGDVISRTAFGSSYEEG 9582

 9581 RRIFQLQKEQALLTVQVTRSVYVPGWR 9501

 8901 FFPTKTNRRMRQISSEVDALLKGIIEKREKAMQAGETANDDLLGLLMESNYREMQENDE 8725

 8724 RKNVGMSIKDVIEECKLFYLAGQETTSALLLWTMVLLSKHSNWQARAREEVLRVFGNKKP 8545

 8544 DGDGLNHLKI 8495

 7909 VTMIFHEVLRLYPPAPMLTRAVFADSQVGGLYLPDGVQIALPILLIHHDDKIWGDDAKEF 7730

 7729 NPGRFSEGVSKAAKSQVSFFPFGYGPRICVGQNFAMMEAKMALAMILQRFSFELSPSYAH 7550

 7549 APISLLTMQPQHGAHLILHGL* 7484

 

>CYP72A111P CAAP02001422.1 pseudogene 96% to CAAP02003454.1

GSVIVP00009781001 on Genoscope Browser not assembled correctly

Chr19_random 699832 to 701684 on strand -

32669 GKNNFIWLGPKPVVNIMQPELIRDVLLKHNAFQKAPRHPLRKLLASGIASLEGEQW 32502

32501 TKRRKIINPAFHLEKLK 32451

32049 HMVSAFQLSCSDMVNKWEKQLSLDGSCELDIWPYLQNLTGDVISRTAFGSSYEEG 31885

31884 RRIFQLQKEQALLAVQVTRSVYVPGWR 31804

31203 FFPTKTNRRMRQISSEVNALLKGIIEKREKAMQAGETANDDLLGLLMESNYREM 31042

31041 QENDERKNVGMSIKDVIEECKLFYLAGQETTSVLLLWTMVLLSKHSNRQACAREEVLRLF 30862

30861 GNKKPDGDGLNHLKI 30717

 

>CYP72A112P CAAP02001422.1 pseudogene fragment 52% to CAAP02002795.1

Chr19_random 732617 to 733044 on strand +

63362 VIS*TTFGSSYEEGRRILLLQEELA*LTIRIF 63457

63664 KGNKRIKKADKEIQELLRGIIDQREKAMKVCETVNDDLLSIL 63789

 

>CYP72A113 CAAP02003454.1 91% to CAN67740.1

GSVIVP00000208001 in Genoscope browser

4821416 to 4825397 on strand +

21735 MKLSSVAISFVFITLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYRLLHGDFREMS 21908

21909 RMINEANSRPISFSDDIVQRVLPFHDHSIQKY (1) 22004

22833 GKNNFIWLGPKPVVNIMQPELIRDVLLKHNAFQKPPPHPLGKLLASGISSL 22985

22986 DGEQWTKRRKIINPAFHLEKLK (0) 23051

23456 HMVSAFQLSCSDMVNKWEKQLSLDGSCELDIWPYLQNLTGDVISRTAFGSSYEEGRRIF 23632

23633 QLQKEQALLAVQVTRSVYVPGWR (2) 23701

24301 FFPTKTNRRMRQISSEVDALLKGIIEKREKAMQAGETANDDLLGLLMESNYREMQENDER 24480

24481 KNVGMSIKDVIEECKLFYLAGQETTSVLLLWTMVLLSKHSNWQACAREEVLRVFGNKKPD 24660

24661 GDDLNHLKI (0) 24687

25294 VTMIFHEVLRLYPPVPMLTRAVFADSQVGGLYLPDGVQIALPILLIHHDDKIWGEDAKE 25470

25471 FNPGRFSEGVSKAAKSQVSFFPFGYGPRICVGQNFAMMEAKMALAMILQRFSFELSPSYA 25650

25651 HAPISLLTMQPQHGAHLILHGL* 25719

 

>CYP72A114P CAAP02000680.1  pseudogene missing exons 2 and 3

7 aa diffs to CAAP02003454.1

GSVIVP00000210001 in Genoscope browser not assembled correctly

chrUn_random 4873006 to 4875944 on strand +

1243 MKLSSVAISFAFIVLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYRLLHGDFREMSRM 1422

1423 INEANSRPMSFSDDIVQRVLPFHDHSIQKY (1) 1512

2769 FFPTKTNRRMRQISSEVNALLKGIIEKREKAMKAGETANDDLLGLLMESNYREMQENDE 2945

2946 RKNVGMSIKDVIEECKLFYLAGQETTSVLLLWTMVLLSKHSNWQACAREEVLRVFGNKKP 3125

3126 DGDDLNHLKI (0) 3155

3759 VTMIFHEVLRLYPPAPMLTRAVFADSQVGGLYLPDGVQIALPILLIHHDDKIWGDDAK 3932

3933 EFNPGRFSEGVSKAAKSQVSFFPFGYGPRICVGQNFAMMEAKMALAMILQRFSFELSPSY 4112

4113 AHAPISLTTMQPQHGAHLILHGL* 4184

 

>CYP72A115P CAAP02000101.1 N-term exon may be a pseudogene or the rest of the gene

may run off the end of the contig, 1 aa diff to CAN63404.1

chrUn_random 3956335 to 3956628 on strand -

6161 MKHSSIAISFGFLTVLISCLWRLLNWVWLRPKRLERCLREQGLAGNSYRLLHGDFKEMSM 5982

5981 MLKEAYSRPISLSDDIAPRYELLLFIIVLKFADLFKLW 5868

 

>CYP72A116P CAAP02000101.1 N-term exon pseudogene, 93% to CAN63404.1

chrUn_random 3957647 to 3957925 on strand -

7458 MKHSSVAISFGFLTVLISYLWRLLNWVWLRPKRLERCLREQGLAGNSYRLLLHGDFKEMS 7279

7278 MMLKEAYSRPINLSDDIALCVLPFHCRFIKKYG 7180

 

>CYP72A117P CAAP02000101.1, pseudogene, missing exons 2,3, 83% to CAN63404.1

about 90% to CAAP02000149.1d

GSVIVP00000171001 in Genoscope browser not assembled correctly

chrUn_random 4136790 to 4143738 on strand -

193271 MKHNSVAISFGFLTVFISCLWMLLNWVWLRPKRLERCLREQGLAENSYSLLHGDFKEMSM 193092

193091 ILKEAYSRPISLSDDIAPRVLPFRCHFIKKY 192999

 

187549 FLPTKTNRKMKQISKKAYALLRGIINKREKTMKADKTGNSDLLVILMESNFR* 187391

187390 IQEHKNNKKIGMSVKEVIEECKIFYLAGQETTSVFLVWTMVLLSENPNWQARAREEVLQV 187211

187210 FGNKKLEANGLNHLKI (0) 187163

186751 VTMIFHEVLRLYPPVAMLTRAVYKDTQVGDMYFPAGVQVALPTILVHHDHEIWGDD 186584

186583 VKEFNPERLAEGISKAKKNQVSFFPFGWGPQACIGQNFAMMEAKIALAMILQHFLFELSP 186404

186403 SYAHAPFNILTMQLQYGGHLILHGLQC 186323

 

>CYP72A118P gi|147818466|emb|CAN71976.1| 51% to 72A10 probable pseudogene

80% to CAAP02000149.1d

chrUn_random 57735623 to 57737209 on strand –

first line does not match

     MEAXELGVXETEEXREMPLESKAWLGIRIGYYKGDSKEMSRMMKEAYSRPISLSDDIVQRVLPFHCHFIKKY

1865 GKNFFTWVGPSPRVNIMEPELMRDVLLKSNIFQKTPSHPLVKLLVSGLVALEGEQWAKRRKIINPAFHPEKLK 1649

1598 RKIINPVFHPE 1566 (small duplication)

1271 NMLSAFHLSCSDMVNKWKKLSVEGSCELDVWPYLENLTGDVISRTAFGSSYEEGIRIFQLQKEQT

     YLAIKVAMSVYIPGWR 1029

 

$$$$

 

>CYP72A119P gi|147779725|emb|CAN67214.1| AM437669.2

45% to 72A8 C-helix

CAAP02016581.1, 74% to CAN71976.1 pseudogene

876  KLGKNSFTWVSPNPRVNIM

     KPNIFQKTPSHPLVKXLVSGLVAQEGEQWAKRRKIINPVFHPEKLK

1151 RKIINPAFHPE 1183 (small duplication)

1443 NMLPTIHLSCS 1475

1475 LSVEGSRESDVWPYLENLTWDVIARTAFGSSYEEGRKIFQLQKEQTYLAINVATWVNIPGWTYA 1666

 

AM437669.2 4 aa diffs to CAAP02003489.1, 68% to CAAP02000473.1

chrUn_random 69166373 to 69166585 on strand –

from GKNS to FHPEK 1 aa diff

15215 GKNSFTWVGPN 15247

15249 PRVNIMKPELMRDVLLK

15298 RPNIFQKTPSHPLVKXLVSGLVAQEGEQWAKRRKIINPVFHPEKLK 15435

15486 RKIINPAFHPE 15518

      MLPTIHLSCS

      LSVEGSRESDVWPYLE

15860 NLTWDVIARTAFGSSYEEGRKIFQLQKEQTYLAINVATWVNIPGWTY 16000

 

>CYP72A119P CAAP02003489.1 pseudogene CYP72A 4 aa diffs to CAN67214.1

39277 GKNSFTWVGPNPRVNIMKPELMRDVLLKPNIFQKTPSHPLVKLLVSGLVAQEGEQWAKRR 39098

39097 KIINPVFHPEKLK 39059

38658 NLTWDMIARTAFGSSYEEGRKIFQLQKE*TYLAINVATSVNIPGWTY 38518

 

$$$$

 

CYP72D SUBFAMILY (4 genes) [1 pseudogene]

 

>CYP72D3 gi|147795107|emb|CAN60851.1| 43% to 72A15 yellow region too long

CAAP02006515.1a 1-3060 runs off end, missing exon 1

87% to CAAP02007230.1

GSVIVP00009515001 on Genoscope browser

chrUn_random 60546398 to 60549431 on strand + missing exon 1

MAYSFAILTMYTLSRVVYSIWWRPKSLEKQLRRQGIRGTRYKLLFGDAKAMKQSFMEARSKPMALNHSIV

PRVVPFYHEIAQKY

GKVSVSWNFTTPRVLIVEPELMRLILTSKNGHFQRLPGNPLGYLLSRGLSYLQGEK

WAKRRKLLTPAFHFEKLK

SYHRTVRGHHSRNGPDSAESYGAICLFGSWGVTSLGFELKTKVFLVALL

GMVPAFSVSCRKLIERWKNLVAPQGTYELDMMHEFQNLTGDVISQVAFGSNYEEGKKVFELQKEQAVLVMEAF

RTFYIPGFRFVPIGKNKKRYYIDSEIKAILKKIILKRKQTMKPGDLGNDDLLGLLLQCQEQTDSEMTIED

VIEECKLFYFAGQETTANWLTWTILLLSMHPNWQEKAREEVLQLCGKKMPDIEAINRLKIVSMILHEVLR

LYPPVTQQFRHTCERINIAGMCIPAGVNLVLPTLLLHHSPEYWGDDVEEFKPERFSEGVSKASKGDQIAF

YPFGWGHRICLGQGFAMIEAKMALAMILQHFWFELSPTYTHAPHTVITLQPQHGAPIILHEI

 

>CYP72D3 CAAP02015403.1 = CAN60851.1 4 aa diffs, runs off end

1507  MAYSFAILTMYTLSRVV

1456  YSVWWRPKSLEKQLRRQGIRGTRYKLLFGDAKAMKQSFMEARSKPMALNHSIVPRVL  1286

1285  PFYHEIAQKY  1253

1178  GKVSVSWNFTTPRVLIVEPELMRLILTSKNGHFQRLPGNPLGYLLTRGLSYLQGEKWAK  1002

1001  RRKLLTPAFHFEKLK  957

232   GMVPAFSVSCRKLIERWKNLVAPQGTYELDMMPEFQ  125

124   NLTGDVISQVAFGSNYEEGKKVFELQKEQAVLVMEAFRTFY  2

 

>CYP72D4 gi|147795108|emb|CAN60852.1| AM443849.2 43% to 72A14

CAAP02006515.1b 5359-7817 adjacent to CAN60851, 87% to CAN60851

97% to CAAP02007230.1, 100% to CAO16149.1

GSVIVP00009516001 in Genoscope Browser

chrUn_random 60551733 to 60554191 on strand +

MAYSFAILTVYTLLRVVYSIWWRPKSLEKQLRRQGIRGTHYKLLFGDAKAMKQSFVEARSKPMALNHSIV

PRVTPFYHEMAQKYGKVSVSWHFTTPRVLIVEPELMRMILXYKNGHLXRLPGNPLGYHLSRGLLSLEGEK

WAKRRKLLSPAFHLEKLK

GMMPAFSTSCHXLIERWKNLVGPQGTYELDVMPEFQ

NLTGDVISRTAFGSSYEEGRRVFELQKEQIVLVMEDFRNFYIPGFRFVPTRK

NKRRYYMDSEIKAMIKKIILKKKQTLKNGDPGNDDLLGLLLQCQEQTDSEMTIEDVVEECKLFYFVGQET

TANWLTWTILLLSMHPNWQEKARAEVLQICGKKMPDIEAISNLKIVSMILHEVLRLYPPVIMQFRHTRER

INIAGMYIPAGVDLVLPTVLLHHSPEYWGDDVEEFKPERFSEGVSKASKGDQTAFYPFGWGHRICLGQGL

AMIEAKMALAMILQHFWFELSPAYTHAPYRIITLQPQYGAPIILHQI

 

>CYP72D5 CAAP02007230.1 87% to CAN60851.1, 100% to CAO41622.1

GSVIVP00013480001 in Genoscope Browser

chrUn_random 89958971 to 89961429 on strand +

1439 MAYSFAILTVYTLLRVVYSIWWRPKSLEKQLRRQGIRGTHYKLLFGDAKAMKQSFMEARS 1618

1619 KPMALNHSIVPRVIPFYHEMAQKY 1690

1768 GKVSVSWHFTTPRVLIVEPELMRMILKYKNGHLHRLPGNPLGYHLSRGLLSLE 1926

1927 GEKWAKRRKLLSPAFHLEKLK 2019

2699 GMMPAFSTSCHDLIERWKNLVGPQGTYELDVMPEFQNLTGDVISRTAFGSSYEEGRRVFE 2878

2879 LQKEQIVLVMEDFRNFYIPGFR 2944

3035 FVPTRKNKRRYYMDSEIKAMIKKIILKKKQTLKNGDPGNDDLLGLLLQCQEQTDSEMTI 3211

3212 DDVVEECKLFYFVGQETTANWLTWTTLLLSMHPNWQEKARAEVLQICGKKMPDIEAISNL 3391

3392 KI 3397

3469 VSMILHEVLRLYPPVIMQFRHTGERINIAGMCIPAGVDLVLPTALLHHSPEYWGDDVEEF 3648

3649 KPERFSEGVSKASKGDQTAFYPFGWGHRICLGQGLAMIEAKMALAMILQHFWFELSPTYT 3828

3829 HAPHRIITLQPQYGAPIILHQI* 3897

 

>CYP72D6 CAAP02003169.1 48% to 72A15, 75% to CAAP02007230.1, 76% to CAN60851.1

GSVIVP00032271001 in Genoscope Browser

Chr4 1480668 to 1483369 on strand +

19453 MAFSFAILVVYGLLRAVYTIWWRPKSLEKQLRQQGIRGTRYKPMYGDMKALKLSFQEAQS 19632

19633 KPMTLNHSIVPRVIPFFHQMFQNY 19704

20367 GKISMSWIFTRPRVMIVDPELIRMILADKNGQFQKPPLNPLVDLLTLGLSTLE 19525

20526 GEQWAKRRKLITPAFHVEKL 19585

20884 GMVPAFSMSCCNLIERWKNWVGPQGTYELDVMPEFQNVTGDVISRAAFGSSYEEGKKVFE 21063

21064 LQKEQAVLVIEASRAIYLPGFR 21129

2219 FVPTVKNRRRYHIDNEIKAMLRSMIDRKKQAMKNGDSGYNDDLLGLLLQLTEEID 21383

21384 NEMRIEDLIEECKLFYFAGQETTANLLTWTMILLSMNPKWQDKAREEVLQICGKKIPDLE 21563

21564 AIKHLKI 21584

21726 VSMILHEVLRLYPSVVNLLRYTHKRTDVAGLSIPAGVELYLPTILLHHSPEYWGDDVEEF 21905

21906 KPERFSEGVSKASKGDQIAFYPFGWGPRICLGQSFAMIEAKMALAMILQNFWFELSPTYT 22085

22086 HAPYTVITLQPQYGAPIILHQI* 22155

 

>CYP72D6 gi|147773778|emb|CAN65255.1| 53% to 72A8

1 aa diff to CAAP02003169.1 missing first two exons

GMVPAFSMSCCNLIERWKNWVGPQGTYELDVMPEFQNVTGDVISRAAFGSSYEEGKKVFELQKE

QAVLVIEASRAIYLPGFRFVPTVKNRRRYHIDNEIKAMLRSMIDRKKQAMKNGDSGYNDDLLGLLLQLTE

EIDNEMRIEDLIEECKLFYFAGQETTANLLTWTMILLSMNPKWQDKAREEVLQICGKKIPDLEAIKHLKI

VSMILHEVLRLYPSVVNLLRYTHKRTDVAGLSIPAGVELYLPTILLHHSPEYWGDDVEEFKPERFSEGVS

KASKGDQIAFYPFGWGPRICLGQSFAMIEAKMALAMILQNFWFELSPTYTHAPYTVITLQPQYGAPIILH

QI

 

>CYP72D7P CAAP02000210.1 pseudogene, one stop codon, insert in C-helix 59% to CAAP02007230.1

GSVIVP00016465001 in Genoscope Browser (missing C-helix region)

chr11 1013195 to 1015601 on strand -

131775 MAVSMFSCLLISSLVLLLYGVLRVSYSIWWKPKWLEKRLRQQGIRGTPYKLVMGDMKEYI 131596

131595 RLITEAWSKPMNLNHHIVSRVDPFTQNNMQQY

131412 GKVSLFWAGTTPRLIVMDPGMIKEVLSNKQGHFQKPYISPLILTLARGLTALEGEVWAK 131236

131317 RRIINPAFHLEKLK 131192

131015 VMIPAFTTSCSMLIERWKELASLQETCEVDIWPELQNLTRDVISRAALGSSFEEGRQ 130845

130844 IFELQKEHITLTLEAMQTLYIPGFR 130770

130633 FIPTKKNQRRKYLQKRTTSMFRDLIQRKKDAIRTGQAEGDNLLGLLLLSSSQNNLPEN 130460

130459 VMSTKDNAITLEEVIEECKQFYLAGHETTSSWLTWTVTVLAMHPNWQEKAREEVMQICGK 130280

130279 KEPDSEALSHLKI 130241

129794 VSMILYEVLRLYPPVIAVYQHAYKETKIGTISLPAGVDLTLPTLLIHHDPELWGDDAEEF 129615

129614 KPERFAEGVSKASKDQLAFFPFGWGPRTCIGQNFAMIEAKVALAMILQHFSFELSPSYTH 129435

129434 APHTVMTLQPQHGAQLKFYQL* 129369

 

CYP73 family

 

>CYP73A78 gi|147821469|emb|CAN70035.1| = AM455281.2

CAAP02004907.1 7360-5632 (-) strand, 1 aa diff

MAHLLNKPVFFSTLLTIILLSSTRLLASYLSISPPLIASFLPLAPLILYLFYSIAKRSASLPPGPLSIPL

FGNWLQVGNDLNHQLLASMAQKYGPVFLLKLGSKNLAVVSDPELASQVLHTQGVEFGSRPRNVVFDIFTG

NGQDMVFTVYGDHWRKMRRIMTLPFFTNKVVHHYSEMWEEEMELVVDDLRNKESVKSEGLVIRKRLQLML

YNIMYRMMFDSKFESQEDPLFIQATRFNSERSRLAQSFDYNYGDFIPFLRPFLRGYLNKCRELQSRRLAF

FNNYFVEKRREIMAANGEKHKIRCAIDHIIDAQLKGEISEANVLYIVENINVAAIETTLWSMEWAIAELV

NHPHVQCKIRDEITTILQGDAVTESNLHQLPYLQATVKETLRLHAPIPLLVPHMNLEEAKLGGYTIPKES

KVVVNAWWLANNPSWWKNPEEFRPERFLEEESGTDAVAGGKVDFRFLPFGVGRRSCPGIILALPILALVI

AKLVMNFEMRPPIGVEKIDVSEKGGQFSLHIANHSTVALTPIAA

 

>CYP73A81 CAAP02000489.1 94% to 73A78

132479  MAHLLNKPLFFTLVTIILLSSTRLLASYLPISPNIARFLPLAPLILYLFYSISKRSAS  132652

132653  LPPGPLSIPIFGNWLQVGNDLNHQLLASMAQKYGPVFLLKLGSKNLTVVSDPELASQVLH  132832

132833  TQGVEFGSRPRNVVFDIFTGNGQDMVFTVYGDHWRKMRRIMTLPFFTNKVVHQYSEMWEE  133012

133013  EMDLVVDDLRNKESVKTEGLVIRKRLQLMLYNIMYRMMFDAKFESQEDPLFIQATRFNSE  133192

133193  RSRLAQSFDYNYGDFIPLLRPFLRGYLNKCRELQSSRLAFFNNYYVEKRR  133342

133464  EIMAANGEKHKIRCAIDHIIDAQHKGEISEENVLYIVENINVAAIETTLWSMEWAIAEL  133640

133641  VNHPHVQSKIRDEITTVLQGGAVTESNLHQLPYLQATVKETLRLHSPIPLLVPHMNLEEA  133820

133821  KLGGYTIPKESKVVVNAWWLANNPEWWKNPEEFRPERFLQEESATDAVAGGKADFRFLPF  134000

134001  GVGRRSCPGIILALPILALVIGKMVMNFEMRPPIGVEKIDVSEKGGQFSLHIANHSTVAF  134180

134181  TPITA*  134198

 

>CYP73A82 gi|147775009|emb|CAN77208.1| 63% to CYP73A78

MDLILIEKALLAVFCAIILAITISKLLGKKLKLPPGPLPVPVFGNWLQVGDDLNHLNLSDLAKKFGDIFM

LRMGQRNLVVVSSPDLAKDVLHTQGVEFGSRTRNVVFDIFTGKGQDMVFTVYGEHWRKMRRIMTVPFFTN

KVVQQYRVGWEDEAARVVEDVKKNPEASTNGIVLRRRLQLMMYNNMYRIMFDRRFDSEEDPLFVKLKALN

GERSRLAQSFEYNYGDFIPILRPFLRGYLKICKEVKERRLQLFKDHFLEERKKLASTKSTDHNSLKCAVD

HILDAQQKGEINEDNVLYIVENINVAAIETTLWSIEWGIAELVNHPHIQKKLRDELNTVLGPGVQVTEPD

IQKLPYLQAVIKETLRLRMAIPLLVPHMNLNDAKLGGYDIPAESKILVNAWWLANDSSKWKKPEEFRPER

FLEEESKVEANGNDFRYLPFGVGRRSCPGIILALPILGITIGRLVQNFELLPPPGQAKLDTTGKGGQFSL

HILKHSTIVARPIEA

 

>CYP73A82 CAAP02000415.1 = CAN77208.1

86000  MDLILIEKALLAVFCAIILAITISKLLGKKLKLPPGPLPVPVFGNWLQVGDDL  86158

86159  NHLNLSDLAKKFGDIFMLRMGQRNLVVVSSPDLAKDVLHTQGVEFGSRTRNVVFDIFTGK  86338

86339  GQDMVFTVYGEHWRKMRRIMTVPFFTNKVVQQYRVGWEDEAARVVEDVKKNPEASTNGIV  86518

86519  LRRRLQLMMYNNMYRIMFDRRFDSEEDPLFVKLKALNGERSRLAQSFEYNYGDFIPILRP  86698

86699  FLRGYLKICKEVKERRLQLFKDHFLEERK  86785

86992  KLASTKSTDHNSLKCAVDHILDAQQKGEINEDNVLYIVENINVA  87120

88683  AIETTLWSIEWGIAELVNHPHIQKKLRDELNTVLGPGVQVTEPDIQKLPYLQAV  88844

88845  IKETLRLRMAIPLLVPHMNLNDAKLGGYDIPAESKILVNAWWLANDSSKWKKPEEFRPER  89024

89025  FLEEESKVEANGNDFRYLPFGVGRRSCPGIILALPILGITIGRLVQNFELLPPPGQA  89195

89196  KLDTTGKGGQFSLHILKHSTIVARPIEA  89279

 

>CYP74A13 CAAP02000041.1a CYP74A 54% to 74A4 (CAO47688.1)

in contig CU459225.1 chr3 scaffold_8

234521 MSSLSSSSSSSRSELPLLKIPGDYGLPFFGPIRDRFDYFYNQGQDEFFKTRMQKYHSTVFRAN 234709

234710 MPPGPFISSDSKVVVLLDAVSFPVLFDSSKVEKRNVLDGTFMPSTDLTGGYRVLAFLDPS 234889

234890 EPKHDLLKRFSFSLLASRHRDFIPVFRSGLPDLFTTIEDDVSSKGKANFNNIADGMYFNF 235069

235070 VFRLICGKDPSDAKIRSEGPNIFSKWLFLQLSPLMTLGLSMLPNFIEDLLLHTFPLPPFL 235249

235250 VKSDYNKLYKAFYESASSVLDEGERMGINRDEACHNLVFLAGFSTFGGMKVLFPPLIKWV 235429

235430 GLAGEKLHRELADEIRTVVKAEGGVTFAALDKMALTKSVVYEALRIGPPVPFQYGKARE 235606

235607 DMVIHSHDAAFEIKKGEMIFGYQPFATKDPKVFENPEDFVAHRFMGEGEKLLKYVYWSN 235783

235784 GRETDNPTAENKQCSGKDLVVLISKLMLVEIFLRYDTFEVESGTMVLGSAVLFKSLTKSS 235963

235964 YT* 235972

 

>CYP74A14 CAAP02000041.1b CYP74A 54% to 74A4 (CAO47689.1)

in contig CU459225.1 chr3 scaffold_8

244102 MSSSSSSSSSSRPELPLRKIPGDYGLPFFGPIRNRFDYFYNQG 244230

244231 QDEFFKTRMQKYHSTVFRANMPPGPFISSDSKVVVLLDTVSFPVLFDSSKVEKRNVFVGT 244410

244411 FMPSTDLTGGYRVLPYLDPSEPKHDLLKRFSFSLLASRHRDFIPVFRSGLPDLFSTIEDD 244590

244591 VSRKGKANFNDIADDMYFNFVFRLICGKDPSDAKIRSEGPNIFLKWLFLQLSPLLTLGLS 244770

244771 ILPNFIDDLLLHTFPFPPFLVKSDYNKLYKAFYESASSVLDEGERMGIKRDEACHNLVFL 244950

244951 AGFNSFGGMKVFFPALIKWVGLAGEKLHRELADEIRTVIKAEGGVTFAALDKMALTKSM 245127

245128 VYEALRIEPPVPFQYGKAREDMVIHSHDAAFEIKKGEMIFGYQPFATKDPKVFENPEEFV 245307

245308 AHRFMGEGEKLLKYVYWSNGRETDNPTAENKQCSGKDLVVLISRLMLVEIFLRYDTFEV 245484

245485 ESGTMLLGSSLLFKSLTKTSYT* 245553

 

>CYP74A15 CAAP02000041.1c CYP74A 56% to 74A5 (CAO47690.1 fused)

in contig CU459225.1 chr3 scaffold_8 upstream of CAAP02006275.1a

252843 MSSSSSSLPLNFDNSSSSSKLPLRSIPGDCGSPFFGPIKDRFDYFYNEGRDQFFRTRMQKY 253025

253026 QSTVFRANMPPGPSMASNPNVVVLLDAISFPILFDTSRIEKRNVLDGTYMPSTAFTGGYR 253205

253206 VCAYLDPSEPNHALLKRLFMSSLAARHHNFISVFRSCLTELFITLEDDASRKGKADFNGI 253385

253386 SDNMSFNFVFKLFCDKHPSETKLGSNGPNLVTKWLFLQLAPLITLGLSMLPNVVEDLLLH 253565

253566 TFPLPSLFVKSDYKNLYHAFYASASSILDEAESMGIKRDEACHNLVFLAGFNAYGGMKTL 253745

253746 FPALIKWVGLAGEKLHGQLADEIRSIVKAEGGVTFAALDKMALTKSVVYEALRIEPPVP 253922

253923 FQYGKAKEDMVIHSHDAAFEIKKGEMIFGYQPFATKDPKVFDNPEEFVAHRFMGDGEKM 254099

254100 LEYVYWSNGRESDDPTVENKQCPGKDLVVLLSRVMMVEFFLRYDTFNIECGTLLLGSSVT 254279

254280 FKSLTKQPTFDHKSITHVS* 254339

 

>CYP74A16 CAAP02006275.1a CYP74A, 96% to CAAP02000041.c (CAO47690.1 fused)

in contig CU459225.1 chr3 scaffold_8

5348 MSSSSSSLPLNFVNSSSSSKLPLRSIPGDCGSPFFGPIKDRFDYFYNEGRDQFFRTRMQK 5527

5528 YQSTVFRANMPPGPFMAFNPNVVVLLDAISFPILFDTSRIEKRNVLDGTYMPSTAFTGGY 5707

5708 RVCAYLDPSEPNHALLKRFFTSSLAARHHNFIPVFRSCLTELFTTLEDDVSRKGKADFNG 5887

5888 ISDNMSFNFVFKLFCDKHPSETKLGSNGPNLVTKWLFLQLAPLITLGLSMLPNVVEDLLL 6067

6068 HTFPLPSLFVKSDYKKLYHAFYASASSLLDEAESMGIKRDEACHNLVFLAGFNAYGGMKT 6247

6248 LFPALIKWVGLAGGKLHRQLADEIRSIVKAEGGVTFAALDKMALTKSVVYEALRIEPPVP 6427

6428 FQYGKAKEDMVIHSHDAAFEIKKGEMIFGYQPFATKDPKVFDNPEEFVAHRFMGDGEKLL 6607

6608 EYVYWSNGRESDDPTVENKQCPGKDLVVLLSRVMLVEFFLHYDTFDIECGTLLLGSSVTF 6787

6788 KSLTKQPTFDHKSIKHVS* 6844

 

>CYP74A17 CAAP02006275.1b CYP74A, 84% to CAAP02000041.c (CAO47691.1)

in contig CU459225.1 chr3 scaffold_8

11732 MSSSSDKNDLNSSSSLSKLPLRKIPGDYGLPFFGAIKDRLDYFYKQGREEFFNARMHK 11905

11906 YQSTVFRANMPPGPFMASNPNVIVLLDSISFPILFDTSKVEKRNVLDGTYMPSTAFTGGY 12085

12086 RVCAYLDPSETNHALLKRLFMSALAARHHNFIPLFRSSLSELFTSLEDDISSKGEADFND 12265

12266 ISDNMSFNFVFRLFCDKYPSETALGSQGPSIVTKWLFFQLAPLITLGLSLLPNFVEDLLL 12445

12446 HTFPLPSIFVKSDYKKLYRAFYASASSILDEAESMGIKRDEACHNLVFLAGFNAYGGMKA 12625

12626 LFPSLIKWVGSAGEKLHRELADEIRTVVKAEGGVSFAALEKMSLTKSVVYEALRIDPPVP 12805

12806 FQYGKAKEDMVIHSHDAAFEIKKGEMIFGYQPFATKDPKVFDNPEEFMGNRFMGEGERLL 12985

12986 KYVYWSNGRESGNPTVENKQCAGKDLVLLLSRVMLVEFFLRYDTFDIESGTLLLGSSVTF 13165

13166 KSITKATDS* 13195

 

>CYP74A1 CAAP02000063.1 (CAO61246.1) in contig CU459218.1

chr18 scaffold_1

61% to 74A1 Arab. 70% to 74A1 tomato

next closest match to the tomato 74A1 is CAAP02000041.1b 60%

so this is considered the ortholog of CYP74A1. Note it is distant from the

other CYP74 gene cluster on chr 3

149291 MASPSLTFPSLQLQFPTHTKSSKPSNHKLIVRPIFASVSEKPSVPVSQSQVTPPGPIRKI 149470

149471 PGDYGLPFIGPIKDRLDYFYNQGREEFFRSRAQKHQSTVFRSNMPPGPFISSNSKVIVLL 149650

149651 DGKSFPVLFDVSKVEKKDVFTGTFMPSTEFTGGFRVLSYLDPSEPDHTKLKRLLFFLLQS 149830

149831 SRDRVIPEFHSCFSELSETLESELAAKGKASFADPNDQASFNFLARALYGTKPADTKLGT 150010

150011 DGPGLITTWVVFQLSPILTLGLPKFIEEPLIHTFPLPAFLAKSSYQKLYDFFYDASTHVL 150190

150191 DEGEKMGISREEACHNLLFATCFNSFGGMKIIFPTILKWVGRGGVKLHTQLAQEIRSVVK 150370

150371 SNGGKVTMASMEQMPLMKSTVYEAFRIEPPVALQYGKAKQDLVIESHDSVFEVKEGEMLF 150550

150551 GYQPFATKDPKIFERSEEFVPDRFVGEGEKLLKHVLWSNGPETENPTLGNKQCAGKDFV 150727

150728 VLAARLFVVELFLRYDSFDIEVGTSLLGSAINLTSLKRASF* 150853

 

>CYP74B13 AM441513 PLN 18-MAY-2007 Vitis vinifera (Pinot noir grape)

11751 MLSSTVMSVSPGVPTPSSLTPPSPPSSSPVRAIPGSYGWPVLGPIADRLDYFW 11593

11592 FQGPETFFRKRIDKYKSTVFRTNVPPSFPFFVDVNPNVIAVLDCKSFSFLFDMDVVEKKN 11413

11412 VLVGDFMPSVKYTGDIRVCAYLDTAETQHAR 11320

10198 VKGFAMDILKRSSSIWASEVVASL 10127

10125 DTMWDTIDAGVAKSNSASYIKPLQRFIFHFLTKCLVGADPAVSPEIAESGYVMLDKWVFL 9946

 9945 QLLPTISVNFLQPLEEIFLHSFAYPFFLVKGDYRKLYEFVEQHGQAVLQRGETEFNLSKE 9766

 9765 ETIHNLLFVLGFNAFGGFTIFFPSLLSALSGKPELQAKLREEVRSKIKPGTNLTFESVK 9589

 9588 DLELVHSVVYETLRLNPPVPLQYARARKDFQLSSHDSVFEIKKGDLLCGFQKVAMTDPKI 9409

 9408 FDDPETFVPDRFTKEKGRELLNYLFWSNGPQTGSPSDRNKQCAAKDYVTMTAVLFVTHMF 9229

 9228 QRYDSVTASGSSITAVEKAN* 9166

 

>CYP74B13 CAAP02000110.1 (CAO24035.1) in contig CU459253.1

chr12 scaffold_36

no heme Cys 56% to 74B2 also w/o Cys, 3 aa diffs to AM441513

146202 MLSSTVMSVSPGVPTPSSLTPPSPPSSSPVRAIPGSYGWPVLGPIADRLDYFWFQGPETFFRKR 146011

146010 IDKYKSTVFRTNVPPSFPFFVGVNPNVIAVLDCKSFSFLFDMDVVEKKNVLVGDFMPSVK 145831

145830 YTGDIRVCAYLDTAETQHAR (0) 145771

144656 VKSFAMDILKRSSSIWASEVVASLDTMWDTIDAGVAKSNSASYIKPLQRFIFHFLTKCL 144480

144479 VGADPAVSPEIAESGYVMLDKWVFLQLLPTISVNFLQPLEEIFLHSFAYPFFLVKGDYR 144303

144302 KLYDFVEQHGQAVLQRGETEFNLSKEETIHNLLFVLGFNAFGGFTIFFPSLLSALSGKPE 144123

144122 LQAKLREEVRSKIKPGTNLTFESVKDLELVHSVVYETLRLNPPVPLQYARARKDFQLSS 143946

143945 HDSVFEIKKGDLLCGFQKVAMTDPKIFDDPETFVPDRFTKEKGRELLNYLFWSNGPQTGS 143766

143765 PSDRNKQCAAKDYVTMTAVLFVTHMFQRYDSVTASGSSITAVEKAN* 143625

 

AM # and CAN # are from Velasco et al. heterozygous Pinot Noir grapevine variety

CAO # and CAAP # are from Jaillon et al. PN40024 highly homozygous French-Italian Public Consortium

Note: CYP75s and CYP79s are interleaved

 

CYP75 family

 

CYP75A subfamily (9 genes) [5 pseudogenes] 2 alleles, 15 orthologs from other strains

31 sequences

 

>CYP75A28 gi|83715792|emb|CAI54277.1| AJ880356

Shiraz mRNA

flavonoid-3,5'-hydroxylase 78% to CYP75A8 Catharanthus roseus

MAIDTSLLLEFAAATLLFFITRFFIRSLLPKPSRKLPPGPKGWPLLGALPLLGNMPHVALAKMAKRYGPV

MFLKMGTNSMVVASTPEAARAFLKTLDINFSNRPPNAGATLLAYHAQDMVFADYGARWKLLRKLSNLHML

GGKALEDWSQVRAVELGHMLRAMLELCQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFKDM

VVELMTSAGYFNIGDFIPSIAWLDIQGIQRGMKHLHRKFDWLLTKMMEEHTASAHERKGNPDFLDVIMAN

QENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDLP

KLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEEFRPERFL

SGRNTKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDEVEINMDEAFGLALQKA

VSLSAMVTPRLHQSAYAV*

 

>CYP75A28 gi|85679310|gb|ABC72066.1| flavonoid 3',5'-hydroxylase 99%

only 3 aa diffs, DQ351701 cv. Sangiovese Berries genomic

RFFIRSLLLKPSRKLPPGPKGWPLLGALPLLGNMPHVALAKMAKRYGPVMFLKMGTNSMVVASTPEAARA

FLKTLDINFSNRPPNAGATLLAYHAQDMVFADYGARWKLLRKLSNLHMLGGKALEDWSQVRAVELGHMLR

AMLELCQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFKDMVVELMTSAGYFNIGDFIPSIA

WLDIQGIQRGMKHLHRKFDWLLTKMMEEHTASAHERKGNPDFLDVIMANQENSTGEKLTITNIKALLLNL

FTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPL

NLPRVSTQACEVNGYYIPKNTGLSVNIWAIGRDPDVWESPEEFRPERFLSGRNTKIDPRGNDFELIPFGA

GRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAFGLALQKAVSLSAMVTPRLHQSAYAV

 

>CYP75A28 gi|147862221|emb|CAN82592.1| AM436340.2c Pinot Noir genomic

99% only 4 aa diffs

MAIDTSLLLEFAAATLLFFITRFFIRSLLPKPSRKLPPGPRGWPLLGALPLLGNMPHVALAKMAKRYGPV

MFLKMGTNSMVVASTPEAARAFLKTLDINFSNRPPNAGATLLAYHAQDMVFADYGARWKLLRKLSNLHML

GGKALEDWSQVRAVELGHMLRAMLELCQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFKDM

VVELMTSAGYFNIGDFIPSIAWLDIQGIQRGMKHLHRKFDWLLTKMMEEHTASAHERKGNPDFLDVIMAN

QENSTGEKLTITNIKALLL

NLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDXVIGRSRRLVESDLP

KLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSVNIWAIGRDPDVWESPEEFRPERFL

SGRNEKIDPRGNDFELIPFGAGRRILRWH

 

>CYP75A28-de2b gi|157028306|emb|CAAP02012536.1| PN40024,

contig_12536

Length=4320

 

Middle exon on (-) strand near end of contig = CAN82592.1| AM436340.2c

This is a pseudogene fragment that is missing the end of the exon

4021  NLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDLPKLPYLQAIC  3842

3841  KESFRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSVNIW  3722

 

There is another exon 1 on (-) strand at 1-382

= CAN82592.1| AM436340.2c CYP75A28

 

>CYP75A28 gi|83944624|gb|ABC48916.1| DQ298201.1 Cabernet Sauvignon genomic

flavonoid 3'-hydroxylase 98% to CYP75A28

4 aa diffs

this seq is called VvF3’5’H-1a in Castellarin et al. BMC Genomics 2006

IGQVILSRRVFETKGSESNEFKDMVVELMTSAGYFNIGDFIPSIAWLDIQGIQRGMEHLHRKFDWLLTKM

MEEHTASAHERKGNPDFLDVIMANQENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSI

LKRAHEEMDKVIGRSRWLVESDLPKLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSV

NIWAIGRDPDVWESPEEFRPERFLSGRNE

 

>CYP75A28 gi|83944626|gb|ABC48917.1| DQ298202.1 Cabernet Sauvignon genomic

flavonoid 3'-hydroxylase 98% to CYP75A28

4 aa diffs

this seq is called VvF3’5’H-1b in Castellarin et al. BMC Genomics 2006

IGQVILSRRVFETKGSESNEFKDMVVELMTSAGYFNIGDFIPSIAWLDIQEIQRGMEHLHRKFDWLLTKM

MEEHTASAHERKGNPDFLDVIMANQENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSI

LKRAHEEMDKVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSV

NIWAIGRDPDVWESPEEFRPERFLSGRNE

 

>CYP75A28 gi|83944628|gb|ABC48918.1| DQ298203.1 Cabernet Sauvignon genomic

flavonoid 3'-hydroxylase (1 aa diff)

this seq is called VvF3’5’H-1c in Castellarin et al. BMC Genomics 2006

IGQVILSRRVFETRGSESNEFKDMVVELMTSAGYFNIGDFIPSIAWLDIQGIQRGMKHLHRKFDWLLTKM

MEEHTASAHERKGNPDFLDVIMANQENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSI

LKRAHEEMDQVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPKNTRLSV

NIWAIGRDPDVWESPEEFRPERFLSGRNT

 

>CYP75A32P CAAP02004900.1b pseudogene

96% to 75A28 CAI54277.1 AJ880356.1 Shiraz mRNA

2 frameshifts

CAO16882.1 + CAO16883.1 CU459242.1

19834 MAIDTSLLLEFAAATLLFFITRFFIRSILPKPSRKLPPGPKGWPLLGALPLVGNMPHVALAKMAK 19640

19639 RYGPVMFLKMGTNSMVVASTPEAARAFLKTLDINFSSRPPNAGATLLAYHAQD 19481

19480 MVFADYGARWKLLRKLSNLHMLGG 19409

19409 KALEDWSQVRAVELGHMLRAMLELSQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGS 19230

19229 ESNEFKDMVVELMTTAGYFNIGDFIPSIAWLDIQGIERGMKHLHKKFDKLLTRMIEEHT 19053

19052 ASAHERKGNPDFLDVVMGHQGNSTGEKLTLTNIKALLL (0) 18939

18547 NLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDLPKLPYLQAIC 18368

18367 KESFRKHPSTPLNLPRVSTQACEVNGYYIPKNTRL 18263

18261 SVNIWAIGRDPDVWESPEEFRPERFLSGRNAKIDPRGNDFELIPFGAGRRICAGTRMGIVL 18079

18078 VEYILGTLVHSFDWKMPDGVEINMDEAFGLALQKAVSLSAMVTPRLHQSAYAV* 17917

 

>CYP75A33 CAAP02004900.1a 5 aa diffs to CAN60359.1, Pinot Noir genomic

N-term corrected

100% to CAO16880.1 CU459242.1

10277 MAIDTSLLPELAAATLLFFITRFFIRSLFPKPSRKLPPGPRGWPLLGALPLLGNMPHVAL 10098

10097 AKMAKRYGPVMFLKMGTNSMVVASTP

10019 EAARAFLKTLDINFSNRPPNAGATHLAYDAQDMVFADYGARWKLLRKLSNLHMLGGKALE 9840

 9839 DWSQVRTVELGHMLRAMLELSQREEPVVVPEMLSFSVANMIGQVILSRRVFETKGSESNE 9660

 9659 FKDMVVELMTTAGYFNIGDFIPSIAWLDIQGIERGMKHLHKKFDKLLTRMIEEHTASAHE 9480

 9479 RKGNPDFLDVVMGHQGNSTGEKLTLTNIKALLL 9381

 8991 NLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDLQKLPYLQAIC 8812

 8811 KESFRKHPSTPLNLPRVSNEACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEEFSPERF 8632

 8631 LSGRNAKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINM 8452

 8451 DEAFGLALQKAVSLSAMVTPRLHQSAYAV 8365

 

>CYP75A34 gi|147861244|emb|CAN81079.1| AM457118.1 Pinot Noir genomic

91% to CYP75A28

4 aa diffs to CAO16875.1 CU459242.1

MAIDTSLLPELAAATLLFFITRFFICSLFPKPSRKLPPGPRGWPLLGALPLLGNMPHVALAKMAKRYGPV

MFLKMGTSSMVVASTPEAARAFLKTLDINFSNRPPNAGATHLAYGAQDMVFADYGPRWKLLRKLSNLHML

GGKALEDSSQVRTVELGHMLRAMLELSQREEPVVVPEMLSFSIANMIGQVILSRRVFETKGSESNEFKDM

VVELMTCAGYFNIGDFIPSIAWMDIQGIERGMKHLHKKFDKLLTRMIEEHTASAHERKGNPDFLDVVMGH

QGNSTGEKLTLTNIKALLQNLFAAGTDTSASIIEWSLAEMLKNPSILKRAQEEMDHVIGRNRRLVESDLP

KLPYLQAICKESLRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSVNIWAIGRDPDVWENPEEFRPERFL

SGRNAKIDPRGNDFELIPFGAGRRICAGARMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAFGLALQKA

VSLSAMVTPRLHQSAYAV

 

>CYP75A35 gi|147777347|emb|CAN62887.1| AM437324.2 Pinot Noir genomic

72% to CYP75A28

CAAP02001548.1 61270-63341 (+) strand, N-term corrected

100% to CAO16871.1 CU459242.1

MAIDTSFFIVSAAATLLFLIVHSFIHFLVS

RRSRKLPPGPKGWPLLGVLPLLKEMPHVALAKMAKKYGPVMLLKMGTSNMVVASNPEAAQAFLKTHE

ANFLNREPGAATSHLVYGCQDMVFTEYGQRWKLLRRLSTLHLLGGKAVEGSSEVRAAELGRVLQTMLEFS

QRGQPVVVPELLTIVMVNIISQTVLSRRLFQSKESKTNSFKEMIVESMVWAGQFNIGDFIPFIAWMDIQG

ILRQMKRVHKKFDKFLTELIEEHQASADERKGKPDFLDIIMANQEDGPPEDRITLTNIKAVLVNLFVAGT

DTSSSTIEWALAEMLKKPSIFQRAHEEMDQVIGRSRRLEESDLPKLPYLRAICKESFRLHPSTPLNLPRV

ASEACEVNGYYIPKNTRVQVNIWAIGRDPDVWENPEDFAPERFLSEKHANIDPRGNDFELIPFGSGRRIC

SGNKMAVIAIEYILATLVHSFDWKLPDGVELNMDEGFGLTLQKAVPLLAMVTPRLELSAYAA

 

>CYP75A36 CAAP02004490.1 21113-19182 PN40024

MAIDTSLLLKLAAAILLFFITRFFIRSLLPKPSRKLPP

GPRGWPLLGALPLLGNMPHVALAKMAKRYGPVMFLKMGTNCMVVASTPEAAQAFLKTLDI

NVSNRPPNAGATHLAYDAQDMVFADYGPRWKLLRKLSNLHMLGGKALEDWSQVRTVELGH

MLRAMLELSQREEPVVVPEMLSFSVANMIGQVILSRRVFETKGSESNEFKDMVVELMTTA

GYFNIGDFIPSIAWLDIQGIERGMKHLHKKFDKLLTRMIEEHTASAHERKGNPDFLDVVM

GHQGNSTGEKLTLTNIKALLL (0)

NLFTAGTDTSSSVIEWSLAEMLKN

PSILKRAHEEMDQVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPLNLPRVSAQACEV

NGYYIPKNTRLSVNIWAIGRDPDVWESPEEFRPERFLSGRNAKIDPRGNDFELIPFGAGR

RICAGTRMGIVLVEYILGSLVHSFDWKMPDGVEINMDEAFGLALQKAVSLSAMVTPRLHQ

SAYAV*

 

>CYP75A36 gi|86156244|gb|ABC86840.1| DQ356236.1 Sangiovese genomic

flavonoid 3',5'-hydroxylase 94% to 75A28

CAAP02004490.1 21113-19182, 3 aa diffs, N-term corrected

CAO23870.1 translated from CAAP02004490.1

21113 MAIDTSLLLKLAAAILLFFIT

RFFIRSLLPKPSRKLPPGPRGWPLLGALPLLGNMPHVALAKMAKRYGPVMFLKMGTNCMVVASTPEAARA

FLKTLDINVSNRPPNAGATHLAYDAQDMVFADYGPRWKLLRKLSNLHMLGGKALEDWSQVRTVELGHMLR

AMLELSQREEPVVVPEMLSFSVANMIGQVILSRRVFETKGSESNEFKDMVVELMTTAGYFNIGDFIPSIA

WLDIQGIERGMKHLHKKFDKLLTRMIEEHTASAHERKGNPDFLDVVMGYQGNSTGEKLTLTNIKALLLNL

FTAGTDTSSSVIEWSLAEMLKNPSILKRVHEEMDQVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPL

NLPRVSAQACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEEFRPERFLSGRNAKIDPRGNDFELIPFGA

GRRICAGTRMGIVLVEYILGSLVHSFDWKMPDGVEINMDEAFGLALQKAVSLSAMVTPRLHQSAYAV*

 

>CYP75A36 gi|83944630|gb|ABC48919.1| DQ298204.1 Cabernet Sauvignon genomic

 flavonoid 3'-hydroxylase 94% to CYP75A28

2 aa diffs to CYP75A36

this seq is called VvF3’5’H-2a in Castellarin et al. BMC Genomics 2006

IGQVILSRRVFETKGSESNEFKDMVVELMTTAGYFNIGDFIPSIAWLDIQGIERGMKHLHKKFDKLLTRM

IEEHTASAHERKGNPDFLDVVMGHQGNSTGEKLTLTNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSI

LKRAHEEMDKVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPKNTRLSV

NIWAIGRDPDVWESPEEFRPERFLSGRNE

 

>CYP75A36 gi|83944632|gb|ABC48920.1| DQ298205.1 Cabernet Sauvignon genomic

flavonoid 3'-hydroxylase 94% to CYP75A28

this seq is called VvF3’5’H-2b in Castellarin et al. BMC Genomics 2006

2 aa diffs to CYP75A36

IGQVILSRRVFETKGSESNEFKDMVVELMTTAGYFNIGDFIPSIAWLDIQGIERGMKHLHKKFDKLLTRM

TEEHAASAHERKGNPDFLDVVMGHQGNSTGEKLTLTNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSI

LKRAHEEMDQVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPLNLPRVSAQACEVNGYYIPKNTRLSV

NIWAIGRDPDVWESPEEFRPERFLSGRNE

 

>CYP75A37P gi|147794774|emb|CAN60359.1| AM429113.2 Pinot Noir genomic

93% to CYP75A28 wrong N-term

CAO23867 pseudogene

MVQFKSCGTLGQRMRSIHLHPTILHGWGTPNLSVLEVWMVLELFFPNFQLLLVSRSPAMQGLPGEAPGRP

LRWLNGRKKYVYNNNNWRVDVVC

EAARAFLKTLDINFSNRPPNAGATHLAYDAQDMVFADYGPRWKLLRK

LSNLHMLGVKALEDWSRVRTVELGHMLRAMLELSQREEPVVVPEMLSFSVANMIGQVILSRRVFETKGSE

SNEFKDMVVELMTSAGYFNIGDFIPSIAWLDIQGIERGMKHLHKKFDKLLTRMIEEHTASAHERKGNPDF

LDVVMGHQGNSTGEKLTLTNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRR

LVESDLPKLPYLQAICKESFRKHPSTPLNLPRVSNEACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEE

FSPERFLSGRNAKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAF

GLALQKAVSLSAMVTPRLHQSAYAV

 

>CYP75A37P CAAP02002140.1c  translation = CAO23867 PN40024

CAN60359.1 pseudogene, Pinot Noir genomic

missing the N-term not in next 15kb

These genes are usually intact in the first exon, not split.

48449 AARAFLKTLDINFSNRPPNAGATHLAYDAQDMVFADYGPRWKLLRKLSNL 48300

48299 HMLGVKALEDWSRVRTVELGHMLRAMLELSQREEPVVVPEMLSFSVANMIGQVILSRRVF 48120

48119 ETKGSESNEFKDMVVELMTSAGYLNIGDFIPSIAWLDIQGIERGMKHLHKKFDKLLTRM 47943

47942 IEEHTASAHERKGNPDFLDVVMGHQGNSTGEKLTLTNIKALLL (0) 47814

47492 NLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDLPKLPYLQAIC 47313

47312 KESFRKHPSTPLNLPRVSNEACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEEFSPERF 47133

47132 LSGRNAKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGV 46965

46964 EINMDEAFGLALQKAVSLSAMVTPRLHQSAYAV 46866

 

>CYP75A38v1 gi|157331175|emb|CAO63558.1| same seq as CAAP02005443.1 PN40024

next gene is CYP79A29P CAO63559

3 aa diffs to CAN82588.1| AM436340.2a Pinot Noir genomic

7  aa diffs to BAE47007.1,

5 aa diffs to DQ786631.1, 12 aa diffs to CAI54277.1,

9 aa diffs to ABC86841, 11 aa diffs to ABC72066.1,

2 aa diffs to ABC48916 and ABC48917 (partials)

MAIDTSLLLELAAATLLFFITRFFIRSLLLKSSRKLPPGPKGWPLVGALPLLGNMPHVALAKMAKRYGPV

MFLKMGTNSMVVASTPGAARAFLKTLDINFSNRPPNAGATLLAYHAQDMVFADYGARWKLLRKLSNLHML

GGKALEDWSQVRAVELGHMLRAMLELSQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFKDM

VVELMTTAGYFNIGDFIPSIAWLDIQGIQRGMEHLHRKFDWLLTKMMEEHTASAHERKGNPDFLDVIMAN

QENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDKVIGRSRRLVESDLP

KLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSVNIWAIGRDPDVWESPEEFRPERFL

SGRNEKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAFGLALQKA

VSLSAMVTPRLHQSAYAV

 

>CYP75A38v2 gi|157332081|emb|CAO68617.1| CU460585.1 PN40024

runs off the end

2 aa diffs to DQ356237.1 Sangiovese genomic

3 aa diffs to DQ786631.1 Cabernet Sauvignon mRNA

STPGAARAFLKTLDINFSNRPPNAGASLLAYHAQD

MVFADYGARWKLLRKLSNLHMLGGKALEDWSQVRAVELGHMLRAMLELSQRAEPVVVPEMLTFSMANMIG

QVILSRRVFETKGSESNEFKDMVVELMTTAGYFNIGDFIPSIAWLDIQGIQHGMKHLHRKFDRLLTKMME

EHTASAHERKGNPDFLDVIMANQENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSILK

RAHEEMDQVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSVNI

WAIGRDPDVWESPEEFRPERFLSGRNEKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFD

WKMPDGVEINMDEAFGLALQKAVSLSAMVTPRLHQSAYAV

 

>CYP75A38v3 gi|147862217|emb|CAN82588.1| AM436340.2a Pinot Noir genomic

97% to 75A28, 11 aa diffs

3 aa diffs to CAAP02005443.1a 8380-10309

MAIDTSLLLELAAATLLFFITRFFIRSLLLKSSRKLPPGPKGWPLVGALPLLGNMPHVALAKMAKRYGPV

MFLKMGTNSMVVASTPGAARAFLKTLDINFSNRPPNAGATLLAYHAQDMVFADYGARWKLLRKLSNLHML

GGKALEDWSQVRAVELGHMLRAMLELSQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFKDM

VVELMTTAGYFNIGDFIPSIAWLDIQGIQRGMKHLHRKFDRLLTKMMEEHTASAHERKGNPDFLDVIMAN

QENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDKVIGRSRRLVESDLP

KLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEEFRPERFL

SGRNEKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAFGLALQKA

VSLSAMVTPRLHQSAYAV

 

>CYP75A38v4 gi|111144659|gb|ABH06585.1| translated from DQ786631.1 Cabernet Sauvignon mRNA

flavonoid 3'5' hydroxylase

2 aa diffs to CAN82588.1 Pinot Noir genomic

4 aa diffs to CAAP02007407.1 PN40024

11 aa  diffs 97% to 75A28 CAI54277 Shiraz mRNA

MAIDTSLLLELAAATLLFFITRFFIRSLLLKSSRKLPPGPKGWPLVGALPLLGNMPHVALAKMAKRYGPV

MFLKMGTNSMVVASTPGAARAFLKTLDINFSNRPPNAGATLLAYHAQDMVFADYGARWKLLRKLSNLHML

GGKALEDWSQVRAVELGHMLRAMLELSQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFRDM

VVELMTTAGYFNIGDFIPSIAWLDIQGIQRGMKHLHRKFDRLLTKMMEEHTASAHERKGNPDFLDVIMAN

QENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDLP

KLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEEFRPERFL

SGRNEKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAFGLALQKA

VSLSAMVTPRLHQSAYAV

 

>CYP75A38v5 gi|78183426|dbj|BAE47007.1| AB213606.1 Cabernet Sauvignon genomic

flavonoid 3',5'-hydroxylase 98%

4 aa diffs to CYP75A38v4 and CYP75A38v3

7 aa diffs to 75A28, EST = EE066764

MAIDTSLLLEFAAATLLFFITRFFIRSLLLKSSRKLPPGPKGWPLVGALPLLGNMPHVALAKMAKRYGPV

MFLKMGTNSMVVASTPGAARAFLKTLDINFSNRPPNAGATLLAYHAQDMVFADYGARWKLLRKLSNLHML

GGKALEDWSQVRAVELGHMLRAMLELSQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFKDM

VVELMTTAGYFNIGDFIPSIAWLDIQGIQRGMKHLHRKFDRLLTKMMEEHTASAHERKGNPDFLDVIMAN

QENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDLP

KLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEEFRPERFL

SGRNTKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDEVEINMDEAFGLALQKA

VSLSAMVTPRLHQSAYAV

 

>CYP75A38v6 gi|86156246|gb|ABC86841.1| DQ356237.1 Sangiovese genomic

flavonoid 3',5'-hydroxylase

2 aa diffs to CAAP02008469.1 translation = CAO68617 PN40024

runs off end

8 aa diffs to 75A28

note: 79A29P is close

CAO68617 may be allelic with CAO63558 that also has CYP79A29P close

If not it is a nearly identical duplication

RFFIRSLLPKPSRKLPPGPRGWPLLGALPLLGNMPHVALAKMAKRYGPVMFLKMGTNSMVVASTPEAARA

FLKTLDINFSNRPPNAGATLLAYHAQDMVFADYGARWKLLRKLSNLHMLGGKALEDWSQVRAVELGHMLR

AMLELSQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFKDMVVELMTTAGYFNIGDFIPSIA

WLDIQGIQRGMKHLHRKFDRLLTKMMEEHTASAHERKGNPDFLDVIMANQENSTGEKLTITNIKALLLNL

FTAGTDTSSSVIEWSLAEMLKNPSILKRVHEEMDQVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPL

NLPRVSTQACEVNGYYIPENTRLSVNIWAIGRDPDVWESPEEFRPERFLSGRNEKIDPRGNDFELIPFGA

GRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAFGLALQKAVSLSAMVTPRLHQSAYAV

 

>CYP75A39Pv1 gi|157332520|emb|CAO70765.1| CU460864.1 same seq as CAAP02008469.1

100% to CYP75A38v4

P450 gene does not continue upstream

ESNEFKDMVVELMTTAGYFNIGDFIPSIAWLDIQGIQRGMKHLHRKFDRLLTKMMEEHTASAHERKGNPD

FLDVIMANQENSTGEKLTITNIKALLL

NLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDL

PKLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSVNIWAIGRDPNVWESPEEFRPERF

LSGRNEKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAFGLALQK

AVSLSAMVTPRLHQSAYAV

 

>CYP75A39Pv2 gi|157023020|emb|CAAP02017822.1| PN40024, contig_17822 Length=1454

Identical to CAN82588.1 and AB213606.1 and DQ356237.1

Does not extend, pseudogene fragment

609  ESNEFKDMVVELMTTAGYFNIGDFIPSIAWLDIQGIQRGMKHLHRKFDRLLTKMMEEHTA  430

429  SAHERKGNPDFLDVIMANQENSTGEKLTITNIKALLL  319

 

>CYP75A40P gi|147819898|emb|CAN60738.1| AM440112.2 Pinot Noir genomic

9 aa diffs to CYP75A38v3, end of the gene is missing

probably pseudogene

94% to CYP75A28

MAIDTSLLLELAAATLLFFITRFFIRSLLPKSSRKLPPGPKGWPLVGALPLLGNMPHVALAKMAKRYGPV

MFLKMGTNGMVVASTPGAARAFLKTLDINFSNRPLNAGATLLAYRSQDMVFADYGARWKLLRKLSNLHML

GGKALEDWSQVRAVELGHMLRAMLELSQRTEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFKDM

VVELITTAGYFNIGDFIPSIAWLDIQGIQHGMKHLHRKFDRLLTKMMEEHTASAHERKGNPDFLDVIMAN

QEKSTGEKLTITNIKALLLVGTIWHRNLWYNIHVIQHAILYDHCSEYGILQIGIRAFVG

 

>CYP75A41 gi|157333816|emb|CAO18026.1| PN40024

7 aa diffs to CYP75A34

MAIDTSLLPELAAATLLFFITRFFIRSLFPKPSRKLPPGPRGWPLLGALPLLGNMPHVALAKMAKRYGPV

MFLKMGTNSMVVASTPEAARAFLKTLDINFSNRPPNAGATHLAYGAQDMVFADYGPRWKLLRKLSNLHML

GGKALEDSSQVRTVELGHMLRAMLELSQREEPVVVPEMLSFSIANIIGQVILSRRVFETKGSESNEFKDM

VVELMTCAGYFNIGDFIPSIAWMDIQGIERGMKHLHKKFDKLLTRMIEEHTASAHERKGNPDFLDVVMGH

QENTTGEKLTLSNIKALLQNLFAAGTDTSASIIEWSLAEMLKNPSILKRAQEEMDHVIGRNRRLVESDLP

KLPYLQAICKESLRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSVNIWAIGRDPDVWENPEEFRPERFL

SGRNAKIDPRGNDFELIPFGAGRRICAGARMGIVLVEYILGTLVHSFDWKIPDGVEINMDEAFGLALQKA

VSLSAMVTPRLHQSAYAV

 

>CYP75A41 gi|147862169|emb|CAN82604.1| AM436584.2 Pinot Noir genomic

86% to CYP75A28

CAAP02012125.1 2184-4508, 1 aa diff

CAAP02007036.1 12858-10533 (-) strand, 1 aa diff, no seq gap

MAIDTSLLPELAAATLLFFITRFFIRSLFPKPSRKLPPGPRGWPLLGALPLLGNMPHVALAKMAKRYGPV

MFLKMGTNSMVVASTPEAARAFLKTLDINFSNRPPNAGATHLAYGAQDMVFADYGPRWKLLRKLSNLHML

GGKALEDSSQVRTVELGHMLRAMLELSQREEPVVVPEMLSFSIANIIGQVILSRRVFETKGSESNEFKDM

VVELMTCAGYFNIGDFIPSIAWMDIQGIERGMKHLHKKFDKLLTRMIEEHTASAHERKGNPDFLDV

(small seq gap)

NLFAAGTDTSASIIEWSLAEMLKNPSILKRAQEEMDHVIGRNRRLVESDLPKLPYLQAICKESLRKHPSTPLNL

PRVSTQACEVNGYYIPENTRLSVNIWAIGRDPDVWENPEEFRPERFLSGRNAKIDPRGNDFELIPFGAGR

RICAGARMGIVLVEYXLGTLVHSFDWKMPDGVEINMDEAFGLALQKAVSLSAMVTPRLHQSAYAV

 

>CYP75A42 gi|147802021|emb|CAN61852.1| Pinot Noir genomic

91% to CYP75A28

7 aa diffs to 75A41

MAIDTSLLPELAAATLLFFITRFFIRSLFPKPSRKLPPGPRGWPLLGALPLLGNMPHVALAKMAKRYGPV

MFLKMGTXSMVVASTPEAARAFLKTLDINFSNRPPNAGATHLAYGAQDMVFADYGPRWKLLRKLSNLHML

GGKALEDSSQVRTVELGHMLRAMLELSQREEPVVVPEMLSFSIANIIGQVILSRRVFETKGSESNEFKDM

VVELMTCAGYFNIGDFIPSIAWMDIQGIERGMKHLHKKFDKLLTRMIEEHTASAHERKGNPDFLDVVMGH

QENTTGEKLTLSNIKALLQNLFAAGTDTSASIIEWSLAEMLKNPSILKRAXEEMDXVIGRXRRLVESDLP

KLPYLQAICKESXRKHPSTPLNLPRVSNEACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEEFSPERFL

SGRNAKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAFGLALQKA

VSLSAMVTPRLHQSAYAV

 

>CYP75A43 gi|147852187|emb|CAN80142.1| Pinot Noir genomic

70% to CYP75A28, 73% to CYP75A41

MSRSSRRLPPGPRGWPVVGCLPLLGAMPHVALAQLAQKYGAIMYLKLGTCDVVVASKPDSARAFLKTLDL

NFSNRPPNAGATHIAYEAQDFVFADIGPRWNLLRKLTSLHMLGAKSFKDWGAIRGAEIGHMIQAMCELSR

RGEPVVVPEMVSCALANIIGQKSLSRRVFETQGSESNDFKEMVVELMRLAGLFNVGDFIPSIAWMDLQGX

EGKMKLLHNKFDALLTRMIEEHSATAHERLGNPDILDVVMAEQEYSCGVKLSMVNIKALLLNLFIAGTDT

SSGTIEWALAEILKNPTMLKRAHAEMDRVIGKNRLLQESDVPKLPXLEAICKETFRKHPSVPLNIPRVSA

NACEVDGYYIPEDTRLFVNVWAIGRDPAVWENPLEFKPERFLSEKNARISPWGNDFELLPFGAGRRMCAG

IRMGIEVVTYALGTLVHSFDWKLPKGDELNMDEAFGLVLQKAVPLSAMVTPRLHPSAYKAQV

 

>CYP75A43 CAAP02001252.1 PN40024 genomic

2 aa diffs to CAN80142.1

35518  MVQIDEL

35539  LFTALVFLVTNFFVKRITSMSRSSRRLPPGPRGWPVVGCLPLLGAMPHVALAQLAQKYGA  35718

35719  IMYLKLGTCDVVVASKPDSARAFLKTLDLNFSNRPPNAGATHIAYEAQDFVFADIGPRWN  35898

35899  LLRKLTSLHMLGAKSFKDWGAIRGAEIGHMIQAMCELSRRGEPVVVPEMVSCALANIIGQ  36078

36079  KSLSRRVFETQGSESNDFKEMVVELMRLAGLFNVGDFIPSIAWMDLQGTEGKMKLLHN  36252

36253  KFDALLTRMIEEHSATAHERLGNPDILDVVMAEQEYSGGVKLSMVNIKALLL (0)  36408

       NLFIAGTDTSSGTIEWALAEILKNPTMLKRAH  36603

36604  AEMDRVIGKNRLLQESDVPKLPYLEAICKETFRKHPSVPLNIPRVSANACEVDGYYIPED  36783

36784  TRLFVNVWAIGRDPEVWENPLEFKPERFLSEKNARISPWGNDFELLPFGAGRRMCAGIRM  36963

36964  GIEVVTYALGTLVHSFDWKLPKGDELNMDEAFGLVLQKAVPLSAMVTPRLHPSAYKAQV* 37143

 

CYP75B subfamily (2 genes) [3 pseudogenes] 4 orthologs 2 alleles

21 sequences

 

>CYP75B32v1 gi|83715794|emb|CAI54278.1| AJ880357.1 Shiraz mRNA

flavonoid-3'-hydroxylase

same as AJ880357

CAAP02002732.1 7596-5384 (-) strand, 1 aa diffs

MNPLALIFCTALFCVLLYHFLTRRSVRLPPGLKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV

DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGKALDDF

RHIRQEEVAVLTRALARAGQTPVNLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV

LAGVFNIGDFVPALEWLDLQGVAAKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISVRDNAD

GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRSRLVTDLDLPQLT

YVQAIIKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGG

ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR

AAPLMVHPLPRLSPQVFGK

 

>CYP75B32v1 gi|157342333|emb|CAO64446.1| CU459229.1 PN40024

complement(join(4340871..4341512,4341761..4342213,

                     4342649..4343083))

2 aa diffs to 75B32v1 (CAI54278.1), 6 aa diffs to BAE47006

6 aa diffs to 75B32v3 BAE47005.1

6 aa diffs to DQ786632.2, 5 aa diffs to 75B32v2 CAN68303.1

MNPLALIFCTALFCVLLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV

DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGKALDDF

RHIRQEEVAVLTRALARAGQTPVNLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV

LAGVFNIGDFVPALEWLDLQGVAAKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISVRDNAD

GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRSRLVTDLDLPQLT

YVQAIIKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGG

ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR

AAPLMVHPRPRLSPQVFGK

 

>CYP75B32v2 gi|147833535|emb|CAN68303.1| Pinot Noir genomic

99% 5 aa diffs to CYP75B32v1, 2 aa diffs to CAO64446

MNPLALIFCTALFCVLLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV

DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGKALDDF

RHIRQEEVAVLTRALARAGQTPVNLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV

LAGVFNIGDFVPALEWLDLQGVAAKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISVRDNAD

GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRXRLVTDLDLPQLT

YXQAIIKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEKPLEFRPSRFLPGG

ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR

AAPLMVHPLPRLSPQVFGK

 

>CYP75B32v3 gi|78183422|dbj|BAE47005.1| AB213604.1 Cabernet Sauvignon genomic

flavonoid 3'-hydroxylase 98%

6 aa diffs to CYP75B32v1

MNPLALIFCTALFCILLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV

DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGKALDDF

RHIRQEEVAVLTRALARAGQTPVNLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV

LAGVFNIGDFVPALEWLDLQGVAAKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISVRDNAD

GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPQLT

YLQAIIKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEKPLEFRPSRFLPGG

ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR

AAPLMVHPLPRLSPQVFGK

 

>CYP75B32v4 gi|111144661|gb|ABH06586.1| translated from DQ786632.2 Cabernet Sauvignon mRNA

flavonoid 3' hydroxylase 99%

8 aa diffs

MNPLALFFCTALFCVLLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV

DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGKALDDF

RHIRQEEVAVLTRALARAGQTPVNLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV

LAGVFNIGDFVPALEWLDLQGVAAKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISVRDNAD

GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPQLT

YLQAIIKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEKPLEFRPSRFLPGG

ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR

APPLMVHPRPRLSPQVFGK

 

>CYP75B32v4 gi|78183424|dbj|BAE47006.1| AB213605.1 Cabernet Sauvignon genomic

flavonoid 3'-hydroxylase 98% to CYP75B32

100% to DQ786632.2, 100% to AB213605.1, 4 aa diffs to AB213604

3 aa diffs to CAN68303, 8 aa diffs to 75B32v1

MNPLALFFCTALFCVLLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV

DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGKALDDF

RHIRQEEVAVLTRALARAGQTPVNLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV

LAGVFNIGDFVPALEWLDLQGVAAKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISVRDNAD

GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPQLT

YLQAIIKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEKPLEFRPSRFLPGG

ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR

APPLMVHPRPRLSPQVFGK

 

$$$$

 

>CYP75B38 gi|83944614|gb|ABC48911.1| DQ298196.1 Cabernet Sauvignon genomic

flavonoid 3'-hydroxylase 97% to CYP75B32

this seq is called VvF3’H-1a in Castellarin et al. BMC Genomics 2006

100% to CAN75347

VCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMVLAGVFNIGDFVPALEWLDLQGVASKMKKLH

ARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNADGEGGKLTDVEIKALLLNLFTAGTDTSSSTV

EWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPKLTYLQAIVKETFRLHPSTPLSLPRMAAESCEI

NGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGGERPNADVRGNDFEVIPFGAGRRICAGMSLG

LRMVHLLTATLVHAFNWELPEGQVAEKLNMDEA

 

>CYP75B38 gi|83944616|gb|ABC48912.1| DQ298197.1 Cabernet Sauvignon genomic

flavonoid 3'-hydroxylase 97% to CYP75B32

this seq is called VvF3’H-1b in Castellarin et al. BMC Genomics 2006

1 aa diff to CAN75347

VCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMVLAGVFNIGDFVPALEWLDLQGVASKMKKLH

ARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNADGEGGKLTDVEIKALLLNLFTAGTDTSSSTV

EWAIAELIRHPEMMAQAQQEPDAVVGRGRLVTDLDLPKLTYLQAIVKETFRLHPSTPLSLPRMAAESCEI

NGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGGERPNADVRGNDFEVIPFGAGRRICAGMSLG

LRMVHLLTATLVHAFNWELPEGQVAEKLNMDEA

 

>CYP75B38 gi|83944618|gb|ABC48913.1| DQ298198.1 Cabernet Sauvignon genomic

flavonoid 3'-hydroxylase 97% to CYP75B32

this seq is called VvF3’H-1c in Castellarin et al. BMC Genomics 2006

1 aa diff to CAN75347

VCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMVLAGVFNIGDFVPALEWLDLQGVASKMKKLH

ARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNADGEGGKLTDVEIKALLLNLFTAGTDTSSSTV

EWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPKLTYLQAIVKETFRLHPSTPLSLPRMAAESCEI

NGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGGERPNADVRGNDFEVIPFGAGRRICAGMSLG

LRMVHLLTATLVHAFNWELPEGQVAEKRNMDEA

 

>CYP75B38 gi|83944620|gb|ABC48914.1| DQ298199.1 Cabernet Sauvignon genomic

flavonoid 3'-hydroxylase 97% to CYP75B32

this seq is called VvF3’H-1d in Castellarin et al. BMC Genomics 2006

100% to CAO64444.1

VCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMVLAGVFNIGDFVPALEWLDLQGVASKMKKLH

ARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNADGEGGKLTDVEIKALLLNLFTAGTDTSSSTV

EWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPKLTYLQAIVKETFRLHPSTPLSLPRMAAESCEI

NGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGGERPNADVRGNDFEVIPFGAGRRICAGMSLG

LRMVHLLTATLVHAFNWELPEGQVAEKLNMDEA

 

>CYP75B38 gi|157342331|emb|CAO64444.1| CU459229.1 PN40024 100% to CAN75347.1

1 aa diff to AB213603.1

complement(join(4317456..4318097,4318193..4318645,

                     4319126..4319560))

MNPLALIFCTALFCILLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV

DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGQALDDF

RHIRQEEVLALMRALARAGQTPVKLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV

LAGVFNIGDFVPALEWLDLQGVASKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNAD

GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPKLT

YLQAIVKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGG

ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR

AAPLMVHPRPRLSPQVFGK

 

>CYP75B38-de3b CU459229.1 1206 bp upstream of CAO64444

Same as CAAP02002916.1-de3b C-term fragment

4320766 GQVAEKLNMDKAYGLALQ*AAPLMVHPQPRLSPQGFG 4320656

 

>CYP75B38-de3c CU459229.1 same as CAAP02002916.1-de3c  C-term fragment

4309781 GLTLQRAAPLMVHPQPRLSPQGFG 4309707

 

>CYP75B38 gi|147801850|emb|CAN75347.1| Pinot Noir genomic

97% to CYP75B32

MNPLALIFCTALFCILLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV

DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGQALDDF

RHIRQEEVLALMRALARAGQTPVKLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV

LAGVFNIGDFVPALEWLDLQGVASKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNAD

GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPKLT

YLQAIVKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGG

ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR

AAPLMVHPRPRLSPQVFGK

 

>CYP75B38 CAAP02002916.1  100% to CAN75347.1, Pinot Noir genomic

97% to CYP75B32, 1 aa diff to AB213603

45444 MNPLALIFCTALFCILLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYG 45265

45264 PLMHLRMGFVDVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRW 45085

45084 RMLRKICSVHLFSGQALDDFRHIRQ 45010

44529 EEVLALMRALARAGQTPVKLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEM 44353

44352 VVELMVLAGVFNIGDFVPALEWLDLQGVASKMKKLHARFDAFLGAIVEEHKISGSAGSER 44173

44172 HVDLLSTLISLKDNADGEGGKLTDVEIKALLL 44077

43981 NLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPKLTYLQAIV 43802

43801 KETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRF 43622

43621 LPGGERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAE 43442

43441 KLNMDEAYGLTLQRAAPLMVHPRPRLSPQVFGK* 43340

 

>CYP75B38-de3b CAAP02002916.1-de3b C-term fragment

46650 GQVAEKLNMDKAYGLALQ*AAPLMVHPQPRLSPQGFGK* 46534

 

>CYP75B38-de3c CAAP02002916.1-de3c  C-term fragment

35664 GLTLQRAAPLMVHPQPRLSPQGFG 35593

 

>CYP75B38 gi|78183420|dbj|BAE47004.1| AB213603.1 Cabernet Sauvignon genomic

flavonoid 3'-hydroxylase 96% to CYP75B32

1 aa diff to CAO64444.1

MNPLALIFCTALFCILLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV

DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGQALDDF

RHIRQEEVLALMRALAREGQTPVKLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV

LAGVFNIGDFVPALEWLDLQGVASKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNAD

GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPKLT

YLQAIVKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGG

ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR

AAPLMVHPRPRLSPQVFGK

 

$$$$

 

>CYP75B39P gi|83944622|gb|ABC48915.1| DQ298200.1 Cabernet Sauvignon genomic

truncated flavonoid 3'-hydroxylase, pseudogene

only 2 aa diffs to BAE47003.1

this seq is called VvF3’H-2 in Castellarin et al. BMC Genomics 2006

VCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMVLAGVFNIGDFVPALEWLDLQGIASKMKKLH

ARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNADGEGGKLTDVEIEALLL (0)

NLFTAGTDTSSSTV

EWAIAELIRHPEMMAQA & (23 bp deletion)

GRGRLVTDLDLPQLTYLQAIVKETFRLHPSTPLSLPRMAAESCEINGYHIP

KNATLLVNVWAIARDPEVWEKPLEFRPNRFLPGGERPNADVRGNDFEVIPFGAGRRICAG

MSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDE

 

>CYP75B39P gi|78183418|dbj|BAE47003.1| AB213602.1 Cabernet Sauvignon genomic

flavonoid 3'-hydroxylase pseudogene

96% to CYP75B32

same deletion as DQ298200.1, 2 aa diffs

MNPLALSFCTALFCVLLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV

DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGQALDDF

RHIRQ ()

EEVLALMRALARAGQTPVKLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV

LAGVFNIGDFVPALEWLDLQGVASKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNAD

GEGGKLTDVEIKALLL (0) 1396

NLFTAGTDTSSSTVEWAIAELIRHPEMMAQA 1584 & (23 bp deletion)

1586 GRGRLVTDLDLPQLTYLQAIVKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVN 1762

1763 VWAIARDPEVWEKPLEFRPNRFLPGGERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMV 1942

1943 HLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQRAAPLMVHPRPRLSPQVFGK 2107

 

>CYP75B39P gi|147825152|emb|CAN62275.1| AM488740.1 Pinot Noir genomic

96% to CYP75B32 1 aa diff to CAN75347 plus deletion same as in AB213602 and DQ298200

MNPLALIFCTALFCILLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV

DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGQALDDF

RHIRQEEVLALMRALARAGQTPVKLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV

LAGVFNIGDFVPALEWLDLQGVASKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNAD

GEGGKLTDVEIKALLL (0)

NLFTAGTDTSSSTVEWAIAELIRHPEMMAQA & (23 bp deletion)

PGRGRLVTDLDLPQLTYLQAIVKETFRLHPSTPLSLPRMAAESCEINGYHIPKN

ATLLVNVWAIARDPEVWEKPLEFRPNRFLPGGERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLT

ATLVHAFNWELPEGQVAEKLNMDEAYGLTLQRAAPLMVHPRPRLSPQVFGK

 

 

CYP76 family (24 genes) [23 pseudogenes]

 

CYP76A subfamily (5 genes) [3 pseudogenes]

 

>CYP76A10 CAAP02005158.1b 66% to CAN77399.1

23434 MEWTTNFLVWLIIPFLSALLLLLHRLKSGFNKHLPPGPPGWPIFGNIFDLGTLPHQKLAG 23255

23254 LRDTYGDVVWLNLGYIGTMVVQSSKAAAELFKNHDLSFSDRSIHETMRVHQYNESSLSLA 23075

23074 PYGPYWRSLRRLVTVDMLTMKRINETVPIRRKCVDDLLLWIEEEARGMDGTATGLELGRF 22895

22894 FFLATFNMIGNLMLSRDLLDPQSRKGSEFFTAMRISMESSGHTNFADFFPWLKWLDPQGL 22715

22714 KKRMEVDLGKSIEIASGFVKERMRQGRAEESKRKDFLDVLLEFQGDGKDEATKISEKGIN 22535

22534 IFIT (0) 22523

22430 EMFMAASETTSSTMEWAMTELLRSPESMTKVKAELGRVIGEKRKLEESDLDDLPYLH 22260

22259 AVVKETLRLHPAAPFLVPRRAVEDTKFMGYHIPKGTQVFVNVWAIGREAETWDDALCFKP 22080

22079 ERFVDSNMDYKGQNFEFIPFGAGRRICVGIPLAYRVLHFVLGSLLHHFDWQLERNVTPE 21903

21902 TMDMKERRGIVICKFHPLKAVPKIKPIST* 21813

 

>CYP76A12 gi|147791648|emb|CAN77399.1| AM476034.2 (871-3002)

44% to 76G1

CAAP02005158.1a 16011-13907 (-) strand 100% match

MVDWASNILLWCIILVIPVLFLLLHRRRSGSVRLPPGPPGWPVFGNMFDLGAMPHETLAGLRHKYGDVVW

LNLGAIKTTVVQSSKAAAELFKNQDLCFSDRTITETMRAQGYHESSLALAPYGPHWRSLRRLMTMEMLVT

KRINETAGVRRKCVDDMLSWIEEEARGVGGEGRGIQVAHFVFLASFNMLGNLMLSCDLLHPGSKEGSEFF

EVMVRVMEWPGHPNSADFFPWLRWMDPQGLRKKAERDLGIAMKIASGFVQERIKRGPAAEDHKKDFLDVL

LDFQGSGKNEPPQISDKDLNIIILEIFMAGSETTSSTVEWALTELLRHPECMAKVKAELGRVVGASGKLE

ERHIDDLQYLQAVVKETFRLHPPIPFLVPRKAVRDTNFMGYHIPKNTQLFVNVWAIGREAELWEEPSSFK

PERFLDLNHIDYKGQHFZLIPFGAGRRMCAGVPLAHRMVHLVLGSLVYHFDWQLDSSITLETMDMRENLA

MVMRKLEPLKALPKKVSL

 

>CYP76A13 gi|147791649|emb|CAN77400.1 AM476034.2 (10417-12389)

45% to 76G1

CAAP02005373.1b 19219-17303 (-) strand, 2 aa diffs

Missing N-term seq

Nearly identical to adjacent gene CAN77399.1 3 aa diffs

MXDWASNILLWCIILVIPVLFLLLXXRRSGSVRLPPGPPG

WPVFGNMFDLGA

MPHETLAGLRHKYGDVVWLNLGAIKTTVVQSSKAAAELFKNQDLCFSDRTITETMRAQGYHESSLALAPY

GPHWRSLRRLMTMEMLVTKRINETAGVRRKCVDDMLSWIEEEARGVGGEGRGIQVAHFVFLASFNMLGNL

MLSCDLLHPGSKEGSEFFEVMVRVMEWSGHPNFADFFPWLRWMDPQGLRKKAERDLGIAMKIASGFVQER

IKRGPAAEDHKKDFLDVLLDFQGSGKNEPPQISDKDLNIIILEIFMAGSETTSSTVEWALTELLRHPECM

AKVKAELGRVVGANGKLEERHIDDLQYLQAVVKETFRLHPPIPFLVPRKAVRDTNFMGYHIPKNTQLFVN

VWAIGREAELWEEPSSFKPERFLDLNHIDYKGQHFELIPFGAGRRMCAGVPLAHRMVHLVLGSLVYHFDW

QLDSSITLETMDMRENLAMVMRKLEPLKALPKKVSL*

 

>CYP76A14P CAAP02005373.1b pseudogene 64% to CAN77399.1

5191 MERASNFLLYLIVISSSAMSFMLCRRKSGFNRLPPRPIGWPILSNMLDLGTMLHQTLAGLRHK 5003

5002 YGDVVWLRLGAIKTMVILSSKAAGELFKNHDLSFADRSIGETMRVHEYNEGSLALVPYGP 4823

3499 LTTDMFTVRRINETANVRRKCVDDMLLWIEKEALGVNGEASSVHVAEAVFLSNMLGNL 3326

3325 MLSRDVLDLRSEEGSEFFTIMSNLTEWSGHPNLADFFPWLGWLNL*GLRK 3176

2669 KSQQRDLGKAMEMASGFVNERMKKQRTEGTKRKDFLDVLLEFEGNGRDEPAKTSDRDVNI 2490

2489 FIL (0) 2481

2336 EIFMAGSETSSSIVEWVMTELLRNPKSMSKVKDELARVVGADRNVEESDIDELQYLQ 2166

2165 AVVKETLRLHPPIPFLIPRSAIQDTSFMGYHIPKDTQVLVNAWAIGRDPGS*EDPSSFKP 1986

1985 ERFLDSKKIDYKGQNFE 1935

 752 LIPFGAGRRICAGIPLAHRVLHLVLGTLLHHFDWQLEGNVTPETMDMKEKWGLVMLESQP 573

 572 LKAVPKKLT* 543

 

>CYP76A15 gi|147774514|emb|CAN76783.1| 42% to 76G1

CAAP02013124.1 = CAN76783.1

CAAP02000672.1b 113501-112101 added missing parts

113663 MELSTASIVFWSCFFSAALLLFLRLIKFTKGSTKSTPPGPQGWPIFGNIFDLGT

1403 LPHQTLYRLRPQHGPVLWLQLGAINTMVVQSAKAAAELFKNHDLSFSDRNVPFTLTAHNY 1224

1223 DQGSMALGKYGPYWRMIRKVCASELLVNKRINEMGSLRRKCVDDMIRWIEEDAAKSGAEG 1044

1043 RAGEVELPHFLFCMAFNLIGNITLSRDVVDIKSKDGHEFFQAMNGVVEWAGRPNIADFFP 864

 863 LLKRLDPLGMMRNMVRDMGQALNLIARFVKERDEERQSGMVREKRDFLDVLLECRDDEKE 684

 683 GPHEMSDNKVKIIVL (0) 639

 413 EMFFAGSDTTSSTLEWAMTELLRRPESMRKAQEELDRVVGPHGKVEESDIDQLLYLQ 243

 242 AVVKETLRLHPPIPLLLPRNALQDTNFMGYFVPKNTQVFVNAWAIGRDPDAWKEPLSFKP 63

  62 DRFLGSNLDYKGQNFEFIPF 3

     GSGRRICIGISLANKLLPLALASLLHCFDWELGGGVTPET 111981

111980 MDMNERVGITVRKLIPLKPIPKRRTV* 111900

 

>CYP76A15-de2b CAAP02000672.1b-de2b pseudogene, C-term

111539 KEERFIDSDKQ*GDGFVLMASLAGIPSTLAHKVMHLVLLGLLLHRFDWDLEWDIFPK 111369

 

>CYP76A16 gi|147774515|emb|CAN76784.1| 49% to 76G6, 44% to 76G1

CAAP02000672.1a 105702-104085 (-) strand 100% match

MSSLLWWSAFFSAALLVLLRRIKPRKGSTKLRPPGPQGWPILGNIFDLGTMPHQTLYRLRSQYGPVLWLQ

LGAINTVVIQSAKVAAELFKNHDLPFSDRKVPCALTALNYNQGSMAMSNYGTYWRTLRKVCSSELLVIKR

INEMAPLRHKCVDRMIQWIEDDATMARVQGGSGEVEVSHLVFCVAFNLIANLMLSRDFFDMKPKEGNEFY

BAMNKIMELAGKPNTADFFPFLKWLDPQGIKRNMVRELGRAMDIIAGFVKERVEERQTGIEKEKRDFLDV

LLEYRRDGKEGSEKLSERNMNIIILEMFFGGTETTSSTIEWAMTELLRKPKSMRKVKEELDRVVGPDRKV

EESDIDELLYLQAVVKETLRLHPALPLLIPRNALQDTNFMGYFIPQNTQVFVNAWSIGRDPEAWHKPLSF

KPRRFLGSDIDYKGQNFELIPFGSGRRMCIGMPFAHKVVPFVLASLLHCFDWELGSNLTPETIDMNERVG

LTLRKLVPLKAIPRKRIVRDR

 

>CYP76A17P CAAP02000672.1c pseudogene 93% to CAAP02005373.1b

117421 DLRSKEGSEFFTIMSNLTEWSGHPNLSDFFPWLGWLDLQGLRKNMERDLGKAMEMASGF 117245

117244 VNERMKKQRTEGTKRKDFLDVLLEFEGNGKDEPAKISDRDVIIFIL (0) 117107

117000 EIFLAGSETSSSIVEWAMTELLRNPKSMSEVKDELARVVGADRNVEESDIDELQYLQAVV 116821

116820 KETLRLHPPIPFLILRSAIQDTSFMGYHIPKDTQVLVNARAIGRDPGSWEDPSSFKPERF 116641

116640 LDSKKIEYKGQNFELIPFGAGRRICAGIPLAHRVLHLVLGTLLHHFDWQLKGNVTPETMD 116461

116460 MKEKWGLVMRKSQPLKAVPKKLT* 116389

 

CYP76F subfamily (5 genes) [2 pseudogenes]

 

>CYP76F2 gi|7406712|emb|CAB85635.1| putative ripening-related P-450 enzyme

same as AJ237995

MELLSCLLCFLAAWTSIYIMFSARRGRKHAAHKLPPGPVPLPIIGSLLNLGNRPHESLANLAKTYGPIMT

LKLGYVTTIVISSAPMAKEVLQKQDLSFCNRSIPDAIRAAKHNQLSMAWIPVSTTWRALRRTCNSHLFTS

QKLDSNTHLRHQKVQELLANVEQSCQAGGPVDIGQEAFRTSLNLLSNTIFSVDLVDPISETAQEFKELVR

GVMEEAGKPNLVYYFPVLRQIDPQGIRRRLTIYFGRMIEIFDRMIKQRLQLRKIQGSIASSDVLDVLLNI

SEDNSNEIERSHMEHLLLDLFAAGTDTTSSTLEWAMAELLHNPETLLKARMELLQTIGQDKQVKESDISR

LPYLQAVVKETFRLHPAVPFLLPRRVEGDADIDGFAVPKNAQVLVNAWAIGRDPNTWENPNSFVPERFLG

LDMDVKGQNFELIPFGAGRRICPGLPLAIRMVHLMLASLIHSYDWKLEDGVTPENMNMEERYGISLQKAQ

PLQALPVRV

 

>CYP76F10P CAAP02001054.1  75% to 76F2

34678 MDLMSYLLCLLVAWTSIYIVVSARRSKSGAGKLPPGPVPFPIIGNLLNLGNKPH 34517

34516 ESLANLAKIYGPVMSLKLGCVTTVVITSATMAKEVLQKKDQSFCNRTIPDALRALNHNQI 34337

34336 SMVWLPVSTKWRTLRKICNSHIFTNQKLDSSNYLRHQKVQDLLANVEQSCQAGDVVDIGQ 34157

34156 EAFRTTLNLLSNTTFSVDLVEPSSDTVQEFKELVRHMMEEAAKPNLADYFPVVRKIDPQG 33977

33976 IRRRMAIHFGKMIKVLDKKVKQRLRSRQVQGWMASSDVLDTLLNISEDSNNFLDITHIDH 33797

33796 LLL (0)

32324 DLFVAGTDTTANTLEWAMAELLHNPETLLRVQAELRQTIGKDKLVKESDIARLPYLQA 32151

32150 VVKETFRLHPAVPFLLPRKVEVDTEMCGFIVPKDAQVLVNVWAIGRDPNLWENPNLFMPE 31971

31970 RFLGSDMDVRGQNFELIPFGAGRRICPGLLLGIRMVQLMLASLIHSNDWKLEDGLTPENM 31791

31790 NMEEKFGFTLQKAQPLRVLPIHV 31722

 

>CYP76F10P gi|147816105|emb|CAN66326.1 AM481161.2

60% to 76C1

2 AA DIFFs to 76F10P

4841 MDLMSYLLCLLVAWTSIYIXVSARRSKSGAGKLPPGPVPFPIIGNLLNLGNKPHESLANL 4662

4661 AKIYGPVMSLKLGCVTTVVITSATMAKEVLQKKDQSFCNRTIPDALRALNHNQISMVWLP 4482

4481 VSTKWRTLRKICNSHIFTNQKLDSSNYLRHQKVQDLLANVEQSCQAGDVVDIGQEAFRTT 4302

4301 LNLLSNTTFSVDLVEPSSDTVQEFKELVRHMMXEAAKPNLADYFPVVRKIDPQGIRRRMA 4122

4121 IHFGKMIKVLDKKVKQRLRSRQVQGWMASSDVLDTLLNISEDSNNFLDITHIDHLLL 3951

2471 DLFVAGTDTTSNTLEWAMAELLHNPETLLRVQAELRQTIGKDKLVKESDIARLPYLQA 2298

2297 VVKETFRLHPAVPFLLPRKVEVDTEMCGFIVPKDAQVLVNVWAIGRDPNLWENPNLFMPE 2118

2117 RFLGSDMDVRGQNFELIPFGAGRRICPGLLLGIRMVQLMLASLIHSYDWKLEDGLTPENM 1938

1937 NMEEKFGFTLQKAQPLRVLPIHV 1869

 

>CYP76F11 CAAP02001054.1  pseudogene 71% to 76F2

10316 MDLFSCLLCLLVAWASIYIVVSARRRKSGAGKLPPGPVPFPIIGNLLNLGNKPH 10155

10154 ESLANLAKIHGPVMTLELGCVTTVVITSATMAKEVLQKKDQSFCNRTIPDALRAHNHNQL 9975

 9974 SVVWLPASTKWRTLRK 9927

      NSHIFTSQKLDSNAHL (this seq is inverted)

 9885 NCQAGDVVDIGLEAFRTTLNLLSNTIFSVDLVEPSSDTVQEFKELVRNMMEEAAKPN 9715

      (GAP)

 9713 MAIHFGNMIEVFDKMVKQRLRSRQVQGWMASSDVLHILLTISEDSNNVLDITNIDHLLL 9537

 9389 DLFAAGTDTTTNTLEWAMA 9333

 9333 KLLHKPETLRRVQVELLQTIGKDKLVKESDIAQLPYLQAVVKETFRLHPAVPLLLPRKAD 9154

 9153 VDTDICGFIVPKDAQVLVNVWAIGRDPNLWENPNSFMPERFLGSDMDVRGQNFELIPFGA 8974

 8973 GRRICPGIRMIHLMLASLLHSYDWKLEDGVTPENMNMEEKFGVTLQNAQPLRALPT 8806

 8805 LV* 8797

 

>CYP76F11 gi|147802689|emb|CAN72997.1| AM480526.2 60% to 76C1

3 aa diffs to 76F11

11345 MDLFSCLLCLLVAWASIYIVVSARRRKSGAGKLPPGPVPFPIIGNLLNLGNKPHESLANL 11166

11165 AKIHGPVMTLELGCVTTVVITSATMAKEVLQKKDQSFCNRTIPDALRAHNHNQLSVVWLP 10986

10985 ASTKWRTLRK 10956

10911 NSHIXTSQKLDSNAHL 10958 (this seq is inverted)

10914 NCQAGDVVDIGLEAFRTTLNLLSNTIFSVDLVEPSSDTVQEFXELVRNMMEEAAKPN 10744

      (gap)

10742 MAIHFGNMIEVFDKMVKQRLRSRQVQGWMASSDVLHILLTISEDSNNVLDITNIDHLLL 10566

10415 DLFAAGTDTTTNTLEWAMA 10359

10359 KLLHKPETLRRVQVELLQTIGKDKLVKESDIAQLPYLQAVVKETFRLHPAVPLLLPRKAE 10180

10179 VDTDICGFIVPKDAQVLVNVWAIGRDPNLWENPNSFMPERFLGSDMDVRGQNFELIPF 10006

      XAGRRICPGIRMIHLMLASLLHSYDWKLEDGVTPENMNMEEKFGVTLQKAQPLRALPTLV 9826

 

>CYP76F12 CAAP02002347.1d = CAN79423.1 86% to CYP76F2

CAN79423.1 has 2 aa diffs and small deletion

24772  MEMLSCLLCFLVAWTSIYIMFSVRRGSQHTAYKLPPGPVPLPIIGNLLNLGNRPHESLAE  24951

24952  LAKTYGPIMTLKLGYVTTIVISSAPMAKEVLQKQDLSFCNRFVPDAIRATNHNQLSMAWM  25131

25132  PVSTTWRVLRKICNSHLFTTQKLDSNTHLRHHKVQELLAKVEESRQAGDAVYIGREAFRT  25311

25312  SLNLLSNTIFSVDLVDPISETVLEFQELVRCIIEEIERPNLVDYFPVLRKIDPQGIRRRL  25491

25492  TIYFGKMIGIFDRMIKQRLQLRKMQGSIATSDVLDTLLNISEDNSNEIERNHMEHLLL (0) 25677

       DLFVAGTDTTSSTLEWAMAE  25851

25852  LLHNPEKLLKARVELLQTIGKDKQVKESDITRLPFLQAVVKETFRLHPVVPFLIPHRVEE  26031

26032  DTDIDGLTVPKNAQVLVNAWAIGRDPNIWENPNSFVPERFLELDMDVKGQNFELIPFGAG  26211

26212  RRICPGLPLATRMVHLMLASLIHSCDWKLEDGMTPENMNMEDRFGITLQKAQPLKAIPIRV*  26397

 

>CYP76F13P CAAP02002347.1c pseudogene missing N-term and insertion in exon 2

93% to 76F2 CAB85635.1

11894  KVQELLANVEQRCQAGGPVDIGREAFRTSLNLLSNAIFSVDLVDPISETAQEFKELVRGV  11715

11714  MEEAGKPNLVDYFPVLRQIDPQGIRRGLTIYFGRMIEIFDRMIKRRLRLRKMQGSIASSD  11535

11534  VLDILLNISEDNSNEIERSHMEHLLL (0)

       DLFVAGTDTTSSTLEWAM  11250

10487  AELLYNPEKLLKARMELLQTIGQDKQVKESDITRLPYVQAVVKETFRLHPAVPFLL  10320

10319  PRRVEEDTDIQGFTVPKNAQVLVNAWAIGRDPNTWENPNSFVPERFLGLDMDVKGQNFEL  10140

10139  IPFGAGRRICPGLPLAIRMVHLMLASLIHSYDWKLEDGVTPENMNMEESFGLSLQKAQPL  9960

9959   QALPVRV  9939

 

>CYP76F14 CAAP02002347.1b 98% to 76F2 CAB85635.1 7 aa diffs

6638  MELLSCLLCFLAAWTSIYIMFSARRGRKHAAHKLPPGPVPLPIIGSLLNLGNRPHESLAN  6459

6458  LAKTYGPIMTLKLGYVTTIVISSAPMAKEVLQKQDLSFCNRSIPDAIRAAKHNQLSMAWL  6279

6278  PVSTTWRALRRTCNSHLFTPQKLDSNTHLRHQKVQELLANVEQSCQAGGPVDIGQEAFRT  6099

6098  SLNLLSNTIFSVDLVDPISETAQEFKELVRGVMEEAGKPNLVDYFPVLRRIDPQSIRRRL  5919

5918  TIYFGRMIEIFDRMIKQRLQLRKNQGSIASSDVLDVLLNISEDNSSEIERSHMEHLLL (0) 5745

5590  DLFAAGTDTTSSTLEWAMAELLHNPETLLKARMELLQTIGQDKQVKESDISRL  5432

5431  PYLQAVVKETFRLHPAVPFLLPRRVEGDADIDGFAVPKNAQVLVNAWAIGRDPNTWENPN  5252

5251  SFVPERFLGLDMDVKGQNFELIPFGAGRRICPGLPLAIRMVHLMLASLIHSYDWKLEDGV  5072

5071  TPENMNMEERYGISLQKAQPLQALPVRV  4988

 

>CYP76F15 CAAP02002347.1a 95% to 76F2 CAB85635.1

2737  MELLSCLLCFLAAWTSIYIMFSARRGSKHTAYKLPPGPVPLPIIGNLLNLGNRPHESLAN  2558

2557  LAKTYGPIMTLKLGYVTTIVISSAPMAKEVLQKQDLSFCNRSIPDAIRAAKHNQLSMAWL  2378

2377  PVSTTWRALRRTCNSHLFTSQKLDSNTHLRHQKVQELLANVEQSCQAGGPVDIGREAFRT  2198

2197  SLNLLSNAIFSVDLVDPISETAQEFKELVRGVMEEAGKPNLVDYFPVLRRIDPQGIRRRL  2018

2017  TVYFGRMIEIFDRMIKQRLQLRKIQGSIASSDVLDVLLNISEDNSNEIERSHMEHLLL (0) 1844

1689  DLFVAGTDTTSRTLEWAIAELLHNPEKLLKSRMELLQTIGQDKQVKESDITRL  1531

1530  PYVQAVVKETFRLHPAVPFLLPRRVEEDTDIEGFTVPKNAQVLVNAWAIGRDPNTWENPN  1351

1350  SFVPERFLGLDI  1315

1315  MDVKGQNFELIPFGAGRRIRPGLPLAIRMVHLMLASLIHSYDWKLQDGVTPENMNMEERY  1136

1135  GISLQKAQPLQALPVRV  1085

 

CYP76G subfamily (1 gene) [0 pseudogenes]

 

>CYP76G6 CAAP02000370.1 67% to 76G1, 100 to CAN76624.1 = AM452176.2

9798  MEYEMVGIVIALVLWAVWAMVTERRHRRLEELGQL

9693  PPGPRSWPVVGNIFQLGWAPHVSFAKLAGKHGPIMTLWLGSMSTVVISSNEVAREMFKNH  9514

9513  DVVLAGRKIYEAMKGDRGNEGSIITAQYGPQWRMLRRLCTSEFFVTSRLDAMRGVRGGCI  9334

9333  DRMVQFVTEAGTSGTHAIDVGRFIFLMAFNLIGNLMFSKDLLDPKSERGAEFFYHAGKVM  9154

9153  ELAGRPNVADFLPILRWFDPQGIRRKTQFHVERAFAIAGGFIKERMETMAKGSGEAK  8983

8982  SKDFLDVLLEFRGDGVEEPSRFSSRTINVIVF (0)

8787  EMFTAGTDTTTSTLEWAMAELLHTPRILNKVQAELRSVVKPGSKLEEKDMENLPY  8623

8622  LIAVIKETLRLHPPLPFLVPHMAMNSCKMLGYCIPKETQVLVNVWAIGRDPKTWKDPLVF  8443

8442  MPERFLEPNMVDYKGHHFEFIPFGSGRRMCPAVPLASRVLPLALGSLLHSFNWVLPDGLN  8263

8262  PKEMDMTERMGITLRKSVPLRAMPVPYKGIQTQVFA* 8152

 

CYP76T subfamily (10 genes) [17 pseudogenes]

 

>CYP76T7P CAAP02001503.1a pseudogene  

 6227 GDILMLGDKPHQSLTNLSKTYGPVMSLKLGSISTIVISSPETAKEVLHRNDQAFSGR 6397

10877 DLFLAGTDTTSDTIEWAMAELLHNPEKMVKAQRELQEVLGKDGIVQESDISKLPYLQAIV 11056

11057 KETFRLHPLAPLLVPYKAETDVKICGFTVPKNSQVLINAWDIGCDPSVWSNPNAFMPERF 11236

11237 LGCDIDVKGRDFELIPFGAGRRICLALPLAHRMVHLILVSLLHSYAWKLDDGMKPEDMDM 11416

11417 NEKLGFTLQKAQPLRAIPIEV* 11482

 

>CYP76T7P-de2b CAAP02001503.1a-de2b pseudogene

11920 IHRLPSTWPNPNAFMPERFLECDINVKGRDFELIPFGARRRICPGMPLAHRMVHLMLTYL 12099

12100 LYSHAWKLEDGMKPENMDMSEKFGLTLQKAQPLRAIPINV* 12222

 

>CYP76T8P-de2b CAAP02001503.1b-de2b pseudogene

24925 INLMLASLLRSFY*ELEDAMTPEDIDMSEKFGLTLQK 25035

 

>CYP76T8P CAAP02001503.1b pseudogene

25676 DLFIAGTDTISSTLEWAMAELLCNPEKMAKAQKEIRGVLGNEGIVQESDISKLHYLQAIV 25855

25856 KETFRLHPPGPLLLPHKAETDVEICGFTVPKNSQVLVNVWAIGRDPSTWLNPNAFVPERF 26035

26036 LGFDIDVKRRDFELIPFGAGRRICLGLPLAHRMVHLILASLLHSYAWKLDDGMKPADMDI 26215

24922 IINLMLASLLRSFY*ELEDAMTPEDIDMSEKFGLTLQKA 25038

 

>CYP76T8P-de2c CAAP02001503.1b-de2c pseudogene

27409 MDMNDKLGLTLHKVQPLRTIPIK 27477

 

>CYP76T9P CAAP02001503.1c pseudogene

36663 DLFVAGTDTTSDIIEWALAELLHNPETMVKAQRELQEVLGKDGIVQELDI 36812

38848 KLPYLQGIVKETFRLHPPAPLLVLVKAETDVEICGFTVPKNSQVLINAWAIGRDPSIWLN 39027

39028 PNAFVLERFLGCDIDVKGRDFELIPFGAGRRICLGLPLAHRMVHLILVSLLHSYTWKLDD 39207

39208 GMKPENMDMNEKLGFTLQ 39261

 

>CYP76T10 CAAP02001503.1d 70% to CAAP02001585.1e

43606 MDYITVLLLLSFVWTCIHLLKLSPIGRKPSTASLPPGPRPFPIIGNILKLGDKPHQS 43776

43777 LTNLSKTYGPVMSLKLGSISTIVVSSPETAKEVLHRNDQAFSGREVLGAVKAHNHHESSV 43956

43957 IWSPTSAYWRKIRKICTREMFSVQRLNASQGLRKKIVQELLDHVEECCGRGCAVDIGAAT 44136

44137 FTASLNLLSNTIFSTNLAHHDSTFSQEFKDIVWGVMEEAGKPNFADYFPAFRLIDPQCIQ 44316

44317 RNMKVHFGKLIDIFDGLITQRVQSKASSASNDVLDAFLNLTKENNQEWSCNDIIHLLM (0) 44490

45403 DLFVAGTDTTSDTIEWAMAELLHNPETMVKAQRELQEVLGKDGIVQESDISKLPYLQGIV 45582

45583 KETFRLHPPAPLLVPHKAETDVEICGFTVPKNSQVLINAWAIGRDPSIWSNPNAFVPERF 45762

45763 LGCDIDVKGRDFELIPFGAGRRICLGLPLAHRMVHLILASLLHSYAWKLDDGMKPEDMDM 45942

45943 NEKLGFTLQKAQPLRAIPIK 46002

 

>CYP76T11 CAAP02001503.1e 70% to CAAP02001585.1e

47426 MDYITVLLLLSFVWTCIHLLKLSPTGRKPSTASLPPGPRPFPIIGNILKLGDKPHQSL 47599

47600 TNLSKTYGPVMSLKLGSVSTIVISSSETAKEVLHRNNQAFSGRVVLDAVKAHNHHESSVV 47779

47780 WSPASAYWRKIRKICTREMFSVQRLEASQGLRRKIVQELLDHAEECCGRGCAVDIGAATF 47959

47960 TASLNLLSNTIFSINLVHHASTFSQEFKDIVWRVMEDAGRPNFADYFPAFKLIDPQGIQQ 48139

48140 NMKIHLDKLIHIFEGIINQRLQSKASSASNDVLDAFLNLTEENNQEWSCRDIIHLLM (0) 48310

49483 DLFLAGTDTTSGTIEWAMAELLHNPEKMAKAQRELQEVLGKDGIVQESDISKLPYFQAIV 49662

49663 KETFRLHPPGPLLAPHKAESDVEIRGFTVPKNSQVLVNVWAIGRDPSTWSNPNAFVPERF 49842

49843 LGCDIDVKGRDFELIPFGAGRRICLGLPLAHRMVHLILASLLHSYAWKLDDGMKPADMDM 50022

50023 NEKLGLTLHKVQPLRAIPIK 50082

 

>CYP76T12 CAAP02001503.1f 71% to CAAP02001585.1e, 53% to 76C2

53710 MDYITFLLLLSFVWTCIHLLKLSPIGRKPGTASLPPGPRPLPIIGNILKLGDKPHRS 53880

53881 LANLSKTYGPVMSLKLGSIATIVISSSETAKEVLHRNDQAFSSRTVPDAVRAHNHHESSV 54060

54061 VWVPASVHWRKIRKICTREIFSVQQLDASQGLRRKIVQELLDHVEECCSRGCAVDINGAV 54240

54241 FTASLNLLSNTIFSINLAHHGSNFSQEFKNIARGVMEGVGRPNFVDYFPAFRLIDPQGIR 54420

54421 WNMKIYFNKLFHIFDGIINQRLQSKTSSASKDVLDALLNLTKENDNEWSCSDIKHLLL (0) 54594

54928 DLFVAGTDTTSSTVEWAMAELLCNPEKIAKAQKEIRGVLGNEGIVQESDISKFPYLQSIV 55107

55108 KETFRLHPPAPLLVPHKAETDVEICGFTIPKNSQVLVNAWAIGRDPSTWPNPNAFMPERF 55287

55288 LECDIDVKGRDFELIPFGAGRRICPGMPLAHRMVHLMLASLLYSHAWKLEDGMKPENMDM 55467

55468 SEKFGLTLQKAQPLRAIPIK 55527

 

>CYP76T12-de2b CAAP02001503.1f-de2b

57157 DIFVAETNTTWSTVKWAMAELIRNPETMAK

      PKNLQILVNG*AIGRD 57478

57479 PSLWSHPNVFVPERILERDIDARGQDFELIPFSS*RSIGSGMSLAHRMVHLV*ASLIHSF 57658

57659 GWE 57667

 

>CYP76T12-de2c CAAP02001503.1f-de2c

63550 DGMKPEDMDMSEKFGLTL*KA 63612

 

>CYP76T12-de2d CAAP02001503.1f-de2d

78297 TIHLMLASLLCSFC*ELGDAMIPKDIDMSEKFGFTLQK 78410

 

>CYP76T13P CAAP02001585.1a pseudogene fragment C-term 3 aa diffs to CAN81641.1

2 at a boundary

4360 DLFSAGTDTTSSTVEWAMAELLNNPNLMAKARSELGKVVGKEKMVEESDISKLPYLQAVV 4539

4540 KETFRLHPPVPFLVPRKTEMK 4602

4669 RDFTIWSNPNSFVPERFLECEIDVKGRDFRLIPFGAGRRICPGLLLGHRMVHLMLASLLH 4848

4849 SFDWKLEDGLKPEDMDMTEKFGFTLRKAQPLQAVPIKP 4962

 

>CYP76T14P CAAP02001585.1b pseudogene fragment C-term, 2 aa diffs to CAN81641.1

5154 WSNPNSFVPERFLECEIDVKGRDFQLIPFGAGRRICPGLLLGHRMVHLMLASLLHSFDWK 5333

5334 LQDGLKPEDMDMTEKFGLTLRKAQPLQAVPIKP 5432

 

>CYP76T15 gi|147846593|emb|CAN81641.1| 50% to 76C1, 75% to CAN75686.1

CAAP02001585.1c 10864-13399 (+) strand 2 aa diffs, one frameshift

MDYTTLLLLFSFVWSCVKVLTIGFTNRKSGVARLPPGPRPFPIIGNLLKLGEKPHQSLTILSKTYGPLMS

LKLGSTTTIVVSSSEAAQEVLNKNDQAFSSRTVLNAIQVADHHHFSIVFLPASAHWRNLRKICSKQMFSS

HRVEAGQAMRENIVQQLLGHAQESCSSGRAVDIGRATFTTTLNLLSNTIFSVNLAHYNSNFSQEFKDLIW

SIMEEAGKPNLADFFPVLRLVDPQGILKGMTVCFNKLVEVFDGFIEQRLPLKASSANNDVLDGLLNLDKQ

HDHELSSNDVRHLLVDLFSAGTDTTSSTVEWAMAELLNNPNLMAKARSELGKVVGKEKMVEESDISKLPY

LQAVVKETFRLHPPVPFLVPRKTEMKSEILGYAVPKNAHVLVNVWA &

IGRDSTIWSNPNSFVPERFLECEIDVKGRDFQLIPFGAGRRICPGLL

LGHRMVHLMLASLLHSFDWKLEDGLKPEDMDMTEKFGFTLRKAQPLQAVPIKP

 

>CYP76T15-DE3B CAAP02001585.1c-de3b pseudogene fragment C-term, 3 aa diffs to CAN81641.1

14468 IVHLMLASLLHSSDWKLEDGLKPEDMDMTEKFGFTLGKAQPLQAVPIKP 14614

 

>CYP76T16 gi|147846594|emb|CAN81642.1 AM476785.2

45% TO CYP76C1, 75% to CAN75686.1

2 aa diffs to adjacent gene CAN81641.1, one frameshift

MDYTTLLLLFSFVWSCVKVLTIGFTNRKSGVARLPPGPRPFPIIGNLLKLGEKPHQSLTILSKTYGPLMS

LKLGSTTTIVVSSSEAAQEVLNKNDQAFSSRTVLNAIQVADHHHFSIVFLPASAHWRNLRKICSKQMFSS

HRVEAGQAMRENIVQQLLGHAQESCSSGRAVDIGRATFTTTLNLLSNTIFSVNLAHYNSNFSQEFKDLIW

SIMEEAGKPNLADFFPVLRLVDPQGILXXMTVCFNKLVEVFDGFIEQRLPLKASSANNDVLDGLLNLDKQ

HDHELSSNDVRHLLVDLFSAGTDTTSSTVEWAMAELLNNPNLMAKARSELGKVVGKEKMVEESDISKLPY

LQAVVKETFRLHPPIPFLVPRKTEMKSEILGYAVPKNAHVLVNVX &

XIGRDSTIWSNPNSFVPER

FLECEIBVKGRDFQLIPFGAGRRICPGLLLGHRMVHLMLASLLHSFDWKLEDGLKPEDMD

MTEKFGFTLRKAQPLQAVPIKP

 

>CYP76T17 CAAP02001585.1d 82% to CAN81641.1 76C like 80% to CAAP02002145.1a

45370 MNYSILLLLLSFLWSCINAPISALGSSKRKFGMARLQPGPRPFPIIGNLLELGDKPHQSL 45549

45550 TTLSKTYGPLMSLKLGSTTTIVISSS 45627

45626 VLNKNDQAFSSRAVLNAVQAVNHHKFSVVFLPASAHWRNLRKICSTQMLSLPRIDACRA 45802

45803 LRRRIVQQLLDHAHESCTSSRAVDIGRAASTTALNLLSNTIFSVDLAHYDSNFSQEFKDL 45982

45983 VWSIMEEAGKPNLADFFPGLSFIDPQGIQKKMTANFYKLVKVFDGIIDQRLQLKASSANN 46162

46163 DVLDSLLNLNKQHDHELSSNDIKHLLV 46243

47105 DLFSAGTDTTSSTVEWAMAELLNNPKAMAKARSELDEVLGKGMIVEESDISKLPYLQAVV 47284

47285 KETFRLHPPVPFLVPRKTEMESEILGYAVPKNAQVLVNVWAIG

47414 RDPMLWTNPNSFVPERFLECEIDVKGRDFQLIPFGAGRRICPGLLLGHRMVHLMLASLLH 47593

47594 SFDWKLEDGMKPEDMDMTEKFGFTLRKAQPLQAVPIKP 47707

 

>CYP76T17-de3b CAAP02001585.1d-de3b pseudogene fragment C-term 45% to 76C7

42874 NFQFISFGAGRRICLGLLLAHGMV 42945

42945 HFMSASLFHSFDR*LEHWMKPE 43010

 

>CYP76T18 CAAP02001585.1e 64% to CAAP02002145.1a 54% to 76C4

61448 MDYITFLLLLPLLWAFAHVLNFSPFPQKYSAKARLPPGPRPLPIVGNLFKLRDQPHKSLA 61627

61628 DLSKIYGPIMFLKLGSIPTIIISSSKTAQQVLQKNDQPLSNRVVPDAVRALDHHQNSMVW 61807

61808 LPASARWRNIRKTMIMHFFSLQRLDATQALRRTKVQELLDHAHQSCSRGEAVNIGRAAFT 61987

61988 TSLNLLS 62008

65717 ANTVFSTDLVHHDSKFSQEIKDIVWGVMEEIGRLNVADYFPVFRLLDPQGIRRAMKIYFS 65896

65897 KLSDIFYGIIDQRLKSEASSVASNDVLDALLNLTKEDNHEWSFSDTIHLLL 66049

66356 DTFLAGTDTTSSTVEWAMAELISNPKTMAKAQRELQEVLGKDGIVQESDISKLPYLQSVV 66535

66536 KETLRLHPPGPLLLPHKAQADVEICGFTVPKNSQ

66638 VLVNAWAIGRDPNTWTNPNAFVPERFQG 66721

66722 SEIDVKGRDFEVIPFGSGRRMCPGMPLAHRMVHLMLASLLHSFDWKLEDGLKPEDMDMSE 66901

66902 KFGITLQKAKPLRAIPIRI* 66961

 

>CYP76T19 CAAP02002145.1c (-) strand pseudogene 81% to CAN75686.1

51318 MDITALVLLLLLPCFVWLCFHFLILGSTHRKSFQARLPPGPRPLPIIGNLLEFGDKPHQS 51139

51138 LTTLSKTYGPLMSLKLGR 51085

51083 SPETAQQVLT*KDQAFSGRTVPNVFQVANHHHFSMGFLPASAHWDNLRKICRMQIFSPQR 50904

50903 VDAFHGLRRKVVQQLLDHAHESCSSGQAVDLGRAAFTTALNMLSNTFFSVDLAHYDSNLS 50724

50723 QGFKDLIQSLIVESEKPNLADFFPDLRLVDPQGIQRRLTVSFNKLVDIFDGFFNQRLMLK 50544

50543 ASSTDNDVLDGLLNLNKQYDHELSCNGIKHLLL (0) 50445

50254 DLFPAGTDTTASTIEWAMAELLKNPEAMAKAREELSEVVGKDKI 50123

50121 IEESDISKLPYLQAVVKETFRLHPTIPLLVPRKVETDLEI 50002

50001 LGYAVPKNAQVLVNAWAIGKDSRTWSNPNSFEPERFLESEIDVKGRDFQLLPFSGGRRIC 49822

49821 PGLLFGHRMVHLMLASLLHSFDWKLEDGMKPEDMDMDEKFGFALRKVQPLRVVPTKP 49651

 

>CYP76T20 gi|147772136|emb|CAN75686.1| 55% to 76F2

CAAP02002145.1b 27232-24294 (-) strand 5 aa diffs

MDYTPLVLLLLLPCFVWLCFHFLILGSTHRKSFQARLPPGPRPLPIIGNLLELGDKPHHSFTTLSKKYGP

LMSLKLGSITTIVISSPETAQQVLNKKDQTFSGRTVPNAIQVANHQHFSIGFLPASAHWRNLRKICSMQI

FSLQRVDAFHGLRRKVVQQLLDHAHESCSSGRAVDIGRTAFTIALNLLSNTVFSVDLAHYDSNLSQEFKE

LIWSILVEVGKPNLADFFPGLRLVDPQGIHKRMSVYFNKLFDVFDSFINQRLQLRASSTDNDVLDALLNL

NKQHDHELSCNDIRHLLVDLFSAGTDTTSSTIEWAMAELLNNPKAMAKARDELSQVVGKDRIVEESDISK

LPYLHAVVKETFRLHPPAPFLLPRKAEMDSEILGYAVPKNAQVIINVWAIGRDSKTWSDPHSFGPERFLE

CDIDVKGRDFQLIPFGAGRRICPGLLLGRRMVHLVLASLLHSFDWKLEGGMKPEDMDMSETFGFSVRKAQ

PLRVVPIKP

 

>CYP76T20-de1b CAAP02002145.1b-de1b (-) strand pseudogene N-term

31066 MDCTPLVFLLLLPCFVWLCFHFLILGSTHRKFFQARLPPGPRPLPIIGNLLELGDKPHQS 30887

30886 FTTLSKTVYVLLS 30848

 

>CYP76T21 CAAP02002145.1a (-) strand

12947 MDYTPLVLLLLLPCFVWLCFHFLILGSTHRKSFQARLPPGPRPLPIIGNLLELGDKPHQS 12768

12767 FTTLSKTYGPLMSLKLGSTTTIVISSPKTAQEVLNKKDQAFASRTVLNAIQIQDHHKFSM 12588

12587 VFLPASAHWRNLRKICSMQIFSPQRVEASQDLRRKVVQQLLEHARESCNSGRAVDVGRAA 12408

12407 FTTTLNLLSNTFFSVDLAHYDSNLSQEFKDLIWSIMVEAGKPNLADFFPGLRLVDPQGIQ 12228

12227 KRMTVYFNKLLDVFDGFINQRLPLKASSPDNDVLDALLNLNKQHDHELSSNDIRHLLT (0) 12054

11373 DLFSAGTDTISSTIEWAMAELLNNPKAMAKAQDELSQVVGKDRIVEESDVTKLPYL 11206

11205 QAVVKETFRLHPPAPFLVPRKAEMDSEILGYAVPKNAQVLVNVWAIGRDSRTWSNPNSFV 11026

11025 PERFLECQIDVKGRDFQLIPFGAGRRICPGLLLGHRMVHLMLASLLHSFDWKLEDSMRPE 10846

10845 DMDMSEKFGFTLRKAQPLRAVPTKP 10771

 

>CYP76T22P gi|147772930|emb|CAN69411.1| AM436706.1

64% to 76C1, 66% to 76F2

CAAP02011278.1 exon 2 with frameshift

SAME AS CAO24420.1

90% TO 76T12

3951 DLFVAGTDTISSTVEWA

4002 MAELLSNPEKMAKAQKEIRGVLGNEGIVQESDISKFPYLQSIVKETFRLHPPAPLLVPHKAATDVEICGF

     ILPENSQALVNAWAIGRDPSTWSNPNAFMPERFLECDIDVKGRDFELIPFGVGRRICPGMPLAHRMVHLM

     LASLLH 4439

4438 INSLDWKLEDGMKPENMHMSEKFGITLQKAQPLRAIPMKV* 4560

 

>CYP76T22P gi|157349274|emb|CAO24420.1|

NQRLQLPDALLQITKENGNEWSCSDVIHLVL

DLFVAGTDTISSTVEWAMAELLSNPEK

MAKAQKEIRGVLGNEGIVQESDISKFPYLQSIVKETFRLHPPAPLLVPHKAATDVEICGFILPENSQALV

NAWAIGRDPSTWSNPNAFMPERFLECDIDVKGRDFELIPFGVGRRICPGMPLAHRMVHLMLASLLH

 

>CYP76T22P AM436706.1

7243 DLFXAGTDTISSTXEWAMAELLXNPEKMAKAQKEIRGVLGNEGIVQESDISKFPYLQSIV 7422

7423 KETFRLHPPAPLLVPHKAATDVEICGFILPENSQALVNAWAIGRDPSTWSNPNAFMPERF 7602

7603 LECDIDVKGRDFELIPFGVGRRICPGMPLAHRMVHLMLASLLH 7731

7730 INSLDWKLEDGMKPENMHMSEKFGITLQKAQPLRAIPMKV* 7852

 

>CYP76T22P-de2b C-TERM FRAGMENT AM436706.1

6492 INLMLASLLRSFY*ELEDAMTPEDIDMSEKFGLTLQKA 6605

 

>CYP76T23P AM436706.1

8041 DYFPAFRL*IYPHGTRRIMKSLFLVIFFM 8127

8127 GIINQPLQ 8150

8151 LLDGLLKITKENENG*SWSDLIHLLL 8228 END OF EXON1

8405 INLMLASLPRSFY*ELEDAMTPEDIDMSVKFRLT*QKA 8518 C-TERM FRAGMENT

 

CYP76Y subfamily (3 genes) [1 pseudogene]

 

>CYP76Y1 CAAP02000212.1  93% to CAN66456.1

140310 MELNSFLLLCMPLVLCLFFLQFL

140379 RPSSHATKLPPGPTGLPILGSLLQIGKLPHHSLARLAKIHGPLITLRLGSITTVVASSPQ 140558

140559 TAKLILQTHGQNFLDRPVPEAIDSPQGTIAWTPVDHVWRSRRRVCNNHLFTSQSLD 140726

140727 SLQHLRYKKVEQLLQHIRKHCVSGTPVDIGLLASATNLNVLSNAIFSVDLVDPGFESAQD 140906

140907 FRDLVWGIMEGAGKFNISDYFPMFRRFDLLGVKRDTFSSYRRFYEIVGDIIKSRIKCRAS 141086

141087 NPVTRNDDFLDVILDQCQEDGSLFDSENIQVLIV (0) 141188

141578 ELFYAGSDTSTITTEWAMTEFLRNPGVMQKVRQELSEVIGAGQMVRESDMDRLPYFQAVV 141757

141758 KETLRLHPAGPLLLPFKAKNDVELSGFTIPSNSHVLVNMWAIARDPSYWEDPLSFLPERF 141937

141938 LGSKIDYRGQDFEYIPFGAGRRICPGMPLAVRMVQLVLASIIHSFNWKLPEGTTPLTIDM 142117

142118 QEHCGATLKKAIPLSAIPFIEEN* 142189

 

>CYP76Y2 gi|147827288|emb|CAN66456.1| 42% to 76C4

CAAP02002687.1 32455-34284 (+) strand 100% match

CAAP02006631.1 2526-4355 6 aa diffs possible allele

MELNTFLLLCMPLILCFFLLQFLRPSSHATKLPPGPTGLPILGSLLEIGKLPHRSLARLAKIHGPLITLR

LGSITTVVASSPQTAKLILQTHGQNFLDRPAPEALDSPQGTIGWIPADHVWRSRRRVCINHLFTSQSLDS

LQHLRYKKVEQLLQHIRKHCVSGTPVDIGLLTSAINLNVLSNAIFSVDLVDPGFESAQDFRDQVWGIMEG

AGKFNISDYFPMFRRFDLLGVKRDTFSCYKRLYEIVGGIIKSRIKCRASNPMSRNDDFLDVILDQCQEDG

SVFNSDNIQVLIVELFYAGSDTSTITTEWAMTELLRNPRLMQKVRQELSEVIGAGQMVRESDMDRLPYFQ

AVVKETLRLHPAGPLLLPFKAKNDVELCGFTIPSNSHVLVNMWAIARDPGYWEDPSSFLPERFLGSKIDY

RGQDYEYIPFGAGRRICPGIPLAIRMVQLVLASIIHSFNWKLPEGTTPLTIDMQEQCGATLKKAIPLSAI

PFIEEN

 

>CYP76Y3 gi|147857238|emb|CAN83490.1| 41% to 76C4

CAAP02002003.1 50591-48278 (-) strand 100% match

MELQIALLLLCITLFCFCLRHFLLPSYTAKLPPGPTGLPILGSLLQLGEKPHHTLAKFAESHGPLISLRL

GSITTVVASSPQTAKLILQNHADNFLDRPVPDAIMAMPNPECTLAWIPGDHVWRNRRRVCASHMFTTQRL

DSLQHLRQKKVDQLLQHITKHCVLGTPVYITDLASATILNLMSNTMFSVDLVDPRFESAQEFRELMWRIM

EGVGKPNISDYFPIFRSLDLQGVKRGTVPSYKRLHEILDGIIQERMKLRASSSTTSMNDFLDVLLDQCQV

DGSDFSSDTIKTLLVELVFGGSDTSSVTIEWAMAELLRNPHVMQKVRIELSEVISPGQSIKESDIDRLPY

FQAVVKETMRLHPPAPLLLPYKAKNDLEICGFTIPKDSHVLVNIWAIARDPGYWEDPLSFLPERFLSSNI

DFRGQDFEYLPFGAGKRICPGISLGLRMVHLVLASIIHSFSWKLPQGITPESLDMKEQFGVTLKKVVPLC

AIPFIEEKCSP

 

>CYP76Y4P CAAP02002003.1 pseudogene, one stop codon, no start codon, 88% to CAN83490.1

43710 IELQIVVLLLCFTLFCFCLHHFLLPSYTAKLPPGPTGLPIVGSLLQLDEKPHHSLAKFTE 43531

43530 SHDPLISLRLGSITTMVASFPQTTKPILQNHVDNFLDHPVPDAIMAMPNLEYTLAWIPGD 43351

43350 HVWHNRRRVCASHLFTTQRLDSLQHLRQKKVDQLLQHITKHCVLGTPVYITDLASATILN 43171

43170 LMSNTMFSIDLVDLRFESAQEFRELMWRIMEGVGKPNISDYFPIFRSLDLQGVKRGTVPS 42991

42990 YKRLHEILDGIIHERMKLKASNSTTSMNDFLDVLLDQCQMDGSDFSSKTIKTLL 42829

42017 ELVFGGSDTSSITIEWAMVELLRNPHVMQKVRIELSEIISPTRRIKESDID*LPYFQAVV 41838

41837 KETMRFHPLAPHLLPYKAKYYLEILGFTIPKDSNVLVNIWAIARDPRYREDPLSFLPERF 41658

41657 LSFNIDFRGRDFEYLPFGAGKRICPGIPPGLRMVHFVLASIIHSFSWKFPQGITLESLNM 41478

41477 KEQFGVTLKKVIPLCAIPFIEEKCSQ* 41397

 

CYP77 family

 

>CYP77A14 gi|147812439|emb|CAN65790.1| = AM463740.2

MAPAFAAASSTYSPYYHIFFTALAFLISGLIVLLSRKTKSKKLNLPPGPPGWPIVGNLFQFARSGKQFFQ

YVRELRPKYGPIFTLKMGNRTMIIISSAELAHEALIEKGQSFASRPRENPTRTVFSCNKFTVNAAVYGPV

WRSLRRNMVQNMLSASKIREFSNLRDVSMDKLIDRLRSEAEANDGAVWVLKNARFAVFCILLSMCFGVEM

DEETIEVMDDLMKTVLITLDPRLDDYLPLLSPFFSKQRKAATEVRKRQIKTVVPFIERRRAALENPGSDK

TAASFSYLDTIFDLKIEGRKSSPTNPELVTLCSEFLNGGTDTTGTAVEWAIARMIENPEIQSKLYEEIKT

TVGDRKVQEKDMEKMPYLNAVSKELLRKHPPTYFSLTHAVTEPAKLAGYDIPTDANVEFFLPPISEDPKL

WKNPEKFDPDRFLLGGEDADITGVTGVKMMPFGVGRRICPGLSMATVHVNLMLARMVQDFEWSAYPENSK

IDFSEKLEFTVVMKNPLRAKIKPRV

 

>CYP77A14 CAAP02004173.1 CYP77A14 1 aa diff

26070  MAPAFAAASSTYSPYYHIFFTALAFLISGLIVLLSRKTKSKKLNL  26204

26205  PPGPPGWPIVGNLFQFARSGKQFFQYVRELRPKYGPIFTLKMGNRTMIIISSAELAHEAL  26384

26385  IEKGQSFASRPRENPTRTVFSCNKFTVNAAVYGPVWRSLRRNMVQNMLSASKIREFSNLR  26564

26565  DVSMDKLIDRLRSEAEANDGAVWVLKNARFAVFCILLSMCFGVEMDEETIEVMDDLMK  26738

26739  TVLITLDPRLDDYLPLLSPFFSKQRKAATEVRKRQIKTVVPFIERRRAALENPGS  26903

26904  DKTAASFSYLDTIFDLKIEGRKSSPTNPELVTLCSEFLNGGTDTTGTAVEWAIARMIEN  27080

27081  PEIQSKLYEEIKTTVGDRKVQEKDMEKMPYLNAVSKELLRKHPPTYFSLTHAVTEPAKLA  27260

27261  GYDIPTDANVEFFLPPISEDPKLWKNPEKFDPDRFFLGGEDADITGVTGVKMMPFGVG  27434

27435  RRICPGLSMATVHVNLMLARMVQDFEWSAYPENSKIDFSEK  27557

       LEFTVVMKNPLRAKIKPRV* 27617

 

>CYP77B6 gi|147833192|emb|CAN68641.1| = AM482217.2

CAAP02006523.1 11458-10193

MELTDLLLLSLALIFLRFWWRYWPTRVAAGWEPGAGDPPAPSLHSTWYVIYVPSMDLSSRYRWLIHEALV

QRGPIFASRPEDSPTRLVFSVGKCAINSAQYGPLWRTLRRNFVAELITPTRIRQCSWIRKWALENHMRRL

QMEVSEKGFVEVMSNCRLTICSILICICFGAKISERRIKEIESVLKDVMLMTTPKLPDFLPVLTPLLRRQ

LREAKELRKKQMECMVPLVRSRRAFVESKGAPGRSSSEMVSPIGAAYIDSLFGLEPAERGRLGEEELVTL

CSEIINAGTDTSATTVEWALLHLVMNQDIQQKLYKEIIDCVGKNGVVTEGDVEKMPYLGAIVKETFRRHP

PSHFVLSHAATKDTELGGYTIPADVNVEFYTAWVTEDPDLWQDPAEFRPERFLQGDGVNVDVTGTRGVKM

VPFGAGRRICPAMNLGTLHVNLLIARMIHAFKWIPAPGSPPDPTETFAFTVVMKNPLKAIILPR

 

>CYP78A36 CAAP02000038.1 73% to CAN80496.1 (CAO41852.1) on contig CU459222.1

chr1 scaffold_5

17207 MSSEYQLLFVPDIGRWSVVSVEVVVGVVLLCVVFGFWLAPGGLAWALAKCRGRFAIPGPP 17028

17027 GLPVTGLLHVFSGSEAHRVLAKLARRWDAVGLMAFSVGLTRFVVSSDPETAKGILSSSAF 16848

16847 ADRPVKESAYELLFHRAMGFAPYGEYWRNLRRISATHLFSPKRIAAFEGFRRDICLKMVD 16668

16667 EIRGLMVENGEVEVKKVLHFGSLNNVMMTVFGRSYDFDEGGVGFELEKLVREGYELLG 16494

16493 TFNWSDHFPLLGLLDLQGVRKRCRRLASKVNVFVGKIIEEHRAKRVGGLSVNGVE 16329

16328 DFVDVLLDLEKEDKLTDSDMIAVLW (0) 16254

16155 EMIFRGTDTVAILLEWILARMVLHPEIQSKAQSEIDAMVGNSRPVSDSD 16009

16008 IPNLPYVQSIVKESLRVHPPGPLLSWARLATDDVHIGDTLVPAGTTAMVNMWAITHDEKV 15829

15828 WPEPLEFKPERFMDTDISIMGSDLRLAPFGSGRRVCPGKSMGLATVHLWLAQLLQSFKWV 15649

15648 PSHHSVDLSETLNLSLEMKNPLICKAVARVA* 15553

 

>CYP78A37 gi|147777018|emb|CAN70074.1| 57% to CAN80496.1

CAAP02000784.1 69669-66363 (-) strand 1 aa diff

(CAO66175.1) on contig CU459353.1 chr1 scaffold_136

MKSFISAILSLSPLCFAATHQASWPLLLLLFSVSSLIFTLFLNFWLVPGGFAWRNHHGKYSSKLSGPIGW

PLLGSLPLMGSLAHRKLATMAASCGATRLMALSLGVTPVIISSHPDTAKEILCGSSFSNRPVKASARLLM

FERAIGFAPSGDYWRHLRRIAANYMFSPKRISGSEAVRLRVADEMVVGVRKEMEERGVVKLRGILQKGSL

SNIMESVFGRGLGSVEGEGLGFMVIEGYELIAKFNWEDYFPLGFIDFYGVKRRCSKLAAKVNGVVGKMIE

ERKRVGELSGGGNDFLSVLLSLPKEDQLSDSDMVAVLW (0)

EMIFRGTDTVAILLEWIMARMVIHQDIQAKAQ

EELDTCLGNQSHVQDSHIQSLPYLQAIVKEALRMHPPGPLLSWARLAIHDVHVGKFFVPAGTTAMVNMWA

IAHDPTIWKDPWAFKPDRFINEDVSIMGSDLRLAPFGSGRRACPGKALGLATVHLWLARLLHQFKWLPTH

PVDLGECLRLSLEMKKPLICCAIPRV*

 

>CYP78A38 gi|147857131|emb|CAN83498.1| 66% to 78A9

CAAP02000944.1 90793-92451 (+) strand 2 aa diffs (CAO62916.1)

On contig CU459322.1 chr2 scaffold_105

MGTQVESLWVLALAAKCRAFSSETVLVFGFALCLVWFAVVLSHWAYPGGQAWGRYWSKRGSKAEAIIPGP

RGLPVVGSMSLMVNLAHHNLAAAAERLGAMRLMAFSLGETRAIVTSNPDVAKEILNSSAFADRPVKECAY

SLMFNRAIGFAPYGVYWRTLRRIAATHLFSPKQITASETQRSEIAAQMVSLVGSCIGDIRVRDILKRASL

HHMMSSVFGRKYQLGSSNSETDELSRLVEEGYDLLGKLNWSDHLPWLAGLDPQKIRFRCSKLVPRVNRFV

NRIITEHRAQPGPTTRDFVDVLLSLQQPDKLLDSDIIAVLWEMIFRGTDTVAVLIEWILARMVLHPECQS

RVHDELDRVVGKSRPVKESDIPAMVYLAAVVKEVLRLHPPGPLLSWARLSITDTTVDGYHVPAGSTAMVN

MWAITRDPRVWSDPLDFTPDRFVTTPADVEFSLFGSDLRLAPFGSGRRTCPGKTLGLTTATFWVASLLHE

FEWVPSDPNPVDLSEVLRLSCEMAHPLTVRVRPRRI*

 

>CYP78A39 CAAP02001294.1  79% to CAN73323.1 (CAO71766.1) on contig CU459237.1

chr7 scaffold_20 upstream of the CYP736A25, 736A26, 736A27 gene cluster

53548 MELGLVSKDTNWWVFTLPAFLGSGNLLDGYVLGSLLIAFACASLFAWGFAVGGIAWKNRRN 53730

53731 ERGRVSIPGARGLPIFGSLLTLTRGLAHRSLASMAVSRGATQLMAFSLGSTPVVVASDPH 53910

53911 TAREILASPHFADRPIKQSAKSLMFSRAIGFAPNGAHWRLLRRIASSHLFAPKRIAAHEA 54090

54091 GRQLDCAAMLCSIADEQALHGAVCLRKHLQAAALNNIMGSVFGKRYDPTHDSNELNELRA 54270

54271 MVKEGFELLGAFNWSDYLPWLSYFYDPFRINERCSKLVPRVRKLVRGIIQEHRLGESSRL 54450

54451 SDNSDFVDVLLSLDGEEKLHEDDMVAVLW (0)

54622 EMIFRGTDTTALLTEWVMAELVLHPEVQTKLQTELDMMVMNKSVTDADVAKLPYLQAVVK 54801

54802 EALRVHPPGPLLSWARLSTSDVQLSNGMLVPTNTTAMVNMWAITHDPKIWPNPSEFNPER 54981

54982 FLESAGGADVDVRGSDLRLAPFGAGRRVCPGKNLGLVTVSLWVAKLVHHFRWIQDVAHPV 55161

55162 DLTEVLKLSCEMKSPLHAVAVRRNGSALE* 55251

 

>CYP78A40 CAAP02000145.1 = CAN78960.1 61% to 78A6

(CAO24293.1) on contig CU459254.1 chr15 scaffold_37

note 704A19 is also on this scaffold

50225  MIFWAYPGGPAWGKYRWRKASPSASMGKPI

50135  PGPRGFPVVGSMKLMASLAHHRIAAAAEACGAKRLMAFSLGETRVIVTCNPDVAKDILN  49959

49958  SSVFADRPVKESAYSLMFNRAIGFAAYGVYWRTLRRIAATHLFCPKQIKASEAQRAEI  49785

49784  AAEMAAMFGENTEPFRVRDVLKRASLSSMMCSVFGRKYQLDSSNNEAHELRTLVEEGYDL  49605

49604  LGTLNWSDHLPFLGDFDPQKIRMRCSNLVPKVNRFVTRIIAEHRARTTEKIRDFVDVLLS  49425

49424  LQGPDKLSDSDMIAVLW (0)

       EMIFRGTDTVA  49245

49244  VLIEWILARLVLHPDVQSRVHDELDRVVGESRAVAESDITAMEYLPAVVKEVIRLHPPGP  49065

49064  LLSWARLATTDTTVDGHHVPAGTTAMVNMWAITRDPNVWSDPLEFKPDRFSGMGADTDIS  48885

48884  VFGSDLRLAPFGSGRRVCPGKTLGLTTVTFWVASLLHEFEWKPLDGNNNVDLSEVLKL  48711

48710  SCEMEKPLTVKVHRRRSTSTSSP* 48639

 

>CYP78A40 gi|147864009|emb|CAN78960.1| 61% to 78A6

MGTHVESFWIFALVSKCEALSLASIAWVVFFAWFLLS

MIFWAYPGGPAWGKYRWRKASPSASMGKPIPGP

RGFPVVGSMKLMASLAHHRIAAAAEACGAKRLMAFSLGETRVIVTCNPDVAKDILNSSVFADRSGEGVGL

QSDVQQSNWIKASEAQRAEIAAEMAAMFGENTEPFRVRDVLKRASLSSMMCSVFGRKYQLDSSNNEAHEL

RTLVEEGYDLLGTLNWSDHLPFLGDFDPQKIRMRCSNLVPKVNRFVTRIIAEHRARTTEKIRDFVDVLLS

LQGPDKLSDSDMIAVLWEMIFRGTDTVAVLIEWILARLVLHPDVQSRVHDELDRVVGESRAVAESDITAM

EYLPAVVKEVIRLHPPGPLLSWARLATTDTTVDGHHVPAGTTAMVNMWAITRDPNVWSDPLEFKPDRFSG

MGADTDISVFGSDLRLAPFGSGRRVCPGKTLGLTTVTFWVASLLHEFEWKPLDGNNNVDLSEVLKLSCEM

EKPLTVKVHXRRSTSTSSP

 

>CYP78A41 gi|147864456|emb|CAN80496.1| 67% to 78A10

CAAP02008192.1 (CAO62448.1) ON CONTIG CU459318.1 chr17 scaffold_101

MSSENYFSFVPGTGSSPVLSLELFLCVVLFVGVFGFWLVPGGLAWAMSKAKARSAIPGPSGLPLIGLVFA

FTGSLTHRVLAKLARVSEAAPLMAFSVGFTRFIISSQPETAKEILNSSAFADRPVKESAYELLFHRAMGF

APFGVYWRNLRRISATHLFSPRRIAGFGEFRRTIGLKMVDEIRVLMEKKGEVKVKKVLHFGSLNNVMMSV

FGRSYDFGKGVSGDAVELESLVSEGYELLGIFNWSDHLPLLGWLDLQGVRKRCKKLVSKVNVFVSRIIDE

HRLRRVGDGEVGRDGDDSSGDFVDVLLDLEKESRLSDSDMIAVLWEMIFRGTDTVAILLEWILARMVLHP

DIQSKAQSEIDAVVGATRLVSDSDIHKLPYLHAIVKETLRMHPPGPLLSWARLSIHDTHIGSHFIPAGTT

AMVNMWAITHDDAVWDEPKEFKPSRFMEEDVSILGSDLRLAPFGSGRRVCPGKAMGLATVQLWLAQLLQN

FKWVACDSGVDLSECLKLSMEMKQSLVCKAVPRFS

 

>CYP78A42 gi|147794147|emb|CAN73323.1| 71% to 78A7 N-term added

CAAP02000829.1  50046-51958 (+) strand 2 aa diffs

(CAO70823.1) on contig CU459415.1 scaffold_198 chr?

MEFSFKSEYTNWWVFTLPALVETQNLSKGLILLFILIVFLSIGLLSWAFSAGGAAWKNGRNQ

MGRVSIPGPRGIPIFGSLFSLSHGLAHRTLASMALSSAATQLMAFSLGSTPTVVSSEPCTAREILTSPQF

ADRPIKQSVKSLMFSRAIGFAPNGAYWRLLRRISSSHLFAPKRIAAHEGGRQLDCTAMLQSIAKEQSANG

AVVLRKHLQAAALNNIMGTVFGKRLNPVEDSMEARELHEIVKEGFELLGAFNWSDHLPWLNYFYDPFGIN

QRCSALVPRVRKLVKGIIKEHQLSDSNKLSDKSDFVDVLLSLDGEEKLEEEDMVAVLWEMIFRGTDTTAL

LTEWVMAELILNPKVQAKLHEELHLTTLGNKAITDANVAKLPYLQAVIKETLRVHPPGPLLSWARLSTSD

VHLSNGMVIPSNTTAMVNMWAITHDPNLWKDPLAFKPERFLPSAGGADVDVRGCDLRLAPFGAGRRVCPG

KNLGLVTVSLWVAKLVHHFDWVQDMAQPVDLSEVLKLSCEMKNPLSAVPVPRSGAIHI*

 

CYP79 family

 

(9 genes with one allele) [13 pseudogenes and 3 duplicate sequences)

 

>CYP79A15 CAAP02002475.1f 93% to CAN74072.1

same as CAO41497.1 starting except for blue regions with one frameshift

53450 MSSSFPNPFLFLLSHNSETLELAPLHLHL

53363 PFILLLLFLFVFAFLILYRPIPKTLITKQMPLLPPGPTPWPLVGNLPELLTKKPVF 53196

53195 RWILGLLEELNTEIACIKLGNVHVIPVISPEIAREFLKKHDAVFASRPITMASHHLSRGF 53016

53015 LTTALSPWGEQWKKMRRIIISEVLKR 52938

52936 ERHIWLLQKRTEEADNLVRFIYNH (2)

(Q)CKFSS

ITSHNITDSSVVNVRNAVRQYTGNVVRKMMFSRRYF 52742

52741 GEGRKDGGPGLEEEEHVNSLFTLLAYLYSFSPSDYLPCLRVFDLDGHETMVKDALSIIN 52565

52564 KHHDPIVDERIIQWRNGEKKEVEDILDVFLTISDTKGKPLLSVEEIKAQL 52415

52311 ELMIEIVDNPAHAAEWAMAEMINQPEIMQKAVEEIDRVVGKDRLVQESDIAQLKYVKACA 52132

52131 REALRLHPMAPFNVPHVSMADAVVAGYFIPKGSHVLLSRVGLGRNPRVWEEPLKFKPERH 51952

51951 MNDEVVDLAEHELRFISFSTGRRGCPGTALGTALTVTLLARLLQCFSWSVPPNQDQIDL 51775

51774 TESMNELFLAKPLHAHAKPRLPASMYGN 51691

 

>CYP79A15 gi|157351310|emb|CAO41497.1| unnamed protein product [Vitis vinifera]

MGSSCNSTMSSSFPNPFLFLLSHNSETLELAPLHLHLPFILLLLFLFVFAFLILYRPIPKTLITKQMPLL

PPGPTPWPLVGNLPELLTKKPVFRWILGLLEELNTEIACIKLGNVHVIPVISPEIAREFLKKHDAVFASR

PITMASHHLSRGFLTTALSPWGEQWKKMRRIIISE

KRTEEADNLVRFIYNH (2)

(Q)CNHN

ITDSSVVNVRNAVRQ

YTGNVVRKMMFSRRYFGEGRKDGGPGLEEEEHVNSLFTLLAYLYSFSPSDYLPCLRVFDLDGHETMVKDA

LSIINKHHDPIVDERIIQWRNGEKKEVEDILDVFLTISDTKGKPLLSVEEIKAQLIELMIEIVDNPAHAA

EWAMAEMINQPEIMQKAVEEIDRVVGKDRLVQESDIAQLKYVKACAREALRLHPMAPFNVPHVSMADAVV

AGYFIPKGSHVLLSRVGLGRNPRVWEEPLKFKPERHMNDEVVDLAEHELRFISFSTGRRGCPGTALGTAL

TVTLLARLLQCFSWSVPPNQDQIDLTESMNELFLAKPLHAHAKPRLPASMYGN

 

>CYP79A16P CAAP02002475.1e pseudogene 76% to CAN74072.1

43522 KMIN*PEIMQKVMEEIDRVVGKERLVKEFDIMQLKYVKACGREAMRLHPMSSFNDPHLSM 43343

43342 VDAIVAGYFIPKGSHVLLNQVGLGRNPKVWEEPLRFKPERHMNDDVLDLAEPEVRFISF 43166

43165 SAKRQWCFGTALGTTLTVTLLARLLQGFSWSAPHNHEQIDLKESTQPL 43022

 

>CYP79A17 CAAP02002475.1d 92% to CAN74072.1

identical to CAO41496.1 except blue insert

38536 MSSSFPNSFLFLLSHNSETLKLVSLHLHL

38449 PFILLLLFLFFFAFLILYKLKPKTLITKPMPLLPPGPTPWPLVGNLPELFTKKPVFR 38279

38278 WILGLLEELNTEIACIKLGNVHVIPVISPEIAREFLKKHDAVFASRPITMTSDHLSRGFL 38099

38098 TTVLSPWGEQWKKMRRIITSEVLKPARHMWLLQKRTEEADNLVRFIYNH (2)

(Q)CKFSS

      ITSHNF 37919

37918 TESSVVNVRNTVRQYTGNVVRKMMFSRRYFGEGRKDGGPGLEEEEHVNSLFTTLAYLYV 37742

37741 FSPSDYLPCLRVFDLDGHEKMVKEALRIINKHHDPIVDERIIQWRNGEKKEVEDILDVFI 37562

37561 TISDTKGKPLLSVEEIKAQL (0) 37502

37398 ELMIEIVDNPAHAAEWAMAEMINQPEIMQKAVEEIDRVVGKDRLVQESDIAQLKYVKACA 37219

37218 REALRLHPIAPFNVPHVSMADAVVAGYFIPKGSHVLLSRVGLGRNPRVWEEPLKFKPERH 37039

37038 MNDEVVDLAEPELRFISFSTGRRGCPGTALGTALTVTLLARLLQCFSWSVPPNQDQIDL 36862

36861 TESMNDLFLAKPLHAHAKPRLHASMYGN 36778

 

>CYP79A18P CAAP02002475.1c pseudogene 89% to CAN74072.1

33051 MASSFPNPLLFLLSHNSETLDSLHPHL

32970 PFIFLLLFLFLFAFLILYKLKPKTLITKQMPLLPRGPAPWP 32848

32849 LVGNLPELFTKK

32821 QKKPAFRWILGL*EELNNEIACIKLGNVHVIPVISPEIGREFLKKHDAVFAS*PIPMESH 32642

32641 HLSRFLTTVLSPWGEQWKKMRSILTSEVHKQERHM*LLQKRTEEADNLVRFI 32486

 

>CYP79A18P-de1b

32354 SSSYPNPLLFLLSHNSETLELALLRIHLH*LFIFLLIFLFLFAFI 32220

 

>CYP79A19 CAAP02002475.1b 94% to CAN74072.1

identical to CAO41495.1 except blue insert

31474 MSSSFPNSFLFLLSHNSETLKLVSLHLHL

31387 PFILLLLFLFFFAFLILYKLKPKTLITKQMPLLPPGPTPWPLVGNLPELFTKKPVF 31220

31219 RWILGLLEELNTEIACIKLGNVHVIPVISPEIAREILKEHDAVFASRPITMASHHLSRGF 31040

31039 LTTALSPWGEQWKKMRRTIISEVLKPERHIWLLQKRTEEADNLVRFIYNH (2)

(Q)CKFSS

      ITSHN 30860

30859 FTDSSVVNVRNAVRQYTGNVVRKMMFSRRYFGEGRKDGGPGLEEEEHVNSLFTTLVYLY 30683

30682 VFSPSDYLPCLRVFDLDGHEKMVKEALSIINKHHDPIVDERIIQWRNGEKKEVEDILDVF 30503

30502 ITISDTKGKPLLSVEEIKAQL 30440

30336 ELMIEIVDNPAHAAEWAMAEMINKPEIMQKAVEEIDRVVAKDRLVQESDIAQLKYVKACA 30157

30156 REALRLHPMAPFNVPHVSMADAVVAGYFIPKGSHVLLSRVGLGRNPRVWEEPLKFKPERH 29977

29976 MNDEVVDLAEPELRFISFSTGRRGCPGTALGTALTVTLLARLLQCFSWSVPPNQDQIDL 29800

29799 TESMNELFLAKPLHAHAKPRLPASMYGN 29716

 

>CYP79A20P CAAP02002475.1a pseudogene 76% to CAN74072.1

19837 AEMIN*PEIMQKVMEEIDRVVGKERLVQEFDIMQLKYVKACGREAMRLHPMASFNDPHLS 19658

19657 MVDAIIAGYFIPKGSHVLLNQVGLGRNPKV*EEPLRFKPEWHMNDDVLDLAEPKLRFIS 19481

19480 FSTKRQWCFGTALGTTLTVTLLARLLRGFSWSAPHNHEQIDLRESTQPL 19334

 

>CYP79A-se1[3] I-helix region

26788 IVDNPA*AAEWAIA 26747

 

$$$$

 

>CYP79A21v1 CAAP02000934.1c 96% to CAAP02002475.1f

92328 MSSSFPNPFLFLHSDHSETLELASLHLH

92244 LPFILLLLVLFVVAFLILYKLKPKTLITKQMPLLPPGPTPWPLVGNLPELFTKKP 92080

92079 VFRWILGLLEELNTEIACIKLGNVHVIPVISPEIAREFLKKHDAVFASRPITMASHHLSR 91900

91899 GFLTTALSPWGEQWKKMRRIIISEVLKPERHISLLQKRTEEADNLVRFIYNH (2)

QCKFSS

      ITS 91720

91719 HNFTDSSVVNVRNAVRQYTGNVVRKMMFSRRYFGEVRKDGGPGLEEEEHVNSLFTSLAY 91543

91542 LYSFSPSDYLPCLRVFDLDGHETMVKDALSIINKHHDPILDERIMKWRNGEKKEVEDTLD 91363

91362 VFLTIRDTKGKPLLSVEEIKAQLI (0) 91291

91190 ELMIEIVDNPAHAAEWAMAEMIDQPEIMQKAVEEIDRVVGKDRLVQESDIAQLKYVKACA 91011

91010 REALRLHPIAPFNVPHVSMADAVVAGYFIPKGSHVLLSRVGLGRNPRVWEEPLKFKPERH 90831

90830 MNDEVVDLAEPELRFISFSTGRRGCPGTALGTALTVTLLARLLQCFSWSVPPNQDQIDL 90654

90653 TESMNELFLAKPLHAHAKPRLPASMYGN* 90567

 

>CYP79A21v1 gi|157351304|emb|CAO41491.1| unnamed protein product [Vitis vinifera]

MSSSFPNPFLFLHSDHSETLELASLHLHLPFILLLLVLFVVAFLILYKLKPKTLITKQMPLL

PPGPTPWPLVGNLPELFTKKPVFRWILGLLEELNTEIACIKLGNVHVIPVISPEIAREFLKKHDAVFASR

PITMASHHLSRGFLTTALSPWGEQWKKMRRIIISEVLKPERHISLLQKRTEEADNLVRFIYNQCNHNFTD

SSVVNVRNAVRQYTGNVVRKMMFSRRYFGEVRKDGGPGLEEEEH

VNSLFTSLAYLYSFSPSDYLPCLRVF

DLDGHETMVKDALSIINKHHDPILDERIMKWRNGEKKEVEDTLDVFLTIRDTKGKPLLSVEEIKAQLI (0)

ELMIEIVDNPAHAAEWAMAEMIDQPEIMQKAVEEIDRVVGKDRLVQESDIAQLKYVKACAREALRLHPIAPF

NVPHVSMADAVVAGYFIPKGSHVLLSRVGLGRNPRVWEEPLKFKPERHMNDEVVDLAEPELRFISFSTGR

RGCPGTALGTALTVTLLARLLQCFSWSVPPNQDQIDLTESMNELFLAKPLHAHAKPRLPASMYGN

 

>CYP79A21v2 gi|147797931|emb|CAN74072.1 = AM444092.2

 62% to CYP79A2

compare to CAO41491.1

nearly same as CAAP02000934.1c from 90570-92328 except N-term 41 aa with 13 diffs

MSSSFPNPFLFLHSDHSETLELASLHLHLPFILLLLVLFVVAFLILYKLKPKTLITKQMPLLPPGPTPWPLV = CAAP02000934.1c

MASSFPNPLLFLLSHNSETLD--SLHPHLPFIFLLLFLFLFAFLILYKLKPKTLITKQMPLLPPGPYPWPLV

GNLPELFTKKPVFRWILGLLEELNTEIACIKLGNVHVIPVISPEIAREFLKKHDAVFASRPITMASHHLS

RGFLTTALSPWGEQWKKMRRIIISEVLKPERHISLLQKRTEEADNLVRFIYNH (2)

(Q)CKFSS

ITSHNFTDSSVV

NVRNAVRQYTGNVVRKMMFSRRYFGEVRKDGGPGLEEEEH

VNSLFTSLAYLYSFSPSDYLPCLRVF

DLDGHETMVKDALSIINKHHDPILDERIMKWRNGEKKEVEDTLDVFLTIRDTKGKPLLSVEEIKAQLI

ELMIEIVDNPAHAAEWAMAEMINQPEIMQK

AVEEIDRVVGKDRLVQESDIAQLKYVKACAREALRLHPIAPFNVPHVSMADAVVAGYFIPKGSHVLLSRV

GLGRNPRVWEEPLKFKPERHMNDEVVDLAEPELRFISFSTGRRGCPGTALGTALTVTLLARLLQCFSWSV

PPNQDQIDLTESMNELFLAKPLHAHAKPRLPASMYGN

 

$$$$

 

>CYP79A22P CAAP02000934.1b pseudogene = CAAP02002475.1c

86630 KRPAFRWILGL*EELNNEIACIKLGNVHVIPVI 86532

86531 SPEIGREFLKKHDAVFAS*PIPMESHHLSRFLTTVLSPWGEQWKKMRSILTSEVHKQER 86355

86354 HM*LLQKRTEEADNLVRFIYNQCKSSTSTDNFMDSSVVNVRNAVRQYTGNIVRKMMFSRR 86175

86174 YFGEGRLGGGPGLEEEEHVNFLFTSLAYLHAFSPSDYLSCLRVFDLDGHQKMVKEALSI 85998

85997 INKHHDPIVDERII*WRNGEKKEVEDIVDVCITSRDSKGEPLLSVEEIKAQII (?) 85842

85738 LKYVKAFGREALRLHPMAPFNVPHLSMADAVVAGYFIPKGSHVLLSRVGLGRNPRVWEEP 85559

85558 LRFKPERHMNDDVVDLAEPELRFISFSTGRR 85466

 

>CYP79A22P gi|147797930|emb|CAN74071.1 AM444092.2

translation revised

 75% to CAN74072.1

very similar to 934.1b

4618 MASSFPNPLLFLLSHNSETLDSLHPHLPFIFLLLFLFLFAFLILYKLKPKTLITKQMP 4445

4444 LLPPGPAPWP 4415

4416 LVGNLPELFTKK

4385 KRPAFRWILGL*EELNNEIACIKLGNVHVIPVISPEIGREFLKKHDAVFAS*PIPMESHH 4206

4205 LSRFLTTVLXPWGEQWKKMRSILTSEVHKQERHM*LLQKRTEEADNLVRFIYNQCKSST 4029

4028 STDNFMDSSVVNVRNAVRQYTGNIVRKMMFSRRYFGEGRLGGGPGLEEEEHVNFLFTSLA 3849

3848 YLHAFSPSDYLSCLRVFDLDGHQKMVKEALSIINKHHDPIVDERII*WRNGEKKEVEDIV 3669

3668 DVCITSRDSKGEPLLSVEEIKAQII 3594

3496 QLKYVKAFGREALRLHPMAPFNVPHLSMADAVVAGYFIPKGSHVLLSRVGLGRNPRVWEE 3317

3316 PLRFKPERHMNDXVVDLAEPELRFISFSTGRR 3221

 

>CYP79A23 CAAP02000934.1a 94% to CAAP02002475.1d, 60% to 79A2 Arab.

identical to CAO41489.1 except blue insert

81277 MSSSFPNPFLFLHSDNSQTLELASLHLH

81193 LPFILLLLFLFVFAFLILYKLKPKTLITKQMPLLPPGPTPWPLVGNLPELFTKKP 81029

81028 VFRWILGLLEELNTEIACIKLGNVHVIPVISPEIAREFLKKHDAVFASRPITMASHHLSR 80849

80848 GFLTTALSPWGEQWKKMRRIIISEVLKPARHMWLLQKRTEEADNLVRFIYNH (2)

QCKSSTS

      TD 80669

80668 NFMDSSVVNVRNAVRQYTGNIVRKMMFSRRYFGEGRKDGGPGLEEEEHVNSLFTSLAYL 80492

80491 YAFSPSDYLPCLRVFDLDGHEKMVKEALSIISKHHDPIVDDRIIQWRNGEKKEVEDILDV 80312

80311 FITISDTKGKPLLSVEEIKAQLI (0) 80246

80142 ELMIEIVDNPAHAAEWAMAEMINQPEIMQKAVEEIDRVVGKDRLVQESDIAQLKYVKACA 79963

79962 REALRLHPIAPFNVPHVSMADAVVAGYFIPKGSHVLLSRVGLGRNPRVWEEPLKFKPERH 79783

79782 MNDEVVDLAESELRFISFSTGRRGCPGTALGTALTVTLLARLLQCFSWSVPPNQNQIDL 79606

79605 KESMNELFLAKPLHAHAKPRFHASMYGN* 79519

 

>CYP79A23-de2b CAAP02000934.1a-de2b pseudogene

78436 LLGRLFQGF*WSVPLNQEQTDLKES 78362

78359 DLFLAKPFHAHAKPCL 78312

 

>CYP79A24 CAAP02002580.1 93% to CAN75997.1

MGSSSNST

LFSSILFLSPANDAAIDDVLSHLTFLLMLFIISVILNFTKFKSKTSTNSKSM

46176 MMLPPGPAPWPLVRNLPHLLNRKPTFRWIHGFMKEMNTEIECIQLGDVHVIPVTSPEISR 46355

46356 EFLKKHDTVLASRPITMVTEYSSGGFLTTAVVPWGEQWKKMRRVLASKVINPSTFRWLHD 46535

46536 KRVEEADNLVRYVYNH (2)

(Q)CKIST

      NNCLGSVINVRNTVRQYSGNAIRKMILNTRYFGEGKQD 46715

46716 GGPGVEEEQHVESLFTVLAHLYAFSLSDYFPWFRVLDLDGHEKTVREAMNTINKYHDPIV 46895

46896 DQRVEHWRNGEKEAEDLLDVFISIKDSNGEPLLSVAEIKAQCT (0) 47024

47137 ELILAAVDNPSNAIEWAMAEMINQPRVLGKAVEEIDRVVGKERLVQESNFQQLNYVKACI 47316

47317 KEAFRLHPIAPFNLPHVSNADAIVASYFIPKGSHVLLSRLGLGRNPRIWEEPLIFNPERH 47496

47497 LNASTAQGVDLNEQDLRLISFSTGRRGCTGIAFGSAMTVMLLVRLLQGFTWSPPPGQKEV 47676

47677 DLSESRNDLFLANPLHALAKPRLHSSLL* 47763

 

 

>CYP79A25 CAAP02002385.1a 94% to CAN75997.1

MSSSSNSTLFSSILFLSPANDPAIDDVLSHLTFLL

23533 MLFIISVILIFTKFKSKTSTNSKSMMLPPGPAPWPLVRNLPHLLNKKPTFRW 23378

23377 IHGLMKEMNTEIACIQLGDVHVIPVTSPEISREVLKKHDTVFASRPITMATEYSSGGFLT 23198

23197 TAVVPWGDQWKKMRRVLASKVINPSTFRWLHDKRVEEADNLVRYVYNH (2)

(Q)CKIST

      NNCLGS 23018

23017 VINVRNTVRQYSGNAIRKMILNTRYFGEGKKDGGPGVEEEQHVESLFTVLAHIYAFSLS 22841

22840 DYFPWLRVLDLDGHEKTVREAMNTINKYHDPIVDQRVEHWRNGEKNEAEDLLDVFISVKD 22661

22660 SNGEPLLSVAEIKAQCT (0) 22610

22497 ELIFAAVDNPSNAIEWAMAEMINQPRVLGKAVEEIDRVVGKERLVQESDFQQLNYVKACI 22318

22317 KEAFRLHPIAPFNLPHVSNADAIVAGYFIPKGSHVLLSRLGLGRNPRIWEEPLIFNPEGH 22138

22137 LNASTAQGVDLNKQDLRLISFSTGRRGCTPG 22046

22045 GIAFGSAMTVMLLVRLLQGFTWSPPPGQEEIDLSESRNNLFLAKPLHALAKPRLHSSLYPLH* 21857

 

>CYP79A-se2[3] CAAP02002385.1b C-term pseudogene, 85% to CAAP02005443.1b

55797 LSTVQVVELNEPNLRFISFKTGRRGCSGIS*G 55702

 

>CYP79A26 gi|147794417|emb|CAN75997.1| 71% to CAN74072.1

CAAP02001548.1 67317-65541 (-) strand 1 aa diff

MGDNSNSTLFSSVLFLSPANAATIDDLLSHLNFLL

LTFITSIFFILTKFKFKTSTNSKA

MLLPPGPAPWPLVRNLPHLLNNKPTFRWIHGFMKEMNTEIACIQLGNVHVIPVTSPEISKEFLKKHDAVF

ASRPITMASEYSSGGFLTTAVVPWGDQWKKMRRVLASNVINPSTFRWLHDKRVEETDNLVRYVYN (2)

(Q)CKISTS

NNCLGSVINLRNTARQYSGNAIRKMILNTRYFGEGKKDGGPGVEEEQHVESLFTVLAHLYAFSLSDYF

PWLRVLDIDGHEKTVREAMNTINKYHDPIVDQRVEQWRNGEKKEAEDLLDVFISVKDSNGEPLLSVAEIK

AQCTELMLAAVDNPSNAIEWALAEMINQPRVLGKAVEEIDRVVGKERLVQESDFQQLNYVKACIKEAFRL

HPIAPFNLPHVSNADAIVAGYFIPKGSHVLLSRLGLGRNPRIWEEPLIFNPERHLSASRAQGVDLNEQDL

RLISFSTGRRGCTGIAFGSAMTVMLLVRLLQGFTWSPPPGQEEIDLSESRNDLFLAKPLNALAKPRLHSS

LYPLH

 

>CYP79A27 gi|147815868|emb|CAN61661.1| 73% to CAN74072.1

CAAP02000528.1 50292-52017 N-term revised

MGLYWVNMLPYASTHWSFIEKFNPKPDVLYMELAHCTG

MLSSFLFLSLANTATKDHVLAHLNFVLLLFCVSILVIFTKFK

SNTSTSSKGMHLPPGPAPWPLLRNLPDLLKNKPVFRWIHGFMKEMNTEIACIQLGNVHVIPVISPEISRE

FLKKHDAIFASRPVTMASEYSSGGFLTIAVVPKGTQWKKMRRVVASDVINETTFKWLHDKRVEEADNLVR

FIYNH (2)

(Q)CKTFTS

PSIINVRNTVRQYSGNVIRKMILNTRYFGEGKKDGGPGIEEEQHVESLFTVLAHLYAFS

LSDYFPWMRVLDLDGHEKTVRQAMNTIDKYHDPIVENRAKQWRNGGKKEAEDLLDIFLSIKDAHGEPLLS

VAEIKAQCT (0)

ELMLAAVDNPSNAIEWAMAEMINQPEVLRKAVEEINRVVGKERLVQESDFEQLNYVKACAR

EAFRLHPIAPFHLPHVSTCDAVVAGYFIPKGSHVLLSRLGLGRNGRIWEEPLRFKPERHLSEGTGKMVEL

TEPDLRFISFGTGRRGCPGKATGSAMTVMLLARLLQGFTWSAPPEQKEIDLSESRNDLSLEKPLHAVAKP

RLHASLYSAAYQ*

 

>CYP79A27-de1b CAAP02000528.1-de1b pseudogene N-term

52311 PQMLKSKPTIFRWLHGLMEEMNMEIA 52234

 

>CYP79A28P gi|147778176|emb|CAN67567.1| AM447848.2

pseudogene missing the C-term with frameshift and stop codons

69% to CAN74072.1

MDSSSNSTLFSSILFLSPANAATIVNMLSHLNFLLFLFITSVFPIFTKFKSKTITKSKPM

LVPPGPAPWPLLRNLPHL &

NKKPTFRWIHGFMKEMNTEIACIQSGNIHVIPLTSPEISREFLKKHDAVFASRPMTMATE

YSSGGFLTTAVVPWGDQWKKMRRVLASDVINPSTFRWLHDKRVEEADNLVRCIYNH (2)

N  N  C  L  G  S  V  I  N  L  R  N  T  V  R  Q  Y  S  G  N

A  I  R  K  M  I  L  N  T  R  Y  F  G  Q*

K  K  DGGPGVEEEQHVES

LFTVLAHLYVFSLSDYFPWLRVLDLDGHEKTVREAMNTIKKYHDPIVDQRVEQWRNGEKKEAEDLLDFLI

SVKDSNGEPLLSVAEIKAQCTELMLAGVYSPSNAIEWAMVEMINQPEVLSKLVEEIDRVVGKERLVQESD

FQQLNYVKACIREALWLHPIVLFNLPHVSNSDATVAGYFIPKGSHVLLSRLGLGQNPRIWEEPLNFNPER

HLSASTVQAVELNEPNLLFISFKTGRRGCAGIS

 

$$$$

 

>CYP79A29P CAAP02005443.1b = CAN82591.1 AM436340.2b

and CAAP02007407.1 pseudogene

11786 ELMLAGVDSPSNAIE*AMIKQPGILSKAVKKLMEWL 11679

11678 ERIDEFRNPTSVAGYSIPKGSHVLLSRLGLGRNPRIWEEPLNFN 11547

11546 PERHLNASTVQAVELNEPNLRFISFSTGRRGCAGMLARLLQGFTWSPPPG 11397

11396 QKEIDFSES 11370

 

>CYP79A29P gi|147862220|emb|CAN82591.1 AM436340.2 56% to 79A2

next to a CYP75 seq CAAP02005443.1a

ELMLAGVDSPSNATE*A

MIKQPGILSKAV

EEIDGVVGKDRRVQESD

MEWLERIDEFRNPTSVAGYSIPKGSHVLLSRLGLGRNPRIWEEPLNFNPERHLNA

STVQAVELNEPNLRFISFSTGRRGCAGMLARLLQGFTWSPPPGQKEIDFSESRNDLSLLQNLCMP

 

>CYP79A29P gi|147810272|emb|CAN75825.1 AM429843.2 42% to 79A2

MIKQPGILSKAVKKLMEWLERIDEFRNPTSVAGYSIPKGSHVLLSRLGLGRNPRIWEEPLNFNPERHLNA

RMLARLLQGFTWSPPPGQKEIDFSESRNDLSLLQNLCMP

 

>CYP79A29P CAAP02007407.1 pseudogene = CAN82591.1

4350 EQSSEEIDGVVGKDRRVQESD 4288

4283 VAGYSIPKGSHVLLSRLGLGRNPRIWEEPLNFNPERHLNASTVQAVELNEPNLRFISFST 4104

4103 GRRGCAGMLARLLQGFTWSPPPGQKEIDFSES 4008

 

>CYP79A29P CAAP02008469.1 CYP79A pseudogene may be same as CAN82591.1

5938 EEIDGVVGKDRRVQESD 5988

6002 YFIPKGSHVLLSRLGLGRNPRIWEEPLNFNPER 6100

7982 NGINSKSFPRGSIFSFLP 8035

6159 ISFSTGRRGCAGMLARLLQGFTWSPPPGQKEIDFSES 6269

 

$$$$

 

>CYP79A30P CAAP02002140.1a  70% to CAN61661.1

18686 FLLLLFITSIFLIFTKLKSKTSTKSKPIQLPPGLAPWPLVRNLPHL 18823

18915 LVLVTSLEISRELLKKHDVLFASTP 18989

 

>CYP79A31P CAAP02002140.1b  72% to CAN75997.1

39740 VAGYFIPKGSHVLLSRLGLGRNLRIWEEPLNFNP 39841

39874 VELNEPNLRFISFSAGRCGCTGIAFGSAIAVMLLAMLFQGLTWSLPPG 40017

 

>CYP80E3 CAAP02000058.1 55% to 80K1,

only 8 aa diffs to CYP80E3 (EE091942.1 EE096408.1, CF205652.1)

CAAP02013654.1 CYP80E3 842-3125 100% match to CAAP02000058.1

CAAP02016884.1   4 aa diffs to CAAP02000058.1

CAAP02000275.1 2791-5054 6 aa diffs

232005 MVAMLAEGT

232032 SFFDVFLPFLLLLPLLVFLILKLLKDSSSLKSPPLPPGPSPWPILGNLLHLGNMPHISLA 232211

232212 RFSQSYGPLISLRLGSQILVVASTSSAAMEILKTHDRVLSGRYVPHAVPAKNSEINPMSL 232391

232392 GWAVECNGAWKNLRTVCRAELFSTKVMESQAWVGEKKVMEMVRFVSTKEGEVMKVGEVVF 232571

232572 ATVLNTLTTVLMSRDFISFEDDNKDGGMKGLVRKMVMAMAAPNLDDFYLIFSGLDLQGL 232748

232749 NKKTKELIARICSMWESVIRERREGASDDPSKQDFLNILIRSGYSDDQINQLFM (0) 232910

233683 ELLTAGADTSSSTLEWAMAELIKSPESMKKVHEELAREISDNLPKASDLPHLPYLQACVK 233862

233863 ETLRLHPSAPLLLPRRASVSCEVMNYTIPKDSQIWVNAWAIGRDPMNWEDPLVFKPERFL 234042

234043 NSAVDFKGNNLEFIPFGAGRRICPGLPMAARLLPLILASLTHFFDWSLPNGTTPDELDMN 234222

234223 DKFGVTLQKEQPLLIIPKVRK* 234285

 

>CYP80E4 CAAP02001975.1a  56% to CYP80E3 FRAMESHIFT AFTER VAALI

42136 MIKSMAQTALTEGVSVLPSILPLPPLIFLILKHLKAKSPSL

42259 PPGPYPWPLIGNVHQIGKQRHIAMIDFARSYVPLFSLRLGTQTLVVGSSAAAAREILNSY 42438

42439 DHILCARCVPRVIPCRITGLNGFAVGWSPECDDRWKYLRTMCRTQLFSGKAIESQACLR 42615

      EKKLMEVVMFLSSMEGKVVKLKNVGFVAALI  

42696 IISNALLSKDLVTFEDEKALAMMGEIFKTILEVTSTPNLSDYYPILRGLDLQRLQKRS 42881

42882 IISFVKFCSILKPIIKERRERKGGHATSQQDFLDTLISDGFTDDQINI 43025

42616 EKKLMEV (0) 42636

43355 ELLVAGTDSSSVTVEWAMAELIRSPESLKKIREELTTEINQNMLKDSDLRKLPYLQACL 43531

43532 KETLRLHPPGPFLLPHRAVESCKVMNYTIPKDAQVLVNAWAIGRDPMSWEDPLVFKPERF 43711

43712 LNSTVDFQGNNFEFIPFSSRRRICPGLPMAVKLIPLVLASWIHFFDWSLPNGGDPKDIDM 43891

43892 SEKYSANIRKEQPLLLIPKGRK* 43960

 

>CYP80E5 CAAP02001975.1b  57% to CYP80E3

46562 MDEAALTEGTNLFPLILLLLPLIFLILKHLKSKSPISLPPGPYPWPIIGNVHQIGKQRHIAM 46747

46748 ADFARSYGPLFSLRLGTQTLIVGSSAAAAKEILSSYDRIFCARYVPGVMPEKSSEFYNNS 46927

46928 IVWSLECDDRWKYLRTMCRTQLFSGKAIESQACLREKKLMEVVGFLSSMEGRVVKLK 47098

47099 ELAFVTALNMISNALLSKDLVSLEDETAVARMLGCVKKTVDVMSTPNLADYYPILRGLDL 47278

47279 QRLQKKSRDSFVELFSLWQPIVKERRERKGSHATRQHDFLDALINGFTDDRINFLLG (0) 47452

47746 ELLIAGTESTSVTTEWAMAELIRSPDSMKKIREELTTEINKSTLKDSDLRKLPYLQACL 47922

47923 KETLRLHPPGPFLLPHRALESCKVMNYTIPKDAQVLVNAWAIGRDPMSWEDPLVFKPERF 48102

48103 LNSIVDFQGTNFEFIPFGAGRRICPGLPMAVKLIPPVLVSWIHFFDWSLPNWGDPKEIDM 48282

48283 REKFGANIQKEHPLLLIPKVRKWPQVCA* 48369

 

>CYP80K2 gi|147842081|emb|CAN62646.1| 77% to CYP80K1

MDSDTVTADISLSSFLYALLLLPFLLILKHIFLKPPPLPPGPYPWPIIGNLLQLGKNPHVKLASLAKLHG

PLMSLRLGTQLMVVASSPAAAJEVLKTHDRTLSGRYVSSSLSVKDPKLNHLSLAFAKECTNNWKNLRTIC

RTEMFSGKAMESHVELKGEEGDGVGGVFGNKGSNTFFSMDLCDFEREGLKDFIYRAAELGATPNLSDFYP

ILDGLNLHGSKKKSKEALGRILATWEGTLKERRKQKNPGSSHRDLLEAFLEIRFEDDQINQVILELFSAG

ADTSTLTIEWAITQLIRNPDVMYKLRDELTKIIGESPVRESHLPHLPYLQACVKETLRLHPPAPLLLPHR

AMETCQVMGYTIPKDSQVFVNIWAMGRDPKVWDXPLSFTPERFLDSKLEFKGNDFEYIPFGXGRRICPGM

ALGARQVPLVLATLVHLFDWSLPDNMDSAQIDMEEWLVITLRKENPLRLVPKVRK

 

>CYP80K1 gi|147842082|emb|CAN62647.1| = AM430618.2

CAAP02001529.1b 100% match 19493-17900 (-) strand

MDPDTVTADISIFSFLYALLLLPFLVILKHIFLKPPPLPPGPYPWPIIGNLLQMGKNPHAKLANLAKLHG

PLMSLRLGTQLMVVASSPAAAMEVLKTHDRALSGRYLSXSVPVKNPKLNHLSIVFAKDCNTNWKNLRAIC

RMELFSGKAMESQVELRERKVTELVEFLATKEGEVVKVMDLVFTTICNILSNKFFSMDLCDFEDEGRVGG

ALKDLIHKNAEFGATPNLSDYYPILGGLDIQGINRKAKEMFERIPTTWEDILKERRTQRSNRSSHRDFLE

ALLEIGFEDDQINQVIL (0)

ELFSAGAETSSLTVEWAMAELIRNQDAMDKLRGELRQIVGESPVRESHLPRLP

YLQACVKEALRLHPPAPLLLPHLAAETCQVMGYTIPKDSQIFVNIWAMARDPKIWDDPLSFKPERFLDSK

LDFKGNDFEYIPFGAGRRICPGLALGGRQVPLILATFVHLFGWSLPGNMDSAQLDMEEWLVITLRKEQPL

RLVPRVRK

 

>CYP80K2 CAAP02001529.1a CAN62646.1 CYP80K

13918  MDSDTVTADISLSSFLYA

13864  LLLLPFLLILKHIFLKPPPLPPGPYPWPIIGNLLQLGKNPHVKLASLAKLHGPLMSLR  13691

13690  LGTQLMVVASSPAAALEVLKTHDRTLSGRYVSSSLSVKDPKLNHLSLAFAKECTNNWKNL  13511

13510  RTICRTEMFSGKAMESHVELRERKVMELVEFLATKEGEVVKVMDLVFTTICNILSNTFFS  13331

13330  MDLCDFEREGLKDFIYRAAELGATPNLSDFYPILDGLNLHGSKKKSKEAL  13181

13180  GRILATWEGTLKERRKQKNPGSSHRDLLEAFLEIRFEDDQINQVIL (0) 13043

12958  ELFSAGADTSTLTIEWAITQLIRNPDVMYKLRDELTKIIGESP  12821

12820  VRESHLPHLPYLQACVKETLRLHPPAPLLLPHRAMETCQVMGYTIPKDSQVFVNIWAMGR  12641

12640  DPKVWDDPLSFTPERFLDSKLEFKGNDFEYIPFGAGRRICPGMALGARQVPLVLA  12476

12475  TLVHLFDWSLPDNMDSAQIDMEEWLVITLRKENPLRLVPKVRK*  12344

 

>CYP80K3 gi|147815205|emb|CAN70170.1| 76% to CYP80K1 (end does not match)

MATYTIIADISLFSFLYPLLLLPFLLIFKHIFLKSPPLPPGPYPWPIIGNLLQMGGNLHVKLANLAKRHG

PLMSLRLGTQIMVVASSSAAAMEVLKTHDRTLSGRYVSTTIPVNSPKLNHLAMAFAKVCNSDWRNLKAIC

RMELFSGKAMESRVELRERKVMELVEFLEKKEGEVVKVMDLVYTTVCNILSNKFFSIDFSDFEGRDVRGV

LLKDLLNENAELGATNILDFYPILGGLDIQGIRKKLKEIFRRIPTTWEDILKERRKQRIHGSSHGDFLDA

LLETGFEDDQINHVIMELFFAGPETSSLTVEWAMAELIKNQDAMHKLCNELTQIIGESPVRESHLPHLPY

LQACVKETLRLHPTGPLLLPHRATETCQIMGYTIPKDSIIFVNMWAMGRDPGTWEDPLSFKPERFLDSKL

EFKGNDFEYIPFGAGRRMCPGMPLAARLVPMILATFVRLFDWSTPGDMDFAEIDMEERAPSWGLVTIRSS

SLRVVIEPTVSKFDGHYDHWAMLMKNFLCSKEYWGLVENGILAAAEGVVLIDAQRKNIDDQKLKDLKTN

 

>CYP80K3 CAAP02001529.1c CAN70170.1 CYP80K

29696  MATYTIIADISLFSFLYPLLLLPFLLIFKHIFLKSPPL  29583

29582  PPGPYPWPIIGNLLQMGGNLHVKLANLAKRHGPLMSLRLGTQIMVVASSSAAAMEVLKT  29406

29405  HDRTLSGRYVSTTIPVNSPKLNHLAMAFAKVCNSDWRNLKAICRMELFSGKAMESRVELR  29226

29225  ERKVMELVEFLEKKEGEVVKVMDLVYTTVCNILSNKFFSIDFSDFEGRDVRGVLLKD  29055

29054  LLNENAELGATNILDFYPILGGLDIQGIRKKLKEIFRRIPTTWEDILKERRKQRI  28890

28889  HGSSHGDFLDALLETGFEDDQINHVIM (0)

28718  ELFFAGPETSSLTVEWAMAELIKNQDAMHKLCNELTQIIGESPVRESHLPHLPYLQA  28548

28547  CVKETLRLHPTGPLLLPHRATETCQIMGYTIPKDSIIFVNMWAMGRDPGTWEDPLSFKPE  28368

28367  RFLDSKLEFKGNDFEYIPFGAGRRMCPGMPLAARLVPMILATFVRLFDWSTPGDM  28203

28202  DFAEIDMEERFVITLRKEQPLRLVPRIRKY*  28110

 

CYP81 family (21 genes) [14 pseudogenes]

 

CYP81B subfamily (7 genes) [3 pseudogenes]

 

$$$$

 

>CYP81B26 gi|147782531|emb|CAN68429.1| 51% to 81D2

MVIKTRTGKKARLKLIICWRVKRWSGRLNLSGSRQWYFEFVRAGMTRMIPVSSQRPGISNVSSLSLFSSN

HSXSALPSHALTCKAYLSLPPPTLLSLVFGKKALPKIRICTPQSFFKYRYTVLIMANKIDGWLSFAFTVS

IRYGYYGNCRMVLGPSSSRLMKASSLFVKQIEVKDDDKKGVLLYGFAEKPELSVETNWSVSNYLIVGSYG

RKGFSLWLNKGSSIRVRWEAQPSSLSDLQVFLIKGERKYETLLPNPTNSPAAFPFHESTNGREAEYTILE

DNRYYVGIINANRKSVIMTLNVNVTSKMYDITKAKSMCSTIKGSCRLNLLFPNTQFVILTTPNNGDLAGW

YXELSFVARVVTYVSILGFVVIIIFLVLKYLGACEGDNEVHVEEITPREVTETHPLMPEKLFRLTYGTGE

EDAESGXSSSSSEDLYDGKICTICYDEPRNCFFVPCGHCATCYDCAKRLSTMVNRGYYPSRVADAYSKYI

NFQAISTS

 

MDTSYCYILLFLFIYFLTKHFFQSNKKLPPSPPLSLPIIGHLHLFKKPLHRTFAKISNQYGP

ILFIRFGSRPVIIVSSPSAAEECFTKNDIVFANRPRLLAGKHLGYNYTTLTWAPYGQHWRNLRRIASLEI

LSSNRLQMFYDIRIDEVRALLCQLFRASSEGQFSAVDMKSMFFELTLNNMMRMISGKRYYGDNVTELEET

RKFREIVAETFELSGATNIVDFVPFSKWIGLNGIEKKLVILQGKRDGFMQNLIEEHRRMRSPCEXRSKTM

LDVLLSLQETEPECYTDDIIRGMMQVMLSAGTDTSAGTMEWAMSLLLNNPEALEKAQAEIDSHLGKSRLI

DELDIAXLPYLRGIIMETLRMYPAAPLLVPHESSEECTVGGFRVPSGTMLLVNMWAIQNDPMLWAEPSKF

KPERFQGPEGQRNGFMFSPFGAGRRGCPGEGLAMRVVGLALGSLIQFFEWERVDEEMVDMSEGTGLTMPK

AQSLVAKCRPRPSMVSLLSQL

 

>CYP81B26 CAAP02007087.1a = CAN68429.1

GSVIVP00014919001 model is short

chr18 7849329 to 7851295 on strand -

5730 MDTSYCYILLFLFIYFLTKHFF 5795

5796 QSNKKLPPSPPLSLPIIGHLHLFKKPLHRTFAKISNQYGPILFIRFGSRPVIIVSSPSAA 5975

5976 EECFTKNDIVFANRPRLLAGKHLGYNYTTLTWAPYGQHWRNLRRIASLEILSSNRLQMFY 6155

6156 DIRIDEVRALLCQLFRASSEGQFSAVDMKSMFFELTLNNMMRMISGKRYYGDNVTELEET 6335

6336 RKFREIVAETFELSGATNIVDFVPFSKWIGLNGIEKKLVILQGKRDGFMQNLIEEHRRMR 6515

6516 SPCEGRSKTMLDVLLSLQETEPECYTDDIIRGMMQ (0) 6620

7079 VMLSAGTDTSAGTMEWAMSLLLNNPEALEKAQAEIDSHLGKSRLIDELDIAELPYLRGII 7258

7259 KETLRMYPAAPLLVPHESSEECTVGGFRVPSGTMLLVNMWAIQNDPMLWAEPSKFKPERF 7438

7439 QGPEGQRNGFMFSPFGAGRRGCPGEGLAMRVVGLALGSLIQFFEWERVDEEMVDMSEGTG 7618

7619 LTMPKAQSLVAKCRPRPSMVSLLSQL* 7699

 

$$$$

 

>CYP81B27 CAAP02007087.1b no intron, 68% to CAN68429.1

GSVIVP00014920001

chr18 7846017 to 7847942 on strand -

 9083 MEMLYFYIPLFFVLYVFTSHFLHKFRNLPPSPFPTLPLIGHLYLLKKPLHRTLSKISDR 9259

 9260 HGPILFLRFGSRPVLLVSSPSASEECFTKNDVVFANRPRLIAGKHLGYNYTSMSWAPHGD 9439

 9440 HWRNLRRISSFEILSSNRLQTLSGIRSDEVRSLVRWLFKNQSQMVEMKSAFFEMTL 9607

 9608 NVMMKMIGGKRYYGENIGEVEEARMFREMVSETFQLAGATNMVDFLPILGWLGLKGTERR 9787

 9788 LIKLQKRRESFIQNLIEEHRRKGSNCEGRQKTMIEVLLSLQETEPEYYTDEIIRGLML (0) 9961

10388 SMLTGGTDTSAGTMEWALSLLLNNPKVLKKAHQEIDDRLGHDRLIEELDLAQLPYLRSI 10564

10565 IKETLRMYPAGPLLVPHESSKECSVGGFRIPQGTMLLVNLWAIQSDHKIWGDPTEFRPER 10744

10745 FEGVEGDRDGFKFVPFGSGRRGCPGEALAIRIVGLALGSLIQCFDWERVDEQMVDMTEGG 10924

10925 GLTLPKAQPLLAKCRPRPTMVNLFSQL 11005

 

>CYP81B28P CAAP02007087.1c pseudogene, 68% to CAN68429.1

chr18 7844688 on strand - 7843465

12337 KKVRKIVAKTFQLSGATNVGDFVPLDEIDWKKQVGDNAGKRDGFMQNLIEEHSRMR 12504

12505 SSSCK 12519

12522 GCKAMLDVLVSMQETEPECYTGHIITGM

13041 MKWAMSLLLNNPETFVKAQAEIDSHLGRSRLIDELDIAELPYLHGIIKATLRMYPAASLL 13220

13221 VPHESSDECTVGGFRVPSGTMLLVNM*AIQTDPMLCMGREPTNFKPERFQGPEGHRNGFM 13400

      GAGMRDCPREGLAMRV 13461

13459 VDMSEGTELTIPQAQSLVAKCRPRPSMVSLLSQL 13560

 

$$$$

 

>CYP81B29 CAAP02007087.1d exon 1 only, runs off end, 65% to CAN68429.1

100% to CAO61439.1

14246 MEMLYLCIPIFLALYVFTWHFLHKLHNLPPSPFPILPLIGHLYLLKKPLHRNLSKISDR 14422

14423 HGPILFLRFGYRPVLIVSSHSAAEECFTKNDIIFANRPRLIAGKHLGYNYTAIGTAPYGD 14602

14603 HWRNLRRISSFEILSSNRLQMLSGIRSDEVRTLICRLVKNQNQVVEMKSAFFELTL 14770

14771 NVIMRMIGGKRYYGENVGEVEEARKFREMVSKAFRLAGTNMVDFLPILGWLGLKGTEKR 14947

14948 LMELQKMRDSFIQNLIEEHRIKGSKCERRPKTMIEVLLSLQETEPENYTDEIIGGLML (0) 15121

 

>CYP81B29 CAAP02000911.1 exon 2 only N-term upstream of contig

84% to CAAP02007087.1b

100% to CAO61439.1

1138 SMLTAGTDTSAGTMEWALSLLLNSPEVLKKAQQEIDVHLGHDRLIEEVDLAQLPYLRSII 1317

1318 KETLRMYPAGPLLIPHESSKECFVGGFRIPPGTMLLVNVWAIHNDPKIWAEPTKFKPERF 1497

1498 EGEEGERDGLRFLPFGSGRRGCPGEGLAIRMVGLAMGSLIQCFDWERVDQQMVDMTEGHG 1677

1678 LSIPKAQPLLAKCRPRPTMVNLLSQL* 1758

 

>CYP81B29 gi|157335609|emb|CAO61439.1| unnamed protein product [Vitis vinifera]

GSVIVP00014921001

chr18 7839000 to 7842779 on strand -

MEMLYLCIPIFLALYVFTWHFLHKLHNLPPSPFPILPLIGHLYLLKKPLHRNLSKISDRHGPILFLRFGY

RPVLIVSSHSAAEECFTKNDIIFANRPRLIAGKHLGYNYTAIGTAPYGDHWRNLRRISSFEILSSNRLQM

LSGIRSDEVRTLICRLVKNQNQVVEMKSAFFELTLNVIMRMIGGKRYYGENVGEVEEARKFREMVSKAFR

LAGTNMVDFLPILGWLGLKGTEKRLMELQKMRDSFIQNLIEEHRIKGSKCERRPKTMIEVLLSLQETEPE

NYTDEIIGGLMLSMLTAGTDTSAGTMEWALSLLLNSPEVLKKAQQEIDVHLGHDRLIEEVDLAQLPYLRS

IIKETLRMYPAGPLLIPHESSKECFVGGFRIPPGTMLLVNVWAIHNDPKIWAEPTKFKPERFEGEEGERD

GLRFLPFGSGRRGCPGEGLAIRMVGLAMGSLIQCFDWERVDQQMVDMTEGHGLSIPKAQPLLAKCRPRPT

MVNLLSQL

 

$$$$

 

>CYP81B30 gi|147781643|emb|CAN78219.1| 46% to CYP81D8

CAAP02000568.1a 87549-85869 (-) strand 7 aa diffs

GSVIVP00000371001

Chr9 6419583 to 6420527 on strand +

6420649 to 6421260 (6 aa diffs)

METLYMCIPLCLALYLFTKHLLHKLHNLPPTPFLSLPILGHLYLLKKPLHRTLAGISSRYGPIVFLRLGS

RPSLIVSSPSVAEECLTKNDIVFANRPQLIAGKYIGYNYTSLIWANYGDHWRNLRRISTLEILSSSCIQM

LSGIRADEVRLLVLWLLEHENQTVNMKAMLFEITTNVMMRMIAGRRYYGGSMAEAEAEETVKFREIMADT

IRLGDMSNIGDYLPMLRWLGVKGKEEGLRELQRKRDRFMQSLIEEHRTRIAKDKESSSSCCNGDDGEKKK

KKKTMIEVMLSLQEKEPDYYTDLIIRGLMLX LLGAGTDTTATTIEWTLSLLLNNPHALKKAQMEIDNHLG

DNHLIQESDLNQLPYLHCIIKESQRMYPVGPIIPHESSGECTVGGYRIPHGTMLLVNVWAIQNDPRVWEE

PRKFMPERFEGMELEKHGFRLMPFGSGRRGCPGEGLAVRIVGLVLGSLIQCFDWESVGEGMVDMSEGTGL

TLPKAQPLLVRCRPRPAFVDLLSKA

 

>CYP81B31 CAAP02000568.1b

GSVIVP00000372001

chr9 6414208 to 6415863 on strand +

92924 METMYMRIPLCLALYLFTRHLLHKLHNLPPTPFLSFPIIGHLYLLKKPLHRTLAGISSRY 92745

92744 GPIVFLRLGSRPSLLVSSPSVAEVCLNKNDIVFANRPQLIAGKYIGYNYTSLAWANYGDH 92565

92564 WRNLRRISSLEILSSSRIQMLSGIRADEVRLLVRWLLENENQTVNVKAMLFEITTNVMMR 92385

92384 MIAGKRYYGGSMAEAEETVKFREIIADTLRLGDTTNVGDYLPMLRWLGVKGMEKGLRE 92211

92210 LQRKRDRFMQSLIEEHRTRMAEEKESYSSCSNGDDGEKKKKKTMIEVMLSLQEKEPDYY 92034

92033 TDQIIRGLML (0) 92004

91880 VLLGAGTDTTATTIEWTLSLLLNNPHALKKAQMEIDNHLGNNHLIQESDLNQLPYLHCIIK 91704

91703 ESQRMYPAGPIIPHESSGECTVGGYRIPHGTMLLVNLWAIQNDPRVWEEPRKFMPERFEG 91524

91523 IELEKHGFRLMPFGSGRRGCPGEGLALRMVGLVLGSLIQCFDWESVGEGMVDMSEGTGLT 91344

91343 LPKAQPLLVRCRPRPAFVDLLSKA* 91269

 

>CYP81B32 CAAP02000568.1c 96% to CAAP02005641.1b

GSVIVP00000374001 and GSVIVP00000375001 models are wrong

chr9 6384012 to 6385739 on strand -

121153 MEATYLCLPFFLALYLSTRHWLQKLKNLPPSPFLTFPIIGHLYLLKKPLHRTLADLSARYGPIVF 121347

121348 LRLGPRQTLLVSSPSAAEECLSKNDVVFANRPQLLSGKYIGYNYTSMAWANYGDHWRNLR 121527

121528 RISTLEILSTSRIQMLSGIRSDEVRSLLLRLLENGAETVDMKTAFFEMTMNVMMRMI 121698

121699 AGKRYYGGNVVEVEETAKFQEIIEDTFRLGDTTNIGDYLPVLRWLGVKGKEKGLRELQRK 121878

121879 RDRFMQGLIEEHRTRMAKESYSSSSCRVGEKKTMIEVLLSLQEKEAEYYTDEIIRGLML (0) 122055

122260 ALLGAGTDTTSATLEWAMSLLLNNPEVLKKAQMEMDNQLGPNHLIEESDLSQLPYLHCIIRETQR 122454

122455 MYPAGPIVPHESSKECMVGGYHIPRGTMLLVNIWGIQNDPKVWKEPRKFLPERFEVGLE 122631

122632 GEGHGLRLMPFGSGRRGCPGEGLAIRMVGLVLGSLIQCFDWERVGEGKVDMSEGIGLTL 122808

122809 PKAQPLLAKCRPRPALINVLSQI* 122880

 

>CYP81B33P CAAP02005641.1a pseudogene 67% to CAN78219.1

chr9 6369451 to 6374966 on strand -

1733 MQAL*SAGTDTTSTTLEWAMSLSLRNPHILKKAQMEMDNQLG 1858

1864 LEESDLNQLPYLQSIIKETERMYPAGPIIPHESSDDCVAGGFRIPCGTMLLLNICAVQN 2040

2041 DPKVWEEPRKFNPERFEGLEWEKHGLKLMPFGSGRKGVTRGGFS 2172

7162 QICIQSGTYQMHGLGLGSLIQCFDWEMVG 7248

 

>CYP81B34 CAAP02005641.1b 76% to CAN78219.1

chr9 6366394 to 6368121 on strand -

8578 MEAIYLCLPFFLALYLFTRHWLQKLKNLPPSPFLTFPIIGHLYLLKKPLHRTLAD 8742

8743 LSARYGPIVFLRLGSRQTLLVSSPSAAEECLSKNDVVFANRPQLLAGKYIGYNYTSMA 8916

8917 WANYGDHWRNLRRISALEILSTSRIQMLSGIRSDEVRSLLLRLLENGTETVDMKTT 9084

9085 FFEVTMNVMMRMIAGKRYYGGNVVEVEETAKFQEIIEDTFRLGDTTNIGDYLPVLRWLGV 9264

9265 KGKEKGLRELQRKRDRFMQGLIEEHRTRMAKESYSSSSCRAGEKKKTMIEVLLSLQEKEA 9444

9445 EYYTDEIIRGLML (0)

9688  ALLGAGIDTTSATLEWAMSLLLNNPEVLKKAQMEMDNQLGPNHLIEESDLSQLPYLHCII 9867

9868  RETQRMYPAGPIVPHESSKECMVGGYHIPRGTMLLVNIWGIQNDPEVWKEPRKFLPERF 10044

10045 EVGLEGEGHGLRLMPFGSGRRGCPGEGLAIRMVGLVLGSLIQCFDWKRVGEGKVDMSEGI 10224

10225 GLTLPRAQPLLAKCRPRPALINLLSQI* 10308

 

$$$$

 

>CYP81B35P CAAP02005641.1c pseudogene 2 aa diffs with CAN69662.1

chr 9 6360545 to 6360830 on strand -

15869 MQAL*SAGTDTTSTTLEWAMSLSLRNPFVL 15958

15963 KVWEEPRKFNPERFEGLEWEKHGLKLMPFGSGRRGCPGEVLVIRMLGLGLGSLIQCFDWE 16142

16143 MVGGG 16157

16157 MVDMSEGTGLTLPRAHPLLVKCSTR 16231

 

>CYP81B35P gi|147838229|emb|CAN69662.1| 55% to 81A6, 75% to 69662

MYPAGPIIPHESSDDCVAGGFRIPRGTMLLLXJWAVQNDPKVWEEPRKFNPERFEGLEWXKXGLKLMPFG

SGRRGXPGEALAIRMLGLGLGSLIXCFDWEMVGGGMVDMSEGTGLTLPRAXPLLVKCSTRPALVNLLSQI

 

$$$$

 

CYP81V subfamily (13 genes) [11 pseudogenes]

 

>CYP81V1 CAAP02000157.1a 95% to CAAP02002521.1i, 55% to 81D8

7313 MEITWLSTSLCLLFLSFAFNIFLQRRRIHPHLPPSPPAIPILGHLHLLLKPPIHRQLQS 7137

7136 LSKKYGPIFSLRFGSSPVVIISSPSTVEECFTKNDIIFANRPRWLIGKYIGYNYTTIASA 6957

6956 SYGEHWRNLRRLSALEIFSSNRLNMFLGTRRDEIKILLHRLSQNSRDNFARVELRPMFTE 6777

6776 LTCNIIMRMVTGKRYYGEDVDSEEAKRFQKIMRGIFELAGASNPGDFLPLLRWVDFGGY 6600

6599 EKKLVKLNREKDVIFQGLIDEHRSPDQGLVNKNSMIDHLLSLQKSEPEYYTDEIIKGLAL (0) 6420

6296 ILTFAGTDTTATTIEWAMSLLLNHPDVLKKARAELDTHVGKDRLMEESDFPKLQYLRSIIS 6114

6113 ETLRLFPATPLLIPHISSDNCQIGGYDIPRGTILLVNAWAIHRDPKSWKDATSFKPERFE 5934

5933 NGESEAYKLLPFGFGRRACPGAGLANRVIGLTLGLLIQCYEWERVSEKEVDMAEGKGVT 5757

5756 MPKLEPLEAMCKARAIIRKVL* 5691

 

>CYP81V2 CAAP02000157.1b = CAN72520.1 55% to 81D2, 67% to CAN60309.1

16492 MEVSSLHTPLSLLFVFLLLAFGIFFPRRRRYGNLPPSPPALPIIGHLHLLKQPVHRSLQR 16313

16312 LSQKYGPIFSLRFGSQLAVIVSSPSAVEECFTKNDVVLADRPRLASGKYVGFNYTTITAA 16133

16132 SYGEHWRNLRRVSALEIFSSNRLNMFLGIRRDEVKRLLLRLARDSREGFAKVEMRPMLTE 15953

15952 LNFNIITRMVAGKRYYGEDLEYAEAKRFRDIISEIFELLGALSNPADFLPILRWIGFGN 15776

15775 HEKKLKKITRETKAILQGLIDEHRSGNDKGSVDNNSMIDHLLSLQKTEPEYYTDDIIKGLVL (0) 15590

15404 VLILGGTDTSAATMEWAMTLLLNHPDVLEKAKVELDMHVGKDRLIEES 15261

15260 DLPKLRYLQSIISETMRAFPVGPLLVPHMSSDDCQIGGFDIPRGTLLLVNAWALHRDPQV 15081

15080 WEDPTSFKPERFENGEREDYKLVPFGIGRRACPGAGLAQRVVGLALGSLIQCYDWKKISN 14901

14900 TAIDTTEGKGLSMPKLEPLEAMCKAREIINQVH* 14799

 

>CYP81V3 CAAP02000157.1c 59% to CAN60309.1

20585 MLYTALFIFCLILLKVFVRGRRRHRNLPPSPPPLPIIGHLHLVRGGGLHRTFRSLSEKY 20409

20408 GPILFLQLGWQPTVIVSSPSVVEECFTKNDIVLANRPLFLLGKYLGYNFTALAWAGYGDH 20229

20228 WRNLRRLSTLEVFSPSRLDLIVGIQKDEVKCLLQRLSPDSRDGFGKVELKSKFSELTFNI 20049

20048 ITRAVAGKRFYGEDVDAEVSLNFRNLINEIFQSAEATNPADLLPIFRWIDYQGFERKMI 19872

19871 EVSGKSDVFMQGLIDEHRSDRSSLESRNTMIDHLLSLQKSQPEYYTDDIIKGLIM (0) 19707

19615 VLILAGTETSATTTEWAMALLLNHPNSLKKAIAEIDDRVGQERIMDETDLPNLPYLQNIVR 19433

19432 ETLRLYPPGPLLVPHVSSEECEIGGYHIPKHTMVMVNAWAIQRDPKLWPDATSFRPERFE 19253

19252 TGKAETYKFLPYGVGRRACPGASMANRLIGLTLGTLIQCYSWERVSDKEVDMSGAEGLT 19076

19075 MPKKTPLEAMCKPRDVLKKVFEGYKTVL* 18989

 

>CYP81V4P CAAP02002521.1a pseudogene 95% to CAN60309.1

7644 MEDAWLYTSLSVVFLLIAFKLWLQSKRTHGSLPPGPPAVPILGHLHLLKGPFHRALHHLS 7465

7464 ETYG 7453

7451 PIFSLRFGSQLVVVISSSSAVEECFTKNDVIFANRPSLMVSEYLEYKCTSLVSSPYGEHW 7272

7271 RNLRRLCALEIFSSNRLNMFLGIRKDELKHLLRRLGRDSRDNFAKVELKSLFSELTFNII 7092

7091 TRMVAGKRYYGEGADFEEAKHFREIIRKSFLLSAASNPGDFLPILRWMDYGGYD 6930

6931 MANISRELDAILQGLIDEHRSNSKKGLMGNNTMIDHLLSLQKSEPEYYTDQIIKGVTM 6758

6675 NLVFAGTDTAAATMEWAMSLLLNHPDVLKKAKVELDTCVGQERLLEEADLPKLHYLQNII 6496

6495 SETFRLCPPAPLWLPHMSSENCQLGGFDIPRDTMLLVNSWTLHRDPKLWDDPTSFKPERF 6316

6315 EGGERGETYKLLPFGTGRRACPGSGLANKVVGLTLGSLIQCYEWERISEKKVDMMEGKGL 6136

6135 TMPKMEPLEAMCRAYEIVKKVLQDEETMVN* 6043

 

>CYP81V5P CAAP02002521.1b pseudogene 90% to CAN60308.1

13625 MEVTWLNTYFSLLFLWVAFKLLLRRRRLIHPHLPPSLPAI 13506

12381 LPPSPPAFPVLGHDLHLVKLPFHRALQTLSEKYGPIFALRLGSRPVVVVSSPSAVEECFT 12202

12201 KNDIVLANRPHFLSGKHLGYNHTTVDALPYGEDWRNLRRLCSIEILSSNRLNMFLGIRSD 12022

12021 EAKLLLRRLSQDSRDKFAKVELKSLFSKLTFNTITRTIAGKRYHGEEVGMEEVKQFREII 11842

11841 GEIFELGGTSNPMD*LPILEWVDYGGYKKKLMKLSRQTEAMLQCLIDERRNSKKRGLEDK 11662

11661 STTIDHLLSLQKSEPEYYTDEIIKGLIL (0) 11578

10970 VLILGGSERTAVTIEWAMALLLNHPDALNKAREEIDIHL 10854

10854 GQGRLMEESDLSKLGYL 10804

 9986 QNVISETHRLYPAVLLLLPHMTSSHCQVGGFDIPKGTM 9873

 9872 LLINAWAIHRDPKAWDNPTSFKPKRFNSEENNNYKLFPFGLGRRACPGSGLANKVMGLTL 9693

 9692 GLLIQCYE 9669

 9664 WKRVSEKEVDMAEGLGLTMPKVVPLETMCKARDIIKMVV 9548

 

>CYP81V6 gi|147790129|emb|CAN76836.1| 70% to CAN60309.1

CAAP02002521.1c 20516-18540 100% match

MEATWLGTSLSLLFLLFSFSVFLQRRTHPRLPPSPPAIPILGHLHLLLKQPIHWHLQTLSQKYGPIFSLR

FGSRLLVIISSPSTVEECFTKNDIIFANRPCFLFGKHIDYNYSTIASAPYGEHWRNLRRLSTLEIFSSNR

LNMFLGIRRDEIKLLLSQLSRNSRDHFARVELRPMFIELTCNIIMRMVAGKRYYGEAVDFEEAKHFREVM

RGIFELAGARNPGNFLLSLRWVYFGGYEKELVKINRMKEVIFQGLIDEHRSPTGLVNKNTMIDHLLSMQK

SEPEYYTDEIIKGLALDLILAGTDTTATTIEWAMSLLLNHPDVLKKARVELDALVGKDRLMEESDFPKLQ

YLQNIISETLRLFPAAPLLVPHMSSENSQIGGFDIPRDTILLANVWAIHRDPKLWEDATSVKPERFENIG

GTETYKLLPFGLGRRACPGVGLANRVVGLALGSLIQCYEWERVSEKEVDMAEGKGLTMPKMEPLEAMCKA

RAIIRKVF

 

>CYP81V7P CAAP02002521.1d pseudogene 90% to CAAP02006584.1c

22568 LSAVEIFSPNRLNIFIGTRRDEIKILLHRLSQNSRDNFARVELRLMFTELTCNIIMRMVA 22389

22388 GKRYYGEDVDFEEAKHFHEVMRGFFEL

22299 SNPGEFFPLLRWVDFGG*EKKLVKIKTKKEVIFRGLIDEHRSPSRALVNKNSMINHLLS 22123

22122 MQKSEPEYYTDEIIKGHSL

21617 XXXXXGTETTATTIE*AMSLLLNHPDVMKKARVVELDTHDRKDCLMEESDFPKLQYLQSIISETL 21438

21437 RLFPAAPVLVPHMSSDNCQIGGF 21369

 

>CYP81V8P CAAP02002521.1e pseudogene 80% to CAN76836.1

24456 MEITWLSTSL*ILFLLFAFNIFLQRRRIHPHLPPSPPAIPIPGHLHLLLKPPF 24298

24298 HWQLQNLSKKYGPSFPFASDPCSPVVIISSPSIVEECFTK 24179

24178 NDIIFANRPRWLIGKYIGYNYTTIASASYGEHWRNLRRLSALEIFSSNRLNMFLGIRRDE 23999

23998 IKILLHRLSQNSRDNFARVELRPMFTELTCNILMRMVTGKRYYGEDVDSEEAKHFQKIMR 23819

23818 GIFELARASNPGDFLPLLRWVDFGGYEKKLVKLNREKDVIFQGLINEHRSPDQGLVNKN 23642

23641 MIDHLISLQKSEPEYYTDEIIKGLAL 23564

23112 ILIFAGTDTTATTIEWAMSLLLNHPDVLKKARAELDTHAGKDRLMEESDFP 22960

22959 KLQYLRSIISETLRLFPATPLLIPHISSDNCQIGGYDIPRGTILLVNAWAIHRDPKSWKD 22780

22779 ATSFKPERFENEESEAYKLLPFGLGRRACPGAGLANRVIGLTLGLLIQCYEWERVSEKEV 22600

22599 DMAEGKGITMPK 22564

 

>CYP81V9 CAAP02002521.1f 95% to CAAP02006584.1a

31760 MDVSSLHTSLSLLFVFLLLAFGIFFPRRRRYGNLPPSPPAVPIIGHLHLLKQPVHRSLQL 31581

31580 LSQKYGPIFSLRLGSQLAVIVSSPSAVEECFTKNDVVLANRPRFASGKYVGYNYTTIGAA 31401

31400 SYGDHWRNLRRLSALEIFSSNRLNMFLGIRRDEVKRLLLRLARDSREGFAKVEMRPMLTE 31221

31220 LTFNIITRMVAGKRYYGEDVEYTEAKRFREIISQLFILGGASSNPADFLPILRWIGLGYH 31041

31040 EKKLKKIVRETRAILQGLIDEHRSGNDKGSVDNNSMIDHLLSLQKTEPEYYTDDIIKGLVQ (0) 30858

30673 VLILAGTDTSAATMEWAMTLLLNHPDVLEKAKAELDMHVGKDRLIEES 30530

30529 DLPKLRYLRSIISETLRVFPVAPLLLPHMSSDDCQIGGFDIPRGTLLLVNVWALHRDPQV 30350

30349 WEDPTSFKPERFENGERENYKLVPFGIGRRACPGAGLAQRVVGLALGSLIQCYDWKKISN 30170

30169 TAIDTIEGKGLTMPKLQPLEAMCKAREIINEVHLN* 30062

 

>CYP81V10P CAAP02002521.1g pseudogene 88% to CAN76836.1

36653 MEATWLSTSLSLLFLLFAF 36581

36582 QRRTHPRLPPSPPAIPILSHLHLLLKQPIHRHLQTLSQKYGP 36457

36456 IFSLRFGSRLLVIISSPSTVEECFTKSDIIFANRP

36367 FCQPPLLLVRQDYNYTTIASAPYGEHWSNLRRRLSTLEIFSTNRLNMFLGIRWDEIKL 36194

36193 LLRQLSLNSRDHFARVELRPMFIELTCNIIMRMVAGNRYYGEAVDFEEAKLFREVMRGIF 36014

36013 ELA 36005

36003 EHWRTLLRLVSTLEVFSSNHLRMF*GNRNDEISHL 35899

35899 IHRLSWDSRGNFVKVELLLIFNIIMRMIAGKRYYGEDLHVEEARHFQDIIGKIFE 35735

35734 LGATSNPIDF 35705

35706 FLPLLRWIDYGGYERKM 35656

35648 NSIVLDAILQGFD**AYNKNGLVGNNTMIDHLLSLQKSELEYYTDDIIKACGM 35490

35499 LWNDLESFKL*RFENRERDTSKLLLFGVGRRACPGID 35389

35388 LPNHSVDLAVGSLIPEF 35338

 

>CYP81V11P CAAP02002521.1h pseudogene 90% to CAAP02006584.1c

40113 IFSLRFGSRLLVIISSPCTVEECFTKNDIIFANRPIFLFGKYIGYNNTIVTSAPYGEHWR 39934

39933 NLRRLSALEIFSPNRLNMFIGTRRDEIKILLHRLSQNSRDNFARVELRLMFTELTCNIIM 39754

39753 RMVAGKRYYGEDVDFEEAKHFHEVMRGVFELA 39658

39652 SNPGEFFPLLRWVDFGG*EKKLVKIKTKKEVIFRGLIDEHRSPSRALVNKN 39500

39499 SMINHLLSMQKSEPEYYTDEIIKG 39428

37763 GTEITATTIEWAMSLLLNHPDVMKKTRVVELDTHDRKDCLMEESDFPKLQYLQSIISETL 37584

37583 RLFPAAPVLVPHMSSDNCQIGGF 37515

 

>CYP81V12P CAAP02002521.1i 82% to CAN76836.1

46598 MEITWLSTSLCLLFLLFAFNIFLQRRRIHPHLPPSPPAIPILGHLHLLLKPPIHRQLQNL 46419

46418 SKKYGPIFSLRFGSSPVVIISSPSTVEECFTKNDIIFANRPRLLIGKYIGYNYTTIASAS 46239

46238 YDEHWRNLHRLSALEIFSSNRLNMFLGIRRDEIKILLHRLSQKSRDNFARVELRPMFTEL 46059

46058 TCYILMRMVAGKRYYGEAVDSEEGKHFQKIMRGIFELAGASNPGDFLPLLRWVDFGGYEK 45879

45879 KLVKLNREKDVIFQGLIKEHRSPDQGLVNKNSMIDHLLSLQKSEPEYYTDEII 45721

45720 KGLAL (0) 45706

45403 ILIFAGTDTTATTIEWAMSLLLNHPDVLKKARAELDTHVGKDRLTEESDFP 45251

45250 KLQYLRSIISETLRLFPATPLLMPHISSDNCQIGGFDIPRGTILLVNAWAIHRDPKSWKD 45071

45070 PTSFKPERFENEEGEAYKLLPFGLGRRACPGAGLANRVIGLTLGLLIQCYELERASEKE 44894

44893 VDMAEGKGVTMPKLEPLEAMCKARAIIRKVL 44801

 

>CYP81V13 CAAP02005419.1a = second gene 2308-530, 1 aa diff, revised N-term

MEARWLYSSLSFLFFALAVKFLLQ

RNKGKRLNLPPSPPGFPIFGHLHLLKGPLHRTLHRLSERHGPIVSLRFGSRPVI

VVSSPSAVEECFTKNDVIFANRPKFVMGKYIGYDYTVVSLAPYGDHWRNLRRLSAVEIFASNRLNLFLGI

RRDEIKQLLLRLSRNSVENFAKVELKSMFSELLLNITMRMVAGKRFYGDNMKDVEEAREFREISKEILEF

SGTSNPGDFLPILQWIDYQGYNKRALRLGKKMDVFLQGLLDECRSNKRSDLENRNTMIDHLLSLQESEPE

YYTDEIIKGLIV (0)

AMQVGGADTTAVTIEWAMSLLLNHPEVLKKARDELDTHIGHDCLIDETDLPKLQYLQS

IISESLRLFPSTPLLVPHFSTEDCKLGGFDVPGGTMLLVNAWALHRDPKLWNDPTSFKPERFETGESETY

KLLPFGVGRRACPGIGLANRVMGLTLGSLIQCFDWKRVDEKEIDMAEGQGLTMPKVEPLEAMCKTRQVMN

NVSSKILNSV* 530

 

>CYP81V14 gi|147778583|emb|CAN60309.1| 2 genes 55% to CYP81D subfam.

CAAP02005419.1b = first gene 4655-3047, 1 aa diff

95% to CAAP02002521.1a

MEDAWLYTSLSVVFLLFAFKLLLQSKRGHGNLPPSPPAVPILGHLHLLKGPFHRALHHLSETYGPIFSLR

FGSQLVVVISSSSAVEECFTKNDVIFANRPRLMVSEYLGYKYTSIVSSPYGEHWRNLRRLCALEIFSSNR

LNMFLGIRKDEIKHLLRRLGGDSRDNFAKVELKSLFSELTFNIITRMVAGKRYYGEGSDFEEAKHFREII

RKSFLLSAASNPGDFLPILRWMDYGGYEKKMAKNSRELDVILQGLIDEHRSNSKKGLMGNNTMIDHLLSL

QKSEPEYYTDQIIKGVTMNLVFAGTDTAAVTMEWAMSLLLNHPDVLKKAKVELDTCVGQERLLEEADLPK

LHYLQNIISETFRLCPPAPLWLPHMSSANCQLGGFDIPRDAMLLVNSWTLHRDPKLWDDPTSFKPERFEG

GERGETYKLLPFGTGRRACPGSGLANKVVGLTLGSLIQCYEWERISEKKVDMMEGKGLTMPKMEPLEAMC

SAYEILKNVLQDE*

 

>CYP81V15P CAAP02006584.1c pseudogene 79% to CAN76836.1

13600 IFSLRFGSHLVVIISFPCTVEECFTKIGCYHIIFANRPIFLFGKYI 13463

13128 GYNNTIITSAPYGEHWRNLRRLSALEIFSPNRLNMFIGTRRDEIKILLHRLSQNSRDN 12955

12954 FARVELRLMFTELTCNIIMRMVAGKRYYGEDVDFEEAKHFHEVMRGVFELA 12802

12808 IGWVSNPGEFFPLLRWVDFGG*EKKLVKIKTKKELI 12701

12695 GLVEKHRSPNRGLVNKNSMINLLLSMQKSEPEYYTDEIIKGHSL 12564

12165 ILILVGTETTATTIEWAMSLLLNHPDVMKKARVVELDTHVRKDRLMEESDFPKLHYLQSII 11983

11982 SETLRLFPAAPVLVPRMSSDNCQIGGFNIQRDTILLVNVWAIHRDPKLWKDATSFKPERF 11803

11802 ENGESETYKLLPFGLRKRACPGVGLANRV 11716

11711 LGLTLGSLIQCYEWERANEKEVDMAEGRGITMPKLEPLE

11110 SMPKLEPLEAMYKAGCIINK 11051

 

>CYP81V16P CAAP02006584.1b pseudogene similar to CAN76836.1

10171 SLSLLFLLFAFSVFLQRRRTHPRLPPTPPAIPILGHLHLLLKQPIHRHLQTLSQ 10010

 9992 RLSRDSRGNFVKVELLLIFNIIMRMIAGKRYYGEDV 9885

 9880 LGATSNPTDF 9851

      NKRGLVDNNTMIDHLLSLQKSEPEYYTDDIIKACGM 9633

      LWNDLESFKL*RFENRERDISKLLLFGVGRRACPGIG 9532

 

>CYP81V17 CAAP02006584.1a 91% to CAN72520.1

6509 MDVSSLHTSLSLLFVFLLLAFGIFFPRRRRYGNLPPSPPAVPIIGHLHLLKQPVHRSLQL 6330

6329 LSQKYGPIFSLRFGSQLAVIVSSPSAVEECFTKNDVVLANRPRFASGKYMGYNYTTVAAA 6150

6149 SYGEHWRNLRRLSALEIFSSNRLNMFLGIRRDEVKRLLLRLARDSREGFAKVELRPMLTE 5970

5969 LTFNIITRMVAGKRYYGEDVEYTEAKRFREIISQLFVLVGASSNPADFLPILRWTGLGYH 5790

5789 EKKLKNIMRETKAIMQGLIDEHRSGNDKGSVDNNSMIDHLLSLQKTEPEYYTDDIIKGLVQ (0) 5607

5422 VLILAGTDTSASTMEWAMTLLLNHPDVLEKAKAELDMHVGKDRLIEES 5279

5278 DLPKLRYLQSIISETLRVFPVTPLLLPHMSSDDCQIGGFDIPRGTLLLVNAWALHRDPQV 5099

5098 WEDPTSFKPERFENGERENYKLVPFGIGRRACPGAGLAQRVVGLALGSLIQCYDWKKISN 4919

4918 TAIDTIEGKGLTMPKLQPLEAMCKAREIINEVHLN* 4811

 

>CYP81V18 gi|147778582|emb|CAN60308.1| 67% to CAN60309.1

CAAP02009879.1 2615-467 (-) strand

MEDIWLYTSLTVVFLLFAFKVLLHRRRNHGNLPPSPPAFPVLGHLHLVKLPFHRALRTLSEKYGPIFSLR

LGSRPVVVVSSPCAVEECFTKNDIVLANRPHFLSGKHLGYNHTTVDALPYGEDWRNLRRLCSIQILSSNR

LNMFLGIRSDEVKLLLRRLSQDSRDKFAKVELKSLFSKLTFNTITRTIAGKRYHGEEVGMEEVKQFXEII

GEIFELGGTSNPMDYLPILEWVDYGGYKKKLMKLGRQTEAMLQCLIDERRNSKNRGLEDKSTTIDHLLSL

QKSEPEYYTDEIIKGLILVLILAGSESTAVTIEWAMALLLNHPDALNKVREEIDIHVGQGRLMEESDLSK

LGYLQNVISXTHRLXPAAPLLLPHMTSSHCQXGGFDIPKGTMLLINAWAIHRDPKAWDNPTSFKPERFNS

EENNNYKLFPFGLGXRACPGSGLANKVMGLTLGLLIQCYEWKRVSXKEVDMAEGLGLTMPKAVPLEAMCK

ARDIIKMVV

 

>CYP81V19 gi|147782357|emb|CAN70574.1| 67% to CAN60309.1

MEDIWLYTSLTLVFLLFAFKVLLHRRRNHGNLPPSPPAFPVLGHLHLMKLPFHRALQTLSEKYGPIFALR

LGSRPVVVVSSPSAVEECFTKNDIVLANRPHFLTGKHLCYNHTTVEALPYGEDWRNLRRLCSIEIFSSNR

LNMFLGIRSDEVKLLLRRLSQDSRDKFAKVELKPLFSNLTFNTMTRTIAGKRYHGEEVGTEEIKQFREMI

GEIFELAGNSNPMDYLPILEWVDYGGYKKKLMKINRRAEAMLQYLIDEHRNSKKRGLEDHTTIDHLLSLQ

KSEPEYYNDEIIKGLVLILILGGSESTAVTIEWAMALLLNHPDALNKVREEIDIHVGQGRLMEESDLSKL

GYLQNVISETLRLCPAAPLLLPHMTSSHCQVGGFDIPKGTMLITNAWAIHRDPKAWDNPTSFKPERFNSG

ENNNYKLFPFGLGRRACPGSGLANKVIGLTLGLLIQCYEWKRVSEKEVDMAKGLGLTMPKAIPLEAMCKA

RDIIKMVV

 

>CYP81V20 gi|147821814|emb|CAN60018.1| 67% to CAN60309.1

97% to CYP81V19

CAAP02001991.1 38500-36494 (-) strand 100% match 

MEDIWLYTSLTVVFLLFAFKVLLHRRRNHGNLPPSPPAFPVLGHLHLMKLPFHRALQTLSEKYGPIFALR

LGSRPVVVVSSPSAVEECFTKNDIVLANRPYFLTGKHLCYNHTTVEALPYGEDWRNLRRLCSIEILSSNR

LNMFLGIRSDEVKLLLRRLSQDSRDKFAKVELKPLFSNLTFNTMTRTIAGKRYHGEEVGTEEVKQFREII

GEIFELAGNSNPMDYLPILEWVDYGGYKKKLMKISRRTEAMLQYLIDEHRNSKKRGLEDSTTIDHLLSLQ

KSEPEYYTDEIIKGLVLILILGGSESTAVTIEWAMALLLNHPDALNKVREEINIHVGQGRLMEESDLSKL

GYLQNVISETLRLCPAAPLLLPHMTSSHCQVGGFDIPKGTMLITNAWAIHRDPKAWDNPTSFKPERFNSG

ENNNYKLFPFGLGRRACPGSGLANRVIGLTLGLLIQCYEWKRVSEKEVDLAERLGLTMPKAIPLEAMCKA

RDIIKMVV

 

$$$$

 

>CYP81V21 gi|147766556|emb|CAN69522.1| 68% to CAN60309.1, 54% to 81D8

CAAP02015449.1 1-1261 C-term part

CAAP02009664.1 1516-3237

Nearly identical to CAO63093.1 = CU460086.1

87% to CAAP02000157.1b

MEVSSLHSYLSLLFVFLLLVIGILFPRRRYGNLPRSPPAVPIIGHLHLLKQPVHRYLQRLSLKYGPIFSL

RFGSQLVVIVSSPSAVEECFTKNDVVLANRPRLASGKYLGFNYTSMASASYGEHWRNLRRLSALEIFSSN

RLNMFLGIRRDEVKLLLLRLARDSRQGFAKVELRPMLTELTFNIITRMVAGKRYYGEGVEFEEAKRFREI

ISEVFKLNGASSNPTDFLPILRWIGFGDHEKK

LKKTRRETQVILQGLIDEHRSGNDR

GSVDNNSMIDHLLSLQKTEPEYYTDDIIKGLVL

VLILAGTDTSAATVEWAMTLLLNHPDVLKKAKAELDIHVGKDRLIEESDLPKLRYLQSIISETLRLFPVAPLLVP

HMSSDDCQIGGFDIPGGTFLLINAWAIHRDPQVWEDPTSFIPERFENGERENYKLLPFGIGRRACPGAGL

ANRVVGLALGSLIQCYDWKRISKTTIDTTEGXGLTMPKLEPLEAMCKACEIIKTGSLELENNI

 

>CYP81V21 gi|157331084|emb|CAO63093.1|

PIIGHLHLL

KQPVHRYLQRLSLKYGPIFSLRFGSQLVVIVSSPSAVEECFTKNDVVLANRPRLASGKYLGFNYTSMASA

SYGEHWRNLRRLSALEIFSSNRLNMFLGIRRDEVKLLLLRLARDSRQGFAKVELRPMLTELTFNIITRMV

AGKRYYGEGAKRFREIISEVFKLNGASSNPTDFLPILRWI

GFGDHEKKDTGDLAGIEGFKPEYYTDDIIK

GLVLVLILAGTDTSAATVEWAMTLLLNHPDVLKKAKAELDIHVGKDRLIEESDLPKLRYLQSIISETLRL

FPVAPLLVPHMSSDDCQIGGFDIPGGTFLLINAWAIHRDPQVWEDPTSFIPERFENGQRENYKLLPFGIG

RRACPGAGLAHRVVGLALGSLIQCYDWKRISETTIDTTEGKGLTMPKLEPLEAMCKACEIIKTGSLELEN

NI

 

$$$$

 

>CYP81V21-de1b CAAP02015449.1 pseudogene 67% to CAN69522.1

1415 NLPPSSPAIPIIG

1456 LHLLKHPRRRYLQALPLKYDPVFSVRLGSRLVVIVSSSSW

1575 VEECFIEDDIIFANRP 1622

WRTLVRTLEVFSSN 1692

 

>CYP81V21-de1b CAAP02009664.1 same as CAAP02015449.1 pseudogene

     NLPPSSPAIPIIG 3429

3432 LHLLKHPRHRYLQALPLKYDPVFSLRLGSRLVVIVSSSS

3551 VEECFIEDDIIFANRPRLA 3607

3604 SSLGTAPYGEH

     WRTLVRTLEVFSSN 3668

     LNMFLGNPRDKI 3705

 

>CYP81V22P CAAP02001704.1b pseudogene deletion between I-helix and EXXR, 41% to 81D11

45456 WRNLRRFAALKIFSSRRLQMSSNIRMEEIRFLTKQLFKSSIKGIEKVEIKSAFNILAFNI 45277

45276 IMKMLTGNRYFEDDDLNSDETRHRLDDIKQTFSPSRQVGLGGFPFLRWCTFREARQM 45106

45105 KKLYRKRDLFLKAMIDARRNIRSSSSIIHGERRSIIIDTMLSLQELEPEFYTDDIIKGMI 44926

44925 MVII 44914

44816 MLIAGTDASASTLESAISLLVSHPDPLC 44733

44734 VMNETLRLGPVGQFIPPRDASDDC 44663

44654 GFDIPRGTMLLVDSWALHKDPQLWEDPTMFEPERFE 44547

44546 GPQSEKGGLKFTPFGLGRRQCPGAGLAMRLIVLSLGTLIQCFDWEAVEKAGSEASVS 44376

 

>CYP81V23 CAAP02001704.1a 54% to 81D8, 59% to CAN72520.1

43414 MIMKETWFRFLSVSS

43369 SLFLIVLSKLFYHKQSRGRKLPPSPPSFPIIGHLHLLKEPVHRSLQHLSDQYGPVL 43202

43201 TLQFGFRTVLLLSSPSAVEECFTKHDQVFANRPRLLAGKHLHYDYTTLGVAPYGQHWRNL 43022

43021 RRLTTLEIFSTNRLNMFLGIRQDEVKFLLRNLFQRSGQGFARVEMKSRLSELSFNIITRI 42842

42841 VAGKRYFGTDVEDFEEASRFRDIIKEILETSGASNAGDFLPFLQWFDFQGLKKRVLALQR 42662

42661 RTDAFLQGLIDEGRNHNNSYRQERGKTKTFVDSMLALQNSEPEYYADHIIKGMIL (0) 42497

42098 TLLTAGTDTSAVTMEWAMSLLLNHPTVLDKVKTELDCKIGHQRLVEEPDLSDLPYLRAIVN 41916

41915 ETLRLFPAAPLLVAHESSDDCSIGGYDVRGGTMLLVNAWAIHRDAKVWEDPTSFRPERFE 41736

41735 GGEGEACRFIPFGLGRRGCPGAGLANRVMGLALAALVQCFEWQRVGEVEVDMSEGKGLTM 41556

41555 PKAQPLEAMCRARNSMIKVLSEL* 41484

 

CYP81W subfamily (1 gene)

 

>CYP81W1 CAAP02001704.1c 53% to CAAP02007087.1a CYP81

50757 MGNLYHYAVI

50727 LLPIILIIKFLFHGRQRQRYRLPPSPFALPVIGHLHLLKPPLYQGLQALSSQYGPIL 50557

50556 FLRFGCRPFVVVSSPSAVQECFTKNDVVLANRPRSMIGDHVTYNYTAFAWASYGHLWRVL 50377

50376 RRLTVVEILSSNKLLLLSTVREEEVRYLLRQLFKVSNDGAQKVDMRLYLSLFSFNFIMKT 50197

50196 ITGKRCIEEEAEGIETNRQFLERLKRIFVPTTTTNLCDFFPILRWVGYKGLEKSVIQFGK 50017

50016 ERDGYLQGMLDEFRRNNSAVEWQKKRTLIETLLFLQQSEPDFYTDDVIKGLML (0) 49858

48125 VISAGTDTSSVTLEWAMSLLLNHPEALEKARAEIDSHVKPGHLLDDSDLAKLPYLRSVVN 47946

47945 ETLRLYPTAPLLLPHLSSEDCSVGGFDIPRGTTVMVNVWALHRDPRVWEEATKFKPERFE 47766

47765 GMENEEKEAFKFAPFGIGRRACPGAALAMKIVSLALGGLIQCFEWERVEAEKVDMSPGSG 47586

47585 ITMPKAKPLEIIFRPRPTMTSLLSQL* 47505

 

CYP82 family

 

CYP82D subfamily 24 sequences

(16 genes, one allele) [7 pseudogenes]

 

>CYP82D1     Medicago sativa (alfalfa)

1  MDVTIEYLYT IVAGVICIIL ISYSKFFRGD ARAQPKLPPL ASGGWPLIGH

 51  LHLLGSSNQP PYITLGNLAD KYGPIFTLRV GVHNAVVVST WELAKEIFTT

101  HDVIISSRPK FTAAKILGHD YANFGFSPYG DYWQMMRKVT ASELLSTRRF

151  ETLRDIRDSE VKKSLMELCN SGFDGDLKVE MKRFLGDMNL NVIMRMIAGK

201  RYSNNESGDE REVRKVRWVF REFFRLTGLF VVGDAIPFLG WLDLGGHVKE

251  MKKAAREMDS VVCGWLEDHR HKNVIGETKM EQDFIDVLLS VLHGVHLDGY

301  DVDTVIKATC LTLIAGATDT TTVTITWALS LLLNNRHTLK KVQDELDEKV

351  GKDRLVNESD INNLVYLQAV VKETLRLYPA GPLSGARQFT KDCTVGGYNI

401  RAGTRLILNL WKMHRDPRVW SEPLEFQPER FLNTHKDVDV KGQHYELLPF

451  GGGRRSCPGI TFGLQMTNLA LASFLQAFEV TTPSNAQVDM SATFGLTNIK

501  TTPLEVIAKP RLPYHLLFVK EH

 

>CYP82D2 Populus

1542721 MDILLPYLSTIIPTAIVLFSCYLLRRSKSSKTKLAPEASGAWPIIGHLPL 1542572

1542571 LAGAELPHLRLGALADKYGPIFTIRIGMYPALVVSSWELAKELFTTNDAI 1542422

1542421 VSSRPKLTASKILGYNFASFGFSPYGEFFRGIRKIVASELLSNRRLELLK 1542272

1542271 HVRASEVEVSVKELYKLWYSKDKNEESQILVNIKQWTADMNLNLMLRMIA 1542122

1542121 GKRYDDAGIVTEENEARRCQRAMREFFHLTGLFVLRDAVPFLGWLDWGGY 1541972

1541971 EKAMKRNAEELDNIFDEWLAEHRRKRDSGESANKEQDFMDVMLYALDGIN 1541822

1541821 LAGYDADTVRKATSL 1541777 (0?)

1541621 SLIIGGTDTVTVTITWALSLLLNNTVALKSAQEELDVHVGKERL 1541490

1541489 VNESDIEKLTYLQACVKEALRLYPAGPLGGFREFTADCTIGGYYVPAGTR 1541340

1541339 LLLNIHKIQRDPRVWPNPTEFKPERLLGSHKAVDVMGQHFELIPFGAGRR 1541190

1541189 ACPGATLGLRMSHLVLASILQAFEISPPSNAPIDMTGTAGLTCSQATPLQ 1541040

1541039 VLVKPRLPASVYEYRF* 1540989

 

>CYP82D3 Coptis japonica AB374406

VFISLYFLILWRTRSSSKTNTCKEAPEAPGAWPIIGHLHLLGGS

ELRHKTLGAMADKYGPIFKIRIGVNHALVVSNSDIAKECFTTNDKAFASRPTSTASKI

LGYDYVMFGMAPYGQYWVELRKITMSELLSNRRLELLKHVRDSEIDASIQDLYKVWKN

HDKAKGPVLVDMKQWFGDLTLNVILRMIAGKRYSGSMSSCDETEARTCQKGMRDFFRL

LGLFIIEDALPYLSWFDLQGYKKEMKNTAKELDSVFQRWLEEHNRMRQTGELNREQDF

MDVLMSILEDTRISEYDNDTIIKSTCLSIVTGGGDTTMVTLTWILSLLLNNKHALKKA

QDELDSHVGKDRQVEESDIKNLVYLQAITKEALRLYPAGPLSGPRVADADCTVAGYHV

PAGTRLIVNTYKIQRDPLVWSEPSEFRPERFLTSHVNMDVKGLHYELIPFGAGRRSCP

GMSFTLQVVPLVLARFLHEFDSKTEMDMPVDMTETAGLTNAKATPLEVVITPRLHPEI

YGL

 

>CYP82D4 CAAP02000063.1d 93% to 82D22P, 87% to 82D10, 58% to CYP82D1, 57% to 82D3

62% to CYP82D2

GSVIVP00014720001 model is correct

chr18 from 9705915:9708180 on strand +

234268 MAVPHSSSLLQYLNVTTIGVLG

234202 ILFLSYYLLVRRSRAGKRRIGPEAAGAWPIIGHLHLLGGSQLPHVTLGTMADTYGPVFT 234026

234025 IRLGVHRALVVSSWEMAKECLTTNDQAASSRPELLASKHLGYNYSMFGFSPYGSYWREVR 233846

233845 KIISLELLSNRRLELLKDVRASEVVTSIKELYELWEEKKNESGLVSVEMKQWFGDLTLNV 233666

233665 ILRMVAGKRYFSASDASENKQAQRCRRVFREFCHLSGLFAVADAIPFLGWLDLGRHE 233495

233494 KTLKKTAKEMDSIAQEWLEEHRRRKDSGEVNSTQDFMDVMLSVLDSKNLGDYDADTV 233324

233323 NKATCL (0) 233306

232617 ALIVGGSDTTVVTLTWALSLLLNNRDTLKKAQEELDIQVGKERLVNEQDISKLVYLQAIV 232438

232437 KETLRLYPPAALGGPRQFTEDCTLGGYHVSKGTRLILNLSKIQKDPRIWMSPTEFQPERF 232258

232257 LTTHKDLDPREKHFEFIPFGAGRRACPGIAFALQMLHLTLANFLQAFDFSTPSNAQVDM 232081

232080 CESLGLTNMKSTPLEVLISPRMSLL* 232003

 

>CYP82D5 CAAP02000063.1c 77% to 82D, 59% to 82D14

GSVIVP00014719001 model is C-term half only

chr18 9714442:9716538 on strand +

225743 MAVPHSSSLLQYLNITTIGVLG

225675 ILFLSYYLLVRRSRAGKRRIAPEAAGAWPIIGHLHLLGGSQLPHVTLGTMADKYGPIFT 225499

225498 IRLGVHRALVVSSREVAKECFTTNDSAVSGRPKLVAPEHLGYNYAMFAFSPYDAYWREVR 225319

225318 KIVNTELLSNRRLELLKDVRASEVETSIKELYKLWAEKKNELGHVLVEMKQWFGDLSMNV 225139

225138 ILRMVVGKRYFGVGAGGEEEEARRCQKAIREFFRLLGLFVVKDGIPSLGWLDLGGHE 224968

224967 KAMKKTAKEIDSIAQEWLEEHRRRKDWGEDNGMHDFMDVLLSVLDGKALPEYDADTI 224797

224796 NKATSMVL 224773

224274 ALISGGTDTMTVTLTWALSLILNNRETLKKAKEELDTHVGKERL 224137

224136 VNASDISKLVYLQAIVKETLRLRPPGPLSGPRQFTEDCIIGGYHVPKGTRLVLNLSKLHR 223957

223956 DPSVWLDPEEFQPERFLTTHRDVDARGQHFQLLPFGAGRRSCPGITFALQMLHLALASF 223780

223779 LHGFEVSTPSNAPVDMSEIPGLTNIKSTPLEILIAPRLPYNSYK* 223645

 

>CYP82D6 CAAP02000063.1b 58% to CAN66846.1, 74% to 82D7, 54% to CYP82D2

GSVIVP00014717001

chr18 9724558:9726343 on strand +

215625 MEFHLPFSITTA

215589 SMFIFLLSFYYLLKMLRRSERKRTAPEAAGAWPVIGHLHLLGGSELPHKTLGAMADK 215419

215418 YGPIFFIKLGARPVLVVSNWEIAKECFTTNDKAFANRPKLIAVEVMGYNNAMFGFSPYGS 215239

215238 YWRQMRKIVTTHLLSNRSLEMLKLVRISEVKATIKELHELWVSKKSDSNMVSVEMRRWFG 215059

215058 DLALNLAVRMTAGKRFSSDKEGVEYHKAIRCFFELTGKFMVSDALPFLRWF 214906

214905 DLGGYEKAMKKTAKSLDHLLEDWLQEHKRKRVSGQPTGDQDFMDVMLSILDDETRAQDI 214729

214728 KSSDADIINKATCL 214687

214460 XVLIATTDTVTVSLTWALSLLLNNRHVLNKAKEELDLHVGRERRVEERDMSNLVYLDAIIK 214281

214280 ETLRLYSAVQVLAAHESTEECVVGGCYIPAGTRLIINLWKIHHDPSVWSDPDQFMPERFL 214101

214100 TTHKDVDVRGMHFELIPFGSGRRICPGVSLALQFLQFTLASLIQGFEFATASDGPVDMT 213924

213923 ESIGLTNLKATPLDVLLTPRLSSNLYE* 213840

 

>CYP82D7 CAAP02000063.1a 74% to 82D6

GSVIVP00014716001

chr18 9743578:9745309 on strand +

196605 MEFQLPFSTI

196575 AMAAMFTFLFLYYFSKIAKSSERKRTAPEAAGAWPVIGHLHLLGGPELPHKTLGAMA 196405

196404 DKYGPVFLIKLGVQRVLMLSNWEMAKECFTTNDKVFANRPKSIAVEVLGYNYAMFGFGPY 196225

196224 GSYWRQVRKIVTTGLLSNRRLEIVKHVWISEVKASIRELYELWINKRSDSNMVLVEMKDW 196045

196044 FGDLSLNMVLRMLSGGRDSSSKEERMRCHKLVRDFFQSMGTFLVSDALPLLR 195889

195888 WFDFGGYEKAMRKTAKDLDHLLESWLQQHKSKRSSEQADGNQDFMDVMLSMLDDMATDE 195712

195711 DLKGFDADTINKATCL 195664

195497 TILAGGTDTVTVSLIWALSLLLNKPQVLKTAREELDSHVGRERQVEERDMKNLAYL 195330

195329 NAIVKETLRLYPAGPLTAPHESTEDCLLGGYHIPAGTRLLANLWKIHRDPSIWSDPDEFR 195150

195149 PERFLTTHKDVDVKGQHFELIPFGSGRRICPGISFGLQFMQFTLASLIQGFEFATMSDE 194973

194972 PVDMTESIGLTNLKATPLEVLVAPRLSSDLYE* 194874

 

>CYP82D8P CAAP02005229.1f pseudogene exon 2, 83% to CYP82D10

26092 VKETLRLYPPRLLGGLCQFTKDCTLGGYHVSKGTRLIMNLSKI*KDPRIWSDPIEFQPER 25913

25912 FLITHKNVDDWGKHFEFISFGAGRRACLGIAFGLQILYLTLASFLHAFDFSTPSNEHVD 25736

25735 MQEALDLQI* 25706

 

>CYP82D9 CAAP02005229.1e 89% to CAN74205.1,

92% to 82D20v1

GSVIVP00014725001 model is incorrect

Chr18 9667057:9669300 on strand +

22874 MDFLLQCLNPVMVGAFA

22823 ILVLSYHLLLWRTGAGKSRMAPEASGAWPIIGHLHLLGGSKNLPHLLFGTMADKYGPVFS 22644

22643 IRLGLKRAVVVNSWEMAKECFTTHDLALASRPEVEAAKYLGYNYAMFAFSPHGAYWREVR 22464

22463 KIATLELLSNRRLELLKNVRISEVETCMKELYKLWAEKKNEAGVVLVDMKEWFGHLTLNV 22284

22283 ILMMVAGKRYFGYTGESKEKEAQQCRKAIREFFRLWGLFVVSDAIPFFGWLDVGGHLK 22110

22109 AMKKTAKELDGIAQEWLEEHRQRKDSGEADGNQDFMDVTLSILGGRDITDYDADTIN 21939

21938 KATAL (0) 21924

21257 ILIGGGTDTTSATLTWVISLLLNNPDVLRKAQEELDAHVGKERLVNEMDISKLVYL 21090

21089 QAIVKETLRINPTAPLSGPRQFIQDSILGGYHISKGTRLILNLTKIQRDPRVWLNPMEFQ 20910

20909 PDRFLTTHKDVDVRGKQFELTPFGGGRRICPGAVFALQVLHLTLANFLHRFQLSTPSDA 20733

20732 PVDMSESFGLTNIKSTPLEVLISPRLASYDLYE* 20631

 

>CYP82D9-de1b N-term pseudogene CAAP02005229.1e, 67% to 82D4

24708 VPLASSLFQYLNVTQWDCLIYFLSPSYHLLLWSSRAGKIRIHPKAAG

AWPIIGHLLLGSQLPGI 24517

 

>CYP82D10 CAAP02005229.1d 16585-14851 (-) strand, 89% to 89D22P

95% to 82D24 PARTIAL SEQ

GSVIVP00014724001

chr18 9672788:9675080 on strand +

17143 MYFLLQYLNITTVGVFATLFLSYCLLLWRSRAGNKKIAPEAAAAWPIIGHLHLLAGGSHQLPH 16955

16954 ITLGNMADKYGPVFTIRIGLHRAVVVSSWEMAKECSTANDQVSSSRPELLASKL 16793

16792 LGYNYAMFGFSPYGSYWREMRKIISLELLSNSRLELLKDVRASEVVTSIKELYKLWAEKK 16613

16612 NESGLVSVEMKQWFGDLTLNVILRMVAGKRYFSASDASENKQAQRCRRVFREFFHLSGLF 16433

16432 VVADAIPFLGWLDWGRHEKTLKKTAIEMDSIAQEWLEEHRRRKDSGDDNSTQDFMDVMQS 16253

16252 VLDGKNLGGYDADTINKATCL (0) 16190

15477 TLISGGSDTTVVSLTWALSLVLNNRDTLKKAQEELDIQVGKERLVNEQDISKLVYLQAIVK 15295

15294 ETLRLYPPGPLGGLRQFTEDCTLGGYHVSKGTRLIMNLSKIQKDPRIWSDPTEFQPERFL 15115

15114 TTHKDVDPRGKHFEFIPFGAGRRACPGITFGLQVLHLTLASFLHAFEFSTPSNEQVNMRE 14935

14934 SLGLTNMKSTPLEVLISPRLSSCSLYN* 14851

 

>CYP82D11P CAAP02005229.1c pseudogene exon 1, 68% TO 82D5

13855 LFQ*LNVTTVGMLGI 13811

13811 TLSPHYLLLWRSRAGKRRIEPEAAGAWSIIGHL 13713

12315 IPHIYLGTMADIYRPIFTILV 12253

12251 GMHRALVVSNKEVAKECLAINDSAVSGHPKLVAPKHLGYNYAMLALYIY 12105

 

>CYP8212P CAAP02005229.1b pseudogene exon 2, 80% to 82D10

9318 ALIAGGSDTIVVTLTW 9271

9268 ALSLLLNNHDTLKKAQEELDTQVGKE*LVNEQDISKLFYLQAIVKEIL*LYPPRPLAGPR 9089

9088 QFTEDCTLDGYHVSKGTRLILNISKIQKDPRIWSNPIEFQ*ERFLTTHTDLDPRGKYFE 8912

8911 FIPPFGANRKACPRMTFGLQMLHLTLANFLQVFDFSTASNAHIDMHESLGLTNMKSTPLE 8732

8731 VLISLHLSSCNLYN* 8687

 

>CYP82D13 gi|147777974|emb|CAN74205.1| 93% to CAN76287.1., 91% TO 82D20v1

89% to 82D9, 87% to 82D15, 86% to 82D18

CAAP02005229.1a 7042-4484 (-) strand

GSVIVP00014723001

Chr18 9682889:9685447 on strand +

MDFLLQCLNPVMVGAFAILVLSYHLLLWRSGAGKSRMAPEASGAWPIIGHLHLLGGSKNLPHLLFGTMAD

KYGPVFSIRLGXKRAVVVSSWEMAKECFTTHDLALASRPEVVAAKYLGYNYAMFAFSPHGAYWREVRKIA

TLELLSNRRLELLKNVRISEVETCMKELYKLWAEKKSEAGVVLVDMKQWFGHLSLNVILKMVVGKRYFGY

AAESKEKEAQQCQKAIREFFRLLGLFVVSDALPFLGWLDVGGHVKAMKKTAKELDGIAQEWLEEHRRRKD

SGEADGDQDFMDVMLSILGATDPNGYDADTINKATSL (0)

ILIAGGTDTTSVTLTWAISLLLNNPHVLRKAQE

ELDTHVGKERLVNEMDISKLVYLQAIVKETLRLYPAAPLSGQRQFIQDSVLGGYHIPKGTRLLLNLTKIQ

RDPRVWLNPTKFQPSRFLTTYKDVDVKGKHFVLTPFGGGRRICPGAAFALQVLPLTLANFLHKFQLSTPS

NSPIDMSESFGITNIKSTPLEVLISPRLASYNLYK*

 

>CYP82D14 CAAP02001783.1c 61% to CAN74205.1 61% to 82D3

GSVIVP00014732001

chr18 9581271:9583039

21867 MDPALHLPAFF

21834 VFLSLIYVFYAMLGRKKIIKSSKSRDAPEPGGAWPIIGHLHLLGGGDQLLYRTLGAMADK 21655

21654 YGPAFNIRLGSRRAFVVSSWEVAKECFTINDKALATRPTTVAAKHMGYNYAVFGFAPYSP 21475

21474 FWREMRKIATLELLSNRRLEILKHVRTSEVDMGIRELYGLWVKNSSRPLLVELNRWLEDM 21295

21294 TLNVIVRMVAGKRYFGAAAASDSSEARRCQKAISQFFRLIGIFVVSDALPFLWWLDLQG 21118

21117 HERAMKTTAKELDSILEGWLEEHRQRRVSSLIKAEGEQDFIDVMLSLQEEGRLSGFQYDS 20938

20937 ETSIKSTCL (0) 20911

20722 LILGGSDTTAGTLTWAISLLLNNRHALKKAQEELDLCVGMERQVEESDVKNLVYLQAIIK 20543

20542 ETLRLYPAGPLLGPREALDDCTVAGYNVPAGTRLIVNIWKLQRDPSVWTNPCAFQPERFL 20363

20362 NAHADVDVKGQQFELMPFGSGRRSCPGVSFALQVLHLTLARLLHAFELSTPVDQPVDMTE 20183

20182 SSGLTIPKATPLEVLLTPRLNSKLYAF* 20099

 

>CYP82D15 CAAP02001783.1b  11 aa diffs to 82D18

98% to 82D18, 87% to 82D13

GSVIVP00014731001

chr18 9587539:9590007 on strand +

15599 MDFLLQCLNPAMVGAFAILVLSYYLLLWRSGAGKGRMAPEAAGAWPIIGHLHLLGGSKNL 15420

15419 PHLLLGTMADKYGAVFSVRLGLKRAVVVSSWQMAKECFTTHDLALASRPQLVISKQLGYN 15240

15239 DAMFAFSPHGAYWREVRKIATLELLSNRRLELLKNVRISEVETCMKELYKLWAEKKNEAG 15060

15059 VVLVDMKQWFGDLTLNVILMMVVGKRYFGYTAESQEKETQRCQKSIREFFRLLGLFVVSD 14880

14879 ALPFLGWLDVGGHLKATKKTAKEMDGIAQEWLEEHRRRKDSGEASGNQDLMDVMLSILAG 14700

14699 TDPTGYDADTINKATSL (0) 14649

13757 ILIAGGSDTTSVTLTWVISLLLNNPCMLRKAQEELDTHVGKGRLVNEVDLSKLVYLQAIV 13578

13577 KETLRLYPALPLSGPRQFNQDSILGGYRIPNGTRLVLNLTKIQRDPSVWLNPTEFQPERF 13398

13397 LTTHKDVDMRGKNFEFTPFGGGRRICPGATFALQVLHLTLANFLHKFQLSTPSNATVDMS 13218

13217 ESLGITNIKSTPLEVLISPRLSSCDLYE* 13131

 

>CYP82D16P CAAP02001783.1a pseudogene 79% to 82D4

7530 GKYFTIRIGLHRALVVTTWQMAKECLTVNDQVSSSRPELLAAKHLGYNYAMFGFSPYGSY 7351

7350 WREVRKIISLELLSNRRLELLKDVCASEVVTSIKELYKLW 7231

7204 VSVEMKQWFGDLTLNVILRMVAGKRYFSASDASENKQAQRCRRVFREFFHLVGLFAV 7034

7033 ADAILFLGWLDWGRHENTLKKTAIEMDSIAQEWLEEHRRKDSGHDNSTQDFMDVMLSV 6860

6859 LD 6854

880 TLIAGGSDTTVVSLTWVFSLLLNNRDTLKKAKK 782

780 NIQVGKERLVNEQDISKLVYLQAILKE 700

 

>CYP82D17 CAAP02008549.1 exon 2 runs off the end 95% to 82D20v1

GSVIVP00014730001

chr18 9610466:9611086 on strand +

10105 ILIGGGTDTTSGTLTWAISLLLNNPHILRKAQEELDAHVGKERIVNEMDISKLVYLQA 9932

 9931 IVKETLRLNPTAPLSGPRQFIQDSILGGYYISKGTRLILNLTKIQRDPRVWLNPMEFQPD 9752

 9751 RFLTTHKDVDVRGKHFELTPFGGGRRICPGAIFALQVLHLTLANFLHRFQLSTPSDAPVD 9572

 9571 MSESFGLTNIKSTPLEVLISPRLASYDLY 9485

 

>CYP82D18 CAAP02008549.1 86% to CAN74205.1

98% to 82D15

GSVIVP00014729001 model is incorrect

Chr18 9617892:9620347 on strand +

2679 MDFLLQCLNPAMVGAFAILVLSYYLLLWRSGAGKGRMAPEAAGAWPIIGHLHLLGGSKNL 2500

2499 PHLLLGTMADKYGAVFSVRLGLERAVVVSSWQMAKECFTTHDLALASRPQLVISKQLGYN 2320

2319 DAMFAFSPHGAYWREVRKIATLELLSNRRLELLKNVRISEVETCMKELYKLWAEKKNEAG 2140

2139 VVLVDMKQWFGDLTLNVILMMVAGKRYFGYTAESQEKETQRCQKSIREFFRLLGLFVVSD 1960

1959 ALPFLGWLDVGGHLKATKKTAKEMDGIAQEWLEEHRQRKDSGEANGNQDLMDVMLSILAG 1780

1779 TDPTGYDADTINKATSL (0) 1729

 847 ILIAGGSDTTSVTLTWAISLLLNNPCMLRKAQEELDTHVGKGRLVNEVDLSKLVYLQAIV 668

 667 KETLRLYPAFPLSGPRQFNQDSILGGYRIPKGTRLVLNLTKIQRDPSIWLNPTEFQPERF 488

 487 LTTHKDIDMRVKNFEFTPFGGGRRICPGATFALQVLHLTLANFLHKFQLSTPSDATVDMS 308

 307 ESLGITNIKSTPLEVLISPRLSSCDLY 227

 

>CYP82D18-de1b CAAP02008549.1-de1b pseudogene fragment N-term

2 aa diffs to 82D10 and 82D21

5965 MYFLFQYLNITTVGVFATLFLSYCLLLWRSRAGNKKIAPEATAA 5834

 

>CYP82D19 gi|147794787|emb|CAN66846.1| 59% to 82D3

CAAP02003355.1 35337-33216 (-) strand

GSVIVP00013155001 exon 1

chrUn_random 82487508:82486552 exon 1 on strand -

chrUn_random 82486016:82485387 exon 2 on strand -

METLLRFLLPLFSSLLVVVISICFYRLKIKAASNAKRCTAPRAGGAWPIIGHLHLFGAQQLTHKTLGAMA

DKYGPVFTIRLGLNEILVLSSSEMARECFTTHDRVFSTRPSVTASKILGYDFAMFGFAPYGSYWREMRKI

VTIELLSNHRLDMLKHIRASEVGTSIRELYEMWVSERGTDGRVFVDMKRWFGDLTLNLAVRLVGGKRYFG

AGADTKEGEGRTCQKVIRDFAHLFGVFVLSDAIPFLSWLDLKGYKKAMKRTAKELDSLFGGWLQEHKEKR

LLGGEGKDDQDFMDVMLTVLEDVNFSGFDADTVNKATCLNLILAGSDTTKVTLTWALSLLLNHPHVLKKA

QAELDIQVGKDRQVDESDVKNLVYLQAIIKETLRLYPASPIITLHAAMEDCTLAAGYNISAGTQIMVNAW

KIHRDERVWCNPKEFQPERFMTSHKDTDVRGQHFELIPFGSGRRSCPGISLALQVVHFALASLLHSYEVT

KPSDGDVDMTESLGLTNLKATPLEVLLSPRLKAELYRQ

 

>CYP82D20v1 gi|147781110|emb|CAN76287.1 = AM442953.1 52% to 82C4

92% to CYP82D9, 0nly 6 aa diffs to CAAP02010919.1 CYP82D20v2

2 aa diffs to CAO61260.1 in the other genome project

GSVIVP00014727001 exon 2

chrUn_random 151382041:151381091 exon 1 on strand – (6 aa diffs)

chr18 9636009:9636575 exon 2 on strand + (1 aa diff)

MDFLLXCLNPV

MVGAFAILVLSYHLLLWRSGAGKSRMAPEASGAWPIIGHLHLLGGSKNLPHLLFGTMADKYGPVFSIRLG

LKRAVXVSSWEMAKECFTTHDLALASRPZLVAAKYLGYNYAMFGLSPHGAYWREVRKIATLELLSNRRLE

LLKNVRISEVETCMKELYKLWAEKKSEAGVVLVDMKQWFGHLSLNVILKMVVGKRYFGYAAESKEKGAQQ

CQKAIREFFRLMGLFVVSDAIPFLGWLDVGGHVKAMKKTAKELDGITQEWLEEHRRRKDSGEADGDQDFM

DVMLSILGGRDTTDYDADTINKATSL (0)

VMIGGGADTTSGTLTWAISL

LLNNPHILRKAQEELDAHVGKERLVNEMDISKLVYLQAIVKETLRLNPIAPLSGPRQFIQDSILGGYHIS

KGTRLILNLTKIQRDPRVWLNPMEFQPDRFLTTHKDVDVRGKHFELTPFGGGRRICPGIVFALQVLHLTL

ANFLHRFQLSTPSDAPVDMSEDFGLTNIKSTPLEVLISPRLASYDLYE

 

>CYP82D20v2 CAAP02010919.1 runs of the end 6 aa diffs with CAN76287.1

5972 LGYNYAMFGLSPHGAYWREVRKIATLELLSNRRLELLKNVRISEVETCMKELYELWAKKK 5793

5792 SEAGVVLVDMKQWFGHLSLNVILKMVVGKRYFGYAAESKEKEAQQFQKAIREFFRLMGLF 5613

5612 VVSDAIPFLGWLDVGGHVKAMKKTAKELDGITQEWLEEHRRRKDSGEADGDQDFMDVMLS 5433

5432 ILGGRDTTDYDADTINKATSL 5370

4415 VMIGGGADTTSGTLTWAVSLLLNNPHILRKAQEELDAHVGKERLVNEMDISKLVYLQ 4245

4244 AIVKETLRLNPIAPLSGPRQFIQDSILGGYHISKGTRLILNLTKIQRDPRVWLNPMEFQP 4065

4064 DRFLTTHKDVDVRGKHFELTPFGGGRRICPGIVFALQVLHLTLANFLHRFQLSTPSDAPV 3885

3884 DMSEGFGLTNIKSTPLEVLISPRLASYDLYE* 3789

 

>CYP82D21 partial CAAP02010919.1 78% to CAAP02000063.1c runs off the end

identical to CYP82D10 N-term

chr18 9640100:9640363 on strand +

264 MYFLLQYLNITTVGVFATLFLSYCLLLWRSRAGNKKIAPEAAAAWPIIGHLHLLAGGSHQLPH 76

 75 ITLGNMADKYGPVFTIRIGLHRALV 1

 

>CYP82D22P CAAP02008020.1 93% to CAAP02000063.1d

93% to 82D4, 89% to 82D10

GSVIVP00014722001 MODEL IS INCORRECT

chr18 9694430:9696685 on strand +

7540 MYFLLQYLNFTTVGVFATLLLSYCLLLWRSREGNSKIAPEATAAWPIIGHLHLLGGSDQLPHI 7352

7351 TLGNMADKYGPVFTIRLGVHRALVVSS*EMAKECLTTNDQAASSRPELLASKHLGYNHAM 7172

7171 FGFSPYGSYWREVRKIINLELLSNRRLELLKDVRASEVVTSIKELYELWEEKKNESGLVS 6992

6991 VEMKQWFGDLTLNVILRMVAGKRYFSASDASENKQAQRCRRVFREFFHL*GLFAVADAIP 6812

6811 FLGWLDLGRHEKTLKKTAKEMDSIAQEWLEEHRRRKDSGEVNSTQDFMDVMLSVLDGKIL 6632

6631 GDYDADTVNKATCL (0) 6590

5899 ALIVGGSDTTVVTLTWALSLLLNNRDTLKKAQEELDIQVGKERLVNEQDISKLVYLQAIV 5720

5719 KETLRLYPPAALGGPRQFTEDCTLGGYHVSKGT*LILNLSKIQKDPRIWMSLTEFQPERF 5540

5539 LTTHKDLDPQGKHFEFIPFGAGRRACPGIAFALQMLHLTLANFLQAFDFSTPSNARVDMC 5360

5359 ESLGLTNMKSTPLEVLISPRMSLL* 5285

 

>CYP82D23 gi|147777975|emb|CAN74206.1 = AM447864.1

96% to 82D22 but without stop codons

sequence is poor quality near the end

no exact match in genosope

MYFLLQYLNFTTVGVFATLLLSYCLLLWRSRADNSKIAPEATAAWPIIGHLHLLGGSDQLPHITLGN

MADKYGPVFTIRLGVHRALVVSSWEMAKECLTTNDQAASSRPELLASKHLGYNYAMFGFSPYGSYWREVR

KIISLELLSNRRLELLKDVRASEVVTSIKELYELWEEKKNESGLVSVEMKQWFGDLTLNVILRMVAGKRY

FSASDASENKQAQRCRRVFREFCHLSGLFAVADAIPFLGWLDLGRHEKTLKKTAKEMDSIAQEWLEEHRR

RKDSGEVNSTQDFMDVMLSVLDGKNLGDYDADTVNKATCL

ALIVGGSDTIVVTLTWALSLLLNNRDN &

LKKAQEELDIQVGKE &

ISNEQDISXLVYLQAIVKETLRLYPPAALGRPRQFTEDCNLGGYHV

 

>CYP82D24 CAAP02005686.1 missing N-term, runs off the end, 88% to CAAP02000063.1d

95% to CAO61257.1, 95% to 82D10

chr18 9640536:9662802 on strand +

note: distance between exons is too long model is probably wrong

23119 MKQWFGDLTLNVILRMVAGKRYFSASDASENKQAQRCRRVFREFFHLSGLFAVADAIPFL 22940

22939 GWLDWGRHEKTLKKTAIEMDSIAQEWLEEHRRRKDSGDDNSTQDFMDVMQSVLDGKNLGG 22760

22759 YDADTINKATCM (0) 22724

 1476 LISGGNDTTVVSLTWALSLVLNNHDTLKKAQQELDIQVGKERLVNEQDIGKLVYLQAIVK 1297

 1296 ETLRLYPSGPLGGLRQFTEDCTLGGYHVSKGTRLIMNLSKIQKDPRIWSNPTEFQPERFL 1117

 1116 TTHKDVDPWGKHFEFIPFGASRRVCPGITFGLQILHLTLASFLHAFEFSTPSNE*VDMRE 937

  936 SLGLTNMKSTPLEVLISPRLSSCSLYN* 853

 

 

&&&&&&

 

CYP82H subfamily  42 sequences (Vitis has 41 of these)

The founding seq for this subfamily is CYP82H1 from Ammi majus 60% to 82H5

 

In Vitis there are

(6 genes with 2 that have 3 close duplicates shown as .a, .b, .c)

[22 pseudogenes, 8 have close duplicates shown as .a, .b)

 

>CYP82H1 Ammi majus AY532373.1

MITCEMGIYLQMQDIILFSLVFFSTLILWRIFSTYVIRKKTCSG

PPEPAGRWPLIGHLHLLGGSKILHHILGDMADEYGPIFSLNLGINKTVVITSWEVAKE

CFTTQDRVFATRPKSVVGQVVGYNSRVMIFQQYGAYWREMRKLAIIELLSNRRLDMLK

HVRESEVNLFIKELYEQWSANGNGSKVVVEMMKRFGDLTTNIVVRTVAGKKYSGTGVH

GNEESRQFQKAMAEFMHLGGLLMVSDALPLLGWIDTVKGCKGKMKKTAEEIDHILGSC

LKEHQQKRTNISNNHSEDDFIYVMLSAMDGNQFPGIDTDTAIKGTCLSLILGGYDTTS

ATLMWALSLLLNNRHVLKKAQDEMDQYVGRDRQVKESDVKNLTYLQAIVKETLRLYPA

APLSVQHKAMADCTVAGFNIPAGTRLVVNLWKMHRDPKVWSDPLEFQPERFLQKHINV

DIWGQNFELLPFGSGRRSCPGITFAMQVLHLTLAQLLHGFELGTVLDSSIDMTESSGI

TDPRATPLEVTLTPRLPPAVYQ

 

>CYP82H2P CYP82A CAAP02000286.1a 2318-2037 (-) strand  = CAAP02005785.1b revised first 94 aa

1 aa diff

2318  MGFYLQLQGIIVFSLLFAIICL*LIKGKGNRNRGKRAPEPSRA*PLIGHLHLLRAGKPQH  2139

2138  QAFGAMADKYGPIFCFHIGLRKTSVVSSWEVAKE  2037

 

>CYP82H3P CYP82A CAAP02000286.1b pseudogene, no start codon, 3 frameshifts, one stop codon

92% to CAAP02005785.1c

10177 IGFSLQPQDIIVFGLLLATICLLLATVFNAKWNKKKGKRPQEPSERWPIIGHLHLLRADK 9998

9997 LLYRTLGDMADKYGPIFCIHLGLRKALVVSNWEVAKECYTTNDKVFATRPRSLAVKLMGY 9818

9817 DHAMF 9803

9801 GFAPYGPYWRDVQKLAMVELLSNCQLEMLKHVQDSKVEFLIKEIYG 9664

9662 WARNKESPTLVEMKERFGNLVMNVMVSATAGKRYFGTHACGDEPKRGKKALDDFMLLVGL 9483

9482 FMVLDAIPFLGWLDTVKGYTTDMKKIAKKVDYLLRRWVEEHRQQRLSAKNNRVEVDFLHV 9303

9302 MLSVIDDGQFSGRDPDTVIKATCL (0) 9231

8914 NLILASYDTTAITLTWALSLLLNNRHALKKSQAGMEIHVGKHRQVDGSDIKNLVYLQAIV 8735

8734 KETLRLYPPGPLSVPHEAIEDCTVAGFHIQAGTRLLVNLWKLHRDPRV*LDPLEFQPERF 8555

8554 LTKHAGLDVRGKNYEL 8507

8505 PFGSGRRVCLRISFALEMTHLTLARLLHGFELGVVADSPVDMTEGPGLTAPKATPLEVTI 8326

8325 VSRLPFELYSYEAS* 8281

 

>CYP82H4P CYP82A CAAP02000286.1c 16587-16400 (-) strand = CAAP02005785.1c-de2b

ETTVITPTWALSLLLNKS

ETTVITPTWALSLMLNKSCVLKK*TQDELDVQLGKHRLVEES

NLIYLQAIVKWKHYEY

 

>CYP82H5 gi|37906506|gb|AAP49697.1| 59% to 82C4 partial mRNA

2 aa diffs to CAO47708.1 from the other genome project

WVEEHRQNRLSANDNGAEQDFIHAMLSVIDDGQFSGHDPDTIIKGTCSNLILAGNDTTSITLTWALSLLL

NNRHALKKAQAELEIHVGKHRQVDGSDIKNLVYLQAIVKETLRLYPPGPLSLPHEAMEDCTVAGFHIQAG

TRLLVNLWKLHRDPRVWLDPLEFQPERFLTKHAGLDVRGKNYELLPFGSGRRVCPGISFALELTHLTLAR

LLHGFELGAVADSPVDMTESPGLTAPKATPLEVTIVPRLPFELYSYEAA*

 

>CYP82H5 gi|157340903|emb|CAO47708.1| unnamed protein product [Vitis vinifera]

from the other genome project (1 aa diff)

MGFSLRPQDITVFGLLLATICLLLATVLNAKGNKKRGKRPPEPSGRWPLIGHLHLLGADKLLHRTLGDMA

DKYGPIFCVRLGLKKTLVVSSWEVAKECYTTSDKVFATRPRSLAIKLIGYDHGSFVFAPYGPYWRDVRKL

AMVELLSNRQLEMHKHVQDSEVKILIKELYGQWASNKDGPALVEMKERFGNLALNVVVRAIAGKRYFGTH

ACGDEPKRAKKAFEDFIILLGLFMVSDVIPFLGWLDTMKGFTAEMKRVAKEVDYVLGSWVEEHRQNRLSA

NDNGAEQDFIHAMLSVIDDGQFSGRDPDTIIKGTCSNLILAGYDTTSITLTWALSLLLNNRHALKKAQAE

LEIHVGKHRQVDGSDIKNLVYLQAIVKETLRLYPPGPLSLPHEAMEDCTVAGFHIQAGTRLLVNLWKLHR

DPRVWLDPLEFQPERFLTKHAGLDVRGKNYELLPFGSGRRVCPGISFALELTHLTLARLLHGFELGAVAD

SPVDMTESPGLTAPKATPLEVTIVPRLPFELYSYEAA

 

>CYP82H5 CYP82A15 CAAP02000286.1d 23084-21191 (-) strand = CYP82A15 1 aa diff

CYP82A15 gi|147792578|emb|CAN64371.1| = AM487302.1, same as AY226829

60% to CYP82H Ammi majus, 57% to 82D3 Coptis japonica

MGFSLRPQDITVFGLLLATICLLLATVLNAKGNKKRGKRPPEPSGRWPLIGHLHLLGADKLLHRTLGDMA

DKYGPIFCVRLGLKKTLVVSSWEVAKECYTTSDKVFATRPRSLAIKLIGYDHGSFVFAPYGPYWRDVRKL

AMVELLSNRQLEMHKHVQDSEVKILIKELYGQWASNKDGPALVEMKERFGNLALNVVVRAIAGKRYFGTH

ACGDEPKRAKKAFEDFIILLGLFMVSDVIPFLGWLDTMKGFTAEMKRVAKEVDYVLGSWVEEHRQNRLSA

NDNGAEQDFIHAMLSVIDDGQFSGRDPDTIIKGTCSNLILAGNDTTSITLTWALSLLLNNRHALKKAQAE

LEIHVGKHRQVDGSDIKNLVYLQAIVKETLRLYPPGPLSLPHEAMEDCTVAGFHIQAGTRLLVNLWKLHR

DPRVWLDPLEFQPERFLTKHAGLDVRGKNYELLPFGSGRRVCPGISFALELTHLTLARLLHGFELGAVAD

SPVDMTESPGLTAPKATPLEVTIVPRLPFELYSYEAA

 

>CYP82H5-de2b  CAAP02000286.1d-de2b pseudogene fragment C-term

93% to CAAP02005785.1c-de2d

24609 LPEGSGVTLPRATPLDVTVFPRLPSELYGY*AS 24511

 

>CYP82H6P.a CAAP02000286.1e pseudogene 87% to CYP82A.a   CAAP02006846.1 pseudogene

this looks like CYP82A16P

29227 ATNVMVSEVGGKRYFGTVTNDYESRRCRKALEDLLYLSRIFMVSDAIPSLGWLETVRGYT 29048

29047 AKMKRIAREVDQVLGSWVEEHRRKRFSGSVNEAVQDFIHVMLSVIEDGQFSDHDHDTVIK 28868

28867 ATCL (0) 28856

28570 DSYIGGFDSTVITLTCALCPLMNNPSTLKRAQDELDIKVGKHRQVDESDIKNLVYLQAIIKET 28382

28381 L 28379

28377 LRLYPAAPLSVPREAMEDCTVAGFHIQAGTRLLVNLWKLHRDPRIWSDPLEFQPERFLTK 28198

28197 HVDLDVRGRNFEFLPFGSGRRVCPGISFALEVVHLTLAR 28081

LLHGFELGVVADLPVDRTEG

 

>CYP82H6P.b CYP82A16P   Vitis vinifera (Pinot noir grape)

            AM487302.1

            Pseudogene adjacent to CYP82a

            67% to CYP82A15, 4 aa diffs to CYP82H6P.a

     MASVSRDHRFWPSVCG*

1676 IKAEGNKNKGKRAPEPSRAWPLIEHLHLL*AGKPQHQSFGAMADKY 1813

1814 GPIFCFHIGLRKTFLVSSWDAAKECFTTMDKAFDTRPRSLAGKLMGYDHAMFGFSPYGPY 1993

1994 WREVRKLVSVXXXXXXXXXXXXHVRDSEVKLFIKELYGQWIQNGDRPALVVMKEKCWHLA 2173

2174 ANVMVSEVAGKRYFGTVTNDYESRRCRKALEDLLYLSXIFMVSDAIPXLXWLXTVRGYTA 2353

2354 KMKRXXREVDQVLGSWVXEHRXKRFSGSXNEAXQDFXHVMLSVIEDGQFSDHDHDTVIKA 2533

2534 TCL (0) 2542

2840 DSYIGGFDSTVITLTCALCPLMNNPSTLKRAQDELDIKVGKHRQVDESDIKNLVYLQAIIKET 3019

3024 LRLYPAAPLSVPREAMEDCTVAGFHIQAGTRLLVNLWKLHRDPRIWSDPLEFQPERFLTK 3203

3204 HVDLDVRGRKFEFLPFGSGRRVCPGISFPLEVVHLTLARLLHGFELGVVANLPVDRTEGS 3383

3384 GVTLPRATPLDVTVVPRLPSELYGY* 3461

 

>CYP82H7P CYP82A CAAP02000286.1f 93% to CAAP02007374.1 pseudogene

34822 GSSFLAFCLWLIKAEGNKNKGKRAPEPSKAWPLIGHLHLL*AGKP*HQSFGAMADKYGPI 34643

34642 FCFHIGLRKTFLVSS*DPAKECFTTMDKAFDTRPRSFAGKLMGYDHAMFGFSHYGPYWHE 34463

34462 VRKLVSVELLLNRQLELLNHVRDSEVKLFIKELYGQWIQNGDRPALVVMKEKCWHLA 34292

 

>CYP82H8.a CYP82A CAAP02000286.1g 97% to CAAP02007374.1 12 aa diffs

39199 MGFSLQLQDVTVFGILFAIICLWLVNAKGNKNKGRSPPEPSGAWPVMGHLHLLGADKQIH 39020

39019 RTFGAMADEYGPIFSIRLGLRTALVVSSSEVAKECYTTKDKALATRPRSLAVKLMGYDHA 38840

38839 MFAFERHGPYWRDVRRLAMVNLLSNRQHEMLKHVRDSEVKFFIQELYGQWVENGGSPVLV 38660

38659 DMKKKFEHLVANLTMRTVAGKRCENGESRWCQALGDFMNLMGQFMVSDAVPFLGWLDTVR 38480

38479 GYTAKMKGTARQLDQVLGRWVEEHRQKRLSGSINEAEQDFIHAMLSVIDDAQLSAHDHDH 38300

38299 DTVIKATCL (0) 38273

37909 TVMLAGNDTIAITLTWALSLLMNNPRALKKAQEELDFHVGRNQQVYESDIKKLVYLQAII 37730

37729 KETLRLYPAGPLALPHEAMEDCTIAGFHIQAGTRLLVNLWKLHRDPTIWSDPLEFQPERF 37550

37549 LTKHVGLDVRGQHFELLPFGSGRRMCPGISLALEILQLTLARLLHGFELGVVADSPLDMT 37370

37369 EGVGLAMPKATPLEVTLVPRLPSELYH* 37286

 

>gi|147799471|emb|CAN72749.1 = AM472978.1 hybrid of 2 genes

94% to CAAP02000286.1g CYP82H8

last exon has only one diff to CYP82H8

top part is exon 1 end from CYP82H9P.b, bottom part = exon 2 from CYP82H8.b

MADNWVDEHRRKRFSGSMNEAEQDFNHVMLSVIEDGQFSDHDHDTVIKATXL (0)

TVMLAGNDTIAITLTWAL

SLLMNNPRALKKAQEELDFHVGRNQQVYESDIKKLVYLQAIIKETLRLYPAGPLALPHEAMEDCTIAGFH

IQAGTRLLVNLWKLHRDPTIWLDPLEFQPERFLTKHVGLDVRGQHFELLPFGSGRRMCPGISLALEILQL

TLARLLHGFELGVVADSPLDMTEGVGLAMPKATPLEVTLVPRLPSELYH

 

CYP82H8.b AM472978.1 only 3 aa diffs to CYP82H8.a

4095 MGFSLQLQDVTVFGILFAIICLWLVNAKGNKNKGRSPPEPSGAWPVMGHLHLLGADKQIH 3916

3915 RTFGAMADEYGPIFSIRLGLRTALVVSSSEVAKECYTTKDKALATRPRSLAVKLMGYDHA 3736

3735 MFAFERHGPYWRDVRRLAMVNLLSNRQHEMLKHVRDSEVKFFIQELYGQWVENGGSPVLV 3556

3555 DMKKKFEHLVANLTMRTVAGKRYENGESRWCQALGDFMNLMGQFMVSDAVPFLGWLDTVR 3376

3375 GYTAKMKGTARQLDQVIGRWVEEHRQKRLSGSINEAEQDFIHAMLSVIDDAQLSAHDHDH 3196

3195 DTVIKATCL (0) 3169

2805 TVMLAGNDTIAITLTWALSLLMNNPRALKKAQEELDFHVGRNQQVYESDIKKLVYLQAII 2626

2625 KETLRLYPAGPLALPHEAMEDCTIAGFHIQAGTRLLVNLWKLHRDPTIWLDPLEFQPERF 2446

2445 LTKHVGLDVRGQHFELLPFGSGRRMCPGISLALEILQLTLARLLHGFELGVVADSPLDMT 2266

2265 EGVGLAMPKATPLEVTLVPRLPSELYH* 2182

 

>CYP82H9P.a CYP82A CAAP02000286.1h pseudogene 90% to CYP82A.a   CAAP02006846.1 pseudogene

43408 MGFYLQLQGIIVF*PSVAIICLWLIKGKGNRNRGKRAPEPSPAW 43277

      ALIGHLHLVRAGKPQHQAFGAMADKYGPI 43191

43190 FCFHIGLRKTLVVSSYEVATECLTTMDKAFSTRP 43089

43086 SLAGKLMGYGHAMFGFSPCGPYWRDVRKLASVELLSNRQLELLNHVRDSEVKLFIKELYG 42907

42906 QWIQNGDRPVLVEMKEKCWHLAANVMVSAVAGKX 42808

42803 GTVTNDYESRRCRKALGDLLYLSGIFMVSDAIPFLGWLGTVRGYTAKMKRTVREVDQVLG 42624

42623 SWVEEHCRKRFSGSMNEAEQDFNHVMLSVIEDGQFSDHDHDTVIKATC 42480

      TLIIGGSNSTVITLTW

42147 ALSLLMNNPSTLKRAQDELDIKVGKHRQGDGSDIKNLVYFQAIVKETLRLYPPGPLSLPR 41968

41967 EAMEDCTVAGFHIQAGTRLLGNLWKLHKDPRIWSDPLEFQPERFLTKHVYLDVRGQNLEF 41788

41787 LPFGSGRRVCPGISFALEVVHPTYLRSYIS* 41695

 

>CYP82H9P.b AM472978.1  96% to CAAP02000286.1h

8679 MGFYLQLQGIIVF*PSVAIICLWLIKGKGNRNRGKRAPEPSPAW 8548

     ALIGHLHLVRAGKPQHQAFGAMADKYGPI 8462

8461 FCFHIGLRKTLVVSSYEVATECLTTMDKAFSTRPRSLAGKLMGYDHAMFGFSPCGPYWRD 8282

8281 VRKLASVELLSNRQLELLNHVXDSEVKLFIKXLYGQWIQNGDRPVLVEMKEKCWHLAANV 8102

8101 MVSXVAGK 8078

8076 FGTVTNDYESRRCRKALGDLLYLXGIFMVSDAIPFLGWLGTVRGYTAKMKRTVREVDQ 7903

7902 VLGSW 7888

7888 GSWVDEHRRKRFSGSMNEAEQDFNHVMLSVIEDGQFSDHDHDTVIKATXL (0) 7739

7449 TLIIGGSNSTVITLTW 7402

7400 ALSLLMNNPSTLKRAQDELDIKVGKHRQVDGSDIKNLVYFQAIVKETLRLYPPGPLSLPR 7221

7220 EAMEGCTVAGFHIQAGTRLLGNLWKLHKDPRIWSDPLEFQPERFLTKHVDLDVRGQNLEF 7041

7040 LPFGSGRRVCPGISFALEVVHPTYLRSYIS* 6948

 

>CYP82H10P CYP82A CAAP02000286.1h-de2b 94% to CYP82A.a-de1b   CAAP02006846.1

N-term

41169 MGFSLQPQEISVFGLLATICLLLETALNAKRNKIKEGKRPPEPSGQ*PMADRFFFL 41002

 

>CYP82H11P.a CYP82A CAAP02000286.1i one stop codon, possible pseudogene, 92% to 82A15

56585 MGFSLRPQDIIVFGLLLSTICLLLATVLNAKGNKKKGERPPEPSGRWPLISHLHLLEADK 56406

56405 LLHRTLGDMADKYGPIFCVHLGLKKALVVSGWEVAKEGYTINDKVFATRPRPLAIKLMGY 56226

56225 DHGSFVFASCGPYWRDVRKLAMAELLSNRQLEMHKHVQDSEVKILIKELYGQWASNKDGP 56046

56045 ALVEMKERFGNLALNVVVRAIAGKRYFGTHACGDEPKRGKKAFEDFIILVGLFMVSDAIH 55866

55865 FLGWLDTVKGFTAEMKRVAKEVDYVLGSWVEDHRQNRLSANDNGAEQDFIHAMLSVIDDG 55686

55685 QFSRRDPDTIIKGTCW (0) 55638

55320 NLILAGYGSTFITLTWALSLLLNNHHALKKA*AELEIHVGKHRQVDGSDIKNLVYLQAIV 55141

55140 KETLRLYRPRPLSLPREAMEDCIVAGFHIQAGTRLLVNLWKLHRDPRVWLNPLEFQPERF 54961

54960 LTKHAGLDVRGRNYELLPFGSGRRVCPGISFALELTHLTLARLLHGFELGAVVDSRVDMT 54781

54780 ESPGLTALKATPLEVTIVPRLPFELYSYEVA* 54685

 

>CYP82H11P.b AM485903.2 (duplicate segment)

94% to CYP82H11P.a

5075 MGFSLHPQDITVFGLLLSTICLLLATVLNAKGNKKKGERPPESSGRWPLIGHLHLLEADK 4896

4895 LLHRTLGDMADKYGPIFCVHLGLKKALVVSGWEVAKEGYTTNDKVFATRPRSLAIKLMGY 4716

4715 DHGSFVFAPYGPY*RDVRKLAMAELLSNRQLEMHKHVQDSEVKILIKELYGQWASNKD 4542

4541 GRPALVEMKERFGNLALNVVVRAIAGKRYFGTHACGDEPKRGKKAFEDFIILV 4387

4386 GLFMVSDAIPFLGWLDTVKGFTAEMKRVAKEVDYMLGSWVEDHRQNRLSANDNGVEQDFI 4207

4206 HAMLSVID

     NLILVGYGSTFITLTWALSLLLNNCHALKKAQAE 4039

4038 LEIHVGKHRQVDGSDIKNLVYLQAIVKETLRLYRPGPLSLPREAMEDCTVAGFHIQAGTR 3859

3858 LLL 3850

3853 VNLWKLHRDPRVWLDPLEF 3797

1801 QPERFLTKHEGLDVRGKNYELLPFGSGRRVCPGISFALELTHLTLARLLHGFELGAVV 1628

1627 DSWVDMIESPGLTALKATSLEVTIVPRLPFELYSYEAA* 1511

 

>CYP82H12P.a CYP82A CAAP02000286.1j pseudogene 58% to 82A15

58901 MAFFFQLQDMRVFGPLFAMNFLSLVV

58822 NSKANKNKGKRPPDLSRASHLIFIGHLHLLGANKSLHQTFGAINDKYEPFVFV 58664

 

>CYP82H12P.b AM485903.2 (duplicate segment)

7394  MAFFFQLQDMRVFXPLFAMNFLSLVV 7317

7315  NSKANKNKGKRPPDLSRASHLIFIGHLHLLGANKSLHQTFGAINDKYEPFVF 7160

 

>CYP82H13P.a CYP82A CAAP02000286.1k pseudogene, 91% to CAAP02007374.1 pseudogene

64275 FSFLLSHGFSFEGSSFLAFCLWLIKAEGNKNKGKRAPEPSRAWP*IGHLHLL*AGKPQHQ 64096

64095 AFGA 64084

64084 MADKYGPIFCFHIGPRKTFLVSSWDAAKECFTTMDRAFATRPRSLA*KLMGYDHAMFGF 63908

63907 SPYGPYWRDVRKLASVELLSNLQLELLNHVRDSEVKLFIKELYGQWIQNGDRPLLVEMKE 63728

63727 KCWHMATNVMVSVVAGKQHFGTITNDYESRQCRKALGDLLYLSGIFMVFDANPLPWL 63557

63573 IPFLGWSDTVRGYTTKMKRTTREMDQVLG 63487

 

>CYP82H13P.b AM485903.2 (duplicate segment)

11925  MASVSRDHRFWPSVCG*LK

11944  FSFLLSHGFSFEGSSFLAFCLWLIKAERNKNKGKRAPDPSRAWPLIGHLHLL*AGKPQHQ  11765

11764  AFGAMADKYGPIFCFHIGLRKNFLVSSWDAAKECFTTMDKAFATRPRSLA*KLMGYDHAM  11585

11584  FGFSPYGPYWHDVRKLASVELLSNLQLELLNHVRDSKVKLFIKELYGQWIQNGDRPLLVE  11405

11404  MKEKCWHMATNVMVSVVAGKRHFGTITNDYESR*CRKALGDLLYLSGIFMVFDA  11243

11242  IPFLGWSDTVRGYTAKMKRTTREMDQVLG  11156

 

>CYP82H14.a CYP82A CAAP02000286.1L 96% to 82H14.b on AM485903.2

68598 MGFSLQLQDVTVFGILFAIICLWLVNAKGNKNKGRSPPEPSGAWPVMGHLHLLGADKPLH 68419

68418 RTFGAMADEYGPIFSIRLGLRTALVVSSSEVAKECYTTKDKALANRPRSLAVKLMGYEHA 68239

68238 MFAFERHGPYWRDVRRLAMVELLSNRQHEMLKHVRDSEVKFFIQELYGQWVDNGGSPVLV 68059

68058 DMKKKFEHLVANLVMRTVAGKRCGNGESRWCQALGDFMNLMGQFMVSDAVPFLGWLDTVR 67879

67878 GYTAKMKGTARQLDQVIGRWVEEHRQKRLSGSINEAEQDFIHAMLSVIDDAQFSGHDHDH 67699

67698 DTVIKATCL (0) 67672

67306 TVMLASNDTIAITLTWALSLLMNNPHALKKAQEELDFHVGRNQQVYESDIKKLVYLQAII 67127

67126 KETLRLYPAGPLALPHEAMEDCTIAGFHIQAGTRLLVNLWKLHRDPTIWSDPLEFQPERF 66947

66946 LTKHVGLDVRGQHFELLPFGSGRRMCPGISFALEILQLTLARLLHGFELGVVADSPLDMT 66767

66766 EGVGLALPKATPLEVTLVPRLPSELYH* 66683

 

>CYP82H14.b CYP82A19   Vitis vinifera (Pinot noir grape) AM485903.2 71% to 82A15

same as 82A17 on pages, 96% to 82H14.a (duplicate segment)

17161  MGFSLQLQDVTVFGILFAIICLWLVNAKGNKNKGRSPPEPSVAWPLMGHLHLLGADK  16991

16990  PLHRTFGAMADEYGPIFSIRLGLRTALVVSSSEVAKECYTTKDKALATRPRSLAVKLMGY  16811

16810  DHAMFAFERHGPYWRDVRRLAMVELLSNRQHEMLKHVRDSEVKFFIQELYGQWVDNGGSP  16631

16630  VLVDMKKKFEHLVANLVMRTVAGKRCGNGESRWCQALGDFMNLMGQFMVSDAVP  16469

16468  FLGWLDTVRGYTAKMKGTARQLDQVIGRWVEEHRQKRLSGSINEAEQDFIHAMLSVIDDA  16289

16288  QFSAHDHDHDTIIKATCL (0)  16235

14983  TVMLAGNDTIAITLTWALSLLMNNPRALKKAQEELDFHVGRNQQVYESDIKKLVYLQAIIK  14801

14800  ETLRLYPVGPLALPHEAMEDCTIAGFHIQAGTRLLVNLWKLHRDPTIWSDPLEFQPERFL  14621

14620  TKHVGLDVRGQHFELLPFGSGRRMCPGISLALEVLQLTLARLLHGFELGVAADSXLDMTE  14441

14440  GAGVTIPKETPLEVTLVPRLPSELYH*  14360

 

>CYP82H14.c AM467494.2

Gene 3 exon 1    2 aa diffs to CYP82H14 (duplicate segment)

3265 MGFSLQLQDVTVFGILFAIICLWLVNAKGNKNKGRSPPEPSGAWPVMGHLHLLGADKPLH 3086

3085 RTFGAMADEYGPIFSIRLGLRTALVVSSSEVAKECYTTKDKALATRPRSLAVKLMGYDHA 2906

2905 MFAFERHGPYWRDVRRLAMVELLSNRQHEMLKHVRDSEVKFFIQELYGQWVDNGGSPVLV 2726

2725 DMKKKFEHLVANLVMRTVAGKRCGNGESRWCQALGDFMNLMGQFMVSDAVPFLGWLDTV 2549

2548 RGYTAKMKGTARQLD 2504

2502 QVIGRWVEEHRQKRLSGSINEAEQDFIHAMLSVIDDAQFSAHDHDHDTIIKATCL (0) 2338

 

>CYP82H15P.a CYP82A.a CAAP02006846.1 pseudogene 71% to 82A19

100% match to end of AM485903.2, this contig continues the AM485903.2 seq.

3523 MGFYLQLQGIIVFNLLFAIICLWLIK 3446

3446 GKGNRNRGKRAPEPSRAWPLIGHLHLLRTGKPQHQSFGAMADKYGPTFCF 3297

3296 HIGLRKTLVVSSCEVATECLTTMDKAFSIRP 3204

3204 RSLAGKLMGYDHAMFGFSPYGPYWRDVRKLASVELLSNRQLELLNHVRDSEVKLFIIELY 3025

3024 GQWIQNGDRPLLVEMKEKCWHLAANVMVSAVAGK 2923

     XXXXX

2897 ESRRCRKALGDLLYLSGVFMVSDAIPFLGWLDTVRGYTAKMKRTAREVDQVQGS

2735 WVEEHRRKRFSGSMNEAEQDFNHVMLSVIEDGQFSDHDHDTVINATCL (0) 2592

2311 TLIIGGSDSTVITLTWALCPLMNNPSTLKRAQDELDIKVGKHRQVDESDIKNLVYLQAIIK 2129

2128 ETLRLYPAAPLSVPREAMEDCTMAGFHIQAGTRLLVNLWKLYKNPRIWSDPLEFQPERFL 1949

1948 TKHVDLDVRGQNFEFLPFGSGRRVCPGISFALEVVHPTYLRSYIS* 1811

 

>CYP82H15P.a AM485903.2

CYP82A.a  same as CAAP02006846.1

20671 TLIIGGSDSTVITLTWALCPLMNNPSTLKRAQDELDIKVGKHRQVDESDIKNLVYLQAII 20492

20491 KETLRLYPAAPLSVPREAMEDCTMAGFHIQAGTRLLVNLWKLYKNPRIWSDPLEFQPERF 20312

20311 LTKHVDLDVRGQNFEFLPFGSGRRVCPGISFALEVVHPTYLRSYIS* 20171

 

>CYP82H15P.b AM467494.2

gene 1 87% to CYP82H15P, 86% to CYP82H9P

10322 MGFYLQLQGIIVFSLLFAIICLWLIKAKWNKNTGKMAPEPSXAWPLIRHLHLLRAXKPQY 10143

10142 QAXGAXADKYGPIFCFHIXLRKTLVVSXYEVATECLTTMDKAFSTRP 10002

10002 RSLXGKLMGYDQAMFGFSPYGPYWRDVRKLTSVEPLSNXQLELLNHVRDSEVXLFIK*LY 9823

 9822 GQWIQNGDRPLLVEMKEKYWHLXANVMVSAVAGK 9721

 9710 VTNDYESRQCRKALGDLIYLSXIFMVSDAIPFLGWLDTVRGYTAKMKRTVXEVXQVLG 9537

 9536 XWVEEHRWKRFSGSMNEAEQDFNHVMLXXIEDGQFFDHDHDTVIKATC (0) 9393

 6807 TFIIGGSDSTVITLTWALSLLMNNPSTLKTAQDELDIKVGKHRQVDESDIKNLVYLQAII 6628

 6627 KETLQLYPAAPLSVPCEAMEDCTMAGFHIQAGTRLLVNLWKLHKDPRIWLDPLEFQPEKF 6448

 6447 LTKHVDLDVRGQNFEFLPFGSGRRVCPGISFALQVVHPTYLRSYIS* 6307

 

>CYP82H15P.a-de1b CAAP02006846.1 pseudogene 84% to CAN81092.1

also seen on AM485903.2 from 19465-19318

1289 MGFSLQPQEITVFGLLATICLLLETALNAKRNKIKEGKRPPEPS*QWPLIGHLHLLGAD 1113

1112 KLLHRKLGDMADKYGPIFCIHLGFRKALVASRWEVAKECYTTNDKVFATQP 960

 

>CYP82H15P.a-de1b AM485903.2

19649 MGFSLQPQEITVFGLLATICLLLETXLNAKRNKIKEGKRPXEXSXQWP  19506

19503 LIGHLHLLGAXKL 19465

19464 LHXKLGDMADKYGPIFCIHLGFRKALVXSRWEVAKECYTTNDKVFATXP 19318

 

>CYP82H15P.b-de1b AM467494.2

Gene 2 N-term fragment identical to CYP82H15P-de1b on AM485903.2

5780 MGFSLQPQEITVFGXLATICLLLETTLNAKRNKIKEGKRPLELSGQWP 5637

5634 LIGHLHLLGANKLLHQKLGDMADKYGPIFCIHLGFRKALVVSRWEVAKECYTTNDKVFAT 5455

5454 RP 5449

 

>CYP82H16 CYP82A.b CAAP02006846.1 pseudogene 94% to AAP49697.1 missing N-term

6631 VSLTGYDDGSFVFAPCGPYWRDVQKLAMAELLSNRQLEMHKHVQDSEVKILIKELYGQWA 6452

6456 GQWAMGKQQGLPALVEMKERFGNLALNVVVRAIAGKRYFGTHACGDEPKRCKKAFEDFII 6277

6276 LVGLFMVSDAIPFLGWLDTVRGFTPEMKRVAKEVDYVLGS

6156 WVEEHRQNRLSANDNRAEQDFIHAMLSVIDDGQFSGRDPDTIIKGTCS 6013

5703 NLILAGYESTFITLTWALSLLLNNRHALKNAQEELEIHVGKHRQVDGSDIKNLVYLQAIV 5524

5523 KETLRLYPPGPLSLPHEAIENCTVAGFHIQAGTRLLVNLWKLHRDPRVWLDPLEFQPERF 5344

5343 LTKHAGLDVRGKNYELLPFGSGRKVCPGISFALELTHLTLARLLHGFELGAVADSRVDMT 5164

5163 ESPGLTALKATPLEVTIVPRLPFELYSYEAA* 5068

 

>CYP82H17P CYP82A.c CAAP02006846.1 pseudogene N-term fragment

10664 TTMDKAFATQPRSLAGKLMGYNHAMFGFSPCRAY*HDVRKLASIVL 10527

 

>CYP82H18 CYP82A CAAP02007374.1 94% to 82A19

7879 MGFSLQLQDVTVFGILFAIICLWLVNAKGNKNKGRSPPEPSGAWPVMGHLHLLGADKPLH 7700

7699 RTFGAMADEYGPIFSIRVGLRTALVVSSSEVAKECYTTKDKALATRPRSLAVKLMGYDHA 7520

7519 MFAFERHGPYWRDVRRLAMVNLLSNRQHEMLKHVRDSEVKFFIQELYGQWVENGGSPVLV 7340

7339 DMKKKFEHLVANLTMRTVAGKRCENGESRWCQSLGDFMNLMGQFMVSDALPFLGWLDTVR 7160

7159 GYTAKMKGTARQLDRVIGRWVEEHRQKRLSGSINDAEQDFIHAMLSVIDDAQLSGHDHDH 6980

6979 DTVIKATCL 6953

6265 TVMLAGNDTIAVTLTWALSLLMNNPRALKKAQEELDFHVGRNQQVYESDIKKLVYLQAII 6086

6085 KETLRLYPAGPLALPHEAMEDCTIAGFHIQAGTRLLVNLWKLHRDPTIWSDPLEFQPERF 5906

5905 LTKHVGLDVGGQHFELLPFGSGRRMCPGISLALEILQLTLARLLHGFELGVVSDSPLDMT 5726

5725 EGVGLAMPKATPLEVTLVPRLPSELYH* 5642

 

>CYP82H19P CYP82A CAAP02007374.1 pseudogene 66% to 82A19

3512 GSSFLAFCLWLIKAEGNKNKGKRAPEPSRA*PLIGHLHLL*AGKPQHQSFGAMADKY 3342

3341 GPIFCFHIGLRKTFLVSSWDAAKECFTTMDKAFDTRPRSLAGKLMGYDHAMFGFSPYGPY 3162

3161 WREVRKLASVELLLNRQLELLNHVRDSEVKLFIKELYGQWIQNGDRPVLVEMKEKCWFLA 2982

2981 ANVMVSEVAGKRYFGT 2934

 

>CYP82A19X gi|147815732|emb|CAN65890.1 = AM467494.2

similar to AM485903.2

hybrid of three different P450 sequences CYP82H14.c, CYP82H15P.b,

CYP82H15P.b-de1b

 

Same as CYP82H15P.b

MADKYGPIFCFHIXLRKTLV  

 

LGNFGT

 

CYP82H15P.b

VTNDYESRQCRKALGDLIYLSXIFMVSDAIPFLGWLDTVRGYTA

KMKRTVXEVBQVLGXWVEEHRWKRFSGSMNEAEQDFNHVMLXXIEDGQXFDHDHDTVIKATCL (0)

VRSKFWGQTFQDLECLFE (intron seq)

CYP82H15P.b

TFIIGGSDSTVITLTWALSLLMNNPSTLKTAQDELDIKVGKHRQVDESDIKNLVYLQAI

IKETLQLYPAAPLSVPCEAMEDCTMAGFHIQAGTRLLVNLWKLHKDPRIWLDPLEFQPEKFLTKHVDLDV

RGQNFEFLPFGSGRR

 

EQDQGRQKATRAVRTMAS

 

CYP82H15P.b-de1b

LIGHLHLLGANKLLHQKLGDMADKYGPIFCIHLGFRK

ALVVSRWEVAKECYTTNDK

 

REAMRSDTESSLHCWFVVWLCTRLLVIVLHADE

 

CYP82H14.c

LQDVTVFGILFAIICLWL

VNAKGNKNKGRSPPEPSGAWPVMGHLHLLGADKPLHRTFGAMADEYGPIFSIRLGLRTALVVSSSEVAKE

CYTTKDKALATRPRSLAVKLMGYDHAMFAFERHGPYWRDVRRLAMVELLSNRQHEMLKHVRDSEVKFFIQ

ELYGQWVDNGGSPVLVDMKKKFEHLVANLVMRTVAGKRCGNGESRWCQALGDFMNLMGQFMVSDAVPFLG

WLDTVRGYTAKMKGTARQLD

 

PGYWKMGGGASSEEAFRKHQ

 

 

>CYP82J1 Populus

23341018 MDFSFHLLAVSTVLALVLWYTLRRVRETRRKTEKGLQPPEPSGALPLIGH 23340869

23340868 LHLLGAQKTLARTLAAMADKYGPIFTIRLGKHPTVVVSNLEAIKECFTTH 23340719

23340718 DRILSSRPRSSHGEHLSYNYAAFGFNNSGPFWREMRKIVTIQLLSSHRLK 23340569

23340568 SLRHVQVSEVNTLINDLYLLSKSNKQGSTKIDISECFERM TINMITRMIA 23340419

23340418 GKRYFSSTEAEKEDEGKRIGKLMKEFMYISGVFVPSDVIPFLGWMNNFLG 23340269

23340268 SVKTMKRLSRELDSLMESWIQEHKLKRLESTENTNKMEDDDFIDVMLSLL 23340119

23340118 DDSMFGYSRETIIKATAM 23340065 (0)

23339700 TLIIAGADTTSITLTWILSNLLNNRRSLQLAQEELDLKVGRERWAEDSDI 23339551

23339550 GNLVYIQAIIKETLRLYPPGPLSVPHEATKDFCVAGYHIPKGTRLFANLW 23339401

23339400 KLHRDPNLWSNPDEYMPERFLTDHANVDVLGHHFELIPFGSG RRSCPGIT 23339251

23339250 FALQVLHLTFARLLQGFDMKTPTGESVDMTEGVAITLPKATPLEIQITPR 23339101

23339100 LSPELYYEC* 23339071

 

>CYP82K1 Populus

168971 MIIWRILSTSHKRNKTLPPPEPSGAWPLIGHLRILNSQIPFFRILGDLAV 168822

168821 KHGPVFSIRLGMRRTLVISSWESVKECFKTNDRKFLNRPSFAASKYMGYD 168672

168671 DAFFGFHPYGEYWLEMRKIATQELLSNRRLELLKHVRVSEIETCIKELHT 168522

168521 TCSNGSVLVDMSQWFSCVVANVMFRLIAGKRYCSGIGKDSGAFGRLVREF 168372

168371 FYLGGVLVISDLIPFTEWMDLQGHVKSMKRVAKELDHVVSGWLVEHLQRR 168222

168221 EEGRVRKEEKDFMDVMLESLAVGDDPIFGYKRETIVKATAL 168099 (0)

168015 NLILAGTDTTSVTLTWALSLLLNHTEVLKRAQKEIDVHVGTTRWVEESDI 167866

167865 KNLVYLQAIVKETLRLYPPGPLLVPRESLEDCYVDGYLVPRGTQLLVNAW 167716

167715 KLHRDARIWENPYEFHPERFLTSHGSTDVRGQQFEYVPFGSGRRLCPGIS 167566

167565 SSLQMLHLTLSRLLQGFNFSTPMNAQVDMSEGLGLTLPKATPLEVVLTPR 167416

167415 LENEIYQH* 167389

 

>CYP82L1 Populus

8826205 MILEALILVFLYGFWKILARNSEGKKSTRAPEPSGAWPLFGHLPSLVGKD 8826354

8826355 PACKTLGAIADKYGPIYSLKFGIHRTLVVSSWETVKDCLNTNDRVLATRA 8826504

8826505 GIAAGKHMFYNNAAFALAPYGQYWRDVRKLATLQLLSNQRLEMLKHVRVS 8826654

8826655 EVDTFIKGLHSFYAGNVDSPAKVNISKLLESLTFNINLRTIVGKRYCSST 8826804

8826805 YDKENSEPWRYKKAIKKALYLSGIFVMSDAIPFLEWLDYQGHVSAMKKTA 8826954

8826955 KELDAVIRNWLEEHLKKKIDGELGSDRESDFMDVMISNLAEGPDRISGYS 8827104

8827105 RDVVIKATAL 8827134 (0)

8827302 ILTLTGAGSTATTLVWTLSLLLNNPTVLKAAQEELDKQVGRERWVEESDI 8827451

8827452 QNLKYLQAIVKETLRLYPPGPLTGIREAMEDCSIGGYDVPKGTRLVVNIW 8827601

8827602 KLHRDPRVWKNPNEFKPDRFLTTHADLDFRGQNMEFIPFSSGRRSCPAIN 8827751

8827752 LGLIVVHLTLARILQGFDLTTVAGLPVDMIEGPGIALPKETPLEVVIKPR 8827901

8827902 LGLELY* 8827922

 

>CYP82L2 Populus

531878 MILGALVLLILYGFWKTLARERESKKLARAPEPSGAWPVIGHLPRLRGQD 531729

531728 PACKTLAAIADKYGPIYSLRLGSHRIVVVSSWETVKDCLTTNDRILATRA 531579

531578 NIAAGKHMGYNNAAFALSPYGKYWRDVRKLVTLQLLSNHRLEMLKHVRVL 531429

531428 EVDAFIKGLHNSYAETAEYPAKVTMSKLFESLTFNISLRTIVGKRYCSSL 531279

531278 YDKENSEPWRYKKAIEKALYLSGIFVMSDAIPWLEWIDFQGHISAMKRTA 531129

531128 KELDAVIGSWLEEHLKKEIQGESDFMDVIISNLADGAAEMSGYSRDVVIK 530979

530978 ATTL 530967 (0)

530264 ILTLTGAGSTAVTLTWALSLLLNHPSVLKAAQEELDKQVGREKWVEESDI 530115

530114 QNLMYLQAIVKETLRLYPPGPLTGIREAMEDCHICGYYVPKGTRLVVNIW 529965

529964 KLHRDPRVWKNPDDFQPERFLTTHADLDFRGQDFEFIPFSSGRRSCPAIN 529815

529814 LGMAVVHLTLARLLQGFDLTTVAGLPVDMNEGPGIALPKLIPLEAVIKPR 529665

529664 LGLPLYN* 529641

 

 

&&&&&

 

>gi|147782327|emb|CAN63048.1| 45% to 82C4 really two genes, not assembled correctly

CAAP02005785.1b 5320-2729 (-) strand runs off the end

MADKYGPIFCFHIGLRKTSVVSSWEVAKECFTTMDKAFATQPRSLAGKLMGYDHAMFGFSPCRAYWRD

DVGSRAARSTHEGQMVAKFVDI ()

VIAKVQYILLRIQHLTRHITILYHFIRPLLKSGVLTLVKIHDNKNPTYMLTKALIAENLELCLASVGQG ()

HQLELLNHVRDSEVKLFIKELYRQWIQNGDRPLLVEMKEKCWHLAANVMVS

TVAGERYFGTVTNDYESRQCRKALGDFLYLSGIFMVSDAIPFLGWLDTVRGYTAKMKRTAREVDQVLGSW

VEEHGWKRVFGSMNEADQDFECFW ()

SHITGSKDQHGCNVDVDSYIGGFDSTVITLTSALSLLMNNPSTQKR

AQDELDIKVGKHRKVDESDIKNLVYLQAIIKETLRLYPAAPLSVPREAMEDCNVAGFHIQAGTRLLVNLW

KLPRDSEIWPDLWSSNLG ()

 

>CYP82H20P CAAP02005785.1a  pseudogene 48% to 82C4 82% TO CAAP02005785.1c

91% to CYP82H5 runs off the end

675 MGFSLQPQDITVLGF 631

631 LLATICLLLATALNAKGNKKKGKRPPEPSGQWPLIGHLHLLGADKLLHRTLGDMADKYG 455

454 PIFCVRLGLK 425

425 KALVVSSWEVAKKCYTTSDKVFATRPKSLAIKLMGYDHGSFVFAPYGPYWCDVRKQAIVE 246

245 LLSNRQLEMHKHVQDTEVKILIKEFYGQWASNK 147

145 ALVEMKERFGNLSLNVVVRAIAGKRYFG 62

 61 THACGDEPKRGKKAFEDFII  2

 

>CYP82H21P CAAP02005785.1b Revised seq pseudogene 46% to 82C2 63% TO CAAP02005785.1c

81% to CYP82H15P.a

5320 MGFYLQLQGIIVFSLLFAIICL*LIKGKGNRNRGKRAPEPSRA*PLIGHLHLLRVGKPQHQAFGA 5126

5125 MADKYGPIFCFHIGLRKTSVVSSWEVAKECFTTMDKAFATQPRSLAGKLMGYDHAMFGFSPC 4940

4939 RAYCVM*GSWPQLCWIVKLEL 4877

3995 VELFSSHQLELLNHVRDSEVKLFIKELYRQWIQNGDRPLLVEMKEKCWHLAANVMVSTV 3819

3818 AGERYFGTVTNDYESRQCRKALGDFLYLSGIFMVSDAIPFLGWLDTVRGYTAKMKR 3651

3650 TAREVDQVLGSWVEEHGWKRVFGSMNEADQDF

     DVVSVIKDGQFSHHDHDTVVKATCL

3046 IGGFDSTVITLTSALSLLMNNPSTLKRAQDELDIKVGKHRKVDESDIKNLVYLQAIIKET 2867

2866 LRLYPAAPLSVPREAMEDCNVAGFHIQAGTRLLVNLWKLPRDSEIWPD 2723

2723  PLEFQPGGFLTKHVDFEV  2670

 

>CYP82H21P-de1b CAAP02005785.1bb new

2243  LPWLVTHGEGGIRQKMKSTAREVYHVLER*LEEHPFKRLAGSINEAKQDFIHVMPSVIGD  2064

2063  GQFSGHNTDTIIKGACL (0)  2013

 

>CYP82H22-de2d CAAP02005785.1c-de2d last exon, runs off the end, 76% to 82A15

457-527 89% to CYP82H6Pv2

20898 STRRVCPGISFALEVMHLTLARLLHSFQLGVVADLPVDRTEGSGVTLPRATPLEVTVVPR 20719

20718 LPSELYGY*AS 20686

 

>CYP82H22-de2c CAAP02005785.1c-de2c new

19453 PFLGWLDTVKGYKAKMKRTAGKVYHVLGR*LEEHPFKRLAESIEAEQDFTHVRPSVIGD 19277

19276 GRFPGHNTDTIIKGTCL (0) 19226 end of first exon 240-316

18705 LKDCTIAGLHIQ 18670 aa 395-406

17176  KDCTVVGFHI 17147 aa 396-405

 

>CYP82H22-de2b CAAP02005785.1c-de2b same as CYP82H4P

16325 ETTVITPTWALSLLLN 16278 aa 324-339

16261 ETTVITPTWALSLLLNKSCVLKK*TQDELDVQLGKHRLVEESAMKT 16124 aa 324-368

16124 NLIYLQAIVKWKHYEY 16077 a 369-377

 

>CYP82H22 CAAP02005785.1c = CAN81092.1 (first exon)

92% to CYP82H3P

14782 MGFSLQPQDITVFGLLLATICLLLATVFNAKGNKKKGKRPQEPSGRWPIIGHLHLLGADKLPHRTLG 14582

14581 DMADKYGPIFCIHLGLRKALVVSSWEVAKECYTTNDKVFATRPRSLAVKLMGYDHAMLGF 14402

14401 APYGPYWRDVRKLAMVELLSNRQLEMLKHVQDSEVEFLIKELYGQWARNKDSPALVEMK 14225

14224 ERFGNLVMNVMVRAIAGKRYFGTHACDDEPKRGKKALDDFMLLVGLFMVSDAIPFL 14057

14056 GWLDTVKGYTTDMKKIAKELDYLLGRWVEEHRQQRLSANNNRAEVDFLHVMLSVIDDGQF 13877

13876 SGRDPDTVIKATCL (0) 13835

13520 NLILAGYDTTSITLTWALSLLLNNRHALKKAQAELEIHVGKHRQ 13389

13388 VDGSDIKNLVYLQAIVKETLRLYPPGPLSVPHEAMEDCTVAGFHIQAGTRLLVNLWKLHR 13209

13208 DPRVWLDPLEFQPERFLTNHAGLDVRGKNYELLPFGSGRRVCPGISFALELTHLALARL 13032

13031 LHGFELGVVADSPVDMTEGPGLSAPKATPLEVTIVPRLPFELYSYEAA* 12885

 

>CYP82H22 partial gi|147798650|emb|CAN63328.1| 64% to 82C4

same as exon 2 of CAO47699.1 other genome project

NLILAGYDTTSITLTWALSLLLNNRHALKKAQAELEIHVGKHRQVDGSDIKNLVYLQAIVKETLRLYPP

GPLSVPHEAMEDCTVAGFHIQAGTRLLVNLWKLHRDPRVWLDPLEFQPERFLTNHAGLDVRGKNYELLPF

GSGRRVCPGISFALELTHLALARLLHGFELGVVADSPVDMTEGPGLSAPKATPLEVTIVPRLPFELYSYE

AA

 

>CYP82H22 gi|157340894|emb|CAO47699.1

100% to CAAP02005785.1c

MGFSLQPQDITVFGLLLATICLLLATVFNAKGNKKKGKRPQEPSGRWPIIGHLHLLGADKLPHRTLGDMA

DKYGPIFCIHLGLRKALVVSSWEVAKECYTTNDKVFATRPRSLAVKLMGYDHAMLGFAPYGPYWRDVRKL

AMVELLSNRQLEMLKHVQDSEVEFLIKELYGQWARNKDSPALVEMKERFGNLVMNVMVRAIAGKRYFGTH

ACDDEPKRGKKALDDFMLLVGLFMVSDAIPFLGWLDTVKGYTTDMKKIAKELDYLLGRWVEEHRQQRLSA

NNNRAEVDFLHVMLSVIDDGQFSGRDPDTVIKATCLNLILAGYDTTSITLTWALSLLLNNRHALKKAQAE

LEIHVGKHRQVDGSDIKNLVYLQAIVKETLRLYPPGPLSVPHEAMEDCTVAGFHIQAGTRLLVNLWKLHR

DPRVWLDPLEFQPERFLTNHAGLDVRGKNYELLPFGSGRRVCPGISFALELTHLALARLLHGFELGVVAD

SPVDMTEGPGLSAPKATPLEVTIVPRLPFELYSYEAA

 

CYP82S subfamily (24 sequences) [12 genes plus 1 allele]

[8 pseudogenes plus 2 pseudogene alleles and one duplicate pseudogene]

 

>CYP82S1 CYP82M.3 gi|147815208|emb|CAN65652.1| 89% to CAN82345.1

CAAP02004370.1a 17035-19016 green part added back, 3 aa diffs

MDLPSHFLAIAGLILGLVLWYNHWRGKTLTHKSKGMSPPEPSGAWPFVGHLHLLHGKVPVFRTLGAMADK

VGPVFVIRLGMYRTLVVSNREAAKECFTTNDKIFASRPNSSAAKILGYNYAAFAFAPHGPYWREMRKLSM

LEILSTRRL

GDLMHVQVSELHAGIKDLYILGKDNNWVNPKKVVISEWFEHLTFNV

VLRMVAGKRYFNNVVHGGEEARSAIAAIKKLLLLVGASVASDVIPFLEWVDLQGHLSSMKL

VAKEMDSLIESWVKEHTGRLNSEASSSQDFIDIMLTKLKDDSLFGYSRETIIKATVL (0)

TMIVAGSDTTSLT

STWLLSALLNNKHVMKHAQEELDLKVGRDRWVEQSDIQNLVYIKAIVKETLRLYTTFPLLVPHEAMEDCH

VGGYHISKGTRLLVNAWKLHRDPAVWSNPEEFQPERFLTSHANVDVFGQHFELIPFGSGRRSCPGLNMGL

QMLHLTIARLLQGFDMTKPSNSPVDMTEGISVALSKLTPLEVMLTPRLPAELY

 

>CYP82S2Pv1 CYP82M.4 CAAP02004370.1b pseudogene, 89% to CAN64501.1

N-TERMINAL MISSING IN THE REGION BETWEEN 19016 AND 20247

90% to CYP82S3

20247 AMADKVGPVFVIRLGMYRALVVSN 20318

20321 EAAKESFTTNDKVFASGP 20374

SSRADKILGYNNAAFGLAPYGPLWREMRKLSMLEILSTGRLSDLMHVHVSELHAGIK

20546 DLYILGKDYNWVNPKKVVMSVWFEHLTFNVVLRMVAGKR*FNNVVHGGEEAGSAIAAIKK 20725

20726 LVPLAGAFVASDLIPFLEWVDLQGHLSSMKQVAKEMDSVLESWVEEHTGRLNTEASSRQD 20905

20906 FIDIMLTKLKDASLFGYSRETIIKATVL (0) 20989

21564 ILIVVGSDTTSITSTWLLSALLNNRHVMKHAQEELDLKVGRDRWVEQSDIQNLVY 21728

21729 LKAIVKETLRLCPAIPLLVPLEAMEDYHVGYHSNSPGYHIPKGTRLLVNAWKLYRGPAVW 21908

21909 SNPEEFQPESF 21941

21940 LTSHATLDVFCQHFELIPYGSGRRSCPGINMALQMLHLTTARLLEGFDMATPSNSLV 22110

22111 DMTEGISITMPKFTPLEVMLTRLPAELY 22194

 

>CYP82S2Pv2 CYP82M gi|147815209|emb|CAN65653.1 = AM459398.2

81% to CAN82345.1, 91% to CYP82S3

97% to CYP82S2Pv1

pseudogene

KNLFAMADKVGPVFVIRLGMYRALVVSSNHEAAKECFTSNDKVFASGPSSRAAKILGYNNAAFGLAPYGPLWREM

RKLSMLEILSTGRLSDLMHVHVSELHAGIEDLYILGKDYNWVNPKKVVM

SVWFEHLTFNVVLRMVAGKR*FNNVVHGGEEAGSAIAAIKKLVPLAGAFVASDLIP

FLEWVDLQGHLSSMKQVAKQMDSVLESWVEEH

TGRLNTEASSRQDFIDIMLTKLKDASLFGYSRETIIKATVL (0)

 

NPAVWSNPEEFQPERFLTSHATLDVFCQ

HFELIPYGSGRRSCPGINMALQMLHLTTARLLEGFDMATPSNSLVDMTEGI

SITMPKFTPLEVMLTRLPAELY

 

>CYP82S3 CYP82M.5 gi|147819436|emb|CAN64501.1| 89% to CAN82345.1

CAAP02003017.1c 2760-4887 (+) strand, restored green seq

MDLPSHFLAIAGLILGLVLWYNHWRGKTLTHKSKGMSPPEPSGAWPFVGHLHLLHGKVPVFRTLGAMADK

VGPVFVIRLGMYRALVVSNHEAAKECFTTNDKVFASRPSSSASKI 3104

3105 LGYNYVAFGLAPYGPLWREIRKLCMLEILSTRRLSDLMHVHVSELHAGIKD

LYILGKDYNWVNPKKVVISEWFEHLNFNVVL

RMVAGKRYFNNVVHGGEEAGSATAVIKKLVPLAGAFVASDVIPFLEWVDLQGHLSSMKQVAKEMDSVLES

WVEEHTGRLNSEASSRQDFIDIMLTKLKDASLFGYSRETIIKATVM (0)

MLIVAGSDTTSITSTWLLSALLNN

RHVMKHAQEELDLKVGRDRWVEQSDIQNLVYLKAIVKETLRLYPAVPLLVPHEAMEDCHVGGYHIPKGTR

LLVNAWKLHRDPAVWSNPEEFQPERFLTSHATVDVLGQNFELIPFGSGRRSCPGINMALQMLHLTIAQLL

QGFDMATPSNSPVDMAEAISITMPKLTPLEVMLTPRLPAELY

 

>CYP82S3-de2b CAAP02003017.1b C-term pseudogene mid region pseudogene

 951 NTMKNSSFKLLHCECFQFKLLTFNIVLRMI 1040

1041 AIKKFLSLTGAFVVSNVIPFLE*VD 1115

1115 LRGHLSSMKLVAKELDSLIESRVEEH 1192

1568 IKESLRLYTLATLSAPHEAMEDCHVGGYRIPKGTCL 1675

 

>CYP82S3-de2c CAAP02003017.1a C-term pseudogene = CAN65652.1

610 TGRRSCPGLNMGLQMLHLTIARLLQ 684

 

>CYP82S4P CYP82M.7 gi|147853671|emb|CAN82329.1 = AM429883.2 79% to CAN82345.1

MDLPSHFLAIAGLILGLVLWYNHWR

GKTLTHKSKGMVPPGALRCLAICRSPAXTTRQ &

VPVFRTLGAMADKVGPVFVIRLGMYR

ALVVSSHEAAKECFTTNDKVFASRPSSSASKILGYNYVAFGLAPY

GPLWREIRKLCVLEILSTRRLSDLMHVHVSELHAGIKDLYILGEDYNWVNPKKVVISEWFEHLNFNVVLR

MVAGKRYFNNVVHGGEEAGSATAVIKKLVPLAGAFVASDVIPFLEWVDLQGHLSSMKQVAKEMDSVLESW

VEEHTGRLNTEVSSRQDFVDIMLTKLKDASLFGYSRETIIKATVM (0)

ILIVAGSDTTSIT &

STWLLSALLNNRHVMK

NAQEELDLKVGRDRWVEQSDIQNLV

YLKAIVKETLRLYPAIPLLVPHEAMEDCHVGGYHIPKGTRLLVNAWKLHRDPAVWSNPEEFQPERFLTSH

ATVDVLGQNFELIPFGSGRRSCPGINMALQMLHLTIAQLLQGFDMATPSNSPVDMAEAISITMPKLTPLE

VMLTPRLPAELY

 

>CYP82S5Pv1 82M.6 CAAP02003017.1d cyan is a frameshifted region 83% to CAN65652.1

PSEUDOGENE

19840 MDLPSHFLAIAGLILGLVFWYNHWRGKTLTHKSKGMAPPGALRCLAIYRSPAPTTRK 20010

20011 VPVFRTLGAMGDKLGPVFVIGLGVYRALVVSNHEAVKECFTTNDKVFASRPSPSAAKILG 20190

20191 YNYAAFGFAPYGPFWREMQKLSLLEILSTRRLSDLMHVQVSELQAVIKDLYILGKDNKWV 20370

20371 NSKRVVMSEWFEHLTFNVVLTMIAGKRYFNDVVHGGEEARSAIAAIKKFMSLSGAFVASD 20550

20551 VIPFLEWVDLQGHLSSMKLVAKELDSLIESWVEEHRGRLNREASSRLDLIDVMLTMLKGA 20730

20731 SLFHYSRETIIKATVV (0) 20778

22115 NIIVGGTDTTSITSTWLLSALLNNRHVMKHAQEELDLKVGRDRWVEQSDIQNLVYLKAIVK 22297

22298 ETLRLYTTAPLSVPHEAMEDFHVGGYHIPKGTRLLVNAWKLHRDPAVWSNPEEFQPERFL 22477

22478 TSHATIDVVGQHFELIPFGSGRRSCPGINLALQMLHLTIARLLQ*FDMATPSNSPVDMTE 22657

22658 GISITMPKVTPLEVMLTPAFVLNFTSAT* 22744

 

>CYP82S5Pv1 DUPLICATE CYP82M.2 gi|147782909|emb|CAN65592.1 = AM457484.2 RUNS OFF THE END

PSEUDOGENE

REMOVE CYAN = INTRON

100% TO CAAP02003017.1d

RSAIAAIKKF

MSLSGAFVASDVIPFLEWVDLQGHLSSMKLVAKELDSLIESWVEEHRGRLNREASSRLDLIDVMLTMLKG

ASLFHYSRETIIKATVV (0)

VSERTRNEVTASHFPRSLPTITYHDSDSSQHLTTISDSRSTYHDLCQPPVRVND

GPATTWCHYDNTKYLPTVKEGNKASECYIYMEPSHKEEATWITRDAGPQQWGMNCRPWVEQESMNSLVEC

RNEEDQKRL

NIIVGGTDTTSITSTWLLSALLNNRHVMKHAQEELDLKVGRDRWVEQSDIQNLVYLKAIVK

ETLRLYTTAPLSVPHEAMEDXHVGGYHIPKGTRLLVNAWKLHRDPAVWSNPEEFQPERFLTSHATIDVVG

QHFELIPFGSGRRSCPGINLALQMLHLTIARLLQ*FDMATPSNSPVDMT 2195

2136 EGISITMPKVTPLEVMLTPAFVLNFTSAT* 2285

 

>CYP82S5Pv2 CYP82M.1 gi|147853615|emb|CAN82345.1 = AM430157.2

PSEUDOGENE

C-TERM = 66% to CYP82M1 tobacco, 49% TO 82M1 WHOLE SEQ

97% TO CAAP02003017.1d

5420 MDLPSHFLAIAGLILGLVFWYNHWRGKTLTHKSKGMAPPGALRCLAIYRSPAPTTREVPVFRTLGAMADK

LGPVFVIGLGVYRALVVSNHEAVKECFTSNDKVFVSRPSPSAAKILGYNYAAFGFAPYGPFWREMRKLSL

LQILSTRRLSDLMHVQVSELQAVIKDLYILGKDNKWVNSKKVVMSEWFEHLTFNVVLTMIAGKRYFNDVV

HGGEEARSAIAAIKKFMSLSGAFVASDVIPFLEWVDLQGHLSSMKLVAKELDSLIESW 6223 &

6223 VEEHRGRLNGEASSRLDLIDVMLTMLKGASLFRYSRETIIKATVV 6357

NMIVA &

6970 DTTSITSTWLLSALLNNRHVMKHAQEELDL

KVGRDRWVEQSDIQNLVYLKAIVKETLRLYTTAPLSVPHEAMEDCHVGGYHIPKGTRLLVNAWKLHRDPA

VWSNPEEFQPERFLTSHATIDVVGQHFELIPFGSGRRSCPGINLALQMLHLTIARLLQ

*FDMATPSNSPVDMTEGISITM 7509

7510 PKVTPLEVMLTPAFVLNFTSAT* 7578

 

>CYP82S6 CYP82M.16 gi|147858518|emb|CAN81014.1 = AM454711.2

65% to CAN82345.1

CAAP02002284.1 45127-42934 (-) strand

1829 MDLLTHLLAFAGLFLGLLYWYNRWRVRTLTHNSKGISAPKPPGAWPIIGHLHLLSGQVPIFRTLGAMADK

HGPVFMIQLGMHPAVVVSSHEAVKECFTTNDKVFASRPRSSVSKLLGYNYAGFGFAPYGPFWREMRKLSV

VEILSARRLNELKDVRISELDACIQDLYSLGKDNNWISPIEVVMSEWFEHLTFNFVLRMIAGKRYFDNAV

HGNEEARGAIIAIKKFLSLSGAFVPSDVFPFLERLDLKGYLGSMKHVAEELDCLVGSWVEEHVMRLKSEP

GSRHDFIDVLLSAVQDTSMFGHSRETVIKATIG (0)

NLIVGGSDSTSITSTWILSALLNNREAMKRAQEELDL

KVGRSRWVEESDIQKLDYLRAIIKESLRLYSAAPLLVPHEATQDCHVCGYHIPKGTRLFVNAWKLHRDPR

VWSNPEEFEPERFLGSHANLDVFGHQFELIPFGSGRRACPGINMALQMLHLTFARLLQGFDMATPSNAPV

DMTEGISFTMPKLTPLCVMLTPRLPSHLY* 4046

 

>CYP82S7 CYP82M.17 CAAP02002284.1 95% to CAN81014.1

36166 MDLLTQLLAFAGLFLGLLFWYNRWIVRTLTHNSKGISAPKPPGAWPIIGHLHLLSGQVPI 35987

35986 FRTLGAMADKHGPVFMIQLGMHPAVVVSSHEAVKECFTTNDKVFASRPRSSVSKLLGYNY 35807

35806 AMFGSAPYGLFWREMRKLSVVEILSARRLNELKDVRISELDACIKDLYSLGKDNNWISPI 35627

35626 KVVMSEWFEHLTFNFALRMIAGKRYFDNAVHGNEEARGAIITIKKYLSLSGAFVPSDVFP 35447

35446 FLERLDLQGYLGSMKHVTEELDCLVGSWVEEHVMRLKSEPGCRHDFIDVLLSTVQDTSMF 35267

35266 GHTRETVIKATIV (0) 35228

34575 NLIVGGSDSTSITSTWILSALLNNREAMKHAQEELDLKVGRSRWVEESDIQKLDYLRA 34402

34401 IIKESLRLYPAAPLLVPHEATQDCHVCGYHIPKGTRLFVNAWKLHRDPRVWSNPEEFEPE 34222

34221 RFLGSHANLDVFGHQFELIPFGSGRRACPGINMALQMLHLTFARLLQGFDMATPSNAPVD 34042

34041 MTEGISFTMPKLTPLRVMLTPRLPSHLY* 33955

 

>CYP82S8 CYP82M.12 gi|147821972|emb|CAN77159.1| 63% to CAN82345.1 too long remove cyan

CAAP02012210.1 1 aa diff

CAAP02009734.1  4380-3760 exon 2 only (2 aa diffs)

MGGGLAWFPPASQAKVPRGPCFGAREPSQDTRRVPCGLAWCPRAEPRHATRAYTKSAKVCFEVENXLQSLXEAL

MWKLSQIEAQXI

MNLLSHLLAVAGFMWLVLLCNVWRVKSFAHRGKGRSAPEPSGAWPFIGHLHLLN

SPMPIFRTLTAMADXHGPVFMIRLGMXRALVVSSHKAVKECLTTNDKAFASRPISSAGKLLGYNYAGFGF

APYGPLWREMRKLSVTELLSNRRLDELKHVLVSELDVCIRDLYSLGKETNWVNPIKVAMSEWLEQXTFNV

VLRMVAGKRYFGNGVHGNEEARHAIAVIKKFLSLTGAFVASDVIPFVEWMDLQGHLGSMKRVAGQLDPFV

EGWVEEHVTKLNSDPSSRQDFIDVMLSVLKDNSIFGHTRETVIKATVM (0)

TLIVGGSETTSIVSTWILSALL

NNRHALKRAQEEIDLKVGRGRWVEESDIZNLIYLQAVVKETLRLYPPAPLSIPHEAVEDCNVCEYHIPKG

TRLFVNVWKLHRDPGVWSDPEEFQPERFLTTNANLNVFGQHFELIPFSSGRRSCPGIALALQILHLTVAR

LLQGYDMTTPLNAPVDMTEGIGITMPRATPLEVMLTPRLPSLLY*

 

>CYP82S9 CYP82M.13 CAAP02008286.1 90% to CAN77159.1, one frameshift

9344 MWKLSQTEAQFIMNLLSHLLAIAGFMGLLLLYNVWRVRSFAHGGKGRSAPEPSGAWPFVG 9165

9164 HLHLLSGPTPIFRTLAAMADKHGPVFMIRLGVHRALVVSSHEAVKECFTTNDKAFASRPS 8985

8984 SSAGKLLGYNYAGFGFAPYGPLWREMRKLSVTELLSNRPLNELKHVLVSELDVCIRDLYS 8805

8804 LGKETNWVNPIKVAMSEWLEQLTFNVVLRMVAGKRYFGNGVHGNEEARHAIAVIKKFIFL 8625

8624 TGVFVASDAIPFVEWMDLQGHLGSMKRVAEQLDPFVEGWVEEHVTKLKSDPSSRQDFIDV 8445

8444 MLSVLKDNSMFGHRRETVIKATVM 8373

8114 TLIVGGSETTSIASIWILSALLNNRHALKRAQEELDLKVGRGRWVEESDIQNLIYLQAAV 7935

7934 KETLRLYPPGPLLVPHEAIQDCNVCGYHIPKGTRLLVNVWKLHRDPDAWSDPEEFQPERF 7755

7754 LTTHANLNVFGQHSELIPFSSGRRSCPG 7671

7669 IALALQILHLTVARLLQGYDMTTPLNAPVDMTEGIGLAMPKETPLEVMLTPRLPSLLY* 7493

 

>CYP82S10 CYP82M.14 CAAP02009192.1 72% to CAN77159.1

CAAP02004896.1 100% match

2122 MMTLLSHLLAVGGLVGLVFFCKAWRVR   KGKGKSAPEPPGAWPIVGHLRLLCGK 2280

2281 KPFCRVLGAMADKHGPVFMIRLGVHPTLVVSSHEAVKECFTTNDKAFASRPRYSSGKLLG 2460

2461 YNNAGFGFAPYGPLWREMRKLSVIGLLSYRRLDELKHVLLSELDFCTGDLYSLGRKRNWV 2640

2641 EPIEVEMNKWLEELPFNVVLRMVAGKRYFVNGVHANEEARHVIAGFRQFVHLLEARVPSD 2820

2821 VLPFLEWIDLDGYLKSMKHVAELLDTSVQGWVEEHVMKLKSDPSSRQDFIDVMLSILKDD 3000

3001 SVFGHTRETVIKATVV (0) 3048

3161 SLIVGGSATTFAASACLLFLLLNHRHALERVQEELDVKVGRERWVEESDIENLV 3322

3323 YLQAVVKETLRLSPPAPLLIAHEAIEDCNVCGYNIPKGTRLFVNVWKLHRDPRTWSDPEK 3502

3503 FQPERFLTSNADISVFGRHFELIPFGSGRRCCPGIALALQMLHLTVARLLQGFDIT 3670

3671 PVGMTPRMGFFSPKETPQKVMLKPRLPSQLY* 3766

 

>CYP82S11 CYP82M.10 gi|147807677|emb|CAN75482.1| 75% to CAN82345.1

MDTPSQLVAISGILGLVLLYGVWRVMTLAGKSKGKSAPEPSGAWPFLGHLPLLRGQTPIFRTLGAMADKH

GPVFMIRLGVHRALVVSSREAVKECFTTNDKAFASRPSSSAGKILGYNHAGFGFAPYGALWREMRKLSMM

EILSASSPQCVKSCSDWVHPVKVVMSEWFQHLSFNIVLKMIAGKRSFNTSDHGNEEARRAIATIHKLLFL

TGAFVLSDAIPGVEWMDLQGYLGSMKRVAKEVDSLVGGWVEEHEMRLNREGNKRQDFIDVMLSVLEDTSM

FGYSRETVIKATIM (0)

ILIVGGTDTLSTTSTWLLSALLNNKHALKCAQEELDLKVGRGRWVEESDIPNLLYL

QAVIKETLRLYTATPLSAPHEAMEDCHVAGYHIPKGTRLFVNAWKLHRDPSVWSDPEDFQPERFLTTHAD

LDVLGQHFELIPFGSGRRSCPGITMALKLLPLVIGRLLQGFDLSTPLNAPVDMREGLSITLAKLTPLEVM

LTPRLPSQFY

 

>CYP82S12 AM440067.2, 2 aa diffs to CAO45058.1

MDTLSQLVAISGILGLVLLYGVWRVMILAGKSKGRSAPEPFGAWPFLGHLPILRGQTP

IFRTLGAMADKHGPVFMIRLGVHRALVVSSREAVKECFTTNDKVFASRPSSSAGKILGYN

HAGFGFAPYGALWREMRKLSMMEILSARRLDALKHVQISELDLSIKDLYSLGKGSDWVHP

VKVVMSEWFQHLSFNIVLKMIAGKRYFNTSGHGNEEARRAIATIQKLLFLTGAFVLSDAI

PGVEWMDSQGYLGSMKRVAKEVDSLVGGWVEEHEMRLNSEGNKRQDFIDVMLSVLEDTSM

FGHSRETVIKATIM (0)

 

>gi|157353148|emb|CAO45058.1|

exon 2 = CYP82S12

MDTLSQLVAISGILGLVLLYGVWRVMILAGKSKGRSAPEPFGAWPFLGHLPILRGQTPIFRTLGAMADKH

GPVFMIRLGVHRALVVSSREAVKECFTTNDKVFASRPSSSAGKILGYNHAGFGFAPYGALWREMRKLSMM

EILSARRLDALKHVQISELDLSIKDLYSLGKGSDWFHPVKVVMSEWFQHLSFNIVLKMIAGKRYFNTSGH

GNEEARRAIATIQKLLFLTGAFVLSDAIPGVEWMDSQGYLGSMKQVAKEVDSLVGGWVEEHEMRLNSEGN

KRQDFIDVMLSVLEDTSMFGHSRETVIKATIM (0)

ILIVGGTDTVSTTSTWLLSALLNNKHALKCAQEELDLK

VGRGRWVEESDIPNLLYLQAVIKETLRLYTATPLSAPHEAMEDCHVAGYHIPKGTRLFVNAWKLHRDPSV

WSDPEDFQPERFLTTHADLDVLGQHFELIPFGSGRRSCPGITMALKLLPLVIGRLLQGFDLSTPLNAPVD

MREGLSITLAKLTPLEVMLTPRLPSQFYYVNYGCTKIY

 

>CYP82S12 82M.11 CAAP02004690.1a 95% to CAN75482.1, 50% to 82C4

12565 MDTLSQLVAISGILGLVLLYGVWRVMILAGKSKGRSAPEPFGAWPFLGHLPILRGQTPIF 12386

12385 RTLGAMADKHGPVFMIRLGVHRALVVSSREAVKECFTTNDKVFASRPSSSAGKILGYNHA 12206

12205 GFGFAPYGALWREMRKLSMMEILSARRLDALKHVQISELDLSIKDLYSLGKG

12049 SDWFHPVKVVMSEWFQHLSFNIVLKMIAGKRYFNTSGHGNEEARRAIATIQKLLFLTGAF 11870

11869 VLSDAIPGVEWMDSQGYLGSMKQVAKEVDSLVGGWVEEHEMRLNSEGNKRQDFIDVMLSV 11690

11689 LEDTSMFGHSRETVIKATIM (0) 11630

 8827 ILIVGGTDTVSTTSTWLLSALLNNKHALKCAQEELDLKVGRGRWVEESDIPNLLYLQA 8654

 8653 VIKETLRLYTATPLSAPHEAMEDCHVAGYHIPKGTRLFVNAWKLHRDPSVWSDPEDFQPE 8474

 8473 RFLTTHADLDVLGQHFELIPFGSGRRSCPGITMALKLLPLVIGRLLQGFDLSTPLNAPVD 8294

 8293 MREGLSITLAKLTPLEVMLTPRLPSQFY 8210

 

>CYP82S12-de2b CAAP02004690.1b  pseudogene 94% to CAN75482.1

18122 LQAVIKETLRLYTAVLLSVPHEAMEDCHVAGYHIPKGTRLFVNAWKLHRDPSVWSDPEDF 17943

17942 QLERFLTSHANLDVLGQHFELIPFGSGRRSCPGS 17841

      TMALKLLPLVIGRLLQGSDLS 17779

17778 TPLNAPVDMREGLSITLAKLTPLEVMLTPRLPSQFY 17671

 

>CYP82S13 CYP82M.9 CAAP02002685.1b  95% TO CAN61061.1, one stop codon possible pseudogene

16307 MDTPSQLVAISGILGLVLLYGVWRVMTLAGKSKGRSAPEPSGAWPFLGHLPLLRGQTPIF 16128

16127 RTLGAMADKHGPVFMIRLGVHRALVVSSREAVKECFTTNDKVFASRPSSSAGKLLGYNYA 15948

15947 GFGFAPYGALWREMRKLSMMEILSARPLDALKHVQISELDLSIKDLYSLGKGSD*VHPVK 15768

15767 VVMSEWFQHLSFNIVLKMIAGKKYFNTSGHGNEEARRAIATIQEFLSLAGAFVLSDAIPG 15588

15587 VEWMDSQGYLGSMKRVAKEVDSLVGGWVEEHEMRLNSEGSKRQDLIDVMLSVLEDTSMFG 15408

15407 HSRETVIKATVM (0) 15372

13333 TLMVGGTDTVATTSTWLLSALLNNKHALKRAQEELDLKVGRGRWVEESDIPNLHYLQA 13160

13159 VIKETLRLYTAAPLSVPHEAMEDCHVAGYHIPKGTRLFVNAWKLHRDPSVWSDPEDFQPE 12980

12979 RFLTSHADLDVLGQHFELIPFGSGRRSCPGITMALKLLHLVIGRLLQGFDLSTPLNAPVD 12800

12799 MREGLSIILAKVTPLEVMLTPRLPSQFY* 12713

 

>CYP82S14P CAAP02002685.1a  pseudogene, 76% to CAAP02004690.1a , 82C like

918 YRTFGAIADKHNP 880

883 PIFIIHLGVHRALVVRSREAVKECFTTNDKAFASHLSSSA*KFLGYNYVGFGYATY*ALW 704

703 HEMRKLSMMEILSAHRLDTLKHVEFSELDLSIKYLYSLGKTNNWFHPVKIMMCEWFQHLS 524

523 FNRVLKMIVGKIYFD 479

 

>CYP82S15v1 CYP82M.8 CAAP02002221.1b = CAN61061.1, 65% to CAN82345.1

51754 MDTPSQLVAISGILGLVLLYGVWRVMTLAG

51664 KSKGRSAPEPSGAWPFLGHLPLLRGQTPIFRTLGAMADKHGPVFMIRLGVHRALVVSSR 51488

51487 EAVKECFTTNDKVFASRPSSSAGKLLGYNYAGFGFAPYGALWREMRKLSMMEILSARRLD 51308

51307 ALKHVQISELDLSIKDLYSLGKGSDWVHPVKVVMSEWFQHLSFNIVLKMIAGKRYF 51140

51139 NTSGHGIEEARRAIATIQEFLSLSGAFVLSDAIPGVEWMDLQGYLGSMKRVAKEVDSLVG 50960

50959 GWVEEHEIRLNSEGSRMQDFIDVMLSVLEDTSMFGHSRETVIKATIV (0) 50819

47089 ILIVGGTETVATTSTWLLSALLNNKHALKRAQEELDLKVGRGRWVEESDIPNLLYLQAVIK 46907

46906 ETLRLYTAAPLSVPHEAMEDCHVAGYHIPKGTRLFVNAWKLHRDPSVWSDPEDFQPERFL 46727

46726 TSHADLDVLGQHFELIPFGSGRRSCPGITMALKLLPLVIGRLLQGFDLSTPLNAPVDMR 46550

46549 EGLSITLAKLTPLEVILTPRLPSQFY* 46469

 

>CYP82S15v2 CYP82M.15 gi|147833566|emb|CAN66023.1 = AM471158.2

63% to CAN82345.1

NEARLY IDENTICAL TO CAAP02002221.1 51754-46565 (-) STRAND

5 AA DIFFS AND SOME UNCALLED AA.

MDTPSQLVAISGILGLVLLYGVWRVMTLAGKSKGRSAPEPSGAWPFLGHLPLLRGQTPIFRTLGAMADKH

GPVFMIRLGVQRALVVSSREAVKECFTTNDKVFASRPSXSAGKJLGYNXAGFXFAPYGALWREMRKLSMM

EILSARRLDAJKHVQISELDLSIKDLYSLGKGSDWVHPVKVVMSEWFQHLSFNIXLKMIXGKRYFNTSGH

GIEEARRAIATIQEFLSLXGAFVLSDAIPGVEWMDLQGYLGSMKRVAKEXDSLVGGWVEEHEXRLNSEGN

KXQDFIDVMLSVLEDTSMFGHSRETVIKATIM (0)

ILIVGGTETVATTSTWLLSALLNXKHALKRAQEELDLK

VGRGRWVEESDIPNLLYLQAVIKETLRLYTAAPLSVPHEAMEDCHVAGYHIPKGTRLFVNAWKLHRDPSV

WSDPEDFQPERFLTSHADXDVLGQHFELIPFGSGRRSCPGITMALKLLPLVIGRLLQGFDLSTPLNA &

PVDMREGLSITLAKLTPLEVILTPRLPSQFY*

 

>CYP82S16P CAAP02002221.1a pseudogene 62% to CAN61061.1

26959 FQTLGAMANKHNPSS*STLECTVHLW*VVVRLLKNVFTTNDKAFASCQSSSA*KLLG*NY 26780

26779 AGFSYAPY*ALWHEMRKPSMMEILSTHRLDTLKHVKVSELDLSIKDLYSLGKANNWVHPV 26600

26599 KVMMCEWFQHLNFNRVLKMIARKIYFDHSGHGNEETRQVIAKIQKLLSLSRVFVLS 26432

26431 NVIPCVEWMDLQGHLGLIKRVTKELDSLVG

 

>CYP82S17 AM455390.1 pseudogene 46% to 82D14

41% to CYP82C4 with a seq gap

Second exon 65% to CYP82C4

This seq is the corrected seq of CAN67787.1 and CAN67788.1

88% to CYP82S12 (AM440067.2)

14443 VAEKCKGRLAPGPSDAWPFLGHLPLLRGQTPIFRTLGAMVDKQDPVFMIRLGVHRALVV 14619

14620 SSREAVKECFTTNDKAFASRPSSSAGK 14700

14703 GYNYAGFGFTPYRALWREMRKLSMMEILSARRLDALTHVQISELDLFIKDLDSLGKGSD 14879

14880 WVHPVKVEMSKWFQHLSFNIALKMIAGKKYFNTSGHGNEEARLAIATI*KLLFLTGTFV 15056

15057 LSEAIPGVEWMDSQGYLGSMKRVAKEVDSLVGG*VEEHEMRLNSEGRKREDFIYVIL 15227

15228 SVLEDTSMFGHSKETVIKATIM (0) 15293

18129 VYNLQAVIKETL 18164

18166 RLYTAVPLSVPHETMEDCHVAGYHIPKGTRLFVNAWKLHRDPSVRSDPEDFQPERFLTTH 18345

18346 ADLDVLGQHFELIPFGSGRRS 18408

 

18433 PLSVPHEAMEDCHVAGYHIPKGTRLFVNAWKLHRDPSVRSDPEDFQAERFLTTHADLDVLG 18615

 

18662 GTRLFVNAWKLHR 18700

 

>gi|147834859|emb|CAN67787.1 = AM455390.1 see 82S17

38% to 82C4

pseudogene incorrectly assembled, green parts are P450

MTHLRSIITCPSNNDSYNKYQ ()

VSSNLGNSGVGFIVWCTVAEKCKGRLAPGPSDAWPFLGHLPLLRGQTPI

FRTLGAMVDKQDPVFMIRLGVHRALV

 

FQCRKAGYNYAGFGFTPYRALWREMRKLSMMEILSARRLDALTH

VQISELDLFIKDLDSLGKGSDWVHPVK (0)

 

LQYSLKDDRRKEVFQYFWPWKRGSKAGHSNNLETPVSNGDICF

VGGHSRCRMDGFTRVFGVDEAGGEGSGLSCGRLSGRT

 

>gi|147834860|emb|CAN67788.1| see 82S17

62% to 82C4

MQTLMVGSTDTVSTTSKLLSRKPWRLYTAVPLSVPHETMEDCHVAGYHIPKGTRLFVNAWKLHRDPSVRS

DPEDFQPERFLTTHADLDVLGQHFELIPFGSGRRSSHTHTHRVPLSVPHEAMEDCHVAGYHIPKGTRLFV

NAWKLHRDPSVRSDPEDFQAERAQPQFSTTRWTKEEKEEEEEEEEEGEGEEGRAGGRTWLEASQATASCL

AHLSGSFMKGRRLPLGEALKGGVALP

 

 

CYP84 sequences

 

>CYP84A30 gi|147799011|emb|CAN74838.1| 60% to 84A5

MASPLQSLLTFPSLFFLLFSLFLFIIFLRXXSRKLPYPPGPKGLPIIGNMLMMNQLTHRGLANLSKVYGG

LLHMKMGVLHLVVVSTPEMAREVLQVQDSVFANRPARVAIKYLTYDRADMAFAQYGPSWRQMRKICVMKL

FSRKRAESWASVREEVDSTLQSIAKRGGSAVNIGELALDLTKNITYRAAFGSSSREKQKEFVKILQEFSR

LFGAFNFADFIPWLGWIQGKEFTKRLVKARGSLDEFIDKIIDDHIEKRKKQNNSGDESESEAELDIVDEL

MEFYSKDVAAEDLNSSIKFTRDDIKAIIMDVMFGGTETVASAIEWAMAELMKSPDDLKKLQQELIDVVGL

NRRLHESDLEKLTYLKCCIKETLRLHPPIPVLLHETAEDSVVAGYSVPARSDVMINAWAINRDKTAWEDP

ETFKPERFLKDAPDFKGSHFEFIPFGSGRRSCPGMQLGLYGLDLAVGHLVHCFSWELPDGMKASDLDMSD

VFGLTAPRAIQLIAVPTYRLQCLLLE

 

>CYP84A30 CAAP02007467.1  2 aa diffs with CAN74838.1

1535  MASPLQSLLTFPSLFFLLFSLFLFIIFLRNFSRKLPYPPGPKGLPIIGNMLMMNQLTHRG  1714

1715  LANLSKVYGGLLHMKMGVLHLVVVSTPEMAREVLQVQDSVFANRPARVAIKYLTYDRADM  1894

1895  AFAQYGPSWRQMRKICVMKLFSRKRAESWASVREEVDSTLQSIAKRGGSAVNIGELALDL  2074

2075  TKNITYRAAFGSSSREKQKEFVKILQEFSRLFGAFNFADFIPWLGWIQGKEFTKRLVKAR  2254

2255  GSLDEFIDKIIDGHIEKRKKQNNSGDESESEAELDIVDELMEFYSKDVAAEDLNSSIKFT  2434

2435  RDNIKAIIM (0)

2633  DVMFGGTETVASAIEWAMAELMKSPDDLKKLQQELIDVVGLNRRLHESDLEKLT  2794

2795  YLKCCIKETLRLHPPIPVLLHETAEDSVVAGYSVPARSDVMINAWAINRDKTAWEDPETF  2974

2975  KPERFLKDAPDFKGSHFEFIPFGSGRRSCPGMQLGLYGLDLAVGHLVHCFSWELPDGMKA  3154

3155  SDLDMSDVFGLTAPRAIQLIAVPTYRLQCLLLE  3253

 

>CYP84A31 CAAP02000429.1 CAN73481.1 72% to CAN74838.1, 1 aa diff

130776 MDSIMEGLQPLQMTLLFIIPLILLLGLVSRLRRRRPYPPGPKGLPLIGSMNMMDQLTHRGLAKIAKQYGGIFH 130558

130557 LRMGYLHMVGVSSPDIARQVLQVQDNIFSNRPATIAISYLTYDRADMAFAHYGPFWRQMR 130378

130377 KLCVMKLFSRKRAESWESVREEVESTVRTVASSIGSPVNIGELVFTLTKNIIYRAAFGTS 130198

130197 SKEGQDEFISILQEFSKLFGAFNIADFIPWLSWVDPQGLNARLAKARKSLDGFIDDIIDD 130018

130017 HMQKKKLNNDSDEVDTDMVDDLLAFYSEEAKVNESEDLQNAIELTRDNIKAIIM (0) 129856

129689 DVMFGGTETVASAIEWAMAEMMKSPEDLKKVQQELADVVGLNRRVEESDLEKLTYLK 129519

129518 CVLKETLRLHPPIPLLLHETAEDAEVAGYHIPARSRVMINAWAIGRDKNSWDEPETFKPS 129339

129338 RFLKAGVPDFKGSNFEFIPFGSGRRSCPGMQLGLYALELAVVHLLHCFTWELPDGMKPSE 129159

129158 LDMGDVFGLTAPRATRLVAVPSPRLECPLS* 129066

 

>CYP84A32v1 CAAP02003166.1 97% to CAN74838.1, 96% to CAN60361.1

there seem to be three genes in each genome project this one matches up with

CAN60361.1 by default, since the other two pair are obvious best matches.

36116  MASPLQSLLTF

36149  PSLFFLLFSLFLFIIFLRKLSRKLPYPPGPKGLPIIGNMLMMNQLTHRGLANLSK  36313

36314  VYGGLLHMKMGVLHLVVVSTPEMAREVLQVQDSVFANRPARVAIKYLTYDRADMAFAQYG  36493

36494  PSWRQMRKICVMKLFSRKRAESWASVREEVDSTLQSIAKRGGSAVNIGELALDLT  36658

36659  KNITYRAAFGSSSREKQEEFVKILQEFSRLFGAFNFADFIPWLGWIQGKEFNKRL  36823

36824  VKARGSLDEFIDKIIDDHIEKRKKQNNSGDESESEAELDMVDELMEFYSEDVAAEDLNSS  37003

37004  IKFTRDNIKAIIM (0)

37214  DVMFGGTETVASAIEWAMAELMKSPDDLKKLQQELTDVVGLNRRLHESDL  37363

37364  EKLTYLKCCIKETLRLHPPIPVLLHETSEASVVAGYSVPARSDVMINAWAINRDKTAWE  37540

37541  DPETFKPERFLKKDAPDFKGSHFEFIPFGSGRRSCPGMQLGLYGLDLAVAHLVHCFSWEL  37720

37721  PDGMKASDLDMSDLFGLTAPRAIQLVAVPTYRLQCPLLE* 37840

 

>CYP84A32v2 gi|147794276|emb|CAN60361.1| 66% to 84A1

MASPLQSLLTFPSLFFLLFSLFLFIIFLRNFSRKLPYPPGPKGLPIIGNMLMMNQLTHRGLANLSKVYGG

LLHMKMGVLHLVVVSTPEMAREVLQVQDSVFANRPARVAIKYLTYDRADMAFAQYGPSWRQMRKICVMKL

FSRKRAESWASVREEVDSTLQSIAKRGGSAVNIGELALDLTKNITYRAAFGSSSREKQKEFVKILQEFSR

LFGAFNFADFIPWLGWIQGKEFTKRLVKARGSLDEFIDKIIDXHIEKRKKQNNSGDESESEAELDIVDEL

MEFYSKDVAAEDLNSSIKFTRDBIKAIIMDVMFGGTETVASAIEWAMAELMKSPDDLKKLQQELIDVVGL

NRRLHESDLEKLTYLKCCIKETLRLHPPIPVLLHETAEDSVVAGYSVPARSDVMINAWAINRDKTAWEDP

ETFKPERFLKDAPDFKGSHFEFIPFGSGRRSCPGMQLGLYGLDLAVGHLVHCFSWELPDGMKASDLDMSD

VFGLTAPRAIQLIAVPTYRLQCLLLE

 

>CYP85A1 gi|81239117|gb|ABB60086.1| brassinosteroid-6-oxidase = DQ235273.1

53071-55904 CAAP02001155.1

MAVFGVVLIGLCICTALLRWNEVRYRKKGLPPGTMGWPVFGETTEFLKQGPSFMKNQRARYGKFFKSHLL

GCPTTVSMDPELNRYILMNEAKGLVPGYPQSMLDILGKCNIAAVHGSTHKYMRGALLALISPTMIRGQLL

PKIDEFMRSHLNKWDTKIINIQEKTKEMALLSSLKQIAGIESGTISKEFMPEFFKLVLGTISLPIDLPGT

NYRRGFQARKNIVGMLRQLIEERKASQETHNDMLGCLMRTNENRYKLSDEEIIDLIITILYSGYETVSTT

SMMAVKYLHDHPRVLDELRKEHLAIRERKRPEDPIDWNDYKLMRFTRAVIFETSRLATIVNGVLRKTTKD

MELNGFVIPKGWRIYVYTREINYDPLLYPDPLAFNPWRWLDKSLESQNYFLLFGGGTRQCPGKELGIAEI

STFLHYFVTRYRWEEVGGDKLMKFPRVEAPNGLHIRVSAY*

 

>CYP85A1 gi|147792763|emb|CAN66537.1| 97% to CYP85A1, differs at one intron

MAVFGVVLIGLCICTALLRWNEVRYRKKGLPPGTMGWPVFGETTEFLKQGPSFMKNQRARYGKFFKSHLL

GCPTTVSMDPELNRYILMNEAKGLVPGYPQSMLDILGKCNIAAVHGSTHKYMRGALLALISPTMIRGQLL

PKIDEFMRSHLNKWDTKIINIQEKTKEVYGTSLFSQADAVAGIESGTISKEFMPEFFKLVLGTISLPIDL

PGTNYRRGFQARKNIVGMLRQLIEERKASQETHNDMLGCLMRTNENRYKLSDEEIIDLIITILYSGYETV

STTSMMAVKYLHDHPRVLDELRKEHLAIRERKRPEDPIDWNDYKLMRFTRAVIFETSRLATIVNGVLRKT

TKDMELNGFVIPKGWRIYVYTREINYDPLLYPDPLAFNPWRWLDKSLESQNYFLLFGGGTRQCPGKELGI

AEISTFLHYFVTRYRWEEVGGDKLMKFPRVEAPNGLHIRVSAY

 

>CYP85A10 gi|147770370|emb|CAN73647.1| = AM431608

MAVLVVIFVLVVALSVCFALLKWNEIRYSRRGLPPGTMGWPLFGXTTDFIKQGPDFMKKQRARYGSFFKT

HILGCPTIISMDPELNRYVLLNEGKGLVPGYPQSMLDILGEHNIAAVQGSTHKYIRGSMLSLIAPPMIKD

QLLRKIDQGMRFHLSNWDGRTIDIQEKTNEMALFIPFKLIMETESASIYETFKREFDKLVEGTLSLPINI

PGTSYHHGFQGRKNVIRMLKGVMEKRRASSMTQDDMLGYLLRNEGSKYNLSDEEILDQVITILYSGYETV

STTSMMAVKYLLDNPRALQQLREEHLAIRQRKNPEDPIDWNDYKSMNFTRAVIFETSRLATVVNGVLRKT

TKEMELNGFVIPRGWRIYVYTREINYDPFLYPEPYTFNPWRWLDNSLESHNYCFTFGGGTRLCPGKELGI

VQISTFLHYFLTSYRWEEVGSNKIVKFPRVEAPNGLHIRVSKY

 

>CYP85A10 CAAP02000704.1

70874  MAVLVVIFVLVVALSVCFALLKWNEIRYSRRGLPPGTMGWPLFGETTDFIKQGPDFMKKQRAR  70686

69733  YGSFFKTHILGCPTIISMDPELNRYVLLNEGKGLVPGYPQSMLDILGEHNIAAVQGSTH  69557

69556  KYIRGSMLSLIAPPMIKDQLLRKIDQGMRFHLSNWDGRTIDIQEKTNE  69413

69276  MALFIPFKLIMETESASIYETFKREFDKLVEGTLSLPINIPGTSYHHGFQ  69127

       GRKNVIRMLKGVMEKRRASSMTQDDMLGYLLRNE  68942

68941  GSKYNLSDEEILDQVITILYSGYETVSTTSMMAVKYLLDNPRALQQLR

       EEHLAIRQRKNPEDPIDWNDYKSMNFTRA

       VIFETSRLATVVNGVLRKTTKEMELN  68477

68396  GFVIPRGWRIYVYTREINYDPFLYPEPYTFNPWRWL

       DNSLESHNYCFTFGGGTRLCPGKELGIVQISTFLHYFLTSYR  68061

67871  WEEVGSNKIVKFPRVEAPNGLHIRVSKY  67788

 

>CYP85A11P CAAP02000704.1 67% to 85A10 probable pseudogene

64585  MEVVGMVGLVLGVGICVGLALLMKWNEMRYRRKGLPPGSMGLPFFGETAKFXXXXXXXXXXXXXX 64433

63500  YGNFFKTHIFGCPTVICMDPGVNRYILLNEGKGLVPGYPPSMRNIIGNKNIAAVHGATH  63324

63323  KYIRGSLLSLIGPPVIKDHLLQQVDGLMRSFLHNW  63219

52684  DGKTIDIQDKTNEV  52643

52566  MALLVSYKQMLEIEPALLYEAFKPEFDKLVIGTLAMPINLPGTNYYFGFQGRKNVLKML  52390

52389  RKVIAERRASSATHNDMLGDLLSKEDPKHSLLSDEEILDQIITILYSGYETVSKTAMMSI  52210

52209  KYLHDNPKALQQLR  52168

52067  EEHLAIRKGKSPEDPIGWTEYKSMTFTRA  51981

51882  VILETSRLDTIVNGVLRETTNDIEVNG  51802

46993  XXXXXXWRIYVYTRETNYDPLQYPEPFTFNPWRWL  46907

46809  DKSLESQNYCFLFGAGNRVCPGKELGIVKISMFLHHLVTRYR  46684

46588  WEEVGDAEIAKFPRVEAPKGLHIKITKY*  46502

 

>CYP86A28 gi|147852119|emb|CAN80156.1| 79% to 86A8

CAAP02001291.1 81120-79537 (-) strand 100% match

MAITAYLLWFTFISRSLRGPRVWPLLGSLPGLIENSERMHEWIAENLRACGGTYQTCICAVPFLARKQGL

VTVTCDPKNLEHILKTRFDNYPKGPTWQGVFHDLLGEGIFNSDGDTWLFQRKTAALEFTTRTLRQAMARW

VSRAIKHRFCPILRAAQLEAKPVDLQDLLLRITFDNICGLTFGMDPQTLAPGLPENSFASAFDRATEASL

QRFILPEVMWKLKKWLGLGMEVSLSRSIVHVENYLSKVINTRKVELLSQQKDGNPHDDLLSRFMKKKESY

SDSFLQHVALNFILAGRDTSSVALSWFFWLVTQNPTVEKKILHEICTVLMGTRGSDTSKWVDEPLEFEEL

DRLIYLKAALSETLRLYPSVPEDSKHVVADDILPDGTFVPAGSSITYSVYSSGRMKSTWGEDWPEFRPER

WLSADCQRFILHDQYKFVAFNAGPRICLGKDLAYLQMKSIAASVLLRHRLTVVAGHRVEQKMSLTLFMKY

GLKVNVHERDLTAVVDGVRNEKASEVCGEEAVRVKCNVAAQQFLKAQLRLILIEHEG

 

>CYP86A29 gi|147844260|emb|CAN80040.1| 83% to CAN80156.1 end is wrong, revised

CAAP02000797.1 107620-105983 (-) strand

MDASTVLLLLAIITAYLIWFRSITRSLKGPRVWPVVGSLPLLIQNANRMHEWIAENLRSCGGTYQTCICP

IPFLARKQGLVTVTCDPKNLEHILKIRFDNYPKGPTWQAVFHDLLGEGIFNSDGETWRFQRKTAALEFTT

RTLRQAMARWVTRAIKLRFCPILKKAQLEGKPVDLQDLLLRLTFDNICGLAFGKDPQTLAPGLPENSFAT

AFDRATEATLQRFILPEFIWKLKKWLRLGMEASLTHSLGHVDKYLSDVISTRKLELVSQQQGGSPHDDLL

SRFMKKKEAYSDNFLQHVALNFILAGRDTSSVALSWFFWLVIQNPRVEEKILTEICTVLMETRGSDTSKW

VEDPLVFEEVDRLIYLKAALSETLRLYPSVPEDSKHVVVDDVLPDGTFVPAGSAITYSIYSTGRMKFIWG

EDCLEFRPERWLSADGKKIELQDSFKFVAFNAGPRICLGKDLAYLQMKSIAAAVLLRHRLTVAPGHRVEQ

KMSLTLFMKYGLKVNVQPRDLTPILANIPKAKPNTKVNGNVHNVYEVENLTGVAY*

 

SHENLIFLEEVLSCGNILYSTLALSSRNQETIDF

AWWIGMSDLSIYLVNYLGLM

 

>CYP86A30 CAAP02004642.1 67% to CAN80040.1, 79% to 86A1 Arab.

17889 METPSIVFFVVAAASAYVLWFYLLARKLTGPKMWPVVGSLPALFMNRRRIHDWISGNLRETGGA 17698

17697 ATYQTCTLALPFLAYKQGFYTVTCHPKNIEHILRTRFDNYPKGPTWQTAFHDLLGQGIFN 17518

17517 SDGDTWLIQRKTAALEFTTRTLRQAMSRWVNRTIKMRLWRILEKAASEKSSVDMQDLLLR 17338

17337 LTFDNICGLTFGKDPETLSPDFPENPFSMAFDSATESTLQRLLYPGFLWRLKKFLRIGAE 17158

17157 RRLKQSLRVVENYMDDAVAARKERPSDDLLSRFMKKRDVDGNVFPTSVLQR 17005

17004 IALNFVLAGRDTSSVALSWFFWLIMNNPRVEEKITTELSTVLRETRGDDQTKWLEEPLVF 16825

16824 DEADRLIYLKAALAETLRLYPSVPEDFKYVVSDDILPDGTYVPAGTTVTYSIYSVGRMKT 16645

16644 IWGEDCLEFKPERWLSTGGDRFEPPKDGYKFVAFNAGPRTCLGKDLAYLQMKSVASAVLL 16465

16464 RYRLSPVPGHRVEQKMSLTLFMKNGLRVYLHPRGLEPPGAATSA* 16330

 

>CYP86A31P CAAP02001116.1 pseudogene 2 aa diffs to CAN80040.1

16533 VGEGIFNSDGETWRFQRKTAALEFTTRTLRQAMARRVTRAI 16655

16656 GKDPQTLAPGLPENSFATAFDRATEATLQHFILP 16757

 

>CYP86B7 gi|147836212|emb|CAN75428.1| = AM486428.2

CAAP02000522.1 N-term revised cyan not correct 33550-35827

MTLDEVIRRSKISAAEEEEFIEPGKEVEMEMDEE

MINPSSNFTPLPSHGIARNFVSRRLLFLPEIQVMEVLVALIVFVAIHSLRQRKRHGLPVWPFLGM

LPSLVGGLRTDMYEWISGVLCRRNGTFVFKGPWFSSLNCVVTSDPRNIEHLLKGKFSNFPKGAYFRNTVR

DLLGDGIFSADDETWQRQRKTASIEFHSAKFRKLTIESLLVLVHSRLLPVLEDSVKNSASIDLQDVLLRL

TFDNVCMIAFGVDPGCLRLGLPEIPFARAFEDATEATVLRFVTPTCIWKAMRHLNIGTEKNLKISIMGVD

KFANEVITTRKKELSLQCDDKNQRSDLLSIFMGLKDENGQPFSDKFLRDICVNFILAGRDTSSVALSWFF

WLLDRNPAVEERIMAEICRMVGERKGEEEGDGLIFKAEEVKKMEYLQAALSEALRLYPSVPVDHKE (0)

VIEDDVFPDGTVLKKGTKVVYAIYAMGRMEGIWGKDCREFKPERWLRDGRFMSESAYKFTAFNGGPRLCLGKDF

AYYQMKFAAASIIYRYHVKVVENHPVEPKLALTMYMKHGLKVNLIRRHESELQKYLKIRN*

 

>CYP86C8 gi|147791153|emb|CAN63571.1| = AM462286.2

CAAP02000152.1   120976-122535 (+) strand 4 aa diffs

MAGKLVMSVVEWLCHHIWFSDIAVALSIVFILSCILNRLTNKGPMLWPVLGILPSLFFHMNDIYDWGTRA

LIKAGGTFHYRGMLMGGNYGIMTVDPSNIEFMLKTRFKNFPKGNYYRERFHDLLGGGIFNVDHESWKEQR

RIASSEMHSTQFVAYSFQTIQDLVNQKLLELTDKLAKSGDCIDLQEVLLRFTFDNICTAAFGIDPGCLAL

ELPEVSFAKAFEEATELTLFRFMVPPFIWKSMKFFGVGTEKRLQEAVRVVHDFAEKTVADRRIELSKTGN

LNKQTDLLSRIMAIGEHEEGKDNHFSDKFLKDFCISFILAGRDTSSVALAWFFWLINKNPEVENKILGEI

NEVLGHRESNTALTMKDLKKMVYLQAALSETLRLYPSVPVDFKEVVEDDVLPDGTRVKKGSRVLYSIFSM

ARMESIWGKDCMEFKPERWIKDGQFVSENQFKYPVFNAGPRLCIGKKFAFTQMKMVAASILMRYTVKVVE

GHSVVPKMTTTLYMRNGLLVTLEPRLSLVIN

 

>CYP86C10 gi|147793015|emb|CAN77648.1| 79% to 86C8

CAAP02000057.1 64019-65599 no introns, identical to CAN77648.1

MVGRIVISAVDWVVHHIWFSDIAVALSSIFIFSSILHRLTNKGPMLWPVMGIIPTVFFH

MNDIYNWGTRVLIRAGGTFYYRGMWMGGSYGIMTIDPANIEYMLKTRFKNFPKGNYYRER

FNDLLGGGIFNADDESWKEQRRLATFEMHSGPFVAHSFQTIQGLVHQKLLKLIEKLAKSG

DCIDLQEVLLRFTFDNICTAAFGVDPGCLALDLPEVPFAKAFEEATESTLFRFIVPPFVW

KPMRFFRVGTEKRLKEAVRIVHDFAEKTVTERRIELSKAGSSTNRCDLLSRIVAIGYSEQ

GKNNNFSDKFLKDFCISFILAGRDTSSVALAWFFWLIHKNPDVESRILSEIKEVLGPYDS

NKEDLSQRAFTVEELKKMVYLQAALTESLRLYPSVPIDFKEVMEDDVFPDGTPIKRGARV

LYSIFSMARIESIWGKDCMEFKPERWIKDGELVSENQFKYPVFNAGPRLCIGKKFAYMQM

KMVAASILMRYSVKVVEGHNVIPKMTTTLYMKNGLLVTFKPRLSLVS*

 

>CYP87A12 CAAP02003123.1b 57% to 87A2

17000  MGYLWYAFAFTATLLVCITHWVYSFRKPKFNGKLPPGSMGLPILGETLQFFAPNTALDV  16824

16823  APFIRERMNR

16652  YGPLFRTSLVGWPLVISTDPDLSRFILQQEGKLVHSWYTESFDNVVGKQNVLSAKGAMHKCLR  16464

16463  NLILNQFGSESLKTRFLTQVEELVLKHLQLWSNCTSVELKEAIAS

       MIFGFTAKKLFDYDESRTPEKLRENYAAFLDGLISFPLKIPGTSY  16104

16103  WKCLQ

       GRKRAMKTIRNMLDERRASPEREDKDYIDFVL  15924

15923  EEMQKDQTILTEEIVLDLLFALPFATYETTSSALVLAIQYLGSHPSALAEIT  15768

       EHESILRSRERVDSGITWNEYKSMNFTMM  15582

15489  VINETVRLGNIVPGIFRKVAKDIEIK  15412

15323  GYTIPAGWMVMISPPAVHFNPTLYKDPLVFNPWRWQ (0) 15216

14981  SRNFMGFGGGIRQCVGAEFVKLQMAIFLHHLLTKYR (2) 14874

14710  WTVIKGGDTVWKPGLVFPKGFHVQISERTKLYEDSPMET* 14591

 

>CYP87A13 CAAP02003123.1a 51% to 87A2, 71% to 87A12

9386  MEYLGYVFVGIVTLLLVCIPHWLRSWRKPQCNGKLPPGSMGWPILGETLQFSTPYTNRGV  9207

9206  SPFIRKRMDR  9177

9087  RYGPLFRTKLLGWPFVISADPDVSRFVLQQEGKLFHCWYMESFDNLFGPQNVLSSQGALH  8908

8907  KCLRSLILSQFGSESLRTRVLSQVEELVLKKLQLWSNHTSVDLKEGITS  8761

8571  MMFDFTAKMICNYDESKTPEKLRENYSAFLSGLISFPLNIPGTSYWKCLK  8422

8323  GRERARKTLRNRLLERLASPEREHKDIMDFIIQEMKKDDTILTEEIAVDLLFGLPFGANE  8144

8143  TTSSTLILAVQYLGSHPSALAEIT  8072

7980  DLQREHESILRNRKQKDSGITWEEYKSMSFTMM

7782  VVNETVRMGSILPSIFRKVDKDIEIK  7669

7613  GYTIPAGWMVLVSPPAAHFNPNVHKDPHVFNPWRWQ  7507

6281  GQEPTSGSNALMGFGGGIKLCAGVDFAKLEIAIFLHHLVTKYR (2)

      WEVIKGGEVVWRQSTGPIFPNGFHVRISEKTK*  5910

 

>gi|147785388|emb|CAN64251.1| 46% to 87A6

MEYLGYVFVGIVTLLLVCIPHWLRSWRKPQCNGKLPPGSMGWPILGETLQFSTPYTNRGVSPFIRKRMDR

YGPLFRTKLLGWPFVXSADPDVSRFVLQQEGKLFHCWYMESFDNLFGPQNVLSSQGALHKCLRSLILSQF

GSESLRTRVLSQVEXLVLKKLQLWSNHTSVDLKEGITSMMFDFTAKMICNYDESKTPEKLRENYSAFLSG

LISFPLNIPGTSYWKCLKGRERARKTLRNRLLERLASPEREHKDIMDFIIQEMKKDDTILTEEIAVDLLF

GLPFGANETTSSTLILAVQYLGSHPSALAEITREHESILRNRKQKDSGITWEEYKSMSFTMMVVNETVRM

GSILPSIFRKVDKDIEIKGTVWID

 

>CYP87A14 CAAP02000058.1 60% to CYP87A12

167215  MWPIALCIGTF

167182  VIIRIIHWGYSWRNPKCNGKLPPGSMGLPLLGETLQFFAPNTSSDIPPFIKERMER  167015

166931  RYGPIFRTNLVGRPVVVSTDPDLNYFIFQQEGQLFQSWYPDTFTEIFGRQNVGSLHGFMY  166752

166751  KYLKNMVLNLFGPESLKKMLPEVEHATCRNLDRWSCQDTVELKEATAR  166608

166527  MIFDLTAKKLISYEQDKSSENLRENFVAFIQGLISFPLDIPGTAYHKCLQ  166378

166018  GRKKAMSMLKNMLKERRAMPRKKQSDFFDYVIEELKKEGTILTEAIALDLMFV  165860

165859  LLFASFETTSLAITLATKFLSDHPLVLKKLT  165767

165662  EHEAILEKREIVDSELTWKEYKSMTFTFQ165483

165482  FINETVRLANIVPGIFRKALREIQFK  165405

165294  GYTIPAGWAIMVCPPAVHLNPAKYEDPLAFNPWRWE  165187

165011  SKHFMAFGGGMRFCVGTDFTKMQMAVFLHCLVTKYR  164904

164824  WQTVRGGDIVRTPGLQFPNGFHVQILGKN*  164735

 

>CYP87B6 CAAP02018885.1

45% to 87A12, 47% to 87A2 same as AM471735.2 seq mixed below

same as CAAP02008583.1 4186-6219 4 aa diffs

3990 MLPIGLCVVSLVIIWITYWIRRWKNPRCNVTLPPGSLGFPLIGESIQFLISCSNSLDLHP 3811

3810 FFRKRIQK 3787

1009 RYGPLFKTSMLGRQVVVTADPEANHFILEQEGKSVEMCYLDSVAQLCGHDESSAGATGHI 830

 829 HKYLRTLILNHFGYERLRYKLLKKVEAMAHKSLGAWSSQPSVELNRATSQIMLDFISKEL 650

 649 FSYDPKGCTESMGDAFIDFLDSLASVPLNIPGTTFHKCLK 530

 422 NQKKTMKILREIVDERCASPEIRRGDFLDYFLEGMKKEAFITKDFIAFVMFGLLFAS 252

 251 FESIPIMLSLALKLIMEHPLVLQELE (1) 174

  79 EHEAILRNKDTSNFTLTWEDYKSMTFT 2

     VIDETLRMANVGLGNFRKALEDIKIK 2538

     GHTIPAGWTILVVSSVLHMDPNIYPDPLVF 2368

2367 NPWRWK 2350

2258 GSXKITTKNFTPFGGGIRFCPGAELSKLTMAIFLHVAVTKYR 2133

2040 FTKIKGGNLVRNPVLKFKDGFHIKVSKK 1957

 

>CYP87B7 CAAP02008092.1 73% to CAAP02018885.1

 9379 MWPIGLCIVTLVIIWVTNWIHRWRNPRCNGTLPPGTLGFPLIGETIQFFIPGHSLDLLP 9555

 9556 FFKKRVQR

 9746 YGRLFRTSLVGRPVAVAADPEVNHFILQEEGKSVEMFYLDSIVKLFGKDGASTHATGHV 9922

 9923 HKYLRTLVMNYFGFESLRDKLLPKVEAVARKSLDTWSSQPSVELNYAISQVMFEFISMEL 10102

10103 FSYDPSASTESMSDAFINFLKGLVSIPLNIPGTTFHKCLK 10222

10326 NQKKVMKMLREIVEERCASPERRHGDVLDYFLEEMKSKTFITKDFIVYIMFGLLFATFES 10505

10506 IPTMFTLVFKLIMEHPLVWQELK 10574

      DEHEAILRNSQTSNSTITW 10727

10728 EDYKSMTFT 10754

10855 VINEALRMGNISLGSFRRAVEDVRIN 10932

11022 GYTIPAGWIILVVPSALHMDPETYPDPLVFNPWRWK 11135

11240 GGSKIRVKNFTPFGRGIRSCPGAELSKLVAATFIHAAVTKYR 11365

11450 FTKIKGGRVVRNPMLKFKDGFYVKVSE 11530

 

>CYP87B8 CAAP02011822.1 CYP87 like 73% to CAAP02018885.1, 44% to 87A2

2309 MWPIGLCIVTLVIIWVTNWIHRWRNPRCNGTLPPGTLGFPFIGETIQFFIPGHSLDLLP 2485

2486 FFKKRVQR 2509

2676 YGRLFRTSLVGRPVAVAADPEVNHFILQEEGKSVEMFYLDSIVKLFGKDGASTHATGHV 2852

2853 HKYLRTLVMNYFGFESLRDKLLPKVEAVARKSLDTWSSQPSVELNYAISQVMFEFISMEL 3032

3033 FSYDPSASTESMSDAFINFLKGLVSIPLNIPGTTFHKCLK 3152

3256 NQKKVMKMLREIVEERCASPERRHGDVLDYFLEEMKSKTFITKDFIVYIMFGLLFATFES 3435

3436 IPTMFTLVFKLIMEHPLVWQELK 3504

     DEHEAILRNSQTSNSTITW 3657

3658 EDYKSMTFT 3684

     VINEALRMGNISLGSFRRAVEDVRIN 3862

     GYTIPAGWIILVVPSALHMDPETYPDPLVFNPWRWK 4064

4170 GGSKIRVKNFTPFGRGIRSCPGAELSKLVAATFIHAAVTKYR 4295

4377 RFTKIKGGRVVRNPMLKFKDGFYVKVSEMVTEGGGSAAEDA* 4502

 

>CYP87B9 CAAP02000523.1 53% to CAAP02008092.1, 49% to 87A2, 55% to 87B1

119321 MYWSVWLCVVSLFIASIIHWVYKWRNPKCNGKLPPGSMGFPLIGETIQFFIPSKSLDVSSFIRKRMKK (2) 119524

119639 YGPLFCTNLVGRPVVVSSDPDFNYYIFQQEGRLVEIWYLDSFARLVGQDASQSTAASGY 119815

119816 VHKYLRNLVLAHFGTEVLKDKLLSKAEDMIRTRLHDWSKLPALEFKTCVSS 119968

120387 MIFDFTANELFSYDIKKMGENFSERFTNIIQAVASFPLNIPGTTFHKCLK 120536

120639 NQKEVIKLIRDILKERKVSPESRKGDFLDQIVDDIKKEKFLSDDFIVLVMFGILLASFE 120815

120816 TISATLTLAVKLLIENPSVMQEL 120884

120990 EEHEAILKNRENSNSGISWKEYKSMTFT 121073

121144 VINEALRLASVAPGILRRAIKDIQVN 121221

121326 GYTIPAGWTIMVVPAALQLSPDAFVDPLAFNPSRWK 121433

121900 MGVGVVAKNFIPFGGGSRLCAGAEFTKVLMTTFFHVLVTNYR 122025

122114 LTKIKGGQIARSPALTFGNGLHINISKKHG* 122208

 

>CYP88A23 CAAP02003758.1b 66% to CYP88A3

33418  MGLASSWVLYTAIFAGALALRWVLLRVNKWVYEGRLKGKSYHLPPGDLGWPLIGNMWT  33245

33244  FLRAFKTKNPDSFISNIVER  33185

32867  YGKGGIYKTFMFGNPSILVTSPEGCRKVLTDDDNFKPGWPTSTEELIGKKSFVSISYEE  32691

32690  HKRLRRLTSAPVNGHEALSLYIPYIEKNVISDLEKWSKMGNIEFLTGVRKLTFKIIMYIF  32511

32510  LSAESGDVMEALEKEYTILNYGVRALAINIPGFAFHKAFK  32391

32298  ARKNLVATLQATVDERRQRERENSSAREKDMLDALLHVEDENGRKLTDEEIIDLLIMYL  32122

32121  NAGHESSGHVTMWATLLLQGHPEIFQRAK (0)

31924  AEQEEIVKNRPPTQKGLTLREVRKMEYLSQ (0)

31731  VIDETLRWLTFSLMVFREAKADVNIG

31550  GYLFPKGWKVLVWFRAVHYDPEIYPNPEVFNPSRWD (0)

31376  NVVYKLHVQNFTPKAGTFLPFGAGSRLCPGNDLAKLEISIFLHYFLLNYR  31227

       LERVNPGCELMYLPHPRPVDNCLARVRKVA*  31023

 

>CYP88A24 CAAP02003758.1a 69% to 88A4, 70% to 88A23

5562  MELGMIW

5541  VAFGAILGGVLGVKWVLRRANSWVYEVKLGEKRYSLPPGDLGWPLIGNMWSFLRAFKSTD  5362

5361  PDSFISSFITR  5329

3671  FGQTGMYKVLMFGNPSIIVTIPEACKRVLTDDQNFKPGWPTSTMELIGRKSFIGITNEE  3495

3494  HKRLRRLTATPVNGHEALSIYMQYIEDNVISALNKWAAMGEFEFLTALRKLTFKIIMYIF  3315

3314  LSSESEHVMEALEREYTSLNYGVRSMAINLPGFAYHKALK  3195

2510  ARKNLVNIFQSIVNERRDRKKGNSQTMKKDMMDALLDIEDENGRKLSDEEIIDILVMYL  2334

2333  NAGHESSAHVTMWATVKLQENPEFFQRAK  2247

1842  AEQEEIIRKRPPNQKRLTLKEIREMEYLPK  1753

1642  VIDETLRWITFSFVVFREAKADINIC  1565

1413  GYTIPKGWKVLVWFRSLHFDPETYPDPKEFNPCRWD  1306

1211  DYTAKPGTFLPFGLGSRLCPGNDLAKLEISVFLHHFLLNYQ  1089

446   LERLNPGCPRMYLPHSRPRDNCLAIVRKVAAESE* 342

 

CYP89 family (14 genes) [11 pseudogenes]

 

>CYP89A38 AM423953.2  Vitis vinifera (Pinot noir grape)

64% to papaya

CAAP02002218.1j 56953-58518 (+) strand 2 aa diffs

      MDFDLQKMEIWVFLFVVSLCIASLLKSLHDFFFPKLNLPPGPAAFPLI

6418  GNLHWLGPSFADLEPILRNLHAKYGPILTLRIGSRPAIFISENSLAHQALVQNGAVFADR  6239

6238  PAALPASRVMSSNQRNINSSPYGPTWRLLRRNLTAEILHSSRVRSYSHARKWVLEILVSR  6059

6058  LRGHSDGFVPVRIMDHFQYAMFCLLVLMCFGDKLEEKQIQEIEMIQRKLLLAFRGLNRL  5882

5881  NLWPRMGKILFRKRWEEWLNLRKDQEAILLPHIRARQRLKQETQNKQEDDSSSSSKDYVL  5702

5701  SYVDTLLDLQLPEEKRKLNEGEMVTMCSEFLSAGTDTTSTALQWIMANLVKAPHIQARL  5525

5524  FEEISGVVGEGEEEVKEEDLQKMPYLKAVVLEGLRRHPPGHFVLPHSVTQDVSFEGYDIP  5345

5344  KNATVNFSVSDMNWNPRIWEDPMEFKPERFLNSNGDGDHADAGKEFDITGSKEIKMMPFG  5165

5164  AGRRICPGYGLAMLHLEYFVGNLVWNFEWKAVEGDEVDLSEKLEFTVVMKNPLQAHLSPR  4985