71 CONTIGS AFTER SEARCHES WITH 18 mammalian P450s one from each family 
D. Nelson March 3, 2005
Revised/completed 2K10, 2P4, 2X2, 3A49, 4T5, 7C1, 27A3 (2X4 now discontinued)
There are 51 full length sequences and one full length pseudogene CYP3A50P.
CYP27A2P now appears to be a pseudogene and the Tetraodon ortholog seq is also.
CYP1A and CYP2X3 will probably be full length functional P450s.
That would bring the total to 53 CYPs in Fugu. In addition to 3A50P, there
are 14 more pseudogene pieces in Fugu.
CYP39 is still missing.  I searched with every exon in both Fugu and Tetraodon
I suspect this group of fish has deleted the CYP39 sequence, even though it is 
found in zebrafish.  Attempts to reconstruct the synteny around the CYP39 sequence 
in Zebrafish and FUGU failed.  In humans the gene order at CYP39 is: 
DSCR1L1 (Down Syndrome Critical Region 1-Like 1, also called ZAKI-4 or calsupressin) 
facing left, CYP39A1 facing left and UCP4 (uncoupling protein 4, also called 
SLC25A27) facing right.  The CYP39A1 and UCP4 have only 610 bp between them and so 
they share a common promoter region.  This gene order is also seen in Xenopus with 
5000bp between UCP4 and CYP39A1.  Chicken is currently missing UCP4, but DSCR1L1 
is present in the same arrangement.  In zebrafish UCP4 is 10Mb away from CYP39A1.
No clear ortholog of UCP4 was seen in Fugu. The best hits to DSCR1L1 were on different chromosomes in zebrafish than UCP4 and CYP39A1.  In fugu the DSCR1L1 hit was unmapped.

D. Nelson May 28, 2002
17 families are represented only CYP39 is still missing.
There are at least 54 P450s in Fugu.
Many short fragments have now been replaced by longer contigs or full length sequences.
Last modified July 17, 2002 includes 45 full length assembled genes and one full length pseudogene 
(CYP3A50P). 8 more nearly complete P450s are missing only small parts (1-2 exons or less). 
There are 17 partials for a total of 71 contigs.  12 of these 17 are probable pseudogene fragments, 
5 more are short fragments of 1-2 exons that may be from intact genes (CYP17 frag a, CYP17 frag b, 4V5-like fragment,
exon 4 of a CYP27A and CYP2X4 a unique EST), or they could be poor quality sequences from known genes. 
CYP1A is a probable full length gene that runs off a scaffold end into a sequence gap. 
Numbers in () refer to intron phase.
Below you will find Tetraodon and zebrafish sequences 1 Tilapia , 
1 Oreochromis , 1 Xenopus, some human, 1 trout , 1 Paralichthys sequence used in building the fugu genes.

Note completed 2K11, 2N12 and almost completed 2K10 5/17/02
5/17/02 new assembly of Fugu available for blast searches

>CYP1A Scaffold_19246 still incomplete
63% to 1A1 human 83% to Limanda 1A
Length = 2111 rest of sequence off end of scaffold
LKG94794.x1
LGS78315.x1
LKG94794.y1
LKH15188.y1
LKH15188.x1
LKH7579.x1
FS:S006359 Scaffold_6359
LGS229007.x1 N-terminal 76% to limanda 1A1
FM:M008150 scaffold_8150 corrected a frameshift
Fe:eCA590906 EST for missing region in the N-terminal half
Lower case 66 amino acids are from Tetraodon
MVLMVLPLIGSVSVSEVLVALTTACLVYLMVRYFYTEIPAGLRRLPGPTPLPIIGNVLEI
GRRPYLSLTAMSKRYGDVFQIQIGTRPVIVLSGIETVRQALVKQGEEFSSRPDLYSFRFI
NEGKSLSFSTDQAGVWRARRKLAYNALRSFSTLQGTTPEYSCMLEEHICKEGEYLVNRLS
SVLQADGRFEPFrhivvsvanvicgmcfgrrynhndqelvglvtlshefgevasngnpad
fipalrflpskamkrfvd
12370 LNTRFTTFVQKIVNEHYATFDK 12305
12218 ENMRDITDSLIDHCEDRKLDENSNIQVSDEKIVGIVNDLSGA
      GFDTVSTALSWSIMYLVTYPDVQERLYQEL
11786 ESNVDQNRKPRLSDKPNLPLVEAFILELFRHSSFLPFTIPHC (2?) 11661 extra g base in Cys codon TGGT
      TSKDTSLNGYFIPKDTCVFINQWQINHDP 10722 
306 EQWEDPSSFNPDRFLSADGTEVNKAEGEKVTTFGMGKRRCIGEIIARNEVYLFLAILIQRLQ 488
489 FLPIPGETVDMTPEYGLTMKHKDCRLKARMRTRDEQ* 599

>Tetraodon seq of Fugu missing region
4854 GGRPYLSLTAMRKRYGDVFQIQLGMRPVVVLSGLETVRQALVRQGEEFSSRPDLYSFRFI 4675
4674 NEGKSLTFSTDGAGVWRARRKLAYNALRSFSTLKGTTPEYSCMLEEHICKEAADLIQQLH 4495
4494 GVMEADGNFDPYRHIVVSVANVICGMCFGRRYNHNDQELVGLVTLSHEFGEVASNGNPAD 4315
4314 FIPALRFLPSKAMKRFVD 4261

>CYP1B1 Scaffold_1553 complete gene Scaffold_11030 Scaffold_10662   
54% TO 1B1 human 51% to 1B1 mouse
AL024920.1 AL015454.1 cosmid 077P23 80% to CYP1B from pleuronectes platessa
FC:C013F14aE4 LGU7740.y1 FC:C077P23aC12 AL015446.1 077P23 FC:C077P23aD8
2460 MKVIQEEVSPEAGALLLACATLLVSLQLWRWRRRRPGGCPPGPRAWPIIGNAAQLGHAPHL 2278
2277 YFTRMAQRFGNVFQIKLGSRTVVVLNGDAIKQALVRKGLEFAGRPDFTSFKYISNGHSL 2101
2100 AFGTVTDWWKSHRRVAQSTVRMFSTGNLQTKKTFERHLTCEVRELLHLFLGKTKELQYFQ 1921
1920 PMNYLVVSTANVISAVCFGKRYSYEDEEFQQVVGRNDQFTRTVGAGSIVDVMPWL 1756
1755 QYFPNPVKSIFDNFKRLNKEFSDFIRDKVTEHRKSIRPSSVRDMTDAFIVSLDKLSE 1585
1584 KTGVPLWKDYVIPTVGDVFGASQDTLSTALQWIFLVLVR 1468 (2)
294 YPDMQQRLQEEVDLVVGRQRLPCIEDQQQLPWVMAFIYEVMRFTSFVPLTIPHSTTTDTT 115
114 IMGYTIPKNTIIFINQWSINHDPTIWSHPET 13
FDPNRFLNPSGSLNKDLTSRMLIFSMGKRRCIGEELSKLHLFLFTALIGHQCHITDDPA
KPTTMDYNYGLTLKPRGFYVALTLRGDMRLLDEAASRPPAEEPGRGPLADP*

>CYP1C1 Scaffold_3008b no introns complete gene
LGS269180.x1 Length = 510 38-163 = Scaffold_11497 48% to 1B1 no introns
10253 MALDTEFGVKSSSITREWSGQVQPALVASFLFLFCLEACLWVRNLRHKRRL
10100 PGPFAWPVVGNAMQLGQMPHITFAKLAKKYGNVYQIRLGCSNI 9972
9971  VVLNGDQAIHQALIEHSTEFAGRPNFVSFQMISGGRSLTFTNYSKQWKVHRKLAQSSLRA 9792
9791  FSSANKQTKIAFEQHVTAEANELVQAFLRYSTDGRYFDPAHEFTVAAANVMCALCFGKRY 9612
9611  GHDDHEFRCLLKKLNKFGETVGAGSLVDVMPWLQSFPNPVRSLYENFKSLNEEFFNFV 9438
9437  KNKVQEHRESFDPNVTRDMSDAMINVIEERKDGTLSKEFAEATITDLIGAGQDTVS 9270
9269  TVLQWIVLLLVKHPDKQAKLHELMDKVVGQDRLPTTEDRSSLAYLDAFIYETMRFTSFVP 9090
9089  VTIPHSTTSDVTIEGLRIPKDTVVFINQWSVNHDPLKWKDPHVFDPSRFLNENGDLNKDL 8910
8909  TSGVMIFSSGKRRCIGSQIAKVEVFLFAAILLHQCSFESDPSDPLTLDCSYGLTLKP 8739
      LRCFVSAKPRGKLLGLVSPA* 8676

>AL252616.1 C0BG040DF06SP1 G Tetraodon nigroviridis CYP1C1 genomic clone 040L12
Length = 938

3   VLNGDQAIHQALIQHSTEFAGRPNFVSFQMISGGRSLXFTSYSKXWKAHRKVAQSSLRA 179
180 FSSANNQTKKAFEQHVTAEAXKLVQTFLHXSTDGKYFDPAHDFTIAAANVMCALCFG 350
351 KRYGHDDQEFRCLLMKLDKFGQTVGAGSLVDVMPWLQSFPNPVRSVYENFKSLNEEFFSF 530
531 VKNKVSEHRESFDPNVTRDMSDAMINVIEGRKDSTLTKEFVEATVTDLIGAGQDT 695
696 ISTVMQWIILLLVKYPDMQAKLHELVDKVVGQDRLPTVEDRSSLAYLXAFIXETMRFTSF 875
876 VPVTIPHSTTSDVT 917

>gnl|ti|25684989 zfishC-a2047d05.q1c Length = 1060 similar to CYP1B2 
with 2 aa insert (TF) compared to CYP1B3, so the insert is real
150 VQPALIASFIILFFLE 197 frameshift 199 ACLWVRNL  (TF) 
KKRLPGPFAWPLVGNAMQLGQMPHITFSKLAKKYGNVYQIRLGCSDI 369

>CYP1C2 Scaffold_3008a  no introns complete gene
LPC23063.x2 53% to 1B1 different from Scaffold_1553 52% to 1A2
LGW71232.x1 LGS304686.x1 Length = 29042 = LPC23063
6770 MGXDFGLKXSSSITREWSGHVQPALVAFFVFLFCVEACLWAKNLKRRL
6626 PGPFAWPVVGNAMQLGQMPHITFSKLAKKYGNVYQIRLGCSDI 6498
6497 VVLNGARVIRQALIEHSTEFAGRPNFVSFQNVSGGKSMAFTSYSKQWRMHRKIAQSTIRA 6318
6317 FSSANSQTKKVFEQQIVAEATELVEVFLKLGARGQHFNPAHELTVAAANVICALCFGRRY 6138
6137 GHDDQEFRDVLRRIDKFGQTVGAGSLVDVMPWLQSFPNPVRSMFRSFEALNREFFGF 5967
5966 VQLKVEQHRETFDPEVTRDMSDAIISVLEKSDGETALTKDYTEVTMADLIGAGLDTV 5796
5795 STALHWMLLLLVKHPELQSKLHQLIDRVVGRNRLPSIEDRSSLAYLDAFIYETMRFTSFV 5616
5615 PVTIPHSTTSDVTIEGLRIPKDTVVFINQWSVNQDPLMWKDPHVFDPSRFMDEEGSLDRD 5436
5435 LACNVMIFSAGKRRCIGDQIAKVEVFLFFAVLLHQCSFESSADEDLTLNCSYGLTLKPL 5259
5258 DFSITAKLRGKLLKSP* 5208 

>CYP2K9 Scaffold_12487    = LGW56404.x1 50% to 2A7 63% TO 2K7 COMPLETE
Scaffold_13436b seems to be a pseudogene of this gene
= LKB101560.y1 56% to 2G2P
= LJQ41788.x1 52% to 2F1
= LGS108592.y1 88% to 2D6
= LOL6792.x1 40% to 2W1
3037 MIEDLFESSTSGFLMVAIVSLLLLQ
     LCFSFISREKRKDLPGPEALPLLGNLHQLDLKRLDCHLVQ 3231 (0)
3299 LSQKYGPIFRVYLASKKVVVLAGYTAVKQALVNQAEDFGEREIFPIFHDFNKGN 3460 (1)
3527 GILFTNGDQWKEMRRFALMTLKDFGMGKRTIEEKIIKECQ YLIEAFEQHQ 3676 (1)
     GEAFSNAQVIS YATSNIISAIMYGRRFDYKDPTFQAMIERDHEVIHLTGSPSIQ (0)
     IYNIFPWLGPFLKTWRYIMKKVEINIESTRRIIGEMKETRNP
     GTCRCFVDAFLIHKENQE (0)
4483 ESDVNAHYYHEDNLLHCAMNLFGAGTDTTATTLQWGLLYITKYPHIQ 4623 (1)
4692 DGVQEELRRVVGNRQVRVEDRKNLPYMEAVIHETQRMANIVPMSLPHRTS*DTFQGYVIKK (0?)
     GTMVIPLLTSVLYDESQWEKPHTFNPAHFLDDEGRFVRRDAFMPFSA 5095 (1)
5164 GRRMCLGEGLARMELFLFFASLLQHFRFKPAPGVSEDSLDLTPVVGITLNPLTHKLRAISRF* 5352

>CYP2K10 complete finished with FM:M000353 from Mayfolds 
Scaffold_19693    = LGW19459.x1 53% to 2G2P Length = 2743
LGW19459.x1 Length = 532 343-416 54% to LGS141970.y1 53% to 2G2P
LKU76565.y1 Length = 289 355-416 = LDZ45836.x1 53% to LGS141970.y1 52% to 2G2P
LKU76565.x1 = LPC21805.x1 LDZ45836.y1 Length = 608 N-term region of LKU76565.y1
LKB50710.x1 53% to 2W1 C-term 84% to CYP2K 73% to scaf 10791
66% to CYP2K
Scaffold_13436a N-terminal of second gene runs off beginning of scaffold
Cannot complete note LKB50710.x1 is identical to scaf 19693 up to the gap at 977
It continues into the gap up to KLCA
587 MSLQDFLLSLGPSTLMGSVALLLLLCLVSRSFGRATRREPPGPRALPLLGNLLQLDLSRPHQTLYQ 390 (0)
313 LSKKYGPVFKVHFGPRKVVVLAGHKTVKEALVGNAEQFGDRDISPIFYDMNQGHG 149 same as next line
    SKKYGPVFKVHFGPRKVVVLAGHKTVKEALVGNAEQFGDRDISPIFYDMNQGHG 457 LKU76565.x1
24553 GILFSNGETWKEMRRFALSTLRDFGMGKRMIEDKIAEECNYLIQKFEDHE (1) 24404 FM:M000353
24327 GRAFDTSRLANYATSNIISSIVYGSRFDYDDPRFINMVNRVNEVIRLTGSAPIQ (0) 24166 FM:M000353
     LYNIFPGLANWIKNRQLLLKQVAMNLRDMTDLIQQLKDTLNPGVCRGFVDCFLLRKQKAV (0)
2184 DSGVIDSLYNEKNLLYSLSNLFGAGTDTTATTLRWGLLLMAKYPRIQG
     QVQQELSMVVGNRRVCVEDRKNLPYVDAV 1813
1812 IHEIQRLGNIAPMAVPHKTARDVEFRGYFIEK 1717
1286 GTTVFPLLTSVLYDENEWETPHTFNPSHFLDKDGKFIKRDAFMPFSA 1146
     GRRMCLGEGLAKMEIFLFFTSLLQQFRFTPPPGVGEDELDLTPVVGFTLSPSPHKLCAIPRQ* 
Exon 9 from FS:S007893 Scaffold_7894

>CYP2K11 Scaffold_10791    = LKB50669.y1 LKB50669.x1 2D6 like Length = 7345
LKB50669.y1 LKB50669.x1 Length = 581 318-450 57% to LGS141970.y1 53% to 2D6
65% to LKU76565.y1 65% to CYP2K cannot complete missing exons 1-4 about 176 aa
FS:S006775 Scaffold_6776
     MGIVDLFLQASSSVSLLLLGALALLLFVYFISSVSFSSKKDRKCPPGPKPLPILGNLLQFDLKRPYNTLMK (0)
     LSKTYGSVFTVYLGPKKVVVLAGYKTVKEALIDHAEEFGERDPIMLVQNANHEH (1?)
     GVLWSNGESWKEMRRFALTNLRDFGMGKKACENKIIEECSYLMEELKKWK (1?)
     GEPFDTTHPINYAVSNIICSMVYGNRFEYDDPEFTSLVDRTNTLIQISGSPSVL (0)
5891 VYDLFPWIGPLVNNKKLFQSLFAANKKQNLQLFAAAKEMLNPQMCRSFVDSFLARQQILE 5721 (0)
4989 KSGTNVHFHDENLMSTVMNLFNAGTDTTATTLRWGLLLMAKYPLIQ (1?)
4750 DQVQEELRRVIGSRQVQVEDRKSLPFTDAVIHETQRLANIVPMALPHKTSQDVTLQGFFIEK 4571 (0)
     GTTVYPLLTSVLYDETEWEKPLNFYPAHFLDKDGKFVKREAFLPFSA 4355 (1)
4287 GRRICLGEGLAKMELFIFFSTLLQHFRFRPPPGVSEDHLDLTPRVGLTLNPSAHKLCAVSCL* 3999

>AL263467.1 C0BG063DC11LP1 G Tetraodon nigroviridis genomic clone 063F22 T7.
Length = 953 86% to 2K11
314 GRRVCLGESLAKMELFIFFSTLLQHFRFCPPAGVSEDDLDLTPRVGLTLSPSAHKLCAVSLR* 126

>tetraodon AL222696 C0AG203BG06LP1 G Tetraodon nigroviridis
VYIFPSIMELLPGPHHTMFRNTDFLRNFVMTKIQEHKDSLDPSSPRDFIYCFLIRMEQVG
KNLPTTEFQYENLVSTVLNLFLAGTETTSTTIRYALQVLIKHPNIQ (1?)
KMQQEIDTVVKQEHCPKMEDRKSLPFVDAAIHEVQRF (fs)
DIVPFSLPHFALKDISFRGYTIPK

>CYP2K12P Scaffold_3103 Length = 27036 59% to scaf 10791 
Heme junction missing the conserved Gly, no uspstream seq found 
With these defects and a frameshift this is probably a pseudogene 
LKB99171.x1 50% TO 2C37
17897 DQVQEELSRVIG 17862 frameshift
17860 SRQVQEGDRKNLSFTNAVIHETQSGHVALTSLPHVTNQDIIFRGHFLKKG 17711 (1)
17388 NYMEDTASVASVLLEETEWEHPHTFYPSHFLEKDRKFVKRDAFLPFSA 17242 (1)
17176 ISRACPGETLARVELFIFLVTLLQHFCFTLAPGVSPDELHVTPSIGSNHSPVAYRLCTVSCM* 16988

>CYP2K13 Scaffold_12487 /CYP2K14P pseudogene of 2K9 Scaffold_13436b Length = 4942
= LGW56404.x1 50% to 2A7 
two partial genes in this contig both on minus strand
Scaffold_13436b pseudogene of Scaffold_12487 (fs) = frameshift
3958 VRVEDRKNLPYMEAVIHETQRMANIVPMSLPHRTSRD 
     TSFSGDTSSKRFTALFELAHVYV
     GTMVIPL (fs) LTSVLYDESQWEKPHTFNPAHFLDDEGRFVRRDAFMPFSA (1)
     GRRMCLGE (deletion 3 nuc) RMELF (insertion 12 nuc) LFF (deletion 33 nuc)
     VSVDSLDLTPVVGITLNPLTHNLRAISRF* 3368

>CYP2K15P pseudogene Scaffold_13758    
41% to LKB99171.x1 50% TO 2C37 Length = 5303
FC:C094J16aF1, FC:C007E01aF1 pseudogene
740 KGRITQRHFHDEKLMMTVSSHLAAGTHLDTYTALRQEPLVMAK*PEVQ 883 exon 6 (1) 52% to 2K11
    Exons 7 and 8 deleted
1284 (1) GLRSCPGEG*SRMKLFIFIVILLQHLCFSSSPVLMEEDLELKTVLGSILNPINCVLFVGRER* 1472 exon 9 48% to 2K9

>CYP2N9 Scaffold_3261a    PARTS OF 5 GENES Length = 26692 58% TO 2N2 47% to 2J2
Exons 2 and 3 are in a sequence gap. Added in LKG95403.y1 to fill the hole
= Fc:c144E09x1 LSH.54947.x1 Fc:c161K19y1 LPC.61518.y1
LGS73235.x1 Length = 666 70-114 opposite end of LGS73235.y1 48% to LGL41180.y2
LKG95403.y1
exon 2, 3 may belong to seq 3261a
9342 MWLWDLVLWLRLTGFLLPVLIVLLIIMYSLRQKDPPNFPPGPPALPLLGNIFNIEAKQPHLYLTK 9148 (0)
     LADVYGSVFCIRLGRHKTVFVSGWKMVKEAIVTQADSFVDRPYSPMATRIYSGNS LKG95403.y1
     AGLFFSNGHVWRKQRRFAMATLRSFGLANGSMELSICEESRHLQEAMESQK LKG95403.y1
8235 GEPFDPVPLLNNAVANIICQIVFGRRFDYTDHMFQRMLHHLTEMAYLEGSIWAL 8074 (0)
7991 LYDSFPALMKHLPGPHNGIFSSSSSLQGFIWREIQRHKSDLDPSNPRDYIDAFLIEEG 7818 (0)
7743 NGNNQLGFEERNLVLCCLDLFLAGSETTSKTLQWGLIYLIRKPHIQ 7606 (1)
     EKVQVEIDRPIGRTRQPTMADRPNLPYTDAVIHEIQRMGNIVPLNGPSNGCQGTRPWRGYFIPK (0)
     GTSVMPNLTSVLFDKNEWETPDTFNPEHFLDAEGKFVRREAFLPFSA 7246 (1)
7162 GRRACLGEGLARMELLLFFVSLCQRFHFSTLDRVELSTEGITGATRTPYPFKIYAQVR* 6986

>CYP2N10 Scaffold_3261b complete gene 9 exons PARTS OF 5 GENES Length = 26692
= LGS73235.y1 LGK95403.x1 67% TO 2N2 50% to 2J2
= LKH16351.x1
13883 MWLYSVLSWDFTSLLLFFFVLILFANYLKNRDPPNFPPGPFAFPIVGNFFTMDSKNLHLYFNK 13695 (0)
12557 LADVHGNVFSFRLGGDKMVCVSGHKMVKEAIVTQADNFVDRPYDPISARVYGGQT 12393 (1)
      DGLFQSNGEVWKRQRRFALSTLRNFGLGKNILEQSICEEAQHLLEEMRSHG 12153 (1)
      GKPFNPARLFNNTVSNIICQLVMGKRFEYSDHKFQMLLKYLSEVLVLEGSFWGQ 11913 (0)
11814 LYEAFPSVMKHLPGPHNKVFSHFNHLKDFMNEEIQNHKKDLDHNNPRDYIDAFIIEMEK 11638 (0)
      NKDTNLGFTETNLAMCSLDLFIAGTETTATTLLWDLVYLINNPDIQ 11413 (1)
11290 GKVQAEIDQVIGQNRQPTMADRPNLPYTDAVIHEIQRMGNIVPLNGPRMAAKDTTLGGYFIPK 11102 (0)
11018 GTSLMPILTSVLFDKNEWETPDKFNPGHFLDAEGKFKKREALLPFSA (1)
      GKRVCLGEGLAKMELFLFFVSLFQNFTFFVPGGAELNTEGITGTTRVPHPFEILARPR* 10619

>CYP2N11 Scaffold_3261c  complete gene 9 exons  PARTS OF 5 GENES Length = 26692
= LGS36609.y1 67% to 2N2 49% to 2J2
      MWPLQLLLDFDIRALLLFISVLLLIGDYFRYKNPPNFPPGPMSLPFVGSFFSVDSKHPHNYFIQ (0)
18495 MAELYGKLFSIRLGSGKIVFACGYKMVKEAIVTQADNFVDRPFNAFGDRIYMGQR 18331 (1)
18251 DGLFQNNGEVWKRQQHFALSTLRNFGLGKNILEQSICEEAQHLLEEMRSHG (1)
      GKPFDPASLFTRAVSNIICQLVMGKRFEYSDHKFQMLLKYLSELLVLEGSFWGQ 17859 (0)
      LYQAFPSVMKHLPGPHNKVFSHYNHLKDFMNEEIQNHKKNLNHNNPRDYIDAFIIEMEK (0)
17498 NKDTNLGFTETNLVLCSLDLFLAGTQTTATTLLWALVYLINNPDIQ 17364 (1)
16988 EKVQAEIDQVIGQTRQPTMADRPNLPYTDAVIHEIQRMGNIVPLNASRMAAKDTTLGGYFIPK 16800 (0)
      GTSLLPILTSVLFDKNEWETPDKFNPGHFLDAEGKFKKREAFLPFSA (1)
16492 GKRVCLGEGLVKMELFLFFVSLFQKFSYSVSGGAELSTEGITGITRVPHPFEIHTRPRSF* 16310

>CYP2N12 one of 5 genes in a cluster
old Scaffold_3261d 61% to AF248042 55% to 2P3
Scaffold_805 CYP2N9 (-), CYP2N10 (-), CYP2N11 (-), CYP2N12 (-), CYP2P4 (+)

92399 MILQKIFAYMDFSSWVLLIFLVLLITDVIRNWTPHNFPPGPWAMPFVGNIFTGVDFRTIEK (0) 92217
92102 LSQKYGPVFSLRRGNTRTVFINGYKMVKEALVSQLDSFEDRPVVPLFHVVFKGI (1) 91941
91785 GIALSNGYMWKKQRKFAHTHLRYFGEGQKLLENHIQMESKFMCEAFKDEQGaagt 91633 (2? Bad 
boundary)
91552 GKPFDPQYTITNAVGNIISALVFGHRFEYSDASFRRILELDNEAVVLAGSARTQ 91391 (0)
TGTACTGGGGGTCAAACGGCTTTCCTTTTAAGAGAGACAACCATCTATTTAATTATGAAATATTTTGCATGTG 91600
             F  P  K  G (1)
91307 LYDSFPSLMKHLPGPHQTVHANYGKITDFLKKEVDKHMEEWNPEDPRDYVDTYLSEMEK 91131 (0)
90784 MNQDPQGGFNVETLLICILDLIEAGTESAATTLRWGLVFILNYPDVQ 90644 (1)
90564 EKVQEEIDRVIGQSRQPAMADRPNMPYTDAVIHEIQRFANVVPAGFPKMATKDTTVGGYFIPK 90376 
(0)
90287 GLAITTMLSSVLFDKNEWETPDVFNPNHFLDSEGRFRKRDAFIPFSA 90147 (1)
90043 GKRVCIGENLAKMELFLFFTSILQHFNLSPVPGQMPSLEGILGFTYSPQPFRMIVAPR* 89867

>CYP2P4 Scaffold_3261e completed exon 4 with FM:M000194 from mayfolds
PARTS OF 5 GENES Length = 26692 missing exons 4,7,8,9
= LDZ64156.y1 
= LDZ64156.x1 65% to 2P3 46% TO 2J6 
seq runs off end of contig missing exons 7,8 and 9
FS:S001425 Scaffold_1425 Length = 63654 probably exons 7, 8 amd 9 of 2P4
Fc:c161F04y1 LPC.61739.y1 Fc:c161E03y1 LPC.61451.y1 60% to 2J9 
Scaffold_2841 probably exons 8 and 9 of 2P4
80% to LPC61680  66% to 2D9 Length = 29344
LPC61680.x1 LPC22842.y1 LPC61776.x1 LPC61672.x1 Fc:c161P11x1 Fc:c161P09x1 
66% to 2D9 LPC61488.x1 64% to 2d9 Fc:c161O11x1 93% to LPC61680 
probably same gene 67% to CYP2K, upstream sequence runs off scaffold
62% to 2Z2 over 106 aa 59% to 2K10 over 108 aa 60% to 2N12 over 100 aa 
24164 MEAILSTLGLEWMDGRTILIFLLVFVLLADYIKNRVPSNFPPGPWPLPLIGDLHRINPSRLHLQFAE 24364 (0)
24760 FAGKYGNIFSLRLFGGRVVVLNGYKTVREALVEKGENFVDRPLIPLFEAFAGNR 24924 (1)
24994 GLVISNGNPWKHQRRFALHTLRNFGIGKKSLEPSIQQECHYLAEAFAQHKG 25156
65508 GEPFYAKALIHNAVSNIICCLVFGERFEYSDKQYHAILKSFDRIIQLQGHFIVQ 65347 (0)
26236 VYNTFPWLLKWLPGTHQTIFSEIKTVINFVDLKIQEHKRNFDPSSLRDYIDCFLAEMGE 26412 (0)
26493 KEDVESGFDMKNLSICTMDLFGAGTETTTTTLQWGLLYMIYYPHIQ 26630 (1)
85    EKVYAEISAVIGSSREPSITDRDNMPYTNAVIHEMQRMANIIPLNVVHMASSDTTIGNYTIPK (0) 273
695   GTIIMPTLNSVLHDESMWETPHSFNPQHFLDQDGKFRKREAFLPFSA (1) 836
958   GKRVCLGEQLARMELFLFFTSLLQRFSFSMADGEQPSLDFQLGGARFPKPYRLRAILR* 1134

>AL185445.1 C0AG244AD07SP1 G Tetraodon nigroviridis genomic clone 244G13
Length = 890 80% identical to 2P4 exons 2 and 3
167 FAEKYGNIFSLRXFGGRIVVLNNYKTVRXALVEKRQNVTDRPIIPLFEPVVGNKG 331 (?)
    LLISNGNPWKQQIRFALHTLRNFGIGKKSLE 517
518 PSIQQESHYLAXAFXHHKG 574

>AL345210.1 C0AB011CB01C1 B Tetraodon nigroviridis genomic clone 011D01 T7.
Length = 922 89-92% identical to CYP2P fragment exons 8 and 9
708 GTIIMATLDSVLNDEXMWETPHTFNPQHFLDQDGXFXXREAFLPFSAG 565
385 KRVCLGEQLARMELFLFFTSLLQRFSFSMADGEQPSLDFQLGGARCPKPYRLRAMVR 215

>CYP2P5P pseudogene fragment Fc:c060E24y1 LPC.22843.y1 
56% TO 2W1 PKG TO HEME 70% to scaf 2841 exon 8
GTIVVPTLNSVLPDESVWETPHSLDPPLFLDL*RXFRVREAFLPFFA

>AL185445.1 Tetraodon nigroviridis genomic clone 244G13 PUC-Ori.Length = 890
RIHSQLAE (0)
FAEKYGNIFSLRXFGGRIVVLNNYKTVRXALVEKRQNVTDRPIIPLFEPVVGNK (1?)
GLLISNGNPWKQQIRFALHTLRNFGIGKKSLEPSIQQESHYLAXAFXHHKG

>CYP2R1 ortholog LPC25839.y1 LPC25855.y1 LGL29067.x1 LGS101007.x1 Fc:c068M08y1 
Fc:c068M04y1 LKU41493.y12R1 LGW54033.y1 69% to 2R1 
Scaffold_7138 FS:S000037 Scaffold_37 
      MVPAQSPPLVPPSRDQALLGLACLTVAFLAVLLVRQLVK
      QRRPPGFPPGPSPIPIIGNIMSLATEPHVFLKKQSEVHGQ (0)
      IFSIDLGGILTVVLNGYDCIRECLYNQSEVFADRPSLPLFKKMTKMG 12808 (1)
12701 GLLNCKYSKGWIEHRKLACNSFRYFGSGQRLFERKISEECMFLVDAIDQHKGKAFNPKHL 12522
12521 VTNAVSNITNLIIFGQRFTYDDHNFQHMIELFSENVELAVSGWALLYNAFPWIEYLPFGK 12342
12341 HQKLFFNAAEVYDFLLRVTKEFSQGRVPHMPRHYVDAYLDELERNAGDPNSSFSYENLIY 12162
12161 SVGELIIAGTETTTNTLRWAMLYMALYPNIQ (1)
      ERVHREIDSVLANGRMPTLEDKQKMPYVEAVLHEVLRFCNIVPLGIFRATSQDA 11802
11801 NVNGYTIPKGTMVITNLYSVHFDEKYWSDPGVFSPQRFLDANGNFVRREAFLPFSL 11631 (1)
11535 GRRQCLGEQLARMEMFLFFTTLLQRFHLQFPVGTIPTIAPKLGMTLQPKPYSICAVRR 11362
      HQKSLISVTTPCHK* 11317

>CYP2R2P 
Fc:c104I03x1 LPC.39565.x1 77% to fugu 2R1 MAY BE PSEUDOGENE OF scaf 7138 exon 8
201 DSVLANGRMPTLEDKQKMPYVEAVLHEVLRFCNIVPLGIFRATS*DANVNGYTIPKGTM 220
221 VITNLYSWHFYEKNWSKTGAFSHPKCLWDAHGHFCEWLMASMPGSFG 518

>CYP2R3P
Fc:c068L08y2 LPC.26046.y2 67% to fugu 2R1 exon 8 possible pseudogene fragment
LYYTKIXTVLARVEIPTLEDKQKMPYLEAVLPEVLRFCDIVPLGLFRATSAGADVNGFTIPGGAVLIAILCSGRF

>CYP2U1 Scaffold_8899 56% to 2U1 human 13-246 LED73857.y1 LGS275010.y1 Scaffold_10678
LGS291273.y1 Fc:c161P11x1 Scaffold_895 (complete seq)
LGS191056.x1 Length = 139 LED33740.x1 56% to 2U1 C-term
Insertion relative to other P450s between PVVGNF and LAKVYG
This is seen in human and mouse cDNA also so it seems real
MMSLSWLQSLSSSILTLVIMIILHHLFKCYQKRHGFANIPPGPKPWPVVGNFGGFL
VPSAIRKRFGSKAEGPAK
NAAAVLTELAKVYGNVYSIYVGSQLVVVLNGYKVVRDALSNHPDVFSDRPDIPAISIMTKRK (1)
GIVFAPYGPLWQKHRRFCLSTLRNFGLGRLGLEPCIVEGLTNIKTELLRLE 
EESGGAGVDPAPVISNAVSNVICSLVLGHRFNHDDQEFRSMLRLMDRGLEICVNSPAVLI 
NVFPLLYHLPFGVFRELRQVERDITAFL
KRFIANHQETLDPNNPRDLTDMYLKEISARREAGDVDSGFTED
YLFYIIGDLFIAGTDTTANSVLWVILYMASYPDIQ (1)
DKVQAEIDGVVGPLRTPSLSDKGKLPFTEAAIMEVQRLTTVVPLAIPHMTSETI (1)
EFMGYTIPKGTVVLPNLWSVHRDPTEWDDPDSFDPTRFLDEDGTLLRKECFIPFGI (1)
GRRVCMGAQLAKMELFLTVTNLLQTFHFRLPEGAPRPPLQGRFGLTLAPCPYTVCINPR

>CYP2X2 Scaffold_4007 corrected frameshift with FM:M000743 from mayfolds
Length = 24042 75% to LED83776.x1 60% to 2X1
= LGQ3874.x1 54% to 2R1 = 4007
= LKU17547.x1 58% TO 2U1 C-TERM 82% to FE:EFRy002apsE4 (EST)
FS:S003334 Scaffold_3334 Length = 26208
      MVTSVILLCLGVVVLVLLLRSQRPKNFPPGPPVLPLLGSILELALDNPLQDFER (0)
28128 LRKKYGNVYSLFLGTRPAVVISGLKNLKEALVTKGSDFSGRPQDMFVNDAIKTN (1) 28289
13208 VIMQDYNLVWKEHRRFALTTMRNFGMGKTSMEDRIHGEIEYIVNTLEKNN (1)
      GKTLSPHLMFHNAASNIICQVLFGTRYEYDDHFIREIVRCFTENAKISNGPWAM (0)
      LYDSIPLVRYLPLPFKNAFKNVE (0)
      TAENLVKDLFVEHKKTRMSGDPRDFVDCYFDELDK (0)
      RGKDRSSFSENMLTMYALDLHFAGTDTTSNTLLTGFLYLMNYPHIQ (1)
      ERCHQEIDKVLQDNETVTYDARNQMPYMQ (0)
15630 AVIHEVQRVANTVPLSVFHCTTKDTEFMGYSIPK 15731 (0)
15853 GTLIIPHLASVLKEEGQWKFPNEFNPDNFLNDDGEFVKPEAFMPFST (1)
16100 GPRVCLGEGLARMELFLIIVTLLHKFQFIWPEDAGEPDYTPIFGATQTPKPYRMKIQLRK* 16282

>FS_CONTIG_119_5 tetraodon exon 2 of a CYP2X sequence
LQKTYGNIYSLYLGRRPAVVISGLKTIKEALVTKGSDFSGRPQDMFIKDAIKTSGKTLQSCTAKNI

>CYP2X3 Scaffold_10845    
= LKU29272.y1 54% to 2d10 Length = 7563 71% to scaf. 4007
= LGW126079.y1 Length = 560 51% to 2E1 I-helix region 61% to LGW56404.x1
missing exons 1-4 off the end of the scaffold
exons 1, 3 and 4 are on: FS:S005546 Scaffold_5546
old Scaffold_9193  Length = 9721 51% to scaf 4007
LGL47087.y1 Length = 725 2 family N-term exon 1
Assume these are from the same gene, since they are both 2X seqs and complement each other
Missing 30 aa from exon 2
Part of exon 2 from gnl|ti|118242285 NFP96127.y1.gz trace archive
MLVSLALLLAAAFGLWVFFQIQRPKNFPPGPPPIPLFGNLLEIQLDNPIADLER (0)
LGQRYGNVYGLFLGSRPEVVIKGV 
(30 aa sequence gap)
LLLSPYNSGWREHRRFTLMTLRNFGLGKQSMEDRILGEMRRVMEFLEQSD (1)
GEPINPETLFHKAASNIIFQVLFAKRFDNEDDSMKFFTNFFRETSQIINGPWSL (0)
7527 LYDSFPAVRYLPLPFKRGFEMFK 7450 (0)
7381 MSHERYLEMFVETKKTRVPGKPRHFVDAYMDELEK 7277 (0)
7193 RGDEAFFSEDQLCAIILDLHFAGTDTTANTLLSGLLYLMKYPHIQ 7057 (1)
6289 EYCQQEIDKVMQGKNEVSFEDRVQMPYVQ 6203 (0)
6105 AVIHEIQRTANTVPLSVFHCTTRDTELMGYSIPK 6004 (0) exon 9
5617 GTLIIPNLSSVLNEKGQWKSSHEFNPENFLNENGEFVQPEAFMPFST 5477 (1)
5244 GPRVCLGEGLARMELFIILVSLLRKFRFIWPEDAEEPDLTPVFGVTQTPKPYSLKVQVRSRC* 5056

>CYP2X4X discontinued FE:EFRy002apsE4 EST exons 10 and 11
Length = 458 395-496 51% to 2D6 87% to Scaffold_10845 (CYP2X3)
Note: this EST is not in the current database and appears to have been removed.
SSPKGTIIIPNLSSVLNEKGQWKCPHEFHPGNFLNENGEFVKPEAFVPFST
GPRVCLGEGLARMELFIILVTLLRRFKFIWPEDAEEPDLTPIFGLTQTPKPYRLKVQIRSSFK*

>CYP2X5P Scaffold_3538 57% to FE:EFRy002apsE4 51% to 2D6 Length = 26272 
61% to 2X2 59% to scaf 10845 (CYP2X3)
first 8 exons missing off end of scaffold
E in EXXR motif missing, one bad boundary, no exon 11 found
Possible pseudogene
25728 (0) PGIHKVQRIANTVPLNVQYCTMKETQLMAHLLPR 25627 exon 9 bad boundary
25349 (0) ETLIIQNLNSRQNEEGQWKFPHKSRPENFLNDQGEFVKTEDFMLFSA 25209 (1) exon 10

>CYP2Y1 Scaffold_39a complete gene 9 exons Length = 125930 2 genes 48% to 2G1 
LGS30595.x1 Length = 630 (OLD FRAMESHIFTS CORRECTED)
12087 MDLTVMLLTATLLLVVLWILNAHTRKHTRLPPGPRGIPVLGNLLQLDKKAPFKSLLK 11917 (0)
11768 LSENYGPVLTVALGPQRTVVLVGYEAVKDALVDHADDFTGRGPVPFLMKVTRGY 11607 (1)
11166 GLAISNGERWRQLRRFTLTTLRDFGMGRKGMEEWIQEESKHLVTRIKSTE 11011 (1)
10937 GAPFDPTFFLSCTVSNVICCLVFGQRFSYDDEHFLSLLHIISETIQFGSSASGL 10781 (0) 
10700 MYNLFPRLMEWLPGRHREMFGKIEKVRAFTMEKIEEHQDTLDPSSPRDYIDCFLMRLQQ 10524 (0)
10452 EKPQPNTEFNYDNLVSTVLNLYLAGTETTSSTIRYALNVLIRHPKIQ 10312 (1)
10187 EKMQEDIDSVIGQGRCPYVEDRKSLPFTDAVLHEIQRYLDMIPFSIPHYALQDISFRGYTIPK 9999 (0)
9924  DTLIIPLLHSVLKDDKMWETPGSFNPQHFLDGNGSFKKNPAFLPFSAG 9779 (1)
9687  GKRACVGESLARMEIFLFVVSLVQHFTLSCPGGPDSVDLTPEYSSFANVPRKYKIIATPRWQ* 

>CYP2Y2 Scaffold_39b    Length = 125930 2 genes  49% to 2A13 over 486 aa
LDZ56427.y1 Length = 696 356-420 55% to LGS141970.y1 55% to 2F1
LGS30595.y1 LGL15334.x1 LGS236534.x1 Length = 605 opposite end of LGS30595.x1 53% to 2F1 69% 
to LPC61680.x1 LGS111634.x1 65% to 2W1
15595 MEFSVTLILAGLVLAFFWFILQKRKYNLPPGPTTLPLVGNLPQLDKKQPFKSFTE 15431 (0)
15356 LSKSYGPVMTLYLGWQRTVVLTGYEVVKEALVDQAEDFTGRGPLPFLLKATNGY 15195 (1)
15078 GLGISNGERWRQLRRFTLSTLRDFGMGRKGMEEWIQEESKHLTARIKTLK 14944 (1)
14815 VKPFDPTFLLGCTVSNVICCMVFGERFSYDDKQFLELLRVIAEVLRFNSSFLGQ 14654 (0)
14549 MYNVFPWILEHLPGPQHTMFSHVNFLREFIKKKIQEHKESLDPSSPRDYIDTFLIRMEQ 14373 (0)
14282 EKNLPNTEFHYENLVSTVLNLFLAGTETTSSTLRYALGVLIKHPNVQ 14142 (1)
14046 EEMQREIDNVVRQDQCPKMEDRKSLPFTDAVIHEVQRFLDIVPFGLPHYALKDITFRGYSIPK 13858 (0)
13775 GTVIIPLLHSVLKGDQWETPWAFNPKHFLDQNGSFKKNSAFLPFSA 13638 (1)
13390 GKRSCVGESLARMELFIFLVTLLKDFTFSCIEGPDSISLNPQYSGFANLPRNYEIVATPR* 13208

>CYP2Z1 Scaffold_2993a 2 COMPLETE GENES Length = 28621 
= LGW128983.x1 58% to 2D6
= LGL41180.y2 57% to 2d22
exon 6 very poor seq quality with many frameshifts (fs) just after a 51 nuc. gap in the seq.
      MGLIVSVFGSHADWSISTLLLFTAVFILMVNWIRNRRPPSFPPGPWTLPVVGNMHNLAHHRMHLNLME (0)
16293 LAETYGNVFSIQLGQEWMVVLNGPTILKEALVNQGDSVADRPNLQLIIDSCHGL (1)
16785 GLGFSSGHLWKQQRQFAISTLRYFGSGSKSLEPVVLEEFAHCAKQFSEFK 16937 (1)
17023 GKPFAPQLMFYNIVTNIICSLVFGHRFEYGDKNFEKLMNSFGRCLQIEASVCAQ 17184 (0)
17262 LYNSFPRLMGCLPGPHQTVKRIYQNIRDFIREEMKEHKKGLDPSTPRDYIDCYLNKIKK 17435 (0)
      SGAPHTFHEENLVICVWDLFLAGTDTTTSTLHWLFLFMAKYPEMQ (1)
17899 EKVQAEIDEVIGQSRRATMDDCVNMPYTNAVIHESLRMGNVVPLSLLHATGRDIQLEGYTIPK 18087 (0)
18158 GTTVIANLTSALFDKNEWETPFAFNPGHFLDEEGRFRKRTAFLPFSA (1)
18388 GRRLCLGENLARMMLFLFFTSFMQDFTISFPAGVSPAMEYHHFGVTLAPHPFDICAVSR* 18567

>CYP2Z2 Scaffold_2993b    2 COMPLETE GENES Length = 28621
= LGS141970.y1 52% TO 2P3 over 486 aa
      MHWIFDLIGSFLAGDFKSLLFFLLIFILTADYLRNRRSGSFPPGPMAIPIIGNMLSLDRSRTHESLTQ (0)
21437 LAETYGNVYSLRTGQTWMVVVNSFKVVREALVTHGESVSDRPDLPLQDEIAHGK 21273 (1)
20946 GVISSNGHLWKQQRRFALSTLRLFGFGKKSLEPFITDEFTHCANIFRSYK 20815 (1)
20726 GKPLPPHLILNNVVSNIICSLVFGHRFEYGDKNFKNLIKLFDQSLQIEASVWAE 20565 (0)
20473 LYNSFPLLMKHVPGPHQTVKKIWNEVKDFVRNELKEHRKNWDPSDPRDYIDCYLREIQA 20300 (0) 
19990 SGQSDSTFDEENLVICVMDLFVPGSETTSTTLRWAFLYMAKYPEIQ (1)
19748 EKVQAEIDRVVGQSRPLTMDDRVNLPYTDAVLHEIQRFGNIVPLSLPHVTNKAIQLEGYNIPK 19560 (0)
19470 GIMIIPNLTSALFDKNEWETPCTFNPGHFLDNEGKFRKRAAFIPFSA 19330 (1)
19220 GKRLCLGENLARMELFLFFTSFMQHFTFSMPAGVKPDMSFRFGVTLAPKPYEICAIPR* 19044

>3a27 trout
MMSFLPYFSAETWTLLALLITLIVV
YGYWPYGVFTKMGIPGPKPLPYFGTMLEYKK
GFTNFDTECFQKYGRIWG
IYDGRQPVLCIMDKSMIKTVLIKECYNIFTNRRNF
HLNGELFDALSVAEDDTWRRIRSVLSPSFTSGRLKE
MFGIMKQHSSTLLSGMKKQADKDQTIEVKE
FFGPYSMDVVTSTAFSVDIDSLNNPSDPFVSNVKKMLKFDLFNPLFL
LVALFPFTGPILEKMKFSFFPTAVTDFFYASLAKIKSGRDTGNST
NRVDFLQLMIDSQKGSDTKTGEEQTK
GLTDHEILSQAMIFIFAGYETSSSTMSFLAYNLATNHHVMTKLQEEIDTVFPN
KAPIQYEALMQMDYLDCVLNESLRLYPIAPRLERVAKKTVEINGIVIPKDCIVLVPTWTLHRDPEIWSDPEEFKPER
FSKENKESIDPYTYMPFGAGPRNCIGMRFALIMIKLAMVEILQSFTFSVCDETE
IPLEMDNQGLLMPKRPIKLRLEARRNTPSNTTATTLKSPTT

>CYP3A47 Scaffold_600    56% to 3A Length = 62144 exon 11 split into two parts
Fc:c181J12x1 LPC.69453.x1 N-TERM
Fc:c121C06x1 LSH.46194.x1 C-TERM
Scaffold_875 50% to 3A13 = FS:s875
      MNYLPLFALETWILLITFTCLFVM (2)
      YGKRTFGIFEKLGISGPKPTMYFGSICKYNN (0)
14338 VYYLDQECAQKYGKIWG (2)
14472 TYELRKPMLVVMDPDLLKTILVKECFTHFTNRR (0)
      NFCLNGDLYDAVNIAEDDDWRRIRNVLSPLFTTGRIKQ 14879 (0)
      IFSLMKRQSSKLTSSLEPKAENEEIISIKD (2)
      FFGAYNMNVATGVLFGMEMEPSLPIKHASKLFKFPIPLFMIQGASP (1)
      RCFPILLPLLQLMGFSLFPRDSFAFVKKIVEKIRAERDGGSHQ (0)
      EFDFLQHMISTQKNDGIYAVTS
  (1) GLTDHEIVSQLTVLLTGGYETSTLALTLSIYSLATNPGSMNRLQEEIDATFPDA (0)
16040 APVTYEALMQMEYVDCVINECLR (2)
      LYPPAARLERTAKETVEISGITIPKNMTVTVPIFALHRDPEHWPEPEEFKPDR 16334 (2)
      FSKQNKGRINPYTYLPFGIGPRKCLGMRLALVIVKLALVETLQKYSFSVCKETE (0)
      IPFKMDPNTFVGPINPIKLRVVRR*

>CYP3A48 at 198-201K on scaffold 35 3A48 with a few changes 94%
FS:S000035 Scaffold_35
201938 MDLLPNFSIETWTLLTLIFTLIIV (2) 201867
YGYAPYGFFKKVGIPGPKPWPFIGTFLNYRR (0)
GIHHFDEECYKKYGKVWG (2)
LYDGRQPLMCIMDTGMIKTVLVKECYSNFTNRR (0)
DLGVNGPLSDAVSVAEDEQWKRIRGILSPSFTSGRLKE (0)
MYTIMLQHSKNLLNFLNKKVEADEVIDVKD (2)
VFGPYSMDVVTSTAFSVDIDSINNPSDPFVANIKEMTQFSFLNPLV (0)
VLFPFLVPIFKKMNVSTFPAHVIDFFFNFLRQIKSDRNKDKKK (0)
SRVDFMQLMVNAQMQEGNEEGSSQK (1?)
GLTDNEILAQAMIFIFGGYETTSSSLGFLAYNLATNPKIQKKLQEEIDKTFPGK (0)
VRPNYDDLMQLEYLDMVVNESMRVFPILSRLERMTKTSVEINGFTIPKGTVVAIPVYVLQHDKAYWPEPEAFKPER (2?)
FSKENKDNVDPYAYLPFGAGPRNCIGNRFALVLMKLAIAEILQHYSFVPCKETD (0)
IPMVLNTEGLVAPKNPIKLKLKPRAV

>old CYP3A48 Scaffold_1124a Length = 48103 2 genes 40k-44k and 47k-48k
Second gene 55% to 3A4 complete 
LJQ41167.x1 68% to 3A13 alternative 3A sequence C-helix
44660 MDLLLNFSIETWTLLTLVFTLIIV 44589 (2)
44089 YGYAPYGFFKKVGIPGPKPWPFIGTFLNYRR 43997 (0)
43888 GVHHFDEECYKKYGKVWG 43835 (2)
43728 LYDGRQPLMCIMDTGIIKTILVKECYSNFTNRR 43630 (0)
43549 DLGLNGPLRDAVSVAEDEQWKRIRGILSPSFTSGRLKE 43436 (0)
43350 MYTIMLQHSKNLLNFLHKKVEADEVIDVKD 43261 (2)
42808 VFGPYSMDVVTSTAFSADIDSINHPSDPFVANIKKMVKFNFLNPLLIFV 42662 (1)
42612 VLFPFTQPIFDKVDFSFFPAHVIDFFYNFLRQIKSDRNKDKKK 42452 (0)
42378 SRVDFMQLMVNAQMQEGNEEGSSQK 42310 (1)
42109 GLTDNEILAQAMIFIFAGYETTSSSLGFLAYNLATNPKIQKKLQEEIDKTFPGK 41948 (0)
41869 VRPNYDDLMQLEYLDMVVNESMRVFPILSRLERMTKTSVEI 41747
41746 NGFTIPKGTVVAIPVYVLQHDKAYWPEPEAFKPER 41648 (2)
41162 FSKENKDNVDPYAYLPFGAGPRNCIGNRFALVLMKLAIAEILQHYSFVPCKETD 41001 (0)
40889 IPMVLNTEGLVAPKNPIKLKLKPRAVSS* 40803

>old CYP3A49 Scaffold_1124b Length = 48103 2 genes 40k-44k and 47k-48k 86% to 
CYP3A48 First gene runs off contig end
48101 XXXXYEILAQ 48804 frameshift
48084 AIFFIFAGYELPAALLDFANNLATNPKIQKKLQEEIDKTFPGK 47956 (0)
47876 VRPNYDDLMQLEYLDMVVNESMRLYPIANRLERMTKTSVEINGLTIPKGTVVAIPVYALQ 47697
47696 RDPVLWPEPEAFKPER 47648 (2)
47492 FSKENKDNVDPYAYLPFGAGPRNCIGNRFALVLMKLAIAEILQHYSFVTCQETD (0) 47331
47246 IPMVLGNEGFVTPKNPIKLKLKLRDVFN* 47160

>CYP3A49 at 204-207K on scaffold 35 95% to 3A48
FS:S000035 Scaffold_35 completed with FM:M000128
125035 MDLLPNFSIETWTLIALVITLITV
YGYAPYGFFKKVGISGPKPWPFIGTFLNYKK (0)
GVHHFDEECYKKYGKVWG (2?)
LYDGRQPLMCIMDTGIIKTILVKECYSNFTNRR (0)
DLGLNGPLRDAVSVAEDEQWKRIRGILSPSFTSGRLKE (0)
MYTIMLQHSKNLLNFLHKKVEADEVIDVKD (2?)
VFGPYSMDVVTSTAFSADIDSINHPSDPFVANIKKMVKFNFLNPLLIFV (1?)
VLFPFTQPIFDKVDFSFFPAHVIDFFYNFLRQIKSDRNKDKKK (0)
SRVDFMQLMVNAQMQEGNEEGSSQK (1)
GLTDNEILAQAMIFIFAGYETTSSSLGFLAYNLATNPKIQKKLQEEIDKTFPGK (0)
VRPNYDDLMQLEYLDMVVNESMRLYPIANRLERMTKTSVEINGLTIPKGTVVAIPVYALQRDPVLWPEPEAFKPER (2)
FSKENKDNVDPYAYLPFGAGPRNCIGNRFALVLMKLAIAEILQHYSFVTCPETD (0)
IPMVLGNEGFVTPKNPIKLKLKLRDVSN

>CYP3A50P at 212-211K on scaffold 35
FS:S000035 Scaffold_35
On old scaffold 10760 also = LGW123041.y1 LKU32251.y1 
LGL33554.y1 FS:s1316 LGW92440.y1 LKG69668.x1
MNFLPDFFLNTWTFLILVIVLSTT (2)
IKRVGIRGPTPWAPLIGDIFYSTRRG (1)
KCCKRYGKVWG (2)
LYEGRIPVMFIVDTAMIKTVFVKEGYSVFLNRK (0)
SIGPNGILSTGLPFLRDDNWKRVHKIVSPAFSSGRMKD (0)
MFSIMLQHSNILM (insert in this exon)
FSIDIESLNNPSSPFLHYLQEVSKYNYMNLWRLLG (1)
AIFPFLTPLMDKMNITVNSSEALQFFISVIKKIKEERKNNPK (0)
DRVDFMQLMLNAQAPDGSNKDCDSA (1)
GLSDEEIMVQALIFALVGNGNMAYLVAFTAYNLAVHPQTQTRLQAEIDRTFPGKVSPN (1)
(aa not ag) SYEELLQLKYLDMVVTETSRLYPLGNRIERVAKSTVEVSG
VIIPEGVAVAVPIYTLHRDPTVWPDPDSFKPER (2)
FSKDNRDNIDPYGLL (frameshift and 3 aa deletion)
GPRSCTGTRMSMLVVKLTLVEILQHFSFVACKETM (0)
IPMVLDDNGFVHPKTPIMLIPARTPGSCFSACTYVINVLLTVRRLICSLIN*

>old CYP3A50P Scaffold_10760 
= LGW123041.y1 3A Length = 7609 probable pseudogene
1847 MNFLPDFFLNTWTFLILVIVLSTT 1918 (2) exon 1
2258 bad boundary MDVLHMSSIKRVGIRRPKTWPIIGTYNRTDG 2350 bad boundary exon 2
     cannot identify exons 3-6
3091 XXXXXXXXXXXXXXFSIDIESLNNPSSPFLHYLQEVSKYNYMNLWRLLG 3195 (1) part of exon 7
3268 AIFPFLTPLMD 3300 start of exon 8 
     Seq gap in scaffold
3919 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXYNLAVHPQTQTRLQAEIDRTFPGK 3990 exon 10
4078 CLITYEELLQLKYLDMVVTETSRLYPLGNRIERVAKSTVEVSGVIIPEGVAVAVPIYTLH 4257
4258 RDPTVWPDPDSFKPERFSKDNRDNIDPYGLLTR 4424 frameshift and 7 nuc. deletion 
4423 GPRSCTGTRMSMLVVKLTLVEILQHFSFVACKETM 4527
     IPMVLDDNGFVHPKTPIMLIPARTPGSCFSACTYVINVLLTVRRLICSLIN* 4787

>CYP3B1 Scaffold_2213a 3A-like Length = 32737 one of two genes 13 exons
6382   MFEFVLFSGTTWALLALFFALLLL 6311 (2)
6232   YGVWPYHHFKKLGIRGPRPLPFMGSTFYYRK 6134 (0)
6052   GIIPFESWCQAEYGDVWG 5999 (2)
       MFEGRTPVLMVSDPEILKTVLVKECYSVFTNRR (0)
5469   DSFAGPLEDSVSAVKDERWKRIRSTISPCFTSGRLKN 5368 (0)
5259   AFPIVARYADRITKKLEQSNLDEPINVKE 5173 (2)
5069   FLAPYSLDAVTSVSFSVEADSINNPNDPLIVNLKKVFKFNFVVFFLV 4929 (1)
4846   AFFPFCARLFQFLGIDPIPRSSVNYFYNVIKNFKDQHHAD 4727 (bad boundary, 0 expected)
4124   TRGDFLQVLIQSEIPQSEIKSEQDQPPK 4041 (1)
4011   GLTEHEILSQAFIFIFGGYETTTTTLTNVLYGLAINPDVLQVLHKEIDTNIPSD 3850 (0)
3780   APISYEDLMGLQYLDQVLNESQRLYPTAPRLERACKKTVQIHG 3652
3651   LTILEGTIVGIPVHLLHKDPRFWSSPEEFRPER 3553 (2)
3416   FSKDSTEEVNPYAFMPFGLGPRNCVGMRYAILVMKMLIVRLLQSYTVETCKDTM 3252 (0)
3171   IPLEFDWKSQPLKPIKLSFIPRQK* 3097

>CYP3B2 Scaffold_2213b 3A-like Length = 32737 one of two genes 13 exons
Scaffold_2894 exon 4 this scaffold name now retired
LGS118983.x1 LGX10826.y1 exon 1
12175 MFVFMCFSATTWTILVLFSTLLLL (2)
11998 YGHWPYRLFRKNGIPGPMPLPFIGTMWNLLK 11906 (0)
11778 GNMVFDRECQSKYGDVWG 11725 (2)
11412 VFEGRTPVLMVSDPGMIKTILVKECYSVFTNHRVR (0)
11044 IFSGPLETSVFSAKDETWKRMRTSISPCFTSGRLRQ (0)
10847 AFPIIARYADRFIAKLEQTKLEDSTDIKK 10731 (2)
10640 LFGPYSLDVIASSSFSVDADSINNPDDPFIINIKKVLNLNFWLLLIK 10500 (1)
10422 NVLPFSSYLFEFLHIDIIPRSIVDYFFNLIKQLKAQHDQS 10309 (0)
 9911 IKGDFLHVMIQNEIPQSEIKSDQDQPPK (1)
 8638 GLTEQEILSQSVLFIFGGYDTTTITITYLLYNLAINPDVLQILHSEIDSNFPKD 8483 (0)
 8391 TPFSYEDLVGFQYLDQVLNESQRLIPTAPALERFCKKTVQIHGLTIPEGTVVAVPVHLL 8215
 8214 HKDPRFWSSPEEFRPER 8164 (2)
      FSKDNIEEVNPYAFMPFGLGPRNCIGMRYAILVMKMILVRVLQSYTVETCKDTM 7888 (0)
7808 IPLEFDWKLQPTKQIKLRLVPRQK* 7734

>CYP4F28 Scaffold_2425  complete gene 13 exons Length = 32794 55% to 4F3
LGS32765.x1 Length = 197 exon 5 76% to 4F7 
8381 MSVLQLALSWTSVCPLLAVLAAAPVALLTFWTAKLLVRHAWYTHRMACFSKPHARSWLLGHLGQ 8190 (0)
7561 MQSTEEGLLQVDELVQMFTYCCSWFIGPFYHLVRVFHPDYVKPLLMAP 7418 (1)
7238 ASITVKDELIYDHLRPWL 7185 (1)
7049 GNSLLLSNGEAWSRRRRLLTPAFHFDILKNYVTKFNTSTNTLH 6921 (0)
6727 DKWHRLLVEGTTNIEVFEHFTLMTLDSLLKCAFSYDSDCQQ 6605 (2)
6527 SSSEYASAIIELSDLIIERRQRILHHWDWIYWRTQQGKRFKKALSIVHK 6381 (2)
6307 FTRDVVQKRLALISQKGVAELAAHRKRDFVDIILLTK 6197 (0)
6115 DEDGKGLTDEELQAEANTFMFA 6050 (1)
5813 GHDTTASAICWSLYNLAHHNHYQEQCRQEVMDLMEGRDVDEIKW 5682 (2)
5607 EDLSSLPFTTMCIRESLRLHSPVQAVTRRYIQDVKLPGERTVPK 5476 (1)
5248 GAICLVSIYGTHHNPAVWTNPH 5183 (0) 
5101 DFDPLRFDPKNQEGLSSHAFIPFSSGPR 5018 (2)
4948 NCIGQKFALAELRVVVALTLLRFRLLPGNDPKRETPFEKVRRLPQLVLRAEGGLWLQIKPVTQTQ* 4751

>FM:M001448
          Length = 56159

 Score =  106 bits (262), Expect(2) = 1e-38
 Identities = 49/50 (98%), Positives = 50/50 (100%)
 Frame = +2

Query: 1     SSSEYASAIIELSDLIIERRQRILHHWDWIYWRTQQGKRFKKALSIVHKF 50
             SSSEYASAIIELSDLIIERRQRILHHWDWIYWRTQQGKRFKKALSIVHK+
Sbjct: 49592 SSSEYASAIIELSDLIIERRQRILHHWDWIYWRTQQGKRFKKALSIVHKY 49741

 Score = 74.1 bits (179), Expect(2) = 1e-38
 Identities = 37/38 (97%), Positives = 38/38 (99%)
 Frame = +3

Query: 49    KFTRDVVQKRLALISQKGVAELAAHRKRDFVDIILLTK 86
             +FTRDVVQKRLALISQKGVAELAAHRKRDFVDIILLTK
Sbjct: 49809 RFTRDVVQKRLALISQKGVAELAAHRKRDFVDIILLTK 49922

Query: 1    LRVVVALTLLRFRLTLGVNPEVGPSSGEVRRLPQLVLRAEG 41 CYP4F7 mRNA
            LRVVVALTLLRFRL  G +P+      +VRRLPQLVLRAEG
Sbjct: 4915 LRVVVALTLLRFRLLPGNDPKRETPFEKVRRLPQLVLRAEG 4793 no frame matches 4F7 

>BG799227.1 fp31h09.y1 zebrafish gridded kidney Danio rerio cDNA clone 4744960 Length = 606 
similar to scaf 2425
MMLFGLSVTGILALSGAALLCALVTRIIINRRALRCFNQPPLRNWIMGHMGLMGHNEEGLQRVDDLVCKYGHSCSWFLGPFYNMVRLFHPDYI
RSLLTASASITLKDRIFYGFMKPWLGNCLLLQSGQEWSRHRRLLTPAFHFDILK

>gi|7889176|emb|AL230181.1|AL230181 C0AG216AG12LP1 G Tetraodon nigroviridis genomic clone 
216M23 T7. Length = 575 similar to scaf 2425
254 XQSTEEGLLQVEELVQMFTYCCSWFIGPFYHLLRVFHPDYVKPLLTAX 391

>CYP4T5 complete Scaffold_15094 Length = 4295
50% to 4B1 and 4A14 lower case = human
Scaffold_9071 64% to 4a14 exon 12
Scaffold_8637 LKG81295.x1 exon 9 first 6 aa missing off the end
LKG81295.y1 LKB129582.y1 exon 5
LKU50005.y1 exon 5 runs off end
LKB5549.y1 exons 2 and 3
LKU98995.x1 part of exon 2
78% to CYP4T2 Dicentrarchus labrax
508  MEITRALVVLGWSHFYQLLALFCLAIVLYKLTVLLMLKRALIRNFESFPGPPGHWLFGNILE 693 (0)
902  FKQDGNDLDKLVKFGQKYPYCFPLWFGPFVCFLNIHHPEYVKTILAST 1045 (1)
1142 EPKDDLAYSFIQNWI 1186 (1)
1291 GNGLLVSQGQKWFRHRRLLTPGFHYDVLKPYVKLMAHSTKTML 1419 (0)
1673 DKWESYAKTNKPLEVFEYVSLMTLDTILNCAFSYDSNCQTER 1798 (2)
2267 KNTYIKAVYELSNLINLRFRIFPYHNDLIFYLSPHGFRYRKACMVAHSHT 2416 (1)
2521 EEVIKKRREALKKEKELERIQAKRNLDFLDILLFAK 2638 (0)
3171 DENQQGLLDEDIRAEVDTFMFEGHDTTASGISFLLYNLACHPKHQKLCRKEIMQVLHGKDTMDW 3362 (2)
3457 EDLNKIPYTTMCIKESLRMHPPVPGISRKTTKPITFFDGRTLPA 3588 (1)
219  ESRIGTSVFGIHRNASIWENPN 284 FM:M006027 scaffold_6027
     VFDPLRFLPENISKRPPHAFVPFSAGPR gnl|ti|112745823 MBF260486.y1.gz trace archive
     NCIGQNFAMNEMKVVIAMTLLKYELLEEPTLKPKIIPRLVLRSLNGIHIKIKNANQN* gnl|ti|438249107    
     ACAX159262.x1 trace archive

392  ESRIGTSVFGIHRNASLWENPN 454 (0) this exon from Fugu LPC.11421.x1
1518 vfdplrflpenaskrsphafvpfaagpr 1601 FS_CONTIG_43569_1 Tetraodon exon
     fdhwrflpenvskrsphafvpfsagpr this exon from 4T2 Dicentrarchus labrax
     NCIGQNFAMNEMKVVIALTLKKYHLIEDPNWKPKIIPRLVLRSLNGIHIKIK 4T2 Dicentrarchus AF045468
425  NCIGQNFAMNEMKVAVALTLKRYYLIKDPDHTPKMIPQVVLRSLNGIHIKIK 270 zfishK-a877b03.q1c
  6         AMNXIKVVIAMTLKKYELMEEPTMKPKIIPRVVLRSLNGIHIRIKD 143 AL175451 Tetraodon 
nigroviridis

>CYP4V5 Scaffold_485 complete gene 11 exons, 60% to 4V3 over 513 aa
= LGS237519.x1 Length = 65714
exon 5 starts with ETAM seen in many Drosophila 4 fam. and C. elegans sequences
36037 MADLLGGFTLPLLGASVFVAALTCFTYRMLSDYLHKWFQMKPIPELEGTYPLIGNALQFKPNAG 36222 (1)
36309 DFFNQIVEYTRENYHRPLFKIWVGPVPFVVLFHPETVE (0)
36504 PVLTNAVHMEKSYSYSFLHPWLGTGLLT 36587 (2)
37012 STGPKWRRRRKMLTPTFHFSILADFLEVMNEQAEILVEKLDQQAGKGPFNCFSYVTLCALDIIC 37203 (1)
      ETAMGKKIYAQSNSESEYVKCVSK 37380 (2)
37458 MSDIISRRQRTPWFWPNFAYYSIGDGREHDSTLKVLHSFTYK (0)
37661 VITERAENVSSVESDSDSDHGRKKRQAFLDMLLKTTDEDGNKMSHRDIQEEVDTFMFR (0)
37906 GHDTTAASMNWVLHLMGSHPEAQSKVHQELQEVF (1)
38199 GESNRPITTEDLKKLKYLESVIKEALRLFPSVPFFARSLGEDCHI 38321 (1)
38530 NGFKVPKGANAVIITYALHRDPRYFPEPEEFRPERFLPENSVGRPPYAYLPFSAGLRNCI 38724 (1)
38802 GQRFALIEEKVVLASILRKFNVEACQKREELRPVGELILRPEKGIWIKLEKRKPLIPPS* 38981

>CYP4V5-like fragment FS:S000209 Scaffold_209 52% to 4V5
just a pseudogene fragment
102980 DSTIAGFKVPKGHGAIYVSHAVHRDPDVFQQPDSFLPERW 103099

>CYP5A1 Scaffold_1244 COMPLETE GENE 12 EXONS Length = 46168
= LED26857.x2 Length = 665 49% to 5A
= LGW67540.x1 Length = 648 
part of exon 8 from TAH to MKM not in seq alignment (CYP5A insertion)
7486 MSQLIFNIKFLEVTIRLSVNTVLSFRYSVRPFSVLSRNGIKHPKPLPFFGNLFMFRQ 7316 (0)
7158 GFFNPLNDLIKTHGRVCG 7105 (2)
7021 YYLGRKPVVVVADPEMLRQVMVKDFSSFPNRM 6926 (0)
6848 SLRFITKPMSDCLLMLRNERWKRVRSILTPAFSASKMKE 6732 (0)
6519 MVPLINTATDALIKNLDAFAESAEAFNIHK 6430 (2)
6355 CFGCFTMDVVASVAFGTQVDSQNNCDDPFVRHAQLFFSFNFFRPIMLFF 6209 (1)
6046 AAFPSLAAPLVGIIPNKKRDDMNHFFIRTIQKIIKQREEQPPEQ 5915 (0)
5794 RRRDFLQLMLDARTSDESVSIEHFDTAHPVGEAHNRSEQNQDNQVGPQERPQMKM 5630
5629 ITEDEIVGQAFVFLVAGYETSSNTLAFACYLLAINPECQRELQKEVDHFFTRH 5471 (0)
5395 ESPDYTNVQELKYLDMVISETLRLYPPGFR 5306 (2)
5229 FAREIERDCVVNGQSFPKGATFEIPAGFLHRDPEHWPDPDKFIPER (2)
4998 FTPEAKASRHPFVYIPFGAGPRNCVGMRLAQLEMKMALVRLFRRFNLLACTETK 4837 (0)
4699 VPLELKSSSTLGPKNGVFVKIERRHWDESQVNSPSED* 4586

>CYP7A1 Scaffold_5172  Length = 18849 59% to 7A1
= LGW1565.x1 Length = 555 27-153 CYP7A1
= LGW57257.y1 50% to 7a1 238-350
= LOL6406.x1 61% to 7A1 390-436 also LOL6406.y1
= LGW154142.y1
= LGU7599.x1
insertion of 6 aa in exon 4 vs mammalian seqs, but probably real see zfish seq below
14694 MILSIALIWAVVVGFCCLLWLAVGIRHR 14777 (2)
15070 HSSEPPVENGLIPYLGCALQFGANPLQFLRSRQKKYGHIFTCKIA 15204 frameshift
15205 GQYIHFLCDPFSYHSVIRQGRHLDWRKFHFATSVK 15310 (0 expected) bad boundary
15425 AFGHDSFDPRHGHTTENLHQ 15484
15485 TFLKTLQGEALPSLIKTMMGHLQDVMLKSDTLRRSKDHWEVDGIFAFCYK 15634 (0)
15757 VMFESGYLTLFGKELGEDTCQARQAAQKALVLNALENFKEFDKIFPALVAGLPIHVFKSAYSARE 15951 (0)
16053 NLAKTMHAEKLSKRENVSDLISMRMILNDSLSTFNDVSKARTHVALLWASQANTLPATFWSLFYMIR 16253 (2) 
16383 SPDAIKAAREEAQKVFETFGVKIDPHNPTLNLTRDVLDNMPVL 16511 (1)
16744 DSIIKEAMRLSSASLNIRVAKEDFLLHLDNQEAYRIRKDDVIALYPPMLHYDPEIFEDPY 16923 (0)
17029 EYKFDRFLDENNQEKTTFTRNGRKL 17103
17104 RYFYMPFGSGVTKCPGRFFAVYEIKQFLTLVLTYFDMELLDPAIQVPPLDQSRAGLGILQ 17283
17284 PTYDVDFRYKLKLAY* 17331

>gi|12016604|gb|BF717505.1|BF717505 fd45f11.y1 Zebrafish WashU MPIMG EST Danio rerio cDNA clone
           IMAGE:3732717 5' similar to SW:CP70_RAT P18125
           CYTOCHROME P450 7 ;.
          Length = 304

Query: 1   VMFESGYLTLFGKELGEDTCQARQAAQKALVLNALEN 37
           VMFE+GYLTLF KEL  D   ARQ AQKALVL  L+N
Sbjct: 194 VMFEAGYLTLFXKELDGDQSIARQQAQKALVLXCLDN 304

>CYP7C1 Scaffold_16085 Length = 3913 43% to 7A1 completed with FM:M001203
= LGL2670.y1 Length = 704 282-411 
EST Fe:eCA588108 has N-term sequence
      MSGSLLSLVGVLSLLVCVLRRRI EST seq 
72239 MSGLLLLLVGVLSLLVCVLRRRI 72171
70779 RRDNEPPLVMGWIPFVGKAVEFGRDAQGFLLQQKKKFGDVFTVLIA 70642 (1)
1114 GKYMTFIMDPLMYPNIIKHGRQLDFHSFSDSMAPVVFGYPPVRSWKTPSLHEDIQ 1293
1294 RAFKLLQGDHLCALTEGMMGNLMMLLRQDHLGRRPGAGPGWKSGDMYEFCSRVMFEATFL 1473
1474 TLYGIPQEGGRHDGMDELRKDLFQFDGWFPWLVAGVPIGLLRQAKTSRNKLTRS 1635
1636 LLPMRISSWSNRSQFIRRRQELVEKVDALKDVDRA 1740 (1)
2208 AHHFAMLWASVANTIPACFWSMYNLVSHPDALQVVRQQILDELKLSGVQFSTDTDVTL 2387
2388 SRDLLDKLLYL (1) 
     ESSVNESLRLSSASMNIRVAQEDFSLHLKNERSANVRKGDIIVLYPQSLHMDPEVYEDPQ 2666 (0)
2786 TFQFDRYVQDSKGGFFKGGQRLKYYLMPFGSGSSMCPGRHFAVNEIKQFLCLMLLYF 2956
2957 NLELEPGQTRATVDSSRAGLGILFPSAKVHFRYRLRCV* 3073

>FS:S005652 Scaffold_5652
          Length = 12353

 Score =  121 bits (301), Expect = 6e-28
 Identities = 57/60 (95%), Positives = 57/60 (95%)
 Frame = -1

Query: 1    SCAPAGKYMTFIMDPLMYPNIIKHGRQLDFHSFSDSMAPVVFGYPPVRSWKTPSLHEDIQ 60
            SCAPAGK   FIMDPLMYPNIIKHGRQLDFHSFSDSMAPVVFGYPPVRSWKTPSLHEDIQ
Sbjct: 2801 SCAPAGK---FIMDPLMYPNIIKHGRQLDFHSFSDSMAPVVFGYPPVRSWKTPSLHEDIQ 2631

>_4
                              VLLYDDCSIYMSICFSSHFFK*LFCSDTNKYPK*LYCIVNLTGKN*AKTI*GRTLD*TSV
                              FG*CSCSVTKVHDRRRAQRSISNLPDNGAWVHAPSCPQSLHKTGTVGLLQACSGQGPWP*
                              TGRCRFTWLPRTC*RSGPSSCHDEANRTSSAKSRDEIL*FPTGPPLAPGRTERFCP*KF*
                              IEPVTKGSPARVQQALEAGLTYCRNETVQPSAGGLRPPHRSTGHQEGHDGLVG*TPINPP
                              GEGIGLVQRSSTSVPPASEVP*SSPEPWCRPSRGG*GV*STCIWNTTSAPPLRAAAQAAR
                              PDTVPTTRRCRGQNSPKTWS*DHLRNNSYF*KQK*RSILSRLCSGAALPMMDDVVVCWCC
                              RRXXXX
                              >_5
                              RVVV**LQYLHEHLFFISFF*IIVLFRYK*ISQMTLLYCESNWQKLS*DHLRKNFGLNVS
                              LWLM*LLCNEGS*PKASAAVDQ*PPR*RCVGPRTILPSVTAQDRDSWTPPGLFWSRTMAL
                              NWEVPIHLAAQNLLKVRTFFMS**SQQDIVCKKQR*NPVVPNWTPSGPWTHREILSLKIL
                              NRTGDKGQPSQSPTSTGSRSDLLPQ*DRSALSRGPQTPPQKHRTPGGT*WTGRVNSHQPS
                              WRGYRAGPAFLDQRPPSVRGSLILSRTLVQAFPGRLRRVIHLYLEHNLCSPFEGSSPGCQ
                              TRHCPDHATLQRPKLTKDMELRPFKEQLVLLKTEIKIYIEPVVQRSGTANDG*RCCLLVL
                              QTXXXXX
                              >_6
                              PCCCMMIAVFT*ASVFHLIFLNNCFVQIQINIPNDFIVL*I*LAKTELRPFKEELWIKRQ
                              SLVDVVAL*RRFMTEGERSGRSVTSPITVRGSTHHPALSHCTRPGQLDSSRPVLVKDHGL
                              KLGGADSLGCPEPAEGQDLLHVMMKPTGHRLQKAEMKSCSSQLDPLWPLDAPRDFVLKNS
                              E*NR*QRAAQPESNKHWKQV*LTAAMRPFSPQQGASDPPTEAQDTRRDMMDWSGKLPSTL
                              LARV*GWSSVPRPASPQRPRFLDPLQNPGAGLPGEAEACDPPVFGTQPLLPL*GQQPRLP
                              DQTLSRPRDAAEAKTHQRHGAKTI*GTTRTSKNRNKDLY*AGCAAERHCQ*WMTLLFVGV
                              ADVXXXX

>AL346920 C0AB009BB08B1 B Tetraodon nigroviridis
MIWAALLGITGILTLVLLFFSR
RQENEPPLDKGALPWLGHALEFGRDAAKFLARMKEKHGDVFT
VRVAGQYVTVVLDANSFDSVVNDTVSLDFISSKNQL

>CYP8A1 Scaffold_4451  Length = 20860 = LDZ70400.x1 Length = 607 
55% to 8A1 human over 423 aa
LKU27079.x1 Length = 150 67-111 cyp7a
LGS296803.x1 Length = 653 238-284 8a
FS:S002619 Scaffold_2619
      MELRPFKEQLVLLKTEIKIYIEPVVQR
      MALNWEVPIHLAAQNLLKVRTFFMS
      MMDWSGKLPSTLLAR
      MIWTVLLLVHALLLYFILRHRSR 15761 (2)
      NEPPLDKGLIPWLGHALEFGKDASKFLERMKRKHGDIFT 15636 (0) 
      VRAAGRYVTVLLDPHSYDQVIHDQDCLDFHSYAKVLMERIFQLRLPNHEPAKEKAVMTQ (2)
15682 HFLGMNLCGLNSSMSRHMLEVAKAEMPQNQKDWKEDGLFNLSYSLLFK 15539
      VGYLTLFGGEQNNNCSDLASIYEEYKKFDGLLTKMARGTLKS (1)
      YEKKTAQSARQRLWELLAPARLAKGSGSTPWLHAYRRLLREEGVDNEMQTRALLLQLWATQ (0)
      GNVGPAAFWMLGYLLTHPEALTAVKTEMEALQLSPLDSSVVTPVF (1)
      DSALDEALRLTAAPFITREVLQDKVLHMADGQQYLLRKGDRVCLFPFISPQMDPEIHQEPQ (0)
12256 KYKFNRFLNEDGSVKKDFYKGGRRLKYYSMPWGAGANGCVGKQFAISTMKQ 12107 (2)
      YIYVLLTNYDLELCDPCALMPGVNASRYGFGMLQPEGDLLVRYRPRQKL*

>CYP8A2 Scaffold_5061 Length = 18605 complete gene 48% to 8A1 human over 491 aa
= Fc:c114O18y1 LPC.43560.y1 Length = 982 Fc:c125E08x1 LPC.47739.x1 57% to 8a1
= LPC43560.y1 LPC47739.x1 LPC46976.x1 LGW163383.x1 Fc:c123O08x1 LPC.46976.x1 
= Fc:c161B20y1 LPC61801.y1 Length = 891 69% to 8A1 28-70
= Fc:c094B16x1 LPC36057.x1 LPC35796.x1 LGS155375.y1 Length = 938 62% to 8A1 336-416
      MIWAALLGGLLTLVLLYFSRRR (2)
15028 QDNEPPLDKGALPWLGHALEFGRDAAKFLARMKEKHGDVFT 14906 (0)
14562 VRVAGQYITVLLDANSFDSVLNDTVSLDFVKSKNQLLERIFFLKLPGLQPAAEREWMEQ 14389 (2)
14306 HFHGFRLSNLSGTMKANIESLLLSDVKGGSASGWRQDGLFNFCYSLLFR 14163 (2) 
      AGYLTLFDSADNVTAIYKEFRKFDKLLSKLVRGSLKK (1)
      GETHTVNSSRKRLWELLSADWLSRASGSISWQQSYNRFLEKEGVDMEMQRRASLLQLWTTQ 13707 (0)
      CNNGPAAFWLLGFLLTHPEAMEALKSEIRQFNLQDPAVRHRTSGSPDSRRTPVF 12379 (1)
12150 DSILSETLRLTAAVLISRVVVGDKVLRMASGQQYKLRRGDKAILFPFLSPQMDP 11989 (frameshift)
      EIHLEPQ (0)
11610 SFKYDRFLNIDMSMKDTFYKNEGRLKYYTMPWGAGRHACVGKEFAVATIKQ 11458 (2)
10998 FVFFILTHFDLETCDPQAKLPPVNPSRYGLGMLQPEGELQVRYRLKRSQPET* 10840 

>CYP8A3P Scaffold_11168   Length = 7738 60% to CYP8A2
= FC:C061O19bD10 FC:C061O19bD6 Length = 600 28-74 cyp8a 62% to Fc:c161B20y1
exon 1 not detected, rest of sequence runs off beginning of scaffold
347 (2) RDNEPPLVMGWIPFVGKAVEFGRDAQGFLLQQKKKFGDVFT 225 (0) exon 2

>CYP8B1 Scaffold_7782 Length = 12212 50% to 8B1 95% to LKH34703.x1 LDZ59358.y1 
Scaffold_16670 Length = 3325 = scaf 7782
Scaffold_14036   Length = 6012 2 diffs with scaffold 7882 
Scaffold_19012 Length = 2771 2 diffs with scaf. 7782 and LKH34703 contig
4794 MASVLLILLTFLVALLGGLYLLGVFRQRRPGE
4890 PPLDKGLIPWLGHVLEFRRNTWRFLERMEKKHGDVFTVQLAGFYITFIQDPMSFGAFVKE 5069
5070 SREKLDFRKFASHLVRRVFGYSSIQNDHDILNASSNRHLKGDGLEVMTQAMM 5225
5226 VNLQNLMLHNVDSSSDQVTWREEKLFAYCYNIVFRAGYLSLYGNEPFDSKISQEKAKEKD 5405
5406 RAESEALYHEFRKYDQLFPRLAYGVLPPKKQREATKLLEYFWNVLSVQKLKGRD 5567
5568 NISRWIWDVQEGKDNAGVKEDMITRYMFVLLWASQGNTGPSSFWLLLFL
     MKHPEAMRSVKEEIDKVVRESGQEVKPGGPLVNLSREMLMKTPIL
     DSAVEETLRLTAAPLLTRAVLQDMTLKMADGQQYFIRQGDRVALFPYSAIQMDPEIHPDPR
     SFKYDRFLNPDGSKKTDFYKAGNKVKYYTMPWGSGISMCPGRFFATNELKQ
     FAFLMLLYFEFELINPDEEIPEIDYSRYGFGTMQPDRDLQFRYRLRY* 6329

>CYP8B2 Scaffold_21917 Length = 2428
LKH34703.x1 LDZ59358.y1 Length = 613 7-494 CYP8B1 like 41%
LGX19039.x1 LGW33165.y1 LPC13393.x1 Fc:c026O05x1 LPC9624.x1 59% to 8B1 248-494 
Fc:c102B11x1 LPC39017.x1 LGP735.y1 LKG104252.x1 Fc:c035B14x1 LPC13393.x1
MASLLLILLTFLVALLGALYLLGVFRQRRPGEPPLDKGLIPWLGHVLEFCRNTWRFLERMEKKHGD 
VFTVQLAGFYITFIQDPMSFGAFVKESREKLDFRKFASHLVRRVFGYIATKADHEMLNASSNRH 
LKGDGLEVMTQAMMINLQNLMLHNVDSSSDRVTWREEKLFAYCYNIVFRAGYLSLYGNEA 
FDSKTSQEKAKEKDRAESEALYHEFRKYDQLFPRLAYGVLPPKKRREATKLLEYFWNVLS 
VQKLKGRDNISRWIWDVQEGKDNAGVKEDMITR
YMFVLLWASQGNTGPSSFWLLLFLMKHPEAMRSVKEEIDKAVRESGQEVKPGGPLVKMSR 
EMLMKTPILDSAVEETLRLTAAPLLTRAVLQDMTLKMADGRQYFIRQGDRVALFPYSAVQ 
MDPEIHPDPRSFKYDRFLNPDGSKKTDFYKAGNKVKYYTMPWGSGASMCPGRFFATNELKQFAFLM 
LLYFEFELINPDEEIPEIDYSRYGFGTMQPDRDLQFRYRRRY*

>CYP8B3P Scaffold_20802   Length = 2432 9 diffs with Scaffold_7782
pseudogene of scaf. 7782
2204 DIAGAKEDMINRFISRYMFVLFWASWGNTGPSSFWLLLFL 2085
     exon 7,8 and part of 9 in a deletion
2061 MPWGSGISMCPGRFFATNELKQFAFLMLLYFEFELINPDLEIPEIDYSRYGFGTTQPD*D 1882
1881 LQFRYRSRY 1855

>CYP11A1 Scaffold_1630 complete gene 9 exons 82% to 11A1 trout Length = 42168
= LGW98501.x1 
= LKB125345.x1 
37515 MARWSVWRSPVVLPLSRMEVPMTGARHSSTMPVARQTYSDSSSFV 37381
37380 RSFNDIPGLWKNGVANLYNFWKLDGFRNLHHIMVQNFNTFGPIYR 37246 (2)
37065 EKIGYYESVNIINPEDAAILFKAEGHYPKRLKVEAWTSYRDYRNRKYGVLLK 36904 (2)
36822 NGEEWRCNRVLLNKEVISPKVLENFVPLLDEVGNDFVVRVHKKIARSGQNKWTTDLSQELFKYALES 36628 (1)
36506 VSSVLYGERLGLFLDYIDPEAQHFIDCISLMFKTTSPML
      YIPPALLRKVGAKVWRDHVEAWDGIFNQ 36300 (1 expected) bad boundary
36208 ADRCIQNIYRRLRQETGPSKKYPGVLASLLLRDKLSIEDIKASITELMAGGVDT 36047 (0)
35975 TSITLLWTLYELARHPNLQEELRAEVAAARTESQGDMLEMLKRIPLVKGALKETLR 35808 (2)
35731 LHPVAVSLQRYIAEDIIIQNYHIPAG 35654 (0)
35568 TLVQLGLYAMGRDPKVFFRPEQYQPSRWLRSETHYFKSLGFGFGPRQCLGRRIAEAEMQLFLIH 35377 (0)
35297 MLENFRVEKQRHMEVQSTFELILLPDKPIILTLKPLSS* 35181

>CYP11B1 Scaffold_9267 Length = 9352 cyp11 like N-TERMINAL exons 1 and 2 
rest of gene off scaffold end
Fc:c066L04x1 LPC.25262.x1 50% to 11B1 139-189 part of exon 3
Fc:c066L03x1 LPC.25166.x1 Lowercase = zebrafish
Scaffold_8316   Length = 11859 CYP11 like exons 4-8 may be C-term of scaf. 9267
C-term exon 9 not found
This sequence composed of three separate sequences that all match CYP11B seqs.
Part confirmed on FM:M001017 and FM:M027074 scaffold_27074
8573  MWLPAGAGVRARSARGFRTAAGAVVDGKVGACKGAEVPESKKGVDGQVRSFEEIPHTG 8746
8747  RNSWVNLLRFWREDRFRHLHKHMERNFNSLGPIYR (2)
      EPVGAPNSVNIMLPSDISELSRSEGLHPRRMTLQPWATHRETRKHGKGGFHK (2?)
254   NGEEWRADRLLLNKEVMMSEAVRRFIPLLDEVAQDFCHMMQTKVEREGRGERGKRSLTINPSPDLFRFALE
11719 ASCHVLYGERIGLFSSSPSLESQKFIWAVERMLATTPPLLYLPHRLLLHLGAPLWTQHASAWDHIFTH 11519 (1)
11438 AEERIQRGYQRLSHSQSRGPEGGGRYKGVLGQLMAKGQLSLELIKANITELMAGAVDT 11265 (0)
11178 TAVPLQFALFELGRNPEVQQRVRQQVQESWAQAGGDPQKALQEAPLLKGTIRETLR 11011 (2)
10465 LYPVGTTVQRYPVKDIVLQNYHIPAG (0) 
      TMVQACLYPLGRSAEVFEDPLRFDPGRWGKSREEGQRGGGTGFRSLAFGFGARQCVGRRIAENEMQLLLMH 10097 (0)
 9896 ILLGFDLSVPSSEDIKTMCTLILMPETPPKITFTKL* 9786

>BI880530 Zfish CYP11B EST 
IKETLR
LYPVGITVQRYPVRDIVLQNYHVPAG
TLVQVCLYPLGRSAEVFSRPECFDPSRWSADADAG
SAGGFRSLAFGFGSRQCVGRRIAENEMQLLLMH
ILRTFKLTVSSTEELSTKYTLILQPECPPRITFSTLTHQH*

>AL279350 C0BG090CD02SP1  Tetraodon nigroviridis
AL241672.1 C0BG022BA01LP1 G Tetraodon nigroviridis genomic clone 022A02 T7.
AL202930
AL298541 C0BG122BA09LP1 G Tetraodon nigroviridis
AL298542
AL241673
Lowercase = zebrafish
MXIPAGAGVRAGGAGGLRLAAGAAVDGKGAEGSGSRKGGVDGQVRSFEEIPHTGR
NSWVNLLRFWREDRFRQLHKHMERTFNALGPIYR
EHVGTQSSVNIMLPSDISELFRSEGLHPRRMTLQXWATHRETRHHSKGVFLK
NGEEWRADRLLLXKEVMMSEAVRRFLPLLDEXAKEfcrslrrrvqadgfekagqhtltldpspdlfrfale
ASCHVLYGERIGLFSSSPSLESQKFIWAVERMLATTPPLLYLPHRLLLHLGAPLWTQHASAWDHIFSH (1)
AEERIQRGYQRLSPSQSRDREGGGRYTGVLGQLMEKGQLSLELIKANITELMAGAVDT (0)
TACPLQFALFELXRNPEVQQRVRQQVQASWARAGGDPQKALQEAPLLKGTIKETLR
LYPVGTTVQRYPVKDIVLQNYHIPAG
TMVQACLYPLGRSAEVFEDPRRFDPGRWGKSREEGQRGAGTGFRSLAFGFGARQCVGRRIAENEMQLLLMH

>BI880530.1 fm76b10.x1 Zebrafish adult retina cDNA cDNA clone Length = 573
BG738320 fp57b11.y1 Zebrafish adult retina cDNA
1   MTLQPWATHRETRRHSKGVFLKNGTEWRADRLLLNREVMVSSSVHRFLPLLDE 159
160 VAQDFCRSLRRRVQADGFEKAGQHTLTLDPSPDLFRFALEASCHVLYGERIGLFSSCPSD 339
340 ESERFISAVERMLATTPPLLYLPPRLLLRLRASLWTTHATAWDDIFSHAEQRIQRSYQRL 519
    QARASAAPDCSFPGVLGKLMEAGQLSLELIQIHITEL

573 IKETLRLYPVGITVQRYPVRDIVLQNYHVPAGTLVQVCLYPLGRSAEVFSRPECFDPSRW 394
393 SADADAGSAGGFRSLAFGFGSRQCVGRRIAENEMQLLLMH 274

>BG738320.1 fp57b11.y1 Zebrafish adult retina cDNA Danio rerio cDNA clone
Length = 519

64  KNGTEWRADRLLLNREVMVSSSVHRFLPLLDEVAQDFCRSLRRRVQADGFEKAGQHTLTLDPSPDLFRFALE
ASCHVLYGERIGLFSSCPSDESERFISAVERMLATTPPLLYL 405

>CYP17A1 Scaffold_4175  complete CYP17 8 exons  = LPC12298.x1 Scaffold_2373 
Length = 23018
LPC12306.x1 
Fc:c033C05x1 Fc:c033C03x1 58% to LGS96377.x1
LPC10742.x1 LPC10550.x2 LPC.10742.x1 LPC10742.x2 Fc:c028L22x1 
80% to CYP17 zebrafish
      MDWVLFVYAFSAVNLALLALHLKFRTPASGPRGPPRLPALPLIGSLLSLRSPHPPHVLFKE
      LQGKYGQTYSLMMGSHRVIIVNHHAHAKEVLLKKGKIFAGRPRS 11667 (0)
11572 VTTDVLSRDGKDIAFGDYSATWRFHRKIVHGALCMFGEGSASIEKI 11435 (1)
10871 ICAEAASLCSILSEAWTAGLALDLSPELTRAVTNVICSLCFSSSYRRGDAEFEAMLHYSQ 10692
10691 GIVDTVAKDSLVDIFPCLQ 10635 (0)
      IFPNADLRLLKRCVSVRDKLLQKEYDKHK (0)
      AAYSDHVQRDLLDALLRAKCSAENNNTTGINAESVGLTDDHLLMT
      VGDIFGAGVETTTTVMKWAITYLIHHPQ 9765 (0)
 9637 IQSRIQEELDSRVGMDRSPQLSDRGSLPYLEATIREVLRIRPVAPLFIPHVALSDT 9473 (2)
 9369 SIGDFAVKKGTRVVINLWSLHHDEKEWENPERFDP 9265 (1)
      GRFLNSEGTGLVIPSSSYLPFGAGVRVCLGEALAK 8558
 8557 MELFLFLSWILQRFTLTVPSGHSLPSLEGKFGVVLQPTKYKVNATPRPGWEGKCKACWN* 8378

>CYP17 fragment Fc:c028I22x2 LPC.10549.x2 Length = 1007 61% to 17A1 Fugu
GAPRATTDNHHAHAKEARPKKGKKAAGRPRK
ATTDGASRDGTDTAEGDHSATRRDQR

>CYP17 fragment Fc:c028I22x1 LPC.10549.x1 Length = 894 
78% to Fc:c028I22x2 65% to 17A1 Fugu
KPRTQPEGPGGPPSRPAQRRIGSGQRKRSPHPPQGQT
KEQQAEHGQTTSRMMGSHRGNTDNQHAHAKEARQKKGKKGAGRPRK
ATTDGRSRDGTDIAEGDHSATWRHHRKIVQRARRRNGEGSAPSEKI

>CYP17A2 Scaffold_8086    = LGS96377.x1 57% to 2D9 Length = 12148
AL001511.1 cosmid 038F21 Takifugu rubripes FC:C038F21bD12
44% to LDZ12559.x1 57% to chicken CYP17 48% to scaf 4175
LED27587.x1 LED27587.y1 POSSIBLE CYP17 MID REGION
1198 MVTVGSFLIFRRPVRGSEPGSEAGPPRVKVPCISWVPVLGSLPWLRGGRPLHLIFTQLSYR 1380 (2)
1600 YGPLFALYLGPHLTVVVNNHQHAREVLLLRGKDFAGRPRM 1719 (0)
1797 VTTDLLTRGGKDIAFSDYCPLWKSHRRLVQNSFTLFGEGTSRLQDM 1934 (1)
2061 VLAAVDSLCEELLSMEGRGFDPAPAVTRAVTNVVCMLVFSATYRHGDSELQEVLRYNDGI 2240
2241 VQTIAGGGLVDIYPWMK 2291 (0)
2372 VFPNKTLSKLKACIAVRDRLLTHKLEEHK 2458 (0)
2622 ATLTDNQPRDLLDALLMGQVGRGRRKGSGRVEEDIITEDHVLMTAAEAFGAGVETTSTTLLWILAYLLHHPQ 2837 (0)
2925 VQERVQKELDDHVGSERPVRVSDRARLTYLDCVINEGMRIRPVSPVLIPHTAMTDSR 3074 (2)
3870 IGGHHISRGTRVLVNMWSIHHDSAHWDKPDLFNP 3971 (1)
4519 DRFRDHQGQRVTPSCFLPFGAGPRVCVGESLARLELFLFLSSLLQRMSFRLPNGA 4692
4693 SPPDLQGRMGVVLQPVPYKVVVTPRVG* 4776

>CYP19A1 ov Scaffold_7098 64% to LDZ38561.x1 CYP19 Length = 14029 53% to CYP19
= LGS44549.x1 like ovary CYP19 P450s
9466 MAAVGLDAEVLVSVSPNATEAESPGSSAGTRALIILTCLLLLVWSHTEKKSVP 9308 (1)
9242 SLLGPSFCLGFGPLLTYVRFIWTGIGTASNYYNKKYGDIVRVWVNGEETLVISR 9081 (2)
8985 ASAVHHVLKSRQYTSRFGSKQGLSCIGMNERGIIFNNNVTEWRKIRGYFTK 8830 (1)
8759 ALTGPAVQNTVEVCNSSTQAHLDRLEDLAQVDVLSLLRCTVVDISNRLFLDIPIN 8595 (1)
8499 EKELLLKIHKYFDTWQTVLIKPDIYFKFGWIHQKHKTAA 8392 (2)
8296 RELQEAIEGLVEQKRRDLEQADKLENINFTAELLFAQ 8186 (0)
8084 NHGELSAENVMQCVLEMVIAAPDTLSVSLFFMLLLLKQNPDVELQLLQEIDAVVGK (0 expected, bad boundary)
     RQLQNGDLQKLRVLETFINECLRFHPV 7719
7718 VDFTMRRSLSDDVIEGYRVPKGTNIILNTGHMHRTEFFLRPTEFCLQNFEKN 7563 (0)
     APRRYFQPFGSGPRACVGKHIAMVMMKSILVTLLSQYSVCPHEGLT 7327
7326 LDCLPQTNNLSQQPVEHQEEAQQLSMRFLPRQRGSWQTV* 7207

>CYP19A1 a  AF183906 zebrafish ovary type
MAGDLLQPCGMKPVRLGEAVVDLLIQRAHNGTERAQDNACGATA
TILLLLLCLLLAIRHHRPHKSHIPGPSFFFGLGPIVSYCRFIWSGIGTASNYYNSKYG
DIVRVWINGEETLILNRSSAVYHVLRKSLYTSRFGSKLGLQCIGMHEQGIIFNSNVAL
WKKVRAFYAKALTGPGLQRTMEICTTSTNSHLDDLSQLTDAQGQLDILNLLRCIVVDV
SNRLFLGVPLNEHDLLQKIHKYFDTWQTVLIKPDVYFRLDWLHKKHKRDAQELQDAIT
ALIEQKKVQLAHAEKLDHLDFTAELIFAQSHGELSAENVRQCVLEMVIAAPDTLSISL
FFMLLLLKQNPDVELKILQEMDSVLAGQSLQHSHLSKLQILESFINESLRFHPVVDFT
MRRALDDDVIEGYNVKKGTNIILNVGRMHRSEFFSKPNQFSLDNFQKNVPSRFFQPFG
SGPRSCVGKHIAMVMMKSILVALLSRFSVCPMKACTVENIPQTNNLSQQPVEEPSSLS
VQLILRNTL

>AF135851.1 Tilapia mossambica ovary cytochrome P450 aromatase mRNA, complete cds
          Length = 1783

Query: 3    GEFSADNVRQCVLEMVIAAPDTLSISLFFMLMLLKQHPDVELRIVEEL-STEGE---ENI 58
            GE SA+NV QCVLEMVIAAPDTLS+SLFFML+LLKQ+P VE ++++E+ +  GE   +N 
Sbjct: 924  GELSAENVTQCVLEMVIAAPDTLSLSLFFMLLLLKQNPHVEPQLLQEIDAVVGERQLQNQ 1103

>CYP19A2 br Scaffold_4200 Length = 23225 most of CYP19 58% to CYP19
= LDZ38561.x1 CYP19 Length = 23225 region in lower case may have seq errors
2918 MKPKETLNITASGPFTPLPLPLMMMMMMLLLMMMMLFLTWNRPQRQHVP (1)
3451 GPLFLAGLGPLLSYCRFMWTGIGTACNFYNNKYGSLVRVWINGEETLILSR (2)
3674 SSAVYHVLRSAHYTARFGSRAGLECIGMEGQGVIFNSDVQLWRRARVYFSK (1)
3960 ALTGPGLQRTVGVCVTSTAKHLDCLVDMTDASGHVDALNLLRAIVVDISNRLFLRVPLN 4136 (1)
     EKDLLTKIHNYFETWQAVLIKPDIFFKIGWLFDKHRRAA 4319 (2)
     QELQDTMAALLKVKRKLVHEAEKLDDVLDFATELILAQ (0)
     EAGEFSADNVRQCVLEMVIAAPDTLSISLFFMLMLLKQHPDVELRIVEELSTvsrt (0)
     egeENIDYQRLKVMESFINESMRFHPVVDFTMRKALEDDTIEGIRIRKGTNIILNIGLMHKTE 5039
5040 FFPKPREFSLTNFEQT (0)
     VPSRFFQPFGCGPRSCVG 5219
5220 KHIAMVMMKAILATLLSRYTVCPRHGCTLTSIRQTNNLSQQPVEDEHSLAMRFIPRTIQSPS* 5408

>CYP19A2 b  AF183908 zebrafish brain type
MMEHVVKDAVNIGAVVQGTLLLLTGTLMLILLHRIFGVKNWRNQ
SALPGPGWWLGLGPVLSYSRFLWMGIGTACNYYNEKYGSIARVWINGEETVILSKSSA
VYHVLKSNNYTGRFASAKGLQCIGMFKQGIIFNSNIAKWKKVRTYFTRALTGPGLQKS
VEVCVSATNRQLDVLQEFTDASGHVDVLNLLRCIVVDVSNRLFLRIPLNEKELLIKIH
RYFSTWQTVLIQPDIFFKLDFVYRKYHLAAKELQDEMGKLVEQKRQAINNTEKLDEMD
FATELIFAQNHDELSVDDVRQCVLEMVIAAPDTLSISLFFMLLLLKQNSAVEEQIVQE
IQSQIGSRDVESADLQKLNVLERFIKESLRYHPVVDFIMRQSLEDDYIDGYRVAKGTN
LILNIGRMHKTEFFKKPNEFSLENFENTVPSRYFQPFGCGPRACVGKHIAMVMTKAIL
VTMLSRFTVCPRHGCTISTIRQTNNLSMQPVEEDPDCLAMRFIPRAQNSNGETADNRT
SKE

>gi|13310929|gb|AF295761.2|AF295761 Oreochromis niloticus cytochrome P450 aromatase type II 
mRNA,
            complete cds
          Length = 1938

Query: 3    GEFSADNVRQCVLEMVIAAPDTLSISLFFMLMLLKQHPDVELRIVEELST----EGEENI 58
            GE SADNVRQCVLEMVIAAPDTLSISLFFMLMLLKQ+P +EL++VEE++T    +  ENI
Sbjct: 1042 GELSADNVRQCVLEMVIAAPDTLSISLFFMLMLLKQNPAIELQLVEEMNTILNEKDVENI 1221

Query: 59   DYQRLKVMESFINESMRFHPVVDFTMRKALEDDTIEGIRIRKGTNIILNIGLMHKTE 115
            DYQ LKVMESFINES+RFHPVVDFTMRKALED+   G +I+KGTNIILN GLMHKTE
Sbjct: 1222 DYQSLKVMESFINESLRFHPVVDFTMRKALEDNDFAGTKIKKGTNIILNTGLMHKTE 1392

>CYP20 Scaffold_486 Length = 66580 59% TO CYP20 human
AL019151.1 cosmid 169N23 genomic clone 169N23aH7  Length = 601
AL019159.1 cosmid 169N23 genomic clone 169N23aB6.Length = 463
AL019161.1 cosmid 169N23 genomic clone 169N23aD3  Length = 406
MLDFAIFAVTFVIVLVGAVLYLYPSSRRASGIPGLNPTDEK
11654 DGNLQDIVGRGSLHEFLVSLHQEFGPVASFWFGSRPVVSLGSLQQLRQHINPNHST 11487 (1)
DSFETMLKSLLGYHSGGGGASTDSIIRKKVYQGAIDTTLKNNFPL 11271
LVDELVGKWKSFPEDQHTPLCAHQLVLAMKTITQLALGESFSEDARVIAFRKNHDVV
IWSEIGKGYMDGSLEKSTSRKGHYEKG 
ALSEMESTLLSVVKERKSQRNKSVFVDSLIQSTLTERQ
IMEDCMVFMLAGCAITAN
VCIWALHFLSTSEEVQDRLYKEFEEVLGSSPVSLEKIPQLR
YCQQVLNETLRTAKLTPIAARLQEVEGKVDQHLIPKE
SLVIYALGVILQDSDTWNAPYR
FDPDRFEEESVKKSFHLLGFSGSQTCPELR
FAYTVATVLLSVLVRQLKLHRLKDTLMEVRSELVSTPRDETWITFNLRN*

>CYP21 Scaffold_15 complete gene 11 exons = LDZ12559.x1 LPC12755.x1 
Length = 151128 
      MCSFLFCSSFSAPPAVKRNLPLSLWQLPLRPSSPPIPGPPCRFLIGNMTE
28230 LMHDHLPIHLTNLAKRYGNIYRLKCGNTT 28144 (1) 
28058 AMIVLNSSDIIREALVKKWSDFAGRAVSYT 27969 (1)
27888 ADIVSGGGRNISLGDYTEEWKALRRLVHGALQRCCKHSLHNVIERQALQLRK 27754 (0)
      VLVDYRGGAVDLSEDFTVAASNVIITLVFGKE (0)
      YDKSSSELQQLHRCLNEIVALWGSTWISALDTFPLLR (0)
      KFPNPVFSRLLREVSRRDEIIRKHLNQFK (0)
      CVLCCAQSEGHRRTDVITGSLLEG (0)
26896 VLTDMHVHMATVDLLIGGSETTAAWLNWTVAFLLHRPE (0) 
      FQTKVYEELCTVLEGRYPKYSDRQRLPILCSLIHEVLRLRPVAPLAVPHKAIRDS (2) 
      SIAGYFIPRNTIIIPNLFGAHHDPEVWSDPYSFKP (1)
26238 ERFLEGGGGSTRALIPFGGGARLCLGETVAKMELFLFTAYLLRDFCFVLPDSEAPLPDLR 26059
26058 GVASVVLKIKSFTVIARPRTGP* 25990

>AL281449.1 C0BG094DE12LP1 G Tetraodon nigroviridis genomic clone 094J24 T7.Length = 895
AL233853.1 C0BG007BE04XD1 G Tetraodon nigroviridis genomic clone 007I08 T7.Length = 1079
86% to Fugu CYP21
MGCIFFFFYLPFSAPPAVKRSLLQSLCGLLHRPSSPSIPGPPCRFLIGNMTE
LMQDHLPIHLTDLAKRYGNIYRLKCGNTS
AMVVLSSGDVIREALVKKWSDFAGRSVSYT
ADIVSGGGRTISLGDYTEEWKAHRRLVHSAL (frameshift) ERCXKQSLHDVIERQALQLRK
Missing exon 4 and part of exon 5
                 GSAWISALDTFPLLR
KFPNPVFSRLLREVTRRDEIIRKHLNQYK
CVLCCVQSQDNKSTDVITGSLLEG
VLTDVHVHMATVDLLIGGTETTAAWLNWTVAFLLHRPE
IQTKVYEELCTVLEGRYPKYSDRHRLPVLCSLVHEVLRLRPVAPLAVPHKAVRDS
SIAGYFIPKNTIIIPNLFGAHHDPXVWPDPYSFXX
Missing exon 11

>CYP24 Scaffold_1804 CYP24 LDZ23330.x1 LGL43929.x1 Fc:c114K05x1 LKU47931.x1 
Scaffold_4128 FS:S002393 Scaffold_2393 (N-term revised 6/2/04)
MRAQMKKAPQIVELLRKKSVGLQHFKPTSSVCVLEEKDALEAARCPHAASRAHS
LDAIPGPTNWPLVGSLFELLRKGGLTRQHEAL
VDYHKKFGKIFRLKLGSFESVHIGAPCLLESLYRTEGSYPQRLEIKPWTAYRDMRDEAYGLLIL
EGKDWQRVRRAFQQKLMKPTEVVKLDRKINE
VLEDFVSRIGKTNIGGKIEDLYFELNKWSFES
ICLVLYDKRFGLLQDKVNEEAMNFITAVKT
MMSTFGLMMVTPVELHKSLNTKTWQDHTAAWDRIFST
AKVYIDKKLKRNSVIAPDDLIGDILHQSRLSKKELYAAITELQIGGVET
8608 TANSMLWAIFNLSRNPGAQRRLLEEIRTVVPPEQDPCGEHIKSMPYLKACLKESMR 8441 (2)
7830 ISPSVPFTSRTLDKDTVLGDYAIPKG 7753
TVLMINSHALGSSEDYFDDGKKFKPERWLREHGTINPFAH
VPFGIGKRMCIGRRLAELQMSLFLQLVRDFE
IVATDNEPLDVIHSGLLVPNRELPVAFIKR

>CYP26A1 Scaffold_12575 Length = 5934 
FS:S009376 Scaffold_9377
2144 MAVSALLATFLCTIVLPLLLFLVTVKLWEVYVIRERDSACPSPLPPGTMGLPFIGETLQLILQ (0)
1689 RRKFLRMKRQKYGYIYRTHLFGNPTVRVTGANNVRHILLGEHRLVAV 1552
1551 QWPASVRTILGSDTLSNVHGAQHKTKKK 1465 (0)
1230 AIMQAFSREALEFYIPAMQHEVQAAVQEWLAKDSCVLVY
     PEMKRLMFRIAMQILLGFQLEQIKTDEQKLVEAFEEMIKNLFSLPIDMPFSGLYR 948 (0)
784  GLKARNFIHAKIEENIKRKLRESNSDSKCRDALQQLIDSSKKSGQVLSMQ 635 (0)
548  VLKESATELLFGGHETTASTATSLIMFLGLNPEVLDKLRHELSDKVMHKGF 396 (1)
329  LDLRSLNLETLEQLKYTSCVIKETLRMNPPVPGGFRVALKTFELG 195 (0)
100  GYQIPKGWNVIYSICDTHDVAEIFPNKEDFQPERFMMKNCGDSSRFQYIPFGGG 
     SRMCVGKEFAKVLLKIFLVEVVTKCHWSLLNGPPTMKTGPTVYPVDNLPTKFNTYVQN* 3998

>CYP26B1 Scaffold_4267 Scaffold_49 partial sequence 74% to 26B1 
missing exons 1 and 2 26B1 exons 1 and 2 Length = 21195 
= LGW28482.x1 Length = 622 this seq 75% to 26B1 fugu but not 26A,B or C
predict these scaffolds are from the same gene
8448  MLFDSFDLVSALATLAACLVSMALLLAVSQQLWQLRWTATRDRNCKLPMPKGSMGFPFIGETCHWLLQ 8651
17930 GSGFHASRRQKYGNVFKTHLLGRPLIRVTGAENIRKVLMGEHTLVTVDWPQSTSTLLGPNSLA 18118
18119 NSIGDIHRKKRK 18154
1487  VFAKVFSHEALESYLPKIQQVIQESLRVWSSNPEPINVYR 1606 (2)
1689  ESQRLSFTMAVRVLLGFRVSEEEMKHLFSTFQDFVDNLFSLPIDLPFSGYRK 1844 (0)
1921  GIRARDTLQKSIEKAIREKPLCSQGKDYSDALDVLMESAKENGSELTMQELK 2076 (0) exon 4
2851  ESTIELIFAAFATTASASTSLIMQLLRHPPVLERLREELRARGL exon 5,6 fused
      LHNGCLCPEGELRLDTIVSLKYLDCVIKEVLRLFTPVSGAYRTAMQTFELD 3135 (0) exon 5,6 fused
3271  GVQIPKGWSVMYSIRDTHDTSTVFKDVDVFDPDRFSQERGEDKEGRFHYLPFGGGVRSCLGKQLA exon 7
      TLFLRILAIELASTSRFELATRQFPRVITVPVVHPVDGLKVKFYGLDSNQNEIMAKSEELLGAAV* 3663 exon 7

>CYP26C1 Scaffold_11741 complete gene 7 exons Length = 7795 probable ortholog of CYP26C1
LGS143091.x1 whole equally similar to 26B1 or 26C1 but C-term 68% to 26C1 while 58% to 26B1
Lower case region very poor match may not be correct exon structure here.
Verified on FM:M003163 scaffold_3163 
6362 MLGLVSALATALTTLLLLLLLLALTRQLWSFRWSLTRDRRCELPLPKGSMGWPLVGETFQWLFQ 6171 (0)
5571 GSNFHISRRKRHGNVFKTHLLGKPLVRVTGAENIRKILLGEHSLVCTQWPQSTRIILGPN 5392
5391 ALVNSIGELHKRKRK 5347 (0)
4963 ILAKVFSRKALESYLPRLQEVIKCEIAKWCAEPGSVDVYAATRSLTFRIAIGVLLGLHL 4781
4780 EEERIDYLAQIFGQLMSNLFSLPIDAPFSGLRK (0)
3972 GIKARKILHANMEKIIEKKMERQQEEEEYRDAFDYMLSTSKEQGQQISIQELK 3814 (0)
3581 ETAVELIFAAHSTTASAATSLVLQLLHHPEVVERVRVELEAQKLcynslnlpsqa 3417 (1)
3377 ctfpqsqchasnLSLDKLNQLHYIDCVIKEVLRFLPPVSGGYRTALQTFELD 3222 (0)
2770 GYQIPKGWTVMYSIRDTHETAEIFQNPELFDPDRFVTAQVESRSSRFSYVPFGGGVR 2600
2599 SCVGKELAQIILKTLTIELIRTCKWTLATEKFPKMQTVPIVHPVNGLHVNFMYKNLHEIDH* 2414 

>Oryzias latipes EST BJ005391
LTLTPTGXXIPLGVMAETAQDAFETTCLLDQGFPKPRSHIPYLTLEKLSQLRYIDCIIKE
VLRFLPPVSGGYRTALKTFELDGYQIPKGWSVMYSIRDTHETAAVFQSPEMFDPDRFGPE
REESRASRFSYVPFGGGVRRCIGKELAQIILKTLAVELIGTCKWTLATQNFPKMQTVPIV
HPVNGLHVRFSYKNPL*

>CYP27A1 Scaffold_3437 Length = 26117 46% to 27A1 
Scaffold_7201 62% to 27A1 506-532
MSACLCVNSCGRKEAGCCGLWPGLAPAPGAQGGSPQSQRLPPISASNHGRWRTFHASASGSC
LQINVSGFHSHMHELQ (0)
136  (0) ILEKGRYGPIYRNGMNAVSVSTAKLLGEVLRNDDKFPNRGDMSIWKEYRDLRGYGYGPFTE 321 (2)
536  KDERWYNLRAVLNKRMLRPKDALQYGDTIGEVVTDFIRRIYFLRQRSPTGDVVTDLNNELYHFSLE 733 (1)
816  AIASILFETRLGCLEEEIPTGTQDFINAISQMFSNNFQVFLMPKWSRGVLPYWRRYVAGWDGIFSF 1013 (1)
1206 ATRLIDRKMEFIQQHLDNNQNVEGEYLTYLLSNTQMSIKDVYGSVSELLLAGVDT 1370 (0)
1465 TSNTLTWTLHLLSKYPQCQEILFKEVSTSVPADRAPSAEEVTRMPYLRAVVKESLR 1632 (2)
1756 MFPVIPMNGRILADKDVMIGGYQFSKN 1836 (0)
1949 TAFNFSHYAIGRDEDTFPEPATFMPERWLQDSHNRPNAFGAIAFGFGVRGCVGRRIAELEMYSFLCH 2149 (0)
2308 LMRHFEIKPDPKMGELKSVCRTVLIPDKPVSLRFLDRGSGHAA* 2439

>FS_CONTIG_4992_1 Tetraodon seq that matchwes the last exon of CYP27A1
          Length = 11585

 Score = 76.1 bits (184), Expect = 3e-14
 Identities = 34/43 (79%), Positives = 39/43 (90%)
 Frame = -3

Query: 1    LMRHFEIKPDPKMGELKSVCRTLLIPSKPINLRFLRRGSGHAA 43
            LMRHFEIKPDP+MGELKS+CRT+LIP KP++L FL R SGHAA
Sbjct: 4179 LMRHFEIKPDPQMGELKSICRTVLIPDKPLSLYFLDRRSGHAA 4051

>CYP27A2P Scaffold_697 both Fugu and Tetraodon have defects in exon 3
Length = 61067 = Scaffold_6851 53% to 27A1 mouse
THIS SEQ NOT PRESENT IN THE MAYFOLDS
29906 MASFTALRCAAIGARNSALRPATLPSRNLNLQATSEAANLKGIADLPGPNTYKILYWLFVKGYGERSHLLQ 30118 (0)
30735 GKLKNIYGPMWRWKLGPYDFVSVASPELIARVIQQEGRYPVRVQLPHWKEYRDLRGQAYGLHVE 30926 (2)
31027 TGPEWSRLRSALKPRMLKLREV 
      42 aa of exon 3 missing in region of poor seq and a small sequence gap (1)
31334 GISAILFETRLGCLGEKVDPNVQRFISGVNDMLSLSDITYLFPRWTRSFVPVWKRFAQAWDDISDV 31531 (1)
31614 ASSLIDRRIAEIDARVANGQSVEGLYLTYLLSSDKMSRAEISTCITDLLLGGVDT 31778 (0)
32763 TSNTLSWALYHLAKDPVAQDRLYDEVNSVCPNHHQPTTDDLANMPFLKAVIKEVLR 32930 (2)
33004 LYPVVHQNARFISENDVILNDYWFPKK (0)
      TQFHLCHYSVCHDETQFKHAERFLPERWLRHSAPLSGYYQHHPYSFIPFGVGVRACVGKRVAELEMYFALTR (0)  
gnl|ti|438108935 ACAX94560.b1 trace archive for exon 8
    54 RTLLIPSKPINLRFLRRPGEQRC* 125 
gnl|ti|438306768 name:ACAX184598.y1 trace archive for exon 9
The last exon is missing the first 20 aa

tetraodon hit also seems to be lacking the rest of this exon
>FS_CONTIG_506_1
          Length = 39073

Query: 1    TGPEWSRLRSALKPRMLKLREVVSCGR------WWP----------YPPMNHEVDGDL-- 42
            TGPEWSRLRSAL P+MLKL+EV +         W P           P   H +   L  
Sbjct: 9246 TGPEWSRLRSALNPKMLKLQEVATFAPVVHSVVWGPAAAP*VPAKRQPGRRHRLGCGL*A 9425

Query: 43   ----LDD*VTRNGTMDGYR-YKYGSNGEL--GISAILFETRLGCLGEKVDPNVQRFISGV 95
                L     R    D ++  ++  NG+   GISAILFETRLGCLG+KVDPNVQRFI+ +
Sbjct: 9426 VQVWL*RWKLRRRFQDVFKNIEFLGNGDSFPGISAILFETRLGCLGQKVDPNVQRFITAI 9605

Query: 96   NDMLSLSDITYLFPRWTRSFVPVWKRFAQAWDDISDV 132
             DMLS SD  YL PRW RS VPVWKRF QAWDDISDV
Sbjct: 9606 GDMLSTSDFAYLVPRWARSLVPVWKRFVQAWDDISDV 9716

>Untitled frame1
 TGPEWSRLRSALKPRMLKLREVVALSPDESRG*WGPSGRLSYSKRHHGRVPIQIWLEWRV
 GLTVEAGVPKTXXXXXXXXXXXXXXXXXSLLT*RFECFFSSCRNLSNPV*NQAGLPGREG
 RSKRSALHQWRQRHAFSLRHHLPLPSVDPQLRPCVETLRPGLGRHLRR
>Untitled frame2
 RDLSGPDSAAPSNPEC*SCGRWWPYPPMNHEVDGDLLDD*VTRNGTMDGYRYKYGSNGEL
 D*QWKPAFRRXXXXXXXXXXXXXXXXXEVS*LNVLNVSFLLAGISAILFETRLGCLGEKV
 DPNVQRFISGVNDMLSLSDITYLFPRWTRSFVPVWKRFAQAWDDISDV
>Untitled frame3
 GT*VVQTPQRPQTQNAEVAGGGGLIPR*ITRLMGTFWTIELLETAPWTGTDTNMARMESW
 IDSGSRRSEDXXXXXXXXXXXXXXXXXKSPNLTF*MFLFFLQESQQSCLKPGWAAWERRS
 IQTFSASSVASTTCFLSPTSPTSSLGGPAASSLCGNASPRPGTTSQT

>FM:S000138_Ranges_30104_30608 rev complement
ACGGGACCTGAGTGGTCCAGACTCCGCAGCGCCCTCAAACCCAGAATGCTGAAG TTGCGGGAGGTGGTGGCCTTATCCCCCGATGAATCACGAGGTTGATGGGGACCTTCTGGACGATTGAGTTACTCGAAACGGCACCATGGACGGGTACCGATACAAATATGGCTCGAATGGAGAGTTGGATTGACAGTGGAAGCCGGCGTTCCGAAGACNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGAAGTCTCCTAACTTAACGTTTTGAATGTTTCTTTTCTTCTTGCA
GGAATCTCAGCAATCCTGTTTGAAACCAGGCTGGGCTGCCTGGGAGAGAAGGTCGATCCAAACGTTCAGCGCTTCATCAGTGGCGTCAACGACATGCTTTCTCTCTCCGACATCACCTACCTCTTCCCTCGGTGGACCCGCAGCTTCGTCCCTGTGTGGAAACGCTTCGCCCAGGCCTGGGACGACATCTCAGACGTT


>gnl|ti|119676116 name:MBF444348.y1.gz mate:119676115 mate_name:MBF444348.x1.gz template:MBF444348 end:F
AATAGACTTCGAGCTCGGTACCTGAAGTAGGAGGCTGTCTTTGGTCTGGAAAGTATTCATATTCTGTTCT
TTATAGAACCGGTGTTGAAAAACAAGGAAGAAAAGGTTTAAACCAGTTCTATTAAATAAGATAACGGATG
CAGAATCCCCCTCCCCTCCCCTCCCCTCCCCTCCATGGTGCTGCTGCAAGTTCTCTGGAACTTGAGCAAC
AGTTTTCCTGGTTAACTGCCCTCATCTTCTTTTGAACGTCCCTTAGGTTGCATCATTTGCGTTGTGTTGT
CATTGGTGGAGGTTAAAGAGAAAAAAAAAAGAAACTTAAGTAAAGTGGCTGCTACGTTGCTACGTAAACT
CACATGTTTTGCAGGGTAAACTGAAGAATATCTACGGCCCCATGTGGCGCTGGAAGTTGGGCCCTTATGA
CTTTGTCAGTGTTGCATCTCCAGAGCTCATTGCTCGGGTGATCCAGCAGGAGGGACGCTACCCAGTTCGA
GTCCAGCTGCCTCACTGGAAGGAGTACCGGGACCTGAGGGGGCAGGCTTATGGTCTGCATGTGGAGTGAG
TGGATGCTGCTGTATACAGGAGTTCCTCAGGGCTCCATCGTGGCTCCCTTTCAGGTAAGTTACCTAAAGT
GTCCTGATCTCCATTGACCGTAGGACGGGACCTGAGTGGTCCAGACTCCGCAGCGCCCTCAAACCCAGAA
TGCTGAAGTTGCGGGAGGTGGTGGCCTTATCCCCCGATGAATCACGAGGTTGATGGGGACCTTCTGGACG
ATTGAGTTACTCGAAACGGCACCATGGACGGGTACCGATACAAATATGGCTCGAATGGAGAGTTGGATTG
ACAGTGGAAGCCGGCGTTCCGAAGAC

>CYP27A3 Scaffold_6002 complete Length = 16767 missing 7 aa
= LGS139924.x1 57% to 27A1 I-helix LGS125183.x1 Cyp27a1
FS:S002565 Scaffold_2565
Fe:eCA589520 EST fills in 7aa seq gap at beginning of exon 3
First 16 aa and 49-81 supported by EST from AU050037 Paralichthys olivaceus
49 on also supported by AW343479 zebrafish EST
LPC42076.x1 39% to 27A1 202-276 79% to LPC42075.x1 Exon 4
     MFRNRLLTVGLRASVPHREGLHRTAVNYAGARRRHASSAATEITEHNVR
     QKTMEDLGGPSFLTTLNWLFLKGYLPKTQQMQ (0)
4619 VEHSKIYGPLWKSKYGPMVVVNVASADLIEQVLRQEGRHPVRTDMPHWRRYRALRNQAHGPLTE (2) 6106 
5422 GMGAKWQRIRSILNPRMLKPQHVSSYGITINDVVTDFLEKLVWLRAKDGGGVMVNDVAGELYKFAFE (1) 5249
4825 GISSVLFESRMGCLNDEVPEETQKFIYSVGEMFRLSAVVVLFPQSVWPYLPLWKKFVAAWDYLFKV (1) 4604
1891 AEQMVQKKMEEIQNKVDLHQDVEGAYLTHLLLSEKMTVTEILGSITELLLAGVDT 1715
1661 TSNTISWALYQLAQNPSIQDQLYHEVRSVCPGNKMPDSDDIAQMPYLKAVIRETLR
     LYPVVPGNARVTVDKEIVVGGYLFPKQ (0)
754  TLFHLCHYCVSHDENIFPNSRVFQPERWLRGREEKSKQHPFGSVPFGFGVRACLGRRV 581
580  AELEMYLLLSR (0)
     LIRRFEVRPDPNGAEVKPITRTLLCPATPINLQFLDRGAQRAPGPAAGASL*

>FS_CONTIG_7001_1 Tetraodon seq that matches the last exon of CYP27A3
          Length = 7773

6207 LIRRFEVRPDPAGTEVKPITRTLLCPAKPINLQFLDRRS 6323

>CYP27A fragment b exon 4 
LPC42075.x1 35% to 27A1 79% to LPC42076.x1 Exon 4
GISSVLFESRLGCLNDEVPEETRRVIYSVGEMCRLSAVVVLFPQSGWPYLPVWARLGAAGDYLLQFGEQL

>AL239517.1 C0BG017DC09LP1 G Tetraodon nigroviridis genomic clone 017F18 T7.
Length = 888 81% to 27A fragment a above also same seq = AL187075.1, AL219383.1
Only 64% to fragment b below, so frag a is the best candidate for the 27A3 gene
GICSVLFENRMGCLNEEVPEETQKFIFSVGEMFRLSPLVVLFPKYMWPYLPFWKQFVQAWDYLFKV 279

>AL187075.1 C0AG247CB12SP1 G Tetraodon nigroviridis genomic clone 247D23
Length = 905
    VMVHDVAGELYKFAFE 816 part of exon 3
471 ICSVLFENRMGCLNEEVPEETQKFIFSVGEMFRLSPLVVLFPKYMWPYLPFWKQFVQAWDYLFKV 277

>gnl|ti|15671213 zfishG-a1723a03.q1c Length = 613 like 27A1
559 MGPEWQRIRSILNPRMLTPKHVSNYTNAINGVVSDFIEKMAKLKTTKGNDVMVYDVAGELYKFAFE exon 3
    GIT start of exon 4 SVLFRVPHRLL continuation of Xenopus BJ043405

>Xenopus BG731192
    TRGPKESMVYAESMNQVVSDLLVKIKEITAQSSSGTTVNGVADLMYRFAFE end of exon 3
    SICTVLFETRIGCLNKEILPETQKFIDSIGNMLKYLTVVMRLPQWTKGILPYWGRYIEAWDTIFEY exon 4
    GRKLIDNKMKEIDDRLKRGEEVEGEYLTYLLSSGKLSMKEIYGSV exon 5

>gi|6839845|gb|AW343479.1|AW343479 fi80d01.y1 Sugano Kawakami zebrafish DRA Danio rerio cDNA 
TTSTLIGGDKQKTMDDLDGPSFLTSIYWLFGKGYFQTTHQMQ part of exon 1
IEHSKIYGPLWKSKYGPLVIVNVASAHLIKQVLRQEGRHPIHTDMPHWRGYLILRNHAYGPLTE exon 2
MGPEWQRIRSILNPRMLKP*HVSDYNNTINGVLSDCIENM exon 3

>AU050037.1 Paralichthys olivaceus (Japanese flounder) 
cDNA clone WC3-8.Length = 802
MLRRNLTSVGLRLNRP     32 amino acid gap      KLKTMDELHGPSLWSNLYWLFVKGYFDTTQQLQ exon 1
IEHRKIYGPLWKSVYGPLIVVNVAQVELIEQVLRQEGKHPIRTDMPHWRLYREMKNQAHGPLTG exon 2
MGANWQRIRSFLNPRMLTPKHVSNYTNAINGVVSDFIEKMAKLKTTKNDVMVYD

>CYP27B1 Scaffold_470  complete gene 9 exons  52% to 27B1 Length = 67430
      MLQQALRVSCRSASPLVKWMERWAECASARPQAVKPLGDMPGPSVASFAWDLFAKRGLSRLHELQ (0)
      LEGVRRYGPMWKASFGPILTVHVADPALIEQVLRKEGQHPMRSDLSSWKDYRRLRGHHYGLLTS (2)
51430 EGEEWQSIRSLLGKHMLRPKAVEAYDQTLNSVVDDLITKLRLRRSSQGLVTDIASEFYRFGLE 51630 (1)
51726 GVSSVLFESRIGCLDKIVPEETERFIQCINTMFVMTLLTMAMPSWMHQLFPKPWNVFCQCWDYMFDF (1)
      AKGHIDQRMAAEAEKIARGEEVEGRYLTYFLSRTSLPMKTVYSNVTELLLAGVDT (0)
52280 ISSTLSWSLYELSRHQAVQASLREEVLSVLGGRRVPTAADVAQMPLLKATIKEVLR 52444 (2)
52527 LYPVIPANARVITERDIQVGGYLIPKN 52610 (0)
52697 TLITLCHYATSRDPAVFPRPDEFLPQRWLNKEQSHHPYASVPFGVGKRSCIGRRIAELELYLAVAR 52894 (0) 
53237 ILLEFDIKPDPEGISVKPMTRTLLVPENVINLQFTER* 53347

>CYP27C1 Scaffold_1410  Length = 43243 71% to 27C1 Scaffold_2221 75% to 27C1
= LGW72125.x1 (this covers part of the I helix not seen in scaf. 1410)
Scaffold_2221 75% to 27C1
lower case vdt is best guess based on related sequences
Phase 0 is based on AG at 23002-23003
      MSVMNKLTTTCWTNFYGDRRNKQMLFVLRCLHKSATSGTFGVAREEPLPERLITSTDATKKRLPKT
19233 LAEMPGPGTISNLFEFFWRDGFSRIHEIQ 19319 (0)
20076 IEHSKMYGKIFKSRFGPQLVVSVADRDLVAEVLRAE
      GVAPQRANMESWHEYRDMRRRSTGLISA 20261 (2)
      EGEDWLRMRSVLRQLIMRPRDVAVFSDDVSEVVDEVVDDLIKR
      IVCLRSQSSDGTTICNINDLFFKYAME (1)
20741 GIAAILYECRLGCLSQKIPQETEDYIDALHLMFSSFKTTMYAGAIPKWLRPV 20911
20912 FPKPWEEFCDSWDGLFRF 20968 (1)
      SVHVDKRLKQIESQLQRGEKVTGGLLTYMLVAKEMSVEEIYANVTEMLLAGvdt (0)
23004 TSFTLSWASYLLARHPDVQQQIHAEVMRVLGSEKVATAEDVQHLPFIRGLVKETLR 23183 (2)
      LFPVLPGNGRITQDDMVLGGYFIPKG
23623 TQLALCHYSTSLDDENFPSSLEFRPDRWIRKHSSDRLDNFGSIPFGYGIRSCIGKRI 23793
23794 AELEMHLALIR (0)
23998 IIQKFHVCVSPLTTDVKAKTHGLLCPGAPINLQFIDREI* 24117

Note: this scaffold has been replaced by Scaffold 106
27C1 is on the minus strand from 31916-36680
neigboring gene 2624 bp away is ercc3 at 39305-43553 minus strand (also found in human 39kb away)
The mouse and probably rat have lost 27C1 due to a chromosome rearrangement between BIN1 and ERCC3
It seems that 27C1 was broken and lost in this event.

>54. CYP39A1 human AC008104 AL035670
MELISPTVIIILGCLALFLLLQRKNLRRPPCIKGWIPWIGVGFEFGKAPLEFIEKARIK
YGPIFTVFAMGNRMTFVTEEEGINVFLKSKKVDFELAVQNIVYRT
ASIPKNVFLALHEKLYIMLKGKMGTVNLHQFTGQLTEELHEQLENLGTHGTMDLNNLVR
HLLYPVTVNMLFNKSLFSTNKKKIKEFHQYFQVYDEDFEYGSQLPECLLR 
NWSKSKKWFLELFEKNIPDIKACKSAKDNSM 
TLLQATLDIVETETSKENSPNYGLLLLWASLSNAVP
VAFWTLAYVLSHPDIHKAIMEGISSVFGKAG
KDKIKVSEDDLENLLLIKWCVLETIRLKAPGVITRKVVKPVEIL
NYIIPSGDLLMLSPFWLHRNPKYFPEPELFKPERW
EKGKFRRKHSFLGTASWAFGAGSSQCPGKV
FALLEVQMCIILILYKYDCSLLDPLPKQ
SYLHLVGVPQPEGQCRIEYKQRI

>CYP46 Scaffold_4537 Scaffold_7057 60% to CYP46 LKB67200.x1 LGP3798.x1
LGW51796.y1 LGW51796.x1 LKG46617.x1 LKG481383.x1 LPC42087.y1 human seq is lower 
case  It seems that there may be two different CYP46 genes
Scaffold_14704 first two exons appear after the rest of seq on this contig
Either it is assembled incorrectly or there are two genes adjacent on
This contig and both are partial but cover a whole gene.
Scaffold_13905
Scaffold_1583
MGVFNLIFGWISQASIFLLLLLFIALLGYCMYIKYTHMKYDHIPGPPRDS (2) 
FFSGHSSKLLDIMKDDGVVHDMFLKW (2) 
AETYGPVYKIYFLHHVMVFVSCPETTK (0)
EMLMSPKYTKDKFLHNRIGSLFGQR (2)
FLGNGLVTVRDHEKWYKQRRIMDPAFSSL (2)
YLRSLMGNFNETADKLMDKLSEIADNKTTANMLHLVNCVTMEVLAK (0)
VAFGVDLDLLRKSSPFPRAVELCLKGMVFSIRDTFFM (0)
LNPKNWSFIREVRGACRLLRQTGAQWIQQRKTAMRNGEVPKDILTQIIKSAGK (1) 1712
2446 EEIMTQEDEEFMLDNFLTFFIA (1)
GQETTANQLGFCIMELGRHPDILER (2)
VKKEVDEAIGMKQDISYDDLGHLGYLSQ (0)
VLKETLRLYPTAPGTSRDLKEDMVIGGVHVPGGVVCV (0)
FSSYGMGRMETFFKDPLKFDPDRFDPDAPK (2)
PYYCYFPFSLGPRSCLGQNFAQ (0)
MEAKVVMAKLIQRFDFTLLPGQSFDILDNGTLRPKSGVLCSLRHRDHKK*

>CYP46 FS:S000256 Scaffold_256
149796 MGVFNLIFGWISQASIFLLLLLFIALLGYCMYIKYTHMKYDHIPGPPRD 149942
150055 SFFFGHTSKILEIMKDDGVVHDLFLKW 150135 this exon 2 differs from above version of cyp46
150400 AETYGPVYKIYFLHHVMVFVSCPETTK 150480 exon 3
150645 EMLMSPKYTKDKFLHNRIGSLFGQR 150719 exon 4
150815 FLGNGLVTVRDHEKWYKQR 150871 ALTERNATE EXON 5 partial
151332 AETYGPVYKINFMHHVMVFVSCPETTK 151412 ALTERNATE EXON 3
151577 EMLMSPKYTKDKFLHNRIGSLFGQR 151651 ALTERNATE EXON 4
151747 FLGNGLVTVRDHEKWYKQRRIMDPAFSSL 151833 exon 5
151976 YLRSLMGNFNETADKLMDKLSEIADNKTTANMLHLVNCVTMEVLAK 152113 exon 6
152200 VAFGVDLDLLRKSSPFPRAVELCLKGMVFSIRDTFFM 152310 exon 7
152391 LNPKNWSFIREVRGACRLLRQTGAQWIQQRKTAMRNGEVPKDILTQIIKSAGK 152549 exon 8
153283 EEIMTQEDEELMLDNFLTFFI 153345 exon 9 1 diff
153423 GQETTANQLGFCIMELGRHPDILER 153497 exon 10
153572 VKKEVDEAIGMKQDISYDDLGHLGYLSQ 153655 exon 11
153752 VLKETLRLYPTAPGTSRDLKEDMVIGGVHVPGGVVCV 153862 exon 12
154027 FSSYGMGRMETFFKDPLKFDPDRFDPDAPK 154116 exon 13
154195 PYYCFSSYGMGRMETFFKDPLKFDPDRFDPDAPK 154260 exon 14
154357 MEAKVVMAKLIQRFDFTLLPGQSFDILDNGTLRPKSGVLCSLRHRDHKK 154503 exon 15

>CYP46A2P FS:S000256 Scaffold_256
155374 MGVFNLIFGWISQASIFLLLLLFIALLGYCMYIKYTHMKYDHIPGPPRD 155520
155635 SFFSGHSSKLLDIMKDDGVVHDMFLKW 155715
155962 AETYGPVYKINFMHHVMVFVSCPETTK 156042

MISSING MIDDLE OF GENE

156393 FSSMEWQMETFFK 156431 frameshift
156431 DPLKFDPDRFDPDAPK 156478 
156557 PYYCYFPFSLGPRSCLGQNFAQ 156632
156719 MEAKVVMAKLIQRFDFTLLPGQSFDILDNGALRPKSGVLCSLRHRDHRK 156865

>CYP46A3P FS:S000256 Scaffold_256 aa 458-506
148912 MEAKVVMAKLIQRFDFTLLPGQSFDILDTGALRPKSGVLCSLRHRDHKK 149058

>AL237943 C0BG014DG03LP1 G Tetraodon nigroviridis genomic clone 014N06 T7.Length = 936 N-
terminal of CYP46 71% to Fugu CYP46
MGICNSIFSWMSQFFIVLLFLLFIALLGYCLYVKYVHLKYDHIPGPPRDR (2) aggt 319
agc (2) FLFGHSSTLVEIMKRNGVVHDKFLEW (2) tggt

>AL309772 C0AA037CG04A1 A Tetraodon nigroviridis genomic clone 037N07.
Length = 898 C-terminal of CYP46 357-489 84% to Fugu CYP46
LKXXLRIYPTAPGTXRDLVXDMVIGXVHVPKGVICI (0)
FSSYTMGRMEKXFKDPLKFDPDRFXPDAPK (2)
PYYCYFPFALGSRSCLGQNFAQ (0)
MEAKVVMAKLIQRFDFTLLPGQSFDILDNGTLRPKSGVVCSIRHR (2)

>CYP51 Scaffold_437 Length = 70160 CYP51 complete boundaries need checking
LGS77512.y1 Length = 611 98-208 100% match to scaffold 437
LGW124224.y1 LKH45638.y1 CYP51 85% to mouse 51 309-397
67511 MSAHLYEMSSKLIGDTVGRVHD
67445 NLTTVVLAASFITLSLGYVSKLLLRQSFVTDA (0) 
      KHPPYI 67266
67265 PSCIPFLGHAISFGKSPIEFLENAYEK 67185
66989 YGPVFSFTMVGSTFTYLLGSDAAALLFNSKNEDLNAEDVYSRLTTPVFGKGVAYDVPNP 66813
      IFLEQKKMLKTGLNIAHFKEHVKIIEAETRE 66633
66632 YFQRWGDSGER 66600
66525 DLFEALSELIILTASSCLHGKEIRSMLDERVAQLYADLDGGFTHAA 66388
      WLLPGWLPLPSF 
66277 RKRDRAHREIKKIFFKVIEKRRRSGENTDDILQTLVDATYK
      DGRPLSDDEIGGMLIGLLLAGQHTSSTTSAWMGFFMARDRRLQ 65918
65917 ERCYAEQKAACGEDLPPLSFDQ 65852
      LKDLSLLEGCLKETLRLRPPIMTMMRMARSPQ
      TAAGYTIPVGHQVCVSPTVNHRLGDAWEQRLEFKPDRYLDDNPAAGEKFAYI 64964
64963 PFGAG
      RHRCIGENFAYVQIKTIWSTLLRMYEF 64784
64783 DLVDGYFPTINYTTMIHTPHNPVIRYKRRHE* 64688

N-term exon supported by Zebrafish EST

>gi|12355064|gb|BF937808.1|BF937808 fm68c05.y1 Zebrafish adult retina cDNA Danio rerio cDNA clone
           4200561 5' similar to SW:CP51_PIG O46420 CYTOCHROME P450
           51 ;.
          Length = 656

Query: 5   LYEMSSKLIGDTVGRVHDNLTTVVLAASFITLSLGYVSKLLLRQSFVTDAKHPPYIPSCI 64
           + E+ S+LI   V ++  +LT+V+L AS  TL+LGY+SKLL  Q      K+PP+IPS +
Sbjct: 78  ILEVGSQLIESAVLQM--SLTSVLLTASVFTLTLGYISKLLFTQHSSEHTKYPPHIPSSL 251

Query: 65  PFLGHAISFGKSPIEFLENAYEK 87
           PFLG A++FG+SPIEFLE AYE+
Sbjct: 252 PFLGQAVAFGRSPIEFLEKAYEQ 320

MTILEVGSQLIESAVLQMSLTSVLLTASVFTLTLGYISKLLFTQHSSEHTKYPPHIPSSLPFLGQAVAFGRSP
IEFLEKAYEQYGPVVSFTMVGKTFTYLLGSDAAALMFNSKNEDLNAEDVYARLTTPVFGKGVAYDVPHPLFLE
QKKMLETGLNIAQFKQHVEIIEEETKEYFRRLGESGERNLVDALSELII