May 23, 2001

|

CYP26C1 is complete. CYP27C1 is complete. There is a new CYP4A22 sequence
that is 95% identical to 4A11. Three genomic sequences agree, but all ESTs are
like the 4A11 sequence and not like this new genomic sequence. Therefore, there
seem to be two different genes, 4A11 and the new closely related 4A22 sequence.

The human CYP26C1 gene is now complete. There are no mouse or human ESTs in
Genbank even though there are 3.5 million human ESTs and 2 million mouse ESTs,
so this is a rare transcript, or it may have a long 3 prime untranslated region
that makes the ESTs fall in the non-coding part of the gene. The 2888bp
downstream of the stop codon do not match any human ESTs. There is a bovine
EST that covers the first 133 amino acids and it is 95% identical suggesting the
protein is highly conserved in mammals that diverged 80-100 million years ago.
This may be a developmental gene only present for a brief time or in a limited
tissue distribution.

CYP26C1    human
           GenEMBL AL358613.11 May 2, 2001
           522 amino acids, 6 exons, (0) = phase 0 intron
           52% to 26B1 human, also 15 amino acid insertion in exon 5 vs. 26B1
MFPWGLSCLSVLGAAGTALLCAGLLLSLAQHLWTLRWMLSRDRASTLPLPKGSMGWPFFGETLHWLVQ (0)
GSRFHSSRRERYGTVFKTHLLGRPVIRVSGAENVRTILLGEHRLVRSQWPQSAHILLGSHTLLGAVGEPHRRRRK (0)
VLARVFSRAALERYVPRLQGALRHEVRSWCAAGGPVSVYDASKALTFRMAARILLGLRL
DEAQCATLARTFEQLVENLFSLPLDVPFSGLRK (0)
GIRARDQLHRHLEGAISEKLHEDKAAEPGDALDLIIHSARELGHEPSMQELK (0)
ESAVELLFAAFFTTASASTSLVLLLLQHPAAIAKIREELVAQGLGRACGCAPGAAGGSEGPPPD
CGCEPDLSLAALGRLRYVDCVVKEVLRLLPPVSGGYRTALRTFELD (0)
GYQIPKGWSVMYSIRDTHETAAVYRSPPEGFDPERFGAAREDSRGASSRLHYIPFGGGARSCLG
QELAQAVLQLLAVELVRTARWELATPAFPAMQTVPIVHPVDGLRLFFHPLTPSVAGNGLCL*

CYP27C1 AC027142 43% identical to 27A1 assembled gene
intron starting with QIH ending in VDT is from Celera's public data
CRA_Gene|hCG42613 /len=10487.  This Celera sequence is still missing the C-terminal.
Probable last exon is now found in AC027142.  AG Intron boundary is in the same
Location as CYP26B1.  Stop codon is one codon away from 26B1s stop codon.
Length is preserved from cys to intron. (n) = intron phase, 9 exons

  1  85452 MQTSAMALLARILRAGLRPAPERGGLLGGGAPRRPQPAGARLPAGARAEDKGAGRPGSPPG 85634 61
 62  85635 GGRAEGPRSLAAMPGPRTLANLAEFFCRDGFSRIHEIQ (0) 85748 99
100  39574 QKHTREYGKIFKSHFGPQFVVSIADRDMVAQVLRAEGAAPQRANMESWREYRDLRGRATGLISA (2) 39371 163
164  43984 EGEQWLKMRSVLRQRILKPKDVAIYSGEVNQVIADLIKRIYLLRSQAEDGETVTNVNDLFFKYSME (1) 43787 229
230  41743 GVATILYESRLGCLENSIPQLTVEYIEALELMFSMFKTSMYAGAIPRWLRPFIPKPWREFC 41564 290
291  41563 RSWDGLFKFS 41534 300 (1)
301        QIHVDNKLRDIQYQMDRGRRVSGGLLTYLFLSQALTLQEIYANVTEMLLAGVDT (0) 354 (Celera sequence)
355 110201 TSFTLSWTVYLLARHPEVQQTVYREIVKNLGERHVPTAADVPKVPLVRALLKETLR (2) 110034 410
411 108566 LFPVLPGNGRVTQEDLVIGGYLIPKG (0) 108489 436
437 108006 TQLALCHYATSYQDENFPRAKEFRPERWLRKGDLDRVDNFGSIPFGHGVRSCIGRRIAELEIHLVVIQ (0) 107794 504
505 102503 LLQHFEIKTSSQTNAVHAKTHGLLTPGGPIHVRFVNRK* 102619 542

new CYP4A22 sequence
>new 4A11 like sequence AL390073.5 95% identical to 4A11 see alignment below
MSVSVLSPSRRLGGVSGILQVTSLLILLLLLIKAAQLYLHRQWLLKALQQFPCPPSHWLFGHIQE
FQHDQELQRIQERVKTFPSACPYWIWGGKVRVQLYDPDYMKVILGRS
DPKSHGSYKFLAPRI
GYGLLLLNGQTWFQHRRMLTPAFHNDILKPYVGLMADSVRVML
DKWEELLGQDSPLEVFQHVSLMTLDTIMKSAFSHQGSIQVDR
NSQSYIQAISDLNSLVFCCMRNAFHENDTIYSLTSAGRWTHRACQLAHQHT
DQVIQLRKAQLQKEGELEKIKRKRHLDFLDILLLAK
MENGSILSDKDLRAEVDTFMFEGHDTTASGISWILYALATHPKHQERCREEIHGLLGDGASITW
NHLDQMPYTTMCIKEALRLYPPVPGIGRELSTPVTFPDGRSLPKG
IMVLLSIYGLHHNPKVWPNLE
VFDPSRFAPGSAQHSHAFLPFSGGSR
NCIGKQFAMNQLKVARALTLLRFELLPDPTRIPIPMARLVLKSKNGIHLRLRRLPNPCEDKDQL*

>CYP4A11 NM_000778 12 exons (n) = phase of introns
MSVSVLSPSRLLGDVSGILQAASLLILLLLLIKAVQLYLHRQWLLKALQQFPCPPSHWLFGHIQE(0)
LQQDQELQRIQKWVETFPSACPHWLWGGKVRVQLYDPDYMKVILGRS (1)
DPKSHGSYRFLAPWI (1)
GYGLLLLNGQTWFQHRRMLTPAFHYDILKPYVGLMADSVRVML (0)
DKWEELLGQDSPLEVFQHVSLMTLDTIMKCAFSHQGSIQVDR (2)
NSQSYIQAISDLNNLVFSRVRNAFHQNDTIYSLTSAGRWTHRACQLAHQHT (1)
DQVIQLRKAQLQKEGELEKIKRKRHLDFLDILLLAK (0)
MENGSILSDKDLRAEVDTFMFEGHDTTASGISWILYALATHPKHQERCREEIHSLLGDGASITW (2)
NHLDQMPYTTMCIKEALRLYPPVPGIGRELSTPVTFPDGRSLPKG (1)
IMVLLSIYGLHHNPKVWPNPEV (0)
FDPSRFAPGSAQHSHAFLPFSGGSR (2)
NCIGKQFAMNELKVATALTLLRFELLPDPTRIPIPIARLVLKSKNGIHLRLRRLPNPCEDKDQL*


CYP4A22 new seq (top) vs CYP4A11 NM_000778 (bottom) 12 exons
       Length = 520

 Score = 2607 (917.7 bits), Expect = 1.1e-276, P = 1.1e-276
 Identities = 494/520 (95%), Positives = 504/520 (96%)

Query:     1 MSVSVLSPSRRLGGVSGILQVTSLLILLLLLIKAAQLYLHRQWLLKALQQFPCPPSHWLF 60
             MSVSVLSPSR LG VSGILQ  SLLILLLLLIKA QLYLHRQWLLKALQQFPCPPSHWLF
Sbjct:     1 MSVSVLSPSRLLGDVSGILQAASLLILLLLLIKAVQLYLHRQWLLKALQQFPCPPSHWLF 60

Query:    61 GHIQEFQHDQELQRIQERVKTFPSACPYWIWGGKVRVQLYDPDYMKVILGRSDPKSHGSY 120
             GHIQE Q DQELQRIQ+ V+TFPSACP+W+WGGKVRVQLYDPDYMKVILGRSDPKSHGSY
Sbjct:    61 GHIQELQQDQELQRIQKWVETFPSACPHWLWGGKVRVQLYDPDYMKVILGRSDPKSHGSY 120

Query:   121 KFLAPRIGYGLLLLNGQTWFQHRRMLTPAFHNDILKPYVGLMADSVRVMLDKWEELLGQD 180
             +FLAP IGYGLLLLNGQTWFQHRRMLTPAFH DILKPYVGLMADSVRVMLDKWEELLGQD
Sbjct:   121 RFLAPWIGYGLLLLNGQTWFQHRRMLTPAFHYDILKPYVGLMADSVRVMLDKWEELLGQD 180

Query:   181 SPLEVFQHVSLMTLDTIMKSAFSHQGSIQVDRNSQSYIQAISDLNSLVFCCMRNAFHEND 240
             SPLEVFQHVSLMTLDTIMK AFSHQGSIQVDRNSQSYIQAISDLN+LVF  +RNAFH+ND
Sbjct:   181 SPLEVFQHVSLMTLDTIMKCAFSHQGSIQVDRNSQSYIQAISDLNNLVFSRVRNAFHQND 240

Query:   241 TIYSLTSAGRWTHRACQLAHQHTDQVIQLRKAQLQKEGELEKIKRKRHLDFLDILLLAKM 300
             TIYSLTSAGRWTHRACQLAHQHTDQVIQLRKAQLQKEGELEKIKRKRHLDFLDILLLAKM
Sbjct:   241 TIYSLTSAGRWTHRACQLAHQHTDQVIQLRKAQLQKEGELEKIKRKRHLDFLDILLLAKM 300

Query:   301 ENGSILSDKDLRAEVDTFMFEGHDTTASGISWILYALATHPKHQERCREEIHGLLGDGAS 360
             ENGSILSDKDLRAEVDTFMFEGHDTTASGISWILYALATHPKHQERCREEIH LLGDGAS
Sbjct:   301 ENGSILSDKDLRAEVDTFMFEGHDTTASGISWILYALATHPKHQERCREEIHSLLGDGAS 360

Query:   361 ITWNHLDQMPYTTMCIKEALRLYPPVPGIGRELSTPVTFPDGRSLPKGIMVLLSIYGLHH 420
             ITWNHLDQMPYTTMCIKEALRLYPPVPGIGRELSTPVTFPDGRSLPKGIMVLLSIYGLHH
Sbjct:   361 ITWNHLDQMPYTTMCIKEALRLYPPVPGIGRELSTPVTFPDGRSLPKGIMVLLSIYGLHH 420

Query:   421 NPKVWPNLEVFDPSRFAPGSAQHSHAFLPFSGGSRNCIGKQFAMNQLKVARALTLLRFEL 480
             NPKVWPN EVFDPSRFAPGSAQHSHAFLPFSGGSRNCIGKQFAMN+LKVA ALTLLRFEL
Sbjct:   421 NPKVWPNPEVFDPSRFAPGSAQHSHAFLPFSGGSRNCIGKQFAMNELKVATALTLLRFEL 480

Query:   481 LPDPTRIPIPMARLVLKSKNGIHLRLRRLPNPCEDKDQL* 520
             LPDPTRIPIP+ARLVLKSKNGIHLRLRRLPNPCEDKDQL*
Sbjct:   481 LPDPTRIPIPIARLVLKSKNGIHLRLRRLPNPCEDKDQL* 520