CYP26C1 is complete. CYP27C1 is complete. There is a new CYP4A22 sequence
that is 95% identical to 4A11. Three genomic sequences agree, but all ESTs are
like the 4A11 sequence and not like this new genomic sequence. Therefore, there
seem to be two different genes, 4A11 and the new closely related 4A22 sequence.
The human CYP26C1 gene is now complete. There are no mouse or human ESTs in
Genbank even though there are 3.5 million human ESTs and 2 million mouse ESTs,
so this is a rare transcript, or it may have a long 3 prime untranslated region
that makes the ESTs fall in the non-coding part of the gene. The 2888bp
downstream of the stop codon do not match any human ESTs. There is a bovine
EST that covers the first 133 amino acids and it is 95% identical suggesting the
protein is highly conserved in mammals that diverged 80-100 million years ago.
This may be a developmental gene only present for a brief time or in a limited
tissue distribution.
CYP26C1 human GenEMBL AL358613.11 May 2, 2001 522 amino acids, 6 exons, (0) = phase 0 intron 52% to 26B1 human, also 15 amino acid insertion in exon 5 vs. 26B1 MFPWGLSCLSVLGAAGTALLCAGLLLSLAQHLWTLRWMLSRDRASTLPLPKGSMGWPFFGETLHWLVQ (0) GSRFHSSRRERYGTVFKTHLLGRPVIRVSGAENVRTILLGEHRLVRSQWPQSAHILLGSHTLLGAVGEPHRRRRK (0) VLARVFSRAALERYVPRLQGALRHEVRSWCAAGGPVSVYDASKALTFRMAARILLGLRL DEAQCATLARTFEQLVENLFSLPLDVPFSGLRK (0) GIRARDQLHRHLEGAISEKLHEDKAAEPGDALDLIIHSARELGHEPSMQELK (0) ESAVELLFAAFFTTASASTSLVLLLLQHPAAIAKIREELVAQGLGRACGCAPGAAGGSEGPPPD CGCEPDLSLAALGRLRYVDCVVKEVLRLLPPVSGGYRTALRTFELD (0) GYQIPKGWSVMYSIRDTHETAAVYRSPPEGFDPERFGAAREDSRGASSRLHYIPFGGGARSCLG QELAQAVLQLLAVELVRTARWELATPAFPAMQTVPIVHPVDGLRLFFHPLTPSVAGNGLCL* CYP27C1 AC027142 43% identical to 27A1 assembled gene intron starting with QIH ending in VDT is from Celera's public data CRA_Gene|hCG42613 /len=10487. This Celera sequence is still missing the C-terminal. Probable last exon is now found in AC027142. AG Intron boundary is in the same Location as CYP26B1. Stop codon is one codon away from 26B1s stop codon. Length is preserved from cys to intron. (n) = intron phase, 9 exons 1 85452 MQTSAMALLARILRAGLRPAPERGGLLGGGAPRRPQPAGARLPAGARAEDKGAGRPGSPPG 85634 61 62 85635 GGRAEGPRSLAAMPGPRTLANLAEFFCRDGFSRIHEIQ (0) 85748 99 100 39574 QKHTREYGKIFKSHFGPQFVVSIADRDMVAQVLRAEGAAPQRANMESWREYRDLRGRATGLISA (2) 39371 163 164 43984 EGEQWLKMRSVLRQRILKPKDVAIYSGEVNQVIADLIKRIYLLRSQAEDGETVTNVNDLFFKYSME (1) 43787 229 230 41743 GVATILYESRLGCLENSIPQLTVEYIEALELMFSMFKTSMYAGAIPRWLRPFIPKPWREFC 41564 290 291 41563 RSWDGLFKFS 41534 300 (1) 301 QIHVDNKLRDIQYQMDRGRRVSGGLLTYLFLSQALTLQEIYANVTEMLLAGVDT (0) 354 (Celera sequence) 355 110201 TSFTLSWTVYLLARHPEVQQTVYREIVKNLGERHVPTAADVPKVPLVRALLKETLR (2) 110034 410 411 108566 LFPVLPGNGRVTQEDLVIGGYLIPKG (0) 108489 436 437 108006 TQLALCHYATSYQDENFPRAKEFRPERWLRKGDLDRVDNFGSIPFGHGVRSCIGRRIAELEIHLVVIQ (0) 107794 504 505 102503 LLQHFEIKTSSQTNAVHAKTHGLLTPGGPIHVRFVNRK* 102619 542 new CYP4A22 sequence >new 4A11 like sequence AL390073.5 95% identical to 4A11 see alignment below MSVSVLSPSRRLGGVSGILQVTSLLILLLLLIKAAQLYLHRQWLLKALQQFPCPPSHWLFGHIQE FQHDQELQRIQERVKTFPSACPYWIWGGKVRVQLYDPDYMKVILGRS DPKSHGSYKFLAPRI GYGLLLLNGQTWFQHRRMLTPAFHNDILKPYVGLMADSVRVML DKWEELLGQDSPLEVFQHVSLMTLDTIMKSAFSHQGSIQVDR NSQSYIQAISDLNSLVFCCMRNAFHENDTIYSLTSAGRWTHRACQLAHQHT DQVIQLRKAQLQKEGELEKIKRKRHLDFLDILLLAK MENGSILSDKDLRAEVDTFMFEGHDTTASGISWILYALATHPKHQERCREEIHGLLGDGASITW NHLDQMPYTTMCIKEALRLYPPVPGIGRELSTPVTFPDGRSLPKG IMVLLSIYGLHHNPKVWPNLE VFDPSRFAPGSAQHSHAFLPFSGGSR NCIGKQFAMNQLKVARALTLLRFELLPDPTRIPIPMARLVLKSKNGIHLRLRRLPNPCEDKDQL* >CYP4A11 NM_000778 12 exons (n) = phase of introns MSVSVLSPSRLLGDVSGILQAASLLILLLLLIKAVQLYLHRQWLLKALQQFPCPPSHWLFGHIQE(0) LQQDQELQRIQKWVETFPSACPHWLWGGKVRVQLYDPDYMKVILGRS (1) DPKSHGSYRFLAPWI (1) GYGLLLLNGQTWFQHRRMLTPAFHYDILKPYVGLMADSVRVML (0) DKWEELLGQDSPLEVFQHVSLMTLDTIMKCAFSHQGSIQVDR (2) NSQSYIQAISDLNNLVFSRVRNAFHQNDTIYSLTSAGRWTHRACQLAHQHT (1) DQVIQLRKAQLQKEGELEKIKRKRHLDFLDILLLAK (0) MENGSILSDKDLRAEVDTFMFEGHDTTASGISWILYALATHPKHQERCREEIHSLLGDGASITW (2) NHLDQMPYTTMCIKEALRLYPPVPGIGRELSTPVTFPDGRSLPKG (1) IMVLLSIYGLHHNPKVWPNPEV (0) FDPSRFAPGSAQHSHAFLPFSGGSR (2) NCIGKQFAMNELKVATALTLLRFELLPDPTRIPIPIARLVLKSKNGIHLRLRRLPNPCEDKDQL* CYP4A22 new seq (top) vs CYP4A11 NM_000778 (bottom) 12 exons Length = 520 Score = 2607 (917.7 bits), Expect = 1.1e-276, P = 1.1e-276 Identities = 494/520 (95%), Positives = 504/520 (96%) Query: 1 MSVSVLSPSRRLGGVSGILQVTSLLILLLLLIKAAQLYLHRQWLLKALQQFPCPPSHWLF 60 MSVSVLSPSR LG VSGILQ SLLILLLLLIKA QLYLHRQWLLKALQQFPCPPSHWLF Sbjct: 1 MSVSVLSPSRLLGDVSGILQAASLLILLLLLIKAVQLYLHRQWLLKALQQFPCPPSHWLF 60 Query: 61 GHIQEFQHDQELQRIQERVKTFPSACPYWIWGGKVRVQLYDPDYMKVILGRSDPKSHGSY 120 GHIQE Q DQELQRIQ+ V+TFPSACP+W+WGGKVRVQLYDPDYMKVILGRSDPKSHGSY Sbjct: 61 GHIQELQQDQELQRIQKWVETFPSACPHWLWGGKVRVQLYDPDYMKVILGRSDPKSHGSY 120 Query: 121 KFLAPRIGYGLLLLNGQTWFQHRRMLTPAFHNDILKPYVGLMADSVRVMLDKWEELLGQD 180 +FLAP IGYGLLLLNGQTWFQHRRMLTPAFH DILKPYVGLMADSVRVMLDKWEELLGQD Sbjct: 121 RFLAPWIGYGLLLLNGQTWFQHRRMLTPAFHYDILKPYVGLMADSVRVMLDKWEELLGQD 180 Query: 181 SPLEVFQHVSLMTLDTIMKSAFSHQGSIQVDRNSQSYIQAISDLNSLVFCCMRNAFHEND 240 SPLEVFQHVSLMTLDTIMK AFSHQGSIQVDRNSQSYIQAISDLN+LVF +RNAFH+ND Sbjct: 181 SPLEVFQHVSLMTLDTIMKCAFSHQGSIQVDRNSQSYIQAISDLNNLVFSRVRNAFHQND 240 Query: 241 TIYSLTSAGRWTHRACQLAHQHTDQVIQLRKAQLQKEGELEKIKRKRHLDFLDILLLAKM 300 TIYSLTSAGRWTHRACQLAHQHTDQVIQLRKAQLQKEGELEKIKRKRHLDFLDILLLAKM Sbjct: 241 TIYSLTSAGRWTHRACQLAHQHTDQVIQLRKAQLQKEGELEKIKRKRHLDFLDILLLAKM 300 Query: 301 ENGSILSDKDLRAEVDTFMFEGHDTTASGISWILYALATHPKHQERCREEIHGLLGDGAS 360 ENGSILSDKDLRAEVDTFMFEGHDTTASGISWILYALATHPKHQERCREEIH LLGDGAS Sbjct: 301 ENGSILSDKDLRAEVDTFMFEGHDTTASGISWILYALATHPKHQERCREEIHSLLGDGAS 360 Query: 361 ITWNHLDQMPYTTMCIKEALRLYPPVPGIGRELSTPVTFPDGRSLPKGIMVLLSIYGLHH 420 ITWNHLDQMPYTTMCIKEALRLYPPVPGIGRELSTPVTFPDGRSLPKGIMVLLSIYGLHH Sbjct: 361 ITWNHLDQMPYTTMCIKEALRLYPPVPGIGRELSTPVTFPDGRSLPKGIMVLLSIYGLHH 420 Query: 421 NPKVWPNLEVFDPSRFAPGSAQHSHAFLPFSGGSRNCIGKQFAMNQLKVARALTLLRFEL 480 NPKVWPN EVFDPSRFAPGSAQHSHAFLPFSGGSRNCIGKQFAMN+LKVA ALTLLRFEL Sbjct: 421 NPKVWPNPEVFDPSRFAPGSAQHSHAFLPFSGGSRNCIGKQFAMNELKVATALTLLRFEL 480 Query: 481 LPDPTRIPIPMARLVLKSKNGIHLRLRRLPNPCEDKDQL* 520 LPDPTRIPIP+ARLVLKSKNGIHLRLRRLPNPCEDKDQL* Sbjct: 481 LPDPTRIPIPIARLVLKSKNGIHLRLRRLPNPCEDKDQL* 520