Feb. 8, 1999 revised Oct. 6, 1999 revised Dec. 22, 1999

The number of human P450s is currently 50 genes plus 15 pseudogenes


There are 50 known P450s in humans and 15 pseudogenes (see below).  Of the 50 
genes that are known, 47 of them have ESTs in the EST section of Genbank. This 
is 94% of the known P450s. CYP2G1, CYP4F8, CYP7A1, CYP8B1, CYP11B1 and CYP2F1 
from humans have no exact EST matches to their coding region (out of 1,250,000 
human ESTs).  CYP4F8, CYP11B1 and CYP8B1 have ESTs in the 3 prime non-coding 
region.

The number of human disease genes cloned by positional cloning that have an EST 
in the dbEST is 83/91 or 91.2%.  This is from Bassett's estimate
Which was made in 1997.  If new data are added from the list of disease genes at 
NCBI, the numbers increase to 118/122 = 96.7% of human disease genes have ESTs.  
This number is similar the the percent of known human P450s represented in the 
EST database.  From these numbers we could predict 47/0.967 = 49 human P450s.  
Put another way, there ought to be 2 P450 genes in humans that do not have ESTs 
in the database.  CYP2F1, CYP2G1 and CYP7A1 are known from humans but are not 
found in the EST database.  This is actually one more than predicted.  This 
estimate of 49 human P450s is one short of the actual P450 count in humans.  This 
suggests that there are very few human P450s left to be discovered. This prediction 
is probably not far wrong, because even CYP27B1, which was notoriously hard to 
clone has ESTs in the database.  


Humans have 50 sequenced CYP genes and 15 pseudogenes.
1A1, 1A2, 1B1, 2A6, 2A7, 2A7PT (telomeric), 2A7PC(centromeric), 2A13, 2B6, 2B7P, 
2C8, 2C9, 2C18, 2C19, 2D6, 2D7P, 2D7AP, 2D8P, 2D8BP, 2E1, 2F1, 2F1P, 2G1,
2J2, 2R1, 2S1, 3A4, 3A5, 3A5P1, 3A5P2, 3A7, 3A43, 4A11, 4B1, 4F2, 4F3, 4F8, 4F9P, 
4F10P, 4F11, 4F12, 4X1, 4Z1, 5A1, 7A1, 7B1, 8A1, 8B1, 11A1, 11B1, 11B2, 17, 19, 
21A1P, 21A2, 24, 26A1, 26B1, 27A1, 27B1, 39A1, 46, 51, 51P1, 51P2

2C10, 3A3 and 4A9 have been removed because they are probably sequencing 
artifacts. 2G1 has just been sequenced in humans. The Chromosome 19 region is 
represented by 93 contigs in AC008357 (Sept. 2 1999).  The 2G1 sequence has been 
assembled from these, and one pseudogene fragment from AC008962 currently missing 
from AC008537. The sequence may have to be adjusted later.

2R1, 4X1 and 4F12 are missing their N-terminals
2S1, 4F11, 4Z1 and 39A1 are partial sequences

For more details see New Human P450s
For a list of UNIGENE entries of human P450s see human UNIGENE P450s

>Human 2G1 assembly hypothetical 78% identical to rat 2G1 from AC008537 and AC008962

MELGGAVTIFLALRLSCLLILIAWKRMDKAGKLPPGPTPILFLGHLLQVRTDATFQSFMK*

LREKYSPVFTVYMGPRPVVVLCGHEAVKEALIDQADEFSGRGELASIKQNFQGHG*

VALANGERWRILRRFSLTILRDFGMGKQSIKERIQEEASYLLEEFQKTK*

GAPIDPIFLLSRTVSNVISSVVFRSRFDYEDKQFLNLLRLINESFIEMSTPWAQ*

LYDMYSGIMQYLPGRHNLIYYLVEELKDFIASRVKINEASFDPQNPRDFIDCFLIKMH*

QEEKNPNTEFYLKNLVLTTLNLFVGGTETVSTTLHYGFLLLMKHPEVE*

AKIHEEINQVIGPHRLPRVDDRVKMPYTDVVIHEIQRLVDIVPMGVPHNIIQDTQFRGYLLPK*

GTDVFPLLGSVLKDPKYFRYPDAFYPQHFLDEQGRFKKNEAFVPFSSGRGK*

RICLGEAMARMELFLYFTSTLQNFSLCSLVPLVDIDITPKLSGFGNITPTYELCLVAR

The AC008537 has many P450s on it including CYP2B7, CYP2A6, CYP2A7, CYP2G1
And a possible new CYP2F sequence and probably some pseudogene fragments.

Note this GSS might be the N-terminal of a human 2G1 pseudogene see AC008962
AQ620239 HS_5182_B2_D05_T7A RPCI-11 Human Male BAC Library Homo 
           Length = 499
           
Identities = 49/61 (80%), Positives = 52/61 (84%)
 Frame = +1

Query: 1   MELGGAVTIFLALRLSCLLILIAWKRMDKAGKLPPGPTPILFLGHLLQVRTDATFQSFMK 60
           ME+GGAVTIFLAL LSCLLILIAWK M+KAGKLPPGPTPI FL      RTDATFQSFMK
Sbjct: 205 MEMGGAVTIFLALCLSCLLILIAWK*MNKAGKLPPGPTPIPFLXEPAASRTDATFQSFMK 384

This is a GSS fragment for a 2G1 related gene

AQ791192 HS_4507_B1_H06_T7A CIT Approved Human Genomic Sperm Library D Homo
Length = 515
           
Identities = 53/61 (86%), Positives = 54/61 (87%)
 Frame = +1

Query: 1   MELGGAVTIFLALRLSCLLILIAWKRMDKAGKLPPGPTPILFLGHLLQVRTDATFQSFMK 60
           MELGGAV IFLAL  SCLLILIAWK MDKA KLPPGPTPILFLGHLL VRTDATFQSFM 
Sbjct: 199 MELGGAVNIFLALSSSCLLILIAWKPMDKARKLPPGPTPILFLGHLLHVRTDATFQSFMN 378

CYP3A43     human
            GenEMBL AC011904 8902-46787 13 exons
            Gene assembled from genomic sequence by Henry Strobel 
            and David Nelson on Dec 11, 1999
            intron exon boundaries defined by comparison to rat 3A9
            and human 3A sequences.  GT AG pairs found for all introns
            ESTs AA417369 zu08d03.s1 AA416822 zu08d03.r1 Soares testis
            Opposite ends of same clone
            67% identical to rat 3A9

Assembled gene * = intron exon boundary ** = EST support for this boundary

MDLIPNFAMETWVLVATSLVLLYI*
YGTHSHKLFKKLGIPGPTPLPFLGTILFYLR*
GLWNFDRECNEKYGEMWG*
LYEGQQPMLVIMDPDMIKTVLVKECYSVFTNQM*
PLGPMGFLKSALSFAEDEEWKRIRTLLSPAFTSVKFKE*
MVPIISQCGDMLVRSLRQEAENSKSINLKE*
DFFGAYTMDVITGTLFGVNLDSLNNPQDPFLKNMKKLLKLDFLDPFLLLI* 
SLFPFLTPVFEALNIGLFPKDVTHFLKNSIERMKESRLKDKQK*
HRVDFFQQMIDSQNSKETKSHK*
ALSDLELVAQSIIIIFAAYDTTSTTLPFIMYELATHPDVQQKLQEEIDAVLPNK**
APVTYDALVQMEYLDMVVNETLRLFPVVSRVTRVCKKDIEINGVFIPKGLAVMVPIYALHHDPKYWTEPEKFCPE**
RFSKKNKDSIDLYRYIPFGAGPRNCIGMRFALTNIKLAVIRALQNFSFKPCKETQ**
IPLKLDNLPILQPEKPIVLKVHLRDGITSGP*

coding region 504 amino acids

exon 1 8902-8972
exon 2 17239-17332
exon 3 19906-19958
exon 4 24929-25028
exon 5 28274-28387
exon 6 28952-29040
exon 7 30332-30480
exon 8 36377-36504
exon 9 37619-37685
exon 10 40616-40776
exon 11 42399-42625
exon 12 44323-44485
exon 13 46692-46787