The cytochrome P450 superfamily is a highly diversified set of heme containing proteins.
These proteins were discovered in 1958 by their unusual reduced carbon monoxide
difference spectrum that has an absorbance at 450 nm, thus Pigment at 450 nm or P450.
This odd spectrum is caused by a thiolate anion acting as the 5th ligand to the the heme.
The most common reaction catalyzed is hydroxylation, often of a lipophilic substrate.
Consequently, the proteins are frequently called hydroxylases, but P450 proteins can
perform a wide spectrum of reactions including N-oxidation, sulfoxidation, epoxidation,
N-, S-, and O-dealkylation, peroxidation, deamination, desulfuration and dehalogenation
(1). In bacteria these proteins are soluble and approximately 400 amino acids long. The
eukaryotic P450s are larger, being about 500 amino acids. In eukaryotes the proteins are
ususally membrane bound through an N-terminal hydrophobic peptide and other less well
understood contacts. The two locations of these proteins in eukaryotes are the endoplasmic
reticulum membrane and the mitochondrial inner membrane. There are a few examples
known of soluble eukaryotic P450s, but these seem to be bacterial P450s that have been
acquired by some fungi by a lateral transfer across kingdoms.
Cytochrome P450s are sometimes called mixed function oxidases or monooxygenases.
This refers to the way molecular oxygen is incorporated into product. In the usual
hydroxylation, one atom of oxygen is added to the substrate and the other contributes to
forming a water molecule. This process is complex and requires the donation of two
electrons sequentially from an electron donor. The donor is different depending on the
location of the P450 in the cell or whether it is a bacterial protein or a eukaryotic protein.
At the ER membrane, NADPH cytochrome P450 reductase is the usual electron donor,
though cytochrome b5 can also participate. In the mitochondria, ferredoxin (adrenodoxin)
and ferredoxin reductase (adrenodoxin reductase) form a short electron transfer chain to
supply the electrons. The bacterial donors are of both types. The Bacillus megaterium
P450 CYP102 actually has the NADPH cytochrome P450 reductase fused to the P450 in a
single gene.
There are more than 1500 known P450 sequences. To aid in communication, a
standardized curated nomenclature has been established (2). This nomenclature is based on
evolution of the protein sequences, with similar sequences being clustered into families and
subfamilies. The root for cytochrome P450 names is CYP. By convention this is Cyp in
the mouse and Drosophila. Families are designated by a number and subfamilies by a
letter. Individual members in a subfamily are numbered consecutively as they are reported
to the nomenclature committee. The first P450 named was CYP1A1. When this system
was established, blocks of family names were reserved for different taxonomic groups.
Families 1-49 were for animals, 51-69 were for lower eukaryotes, 71-99 were for plants
and 101 and higher were for bacteria. This original allocation was too small and the
numbers have had to migrate into three digit numbers to continue naming new families.
CYP301-CYP499 are for animals, CYP501-699 are for lower eukaryotes, CYP701-999
are for plants and bacteria remain in the 101-299 range. The exact count of P450s is a
moving target and precise numbers have not been tallied except in plants where this was
done recently. As of April 2000, there were 513 plant P450 sequences known. There are
probably a larger number of animal P450 sequences. Bacteria and lower eukaryotes are
close to 100 named P450s each, but this will surely grow as the genome projects continue
to sequence whole genomes. Individual species are of some interest since the complete
genomes of yeast, C. elegans and Drosophila are known. Yeast have ony three P450s, and
the nearly complete Schizosaccharomyces pombe seems to have only two. C. elegans has
80 P450 genes, with about 6 of these being pseudogenes. Drosophila has 90 P450s with 4
pseudogenes. Humans have 56, not counting pseudogenes, but the human genome is only
94% sequenced mostly in draft form and this number may rise by one or two. Arabidopsis
is the undisputed record holder with 274 named P450 genes with 100% of the genome
completed. For detailed information on P450 nomenclature and other information see the
cytochrome P450 homepage at http://drnelson.uthsc.edu/CytochromeP450.html.
The functions of P450s are very broad. In mammals they are critical for drug metabolism,
blood hemostasis, cholesterol biosynthesis and steroidogenesis. They are responsible for a
number of human diseases. In plants they are involved in plant hormone synthesis,
phytoalexin synthesis, flower petal pigment biosynthesis and perhaps hundreds of
unknown functions. In fungi they make ergosterol and they are involved in pathogenesis,
by detoxifying host plant defenses. CYP51, the lanosterol 14-alpha demethylase is the
primary target of antifungal triazole drugs. Bacterial P450s are key players in antibiotic
synthesis.
Because the eukaryotic P450s are membrane bound, it has not been possible until very
recently to obtain a crystal structure of a representative enzyme. This has finally been
done, but the result is still unpublished. There are six soluble bacterial P450s that have
crystal structures CYP101, CYP102, CYP105A3, CYP55A1, CYP107A1, CYP108.
CYP55A1 is a pirated bacterial P450 found in a fungus. These have been reviewed by
Graham and Peterson (3). The general structure is globular, almost triangular, with the C-
terminal half being helix rich and the N-terminal half being more beta sheet rich. The C-
terminal half is more highly conserved. The P450 siganture motif includes the heme ligand
cys and is ususally represented as FXXGXXXCXG, though there are exceptions at all
three non-cys positions. This heme binding region is about 50 amino acids from the C-
terminal of the protein. The helix rich half of the protein starts with the I-helix. This long
helix contributes a conserved motif A(A,G)X(E,D)T where the thr residue is part of the
oxygen binding site. The K helix has an invariant EXXR sequence which tolerates no
substitutions. The E, the R and the C at the heme binding site are the only completely
conserved amino acids in P450s. For more