Last modified March 25, 2002
New: see the poster Comparative genomics of Fugu and Human P450s
On Oct 26, 2001 the JGI released a draft assembly of Fugu.
Fugu P450s have been assembled from the genomic data at the Fugu blast servers.
There are 75 different contigs. 35 are complete assembled genes. 12 more are
nearly complete, missing only small parts of the sequence, from a few amino acids
up to one or two exons. These 47 Fugu P450s have been aligned to 60 human
sequences and 8 other fish sequences and used to make a tree.
In this tree human branches are red, Fugu branches are blue and other
fish are gray. The remaining 28 Fugu sequence contigs are made up of 11
pseudogene pieces (2K12P, 2K13P, 2K14P, 2K15P, 2P5P, 2R2P, 2R3P, 2X5P, 3A50P,
8A3P, 8B3P) and 17 partials. The 17 partials will probably collapse to not more
than 8-12 genes, since some appear to be from different parts of the same gene.
Examples are CYP2P4 missing exons 4, 7, 8 and 9 and two fragments covering exon 7
and exons 8,9. CYP2X3 and CYP2X4 are both missing exons 1 and 2, however there
are two fragments in CYP2X that cover exon 1 and exon 2, it is just not possible
to tell if they belong to 2X3 or 2X4. 27A3 is missing exons 3 and 4, however,
there are two exon 4 CYP27A fragments to choose from. One probably belongs to
27A3 and the other may be a pseudogene fragment or an alternative splice exon 4.
The sequences CYP1A1, 2K11, 3A49, 11B1 are probably real genes that are missing
more than half of their sequence or they are hybrids made from several fragments
that may or may not belong together (like 11B1). There are two CYP17A fragments
and one 3A fragment that are short. They might be from real genes, or
pseudogenes. More sequence data is needed to clarify their situation.
Based on these arguments there is good evidence for 55 Fugu P450s plus 11-15
There is a blast server set up to search these Fugu P450s. Go to
P450 blast server
The sequences used in this BLAST search database are shown in more detail in the
FASTA file below. This FASTA file has some sequences assembled with zebrafish
sequence, tetraodon sequence or human sequence to fill gaps in the assembly.
These are not found in the blast file to avoid confusion between species.
To see the alphabetical list of accession numbers go to
To see the FASTA format of the sequence contigs sorted by family go to
FASTA Sequence List
To see the alignment of human and Fugu sequences go to
Human Fugu alignment
To see the comparative genomics of human and Fugu P450s go to
Human Fugu CYP2s
Human Fugu CYPs familes 46, 26, 11, 27, 24, 19, 20, 8, 7, and 51
Human Fugu CYPs familes 1, 3, 4, 5, 17, and 21
The Tetraodon sequences have not been searched yet.
There are 189,036 GSSs for Tetraodon nigroviridis (freshwater pufferfish) and
45,707 GSSs for Fugu rubripes (salt water Japanese pufferfish) in Genbank , so
there should be many P450 sequence fragments in these collections.
May 24, 2001
June 22, 2001
There are 412,000 Fugu sequences at
These represent about 200 million bases of sequence, about half the genome size
of Fugu. Allowing for some redundancy less than half of the genome is
represented in this set. I have searched these with 18 mammalian P450s and
found 107 accession numbers that have P450 sequence. These have been sorted
into 80 contigs and 17 different P450 families. Only CYP39 is still missing.
August 27, 2001
The Fugu blast server at http://fugu.hgmp.mrc.ac.uk/blast/ has
979,612 sequences; 421,163,906 total letters or just over 1X coverage of the
genome. These sequences have been searched and the results posted in two files
below. There are now 272 accession numbers in 108 contigs.
The sequences are still very fragmentary consisting mostly of single
exons or a few exons on a single contig. There is some uncertainty about how to
join these into a complete sequence. For example, a complete sequence of CYP3A
can be assembled from individual reads, but there are at least two different
CYP3As in Fugu, so this is a chimeric sequence. The other complication with ray
finned fish is a predicted whole genome duplication that occurred in this
lineage after divergence from tetrapods (us). That is why fish have seven Hox
gene clusters instead of the four seen in mammals. After this duplication many
parts of the duplicated genome were lost, however, some parts were retained. It
is known that fish have two CYP19 sequences, probably as a result of the genome
duplication. It looks like there might be two CYP17 sequences, two CYP46
sequences and two CYP26B1
sequences. This will add to the difficulty of assembling these genes and other
P450s can be expected to have similar duplications. In my assembly of contigs,
I have tried to get full length P450s at the risk of making chimeric sequences.
I expect these will be self correcting as the genome is more deeply sequenced
and the contig size increases, so beware of this when looking at the assembled
genes. CYP8B1 is shown as a complete sequence, but there are minor conserved
sequence variations between reads, so there are probably two CYP8B1s in Fugu.