Last modified July 17, 2003 David Nelson
An alignment has been made of all CYP4 clan members from Ciona savignyi and
Ciona intestinalis. The alignment was made to help refine the sequences by
comparing them to others and to help in identifying intron-exon boundaries.
There are 35 sequences in this set. One set of 10 sequences from savignyi has
expanded recently, since there is only a single intestinalis ortholog.
An alignment of CYP4 clan Ciona sequences
some intron exon boundaries are marked, but this is not finished.
Blue = phase 0, magenta = phase 1, green = phase 2.
A tree of CYP4 clan Ciona sequences
Most Ciona sequences have now been identified from both species, however, the
other sequences are not refined as to their full length and intron-exon
boundaries. This is a work in progess. The partially assembled sequences are
here. There are more savignyi CYPs (about 97).
Sequences of all CYP Ciona sequences
Table of all Ciona savignyi CYP sequences by scaffold number
section below modified Jan. 7, 2002 David Nelson
New info: An assembly of the C. intestinalis genome has become available.
I have searched this assembly for P450s and have used it to complete the
partial sequences I had already assembled from individual reads. I am in
the process of improving my sequences by comparison with the assembled
genome. I have made a new file of 113 sequences. revised Ciona P450 contigs.
This includes 83 C. intestinalis sequences and 30 C. savignyi
sequences. I have made a tree of the complete sequences
(80 C. intestinalis and 18 C. savignyi sequences)
leaving out those partials and pseudogenes that would affect the tree
building algorithm. The new tree is
new Ciona tree.
This tree shows the typical expansion of a single clan to make P450s
for the organism. The clan used is the CYP2 clan and there are 51
C. intestinalis P450s in this clan.
A complete cross reference table has been constructed to link my assemblies
with the JGI assemblies. The JGI sequences are linked from this table.
Only two P450 pseudogenes have been found in Ciona inestinalis.
These are sequence 91 (a possible pseudogene of sequence 36),
and sequence 232 (a possible pseudogene of sequence 112).
Sequence 231 is incomplete, but it is 80% identical
to sequence 64 and may represent part of a complete gene. Sequence
231 is not found in the JGI assembly v1.0. It may be upstream of scaffold 638.
The 83 C. intestinalis sequences include the two pseudogenes (seq 91 and seq 232).
This leaves 81 predicted functional genes (assuming seq 231 will be an intact gene).
The Ciona genes will be named after the Anopheles genes are named.
The genome of Ciona savignyi has been sequenced to 14X
coverage at the Whitehead Institute. (genome size 180Mb)
The genome of Ciona intestinalis is being sequenced at the
Joint Genome Institute. (2.5X coverage blast searchable)
I am interested in finding all the P450s from this model
urochordate genus that is simpler than Fugu yet more closely
related to mammals/vertebrates than to echinoderms. This fits
in with comparative P450 studies already done on mammals and
Fugu. To this end, Rob Edwards and myself have downloaded
the 44 sequence files from the Whitehead institute and set
them up in a local Blast server so they can be searched.
We attempted to assemble the genome with Phrap from the 4.3 million
reads and their associated quality files but our Linux Dell PC
could not do this.
The MSCI814 Bioinformatics class (25 students and some auditors)
that Rob and I taught last semester, scoured the data for every
P450 hit and tried to assemble the genes. This is part of the
course was very difficult for the students. In fact, no one got
a complete Ciona P450 gene assembled. There were too many exons
per gene and the reads were too short to easily link the exons.
Extensive chromosome walking was required and the students did not fair
too well at this task. I have been working on it myself in October
So far, 77 different P450 sequences have been found in the Ciona intestinalis
data. I have assembled 75 sequences from the I-helix to the end of the gene.
31 are completely assembled. 13 more savignyi P450s are completely assembled.
The sequences found in Ciona savignyi will be
blasted against the intestinalis Blast server at JGI to find the
orthologs of that species and to help in assembly if there
are any gaps in the savignyi sequences. The sequence
coverage at JGI is less than the Whitehead data, so some
missing sequences are expected.
To see the detailed progress in analyzing these genomes for p450s
see the bioinformatics course pages on this process.
An alignment of 77 C-terminals is shown here.
A phylogenetic tree of 75 of these sequences is shown here.
because the font size is too small to read in this picture see
Bare Tree for a tree with readable lables, but no clan annotations.
Bare Tree 2 a tree with some extra fugu reference sequences
Summary of older information:
Blast searches have been done with P450s against the JGI Ciona
intestinalis sequence data. The CYP1A1 blast gave 250 valid P450 hits.
The protein sequences from these hits have been extracted and blast searched
against each other to find overlaps. Since then 25 more P450s have been used to
find accession number and sequences. The resulting assemblies are 210
Ciona P450 contigs.
All 18 mammalian P450 families and a 8 additional subfamilies (1B1, 2A6, 2D6,
2F1, 2W1, 26B1, 27B1, 27C1) have now been searched against Ciona at JGI.
There are 780 accessions so far with a few more expected. All of these have been
translated and assembled. Including some Ciona savignyi sequences the
Blast file now has 210 contigs. 44 genes are complete. For a FASTA file
See the FASTA list
For a more detailed list with accession numbers see the master sequence file
In addition, Rob Edwards has blast searched 41 human P450s (one from each
subfamily) against all 4.3 million reads of Ciona savignyi. These reads were in
44 separate files, since we have not been able to assemble them. 1804
blast files covering all of mammalian P450 space are collected, but these have not
been analyzed yet. Each sequence read only contains one or two exons, so there
are many fragments that are probably from the same gene, but they have not been
joined due to lack of overlap within exons. This may pose a problem that will
require comparison of the intron sequences and walking to join the fragments.
The accession numbers sorted by sequence are listed here.