83 Ciona intestinalis Cytochrome P450s

This page was revised Jan. 7, 2003. This table provides a comprehensive cross reference table between Ciona intestinalis P450s assembled from individual sequence reads prior to the first genome assembly (sequences assembled up to Oct. 24, 2002), and the P450s assembled in the JGI Ciona assembly version 1.0. The sequence order is the same as seen on the Ciona phylogenetic tree (excluding the Ciona savignyi sequences and seqs 91, 231 and 232 that are not in the tree). The scaffold location of the minus strand sequences are not completely correct, because the numbering on the JGI blast output is backwards for minus strand hits. In addition to that, the 3 prime nucleotide number minus the 5 prime nucleotide number = the number of amino acids, not 3X the number of amino acids. So there are a few problems with the blast numbering. The process of reconciling the JGI assembled sequences with my assembled sequences is not finished yet. There are improvements to be made in both directions, sometimes my assemblies are more accurate, sometimes JGIs are more accurate. The current best estimate is that there are probably 81 functional P450s in Ciona intestinalis. There are two pseudogenes, seqs 91 and 232. Seq 231 is not found in the JGI assembly and it is partial, though it is predicted to be from a full gene based on its strong 80% identity to seq 64. The ci numbers are linked to the JGI entry for that ci#. The sequence can be obtained from that page by clicking on “get sequence”. My sequence assemblies can be found in the FASTA file. At a later time I may add a blast comparison between the two assemblies to show where they do not agree. Some sequences occur at the ends of scaffolds and may provide the evidence needed to link scaffolds. For example, sequence 16 has two parts on two different scaffolds. They overlap exactly in exon 6. So these scaffolds should probably be joined. A similar situation is seen in seq 109. Sequence 2 and sequence 4 both run off the end of scaffolds. These two sequences are adjacent on the dendrogram, so the two are probably in a gene cluster and these two scaffolds (27 and 1257) should be adjacent. Sequence 151 has two parts that do not join. However sequence 151 is homologous to a complete sequence from C. savignyi that suggests the two parts (scaffolds 357 and 996) belong to the same gene.

seq# Scaffold ci # strand aa exons Scaffold location Scaffold size comments
167 1239 ci0100138987 - 1-490 1-10 3347-7519 11509 bp complete
166 187 ci0100138282 - 1-497 1-10 52448-56649 158258 bp complete
169 544 ci0100150360 - 1-512 1-8 30328-34827 45966 bp complete
164 136 ci0100153637 - 36-509 2-10 177570-181904 232117 bp exon 1not detected
180 111 ci0100141900 + 1-432 1-9 40151-43640 260550 bp completevery odd seq
94 13 ci0100143924 + 1-607 1-9 297502-305021 614915 bp completeN-term extension
190 13 ci0100143944 - 1-565 1-7 305953-312466 614915 bp complete
192 5 ci0100143856 - 1-497 1-6 161647-164577 753052 bp complete
204 83 ci0100151525 - 1-493 1-11 10090-15485 294982 bp complete
182b 83 ci0100131379 - 1-508 1-11 23537-29791 294982 bp complete
182a 83 ci0100147340 - 1-508 1-11 17751-22173 294982 bp complete
136 88 ci0100142705 - 1-497 1-10 99533-105660 292531 bp complete
186 116 ci0100131439 - 1-490 1-11 37035-41638 253995 bp complete
46 35 ci0100131189 - 1-526 1-7 240254-243665 458350 bp complete
15 2 ci0100132188 + 1-531 1-6 115243-117466 919088 bp complete
25 27 ci0100134415 + 1-607 1-6 357708-364389 480478 bp complete
49 8 ci0100136792 - 1-531 1-6 484460-488264 681948 bp complete
40 744 ci0100138492 - 1-471 2-9 2570-7152 26277 bp exon 1not detected
184 36 ci0100153789 - 1-515 1-10 186368-193205 429292 bp complete
222 38 ci0100144128 + 1-498 1-10 186017-190262 397834 bp complete
17 592 ci0100131843 - 1-519 1-10 25042-34375 40561 bp complete4000 bp 1st intron
84 147 ci0100144919 + 1-494 1-10 43217-47457 222906 bp complete
51 1133 ci0100150675 + 1-488 1-2 9989-11055 14196 bp complete
198 846 ci0100130969 + 1-466 1-8 15489-19130 21299 bp complete
41 344 ci0100149995 - 1-494 1-5 15302-17488 80700 bp complete
6 43 ci0100133889 + 1-504 1-9 120794-123876 362962 bp complete
39 133 ci0100150103 - 1-505 1-9 26184-29899 227810 bp complete
21 24 ci0100150103 - 1-494 1-9 319733-322310 493144 bp complete
7b 37 ci0100139379 - 1-482 1-9 260933-264135 424764 bp complete
7a 37 ci0100139346 - 1-482 1-9 257757-259491 424764 bp complete
16 311 ci0100140915 + 1-293 1-6 93294-96689 98311 bp N-term
16 1266 ci0100137403 + 252-501 6-9 1786-3932 11203 bp C-term
202 638 ci0100132628 + 1-512 1-9 5561-8704 36007 bp complete andpartial dup. of exon 1
43 638 ci0100132600 + 1-511 1-9 1050-3985 36007 bp complete
196 39 ci0100152256 - 1-514 1-9 241602-246268 402610 bp complete
66 174 ci0100133948 + 1-513 1-9 82507-85647 186786 bp complete
64 36 ci0100152549 + 1-507 1-9 153884-156108 429292 bp complete
57 36 ci0100152549 + 1-512 1-9 149304-152108 429292 bp complete
33 230 ci0100137396 - 1-510 1-9 31420-334839 141404 bp complete
102 1 ci0100151989 - 1-496 1-2, 4-9 884424-887570 972361 bp missing part of exon 2 and all of exon 3
103 91 ci0100135419 - 1-505 1-9 245271-248731 282773 bp complete
157 92 ci0100148852 - 1-495 1-9 270432-274571 292709 bp complete
36 304 ci0100152549 + 1-496 no introns 82343-83830 98326 bp complete
91 304 no annotation - 406-493 8-9? 84999-85107 98326 bp pseudogene
5 350 ci0100141208 - 1-476 1-9 37969-42793 83283 bp complete
58 144 ci0100144015 + 1-492 1-9 12831-16538 209353 bp complete
14 144 ci0100132260 - 1-252 1-5 496-2939 209353 bp runs off end
31 11 ci0100147352 - 1-214, 260-504 1-4, 6-9 652191-655868 658104 bp missing exon 5
55 106 ci0100138298 - 1-494 1-9 201061-206406 267246 bp complete
48 106 ci0100130435 + 1-497 1-9 196663-199901 267246 bp complete
4 1257 ci0100135981 - 1-172 1-4 164-851 13567 bp runs off end
27 1257 ci0100136000 + 1-497 1-9 2678-6033 13567 bp complete
1 1257 ci0100131008 + 1-188, 250-393 1-3.5, 6-8 7055-9749 13567 bp assembly problemsprobably one gene
1 1257 ci0100131008 + 112-338, 394-497 1.5-7, 9 9992-12062 13567 bp assembly problemsprobably one gene
209 324 ci0100147330 + 1-496 1-9 1628-4892 90072 bp complete
72 817 ci0100136116 + 1-488 1-9 3188-9799 24028 bp complete2 copies of exon 2
106 149 ci0100153482 - 1-504 1-10 36996-39661 222761 bp complete
18 149 ci0100137673 + 1-504 1-10 41196-43841 222761 bp complete
20 54 ci0100147280 + 1-494 1-10 16357-20237 349210 bp complete
2 27 ci0100144992 + 429-497 9 1-102 480478 bp runs off end
208 19 ci0100139081 - 1-452 2-13 350362-356610 548072 bp cannot ID exon 1
62 918 ci0100151040 + 1-481 1-5 8336-10725 18933 bp complete, part of exon 5 dup. after gene end
26 48 ci0100154611 + 1-492 no introns 49424-49923 393431 bp complete
42 3 ci0100139255 + 27-537 1-5 1-2600 866863 bp missing 1-26 of exon 1 runs off end
29 12 ci0100145514 + 1-502 1-4 327602-329890 625398 bp complete
65 12 ci0100145536 + 1-501 1-5 331861-333829 625398 bp completeis last exon split?
34 63 ci0100152358 + 1-503 1-4 209228-211408 354958 bp complete
232 41 no annotation - 1-61, 356-509 1-2, 11-14 2948-6288 406245 bp pseudogene
119 41 no annotation - 1-61 1-2 7-1002 406245 bp N-term
119 75 ci0100140050 + 62-509 2-14 2106-8270 316770 bp join scaffold 41runs off end
112 75 ci0100140585 + 1-509 1-14 8925-16463 316770 bp complete
125 75 ci0100151443 + 1-508 1-14 17342-24230 316770 bp complete
118 262 ci0100133019 + 1-512 no introns 77325-77874 115427 bp complete
129 8 ci0100135592 - 1-546 1-11 3701-8421 661948 bp complete
109 8 ci0100134523 - 1-324 1-6 122-2467 661948 bp N-term half
109 359 ci0100151551 - 460-546 10-11 78497-78920 79400 bp C-term
159 359 ci0100150610 - 1-536 1-11 68718-75613 79400 bp complete
158 176 ci0100138987 + 1-547 1-11 192129-196437 197985 bp complete
151 357 ci0100151682 - 1-211 1-4 224-2360 81286 bp N-term runs off end
151 996 ci0100147473 + 241-430 8-11 402-2256 16667 bp C-term runs off end
148 100 ci0100144682 - 1-486, gap 1-6, 8-11 151778-156514 276848 bp missing exon 7
147 78 ci0100152205 + 1-536 1-11 136637-140140 307118 bp complete
145 81 ci0100143467 + 1-501 1-11 198977-206748 303162 bp complete
155 21 ci0100151100 + 1-513 1-12 370383-375609 531011 bp complete
110 21 ci0100151041 + 1-512 1-12 364082-369116 531011 bp complete
115 15 ci0100146084 + 1-503 1-3 381100-382367 591421 bp complete
134 46 ci0100142368 - 1-503 1-3 116988-118566 393349 bp complete
seq# Scaffold ci # strand aa exons Scaffold location Scaffold size comments

Sequence 231 is not found in the JGI assembly v1.0. It may be located in the gap upstream of scaffold 638, since there is a related gene (seq 43) at the end of this scaffold and it is already part of a P450 gene cluster with sequence 202.