This page was revised Jan. 7, 2003. This table provides a comprehensive cross reference table between Ciona intestinalis P450s assembled from individual sequence reads prior to the first genome assembly (sequences assembled up to Oct. 24, 2002), and the P450s assembled in the JGI Ciona assembly version 1.0. The sequence order is the same as seen on the Ciona phylogenetic tree (excluding the Ciona savignyi sequences and seqs 91, 231 and 232 that are not in the tree). The scaffold location of the minus strand sequences are not completely correct, because the numbering on the JGI blast output is backwards for minus strand hits. In addition to that, the 3 prime nucleotide number minus the 5 prime nucleotide number = the number of amino acids, not 3X the number of amino acids. So there are a few problems with the blast numbering. The process of reconciling the JGI assembled sequences with my assembled sequences is not finished yet. There are improvements to be made in both directions, sometimes my assemblies are more accurate, sometimes JGIs are more accurate. The current best estimate is that there are probably 81 functional P450s in Ciona intestinalis. There are two pseudogenes, seqs 91 and 232. Seq 231 is not found in the JGI assembly and it is partial, though it is predicted to be from a full gene based on its strong 80% identity to seq 64. The ci numbers are linked to the JGI entry for that ci#. The sequence can be obtained from that page by clicking on “get sequence”. My sequence assemblies can be found in the FASTA file. At a later time I may add a blast comparison between the two assemblies to show where they do not agree. Some sequences occur at the ends of scaffolds and may provide the evidence needed to link scaffolds. For example, sequence 16 has two parts on two different scaffolds. They overlap exactly in exon 6. So these scaffolds should probably be joined. A similar situation is seen in seq 109. Sequence 2 and sequence 4 both run off the end of scaffolds. These two sequences are adjacent on the dendrogram, so the two are probably in a gene cluster and these two scaffolds (27 and 1257) should be adjacent. Sequence 151 has two parts that do not join. However sequence 151 is homologous to a complete sequence from C. savignyi that suggests the two parts (scaffolds 357 and 996) belong to the same gene.
seq# | Scaffold | ci # | strand | aa | exons | Scaffold location | Scaffold size | comments |
---|---|---|---|---|---|---|---|---|
167 | 1239 | ci0100138987 | - | 1-490 | 1-10 | 3347-7519 | 11509 bp | complete |
166 | 187 | ci0100138282 | - | 1-497 | 1-10 | 52448-56649 | 158258 bp | complete |
169 | 544 | ci0100150360 | - | 1-512 | 1-8 | 30328-34827 | 45966 bp | complete |
164 | 136 | ci0100153637 | - | 36-509 | 2-10 | 177570-181904 | 232117 bp | exon 1not detected |
180 | 111 | ci0100141900 | + | 1-432 | 1-9 | 40151-43640 | 260550 bp | completevery odd seq |
94 | 13 | ci0100143924 | + | 1-607 | 1-9 | 297502-305021 | 614915 bp | completeN-term extension |
190 | 13 | ci0100143944 | - | 1-565 | 1-7 | 305953-312466 | 614915 bp | complete |
192 | 5 | ci0100143856 | - | 1-497 | 1-6 | 161647-164577 | 753052 bp | complete |
204 | 83 | ci0100151525 | - | 1-493 | 1-11 | 10090-15485 | 294982 bp | complete |
182b | 83 | ci0100131379 | - | 1-508 | 1-11 | 23537-29791 | 294982 bp | complete |
182a | 83 | ci0100147340 | - | 1-508 | 1-11 | 17751-22173 | 294982 bp | complete |
136 | 88 | ci0100142705 | - | 1-497 | 1-10 | 99533-105660 | 292531 bp | complete |
186 | 116 | ci0100131439 | - | 1-490 | 1-11 | 37035-41638 | 253995 bp | complete |
46 | 35 | ci0100131189 | - | 1-526 | 1-7 | 240254-243665 | 458350 bp | complete |
15 | 2 | ci0100132188 | + | 1-531 | 1-6 | 115243-117466 | 919088 bp | complete |
25 | 27 | ci0100134415 | + | 1-607 | 1-6 | 357708-364389 | 480478 bp | complete |
49 | 8 | ci0100136792 | - | 1-531 | 1-6 | 484460-488264 | 681948 bp | complete |
40 | 744 | ci0100138492 | - | 1-471 | 2-9 | 2570-7152 | 26277 bp | exon 1not detected |
184 | 36 | ci0100153789 | - | 1-515 | 1-10 | 186368-193205 | 429292 bp | complete |
222 | 38 | ci0100144128 | + | 1-498 | 1-10 | 186017-190262 | 397834 bp | complete |
17 | 592 | ci0100131843 | - | 1-519 | 1-10 | 25042-34375 | 40561 bp | complete4000 bp 1st intron |
84 | 147 | ci0100144919 | + | 1-494 | 1-10 | 43217-47457 | 222906 bp | complete |
51 | 1133 | ci0100150675 | + | 1-488 | 1-2 | 9989-11055 | 14196 bp | complete |
198 | 846 | ci0100130969 | + | 1-466 | 1-8 | 15489-19130 | 21299 bp | complete |
41 | 344 | ci0100149995 | - | 1-494 | 1-5 | 15302-17488 | 80700 bp | complete |
6 | 43 | ci0100133889 | + | 1-504 | 1-9 | 120794-123876 | 362962 bp | complete |
39 | 133 | ci0100150103 | - | 1-505 | 1-9 | 26184-29899 | 227810 bp | complete |
21 | 24 | ci0100150103 | - | 1-494 | 1-9 | 319733-322310 | 493144 bp | complete |
7b | 37 | ci0100139379 | - | 1-482 | 1-9 | 260933-264135 | 424764 bp | complete |
7a | 37 | ci0100139346 | - | 1-482 | 1-9 | 257757-259491 | 424764 bp | complete |
16 | 311 | ci0100140915 | + | 1-293 | 1-6 | 93294-96689 | 98311 bp | N-term |
16 | 1266 | ci0100137403 | + | 252-501 | 6-9 | 1786-3932 | 11203 bp | C-term |
202 | 638 | ci0100132628 | + | 1-512 | 1-9 | 5561-8704 | 36007 bp | complete andpartial dup. of exon 1 |
43 | 638 | ci0100132600 | + | 1-511 | 1-9 | 1050-3985 | 36007 bp | complete |
196 | 39 | ci0100152256 | - | 1-514 | 1-9 | 241602-246268 | 402610 bp | complete |
66 | 174 | ci0100133948 | + | 1-513 | 1-9 | 82507-85647 | 186786 bp | complete |
64 | 36 | ci0100152549 | + | 1-507 | 1-9 | 153884-156108 | 429292 bp | complete |
57 | 36 | ci0100152549 | + | 1-512 | 1-9 | 149304-152108 | 429292 bp | complete |
33 | 230 | ci0100137396 | - | 1-510 | 1-9 | 31420-334839 | 141404 bp | complete |
102 | 1 | ci0100151989 | - | 1-496 | 1-2, 4-9 | 884424-887570 | 972361 bp | missing part of exon 2 and all of exon 3 |
103 | 91 | ci0100135419 | - | 1-505 | 1-9 | 245271-248731 | 282773 bp | complete |
157 | 92 | ci0100148852 | - | 1-495 | 1-9 | 270432-274571 | 292709 bp | complete |
36 | 304 | ci0100152549 | + | 1-496 | no introns | 82343-83830 | 98326 bp | complete |
91 | 304 | no annotation | - | 406-493 | 8-9? | 84999-85107 | 98326 bp | pseudogene |
5 | 350 | ci0100141208 | - | 1-476 | 1-9 | 37969-42793 | 83283 bp | complete |
58 | 144 | ci0100144015 | + | 1-492 | 1-9 | 12831-16538 | 209353 bp | complete |
14 | 144 | ci0100132260 | - | 1-252 | 1-5 | 496-2939 | 209353 bp | runs off end |
31 | 11 | ci0100147352 | - | 1-214, 260-504 | 1-4, 6-9 | 652191-655868 | 658104 bp | missing exon 5 |
55 | 106 | ci0100138298 | - | 1-494 | 1-9 | 201061-206406 | 267246 bp | complete |
48 | 106 | ci0100130435 | + | 1-497 | 1-9 | 196663-199901 | 267246 bp | complete |
4 | 1257 | ci0100135981 | - | 1-172 | 1-4 | 164-851 | 13567 bp | runs off end |
27 | 1257 | ci0100136000 | + | 1-497 | 1-9 | 2678-6033 | 13567 bp | complete |
1 | 1257 | ci0100131008 | + | 1-188, 250-393 | 1-3.5, 6-8 | 7055-9749 | 13567 bp | assembly problemsprobably one gene |
1 | 1257 | ci0100131008 | + | 112-338, 394-497 | 1.5-7, 9 | 9992-12062 | 13567 bp | assembly problemsprobably one gene |
209 | 324 | ci0100147330 | + | 1-496 | 1-9 | 1628-4892 | 90072 bp | complete |
72 | 817 | ci0100136116 | + | 1-488 | 1-9 | 3188-9799 | 24028 bp | complete2 copies of exon 2 |
106 | 149 | ci0100153482 | - | 1-504 | 1-10 | 36996-39661 | 222761 bp | complete |
18 | 149 | ci0100137673 | + | 1-504 | 1-10 | 41196-43841 | 222761 bp | complete |
20 | 54 | ci0100147280 | + | 1-494 | 1-10 | 16357-20237 | 349210 bp | complete |
2 | 27 | ci0100144992 | + | 429-497 | 9 | 1-102 | 480478 bp | runs off end |
208 | 19 | ci0100139081 | - | 1-452 | 2-13 | 350362-356610 | 548072 bp | cannot ID exon 1 |
62 | 918 | ci0100151040 | + | 1-481 | 1-5 | 8336-10725 | 18933 bp | complete, part of exon 5 dup. after gene end |
26 | 48 | ci0100154611 | + | 1-492 | no introns | 49424-49923 | 393431 bp | complete |
42 | 3 | ci0100139255 | + | 27-537 | 1-5 | 1-2600 | 866863 bp | missing 1-26 of exon 1 runs off end |
29 | 12 | ci0100145514 | + | 1-502 | 1-4 | 327602-329890 | 625398 bp | complete |
65 | 12 | ci0100145536 | + | 1-501 | 1-5 | 331861-333829 | 625398 bp | completeis last exon split? |
34 | 63 | ci0100152358 | + | 1-503 | 1-4 | 209228-211408 | 354958 bp | complete |
232 | 41 | no annotation | - | 1-61, 356-509 | 1-2, 11-14 | 2948-6288 | 406245 bp | pseudogene |
119 | 41 | no annotation | - | 1-61 | 1-2 | 7-1002 | 406245 bp | N-term |
119 | 75 | ci0100140050 | + | 62-509 | 2-14 | 2106-8270 | 316770 bp | join scaffold 41runs off end |
112 | 75 | ci0100140585 | + | 1-509 | 1-14 | 8925-16463 | 316770 bp | complete |
125 | 75 | ci0100151443 | + | 1-508 | 1-14 | 17342-24230 | 316770 bp | complete |
118 | 262 | ci0100133019 | + | 1-512 | no introns | 77325-77874 | 115427 bp | complete |
129 | 8 | ci0100135592 | - | 1-546 | 1-11 | 3701-8421 | 661948 bp | complete |
109 | 8 | ci0100134523 | - | 1-324 | 1-6 | 122-2467 | 661948 bp | N-term half |
109 | 359 | ci0100151551 | - | 460-546 | 10-11 | 78497-78920 | 79400 bp | C-term |
159 | 359 | ci0100150610 | - | 1-536 | 1-11 | 68718-75613 | 79400 bp | complete |
158 | 176 | ci0100138987 | + | 1-547 | 1-11 | 192129-196437 | 197985 bp | complete |
151 | 357 | ci0100151682 | - | 1-211 | 1-4 | 224-2360 | 81286 bp | N-term runs off end |
151 | 996 | ci0100147473 | + | 241-430 | 8-11 | 402-2256 | 16667 bp | C-term runs off end |
148 | 100 | ci0100144682 | - | 1-486, gap | 1-6, 8-11 | 151778-156514 | 276848 bp | missing exon 7 |
147 | 78 | ci0100152205 | + | 1-536 | 1-11 | 136637-140140 | 307118 bp | complete |
145 | 81 | ci0100143467 | + | 1-501 | 1-11 | 198977-206748 | 303162 bp | complete |
155 | 21 | ci0100151100 | + | 1-513 | 1-12 | 370383-375609 | 531011 bp | complete |
110 | 21 | ci0100151041 | + | 1-512 | 1-12 | 364082-369116 | 531011 bp | complete |
115 | 15 | ci0100146084 | + | 1-503 | 1-3 | 381100-382367 | 591421 bp | complete |
134 | 46 | ci0100142368 | - | 1-503 | 1-3 | 116988-118566 | 393349 bp | complete |
seq# | Scaffold | ci # | strand | aa | exons | Scaffold location | Scaffold size | comments |
Sequence 231 is not found in the JGI assembly v1.0. It may be located in the gap upstream of scaffold 638, since there is a related gene (seq 43) at the end of this scaffold and it is already part of a P450 gene cluster with sequence 202.