83 Ciona intestinalis Cytochrome P450s

This page was revised Jan. 7, 2003. This table provides a comprehensive cross reference table between Ciona intestinalis P450s assembled from individual sequence reads prior to the first genome assembly (sequences assembled up to Oct. 24, 2002), and the P450s assembled in the JGI Ciona assembly version 1.0. The sequence order is the same as seen on the Ciona phylogenetic tree (excluding the Ciona savignyi sequences and seqs 91, 231 and 232 that are not in the tree). The scaffold location of the minus strand sequences are not completely correct, because the numbering on the JGI blast output is backwards for minus strand hits. In addition to that, the 3 prime nucleotide number minus the 5 prime nucleotide number = the number of amino acids, not 3X the number of amino acids. So there are a few problems with the blast numbering. The process of reconciling the JGI assembled sequences with my assembled sequences is not finished yet. There are improvements to be made in both directions, sometimes my assemblies are more accurate, sometimes JGIs are more accurate. The current best estimate is that there are probably 81 functional P450s in Ciona intestinalis. There are two pseudogenes, seqs 91 and 232. Seq 231 is not found in the JGI assembly and it is partial, though it is predicted to be from a full gene based on its strong 80% identity to seq 64. The ci numbers are linked to the JGI entry for that ci#. The sequence can be obtained from that page by clicking on "get sequence". My sequence assemblies can be found in the FASTA file. At a later time I may add a blast comparison between the two assemblies to show where they do not agree. Some sequences occur at the ends of scaffolds and may provide the evidence needed to link scaffolds. For example, sequence 16 has two parts on two different scaffolds. They overlap exactly in exon 6. So these scaffolds should probably be joined. A similar situation is seen in seq 109. Sequence 2 and sequence 4 both run off the end of scaffolds. These two sequences are adjacent on the dendrogram, so the two are probably in a gene cluster and these two scaffolds (27 and 1257) should be adjacent. Sequence 151 has two parts that do not join. However sequence 151 is homologous to a complete sequence from C. savignyi that suggests the two parts (scaffolds 357 and 996) belong to the same gene.

seq#

Scaffold

ci #

strand

aa

exons

Scaffold location

Scaffold size

comments

167

1239

ci0100138987

-

1-490

1-10

3347-7519

11509 bp

complete

166

187

ci0100138282

-

1-497

1-10

52448-56649

158258 bp

complete

169

544

ci0100150360

-

1-512

1-8

30328-34827

45966 bp

complete

164

136

ci0100153637

-

36-509

2-10

177570-181904

232117 bp

exon 1

not detected

180

111

ci0100141900

+

1-432

1-9

40151-43640

260550 bp

complete

very odd seq

94

13

ci0100143924

+

1-607

1-9

297502-305021

614915 bp

complete

N-term extension

190

13

ci0100143944

-

1-565

1-7

305953-312466

614915 bp

complete

192

5

ci0100143856

-

1-497

1-6

161647-164577

753052 bp

complete

204

83

ci0100151525

-

1-493

1-11

10090-15485

294982 bp

complete

182b

83

ci0100131379

-

1-508

1-11

23537-29791

294982 bp

complete

182a

83

ci0100147340

-

1-508

1-11

17751-22173

294982 bp

complete

136

88

ci0100142705

-

1-497

1-10

99533-105660

292531 bp

complete

186

116

ci0100131439

-

1-490

1-11

37035-41638

253995 bp

complete

46

35

ci0100131189

-

1-526

1-7

240254-243665

458350 bp

complete

15

2

ci0100132188

+

1-531

1-6

115243-117466

919088 bp

complete

25

27

ci0100134415

+

1-607

1-6

357708-364389

480478 bp

complete

49

8

ci0100136792

-

1-531

1-6

484460-488264

681948 bp

complete

40

744

ci0100138492

-

1-471

2-9

2570-7152

26277 bp

exon 1

not detected

184

36

ci0100153789

-

1-515

1-10

186368-193205

429292 bp

complete

222

38

ci0100144128

+

1-498

1-10

186017-190262

397834 bp

complete

17

592

ci0100131843

-

1-519

1-10

25042-34375

40561 bp

complete

4000 bp 1st intron

84

147

ci0100144919

+

1-494

1-10

43217-47457

222906 bp

complete

51

1133

ci0100150675

+

1-488

1-2

9989-11055

14196 bp

complete

198

846

ci0100130969

+

1-466

1-8

15489-19130

21299 bp

complete

41

344

ci0100149995

-

1-494

1-5

15302-17488

80700 bp

complete

6

43

ci0100133889

+

1-504

1-9

120794-123876

362962 bp

complete

39

133

ci0100150103

-

1-505

1-9

26184-29899

227810 bp

complete

21

24

ci0100150103

-

1-494

1-9

319733-322310

493144 bp

complete

7b

37

ci0100139379

-

1-482

1-9

260933-264135

424764 bp

complete

7a

37

ci0100139346

-

1-482

1-9

257757-259491

424764 bp

complete

16

311

ci0100140915

+

1-293

1-6

93294-96689

98311 bp

N-term

16

1266

ci0100137403

+

252-501

6-9

1786-3932

11203 bp

C-term

202

638

ci0100132628

+

1-512

1-9

5561-8704

36007 bp

complete and

partial dup. of exon 1

43

638

ci0100132600

+

1-511

1-9

1050-3985

36007 bp

complete

196

39

ci0100152256

-

1-514

1-9

241602-246268

402610 bp

complete

66

174

ci0100133948

+

1-513

1-9

82507-85647

186786 bp

complete

64

36

ci0100152549

+

1-507

1-9

153884-156108

429292 bp

complete

57

36

ci0100152549

+

1-512

1-9

149304-152108

429292 bp

complete

33

230

ci0100137396

-

1-510

1-9

31420-334839

141404 bp

complete

102

1

ci0100151989

-

1-496

1-2, 4-9

884424-887570

972361 bp

missing part of exon 2

and all of exon 3

103

91

ci0100135419

-

1-505

1-9

245271-248731

282773 bp

complete

157

92

ci0100148852

-

1-495

1-9

270432-274571

292709 bp

complete

36

304

ci0100152549

+

1-496

no introns

82343-83830

98326 bp

complete

91

304

no annotation

-

406-493

8-9?

84999-85107

98326 bp

pseudogene

5

350

ci0100141208

-

1-476

1-9

37969-42793

83283 bp

complete

58

144

ci0100144015

+

1-492

1-9

12831-16538

209353 bp

complete

14

144

ci0100132260

-

1-252

1-5

496-2939

209353 bp

runs off end

31

11

ci0100147352

-

1-214, 260-504

1-4, 6-9

652191-655868

658104 bp

missing exon 5

55

106

ci0100138298

-

1-494

1-9

201061-206406

267246 bp

complete

48

106

ci0100130435

+

1-497

1-9

196663-199901

267246 bp

complete

4

1257

ci0100135981

-

1-172

1-4

164-851

13567 bp

runs off end

27

1257

ci0100136000

+

1-497

1-9

2678-6033

13567 bp

complete

1

1257

ci0100131008

+

1-188, 250-393

1-3.5, 6-8

7055-9749

13567 bp

assembly problems

probably one gene

1

1257

ci0100131008

+

112-338, 394-497

1.5-7, 9

9992-12062

13567 bp

assembly problems

probably one gene

209

324

ci0100147330

+

1-496

1-9

1628-4892

90072 bp

complete

72

817

ci0100136116

+

1-488

1-9

3188-9799

24028 bp

complete

2 copies of exon 2

106

149

ci0100153482

-

1-504

1-10

36996-39661

222761 bp

complete

18

149

ci0100137673

+

1-504

1-10

41196-43841

222761 bp

complete

20

54

ci0100147280

+

1-494

1-10

16357-20237

349210 bp

complete

2

27

ci0100144992

+

429-497

9

1-102

480478 bp

runs off end

208

19

ci0100139081

-

1-452

2-13

350362-356610

548072 bp

cannot ID exon 1

62

918

ci0100151040

+

1-481

1-5

8336-10725

18933 bp

complete, part of exon 5

dup. after gene end

26

48

ci0100154611

+

1-492

no introns

49424-49923

393431 bp

complete

42

3

ci0100139255

+

27-537

1-5

1-2600

866863 bp

missing 1-26 of exon 1

runs off end

29

12

ci0100145514

+

1-502

1-4

327602-329890

625398 bp

complete

65

12

ci0100145536

+

1-501

1-5

331861-333829

625398 bp

complete

is last exon split?

34

63

ci0100152358

+

1-503

1-4

209228-211408

354958 bp

complete

232

41

no annotation

-

1-61, 356-509

1-2, 11-14

2948-6288

406245 bp

pseudogene

119

41

no annotation

-

1-61

1-2

7-1002

406245 bp

N-term

119

75

ci0100140050

+

62-509

2-14

2106-8270

316770 bp

join scaffold 41

runs off end

112

75

ci0100140585

+

1-509

1-14

8925-16463

316770 bp

complete

125

75

ci0100151443

+

1-508

1-14

17342-24230

316770 bp

complete

118

262

ci0100133019

+

1-512

no introns

77325-77874

115427 bp

complete

129

8

ci0100135592

-

1-546

1-11

3701-8421

661948 bp

complete

109

8

ci0100134523

-

1-324

1-6

122-2467

661948 bp

N-term half

109

359

ci0100151551

-

460-546

10-11

78497-78920

79400 bp

C-term

159

359

ci0100150610

-

1-536

1-11

68718-75613

79400 bp

complete

158

176

ci0100138987

+

1-547

1-11

192129-196437

197985 bp

complete

151

357

ci0100151682

-

1-211

1-4

224-2360

81286 bp

N-term

runs off end

151

996

ci0100147473

+

241-430

8-11

402-2256

16667 bp

C-term

runs off end

148

100

ci0100144682

-

1-486, gap

1-6, 8-11

151778-156514

276848 bp

missing exon 7

147

78

ci0100152205

+

1-536

1-11

136637-140140

307118 bp

complete

145

81

ci0100143467

+

1-501

1-11

198977-206748

303162 bp

complete

155

21

ci0100151100

+

1-513

1-12

370383-375609

531011 bp

complete

110

21

ci0100151041

+

1-512

1-12

364082-369116

531011 bp

complete

115

15

ci0100146084

+

1-503

1-3

381100-382367

591421 bp

complete

134

46

ci0100142368

-

1-503

1-3

116988-118566

393349 bp

complete

Sequence 231 is not found in the JGI assembly v1.0. It may be located in the gap upstream of scaffold 638, since there is a related gene (seq 43) at the end of this scaffold and it is already part of a P450 gene cluster with sequence 202.