Mouse P450s and their UNIGENE entries

April 29, 1999 adjusted to changes in UNIGENE build #50 of April 1999 A text search for P450 in the mouse section of UNIGENE finds 38 entries. This is a little misleading because some P450 gene subfamiles have multiple genes lumped together in one entry, since they map to the same locus as in Cyp2d9 and 2d10. Cyp2d9 has two different entries, so there is some confusion. Entries that were separated in Feb. have become joined now. This is a mistake. Cyp2a4 and Cyp2a5 have been joined, Cyp2d22 and Cyp2d26 have been joined. The old entries for 2a5 and 2d22 have been retired. The Cyp21 gene resides next to some MHC genes. A large contiguous sequence including Cyp21 has been deposited in GenBank, so all these genes come up in the search. They include complement components C2, C4 and B-factor. Some P450s are not called P450s at all like thromboxane A2 synthase Cyp5 or sterol 26- hydroxylase. These P450s are missed by a text search for P450. Below are presented all the mouse P450s I could find in UNIGENE sorted by their correct Cyp name. There are 64 P450 genes and 46 UNIGENE entries included here. WARNING! This table was complied on Feb. 12. On Feb 15 a new build of the UNIGENE mouse database was made and about 14 of the Mm. identifiers listed here were retired with no link to the newer identifier. I had to spend a couple of hours to track these now defunct numbers down and replace them with the new numbers. This will probably happen every month, and I may not be able revise it that often. To help you find these new identifiers, I recommend a search in UNIGENE for one of the genbank or EST accession numbers from that set. I am going to include one accession number for each identifier to make this possible. Personally, I think this approach to retiring numbers without links is very poor and needs to be fixed. Note by 3/29/99 a 3a11, 3a16 UNIGENE cluster was split up into two separate entries as it should be. A second 2d9 entry was added Mm.34822, and the earlier 2d9, 2d11 UNIGENE cluster was modified so it now contains 2d9 and 2d10. In the future I hope this cluster will be broken into 2d9 and 2d10 entries. Cyp1a1 Mm.14089 genbank X01681 Cyp1a2 Mm.15537 genbank X00479 Cyp1b1 Mm.4443 genbank U03283 Cyp2a4 Mm.14781 genbank J03549 note 2a4 and 2a5 have been joined in one entry since the Feb build. Cyp2a5 Mm.4112 (retired) genbank X89864 Cyp2a12 Mm.20770 genbank L06463 Cyp2b9 Mm.876 genbank M60273 Cyp2b10 Mm.14177 genbank M21856 Cyp2b13 Mm.14413 genbank M60358 Cyp2b19 Mm.14098 genbank AF047529 Cyp2b20 no UNIGENE entry genbank X99715 no ESTs fusion of 2b10 and P450 reductase Cyp2c29 Mm.20764 genbank D17674 Cyp2c37 Mm.28533 genbank AF047542 ESTs AI255551 AI255553, AI315987, AI119584 Cyp2c38 no UNIGENE entry genbank AF047725 no ESTs Cyp2c39 no UNIGENE entry genbank AF047726 EST AA882179 Cyp2c40 Mm.29973 genbank AF047727 86 ESTs follow AI255979 Cyp2c44 Mm.26457 23 ESTs follow AI317668 Cyp2d9, 10 Mm.3164 genbank M23998 Cyp2d10,11,22Mm.27803 genbank M24263 Cyp2d12 no UNIGENE entry EST AA114047 Cyp2d13 no UNIGENE entry no ESTs no genbank number, may be an erroneous sequence Cyp2d26 Mm.29064 68ESTs follow AA982292 Cyp2e1 Mm.13020 genbank L11650 Cyp2f2 Mm.4515 genbank M77497 Cyp2g1 no UNIGENE entry not cloned in mouse yet Cyp2j5 Mm.12838 genbank U62294 Cyp2j6 Mm.6477 genbank U62295 Cyp2j7 seq. confidential Cyp2j8 seq. confidential Cyp2j9 seq. confidential Cyp2r no UNIGENE entry only known in humans so far Cyp2s1 CODING REGION ESTS AA562979, AA543966, AA967201, AA472776 Cyp2s1 3'UTR Mm.23710 15 ESTs from the 3 prime untranslated region Cyp3a11 Mm.21193 genbank X60452 Cyp3a13 Mm.4094 genbank X63023 Cyp3a16 Mm.30303 genbank D26137 no ESTs Cyp4a10 Mm.10742 genbank AB018421 Cyp4a12 no UNIGENE entry genbank X71479, Y10222 EST AA882070 vx36a01.r1 Cyp4a14 Mm.7459 genbank Y11638 Cyp4b1 Mm.1840 genbank D50834 Cyp4f13 Mm.22045 15 ESTs follow AI323999 Cyp4f14 Mm.10976 C-term only. 3ESTs follow AI046446 The N-terminal is not in UNIGENE but consists of AI098768, AI047403, AI099192, AA237519, AA793220, AA869706, AI118365 (chimeric clone), AA880205 Cyp4f15 Mm.26539 1EST AI121221 no UNIGENE entry for N-terminal ESTs AI315840 uj47c01.y1 AI046615 ud70e01.y1, AI048594 ud61g02.y1 Mm.26539 AI121221 matches C-term of Cyp4f13, Cyp4f14 and Cyp4f15 exactly, but its opposite end is really Cyp4f15 this may indicate gene conversion occurred among these three genes. Cyp4f16 no UNIGENE entry ESTs AA059667 mj77a08.r1 AA863889 vx15d11.r1, AA673923 vo83a09.r1 AA798644 vw34b12.r1, AA896484 vx63d12.r1 Cyp4f16 frag.a no UNIGENE entry ESTs AA671839 vl02c09.r1, AA755081 vu02b08.r1 Cyp4f16 frag.cMm.30504 ESTs AA688499 vq53e04.r1 AA791143 vv91d10.r1, AI036012 vz68a09.r1 AA794692 vu63c06.r1 Cyp4f17 no UNIGENE entry ESTs AA387122 vc22e06.r1 AA386914 vc23b12.r1, AA387149 vc22f06.r1 Cyp4f18 frag.b no UNIGENE entry EST AA716998 vu61b04.r1 Cyp4f18 frag.d no UNIGENE entry EST AI006742 ua82h10.r1 Cyp4x no UNIGENE entry only known in humans so far Cyp4z no UNIGENE entry only known in humans so far Cyp5a1 Mm.4054 genbank L18868 Cyp7a1 no UNIGENE entry genbank L23754 EST AA254999 Cyp7b1 Mm.4781 genbank U36993 Cyp8a1 Mm.2339 genbank AB001607 Cyp8b1 Mm.20889 genbank AF090317 Cyp11a1 Mm.28748 genbank J05511 ESTs AA288657 AA107913, AA288671, AA684394, AA920968, AI324751, AA920915, C85519, C87334, D19277, AU021754, C87804, C85447, C87406, AA389569, C88026, AU108054, C86435, C86758, W18146 Cyp11b1 no UNIGENE entry genbank J04451 no ESTs Cyp11b2 no UNIGENE entry genbank S85260 no ESTs Cyp17 Mm.1262 genbank M64863 Cyp19 Mm.5199 genbank D00659 Cyp21 Mm.18845 genbank AF049850 Cyp24 Mm.6575 genbank D89669 Cyp26 no UNIGENE entry genbank Y12657 EST AA239785 Cyp27a1 Mm.26793 8 ESTs follow AI286988 Cyp27b1 Mm.6216 genbank AB006034 Cyp39 no UNIGENE entry ESTs AA272844, AA606237 Cyp46 no UNIGENE entry ESTs AA096922, R75217 Cyp51 Mm.24155 ESTs AA123049, AI007126, W58959, AA144311, AA288897, AA118292, AA537618, AA980997, AA107943, AI314330, AA107043, AI226612, AA105775, AA105777, AA288896, AA162461, AA755063 Cyp4f mouse sequences Cyp4f13 1-341 89% identical to rat CYP4F6 complete sequence known but confidential (Henry Strobel) AI304244 ui64h08.y1 AA277579 va77c04.r1 AA238458 my34d11.r1 AA277631 va81d08.r1 AI265156 uj03h07.y1 AI120320 ub69c04.r1 AI327511 mb31b06.y1 AA245188 mx09e08.r1 AA799238 vx69h12.r1 AA797204 vw22d08.r1 CONSENSUS MLQLCLSWLGMGSLTASPWHLLLLGGASWILARILAWIYAFYDNCSRLRCFPQPPKPSWF WGHLALMKNNEESMQFITHLGHDFHDVHLSWVGPVYPILRLVHPNFIAPLLQASAAVAPKEMTLYGFLKP WLGDGLLMSAGDKWSHHRRLLTPAFHFDILKSYVKIFNKSVNIMHAKWQCLASKGTSRLD MFEHISLMTLDSLQKCIFSVDSNCQESDSKYIAAILELSSLVVKRHRQP FLYLDLLYYLTADGRRFRKACDLVHNFTDAVIKERRSTLNTQGVEFLKAKAKTKTLD FIDVLLMAEDEHGKGLSNEDIRAEADTFMFGGHDTTTSALSWILYNXXXXXXXXXXXXX Cyp4f13 356-524 AI196488 ui64h08.x1 355-524 AI196852 ui67e09.x1 363-524 AA511038 vh55g04.r1 AI323999 mb31b06.x1 W64560 md70f05.r1 CONSENSUS EVQELLRDRDSEEIEWDDLAQLPFLTMCIKESLRLHPPVLLISRCCTQDVLLPDGRAI PKGNICVISIFGVHHNPSVWPDPEVYNPFRFDPENPQKRSPLAFIPFSAGTRNCIGQTFA MSEIKVALALTLLRFRILPDDKEPRRKPELILRAEGGLWLRVEPLSAGAQ* Cyp4f14 1-370 92% identical to rat CYP4F1 probable ortholog complete sequence known but confidential (Henry Strobel) AI098768 ue38c01.y1 AI047403 ud65h03.y1 1-207 AI099192 ue40b11.y1 1-183 AA237519 mx10h12.r1 AA793220 vl85g08.r1 AA869706 vq44d03.r1 AI118365 ue40b11.x1 238-369 chimeric clone contains a-L-Fucosidase AA880205 vx39b01.r1 consensus MSQLSLSWLGLGPEVAFPWKTLLLLGASWILARILIQIYAAYRNYRHLHGFPQPPKRNWL MGHVGMVTPTEQGLKELTRLVGTYPQGFLMWIGPMVPVITLCHSDIVRSILNASAAVALK DVIFYSILKPWLGDGLLVSAGDKWSRHRRMLTPAFHFNILKPYVKIFNDSTNIMHAKWQR VISDGSAPLDMFEHVSLMTLDSLQKCVVFSFDSNCQEKSSEYIAAILELSALVAK KDQQPLMFMDLLYNLTPDGMRFRKACNVVHEFTDAVIRERHRTLPDQGLDDFLKSKAKSKTLD FIDVLLLSKDEDGKELSDEDIRAEADTFMFEGHDTTASGLSWILYNLARHPEYQERCRQE VQELLRGREPEE gap of 69 amino acids Cyp4f14 444-524 AI046446 ud65h03.x1 C-term AA274495 vb07d01.r1 C-term VYDXFRFDPENIKDSSPLAFIPFSAGPRNCIGQTFAMSEMKVALALTLLRFRLLPDDKEPRRQPELILRAEGGLWLRVE PLSAGAH* Cyp4f15 92% identical to rat CYP4F4 probable ortholog complete sequence known but confidential (Henry Strobel) AI315840 uj47c01.y1 1-201 AI046615 ud70e01.y1 AI048594 ud61g02.y1 MPQLDLSWLGLRLEASSPWLLLLLIGASWLLARVLTQTYIFYRTYHHLCDFPQPPKWNWF LGHLGMITPTEHGLKEVTNLVATYPQGFMTWLGPIIPIITLCHPDIIRSVLNASASVALK EVVFYSFLKPWLGDGLLLSDGDKWSSHRRMLTPAFHFNILKPYVKIFNDSTNIMHAKWQH LALGGSARLDVFENISLMTLD AI048594 ud61g02.y1 407-480 AI121221 ud70e01.x1 C-term this fragment also is an exact match to 4f13 and 4f14 There may have been gene conversion at the C-terminal of these three genes. LLPDGRVIPKGVICIINIFATHHNPTVWPDPEVYDPFRFDPENIKDRSPLAFIPFSAGPR NCIGQTFAMNEMXVXXXXXXXXXXXXPDDKEPRRKPELILRAEGGLWLRVEPLSTQ Cyp4f16 92% identical to rat CYP4F5 Complete sequence is known but confidential (Henry Strobel) AA059667 mj77a08.r1 AA863889 vx15d11.r1 AA673923 vo83a09.r1 AA798644 vw34b12.r1 AA896484 vx63d12.r1 MLRLSVSGLDLGSVVTSSWHLLLLGVASWILARILAWTYSFYENCSRLSCFPQPPKKNWF SGHLGMIQSNEEGMQLVTEMGQTFQDVHLFWLGPVIPVLRIVDPAFVAPLLQAPALVAPKDMTFLRFLKPWL GDGLFLSSGDKWSRHRRLLTPAFHFDILKPYVKIFNQSVNIMHAKWKHLSSEGSARLEMF EHISLMTLDSLQKCLFGFDSNCQESPSEYISAILELSSXXXXXXXXXXXXXXX Cyp4f fragment a 93% identical to 4F5 88% to 4F6, 86% to 4F1 83% to 4F4 This fragment belongs to Cyp4f16 AA671839 vl02c09.r1 AA755081 vu02b08.r1 LYYHTADGRRFRKACDLVHNFTDAVIRERRHTLSSQNHDEFLKSKTKSKTLDFIDVL LLAKDEHGKELSDEDIRAEADTFMFGGHDTTASALSWILYNLARHPEYQERCRQRVQE LLRDREPEEIEWDDLAQLPFLTMCIKESLRLHSPVIDXXXXXXX Cyp4f fragment c 91% identical to 4F6, 86% to 4F1 and 4F5, 85% to 4F4 This fragment is part of Cyp4f16 AA688499 vq53e04.r1 AA791143 vv91d10.r1 AI036012 vz68a09.r1 AA794692 vu63c06.r1 DIVLXXXXXXPKGNICVISIFGIHHNPSVWPDPEXXX PFRFDPENPQKRSPLAFIPFSAGPRNCIGQTFAMSEM KVALALTLLRFRILPDDKEPRRKPEIILRAEGGLWLRVEPLSKGAQ* Cyp4f17 66% to CYP4F6 but 81% identical to a rat EST AI030199 The rat homolog to this gene has not been cloned yet. AA387122 vc22e06.r1 AA386914 vc23b12.r1 AA387149 vc22f06.r1 MLQLGLSWLGLGPGAAFPWQLLQLVGASLFLARILTWICAFYDNYCRLRCFPEPPSRHWF WGHMSMVKNNEEGLQLLTERSHQFHDVHLCWIGPFYPILRLIHPKFIGSILQASAAVAPK EMIFYGFLKPWLGDGLLVSAGEKWSRH AI030199 probable rat ortholog to 4f17 4f17 53 EPPSRHWFWGHMSMVKNNEEGLQLLTERSHQFHDVHLCWIGPFYPILRLIHPKFIGSILQA 112 +P SRHWFWGH+++VKNNEEGLQLL E SHQF D+HLCWIG FYPILRLIHPKFIG ILQA rat 431 QPLSRHWFWGHLNLVKNNEEGLQLLAEMSHQFQDIHLCWIGIFYPILRLIHPKFIGPILQA 252 Cyp4f18 fragments b and d 90% identical to 4F6, 89% to 4F4, 88% to 4F5, 87% to 4F1 complete sequence known but confidential (Henry Strobel) AI006742 ua82h10.r1 PKRNWILGHLGLIQSSEEGLLYIQSLVRTFRDACCWWVGPLHPVIRIFHPAFIKXVVLA AA716998 vu61b04.r1 LLSKDEHGKALSDEDIRAEADTFMFGGHDTTASGLSWILYNLARHPEYQERCRQEVRD LLRDREPEEIEWDDLAQLPFLTMCIKESLRLHPPVTAISRCCTQDIVLPDGRVIPKGVIS RISIFGTHHNPAVWPDPEV Mm.29064 Cyp2d26 consensus MGLLVGDDLWAVVIFTAIFLLLVDLVHRRQRWTACYPPGPVPFPGLGNLLQVDFENIPYS FYKLQNRYGNVFSLQMAWKPVVVVNGLKAVRELLVTYGEDTSDRPLMPIYNHIGYGHKSK GVILAPYGPEWREQRRFSVSTLRDFGLSKKSLEQWVTEEAGHLCDAFTKEAEH PFNPSPLLSKAVSNVIASLIYARRFEYEDPFFNR MLKTLKESLGEDTGFVGEVLNAIPMLLHIPGLPDKAFPKLNSFIALVNKMLIEHDL TWDPAQPPRDLTDAFLAEVEKAKGNPESSFNDKNLRIVVIDLFMAGMVTTSTTLSWALLL MILHPDVQRRVHQEIDEVIGHVRHPEMADQARMPYTNAVIHEVQRFADIVPTNLPHMTSR DIKFQDFFIPKGTTLIPNLSSVLKDETVWEKPLRFYPEHFLDAQGHFVKHEAFMPFSAGR RSCLGEPLARMELFLFFTCLLQRFSFSVPDGQPRPSDYGIYTMPVTPEPYQLCAVAR