August 19, 1998
A note on P450 Clans: Higher order structure in P450 trees.

	The P450 nomenclature has been flooded with new families over the last few years.  
Currently, there are about 150 families and there is a backlog of new sequences from 
Mycobacterium and some other bacteria that still need to be named.  The large number of 
families is beginning to make the nomenclature cumbersome.  One way to alleviate some of 
this is to recognize that the families naturally group into higher order clusters.  Naming 
these clusters would automatically reduce the need to keep abreast of all the P450 familes.  
I have looked at the P450s with this in mind, and have begun the process of naming 
consistent clusters of P450s.  

	The term chosen for these clusters is clan.  A P450 clan consists of families of 
P450s that clearly belong together based on many trees that have been constructed over 
time, by me and others.  To try and place a quantitative measure on this, these clusters 
would have high bootstrap values.  Perhaps a fixed bootstrap value would be analogous to 
the (approximately)40% cutoff for membership in a given family.  I do not know yet what 
this value might be and I am open to suggestions.  For the moment, I would like to list 
some very obvious clans and go on from there.  More families can be added to the clans 
later if an agreement can be reached about the cutoff value.

	The CYP2 clan contains the CYP2 family as well as the CYP1, 17, 18 and 21 
families.  Generally the clan will be named for the CYP family with the lowest CYP 
number in the cluster, or for the most prominent family in the cluster.  In this case, the 2 
family is dominant.  CYP22 from C. elgans is the next closest P450 to this cluster and it 
may belong in the clan, but for now I will use the more conservative set given above.

	The CYP3 clan contains CYP3, 5, 6, 9 and 30.  Later it might be extended to 
include CYP25 and CYP13, both C. elegans families.  A bootstrap analysis will be 
necessary.  

	CYP4 includes a large number of sequences from mammals, insects and C. 
elegans.  Several people have written me regarding the similarity of the C. elegans 
sequences to the four family and expressed the opinion that they should be included in the 
four family.  This CYP4 clan provides an alternative which does not break the family 
definition completely apart.  This clan includes the CYP4, 29, 31, 32 and 37 families.  

	The CYP7 clan includes CYP7 and 8.  

	The mitochondrial clan includes all P450s known to be found in mitochondria and 
the sequences that cluster with them, even if experimental evidence is currently lacking for 
their localization.  This includes the CYP10, 11, 12, 24, 27 and 44 families.  

	Another large clan that is exclusively C. elegans so far is called the C. elegans clan.  
It includes families CYP14, 23, 33, 34, 35 and 36.  There are about 45 sequences in this 
clan.

	P450 clans in plants and bacteria have not been given names in this system yet.  In 
plants, it has long been recognized that there is a large cluster called the group A P450s.  
This includes the CYP71 family, the first plant P450 to be sequenced.  Following the 
general rule, this cluster might be named the CYP71 clan, but this is an open issue now.

Please email me with your comments and suggestions concerning this new higher order 
nomenclature for P450s.  A publication will appear in Comparative Biochemistry and 
Physiology Part C (August 98), but I do not have page numbers for this yet.  It is titled 
Metazoan cytochrome P450 evolution, and it is part of a special P450 issue.