Bao XH et al / Acta Pharmacol Sin 2003 May; 24 (5): 472-476
BAO Xin-Hua2, LU Wen-Cong, LIU Liang, CHEN Nian-Yi
Department of Chemistry, School of Sciences, Shanghai University, Shanghai 200436, China
1 Project supported by Ford-China Foundation, No 9716214.
2 Correspondence to Prof BAO Xin-Hua. Phn 86-21-6613-3513. Fax 86-21-6613-2797. E-mail ly046474@online.sh.cn
Received 2002-03-25 Accepted 2002-09-27
KEY WORDS guanidines; structure-activity relationship; sodium-hydrogen antiporter; pattern recognition; hyper-polyhedron models
ABSTRACT
AIM: To investigate structure-activity relationships of N-(3-Oxo-3,4-dihydro-2H-benzo[1,4]oxazine-6-carbonyl) guanidines in Na/H exchange inhibitory activities and probe into a new method of the computer-aided molecular screening. METHODS: The hyper-polyhedron model (HPM) was proposed in our lab. RESULTS: The samples with probably higher activities could be determined in such a way that their representing points should be in the hyper-polyhedron region where all known samples with high activities were distributed. And the predictive ability of different methods available was tested by the cross-validation experiment. CONCLUTION: The accurate rate of molecular screening of N-(3-Oxo-3,4-dihydro-2H-benzo[1,4]oxazine-6-carbonyl) guanidines by HPM was much higher than that obtained by PCA (principal component analysis) and Fisher methods for the data set available here. Therefore, HPM could be used as a powerful tool for screening new compounds with probably higher activities.
INTRODUCTION
The Na/H exchanger is a major Na+ entry pathway in many types of cells and plays an important role in regulation of cell volume and ion concentration. It is rapidly activated at post-ischemic reperfusion and causes a Ca2+ overload, which is known to be associated with cellular dysfunction, damage, and necrosis. Therefore, Na/H exchange inhibitor is a potentially useful candidate for improvement of ischemia-reperfusion- induced injury. N-(3-Oxo-3,4-dihydro-2H-benzo[1,4]oxazine-6-carbonyl) guanidines are known to be Na/H exchange inhibitors[1].
With the increasing demand for research of structure-activity relationship (SAR), the number of multivariate methods introduced into the literature for SAR has increased[2-7].
In recent years, we have applied some chemo-metric methods in drug optimal designing, material optimal design and industrial optimization[8-10]. It is also realized that a real-world data set sometimes exhibits so much noise that quantitative models such as linear or nonlinear regression, back-propagation artificial neural network (BP ANN) methods can not be used to represent them appropriately in some cases. When realistic quantitative models are not available, semi-quantitative, and qualitative pattern recognition methods are useful in classifying different kinds of samples of data set. The pattern recognition methods can be divided into 2 classes[6-7,11-12]: 1) linear methods, eg, principal component analysis (PCA), Fisher vector. 2) non-linear methods, eg, non-linear mapping (NLM), Kohonen's self-organizing map (KSOM).
However, a real-world data set sometimes is so complicated that even qualitative methods available can not give us satisfactory solutions. The data set used in this paper is one of such examples. So it is a meaningful task to obtain a realistic mathematical model for computer prediction of new compounds with high biological activities on the basis of such data set.
In this paper, the hyper-polyhedron model (HPM) was proposed for computer-aided molecular screening of N-(3-Oxo-3,4-dihydro-2H-benzo[1,4]oxazine-6- carbonyl) guanidines as Na/H exchange inhibitors. It seemed to be a more powerful tool to classify different kinds of samples than traditional pattern recognition methods. A useful software, "Hyper Miner", has been built based on a series of new computation methods including HPM.
MATERIALS AND METHODS
HPM was a new pattern recognition method powerful for classification of samples. In this method, no projection map was made. In computation, a series of hyper-planes could be obtained to create a hyper-polyhedron that encloses all samples with known high biological activities. The characteristic parameters relating to biological activities of compounds were used to span the multi-dimensional space. The representative points of compounds were plotted into the space. If the representative points of compounds with high biological activities (class "1" samples in Fig 1) were located in a definite zone (optimal zone) in the space, HPM could be used to describe the optimal zone. HPM could be expressed by a series of inequalities describing the boundaries of two kinds of compounds. Therefore, the inequalities could be used as the criterion of compounds with high activities. Since data separation took place in the multi-dimensional space directly, the results of separation were usually very good, provided the original classification of deferent classes in the multi-dimensional space was well defined. The conceptual hyper-polyhedron in three-dimensional space was demonstrated in Fig 1.
Fig 1. Conceptual hyper-polyhedron in three-dimensional space. 1: samples with high biological activities; 2: samples with low biological activities.
RESULTS
Computation of characteristic parameters of structures The theoretical parameters from computation of molecular mechanics and quantum chemistry were used to describe the molecular structures of N(3-Oxo-3,4-dihydro-2H-benzo[1,4]oxazine-6-carbonyl) guanidines. The molecular mechanics program (MM+) with all atom force fields was used to optimize the configurations of compounds, and the quantum chemical program (PM3) to calculate some theoretical parameters of the compounds.
Feature selection The reduced subset of parameters was also obtained by HPM. The data separability criteria had been established to select key parameters that determined the biological activities of the compounds. The rate of separability R was defined as R=(1-N2/N1). Here N1 was the number of the samples with high activities, and N2 was the number of the samples with low activities remaining within the hyper-polyhedron. If R >90 %, the separability was "excellent." It was practical and reliable to build an HPM on the basis of excellent separability for all known samples with a reduced subset of parameters. After the separability analysis of sample set, 4 quantum chemical parameters were taken as the features of pattern recognition. They were HOMO (energy of highest occupied molecular orbital), LUMO (energy of lowest unoccupied molecular orbital), D (the longest diameter of the substituent R3), Q (net charge of the oxygen atom of oxazine). It should be mentioned that the determining factors of activities of the compounds might be not merely the above four parameters, but they were enough to be used to classify the training samples. The "excellent" separability result was available by HPM for such data set with above 4 parameters. The activity (IC50) was evaluated by its ability to inhibit platelet swelling induced by sodium propionate. The training data set consisted of 18 compounds listed in Tab 1[1].
Tab 1. Features and Na/H exchange inhibitory activities of 18 compounds.
Computation of hyper-polyhedron model In the fourdimensional pattern space spanned by HOMO, LUMO, D, and Q of the samples, the two different kinds of the samples were classified according to their activities. Samples with IC50<0.44 mmol/L are grouped as "class 1" while the others as "class 2." The following series of inequalities used as the criterion of "class 1" samples were obtained by the hyper-polyhedron model.
-34.7<5.16[HOMO]_5.94[LUMO]+1.00[D]_25.6[Q]<-33.6
-3.31<0.222[HOMO]+3.17[LUMO]_0.052[D]_6.16[Q]< -2.41
-141.2<17.9[HOMO]_13.4[LUMO]_0.080[D]_53.9[Q]< -140.3
Results of cross-validation In order to test the predictive ability of different methods available here, the leaving-one method was used to compare the rate of correctness predicted by HPM, PCA and Fisher methods.
The projection maps of PCA and Fisher methods by using the same features as the HPM were given in Fig 2, 3. The results of cross-validation (leaving-one method) experiment were summarized in Tab 2. The predictive ability of HPM outperformed that of PCA and Fisher for the data.
Fig 2. Pattern recognition by PCA method.
IC50<0.44
mol/L; ×IC50>0.44
mol/L; A,
B: Predicted samples.
Fig 3. Pattern recognition by Fisher method.
IC50<0.44
mol/L; ×IC50>0.44
mol/L; A,
B: Predicted samples.
Tab 2. Results of cross-validation experiment by the leaving-one method.
Application of HPM According to the series of inequalities obtained by HPM, we could predict the activity of any new compound of N-(3-Oxo-3,4-dihydro-2H-benzo[1,4]oxazine-6-carbonyl) guanidines. Here were two examples to demonstrate the application of HPM. The new sample labeled A (Fig 2, 3) was found in the hyper-polyhedron expressed by above inequalities describing the boundaries of two different kinds of samples, while the other one labeled B (Fig 2, 3) was found out of the hyper-polyhedron. So we could predict that sample A would be a compound with high activity and sample B with low activity. These results were in agreement with the experimental facts listed in Tab 3.
Tab 3. Predicted results by hyper-polyhedron model.

DISCUSSION
Generally speaking, molecular structures can be described by various parameters, including electronic parameters, geometric parameters and hydrophobic parameters. Some parameters come from theoretical methods by using computational results of molecular mechanics and quantum chemistry, others are obtained from experimental results. Because of the large number of parameters, it is very difficult to select and use the available parameters to describe the molecular structure. The main difficulty is how to choose the right set, or a reduced subset, of parameters that correlate well the biological activity with the structure of drug molecule. The benefit of feature selection by using HPM is that the reduced subset of parameters represents directly the activity response of the compounds. So the satisfactory results of modeling can be obtained on the basis of the feature reduction.
In order to investigate the SAR of the compounds, a multi-dimensional space with 4 effective descriptors (HOMO, LUMO, D, and Q) was spanned to represent the different kinds of samples. Because HOMO and LUMO were the important factors affecting the activity of a compound, they were represented in the series of inequalities of HPM. So far as the geometric factors, 3 effective diameters of the substituents R1, R2, and R3 were calculated, respectively before application of HPM to the sample set. However, the result of HPM indicated that R3 represented by descriptor D was related inhibitory activity, while the substituents R1, R2 at 2-position of 2H-benzo[1,4]oxazine ring were not significant. At last, the inhibitory activity was affected by descriptor Q, which was the integrated embodiment induced, resonance, and hydrogen bond effect as well.
The cross-validation experiment comparing the performance of HPM with 2 other pattern recognition methods, PCA and Fisher methods. PCA and Fisher methods are 2 traditional pattern recognition methods widely used in structure-activity relationship (SAR) of drugs. Although they are very useful tools and successful in many cases of SAR research, PCA and Fisher methods could not build a satisfactory model for the data set used here (Fig 2, 3). As Tab 2 shown, the predictive ability of HPM outperformed that of PCA and Fisher for the data set used here. Of course, it should be noted that no single method or paradigm is uniformly superior. It is possible to use HPM in the area of SAR research.
In conclusion, HPM could be used to screen new compounds with probably higher activities. The new compounds with probably higher activities were determined in such way that their representing points should be in the hyper-polyhedron region formed by HPM, where all known samples with high activities were distributed. Also, HPM is a useful tool for feature selection. The computer prediction of N-(3-Oxo-3,4-dihydro-2H-benzo[1,4]oxazine-6-carbonyl) guanidines with high activities is feasible by using HPM while PCA and Fisher models could not provide us satisfactory results for the data set used here.
REFERENCES
-opioid receptor. Acta Pharmacol
Sin 2000; 21: 46-54.
mRNA and QSAR analysis. Acta Pharmacol Sin 2000; 21: 80-6.