Comparison of Crisp and Fuzzy Classification Trees Using Chi-Squared ImpurityMeasure on Simulated Data
AbstractClassification trees are one of the most popular choices in classification and discriminant analysis. One chief reason is that they are distribution free methods. Recently, with the introduction of fuzzy theory,fuzzy classification trees are gaining popularity. In this paper we use Pearson’s chi-squared impurity measure to compare the performance of crisp and fuzzy classification trees. This is done using simulated data. The data used consisted of two sets of observations from multivariate normal distributions. The first set of data were from two 3-variate normal populations with different mean vectors and common dispersion matrix. From each of the two populations 5000 samples were generated. 1000 samples out of the 5000 were used to create the trees. The remaining 4000 samples from each population were used to test the trees. The second set of data were from three 4-variate normal populations with different mean vectors and common dispersion matrix. A similar sampling and testing procedure as for the case of first set of data was employed. Computations were implemented using R statistical package. Using the Pearson’s chi-squared statistic for testing homogeneity in contingency tables showed that fuzzy classification trees algorithm makes two subnodes more heterogeneous than the crisp classification algorithm. Therefore fuzzy classification trees allocated observations to the correct population with fewer errors than did crisp classification tree.
Download data is not yet available.
How to Cite
Muchai, E., Odongo, L., & Kahiri, J. (2018). Comparison of Crisp and Fuzzy Classification Trees Using Chi-Squared ImpurityMeasure on Simulated Data. European Scientific Journal, ESJ, 14(35), 351. https://doi.org/10.19044/esj.2018.v14n35p351