Comparison of Crisp and Fuzzy Classification Trees Using Chi-Squared ImpurityMeasure on Simulated Data

Eunice Muchai; Leo Odongo; James Kahiri

doi:10.19044/esj.2018.v14n35p351

Eunice Muchai Department of Statistics and Actuarial Science, Kenyatta University, Nairobi, Kenya
Leo Odongo Department of Statistics and Actuarial Science, Kenyatta University, Nairobi, Kenya
James Kahiri Department of Statistics and Actuarial Science, Kenyatta University, Nairobi, Kenya

DOI: https://doi.org/10.19044/esj.2018.v14n35p351

Abstract

Classification trees are one of the most popular choices in classification and discriminant analysis. One chief reason is that they are distribution free methods. Recently, with the introduction of fuzzy theory,fuzzy classification trees are gaining popularity. In this paper we use Pearson’s chi-squared impurity measure to compare the performance of crisp and fuzzy classification trees. This is done using simulated data. The data used consisted of two sets of observations from multivariate normal distributions. The first set of data were from two 3-variate normal populations with different mean vectors and common dispersion matrix. From each of the two populations 5000 samples were generated. 1000 samples out of the 5000 were used to create the trees. The remaining 4000 samples from each population were used to test the trees. The second set of data were from three 4-variate normal populations with different mean vectors and common dispersion matrix. A similar sampling and testing procedure as for the case of first set of data was employed. Computations were implemented using R statistical package. Using the Pearson’s chi-squared statistic for testing homogeneity in contingency tables showed that fuzzy classification trees algorithm makes two subnodes more heterogeneous than the crisp classification algorithm. Therefore fuzzy classification trees allocated observations to the correct population with fewer errors than did crisp classification tree.

Comparison of Crisp and Fuzzy Classification Trees Using Chi-Squared ImpurityMeasure on Simulated Data

Abstract

Downloads

Metrics

PlumX Statistics

Most read articles by the same author(s)

Comparison of Crisp and Fuzzy Classification Trees Using Chi-Squared ImpurityMeasure on Simulated Data

Abstract

Downloads

Metrics

PlumX Statistics

Most read articles by the same author(s)

Follow us on Social Media