Decision Forests have attracted the academic community's interest mainly due to their simplicity and transparency. This paper proposes two novel decision forest building techniques, called Maximal Information Coefficient Forest (MICF) and Pearson's Correlation Coefficient Forest (PCCF). The proposed new algorithms use Pearson's Correlation Coefficient (PCC) and Maximal Information Coefficient (MIC) as extra measures of the classification capacity score of each feature. Using those approaches, we improve the picking of the most convenient feature at each splitting node, the feature with the greatest Gain Ratio. We conduct experiments on 12 datasets that are available in the publicly accessible UCI machine learning repository. Our experimental results indicate that the proposed methods have the best average ensemble accuracy rank of 1.3 (for MICF) and 3.0 (for PCCF), compared to their closest competitor, Random Forest (RF), which has an average rank of 4.3. Additionally, the results from Friedman and Bonferroni-Dunn tests indicate statistically significant improvement. |
*** Title, author list and abstract as seen in the Camera-Ready version of the paper that was provided to Conference Committee. Small changes that may have occurred during processing by Springer may not appear in this window.