adaptive_boosting_classifier
Adaptive Boosting (aka AdaBoost) classifier using C4.5 decision trees as base learners. Implements the SAMME (Stagewise Additive Modeling using a Multi-class Exponential loss function) variant, which supports multi-class classification by adjusting the weight update formula to account for the number of classes. Builds an ensemble of weighted decision trees where each subsequent tree focuses on the examples misclassified by previous trees: after each iteration, the weights of misclassified examples are increased so that subsequent learners focus more on difficult cases.
The library implements the classifier_protocol defined in the
classification_protocols library. It provides predicates for
learning an ensemble classifier from a dataset, using it to make
predictions (with class probabilities), and exporting it as a list of
predicate clauses or to a file.
Datasets are represented as objects implementing the
dataset_protocol protocol from the classification_protocols
library. See test_files directory for examples.
API documentation
Open the ../../docs/library_index.html#adaptive_boosting_classifier link in a web browser.
Loading
To load all entities in this library, load the loader.lgt file:
| ?- logtalk_load(adaptive_boosting_classifier(loader)).
Testing
To test this library predicates, load the tester.lgt file:
| ?- logtalk_load(adaptive_boosting_classifier(tester)).
Features
Adaptive Boosting: Iteratively trains weighted decision trees, focusing on misclassified examples
SAMME Algorithm: Supports multi-class classification via the SAMME variant of AdaBoost
C4.5 Base Learners: Uses C4.5 decision trees as weak learners for each boosting round
Weighted Voting: Final predictions determined by weighted voting where more accurate learners have higher weight
Probability Estimation: Provides confidence scores based on weighted vote proportions
Early Stopping: Training stops early if a perfect classifier is found or if a base learner is worse than random
Configurable Options: Number of estimators (boosting rounds) configurable via predicate options
Classifier Export: Learned classifiers can be exported as predicate clauses
Options
The following options can be passed to the learn/3 predicate:
number_of_estimators(N): Number of boosting rounds / weak learners (default: 10)
Classifier representation
The learned classifier is represented as a compound term:
ab_classifier(WeightedTrees, ClassValues, Options)
Where:
WeightedTrees: List ofweighted_tree(Alpha, C45Tree, AttributeNames)elementsClassValues: List of possible class valuesOptions: List of options used during learning
When exported using export_to_clauses/4 or export_to_file/4,
this classifier term is serialized directly as the single argument of
the generated predicate clause so that the exported model can be loaded
and reused as-is.
References
Freund, Y. and Schapire, R.E. (1997). “A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting”. Journal of Computer and System Sciences, 55(1), 119-139.
Zhu, J., Zou, H., Rosset, S., and Hastie, T. (2009). “Multi-class AdaBoost”. Statistics and Its Interface, 2(3), 349-360.
Quinlan, J.R. (1993). “C4.5: Programs for Machine Learning”. Morgan Kaufmann.
Usage
Learning a Classifier
% Learn an AdaBoost classifier with default options (10 estimators)
| ?- adaptive_boosting_classifier::learn(play_tennis, Classifier).
...
% Learn with custom options
| ?- adaptive_boosting_classifier::learn(play_tennis, Classifier, [number_of_estimators(20)]).
...
Making Predictions
% Predict class for a new instance
| ?- adaptive_boosting_classifier::learn(play_tennis, Classifier),
adaptive_boosting_classifier::predict(Classifier, [outlook-sunny, temperature-hot, humidity-high, wind-weak], Class).
Class = no
...
% Get probability distribution from weighted voting
| ?- adaptive_boosting_classifier::learn(play_tennis, Classifier),
adaptive_boosting_classifier::predict_probabilities(Classifier, [outlook-overcast, temperature-mild, humidity-normal, wind-weak], Probabilities).
Probabilities = [yes-0.9, no-0.1]
...
Exporting the Classifier
% Export as predicate clauses
| ?- adaptive_boosting_classifier::learn(play_tennis, Classifier),
adaptive_boosting_classifier::export_to_clauses(play_tennis, Classifier, my_boost, Clauses).
...
% Export to a file
| ?- adaptive_boosting_classifier::learn(play_tennis, Classifier),
adaptive_boosting_classifier::export_to_file(play_tennis, Classifier, my_boost, 'boost.pl').
...
Using a Saved Classifier
% Load and use a previously saved classifier
| ?- logtalk_load('boost.pl'),
my_boost(Classifier),
adaptive_boosting_classifier::predict(Classifier, [outlook-sunny, temperature-cool, humidity-normal, wind-weak], Class).
Class = yes
...
Printing the Classifier
% Print a summary of the AdaBoost classifier
| ?- adaptive_boosting_classifier::learn(play_tennis, Classifier),
adaptive_boosting_classifier::print_classifier(Classifier).
AdaBoost Classifier
===================
Learning options: [number_of_estimators(10)]
Class values: [yes,no]
Number of estimators: 10
Weighted trees:
Estimator 1 (alpha=1.2345, features: [outlook,temperature,humidity,wind]):
-> tree rooted at outlook
...
...