Decision Trees: Entropy and Gini Impurity
A decision tree is a classifier that asks a sequence of yes/no questions about input features, following a path from root to leaf, and returns the prediction at…
~/blog/tutorials/machine-learning
A decision tree is a classifier that asks a sequence of yes/no questions about input features, following a path from root to leaf, and returns the prediction at…
Post 01 showed that splitting on Employed reduces entropy more than splitting on Income. This post builds the full decision tree level by level — computing Info…
The previous posts used categorical features with a finite set of values. For numerical features like sq_ft or age, the split can occur at infinitely many point…
An unconstrained decision tree will grow until every leaf is pure — one sample per leaf if necessary. On training data this achieves 100% accuracy. On test data…
A classification tree uses entropy or Gini to measure node impurity. A regression tree uses variance — the same algorithmic skeleton, a different objective. At…