The generalized boosting package gbm in
R offers the plot.gbm
function which > “selects a grid
of points and uses the weighted tree traversal method described in
Friedman (2001) to do the integration”
The partial dependence package pdp in
R offers the partial
function with the Boolean
recursive
argument which > “indicating whether or not to
use the weighted tree traversal method described in Friedman (2001).
This only applies to objects that inherit from class gbm
.
Default is TRUE which is much faster than the exact brute force approach
used for all other models.”
I believe that the original description of the algorithm in the Friedman paper is incorrect. By following only one child of splits on X0 one computes conditional PDPs and not interventional PDPs. I do not believe it possible to compute the latter with just one pass through the tree, it requires a full pass of the entire training data. The only speedup that my proposed algorithm offers is that one full pass is sufficient for all values of X0! And to achieve that, one has to follow both children of X0.
Why do we need to pass all data? Because the splits on XC can be very different depending on the X0 condition due to correlations among the features. The following shows a simple extreme example to illustrate this point.
Both in his paper and in the ESLII book, Friedman elaborates on how the conditional E[f(xS,XC)|XS=xs] is different from the interventional E[f(xS,XC)|do(XS=xs)].