partial dependency plot using the pdp package

bettey12 · Nov 22, 2022

I use the pdp package to execute partial dependence for linear regression, and it works flawlessly without any warnings. However, when I switch to the classification(logistic) label for xgboost. I received partial dependence warning messages stating that the partial reliance is based on linear as follows. May I inquire whether the code has to be updated in any way to precisely feed the categorization object using the xgboost package so that the partial dependence is correct as said here? Or I can disregard the warning notice because it is already right. I know randomforest is simple with no warning messages.

Code:

# Load required packages
library(pdp)
library(xgboost)

# Simulate training data with ten million records
set.seed(101)
trn <- as.data.frame(mlbench::mlbench.friedman1(n = 1e+07, sd = 1))
trn=trn[sample(nrow(trn), 500), ]
trn$y=ifelse(trn$y>16,1,0)

# Fit an XGBoost classification(logistic) model
set.seed(102)
bst <- xgboost(data = data.matrix(subset(trn, select = -y)),
           label = trn$y,
           objective = "reg:logistic",
           nrounds = 100,
           max_depth = 2,
           eta = 0.1)
 #partial dependency plot

  pd <- partial(bst$handle,
            pred.var = c("x.1"),
            grid.resolution = 10,
            train = data.matrix(subset(trn, select = -y)),
            prob=TRUE,
            plot = FALSE,
            .progress = "text")

 Warning message:
 In superType.default(object) :
 `type` could not be determined; assuming `type = "regression"`

Isaac · Nov 28, 2022

This is a sql server forum

partial dependency plot using the pdp package

bettey12

New member

Isaac

Lifelong Learner

Users who are viewing this thread