Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[native] [jvm-packages] allow rebuild prediction cache when it is not initialized #5272

Closed
wants to merge 19 commits into from
4 changes: 4 additions & 0 deletions include/xgboost/generic_parameters.h
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ struct GenericParameter : public XGBoostParameter<GenericParameter> {
bool enable_experimental_json_serialization {false};
bool validate_parameters {false};
bool validate_features {true};
bool adding_all_to_cache {false};

void CheckDeprecated() {
if (this->n_gpus != 0) {
Expand Down Expand Up @@ -85,6 +86,9 @@ struct GenericParameter : public XGBoostParameter<GenericParameter> {
"\n\tDeprecated. Single process multi-GPU training is no longer supported."
"\n\tPlease switch to distributed training with one process per GPU."
"\n\tThis can be done using Dask or Spark. See documentation for details.");
DMLC_DECLARE_FIELD(adding_all_to_cache)
.set_default(false)
.describe("adding prediction results for all dmatrix to prediction cache");
}

private:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,14 @@ private[this] class XGBoostExecutionParamsFactory(rawParams: Map[String, Any], s
logger.info("parameter \"maximize_evaluation_metrics\" is set to " + maximize)
overridedParams += ("maximize_evaluation_metrics" -> maximize)
}

if (params.contains("checkpoint_path") && params.contains("checkpoint_interval") &&
params("checkpoint_path") != null &&
params("checkpoint_path").asInstanceOf[String].length > 0 &&
params("checkpoint_interval").asInstanceOf[Int] > 0) {
overridedParams += "adding_all_to_cache" -> true
}

overridedParams
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -678,7 +678,7 @@ void saveRabitCheckpoint() throws XGBoostError {
*/
private void init(DMatrix[] cacheMats) throws XGBoostError {
long[] handles = null;
if (cacheMats != null) {
if (cacheMats != null && cacheMats.length > 0) {
handles = dmatrixsToHandles(cacheMats);
}
long[] out = new long[1];
Expand Down
Original file line number Diff line number Diff line change
@@ -1 +1 @@
version=${project.version}
version=${project.version}
3 changes: 3 additions & 0 deletions src/learner.cc
Original file line number Diff line number Diff line change
Expand Up @@ -202,6 +202,7 @@ class LearnerImpl : public Learner {
tparam_.UpdateAllowUnknown(args);
mparam_.UpdateAllowUnknown(args);
generic_parameters_.UpdateAllowUnknown(args);
std::cout << "all_to_prediction_cache:" << generic_parameters_.adding_all_to_cache << "\n";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just do not forget to remove it before merging

generic_parameters_.CheckDeprecated();

ConsoleLogger::Configure(args);
Expand Down Expand Up @@ -704,6 +705,7 @@ class LearnerImpl : public Learner {
this->ValidateDMatrix(train);

monitor_.Start("PredictRaw");
std::cout << "calling UpdateOneIter\n";
this->PredictRaw(train, &preds_[train], true);
monitor_.Stop("PredictRaw");
TrainingObserver::Instance().Observe(preds_[train], "Predictions");
Expand Down Expand Up @@ -745,6 +747,7 @@ class LearnerImpl : public Learner {
for (size_t i = 0; i < data_sets.size(); ++i) {
DMatrix * dmat = data_sets[i];
this->ValidateDMatrix(dmat);
std::cout << "calling EvalOneIter\n";
this->PredictRaw(data_sets[i], &preds_[dmat], false);
obj_->EvalTransform(&preds_[dmat]);
for (auto& ev : metrics_) {
Expand Down
15 changes: 12 additions & 3 deletions src/predictor/cpu_predictor.cc
Original file line number Diff line number Diff line change
Expand Up @@ -171,18 +171,25 @@ class CPUPredictor : public Predictor {
if (ntree_limit == 0 || ntree_limit > model.trees.size()) {
ntree_limit = static_cast<unsigned>(model.trees.size());
}

std::cout << "run PredLoopInternal in PredictBatch\n";
this->PredLoopInternal(dmat, &out_preds->HostVector(), model,
tree_begin, ntree_limit);

auto cache_entry = this->FindCache(dmat);
if (cache_entry == cache_->cend()) {
return;
std::cout << "cannot find cache\n";
if (!generic_param_->adding_all_to_cache || !(*cache_).empty()) {
return;
} else {
std::cout << "adding dmatrix to cache\n";
(*cache_)[dmat].data = static_cast<std::shared_ptr<DMatrix>>(dmat);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused by this line of code. It was quite a while when I last dived into this part of code
(Full disclaimer, I recently switched companies and I am no longer work in Criteo on xgboost :( )

My C++ is a little bit rusty in regard to static_cast in this context so please feel free to correct me
I am reading this https://en.cppreference.com/w/cpp/language/static_cast and I am trying to figure out which case is here.
A constructor of shared_ptr is marked as explicit so it can't be an implicit object creation but I do not understand then what it can be? shared_ptr is not a child of DMatrix.

Are you creating a new shared_ptr from the pointer dmat? If so I strongly suggest against. It usually leads to errors.

Cause I thought about similar fix but I rejected it cause I did not figure out how to retrieve already existing shared_pt from just DMatrix* (cause I want to have the copy of that reference counting pointer)

What I had in mind is changing the interface of Predict to pass shared_ptr in the first place instead of just a raw pointer. That would permit us to recreate a cache

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I consulted with my C++ friends and they pointed to https://stackoverflow.com/questions/32713083/explicit-constructor-and-static-cast
So static_cast here actually will call a constructor and it will create a second separate shared_ptr which won't know about any other shared_ptr's which manage this pointer.

Sooner or later this will result in double free cause each of these smart pointers will try to free the resource.

In the end I suggest to wait @trivialfis refactoring. It should enable you (or us) to solve this problem

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that's why the current unit test broken

return;
}
}
if (cache_entry->second.predictions.Size() == 0) {
// See comment in GPUPredictor::PredictBatch.
InitOutPredictions(cache_entry->second.data->Info(),
&(cache_entry->second.predictions), model);
&(cache_entry->second.predictions), model);
cache_entry->second.predictions.Copy(*out_preds);
}
}
Expand All @@ -198,6 +205,7 @@ class CPUPredictor : public Predictor {

if (e.predictions.Size() == 0) {
InitOutPredictions(e.data->Info(), &(e.predictions), model);
std::cout << "calling PredLoopInternal in UpdatePredictionCache_1\n";
PredLoopInternal(e.data.get(), &(e.predictions.HostVector()), model, 0,
model.trees.size());
} else if (model.learner_model_param_->num_output_group == 1 && updaters->size() > 0 &&
Expand All @@ -206,6 +214,7 @@ class CPUPredictor : public Predictor {
&(e.predictions))) {
{} // do nothing
} else {
std::cout << "calling PredLoopInternal in UpdatePredictionCache_2\n";
PredLoopInternal(e.data.get(), &(e.predictions.HostVector()), model, old_ntree,
model.trees.size());
}
Expand Down