XGBoost too large for pickle/joblib2019 Community Moderator ElectionPython 3 - Can pickle handle byte objects larger than 4GB?How to save & load xgboost model?Python multiprocessing PicklingError: Can't pickle <type 'function'>How can I use pickle to save a dict?What are the different use cases of joblib versus pickle?“Large data” work flows using pandasPickle File too large to loadhow to find pickle files created by joblibLoad pkl (using joblib or pickle) generates keyerror 120Joblib error: TypeError: can't pickle _thread.lock objectspickle/joblib AttributeError: module '__main__' has no attribute 'thing' in pytestHow to save large python object with pickle in joblib parallel processes
Light propagating through a sound wave
Optimising a list searching algorithm
How to terminate ping <dest> &
If "dar" means "to give", what does "daros" mean?
Turning a hard to access nut?
Why is there so much iron?
Hausdorff dimension of the boundary of fibres of Lipschitz maps
How to define limit operations in general topological spaces? Are nets able to do this?
In Aliens, how many people were on LV-426 before the Marines arrived?
Help rendering a complicated sum/product formula
Do native speakers use "ultima" and "proxima" frequently in spoken English?
Maths symbols and unicode-math input inside siunitx commands
Existence of a celestial body big enough for early civilization to be thought of as a second moon
What does "mu" mean as an interjection?
Have the tides ever turned twice on any open problem?
Knife as defense against stray dogs
How does one measure the Fourier components of a signal?
What can I do if I am asked to learn different programming languages very frequently?
Should I use acronyms in dialogues before telling the readers what it stands for in fiction?
Could Sinn Fein swing any Brexit vote in Parliament?
Do I need to consider instance restrictions when showing a language is in P?
What exactly term 'companion plants' means?
What does Jesus mean regarding "Raca," and "you fool?" - is he contrasting them?
Writing in a Christian voice
XGBoost too large for pickle/joblib
2019 Community Moderator ElectionPython 3 - Can pickle handle byte objects larger than 4GB?How to save & load xgboost model?Python multiprocessing PicklingError: Can't pickle <type 'function'>How can I use pickle to save a dict?What are the different use cases of joblib versus pickle?“Large data” work flows using pandasPickle File too large to loadhow to find pickle files created by joblibLoad pkl (using joblib or pickle) generates keyerror 120Joblib error: TypeError: can't pickle _thread.lock objectspickle/joblib AttributeError: module '__main__' has no attribute 'thing' in pytestHow to save large python object with pickle in joblib parallel processes
I'm having difficulty loading an XGBoost regression with both pickle and joblib.
One difficulty could be the fact I am writing the pickle/joblib on a Windows desktop, but I am trying to load on a Macbook Pro
I attempted to use this solution previously posted: Python 3 - Can pickle handle byte objects larger than 4GB?
however, it still does not work. I will get a variety of errors, but usually something like:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: [Errno 22] Invalid argument
have also tried using protocol=4 in a pickle and joblib dump and in each instance, the file was still unable to load.
The files trying to be loaded have been anywhere from 2gb to 11gb based on joblib/pickle or using the bytes_in/os.path solution previously posted
Does anyone know a solution for optimal ways to write large XGBoost regressions, and/or how to then load them?
Here is the code used to write the XGBoost
dmatrix_train = xgb.DMatrix(
X_train.values, y_train, feature_names=X_train.columns.values
)
dmatrix_validate = xgb.DMatrix(
X_test.values, y_test, feature_names=X_train.columns.values
)
eval_set = [(dmatrix_train,"Train")]
eval_set.append((dmatrix_validate,"Validate"))
print("XGBoost #1")
params =
'silent': 1,
'tree_method': 'auto',
'max_depth': 10,
'learning_rate': 0.001,
'subsample': 0.1,
'colsample_bytree': 0.3,
# 'min_split_loss': 10,
'min_child_weight': 10,
# 'lambda': 10,
# 'max_delta_step': 3
num_round = 500000
xgb_model = xgb.train(params=params, dtrain=dmatrix_train,evals=eval_set,
num_boost_round=num_round, verbose_eval=100)
joblib.dump(xgb_model, 'file.sav', protocol=4)
The final line has also been tried with standard pickle dumping as well, with 'wb' and without.
python pickle xgboost joblib
add a comment |
I'm having difficulty loading an XGBoost regression with both pickle and joblib.
One difficulty could be the fact I am writing the pickle/joblib on a Windows desktop, but I am trying to load on a Macbook Pro
I attempted to use this solution previously posted: Python 3 - Can pickle handle byte objects larger than 4GB?
however, it still does not work. I will get a variety of errors, but usually something like:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: [Errno 22] Invalid argument
have also tried using protocol=4 in a pickle and joblib dump and in each instance, the file was still unable to load.
The files trying to be loaded have been anywhere from 2gb to 11gb based on joblib/pickle or using the bytes_in/os.path solution previously posted
Does anyone know a solution for optimal ways to write large XGBoost regressions, and/or how to then load them?
Here is the code used to write the XGBoost
dmatrix_train = xgb.DMatrix(
X_train.values, y_train, feature_names=X_train.columns.values
)
dmatrix_validate = xgb.DMatrix(
X_test.values, y_test, feature_names=X_train.columns.values
)
eval_set = [(dmatrix_train,"Train")]
eval_set.append((dmatrix_validate,"Validate"))
print("XGBoost #1")
params =
'silent': 1,
'tree_method': 'auto',
'max_depth': 10,
'learning_rate': 0.001,
'subsample': 0.1,
'colsample_bytree': 0.3,
# 'min_split_loss': 10,
'min_child_weight': 10,
# 'lambda': 10,
# 'max_delta_step': 3
num_round = 500000
xgb_model = xgb.train(params=params, dtrain=dmatrix_train,evals=eval_set,
num_boost_round=num_round, verbose_eval=100)
joblib.dump(xgb_model, 'file.sav', protocol=4)
The final line has also been tried with standard pickle dumping as well, with 'wb' and without.
python pickle xgboost joblib
Please add your code. This will help solving the issue.
– Rubens_Zimbres
Mar 7 at 17:52
add a comment |
I'm having difficulty loading an XGBoost regression with both pickle and joblib.
One difficulty could be the fact I am writing the pickle/joblib on a Windows desktop, but I am trying to load on a Macbook Pro
I attempted to use this solution previously posted: Python 3 - Can pickle handle byte objects larger than 4GB?
however, it still does not work. I will get a variety of errors, but usually something like:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: [Errno 22] Invalid argument
have also tried using protocol=4 in a pickle and joblib dump and in each instance, the file was still unable to load.
The files trying to be loaded have been anywhere from 2gb to 11gb based on joblib/pickle or using the bytes_in/os.path solution previously posted
Does anyone know a solution for optimal ways to write large XGBoost regressions, and/or how to then load them?
Here is the code used to write the XGBoost
dmatrix_train = xgb.DMatrix(
X_train.values, y_train, feature_names=X_train.columns.values
)
dmatrix_validate = xgb.DMatrix(
X_test.values, y_test, feature_names=X_train.columns.values
)
eval_set = [(dmatrix_train,"Train")]
eval_set.append((dmatrix_validate,"Validate"))
print("XGBoost #1")
params =
'silent': 1,
'tree_method': 'auto',
'max_depth': 10,
'learning_rate': 0.001,
'subsample': 0.1,
'colsample_bytree': 0.3,
# 'min_split_loss': 10,
'min_child_weight': 10,
# 'lambda': 10,
# 'max_delta_step': 3
num_round = 500000
xgb_model = xgb.train(params=params, dtrain=dmatrix_train,evals=eval_set,
num_boost_round=num_round, verbose_eval=100)
joblib.dump(xgb_model, 'file.sav', protocol=4)
The final line has also been tried with standard pickle dumping as well, with 'wb' and without.
python pickle xgboost joblib
I'm having difficulty loading an XGBoost regression with both pickle and joblib.
One difficulty could be the fact I am writing the pickle/joblib on a Windows desktop, but I am trying to load on a Macbook Pro
I attempted to use this solution previously posted: Python 3 - Can pickle handle byte objects larger than 4GB?
however, it still does not work. I will get a variety of errors, but usually something like:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: [Errno 22] Invalid argument
have also tried using protocol=4 in a pickle and joblib dump and in each instance, the file was still unable to load.
The files trying to be loaded have been anywhere from 2gb to 11gb based on joblib/pickle or using the bytes_in/os.path solution previously posted
Does anyone know a solution for optimal ways to write large XGBoost regressions, and/or how to then load them?
Here is the code used to write the XGBoost
dmatrix_train = xgb.DMatrix(
X_train.values, y_train, feature_names=X_train.columns.values
)
dmatrix_validate = xgb.DMatrix(
X_test.values, y_test, feature_names=X_train.columns.values
)
eval_set = [(dmatrix_train,"Train")]
eval_set.append((dmatrix_validate,"Validate"))
print("XGBoost #1")
params =
'silent': 1,
'tree_method': 'auto',
'max_depth': 10,
'learning_rate': 0.001,
'subsample': 0.1,
'colsample_bytree': 0.3,
# 'min_split_loss': 10,
'min_child_weight': 10,
# 'lambda': 10,
# 'max_delta_step': 3
num_round = 500000
xgb_model = xgb.train(params=params, dtrain=dmatrix_train,evals=eval_set,
num_boost_round=num_round, verbose_eval=100)
joblib.dump(xgb_model, 'file.sav', protocol=4)
The final line has also been tried with standard pickle dumping as well, with 'wb' and without.
python pickle xgboost joblib
python pickle xgboost joblib
edited Mar 7 at 18:17
Mike Keenan
asked Mar 7 at 17:14
Mike KeenanMike Keenan
12
12
Please add your code. This will help solving the issue.
– Rubens_Zimbres
Mar 7 at 17:52
add a comment |
Please add your code. This will help solving the issue.
– Rubens_Zimbres
Mar 7 at 17:52
Please add your code. This will help solving the issue.
– Rubens_Zimbres
Mar 7 at 17:52
Please add your code. This will help solving the issue.
– Rubens_Zimbres
Mar 7 at 17:52
add a comment |
1 Answer
1
active
oldest
votes
You appear to be using low-level XGBoost API (as opposed to high-level Scikit-Learn wrapper API). At this level, you can save/load XGBoost models natively using Booster.save_model(fname)
and Booster.load_model(fname)
functions.
For example, see this SO thread: How to save & load xgboost model?
Pickling makes sense if there's a significant "Python wrapper object" involved. There isn't one here.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55049442%2fxgboost-too-large-for-pickle-joblib%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You appear to be using low-level XGBoost API (as opposed to high-level Scikit-Learn wrapper API). At this level, you can save/load XGBoost models natively using Booster.save_model(fname)
and Booster.load_model(fname)
functions.
For example, see this SO thread: How to save & load xgboost model?
Pickling makes sense if there's a significant "Python wrapper object" involved. There isn't one here.
add a comment |
You appear to be using low-level XGBoost API (as opposed to high-level Scikit-Learn wrapper API). At this level, you can save/load XGBoost models natively using Booster.save_model(fname)
and Booster.load_model(fname)
functions.
For example, see this SO thread: How to save & load xgboost model?
Pickling makes sense if there's a significant "Python wrapper object" involved. There isn't one here.
add a comment |
You appear to be using low-level XGBoost API (as opposed to high-level Scikit-Learn wrapper API). At this level, you can save/load XGBoost models natively using Booster.save_model(fname)
and Booster.load_model(fname)
functions.
For example, see this SO thread: How to save & load xgboost model?
Pickling makes sense if there's a significant "Python wrapper object" involved. There isn't one here.
You appear to be using low-level XGBoost API (as opposed to high-level Scikit-Learn wrapper API). At this level, you can save/load XGBoost models natively using Booster.save_model(fname)
and Booster.load_model(fname)
functions.
For example, see this SO thread: How to save & load xgboost model?
Pickling makes sense if there's a significant "Python wrapper object" involved. There isn't one here.
answered Mar 7 at 19:34
user1808924user1808924
2,0642914
2,0642914
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55049442%2fxgboost-too-large-for-pickle-joblib%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Please add your code. This will help solving the issue.
– Rubens_Zimbres
Mar 7 at 17:52