XGBoost too large for pickle/joblib2019 Community Moderator ElectionPython 3 - Can pickle handle byte objects larger than 4GB?How to save & load xgboost model?Python multiprocessing PicklingError: Can't pickle <type 'function'>How can I use pickle to save a dict?What are the different use cases of joblib versus pickle?“Large data” work flows using pandasPickle File too large to loadhow to find pickle files created by joblibLoad pkl (using joblib or pickle) generates keyerror 120Joblib error: TypeError: can't pickle _thread.lock objectspickle/joblib AttributeError: module '__main__' has no attribute 'thing' in pytestHow to save large python object with pickle in joblib parallel processes

Light propagating through a sound wave

Optimising a list searching algorithm

How to terminate ping <dest> &

If "dar" means "to give", what does "daros" mean?

Turning a hard to access nut?

Why is there so much iron?

Hausdorff dimension of the boundary of fibres of Lipschitz maps

How to define limit operations in general topological spaces? Are nets able to do this?

In Aliens, how many people were on LV-426 before the Marines arrived​?

Help rendering a complicated sum/product formula

Do native speakers use "ultima" and "proxima" frequently in spoken English?

Maths symbols and unicode-math input inside siunitx commands

Existence of a celestial body big enough for early civilization to be thought of as a second moon

What does "mu" mean as an interjection?

Have the tides ever turned twice on any open problem?

Knife as defense against stray dogs

How does one measure the Fourier components of a signal?

What can I do if I am asked to learn different programming languages very frequently?

Should I use acronyms in dialogues before telling the readers what it stands for in fiction?

Could Sinn Fein swing any Brexit vote in Parliament?

Do I need to consider instance restrictions when showing a language is in P?

What exactly term 'companion plants' means?

What does Jesus mean regarding "Raca," and "you fool?" - is he contrasting them?

Writing in a Christian voice



XGBoost too large for pickle/joblib



2019 Community Moderator ElectionPython 3 - Can pickle handle byte objects larger than 4GB?How to save & load xgboost model?Python multiprocessing PicklingError: Can't pickle <type 'function'>How can I use pickle to save a dict?What are the different use cases of joblib versus pickle?“Large data” work flows using pandasPickle File too large to loadhow to find pickle files created by joblibLoad pkl (using joblib or pickle) generates keyerror 120Joblib error: TypeError: can't pickle _thread.lock objectspickle/joblib AttributeError: module '__main__' has no attribute 'thing' in pytestHow to save large python object with pickle in joblib parallel processes










0















I'm having difficulty loading an XGBoost regression with both pickle and joblib.



One difficulty could be the fact I am writing the pickle/joblib on a Windows desktop, but I am trying to load on a Macbook Pro



I attempted to use this solution previously posted: Python 3 - Can pickle handle byte objects larger than 4GB?



however, it still does not work. I will get a variety of errors, but usually something like:



Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: [Errno 22] Invalid argument


have also tried using protocol=4 in a pickle and joblib dump and in each instance, the file was still unable to load.



The files trying to be loaded have been anywhere from 2gb to 11gb based on joblib/pickle or using the bytes_in/os.path solution previously posted



Does anyone know a solution for optimal ways to write large XGBoost regressions, and/or how to then load them?



Here is the code used to write the XGBoost



dmatrix_train = xgb.DMatrix(
X_train.values, y_train, feature_names=X_train.columns.values
)
dmatrix_validate = xgb.DMatrix(
X_test.values, y_test, feature_names=X_train.columns.values
)
eval_set = [(dmatrix_train,"Train")]
eval_set.append((dmatrix_validate,"Validate"))

print("XGBoost #1")

params =
'silent': 1,

'tree_method': 'auto',
'max_depth': 10,
'learning_rate': 0.001,
'subsample': 0.1,
'colsample_bytree': 0.3,
# 'min_split_loss': 10,
'min_child_weight': 10,
# 'lambda': 10,
# 'max_delta_step': 3


num_round = 500000

xgb_model = xgb.train(params=params, dtrain=dmatrix_train,evals=eval_set,
num_boost_round=num_round, verbose_eval=100)

joblib.dump(xgb_model, 'file.sav', protocol=4)


The final line has also been tried with standard pickle dumping as well, with 'wb' and without.










share|improve this question
























  • Please add your code. This will help solving the issue.

    – Rubens_Zimbres
    Mar 7 at 17:52















0















I'm having difficulty loading an XGBoost regression with both pickle and joblib.



One difficulty could be the fact I am writing the pickle/joblib on a Windows desktop, but I am trying to load on a Macbook Pro



I attempted to use this solution previously posted: Python 3 - Can pickle handle byte objects larger than 4GB?



however, it still does not work. I will get a variety of errors, but usually something like:



Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: [Errno 22] Invalid argument


have also tried using protocol=4 in a pickle and joblib dump and in each instance, the file was still unable to load.



The files trying to be loaded have been anywhere from 2gb to 11gb based on joblib/pickle or using the bytes_in/os.path solution previously posted



Does anyone know a solution for optimal ways to write large XGBoost regressions, and/or how to then load them?



Here is the code used to write the XGBoost



dmatrix_train = xgb.DMatrix(
X_train.values, y_train, feature_names=X_train.columns.values
)
dmatrix_validate = xgb.DMatrix(
X_test.values, y_test, feature_names=X_train.columns.values
)
eval_set = [(dmatrix_train,"Train")]
eval_set.append((dmatrix_validate,"Validate"))

print("XGBoost #1")

params =
'silent': 1,

'tree_method': 'auto',
'max_depth': 10,
'learning_rate': 0.001,
'subsample': 0.1,
'colsample_bytree': 0.3,
# 'min_split_loss': 10,
'min_child_weight': 10,
# 'lambda': 10,
# 'max_delta_step': 3


num_round = 500000

xgb_model = xgb.train(params=params, dtrain=dmatrix_train,evals=eval_set,
num_boost_round=num_round, verbose_eval=100)

joblib.dump(xgb_model, 'file.sav', protocol=4)


The final line has also been tried with standard pickle dumping as well, with 'wb' and without.










share|improve this question
























  • Please add your code. This will help solving the issue.

    – Rubens_Zimbres
    Mar 7 at 17:52













0












0








0








I'm having difficulty loading an XGBoost regression with both pickle and joblib.



One difficulty could be the fact I am writing the pickle/joblib on a Windows desktop, but I am trying to load on a Macbook Pro



I attempted to use this solution previously posted: Python 3 - Can pickle handle byte objects larger than 4GB?



however, it still does not work. I will get a variety of errors, but usually something like:



Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: [Errno 22] Invalid argument


have also tried using protocol=4 in a pickle and joblib dump and in each instance, the file was still unable to load.



The files trying to be loaded have been anywhere from 2gb to 11gb based on joblib/pickle or using the bytes_in/os.path solution previously posted



Does anyone know a solution for optimal ways to write large XGBoost regressions, and/or how to then load them?



Here is the code used to write the XGBoost



dmatrix_train = xgb.DMatrix(
X_train.values, y_train, feature_names=X_train.columns.values
)
dmatrix_validate = xgb.DMatrix(
X_test.values, y_test, feature_names=X_train.columns.values
)
eval_set = [(dmatrix_train,"Train")]
eval_set.append((dmatrix_validate,"Validate"))

print("XGBoost #1")

params =
'silent': 1,

'tree_method': 'auto',
'max_depth': 10,
'learning_rate': 0.001,
'subsample': 0.1,
'colsample_bytree': 0.3,
# 'min_split_loss': 10,
'min_child_weight': 10,
# 'lambda': 10,
# 'max_delta_step': 3


num_round = 500000

xgb_model = xgb.train(params=params, dtrain=dmatrix_train,evals=eval_set,
num_boost_round=num_round, verbose_eval=100)

joblib.dump(xgb_model, 'file.sav', protocol=4)


The final line has also been tried with standard pickle dumping as well, with 'wb' and without.










share|improve this question
















I'm having difficulty loading an XGBoost regression with both pickle and joblib.



One difficulty could be the fact I am writing the pickle/joblib on a Windows desktop, but I am trying to load on a Macbook Pro



I attempted to use this solution previously posted: Python 3 - Can pickle handle byte objects larger than 4GB?



however, it still does not work. I will get a variety of errors, but usually something like:



Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: [Errno 22] Invalid argument


have also tried using protocol=4 in a pickle and joblib dump and in each instance, the file was still unable to load.



The files trying to be loaded have been anywhere from 2gb to 11gb based on joblib/pickle or using the bytes_in/os.path solution previously posted



Does anyone know a solution for optimal ways to write large XGBoost regressions, and/or how to then load them?



Here is the code used to write the XGBoost



dmatrix_train = xgb.DMatrix(
X_train.values, y_train, feature_names=X_train.columns.values
)
dmatrix_validate = xgb.DMatrix(
X_test.values, y_test, feature_names=X_train.columns.values
)
eval_set = [(dmatrix_train,"Train")]
eval_set.append((dmatrix_validate,"Validate"))

print("XGBoost #1")

params =
'silent': 1,

'tree_method': 'auto',
'max_depth': 10,
'learning_rate': 0.001,
'subsample': 0.1,
'colsample_bytree': 0.3,
# 'min_split_loss': 10,
'min_child_weight': 10,
# 'lambda': 10,
# 'max_delta_step': 3


num_round = 500000

xgb_model = xgb.train(params=params, dtrain=dmatrix_train,evals=eval_set,
num_boost_round=num_round, verbose_eval=100)

joblib.dump(xgb_model, 'file.sav', protocol=4)


The final line has also been tried with standard pickle dumping as well, with 'wb' and without.







python pickle xgboost joblib






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 7 at 18:17







Mike Keenan

















asked Mar 7 at 17:14









Mike KeenanMike Keenan

12




12












  • Please add your code. This will help solving the issue.

    – Rubens_Zimbres
    Mar 7 at 17:52

















  • Please add your code. This will help solving the issue.

    – Rubens_Zimbres
    Mar 7 at 17:52
















Please add your code. This will help solving the issue.

– Rubens_Zimbres
Mar 7 at 17:52





Please add your code. This will help solving the issue.

– Rubens_Zimbres
Mar 7 at 17:52












1 Answer
1






active

oldest

votes


















0














You appear to be using low-level XGBoost API (as opposed to high-level Scikit-Learn wrapper API). At this level, you can save/load XGBoost models natively using Booster.save_model(fname) and Booster.load_model(fname) functions.



For example, see this SO thread: How to save & load xgboost model?



Pickling makes sense if there's a significant "Python wrapper object" involved. There isn't one here.






share|improve this answer






















    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55049442%2fxgboost-too-large-for-pickle-joblib%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    You appear to be using low-level XGBoost API (as opposed to high-level Scikit-Learn wrapper API). At this level, you can save/load XGBoost models natively using Booster.save_model(fname) and Booster.load_model(fname) functions.



    For example, see this SO thread: How to save & load xgboost model?



    Pickling makes sense if there's a significant "Python wrapper object" involved. There isn't one here.






    share|improve this answer



























      0














      You appear to be using low-level XGBoost API (as opposed to high-level Scikit-Learn wrapper API). At this level, you can save/load XGBoost models natively using Booster.save_model(fname) and Booster.load_model(fname) functions.



      For example, see this SO thread: How to save & load xgboost model?



      Pickling makes sense if there's a significant "Python wrapper object" involved. There isn't one here.






      share|improve this answer

























        0












        0








        0







        You appear to be using low-level XGBoost API (as opposed to high-level Scikit-Learn wrapper API). At this level, you can save/load XGBoost models natively using Booster.save_model(fname) and Booster.load_model(fname) functions.



        For example, see this SO thread: How to save & load xgboost model?



        Pickling makes sense if there's a significant "Python wrapper object" involved. There isn't one here.






        share|improve this answer













        You appear to be using low-level XGBoost API (as opposed to high-level Scikit-Learn wrapper API). At this level, you can save/load XGBoost models natively using Booster.save_model(fname) and Booster.load_model(fname) functions.



        For example, see this SO thread: How to save & load xgboost model?



        Pickling makes sense if there's a significant "Python wrapper object" involved. There isn't one here.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Mar 7 at 19:34









        user1808924user1808924

        2,0642914




        2,0642914





























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55049442%2fxgboost-too-large-for-pickle-joblib%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Identity Server 4 is not redirecting to Angular app after login2019 Community Moderator ElectionIdentity Server 4 and dockerIdentityserver implicit flow unauthorized_clientIdentityServer Hybrid Flow - Access Token is null after user successful loginIdentity Server to MVC client : Page Redirect After loginLogin with Steam OpenId(oidc-client-js)Identity Server 4+.NET Core 2.0 + IdentityIdentityServer4 post-login redirect not working in Edge browserCall to IdentityServer4 generates System.NullReferenceException: Object reference not set to an instance of an objectIdentityServer4 without HTTPS not workingHow to get Authorization code from identity server without login form

            2005 Ahvaz unrest Contents Background Causes Casualties Aftermath See also References Navigation menue"At Least 10 Are Killed by Bombs in Iran""Iran"Archived"Arab-Iranians in Iran to make April 15 'Day of Fury'"State of Mind, State of Order: Reactions to Ethnic Unrest in the Islamic Republic of Iran.10.1111/j.1754-9469.2008.00028.x"Iran hangs Arab separatists"Iran Overview from ArchivedConstitution of the Islamic Republic of Iran"Tehran puzzled by forged 'riots' letter""Iran and its minorities: Down in the second class""Iran: Handling Of Ahvaz Unrest Could End With Televised Confessions""Bombings Rock Iran Ahead of Election""Five die in Iran ethnic clashes""Iran: Need for restraint as anniversary of unrest in Khuzestan approaches"Archived"Iranian Sunni protesters killed in clashes with security forces"Archived

            Can't initialize raids on a new ASUS Prime B360M-A motherboard2019 Community Moderator ElectionSimilar to RAID config yet more like mirroring solution?Can't get motherboard serial numberWhy does the BIOS entry point start with a WBINVD instruction?UEFI performance Asus Maximus V Extreme