Data Limits and Maximum Distances for boxplot in pandas (Python)2019 Community Moderator ElectionAdding new column to existing DataFrame in Python pandas“Large data” work flows using pandasChange data type of columns in Pandasextract data from array after specifying Upper/Lower limit using matlabValues missing: Overlaying points on boxplot subplots from a pandas dataframePython pandas plotting shift x-axis if twinx two y-axesCreate new pandas df from matching numpy array with pandas data framePandas: Remove limited duplicatesSet the values out of the defined interval limits to a given value (f.e. NaN) for a column in pandas data frameMatlab boxplot adjacent values

Can Witch Sight see through Mirror Image?

School performs periodic password audits. Is my password compromised?

Paper published similar to PhD thesis

Why do we call complex numbers “numbers” but we don’t consider 2-vectors numbers?

Is "cogitate" used appropriately in "I cogitate that success relies on hard work"?

Help! My Character is too much for her story!

Does the US political system, in principle, allow for a no-party system?

Is there a logarithm base for which the logarithm becomes an identity function?

How can I portion out frozen cookie dough?

Interpretation of linear regression interaction term plot

How to make sure I'm assertive enough in contact with subordinates?

3.5% Interest Student Loan or use all of my savings on Tuition?

Book where society has been split into 2 with a wall down the middle where one side embraced high tech whereas other side were totally against tech

Inorganic chemistry handbook with reaction lists

What does it take to become a wilderness skills guide as a business?

How to install "rounded" brake pads

I am the light that shines in the dark

Tabular environment - text vertically positions itself by bottom of tikz picture in adjacent cell

Ultrafilters as a double dual

Short story about cities being connected by a conveyor belt

How to educate team mate to take screenshots for bugs with out unwanted stuff

Use Mercury as quenching liquid for swords?

Is this Paypal Github SDK reference really a dangerous site?

Too soon for a plot twist?



Data Limits and Maximum Distances for boxplot in pandas (Python)



2019 Community Moderator ElectionAdding new column to existing DataFrame in Python pandas“Large data” work flows using pandasChange data type of columns in Pandasextract data from array after specifying Upper/Lower limit using matlabValues missing: Overlaying points on boxplot subplots from a pandas dataframePython pandas plotting shift x-axis if twinx two y-axesCreate new pandas df from matching numpy array with pandas data framePandas: Remove limited duplicatesSet the values out of the defined interval limits to a given value (f.e. NaN) for a column in pandas data frameMatlab boxplot adjacent values










1















I am using Python to plot data (coming from many experiments) and I would like to use boxplot method of pandas library.



Executing df = pd.DataFrame(value,columns=['Col1']) the result is the following one:



enter image description here



The problem comes from the extreme values. In Matlab the solution is to use the 'DataLimit' option:



boxplot(bp1,'DataLim',[4.2,4.3])


From Matlab documentation:




Data Limits and Maximum Distances



'DataLim' — Extreme data limits
[-Inf,Inf] (default) | two-element numeric vector



Extreme data limits, specified as the comma-separated pair consisting of 'DataLim' and a two-element numeric vector containing the lower and upper limits, respectively. The values specified for 'DataLim' are used by 'ExtremeMode' to determine which data points are extreme.




Is there something similar for Python?



Walkaround:
However, I have a walk around (that I really don't like because it changes the statistical distribution of the measurements): I just exclude the "problematic values" manually:



df = pd.DataFrame(value[100:],columns=['Col1'])
df.boxplot(column=['Col1'])


and the result is:



enter image description here



This is because I know where the problem is.










share|improve this question
























  • Couldn't you just filter your df with loc before plotting?

    – Josh Friedlander
    2 days ago






  • 1





    I don't think there is an option in matplotlib to do exactly what you want. I would just plot the filtered df df[(df["Col1"] > 4.2) & (df["Col1"] < 4.3)].boxplot()

    – Runkles
    2 days ago












  • @Josh, what do you mean? can you make an example?

    – Leos313
    2 days ago











  • @Runkles yes, it can work. But I think (not sure!!) that in Matlab the points are used for the statistics of the boxplot and just not printed

    – Leos313
    2 days ago







  • 1





    @Runkles if you plot only those data, you change the statistical distribution. Not sure if that's ok for OP

    – micric
    2 days ago















1















I am using Python to plot data (coming from many experiments) and I would like to use boxplot method of pandas library.



Executing df = pd.DataFrame(value,columns=['Col1']) the result is the following one:



enter image description here



The problem comes from the extreme values. In Matlab the solution is to use the 'DataLimit' option:



boxplot(bp1,'DataLim',[4.2,4.3])


From Matlab documentation:




Data Limits and Maximum Distances



'DataLim' — Extreme data limits
[-Inf,Inf] (default) | two-element numeric vector



Extreme data limits, specified as the comma-separated pair consisting of 'DataLim' and a two-element numeric vector containing the lower and upper limits, respectively. The values specified for 'DataLim' are used by 'ExtremeMode' to determine which data points are extreme.




Is there something similar for Python?



Walkaround:
However, I have a walk around (that I really don't like because it changes the statistical distribution of the measurements): I just exclude the "problematic values" manually:



df = pd.DataFrame(value[100:],columns=['Col1'])
df.boxplot(column=['Col1'])


and the result is:



enter image description here



This is because I know where the problem is.










share|improve this question
























  • Couldn't you just filter your df with loc before plotting?

    – Josh Friedlander
    2 days ago






  • 1





    I don't think there is an option in matplotlib to do exactly what you want. I would just plot the filtered df df[(df["Col1"] > 4.2) & (df["Col1"] < 4.3)].boxplot()

    – Runkles
    2 days ago












  • @Josh, what do you mean? can you make an example?

    – Leos313
    2 days ago











  • @Runkles yes, it can work. But I think (not sure!!) that in Matlab the points are used for the statistics of the boxplot and just not printed

    – Leos313
    2 days ago







  • 1





    @Runkles if you plot only those data, you change the statistical distribution. Not sure if that's ok for OP

    – micric
    2 days ago













1












1








1








I am using Python to plot data (coming from many experiments) and I would like to use boxplot method of pandas library.



Executing df = pd.DataFrame(value,columns=['Col1']) the result is the following one:



enter image description here



The problem comes from the extreme values. In Matlab the solution is to use the 'DataLimit' option:



boxplot(bp1,'DataLim',[4.2,4.3])


From Matlab documentation:




Data Limits and Maximum Distances



'DataLim' — Extreme data limits
[-Inf,Inf] (default) | two-element numeric vector



Extreme data limits, specified as the comma-separated pair consisting of 'DataLim' and a two-element numeric vector containing the lower and upper limits, respectively. The values specified for 'DataLim' are used by 'ExtremeMode' to determine which data points are extreme.




Is there something similar for Python?



Walkaround:
However, I have a walk around (that I really don't like because it changes the statistical distribution of the measurements): I just exclude the "problematic values" manually:



df = pd.DataFrame(value[100:],columns=['Col1'])
df.boxplot(column=['Col1'])


and the result is:



enter image description here



This is because I know where the problem is.










share|improve this question
















I am using Python to plot data (coming from many experiments) and I would like to use boxplot method of pandas library.



Executing df = pd.DataFrame(value,columns=['Col1']) the result is the following one:



enter image description here



The problem comes from the extreme values. In Matlab the solution is to use the 'DataLimit' option:



boxplot(bp1,'DataLim',[4.2,4.3])


From Matlab documentation:




Data Limits and Maximum Distances



'DataLim' — Extreme data limits
[-Inf,Inf] (default) | two-element numeric vector



Extreme data limits, specified as the comma-separated pair consisting of 'DataLim' and a two-element numeric vector containing the lower and upper limits, respectively. The values specified for 'DataLim' are used by 'ExtremeMode' to determine which data points are extreme.




Is there something similar for Python?



Walkaround:
However, I have a walk around (that I really don't like because it changes the statistical distribution of the measurements): I just exclude the "problematic values" manually:



df = pd.DataFrame(value[100:],columns=['Col1'])
df.boxplot(column=['Col1'])


and the result is:



enter image description here



This is because I know where the problem is.







python python-3.x pandas matlab boxplot






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 2 days ago







Leos313

















asked 2 days ago









Leos313Leos313

1,61711435




1,61711435












  • Couldn't you just filter your df with loc before plotting?

    – Josh Friedlander
    2 days ago






  • 1





    I don't think there is an option in matplotlib to do exactly what you want. I would just plot the filtered df df[(df["Col1"] > 4.2) & (df["Col1"] < 4.3)].boxplot()

    – Runkles
    2 days ago












  • @Josh, what do you mean? can you make an example?

    – Leos313
    2 days ago











  • @Runkles yes, it can work. But I think (not sure!!) that in Matlab the points are used for the statistics of the boxplot and just not printed

    – Leos313
    2 days ago







  • 1





    @Runkles if you plot only those data, you change the statistical distribution. Not sure if that's ok for OP

    – micric
    2 days ago

















  • Couldn't you just filter your df with loc before plotting?

    – Josh Friedlander
    2 days ago






  • 1





    I don't think there is an option in matplotlib to do exactly what you want. I would just plot the filtered df df[(df["Col1"] > 4.2) & (df["Col1"] < 4.3)].boxplot()

    – Runkles
    2 days ago












  • @Josh, what do you mean? can you make an example?

    – Leos313
    2 days ago











  • @Runkles yes, it can work. But I think (not sure!!) that in Matlab the points are used for the statistics of the boxplot and just not printed

    – Leos313
    2 days ago







  • 1





    @Runkles if you plot only those data, you change the statistical distribution. Not sure if that's ok for OP

    – micric
    2 days ago
















Couldn't you just filter your df with loc before plotting?

– Josh Friedlander
2 days ago





Couldn't you just filter your df with loc before plotting?

– Josh Friedlander
2 days ago




1




1





I don't think there is an option in matplotlib to do exactly what you want. I would just plot the filtered df df[(df["Col1"] > 4.2) & (df["Col1"] < 4.3)].boxplot()

– Runkles
2 days ago






I don't think there is an option in matplotlib to do exactly what you want. I would just plot the filtered df df[(df["Col1"] > 4.2) & (df["Col1"] < 4.3)].boxplot()

– Runkles
2 days ago














@Josh, what do you mean? can you make an example?

– Leos313
2 days ago





@Josh, what do you mean? can you make an example?

– Leos313
2 days ago













@Runkles yes, it can work. But I think (not sure!!) that in Matlab the points are used for the statistics of the boxplot and just not printed

– Leos313
2 days ago






@Runkles yes, it can work. But I think (not sure!!) that in Matlab the points are used for the statistics of the boxplot and just not printed

– Leos313
2 days ago





1




1





@Runkles if you plot only those data, you change the statistical distribution. Not sure if that's ok for OP

– micric
2 days ago





@Runkles if you plot only those data, you change the statistical distribution. Not sure if that's ok for OP

– micric
2 days ago












1 Answer
1






active

oldest

votes


















0














You can use ylim to constrain the axis without omitting the outliers from the calculation:



data = np.concatenate((np.random.rand(50) * 100, # spread
np.ones(25) * 50, # center
np.random.rand(10) * 100 + 100, # flier high
np.random.rand(10) * -100, # flier low
np.random.rand(2) * 10_000)) # unwanted outlier
fig1, ax1 = plt.subplots()
ax1.boxplot(data)
plt.ylim([-100, 200])
plt.show()





share|improve this answer






















    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55027041%2fdata-limits-and-maximum-distances-for-boxplot-in-pandas-python%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    You can use ylim to constrain the axis without omitting the outliers from the calculation:



    data = np.concatenate((np.random.rand(50) * 100, # spread
    np.ones(25) * 50, # center
    np.random.rand(10) * 100 + 100, # flier high
    np.random.rand(10) * -100, # flier low
    np.random.rand(2) * 10_000)) # unwanted outlier
    fig1, ax1 = plt.subplots()
    ax1.boxplot(data)
    plt.ylim([-100, 200])
    plt.show()





    share|improve this answer



























      0














      You can use ylim to constrain the axis without omitting the outliers from the calculation:



      data = np.concatenate((np.random.rand(50) * 100, # spread
      np.ones(25) * 50, # center
      np.random.rand(10) * 100 + 100, # flier high
      np.random.rand(10) * -100, # flier low
      np.random.rand(2) * 10_000)) # unwanted outlier
      fig1, ax1 = plt.subplots()
      ax1.boxplot(data)
      plt.ylim([-100, 200])
      plt.show()





      share|improve this answer

























        0












        0








        0







        You can use ylim to constrain the axis without omitting the outliers from the calculation:



        data = np.concatenate((np.random.rand(50) * 100, # spread
        np.ones(25) * 50, # center
        np.random.rand(10) * 100 + 100, # flier high
        np.random.rand(10) * -100, # flier low
        np.random.rand(2) * 10_000)) # unwanted outlier
        fig1, ax1 = plt.subplots()
        ax1.boxplot(data)
        plt.ylim([-100, 200])
        plt.show()





        share|improve this answer













        You can use ylim to constrain the axis without omitting the outliers from the calculation:



        data = np.concatenate((np.random.rand(50) * 100, # spread
        np.ones(25) * 50, # center
        np.random.rand(10) * 100 + 100, # flier high
        np.random.rand(10) * -100, # flier low
        np.random.rand(2) * 10_000)) # unwanted outlier
        fig1, ax1 = plt.subplots()
        ax1.boxplot(data)
        plt.ylim([-100, 200])
        plt.show()






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered 2 days ago









        rgkrgk

        37339




        37339





























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55027041%2fdata-limits-and-maximum-distances-for-boxplot-in-pandas-python%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Identity Server 4 is not redirecting to Angular app after login2019 Community Moderator ElectionIdentity Server 4 and dockerIdentityserver implicit flow unauthorized_clientIdentityServer Hybrid Flow - Access Token is null after user successful loginIdentity Server to MVC client : Page Redirect After loginLogin with Steam OpenId(oidc-client-js)Identity Server 4+.NET Core 2.0 + IdentityIdentityServer4 post-login redirect not working in Edge browserCall to IdentityServer4 generates System.NullReferenceException: Object reference not set to an instance of an objectIdentityServer4 without HTTPS not workingHow to get Authorization code from identity server without login form

            2005 Ahvaz unrest Contents Background Causes Casualties Aftermath See also References Navigation menue"At Least 10 Are Killed by Bombs in Iran""Iran"Archived"Arab-Iranians in Iran to make April 15 'Day of Fury'"State of Mind, State of Order: Reactions to Ethnic Unrest in the Islamic Republic of Iran.10.1111/j.1754-9469.2008.00028.x"Iran hangs Arab separatists"Iran Overview from ArchivedConstitution of the Islamic Republic of Iran"Tehran puzzled by forged 'riots' letter""Iran and its minorities: Down in the second class""Iran: Handling Of Ahvaz Unrest Could End With Televised Confessions""Bombings Rock Iran Ahead of Election""Five die in Iran ethnic clashes""Iran: Need for restraint as anniversary of unrest in Khuzestan approaches"Archived"Iranian Sunni protesters killed in clashes with security forces"Archived

            Can't initialize raids on a new ASUS Prime B360M-A motherboard2019 Community Moderator ElectionSimilar to RAID config yet more like mirroring solution?Can't get motherboard serial numberWhy does the BIOS entry point start with a WBINVD instruction?UEFI performance Asus Maximus V Extreme