Advanced Pivot Table in PandasAdd one row to pandas DataFrameSelecting multiple columns in a pandas dataframeRenaming columns in pandasAdding new column to existing DataFrame in Python pandasDelete column from pandas DataFrame by column name“Large data” work flows using pandasHow to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasOrganizing data read from Excel to Pandas DataFrameGet list from pandas DataFrame column headers

Recursively move files within sub directories

Make a Bowl of Alphabet Soup

Not hide and seek

Air travel with refrigerated insulin

Reason why a kingside attack is not justified

Strange behavior in TikZ draw command

C++ lambda syntax

How to detect sounds in IPA spelling

Calculate Pi using Monte Carlo

Can a Knock spell open the door to Mordenkainen's Magnificent Mansion?

1 John in Luther’s Bibel

Taking the numerator and the denominator

What should be the ideal length of sentences in a blog post for ease of reading?

Can creatures abilities target that creature itself?

New Order #2: Turn My Way

Why is participating in the European Parliamentary elections used as a threat?

What is the purpose of using a decision tree?

What 1968 Moog synthesizer was used in the Movie Apollo 11?

Do native speakers use "ultima" and "proxima" frequently in spoken English?

What is the period/term used describe Giuseppe Arcimboldo's style of painting?

How to test the sharpness of a knife?

Is this saw blade faulty?

Why doesn't Gödel's incompleteness theorem apply to false statements?

How to get directions in deep space?



Advanced Pivot Table in Pandas


Add one row to pandas DataFrameSelecting multiple columns in a pandas dataframeRenaming columns in pandasAdding new column to existing DataFrame in Python pandasDelete column from pandas DataFrame by column name“Large data” work flows using pandasHow to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasOrganizing data read from Excel to Pandas DataFrameGet list from pandas DataFrame column headers













1















I am trying to optimize some table transformation scripts in Python Pandas, which I am trying to feed with huge data sets (above 50k rows). I wrote a script that iterates through every index and parses values into a new data frame (see example below), but I am experiencing performance issues. Is there any pandas function, that could get the same results without iterating?



Example code:



from datetime import datetime
import pandas as pd

date1 = datetime(2019,1,1)
date2 = datetime(2019,1,2)

df = pd.DataFrame("ID": [1,1,2,2,3,3],
"date": [date1,date2,date1,date2,date1,date2],
"x": [1,2,3,4,5,6],
"y": ["a","a","b","b","c","c"])


new_df = pd.DataFrame()
for i in df.index:

new_df.at[df.at[i, "ID"], "y"] = df.at[i, "y"]

if df.at[i, "date"] == datetime(2019,1,1):
new_df.at[df.at[i, "ID"], "x1"] = df.at[i, "x"]
elif df.at[i, "date"] == datetime(2019,1,2):
new_df.at[df.at[i, "ID"], "x2"] = df.at[i, "x"]


output:



 ID date x y
0 1 2019-01-01 1 a
1 1 2019-01-02 2 a
2 2 2019-01-01 3 b
3 2 2019-01-02 4 b
4 3 2019-01-01 5 c
5 3 2019-01-02 6 c

y x1 x2
1 a 1.0 2.0
2 b 3.0 4.0
3 c 5.0 6.0


The transformation basically groups the rows by the "ID" column and gets the "x1" values from the rows with date 2019-01-01, and the "x2" values from the rows with date 2019-01-02. The "y" value is the same within the same "ID". "ID" columns become the new indexes.



I'd appreciate any advice on this matter.










share|improve this question




























    1















    I am trying to optimize some table transformation scripts in Python Pandas, which I am trying to feed with huge data sets (above 50k rows). I wrote a script that iterates through every index and parses values into a new data frame (see example below), but I am experiencing performance issues. Is there any pandas function, that could get the same results without iterating?



    Example code:



    from datetime import datetime
    import pandas as pd

    date1 = datetime(2019,1,1)
    date2 = datetime(2019,1,2)

    df = pd.DataFrame("ID": [1,1,2,2,3,3],
    "date": [date1,date2,date1,date2,date1,date2],
    "x": [1,2,3,4,5,6],
    "y": ["a","a","b","b","c","c"])


    new_df = pd.DataFrame()
    for i in df.index:

    new_df.at[df.at[i, "ID"], "y"] = df.at[i, "y"]

    if df.at[i, "date"] == datetime(2019,1,1):
    new_df.at[df.at[i, "ID"], "x1"] = df.at[i, "x"]
    elif df.at[i, "date"] == datetime(2019,1,2):
    new_df.at[df.at[i, "ID"], "x2"] = df.at[i, "x"]


    output:



     ID date x y
    0 1 2019-01-01 1 a
    1 1 2019-01-02 2 a
    2 2 2019-01-01 3 b
    3 2 2019-01-02 4 b
    4 3 2019-01-01 5 c
    5 3 2019-01-02 6 c

    y x1 x2
    1 a 1.0 2.0
    2 b 3.0 4.0
    3 c 5.0 6.0


    The transformation basically groups the rows by the "ID" column and gets the "x1" values from the rows with date 2019-01-01, and the "x2" values from the rows with date 2019-01-02. The "y" value is the same within the same "ID". "ID" columns become the new indexes.



    I'd appreciate any advice on this matter.










    share|improve this question


























      1












      1








      1








      I am trying to optimize some table transformation scripts in Python Pandas, which I am trying to feed with huge data sets (above 50k rows). I wrote a script that iterates through every index and parses values into a new data frame (see example below), but I am experiencing performance issues. Is there any pandas function, that could get the same results without iterating?



      Example code:



      from datetime import datetime
      import pandas as pd

      date1 = datetime(2019,1,1)
      date2 = datetime(2019,1,2)

      df = pd.DataFrame("ID": [1,1,2,2,3,3],
      "date": [date1,date2,date1,date2,date1,date2],
      "x": [1,2,3,4,5,6],
      "y": ["a","a","b","b","c","c"])


      new_df = pd.DataFrame()
      for i in df.index:

      new_df.at[df.at[i, "ID"], "y"] = df.at[i, "y"]

      if df.at[i, "date"] == datetime(2019,1,1):
      new_df.at[df.at[i, "ID"], "x1"] = df.at[i, "x"]
      elif df.at[i, "date"] == datetime(2019,1,2):
      new_df.at[df.at[i, "ID"], "x2"] = df.at[i, "x"]


      output:



       ID date x y
      0 1 2019-01-01 1 a
      1 1 2019-01-02 2 a
      2 2 2019-01-01 3 b
      3 2 2019-01-02 4 b
      4 3 2019-01-01 5 c
      5 3 2019-01-02 6 c

      y x1 x2
      1 a 1.0 2.0
      2 b 3.0 4.0
      3 c 5.0 6.0


      The transformation basically groups the rows by the "ID" column and gets the "x1" values from the rows with date 2019-01-01, and the "x2" values from the rows with date 2019-01-02. The "y" value is the same within the same "ID". "ID" columns become the new indexes.



      I'd appreciate any advice on this matter.










      share|improve this question
















      I am trying to optimize some table transformation scripts in Python Pandas, which I am trying to feed with huge data sets (above 50k rows). I wrote a script that iterates through every index and parses values into a new data frame (see example below), but I am experiencing performance issues. Is there any pandas function, that could get the same results without iterating?



      Example code:



      from datetime import datetime
      import pandas as pd

      date1 = datetime(2019,1,1)
      date2 = datetime(2019,1,2)

      df = pd.DataFrame("ID": [1,1,2,2,3,3],
      "date": [date1,date2,date1,date2,date1,date2],
      "x": [1,2,3,4,5,6],
      "y": ["a","a","b","b","c","c"])


      new_df = pd.DataFrame()
      for i in df.index:

      new_df.at[df.at[i, "ID"], "y"] = df.at[i, "y"]

      if df.at[i, "date"] == datetime(2019,1,1):
      new_df.at[df.at[i, "ID"], "x1"] = df.at[i, "x"]
      elif df.at[i, "date"] == datetime(2019,1,2):
      new_df.at[df.at[i, "ID"], "x2"] = df.at[i, "x"]


      output:



       ID date x y
      0 1 2019-01-01 1 a
      1 1 2019-01-02 2 a
      2 2 2019-01-01 3 b
      3 2 2019-01-02 4 b
      4 3 2019-01-01 5 c
      5 3 2019-01-02 6 c

      y x1 x2
      1 a 1.0 2.0
      2 b 3.0 4.0
      3 c 5.0 6.0


      The transformation basically groups the rows by the "ID" column and gets the "x1" values from the rows with date 2019-01-01, and the "x2" values from the rows with date 2019-01-02. The "y" value is the same within the same "ID". "ID" columns become the new indexes.



      I'd appreciate any advice on this matter.







      python pandas pivot-table






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Mar 7 at 20:16









      Brian Tompsett - 汤莱恩

      4,2421339102




      4,2421339102










      asked Mar 7 at 20:11









      canbe90canbe90

      82




      82






















          1 Answer
          1






          active

          oldest

          votes


















          0














          Using pivot_tables will get what you are looking for:



          result = df.pivot_table(index=['ID', 'y'], columns='date', values='x')
          result.rename(columns=date1: 'x1', date2: 'x2').reset_index('y')





          share|improve this answer






















            Your Answer






            StackExchange.ifUsing("editor", function ()
            StackExchange.using("externalEditor", function ()
            StackExchange.using("snippets", function ()
            StackExchange.snippets.init();
            );
            );
            , "code-snippets");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "1"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55052056%2fadvanced-pivot-table-in-pandas%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            0














            Using pivot_tables will get what you are looking for:



            result = df.pivot_table(index=['ID', 'y'], columns='date', values='x')
            result.rename(columns=date1: 'x1', date2: 'x2').reset_index('y')





            share|improve this answer



























              0














              Using pivot_tables will get what you are looking for:



              result = df.pivot_table(index=['ID', 'y'], columns='date', values='x')
              result.rename(columns=date1: 'x1', date2: 'x2').reset_index('y')





              share|improve this answer

























                0












                0








                0







                Using pivot_tables will get what you are looking for:



                result = df.pivot_table(index=['ID', 'y'], columns='date', values='x')
                result.rename(columns=date1: 'x1', date2: 'x2').reset_index('y')





                share|improve this answer













                Using pivot_tables will get what you are looking for:



                result = df.pivot_table(index=['ID', 'y'], columns='date', values='x')
                result.rename(columns=date1: 'x1', date2: 'x2').reset_index('y')






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Mar 7 at 20:20









                busybearbusybear

                3,3691926




                3,3691926





























                    draft saved

                    draft discarded
















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55052056%2fadvanced-pivot-table-in-pandas%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    How to get text form Clipboard with JavaScript in Firefox 56?How to validate an email address in JavaScript?How do JavaScript closures work?How do I remove a property from a JavaScript object?How do you get a timestamp in JavaScript?How do I copy to the clipboard in JavaScript?How do I include a JavaScript file in another JavaScript file?Get the current URL with JavaScript?How to replace all occurrences of a string in JavaScriptHow to check whether a string contains a substring in JavaScript?How do I remove a particular element from an array in JavaScript?

                    Can't initialize raids on a new ASUS Prime B360M-A motherboard2019 Community Moderator ElectionSimilar to RAID config yet more like mirroring solution?Can't get motherboard serial numberWhy does the BIOS entry point start with a WBINVD instruction?UEFI performance Asus Maximus V Extreme

                    List of MPs elected to the English parliament in 1640 (April) Contents List of constituencies and members See also Notes References Navigation menueNational Archives – The Glynde Place ArchivesCobbett's Parliamentary history of England, from the Norman Conquest in 1066 to the year 1803'Aldermen in Parliament', The Aldermen of the City of London: Temp. Henry III – 1912onepage&q&f&#61, false 229