groupby in userdefined python function, doesn't work2019 Community Moderator ElectionCalling an external command in PythonWhat are metaclasses in Python?Finding the index of an item given a list containing it in PythonDifference between append vs. extend list methods in PythonHow can I safely create a nested directory in Python?Does Python have a ternary conditional operator?Using global variables in a functionHow to make a chain of function decorators?Does Python have a string 'contains' substring method?“Large data” work flows using pandas
SQL Server Primary Login Restrictions
Making a sword in the stone, in a medieval world without magic
Why do passenger jet manufacturers design their planes with stall prevention systems?
Is having access to past exams cheating and, if yes, could it be proven just by a good grade?
What options are left, if Britain cannot decide?
Russian cases: A few examples, I'm really confused
Using "wallow" verb with object
How to get the name of the database a stored procedure is executed in within that stored procedure while it's executing?
Make a transparent 448*448 image
Why did it take so long to abandon sail after steamships were demonstrated?
Brexit - No Deal Rejection
Schematic conventions for different supply rails
Is it possible that AIC = BIC?
Why do Australian milk farmers need to protest supermarkets' milk price?
2D counterpart of std::array in C++17
Is it true that real estate prices mainly go up?
Where is the 1/8 CR apprentice in Volo's Guide to Monsters?
Rejected in 4th interview round citing insufficient years of experience
I need to drive a 7/16" nut but am unsure how to use the socket I bought for my screwdriver
Did CPM support custom hardware using device drivers?
How Did the Space Junk Stay in Orbit in Wall-E?
How could a female member of a species produce eggs unto death?
How to deal with taxi scam when on vacation?
The use of "touch" and "touch on" in context
groupby in userdefined python function, doesn't work
2019 Community Moderator ElectionCalling an external command in PythonWhat are metaclasses in Python?Finding the index of an item given a list containing it in PythonDifference between append vs. extend list methods in PythonHow can I safely create a nested directory in Python?Does Python have a ternary conditional operator?Using global variables in a functionHow to make a chain of function decorators?Does Python have a string 'contains' substring method?“Large data” work flows using pandas
I have made my own userdefined function in Python. The input are some parameters and a dataframe. First some new variables are added to the input dataframe. Then I try to make a groupby on the dataframe and left join the result on to the dataframe.
But the dataframe don't get the groupby variables added.
def test(df, params):
df['b']=df['a']*params['some_parameter']
df['c']=df['b']*df['total']
aaa=df.groupby(['aa', 'bb']).agg('c':'sum')
df=pd.merge(df,a,how='left',on=['aa', 'bb'])
return
Next try:
def test(df, params):
df['b']=df['a']*params['some_parameter']
df['d']=df['c']*df['b']
aaa=df.groupby(['y','x']).agg('d':'sum','g':'sum').add_suffix('_sum')
df=df.join(aaa, on=['y','x'])
return
I then call the function by:
test(df2,params)
I would expect df2 would have 4 new columns, b, d, d_sum and g_sum. But it only has 2 new columns, b and d.
python pandas pandas-groupby
add a comment |
I have made my own userdefined function in Python. The input are some parameters and a dataframe. First some new variables are added to the input dataframe. Then I try to make a groupby on the dataframe and left join the result on to the dataframe.
But the dataframe don't get the groupby variables added.
def test(df, params):
df['b']=df['a']*params['some_parameter']
df['c']=df['b']*df['total']
aaa=df.groupby(['aa', 'bb']).agg('c':'sum')
df=pd.merge(df,a,how='left',on=['aa', 'bb'])
return
Next try:
def test(df, params):
df['b']=df['a']*params['some_parameter']
df['d']=df['c']*df['b']
aaa=df.groupby(['y','x']).agg('d':'sum','g':'sum').add_suffix('_sum')
df=df.join(aaa, on=['y','x'])
return
I then call the function by:
test(df2,params)
I would expect df2 would have 4 new columns, b, d, d_sum and g_sum. But it only has 2 new columns, b and d.
python pandas pandas-groupby
add a comment |
I have made my own userdefined function in Python. The input are some parameters and a dataframe. First some new variables are added to the input dataframe. Then I try to make a groupby on the dataframe and left join the result on to the dataframe.
But the dataframe don't get the groupby variables added.
def test(df, params):
df['b']=df['a']*params['some_parameter']
df['c']=df['b']*df['total']
aaa=df.groupby(['aa', 'bb']).agg('c':'sum')
df=pd.merge(df,a,how='left',on=['aa', 'bb'])
return
Next try:
def test(df, params):
df['b']=df['a']*params['some_parameter']
df['d']=df['c']*df['b']
aaa=df.groupby(['y','x']).agg('d':'sum','g':'sum').add_suffix('_sum')
df=df.join(aaa, on=['y','x'])
return
I then call the function by:
test(df2,params)
I would expect df2 would have 4 new columns, b, d, d_sum and g_sum. But it only has 2 new columns, b and d.
python pandas pandas-groupby
I have made my own userdefined function in Python. The input are some parameters and a dataframe. First some new variables are added to the input dataframe. Then I try to make a groupby on the dataframe and left join the result on to the dataframe.
But the dataframe don't get the groupby variables added.
def test(df, params):
df['b']=df['a']*params['some_parameter']
df['c']=df['b']*df['total']
aaa=df.groupby(['aa', 'bb']).agg('c':'sum')
df=pd.merge(df,a,how='left',on=['aa', 'bb'])
return
Next try:
def test(df, params):
df['b']=df['a']*params['some_parameter']
df['d']=df['c']*df['b']
aaa=df.groupby(['y','x']).agg('d':'sum','g':'sum').add_suffix('_sum')
df=df.join(aaa, on=['y','x'])
return
I then call the function by:
test(df2,params)
I would expect df2 would have 4 new columns, b, d, d_sum and g_sum. But it only has 2 new columns, b and d.
python pandas pandas-groupby
python pandas pandas-groupby
edited Mar 8 at 13:54
thomlund83
asked Mar 7 at 12:19
thomlund83thomlund83
83
83
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
You can use GroupBy.transform instaed groupby with left join by merge:
aaa=df.groupby(['aa', 'bb']).agg('c':'sum')
df=pd.merge(df,a,how='left',on=['aa', 'bb'])
to:
df['c1'] = df.groupby(['aa', 'bb'])['c'].transform('sum')
All together:
def test(df, params):
df['b']=df['a']*params['some_parameter']
df['c']=df['b']*df['total']
df['new'] = df.groupby(['aa', 'bb'])['c'].transform('sum')
return df
If need aggregate multiple columns is possible use DataFrame.join with default left join:
df = pd.DataFrame(
'x':list('dddddd'),
'y':list('aaabbb'),
'a':[4,5,4,5,5,4],
'b':[7,8,9,4,2,3],
'c':[1,3,5,7,1,0],
'd':[5,3,6,9,2,4],
'g':[1,3,6,4,4,3],
)
print (df)
x y a b c d g
0 d a 4 7 1 5 1
1 d a 5 8 3 3 3
2 d a 4 9 5 6 6
3 d b 5 4 7 9 4
4 d b 5 2 1 2 4
5 d b 4 3 0 4 3
params = 'some_parameter':100
def test(df, params):
df['b']=df['a']*params['some_parameter']
df['d']=df['c']*df['b']
aaa=df.groupby(['y','x']).agg('d':'sum','g':'sum').add_suffix('_sum')
df=df.join(aaa, on=['y','x'])
return df
df1 = test(df, params)
print (df1)
x y a b c d g d_sum g_sum
0 d a 4 400 1 400 1 3900 10
1 d a 5 500 3 1500 3 3900 10
2 d a 4 400 5 2000 6 3900 10
3 d b 5 500 7 3500 4 4000 11
4 d b 5 500 1 500 4 4000 11
5 d b 4 400 0 0 3 4000 11
It doesn't work. When i do it outside of a function, it works. But when I move it inside the function, the 2 new columns is not in the dataframe. I have pasted in the new function in the original post.
– thomlund83
Mar 8 at 13:48
@thomlund83 - There is some error? So if working outside function there should be problem inside. Is possible share your function, which not working by edit question?
– jezrael
Mar 8 at 13:49
@thomlund83 - do you forgetreturn df?
– jezrael
Mar 8 at 13:55
"return df" does not help. I guess its not needed since the b and c variables ARE created in the dataframe df2
– thomlund83
Mar 8 at 13:59
@thomlund83 - You are wrong, is necessaryreturn df, added samle data function to my answer - working nice withreturn df
– jezrael
Mar 8 at 14:09
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55043633%2fgroupby-in-userdefined-python-function-doesnt-work%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can use GroupBy.transform instaed groupby with left join by merge:
aaa=df.groupby(['aa', 'bb']).agg('c':'sum')
df=pd.merge(df,a,how='left',on=['aa', 'bb'])
to:
df['c1'] = df.groupby(['aa', 'bb'])['c'].transform('sum')
All together:
def test(df, params):
df['b']=df['a']*params['some_parameter']
df['c']=df['b']*df['total']
df['new'] = df.groupby(['aa', 'bb'])['c'].transform('sum')
return df
If need aggregate multiple columns is possible use DataFrame.join with default left join:
df = pd.DataFrame(
'x':list('dddddd'),
'y':list('aaabbb'),
'a':[4,5,4,5,5,4],
'b':[7,8,9,4,2,3],
'c':[1,3,5,7,1,0],
'd':[5,3,6,9,2,4],
'g':[1,3,6,4,4,3],
)
print (df)
x y a b c d g
0 d a 4 7 1 5 1
1 d a 5 8 3 3 3
2 d a 4 9 5 6 6
3 d b 5 4 7 9 4
4 d b 5 2 1 2 4
5 d b 4 3 0 4 3
params = 'some_parameter':100
def test(df, params):
df['b']=df['a']*params['some_parameter']
df['d']=df['c']*df['b']
aaa=df.groupby(['y','x']).agg('d':'sum','g':'sum').add_suffix('_sum')
df=df.join(aaa, on=['y','x'])
return df
df1 = test(df, params)
print (df1)
x y a b c d g d_sum g_sum
0 d a 4 400 1 400 1 3900 10
1 d a 5 500 3 1500 3 3900 10
2 d a 4 400 5 2000 6 3900 10
3 d b 5 500 7 3500 4 4000 11
4 d b 5 500 1 500 4 4000 11
5 d b 4 400 0 0 3 4000 11
It doesn't work. When i do it outside of a function, it works. But when I move it inside the function, the 2 new columns is not in the dataframe. I have pasted in the new function in the original post.
– thomlund83
Mar 8 at 13:48
@thomlund83 - There is some error? So if working outside function there should be problem inside. Is possible share your function, which not working by edit question?
– jezrael
Mar 8 at 13:49
@thomlund83 - do you forgetreturn df?
– jezrael
Mar 8 at 13:55
"return df" does not help. I guess its not needed since the b and c variables ARE created in the dataframe df2
– thomlund83
Mar 8 at 13:59
@thomlund83 - You are wrong, is necessaryreturn df, added samle data function to my answer - working nice withreturn df
– jezrael
Mar 8 at 14:09
add a comment |
You can use GroupBy.transform instaed groupby with left join by merge:
aaa=df.groupby(['aa', 'bb']).agg('c':'sum')
df=pd.merge(df,a,how='left',on=['aa', 'bb'])
to:
df['c1'] = df.groupby(['aa', 'bb'])['c'].transform('sum')
All together:
def test(df, params):
df['b']=df['a']*params['some_parameter']
df['c']=df['b']*df['total']
df['new'] = df.groupby(['aa', 'bb'])['c'].transform('sum')
return df
If need aggregate multiple columns is possible use DataFrame.join with default left join:
df = pd.DataFrame(
'x':list('dddddd'),
'y':list('aaabbb'),
'a':[4,5,4,5,5,4],
'b':[7,8,9,4,2,3],
'c':[1,3,5,7,1,0],
'd':[5,3,6,9,2,4],
'g':[1,3,6,4,4,3],
)
print (df)
x y a b c d g
0 d a 4 7 1 5 1
1 d a 5 8 3 3 3
2 d a 4 9 5 6 6
3 d b 5 4 7 9 4
4 d b 5 2 1 2 4
5 d b 4 3 0 4 3
params = 'some_parameter':100
def test(df, params):
df['b']=df['a']*params['some_parameter']
df['d']=df['c']*df['b']
aaa=df.groupby(['y','x']).agg('d':'sum','g':'sum').add_suffix('_sum')
df=df.join(aaa, on=['y','x'])
return df
df1 = test(df, params)
print (df1)
x y a b c d g d_sum g_sum
0 d a 4 400 1 400 1 3900 10
1 d a 5 500 3 1500 3 3900 10
2 d a 4 400 5 2000 6 3900 10
3 d b 5 500 7 3500 4 4000 11
4 d b 5 500 1 500 4 4000 11
5 d b 4 400 0 0 3 4000 11
It doesn't work. When i do it outside of a function, it works. But when I move it inside the function, the 2 new columns is not in the dataframe. I have pasted in the new function in the original post.
– thomlund83
Mar 8 at 13:48
@thomlund83 - There is some error? So if working outside function there should be problem inside. Is possible share your function, which not working by edit question?
– jezrael
Mar 8 at 13:49
@thomlund83 - do you forgetreturn df?
– jezrael
Mar 8 at 13:55
"return df" does not help. I guess its not needed since the b and c variables ARE created in the dataframe df2
– thomlund83
Mar 8 at 13:59
@thomlund83 - You are wrong, is necessaryreturn df, added samle data function to my answer - working nice withreturn df
– jezrael
Mar 8 at 14:09
add a comment |
You can use GroupBy.transform instaed groupby with left join by merge:
aaa=df.groupby(['aa', 'bb']).agg('c':'sum')
df=pd.merge(df,a,how='left',on=['aa', 'bb'])
to:
df['c1'] = df.groupby(['aa', 'bb'])['c'].transform('sum')
All together:
def test(df, params):
df['b']=df['a']*params['some_parameter']
df['c']=df['b']*df['total']
df['new'] = df.groupby(['aa', 'bb'])['c'].transform('sum')
return df
If need aggregate multiple columns is possible use DataFrame.join with default left join:
df = pd.DataFrame(
'x':list('dddddd'),
'y':list('aaabbb'),
'a':[4,5,4,5,5,4],
'b':[7,8,9,4,2,3],
'c':[1,3,5,7,1,0],
'd':[5,3,6,9,2,4],
'g':[1,3,6,4,4,3],
)
print (df)
x y a b c d g
0 d a 4 7 1 5 1
1 d a 5 8 3 3 3
2 d a 4 9 5 6 6
3 d b 5 4 7 9 4
4 d b 5 2 1 2 4
5 d b 4 3 0 4 3
params = 'some_parameter':100
def test(df, params):
df['b']=df['a']*params['some_parameter']
df['d']=df['c']*df['b']
aaa=df.groupby(['y','x']).agg('d':'sum','g':'sum').add_suffix('_sum')
df=df.join(aaa, on=['y','x'])
return df
df1 = test(df, params)
print (df1)
x y a b c d g d_sum g_sum
0 d a 4 400 1 400 1 3900 10
1 d a 5 500 3 1500 3 3900 10
2 d a 4 400 5 2000 6 3900 10
3 d b 5 500 7 3500 4 4000 11
4 d b 5 500 1 500 4 4000 11
5 d b 4 400 0 0 3 4000 11
You can use GroupBy.transform instaed groupby with left join by merge:
aaa=df.groupby(['aa', 'bb']).agg('c':'sum')
df=pd.merge(df,a,how='left',on=['aa', 'bb'])
to:
df['c1'] = df.groupby(['aa', 'bb'])['c'].transform('sum')
All together:
def test(df, params):
df['b']=df['a']*params['some_parameter']
df['c']=df['b']*df['total']
df['new'] = df.groupby(['aa', 'bb'])['c'].transform('sum')
return df
If need aggregate multiple columns is possible use DataFrame.join with default left join:
df = pd.DataFrame(
'x':list('dddddd'),
'y':list('aaabbb'),
'a':[4,5,4,5,5,4],
'b':[7,8,9,4,2,3],
'c':[1,3,5,7,1,0],
'd':[5,3,6,9,2,4],
'g':[1,3,6,4,4,3],
)
print (df)
x y a b c d g
0 d a 4 7 1 5 1
1 d a 5 8 3 3 3
2 d a 4 9 5 6 6
3 d b 5 4 7 9 4
4 d b 5 2 1 2 4
5 d b 4 3 0 4 3
params = 'some_parameter':100
def test(df, params):
df['b']=df['a']*params['some_parameter']
df['d']=df['c']*df['b']
aaa=df.groupby(['y','x']).agg('d':'sum','g':'sum').add_suffix('_sum')
df=df.join(aaa, on=['y','x'])
return df
df1 = test(df, params)
print (df1)
x y a b c d g d_sum g_sum
0 d a 4 400 1 400 1 3900 10
1 d a 5 500 3 1500 3 3900 10
2 d a 4 400 5 2000 6 3900 10
3 d b 5 500 7 3500 4 4000 11
4 d b 5 500 1 500 4 4000 11
5 d b 4 400 0 0 3 4000 11
edited Mar 8 at 14:08
answered Mar 7 at 12:21
jezraeljezrael
347k25302378
347k25302378
It doesn't work. When i do it outside of a function, it works. But when I move it inside the function, the 2 new columns is not in the dataframe. I have pasted in the new function in the original post.
– thomlund83
Mar 8 at 13:48
@thomlund83 - There is some error? So if working outside function there should be problem inside. Is possible share your function, which not working by edit question?
– jezrael
Mar 8 at 13:49
@thomlund83 - do you forgetreturn df?
– jezrael
Mar 8 at 13:55
"return df" does not help. I guess its not needed since the b and c variables ARE created in the dataframe df2
– thomlund83
Mar 8 at 13:59
@thomlund83 - You are wrong, is necessaryreturn df, added samle data function to my answer - working nice withreturn df
– jezrael
Mar 8 at 14:09
add a comment |
It doesn't work. When i do it outside of a function, it works. But when I move it inside the function, the 2 new columns is not in the dataframe. I have pasted in the new function in the original post.
– thomlund83
Mar 8 at 13:48
@thomlund83 - There is some error? So if working outside function there should be problem inside. Is possible share your function, which not working by edit question?
– jezrael
Mar 8 at 13:49
@thomlund83 - do you forgetreturn df?
– jezrael
Mar 8 at 13:55
"return df" does not help. I guess its not needed since the b and c variables ARE created in the dataframe df2
– thomlund83
Mar 8 at 13:59
@thomlund83 - You are wrong, is necessaryreturn df, added samle data function to my answer - working nice withreturn df
– jezrael
Mar 8 at 14:09
It doesn't work. When i do it outside of a function, it works. But when I move it inside the function, the 2 new columns is not in the dataframe. I have pasted in the new function in the original post.
– thomlund83
Mar 8 at 13:48
It doesn't work. When i do it outside of a function, it works. But when I move it inside the function, the 2 new columns is not in the dataframe. I have pasted in the new function in the original post.
– thomlund83
Mar 8 at 13:48
@thomlund83 - There is some error? So if working outside function there should be problem inside. Is possible share your function, which not working by edit question?
– jezrael
Mar 8 at 13:49
@thomlund83 - There is some error? So if working outside function there should be problem inside. Is possible share your function, which not working by edit question?
– jezrael
Mar 8 at 13:49
@thomlund83 - do you forget
return df ?– jezrael
Mar 8 at 13:55
@thomlund83 - do you forget
return df ?– jezrael
Mar 8 at 13:55
"return df" does not help. I guess its not needed since the b and c variables ARE created in the dataframe df2
– thomlund83
Mar 8 at 13:59
"return df" does not help. I guess its not needed since the b and c variables ARE created in the dataframe df2
– thomlund83
Mar 8 at 13:59
@thomlund83 - You are wrong, is necessary
return df, added samle data function to my answer - working nice with return df– jezrael
Mar 8 at 14:09
@thomlund83 - You are wrong, is necessary
return df, added samle data function to my answer - working nice with return df– jezrael
Mar 8 at 14:09
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55043633%2fgroupby-in-userdefined-python-function-doesnt-work%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown