How can I strip the whitespace from Pandas DataFrame headers?2019 Community Moderator Electiontrim last column in pandas DataframePythonic/efficient way to strip whitespace from every Pandas Data frame cell that has a stringlike object in itHow to remove whitespace from df column headers (strip isn't working)Renaming csv headers in pandas without writing the whole file over for a very large csvHow to left align a dataframe with pandasHow can I safely create a nested directory in Python?How can I make a time delay in Python?How do I trim whitespace from a Python string?Selecting multiple columns in a pandas dataframeAdding new column to existing DataFrame in Python pandasDelete column from pandas DataFrame by column name“Large data” work flows using pandasHow to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasGet list from pandas DataFrame column headers
Bash script should only kill those instances of another script's that it has launched
Is it necessary to separate DC power cables and data cables?
Does "Until when" sound natural for native speakers?
How are instrumentation amplifiers constructed on the semiconductor level?
Do I really need to have a scientific explanation for my premise?
An alternative proof of an application of Hahn-Banach
What's the "normal" opposite of flautando?
Signed and unsigned numbers
Does a warlock using the Darkness/Devil's Sight combo still have advantage on ranged attacks against a target outside the Darkness?
Word for a person who has no opinion about whether god exists
Recommendation letter by significant other if you worked with them professionally?
How can The Temple of Elementary Evil reliably protect itself against kinetic bombardment?
What is the magic ball of every day?
NASA's RS-25 Engines shut down time
Can I pump my MTB tire to max (55 psi / 380 kPa) without the tube inside bursting?
Could you please stop shuffling the deck and play already?
Intuition behind counterexample of Euler's sum of powers conjecture
Do items de-spawn in Diablo?
How are showroom/display vehicles prepared?
When a wind turbine does not produce enough electricity how does the power company compensate for the loss?
Is "history" a male-biased word ("his+story")?
Can you reject a postdoc offer after the PI has paid a large sum for flights/accommodation for your visit?
How does one describe somebody who is bi-racial?
Single word request: Harming the benefactor
How can I strip the whitespace from Pandas DataFrame headers?
2019 Community Moderator Electiontrim last column in pandas DataframePythonic/efficient way to strip whitespace from every Pandas Data frame cell that has a stringlike object in itHow to remove whitespace from df column headers (strip isn't working)Renaming csv headers in pandas without writing the whole file over for a very large csvHow to left align a dataframe with pandasHow can I safely create a nested directory in Python?How can I make a time delay in Python?How do I trim whitespace from a Python string?Selecting multiple columns in a pandas dataframeAdding new column to existing DataFrame in Python pandasDelete column from pandas DataFrame by column name“Large data” work flows using pandasHow to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasGet list from pandas DataFrame column headers
I am parsing data from an Excel file that has extra white space in some of the column headings.
When I check the columns of the resulting dataframe, like so:
df.columns
The result looks like this:
Index(['Year', 'Month ', 'Value'])
Consequently, I can't run
df["Month"]
Because it will tell me the column is not found, as I asked for "Month", not "Month ".
My question, then, is how can I strip out the unwanted white space from the column headings?
python pandas
add a comment |
I am parsing data from an Excel file that has extra white space in some of the column headings.
When I check the columns of the resulting dataframe, like so:
df.columns
The result looks like this:
Index(['Year', 'Month ', 'Value'])
Consequently, I can't run
df["Month"]
Because it will tell me the column is not found, as I asked for "Month", not "Month ".
My question, then, is how can I strip out the unwanted white space from the column headings?
python pandas
add a comment |
I am parsing data from an Excel file that has extra white space in some of the column headings.
When I check the columns of the resulting dataframe, like so:
df.columns
The result looks like this:
Index(['Year', 'Month ', 'Value'])
Consequently, I can't run
df["Month"]
Because it will tell me the column is not found, as I asked for "Month", not "Month ".
My question, then, is how can I strip out the unwanted white space from the column headings?
python pandas
I am parsing data from an Excel file that has extra white space in some of the column headings.
When I check the columns of the resulting dataframe, like so:
df.columns
The result looks like this:
Index(['Year', 'Month ', 'Value'])
Consequently, I can't run
df["Month"]
Because it will tell me the column is not found, as I asked for "Month", not "Month ".
My question, then, is how can I strip out the unwanted white space from the column headings?
python pandas
python pandas
edited Mar 7 at 6:23
JJJ
672720
672720
asked Feb 6 '14 at 15:26
Spike WilliamsSpike Williams
11.7k114154
11.7k114154
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
You can give functions to the rename
method. The str.strip()
method should do what you want.
In [5]: df
Out[5]:
Year Month Value
0 1 2 3
[1 rows x 3 columns]
In [6]: df.rename(columns=lambda x: x.strip())
Out[6]:
Year Month Value
0 1 2 3
[1 rows x 3 columns]
14
This will return a new DataFrame object. To be able to referencedf["Month"]
you will need to addinplace=True
as an argument therename
method.
– Maximus
Jul 14 '15 at 0:48
1
This is a great answer because it can also be used when chaining operations.pd.read_csv(fname).rename(columns=lambda x: x.strip()))
– drbv
Jan 31 '18 at 16:22
add a comment |
You can now just call .str.strip
on the columns if you're using a recent version:
In [5]:
df = pd.DataFrame(columns=['Year', 'Month ', 'Value'])
print(df.columns.tolist())
df.columns = df.columns.str.strip()
df.columns.tolist()
['Year', 'Month ', 'Value']
Out[5]:
['Year', 'Month', 'Value']
Timings
In[26]:
df = pd.DataFrame(columns=[' year', ' month ', ' day', ' asdas ', ' asdas', 'as ', ' sa', ' asdas '])
df
Out[26]:
Empty DataFrame
Columns: [ year, month , day, asdas , asdas, as , sa, asdas ]
%timeit df.rename(columns=lambda x: x.strip())
%timeit df.columns.str.strip()
1000 loops, best of 3: 293 µs per loop
10000 loops, best of 3: 143 µs per loop
So str.strip
is ~2X faster, I expect this to scale better for larger dfs
1
You can dodf.index = df.index.str.strip()
if your index has only strings and you want to strip the index.
– Elmex80s
Oct 19 '17 at 16:51
3
And its faster too!
– sfjac
Jan 17 '18 at 23:01
A LOT faster. I did a%timeit
ondf.rename()
anddf.columns.str.strip()
on a dataframe with 29 columns. The timings were 45.7 ms and .141 ms, respectively.
– S3DEV
Apr 24 '18 at 10:54
1
@S3DEV I've decided to add timings for a trivial example to aid future users
– EdChum
Apr 24 '18 at 10:59
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f21606987%2fhow-can-i-strip-the-whitespace-from-pandas-dataframe-headers%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can give functions to the rename
method. The str.strip()
method should do what you want.
In [5]: df
Out[5]:
Year Month Value
0 1 2 3
[1 rows x 3 columns]
In [6]: df.rename(columns=lambda x: x.strip())
Out[6]:
Year Month Value
0 1 2 3
[1 rows x 3 columns]
14
This will return a new DataFrame object. To be able to referencedf["Month"]
you will need to addinplace=True
as an argument therename
method.
– Maximus
Jul 14 '15 at 0:48
1
This is a great answer because it can also be used when chaining operations.pd.read_csv(fname).rename(columns=lambda x: x.strip()))
– drbv
Jan 31 '18 at 16:22
add a comment |
You can give functions to the rename
method. The str.strip()
method should do what you want.
In [5]: df
Out[5]:
Year Month Value
0 1 2 3
[1 rows x 3 columns]
In [6]: df.rename(columns=lambda x: x.strip())
Out[6]:
Year Month Value
0 1 2 3
[1 rows x 3 columns]
14
This will return a new DataFrame object. To be able to referencedf["Month"]
you will need to addinplace=True
as an argument therename
method.
– Maximus
Jul 14 '15 at 0:48
1
This is a great answer because it can also be used when chaining operations.pd.read_csv(fname).rename(columns=lambda x: x.strip()))
– drbv
Jan 31 '18 at 16:22
add a comment |
You can give functions to the rename
method. The str.strip()
method should do what you want.
In [5]: df
Out[5]:
Year Month Value
0 1 2 3
[1 rows x 3 columns]
In [6]: df.rename(columns=lambda x: x.strip())
Out[6]:
Year Month Value
0 1 2 3
[1 rows x 3 columns]
You can give functions to the rename
method. The str.strip()
method should do what you want.
In [5]: df
Out[5]:
Year Month Value
0 1 2 3
[1 rows x 3 columns]
In [6]: df.rename(columns=lambda x: x.strip())
Out[6]:
Year Month Value
0 1 2 3
[1 rows x 3 columns]
answered Feb 6 '14 at 15:49
TomAugspurgerTomAugspurger
15.8k35355
15.8k35355
14
This will return a new DataFrame object. To be able to referencedf["Month"]
you will need to addinplace=True
as an argument therename
method.
– Maximus
Jul 14 '15 at 0:48
1
This is a great answer because it can also be used when chaining operations.pd.read_csv(fname).rename(columns=lambda x: x.strip()))
– drbv
Jan 31 '18 at 16:22
add a comment |
14
This will return a new DataFrame object. To be able to referencedf["Month"]
you will need to addinplace=True
as an argument therename
method.
– Maximus
Jul 14 '15 at 0:48
1
This is a great answer because it can also be used when chaining operations.pd.read_csv(fname).rename(columns=lambda x: x.strip()))
– drbv
Jan 31 '18 at 16:22
14
14
This will return a new DataFrame object. To be able to reference
df["Month"]
you will need to add inplace=True
as an argument the rename
method.– Maximus
Jul 14 '15 at 0:48
This will return a new DataFrame object. To be able to reference
df["Month"]
you will need to add inplace=True
as an argument the rename
method.– Maximus
Jul 14 '15 at 0:48
1
1
This is a great answer because it can also be used when chaining operations.
pd.read_csv(fname).rename(columns=lambda x: x.strip()))
– drbv
Jan 31 '18 at 16:22
This is a great answer because it can also be used when chaining operations.
pd.read_csv(fname).rename(columns=lambda x: x.strip()))
– drbv
Jan 31 '18 at 16:22
add a comment |
You can now just call .str.strip
on the columns if you're using a recent version:
In [5]:
df = pd.DataFrame(columns=['Year', 'Month ', 'Value'])
print(df.columns.tolist())
df.columns = df.columns.str.strip()
df.columns.tolist()
['Year', 'Month ', 'Value']
Out[5]:
['Year', 'Month', 'Value']
Timings
In[26]:
df = pd.DataFrame(columns=[' year', ' month ', ' day', ' asdas ', ' asdas', 'as ', ' sa', ' asdas '])
df
Out[26]:
Empty DataFrame
Columns: [ year, month , day, asdas , asdas, as , sa, asdas ]
%timeit df.rename(columns=lambda x: x.strip())
%timeit df.columns.str.strip()
1000 loops, best of 3: 293 µs per loop
10000 loops, best of 3: 143 µs per loop
So str.strip
is ~2X faster, I expect this to scale better for larger dfs
1
You can dodf.index = df.index.str.strip()
if your index has only strings and you want to strip the index.
– Elmex80s
Oct 19 '17 at 16:51
3
And its faster too!
– sfjac
Jan 17 '18 at 23:01
A LOT faster. I did a%timeit
ondf.rename()
anddf.columns.str.strip()
on a dataframe with 29 columns. The timings were 45.7 ms and .141 ms, respectively.
– S3DEV
Apr 24 '18 at 10:54
1
@S3DEV I've decided to add timings for a trivial example to aid future users
– EdChum
Apr 24 '18 at 10:59
add a comment |
You can now just call .str.strip
on the columns if you're using a recent version:
In [5]:
df = pd.DataFrame(columns=['Year', 'Month ', 'Value'])
print(df.columns.tolist())
df.columns = df.columns.str.strip()
df.columns.tolist()
['Year', 'Month ', 'Value']
Out[5]:
['Year', 'Month', 'Value']
Timings
In[26]:
df = pd.DataFrame(columns=[' year', ' month ', ' day', ' asdas ', ' asdas', 'as ', ' sa', ' asdas '])
df
Out[26]:
Empty DataFrame
Columns: [ year, month , day, asdas , asdas, as , sa, asdas ]
%timeit df.rename(columns=lambda x: x.strip())
%timeit df.columns.str.strip()
1000 loops, best of 3: 293 µs per loop
10000 loops, best of 3: 143 µs per loop
So str.strip
is ~2X faster, I expect this to scale better for larger dfs
1
You can dodf.index = df.index.str.strip()
if your index has only strings and you want to strip the index.
– Elmex80s
Oct 19 '17 at 16:51
3
And its faster too!
– sfjac
Jan 17 '18 at 23:01
A LOT faster. I did a%timeit
ondf.rename()
anddf.columns.str.strip()
on a dataframe with 29 columns. The timings were 45.7 ms and .141 ms, respectively.
– S3DEV
Apr 24 '18 at 10:54
1
@S3DEV I've decided to add timings for a trivial example to aid future users
– EdChum
Apr 24 '18 at 10:59
add a comment |
You can now just call .str.strip
on the columns if you're using a recent version:
In [5]:
df = pd.DataFrame(columns=['Year', 'Month ', 'Value'])
print(df.columns.tolist())
df.columns = df.columns.str.strip()
df.columns.tolist()
['Year', 'Month ', 'Value']
Out[5]:
['Year', 'Month', 'Value']
Timings
In[26]:
df = pd.DataFrame(columns=[' year', ' month ', ' day', ' asdas ', ' asdas', 'as ', ' sa', ' asdas '])
df
Out[26]:
Empty DataFrame
Columns: [ year, month , day, asdas , asdas, as , sa, asdas ]
%timeit df.rename(columns=lambda x: x.strip())
%timeit df.columns.str.strip()
1000 loops, best of 3: 293 µs per loop
10000 loops, best of 3: 143 µs per loop
So str.strip
is ~2X faster, I expect this to scale better for larger dfs
You can now just call .str.strip
on the columns if you're using a recent version:
In [5]:
df = pd.DataFrame(columns=['Year', 'Month ', 'Value'])
print(df.columns.tolist())
df.columns = df.columns.str.strip()
df.columns.tolist()
['Year', 'Month ', 'Value']
Out[5]:
['Year', 'Month', 'Value']
Timings
In[26]:
df = pd.DataFrame(columns=[' year', ' month ', ' day', ' asdas ', ' asdas', 'as ', ' sa', ' asdas '])
df
Out[26]:
Empty DataFrame
Columns: [ year, month , day, asdas , asdas, as , sa, asdas ]
%timeit df.rename(columns=lambda x: x.strip())
%timeit df.columns.str.strip()
1000 loops, best of 3: 293 µs per loop
10000 loops, best of 3: 143 µs per loop
So str.strip
is ~2X faster, I expect this to scale better for larger dfs
edited Apr 24 '18 at 10:59
answered Mar 18 '16 at 10:56
EdChumEdChum
180k33385327
180k33385327
1
You can dodf.index = df.index.str.strip()
if your index has only strings and you want to strip the index.
– Elmex80s
Oct 19 '17 at 16:51
3
And its faster too!
– sfjac
Jan 17 '18 at 23:01
A LOT faster. I did a%timeit
ondf.rename()
anddf.columns.str.strip()
on a dataframe with 29 columns. The timings were 45.7 ms and .141 ms, respectively.
– S3DEV
Apr 24 '18 at 10:54
1
@S3DEV I've decided to add timings for a trivial example to aid future users
– EdChum
Apr 24 '18 at 10:59
add a comment |
1
You can dodf.index = df.index.str.strip()
if your index has only strings and you want to strip the index.
– Elmex80s
Oct 19 '17 at 16:51
3
And its faster too!
– sfjac
Jan 17 '18 at 23:01
A LOT faster. I did a%timeit
ondf.rename()
anddf.columns.str.strip()
on a dataframe with 29 columns. The timings were 45.7 ms and .141 ms, respectively.
– S3DEV
Apr 24 '18 at 10:54
1
@S3DEV I've decided to add timings for a trivial example to aid future users
– EdChum
Apr 24 '18 at 10:59
1
1
You can do
df.index = df.index.str.strip()
if your index has only strings and you want to strip the index.– Elmex80s
Oct 19 '17 at 16:51
You can do
df.index = df.index.str.strip()
if your index has only strings and you want to strip the index.– Elmex80s
Oct 19 '17 at 16:51
3
3
And its faster too!
– sfjac
Jan 17 '18 at 23:01
And its faster too!
– sfjac
Jan 17 '18 at 23:01
A LOT faster. I did a
%timeit
on df.rename()
and df.columns.str.strip()
on a dataframe with 29 columns. The timings were 45.7 ms and .141 ms, respectively.– S3DEV
Apr 24 '18 at 10:54
A LOT faster. I did a
%timeit
on df.rename()
and df.columns.str.strip()
on a dataframe with 29 columns. The timings were 45.7 ms and .141 ms, respectively.– S3DEV
Apr 24 '18 at 10:54
1
1
@S3DEV I've decided to add timings for a trivial example to aid future users
– EdChum
Apr 24 '18 at 10:59
@S3DEV I've decided to add timings for a trivial example to aid future users
– EdChum
Apr 24 '18 at 10:59
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f21606987%2fhow-can-i-strip-the-whitespace-from-pandas-dataframe-headers%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown