Python - Webscape hidden chars show in len how do i remove these?2019 Community Moderator ElectionHidden features of PythonHow can I safely create a nested directory in Python?How can I remove a trailing newline in Python?How to get the current time in PythonHow can I make a time delay in Python?How do I remove an element from a list by index in Python?How to get the number of elements in a list in Python?How to concatenate two lists in Python?How to lowercase a string in Python?How to remove a key from a Python dictionary?
Do Paladin Auras of Differing Oaths Stack?
Sampling from Gaussian mixture models, when are the sampled data independent?
Why do we say 'Pairwise Disjoint', rather than 'Disjoint'?
Are these two graphs isomorphic? Why/Why not?
Difference between `nmap local-IP-address` and `nmap localhost`
How do I raise a figure (placed with wrapfig) to be flush with the top of a paragraph?
What does the Digital Threat scope actually do?
Is there stress on two letters on the word стоят
Why does Central Limit Theorem break down in my simulation?
Having the player face themselves after the mid-game
What is the purpose of a disclaimer like "this is not legal advice"?
Idiom for feeling after taking risk and someone else being rewarded
What is better: yes / no radio, or simple checkbox?
Too soon for a plot twist?
Movie: boy escapes the real world and goes to a fantasy world with big furry trolls
How to write a chaotic neutral protagonist and prevent my readers from thinking they are evil?
How do spaceships determine each other's mass in space?
Why restrict private health insurance?
Short scifi story where reproductive organs are converted to produce "materials", pregnant protagonist is "found fit" to be a mother
Rationale to prefer local variables over instance variables?
When an outsider describes family relationships, which point of view are they using?
Is there a logarithm base for which the logarithm becomes an identity function?
If nine coins are tossed, what is the probability that the number of heads is even?
Called into a meeting and told we are being made redundant (laid off) and "not to share outside". Can I tell my partner?
Python - Webscape hidden chars show in len how do i remove these?
2019 Community Moderator ElectionHidden features of PythonHow can I safely create a nested directory in Python?How can I remove a trailing newline in Python?How to get the current time in PythonHow can I make a time delay in Python?How do I remove an element from a list by index in Python?How to get the number of elements in a list in Python?How to concatenate two lists in Python?How to lowercase a string in Python?How to remove a key from a Python dictionary?
I have used:
driver.find_elements_by_xpath('(.//span[@class = "x"])')[0].text
The information it pulls is correct but it adds spaces that show in the HTML as as "‭"
from the website im scaping.
How do I remove these so I can turn the str in to a int with as this is stopping me at the moment.
I have tried .strip
and .replace
with no luck.
Heres the raw HTML
<span class="coordinateX">(−52</span>
when I print this string i get (-52 but when I len()
it I get 8 instead of 4 due to these hidden characters.
Thanks
Mark.
python string selenium web-scraping int
New contributor
add a comment |
I have used:
driver.find_elements_by_xpath('(.//span[@class = "x"])')[0].text
The information it pulls is correct but it adds spaces that show in the HTML as as "‭"
from the website im scaping.
How do I remove these so I can turn the str in to a int with as this is stopping me at the moment.
I have tried .strip
and .replace
with no luck.
Heres the raw HTML
<span class="coordinateX">(−52</span>
when I print this string i get (-52 but when I len()
it I get 8 instead of 4 due to these hidden characters.
Thanks
Mark.
python string selenium web-scraping int
New contributor
Try RegEx, 'import re' then create a pattern to detect and remove the items you don't want.
– Ari Victor
Mar 6 at 23:50
1
Can you give an example ? I don't see any reason whyreplace
won't work here!
– RobinFrcd
Mar 6 at 23:56
1
I cant see what the textis to say why it wont work. What does the result of that command give you, show us the string you're trying to work with. Check out my answer below, see if that helps
– Ari Victor
Mar 7 at 0:01
add a comment |
I have used:
driver.find_elements_by_xpath('(.//span[@class = "x"])')[0].text
The information it pulls is correct but it adds spaces that show in the HTML as as "‭"
from the website im scaping.
How do I remove these so I can turn the str in to a int with as this is stopping me at the moment.
I have tried .strip
and .replace
with no luck.
Heres the raw HTML
<span class="coordinateX">(−52</span>
when I print this string i get (-52 but when I len()
it I get 8 instead of 4 due to these hidden characters.
Thanks
Mark.
python string selenium web-scraping int
New contributor
I have used:
driver.find_elements_by_xpath('(.//span[@class = "x"])')[0].text
The information it pulls is correct but it adds spaces that show in the HTML as as "‭"
from the website im scaping.
How do I remove these so I can turn the str in to a int with as this is stopping me at the moment.
I have tried .strip
and .replace
with no luck.
Heres the raw HTML
<span class="coordinateX">(−52</span>
when I print this string i get (-52 but when I len()
it I get 8 instead of 4 due to these hidden characters.
Thanks
Mark.
python string selenium web-scraping int
python string selenium web-scraping int
New contributor
New contributor
edited yesterday
Mark
New contributor
asked Mar 6 at 22:48
MarkMark
42
42
New contributor
New contributor
Try RegEx, 'import re' then create a pattern to detect and remove the items you don't want.
– Ari Victor
Mar 6 at 23:50
1
Can you give an example ? I don't see any reason whyreplace
won't work here!
– RobinFrcd
Mar 6 at 23:56
1
I cant see what the textis to say why it wont work. What does the result of that command give you, show us the string you're trying to work with. Check out my answer below, see if that helps
– Ari Victor
Mar 7 at 0:01
add a comment |
Try RegEx, 'import re' then create a pattern to detect and remove the items you don't want.
– Ari Victor
Mar 6 at 23:50
1
Can you give an example ? I don't see any reason whyreplace
won't work here!
– RobinFrcd
Mar 6 at 23:56
1
I cant see what the textis to say why it wont work. What does the result of that command give you, show us the string you're trying to work with. Check out my answer below, see if that helps
– Ari Victor
Mar 7 at 0:01
Try RegEx, 'import re' then create a pattern to detect and remove the items you don't want.
– Ari Victor
Mar 6 at 23:50
Try RegEx, 'import re' then create a pattern to detect and remove the items you don't want.
– Ari Victor
Mar 6 at 23:50
1
1
Can you give an example ? I don't see any reason why
replace
won't work here!– RobinFrcd
Mar 6 at 23:56
Can you give an example ? I don't see any reason why
replace
won't work here!– RobinFrcd
Mar 6 at 23:56
1
1
I cant see what the textis to say why it wont work. What does the result of that command give you, show us the string you're trying to work with. Check out my answer below, see if that helps
– Ari Victor
Mar 7 at 0:01
I cant see what the textis to say why it wont work. What does the result of that command give you, show us the string you're trying to work with. Check out my answer below, see if that helps
– Ari Victor
Mar 7 at 0:01
add a comment |
2 Answers
2
active
oldest
votes
Maybe try regex?
import re
string = 'Here is some string to‭test'
string = re.sub(r'(&#dddd)',' ', string)
print(string)
>>> 'Here is some string to test'
re.sub says, if you find this regex pattern r'(&#dddd)', replace it with a ' ', and do this search in the 'string' variable.
Resources
https://pythex.org/ - for creating and testing patterns
Learning material
https://developers.google.com/edu/python/regular-expressions
https://www.tutorialspoint.com/python/python_reg_expressions.htm
Hello,This doesnt seem to work heres the raw HTML ``` <span class="coordinateX">(−52</span> ```
– Mark
yesterday
add a comment |
The relevant HTML would have helped us to debug the issue in a better way. However, you can use get_attribute()
method instead of text
property as follows:
myText = driver.find_elements_by_xpath('(.//span[@class = "x"])')[0].get_attribute("innerHTML")
Thanks for the reply I tried .get_attribute() and is still has the same issue len() brings back 8 but it should only actually be 4.
– Mark
yesterday
@Mark Can you update the question with the relevant HTML for further analysis?
– DebanjanB
yesterday
I've added this in now, thanks for the help.
– Mark
yesterday
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Mark is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55033433%2fpython-webscape-hidden-chars-show-in-len-how-do-i-remove-these%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Maybe try regex?
import re
string = 'Here is some string to‭test'
string = re.sub(r'(&#dddd)',' ', string)
print(string)
>>> 'Here is some string to test'
re.sub says, if you find this regex pattern r'(&#dddd)', replace it with a ' ', and do this search in the 'string' variable.
Resources
https://pythex.org/ - for creating and testing patterns
Learning material
https://developers.google.com/edu/python/regular-expressions
https://www.tutorialspoint.com/python/python_reg_expressions.htm
Hello,This doesnt seem to work heres the raw HTML ``` <span class="coordinateX">(−52</span> ```
– Mark
yesterday
add a comment |
Maybe try regex?
import re
string = 'Here is some string to‭test'
string = re.sub(r'(&#dddd)',' ', string)
print(string)
>>> 'Here is some string to test'
re.sub says, if you find this regex pattern r'(&#dddd)', replace it with a ' ', and do this search in the 'string' variable.
Resources
https://pythex.org/ - for creating and testing patterns
Learning material
https://developers.google.com/edu/python/regular-expressions
https://www.tutorialspoint.com/python/python_reg_expressions.htm
Hello,This doesnt seem to work heres the raw HTML ``` <span class="coordinateX">(−52</span> ```
– Mark
yesterday
add a comment |
Maybe try regex?
import re
string = 'Here is some string to‭test'
string = re.sub(r'(&#dddd)',' ', string)
print(string)
>>> 'Here is some string to test'
re.sub says, if you find this regex pattern r'(&#dddd)', replace it with a ' ', and do this search in the 'string' variable.
Resources
https://pythex.org/ - for creating and testing patterns
Learning material
https://developers.google.com/edu/python/regular-expressions
https://www.tutorialspoint.com/python/python_reg_expressions.htm
Maybe try regex?
import re
string = 'Here is some string to‭test'
string = re.sub(r'(&#dddd)',' ', string)
print(string)
>>> 'Here is some string to test'
re.sub says, if you find this regex pattern r'(&#dddd)', replace it with a ' ', and do this search in the 'string' variable.
Resources
https://pythex.org/ - for creating and testing patterns
Learning material
https://developers.google.com/edu/python/regular-expressions
https://www.tutorialspoint.com/python/python_reg_expressions.htm
answered Mar 6 at 23:57
Ari VictorAri Victor
4611422
4611422
Hello,This doesnt seem to work heres the raw HTML ``` <span class="coordinateX">(−52</span> ```
– Mark
yesterday
add a comment |
Hello,This doesnt seem to work heres the raw HTML ``` <span class="coordinateX">(−52</span> ```
– Mark
yesterday
Hello,This doesnt seem to work heres the raw HTML ``` <span class="coordinateX">(−52</span> ```
– Mark
yesterday
Hello,This doesnt seem to work heres the raw HTML ``` <span class="coordinateX">(−52</span> ```
– Mark
yesterday
add a comment |
The relevant HTML would have helped us to debug the issue in a better way. However, you can use get_attribute()
method instead of text
property as follows:
myText = driver.find_elements_by_xpath('(.//span[@class = "x"])')[0].get_attribute("innerHTML")
Thanks for the reply I tried .get_attribute() and is still has the same issue len() brings back 8 but it should only actually be 4.
– Mark
yesterday
@Mark Can you update the question with the relevant HTML for further analysis?
– DebanjanB
yesterday
I've added this in now, thanks for the help.
– Mark
yesterday
add a comment |
The relevant HTML would have helped us to debug the issue in a better way. However, you can use get_attribute()
method instead of text
property as follows:
myText = driver.find_elements_by_xpath('(.//span[@class = "x"])')[0].get_attribute("innerHTML")
Thanks for the reply I tried .get_attribute() and is still has the same issue len() brings back 8 but it should only actually be 4.
– Mark
yesterday
@Mark Can you update the question with the relevant HTML for further analysis?
– DebanjanB
yesterday
I've added this in now, thanks for the help.
– Mark
yesterday
add a comment |
The relevant HTML would have helped us to debug the issue in a better way. However, you can use get_attribute()
method instead of text
property as follows:
myText = driver.find_elements_by_xpath('(.//span[@class = "x"])')[0].get_attribute("innerHTML")
The relevant HTML would have helped us to debug the issue in a better way. However, you can use get_attribute()
method instead of text
property as follows:
myText = driver.find_elements_by_xpath('(.//span[@class = "x"])')[0].get_attribute("innerHTML")
answered 2 days ago
DebanjanBDebanjanB
43.9k114386
43.9k114386
Thanks for the reply I tried .get_attribute() and is still has the same issue len() brings back 8 but it should only actually be 4.
– Mark
yesterday
@Mark Can you update the question with the relevant HTML for further analysis?
– DebanjanB
yesterday
I've added this in now, thanks for the help.
– Mark
yesterday
add a comment |
Thanks for the reply I tried .get_attribute() and is still has the same issue len() brings back 8 but it should only actually be 4.
– Mark
yesterday
@Mark Can you update the question with the relevant HTML for further analysis?
– DebanjanB
yesterday
I've added this in now, thanks for the help.
– Mark
yesterday
Thanks for the reply I tried .get_attribute() and is still has the same issue len() brings back 8 but it should only actually be 4.
– Mark
yesterday
Thanks for the reply I tried .get_attribute() and is still has the same issue len() brings back 8 but it should only actually be 4.
– Mark
yesterday
@Mark Can you update the question with the relevant HTML for further analysis?
– DebanjanB
yesterday
@Mark Can you update the question with the relevant HTML for further analysis?
– DebanjanB
yesterday
I've added this in now, thanks for the help.
– Mark
yesterday
I've added this in now, thanks for the help.
– Mark
yesterday
add a comment |
Mark is a new contributor. Be nice, and check out our Code of Conduct.
Mark is a new contributor. Be nice, and check out our Code of Conduct.
Mark is a new contributor. Be nice, and check out our Code of Conduct.
Mark is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55033433%2fpython-webscape-hidden-chars-show-in-len-how-do-i-remove-these%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Try RegEx, 'import re' then create a pattern to detect and remove the items you don't want.
– Ari Victor
Mar 6 at 23:50
1
Can you give an example ? I don't see any reason why
replace
won't work here!– RobinFrcd
Mar 6 at 23:56
1
I cant see what the textis to say why it wont work. What does the result of that command give you, show us the string you're trying to work with. Check out my answer below, see if that helps
– Ari Victor
Mar 7 at 0:01