Python - Webscape hidden chars show in len how do i remove these?2019 Community Moderator ElectionHidden features of PythonHow can I safely create a nested directory in Python?How can I remove a trailing newline in Python?How to get the current time in PythonHow can I make a time delay in Python?How do I remove an element from a list by index in Python?How to get the number of elements in a list in Python?How to concatenate two lists in Python?How to lowercase a string in Python?How to remove a key from a Python dictionary?

Do Paladin Auras of Differing Oaths Stack?

Sampling from Gaussian mixture models, when are the sampled data independent?

Why do we say 'Pairwise Disjoint', rather than 'Disjoint'?

Are these two graphs isomorphic? Why/Why not?

Difference between `nmap local-IP-address` and `nmap localhost`

How do I raise a figure (placed with wrapfig) to be flush with the top of a paragraph?

What does the Digital Threat scope actually do?

Is there stress on two letters on the word стоят

Why does Central Limit Theorem break down in my simulation?

Having the player face themselves after the mid-game

What is the purpose of a disclaimer like "this is not legal advice"?

Idiom for feeling after taking risk and someone else being rewarded

What is better: yes / no radio, or simple checkbox?

Too soon for a plot twist?

Movie: boy escapes the real world and goes to a fantasy world with big furry trolls

How to write a chaotic neutral protagonist and prevent my readers from thinking they are evil?

How do spaceships determine each other's mass in space?

Why restrict private health insurance?

Short scifi story where reproductive organs are converted to produce "materials", pregnant protagonist is "found fit" to be a mother

Rationale to prefer local variables over instance variables?

When an outsider describes family relationships, which point of view are they using?

Is there a logarithm base for which the logarithm becomes an identity function?

If nine coins are tossed, what is the probability that the number of heads is even?

Called into a meeting and told we are being made redundant (laid off) and "not to share outside". Can I tell my partner?



Python - Webscape hidden chars show in len how do i remove these?



2019 Community Moderator ElectionHidden features of PythonHow can I safely create a nested directory in Python?How can I remove a trailing newline in Python?How to get the current time in PythonHow can I make a time delay in Python?How do I remove an element from a list by index in Python?How to get the number of elements in a list in Python?How to concatenate two lists in Python?How to lowercase a string in Python?How to remove a key from a Python dictionary?










0















I have used:



driver.find_elements_by_xpath('(.//span[@class = "x"])')[0].text


The information it pulls is correct but it adds spaces that show in the HTML as as "&#8237" from the website im scaping.



How do I remove these so I can turn the str in to a int with as this is stopping me at the moment.



I have tried .strip and .replace with no luck.



Heres the raw HTML



<span class="coordinateX">(‭‭−‭‭52‬‭‬‬</span>


when I print this string i get (-52 but when I len() it I get 8 instead of 4 due to these hidden characters.



Thanks
Mark.










share|improve this question









New contributor




Mark is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




















  • Try RegEx, 'import re' then create a pattern to detect and remove the items you don't want.

    – Ari Victor
    Mar 6 at 23:50






  • 1





    Can you give an example ? I don't see any reason why replace won't work here!

    – RobinFrcd
    Mar 6 at 23:56






  • 1





    I cant see what the textis to say why it wont work. What does the result of that command give you, show us the string you're trying to work with. Check out my answer below, see if that helps

    – Ari Victor
    Mar 7 at 0:01
















0















I have used:



driver.find_elements_by_xpath('(.//span[@class = "x"])')[0].text


The information it pulls is correct but it adds spaces that show in the HTML as as "&#8237" from the website im scaping.



How do I remove these so I can turn the str in to a int with as this is stopping me at the moment.



I have tried .strip and .replace with no luck.



Heres the raw HTML



<span class="coordinateX">(‭‭−‭‭52‬‭‬‬</span>


when I print this string i get (-52 but when I len() it I get 8 instead of 4 due to these hidden characters.



Thanks
Mark.










share|improve this question









New contributor




Mark is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




















  • Try RegEx, 'import re' then create a pattern to detect and remove the items you don't want.

    – Ari Victor
    Mar 6 at 23:50






  • 1





    Can you give an example ? I don't see any reason why replace won't work here!

    – RobinFrcd
    Mar 6 at 23:56






  • 1





    I cant see what the textis to say why it wont work. What does the result of that command give you, show us the string you're trying to work with. Check out my answer below, see if that helps

    – Ari Victor
    Mar 7 at 0:01














0












0








0








I have used:



driver.find_elements_by_xpath('(.//span[@class = "x"])')[0].text


The information it pulls is correct but it adds spaces that show in the HTML as as "&#8237" from the website im scaping.



How do I remove these so I can turn the str in to a int with as this is stopping me at the moment.



I have tried .strip and .replace with no luck.



Heres the raw HTML



<span class="coordinateX">(‭‭−‭‭52‬‭‬‬</span>


when I print this string i get (-52 but when I len() it I get 8 instead of 4 due to these hidden characters.



Thanks
Mark.










share|improve this question









New contributor




Mark is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












I have used:



driver.find_elements_by_xpath('(.//span[@class = "x"])')[0].text


The information it pulls is correct but it adds spaces that show in the HTML as as "&#8237" from the website im scaping.



How do I remove these so I can turn the str in to a int with as this is stopping me at the moment.



I have tried .strip and .replace with no luck.



Heres the raw HTML



<span class="coordinateX">(‭‭−‭‭52‬‭‬‬</span>


when I print this string i get (-52 but when I len() it I get 8 instead of 4 due to these hidden characters.



Thanks
Mark.







python string selenium web-scraping int






share|improve this question









New contributor




Mark is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




Mark is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited yesterday







Mark













New contributor




Mark is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked Mar 6 at 22:48









MarkMark

42




42




New contributor




Mark is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Mark is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Mark is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












  • Try RegEx, 'import re' then create a pattern to detect and remove the items you don't want.

    – Ari Victor
    Mar 6 at 23:50






  • 1





    Can you give an example ? I don't see any reason why replace won't work here!

    – RobinFrcd
    Mar 6 at 23:56






  • 1





    I cant see what the textis to say why it wont work. What does the result of that command give you, show us the string you're trying to work with. Check out my answer below, see if that helps

    – Ari Victor
    Mar 7 at 0:01


















  • Try RegEx, 'import re' then create a pattern to detect and remove the items you don't want.

    – Ari Victor
    Mar 6 at 23:50






  • 1





    Can you give an example ? I don't see any reason why replace won't work here!

    – RobinFrcd
    Mar 6 at 23:56






  • 1





    I cant see what the textis to say why it wont work. What does the result of that command give you, show us the string you're trying to work with. Check out my answer below, see if that helps

    – Ari Victor
    Mar 7 at 0:01

















Try RegEx, 'import re' then create a pattern to detect and remove the items you don't want.

– Ari Victor
Mar 6 at 23:50





Try RegEx, 'import re' then create a pattern to detect and remove the items you don't want.

– Ari Victor
Mar 6 at 23:50




1




1





Can you give an example ? I don't see any reason why replace won't work here!

– RobinFrcd
Mar 6 at 23:56





Can you give an example ? I don't see any reason why replace won't work here!

– RobinFrcd
Mar 6 at 23:56




1




1





I cant see what the textis to say why it wont work. What does the result of that command give you, show us the string you're trying to work with. Check out my answer below, see if that helps

– Ari Victor
Mar 7 at 0:01






I cant see what the textis to say why it wont work. What does the result of that command give you, show us the string you're trying to work with. Check out my answer below, see if that helps

– Ari Victor
Mar 7 at 0:01













2 Answers
2






active

oldest

votes


















1














Maybe try regex?



import re

string = 'Here is some string to&#8237test'

string = re.sub(r'(&#dddd)',' ', string)

print(string)

>>> 'Here is some string to test'


re.sub says, if you find this regex pattern r'(&#dddd)', replace it with a ' ', and do this search in the 'string' variable.



Resources



https://pythex.org/ - for creating and testing patterns



Learning material



https://developers.google.com/edu/python/regular-expressions
https://www.tutorialspoint.com/python/python_reg_expressions.htm






share|improve this answer























  • Hello,This doesnt seem to work heres the raw HTML ``` <span class="coordinateX">(‭‭−‭‭52‬‭‬‬</span> ```

    – Mark
    yesterday



















0














The relevant HTML would have helped us to debug the issue in a better way. However, you can use get_attribute() method instead of text property as follows:



myText = driver.find_elements_by_xpath('(.//span[@class = "x"])')[0].get_attribute("innerHTML")





share|improve this answer























  • Thanks for the reply I tried .get_attribute() and is still has the same issue len() brings back 8 but it should only actually be 4.

    – Mark
    yesterday












  • @Mark Can you update the question with the relevant HTML for further analysis?

    – DebanjanB
    yesterday











  • I've added this in now, thanks for the help.

    – Mark
    yesterday










Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);






Mark is a new contributor. Be nice, and check out our Code of Conduct.









draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55033433%2fpython-webscape-hidden-chars-show-in-len-how-do-i-remove-these%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














Maybe try regex?



import re

string = 'Here is some string to&#8237test'

string = re.sub(r'(&#dddd)',' ', string)

print(string)

>>> 'Here is some string to test'


re.sub says, if you find this regex pattern r'(&#dddd)', replace it with a ' ', and do this search in the 'string' variable.



Resources



https://pythex.org/ - for creating and testing patterns



Learning material



https://developers.google.com/edu/python/regular-expressions
https://www.tutorialspoint.com/python/python_reg_expressions.htm






share|improve this answer























  • Hello,This doesnt seem to work heres the raw HTML ``` <span class="coordinateX">(‭‭−‭‭52‬‭‬‬</span> ```

    – Mark
    yesterday
















1














Maybe try regex?



import re

string = 'Here is some string to&#8237test'

string = re.sub(r'(&#dddd)',' ', string)

print(string)

>>> 'Here is some string to test'


re.sub says, if you find this regex pattern r'(&#dddd)', replace it with a ' ', and do this search in the 'string' variable.



Resources



https://pythex.org/ - for creating and testing patterns



Learning material



https://developers.google.com/edu/python/regular-expressions
https://www.tutorialspoint.com/python/python_reg_expressions.htm






share|improve this answer























  • Hello,This doesnt seem to work heres the raw HTML ``` <span class="coordinateX">(‭‭−‭‭52‬‭‬‬</span> ```

    – Mark
    yesterday














1












1








1







Maybe try regex?



import re

string = 'Here is some string to&#8237test'

string = re.sub(r'(&#dddd)',' ', string)

print(string)

>>> 'Here is some string to test'


re.sub says, if you find this regex pattern r'(&#dddd)', replace it with a ' ', and do this search in the 'string' variable.



Resources



https://pythex.org/ - for creating and testing patterns



Learning material



https://developers.google.com/edu/python/regular-expressions
https://www.tutorialspoint.com/python/python_reg_expressions.htm






share|improve this answer













Maybe try regex?



import re

string = 'Here is some string to&#8237test'

string = re.sub(r'(&#dddd)',' ', string)

print(string)

>>> 'Here is some string to test'


re.sub says, if you find this regex pattern r'(&#dddd)', replace it with a ' ', and do this search in the 'string' variable.



Resources



https://pythex.org/ - for creating and testing patterns



Learning material



https://developers.google.com/edu/python/regular-expressions
https://www.tutorialspoint.com/python/python_reg_expressions.htm







share|improve this answer












share|improve this answer



share|improve this answer










answered Mar 6 at 23:57









Ari VictorAri Victor

4611422




4611422












  • Hello,This doesnt seem to work heres the raw HTML ``` <span class="coordinateX">(‭‭−‭‭52‬‭‬‬</span> ```

    – Mark
    yesterday


















  • Hello,This doesnt seem to work heres the raw HTML ``` <span class="coordinateX">(‭‭−‭‭52‬‭‬‬</span> ```

    – Mark
    yesterday

















Hello,This doesnt seem to work heres the raw HTML ``` <span class="coordinateX">(‭‭−‭‭52‬‭‬‬</span> ```

– Mark
yesterday






Hello,This doesnt seem to work heres the raw HTML ``` <span class="coordinateX">(‭‭−‭‭52‬‭‬‬</span> ```

– Mark
yesterday














0














The relevant HTML would have helped us to debug the issue in a better way. However, you can use get_attribute() method instead of text property as follows:



myText = driver.find_elements_by_xpath('(.//span[@class = "x"])')[0].get_attribute("innerHTML")





share|improve this answer























  • Thanks for the reply I tried .get_attribute() and is still has the same issue len() brings back 8 but it should only actually be 4.

    – Mark
    yesterday












  • @Mark Can you update the question with the relevant HTML for further analysis?

    – DebanjanB
    yesterday











  • I've added this in now, thanks for the help.

    – Mark
    yesterday















0














The relevant HTML would have helped us to debug the issue in a better way. However, you can use get_attribute() method instead of text property as follows:



myText = driver.find_elements_by_xpath('(.//span[@class = "x"])')[0].get_attribute("innerHTML")





share|improve this answer























  • Thanks for the reply I tried .get_attribute() and is still has the same issue len() brings back 8 but it should only actually be 4.

    – Mark
    yesterday












  • @Mark Can you update the question with the relevant HTML for further analysis?

    – DebanjanB
    yesterday











  • I've added this in now, thanks for the help.

    – Mark
    yesterday













0












0








0







The relevant HTML would have helped us to debug the issue in a better way. However, you can use get_attribute() method instead of text property as follows:



myText = driver.find_elements_by_xpath('(.//span[@class = "x"])')[0].get_attribute("innerHTML")





share|improve this answer













The relevant HTML would have helped us to debug the issue in a better way. However, you can use get_attribute() method instead of text property as follows:



myText = driver.find_elements_by_xpath('(.//span[@class = "x"])')[0].get_attribute("innerHTML")






share|improve this answer












share|improve this answer



share|improve this answer










answered 2 days ago









DebanjanBDebanjanB

43.9k114386




43.9k114386












  • Thanks for the reply I tried .get_attribute() and is still has the same issue len() brings back 8 but it should only actually be 4.

    – Mark
    yesterday












  • @Mark Can you update the question with the relevant HTML for further analysis?

    – DebanjanB
    yesterday











  • I've added this in now, thanks for the help.

    – Mark
    yesterday

















  • Thanks for the reply I tried .get_attribute() and is still has the same issue len() brings back 8 but it should only actually be 4.

    – Mark
    yesterday












  • @Mark Can you update the question with the relevant HTML for further analysis?

    – DebanjanB
    yesterday











  • I've added this in now, thanks for the help.

    – Mark
    yesterday
















Thanks for the reply I tried .get_attribute() and is still has the same issue len() brings back 8 but it should only actually be 4.

– Mark
yesterday






Thanks for the reply I tried .get_attribute() and is still has the same issue len() brings back 8 but it should only actually be 4.

– Mark
yesterday














@Mark Can you update the question with the relevant HTML for further analysis?

– DebanjanB
yesterday





@Mark Can you update the question with the relevant HTML for further analysis?

– DebanjanB
yesterday













I've added this in now, thanks for the help.

– Mark
yesterday





I've added this in now, thanks for the help.

– Mark
yesterday










Mark is a new contributor. Be nice, and check out our Code of Conduct.









draft saved

draft discarded


















Mark is a new contributor. Be nice, and check out our Code of Conduct.












Mark is a new contributor. Be nice, and check out our Code of Conduct.











Mark is a new contributor. Be nice, and check out our Code of Conduct.














Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55033433%2fpython-webscape-hidden-chars-show-in-len-how-do-i-remove-these%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Identity Server 4 is not redirecting to Angular app after login2019 Community Moderator ElectionIdentity Server 4 and dockerIdentityserver implicit flow unauthorized_clientIdentityServer Hybrid Flow - Access Token is null after user successful loginIdentity Server to MVC client : Page Redirect After loginLogin with Steam OpenId(oidc-client-js)Identity Server 4+.NET Core 2.0 + IdentityIdentityServer4 post-login redirect not working in Edge browserCall to IdentityServer4 generates System.NullReferenceException: Object reference not set to an instance of an objectIdentityServer4 without HTTPS not workingHow to get Authorization code from identity server without login form

2005 Ahvaz unrest Contents Background Causes Casualties Aftermath See also References Navigation menue"At Least 10 Are Killed by Bombs in Iran""Iran"Archived"Arab-Iranians in Iran to make April 15 'Day of Fury'"State of Mind, State of Order: Reactions to Ethnic Unrest in the Islamic Republic of Iran.10.1111/j.1754-9469.2008.00028.x"Iran hangs Arab separatists"Iran Overview from ArchivedConstitution of the Islamic Republic of Iran"Tehran puzzled by forged 'riots' letter""Iran and its minorities: Down in the second class""Iran: Handling Of Ahvaz Unrest Could End With Televised Confessions""Bombings Rock Iran Ahead of Election""Five die in Iran ethnic clashes""Iran: Need for restraint as anniversary of unrest in Khuzestan approaches"Archived"Iranian Sunni protesters killed in clashes with security forces"Archived

Can't initialize raids on a new ASUS Prime B360M-A motherboard2019 Community Moderator ElectionSimilar to RAID config yet more like mirroring solution?Can't get motherboard serial numberWhy does the BIOS entry point start with a WBINVD instruction?UEFI performance Asus Maximus V Extreme