How to make BeautifulSoup 'replace_with' attribute work with a 'unicode' object?2019 Community Moderator ElectionHow do I ignore tags while getting the .string of a Beautiful Soup element?How to sort a list of objects based on an attribute of the objects?How can I make a time delay in Python?How to know if an object has an attribute in PythonHow to make a chain of function decorators?How to make a flat list out of list of lists?In Python, how do I determine if an object is iterable?How does Python's super() work with multiple inheritance?How to make a class JSON serializableHow does the @property decorator work?How to make IPython notebook matplotlib plot inline

Can inspiration allow the Rogue to make a Sneak Attack?

Is this nominative case or accusative case?

Does the in-code argument passing conventions used on PDP-11's have a name?

Is every open circuit a capacitor?

Naming Characters after Friends/Family

An Undercover Army

Can a Mimic (container form) actually hold loot?

Learning to quickly identify valid fingering for piano?

School performs periodic password audits. Is my password compromised?

What does it mean when I add a new variable to my linear model and the R^2 stays the same?

Too soon for a plot twist?

How can I be pwned if I'm not registered on the compromised site?

ESPP--any reason not to go all in?

Is there a math equivalent to the conditional ternary operator?

I can't die. Who am I?

Integrating function with /; in its definition

Do natural melee weapons (from racial traits) trigger Improved Divine Smite?

What is the oldest European royal house?

The (Easy) Road to Code

Is it a Cyclops number? "Nobody" knows!

Giving a talk in my old university, how prominently should I tell students my salary?

Called into a meeting and told we are being made redundant (laid off) and "not to share outside". Can I tell my partner?

Deal the cards to the players

Why do we call complex numbers “numbers” but we don’t consider 2 vectors numbers?



How to make BeautifulSoup 'replace_with' attribute work with a 'unicode' object?



2019 Community Moderator ElectionHow do I ignore tags while getting the .string of a Beautiful Soup element?How to sort a list of objects based on an attribute of the objects?How can I make a time delay in Python?How to know if an object has an attribute in PythonHow to make a chain of function decorators?How to make a flat list out of list of lists?In Python, how do I determine if an object is iterable?How does Python's super() work with multiple inheritance?How to make a class JSON serializableHow does the @property decorator work?How to make IPython notebook matplotlib plot inline










1















Here is my html:



<html>
<body>
<h2>Pizza</h2>
<p>This is some random paragraph without child tags.</p>
<p>Delicious homebaked pizza.<br><em></em>$8.99 pp</em></p>
<h2>Eggplant Parmesan</h2>
<p>Try the authentic <i>Italian flavor</i> of baked aubergine.<br><em>$6.99 pp</em></p>
<h2>Italian Ice Cream</h2>
<p>Our dessert specialty.<br><em>$3.99 pp</em></p>
</body>
</html>


Using BeautifulSoup, I want to grab the text that is displayed for the h2 and p tags, replace them with a prefixed version in the tree, and also print them out on screen. For the h2 tags, this works fine:



from bs4 import BeautifulSoup

with open("/var/www/html/Test/index.html", "r") as f:
soup = BeautifulSoup(f, "lxml")

f = open("/var/www/html/Test/I18N_index.html", "w+")

for h2 in soup.find_all('h2'):
i18n_string = "I18N_"+h2.string
h2.string.replace_with(i18n_string)
print(h2.string)

f.write(str(soup))


###Output:##############################################
# $ python ./test.py
# I18N_Pizza
# I18N_Eggplant Parmesan
# I18N_Italian Ice Cream
########################################################


In my I18N_index.html, all 3 strings appear correctly prefixed with 'I18N_'.



However, my p tags contain child tags, and for these the return type is 'None'. As a result, the concatenation no longer works:



 for p in soup.find_all('p'):
i18n_string = "I18N_"+p.string
p.string.replace_with(i18n_string)
print(p.string)

f.write(str(soup))

###Output:##################################################
# $ python ./test.py
# I18N_Pizza
# I18N_Eggplant Parmesan
# I18N_Italian Ice Cream
# I18N_This is some random paragraph without child tags.
# Traceback (most recent call last):
# File "./test.py", line 15, in <module>
# i18n_string = "I18N_"+p.string
# TypeError: cannot concatenate 'str' and 'NoneType' objects
############################################################


From this thread I learned about the join function. It let's me do the concatenation and print out the resulting strings on screen, but not the replacement in the soup tree:



for p in soup.find_all('p'):
joined = ''.join(p.strings)
i18n_string = "I18N_"+joined
#joined.replace_with(i18n_string)
print (i18n_string)

###Output with 'joined.replace_with(i18n_string)' DISABLED:###
# I18N_Pizza
# I18N_Eggplant Parmesan
# I18N_Italian Ice Cream
# I18N_This is some random paragraph without child tags.
# I18N_Delicious homebaked pizza.$8.99 pp
# I18N_Try the authentic Italian flavor of baked aubergine.$6.99 pp
# I18N_Our dessert specialty$3.99 pp
############################################################

###Output with 'joined.replace_with(i18n_string)' ENABLED:#####
# I18N_Pizza
# I18N_Eggplant Parmesan
# I18N_Italian Ice Cream
# Traceback (most recent call last):
# File "./test.py", line 41, in <module>
# joined.replace_with(i18n_string)
# AttributeError: 'unicode' object has no attribute 'replace_with'
############################################################


In that thread, another solution based on isinstance is mentioned, but I could not make that work.



If I understand correctly, the join function joins the strings but returns a 'unicode' object, not a string object, and this is why the 'replace_with' attribute doesn't work. How can I work around this? Any help is much appreciated.










share|improve this question







New contributor




cbp is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
























    1















    Here is my html:



    <html>
    <body>
    <h2>Pizza</h2>
    <p>This is some random paragraph without child tags.</p>
    <p>Delicious homebaked pizza.<br><em></em>$8.99 pp</em></p>
    <h2>Eggplant Parmesan</h2>
    <p>Try the authentic <i>Italian flavor</i> of baked aubergine.<br><em>$6.99 pp</em></p>
    <h2>Italian Ice Cream</h2>
    <p>Our dessert specialty.<br><em>$3.99 pp</em></p>
    </body>
    </html>


    Using BeautifulSoup, I want to grab the text that is displayed for the h2 and p tags, replace them with a prefixed version in the tree, and also print them out on screen. For the h2 tags, this works fine:



    from bs4 import BeautifulSoup

    with open("/var/www/html/Test/index.html", "r") as f:
    soup = BeautifulSoup(f, "lxml")

    f = open("/var/www/html/Test/I18N_index.html", "w+")

    for h2 in soup.find_all('h2'):
    i18n_string = "I18N_"+h2.string
    h2.string.replace_with(i18n_string)
    print(h2.string)

    f.write(str(soup))


    ###Output:##############################################
    # $ python ./test.py
    # I18N_Pizza
    # I18N_Eggplant Parmesan
    # I18N_Italian Ice Cream
    ########################################################


    In my I18N_index.html, all 3 strings appear correctly prefixed with 'I18N_'.



    However, my p tags contain child tags, and for these the return type is 'None'. As a result, the concatenation no longer works:



     for p in soup.find_all('p'):
    i18n_string = "I18N_"+p.string
    p.string.replace_with(i18n_string)
    print(p.string)

    f.write(str(soup))

    ###Output:##################################################
    # $ python ./test.py
    # I18N_Pizza
    # I18N_Eggplant Parmesan
    # I18N_Italian Ice Cream
    # I18N_This is some random paragraph without child tags.
    # Traceback (most recent call last):
    # File "./test.py", line 15, in <module>
    # i18n_string = "I18N_"+p.string
    # TypeError: cannot concatenate 'str' and 'NoneType' objects
    ############################################################


    From this thread I learned about the join function. It let's me do the concatenation and print out the resulting strings on screen, but not the replacement in the soup tree:



    for p in soup.find_all('p'):
    joined = ''.join(p.strings)
    i18n_string = "I18N_"+joined
    #joined.replace_with(i18n_string)
    print (i18n_string)

    ###Output with 'joined.replace_with(i18n_string)' DISABLED:###
    # I18N_Pizza
    # I18N_Eggplant Parmesan
    # I18N_Italian Ice Cream
    # I18N_This is some random paragraph without child tags.
    # I18N_Delicious homebaked pizza.$8.99 pp
    # I18N_Try the authentic Italian flavor of baked aubergine.$6.99 pp
    # I18N_Our dessert specialty$3.99 pp
    ############################################################

    ###Output with 'joined.replace_with(i18n_string)' ENABLED:#####
    # I18N_Pizza
    # I18N_Eggplant Parmesan
    # I18N_Italian Ice Cream
    # Traceback (most recent call last):
    # File "./test.py", line 41, in <module>
    # joined.replace_with(i18n_string)
    # AttributeError: 'unicode' object has no attribute 'replace_with'
    ############################################################


    In that thread, another solution based on isinstance is mentioned, but I could not make that work.



    If I understand correctly, the join function joins the strings but returns a 'unicode' object, not a string object, and this is why the 'replace_with' attribute doesn't work. How can I work around this? Any help is much appreciated.










    share|improve this question







    New contributor




    cbp is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.






















      1












      1








      1








      Here is my html:



      <html>
      <body>
      <h2>Pizza</h2>
      <p>This is some random paragraph without child tags.</p>
      <p>Delicious homebaked pizza.<br><em></em>$8.99 pp</em></p>
      <h2>Eggplant Parmesan</h2>
      <p>Try the authentic <i>Italian flavor</i> of baked aubergine.<br><em>$6.99 pp</em></p>
      <h2>Italian Ice Cream</h2>
      <p>Our dessert specialty.<br><em>$3.99 pp</em></p>
      </body>
      </html>


      Using BeautifulSoup, I want to grab the text that is displayed for the h2 and p tags, replace them with a prefixed version in the tree, and also print them out on screen. For the h2 tags, this works fine:



      from bs4 import BeautifulSoup

      with open("/var/www/html/Test/index.html", "r") as f:
      soup = BeautifulSoup(f, "lxml")

      f = open("/var/www/html/Test/I18N_index.html", "w+")

      for h2 in soup.find_all('h2'):
      i18n_string = "I18N_"+h2.string
      h2.string.replace_with(i18n_string)
      print(h2.string)

      f.write(str(soup))


      ###Output:##############################################
      # $ python ./test.py
      # I18N_Pizza
      # I18N_Eggplant Parmesan
      # I18N_Italian Ice Cream
      ########################################################


      In my I18N_index.html, all 3 strings appear correctly prefixed with 'I18N_'.



      However, my p tags contain child tags, and for these the return type is 'None'. As a result, the concatenation no longer works:



       for p in soup.find_all('p'):
      i18n_string = "I18N_"+p.string
      p.string.replace_with(i18n_string)
      print(p.string)

      f.write(str(soup))

      ###Output:##################################################
      # $ python ./test.py
      # I18N_Pizza
      # I18N_Eggplant Parmesan
      # I18N_Italian Ice Cream
      # I18N_This is some random paragraph without child tags.
      # Traceback (most recent call last):
      # File "./test.py", line 15, in <module>
      # i18n_string = "I18N_"+p.string
      # TypeError: cannot concatenate 'str' and 'NoneType' objects
      ############################################################


      From this thread I learned about the join function. It let's me do the concatenation and print out the resulting strings on screen, but not the replacement in the soup tree:



      for p in soup.find_all('p'):
      joined = ''.join(p.strings)
      i18n_string = "I18N_"+joined
      #joined.replace_with(i18n_string)
      print (i18n_string)

      ###Output with 'joined.replace_with(i18n_string)' DISABLED:###
      # I18N_Pizza
      # I18N_Eggplant Parmesan
      # I18N_Italian Ice Cream
      # I18N_This is some random paragraph without child tags.
      # I18N_Delicious homebaked pizza.$8.99 pp
      # I18N_Try the authentic Italian flavor of baked aubergine.$6.99 pp
      # I18N_Our dessert specialty$3.99 pp
      ############################################################

      ###Output with 'joined.replace_with(i18n_string)' ENABLED:#####
      # I18N_Pizza
      # I18N_Eggplant Parmesan
      # I18N_Italian Ice Cream
      # Traceback (most recent call last):
      # File "./test.py", line 41, in <module>
      # joined.replace_with(i18n_string)
      # AttributeError: 'unicode' object has no attribute 'replace_with'
      ############################################################


      In that thread, another solution based on isinstance is mentioned, but I could not make that work.



      If I understand correctly, the join function joins the strings but returns a 'unicode' object, not a string object, and this is why the 'replace_with' attribute doesn't work. How can I work around this? Any help is much appreciated.










      share|improve this question







      New contributor




      cbp is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.












      Here is my html:



      <html>
      <body>
      <h2>Pizza</h2>
      <p>This is some random paragraph without child tags.</p>
      <p>Delicious homebaked pizza.<br><em></em>$8.99 pp</em></p>
      <h2>Eggplant Parmesan</h2>
      <p>Try the authentic <i>Italian flavor</i> of baked aubergine.<br><em>$6.99 pp</em></p>
      <h2>Italian Ice Cream</h2>
      <p>Our dessert specialty.<br><em>$3.99 pp</em></p>
      </body>
      </html>


      Using BeautifulSoup, I want to grab the text that is displayed for the h2 and p tags, replace them with a prefixed version in the tree, and also print them out on screen. For the h2 tags, this works fine:



      from bs4 import BeautifulSoup

      with open("/var/www/html/Test/index.html", "r") as f:
      soup = BeautifulSoup(f, "lxml")

      f = open("/var/www/html/Test/I18N_index.html", "w+")

      for h2 in soup.find_all('h2'):
      i18n_string = "I18N_"+h2.string
      h2.string.replace_with(i18n_string)
      print(h2.string)

      f.write(str(soup))


      ###Output:##############################################
      # $ python ./test.py
      # I18N_Pizza
      # I18N_Eggplant Parmesan
      # I18N_Italian Ice Cream
      ########################################################


      In my I18N_index.html, all 3 strings appear correctly prefixed with 'I18N_'.



      However, my p tags contain child tags, and for these the return type is 'None'. As a result, the concatenation no longer works:



       for p in soup.find_all('p'):
      i18n_string = "I18N_"+p.string
      p.string.replace_with(i18n_string)
      print(p.string)

      f.write(str(soup))

      ###Output:##################################################
      # $ python ./test.py
      # I18N_Pizza
      # I18N_Eggplant Parmesan
      # I18N_Italian Ice Cream
      # I18N_This is some random paragraph without child tags.
      # Traceback (most recent call last):
      # File "./test.py", line 15, in <module>
      # i18n_string = "I18N_"+p.string
      # TypeError: cannot concatenate 'str' and 'NoneType' objects
      ############################################################


      From this thread I learned about the join function. It let's me do the concatenation and print out the resulting strings on screen, but not the replacement in the soup tree:



      for p in soup.find_all('p'):
      joined = ''.join(p.strings)
      i18n_string = "I18N_"+joined
      #joined.replace_with(i18n_string)
      print (i18n_string)

      ###Output with 'joined.replace_with(i18n_string)' DISABLED:###
      # I18N_Pizza
      # I18N_Eggplant Parmesan
      # I18N_Italian Ice Cream
      # I18N_This is some random paragraph without child tags.
      # I18N_Delicious homebaked pizza.$8.99 pp
      # I18N_Try the authentic Italian flavor of baked aubergine.$6.99 pp
      # I18N_Our dessert specialty$3.99 pp
      ############################################################

      ###Output with 'joined.replace_with(i18n_string)' ENABLED:#####
      # I18N_Pizza
      # I18N_Eggplant Parmesan
      # I18N_Italian Ice Cream
      # Traceback (most recent call last):
      # File "./test.py", line 41, in <module>
      # joined.replace_with(i18n_string)
      # AttributeError: 'unicode' object has no attribute 'replace_with'
      ############################################################


      In that thread, another solution based on isinstance is mentioned, but I could not make that work.



      If I understand correctly, the join function joins the strings but returns a 'unicode' object, not a string object, and this is why the 'replace_with' attribute doesn't work. How can I work around this? Any help is much appreciated.







      python beautifulsoup






      share|improve this question







      New contributor




      cbp is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question







      New contributor




      cbp is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question






      New contributor




      cbp is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked yesterday









      cbpcbp

      205




      205




      New contributor




      cbp is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      cbp is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      cbp is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






















          2 Answers
          2






          active

          oldest

          votes


















          2














          replace_with() method does not work not because joined is a unicode object, but because it is a method specific to bs4 object. See this: BeautifulSoup-replace_with



          By the way the join() method return a str See this: python3-join



          Now to give you a solution, I would simply remove the string after the p tag:



          from bs4 import BeautifulSoup

          with open("index.html", "r") as f:
          soup = BeautifulSoup(f, "lxml")

          f = open("I18N_index.html", "w+")

          for h2 in soup.find_all('h2'):
          i18n_string = "I18N_"+h2.string
          h2.string.replace_with(i18n_string)
          print(h2.string)

          for p in soup.find_all('p'):
          joined = ''.join(p.strings)
          i18n_string = "I18N_"+joined
          p.replace_with(i18n_string)
          print (i18n_string)


          f.write(str(soup))


          OUTPUT:



          I18N_Pizza
          I18N_Eggplant Parmesan
          I18N_Italian Ice Cream
          I18N_This is some random paragraph without child tags.
          I18N_Delicious homebaked pizza.$8.99 pp
          I18N_Try the authentic Italian flavor of baked aubergine.$6.99 pp
          I18N_Our dessert specialty.$3.99 pp






          share|improve this answer























          • This solution works. Thanks a lot, also for the additional information.

            – cbp
            yesterday











          • You are welcome :-)

            – Maaz
            yesterday


















          1














          With a simplified version of your code (that is, just taking care of the p tags issue), it looks like you have to replace p.string with p.text:



          soup = BeautifulSoup([your html], "lxml")



           for p in soup.find_all('p'):
          print('before: ',p.text)
          i18n_string = "I18N_"+p.text
          print('after ',i18n_string)


          Output:



          before: This is some random paragraph without child tags.
          after I18N_This is some random paragraph without child tags.
          before: Delicious homebaked pizza.$8.99 pp
          after I18N_Delicious homebaked pizza.$8.99 pp
          before: Try the authentic Italian flavor of baked aubergine.$6.99 pp
          after I18N_Try the authentic Italian flavor of baked aubergine.$6.99 pp
          before: Our dessert specialty.$3.99 pp
          after I18N_Our dessert specialty.$3.99 pp





          share|improve this answer

























          • Thanks for your reply. I had tried 'text' before, but it did not resolve my inability to use 'replace_with'.

            – cbp
            yesterday










          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );






          cbp is a new contributor. Be nice, and check out our Code of Conduct.









          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55023173%2fhow-to-make-beautifulsoup-replace-with-attribute-work-with-a-unicode-object%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          2














          replace_with() method does not work not because joined is a unicode object, but because it is a method specific to bs4 object. See this: BeautifulSoup-replace_with



          By the way the join() method return a str See this: python3-join



          Now to give you a solution, I would simply remove the string after the p tag:



          from bs4 import BeautifulSoup

          with open("index.html", "r") as f:
          soup = BeautifulSoup(f, "lxml")

          f = open("I18N_index.html", "w+")

          for h2 in soup.find_all('h2'):
          i18n_string = "I18N_"+h2.string
          h2.string.replace_with(i18n_string)
          print(h2.string)

          for p in soup.find_all('p'):
          joined = ''.join(p.strings)
          i18n_string = "I18N_"+joined
          p.replace_with(i18n_string)
          print (i18n_string)


          f.write(str(soup))


          OUTPUT:



          I18N_Pizza
          I18N_Eggplant Parmesan
          I18N_Italian Ice Cream
          I18N_This is some random paragraph without child tags.
          I18N_Delicious homebaked pizza.$8.99 pp
          I18N_Try the authentic Italian flavor of baked aubergine.$6.99 pp
          I18N_Our dessert specialty.$3.99 pp






          share|improve this answer























          • This solution works. Thanks a lot, also for the additional information.

            – cbp
            yesterday











          • You are welcome :-)

            – Maaz
            yesterday















          2














          replace_with() method does not work not because joined is a unicode object, but because it is a method specific to bs4 object. See this: BeautifulSoup-replace_with



          By the way the join() method return a str See this: python3-join



          Now to give you a solution, I would simply remove the string after the p tag:



          from bs4 import BeautifulSoup

          with open("index.html", "r") as f:
          soup = BeautifulSoup(f, "lxml")

          f = open("I18N_index.html", "w+")

          for h2 in soup.find_all('h2'):
          i18n_string = "I18N_"+h2.string
          h2.string.replace_with(i18n_string)
          print(h2.string)

          for p in soup.find_all('p'):
          joined = ''.join(p.strings)
          i18n_string = "I18N_"+joined
          p.replace_with(i18n_string)
          print (i18n_string)


          f.write(str(soup))


          OUTPUT:



          I18N_Pizza
          I18N_Eggplant Parmesan
          I18N_Italian Ice Cream
          I18N_This is some random paragraph without child tags.
          I18N_Delicious homebaked pizza.$8.99 pp
          I18N_Try the authentic Italian flavor of baked aubergine.$6.99 pp
          I18N_Our dessert specialty.$3.99 pp






          share|improve this answer























          • This solution works. Thanks a lot, also for the additional information.

            – cbp
            yesterday











          • You are welcome :-)

            – Maaz
            yesterday













          2












          2








          2







          replace_with() method does not work not because joined is a unicode object, but because it is a method specific to bs4 object. See this: BeautifulSoup-replace_with



          By the way the join() method return a str See this: python3-join



          Now to give you a solution, I would simply remove the string after the p tag:



          from bs4 import BeautifulSoup

          with open("index.html", "r") as f:
          soup = BeautifulSoup(f, "lxml")

          f = open("I18N_index.html", "w+")

          for h2 in soup.find_all('h2'):
          i18n_string = "I18N_"+h2.string
          h2.string.replace_with(i18n_string)
          print(h2.string)

          for p in soup.find_all('p'):
          joined = ''.join(p.strings)
          i18n_string = "I18N_"+joined
          p.replace_with(i18n_string)
          print (i18n_string)


          f.write(str(soup))


          OUTPUT:



          I18N_Pizza
          I18N_Eggplant Parmesan
          I18N_Italian Ice Cream
          I18N_This is some random paragraph without child tags.
          I18N_Delicious homebaked pizza.$8.99 pp
          I18N_Try the authentic Italian flavor of baked aubergine.$6.99 pp
          I18N_Our dessert specialty.$3.99 pp






          share|improve this answer













          replace_with() method does not work not because joined is a unicode object, but because it is a method specific to bs4 object. See this: BeautifulSoup-replace_with



          By the way the join() method return a str See this: python3-join



          Now to give you a solution, I would simply remove the string after the p tag:



          from bs4 import BeautifulSoup

          with open("index.html", "r") as f:
          soup = BeautifulSoup(f, "lxml")

          f = open("I18N_index.html", "w+")

          for h2 in soup.find_all('h2'):
          i18n_string = "I18N_"+h2.string
          h2.string.replace_with(i18n_string)
          print(h2.string)

          for p in soup.find_all('p'):
          joined = ''.join(p.strings)
          i18n_string = "I18N_"+joined
          p.replace_with(i18n_string)
          print (i18n_string)


          f.write(str(soup))


          OUTPUT:



          I18N_Pizza
          I18N_Eggplant Parmesan
          I18N_Italian Ice Cream
          I18N_This is some random paragraph without child tags.
          I18N_Delicious homebaked pizza.$8.99 pp
          I18N_Try the authentic Italian flavor of baked aubergine.$6.99 pp
          I18N_Our dessert specialty.$3.99 pp







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered yesterday









          MaazMaaz

          359211




          359211












          • This solution works. Thanks a lot, also for the additional information.

            – cbp
            yesterday











          • You are welcome :-)

            – Maaz
            yesterday

















          • This solution works. Thanks a lot, also for the additional information.

            – cbp
            yesterday











          • You are welcome :-)

            – Maaz
            yesterday
















          This solution works. Thanks a lot, also for the additional information.

          – cbp
          yesterday





          This solution works. Thanks a lot, also for the additional information.

          – cbp
          yesterday













          You are welcome :-)

          – Maaz
          yesterday





          You are welcome :-)

          – Maaz
          yesterday













          1














          With a simplified version of your code (that is, just taking care of the p tags issue), it looks like you have to replace p.string with p.text:



          soup = BeautifulSoup([your html], "lxml")



           for p in soup.find_all('p'):
          print('before: ',p.text)
          i18n_string = "I18N_"+p.text
          print('after ',i18n_string)


          Output:



          before: This is some random paragraph without child tags.
          after I18N_This is some random paragraph without child tags.
          before: Delicious homebaked pizza.$8.99 pp
          after I18N_Delicious homebaked pizza.$8.99 pp
          before: Try the authentic Italian flavor of baked aubergine.$6.99 pp
          after I18N_Try the authentic Italian flavor of baked aubergine.$6.99 pp
          before: Our dessert specialty.$3.99 pp
          after I18N_Our dessert specialty.$3.99 pp





          share|improve this answer

























          • Thanks for your reply. I had tried 'text' before, but it did not resolve my inability to use 'replace_with'.

            – cbp
            yesterday















          1














          With a simplified version of your code (that is, just taking care of the p tags issue), it looks like you have to replace p.string with p.text:



          soup = BeautifulSoup([your html], "lxml")



           for p in soup.find_all('p'):
          print('before: ',p.text)
          i18n_string = "I18N_"+p.text
          print('after ',i18n_string)


          Output:



          before: This is some random paragraph without child tags.
          after I18N_This is some random paragraph without child tags.
          before: Delicious homebaked pizza.$8.99 pp
          after I18N_Delicious homebaked pizza.$8.99 pp
          before: Try the authentic Italian flavor of baked aubergine.$6.99 pp
          after I18N_Try the authentic Italian flavor of baked aubergine.$6.99 pp
          before: Our dessert specialty.$3.99 pp
          after I18N_Our dessert specialty.$3.99 pp





          share|improve this answer

























          • Thanks for your reply. I had tried 'text' before, but it did not resolve my inability to use 'replace_with'.

            – cbp
            yesterday













          1












          1








          1







          With a simplified version of your code (that is, just taking care of the p tags issue), it looks like you have to replace p.string with p.text:



          soup = BeautifulSoup([your html], "lxml")



           for p in soup.find_all('p'):
          print('before: ',p.text)
          i18n_string = "I18N_"+p.text
          print('after ',i18n_string)


          Output:



          before: This is some random paragraph without child tags.
          after I18N_This is some random paragraph without child tags.
          before: Delicious homebaked pizza.$8.99 pp
          after I18N_Delicious homebaked pizza.$8.99 pp
          before: Try the authentic Italian flavor of baked aubergine.$6.99 pp
          after I18N_Try the authentic Italian flavor of baked aubergine.$6.99 pp
          before: Our dessert specialty.$3.99 pp
          after I18N_Our dessert specialty.$3.99 pp





          share|improve this answer















          With a simplified version of your code (that is, just taking care of the p tags issue), it looks like you have to replace p.string with p.text:



          soup = BeautifulSoup([your html], "lxml")



           for p in soup.find_all('p'):
          print('before: ',p.text)
          i18n_string = "I18N_"+p.text
          print('after ',i18n_string)


          Output:



          before: This is some random paragraph without child tags.
          after I18N_This is some random paragraph without child tags.
          before: Delicious homebaked pizza.$8.99 pp
          after I18N_Delicious homebaked pizza.$8.99 pp
          before: Try the authentic Italian flavor of baked aubergine.$6.99 pp
          after I18N_Try the authentic Italian flavor of baked aubergine.$6.99 pp
          before: Our dessert specialty.$3.99 pp
          after I18N_Our dessert specialty.$3.99 pp






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited yesterday

























          answered yesterday









          Jack FleetingJack Fleeting

          397311




          397311












          • Thanks for your reply. I had tried 'text' before, but it did not resolve my inability to use 'replace_with'.

            – cbp
            yesterday

















          • Thanks for your reply. I had tried 'text' before, but it did not resolve my inability to use 'replace_with'.

            – cbp
            yesterday
















          Thanks for your reply. I had tried 'text' before, but it did not resolve my inability to use 'replace_with'.

          – cbp
          yesterday





          Thanks for your reply. I had tried 'text' before, but it did not resolve my inability to use 'replace_with'.

          – cbp
          yesterday










          cbp is a new contributor. Be nice, and check out our Code of Conduct.









          draft saved

          draft discarded


















          cbp is a new contributor. Be nice, and check out our Code of Conduct.












          cbp is a new contributor. Be nice, and check out our Code of Conduct.











          cbp is a new contributor. Be nice, and check out our Code of Conduct.














          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55023173%2fhow-to-make-beautifulsoup-replace-with-attribute-work-with-a-unicode-object%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Identity Server 4 is not redirecting to Angular app after login2019 Community Moderator ElectionIdentity Server 4 and dockerIdentityserver implicit flow unauthorized_clientIdentityServer Hybrid Flow - Access Token is null after user successful loginIdentity Server to MVC client : Page Redirect After loginLogin with Steam OpenId(oidc-client-js)Identity Server 4+.NET Core 2.0 + IdentityIdentityServer4 post-login redirect not working in Edge browserCall to IdentityServer4 generates System.NullReferenceException: Object reference not set to an instance of an objectIdentityServer4 without HTTPS not workingHow to get Authorization code from identity server without login form

          2005 Ahvaz unrest Contents Background Causes Casualties Aftermath See also References Navigation menue"At Least 10 Are Killed by Bombs in Iran""Iran"Archived"Arab-Iranians in Iran to make April 15 'Day of Fury'"State of Mind, State of Order: Reactions to Ethnic Unrest in the Islamic Republic of Iran.10.1111/j.1754-9469.2008.00028.x"Iran hangs Arab separatists"Iran Overview from ArchivedConstitution of the Islamic Republic of Iran"Tehran puzzled by forged 'riots' letter""Iran and its minorities: Down in the second class""Iran: Handling Of Ahvaz Unrest Could End With Televised Confessions""Bombings Rock Iran Ahead of Election""Five die in Iran ethnic clashes""Iran: Need for restraint as anniversary of unrest in Khuzestan approaches"Archived"Iranian Sunni protesters killed in clashes with security forces"Archived

          Can't initialize raids on a new ASUS Prime B360M-A motherboard2019 Community Moderator ElectionSimilar to RAID config yet more like mirroring solution?Can't get motherboard serial numberWhy does the BIOS entry point start with a WBINVD instruction?UEFI performance Asus Maximus V Extreme