BeautifulSoup output format error: too much whitespace











up vote
0
down vote

favorite












The following code prints abnormally too much whitespace for much of the output.



import bs4
import requests


res = requests.get('https://www.sportsbookreview.com/forum/search.php?do=finduser&userid=126807&contenttype=vBForum_Post&showposts=1')
soup = bs4.BeautifulSoup(res.text, 'lxml')
print(soup)


Here is the part of the output where the formatting becomes a problem:



Sportsbooks & The Industry    Service Plays    /   "   >   N   e   w   b   i   e       F   o   r   u   m   /   a   >   /   l   i   >   


Prettify does not change anything. Any idea why this occurs?










share|improve this question




























    up vote
    0
    down vote

    favorite












    The following code prints abnormally too much whitespace for much of the output.



    import bs4
    import requests


    res = requests.get('https://www.sportsbookreview.com/forum/search.php?do=finduser&userid=126807&contenttype=vBForum_Post&showposts=1')
    soup = bs4.BeautifulSoup(res.text, 'lxml')
    print(soup)


    Here is the part of the output where the formatting becomes a problem:



    Sportsbooks & The Industry    Service Plays    /   "   >   N   e   w   b   i   e       F   o   r   u   m   /   a   >   /   l   i   >   


    Prettify does not change anything. Any idea why this occurs?










    share|improve this question


























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      The following code prints abnormally too much whitespace for much of the output.



      import bs4
      import requests


      res = requests.get('https://www.sportsbookreview.com/forum/search.php?do=finduser&userid=126807&contenttype=vBForum_Post&showposts=1')
      soup = bs4.BeautifulSoup(res.text, 'lxml')
      print(soup)


      Here is the part of the output where the formatting becomes a problem:



      Sportsbooks & The Industry    Service Plays    /   "   >   N   e   w   b   i   e       F   o   r   u   m   /   a   >   /   l   i   >   


      Prettify does not change anything. Any idea why this occurs?










      share|improve this question















      The following code prints abnormally too much whitespace for much of the output.



      import bs4
      import requests


      res = requests.get('https://www.sportsbookreview.com/forum/search.php?do=finduser&userid=126807&contenttype=vBForum_Post&showposts=1')
      soup = bs4.BeautifulSoup(res.text, 'lxml')
      print(soup)


      Here is the part of the output where the formatting becomes a problem:



      Sportsbooks & The Industry    Service Plays    /   "   >   N   e   w   b   i   e       F   o   r   u   m   /   a   >   /   l   i   >   


      Prettify does not change anything. Any idea why this occurs?







      beautifulsoup python-requests






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 8 at 18:09

























      asked Nov 8 at 3:35









      WooHoo

      61




      61
























          2 Answers
          2






          active

          oldest

          votes

















          up vote
          0
          down vote













          If you check source code of website, you will see it has whitespaces around it (Right Click on webpage and click 'Show Page Source')



          I run your code and it prints without newlines and whitespaces.



          You can do something like



          import bs4
          import requests


          res = requests.get('https://www.sportsbookreview.com/forum/search.php?do=finduser&userid=126807&contenttype=vBForum_Post&showposts=1')
          soup = bs4.BeautifulSoup(res.text, 'lxml')
          print(soup.prettify())





          share|improve this answer

















          • 1




            prettify changes nothing on my end. On IDLE and pycharm it both shows something like this: i m g s r c = " h t t p s : / / f o r u m . s t a t i c - f i l e s . c o m / v b 4 / i m a g e s / i c o n s / i c o n 1 . p n g " / > a h r e f = " h t t p s : / / w w w . s p o r t s b o o k r e v i e w . c o
            – WooHoo
            Nov 8 at 17:38












          • Wow, that's odd. If you check the source code of the website you will see it has a normal output, no whitespace around characters. May I ask, did you change default fonts on your console or letter spacing between characters?
            – Dinko Pehar
            Nov 8 at 18:56








          • 1




            Interestingly the output is normal when I tried to run it on windows just now.The whitespacing occurs only on my macbook. I have not changed any default fonts or anything else.
            – WooHoo
            Nov 8 at 19:43










          • I'm glad you found some breadcrumbs for your problem. Try to solve it. And welcome to stack overflow :) .
            – Dinko Pehar
            Nov 8 at 20:14


















          up vote
          0
          down vote













          Try this:



          Change to soup = bs4.BeautifulSoup(res.text, 'html.parser') instead of 'lxml'



          import bs4
          import requests


          res = requests.get('https://www.sportsbookreview.com/forum/search.php?do=finduser&userid=126807&contenttype=vBForum_Post&showposts=1')
          soup = bs4.BeautifulSoup(res.text, 'html.parser')
          print(soup)





          share|improve this answer





















            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53201189%2fbeautifulsoup-output-format-error-too-much-whitespace%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            0
            down vote













            If you check source code of website, you will see it has whitespaces around it (Right Click on webpage and click 'Show Page Source')



            I run your code and it prints without newlines and whitespaces.



            You can do something like



            import bs4
            import requests


            res = requests.get('https://www.sportsbookreview.com/forum/search.php?do=finduser&userid=126807&contenttype=vBForum_Post&showposts=1')
            soup = bs4.BeautifulSoup(res.text, 'lxml')
            print(soup.prettify())





            share|improve this answer

















            • 1




              prettify changes nothing on my end. On IDLE and pycharm it both shows something like this: i m g s r c = " h t t p s : / / f o r u m . s t a t i c - f i l e s . c o m / v b 4 / i m a g e s / i c o n s / i c o n 1 . p n g " / > a h r e f = " h t t p s : / / w w w . s p o r t s b o o k r e v i e w . c o
              – WooHoo
              Nov 8 at 17:38












            • Wow, that's odd. If you check the source code of the website you will see it has a normal output, no whitespace around characters. May I ask, did you change default fonts on your console or letter spacing between characters?
              – Dinko Pehar
              Nov 8 at 18:56








            • 1




              Interestingly the output is normal when I tried to run it on windows just now.The whitespacing occurs only on my macbook. I have not changed any default fonts or anything else.
              – WooHoo
              Nov 8 at 19:43










            • I'm glad you found some breadcrumbs for your problem. Try to solve it. And welcome to stack overflow :) .
              – Dinko Pehar
              Nov 8 at 20:14















            up vote
            0
            down vote













            If you check source code of website, you will see it has whitespaces around it (Right Click on webpage and click 'Show Page Source')



            I run your code and it prints without newlines and whitespaces.



            You can do something like



            import bs4
            import requests


            res = requests.get('https://www.sportsbookreview.com/forum/search.php?do=finduser&userid=126807&contenttype=vBForum_Post&showposts=1')
            soup = bs4.BeautifulSoup(res.text, 'lxml')
            print(soup.prettify())





            share|improve this answer

















            • 1




              prettify changes nothing on my end. On IDLE and pycharm it both shows something like this: i m g s r c = " h t t p s : / / f o r u m . s t a t i c - f i l e s . c o m / v b 4 / i m a g e s / i c o n s / i c o n 1 . p n g " / > a h r e f = " h t t p s : / / w w w . s p o r t s b o o k r e v i e w . c o
              – WooHoo
              Nov 8 at 17:38












            • Wow, that's odd. If you check the source code of the website you will see it has a normal output, no whitespace around characters. May I ask, did you change default fonts on your console or letter spacing between characters?
              – Dinko Pehar
              Nov 8 at 18:56








            • 1




              Interestingly the output is normal when I tried to run it on windows just now.The whitespacing occurs only on my macbook. I have not changed any default fonts or anything else.
              – WooHoo
              Nov 8 at 19:43










            • I'm glad you found some breadcrumbs for your problem. Try to solve it. And welcome to stack overflow :) .
              – Dinko Pehar
              Nov 8 at 20:14













            up vote
            0
            down vote










            up vote
            0
            down vote









            If you check source code of website, you will see it has whitespaces around it (Right Click on webpage and click 'Show Page Source')



            I run your code and it prints without newlines and whitespaces.



            You can do something like



            import bs4
            import requests


            res = requests.get('https://www.sportsbookreview.com/forum/search.php?do=finduser&userid=126807&contenttype=vBForum_Post&showposts=1')
            soup = bs4.BeautifulSoup(res.text, 'lxml')
            print(soup.prettify())





            share|improve this answer












            If you check source code of website, you will see it has whitespaces around it (Right Click on webpage and click 'Show Page Source')



            I run your code and it prints without newlines and whitespaces.



            You can do something like



            import bs4
            import requests


            res = requests.get('https://www.sportsbookreview.com/forum/search.php?do=finduser&userid=126807&contenttype=vBForum_Post&showposts=1')
            soup = bs4.BeautifulSoup(res.text, 'lxml')
            print(soup.prettify())






            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Nov 8 at 8:20









            Dinko Pehar

            586324




            586324








            • 1




              prettify changes nothing on my end. On IDLE and pycharm it both shows something like this: i m g s r c = " h t t p s : / / f o r u m . s t a t i c - f i l e s . c o m / v b 4 / i m a g e s / i c o n s / i c o n 1 . p n g " / > a h r e f = " h t t p s : / / w w w . s p o r t s b o o k r e v i e w . c o
              – WooHoo
              Nov 8 at 17:38












            • Wow, that's odd. If you check the source code of the website you will see it has a normal output, no whitespace around characters. May I ask, did you change default fonts on your console or letter spacing between characters?
              – Dinko Pehar
              Nov 8 at 18:56








            • 1




              Interestingly the output is normal when I tried to run it on windows just now.The whitespacing occurs only on my macbook. I have not changed any default fonts or anything else.
              – WooHoo
              Nov 8 at 19:43










            • I'm glad you found some breadcrumbs for your problem. Try to solve it. And welcome to stack overflow :) .
              – Dinko Pehar
              Nov 8 at 20:14














            • 1




              prettify changes nothing on my end. On IDLE and pycharm it both shows something like this: i m g s r c = " h t t p s : / / f o r u m . s t a t i c - f i l e s . c o m / v b 4 / i m a g e s / i c o n s / i c o n 1 . p n g " / > a h r e f = " h t t p s : / / w w w . s p o r t s b o o k r e v i e w . c o
              – WooHoo
              Nov 8 at 17:38












            • Wow, that's odd. If you check the source code of the website you will see it has a normal output, no whitespace around characters. May I ask, did you change default fonts on your console or letter spacing between characters?
              – Dinko Pehar
              Nov 8 at 18:56








            • 1




              Interestingly the output is normal when I tried to run it on windows just now.The whitespacing occurs only on my macbook. I have not changed any default fonts or anything else.
              – WooHoo
              Nov 8 at 19:43










            • I'm glad you found some breadcrumbs for your problem. Try to solve it. And welcome to stack overflow :) .
              – Dinko Pehar
              Nov 8 at 20:14








            1




            1




            prettify changes nothing on my end. On IDLE and pycharm it both shows something like this: i m g s r c = " h t t p s : / / f o r u m . s t a t i c - f i l e s . c o m / v b 4 / i m a g e s / i c o n s / i c o n 1 . p n g " / > a h r e f = " h t t p s : / / w w w . s p o r t s b o o k r e v i e w . c o
            – WooHoo
            Nov 8 at 17:38






            prettify changes nothing on my end. On IDLE and pycharm it both shows something like this: i m g s r c = " h t t p s : / / f o r u m . s t a t i c - f i l e s . c o m / v b 4 / i m a g e s / i c o n s / i c o n 1 . p n g " / > a h r e f = " h t t p s : / / w w w . s p o r t s b o o k r e v i e w . c o
            – WooHoo
            Nov 8 at 17:38














            Wow, that's odd. If you check the source code of the website you will see it has a normal output, no whitespace around characters. May I ask, did you change default fonts on your console or letter spacing between characters?
            – Dinko Pehar
            Nov 8 at 18:56






            Wow, that's odd. If you check the source code of the website you will see it has a normal output, no whitespace around characters. May I ask, did you change default fonts on your console or letter spacing between characters?
            – Dinko Pehar
            Nov 8 at 18:56






            1




            1




            Interestingly the output is normal when I tried to run it on windows just now.The whitespacing occurs only on my macbook. I have not changed any default fonts or anything else.
            – WooHoo
            Nov 8 at 19:43




            Interestingly the output is normal when I tried to run it on windows just now.The whitespacing occurs only on my macbook. I have not changed any default fonts or anything else.
            – WooHoo
            Nov 8 at 19:43












            I'm glad you found some breadcrumbs for your problem. Try to solve it. And welcome to stack overflow :) .
            – Dinko Pehar
            Nov 8 at 20:14




            I'm glad you found some breadcrumbs for your problem. Try to solve it. And welcome to stack overflow :) .
            – Dinko Pehar
            Nov 8 at 20:14












            up vote
            0
            down vote













            Try this:



            Change to soup = bs4.BeautifulSoup(res.text, 'html.parser') instead of 'lxml'



            import bs4
            import requests


            res = requests.get('https://www.sportsbookreview.com/forum/search.php?do=finduser&userid=126807&contenttype=vBForum_Post&showposts=1')
            soup = bs4.BeautifulSoup(res.text, 'html.parser')
            print(soup)





            share|improve this answer

























              up vote
              0
              down vote













              Try this:



              Change to soup = bs4.BeautifulSoup(res.text, 'html.parser') instead of 'lxml'



              import bs4
              import requests


              res = requests.get('https://www.sportsbookreview.com/forum/search.php?do=finduser&userid=126807&contenttype=vBForum_Post&showposts=1')
              soup = bs4.BeautifulSoup(res.text, 'html.parser')
              print(soup)





              share|improve this answer























                up vote
                0
                down vote










                up vote
                0
                down vote









                Try this:



                Change to soup = bs4.BeautifulSoup(res.text, 'html.parser') instead of 'lxml'



                import bs4
                import requests


                res = requests.get('https://www.sportsbookreview.com/forum/search.php?do=finduser&userid=126807&contenttype=vBForum_Post&showposts=1')
                soup = bs4.BeautifulSoup(res.text, 'html.parser')
                print(soup)





                share|improve this answer












                Try this:



                Change to soup = bs4.BeautifulSoup(res.text, 'html.parser') instead of 'lxml'



                import bs4
                import requests


                res = requests.get('https://www.sportsbookreview.com/forum/search.php?do=finduser&userid=126807&contenttype=vBForum_Post&showposts=1')
                soup = bs4.BeautifulSoup(res.text, 'html.parser')
                print(soup)






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 10 at 8:05









                NgoCuong

                32917




                32917






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.





                    Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                    Please pay close attention to the following guidance:


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53201189%2fbeautifulsoup-output-format-error-too-much-whitespace%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    這個網誌中的熱門文章

                    Tangent Lines Diagram Along Smooth Curve

                    Yusuf al-Mu'taman ibn Hud

                    Zucchini