Python Selenium Web Scrape embedded csv/excel file in XPATH to pandas dataframe











up vote
1
down vote

favorite












I am working on a Python requirement to download data from a secured website using Python Selenium WebDriver. I am using xpath.click() to download the file to a local download folder or any specific folder I wanted to.



Instead of downloading to a folder, I wanted to load to a database table.



I am looking for help to know how I can direct the downloaded file to a table.



data = driver.find_element_by_xpath('//*[@id="modules--reports-gridTar-instanceGrid"]/div/div[1]/div/div[2]/div[2]/a').click()
print(data) // output: None


The downloaded file is seen on the web browser:





Not sure how could I pick that file showing on the browser to a table?










share|improve this question




























    up vote
    1
    down vote

    favorite












    I am working on a Python requirement to download data from a secured website using Python Selenium WebDriver. I am using xpath.click() to download the file to a local download folder or any specific folder I wanted to.



    Instead of downloading to a folder, I wanted to load to a database table.



    I am looking for help to know how I can direct the downloaded file to a table.



    data = driver.find_element_by_xpath('//*[@id="modules--reports-gridTar-instanceGrid"]/div/div[1]/div/div[2]/div[2]/a').click()
    print(data) // output: None


    The downloaded file is seen on the web browser:





    Not sure how could I pick that file showing on the browser to a table?










    share|improve this question


























      up vote
      1
      down vote

      favorite









      up vote
      1
      down vote

      favorite











      I am working on a Python requirement to download data from a secured website using Python Selenium WebDriver. I am using xpath.click() to download the file to a local download folder or any specific folder I wanted to.



      Instead of downloading to a folder, I wanted to load to a database table.



      I am looking for help to know how I can direct the downloaded file to a table.



      data = driver.find_element_by_xpath('//*[@id="modules--reports-gridTar-instanceGrid"]/div/div[1]/div/div[2]/div[2]/a').click()
      print(data) // output: None


      The downloaded file is seen on the web browser:





      Not sure how could I pick that file showing on the browser to a table?










      share|improve this question















      I am working on a Python requirement to download data from a secured website using Python Selenium WebDriver. I am using xpath.click() to download the file to a local download folder or any specific folder I wanted to.



      Instead of downloading to a folder, I wanted to load to a database table.



      I am looking for help to know how I can direct the downloaded file to a table.



      data = driver.find_element_by_xpath('//*[@id="modules--reports-gridTar-instanceGrid"]/div/div[1]/div/div[2]/div[2]/a').click()
      print(data) // output: None


      The downloaded file is seen on the web browser:





      Not sure how could I pick that file showing on the browser to a table?







      python csv selenium download webdriver






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 8 at 3:22









      Tân Nguyễn

      1




      1










      asked Nov 8 at 0:36









      Deepa

      62




      62
























          1 Answer
          1






          active

          oldest

          votes

















          up vote
          0
          down vote













          don't click() but get href attribute to download with python.



          data = driver.find_element_by_xpath('//*[@id="modules--reports-gridTar-instanceGrid"]/div/div[1]/div/div[2]/div[2]/a')
          url = data.get_attribute('href')
          req = urllib2.urlopen(url).read()
          response = req.read()
          print(response)





          share|improve this answer





















          • I did try using the get_attribute('href') and got the URL which holds the csv file. By opening and reading that url throws some scripts like this .....b'<!DOCTYPE html>rn<html xmlns="w3.org/1999/xhtml" style="position:inherit;">rnt<head id="ctl00_ctl00_Head1"><meta http-equiv="X-UA-Compatible" content="IE=edge" /><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><script type="text/javascript">window.NREUM|.......not able to read the actual csv file
            – Deepa
            Nov 8 at 11:43












          • try add referrer header in your request or also add cookies
            – ewwink
            Nov 8 at 11:48











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53199983%2fpython-selenium-web-scrape-embedded-csv-excel-file-in-xpath-to-pandas-dataframe%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          0
          down vote













          don't click() but get href attribute to download with python.



          data = driver.find_element_by_xpath('//*[@id="modules--reports-gridTar-instanceGrid"]/div/div[1]/div/div[2]/div[2]/a')
          url = data.get_attribute('href')
          req = urllib2.urlopen(url).read()
          response = req.read()
          print(response)





          share|improve this answer





















          • I did try using the get_attribute('href') and got the URL which holds the csv file. By opening and reading that url throws some scripts like this .....b'<!DOCTYPE html>rn<html xmlns="w3.org/1999/xhtml" style="position:inherit;">rnt<head id="ctl00_ctl00_Head1"><meta http-equiv="X-UA-Compatible" content="IE=edge" /><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><script type="text/javascript">window.NREUM|.......not able to read the actual csv file
            – Deepa
            Nov 8 at 11:43












          • try add referrer header in your request or also add cookies
            – ewwink
            Nov 8 at 11:48















          up vote
          0
          down vote













          don't click() but get href attribute to download with python.



          data = driver.find_element_by_xpath('//*[@id="modules--reports-gridTar-instanceGrid"]/div/div[1]/div/div[2]/div[2]/a')
          url = data.get_attribute('href')
          req = urllib2.urlopen(url).read()
          response = req.read()
          print(response)





          share|improve this answer





















          • I did try using the get_attribute('href') and got the URL which holds the csv file. By opening and reading that url throws some scripts like this .....b'<!DOCTYPE html>rn<html xmlns="w3.org/1999/xhtml" style="position:inherit;">rnt<head id="ctl00_ctl00_Head1"><meta http-equiv="X-UA-Compatible" content="IE=edge" /><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><script type="text/javascript">window.NREUM|.......not able to read the actual csv file
            – Deepa
            Nov 8 at 11:43












          • try add referrer header in your request or also add cookies
            – ewwink
            Nov 8 at 11:48













          up vote
          0
          down vote










          up vote
          0
          down vote









          don't click() but get href attribute to download with python.



          data = driver.find_element_by_xpath('//*[@id="modules--reports-gridTar-instanceGrid"]/div/div[1]/div/div[2]/div[2]/a')
          url = data.get_attribute('href')
          req = urllib2.urlopen(url).read()
          response = req.read()
          print(response)





          share|improve this answer












          don't click() but get href attribute to download with python.



          data = driver.find_element_by_xpath('//*[@id="modules--reports-gridTar-instanceGrid"]/div/div[1]/div/div[2]/div[2]/a')
          url = data.get_attribute('href')
          req = urllib2.urlopen(url).read()
          response = req.read()
          print(response)






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 8 at 8:20









          ewwink

          6,87922233




          6,87922233












          • I did try using the get_attribute('href') and got the URL which holds the csv file. By opening and reading that url throws some scripts like this .....b'<!DOCTYPE html>rn<html xmlns="w3.org/1999/xhtml" style="position:inherit;">rnt<head id="ctl00_ctl00_Head1"><meta http-equiv="X-UA-Compatible" content="IE=edge" /><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><script type="text/javascript">window.NREUM|.......not able to read the actual csv file
            – Deepa
            Nov 8 at 11:43












          • try add referrer header in your request or also add cookies
            – ewwink
            Nov 8 at 11:48


















          • I did try using the get_attribute('href') and got the URL which holds the csv file. By opening and reading that url throws some scripts like this .....b'<!DOCTYPE html>rn<html xmlns="w3.org/1999/xhtml" style="position:inherit;">rnt<head id="ctl00_ctl00_Head1"><meta http-equiv="X-UA-Compatible" content="IE=edge" /><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><script type="text/javascript">window.NREUM|.......not able to read the actual csv file
            – Deepa
            Nov 8 at 11:43












          • try add referrer header in your request or also add cookies
            – ewwink
            Nov 8 at 11:48
















          I did try using the get_attribute('href') and got the URL which holds the csv file. By opening and reading that url throws some scripts like this .....b'<!DOCTYPE html>rn<html xmlns="w3.org/1999/xhtml" style="position:inherit;">rnt<head id="ctl00_ctl00_Head1"><meta http-equiv="X-UA-Compatible" content="IE=edge" /><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><script type="text/javascript">window.NREUM|.......not able to read the actual csv file
          – Deepa
          Nov 8 at 11:43






          I did try using the get_attribute('href') and got the URL which holds the csv file. By opening and reading that url throws some scripts like this .....b'<!DOCTYPE html>rn<html xmlns="w3.org/1999/xhtml" style="position:inherit;">rnt<head id="ctl00_ctl00_Head1"><meta http-equiv="X-UA-Compatible" content="IE=edge" /><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><script type="text/javascript">window.NREUM|.......not able to read the actual csv file
          – Deepa
          Nov 8 at 11:43














          try add referrer header in your request or also add cookies
          – ewwink
          Nov 8 at 11:48




          try add referrer header in your request or also add cookies
          – ewwink
          Nov 8 at 11:48


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.





          Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


          Please pay close attention to the following guidance:


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53199983%2fpython-selenium-web-scrape-embedded-csv-excel-file-in-xpath-to-pandas-dataframe%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          這個網誌中的熱門文章

          Academy of Television Arts & Sciences

          L'Équipe

          1995 France bombings