Python Selenium Web Scrape embedded csv/excel file in XPATH to pandas dataframe
up vote
1
down vote
favorite
I am working on a Python requirement to download data from a secured website using Python Selenium WebDriver. I am using xpath.click() to download the file to a local download folder or any specific folder I wanted to.
Instead of downloading to a folder, I wanted to load to a database table.
I am looking for help to know how I can direct the downloaded file to a table.
data = driver.find_element_by_xpath('//*[@id="modules--reports-gridTar-instanceGrid"]/div/div[1]/div/div[2]/div[2]/a').click()
print(data) // output: None
The downloaded file is seen on the web browser:

Not sure how could I pick that file showing on the browser to a table?
python csv selenium download webdriver
add a comment |
up vote
1
down vote
favorite
I am working on a Python requirement to download data from a secured website using Python Selenium WebDriver. I am using xpath.click() to download the file to a local download folder or any specific folder I wanted to.
Instead of downloading to a folder, I wanted to load to a database table.
I am looking for help to know how I can direct the downloaded file to a table.
data = driver.find_element_by_xpath('//*[@id="modules--reports-gridTar-instanceGrid"]/div/div[1]/div/div[2]/div[2]/a').click()
print(data) // output: None
The downloaded file is seen on the web browser:

Not sure how could I pick that file showing on the browser to a table?
python csv selenium download webdriver
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I am working on a Python requirement to download data from a secured website using Python Selenium WebDriver. I am using xpath.click() to download the file to a local download folder or any specific folder I wanted to.
Instead of downloading to a folder, I wanted to load to a database table.
I am looking for help to know how I can direct the downloaded file to a table.
data = driver.find_element_by_xpath('//*[@id="modules--reports-gridTar-instanceGrid"]/div/div[1]/div/div[2]/div[2]/a').click()
print(data) // output: None
The downloaded file is seen on the web browser:

Not sure how could I pick that file showing on the browser to a table?
python csv selenium download webdriver
I am working on a Python requirement to download data from a secured website using Python Selenium WebDriver. I am using xpath.click() to download the file to a local download folder or any specific folder I wanted to.
Instead of downloading to a folder, I wanted to load to a database table.
I am looking for help to know how I can direct the downloaded file to a table.
data = driver.find_element_by_xpath('//*[@id="modules--reports-gridTar-instanceGrid"]/div/div[1]/div/div[2]/div[2]/a').click()
print(data) // output: None
The downloaded file is seen on the web browser:

Not sure how could I pick that file showing on the browser to a table?
python csv selenium download webdriver
python csv selenium download webdriver
edited Nov 8 at 3:22
Tân Nguyễn
1
1
asked Nov 8 at 0:36
Deepa
62
62
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
don't click() but get href attribute to download with python.
data = driver.find_element_by_xpath('//*[@id="modules--reports-gridTar-instanceGrid"]/div/div[1]/div/div[2]/div[2]/a')
url = data.get_attribute('href')
req = urllib2.urlopen(url).read()
response = req.read()
print(response)
I did try using the get_attribute('href') and got the URL which holds the csv file. By opening and reading that url throws some scripts like this .....b'<!DOCTYPE html>rn<html xmlns="w3.org/1999/xhtml" style="position:inherit;">rnt<head id="ctl00_ctl00_Head1"><meta http-equiv="X-UA-Compatible" content="IE=edge" /><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><script type="text/javascript">window.NREUM|.......not able to read the actual csv file
– Deepa
Nov 8 at 11:43
try add referrer header in your request or also add cookies
– ewwink
Nov 8 at 11:48
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
don't click() but get href attribute to download with python.
data = driver.find_element_by_xpath('//*[@id="modules--reports-gridTar-instanceGrid"]/div/div[1]/div/div[2]/div[2]/a')
url = data.get_attribute('href')
req = urllib2.urlopen(url).read()
response = req.read()
print(response)
I did try using the get_attribute('href') and got the URL which holds the csv file. By opening and reading that url throws some scripts like this .....b'<!DOCTYPE html>rn<html xmlns="w3.org/1999/xhtml" style="position:inherit;">rnt<head id="ctl00_ctl00_Head1"><meta http-equiv="X-UA-Compatible" content="IE=edge" /><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><script type="text/javascript">window.NREUM|.......not able to read the actual csv file
– Deepa
Nov 8 at 11:43
try add referrer header in your request or also add cookies
– ewwink
Nov 8 at 11:48
add a comment |
up vote
0
down vote
don't click() but get href attribute to download with python.
data = driver.find_element_by_xpath('//*[@id="modules--reports-gridTar-instanceGrid"]/div/div[1]/div/div[2]/div[2]/a')
url = data.get_attribute('href')
req = urllib2.urlopen(url).read()
response = req.read()
print(response)
I did try using the get_attribute('href') and got the URL which holds the csv file. By opening and reading that url throws some scripts like this .....b'<!DOCTYPE html>rn<html xmlns="w3.org/1999/xhtml" style="position:inherit;">rnt<head id="ctl00_ctl00_Head1"><meta http-equiv="X-UA-Compatible" content="IE=edge" /><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><script type="text/javascript">window.NREUM|.......not able to read the actual csv file
– Deepa
Nov 8 at 11:43
try add referrer header in your request or also add cookies
– ewwink
Nov 8 at 11:48
add a comment |
up vote
0
down vote
up vote
0
down vote
don't click() but get href attribute to download with python.
data = driver.find_element_by_xpath('//*[@id="modules--reports-gridTar-instanceGrid"]/div/div[1]/div/div[2]/div[2]/a')
url = data.get_attribute('href')
req = urllib2.urlopen(url).read()
response = req.read()
print(response)
don't click() but get href attribute to download with python.
data = driver.find_element_by_xpath('//*[@id="modules--reports-gridTar-instanceGrid"]/div/div[1]/div/div[2]/div[2]/a')
url = data.get_attribute('href')
req = urllib2.urlopen(url).read()
response = req.read()
print(response)
answered Nov 8 at 8:20
ewwink
6,87922233
6,87922233
I did try using the get_attribute('href') and got the URL which holds the csv file. By opening and reading that url throws some scripts like this .....b'<!DOCTYPE html>rn<html xmlns="w3.org/1999/xhtml" style="position:inherit;">rnt<head id="ctl00_ctl00_Head1"><meta http-equiv="X-UA-Compatible" content="IE=edge" /><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><script type="text/javascript">window.NREUM|.......not able to read the actual csv file
– Deepa
Nov 8 at 11:43
try add referrer header in your request or also add cookies
– ewwink
Nov 8 at 11:48
add a comment |
I did try using the get_attribute('href') and got the URL which holds the csv file. By opening and reading that url throws some scripts like this .....b'<!DOCTYPE html>rn<html xmlns="w3.org/1999/xhtml" style="position:inherit;">rnt<head id="ctl00_ctl00_Head1"><meta http-equiv="X-UA-Compatible" content="IE=edge" /><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><script type="text/javascript">window.NREUM|.......not able to read the actual csv file
– Deepa
Nov 8 at 11:43
try add referrer header in your request or also add cookies
– ewwink
Nov 8 at 11:48
I did try using the get_attribute('href') and got the URL which holds the csv file. By opening and reading that url throws some scripts like this .....b'<!DOCTYPE html>rn<html xmlns="w3.org/1999/xhtml" style="position:inherit;">rnt<head id="ctl00_ctl00_Head1"><meta http-equiv="X-UA-Compatible" content="IE=edge" /><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><script type="text/javascript">window.NREUM|.......not able to read the actual csv file
– Deepa
Nov 8 at 11:43
I did try using the get_attribute('href') and got the URL which holds the csv file. By opening and reading that url throws some scripts like this .....b'<!DOCTYPE html>rn<html xmlns="w3.org/1999/xhtml" style="position:inherit;">rnt<head id="ctl00_ctl00_Head1"><meta http-equiv="X-UA-Compatible" content="IE=edge" /><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><script type="text/javascript">window.NREUM|.......not able to read the actual csv file
– Deepa
Nov 8 at 11:43
try add referrer header in your request or also add cookies
– ewwink
Nov 8 at 11:48
try add referrer header in your request or also add cookies
– ewwink
Nov 8 at 11:48
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53199983%2fpython-selenium-web-scrape-embedded-csv-excel-file-in-xpath-to-pandas-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown