Requesting using urllib.request and request in python 3
up vote
0
down vote
favorite
I'm building a web scraping application and I have ran into a problem. I have built a similar one for espn.com, but I'm trying to move on to something more comprehensive and useful for research...
Problem: urllib.request and request libraries are not returning the entire source code for scraping; I am only getting enough information to scrape one of the tables. Here is the current webpage I am using for testing:
https://www.sports-reference.com/cfb/players/ryan-aplin-1.html
import pdb
import urllib.request as ureq
from bs4 import BeautifulSoup as BS
def getPlayerStats (playerName, times):
url = "https://www.sports-reference.com/cfb/players/ryan-aplin-1.html"
html = ureq.urlopen(url).read()
print(html)
times = 1
getPlayerStats("Ryan Aplin", times)
The "times" variable is for a separate function that creates url's based on the website formatting, so it is not applicable here.
So my questions are: am I using an incorrect method in requesting source URL? Will I need to switch resources?
It has worked on different websites, so I don't understand why it is not working here.
Thanks.
python html python-3.x web-scraping
add a comment |
up vote
0
down vote
favorite
I'm building a web scraping application and I have ran into a problem. I have built a similar one for espn.com, but I'm trying to move on to something more comprehensive and useful for research...
Problem: urllib.request and request libraries are not returning the entire source code for scraping; I am only getting enough information to scrape one of the tables. Here is the current webpage I am using for testing:
https://www.sports-reference.com/cfb/players/ryan-aplin-1.html
import pdb
import urllib.request as ureq
from bs4 import BeautifulSoup as BS
def getPlayerStats (playerName, times):
url = "https://www.sports-reference.com/cfb/players/ryan-aplin-1.html"
html = ureq.urlopen(url).read()
print(html)
times = 1
getPlayerStats("Ryan Aplin", times)
The "times" variable is for a separate function that creates url's based on the website formatting, so it is not applicable here.
So my questions are: am I using an incorrect method in requesting source URL? Will I need to switch resources?
It has worked on different websites, so I don't understand why it is not working here.
Thanks.
python html python-3.x web-scraping
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I'm building a web scraping application and I have ran into a problem. I have built a similar one for espn.com, but I'm trying to move on to something more comprehensive and useful for research...
Problem: urllib.request and request libraries are not returning the entire source code for scraping; I am only getting enough information to scrape one of the tables. Here is the current webpage I am using for testing:
https://www.sports-reference.com/cfb/players/ryan-aplin-1.html
import pdb
import urllib.request as ureq
from bs4 import BeautifulSoup as BS
def getPlayerStats (playerName, times):
url = "https://www.sports-reference.com/cfb/players/ryan-aplin-1.html"
html = ureq.urlopen(url).read()
print(html)
times = 1
getPlayerStats("Ryan Aplin", times)
The "times" variable is for a separate function that creates url's based on the website formatting, so it is not applicable here.
So my questions are: am I using an incorrect method in requesting source URL? Will I need to switch resources?
It has worked on different websites, so I don't understand why it is not working here.
Thanks.
python html python-3.x web-scraping
I'm building a web scraping application and I have ran into a problem. I have built a similar one for espn.com, but I'm trying to move on to something more comprehensive and useful for research...
Problem: urllib.request and request libraries are not returning the entire source code for scraping; I am only getting enough information to scrape one of the tables. Here is the current webpage I am using for testing:
https://www.sports-reference.com/cfb/players/ryan-aplin-1.html
import pdb
import urllib.request as ureq
from bs4 import BeautifulSoup as BS
def getPlayerStats (playerName, times):
url = "https://www.sports-reference.com/cfb/players/ryan-aplin-1.html"
html = ureq.urlopen(url).read()
print(html)
times = 1
getPlayerStats("Ryan Aplin", times)
The "times" variable is for a separate function that creates url's based on the website formatting, so it is not applicable here.
So my questions are: am I using an incorrect method in requesting source URL? Will I need to switch resources?
It has worked on different websites, so I don't understand why it is not working here.
Thanks.
python html python-3.x web-scraping
python html python-3.x web-scraping
edited Nov 8 at 4:47
asked Nov 8 at 4:41
Bryan D
12
12
add a comment |
add a comment |
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53201653%2frequesting-using-urllib-request-and-request-in-python-3%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown