python3 - how to scrap the data from span

up vote
2
down vote

favorite

I try to use python3 and BeautifulSoup.

import requests

import json

from bs4 import BeautifulSoup



url = "https://www.binance.com/pl"



#get the data

data = requests.get(url);



soup = BeautifulSoup(data.text,'lxml')



print(soup)

If I open the html code (in browser) I can see:
html code in browser

But in my data (printing in console) i cant see btc price:
what data i cant see in console

Could u give me some advice how to scrap this data?

asked Nov 7 at 21:38

user10620635

113

add a comment |

up vote
2
down vote

favorite

I try to use python3 and BeautifulSoup.

import requests

import json

from bs4 import BeautifulSoup



url = "https://www.binance.com/pl"



#get the data

data = requests.get(url);



soup = BeautifulSoup(data.text,'lxml')



print(soup)

If I open the html code (in browser) I can see:
html code in browser

But in my data (printing in console) i cant see btc price:
what data i cant see in console

Could u give me some advice how to scrap this data?

asked Nov 7 at 21:38

user10620635

113

add a comment |

up vote
2
down vote

favorite

I try to use python3 and BeautifulSoup.

import requests

import json

from bs4 import BeautifulSoup



url = "https://www.binance.com/pl"



#get the data

data = requests.get(url);



soup = BeautifulSoup(data.text,'lxml')



print(soup)

If I open the html code (in browser) I can see:
html code in browser

But in my data (printing in console) i cant see btc price:
what data i cant see in console

Could u give me some advice how to scrap this data?

asked Nov 7 at 21:38

user10620635

113

I try to use python3 and BeautifulSoup.

import requests

import json

from bs4 import BeautifulSoup



url = "https://www.binance.com/pl"



#get the data

data = requests.get(url);



soup = BeautifulSoup(data.text,'lxml')



print(soup)

If I open the html code (in browser) I can see:
html code in browser

But in my data (printing in console) i cant see btc price:
what data i cant see in console

Could u give me some advice how to scrap this data?

python-3.x web-scraping beautifulsoup

asked Nov 7 at 21:38

user10620635

113

asked Nov 7 at 21:38

user10620635

113

asked Nov 7 at 21:38

user10620635

113

asked Nov 7 at 21:38

user10620635

113

asked Nov 7 at 21:38

user10620635

113

add a comment |

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

Use .findAll() to find all the rows, and then you can use it to find all the cells in a given row. You have to look at how the page is structured. It's not a standard row, but a bunch of divs made to look like a table. So you have to look at the role of each div to get to the data you want.

I'm assuming that you're going to want to look at specific rows, so my example uses the Para column to find those rows. Since the star is in it's own little cell, the Para column is the second cell, or index of 1. With that, it's just a question of which cells you want to export.

You could take out the filter if you want to get everything. You can also modify it to see if the value of a cell is above a certain price point.

# Import necessary libraries

import requests

from bs4 import BeautifulSoup

# Ignore the insecure warning

from requests.packages.urllib3.exceptions import InsecureRequestWarning

requests.packages.urllib3.disable_warnings(InsecureRequestWarning)



# Set options and which rows you want to look at

url = "https://www.binance.com/pl"

desired_rows = ['ADA/BTC', 'ADX/BTC']



# Get the page and convert it into beautiful soup

response = requests.get(url, verify=False)

soup = BeautifulSoup(response.text, 'html.parser')



# Find all table rows

rows = soup.findAll('div', {'role':'row'})



# Process all the rows in the table

for row in rows:

    try:

        # Get the cells for the given row

        cells = row.findAll('div', {'role':'gridcell'})

        # Convert them to just the values of the cell, ignoring attributes

        cell_values = [c.text for c in cells]



        # see if the row is one you want

        if cell_values[1] in desired_rows:

            # Output the data however you'd like

            print(cell_values[1], cell_values[-1])



    except IndexError: # there was a row without cells

        pass

This resulted in the following output:

ADA/BTC 1,646.39204255

ADX/BTC 35.29384873

edited Nov 7 at 22:11

answered Nov 7 at 22:05

Brian Cohan

1,8381821

1

Oh my good! Thank U so much! You are awsome!
– user10620635
Nov 7 at 22:20

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53198220%2fpython3-how-to-scrap-the-data-from-span%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

You could take out the filter if you want to get everything. You can also modify it to see if the value of a cell is above a certain price point.

# Import necessary libraries

import requests

from bs4 import BeautifulSoup

# Ignore the insecure warning

from requests.packages.urllib3.exceptions import InsecureRequestWarning

requests.packages.urllib3.disable_warnings(InsecureRequestWarning)



# Set options and which rows you want to look at

url = "https://www.binance.com/pl"

desired_rows = ['ADA/BTC', 'ADX/BTC']



# Get the page and convert it into beautiful soup

response = requests.get(url, verify=False)

soup = BeautifulSoup(response.text, 'html.parser')



# Find all table rows

rows = soup.findAll('div', {'role':'row'})



# Process all the rows in the table

for row in rows:

    try:

        # Get the cells for the given row

        cells = row.findAll('div', {'role':'gridcell'})

        # Convert them to just the values of the cell, ignoring attributes

        cell_values = [c.text for c in cells]



        # see if the row is one you want

        if cell_values[1] in desired_rows:

            # Output the data however you'd like

            print(cell_values[1], cell_values[-1])



    except IndexError: # there was a row without cells

        pass

This resulted in the following output:

ADA/BTC 1,646.39204255

ADX/BTC 35.29384873

edited Nov 7 at 22:11

answered Nov 7 at 22:05

Brian Cohan

1,8381821

1

Oh my good! Thank U so much! You are awsome!
– user10620635
Nov 7 at 22:20

add a comment |

up vote
1
down vote

accepted

You could take out the filter if you want to get everything. You can also modify it to see if the value of a cell is above a certain price point.

# Import necessary libraries

import requests

from bs4 import BeautifulSoup

# Ignore the insecure warning

from requests.packages.urllib3.exceptions import InsecureRequestWarning

requests.packages.urllib3.disable_warnings(InsecureRequestWarning)



# Set options and which rows you want to look at

url = "https://www.binance.com/pl"

desired_rows = ['ADA/BTC', 'ADX/BTC']



# Get the page and convert it into beautiful soup

response = requests.get(url, verify=False)

soup = BeautifulSoup(response.text, 'html.parser')



# Find all table rows

rows = soup.findAll('div', {'role':'row'})



# Process all the rows in the table

for row in rows:

    try:

        # Get the cells for the given row

        cells = row.findAll('div', {'role':'gridcell'})

        # Convert them to just the values of the cell, ignoring attributes

        cell_values = [c.text for c in cells]



        # see if the row is one you want

        if cell_values[1] in desired_rows:

            # Output the data however you'd like

            print(cell_values[1], cell_values[-1])



    except IndexError: # there was a row without cells

        pass

This resulted in the following output:

ADA/BTC 1,646.39204255

ADX/BTC 35.29384873

edited Nov 7 at 22:11

answered Nov 7 at 22:05

Brian Cohan

1,8381821

1

Oh my good! Thank U so much! You are awsome!
– user10620635
Nov 7 at 22:20

add a comment |

up vote
1
down vote

accepted

You could take out the filter if you want to get everything. You can also modify it to see if the value of a cell is above a certain price point.

# Import necessary libraries

import requests

from bs4 import BeautifulSoup

# Ignore the insecure warning

from requests.packages.urllib3.exceptions import InsecureRequestWarning

requests.packages.urllib3.disable_warnings(InsecureRequestWarning)



# Set options and which rows you want to look at

url = "https://www.binance.com/pl"

desired_rows = ['ADA/BTC', 'ADX/BTC']



# Get the page and convert it into beautiful soup

response = requests.get(url, verify=False)

soup = BeautifulSoup(response.text, 'html.parser')



# Find all table rows

rows = soup.findAll('div', {'role':'row'})



# Process all the rows in the table

for row in rows:

    try:

        # Get the cells for the given row

        cells = row.findAll('div', {'role':'gridcell'})

        # Convert them to just the values of the cell, ignoring attributes

        cell_values = [c.text for c in cells]



        # see if the row is one you want

        if cell_values[1] in desired_rows:

            # Output the data however you'd like

            print(cell_values[1], cell_values[-1])



    except IndexError: # there was a row without cells

        pass

This resulted in the following output:

ADA/BTC 1,646.39204255

ADX/BTC 35.29384873

edited Nov 7 at 22:11

answered Nov 7 at 22:05

Brian Cohan

1,8381821

You could take out the filter if you want to get everything. You can also modify it to see if the value of a cell is above a certain price point.

# Import necessary libraries

import requests

from bs4 import BeautifulSoup

# Ignore the insecure warning

from requests.packages.urllib3.exceptions import InsecureRequestWarning

requests.packages.urllib3.disable_warnings(InsecureRequestWarning)



# Set options and which rows you want to look at

url = "https://www.binance.com/pl"

desired_rows = ['ADA/BTC', 'ADX/BTC']



# Get the page and convert it into beautiful soup

response = requests.get(url, verify=False)

soup = BeautifulSoup(response.text, 'html.parser')



# Find all table rows

rows = soup.findAll('div', {'role':'row'})



# Process all the rows in the table

for row in rows:

    try:

        # Get the cells for the given row

        cells = row.findAll('div', {'role':'gridcell'})

        # Convert them to just the values of the cell, ignoring attributes

        cell_values = [c.text for c in cells]



        # see if the row is one you want

        if cell_values[1] in desired_rows:

            # Output the data however you'd like

            print(cell_values[1], cell_values[-1])



    except IndexError: # there was a row without cells

        pass

This resulted in the following output:

ADA/BTC 1,646.39204255

ADX/BTC 35.29384873

edited Nov 7 at 22:11

answered Nov 7 at 22:05

Brian Cohan

1,8381821

edited Nov 7 at 22:11

answered Nov 7 at 22:05

Brian Cohan

1,8381821

answered Nov 7 at 22:05

Brian Cohan

1,8381821

answered Nov 7 at 22:05

Brian Cohan

1,8381821

1

Oh my good! Thank U so much! You are awsome!
– user10620635
Nov 7 at 22:20

add a comment |

1

Oh my good! Thank U so much! You are awsome!
– user10620635
Nov 7 at 22:20

Oh my good! Thank U so much! You are awsome!
– user10620635
Nov 7 at 22:20

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Wsrtjtyk