Web Scraping with Puppeteer. If can't get a selector multiple times go to another element
up vote
1
down vote
favorite
I'm saving URLs from a playlist of a video player.
The sequence of actions:
- Click on a video
- Find video element and get it's
src
- Add
src
to an array
Everything is going just fine, but sometimes there is an element I can't get it src
from. So what I want to do is to try to get element's src
for 5 times and if it fails on the 5th time, then I should push the counter's (i
) value in the array instead of the URL and click on the next video and get src
of the next element. Please let me know how I can do it.
Here is my current code which clicks one by one and gets element's src
and just freezes if it can't get element's src
:
const links = await page.evaluate((SELECTORS) => {
return new Promise((resolve, reject) => {
const items = document.querySelector(SELECTORS.list).childNodes;
let i = 0;
const urls = ;
const video = document.querySelector(SELECTORS.video);
let counter = 0;
const limit = 10;
var timer = setInterval(() => {
const url = video.getAttribute('src');
if (!urls.includes(url)) {
urls.push(url);
i++;
if (items[i]) {
setTimeout(items[i].click(), 2000);
} else {
clearInterval(timer);
resolve(urls);
}
}
}, 10);
items[0].click();
});
}, SELECTORS);
javascript web-scraping puppeteer
add a comment |
up vote
1
down vote
favorite
I'm saving URLs from a playlist of a video player.
The sequence of actions:
- Click on a video
- Find video element and get it's
src
- Add
src
to an array
Everything is going just fine, but sometimes there is an element I can't get it src
from. So what I want to do is to try to get element's src
for 5 times and if it fails on the 5th time, then I should push the counter's (i
) value in the array instead of the URL and click on the next video and get src
of the next element. Please let me know how I can do it.
Here is my current code which clicks one by one and gets element's src
and just freezes if it can't get element's src
:
const links = await page.evaluate((SELECTORS) => {
return new Promise((resolve, reject) => {
const items = document.querySelector(SELECTORS.list).childNodes;
let i = 0;
const urls = ;
const video = document.querySelector(SELECTORS.video);
let counter = 0;
const limit = 10;
var timer = setInterval(() => {
const url = video.getAttribute('src');
if (!urls.includes(url)) {
urls.push(url);
i++;
if (items[i]) {
setTimeout(items[i].click(), 2000);
} else {
clearInterval(timer);
resolve(urls);
}
}
}, 10);
items[0].click();
});
}, SELECTORS);
javascript web-scraping puppeteer
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I'm saving URLs from a playlist of a video player.
The sequence of actions:
- Click on a video
- Find video element and get it's
src
- Add
src
to an array
Everything is going just fine, but sometimes there is an element I can't get it src
from. So what I want to do is to try to get element's src
for 5 times and if it fails on the 5th time, then I should push the counter's (i
) value in the array instead of the URL and click on the next video and get src
of the next element. Please let me know how I can do it.
Here is my current code which clicks one by one and gets element's src
and just freezes if it can't get element's src
:
const links = await page.evaluate((SELECTORS) => {
return new Promise((resolve, reject) => {
const items = document.querySelector(SELECTORS.list).childNodes;
let i = 0;
const urls = ;
const video = document.querySelector(SELECTORS.video);
let counter = 0;
const limit = 10;
var timer = setInterval(() => {
const url = video.getAttribute('src');
if (!urls.includes(url)) {
urls.push(url);
i++;
if (items[i]) {
setTimeout(items[i].click(), 2000);
} else {
clearInterval(timer);
resolve(urls);
}
}
}, 10);
items[0].click();
});
}, SELECTORS);
javascript web-scraping puppeteer
I'm saving URLs from a playlist of a video player.
The sequence of actions:
- Click on a video
- Find video element and get it's
src
- Add
src
to an array
Everything is going just fine, but sometimes there is an element I can't get it src
from. So what I want to do is to try to get element's src
for 5 times and if it fails on the 5th time, then I should push the counter's (i
) value in the array instead of the URL and click on the next video and get src
of the next element. Please let me know how I can do it.
Here is my current code which clicks one by one and gets element's src
and just freezes if it can't get element's src
:
const links = await page.evaluate((SELECTORS) => {
return new Promise((resolve, reject) => {
const items = document.querySelector(SELECTORS.list).childNodes;
let i = 0;
const urls = ;
const video = document.querySelector(SELECTORS.video);
let counter = 0;
const limit = 10;
var timer = setInterval(() => {
const url = video.getAttribute('src');
if (!urls.includes(url)) {
urls.push(url);
i++;
if (items[i]) {
setTimeout(items[i].click(), 2000);
} else {
clearInterval(timer);
resolve(urls);
}
}
}, 10);
items[0].click();
});
}, SELECTORS);
javascript web-scraping puppeteer
javascript web-scraping puppeteer
asked Nov 8 at 13:53
Bong2000
385
385
add a comment |
add a comment |
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53209146%2fweb-scraping-with-puppeteer-if-cant-get-a-selector-multiple-times-go-to-anothe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown