How to ignore error message in R and keep loop function going
up vote
0
down vote
favorite
I'm using a loop function to get some urls inside a df and do some validation, like this:
for (i in 1:nrow(df)) {
webpage <- read_html(as.character(df[i,1]))
Sys.sleep(0.025)
validation <- webpage %>% html_nodes("a") %>% html_attr('href')
if (length(grep("bitstream",validation)>0)) {
df$text[[i]] <- "Valid"}
else {
df$text[[i]] <- "Invalid"}
}
The problem is that if and url is broken i get an error message like this:
Error in open.connection(x, "rb") : HTTP error 500
and the loop stops.
Is there a way to set another if condition so it doesn't stop?
r
add a comment |
up vote
0
down vote
favorite
I'm using a loop function to get some urls inside a df and do some validation, like this:
for (i in 1:nrow(df)) {
webpage <- read_html(as.character(df[i,1]))
Sys.sleep(0.025)
validation <- webpage %>% html_nodes("a") %>% html_attr('href')
if (length(grep("bitstream",validation)>0)) {
df$text[[i]] <- "Valid"}
else {
df$text[[i]] <- "Invalid"}
}
The problem is that if and url is broken i get an error message like this:
Error in open.connection(x, "rb") : HTTP error 500
and the loop stops.
Is there a way to set another if condition so it doesn't stop?
r
1
there are some answers on SO that identify how to deal with this but I'm a bit short on time to pull up search results. I usually dosrh <- purrr::safely(read_html)and usesrh()instead ofread_html()and test the$resultcomponent of the return value. Alternately, wrap theread_html()call withtry()ortryCatch()and test the result.
– hrbrmstr
Nov 7 at 17:44
tryCatch worked for me, even though i had to use a different condition for it to work! But thanks anyway =)
– Lucca Ramalho
Nov 8 at 11:39
1
you could post what worked as an answer which may, in turn, help future askers (and you'd get at least one upvote :-)
– hrbrmstr
Nov 8 at 11:46
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I'm using a loop function to get some urls inside a df and do some validation, like this:
for (i in 1:nrow(df)) {
webpage <- read_html(as.character(df[i,1]))
Sys.sleep(0.025)
validation <- webpage %>% html_nodes("a") %>% html_attr('href')
if (length(grep("bitstream",validation)>0)) {
df$text[[i]] <- "Valid"}
else {
df$text[[i]] <- "Invalid"}
}
The problem is that if and url is broken i get an error message like this:
Error in open.connection(x, "rb") : HTTP error 500
and the loop stops.
Is there a way to set another if condition so it doesn't stop?
r
I'm using a loop function to get some urls inside a df and do some validation, like this:
for (i in 1:nrow(df)) {
webpage <- read_html(as.character(df[i,1]))
Sys.sleep(0.025)
validation <- webpage %>% html_nodes("a") %>% html_attr('href')
if (length(grep("bitstream",validation)>0)) {
df$text[[i]] <- "Valid"}
else {
df$text[[i]] <- "Invalid"}
}
The problem is that if and url is broken i get an error message like this:
Error in open.connection(x, "rb") : HTTP error 500
and the loop stops.
Is there a way to set another if condition so it doesn't stop?
r
r
asked Nov 7 at 17:36
Lucca Ramalho
918
918
1
there are some answers on SO that identify how to deal with this but I'm a bit short on time to pull up search results. I usually dosrh <- purrr::safely(read_html)and usesrh()instead ofread_html()and test the$resultcomponent of the return value. Alternately, wrap theread_html()call withtry()ortryCatch()and test the result.
– hrbrmstr
Nov 7 at 17:44
tryCatch worked for me, even though i had to use a different condition for it to work! But thanks anyway =)
– Lucca Ramalho
Nov 8 at 11:39
1
you could post what worked as an answer which may, in turn, help future askers (and you'd get at least one upvote :-)
– hrbrmstr
Nov 8 at 11:46
add a comment |
1
there are some answers on SO that identify how to deal with this but I'm a bit short on time to pull up search results. I usually dosrh <- purrr::safely(read_html)and usesrh()instead ofread_html()and test the$resultcomponent of the return value. Alternately, wrap theread_html()call withtry()ortryCatch()and test the result.
– hrbrmstr
Nov 7 at 17:44
tryCatch worked for me, even though i had to use a different condition for it to work! But thanks anyway =)
– Lucca Ramalho
Nov 8 at 11:39
1
you could post what worked as an answer which may, in turn, help future askers (and you'd get at least one upvote :-)
– hrbrmstr
Nov 8 at 11:46
1
1
there are some answers on SO that identify how to deal with this but I'm a bit short on time to pull up search results. I usually do
srh <- purrr::safely(read_html) and use srh() instead of read_html() and test the $result component of the return value. Alternately, wrap the read_html() call with try() or tryCatch() and test the result.– hrbrmstr
Nov 7 at 17:44
there are some answers on SO that identify how to deal with this but I'm a bit short on time to pull up search results. I usually do
srh <- purrr::safely(read_html) and use srh() instead of read_html() and test the $result component of the return value. Alternately, wrap the read_html() call with try() or tryCatch() and test the result.– hrbrmstr
Nov 7 at 17:44
tryCatch worked for me, even though i had to use a different condition for it to work! But thanks anyway =)
– Lucca Ramalho
Nov 8 at 11:39
tryCatch worked for me, even though i had to use a different condition for it to work! But thanks anyway =)
– Lucca Ramalho
Nov 8 at 11:39
1
1
you could post what worked as an answer which may, in turn, help future askers (and you'd get at least one upvote :-)
– hrbrmstr
Nov 8 at 11:46
you could post what worked as an answer which may, in turn, help future askers (and you'd get at least one upvote :-)
– hrbrmstr
Nov 8 at 11:46
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
I've used tryCatch as @hrbrmstr suggested in the comments, but added a little thing so it would work better.
Using an valid_url checking condition among an next statement for the loop to start over again.
for (i in 1:nrow(df)) {
valid_url <- TRUE
tryCatch({webpage <- read_html(as.character(df[i,1]))}, error=function(e) url_valido<<-FALSE)
if (!valid_url){ cat("14")
cat(paste(i," - Invalid URL","nStatus: ",
percent(i/nrow(df)),sep=""))
df$text[[i]] <- "Invalid URL"
next}
Sys.sleep(0.025)
teste <- webpage %>% html_nodes("a") %>% html_attr('href')
if (length(grep("bitstream",teste)>0)) {
df$texto[[i]] <- "Completo"}
else {
df$texto[[i]] <- "Incompleto"}
cat("14")
cat(paste(i," - ",df$texto[[i]],"nStatus: ",
percent(i/nrow(df)),sep=""))
}
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
I've used tryCatch as @hrbrmstr suggested in the comments, but added a little thing so it would work better.
Using an valid_url checking condition among an next statement for the loop to start over again.
for (i in 1:nrow(df)) {
valid_url <- TRUE
tryCatch({webpage <- read_html(as.character(df[i,1]))}, error=function(e) url_valido<<-FALSE)
if (!valid_url){ cat("14")
cat(paste(i," - Invalid URL","nStatus: ",
percent(i/nrow(df)),sep=""))
df$text[[i]] <- "Invalid URL"
next}
Sys.sleep(0.025)
teste <- webpage %>% html_nodes("a") %>% html_attr('href')
if (length(grep("bitstream",teste)>0)) {
df$texto[[i]] <- "Completo"}
else {
df$texto[[i]] <- "Incompleto"}
cat("14")
cat(paste(i," - ",df$texto[[i]],"nStatus: ",
percent(i/nrow(df)),sep=""))
}
add a comment |
up vote
0
down vote
I've used tryCatch as @hrbrmstr suggested in the comments, but added a little thing so it would work better.
Using an valid_url checking condition among an next statement for the loop to start over again.
for (i in 1:nrow(df)) {
valid_url <- TRUE
tryCatch({webpage <- read_html(as.character(df[i,1]))}, error=function(e) url_valido<<-FALSE)
if (!valid_url){ cat("14")
cat(paste(i," - Invalid URL","nStatus: ",
percent(i/nrow(df)),sep=""))
df$text[[i]] <- "Invalid URL"
next}
Sys.sleep(0.025)
teste <- webpage %>% html_nodes("a") %>% html_attr('href')
if (length(grep("bitstream",teste)>0)) {
df$texto[[i]] <- "Completo"}
else {
df$texto[[i]] <- "Incompleto"}
cat("14")
cat(paste(i," - ",df$texto[[i]],"nStatus: ",
percent(i/nrow(df)),sep=""))
}
add a comment |
up vote
0
down vote
up vote
0
down vote
I've used tryCatch as @hrbrmstr suggested in the comments, but added a little thing so it would work better.
Using an valid_url checking condition among an next statement for the loop to start over again.
for (i in 1:nrow(df)) {
valid_url <- TRUE
tryCatch({webpage <- read_html(as.character(df[i,1]))}, error=function(e) url_valido<<-FALSE)
if (!valid_url){ cat("14")
cat(paste(i," - Invalid URL","nStatus: ",
percent(i/nrow(df)),sep=""))
df$text[[i]] <- "Invalid URL"
next}
Sys.sleep(0.025)
teste <- webpage %>% html_nodes("a") %>% html_attr('href')
if (length(grep("bitstream",teste)>0)) {
df$texto[[i]] <- "Completo"}
else {
df$texto[[i]] <- "Incompleto"}
cat("14")
cat(paste(i," - ",df$texto[[i]],"nStatus: ",
percent(i/nrow(df)),sep=""))
}
I've used tryCatch as @hrbrmstr suggested in the comments, but added a little thing so it would work better.
Using an valid_url checking condition among an next statement for the loop to start over again.
for (i in 1:nrow(df)) {
valid_url <- TRUE
tryCatch({webpage <- read_html(as.character(df[i,1]))}, error=function(e) url_valido<<-FALSE)
if (!valid_url){ cat("14")
cat(paste(i," - Invalid URL","nStatus: ",
percent(i/nrow(df)),sep=""))
df$text[[i]] <- "Invalid URL"
next}
Sys.sleep(0.025)
teste <- webpage %>% html_nodes("a") %>% html_attr('href')
if (length(grep("bitstream",teste)>0)) {
df$texto[[i]] <- "Completo"}
else {
df$texto[[i]] <- "Incompleto"}
cat("14")
cat(paste(i," - ",df$texto[[i]],"nStatus: ",
percent(i/nrow(df)),sep=""))
}
answered Nov 8 at 11:56
Lucca Ramalho
918
918
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53194820%2fhow-to-ignore-error-message-in-r-and-keep-loop-function-going%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
there are some answers on SO that identify how to deal with this but I'm a bit short on time to pull up search results. I usually do
srh <- purrr::safely(read_html)and usesrh()instead ofread_html()and test the$resultcomponent of the return value. Alternately, wrap theread_html()call withtry()ortryCatch()and test the result.– hrbrmstr
Nov 7 at 17:44
tryCatch worked for me, even though i had to use a different condition for it to work! But thanks anyway =)
– Lucca Ramalho
Nov 8 at 11:39
1
you could post what worked as an answer which may, in turn, help future askers (and you'd get at least one upvote :-)
– hrbrmstr
Nov 8 at 11:46