How to ignore error message in R and keep loop function going
up vote
0
down vote
favorite
I'm using a loop function to get some urls
inside a df
and do some validation, like this:
for (i in 1:nrow(df)) {
webpage <- read_html(as.character(df[i,1]))
Sys.sleep(0.025)
validation <- webpage %>% html_nodes("a") %>% html_attr('href')
if (length(grep("bitstream",validation)>0)) {
df$text[[i]] <- "Valid"}
else {
df$text[[i]] <- "Invalid"}
}
The problem is that if and url
is broken
i get an error message like this:
Error in open.connection(x, "rb") : HTTP error 500
and the loop stops.
Is there a way to set another if condition
so it doesn't stop?
r
add a comment |
up vote
0
down vote
favorite
I'm using a loop function to get some urls
inside a df
and do some validation, like this:
for (i in 1:nrow(df)) {
webpage <- read_html(as.character(df[i,1]))
Sys.sleep(0.025)
validation <- webpage %>% html_nodes("a") %>% html_attr('href')
if (length(grep("bitstream",validation)>0)) {
df$text[[i]] <- "Valid"}
else {
df$text[[i]] <- "Invalid"}
}
The problem is that if and url
is broken
i get an error message like this:
Error in open.connection(x, "rb") : HTTP error 500
and the loop stops.
Is there a way to set another if condition
so it doesn't stop?
r
1
there are some answers on SO that identify how to deal with this but I'm a bit short on time to pull up search results. I usually dosrh <- purrr::safely(read_html)
and usesrh()
instead ofread_html()
and test the$result
component of the return value. Alternately, wrap theread_html()
call withtry()
ortryCatch()
and test the result.
– hrbrmstr
Nov 7 at 17:44
tryCatch worked for me, even though i had to use a different condition for it to work! But thanks anyway =)
– Lucca Ramalho
Nov 8 at 11:39
1
you could post what worked as an answer which may, in turn, help future askers (and you'd get at least one upvote :-)
– hrbrmstr
Nov 8 at 11:46
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I'm using a loop function to get some urls
inside a df
and do some validation, like this:
for (i in 1:nrow(df)) {
webpage <- read_html(as.character(df[i,1]))
Sys.sleep(0.025)
validation <- webpage %>% html_nodes("a") %>% html_attr('href')
if (length(grep("bitstream",validation)>0)) {
df$text[[i]] <- "Valid"}
else {
df$text[[i]] <- "Invalid"}
}
The problem is that if and url
is broken
i get an error message like this:
Error in open.connection(x, "rb") : HTTP error 500
and the loop stops.
Is there a way to set another if condition
so it doesn't stop?
r
I'm using a loop function to get some urls
inside a df
and do some validation, like this:
for (i in 1:nrow(df)) {
webpage <- read_html(as.character(df[i,1]))
Sys.sleep(0.025)
validation <- webpage %>% html_nodes("a") %>% html_attr('href')
if (length(grep("bitstream",validation)>0)) {
df$text[[i]] <- "Valid"}
else {
df$text[[i]] <- "Invalid"}
}
The problem is that if and url
is broken
i get an error message like this:
Error in open.connection(x, "rb") : HTTP error 500
and the loop stops.
Is there a way to set another if condition
so it doesn't stop?
r
r
asked Nov 7 at 17:36
Lucca Ramalho
918
918
1
there are some answers on SO that identify how to deal with this but I'm a bit short on time to pull up search results. I usually dosrh <- purrr::safely(read_html)
and usesrh()
instead ofread_html()
and test the$result
component of the return value. Alternately, wrap theread_html()
call withtry()
ortryCatch()
and test the result.
– hrbrmstr
Nov 7 at 17:44
tryCatch worked for me, even though i had to use a different condition for it to work! But thanks anyway =)
– Lucca Ramalho
Nov 8 at 11:39
1
you could post what worked as an answer which may, in turn, help future askers (and you'd get at least one upvote :-)
– hrbrmstr
Nov 8 at 11:46
add a comment |
1
there are some answers on SO that identify how to deal with this but I'm a bit short on time to pull up search results. I usually dosrh <- purrr::safely(read_html)
and usesrh()
instead ofread_html()
and test the$result
component of the return value. Alternately, wrap theread_html()
call withtry()
ortryCatch()
and test the result.
– hrbrmstr
Nov 7 at 17:44
tryCatch worked for me, even though i had to use a different condition for it to work! But thanks anyway =)
– Lucca Ramalho
Nov 8 at 11:39
1
you could post what worked as an answer which may, in turn, help future askers (and you'd get at least one upvote :-)
– hrbrmstr
Nov 8 at 11:46
1
1
there are some answers on SO that identify how to deal with this but I'm a bit short on time to pull up search results. I usually do
srh <- purrr::safely(read_html)
and use srh()
instead of read_html()
and test the $result
component of the return value. Alternately, wrap the read_html()
call with try()
or tryCatch()
and test the result.– hrbrmstr
Nov 7 at 17:44
there are some answers on SO that identify how to deal with this but I'm a bit short on time to pull up search results. I usually do
srh <- purrr::safely(read_html)
and use srh()
instead of read_html()
and test the $result
component of the return value. Alternately, wrap the read_html()
call with try()
or tryCatch()
and test the result.– hrbrmstr
Nov 7 at 17:44
tryCatch worked for me, even though i had to use a different condition for it to work! But thanks anyway =)
– Lucca Ramalho
Nov 8 at 11:39
tryCatch worked for me, even though i had to use a different condition for it to work! But thanks anyway =)
– Lucca Ramalho
Nov 8 at 11:39
1
1
you could post what worked as an answer which may, in turn, help future askers (and you'd get at least one upvote :-)
– hrbrmstr
Nov 8 at 11:46
you could post what worked as an answer which may, in turn, help future askers (and you'd get at least one upvote :-)
– hrbrmstr
Nov 8 at 11:46
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
I've used tryCatch
as @hrbrmstr suggested in the comments, but added a little thing so it would work better.
Using an valid_url
checking condition among an next
statement for the loop to start over again.
for (i in 1:nrow(df)) {
valid_url <- TRUE
tryCatch({webpage <- read_html(as.character(df[i,1]))}, error=function(e) url_valido<<-FALSE)
if (!valid_url){ cat("14")
cat(paste(i," - Invalid URL","nStatus: ",
percent(i/nrow(df)),sep=""))
df$text[[i]] <- "Invalid URL"
next}
Sys.sleep(0.025)
teste <- webpage %>% html_nodes("a") %>% html_attr('href')
if (length(grep("bitstream",teste)>0)) {
df$texto[[i]] <- "Completo"}
else {
df$texto[[i]] <- "Incompleto"}
cat("14")
cat(paste(i," - ",df$texto[[i]],"nStatus: ",
percent(i/nrow(df)),sep=""))
}
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
I've used tryCatch
as @hrbrmstr suggested in the comments, but added a little thing so it would work better.
Using an valid_url
checking condition among an next
statement for the loop to start over again.
for (i in 1:nrow(df)) {
valid_url <- TRUE
tryCatch({webpage <- read_html(as.character(df[i,1]))}, error=function(e) url_valido<<-FALSE)
if (!valid_url){ cat("14")
cat(paste(i," - Invalid URL","nStatus: ",
percent(i/nrow(df)),sep=""))
df$text[[i]] <- "Invalid URL"
next}
Sys.sleep(0.025)
teste <- webpage %>% html_nodes("a") %>% html_attr('href')
if (length(grep("bitstream",teste)>0)) {
df$texto[[i]] <- "Completo"}
else {
df$texto[[i]] <- "Incompleto"}
cat("14")
cat(paste(i," - ",df$texto[[i]],"nStatus: ",
percent(i/nrow(df)),sep=""))
}
add a comment |
up vote
0
down vote
I've used tryCatch
as @hrbrmstr suggested in the comments, but added a little thing so it would work better.
Using an valid_url
checking condition among an next
statement for the loop to start over again.
for (i in 1:nrow(df)) {
valid_url <- TRUE
tryCatch({webpage <- read_html(as.character(df[i,1]))}, error=function(e) url_valido<<-FALSE)
if (!valid_url){ cat("14")
cat(paste(i," - Invalid URL","nStatus: ",
percent(i/nrow(df)),sep=""))
df$text[[i]] <- "Invalid URL"
next}
Sys.sleep(0.025)
teste <- webpage %>% html_nodes("a") %>% html_attr('href')
if (length(grep("bitstream",teste)>0)) {
df$texto[[i]] <- "Completo"}
else {
df$texto[[i]] <- "Incompleto"}
cat("14")
cat(paste(i," - ",df$texto[[i]],"nStatus: ",
percent(i/nrow(df)),sep=""))
}
add a comment |
up vote
0
down vote
up vote
0
down vote
I've used tryCatch
as @hrbrmstr suggested in the comments, but added a little thing so it would work better.
Using an valid_url
checking condition among an next
statement for the loop to start over again.
for (i in 1:nrow(df)) {
valid_url <- TRUE
tryCatch({webpage <- read_html(as.character(df[i,1]))}, error=function(e) url_valido<<-FALSE)
if (!valid_url){ cat("14")
cat(paste(i," - Invalid URL","nStatus: ",
percent(i/nrow(df)),sep=""))
df$text[[i]] <- "Invalid URL"
next}
Sys.sleep(0.025)
teste <- webpage %>% html_nodes("a") %>% html_attr('href')
if (length(grep("bitstream",teste)>0)) {
df$texto[[i]] <- "Completo"}
else {
df$texto[[i]] <- "Incompleto"}
cat("14")
cat(paste(i," - ",df$texto[[i]],"nStatus: ",
percent(i/nrow(df)),sep=""))
}
I've used tryCatch
as @hrbrmstr suggested in the comments, but added a little thing so it would work better.
Using an valid_url
checking condition among an next
statement for the loop to start over again.
for (i in 1:nrow(df)) {
valid_url <- TRUE
tryCatch({webpage <- read_html(as.character(df[i,1]))}, error=function(e) url_valido<<-FALSE)
if (!valid_url){ cat("14")
cat(paste(i," - Invalid URL","nStatus: ",
percent(i/nrow(df)),sep=""))
df$text[[i]] <- "Invalid URL"
next}
Sys.sleep(0.025)
teste <- webpage %>% html_nodes("a") %>% html_attr('href')
if (length(grep("bitstream",teste)>0)) {
df$texto[[i]] <- "Completo"}
else {
df$texto[[i]] <- "Incompleto"}
cat("14")
cat(paste(i," - ",df$texto[[i]],"nStatus: ",
percent(i/nrow(df)),sep=""))
}
answered Nov 8 at 11:56
Lucca Ramalho
918
918
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53194820%2fhow-to-ignore-error-message-in-r-and-keep-loop-function-going%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
there are some answers on SO that identify how to deal with this but I'm a bit short on time to pull up search results. I usually do
srh <- purrr::safely(read_html)
and usesrh()
instead ofread_html()
and test the$result
component of the return value. Alternately, wrap theread_html()
call withtry()
ortryCatch()
and test the result.– hrbrmstr
Nov 7 at 17:44
tryCatch worked for me, even though i had to use a different condition for it to work! But thanks anyway =)
– Lucca Ramalho
Nov 8 at 11:39
1
you could post what worked as an answer which may, in turn, help future askers (and you'd get at least one upvote :-)
– hrbrmstr
Nov 8 at 11:46