Extracting html text using R - can't access some nodes

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}

I have a large number of water take permits that are available online and I want to extract some data from them. For example

url <- "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1"

I don't know html at all, but have been plugging away with help from google and a friend. I can get to some of the nodes without any issues using the xpath or css selector, for instance to get to the title:

library(rvest)

url %>% 

read_html() %>% 

html_nodes(xpath = '//*[@id="main"]/div/h1') %>%

html_text()

[1] "Details for CRC000002.1"

Or using the css selectors:

url %>% 

read_html() %>% 

html_nodes(css = "#main") %>% 

html_nodes(css = "div") %>% 

html_nodes(css = "h1") %>% 

html_text()

[1] "Details for CRC000002.1"

So far, so good, but the information I actually want is buried a bit deeper and I can't seem to get to it. For instance, the client name field ("Killermont Station Limited", in this case) has this xpath:

clientxpath <- '//*[@id="main"]/div/div[1]/div/table/tbody/tr[1]/td[2]'

url %>% 

read_html() %>% 

html_nodes(xpath = clientxpath) %>%

html_text()

character(0)

The css selectors gets quite convoluted, but I get the same result. The help file for html_nodes() says:

# XPath selectors ---------------------------------------------

# chaining with XPath is a little trickier - you may need to vary

# the prefix you're using - // always selects from the root noot

# regardless of where you currently are in the doc

But it doesn't give me any clues on what to use an an alternative prefix in the xpath (might be obvious if I knew html).

My friend pointed out that some of the document is in javascript (ajax), which may be part of the problem too. That said, the bit I'm trying to get to above shows up in the html, but it is within a node called 'div.ajax-block'.

css selectors: #main > div > div.ajax-block > div > table > tbody > tr:nth-child(1) > td:nth-child(4)

Can anyone help? Thanks!

asked Nov 24 '18 at 10:05

TimM

474

1

First of all, is it legal for you to get data from that page?

– NelsonGon
Nov 24 '18 at 10:14

Yes, it's all public information.

– TimM
Nov 24 '18 at 10:17

it dynamic page, use selenium

– ewwink
Nov 24 '18 at 10:22

How would you go about extracting the data in RSelenium? I had a quick look and it seems like it's pretty involved!

– TimM
Nov 24 '18 at 11:16

2

Please see my answer. This "use selenium" craze is just crazy.

– hrbrmstr
Nov 24 '18 at 17:38

add a comment |

I have a large number of water take permits that are available online and I want to extract some data from them. For example

url <- "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1"

library(rvest)

url %>% 

read_html() %>% 

html_nodes(xpath = '//*[@id="main"]/div/h1') %>%

html_text()

[1] "Details for CRC000002.1"

Or using the css selectors:

url %>% 

read_html() %>% 

html_nodes(css = "#main") %>% 

html_nodes(css = "div") %>% 

html_nodes(css = "h1") %>% 

html_text()

[1] "Details for CRC000002.1"

clientxpath <- '//*[@id="main"]/div/div[1]/div/table/tbody/tr[1]/td[2]'

url %>% 

read_html() %>% 

html_nodes(xpath = clientxpath) %>%

html_text()

character(0)

The css selectors gets quite convoluted, but I get the same result. The help file for html_nodes() says:

# XPath selectors ---------------------------------------------

# chaining with XPath is a little trickier - you may need to vary

# the prefix you're using - // always selects from the root noot

# regardless of where you currently are in the doc

But it doesn't give me any clues on what to use an an alternative prefix in the xpath (might be obvious if I knew html).

css selectors: #main > div > div.ajax-block > div > table > tbody > tr:nth-child(1) > td:nth-child(4)

Can anyone help? Thanks!

asked Nov 24 '18 at 10:05

TimM

474

1

First of all, is it legal for you to get data from that page?

– NelsonGon
Nov 24 '18 at 10:14

Yes, it's all public information.

– TimM
Nov 24 '18 at 10:17

it dynamic page, use selenium

– ewwink
Nov 24 '18 at 10:22

How would you go about extracting the data in RSelenium? I had a quick look and it seems like it's pretty involved!

– TimM
Nov 24 '18 at 11:16

2

Please see my answer. This "use selenium" craze is just crazy.

– hrbrmstr
Nov 24 '18 at 17:38

add a comment |

I have a large number of water take permits that are available online and I want to extract some data from them. For example

url <- "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1"

library(rvest)

url %>% 

read_html() %>% 

html_nodes(xpath = '//*[@id="main"]/div/h1') %>%

html_text()

[1] "Details for CRC000002.1"

Or using the css selectors:

url %>% 

read_html() %>% 

html_nodes(css = "#main") %>% 

html_nodes(css = "div") %>% 

html_nodes(css = "h1") %>% 

html_text()

[1] "Details for CRC000002.1"

clientxpath <- '//*[@id="main"]/div/div[1]/div/table/tbody/tr[1]/td[2]'

url %>% 

read_html() %>% 

html_nodes(xpath = clientxpath) %>%

html_text()

character(0)

The css selectors gets quite convoluted, but I get the same result. The help file for html_nodes() says:

# XPath selectors ---------------------------------------------

# chaining with XPath is a little trickier - you may need to vary

# the prefix you're using - // always selects from the root noot

# regardless of where you currently are in the doc

But it doesn't give me any clues on what to use an an alternative prefix in the xpath (might be obvious if I knew html).

css selectors: #main > div > div.ajax-block > div > table > tbody > tr:nth-child(1) > td:nth-child(4)

Can anyone help? Thanks!

asked Nov 24 '18 at 10:05

TimM

474

I have a large number of water take permits that are available online and I want to extract some data from them. For example

url <- "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1"

library(rvest)

url %>% 

read_html() %>% 

html_nodes(xpath = '//*[@id="main"]/div/h1') %>%

html_text()

[1] "Details for CRC000002.1"

Or using the css selectors:

url %>% 

read_html() %>% 

html_nodes(css = "#main") %>% 

html_nodes(css = "div") %>% 

html_nodes(css = "h1") %>% 

html_text()

[1] "Details for CRC000002.1"

clientxpath <- '//*[@id="main"]/div/div[1]/div/table/tbody/tr[1]/td[2]'

url %>% 

read_html() %>% 

html_nodes(xpath = clientxpath) %>%

html_text()

character(0)

The css selectors gets quite convoluted, but I get the same result. The help file for html_nodes() says:

# XPath selectors ---------------------------------------------

# chaining with XPath is a little trickier - you may need to vary

# the prefix you're using - // always selects from the root noot

# regardless of where you currently are in the doc

But it doesn't give me any clues on what to use an an alternative prefix in the xpath (might be obvious if I knew html).

css selectors: #main > div > div.ajax-block > div > table > tbody > tr:nth-child(1) > td:nth-child(4)

Can anyone help? Thanks!

html r web-scraping rvest

asked Nov 24 '18 at 10:05

TimM

474

asked Nov 24 '18 at 10:05

TimM

474

asked Nov 24 '18 at 10:05

TimM

474

asked Nov 24 '18 at 10:05

TimM

474

asked Nov 24 '18 at 10:05

TimM

474

1

First of all, is it legal for you to get data from that page?

– NelsonGon
Nov 24 '18 at 10:14

Yes, it's all public information.

– TimM
Nov 24 '18 at 10:17

it dynamic page, use selenium

– ewwink
Nov 24 '18 at 10:22

How would you go about extracting the data in RSelenium? I had a quick look and it seems like it's pretty involved!

– TimM
Nov 24 '18 at 11:16

2

Please see my answer. This "use selenium" craze is just crazy.

– hrbrmstr
Nov 24 '18 at 17:38

add a comment |

1

First of all, is it legal for you to get data from that page?

– NelsonGon
Nov 24 '18 at 10:14

Yes, it's all public information.

– TimM
Nov 24 '18 at 10:17

it dynamic page, use selenium

– ewwink
Nov 24 '18 at 10:22

How would you go about extracting the data in RSelenium? I had a quick look and it seems like it's pretty involved!

– TimM
Nov 24 '18 at 11:16

2

Please see my answer. This "use selenium" craze is just crazy.

– hrbrmstr
Nov 24 '18 at 17:38

First of all, is it legal for you to get data from that page?

– NelsonGon
Nov 24 '18 at 10:14

Yes, it's all public information.

– TimM
Nov 24 '18 at 10:17

it dynamic page, use selenium

– ewwink
Nov 24 '18 at 10:22

How would you go about extracting the data in RSelenium? I had a quick look and it seems like it's pretty involved!

– TimM
Nov 24 '18 at 11:16

Please see my answer. This "use selenium" craze is just crazy.

– hrbrmstr
Nov 24 '18 at 17:38

add a comment |

1 Answer
1

active

oldest

votes

It's super disconcerting that most if not all SO R contributors default to "use a heavyweight third-party dependency" in curt "answers" when it comes to scraping. 99% of the time you don't need Selenium. You just need to exercise the little gray cells.

First, a big clue that the page loads content asynchronously is the wait-spinner that appears. The second one is in your snippet where the div actually has part of a selector name with ajax in it. Tell-tale signs XHR requests are in-play.

If you open Developer Tools in your browser and reload the page then go to Network and then the XHR tab you'll see:

enter image description here

Most of the "real" data on the page is loaded dynamically. We can write httr calls that mimic the browser calls.

However…

We first need to make one GET call to the main page to prime some cookies which will be carried over for us and then find a per-generated session token that's used to prevent abuse of the site. It's defined using JavaScript so we'll use the V8 package to evaluate it. We could have just use regular expressions to find the string. Do whatev you like.

library(httr)

library(rvest)

library(dplyr)

library(V8)



ctx <- v8() # we need this to eval some javascript



# Prime Cookies -----------------------------------------------------------



res <- httr::GET("https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1")



httr::cookies(res)

##          domain flag path secure          expiration                 name

## 1 .ecan.govt.nz TRUE    /  FALSE 2019-11-24 11:46:13   visid_incap_927063

## 2 .ecan.govt.nz TRUE    /  FALSE                <NA> incap_ses_148_927063

##                                                              value

## 1 +p8XAM6uReGmEnVIdnaxoxWL+VsAAAAAQUIPAAAAAABjdOjQDbXt7PG3tpBpELha

## 2         nXJSYz8zbCRj8tGhzNANAhaL+VsAAAAA7JyOH7Gu4qeIb6KKk/iSYQ==

pg <- httr::content(res)



html_node(pg, xpath=".//script[contains(., '_monsido')]") %>%

  html_text() %>%

  ctx$eval()

## [1] "2"

monsido_token <- ctx$get("_monsido")[1,2]

Here's the searchlist (which is, indeed empty):

httr::VERB(

  verb = "POST", url = "https://www.ecan.govt.nz/data/document-library/searchlist",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    `X-Requested-With` = "XMLHttpRequest",

    TE = "Trailers"

  ), httr::set_cookies(

    monsido = monsido_token

  ),

  body = list(

    name = "CRC000002.1",

    pageSize = "999999"

  ),

  encode = "form"

) -> res



httr::content(res)

## NULL ## <<=== this is OK as there is no response

Here's the "Consent Overview" section:

httr::GET(

  url = "https://www.ecan.govt.nz/data/consent-search/consentoverview/CRC000002.1",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    Authority = "www.ecan.govt.nz",

    `X-Requested-With` = "XMLHttpRequest"

  ),

  httr::set_cookies(

    monsido = monsido_token

  )

) -> res



httr::content(res) %>%

  html_table() %>%

  glimpse()

## List of 1

##  $ :'data.frame':    5 obs. of  4 variables:

##   ..$ X1: chr [1:5] "RMA Authorisation Number" "Consent Location" "To" "Commencement Date" ...

##   ..$ X2: chr [1:5] "CRC000002.1" "Manuka Creek, KILLERMONT STATION" "To take water from Manuka Creek at or about map reference NZMS 260 H39:5588-2366 for irrigation of up to 40.8 hectares." "29 Apr 2010" ...

##   ..$ X3: chr [1:5] "Client Name" "State" "To take water from Manuka Creek at or about map reference NZMS 260 H39:5588-2366 for irrigation of up to 40.8 hectares." "29 Apr 2010" ...

##   ..$ X4: chr [1:5] "Killermont Station Limited" "Issued - Active" "To take water from Manuka Creek at or about map reference NZMS 260 H39:5588-2366 for irrigation of up to 40.8 hectares." "29 Apr 2010" ...

Here are the "Consent Conditions":

httr::GET(

  url = "https://www.ecan.govt.nz/data/consent-search/consentconditions/CRC000002.1",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    Authority = "www.ecan.govt.nz",

    `X-Requested-With` = "XMLHttpRequest"

  ),

  httr::set_cookies(

    monsido = monsido_token

  )

) -> res



httr::content(res) %>%

  as.character() %>%

  substring(1, 300) %>%

  cat()

## <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">

## <html><body><div class="consentDetails">

##     <ul class="unstyled-list">

## <li>

##           

##             

##             <strong class="pull-left">1</strong> <div class="pad-left1">The rate at which wa

Here's the "Consent Related":

httr::GET(

  url = "https://www.ecan.govt.nz/data/consent-search/consentrelated/CRC000002.1",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    Authority = "www.ecan.govt.nz",

    `X-Requested-With` = "XMLHttpRequest"

  ),

  httr::set_cookies(

    monsido = monsido_token

  )

) -> res



httr::content(res) %>%

  as.character() %>%

  substring(1, 300) %>%

  cat()

## <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">

## <html><body>

## <p>There are no related documents.</p>

## 

## 

## 

## 

##   

##     <div class="summary-table-wrapper">

##       <table class="summary-table left">

## <thead><tr>

## <th>Relationship</th>

##           <th>Recor

Here's the "Workflow:

httr::GET(

  url = "https://www.ecan.govt.nz/data/consent-search/consentworkflow/CRC000002.1",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    Authority = "www.ecan.govt.nz",

    `X-Requested-With` = "XMLHttpRequest"

  ),

  httr::set_cookies(

    monsido = monsido_token

  )

) -> res



httr::content(res)

## {xml_document}

## <html>

## [1] <body><p>No workflow</p></body>

Here are the "Consent Flow Restrictions":

httr::GET(

  url = "https://www.ecan.govt.nz/data/consent-search/consentflowrestrictions/CRC000002.1",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    Authority = "www.ecan.govt.nz",

    `X-Requested-With` = "XMLHttpRequest"

  ),

  httr::set_cookies(

    monsido = monsido_token

  )

) -> res



httr::content(res) %>%

  as.character() %>%

  substring(1, 300) %>%

  cat()

## <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">

## <html><body><div class="summary-table-wrapper">

##     <table class="summary-table left">

## <thead>

## <th colspan="2">Low Flow Site</th>

##       <th>Todays Flow <span class="lower">(m3/s)</span>

## </th>

You still need to parse HTML but now you can do it all with just plain R packages.

edited Nov 25 '18 at 12:30

answered Nov 24 '18 at 17:38

hrbrmstr

62.2k694155

Thanks, that's brilliant! Works perfectly, and now I'm on to banging my head against a wall with the text pattern matching. Can you explain to me briefly how to select the arguments for GET? They work perfectly in this case, but I don't think I could replicate it, and the help file in R is a little opaque.

– TimM
Nov 24 '18 at 21:54

1

If this isn't time-sensitive, lemme put this into GitHub by tomorrow and I'll drop a link here and we can iron it out it GitHub issues.

– hrbrmstr
Nov 24 '18 at 22:11

That would be amazing, thanks!

– TimM
Nov 24 '18 at 22:13

Cool. If you could put your initial comment to this answer as an issue in github.com/hrbrmstr/nz-ecan I'll drop a note on the morrow. Tonight is last night with my college freshman son home for the Thanksgiving break so I'll be able to crank on this tomorrow with zeal.

– hrbrmstr
Nov 24 '18 at 22:17

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53457078%2fextracting-html-text-using-r-cant-access-some-nodes%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

If you open Developer Tools in your browser and reload the page then go to Network and then the XHR tab you'll see:

enter image description here

Most of the "real" data on the page is loaded dynamically. We can write httr calls that mimic the browser calls.

However…

library(httr)

library(rvest)

library(dplyr)

library(V8)



ctx <- v8() # we need this to eval some javascript



# Prime Cookies -----------------------------------------------------------



res <- httr::GET("https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1")



httr::cookies(res)

##          domain flag path secure          expiration                 name

## 1 .ecan.govt.nz TRUE    /  FALSE 2019-11-24 11:46:13   visid_incap_927063

## 2 .ecan.govt.nz TRUE    /  FALSE                <NA> incap_ses_148_927063

##                                                              value

## 1 +p8XAM6uReGmEnVIdnaxoxWL+VsAAAAAQUIPAAAAAABjdOjQDbXt7PG3tpBpELha

## 2         nXJSYz8zbCRj8tGhzNANAhaL+VsAAAAA7JyOH7Gu4qeIb6KKk/iSYQ==

pg <- httr::content(res)



html_node(pg, xpath=".//script[contains(., '_monsido')]") %>%

  html_text() %>%

  ctx$eval()

## [1] "2"

monsido_token <- ctx$get("_monsido")[1,2]

Here's the searchlist (which is, indeed empty):

httr::VERB(

  verb = "POST", url = "https://www.ecan.govt.nz/data/document-library/searchlist",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    `X-Requested-With` = "XMLHttpRequest",

    TE = "Trailers"

  ), httr::set_cookies(

    monsido = monsido_token

  ),

  body = list(

    name = "CRC000002.1",

    pageSize = "999999"

  ),

  encode = "form"

) -> res



httr::content(res)

## NULL ## <<=== this is OK as there is no response

Here's the "Consent Overview" section:

httr::GET(

  url = "https://www.ecan.govt.nz/data/consent-search/consentoverview/CRC000002.1",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    Authority = "www.ecan.govt.nz",

    `X-Requested-With` = "XMLHttpRequest"

  ),

  httr::set_cookies(

    monsido = monsido_token

  )

) -> res



httr::content(res) %>%

  html_table() %>%

  glimpse()

## List of 1

##  $ :'data.frame':    5 obs. of  4 variables:

##   ..$ X1: chr [1:5] "RMA Authorisation Number" "Consent Location" "To" "Commencement Date" ...

##   ..$ X2: chr [1:5] "CRC000002.1" "Manuka Creek, KILLERMONT STATION" "To take water from Manuka Creek at or about map reference NZMS 260 H39:5588-2366 for irrigation of up to 40.8 hectares." "29 Apr 2010" ...

##   ..$ X3: chr [1:5] "Client Name" "State" "To take water from Manuka Creek at or about map reference NZMS 260 H39:5588-2366 for irrigation of up to 40.8 hectares." "29 Apr 2010" ...

##   ..$ X4: chr [1:5] "Killermont Station Limited" "Issued - Active" "To take water from Manuka Creek at or about map reference NZMS 260 H39:5588-2366 for irrigation of up to 40.8 hectares." "29 Apr 2010" ...

Here are the "Consent Conditions":

httr::GET(

  url = "https://www.ecan.govt.nz/data/consent-search/consentconditions/CRC000002.1",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    Authority = "www.ecan.govt.nz",

    `X-Requested-With` = "XMLHttpRequest"

  ),

  httr::set_cookies(

    monsido = monsido_token

  )

) -> res



httr::content(res) %>%

  as.character() %>%

  substring(1, 300) %>%

  cat()

## <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">

## <html><body><div class="consentDetails">

##     <ul class="unstyled-list">

## <li>

##           

##             

##             <strong class="pull-left">1</strong> <div class="pad-left1">The rate at which wa

Here's the "Consent Related":

httr::GET(

  url = "https://www.ecan.govt.nz/data/consent-search/consentrelated/CRC000002.1",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    Authority = "www.ecan.govt.nz",

    `X-Requested-With` = "XMLHttpRequest"

  ),

  httr::set_cookies(

    monsido = monsido_token

  )

) -> res



httr::content(res) %>%

  as.character() %>%

  substring(1, 300) %>%

  cat()

## <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">

## <html><body>

## <p>There are no related documents.</p>

## 

## 

## 

## 

##   

##     <div class="summary-table-wrapper">

##       <table class="summary-table left">

## <thead><tr>

## <th>Relationship</th>

##           <th>Recor

Here's the "Workflow:

httr::GET(

  url = "https://www.ecan.govt.nz/data/consent-search/consentworkflow/CRC000002.1",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    Authority = "www.ecan.govt.nz",

    `X-Requested-With` = "XMLHttpRequest"

  ),

  httr::set_cookies(

    monsido = monsido_token

  )

) -> res



httr::content(res)

## {xml_document}

## <html>

## [1] <body><p>No workflow</p></body>

Here are the "Consent Flow Restrictions":

httr::GET(

  url = "https://www.ecan.govt.nz/data/consent-search/consentflowrestrictions/CRC000002.1",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    Authority = "www.ecan.govt.nz",

    `X-Requested-With` = "XMLHttpRequest"

  ),

  httr::set_cookies(

    monsido = monsido_token

  )

) -> res



httr::content(res) %>%

  as.character() %>%

  substring(1, 300) %>%

  cat()

## <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">

## <html><body><div class="summary-table-wrapper">

##     <table class="summary-table left">

## <thead>

## <th colspan="2">Low Flow Site</th>

##       <th>Todays Flow <span class="lower">(m3/s)</span>

## </th>

You still need to parse HTML but now you can do it all with just plain R packages.

edited Nov 25 '18 at 12:30

answered Nov 24 '18 at 17:38

hrbrmstr

62.2k694155

Thanks, that's brilliant! Works perfectly, and now I'm on to banging my head against a wall with the text pattern matching. Can you explain to me briefly how to select the arguments for GET? They work perfectly in this case, but I don't think I could replicate it, and the help file in R is a little opaque.

– TimM
Nov 24 '18 at 21:54

1

If this isn't time-sensitive, lemme put this into GitHub by tomorrow and I'll drop a link here and we can iron it out it GitHub issues.

– hrbrmstr
Nov 24 '18 at 22:11

That would be amazing, thanks!

– TimM
Nov 24 '18 at 22:13

Cool. If you could put your initial comment to this answer as an issue in github.com/hrbrmstr/nz-ecan I'll drop a note on the morrow. Tonight is last night with my college freshman son home for the Thanksgiving break so I'll be able to crank on this tomorrow with zeal.

– hrbrmstr
Nov 24 '18 at 22:17

add a comment |

If you open Developer Tools in your browser and reload the page then go to Network and then the XHR tab you'll see:

enter image description here

Most of the "real" data on the page is loaded dynamically. We can write httr calls that mimic the browser calls.

However…

library(httr)

library(rvest)

library(dplyr)

library(V8)



ctx <- v8() # we need this to eval some javascript



# Prime Cookies -----------------------------------------------------------



res <- httr::GET("https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1")



httr::cookies(res)

##          domain flag path secure          expiration                 name

## 1 .ecan.govt.nz TRUE    /  FALSE 2019-11-24 11:46:13   visid_incap_927063

## 2 .ecan.govt.nz TRUE    /  FALSE                <NA> incap_ses_148_927063

##                                                              value

## 1 +p8XAM6uReGmEnVIdnaxoxWL+VsAAAAAQUIPAAAAAABjdOjQDbXt7PG3tpBpELha

## 2         nXJSYz8zbCRj8tGhzNANAhaL+VsAAAAA7JyOH7Gu4qeIb6KKk/iSYQ==

pg <- httr::content(res)



html_node(pg, xpath=".//script[contains(., '_monsido')]") %>%

  html_text() %>%

  ctx$eval()

## [1] "2"

monsido_token <- ctx$get("_monsido")[1,2]

Here's the searchlist (which is, indeed empty):

httr::VERB(

  verb = "POST", url = "https://www.ecan.govt.nz/data/document-library/searchlist",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    `X-Requested-With` = "XMLHttpRequest",

    TE = "Trailers"

  ), httr::set_cookies(

    monsido = monsido_token

  ),

  body = list(

    name = "CRC000002.1",

    pageSize = "999999"

  ),

  encode = "form"

) -> res



httr::content(res)

## NULL ## <<=== this is OK as there is no response

Here's the "Consent Overview" section:

httr::GET(

  url = "https://www.ecan.govt.nz/data/consent-search/consentoverview/CRC000002.1",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    Authority = "www.ecan.govt.nz",

    `X-Requested-With` = "XMLHttpRequest"

  ),

  httr::set_cookies(

    monsido = monsido_token

  )

) -> res



httr::content(res) %>%

  html_table() %>%

  glimpse()

## List of 1

##  $ :'data.frame':    5 obs. of  4 variables:

##   ..$ X1: chr [1:5] "RMA Authorisation Number" "Consent Location" "To" "Commencement Date" ...

##   ..$ X2: chr [1:5] "CRC000002.1" "Manuka Creek, KILLERMONT STATION" "To take water from Manuka Creek at or about map reference NZMS 260 H39:5588-2366 for irrigation of up to 40.8 hectares." "29 Apr 2010" ...

##   ..$ X3: chr [1:5] "Client Name" "State" "To take water from Manuka Creek at or about map reference NZMS 260 H39:5588-2366 for irrigation of up to 40.8 hectares." "29 Apr 2010" ...

##   ..$ X4: chr [1:5] "Killermont Station Limited" "Issued - Active" "To take water from Manuka Creek at or about map reference NZMS 260 H39:5588-2366 for irrigation of up to 40.8 hectares." "29 Apr 2010" ...

Here are the "Consent Conditions":

httr::GET(

  url = "https://www.ecan.govt.nz/data/consent-search/consentconditions/CRC000002.1",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    Authority = "www.ecan.govt.nz",

    `X-Requested-With` = "XMLHttpRequest"

  ),

  httr::set_cookies(

    monsido = monsido_token

  )

) -> res



httr::content(res) %>%

  as.character() %>%

  substring(1, 300) %>%

  cat()

## <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">

## <html><body><div class="consentDetails">

##     <ul class="unstyled-list">

## <li>

##           

##             

##             <strong class="pull-left">1</strong> <div class="pad-left1">The rate at which wa

Here's the "Consent Related":

httr::GET(

  url = "https://www.ecan.govt.nz/data/consent-search/consentrelated/CRC000002.1",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    Authority = "www.ecan.govt.nz",

    `X-Requested-With` = "XMLHttpRequest"

  ),

  httr::set_cookies(

    monsido = monsido_token

  )

) -> res



httr::content(res) %>%

  as.character() %>%

  substring(1, 300) %>%

  cat()

## <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">

## <html><body>

## <p>There are no related documents.</p>

## 

## 

## 

## 

##   

##     <div class="summary-table-wrapper">

##       <table class="summary-table left">

## <thead><tr>

## <th>Relationship</th>

##           <th>Recor

Here's the "Workflow:

httr::GET(

  url = "https://www.ecan.govt.nz/data/consent-search/consentworkflow/CRC000002.1",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    Authority = "www.ecan.govt.nz",

    `X-Requested-With` = "XMLHttpRequest"

  ),

  httr::set_cookies(

    monsido = monsido_token

  )

) -> res



httr::content(res)

## {xml_document}

## <html>

## [1] <body><p>No workflow</p></body>

Here are the "Consent Flow Restrictions":

httr::GET(

  url = "https://www.ecan.govt.nz/data/consent-search/consentflowrestrictions/CRC000002.1",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    Authority = "www.ecan.govt.nz",

    `X-Requested-With` = "XMLHttpRequest"

  ),

  httr::set_cookies(

    monsido = monsido_token

  )

) -> res



httr::content(res) %>%

  as.character() %>%

  substring(1, 300) %>%

  cat()

## <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">

## <html><body><div class="summary-table-wrapper">

##     <table class="summary-table left">

## <thead>

## <th colspan="2">Low Flow Site</th>

##       <th>Todays Flow <span class="lower">(m3/s)</span>

## </th>

You still need to parse HTML but now you can do it all with just plain R packages.

edited Nov 25 '18 at 12:30

answered Nov 24 '18 at 17:38

hrbrmstr

62.2k694155

Thanks, that's brilliant! Works perfectly, and now I'm on to banging my head against a wall with the text pattern matching. Can you explain to me briefly how to select the arguments for GET? They work perfectly in this case, but I don't think I could replicate it, and the help file in R is a little opaque.

– TimM
Nov 24 '18 at 21:54

1

If this isn't time-sensitive, lemme put this into GitHub by tomorrow and I'll drop a link here and we can iron it out it GitHub issues.

– hrbrmstr
Nov 24 '18 at 22:11

That would be amazing, thanks!

– TimM
Nov 24 '18 at 22:13

Cool. If you could put your initial comment to this answer as an issue in github.com/hrbrmstr/nz-ecan I'll drop a note on the morrow. Tonight is last night with my college freshman son home for the Thanksgiving break so I'll be able to crank on this tomorrow with zeal.

– hrbrmstr
Nov 24 '18 at 22:17

add a comment |

If you open Developer Tools in your browser and reload the page then go to Network and then the XHR tab you'll see:

enter image description here

Most of the "real" data on the page is loaded dynamically. We can write httr calls that mimic the browser calls.

However…

library(httr)

library(rvest)

library(dplyr)

library(V8)



ctx <- v8() # we need this to eval some javascript



# Prime Cookies -----------------------------------------------------------



res <- httr::GET("https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1")



httr::cookies(res)

##          domain flag path secure          expiration                 name

## 1 .ecan.govt.nz TRUE    /  FALSE 2019-11-24 11:46:13   visid_incap_927063

## 2 .ecan.govt.nz TRUE    /  FALSE                <NA> incap_ses_148_927063

##                                                              value

## 1 +p8XAM6uReGmEnVIdnaxoxWL+VsAAAAAQUIPAAAAAABjdOjQDbXt7PG3tpBpELha

## 2         nXJSYz8zbCRj8tGhzNANAhaL+VsAAAAA7JyOH7Gu4qeIb6KKk/iSYQ==

pg <- httr::content(res)



html_node(pg, xpath=".//script[contains(., '_monsido')]") %>%

  html_text() %>%

  ctx$eval()

## [1] "2"

monsido_token <- ctx$get("_monsido")[1,2]

Here's the searchlist (which is, indeed empty):

httr::VERB(

  verb = "POST", url = "https://www.ecan.govt.nz/data/document-library/searchlist",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    `X-Requested-With` = "XMLHttpRequest",

    TE = "Trailers"

  ), httr::set_cookies(

    monsido = monsido_token

  ),

  body = list(

    name = "CRC000002.1",

    pageSize = "999999"

  ),

  encode = "form"

) -> res



httr::content(res)

## NULL ## <<=== this is OK as there is no response

Here's the "Consent Overview" section:

httr::GET(

  url = "https://www.ecan.govt.nz/data/consent-search/consentoverview/CRC000002.1",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    Authority = "www.ecan.govt.nz",

    `X-Requested-With` = "XMLHttpRequest"

  ),

  httr::set_cookies(

    monsido = monsido_token

  )

) -> res



httr::content(res) %>%

  html_table() %>%

  glimpse()

## List of 1

##  $ :'data.frame':    5 obs. of  4 variables:

##   ..$ X1: chr [1:5] "RMA Authorisation Number" "Consent Location" "To" "Commencement Date" ...

##   ..$ X2: chr [1:5] "CRC000002.1" "Manuka Creek, KILLERMONT STATION" "To take water from Manuka Creek at or about map reference NZMS 260 H39:5588-2366 for irrigation of up to 40.8 hectares." "29 Apr 2010" ...

##   ..$ X3: chr [1:5] "Client Name" "State" "To take water from Manuka Creek at or about map reference NZMS 260 H39:5588-2366 for irrigation of up to 40.8 hectares." "29 Apr 2010" ...

##   ..$ X4: chr [1:5] "Killermont Station Limited" "Issued - Active" "To take water from Manuka Creek at or about map reference NZMS 260 H39:5588-2366 for irrigation of up to 40.8 hectares." "29 Apr 2010" ...

Here are the "Consent Conditions":

httr::GET(

  url = "https://www.ecan.govt.nz/data/consent-search/consentconditions/CRC000002.1",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    Authority = "www.ecan.govt.nz",

    `X-Requested-With` = "XMLHttpRequest"

  ),

  httr::set_cookies(

    monsido = monsido_token

  )

) -> res



httr::content(res) %>%

  as.character() %>%

  substring(1, 300) %>%

  cat()

## <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">

## <html><body><div class="consentDetails">

##     <ul class="unstyled-list">

## <li>

##           

##             

##             <strong class="pull-left">1</strong> <div class="pad-left1">The rate at which wa

Here's the "Consent Related":

httr::GET(

  url = "https://www.ecan.govt.nz/data/consent-search/consentrelated/CRC000002.1",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    Authority = "www.ecan.govt.nz",

    `X-Requested-With` = "XMLHttpRequest"

  ),

  httr::set_cookies(

    monsido = monsido_token

  )

) -> res



httr::content(res) %>%

  as.character() %>%

  substring(1, 300) %>%

  cat()

## <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">

## <html><body>

## <p>There are no related documents.</p>

## 

## 

## 

## 

##   

##     <div class="summary-table-wrapper">

##       <table class="summary-table left">

## <thead><tr>

## <th>Relationship</th>

##           <th>Recor

Here's the "Workflow:

httr::GET(

  url = "https://www.ecan.govt.nz/data/consent-search/consentworkflow/CRC000002.1",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    Authority = "www.ecan.govt.nz",

    `X-Requested-With` = "XMLHttpRequest"

  ),

  httr::set_cookies(

    monsido = monsido_token

  )

) -> res



httr::content(res)

## {xml_document}

## <html>

## [1] <body><p>No workflow</p></body>

Here are the "Consent Flow Restrictions":

httr::GET(

  url = "https://www.ecan.govt.nz/data/consent-search/consentflowrestrictions/CRC000002.1",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    Authority = "www.ecan.govt.nz",

    `X-Requested-With` = "XMLHttpRequest"

  ),

  httr::set_cookies(

    monsido = monsido_token

  )

) -> res



httr::content(res) %>%

  as.character() %>%

  substring(1, 300) %>%

  cat()

## <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">

## <html><body><div class="summary-table-wrapper">

##     <table class="summary-table left">

## <thead>

## <th colspan="2">Low Flow Site</th>

##       <th>Todays Flow <span class="lower">(m3/s)</span>

## </th>

You still need to parse HTML but now you can do it all with just plain R packages.

edited Nov 25 '18 at 12:30

answered Nov 24 '18 at 17:38

hrbrmstr

62.2k694155

If you open Developer Tools in your browser and reload the page then go to Network and then the XHR tab you'll see:

enter image description here

Most of the "real" data on the page is loaded dynamically. We can write httr calls that mimic the browser calls.

However…

library(httr)

library(rvest)

library(dplyr)

library(V8)



ctx <- v8() # we need this to eval some javascript



# Prime Cookies -----------------------------------------------------------



res <- httr::GET("https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1")



httr::cookies(res)

##          domain flag path secure          expiration                 name

## 1 .ecan.govt.nz TRUE    /  FALSE 2019-11-24 11:46:13   visid_incap_927063

## 2 .ecan.govt.nz TRUE    /  FALSE                <NA> incap_ses_148_927063

##                                                              value

## 1 +p8XAM6uReGmEnVIdnaxoxWL+VsAAAAAQUIPAAAAAABjdOjQDbXt7PG3tpBpELha

## 2         nXJSYz8zbCRj8tGhzNANAhaL+VsAAAAA7JyOH7Gu4qeIb6KKk/iSYQ==

pg <- httr::content(res)



html_node(pg, xpath=".//script[contains(., '_monsido')]") %>%

  html_text() %>%

  ctx$eval()

## [1] "2"

monsido_token <- ctx$get("_monsido")[1,2]

Here's the searchlist (which is, indeed empty):

httr::VERB(

  verb = "POST", url = "https://www.ecan.govt.nz/data/document-library/searchlist",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    `X-Requested-With` = "XMLHttpRequest",

    TE = "Trailers"

  ), httr::set_cookies(

    monsido = monsido_token

  ),

  body = list(

    name = "CRC000002.1",

    pageSize = "999999"

  ),

  encode = "form"

) -> res



httr::content(res)

## NULL ## <<=== this is OK as there is no response

Here's the "Consent Overview" section:

httr::GET(

  url = "https://www.ecan.govt.nz/data/consent-search/consentoverview/CRC000002.1",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    Authority = "www.ecan.govt.nz",

    `X-Requested-With` = "XMLHttpRequest"

  ),

  httr::set_cookies(

    monsido = monsido_token

  )

) -> res



httr::content(res) %>%

  html_table() %>%

  glimpse()

## List of 1

##  $ :'data.frame':    5 obs. of  4 variables:

##   ..$ X1: chr [1:5] "RMA Authorisation Number" "Consent Location" "To" "Commencement Date" ...

##   ..$ X2: chr [1:5] "CRC000002.1" "Manuka Creek, KILLERMONT STATION" "To take water from Manuka Creek at or about map reference NZMS 260 H39:5588-2366 for irrigation of up to 40.8 hectares." "29 Apr 2010" ...

##   ..$ X3: chr [1:5] "Client Name" "State" "To take water from Manuka Creek at or about map reference NZMS 260 H39:5588-2366 for irrigation of up to 40.8 hectares." "29 Apr 2010" ...

##   ..$ X4: chr [1:5] "Killermont Station Limited" "Issued - Active" "To take water from Manuka Creek at or about map reference NZMS 260 H39:5588-2366 for irrigation of up to 40.8 hectares." "29 Apr 2010" ...

Here are the "Consent Conditions":

httr::GET(

  url = "https://www.ecan.govt.nz/data/consent-search/consentconditions/CRC000002.1",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    Authority = "www.ecan.govt.nz",

    `X-Requested-With` = "XMLHttpRequest"

  ),

  httr::set_cookies(

    monsido = monsido_token

  )

) -> res



httr::content(res) %>%

  as.character() %>%

  substring(1, 300) %>%

  cat()

## <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">

## <html><body><div class="consentDetails">

##     <ul class="unstyled-list">

## <li>

##           

##             

##             <strong class="pull-left">1</strong> <div class="pad-left1">The rate at which wa

Here's the "Consent Related":

httr::GET(

  url = "https://www.ecan.govt.nz/data/consent-search/consentrelated/CRC000002.1",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    Authority = "www.ecan.govt.nz",

    `X-Requested-With` = "XMLHttpRequest"

  ),

  httr::set_cookies(

    monsido = monsido_token

  )

) -> res



httr::content(res) %>%

  as.character() %>%

  substring(1, 300) %>%

  cat()

## <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">

## <html><body>

## <p>There are no related documents.</p>

## 

## 

## 

## 

##   

##     <div class="summary-table-wrapper">

##       <table class="summary-table left">

## <thead><tr>

## <th>Relationship</th>

##           <th>Recor

Here's the "Workflow:

httr::GET(

  url = "https://www.ecan.govt.nz/data/consent-search/consentworkflow/CRC000002.1",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    Authority = "www.ecan.govt.nz",

    `X-Requested-With` = "XMLHttpRequest"

  ),

  httr::set_cookies(

    monsido = monsido_token

  )

) -> res



httr::content(res)

## {xml_document}

## <html>

## [1] <body><p>No workflow</p></body>

Here are the "Consent Flow Restrictions":

httr::GET(

  url = "https://www.ecan.govt.nz/data/consent-search/consentflowrestrictions/CRC000002.1",

  httr::add_headers(

    Referer = "https://www.ecan.govt.nz/data/consent-search/consentdetails/CRC000002.1",

    Authority = "www.ecan.govt.nz",

    `X-Requested-With` = "XMLHttpRequest"

  ),

  httr::set_cookies(

    monsido = monsido_token

  )

) -> res



httr::content(res) %>%

  as.character() %>%

  substring(1, 300) %>%

  cat()

## <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">

## <html><body><div class="summary-table-wrapper">

##     <table class="summary-table left">

## <thead>

## <th colspan="2">Low Flow Site</th>

##       <th>Todays Flow <span class="lower">(m3/s)</span>

## </th>

You still need to parse HTML but now you can do it all with just plain R packages.

edited Nov 25 '18 at 12:30

answered Nov 24 '18 at 17:38

hrbrmstr

62.2k694155

edited Nov 25 '18 at 12:30

answered Nov 24 '18 at 17:38

hrbrmstr

62.2k694155

answered Nov 24 '18 at 17:38

hrbrmstr

62.2k694155

answered Nov 24 '18 at 17:38

hrbrmstr

62.2k694155

Thanks, that's brilliant! Works perfectly, and now I'm on to banging my head against a wall with the text pattern matching. Can you explain to me briefly how to select the arguments for GET? They work perfectly in this case, but I don't think I could replicate it, and the help file in R is a little opaque.

– TimM
Nov 24 '18 at 21:54

1

If this isn't time-sensitive, lemme put this into GitHub by tomorrow and I'll drop a link here and we can iron it out it GitHub issues.

– hrbrmstr
Nov 24 '18 at 22:11

That would be amazing, thanks!

– TimM
Nov 24 '18 at 22:13

Cool. If you could put your initial comment to this answer as an issue in github.com/hrbrmstr/nz-ecan I'll drop a note on the morrow. Tonight is last night with my college freshman son home for the Thanksgiving break so I'll be able to crank on this tomorrow with zeal.

– hrbrmstr
Nov 24 '18 at 22:17

add a comment |

Thanks, that's brilliant! Works perfectly, and now I'm on to banging my head against a wall with the text pattern matching. Can you explain to me briefly how to select the arguments for GET? They work perfectly in this case, but I don't think I could replicate it, and the help file in R is a little opaque.

– TimM
Nov 24 '18 at 21:54

1

If this isn't time-sensitive, lemme put this into GitHub by tomorrow and I'll drop a link here and we can iron it out it GitHub issues.

– hrbrmstr
Nov 24 '18 at 22:11

That would be amazing, thanks!

– TimM
Nov 24 '18 at 22:13

Cool. If you could put your initial comment to this answer as an issue in github.com/hrbrmstr/nz-ecan I'll drop a note on the morrow. Tonight is last night with my college freshman son home for the Thanksgiving break so I'll be able to crank on this tomorrow with zeal.

– hrbrmstr
Nov 24 '18 at 22:17

Thanks, that's brilliant! Works perfectly, and now I'm on to banging my head against a wall with the text pattern matching. Can you explain to me briefly how to select the arguments for GET? They work perfectly in this case, but I don't think I could replicate it, and the help file in R is a little opaque.

– TimM
Nov 24 '18 at 21:54

If this isn't time-sensitive, lemme put this into GitHub by tomorrow and I'll drop a link here and we can iron it out it GitHub issues.

– hrbrmstr
Nov 24 '18 at 22:11

That would be amazing, thanks!

– TimM
Nov 24 '18 at 22:13

Cool. If you could put your initial comment to this answer as an issue in github.com/hrbrmstr/nz-ecan I'll drop a note on the morrow. Tonight is last night with my college freshman son home for the Thanksgiving break so I'll be able to crank on this tomorrow with zeal.

– hrbrmstr
Nov 24 '18 at 22:17

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Wsrtjtyk