R dplyr: join within pipe












2















I am totally new to dplyr and am trying to use dplyr to do the following:



I have the dataframe 'tdata' and want to fill omitted periods (prd) with 'NA' within each group. I want to get the dataframe 'results'. Speed matters for me, so I hope that there is a way to do it in dplyr faster than in for loop.



> tdata <- data.frame(group = c(10, 10, 10, 11, 11), prd = c(1, 2, 5, 3, 5), value = c(2,7,3,6,2))
> tdata
group prd value
1 10 1 2
2 10 2 7
3 10 5 3
4 11 3 6
5 11 5 2
> result <- data.frame(group = c(10, 10, 10, 10, 10, 11, 11, 11), prd = c(1, 2, 3, 4, 5, 3, 4, 5), value = c(2, 7, 'NA', 'NA', 3, 6, 'NA', 2))
> result
group prd value
1 10 1 2
2 10 2 7
3 10 3 NA
4 10 4 NA
5 10 5 3
6 11 3 6
7 11 4 NA
8 11 5 2


I tried to use pipes and got this error:



> fdata <- tdata %>%
+ group_by(group) %>%
+ arrange(prd) %>%
+ left_join(data.frame(prd_v=min(prd):max(prd)), ., by=c("prd_v" = "prd"))
Error in data.frame(prd_v = min(prd):max(prd)) : object 'prd' not found


UPDATE:
Additionally, I want to use this pipe inside the larger function, so I would like to have



period_variable <- "prd"


and then



tdata2 <- ndata %>%
group_by(group) %>%
complete(period_variable = full_seq(period_variable), period = 1) %>%
ungroup()
tdata2


But it does not work. I tried to play with get(), parse(), eval(), as.name(), as.symbol(), UQ(), !!, sym() but it still does not work.










share|improve this question





























    2















    I am totally new to dplyr and am trying to use dplyr to do the following:



    I have the dataframe 'tdata' and want to fill omitted periods (prd) with 'NA' within each group. I want to get the dataframe 'results'. Speed matters for me, so I hope that there is a way to do it in dplyr faster than in for loop.



    > tdata <- data.frame(group = c(10, 10, 10, 11, 11), prd = c(1, 2, 5, 3, 5), value = c(2,7,3,6,2))
    > tdata
    group prd value
    1 10 1 2
    2 10 2 7
    3 10 5 3
    4 11 3 6
    5 11 5 2
    > result <- data.frame(group = c(10, 10, 10, 10, 10, 11, 11, 11), prd = c(1, 2, 3, 4, 5, 3, 4, 5), value = c(2, 7, 'NA', 'NA', 3, 6, 'NA', 2))
    > result
    group prd value
    1 10 1 2
    2 10 2 7
    3 10 3 NA
    4 10 4 NA
    5 10 5 3
    6 11 3 6
    7 11 4 NA
    8 11 5 2


    I tried to use pipes and got this error:



    > fdata <- tdata %>%
    + group_by(group) %>%
    + arrange(prd) %>%
    + left_join(data.frame(prd_v=min(prd):max(prd)), ., by=c("prd_v" = "prd"))
    Error in data.frame(prd_v = min(prd):max(prd)) : object 'prd' not found


    UPDATE:
    Additionally, I want to use this pipe inside the larger function, so I would like to have



    period_variable <- "prd"


    and then



    tdata2 <- ndata %>%
    group_by(group) %>%
    complete(period_variable = full_seq(period_variable), period = 1) %>%
    ungroup()
    tdata2


    But it does not work. I tried to play with get(), parse(), eval(), as.name(), as.symbol(), UQ(), !!, sym() but it still does not work.










    share|improve this question



























      2












      2








      2








      I am totally new to dplyr and am trying to use dplyr to do the following:



      I have the dataframe 'tdata' and want to fill omitted periods (prd) with 'NA' within each group. I want to get the dataframe 'results'. Speed matters for me, so I hope that there is a way to do it in dplyr faster than in for loop.



      > tdata <- data.frame(group = c(10, 10, 10, 11, 11), prd = c(1, 2, 5, 3, 5), value = c(2,7,3,6,2))
      > tdata
      group prd value
      1 10 1 2
      2 10 2 7
      3 10 5 3
      4 11 3 6
      5 11 5 2
      > result <- data.frame(group = c(10, 10, 10, 10, 10, 11, 11, 11), prd = c(1, 2, 3, 4, 5, 3, 4, 5), value = c(2, 7, 'NA', 'NA', 3, 6, 'NA', 2))
      > result
      group prd value
      1 10 1 2
      2 10 2 7
      3 10 3 NA
      4 10 4 NA
      5 10 5 3
      6 11 3 6
      7 11 4 NA
      8 11 5 2


      I tried to use pipes and got this error:



      > fdata <- tdata %>%
      + group_by(group) %>%
      + arrange(prd) %>%
      + left_join(data.frame(prd_v=min(prd):max(prd)), ., by=c("prd_v" = "prd"))
      Error in data.frame(prd_v = min(prd):max(prd)) : object 'prd' not found


      UPDATE:
      Additionally, I want to use this pipe inside the larger function, so I would like to have



      period_variable <- "prd"


      and then



      tdata2 <- ndata %>%
      group_by(group) %>%
      complete(period_variable = full_seq(period_variable), period = 1) %>%
      ungroup()
      tdata2


      But it does not work. I tried to play with get(), parse(), eval(), as.name(), as.symbol(), UQ(), !!, sym() but it still does not work.










      share|improve this question
















      I am totally new to dplyr and am trying to use dplyr to do the following:



      I have the dataframe 'tdata' and want to fill omitted periods (prd) with 'NA' within each group. I want to get the dataframe 'results'. Speed matters for me, so I hope that there is a way to do it in dplyr faster than in for loop.



      > tdata <- data.frame(group = c(10, 10, 10, 11, 11), prd = c(1, 2, 5, 3, 5), value = c(2,7,3,6,2))
      > tdata
      group prd value
      1 10 1 2
      2 10 2 7
      3 10 5 3
      4 11 3 6
      5 11 5 2
      > result <- data.frame(group = c(10, 10, 10, 10, 10, 11, 11, 11), prd = c(1, 2, 3, 4, 5, 3, 4, 5), value = c(2, 7, 'NA', 'NA', 3, 6, 'NA', 2))
      > result
      group prd value
      1 10 1 2
      2 10 2 7
      3 10 3 NA
      4 10 4 NA
      5 10 5 3
      6 11 3 6
      7 11 4 NA
      8 11 5 2


      I tried to use pipes and got this error:



      > fdata <- tdata %>%
      + group_by(group) %>%
      + arrange(prd) %>%
      + left_join(data.frame(prd_v=min(prd):max(prd)), ., by=c("prd_v" = "prd"))
      Error in data.frame(prd_v = min(prd):max(prd)) : object 'prd' not found


      UPDATE:
      Additionally, I want to use this pipe inside the larger function, so I would like to have



      period_variable <- "prd"


      and then



      tdata2 <- ndata %>%
      group_by(group) %>%
      complete(period_variable = full_seq(period_variable), period = 1) %>%
      ungroup()
      tdata2


      But it does not work. I tried to play with get(), parse(), eval(), as.name(), as.symbol(), UQ(), !!, sym() but it still does not work.







      r dplyr






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 18 '18 at 2:14







      Moysey Abramowitz

















      asked Nov 18 '18 at 0:58









      Moysey AbramowitzMoysey Abramowitz

      427




      427
























          2 Answers
          2






          active

          oldest

          votes


















          3














          We can use the complete function from the tidyr package.



          library(dplyr)
          library(tidyr)

          tdata2 <- tdata %>%
          group_by(group) %>%
          complete(prd = full_seq(prd, period = 1)) %>%
          ungroup()
          tdata2
          # # A tibble: 8 x 3
          # group prd value
          # <dbl> <dbl> <dbl>
          # 1 10 1 2
          # 2 10 2 7
          # 3 10 3 NA
          # 4 10 4 NA
          # 5 10 5 3
          # 6 11 3 6
          # 7 11 4 NA
          # 8 11 5 2





          share|improve this answer
























          • Thanks! Do you know how to do it when period_variable is defined outside the piped code? I updated the question.

            – Moysey Abramowitz
            Nov 18 '18 at 1:52











          • @MoyseyAbramowitz I don't know. It is related to functional programming. Please see this dplyr.tidyverse.org/articles/programming.html

            – www
            Nov 18 '18 at 2:26



















          0














          As for the second question, I don't know if this is what you want, but I would do something like this:



          prd = c(1, 2, 5, 3, 5)
          period_variable <- quote(prd)

          tdata2 <- tdata %>%
          dplyr::group_by(group) %>%
          tidyr::complete(prd= tidyr::full_seq(eval(period_variable), period = 1)) %>%
          dplyr::ungroup()





          share|improve this answer


























          • This code produces an error: Error in eval(period_variable) : object 'prd' not found.

            – Moysey Abramowitz
            Nov 18 '18 at 14:41











          • You mentioned that the period_variable was defined outside the piped code ( I edited my response), so I understood that prd was created outside the data.frame. If It was not what you wanted, you could be more explicit.

            – José
            Nov 19 '18 at 12:47













          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53356970%2fr-dplyr-join-within-pipe%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          3














          We can use the complete function from the tidyr package.



          library(dplyr)
          library(tidyr)

          tdata2 <- tdata %>%
          group_by(group) %>%
          complete(prd = full_seq(prd, period = 1)) %>%
          ungroup()
          tdata2
          # # A tibble: 8 x 3
          # group prd value
          # <dbl> <dbl> <dbl>
          # 1 10 1 2
          # 2 10 2 7
          # 3 10 3 NA
          # 4 10 4 NA
          # 5 10 5 3
          # 6 11 3 6
          # 7 11 4 NA
          # 8 11 5 2





          share|improve this answer
























          • Thanks! Do you know how to do it when period_variable is defined outside the piped code? I updated the question.

            – Moysey Abramowitz
            Nov 18 '18 at 1:52











          • @MoyseyAbramowitz I don't know. It is related to functional programming. Please see this dplyr.tidyverse.org/articles/programming.html

            – www
            Nov 18 '18 at 2:26
















          3














          We can use the complete function from the tidyr package.



          library(dplyr)
          library(tidyr)

          tdata2 <- tdata %>%
          group_by(group) %>%
          complete(prd = full_seq(prd, period = 1)) %>%
          ungroup()
          tdata2
          # # A tibble: 8 x 3
          # group prd value
          # <dbl> <dbl> <dbl>
          # 1 10 1 2
          # 2 10 2 7
          # 3 10 3 NA
          # 4 10 4 NA
          # 5 10 5 3
          # 6 11 3 6
          # 7 11 4 NA
          # 8 11 5 2





          share|improve this answer
























          • Thanks! Do you know how to do it when period_variable is defined outside the piped code? I updated the question.

            – Moysey Abramowitz
            Nov 18 '18 at 1:52











          • @MoyseyAbramowitz I don't know. It is related to functional programming. Please see this dplyr.tidyverse.org/articles/programming.html

            – www
            Nov 18 '18 at 2:26














          3












          3








          3







          We can use the complete function from the tidyr package.



          library(dplyr)
          library(tidyr)

          tdata2 <- tdata %>%
          group_by(group) %>%
          complete(prd = full_seq(prd, period = 1)) %>%
          ungroup()
          tdata2
          # # A tibble: 8 x 3
          # group prd value
          # <dbl> <dbl> <dbl>
          # 1 10 1 2
          # 2 10 2 7
          # 3 10 3 NA
          # 4 10 4 NA
          # 5 10 5 3
          # 6 11 3 6
          # 7 11 4 NA
          # 8 11 5 2





          share|improve this answer













          We can use the complete function from the tidyr package.



          library(dplyr)
          library(tidyr)

          tdata2 <- tdata %>%
          group_by(group) %>%
          complete(prd = full_seq(prd, period = 1)) %>%
          ungroup()
          tdata2
          # # A tibble: 8 x 3
          # group prd value
          # <dbl> <dbl> <dbl>
          # 1 10 1 2
          # 2 10 2 7
          # 3 10 3 NA
          # 4 10 4 NA
          # 5 10 5 3
          # 6 11 3 6
          # 7 11 4 NA
          # 8 11 5 2






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 18 '18 at 1:26









          wwwwww

          26.5k112240




          26.5k112240













          • Thanks! Do you know how to do it when period_variable is defined outside the piped code? I updated the question.

            – Moysey Abramowitz
            Nov 18 '18 at 1:52











          • @MoyseyAbramowitz I don't know. It is related to functional programming. Please see this dplyr.tidyverse.org/articles/programming.html

            – www
            Nov 18 '18 at 2:26



















          • Thanks! Do you know how to do it when period_variable is defined outside the piped code? I updated the question.

            – Moysey Abramowitz
            Nov 18 '18 at 1:52











          • @MoyseyAbramowitz I don't know. It is related to functional programming. Please see this dplyr.tidyverse.org/articles/programming.html

            – www
            Nov 18 '18 at 2:26

















          Thanks! Do you know how to do it when period_variable is defined outside the piped code? I updated the question.

          – Moysey Abramowitz
          Nov 18 '18 at 1:52





          Thanks! Do you know how to do it when period_variable is defined outside the piped code? I updated the question.

          – Moysey Abramowitz
          Nov 18 '18 at 1:52













          @MoyseyAbramowitz I don't know. It is related to functional programming. Please see this dplyr.tidyverse.org/articles/programming.html

          – www
          Nov 18 '18 at 2:26





          @MoyseyAbramowitz I don't know. It is related to functional programming. Please see this dplyr.tidyverse.org/articles/programming.html

          – www
          Nov 18 '18 at 2:26













          0














          As for the second question, I don't know if this is what you want, but I would do something like this:



          prd = c(1, 2, 5, 3, 5)
          period_variable <- quote(prd)

          tdata2 <- tdata %>%
          dplyr::group_by(group) %>%
          tidyr::complete(prd= tidyr::full_seq(eval(period_variable), period = 1)) %>%
          dplyr::ungroup()





          share|improve this answer


























          • This code produces an error: Error in eval(period_variable) : object 'prd' not found.

            – Moysey Abramowitz
            Nov 18 '18 at 14:41











          • You mentioned that the period_variable was defined outside the piped code ( I edited my response), so I understood that prd was created outside the data.frame. If It was not what you wanted, you could be more explicit.

            – José
            Nov 19 '18 at 12:47


















          0














          As for the second question, I don't know if this is what you want, but I would do something like this:



          prd = c(1, 2, 5, 3, 5)
          period_variable <- quote(prd)

          tdata2 <- tdata %>%
          dplyr::group_by(group) %>%
          tidyr::complete(prd= tidyr::full_seq(eval(period_variable), period = 1)) %>%
          dplyr::ungroup()





          share|improve this answer


























          • This code produces an error: Error in eval(period_variable) : object 'prd' not found.

            – Moysey Abramowitz
            Nov 18 '18 at 14:41











          • You mentioned that the period_variable was defined outside the piped code ( I edited my response), so I understood that prd was created outside the data.frame. If It was not what you wanted, you could be more explicit.

            – José
            Nov 19 '18 at 12:47
















          0












          0








          0







          As for the second question, I don't know if this is what you want, but I would do something like this:



          prd = c(1, 2, 5, 3, 5)
          period_variable <- quote(prd)

          tdata2 <- tdata %>%
          dplyr::group_by(group) %>%
          tidyr::complete(prd= tidyr::full_seq(eval(period_variable), period = 1)) %>%
          dplyr::ungroup()





          share|improve this answer















          As for the second question, I don't know if this is what you want, but I would do something like this:



          prd = c(1, 2, 5, 3, 5)
          period_variable <- quote(prd)

          tdata2 <- tdata %>%
          dplyr::group_by(group) %>%
          tidyr::complete(prd= tidyr::full_seq(eval(period_variable), period = 1)) %>%
          dplyr::ungroup()






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 19 '18 at 12:42

























          answered Nov 18 '18 at 4:54









          JoséJosé

          516815




          516815













          • This code produces an error: Error in eval(period_variable) : object 'prd' not found.

            – Moysey Abramowitz
            Nov 18 '18 at 14:41











          • You mentioned that the period_variable was defined outside the piped code ( I edited my response), so I understood that prd was created outside the data.frame. If It was not what you wanted, you could be more explicit.

            – José
            Nov 19 '18 at 12:47





















          • This code produces an error: Error in eval(period_variable) : object 'prd' not found.

            – Moysey Abramowitz
            Nov 18 '18 at 14:41











          • You mentioned that the period_variable was defined outside the piped code ( I edited my response), so I understood that prd was created outside the data.frame. If It was not what you wanted, you could be more explicit.

            – José
            Nov 19 '18 at 12:47



















          This code produces an error: Error in eval(period_variable) : object 'prd' not found.

          – Moysey Abramowitz
          Nov 18 '18 at 14:41





          This code produces an error: Error in eval(period_variable) : object 'prd' not found.

          – Moysey Abramowitz
          Nov 18 '18 at 14:41













          You mentioned that the period_variable was defined outside the piped code ( I edited my response), so I understood that prd was created outside the data.frame. If It was not what you wanted, you could be more explicit.

          – José
          Nov 19 '18 at 12:47







          You mentioned that the period_variable was defined outside the piped code ( I edited my response), so I understood that prd was created outside the data.frame. If It was not what you wanted, you could be more explicit.

          – José
          Nov 19 '18 at 12:47




















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53356970%2fr-dplyr-join-within-pipe%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          這個網誌中的熱門文章

          Tangent Lines Diagram Along Smooth Curve

          Yusuf al-Mu'taman ibn Hud

          Zucchini