Filter data with max date per ID in dataframe [duplicate]











up vote
1
down vote

favorite













This question already has an answer here:




  • How to select the rows with maximum values in each group with dplyr?

    4 answers



  • Add missing rows within combinations of factors

    2 answers




I have the following data frame:



sourceid dataelementid value timestamp
11726 10922 34 2016-04-03 02:05:02
11726 10923 9 2016-05-29 10:47:59
11726 10923 9 2016-05-29 03:47:59
11726 10924 19 2016-03-20 02:05:02
11726 10922 18 2016-05-29 10:47:59
12389 10922 23 2016-07-17 02:05:02
12389 10923 12 2016-04-09 02:05:02
12389 10923 3 2016-09-04 02:05:02
12389 10923 30 2016-04-03 02:05:02
12389 10924 23 2016-04-03 02:05:02
12389 10924 17 2016-05-30 02:05:02
12389 10922 15 2016-04-03 02:05:02
45012 10922 33 2016-03-03 02:05:02
45012 10924 11 2016-05-29 10:47:59


As you can see, I have two columns with unique ID's (Sourceid and dataelementid). What I would like to do is to get the most recent data from the data frame and have a new data frame that looks like this:



sourceid dataelementid value timestamp
11726 10922 18 2016-05-29 10:47:59
11726 10923 9 2016-05-29 10:47:59
11726 10924 19 2016-03-20 02:05:02
12389 10922 23 2016-07-17 02:05:02
12389 10923 3 2016-09-04 02:05:02
12389 10924 17 2016-05-30 02:05:02
45012 10922 33 2016-03-03 02:05:02
45012 NA
45012 10924 11 2016-05-29 10:47:59


I have searched for a solution to this but those I have found are only looking at data frames with one column containing the ID's. If a solution to a similar problem to this already exists, it would be great if I could be pointed to it.



Thanks in advance.










share|improve this question















marked as duplicate by hrbrmstr r
Users with the  r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 8 at 11:58


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.















  • Please provide reproducible example: dput(head(mydata, 20))
    – zx8754
    Nov 8 at 11:48












  • It is a 2 step, 1st group by and subset max per group. Then 2nd: add missing combos of 2 ids: "sourceid" and "dataelementid". See linked posts.
    – zx8754
    Nov 8 at 12:06















up vote
1
down vote

favorite













This question already has an answer here:




  • How to select the rows with maximum values in each group with dplyr?

    4 answers



  • Add missing rows within combinations of factors

    2 answers




I have the following data frame:



sourceid dataelementid value timestamp
11726 10922 34 2016-04-03 02:05:02
11726 10923 9 2016-05-29 10:47:59
11726 10923 9 2016-05-29 03:47:59
11726 10924 19 2016-03-20 02:05:02
11726 10922 18 2016-05-29 10:47:59
12389 10922 23 2016-07-17 02:05:02
12389 10923 12 2016-04-09 02:05:02
12389 10923 3 2016-09-04 02:05:02
12389 10923 30 2016-04-03 02:05:02
12389 10924 23 2016-04-03 02:05:02
12389 10924 17 2016-05-30 02:05:02
12389 10922 15 2016-04-03 02:05:02
45012 10922 33 2016-03-03 02:05:02
45012 10924 11 2016-05-29 10:47:59


As you can see, I have two columns with unique ID's (Sourceid and dataelementid). What I would like to do is to get the most recent data from the data frame and have a new data frame that looks like this:



sourceid dataelementid value timestamp
11726 10922 18 2016-05-29 10:47:59
11726 10923 9 2016-05-29 10:47:59
11726 10924 19 2016-03-20 02:05:02
12389 10922 23 2016-07-17 02:05:02
12389 10923 3 2016-09-04 02:05:02
12389 10924 17 2016-05-30 02:05:02
45012 10922 33 2016-03-03 02:05:02
45012 NA
45012 10924 11 2016-05-29 10:47:59


I have searched for a solution to this but those I have found are only looking at data frames with one column containing the ID's. If a solution to a similar problem to this already exists, it would be great if I could be pointed to it.



Thanks in advance.










share|improve this question















marked as duplicate by hrbrmstr r
Users with the  r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 8 at 11:58


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.















  • Please provide reproducible example: dput(head(mydata, 20))
    – zx8754
    Nov 8 at 11:48












  • It is a 2 step, 1st group by and subset max per group. Then 2nd: add missing combos of 2 ids: "sourceid" and "dataelementid". See linked posts.
    – zx8754
    Nov 8 at 12:06













up vote
1
down vote

favorite









up vote
1
down vote

favorite












This question already has an answer here:




  • How to select the rows with maximum values in each group with dplyr?

    4 answers



  • Add missing rows within combinations of factors

    2 answers




I have the following data frame:



sourceid dataelementid value timestamp
11726 10922 34 2016-04-03 02:05:02
11726 10923 9 2016-05-29 10:47:59
11726 10923 9 2016-05-29 03:47:59
11726 10924 19 2016-03-20 02:05:02
11726 10922 18 2016-05-29 10:47:59
12389 10922 23 2016-07-17 02:05:02
12389 10923 12 2016-04-09 02:05:02
12389 10923 3 2016-09-04 02:05:02
12389 10923 30 2016-04-03 02:05:02
12389 10924 23 2016-04-03 02:05:02
12389 10924 17 2016-05-30 02:05:02
12389 10922 15 2016-04-03 02:05:02
45012 10922 33 2016-03-03 02:05:02
45012 10924 11 2016-05-29 10:47:59


As you can see, I have two columns with unique ID's (Sourceid and dataelementid). What I would like to do is to get the most recent data from the data frame and have a new data frame that looks like this:



sourceid dataelementid value timestamp
11726 10922 18 2016-05-29 10:47:59
11726 10923 9 2016-05-29 10:47:59
11726 10924 19 2016-03-20 02:05:02
12389 10922 23 2016-07-17 02:05:02
12389 10923 3 2016-09-04 02:05:02
12389 10924 17 2016-05-30 02:05:02
45012 10922 33 2016-03-03 02:05:02
45012 NA
45012 10924 11 2016-05-29 10:47:59


I have searched for a solution to this but those I have found are only looking at data frames with one column containing the ID's. If a solution to a similar problem to this already exists, it would be great if I could be pointed to it.



Thanks in advance.










share|improve this question
















This question already has an answer here:




  • How to select the rows with maximum values in each group with dplyr?

    4 answers



  • Add missing rows within combinations of factors

    2 answers




I have the following data frame:



sourceid dataelementid value timestamp
11726 10922 34 2016-04-03 02:05:02
11726 10923 9 2016-05-29 10:47:59
11726 10923 9 2016-05-29 03:47:59
11726 10924 19 2016-03-20 02:05:02
11726 10922 18 2016-05-29 10:47:59
12389 10922 23 2016-07-17 02:05:02
12389 10923 12 2016-04-09 02:05:02
12389 10923 3 2016-09-04 02:05:02
12389 10923 30 2016-04-03 02:05:02
12389 10924 23 2016-04-03 02:05:02
12389 10924 17 2016-05-30 02:05:02
12389 10922 15 2016-04-03 02:05:02
45012 10922 33 2016-03-03 02:05:02
45012 10924 11 2016-05-29 10:47:59


As you can see, I have two columns with unique ID's (Sourceid and dataelementid). What I would like to do is to get the most recent data from the data frame and have a new data frame that looks like this:



sourceid dataelementid value timestamp
11726 10922 18 2016-05-29 10:47:59
11726 10923 9 2016-05-29 10:47:59
11726 10924 19 2016-03-20 02:05:02
12389 10922 23 2016-07-17 02:05:02
12389 10923 3 2016-09-04 02:05:02
12389 10924 17 2016-05-30 02:05:02
45012 10922 33 2016-03-03 02:05:02
45012 NA
45012 10924 11 2016-05-29 10:47:59


I have searched for a solution to this but those I have found are only looking at data frames with one column containing the ID's. If a solution to a similar problem to this already exists, it would be great if I could be pointed to it.



Thanks in advance.





This question already has an answer here:




  • How to select the rows with maximum values in each group with dplyr?

    4 answers



  • Add missing rows within combinations of factors

    2 answers








r dataframe filter max






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 8 at 12:06









zx8754

28.9k76395




28.9k76395










asked Nov 8 at 11:38









Ali Nguz

83




83




marked as duplicate by hrbrmstr r
Users with the  r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 8 at 11:58


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.






marked as duplicate by hrbrmstr r
Users with the  r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 8 at 11:58


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.














  • Please provide reproducible example: dput(head(mydata, 20))
    – zx8754
    Nov 8 at 11:48












  • It is a 2 step, 1st group by and subset max per group. Then 2nd: add missing combos of 2 ids: "sourceid" and "dataelementid". See linked posts.
    – zx8754
    Nov 8 at 12:06


















  • Please provide reproducible example: dput(head(mydata, 20))
    – zx8754
    Nov 8 at 11:48












  • It is a 2 step, 1st group by and subset max per group. Then 2nd: add missing combos of 2 ids: "sourceid" and "dataelementid". See linked posts.
    – zx8754
    Nov 8 at 12:06
















Please provide reproducible example: dput(head(mydata, 20))
– zx8754
Nov 8 at 11:48






Please provide reproducible example: dput(head(mydata, 20))
– zx8754
Nov 8 at 11:48














It is a 2 step, 1st group by and subset max per group. Then 2nd: add missing combos of 2 ids: "sourceid" and "dataelementid". See linked posts.
– zx8754
Nov 8 at 12:06




It is a 2 step, 1st group by and subset max per group. Then 2nd: add missing combos of 2 ids: "sourceid" and "dataelementid". See linked posts.
– zx8754
Nov 8 at 12:06

















active

oldest

votes






















active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes

這個網誌中的熱門文章

Tangent Lines Diagram Along Smooth Curve

Yusuf al-Mu'taman ibn Hud

Zucchini