Filter data with max date per ID in dataframe [duplicate]

up vote
1
down vote

favorite

This question already has an answer here:

How to select the rows with maximum values in each group with dplyr?

4 answers

Add missing rows within combinations of factors

2 answers

I have the following data frame:

sourceid dataelementid value timestamp

11726    10922         34    2016-04-03 02:05:02

11726    10923         9     2016-05-29 10:47:59

11726    10923         9     2016-05-29 03:47:59

11726    10924         19    2016-03-20 02:05:02

11726    10922         18    2016-05-29 10:47:59

12389    10922         23    2016-07-17 02:05:02

12389    10923         12    2016-04-09 02:05:02

12389    10923         3     2016-09-04 02:05:02

12389    10923         30    2016-04-03 02:05:02

12389    10924         23    2016-04-03 02:05:02

12389    10924         17    2016-05-30 02:05:02

12389    10922         15    2016-04-03 02:05:02

45012    10922         33    2016-03-03 02:05:02

45012    10924         11    2016-05-29 10:47:59

As you can see, I have two columns with unique ID's (Sourceid and dataelementid). What I would like to do is to get the most recent data from the data frame and have a new data frame that looks like this:

sourceid dataelementid value timestamp

11726    10922         18    2016-05-29 10:47:59

11726    10923         9     2016-05-29 10:47:59

11726    10924         19    2016-03-20 02:05:02

12389    10922         23    2016-07-17 02:05:02

12389    10923         3     2016-09-04 02:05:02

12389    10924         17    2016-05-30 02:05:02

45012    10922         33    2016-03-03 02:05:02

45012    NA

45012    10924         11    2016-05-29 10:47:59

I have searched for a solution to this but those I have found are only looking at data frames with one column containing the ID's. If a solution to a similar problem to this already exists, it would be great if I could be pointed to it.

Thanks in advance.

edited Nov 8 at 12:06

zx8754

28.9k76395

asked Nov 8 at 11:38

Ali Nguz

marked as duplicate by hrbrmstr r
Users with the r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 8 at 11:58

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

Please provide reproducible example: dput(head(mydata, 20))
– zx8754
Nov 8 at 11:48

It is a 2 step, 1st group by and subset max per group. Then 2nd: add missing combos of 2 ids: "sourceid" and "dataelementid". See linked posts.
– zx8754
Nov 8 at 12:06

add a comment |

up vote
1
down vote

favorite

This question already has an answer here:

How to select the rows with maximum values in each group with dplyr?

4 answers

Add missing rows within combinations of factors

2 answers

I have the following data frame:

sourceid dataelementid value timestamp

11726    10922         34    2016-04-03 02:05:02

11726    10923         9     2016-05-29 10:47:59

11726    10923         9     2016-05-29 03:47:59

11726    10924         19    2016-03-20 02:05:02

11726    10922         18    2016-05-29 10:47:59

12389    10922         23    2016-07-17 02:05:02

12389    10923         12    2016-04-09 02:05:02

12389    10923         3     2016-09-04 02:05:02

12389    10923         30    2016-04-03 02:05:02

12389    10924         23    2016-04-03 02:05:02

12389    10924         17    2016-05-30 02:05:02

12389    10922         15    2016-04-03 02:05:02

45012    10922         33    2016-03-03 02:05:02

45012    10924         11    2016-05-29 10:47:59

sourceid dataelementid value timestamp

11726    10922         18    2016-05-29 10:47:59

11726    10923         9     2016-05-29 10:47:59

11726    10924         19    2016-03-20 02:05:02

12389    10922         23    2016-07-17 02:05:02

12389    10923         3     2016-09-04 02:05:02

12389    10924         17    2016-05-30 02:05:02

45012    10922         33    2016-03-03 02:05:02

45012    NA

45012    10924         11    2016-05-29 10:47:59

Thanks in advance.

edited Nov 8 at 12:06

zx8754

28.9k76395

asked Nov 8 at 11:38

Ali Nguz

marked as duplicate by hrbrmstr r
Users with the r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 8 at 11:58

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

Please provide reproducible example: dput(head(mydata, 20))
– zx8754
Nov 8 at 11:48

It is a 2 step, 1st group by and subset max per group. Then 2nd: add missing combos of 2 ids: "sourceid" and "dataelementid". See linked posts.
– zx8754
Nov 8 at 12:06

add a comment |

up vote
1
down vote

favorite

This question already has an answer here:

How to select the rows with maximum values in each group with dplyr?

4 answers

Add missing rows within combinations of factors

2 answers

I have the following data frame:

sourceid dataelementid value timestamp

11726    10922         34    2016-04-03 02:05:02

11726    10923         9     2016-05-29 10:47:59

11726    10923         9     2016-05-29 03:47:59

11726    10924         19    2016-03-20 02:05:02

11726    10922         18    2016-05-29 10:47:59

12389    10922         23    2016-07-17 02:05:02

12389    10923         12    2016-04-09 02:05:02

12389    10923         3     2016-09-04 02:05:02

12389    10923         30    2016-04-03 02:05:02

12389    10924         23    2016-04-03 02:05:02

12389    10924         17    2016-05-30 02:05:02

12389    10922         15    2016-04-03 02:05:02

45012    10922         33    2016-03-03 02:05:02

45012    10924         11    2016-05-29 10:47:59

sourceid dataelementid value timestamp

11726    10922         18    2016-05-29 10:47:59

11726    10923         9     2016-05-29 10:47:59

11726    10924         19    2016-03-20 02:05:02

12389    10922         23    2016-07-17 02:05:02

12389    10923         3     2016-09-04 02:05:02

12389    10924         17    2016-05-30 02:05:02

45012    10922         33    2016-03-03 02:05:02

45012    NA

45012    10924         11    2016-05-29 10:47:59

Thanks in advance.

edited Nov 8 at 12:06

zx8754

28.9k76395

asked Nov 8 at 11:38

Ali Nguz

This question already has an answer here:

How to select the rows with maximum values in each group with dplyr?

4 answers

Add missing rows within combinations of factors

2 answers

I have the following data frame:

sourceid dataelementid value timestamp

11726    10922         34    2016-04-03 02:05:02

11726    10923         9     2016-05-29 10:47:59

11726    10923         9     2016-05-29 03:47:59

11726    10924         19    2016-03-20 02:05:02

11726    10922         18    2016-05-29 10:47:59

12389    10922         23    2016-07-17 02:05:02

12389    10923         12    2016-04-09 02:05:02

12389    10923         3     2016-09-04 02:05:02

12389    10923         30    2016-04-03 02:05:02

12389    10924         23    2016-04-03 02:05:02

12389    10924         17    2016-05-30 02:05:02

12389    10922         15    2016-04-03 02:05:02

45012    10922         33    2016-03-03 02:05:02

45012    10924         11    2016-05-29 10:47:59

sourceid dataelementid value timestamp

11726    10922         18    2016-05-29 10:47:59

11726    10923         9     2016-05-29 10:47:59

11726    10924         19    2016-03-20 02:05:02

12389    10922         23    2016-07-17 02:05:02

12389    10923         3     2016-09-04 02:05:02

12389    10924         17    2016-05-30 02:05:02

45012    10922         33    2016-03-03 02:05:02

45012    NA

45012    10924         11    2016-05-29 10:47:59

Thanks in advance.

This question already has an answer here:

How to select the rows with maximum values in each group with dplyr?

4 answers

Add missing rows within combinations of factors

2 answers

r dataframe filter max

edited Nov 8 at 12:06

zx8754

28.9k76395

asked Nov 8 at 11:38

Ali Nguz

edited Nov 8 at 12:06

zx8754

28.9k76395

asked Nov 8 at 11:38

Ali Nguz

edited Nov 8 at 12:06

zx8754

28.9k76395

edited Nov 8 at 12:06

zx8754

28.9k76395

edited Nov 8 at 12:06

zx8754

28.9k76395

asked Nov 8 at 11:38

Ali Nguz

asked Nov 8 at 11:38

Ali Nguz

asked Nov 8 at 11:38

Ali Nguz

marked as duplicate by hrbrmstr r
Users with the r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 8 at 11:58

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

marked as duplicate by hrbrmstr r
Users with the r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 8 at 11:58

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

Please provide reproducible example: dput(head(mydata, 20))
– zx8754
Nov 8 at 11:48

It is a 2 step, 1st group by and subset max per group. Then 2nd: add missing combos of 2 ids: "sourceid" and "dataelementid". See linked posts.
– zx8754
Nov 8 at 12:06

add a comment |

Please provide reproducible example: dput(head(mydata, 20))
– zx8754
Nov 8 at 11:48

It is a 2 step, 1st group by and subset max per group. Then 2nd: add missing combos of 2 ids: "sourceid" and "dataelementid". See linked posts.
– zx8754
Nov 8 at 12:06

Please provide reproducible example: dput(head(mydata, 20))
– zx8754
Nov 8 at 11:48

It is a 2 step, 1st group by and subset max per group. Then 2nd: add missing combos of 2 ids: "sourceid" and "dataelementid". See linked posts.
– zx8754
Nov 8 at 12:06

add a comment |

active

oldest

votes

This page is only for reference, If you need detailed information, please check here

AoIeIi5V4

搜尋此網誌

Wsrtjtyk

Filter data with max date per ID in dataframe [duplicate]

這個網誌中的熱門文章

Xamarin.form Move up view when keyboard appear

MGP Nordic

Post-Redirect-Get with Spring WebFlux and Thymeleaf