SQL Server Ignore row if duplicate exists unless id exists
up vote
1
down vote
favorite
I ran the follow SQL Query and got the following rows as an output:
SELECT EventTime
,UserID
,SubAddr
FROM tablename
WHERE EventType = 20 AND
(SubAddr = 1 OR SubAddr = 2)
EventTime UserID SubAddr
2018-05-04 16:47:56.000 113 1
2018-05-04 16:48:45.000 113 2
2018-05-04 16:49:17.000 113 1
2018-05-04 16:49:27.000 113 2
2018-05-04 16:49:48.000 113 1
2018-05-04 16:49:57.000 113 2
2018-05-04 16:50:15.000 113 1
2018-05-04 16:51:01.000 113 2
2018-05-04 16:51:23.000 113 1
2018-05-04 16:51:33.000 113 2
2018-05-07 15:42:13.000 114 1
2018-05-07 15:42:16.000 114 1
2018-05-07 15:42:26.000 114 1
2018-05-07 15:42:35.000 114 2
2018-05-07 15:42:43.000 114 2
2018-05-07 15:42:54.000 114 1
2018-05-07 15:43:02.000 114 1
2018-05-07 15:43:11.000 114 2
2018-05-07 15:43:20.000 114 2
2018-05-07 15:43:35.000 114 1
2018-05-07 15:43:42.000 114 1
2018-05-07 15:43:51.000 114 2
2018-05-07 15:43:58.000 114 2
I want to change the query in such a way that it only gets the rows where it get the first eventtime for every user for both subaddr 1 and 2. If there is another row where subaddr is 1 after a row where is the same userID and subaddr = 2 I want it to retrieve that row too.
See the following example
EventTime UserID SubAddr
2018-05-07 15:42:13.000 114 1 <- This row
2018-05-07 15:42:16.000 114 1 <- Not this row
2018-05-07 15:42:26.000 114 1 <- Not this row
2018-05-07 15:42:35.000 114 2 <- This row
2018-05-07 15:42:43.000 114 2 <- Not this row
2018-05-07 15:42:54.000 114 1 <- This row
2018-05-07 15:43:02.000 114 1 <- Not this row
2018-05-07 15:43:11.000 114 2 <- This row
2018-05-07 15:43:20.000 114 2 <- Not this row
2018-05-07 15:43:35.000 114 1 <- This row
2018-05-07 15:43:42.000 114 1 <- Not this row
2018-05-07 15:43:51.000 114 2 <- This row
2018-05-07 15:43:58.000 114 2 <- Not this row
I hope this example make it clearer. I have been struggling with this all day and haven't been able to find an answer
sql sql-server
New contributor
add a comment |
up vote
1
down vote
favorite
I ran the follow SQL Query and got the following rows as an output:
SELECT EventTime
,UserID
,SubAddr
FROM tablename
WHERE EventType = 20 AND
(SubAddr = 1 OR SubAddr = 2)
EventTime UserID SubAddr
2018-05-04 16:47:56.000 113 1
2018-05-04 16:48:45.000 113 2
2018-05-04 16:49:17.000 113 1
2018-05-04 16:49:27.000 113 2
2018-05-04 16:49:48.000 113 1
2018-05-04 16:49:57.000 113 2
2018-05-04 16:50:15.000 113 1
2018-05-04 16:51:01.000 113 2
2018-05-04 16:51:23.000 113 1
2018-05-04 16:51:33.000 113 2
2018-05-07 15:42:13.000 114 1
2018-05-07 15:42:16.000 114 1
2018-05-07 15:42:26.000 114 1
2018-05-07 15:42:35.000 114 2
2018-05-07 15:42:43.000 114 2
2018-05-07 15:42:54.000 114 1
2018-05-07 15:43:02.000 114 1
2018-05-07 15:43:11.000 114 2
2018-05-07 15:43:20.000 114 2
2018-05-07 15:43:35.000 114 1
2018-05-07 15:43:42.000 114 1
2018-05-07 15:43:51.000 114 2
2018-05-07 15:43:58.000 114 2
I want to change the query in such a way that it only gets the rows where it get the first eventtime for every user for both subaddr 1 and 2. If there is another row where subaddr is 1 after a row where is the same userID and subaddr = 2 I want it to retrieve that row too.
See the following example
EventTime UserID SubAddr
2018-05-07 15:42:13.000 114 1 <- This row
2018-05-07 15:42:16.000 114 1 <- Not this row
2018-05-07 15:42:26.000 114 1 <- Not this row
2018-05-07 15:42:35.000 114 2 <- This row
2018-05-07 15:42:43.000 114 2 <- Not this row
2018-05-07 15:42:54.000 114 1 <- This row
2018-05-07 15:43:02.000 114 1 <- Not this row
2018-05-07 15:43:11.000 114 2 <- This row
2018-05-07 15:43:20.000 114 2 <- Not this row
2018-05-07 15:43:35.000 114 1 <- This row
2018-05-07 15:43:42.000 114 1 <- Not this row
2018-05-07 15:43:51.000 114 2 <- This row
2018-05-07 15:43:58.000 114 2 <- Not this row
I hope this example make it clearer. I have been struggling with this all day and haven't been able to find an answer
sql sql-server
New contributor
What version of SQL Server? Do you have window functions available?
– Eric J. Price
Nov 5 at 18:43
If you can use window functions you could create a CTE adding a column with ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY EventTime) and then self join to compare successive record relationships. You could also do something using LAG and LEAD as well, but I don't have test data and someone else will probably provide this answer in a minute or two. :p
– Eric J. Price
Nov 5 at 18:51
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I ran the follow SQL Query and got the following rows as an output:
SELECT EventTime
,UserID
,SubAddr
FROM tablename
WHERE EventType = 20 AND
(SubAddr = 1 OR SubAddr = 2)
EventTime UserID SubAddr
2018-05-04 16:47:56.000 113 1
2018-05-04 16:48:45.000 113 2
2018-05-04 16:49:17.000 113 1
2018-05-04 16:49:27.000 113 2
2018-05-04 16:49:48.000 113 1
2018-05-04 16:49:57.000 113 2
2018-05-04 16:50:15.000 113 1
2018-05-04 16:51:01.000 113 2
2018-05-04 16:51:23.000 113 1
2018-05-04 16:51:33.000 113 2
2018-05-07 15:42:13.000 114 1
2018-05-07 15:42:16.000 114 1
2018-05-07 15:42:26.000 114 1
2018-05-07 15:42:35.000 114 2
2018-05-07 15:42:43.000 114 2
2018-05-07 15:42:54.000 114 1
2018-05-07 15:43:02.000 114 1
2018-05-07 15:43:11.000 114 2
2018-05-07 15:43:20.000 114 2
2018-05-07 15:43:35.000 114 1
2018-05-07 15:43:42.000 114 1
2018-05-07 15:43:51.000 114 2
2018-05-07 15:43:58.000 114 2
I want to change the query in such a way that it only gets the rows where it get the first eventtime for every user for both subaddr 1 and 2. If there is another row where subaddr is 1 after a row where is the same userID and subaddr = 2 I want it to retrieve that row too.
See the following example
EventTime UserID SubAddr
2018-05-07 15:42:13.000 114 1 <- This row
2018-05-07 15:42:16.000 114 1 <- Not this row
2018-05-07 15:42:26.000 114 1 <- Not this row
2018-05-07 15:42:35.000 114 2 <- This row
2018-05-07 15:42:43.000 114 2 <- Not this row
2018-05-07 15:42:54.000 114 1 <- This row
2018-05-07 15:43:02.000 114 1 <- Not this row
2018-05-07 15:43:11.000 114 2 <- This row
2018-05-07 15:43:20.000 114 2 <- Not this row
2018-05-07 15:43:35.000 114 1 <- This row
2018-05-07 15:43:42.000 114 1 <- Not this row
2018-05-07 15:43:51.000 114 2 <- This row
2018-05-07 15:43:58.000 114 2 <- Not this row
I hope this example make it clearer. I have been struggling with this all day and haven't been able to find an answer
sql sql-server
New contributor
I ran the follow SQL Query and got the following rows as an output:
SELECT EventTime
,UserID
,SubAddr
FROM tablename
WHERE EventType = 20 AND
(SubAddr = 1 OR SubAddr = 2)
EventTime UserID SubAddr
2018-05-04 16:47:56.000 113 1
2018-05-04 16:48:45.000 113 2
2018-05-04 16:49:17.000 113 1
2018-05-04 16:49:27.000 113 2
2018-05-04 16:49:48.000 113 1
2018-05-04 16:49:57.000 113 2
2018-05-04 16:50:15.000 113 1
2018-05-04 16:51:01.000 113 2
2018-05-04 16:51:23.000 113 1
2018-05-04 16:51:33.000 113 2
2018-05-07 15:42:13.000 114 1
2018-05-07 15:42:16.000 114 1
2018-05-07 15:42:26.000 114 1
2018-05-07 15:42:35.000 114 2
2018-05-07 15:42:43.000 114 2
2018-05-07 15:42:54.000 114 1
2018-05-07 15:43:02.000 114 1
2018-05-07 15:43:11.000 114 2
2018-05-07 15:43:20.000 114 2
2018-05-07 15:43:35.000 114 1
2018-05-07 15:43:42.000 114 1
2018-05-07 15:43:51.000 114 2
2018-05-07 15:43:58.000 114 2
I want to change the query in such a way that it only gets the rows where it get the first eventtime for every user for both subaddr 1 and 2. If there is another row where subaddr is 1 after a row where is the same userID and subaddr = 2 I want it to retrieve that row too.
See the following example
EventTime UserID SubAddr
2018-05-07 15:42:13.000 114 1 <- This row
2018-05-07 15:42:16.000 114 1 <- Not this row
2018-05-07 15:42:26.000 114 1 <- Not this row
2018-05-07 15:42:35.000 114 2 <- This row
2018-05-07 15:42:43.000 114 2 <- Not this row
2018-05-07 15:42:54.000 114 1 <- This row
2018-05-07 15:43:02.000 114 1 <- Not this row
2018-05-07 15:43:11.000 114 2 <- This row
2018-05-07 15:43:20.000 114 2 <- Not this row
2018-05-07 15:43:35.000 114 1 <- This row
2018-05-07 15:43:42.000 114 1 <- Not this row
2018-05-07 15:43:51.000 114 2 <- This row
2018-05-07 15:43:58.000 114 2 <- Not this row
I hope this example make it clearer. I have been struggling with this all day and haven't been able to find an answer
sql sql-server
sql sql-server
New contributor
New contributor
edited Nov 5 at 18:10
PM 77-1
8,732134584
8,732134584
New contributor
asked Nov 5 at 18:05
Laurens
45
45
New contributor
New contributor
What version of SQL Server? Do you have window functions available?
– Eric J. Price
Nov 5 at 18:43
If you can use window functions you could create a CTE adding a column with ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY EventTime) and then self join to compare successive record relationships. You could also do something using LAG and LEAD as well, but I don't have test data and someone else will probably provide this answer in a minute or two. :p
– Eric J. Price
Nov 5 at 18:51
add a comment |
What version of SQL Server? Do you have window functions available?
– Eric J. Price
Nov 5 at 18:43
If you can use window functions you could create a CTE adding a column with ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY EventTime) and then self join to compare successive record relationships. You could also do something using LAG and LEAD as well, but I don't have test data and someone else will probably provide this answer in a minute or two. :p
– Eric J. Price
Nov 5 at 18:51
What version of SQL Server? Do you have window functions available?
– Eric J. Price
Nov 5 at 18:43
What version of SQL Server? Do you have window functions available?
– Eric J. Price
Nov 5 at 18:43
If you can use window functions you could create a CTE adding a column with ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY EventTime) and then self join to compare successive record relationships. You could also do something using LAG and LEAD as well, but I don't have test data and someone else will probably provide this answer in a minute or two. :p
– Eric J. Price
Nov 5 at 18:51
If you can use window functions you could create a CTE adding a column with ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY EventTime) and then self join to compare successive record relationships. You could also do something using LAG and LEAD as well, but I don't have test data and someone else will probably provide this answer in a minute or two. :p
– Eric J. Price
Nov 5 at 18:51
add a comment |
2 Answers
2
active
oldest
votes
up vote
0
down vote
Lee Mac solution don't working for me, because you will take in this table just
2018-05-07 15:42:13.000 114 1 <- This row
2018-05-07 15:42:35.000 114 2 <- This row
Because you selecting a min from grouping operator of user and subAddr.
I have simple solution (but you must check for you data integrity if working)
cut the data in this way:
2018-05-07 15:42
and Use this query
SELECT distinct(EventTime, UserID, SubAddr)
FROM tablename
WHERE EventType = 20 AND (SubAddr = 1 OR SubAddr = 2)
in this way you will take this data:
2018-05-07 15:42:13.000 114 1 <- This row
2018-05-07 15:42:35.000 114 2 <- This row
2018-05-07 15:42:54.000 114 1 <- This row
2018-05-07 15:43:02.000 114 1 <- Not this row (this will capture)
2018-05-07 15:43:11.000 114 2 <- This row
2018-05-07 15:43:35.000 114 1 <- This row
2018-05-07 15:43:51.000 114 2 <- This row
This is fast way, otherwise you need make a nested query, that will be more complex.
But i repeat, this is not solution to your problem, just a short way.
I don't think this will work; it doesn't take the logical ordering into consideration being requested. I don't see any duplicated EventTime values in the sample output.
– Eric J. Price
Nov 5 at 18:45
You are right @EricJ.Price i edit the answer. I thinking to distinct attribute. But otherwhise have the same issues from previous post.
– Antonio Luciano
Nov 5 at 18:57
The SQL Server doesn' t accept multiple columns in DISTINCT
– Laurens
Nov 6 at 15:05
add a comment |
up vote
0
down vote
Use lag()
with t as (
select t.*, lag(subaddr) over (partition by userid order by eventtime) as prev_subaddr
from tablename t
where EventType = 20 and SubAddr in (1, 2)
)
select EventTime, UserID, SubAddr
from t
where prev_subaddr is null or prev_subaddr <> subaddr;
EDIT:
If you don't have lag()
, you can use apply
:
with t as (
select t.*, tprev.subaddr as prev_subaddr
from tablename t outer apply
(select top (1) t2.*
from tablename t2
where t2.userid = t.userid and
t2.EventType = 20 and
t2.SubAddr in (1, 2) and
t2.eventtime < t.eventtime
order by eventtime desc
) tprev
where EventType = 20 and SubAddr in (1, 2)
)
select EventTime, UserID, SubAddr
from t
where prev_subaddr is null or prev_subaddr <> subaddr;
lag isn't recognized by SQL server
– Laurens
Nov 6 at 15:04
@Laurens . . .lag()
has been in SQL Server for six years. docs.microsoft.com/en-us/sql/t-sql/functions/….
– Gordon Linoff
Nov 6 at 15:18
i'm working on SQL Server 2008. At this moment there is no way for me to upgrade.
– Laurens
Nov 6 at 15:22
I get the following error after your edit: Msg 1033, Level 15, State 1, Line 11 The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP or FOR XML is also specified.
– Laurens
Nov 7 at 7:08
After removing the line that is causing the error I get an output. It gets the first and second row right. but after that it gets every row the same and has dulplicates of every row
– Laurens
Nov 7 at 7:57
|
show 2 more comments
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
Lee Mac solution don't working for me, because you will take in this table just
2018-05-07 15:42:13.000 114 1 <- This row
2018-05-07 15:42:35.000 114 2 <- This row
Because you selecting a min from grouping operator of user and subAddr.
I have simple solution (but you must check for you data integrity if working)
cut the data in this way:
2018-05-07 15:42
and Use this query
SELECT distinct(EventTime, UserID, SubAddr)
FROM tablename
WHERE EventType = 20 AND (SubAddr = 1 OR SubAddr = 2)
in this way you will take this data:
2018-05-07 15:42:13.000 114 1 <- This row
2018-05-07 15:42:35.000 114 2 <- This row
2018-05-07 15:42:54.000 114 1 <- This row
2018-05-07 15:43:02.000 114 1 <- Not this row (this will capture)
2018-05-07 15:43:11.000 114 2 <- This row
2018-05-07 15:43:35.000 114 1 <- This row
2018-05-07 15:43:51.000 114 2 <- This row
This is fast way, otherwise you need make a nested query, that will be more complex.
But i repeat, this is not solution to your problem, just a short way.
I don't think this will work; it doesn't take the logical ordering into consideration being requested. I don't see any duplicated EventTime values in the sample output.
– Eric J. Price
Nov 5 at 18:45
You are right @EricJ.Price i edit the answer. I thinking to distinct attribute. But otherwhise have the same issues from previous post.
– Antonio Luciano
Nov 5 at 18:57
The SQL Server doesn' t accept multiple columns in DISTINCT
– Laurens
Nov 6 at 15:05
add a comment |
up vote
0
down vote
Lee Mac solution don't working for me, because you will take in this table just
2018-05-07 15:42:13.000 114 1 <- This row
2018-05-07 15:42:35.000 114 2 <- This row
Because you selecting a min from grouping operator of user and subAddr.
I have simple solution (but you must check for you data integrity if working)
cut the data in this way:
2018-05-07 15:42
and Use this query
SELECT distinct(EventTime, UserID, SubAddr)
FROM tablename
WHERE EventType = 20 AND (SubAddr = 1 OR SubAddr = 2)
in this way you will take this data:
2018-05-07 15:42:13.000 114 1 <- This row
2018-05-07 15:42:35.000 114 2 <- This row
2018-05-07 15:42:54.000 114 1 <- This row
2018-05-07 15:43:02.000 114 1 <- Not this row (this will capture)
2018-05-07 15:43:11.000 114 2 <- This row
2018-05-07 15:43:35.000 114 1 <- This row
2018-05-07 15:43:51.000 114 2 <- This row
This is fast way, otherwise you need make a nested query, that will be more complex.
But i repeat, this is not solution to your problem, just a short way.
I don't think this will work; it doesn't take the logical ordering into consideration being requested. I don't see any duplicated EventTime values in the sample output.
– Eric J. Price
Nov 5 at 18:45
You are right @EricJ.Price i edit the answer. I thinking to distinct attribute. But otherwhise have the same issues from previous post.
– Antonio Luciano
Nov 5 at 18:57
The SQL Server doesn' t accept multiple columns in DISTINCT
– Laurens
Nov 6 at 15:05
add a comment |
up vote
0
down vote
up vote
0
down vote
Lee Mac solution don't working for me, because you will take in this table just
2018-05-07 15:42:13.000 114 1 <- This row
2018-05-07 15:42:35.000 114 2 <- This row
Because you selecting a min from grouping operator of user and subAddr.
I have simple solution (but you must check for you data integrity if working)
cut the data in this way:
2018-05-07 15:42
and Use this query
SELECT distinct(EventTime, UserID, SubAddr)
FROM tablename
WHERE EventType = 20 AND (SubAddr = 1 OR SubAddr = 2)
in this way you will take this data:
2018-05-07 15:42:13.000 114 1 <- This row
2018-05-07 15:42:35.000 114 2 <- This row
2018-05-07 15:42:54.000 114 1 <- This row
2018-05-07 15:43:02.000 114 1 <- Not this row (this will capture)
2018-05-07 15:43:11.000 114 2 <- This row
2018-05-07 15:43:35.000 114 1 <- This row
2018-05-07 15:43:51.000 114 2 <- This row
This is fast way, otherwise you need make a nested query, that will be more complex.
But i repeat, this is not solution to your problem, just a short way.
Lee Mac solution don't working for me, because you will take in this table just
2018-05-07 15:42:13.000 114 1 <- This row
2018-05-07 15:42:35.000 114 2 <- This row
Because you selecting a min from grouping operator of user and subAddr.
I have simple solution (but you must check for you data integrity if working)
cut the data in this way:
2018-05-07 15:42
and Use this query
SELECT distinct(EventTime, UserID, SubAddr)
FROM tablename
WHERE EventType = 20 AND (SubAddr = 1 OR SubAddr = 2)
in this way you will take this data:
2018-05-07 15:42:13.000 114 1 <- This row
2018-05-07 15:42:35.000 114 2 <- This row
2018-05-07 15:42:54.000 114 1 <- This row
2018-05-07 15:43:02.000 114 1 <- Not this row (this will capture)
2018-05-07 15:43:11.000 114 2 <- This row
2018-05-07 15:43:35.000 114 1 <- This row
2018-05-07 15:43:51.000 114 2 <- This row
This is fast way, otherwise you need make a nested query, that will be more complex.
But i repeat, this is not solution to your problem, just a short way.
edited Nov 5 at 18:56
answered Nov 5 at 18:41
Antonio Luciano
286
286
I don't think this will work; it doesn't take the logical ordering into consideration being requested. I don't see any duplicated EventTime values in the sample output.
– Eric J. Price
Nov 5 at 18:45
You are right @EricJ.Price i edit the answer. I thinking to distinct attribute. But otherwhise have the same issues from previous post.
– Antonio Luciano
Nov 5 at 18:57
The SQL Server doesn' t accept multiple columns in DISTINCT
– Laurens
Nov 6 at 15:05
add a comment |
I don't think this will work; it doesn't take the logical ordering into consideration being requested. I don't see any duplicated EventTime values in the sample output.
– Eric J. Price
Nov 5 at 18:45
You are right @EricJ.Price i edit the answer. I thinking to distinct attribute. But otherwhise have the same issues from previous post.
– Antonio Luciano
Nov 5 at 18:57
The SQL Server doesn' t accept multiple columns in DISTINCT
– Laurens
Nov 6 at 15:05
I don't think this will work; it doesn't take the logical ordering into consideration being requested. I don't see any duplicated EventTime values in the sample output.
– Eric J. Price
Nov 5 at 18:45
I don't think this will work; it doesn't take the logical ordering into consideration being requested. I don't see any duplicated EventTime values in the sample output.
– Eric J. Price
Nov 5 at 18:45
You are right @EricJ.Price i edit the answer. I thinking to distinct attribute. But otherwhise have the same issues from previous post.
– Antonio Luciano
Nov 5 at 18:57
You are right @EricJ.Price i edit the answer. I thinking to distinct attribute. But otherwhise have the same issues from previous post.
– Antonio Luciano
Nov 5 at 18:57
The SQL Server doesn' t accept multiple columns in DISTINCT
– Laurens
Nov 6 at 15:05
The SQL Server doesn' t accept multiple columns in DISTINCT
– Laurens
Nov 6 at 15:05
add a comment |
up vote
0
down vote
Use lag()
with t as (
select t.*, lag(subaddr) over (partition by userid order by eventtime) as prev_subaddr
from tablename t
where EventType = 20 and SubAddr in (1, 2)
)
select EventTime, UserID, SubAddr
from t
where prev_subaddr is null or prev_subaddr <> subaddr;
EDIT:
If you don't have lag()
, you can use apply
:
with t as (
select t.*, tprev.subaddr as prev_subaddr
from tablename t outer apply
(select top (1) t2.*
from tablename t2
where t2.userid = t.userid and
t2.EventType = 20 and
t2.SubAddr in (1, 2) and
t2.eventtime < t.eventtime
order by eventtime desc
) tprev
where EventType = 20 and SubAddr in (1, 2)
)
select EventTime, UserID, SubAddr
from t
where prev_subaddr is null or prev_subaddr <> subaddr;
lag isn't recognized by SQL server
– Laurens
Nov 6 at 15:04
@Laurens . . .lag()
has been in SQL Server for six years. docs.microsoft.com/en-us/sql/t-sql/functions/….
– Gordon Linoff
Nov 6 at 15:18
i'm working on SQL Server 2008. At this moment there is no way for me to upgrade.
– Laurens
Nov 6 at 15:22
I get the following error after your edit: Msg 1033, Level 15, State 1, Line 11 The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP or FOR XML is also specified.
– Laurens
Nov 7 at 7:08
After removing the line that is causing the error I get an output. It gets the first and second row right. but after that it gets every row the same and has dulplicates of every row
– Laurens
Nov 7 at 7:57
|
show 2 more comments
up vote
0
down vote
Use lag()
with t as (
select t.*, lag(subaddr) over (partition by userid order by eventtime) as prev_subaddr
from tablename t
where EventType = 20 and SubAddr in (1, 2)
)
select EventTime, UserID, SubAddr
from t
where prev_subaddr is null or prev_subaddr <> subaddr;
EDIT:
If you don't have lag()
, you can use apply
:
with t as (
select t.*, tprev.subaddr as prev_subaddr
from tablename t outer apply
(select top (1) t2.*
from tablename t2
where t2.userid = t.userid and
t2.EventType = 20 and
t2.SubAddr in (1, 2) and
t2.eventtime < t.eventtime
order by eventtime desc
) tprev
where EventType = 20 and SubAddr in (1, 2)
)
select EventTime, UserID, SubAddr
from t
where prev_subaddr is null or prev_subaddr <> subaddr;
lag isn't recognized by SQL server
– Laurens
Nov 6 at 15:04
@Laurens . . .lag()
has been in SQL Server for six years. docs.microsoft.com/en-us/sql/t-sql/functions/….
– Gordon Linoff
Nov 6 at 15:18
i'm working on SQL Server 2008. At this moment there is no way for me to upgrade.
– Laurens
Nov 6 at 15:22
I get the following error after your edit: Msg 1033, Level 15, State 1, Line 11 The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP or FOR XML is also specified.
– Laurens
Nov 7 at 7:08
After removing the line that is causing the error I get an output. It gets the first and second row right. but after that it gets every row the same and has dulplicates of every row
– Laurens
Nov 7 at 7:57
|
show 2 more comments
up vote
0
down vote
up vote
0
down vote
Use lag()
with t as (
select t.*, lag(subaddr) over (partition by userid order by eventtime) as prev_subaddr
from tablename t
where EventType = 20 and SubAddr in (1, 2)
)
select EventTime, UserID, SubAddr
from t
where prev_subaddr is null or prev_subaddr <> subaddr;
EDIT:
If you don't have lag()
, you can use apply
:
with t as (
select t.*, tprev.subaddr as prev_subaddr
from tablename t outer apply
(select top (1) t2.*
from tablename t2
where t2.userid = t.userid and
t2.EventType = 20 and
t2.SubAddr in (1, 2) and
t2.eventtime < t.eventtime
order by eventtime desc
) tprev
where EventType = 20 and SubAddr in (1, 2)
)
select EventTime, UserID, SubAddr
from t
where prev_subaddr is null or prev_subaddr <> subaddr;
Use lag()
with t as (
select t.*, lag(subaddr) over (partition by userid order by eventtime) as prev_subaddr
from tablename t
where EventType = 20 and SubAddr in (1, 2)
)
select EventTime, UserID, SubAddr
from t
where prev_subaddr is null or prev_subaddr <> subaddr;
EDIT:
If you don't have lag()
, you can use apply
:
with t as (
select t.*, tprev.subaddr as prev_subaddr
from tablename t outer apply
(select top (1) t2.*
from tablename t2
where t2.userid = t.userid and
t2.EventType = 20 and
t2.SubAddr in (1, 2) and
t2.eventtime < t.eventtime
order by eventtime desc
) tprev
where EventType = 20 and SubAddr in (1, 2)
)
select EventTime, UserID, SubAddr
from t
where prev_subaddr is null or prev_subaddr <> subaddr;
edited Nov 7 at 14:18
answered Nov 5 at 20:14
Gordon Linoff
741k32285389
741k32285389
lag isn't recognized by SQL server
– Laurens
Nov 6 at 15:04
@Laurens . . .lag()
has been in SQL Server for six years. docs.microsoft.com/en-us/sql/t-sql/functions/….
– Gordon Linoff
Nov 6 at 15:18
i'm working on SQL Server 2008. At this moment there is no way for me to upgrade.
– Laurens
Nov 6 at 15:22
I get the following error after your edit: Msg 1033, Level 15, State 1, Line 11 The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP or FOR XML is also specified.
– Laurens
Nov 7 at 7:08
After removing the line that is causing the error I get an output. It gets the first and second row right. but after that it gets every row the same and has dulplicates of every row
– Laurens
Nov 7 at 7:57
|
show 2 more comments
lag isn't recognized by SQL server
– Laurens
Nov 6 at 15:04
@Laurens . . .lag()
has been in SQL Server for six years. docs.microsoft.com/en-us/sql/t-sql/functions/….
– Gordon Linoff
Nov 6 at 15:18
i'm working on SQL Server 2008. At this moment there is no way for me to upgrade.
– Laurens
Nov 6 at 15:22
I get the following error after your edit: Msg 1033, Level 15, State 1, Line 11 The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP or FOR XML is also specified.
– Laurens
Nov 7 at 7:08
After removing the line that is causing the error I get an output. It gets the first and second row right. but after that it gets every row the same and has dulplicates of every row
– Laurens
Nov 7 at 7:57
lag isn't recognized by SQL server
– Laurens
Nov 6 at 15:04
lag isn't recognized by SQL server
– Laurens
Nov 6 at 15:04
@Laurens . . .
lag()
has been in SQL Server for six years. docs.microsoft.com/en-us/sql/t-sql/functions/….– Gordon Linoff
Nov 6 at 15:18
@Laurens . . .
lag()
has been in SQL Server for six years. docs.microsoft.com/en-us/sql/t-sql/functions/….– Gordon Linoff
Nov 6 at 15:18
i'm working on SQL Server 2008. At this moment there is no way for me to upgrade.
– Laurens
Nov 6 at 15:22
i'm working on SQL Server 2008. At this moment there is no way for me to upgrade.
– Laurens
Nov 6 at 15:22
I get the following error after your edit: Msg 1033, Level 15, State 1, Line 11 The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP or FOR XML is also specified.
– Laurens
Nov 7 at 7:08
I get the following error after your edit: Msg 1033, Level 15, State 1, Line 11 The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP or FOR XML is also specified.
– Laurens
Nov 7 at 7:08
After removing the line that is causing the error I get an output. It gets the first and second row right. but after that it gets every row the same and has dulplicates of every row
– Laurens
Nov 7 at 7:57
After removing the line that is causing the error I get an output. It gets the first and second row right. but after that it gets every row the same and has dulplicates of every row
– Laurens
Nov 7 at 7:57
|
show 2 more comments
Laurens is a new contributor. Be nice, and check out our Code of Conduct.
Laurens is a new contributor. Be nice, and check out our Code of Conduct.
Laurens is a new contributor. Be nice, and check out our Code of Conduct.
Laurens is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53159839%2fsql-server-ignore-row-if-duplicate-exists-unless-id-exists%23new-answer', 'question_page');
}
);
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
What version of SQL Server? Do you have window functions available?
– Eric J. Price
Nov 5 at 18:43
If you can use window functions you could create a CTE adding a column with ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY EventTime) and then self join to compare successive record relationships. You could also do something using LAG and LEAD as well, but I don't have test data and someone else will probably provide this answer in a minute or two. :p
– Eric J. Price
Nov 5 at 18:51