Pandas - for loop and find index that has the nearest value

I am basically trying to loop through a dataframe that has been grouped and finding the index that has the nearest value to the input argument.

For example, given the dataframe below, for every group defined by the global_id, I want to group to take frames that are spaced at least 10 frames apart. For example if I have a list of frames [1,2,3,4,14,20,30,31], the output would be [1,14,30] because

I would initialize by taking frame 1 as the first frame

The next frame that is at least 10 frames apart would be frame number 14

The following frame that is at least 10 frame apart from 14 is 30

As such, the resulting before and after dataframe should look like below

Before

       seq_name     label pedestrian_id  frame_no  global_id

0          0001  crossing          0001  0001         1

1          0001  crossing          0001  0002         1

2          0001  crossing          0001  0003         1

3          0001  crossing          0001  0004         1

4          0001  crossing          0001  0005         1

5          0001  crossing          0001  0006         1

6          0001  crossing          0001  0007         1

7          0001  crossing          0001  0008         1

8          0001  crossing          0001  0009         1

9          0001  crossing          0001  0010         1

10         0001  crossing          0002  0001         2

11         0001  crossing          0002  0012         2

12         0001  crossing          0002  0013         2

13         0001  crossing          0002  0014         2

14         0001  crossing          0002  0015         2

15         0001  crossing          0002  0029         2

16         0001  crossing          0002  0030         2

17         0001  crossing          0002  0031         2

18         0001  crossing          0002  0032         2

19         0001  crossing          0002  0033         2

20         0002  crossing          0001  0034         3

21         0002  crossing          0001  0035         3

22         0002  crossing          0001  0036         3

23         0002  crossing          0001  0037         3

24         0002  crossing          0001  0038         3

25         0002  crossing          0001  0039         3

26         0002  crossing          0001  0049         3

27         0002  crossing          0001  0050         3

28         0002  crossing          0001  0051         3

29         0002  crossing          0001  0052         3

After filter

       seq_name     label pedestrian_id  frame_no  global_id

0          0001  crossing          0001  0001         1

10         0001  crossing          0002  0001         2

11         0001  crossing          0002  0012         2

15         0001  crossing          0002  0029         2

25         0002  crossing          0001  0039         3

26         0002  crossing          0001  0049         3

Below is what I have. Once I have the indices I can create a new dataframe by indexing from the old. I am still new to Pandas and it looks extremely cumbersome so I am hoping there is a more elegant solution. I have read through the docs on groupby and some other SO posts but still cant figure it out. This isn't a homework. Just trying to clean up my data processing pipeline by replacing everything with Pandas.

ind = 

for j in df["global_id"].unique():

    df_temp = df[df["global_id"] == j][["frame_no"]]

    df_temp["frame_no"] = pd.to_numeric(df["frame_no"]) 

    start_frame = df_temp["frame_no"].min()

    end_frame = df_temp["frame_no"].max()

    i = start_frame-1

    while i < end_frame:

        ind.append(np.min(df_temp[(df_temp["frame_no"] > i) & (df_temp["frame_no"] < i+10)].index.tolist()))

        i+=10

asked Nov 13 '18 at 0:37

Kong

5671823

add a comment |

I am basically trying to loop through a dataframe that has been grouped and finding the index that has the nearest value to the input argument.

I would initialize by taking frame 1 as the first frame

The next frame that is at least 10 frames apart would be frame number 14

The following frame that is at least 10 frame apart from 14 is 30

As such, the resulting before and after dataframe should look like below

Before

       seq_name     label pedestrian_id  frame_no  global_id

0          0001  crossing          0001  0001         1

1          0001  crossing          0001  0002         1

2          0001  crossing          0001  0003         1

3          0001  crossing          0001  0004         1

4          0001  crossing          0001  0005         1

5          0001  crossing          0001  0006         1

6          0001  crossing          0001  0007         1

7          0001  crossing          0001  0008         1

8          0001  crossing          0001  0009         1

9          0001  crossing          0001  0010         1

10         0001  crossing          0002  0001         2

11         0001  crossing          0002  0012         2

12         0001  crossing          0002  0013         2

13         0001  crossing          0002  0014         2

14         0001  crossing          0002  0015         2

15         0001  crossing          0002  0029         2

16         0001  crossing          0002  0030         2

17         0001  crossing          0002  0031         2

18         0001  crossing          0002  0032         2

19         0001  crossing          0002  0033         2

20         0002  crossing          0001  0034         3

21         0002  crossing          0001  0035         3

22         0002  crossing          0001  0036         3

23         0002  crossing          0001  0037         3

24         0002  crossing          0001  0038         3

25         0002  crossing          0001  0039         3

26         0002  crossing          0001  0049         3

27         0002  crossing          0001  0050         3

28         0002  crossing          0001  0051         3

29         0002  crossing          0001  0052         3

After filter

       seq_name     label pedestrian_id  frame_no  global_id

0          0001  crossing          0001  0001         1

10         0001  crossing          0002  0001         2

11         0001  crossing          0002  0012         2

15         0001  crossing          0002  0029         2

25         0002  crossing          0001  0039         3

26         0002  crossing          0001  0049         3

ind = 

for j in df["global_id"].unique():

    df_temp = df[df["global_id"] == j][["frame_no"]]

    df_temp["frame_no"] = pd.to_numeric(df["frame_no"]) 

    start_frame = df_temp["frame_no"].min()

    end_frame = df_temp["frame_no"].max()

    i = start_frame-1

    while i < end_frame:

        ind.append(np.min(df_temp[(df_temp["frame_no"] > i) & (df_temp["frame_no"] < i+10)].index.tolist()))

        i+=10

asked Nov 13 '18 at 0:37

Kong

5671823

add a comment |

I am basically trying to loop through a dataframe that has been grouped and finding the index that has the nearest value to the input argument.

I would initialize by taking frame 1 as the first frame

The next frame that is at least 10 frames apart would be frame number 14

The following frame that is at least 10 frame apart from 14 is 30

As such, the resulting before and after dataframe should look like below

Before

       seq_name     label pedestrian_id  frame_no  global_id

0          0001  crossing          0001  0001         1

1          0001  crossing          0001  0002         1

2          0001  crossing          0001  0003         1

3          0001  crossing          0001  0004         1

4          0001  crossing          0001  0005         1

5          0001  crossing          0001  0006         1

6          0001  crossing          0001  0007         1

7          0001  crossing          0001  0008         1

8          0001  crossing          0001  0009         1

9          0001  crossing          0001  0010         1

10         0001  crossing          0002  0001         2

11         0001  crossing          0002  0012         2

12         0001  crossing          0002  0013         2

13         0001  crossing          0002  0014         2

14         0001  crossing          0002  0015         2

15         0001  crossing          0002  0029         2

16         0001  crossing          0002  0030         2

17         0001  crossing          0002  0031         2

18         0001  crossing          0002  0032         2

19         0001  crossing          0002  0033         2

20         0002  crossing          0001  0034         3

21         0002  crossing          0001  0035         3

22         0002  crossing          0001  0036         3

23         0002  crossing          0001  0037         3

24         0002  crossing          0001  0038         3

25         0002  crossing          0001  0039         3

26         0002  crossing          0001  0049         3

27         0002  crossing          0001  0050         3

28         0002  crossing          0001  0051         3

29         0002  crossing          0001  0052         3

After filter

       seq_name     label pedestrian_id  frame_no  global_id

0          0001  crossing          0001  0001         1

10         0001  crossing          0002  0001         2

11         0001  crossing          0002  0012         2

15         0001  crossing          0002  0029         2

25         0002  crossing          0001  0039         3

26         0002  crossing          0001  0049         3

ind = 

for j in df["global_id"].unique():

    df_temp = df[df["global_id"] == j][["frame_no"]]

    df_temp["frame_no"] = pd.to_numeric(df["frame_no"]) 

    start_frame = df_temp["frame_no"].min()

    end_frame = df_temp["frame_no"].max()

    i = start_frame-1

    while i < end_frame:

        ind.append(np.min(df_temp[(df_temp["frame_no"] > i) & (df_temp["frame_no"] < i+10)].index.tolist()))

        i+=10

asked Nov 13 '18 at 0:37

Kong

5671823

I am basically trying to loop through a dataframe that has been grouped and finding the index that has the nearest value to the input argument.

I would initialize by taking frame 1 as the first frame

The next frame that is at least 10 frames apart would be frame number 14

The following frame that is at least 10 frame apart from 14 is 30

As such, the resulting before and after dataframe should look like below

Before

       seq_name     label pedestrian_id  frame_no  global_id

0          0001  crossing          0001  0001         1

1          0001  crossing          0001  0002         1

2          0001  crossing          0001  0003         1

3          0001  crossing          0001  0004         1

4          0001  crossing          0001  0005         1

5          0001  crossing          0001  0006         1

6          0001  crossing          0001  0007         1

7          0001  crossing          0001  0008         1

8          0001  crossing          0001  0009         1

9          0001  crossing          0001  0010         1

10         0001  crossing          0002  0001         2

11         0001  crossing          0002  0012         2

12         0001  crossing          0002  0013         2

13         0001  crossing          0002  0014         2

14         0001  crossing          0002  0015         2

15         0001  crossing          0002  0029         2

16         0001  crossing          0002  0030         2

17         0001  crossing          0002  0031         2

18         0001  crossing          0002  0032         2

19         0001  crossing          0002  0033         2

20         0002  crossing          0001  0034         3

21         0002  crossing          0001  0035         3

22         0002  crossing          0001  0036         3

23         0002  crossing          0001  0037         3

24         0002  crossing          0001  0038         3

25         0002  crossing          0001  0039         3

26         0002  crossing          0001  0049         3

27         0002  crossing          0001  0050         3

28         0002  crossing          0001  0051         3

29         0002  crossing          0001  0052         3

After filter

       seq_name     label pedestrian_id  frame_no  global_id

0          0001  crossing          0001  0001         1

10         0001  crossing          0002  0001         2

11         0001  crossing          0002  0012         2

15         0001  crossing          0002  0029         2

25         0002  crossing          0001  0039         3

26         0002  crossing          0001  0049         3

ind = 

for j in df["global_id"].unique():

    df_temp = df[df["global_id"] == j][["frame_no"]]

    df_temp["frame_no"] = pd.to_numeric(df["frame_no"]) 

    start_frame = df_temp["frame_no"].min()

    end_frame = df_temp["frame_no"].max()

    i = start_frame-1

    while i < end_frame:

        ind.append(np.min(df_temp[(df_temp["frame_no"] > i) & (df_temp["frame_no"] < i+10)].index.tolist()))

        i+=10

python pandas

asked Nov 13 '18 at 0:37

Kong

5671823

asked Nov 13 '18 at 0:37

Kong

5671823

asked Nov 13 '18 at 0:37

Kong

5671823

asked Nov 13 '18 at 0:37

Kong

5671823

asked Nov 13 '18 at 0:37

Kong

5671823

add a comment |

1 Answer
1

active

oldest

votes

Here is one way using groupby but first you need to define a function doing what you look for in each group. To explain the idea, let's consider a simple dataframe dfs = pd.DataFrame({'a':[1,2,3,4,14,20,30,31]})

I have been looking to solve this kind of problem for a while, trying to avoid looping and it seems complex. Here is the idea I end up with. In numpy, you can use substract combined with outer to get all the differences between each element one to one

print (np.subtract.outer(dfs.a, dfs.a))

array([[  0,  -1,  -2,  -3, -13, -19, -29, -30],

       [  1,   0,  -1,  -2, -12, -18, -28, -29],

       [  2,   1,   0,  -1, -11, -17, -27, -28],

       [  3,   2,   1,   0, -10, -16, -26, -27],

       [ 13,  12,  11,  10,   0,  -6, -16, -17],

       [ 19,  18,  17,  16,   6,   0, -10, -11],

       [ 29,  28,  27,  26,  16,  10,   0,  -1],

       [ 30,  29,  28,  27,  17,  11,   1,   0]], dtype=int64)

Now, for example, in column 0, you can see that difference >10 start at row 4, then going to column 4, difference >10 start at row 6 and going to column 6 you don't get a difference big enough. So the filtering would be keeping row 0, 4 and 6, which is the values [1,14,30] as looking for. To get these numbers, you can compare the np.substract.outer to 10 and sum over the axis=0 such as:

arr = (np.subtract.outer(dfs.a, dfs.a) <=10).sum(0)

print (arr)

array([4, 4, 4, 5, 6, 7, 8, 8])

Now you see, arr[0] = 4, then arr[4] = 6, then arr[6]=8 is out of bound in this example so stop. One way to catch these number is using a while (if someone has a numpy solution to do this, I'm interested in)

list_ind = [0] # initialize list of index to keep with 0

arr = (np.subtract.outer(dfs.a, dfs.a) <=10).sum(0)

i = arr[0]

while i < len(arr):

    list_ind.append(i)

    i = arr[i]



print (list_ind)

[0, 4, 6]



print (dfs.iloc[list_ind])

    a

0   1

4  14

6  30

Now with the whole problem and groupby, you can do:

# it seems you need to convert the column frame_no to integer

df['frame_int'] = pd.to_numeric(df['frame_no'])

df = df.sort_values('frame_int') #ensure data to be sorted by frame_int, whatever the global_id



#define the function looking for the ind

def find_ind (df_g):

    list_ind = [0]

    arr = (np.subtract.outer(df_g.frame_int, df_g.frame_int) <= 10).sum(0)

    i = arr[0]

    while i <len(arr):

        list_ind.append(i)

        i = arr[i]

    return df_g.iloc[list_ind]



#create the filtered dataframe

df_filtered = (df.groupby('global_id').apply(find_ind)

                 .drop('frame_int',axis=1).reset_index(drop=True))



print (df_filtered)

   seq_name     label  pedestrian_id  frame_no  global_id

0         1  crossing              1         1          1

1         1  crossing              2         1          2

2         1  crossing              2        12          2

3         1  crossing              2        29          2

4         2  crossing              1        34          3

5         2  crossing              1        49          3

If you want to keep the original rows' index, you can add level=0 in reset_index such as reset_index(level=0,drop=True) instead.

answered Nov 13 '18 at 16:27

Ben.T

5,9572523

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53272116%2fpandas-for-loop-and-find-index-that-has-the-nearest-value%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

print (np.subtract.outer(dfs.a, dfs.a))

array([[  0,  -1,  -2,  -3, -13, -19, -29, -30],

       [  1,   0,  -1,  -2, -12, -18, -28, -29],

       [  2,   1,   0,  -1, -11, -17, -27, -28],

       [  3,   2,   1,   0, -10, -16, -26, -27],

       [ 13,  12,  11,  10,   0,  -6, -16, -17],

       [ 19,  18,  17,  16,   6,   0, -10, -11],

       [ 29,  28,  27,  26,  16,  10,   0,  -1],

       [ 30,  29,  28,  27,  17,  11,   1,   0]], dtype=int64)

arr = (np.subtract.outer(dfs.a, dfs.a) <=10).sum(0)

print (arr)

array([4, 4, 4, 5, 6, 7, 8, 8])

list_ind = [0] # initialize list of index to keep with 0

arr = (np.subtract.outer(dfs.a, dfs.a) <=10).sum(0)

i = arr[0]

while i < len(arr):

    list_ind.append(i)

    i = arr[i]



print (list_ind)

[0, 4, 6]



print (dfs.iloc[list_ind])

    a

0   1

4  14

6  30

Now with the whole problem and groupby, you can do:

# it seems you need to convert the column frame_no to integer

df['frame_int'] = pd.to_numeric(df['frame_no'])

df = df.sort_values('frame_int') #ensure data to be sorted by frame_int, whatever the global_id



#define the function looking for the ind

def find_ind (df_g):

    list_ind = [0]

    arr = (np.subtract.outer(df_g.frame_int, df_g.frame_int) <= 10).sum(0)

    i = arr[0]

    while i <len(arr):

        list_ind.append(i)

        i = arr[i]

    return df_g.iloc[list_ind]



#create the filtered dataframe

df_filtered = (df.groupby('global_id').apply(find_ind)

                 .drop('frame_int',axis=1).reset_index(drop=True))



print (df_filtered)

   seq_name     label  pedestrian_id  frame_no  global_id

0         1  crossing              1         1          1

1         1  crossing              2         1          2

2         1  crossing              2        12          2

3         1  crossing              2        29          2

4         2  crossing              1        34          3

5         2  crossing              1        49          3

If you want to keep the original rows' index, you can add level=0 in reset_index such as reset_index(level=0,drop=True) instead.

answered Nov 13 '18 at 16:27

Ben.T

5,9572523

add a comment |

print (np.subtract.outer(dfs.a, dfs.a))

array([[  0,  -1,  -2,  -3, -13, -19, -29, -30],

       [  1,   0,  -1,  -2, -12, -18, -28, -29],

       [  2,   1,   0,  -1, -11, -17, -27, -28],

       [  3,   2,   1,   0, -10, -16, -26, -27],

       [ 13,  12,  11,  10,   0,  -6, -16, -17],

       [ 19,  18,  17,  16,   6,   0, -10, -11],

       [ 29,  28,  27,  26,  16,  10,   0,  -1],

       [ 30,  29,  28,  27,  17,  11,   1,   0]], dtype=int64)

arr = (np.subtract.outer(dfs.a, dfs.a) <=10).sum(0)

print (arr)

array([4, 4, 4, 5, 6, 7, 8, 8])

list_ind = [0] # initialize list of index to keep with 0

arr = (np.subtract.outer(dfs.a, dfs.a) <=10).sum(0)

i = arr[0]

while i < len(arr):

    list_ind.append(i)

    i = arr[i]



print (list_ind)

[0, 4, 6]



print (dfs.iloc[list_ind])

    a

0   1

4  14

6  30

Now with the whole problem and groupby, you can do:

# it seems you need to convert the column frame_no to integer

df['frame_int'] = pd.to_numeric(df['frame_no'])

df = df.sort_values('frame_int') #ensure data to be sorted by frame_int, whatever the global_id



#define the function looking for the ind

def find_ind (df_g):

    list_ind = [0]

    arr = (np.subtract.outer(df_g.frame_int, df_g.frame_int) <= 10).sum(0)

    i = arr[0]

    while i <len(arr):

        list_ind.append(i)

        i = arr[i]

    return df_g.iloc[list_ind]



#create the filtered dataframe

df_filtered = (df.groupby('global_id').apply(find_ind)

                 .drop('frame_int',axis=1).reset_index(drop=True))



print (df_filtered)

   seq_name     label  pedestrian_id  frame_no  global_id

0         1  crossing              1         1          1

1         1  crossing              2         1          2

2         1  crossing              2        12          2

3         1  crossing              2        29          2

4         2  crossing              1        34          3

5         2  crossing              1        49          3

If you want to keep the original rows' index, you can add level=0 in reset_index such as reset_index(level=0,drop=True) instead.

answered Nov 13 '18 at 16:27

Ben.T

5,9572523

add a comment |

print (np.subtract.outer(dfs.a, dfs.a))

array([[  0,  -1,  -2,  -3, -13, -19, -29, -30],

       [  1,   0,  -1,  -2, -12, -18, -28, -29],

       [  2,   1,   0,  -1, -11, -17, -27, -28],

       [  3,   2,   1,   0, -10, -16, -26, -27],

       [ 13,  12,  11,  10,   0,  -6, -16, -17],

       [ 19,  18,  17,  16,   6,   0, -10, -11],

       [ 29,  28,  27,  26,  16,  10,   0,  -1],

       [ 30,  29,  28,  27,  17,  11,   1,   0]], dtype=int64)

arr = (np.subtract.outer(dfs.a, dfs.a) <=10).sum(0)

print (arr)

array([4, 4, 4, 5, 6, 7, 8, 8])

list_ind = [0] # initialize list of index to keep with 0

arr = (np.subtract.outer(dfs.a, dfs.a) <=10).sum(0)

i = arr[0]

while i < len(arr):

    list_ind.append(i)

    i = arr[i]



print (list_ind)

[0, 4, 6]



print (dfs.iloc[list_ind])

    a

0   1

4  14

6  30

Now with the whole problem and groupby, you can do:

# it seems you need to convert the column frame_no to integer

df['frame_int'] = pd.to_numeric(df['frame_no'])

df = df.sort_values('frame_int') #ensure data to be sorted by frame_int, whatever the global_id



#define the function looking for the ind

def find_ind (df_g):

    list_ind = [0]

    arr = (np.subtract.outer(df_g.frame_int, df_g.frame_int) <= 10).sum(0)

    i = arr[0]

    while i <len(arr):

        list_ind.append(i)

        i = arr[i]

    return df_g.iloc[list_ind]



#create the filtered dataframe

df_filtered = (df.groupby('global_id').apply(find_ind)

                 .drop('frame_int',axis=1).reset_index(drop=True))



print (df_filtered)

   seq_name     label  pedestrian_id  frame_no  global_id

0         1  crossing              1         1          1

1         1  crossing              2         1          2

2         1  crossing              2        12          2

3         1  crossing              2        29          2

4         2  crossing              1        34          3

5         2  crossing              1        49          3

If you want to keep the original rows' index, you can add level=0 in reset_index such as reset_index(level=0,drop=True) instead.

answered Nov 13 '18 at 16:27

Ben.T

5,9572523

print (np.subtract.outer(dfs.a, dfs.a))

array([[  0,  -1,  -2,  -3, -13, -19, -29, -30],

       [  1,   0,  -1,  -2, -12, -18, -28, -29],

       [  2,   1,   0,  -1, -11, -17, -27, -28],

       [  3,   2,   1,   0, -10, -16, -26, -27],

       [ 13,  12,  11,  10,   0,  -6, -16, -17],

       [ 19,  18,  17,  16,   6,   0, -10, -11],

       [ 29,  28,  27,  26,  16,  10,   0,  -1],

       [ 30,  29,  28,  27,  17,  11,   1,   0]], dtype=int64)

arr = (np.subtract.outer(dfs.a, dfs.a) <=10).sum(0)

print (arr)

array([4, 4, 4, 5, 6, 7, 8, 8])

list_ind = [0] # initialize list of index to keep with 0

arr = (np.subtract.outer(dfs.a, dfs.a) <=10).sum(0)

i = arr[0]

while i < len(arr):

    list_ind.append(i)

    i = arr[i]



print (list_ind)

[0, 4, 6]



print (dfs.iloc[list_ind])

    a

0   1

4  14

6  30

Now with the whole problem and groupby, you can do:

# it seems you need to convert the column frame_no to integer

df['frame_int'] = pd.to_numeric(df['frame_no'])

df = df.sort_values('frame_int') #ensure data to be sorted by frame_int, whatever the global_id



#define the function looking for the ind

def find_ind (df_g):

    list_ind = [0]

    arr = (np.subtract.outer(df_g.frame_int, df_g.frame_int) <= 10).sum(0)

    i = arr[0]

    while i <len(arr):

        list_ind.append(i)

        i = arr[i]

    return df_g.iloc[list_ind]



#create the filtered dataframe

df_filtered = (df.groupby('global_id').apply(find_ind)

                 .drop('frame_int',axis=1).reset_index(drop=True))



print (df_filtered)

   seq_name     label  pedestrian_id  frame_no  global_id

0         1  crossing              1         1          1

1         1  crossing              2         1          2

2         1  crossing              2        12          2

3         1  crossing              2        29          2

4         2  crossing              1        34          3

5         2  crossing              1        49          3

If you want to keep the original rows' index, you can add level=0 in reset_index such as reset_index(level=0,drop=True) instead.

answered Nov 13 '18 at 16:27

Ben.T

5,9572523

answered Nov 13 '18 at 16:27

Ben.T

5,9572523

answered Nov 13 '18 at 16:27

Ben.T

5,9572523

answered Nov 13 '18 at 16:27

Ben.T

5,9572523

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Wsrtjtyk