Python Count Unique value in Row csv
up vote
2
down vote
favorite
I have CSV, which is in a list.
Example:
[[R2C1,R01,API_1,801,API_TEST01],
[R2C1,R01,API_1,802,API_TEST02],
[R2C1,R01,API_1,801,API_TEST03]]
Like to find out the all the unique in i[3]
and count them.
results:
[{num: 801, count: 2}, {num: 802, count: 1}]
so that I can call dict
key for another test.
Code:
for row in data[1:]:
vnum = row[3]
ipcount.append({"num":vnum,"count": count})
if row[3] not in ipcount:
ipcount.append({"num":vlan})
python csv
add a comment |
up vote
2
down vote
favorite
I have CSV, which is in a list.
Example:
[[R2C1,R01,API_1,801,API_TEST01],
[R2C1,R01,API_1,802,API_TEST02],
[R2C1,R01,API_1,801,API_TEST03]]
Like to find out the all the unique in i[3]
and count them.
results:
[{num: 801, count: 2}, {num: 802, count: 1}]
so that I can call dict
key for another test.
Code:
for row in data[1:]:
vnum = row[3]
ipcount.append({"num":vnum,"count": count})
if row[3] not in ipcount:
ipcount.append({"num":vlan})
python csv
add a comment |
up vote
2
down vote
favorite
up vote
2
down vote
favorite
I have CSV, which is in a list.
Example:
[[R2C1,R01,API_1,801,API_TEST01],
[R2C1,R01,API_1,802,API_TEST02],
[R2C1,R01,API_1,801,API_TEST03]]
Like to find out the all the unique in i[3]
and count them.
results:
[{num: 801, count: 2}, {num: 802, count: 1}]
so that I can call dict
key for another test.
Code:
for row in data[1:]:
vnum = row[3]
ipcount.append({"num":vnum,"count": count})
if row[3] not in ipcount:
ipcount.append({"num":vlan})
python csv
I have CSV, which is in a list.
Example:
[[R2C1,R01,API_1,801,API_TEST01],
[R2C1,R01,API_1,802,API_TEST02],
[R2C1,R01,API_1,801,API_TEST03]]
Like to find out the all the unique in i[3]
and count them.
results:
[{num: 801, count: 2}, {num: 802, count: 1}]
so that I can call dict
key for another test.
Code:
for row in data[1:]:
vnum = row[3]
ipcount.append({"num":vnum,"count": count})
if row[3] not in ipcount:
ipcount.append({"num":vlan})
python csv
python csv
edited Nov 9 at 7:36
Vineeth Sai
2,28441023
2,28441023
asked Nov 9 at 7:34
miu
184
184
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
up vote
1
down vote
accepted
You can do this using a dictionary in order to group list items by num
element. The last step is using a list comprehension in order to achieve your desired result.
dict = {}
for elem in data:
if elem[3] not in dict:
dict[elem[3]] = 0
dict[elem[3]] = dict[elem[3]] + 1
final_list = [{'num' : elem, 'count': dict[elem]} for elem in dict]
Output
[{'num': 801, 'count': 2}, {'num': 802, 'count': 1}]
how interesting that you using comprehension. Thanks sharing.
– miu
Nov 9 at 16:30
add a comment |
up vote
1
down vote
If you use the pandas
library:
import pandas as pd
# Open your file using pd.read_csv() or from your list of lists
df = pd.DataFrame([['R2C1','R01','API_1',801,'API_TEST01'],
['R2C1','R01','API_1',802,'API_TEST02'],
['R2C1','R01','API_1',801,'API_TEST03']])
print(df)
0 1 2 3 4
0 R2C1 R01 API_1 801 API_TEST01
1 R2C1 R01 API_1 802 API_TEST02
2 R2C1 R01 API_1 801 API_TEST03
Here you can use .value_counts()
to get the number of each value in column 3
, then using a dictionary comprehension transform this into the form you need:
[{'num': k, 'count': v} for k, v in dict(df[3].value_counts()).items()]
[{'num': 801, 'count': 2}, {'num': 802, 'count': 1}]
thanks Alex, i was looking at pandas, using SeriesGroupBy.nunique but was not able to get the results. Thanks
– miu
Nov 9 at 16:29
If you’ve got a large amount of data to sort through pandas is very quick. If you had any other questions check out the pandas tag/ ask there
– Alex
Nov 10 at 17:04
add a comment |
up vote
0
down vote
here a pure pandas
approach without any loops
import pandas as pd
# define path to data
PATH = u'pathtodata.csv'
# create panda datafrmae
df = pd.read_csv(PATH, usecols = [0,1,2,3], header = 0, names = ['a', 'b', 'c','num'])
# Add count to column of interest
df['count'] = df.groupby('num')['num'].transform('count')
# only keep unique values in column of interest
df.drop_duplicates(subset=['num'], inplace = True)
# create dict from bowth columns
your_output = dict(zip(df.num, df.count))
Thanks sudonym, what does the zip do? does it combine and remove duplicates?
– miu
Nov 9 at 16:31
add a comment |
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
You can do this using a dictionary in order to group list items by num
element. The last step is using a list comprehension in order to achieve your desired result.
dict = {}
for elem in data:
if elem[3] not in dict:
dict[elem[3]] = 0
dict[elem[3]] = dict[elem[3]] + 1
final_list = [{'num' : elem, 'count': dict[elem]} for elem in dict]
Output
[{'num': 801, 'count': 2}, {'num': 802, 'count': 1}]
how interesting that you using comprehension. Thanks sharing.
– miu
Nov 9 at 16:30
add a comment |
up vote
1
down vote
accepted
You can do this using a dictionary in order to group list items by num
element. The last step is using a list comprehension in order to achieve your desired result.
dict = {}
for elem in data:
if elem[3] not in dict:
dict[elem[3]] = 0
dict[elem[3]] = dict[elem[3]] + 1
final_list = [{'num' : elem, 'count': dict[elem]} for elem in dict]
Output
[{'num': 801, 'count': 2}, {'num': 802, 'count': 1}]
how interesting that you using comprehension. Thanks sharing.
– miu
Nov 9 at 16:30
add a comment |
up vote
1
down vote
accepted
up vote
1
down vote
accepted
You can do this using a dictionary in order to group list items by num
element. The last step is using a list comprehension in order to achieve your desired result.
dict = {}
for elem in data:
if elem[3] not in dict:
dict[elem[3]] = 0
dict[elem[3]] = dict[elem[3]] + 1
final_list = [{'num' : elem, 'count': dict[elem]} for elem in dict]
Output
[{'num': 801, 'count': 2}, {'num': 802, 'count': 1}]
You can do this using a dictionary in order to group list items by num
element. The last step is using a list comprehension in order to achieve your desired result.
dict = {}
for elem in data:
if elem[3] not in dict:
dict[elem[3]] = 0
dict[elem[3]] = dict[elem[3]] + 1
final_list = [{'num' : elem, 'count': dict[elem]} for elem in dict]
Output
[{'num': 801, 'count': 2}, {'num': 802, 'count': 1}]
answered Nov 9 at 7:46
Mihai Alexandru-Ionut
29.1k63568
29.1k63568
how interesting that you using comprehension. Thanks sharing.
– miu
Nov 9 at 16:30
add a comment |
how interesting that you using comprehension. Thanks sharing.
– miu
Nov 9 at 16:30
how interesting that you using comprehension. Thanks sharing.
– miu
Nov 9 at 16:30
how interesting that you using comprehension. Thanks sharing.
– miu
Nov 9 at 16:30
add a comment |
up vote
1
down vote
If you use the pandas
library:
import pandas as pd
# Open your file using pd.read_csv() or from your list of lists
df = pd.DataFrame([['R2C1','R01','API_1',801,'API_TEST01'],
['R2C1','R01','API_1',802,'API_TEST02'],
['R2C1','R01','API_1',801,'API_TEST03']])
print(df)
0 1 2 3 4
0 R2C1 R01 API_1 801 API_TEST01
1 R2C1 R01 API_1 802 API_TEST02
2 R2C1 R01 API_1 801 API_TEST03
Here you can use .value_counts()
to get the number of each value in column 3
, then using a dictionary comprehension transform this into the form you need:
[{'num': k, 'count': v} for k, v in dict(df[3].value_counts()).items()]
[{'num': 801, 'count': 2}, {'num': 802, 'count': 1}]
thanks Alex, i was looking at pandas, using SeriesGroupBy.nunique but was not able to get the results. Thanks
– miu
Nov 9 at 16:29
If you’ve got a large amount of data to sort through pandas is very quick. If you had any other questions check out the pandas tag/ ask there
– Alex
Nov 10 at 17:04
add a comment |
up vote
1
down vote
If you use the pandas
library:
import pandas as pd
# Open your file using pd.read_csv() or from your list of lists
df = pd.DataFrame([['R2C1','R01','API_1',801,'API_TEST01'],
['R2C1','R01','API_1',802,'API_TEST02'],
['R2C1','R01','API_1',801,'API_TEST03']])
print(df)
0 1 2 3 4
0 R2C1 R01 API_1 801 API_TEST01
1 R2C1 R01 API_1 802 API_TEST02
2 R2C1 R01 API_1 801 API_TEST03
Here you can use .value_counts()
to get the number of each value in column 3
, then using a dictionary comprehension transform this into the form you need:
[{'num': k, 'count': v} for k, v in dict(df[3].value_counts()).items()]
[{'num': 801, 'count': 2}, {'num': 802, 'count': 1}]
thanks Alex, i was looking at pandas, using SeriesGroupBy.nunique but was not able to get the results. Thanks
– miu
Nov 9 at 16:29
If you’ve got a large amount of data to sort through pandas is very quick. If you had any other questions check out the pandas tag/ ask there
– Alex
Nov 10 at 17:04
add a comment |
up vote
1
down vote
up vote
1
down vote
If you use the pandas
library:
import pandas as pd
# Open your file using pd.read_csv() or from your list of lists
df = pd.DataFrame([['R2C1','R01','API_1',801,'API_TEST01'],
['R2C1','R01','API_1',802,'API_TEST02'],
['R2C1','R01','API_1',801,'API_TEST03']])
print(df)
0 1 2 3 4
0 R2C1 R01 API_1 801 API_TEST01
1 R2C1 R01 API_1 802 API_TEST02
2 R2C1 R01 API_1 801 API_TEST03
Here you can use .value_counts()
to get the number of each value in column 3
, then using a dictionary comprehension transform this into the form you need:
[{'num': k, 'count': v} for k, v in dict(df[3].value_counts()).items()]
[{'num': 801, 'count': 2}, {'num': 802, 'count': 1}]
If you use the pandas
library:
import pandas as pd
# Open your file using pd.read_csv() or from your list of lists
df = pd.DataFrame([['R2C1','R01','API_1',801,'API_TEST01'],
['R2C1','R01','API_1',802,'API_TEST02'],
['R2C1','R01','API_1',801,'API_TEST03']])
print(df)
0 1 2 3 4
0 R2C1 R01 API_1 801 API_TEST01
1 R2C1 R01 API_1 802 API_TEST02
2 R2C1 R01 API_1 801 API_TEST03
Here you can use .value_counts()
to get the number of each value in column 3
, then using a dictionary comprehension transform this into the form you need:
[{'num': k, 'count': v} for k, v in dict(df[3].value_counts()).items()]
[{'num': 801, 'count': 2}, {'num': 802, 'count': 1}]
edited Nov 9 at 7:52
answered Nov 9 at 7:41
Alex
720621
720621
thanks Alex, i was looking at pandas, using SeriesGroupBy.nunique but was not able to get the results. Thanks
– miu
Nov 9 at 16:29
If you’ve got a large amount of data to sort through pandas is very quick. If you had any other questions check out the pandas tag/ ask there
– Alex
Nov 10 at 17:04
add a comment |
thanks Alex, i was looking at pandas, using SeriesGroupBy.nunique but was not able to get the results. Thanks
– miu
Nov 9 at 16:29
If you’ve got a large amount of data to sort through pandas is very quick. If you had any other questions check out the pandas tag/ ask there
– Alex
Nov 10 at 17:04
thanks Alex, i was looking at pandas, using SeriesGroupBy.nunique but was not able to get the results. Thanks
– miu
Nov 9 at 16:29
thanks Alex, i was looking at pandas, using SeriesGroupBy.nunique but was not able to get the results. Thanks
– miu
Nov 9 at 16:29
If you’ve got a large amount of data to sort through pandas is very quick. If you had any other questions check out the pandas tag/ ask there
– Alex
Nov 10 at 17:04
If you’ve got a large amount of data to sort through pandas is very quick. If you had any other questions check out the pandas tag/ ask there
– Alex
Nov 10 at 17:04
add a comment |
up vote
0
down vote
here a pure pandas
approach without any loops
import pandas as pd
# define path to data
PATH = u'pathtodata.csv'
# create panda datafrmae
df = pd.read_csv(PATH, usecols = [0,1,2,3], header = 0, names = ['a', 'b', 'c','num'])
# Add count to column of interest
df['count'] = df.groupby('num')['num'].transform('count')
# only keep unique values in column of interest
df.drop_duplicates(subset=['num'], inplace = True)
# create dict from bowth columns
your_output = dict(zip(df.num, df.count))
Thanks sudonym, what does the zip do? does it combine and remove duplicates?
– miu
Nov 9 at 16:31
add a comment |
up vote
0
down vote
here a pure pandas
approach without any loops
import pandas as pd
# define path to data
PATH = u'pathtodata.csv'
# create panda datafrmae
df = pd.read_csv(PATH, usecols = [0,1,2,3], header = 0, names = ['a', 'b', 'c','num'])
# Add count to column of interest
df['count'] = df.groupby('num')['num'].transform('count')
# only keep unique values in column of interest
df.drop_duplicates(subset=['num'], inplace = True)
# create dict from bowth columns
your_output = dict(zip(df.num, df.count))
Thanks sudonym, what does the zip do? does it combine and remove duplicates?
– miu
Nov 9 at 16:31
add a comment |
up vote
0
down vote
up vote
0
down vote
here a pure pandas
approach without any loops
import pandas as pd
# define path to data
PATH = u'pathtodata.csv'
# create panda datafrmae
df = pd.read_csv(PATH, usecols = [0,1,2,3], header = 0, names = ['a', 'b', 'c','num'])
# Add count to column of interest
df['count'] = df.groupby('num')['num'].transform('count')
# only keep unique values in column of interest
df.drop_duplicates(subset=['num'], inplace = True)
# create dict from bowth columns
your_output = dict(zip(df.num, df.count))
here a pure pandas
approach without any loops
import pandas as pd
# define path to data
PATH = u'pathtodata.csv'
# create panda datafrmae
df = pd.read_csv(PATH, usecols = [0,1,2,3], header = 0, names = ['a', 'b', 'c','num'])
# Add count to column of interest
df['count'] = df.groupby('num')['num'].transform('count')
# only keep unique values in column of interest
df.drop_duplicates(subset=['num'], inplace = True)
# create dict from bowth columns
your_output = dict(zip(df.num, df.count))
answered Nov 9 at 8:16
sudonym
1,291924
1,291924
Thanks sudonym, what does the zip do? does it combine and remove duplicates?
– miu
Nov 9 at 16:31
add a comment |
Thanks sudonym, what does the zip do? does it combine and remove duplicates?
– miu
Nov 9 at 16:31
Thanks sudonym, what does the zip do? does it combine and remove duplicates?
– miu
Nov 9 at 16:31
Thanks sudonym, what does the zip do? does it combine and remove duplicates?
– miu
Nov 9 at 16:31
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53221498%2fpython-count-unique-value-in-row-csv%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown