Python - Merge list of tuples from nested list

up vote
2
down vote

favorite

I have list of list of tuples that I want to merge. Below code combines the properties with single list passed into 'classified_text', how do I iterate this concept for nested list of tuples? I tried adding another for loop and append method, but I get different error. Any simple way to do this? Thanks!

Input Text 1 - Working:

classified_text = [('John', 'PERSON'), ('Smith', 'PERSON'),('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')] # Single list

Output Text 1 - Working:

[('PERSON      ', 'John Smith'), ('ORGANIZATION', 'University of ABC')]

Input Text 2 - Not Working: Nested list with tuples

classified_text = [[('John', 'PERSON'), ('Smith', 'PERSON')], [('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')], [('some', 'O'), ('text', 'O'), ('here', 'O')], [('Mark', 'O'), ('from', 'O'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('CA', 'ORGANIZATION')]]

Code:

from itertools import groupby

entity_extracted_words = 

for tag, chunk in groupby(classified_text, lambda x:x[1]):

    if tag != "O":

        info_ner = "%-12s"%tag, " ".join(w for w, t in chunk)

        entity_extracted_words.append(info_ner)



print('entity_extracted_words:n', entity_extracted_words)

Out Text 2 - Trying to get this result:

[('PERSON      ', 'John Smith'), ('ORGANIZATION', 'University of ABC'),('ORGANIZATION', 'University of CA')]

Error:
TypeError: not all arguments converted during string formatting

edited Nov 7 at 20:37

asked Nov 7 at 20:23

sharp

49531229

add a comment |

up vote
2
down vote

favorite

Input Text 1 - Working:

classified_text = [('John', 'PERSON'), ('Smith', 'PERSON'),('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')] # Single list

Output Text 1 - Working:

[('PERSON      ', 'John Smith'), ('ORGANIZATION', 'University of ABC')]

Input Text 2 - Not Working: Nested list with tuples

classified_text = [[('John', 'PERSON'), ('Smith', 'PERSON')], [('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')], [('some', 'O'), ('text', 'O'), ('here', 'O')], [('Mark', 'O'), ('from', 'O'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('CA', 'ORGANIZATION')]]

Code:

from itertools import groupby

entity_extracted_words = 

for tag, chunk in groupby(classified_text, lambda x:x[1]):

    if tag != "O":

        info_ner = "%-12s"%tag, " ".join(w for w, t in chunk)

        entity_extracted_words.append(info_ner)



print('entity_extracted_words:n', entity_extracted_words)

Out Text 2 - Trying to get this result:

[('PERSON      ', 'John Smith'), ('ORGANIZATION', 'University of ABC'),('ORGANIZATION', 'University of CA')]

Error:
TypeError: not all arguments converted during string formatting

edited Nov 7 at 20:37

asked Nov 7 at 20:23

sharp

49531229

add a comment |

up vote
2
down vote

favorite

Input Text 1 - Working:

classified_text = [('John', 'PERSON'), ('Smith', 'PERSON'),('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')] # Single list

Output Text 1 - Working:

[('PERSON      ', 'John Smith'), ('ORGANIZATION', 'University of ABC')]

Input Text 2 - Not Working: Nested list with tuples

classified_text = [[('John', 'PERSON'), ('Smith', 'PERSON')], [('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')], [('some', 'O'), ('text', 'O'), ('here', 'O')], [('Mark', 'O'), ('from', 'O'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('CA', 'ORGANIZATION')]]

Code:

from itertools import groupby

entity_extracted_words = 

for tag, chunk in groupby(classified_text, lambda x:x[1]):

    if tag != "O":

        info_ner = "%-12s"%tag, " ".join(w for w, t in chunk)

        entity_extracted_words.append(info_ner)



print('entity_extracted_words:n', entity_extracted_words)

Out Text 2 - Trying to get this result:

[('PERSON      ', 'John Smith'), ('ORGANIZATION', 'University of ABC'),('ORGANIZATION', 'University of CA')]

Error:
TypeError: not all arguments converted during string formatting

edited Nov 7 at 20:37

asked Nov 7 at 20:23

sharp

49531229

Input Text 1 - Working:

classified_text = [('John', 'PERSON'), ('Smith', 'PERSON'),('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')] # Single list

Output Text 1 - Working:

[('PERSON      ', 'John Smith'), ('ORGANIZATION', 'University of ABC')]

Input Text 2 - Not Working: Nested list with tuples

classified_text = [[('John', 'PERSON'), ('Smith', 'PERSON')], [('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')], [('some', 'O'), ('text', 'O'), ('here', 'O')], [('Mark', 'O'), ('from', 'O'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('CA', 'ORGANIZATION')]]

Code:

from itertools import groupby

entity_extracted_words = 

for tag, chunk in groupby(classified_text, lambda x:x[1]):

    if tag != "O":

        info_ner = "%-12s"%tag, " ".join(w for w, t in chunk)

        entity_extracted_words.append(info_ner)



print('entity_extracted_words:n', entity_extracted_words)

Out Text 2 - Trying to get this result:

[('PERSON      ', 'John Smith'), ('ORGANIZATION', 'University of ABC'),('ORGANIZATION', 'University of CA')]

Error:
TypeError: not all arguments converted during string formatting

python python-3.x

edited Nov 7 at 20:37

asked Nov 7 at 20:23

sharp

49531229

edited Nov 7 at 20:37

asked Nov 7 at 20:23

sharp

49531229

edited Nov 7 at 20:37

asked Nov 7 at 20:23

sharp

49531229

asked Nov 7 at 20:23

sharp

49531229

asked Nov 7 at 20:23

sharp

49531229

add a comment |

2 Answers
2

active

oldest

votes

up vote
2
down vote

accepted

Try something like this. Simply for-loop over the sublists, combining into a string and add them to the newlist

classified_text = [[('John', 'PERSON'), ('Smith', 'PERSON')], 

                   [('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')],

                   [('some', 'O'), ('text', 'O'), ('here', 'O')],

                   [('Mark', 'O'), ('from', 'O'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('CA', 'ORGANIZATION')]]



newlist = 

for sublist in classified_text:

    combined = 

    for chunk, tag in sublist:

        if tag == 'O':

            continue

        combined_tag = tag

        combined.append(chunk)



    # Append tag and string to list

    if combined:

        # If you wanted to space filled as in your example, you can use

        # the strings ljust method

        newlist.append((combined_tag.ljust(12), ' '.join(combined)))



print(newlist)



#[('PERSON      ', 'John Smith'),

# ('ORGANIZATION', 'University of ABC'),

# ('ORGANIZATION', 'University of CA')]

edited Nov 7 at 21:27

answered Nov 7 at 20:50

Stephen Cowley

858215

@Stephen, Thanks for replying. It is close however your input classified_text last element is slightly different than the description above. Plus I am also trying to filter out 'O' tags. It only works slightly close. When I input the text 2 into your code, I get results: [('PERSON', 'John Smith'), ('ORGANIZATION', 'University of ABC'), ('O', 'some text here'), ('O', 'Mark from University of CA')]. I am looking for this [('PERSON ', 'John Smith'), ('ORGANIZATION', 'University of ABC'),('ORGANIZATION', 'University of CA')]
– sharp
Nov 7 at 21:02

Deleted the list comprehension comment to avoid confusing future readers now that the answer has been edited.
– benvc
Nov 7 at 21:08

@sharp , I think this is more what you were looking for now
– Stephen Cowley
Nov 7 at 21:13

add a comment |

up vote
0
down vote

You could first flatten your list of lists into just a list:

flat_list = [item for sublist in classified_text for item in sublist]

And that flat list should work with your original code.

answered Nov 7 at 21:02

kabdulla

1,882623

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53197253%2fpython-merge-list-of-tuples-from-nested-list%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
2
down vote

accepted

Try something like this. Simply for-loop over the sublists, combining into a string and add them to the newlist

classified_text = [[('John', 'PERSON'), ('Smith', 'PERSON')], 

                   [('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')],

                   [('some', 'O'), ('text', 'O'), ('here', 'O')],

                   [('Mark', 'O'), ('from', 'O'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('CA', 'ORGANIZATION')]]



newlist = 

for sublist in classified_text:

    combined = 

    for chunk, tag in sublist:

        if tag == 'O':

            continue

        combined_tag = tag

        combined.append(chunk)



    # Append tag and string to list

    if combined:

        # If you wanted to space filled as in your example, you can use

        # the strings ljust method

        newlist.append((combined_tag.ljust(12), ' '.join(combined)))



print(newlist)



#[('PERSON      ', 'John Smith'),

# ('ORGANIZATION', 'University of ABC'),

# ('ORGANIZATION', 'University of CA')]

edited Nov 7 at 21:27

answered Nov 7 at 20:50

Stephen Cowley

858215

@Stephen, Thanks for replying. It is close however your input classified_text last element is slightly different than the description above. Plus I am also trying to filter out 'O' tags. It only works slightly close. When I input the text 2 into your code, I get results: [('PERSON', 'John Smith'), ('ORGANIZATION', 'University of ABC'), ('O', 'some text here'), ('O', 'Mark from University of CA')]. I am looking for this [('PERSON ', 'John Smith'), ('ORGANIZATION', 'University of ABC'),('ORGANIZATION', 'University of CA')]
– sharp
Nov 7 at 21:02

Deleted the list comprehension comment to avoid confusing future readers now that the answer has been edited.
– benvc
Nov 7 at 21:08

@sharp , I think this is more what you were looking for now
– Stephen Cowley
Nov 7 at 21:13

add a comment |

up vote
2
down vote

accepted

Try something like this. Simply for-loop over the sublists, combining into a string and add them to the newlist

classified_text = [[('John', 'PERSON'), ('Smith', 'PERSON')], 

                   [('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')],

                   [('some', 'O'), ('text', 'O'), ('here', 'O')],

                   [('Mark', 'O'), ('from', 'O'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('CA', 'ORGANIZATION')]]



newlist = 

for sublist in classified_text:

    combined = 

    for chunk, tag in sublist:

        if tag == 'O':

            continue

        combined_tag = tag

        combined.append(chunk)



    # Append tag and string to list

    if combined:

        # If you wanted to space filled as in your example, you can use

        # the strings ljust method

        newlist.append((combined_tag.ljust(12), ' '.join(combined)))



print(newlist)



#[('PERSON      ', 'John Smith'),

# ('ORGANIZATION', 'University of ABC'),

# ('ORGANIZATION', 'University of CA')]

edited Nov 7 at 21:27

answered Nov 7 at 20:50

Stephen Cowley

858215

@Stephen, Thanks for replying. It is close however your input classified_text last element is slightly different than the description above. Plus I am also trying to filter out 'O' tags. It only works slightly close. When I input the text 2 into your code, I get results: [('PERSON', 'John Smith'), ('ORGANIZATION', 'University of ABC'), ('O', 'some text here'), ('O', 'Mark from University of CA')]. I am looking for this [('PERSON ', 'John Smith'), ('ORGANIZATION', 'University of ABC'),('ORGANIZATION', 'University of CA')]
– sharp
Nov 7 at 21:02

Deleted the list comprehension comment to avoid confusing future readers now that the answer has been edited.
– benvc
Nov 7 at 21:08

@sharp , I think this is more what you were looking for now
– Stephen Cowley
Nov 7 at 21:13

add a comment |

up vote
2
down vote

accepted

Try something like this. Simply for-loop over the sublists, combining into a string and add them to the newlist

classified_text = [[('John', 'PERSON'), ('Smith', 'PERSON')], 

                   [('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')],

                   [('some', 'O'), ('text', 'O'), ('here', 'O')],

                   [('Mark', 'O'), ('from', 'O'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('CA', 'ORGANIZATION')]]



newlist = 

for sublist in classified_text:

    combined = 

    for chunk, tag in sublist:

        if tag == 'O':

            continue

        combined_tag = tag

        combined.append(chunk)



    # Append tag and string to list

    if combined:

        # If you wanted to space filled as in your example, you can use

        # the strings ljust method

        newlist.append((combined_tag.ljust(12), ' '.join(combined)))



print(newlist)



#[('PERSON      ', 'John Smith'),

# ('ORGANIZATION', 'University of ABC'),

# ('ORGANIZATION', 'University of CA')]

edited Nov 7 at 21:27

answered Nov 7 at 20:50

Stephen Cowley

858215

Try something like this. Simply for-loop over the sublists, combining into a string and add them to the newlist

classified_text = [[('John', 'PERSON'), ('Smith', 'PERSON')], 

                   [('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')],

                   [('some', 'O'), ('text', 'O'), ('here', 'O')],

                   [('Mark', 'O'), ('from', 'O'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('CA', 'ORGANIZATION')]]



newlist = 

for sublist in classified_text:

    combined = 

    for chunk, tag in sublist:

        if tag == 'O':

            continue

        combined_tag = tag

        combined.append(chunk)



    # Append tag and string to list

    if combined:

        # If you wanted to space filled as in your example, you can use

        # the strings ljust method

        newlist.append((combined_tag.ljust(12), ' '.join(combined)))



print(newlist)



#[('PERSON      ', 'John Smith'),

# ('ORGANIZATION', 'University of ABC'),

# ('ORGANIZATION', 'University of CA')]

edited Nov 7 at 21:27

answered Nov 7 at 20:50

Stephen Cowley

858215

edited Nov 7 at 21:27

answered Nov 7 at 20:50

Stephen Cowley

858215

answered Nov 7 at 20:50

Stephen Cowley

858215

answered Nov 7 at 20:50

Stephen Cowley

858215

@Stephen, Thanks for replying. It is close however your input classified_text last element is slightly different than the description above. Plus I am also trying to filter out 'O' tags. It only works slightly close. When I input the text 2 into your code, I get results: [('PERSON', 'John Smith'), ('ORGANIZATION', 'University of ABC'), ('O', 'some text here'), ('O', 'Mark from University of CA')]. I am looking for this [('PERSON ', 'John Smith'), ('ORGANIZATION', 'University of ABC'),('ORGANIZATION', 'University of CA')]
– sharp
Nov 7 at 21:02

Deleted the list comprehension comment to avoid confusing future readers now that the answer has been edited.
– benvc
Nov 7 at 21:08

@sharp , I think this is more what you were looking for now
– Stephen Cowley
Nov 7 at 21:13

add a comment |

@Stephen, Thanks for replying. It is close however your input classified_text last element is slightly different than the description above. Plus I am also trying to filter out 'O' tags. It only works slightly close. When I input the text 2 into your code, I get results: [('PERSON', 'John Smith'), ('ORGANIZATION', 'University of ABC'), ('O', 'some text here'), ('O', 'Mark from University of CA')]. I am looking for this [('PERSON ', 'John Smith'), ('ORGANIZATION', 'University of ABC'),('ORGANIZATION', 'University of CA')]
– sharp
Nov 7 at 21:02

Deleted the list comprehension comment to avoid confusing future readers now that the answer has been edited.
– benvc
Nov 7 at 21:08

@sharp , I think this is more what you were looking for now
– Stephen Cowley
Nov 7 at 21:13

@Stephen, Thanks for replying. It is close however your input classified_text last element is slightly different than the description above. Plus I am also trying to filter out 'O' tags. It only works slightly close. When I input the text 2 into your code, I get results:

[('PERSON', 'John Smith'), ('ORGANIZATION', 'University of ABC'), ('O', 'some text here'), ('O', 'Mark from University of CA')]

. I am looking for this [('PERSON ', 'John Smith'), ('ORGANIZATION', 'University of ABC'),('ORGANIZATION', 'University of CA')]
– sharp
Nov 7 at 21:02

[('PERSON', 'John Smith'), ('ORGANIZATION', 'University of ABC'), ('O', 'some text here'), ('O', 'Mark from University of CA')]

. I am looking for this [('PERSON ', 'John Smith'), ('ORGANIZATION', 'University of ABC'),('ORGANIZATION', 'University of CA')]
– sharp
Nov 7 at 21:02

Deleted the list comprehension comment to avoid confusing future readers now that the answer has been edited.
– benvc
Nov 7 at 21:08

@sharp , I think this is more what you were looking for now
– Stephen Cowley
Nov 7 at 21:13

add a comment |

up vote
0
down vote

You could first flatten your list of lists into just a list:

flat_list = [item for sublist in classified_text for item in sublist]

And that flat list should work with your original code.

answered Nov 7 at 21:02

kabdulla

1,882623

add a comment |

up vote
0
down vote

You could first flatten your list of lists into just a list:

flat_list = [item for sublist in classified_text for item in sublist]

And that flat list should work with your original code.

answered Nov 7 at 21:02

kabdulla

1,882623

add a comment |

up vote
0
down vote

You could first flatten your list of lists into just a list:

flat_list = [item for sublist in classified_text for item in sublist]

And that flat list should work with your original code.

answered Nov 7 at 21:02

kabdulla

1,882623

You could first flatten your list of lists into just a list:

flat_list = [item for sublist in classified_text for item in sublist]

And that flat list should work with your original code.

answered Nov 7 at 21:02

kabdulla

1,882623

answered Nov 7 at 21:02

kabdulla

1,882623

answered Nov 7 at 21:02

kabdulla

1,882623

answered Nov 7 at 21:02

kabdulla

1,882623

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Wsrtjtyk