Python - Merge list of tuples from nested list











up vote
2
down vote

favorite












I have list of list of tuples that I want to merge. Below code combines the properties with single list passed into 'classified_text', how do I iterate this concept for nested list of tuples? I tried adding another for loop and append method, but I get different error. Any simple way to do this? Thanks!



Input Text 1 - Working:



classified_text = [('John', 'PERSON'), ('Smith', 'PERSON'),('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')] # Single list


Output Text 1 - Working:



[('PERSON      ', 'John Smith'), ('ORGANIZATION', 'University of ABC')]


Input Text 2 - Not Working: Nested list with tuples



classified_text = [[('John', 'PERSON'), ('Smith', 'PERSON')], [('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')], [('some', 'O'), ('text', 'O'), ('here', 'O')], [('Mark', 'O'), ('from', 'O'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('CA', 'ORGANIZATION')]]


Code:



from itertools import groupby
entity_extracted_words =
for tag, chunk in groupby(classified_text, lambda x:x[1]):
if tag != "O":
info_ner = "%-12s"%tag, " ".join(w for w, t in chunk)
entity_extracted_words.append(info_ner)

print('entity_extracted_words:n', entity_extracted_words)


Out Text 2 - Trying to get this result:



[('PERSON      ', 'John Smith'), ('ORGANIZATION', 'University of ABC'),('ORGANIZATION', 'University of CA')] 


Error:
TypeError: not all arguments converted during string formatting










share|improve this question




























    up vote
    2
    down vote

    favorite












    I have list of list of tuples that I want to merge. Below code combines the properties with single list passed into 'classified_text', how do I iterate this concept for nested list of tuples? I tried adding another for loop and append method, but I get different error. Any simple way to do this? Thanks!



    Input Text 1 - Working:



    classified_text = [('John', 'PERSON'), ('Smith', 'PERSON'),('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')] # Single list


    Output Text 1 - Working:



    [('PERSON      ', 'John Smith'), ('ORGANIZATION', 'University of ABC')]


    Input Text 2 - Not Working: Nested list with tuples



    classified_text = [[('John', 'PERSON'), ('Smith', 'PERSON')], [('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')], [('some', 'O'), ('text', 'O'), ('here', 'O')], [('Mark', 'O'), ('from', 'O'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('CA', 'ORGANIZATION')]]


    Code:



    from itertools import groupby
    entity_extracted_words =
    for tag, chunk in groupby(classified_text, lambda x:x[1]):
    if tag != "O":
    info_ner = "%-12s"%tag, " ".join(w for w, t in chunk)
    entity_extracted_words.append(info_ner)

    print('entity_extracted_words:n', entity_extracted_words)


    Out Text 2 - Trying to get this result:



    [('PERSON      ', 'John Smith'), ('ORGANIZATION', 'University of ABC'),('ORGANIZATION', 'University of CA')] 


    Error:
    TypeError: not all arguments converted during string formatting










    share|improve this question


























      up vote
      2
      down vote

      favorite









      up vote
      2
      down vote

      favorite











      I have list of list of tuples that I want to merge. Below code combines the properties with single list passed into 'classified_text', how do I iterate this concept for nested list of tuples? I tried adding another for loop and append method, but I get different error. Any simple way to do this? Thanks!



      Input Text 1 - Working:



      classified_text = [('John', 'PERSON'), ('Smith', 'PERSON'),('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')] # Single list


      Output Text 1 - Working:



      [('PERSON      ', 'John Smith'), ('ORGANIZATION', 'University of ABC')]


      Input Text 2 - Not Working: Nested list with tuples



      classified_text = [[('John', 'PERSON'), ('Smith', 'PERSON')], [('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')], [('some', 'O'), ('text', 'O'), ('here', 'O')], [('Mark', 'O'), ('from', 'O'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('CA', 'ORGANIZATION')]]


      Code:



      from itertools import groupby
      entity_extracted_words =
      for tag, chunk in groupby(classified_text, lambda x:x[1]):
      if tag != "O":
      info_ner = "%-12s"%tag, " ".join(w for w, t in chunk)
      entity_extracted_words.append(info_ner)

      print('entity_extracted_words:n', entity_extracted_words)


      Out Text 2 - Trying to get this result:



      [('PERSON      ', 'John Smith'), ('ORGANIZATION', 'University of ABC'),('ORGANIZATION', 'University of CA')] 


      Error:
      TypeError: not all arguments converted during string formatting










      share|improve this question















      I have list of list of tuples that I want to merge. Below code combines the properties with single list passed into 'classified_text', how do I iterate this concept for nested list of tuples? I tried adding another for loop and append method, but I get different error. Any simple way to do this? Thanks!



      Input Text 1 - Working:



      classified_text = [('John', 'PERSON'), ('Smith', 'PERSON'),('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')] # Single list


      Output Text 1 - Working:



      [('PERSON      ', 'John Smith'), ('ORGANIZATION', 'University of ABC')]


      Input Text 2 - Not Working: Nested list with tuples



      classified_text = [[('John', 'PERSON'), ('Smith', 'PERSON')], [('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')], [('some', 'O'), ('text', 'O'), ('here', 'O')], [('Mark', 'O'), ('from', 'O'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('CA', 'ORGANIZATION')]]


      Code:



      from itertools import groupby
      entity_extracted_words =
      for tag, chunk in groupby(classified_text, lambda x:x[1]):
      if tag != "O":
      info_ner = "%-12s"%tag, " ".join(w for w, t in chunk)
      entity_extracted_words.append(info_ner)

      print('entity_extracted_words:n', entity_extracted_words)


      Out Text 2 - Trying to get this result:



      [('PERSON      ', 'John Smith'), ('ORGANIZATION', 'University of ABC'),('ORGANIZATION', 'University of CA')] 


      Error:
      TypeError: not all arguments converted during string formatting







      python python-3.x






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 7 at 20:37

























      asked Nov 7 at 20:23









      sharp

      49531229




      49531229
























          2 Answers
          2






          active

          oldest

          votes

















          up vote
          2
          down vote



          accepted










          Try something like this. Simply for-loop over the sublists, combining into a string and add them to the newlist



          classified_text = [[('John', 'PERSON'), ('Smith', 'PERSON')], 
          [('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')],
          [('some', 'O'), ('text', 'O'), ('here', 'O')],
          [('Mark', 'O'), ('from', 'O'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('CA', 'ORGANIZATION')]]

          newlist =
          for sublist in classified_text:
          combined =
          for chunk, tag in sublist:
          if tag == 'O':
          continue
          combined_tag = tag
          combined.append(chunk)

          # Append tag and string to list
          if combined:
          # If you wanted to space filled as in your example, you can use
          # the strings ljust method
          newlist.append((combined_tag.ljust(12), ' '.join(combined)))

          print(newlist)

          #[('PERSON ', 'John Smith'),
          # ('ORGANIZATION', 'University of ABC'),
          # ('ORGANIZATION', 'University of CA')]





          share|improve this answer























          • @Stephen, Thanks for replying. It is close however your input classified_text last element is slightly different than the description above. Plus I am also trying to filter out 'O' tags. It only works slightly close. When I input the text 2 into your code, I get results: [('PERSON', 'John Smith'), ('ORGANIZATION', 'University of ABC'), ('O', 'some text here'), ('O', 'Mark from University of CA')]. I am looking for this [('PERSON ', 'John Smith'), ('ORGANIZATION', 'University of ABC'),('ORGANIZATION', 'University of CA')]
            – sharp
            Nov 7 at 21:02












          • Deleted the list comprehension comment to avoid confusing future readers now that the answer has been edited.
            – benvc
            Nov 7 at 21:08










          • @sharp , I think this is more what you were looking for now
            – Stephen Cowley
            Nov 7 at 21:13


















          up vote
          0
          down vote













          You could first flatten your list of lists into just a list:



          flat_list = [item for sublist in classified_text for item in sublist]


          And that flat list should work with your original code.






          share|improve this answer





















            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














             

            draft saved


            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53197253%2fpython-merge-list-of-tuples-from-nested-list%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            2
            down vote



            accepted










            Try something like this. Simply for-loop over the sublists, combining into a string and add them to the newlist



            classified_text = [[('John', 'PERSON'), ('Smith', 'PERSON')], 
            [('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')],
            [('some', 'O'), ('text', 'O'), ('here', 'O')],
            [('Mark', 'O'), ('from', 'O'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('CA', 'ORGANIZATION')]]

            newlist =
            for sublist in classified_text:
            combined =
            for chunk, tag in sublist:
            if tag == 'O':
            continue
            combined_tag = tag
            combined.append(chunk)

            # Append tag and string to list
            if combined:
            # If you wanted to space filled as in your example, you can use
            # the strings ljust method
            newlist.append((combined_tag.ljust(12), ' '.join(combined)))

            print(newlist)

            #[('PERSON ', 'John Smith'),
            # ('ORGANIZATION', 'University of ABC'),
            # ('ORGANIZATION', 'University of CA')]





            share|improve this answer























            • @Stephen, Thanks for replying. It is close however your input classified_text last element is slightly different than the description above. Plus I am also trying to filter out 'O' tags. It only works slightly close. When I input the text 2 into your code, I get results: [('PERSON', 'John Smith'), ('ORGANIZATION', 'University of ABC'), ('O', 'some text here'), ('O', 'Mark from University of CA')]. I am looking for this [('PERSON ', 'John Smith'), ('ORGANIZATION', 'University of ABC'),('ORGANIZATION', 'University of CA')]
              – sharp
              Nov 7 at 21:02












            • Deleted the list comprehension comment to avoid confusing future readers now that the answer has been edited.
              – benvc
              Nov 7 at 21:08










            • @sharp , I think this is more what you were looking for now
              – Stephen Cowley
              Nov 7 at 21:13















            up vote
            2
            down vote



            accepted










            Try something like this. Simply for-loop over the sublists, combining into a string and add them to the newlist



            classified_text = [[('John', 'PERSON'), ('Smith', 'PERSON')], 
            [('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')],
            [('some', 'O'), ('text', 'O'), ('here', 'O')],
            [('Mark', 'O'), ('from', 'O'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('CA', 'ORGANIZATION')]]

            newlist =
            for sublist in classified_text:
            combined =
            for chunk, tag in sublist:
            if tag == 'O':
            continue
            combined_tag = tag
            combined.append(chunk)

            # Append tag and string to list
            if combined:
            # If you wanted to space filled as in your example, you can use
            # the strings ljust method
            newlist.append((combined_tag.ljust(12), ' '.join(combined)))

            print(newlist)

            #[('PERSON ', 'John Smith'),
            # ('ORGANIZATION', 'University of ABC'),
            # ('ORGANIZATION', 'University of CA')]





            share|improve this answer























            • @Stephen, Thanks for replying. It is close however your input classified_text last element is slightly different than the description above. Plus I am also trying to filter out 'O' tags. It only works slightly close. When I input the text 2 into your code, I get results: [('PERSON', 'John Smith'), ('ORGANIZATION', 'University of ABC'), ('O', 'some text here'), ('O', 'Mark from University of CA')]. I am looking for this [('PERSON ', 'John Smith'), ('ORGANIZATION', 'University of ABC'),('ORGANIZATION', 'University of CA')]
              – sharp
              Nov 7 at 21:02












            • Deleted the list comprehension comment to avoid confusing future readers now that the answer has been edited.
              – benvc
              Nov 7 at 21:08










            • @sharp , I think this is more what you were looking for now
              – Stephen Cowley
              Nov 7 at 21:13













            up vote
            2
            down vote



            accepted







            up vote
            2
            down vote



            accepted






            Try something like this. Simply for-loop over the sublists, combining into a string and add them to the newlist



            classified_text = [[('John', 'PERSON'), ('Smith', 'PERSON')], 
            [('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')],
            [('some', 'O'), ('text', 'O'), ('here', 'O')],
            [('Mark', 'O'), ('from', 'O'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('CA', 'ORGANIZATION')]]

            newlist =
            for sublist in classified_text:
            combined =
            for chunk, tag in sublist:
            if tag == 'O':
            continue
            combined_tag = tag
            combined.append(chunk)

            # Append tag and string to list
            if combined:
            # If you wanted to space filled as in your example, you can use
            # the strings ljust method
            newlist.append((combined_tag.ljust(12), ' '.join(combined)))

            print(newlist)

            #[('PERSON ', 'John Smith'),
            # ('ORGANIZATION', 'University of ABC'),
            # ('ORGANIZATION', 'University of CA')]





            share|improve this answer














            Try something like this. Simply for-loop over the sublists, combining into a string and add them to the newlist



            classified_text = [[('John', 'PERSON'), ('Smith', 'PERSON')], 
            [('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')],
            [('some', 'O'), ('text', 'O'), ('here', 'O')],
            [('Mark', 'O'), ('from', 'O'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('CA', 'ORGANIZATION')]]

            newlist =
            for sublist in classified_text:
            combined =
            for chunk, tag in sublist:
            if tag == 'O':
            continue
            combined_tag = tag
            combined.append(chunk)

            # Append tag and string to list
            if combined:
            # If you wanted to space filled as in your example, you can use
            # the strings ljust method
            newlist.append((combined_tag.ljust(12), ' '.join(combined)))

            print(newlist)

            #[('PERSON ', 'John Smith'),
            # ('ORGANIZATION', 'University of ABC'),
            # ('ORGANIZATION', 'University of CA')]






            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 7 at 21:27

























            answered Nov 7 at 20:50









            Stephen Cowley

            858215




            858215












            • @Stephen, Thanks for replying. It is close however your input classified_text last element is slightly different than the description above. Plus I am also trying to filter out 'O' tags. It only works slightly close. When I input the text 2 into your code, I get results: [('PERSON', 'John Smith'), ('ORGANIZATION', 'University of ABC'), ('O', 'some text here'), ('O', 'Mark from University of CA')]. I am looking for this [('PERSON ', 'John Smith'), ('ORGANIZATION', 'University of ABC'),('ORGANIZATION', 'University of CA')]
              – sharp
              Nov 7 at 21:02












            • Deleted the list comprehension comment to avoid confusing future readers now that the answer has been edited.
              – benvc
              Nov 7 at 21:08










            • @sharp , I think this is more what you were looking for now
              – Stephen Cowley
              Nov 7 at 21:13


















            • @Stephen, Thanks for replying. It is close however your input classified_text last element is slightly different than the description above. Plus I am also trying to filter out 'O' tags. It only works slightly close. When I input the text 2 into your code, I get results: [('PERSON', 'John Smith'), ('ORGANIZATION', 'University of ABC'), ('O', 'some text here'), ('O', 'Mark from University of CA')]. I am looking for this [('PERSON ', 'John Smith'), ('ORGANIZATION', 'University of ABC'),('ORGANIZATION', 'University of CA')]
              – sharp
              Nov 7 at 21:02












            • Deleted the list comprehension comment to avoid confusing future readers now that the answer has been edited.
              – benvc
              Nov 7 at 21:08










            • @sharp , I think this is more what you were looking for now
              – Stephen Cowley
              Nov 7 at 21:13
















            @Stephen, Thanks for replying. It is close however your input classified_text last element is slightly different than the description above. Plus I am also trying to filter out 'O' tags. It only works slightly close. When I input the text 2 into your code, I get results: [('PERSON', 'John Smith'), ('ORGANIZATION', 'University of ABC'), ('O', 'some text here'), ('O', 'Mark from University of CA')]. I am looking for this [('PERSON ', 'John Smith'), ('ORGANIZATION', 'University of ABC'),('ORGANIZATION', 'University of CA')]
            – sharp
            Nov 7 at 21:02






            @Stephen, Thanks for replying. It is close however your input classified_text last element is slightly different than the description above. Plus I am also trying to filter out 'O' tags. It only works slightly close. When I input the text 2 into your code, I get results: [('PERSON', 'John Smith'), ('ORGANIZATION', 'University of ABC'), ('O', 'some text here'), ('O', 'Mark from University of CA')]. I am looking for this [('PERSON ', 'John Smith'), ('ORGANIZATION', 'University of ABC'),('ORGANIZATION', 'University of CA')]
            – sharp
            Nov 7 at 21:02














            Deleted the list comprehension comment to avoid confusing future readers now that the answer has been edited.
            – benvc
            Nov 7 at 21:08




            Deleted the list comprehension comment to avoid confusing future readers now that the answer has been edited.
            – benvc
            Nov 7 at 21:08












            @sharp , I think this is more what you were looking for now
            – Stephen Cowley
            Nov 7 at 21:13




            @sharp , I think this is more what you were looking for now
            – Stephen Cowley
            Nov 7 at 21:13












            up vote
            0
            down vote













            You could first flatten your list of lists into just a list:



            flat_list = [item for sublist in classified_text for item in sublist]


            And that flat list should work with your original code.






            share|improve this answer

























              up vote
              0
              down vote













              You could first flatten your list of lists into just a list:



              flat_list = [item for sublist in classified_text for item in sublist]


              And that flat list should work with your original code.






              share|improve this answer























                up vote
                0
                down vote










                up vote
                0
                down vote









                You could first flatten your list of lists into just a list:



                flat_list = [item for sublist in classified_text for item in sublist]


                And that flat list should work with your original code.






                share|improve this answer












                You could first flatten your list of lists into just a list:



                flat_list = [item for sublist in classified_text for item in sublist]


                And that flat list should work with your original code.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 7 at 21:02









                kabdulla

                1,882623




                1,882623






























                     

                    draft saved


                    draft discarded



















































                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53197253%2fpython-merge-list-of-tuples-from-nested-list%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    這個網誌中的熱門文章

                    Xamarin.form Move up view when keyboard appear

                    Post-Redirect-Get with Spring WebFlux and Thymeleaf

                    Anylogic : not able to use stopDelay()