How to get the xml element as a string with namespace using ElementTree in python?












1















I need to get the elements from xml as a string. I am trying with below xml format.



<xml>
<prot:data xmlns:prot="prot">
<product-id-template>
<prot:ProductId>PRODUCT_ID</prot:ProductId>
</product-id-template>

<product-name-template>
<prot:ProductName>PRODUCT_NAME</prot:ProductName>
</product-name-template>

<dealer-template>
<xsi:Dealer xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">DEALER</xsi:Dealer>
</dealer-template>
</prot:data>
</xml>


And I tried with below code:



from xml.etree import ElementTree as ET

def get_template(xpath, namespaces):
tree = ET.parse('cdata.xml')
elements = tree.getroot()
for element in elements.findall(xpath, namespaces=namespaces):
return element

namespace = {"prot" : "prot"}
aa = get_template(".//prot:ProductId", namespace)
print(ET.tostring(aa).decode())


Actual output:



<ns0:ProductId xmlns:ns0="prot">PRODUCT_ID</ns0:ProductId>


Expected output:



<prot:ProductId>PRODUCT_ID</prot:ProductId>


I should not remove the xmlns from the document where it presents in the document. And It has to be removed where it not presents. Example product-id-template is not containing the xmlns so it needs to be retrieved without xmlns. And dealer-template contains the xmlns so it needs to be retrieved with xmlns.



How to achieve this?










share|improve this question





























    1















    I need to get the elements from xml as a string. I am trying with below xml format.



    <xml>
    <prot:data xmlns:prot="prot">
    <product-id-template>
    <prot:ProductId>PRODUCT_ID</prot:ProductId>
    </product-id-template>

    <product-name-template>
    <prot:ProductName>PRODUCT_NAME</prot:ProductName>
    </product-name-template>

    <dealer-template>
    <xsi:Dealer xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">DEALER</xsi:Dealer>
    </dealer-template>
    </prot:data>
    </xml>


    And I tried with below code:



    from xml.etree import ElementTree as ET

    def get_template(xpath, namespaces):
    tree = ET.parse('cdata.xml')
    elements = tree.getroot()
    for element in elements.findall(xpath, namespaces=namespaces):
    return element

    namespace = {"prot" : "prot"}
    aa = get_template(".//prot:ProductId", namespace)
    print(ET.tostring(aa).decode())


    Actual output:



    <ns0:ProductId xmlns:ns0="prot">PRODUCT_ID</ns0:ProductId>


    Expected output:



    <prot:ProductId>PRODUCT_ID</prot:ProductId>


    I should not remove the xmlns from the document where it presents in the document. And It has to be removed where it not presents. Example product-id-template is not containing the xmlns so it needs to be retrieved without xmlns. And dealer-template contains the xmlns so it needs to be retrieved with xmlns.



    How to achieve this?










    share|improve this question



























      1












      1








      1


      1






      I need to get the elements from xml as a string. I am trying with below xml format.



      <xml>
      <prot:data xmlns:prot="prot">
      <product-id-template>
      <prot:ProductId>PRODUCT_ID</prot:ProductId>
      </product-id-template>

      <product-name-template>
      <prot:ProductName>PRODUCT_NAME</prot:ProductName>
      </product-name-template>

      <dealer-template>
      <xsi:Dealer xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">DEALER</xsi:Dealer>
      </dealer-template>
      </prot:data>
      </xml>


      And I tried with below code:



      from xml.etree import ElementTree as ET

      def get_template(xpath, namespaces):
      tree = ET.parse('cdata.xml')
      elements = tree.getroot()
      for element in elements.findall(xpath, namespaces=namespaces):
      return element

      namespace = {"prot" : "prot"}
      aa = get_template(".//prot:ProductId", namespace)
      print(ET.tostring(aa).decode())


      Actual output:



      <ns0:ProductId xmlns:ns0="prot">PRODUCT_ID</ns0:ProductId>


      Expected output:



      <prot:ProductId>PRODUCT_ID</prot:ProductId>


      I should not remove the xmlns from the document where it presents in the document. And It has to be removed where it not presents. Example product-id-template is not containing the xmlns so it needs to be retrieved without xmlns. And dealer-template contains the xmlns so it needs to be retrieved with xmlns.



      How to achieve this?










      share|improve this question
















      I need to get the elements from xml as a string. I am trying with below xml format.



      <xml>
      <prot:data xmlns:prot="prot">
      <product-id-template>
      <prot:ProductId>PRODUCT_ID</prot:ProductId>
      </product-id-template>

      <product-name-template>
      <prot:ProductName>PRODUCT_NAME</prot:ProductName>
      </product-name-template>

      <dealer-template>
      <xsi:Dealer xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">DEALER</xsi:Dealer>
      </dealer-template>
      </prot:data>
      </xml>


      And I tried with below code:



      from xml.etree import ElementTree as ET

      def get_template(xpath, namespaces):
      tree = ET.parse('cdata.xml')
      elements = tree.getroot()
      for element in elements.findall(xpath, namespaces=namespaces):
      return element

      namespace = {"prot" : "prot"}
      aa = get_template(".//prot:ProductId", namespace)
      print(ET.tostring(aa).decode())


      Actual output:



      <ns0:ProductId xmlns:ns0="prot">PRODUCT_ID</ns0:ProductId>


      Expected output:



      <prot:ProductId>PRODUCT_ID</prot:ProductId>


      I should not remove the xmlns from the document where it presents in the document. And It has to be removed where it not presents. Example product-id-template is not containing the xmlns so it needs to be retrieved without xmlns. And dealer-template contains the xmlns so it needs to be retrieved with xmlns.



      How to achieve this?







      python xml elementtree






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 21 '18 at 11:56







      Sathish

















      asked Nov 21 '18 at 5:25









      SathishSathish

      1166




      1166
























          1 Answer
          1






          active

          oldest

          votes


















          1














          You can remove xmlns with regex.



          import re
          # ...
          with_ns = ET.tostring(aa).decode()
          no_ns = re.sub(' xmlns(:w+)?="[^"]+"', '', with_ns)
          print(no_ns)




          UPDATE: You can do a very wild thing. Although I can't recommend it, because I'm not a Python expert.



          I just checked the source code and found that I can do this hack:



          def my_serialize_xml(write, elem, qnames, namespaces,
          short_empty_elements, **kwargs):
          ET._serialize_xml(write, elem, qnames,
          None, short_empty_elements, **kwargs)

          ET._serialize["xml"] = my_serialize_xml


          I just defined my_serialize_xml, which calls ElementTree._serialize_xml with namespaces=None. And then, in dictionary ElementTree._serialize, I changed value for key "xml" to my_serialize_xml. So when you call ElementTree.tostring, it will use my_serialize_xml.



          If you want to try it, just place the code(above) after from xml.etree import ElementTree as ET (but before using the ET).






          share|improve this answer


























          • Can we achieve this without using regex? Because in some case element may have different xmlns in that time also it will remove right. I need to retrieve as such how the xml document contains the data @Mike Kaskun

            – Sathish
            Nov 21 '18 at 8:57













          • I tried to do it with ElementTree.tostring(), but seems it is not possible. Maybe lxml can do it. I'll let you know if I find better solution.

            – Mike Kaskun
            Nov 21 '18 at 9:19











          • I've updated the answer, can't find other ways with xml.etree. If you can use lxml, you can find similar q&a how to do it.

            – Mike Kaskun
            Nov 21 '18 at 10:39











          • Thanks @Mike Kaskun. But it removes the xmlns in all the place. I should not remove the xmlns from the document where it presents in the document. Above question I added the dealer-template block. There I need the xmlns and it should not be removed from there. And It has to be removed where it not presents

            – Sathish
            Nov 21 '18 at 11:02








          • 1





            Ok, I understand. I'll try something else later. Also, add your explanation to question, maybe someone else will help.

            – Mike Kaskun
            Nov 21 '18 at 11:44











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53405716%2fhow-to-get-the-xml-element-as-a-string-with-namespace-using-elementtree-in-pytho%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1














          You can remove xmlns with regex.



          import re
          # ...
          with_ns = ET.tostring(aa).decode()
          no_ns = re.sub(' xmlns(:w+)?="[^"]+"', '', with_ns)
          print(no_ns)




          UPDATE: You can do a very wild thing. Although I can't recommend it, because I'm not a Python expert.



          I just checked the source code and found that I can do this hack:



          def my_serialize_xml(write, elem, qnames, namespaces,
          short_empty_elements, **kwargs):
          ET._serialize_xml(write, elem, qnames,
          None, short_empty_elements, **kwargs)

          ET._serialize["xml"] = my_serialize_xml


          I just defined my_serialize_xml, which calls ElementTree._serialize_xml with namespaces=None. And then, in dictionary ElementTree._serialize, I changed value for key "xml" to my_serialize_xml. So when you call ElementTree.tostring, it will use my_serialize_xml.



          If you want to try it, just place the code(above) after from xml.etree import ElementTree as ET (but before using the ET).






          share|improve this answer


























          • Can we achieve this without using regex? Because in some case element may have different xmlns in that time also it will remove right. I need to retrieve as such how the xml document contains the data @Mike Kaskun

            – Sathish
            Nov 21 '18 at 8:57













          • I tried to do it with ElementTree.tostring(), but seems it is not possible. Maybe lxml can do it. I'll let you know if I find better solution.

            – Mike Kaskun
            Nov 21 '18 at 9:19











          • I've updated the answer, can't find other ways with xml.etree. If you can use lxml, you can find similar q&a how to do it.

            – Mike Kaskun
            Nov 21 '18 at 10:39











          • Thanks @Mike Kaskun. But it removes the xmlns in all the place. I should not remove the xmlns from the document where it presents in the document. Above question I added the dealer-template block. There I need the xmlns and it should not be removed from there. And It has to be removed where it not presents

            – Sathish
            Nov 21 '18 at 11:02








          • 1





            Ok, I understand. I'll try something else later. Also, add your explanation to question, maybe someone else will help.

            – Mike Kaskun
            Nov 21 '18 at 11:44
















          1














          You can remove xmlns with regex.



          import re
          # ...
          with_ns = ET.tostring(aa).decode()
          no_ns = re.sub(' xmlns(:w+)?="[^"]+"', '', with_ns)
          print(no_ns)




          UPDATE: You can do a very wild thing. Although I can't recommend it, because I'm not a Python expert.



          I just checked the source code and found that I can do this hack:



          def my_serialize_xml(write, elem, qnames, namespaces,
          short_empty_elements, **kwargs):
          ET._serialize_xml(write, elem, qnames,
          None, short_empty_elements, **kwargs)

          ET._serialize["xml"] = my_serialize_xml


          I just defined my_serialize_xml, which calls ElementTree._serialize_xml with namespaces=None. And then, in dictionary ElementTree._serialize, I changed value for key "xml" to my_serialize_xml. So when you call ElementTree.tostring, it will use my_serialize_xml.



          If you want to try it, just place the code(above) after from xml.etree import ElementTree as ET (but before using the ET).






          share|improve this answer


























          • Can we achieve this without using regex? Because in some case element may have different xmlns in that time also it will remove right. I need to retrieve as such how the xml document contains the data @Mike Kaskun

            – Sathish
            Nov 21 '18 at 8:57













          • I tried to do it with ElementTree.tostring(), but seems it is not possible. Maybe lxml can do it. I'll let you know if I find better solution.

            – Mike Kaskun
            Nov 21 '18 at 9:19











          • I've updated the answer, can't find other ways with xml.etree. If you can use lxml, you can find similar q&a how to do it.

            – Mike Kaskun
            Nov 21 '18 at 10:39











          • Thanks @Mike Kaskun. But it removes the xmlns in all the place. I should not remove the xmlns from the document where it presents in the document. Above question I added the dealer-template block. There I need the xmlns and it should not be removed from there. And It has to be removed where it not presents

            – Sathish
            Nov 21 '18 at 11:02








          • 1





            Ok, I understand. I'll try something else later. Also, add your explanation to question, maybe someone else will help.

            – Mike Kaskun
            Nov 21 '18 at 11:44














          1












          1








          1







          You can remove xmlns with regex.



          import re
          # ...
          with_ns = ET.tostring(aa).decode()
          no_ns = re.sub(' xmlns(:w+)?="[^"]+"', '', with_ns)
          print(no_ns)




          UPDATE: You can do a very wild thing. Although I can't recommend it, because I'm not a Python expert.



          I just checked the source code and found that I can do this hack:



          def my_serialize_xml(write, elem, qnames, namespaces,
          short_empty_elements, **kwargs):
          ET._serialize_xml(write, elem, qnames,
          None, short_empty_elements, **kwargs)

          ET._serialize["xml"] = my_serialize_xml


          I just defined my_serialize_xml, which calls ElementTree._serialize_xml with namespaces=None. And then, in dictionary ElementTree._serialize, I changed value for key "xml" to my_serialize_xml. So when you call ElementTree.tostring, it will use my_serialize_xml.



          If you want to try it, just place the code(above) after from xml.etree import ElementTree as ET (but before using the ET).






          share|improve this answer















          You can remove xmlns with regex.



          import re
          # ...
          with_ns = ET.tostring(aa).decode()
          no_ns = re.sub(' xmlns(:w+)?="[^"]+"', '', with_ns)
          print(no_ns)




          UPDATE: You can do a very wild thing. Although I can't recommend it, because I'm not a Python expert.



          I just checked the source code and found that I can do this hack:



          def my_serialize_xml(write, elem, qnames, namespaces,
          short_empty_elements, **kwargs):
          ET._serialize_xml(write, elem, qnames,
          None, short_empty_elements, **kwargs)

          ET._serialize["xml"] = my_serialize_xml


          I just defined my_serialize_xml, which calls ElementTree._serialize_xml with namespaces=None. And then, in dictionary ElementTree._serialize, I changed value for key "xml" to my_serialize_xml. So when you call ElementTree.tostring, it will use my_serialize_xml.



          If you want to try it, just place the code(above) after from xml.etree import ElementTree as ET (but before using the ET).







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 21 '18 at 10:29

























          answered Nov 21 '18 at 8:34









          Mike KaskunMike Kaskun

          8122517




          8122517













          • Can we achieve this without using regex? Because in some case element may have different xmlns in that time also it will remove right. I need to retrieve as such how the xml document contains the data @Mike Kaskun

            – Sathish
            Nov 21 '18 at 8:57













          • I tried to do it with ElementTree.tostring(), but seems it is not possible. Maybe lxml can do it. I'll let you know if I find better solution.

            – Mike Kaskun
            Nov 21 '18 at 9:19











          • I've updated the answer, can't find other ways with xml.etree. If you can use lxml, you can find similar q&a how to do it.

            – Mike Kaskun
            Nov 21 '18 at 10:39











          • Thanks @Mike Kaskun. But it removes the xmlns in all the place. I should not remove the xmlns from the document where it presents in the document. Above question I added the dealer-template block. There I need the xmlns and it should not be removed from there. And It has to be removed where it not presents

            – Sathish
            Nov 21 '18 at 11:02








          • 1





            Ok, I understand. I'll try something else later. Also, add your explanation to question, maybe someone else will help.

            – Mike Kaskun
            Nov 21 '18 at 11:44



















          • Can we achieve this without using regex? Because in some case element may have different xmlns in that time also it will remove right. I need to retrieve as such how the xml document contains the data @Mike Kaskun

            – Sathish
            Nov 21 '18 at 8:57













          • I tried to do it with ElementTree.tostring(), but seems it is not possible. Maybe lxml can do it. I'll let you know if I find better solution.

            – Mike Kaskun
            Nov 21 '18 at 9:19











          • I've updated the answer, can't find other ways with xml.etree. If you can use lxml, you can find similar q&a how to do it.

            – Mike Kaskun
            Nov 21 '18 at 10:39











          • Thanks @Mike Kaskun. But it removes the xmlns in all the place. I should not remove the xmlns from the document where it presents in the document. Above question I added the dealer-template block. There I need the xmlns and it should not be removed from there. And It has to be removed where it not presents

            – Sathish
            Nov 21 '18 at 11:02








          • 1





            Ok, I understand. I'll try something else later. Also, add your explanation to question, maybe someone else will help.

            – Mike Kaskun
            Nov 21 '18 at 11:44

















          Can we achieve this without using regex? Because in some case element may have different xmlns in that time also it will remove right. I need to retrieve as such how the xml document contains the data @Mike Kaskun

          – Sathish
          Nov 21 '18 at 8:57







          Can we achieve this without using regex? Because in some case element may have different xmlns in that time also it will remove right. I need to retrieve as such how the xml document contains the data @Mike Kaskun

          – Sathish
          Nov 21 '18 at 8:57















          I tried to do it with ElementTree.tostring(), but seems it is not possible. Maybe lxml can do it. I'll let you know if I find better solution.

          – Mike Kaskun
          Nov 21 '18 at 9:19





          I tried to do it with ElementTree.tostring(), but seems it is not possible. Maybe lxml can do it. I'll let you know if I find better solution.

          – Mike Kaskun
          Nov 21 '18 at 9:19













          I've updated the answer, can't find other ways with xml.etree. If you can use lxml, you can find similar q&a how to do it.

          – Mike Kaskun
          Nov 21 '18 at 10:39





          I've updated the answer, can't find other ways with xml.etree. If you can use lxml, you can find similar q&a how to do it.

          – Mike Kaskun
          Nov 21 '18 at 10:39













          Thanks @Mike Kaskun. But it removes the xmlns in all the place. I should not remove the xmlns from the document where it presents in the document. Above question I added the dealer-template block. There I need the xmlns and it should not be removed from there. And It has to be removed where it not presents

          – Sathish
          Nov 21 '18 at 11:02







          Thanks @Mike Kaskun. But it removes the xmlns in all the place. I should not remove the xmlns from the document where it presents in the document. Above question I added the dealer-template block. There I need the xmlns and it should not be removed from there. And It has to be removed where it not presents

          – Sathish
          Nov 21 '18 at 11:02






          1




          1





          Ok, I understand. I'll try something else later. Also, add your explanation to question, maybe someone else will help.

          – Mike Kaskun
          Nov 21 '18 at 11:44





          Ok, I understand. I'll try something else later. Also, add your explanation to question, maybe someone else will help.

          – Mike Kaskun
          Nov 21 '18 at 11:44




















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53405716%2fhow-to-get-the-xml-element-as-a-string-with-namespace-using-elementtree-in-pytho%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          這個網誌中的熱門文章

          Tangent Lines Diagram Along Smooth Curve

          Yusuf al-Mu'taman ibn Hud

          Zucchini