How to get the xml element as a string with namespace using ElementTree in python?
I need to get the elements from xml as a string. I am trying with below xml format.
<xml>
<prot:data xmlns:prot="prot">
<product-id-template>
<prot:ProductId>PRODUCT_ID</prot:ProductId>
</product-id-template>
<product-name-template>
<prot:ProductName>PRODUCT_NAME</prot:ProductName>
</product-name-template>
<dealer-template>
<xsi:Dealer xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">DEALER</xsi:Dealer>
</dealer-template>
</prot:data>
</xml>
And I tried with below code:
from xml.etree import ElementTree as ET
def get_template(xpath, namespaces):
tree = ET.parse('cdata.xml')
elements = tree.getroot()
for element in elements.findall(xpath, namespaces=namespaces):
return element
namespace = {"prot" : "prot"}
aa = get_template(".//prot:ProductId", namespace)
print(ET.tostring(aa).decode())
Actual output:
<ns0:ProductId xmlns:ns0="prot">PRODUCT_ID</ns0:ProductId>
Expected output:
<prot:ProductId>PRODUCT_ID</prot:ProductId>
I should not remove the xmlns from the document where it presents in the document. And It has to be removed where it not presents. Example product-id-template
is not containing the xmlns so it needs to be retrieved without xmlns. And dealer-template
contains the xmlns so it needs to be retrieved with xmlns.
How to achieve this?
python xml elementtree
add a comment |
I need to get the elements from xml as a string. I am trying with below xml format.
<xml>
<prot:data xmlns:prot="prot">
<product-id-template>
<prot:ProductId>PRODUCT_ID</prot:ProductId>
</product-id-template>
<product-name-template>
<prot:ProductName>PRODUCT_NAME</prot:ProductName>
</product-name-template>
<dealer-template>
<xsi:Dealer xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">DEALER</xsi:Dealer>
</dealer-template>
</prot:data>
</xml>
And I tried with below code:
from xml.etree import ElementTree as ET
def get_template(xpath, namespaces):
tree = ET.parse('cdata.xml')
elements = tree.getroot()
for element in elements.findall(xpath, namespaces=namespaces):
return element
namespace = {"prot" : "prot"}
aa = get_template(".//prot:ProductId", namespace)
print(ET.tostring(aa).decode())
Actual output:
<ns0:ProductId xmlns:ns0="prot">PRODUCT_ID</ns0:ProductId>
Expected output:
<prot:ProductId>PRODUCT_ID</prot:ProductId>
I should not remove the xmlns from the document where it presents in the document. And It has to be removed where it not presents. Example product-id-template
is not containing the xmlns so it needs to be retrieved without xmlns. And dealer-template
contains the xmlns so it needs to be retrieved with xmlns.
How to achieve this?
python xml elementtree
add a comment |
I need to get the elements from xml as a string. I am trying with below xml format.
<xml>
<prot:data xmlns:prot="prot">
<product-id-template>
<prot:ProductId>PRODUCT_ID</prot:ProductId>
</product-id-template>
<product-name-template>
<prot:ProductName>PRODUCT_NAME</prot:ProductName>
</product-name-template>
<dealer-template>
<xsi:Dealer xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">DEALER</xsi:Dealer>
</dealer-template>
</prot:data>
</xml>
And I tried with below code:
from xml.etree import ElementTree as ET
def get_template(xpath, namespaces):
tree = ET.parse('cdata.xml')
elements = tree.getroot()
for element in elements.findall(xpath, namespaces=namespaces):
return element
namespace = {"prot" : "prot"}
aa = get_template(".//prot:ProductId", namespace)
print(ET.tostring(aa).decode())
Actual output:
<ns0:ProductId xmlns:ns0="prot">PRODUCT_ID</ns0:ProductId>
Expected output:
<prot:ProductId>PRODUCT_ID</prot:ProductId>
I should not remove the xmlns from the document where it presents in the document. And It has to be removed where it not presents. Example product-id-template
is not containing the xmlns so it needs to be retrieved without xmlns. And dealer-template
contains the xmlns so it needs to be retrieved with xmlns.
How to achieve this?
python xml elementtree
I need to get the elements from xml as a string. I am trying with below xml format.
<xml>
<prot:data xmlns:prot="prot">
<product-id-template>
<prot:ProductId>PRODUCT_ID</prot:ProductId>
</product-id-template>
<product-name-template>
<prot:ProductName>PRODUCT_NAME</prot:ProductName>
</product-name-template>
<dealer-template>
<xsi:Dealer xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">DEALER</xsi:Dealer>
</dealer-template>
</prot:data>
</xml>
And I tried with below code:
from xml.etree import ElementTree as ET
def get_template(xpath, namespaces):
tree = ET.parse('cdata.xml')
elements = tree.getroot()
for element in elements.findall(xpath, namespaces=namespaces):
return element
namespace = {"prot" : "prot"}
aa = get_template(".//prot:ProductId", namespace)
print(ET.tostring(aa).decode())
Actual output:
<ns0:ProductId xmlns:ns0="prot">PRODUCT_ID</ns0:ProductId>
Expected output:
<prot:ProductId>PRODUCT_ID</prot:ProductId>
I should not remove the xmlns from the document where it presents in the document. And It has to be removed where it not presents. Example product-id-template
is not containing the xmlns so it needs to be retrieved without xmlns. And dealer-template
contains the xmlns so it needs to be retrieved with xmlns.
How to achieve this?
python xml elementtree
python xml elementtree
edited Nov 21 '18 at 11:56
Sathish
asked Nov 21 '18 at 5:25
SathishSathish
1166
1166
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
You can remove xmlns with regex.
import re
# ...
with_ns = ET.tostring(aa).decode()
no_ns = re.sub(' xmlns(:w+)?="[^"]+"', '', with_ns)
print(no_ns)
UPDATE: You can do a very wild thing. Although I can't recommend it, because I'm not a Python expert.
I just checked the source code and found that I can do this hack:
def my_serialize_xml(write, elem, qnames, namespaces,
short_empty_elements, **kwargs):
ET._serialize_xml(write, elem, qnames,
None, short_empty_elements, **kwargs)
ET._serialize["xml"] = my_serialize_xml
I just defined my_serialize_xml
, which calls ElementTree._serialize_xml
with namespaces=None
. And then, in dictionary ElementTree._serialize
, I changed value for key "xml"
to my_serialize_xml
. So when you call ElementTree.tostring
, it will use my_serialize_xml
.
If you want to try it, just place the code(above) after from xml.etree import ElementTree as ET
(but before using the ET
).
Can we achieve this without using regex? Because in some case element may have different xmlns in that time also it will remove right. I need to retrieve as such how the xml document contains the data @Mike Kaskun
– Sathish
Nov 21 '18 at 8:57
I tried to do it withElementTree.tostring()
, but seems it is not possible. Maybelxml
can do it. I'll let you know if I find better solution.
– Mike Kaskun
Nov 21 '18 at 9:19
I've updated the answer, can't find other ways withxml.etree
. If you can uselxml
, you can find similar q&a how to do it.
– Mike Kaskun
Nov 21 '18 at 10:39
Thanks @Mike Kaskun. But it removes the xmlns in all the place. I should not remove the xmlns from the document where it presents in the document. Above question I added the dealer-template block. There I need the xmlns and it should not be removed from there. And It has to be removed where it not presents
– Sathish
Nov 21 '18 at 11:02
1
Ok, I understand. I'll try something else later. Also, add your explanation to question, maybe someone else will help.
– Mike Kaskun
Nov 21 '18 at 11:44
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53405716%2fhow-to-get-the-xml-element-as-a-string-with-namespace-using-elementtree-in-pytho%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can remove xmlns with regex.
import re
# ...
with_ns = ET.tostring(aa).decode()
no_ns = re.sub(' xmlns(:w+)?="[^"]+"', '', with_ns)
print(no_ns)
UPDATE: You can do a very wild thing. Although I can't recommend it, because I'm not a Python expert.
I just checked the source code and found that I can do this hack:
def my_serialize_xml(write, elem, qnames, namespaces,
short_empty_elements, **kwargs):
ET._serialize_xml(write, elem, qnames,
None, short_empty_elements, **kwargs)
ET._serialize["xml"] = my_serialize_xml
I just defined my_serialize_xml
, which calls ElementTree._serialize_xml
with namespaces=None
. And then, in dictionary ElementTree._serialize
, I changed value for key "xml"
to my_serialize_xml
. So when you call ElementTree.tostring
, it will use my_serialize_xml
.
If you want to try it, just place the code(above) after from xml.etree import ElementTree as ET
(but before using the ET
).
Can we achieve this without using regex? Because in some case element may have different xmlns in that time also it will remove right. I need to retrieve as such how the xml document contains the data @Mike Kaskun
– Sathish
Nov 21 '18 at 8:57
I tried to do it withElementTree.tostring()
, but seems it is not possible. Maybelxml
can do it. I'll let you know if I find better solution.
– Mike Kaskun
Nov 21 '18 at 9:19
I've updated the answer, can't find other ways withxml.etree
. If you can uselxml
, you can find similar q&a how to do it.
– Mike Kaskun
Nov 21 '18 at 10:39
Thanks @Mike Kaskun. But it removes the xmlns in all the place. I should not remove the xmlns from the document where it presents in the document. Above question I added the dealer-template block. There I need the xmlns and it should not be removed from there. And It has to be removed where it not presents
– Sathish
Nov 21 '18 at 11:02
1
Ok, I understand. I'll try something else later. Also, add your explanation to question, maybe someone else will help.
– Mike Kaskun
Nov 21 '18 at 11:44
add a comment |
You can remove xmlns with regex.
import re
# ...
with_ns = ET.tostring(aa).decode()
no_ns = re.sub(' xmlns(:w+)?="[^"]+"', '', with_ns)
print(no_ns)
UPDATE: You can do a very wild thing. Although I can't recommend it, because I'm not a Python expert.
I just checked the source code and found that I can do this hack:
def my_serialize_xml(write, elem, qnames, namespaces,
short_empty_elements, **kwargs):
ET._serialize_xml(write, elem, qnames,
None, short_empty_elements, **kwargs)
ET._serialize["xml"] = my_serialize_xml
I just defined my_serialize_xml
, which calls ElementTree._serialize_xml
with namespaces=None
. And then, in dictionary ElementTree._serialize
, I changed value for key "xml"
to my_serialize_xml
. So when you call ElementTree.tostring
, it will use my_serialize_xml
.
If you want to try it, just place the code(above) after from xml.etree import ElementTree as ET
(but before using the ET
).
Can we achieve this without using regex? Because in some case element may have different xmlns in that time also it will remove right. I need to retrieve as such how the xml document contains the data @Mike Kaskun
– Sathish
Nov 21 '18 at 8:57
I tried to do it withElementTree.tostring()
, but seems it is not possible. Maybelxml
can do it. I'll let you know if I find better solution.
– Mike Kaskun
Nov 21 '18 at 9:19
I've updated the answer, can't find other ways withxml.etree
. If you can uselxml
, you can find similar q&a how to do it.
– Mike Kaskun
Nov 21 '18 at 10:39
Thanks @Mike Kaskun. But it removes the xmlns in all the place. I should not remove the xmlns from the document where it presents in the document. Above question I added the dealer-template block. There I need the xmlns and it should not be removed from there. And It has to be removed where it not presents
– Sathish
Nov 21 '18 at 11:02
1
Ok, I understand. I'll try something else later. Also, add your explanation to question, maybe someone else will help.
– Mike Kaskun
Nov 21 '18 at 11:44
add a comment |
You can remove xmlns with regex.
import re
# ...
with_ns = ET.tostring(aa).decode()
no_ns = re.sub(' xmlns(:w+)?="[^"]+"', '', with_ns)
print(no_ns)
UPDATE: You can do a very wild thing. Although I can't recommend it, because I'm not a Python expert.
I just checked the source code and found that I can do this hack:
def my_serialize_xml(write, elem, qnames, namespaces,
short_empty_elements, **kwargs):
ET._serialize_xml(write, elem, qnames,
None, short_empty_elements, **kwargs)
ET._serialize["xml"] = my_serialize_xml
I just defined my_serialize_xml
, which calls ElementTree._serialize_xml
with namespaces=None
. And then, in dictionary ElementTree._serialize
, I changed value for key "xml"
to my_serialize_xml
. So when you call ElementTree.tostring
, it will use my_serialize_xml
.
If you want to try it, just place the code(above) after from xml.etree import ElementTree as ET
(but before using the ET
).
You can remove xmlns with regex.
import re
# ...
with_ns = ET.tostring(aa).decode()
no_ns = re.sub(' xmlns(:w+)?="[^"]+"', '', with_ns)
print(no_ns)
UPDATE: You can do a very wild thing. Although I can't recommend it, because I'm not a Python expert.
I just checked the source code and found that I can do this hack:
def my_serialize_xml(write, elem, qnames, namespaces,
short_empty_elements, **kwargs):
ET._serialize_xml(write, elem, qnames,
None, short_empty_elements, **kwargs)
ET._serialize["xml"] = my_serialize_xml
I just defined my_serialize_xml
, which calls ElementTree._serialize_xml
with namespaces=None
. And then, in dictionary ElementTree._serialize
, I changed value for key "xml"
to my_serialize_xml
. So when you call ElementTree.tostring
, it will use my_serialize_xml
.
If you want to try it, just place the code(above) after from xml.etree import ElementTree as ET
(but before using the ET
).
edited Nov 21 '18 at 10:29
answered Nov 21 '18 at 8:34
Mike KaskunMike Kaskun
8122517
8122517
Can we achieve this without using regex? Because in some case element may have different xmlns in that time also it will remove right. I need to retrieve as such how the xml document contains the data @Mike Kaskun
– Sathish
Nov 21 '18 at 8:57
I tried to do it withElementTree.tostring()
, but seems it is not possible. Maybelxml
can do it. I'll let you know if I find better solution.
– Mike Kaskun
Nov 21 '18 at 9:19
I've updated the answer, can't find other ways withxml.etree
. If you can uselxml
, you can find similar q&a how to do it.
– Mike Kaskun
Nov 21 '18 at 10:39
Thanks @Mike Kaskun. But it removes the xmlns in all the place. I should not remove the xmlns from the document where it presents in the document. Above question I added the dealer-template block. There I need the xmlns and it should not be removed from there. And It has to be removed where it not presents
– Sathish
Nov 21 '18 at 11:02
1
Ok, I understand. I'll try something else later. Also, add your explanation to question, maybe someone else will help.
– Mike Kaskun
Nov 21 '18 at 11:44
add a comment |
Can we achieve this without using regex? Because in some case element may have different xmlns in that time also it will remove right. I need to retrieve as such how the xml document contains the data @Mike Kaskun
– Sathish
Nov 21 '18 at 8:57
I tried to do it withElementTree.tostring()
, but seems it is not possible. Maybelxml
can do it. I'll let you know if I find better solution.
– Mike Kaskun
Nov 21 '18 at 9:19
I've updated the answer, can't find other ways withxml.etree
. If you can uselxml
, you can find similar q&a how to do it.
– Mike Kaskun
Nov 21 '18 at 10:39
Thanks @Mike Kaskun. But it removes the xmlns in all the place. I should not remove the xmlns from the document where it presents in the document. Above question I added the dealer-template block. There I need the xmlns and it should not be removed from there. And It has to be removed where it not presents
– Sathish
Nov 21 '18 at 11:02
1
Ok, I understand. I'll try something else later. Also, add your explanation to question, maybe someone else will help.
– Mike Kaskun
Nov 21 '18 at 11:44
Can we achieve this without using regex? Because in some case element may have different xmlns in that time also it will remove right. I need to retrieve as such how the xml document contains the data @Mike Kaskun
– Sathish
Nov 21 '18 at 8:57
Can we achieve this without using regex? Because in some case element may have different xmlns in that time also it will remove right. I need to retrieve as such how the xml document contains the data @Mike Kaskun
– Sathish
Nov 21 '18 at 8:57
I tried to do it with
ElementTree.tostring()
, but seems it is not possible. Maybe lxml
can do it. I'll let you know if I find better solution.– Mike Kaskun
Nov 21 '18 at 9:19
I tried to do it with
ElementTree.tostring()
, but seems it is not possible. Maybe lxml
can do it. I'll let you know if I find better solution.– Mike Kaskun
Nov 21 '18 at 9:19
I've updated the answer, can't find other ways with
xml.etree
. If you can use lxml
, you can find similar q&a how to do it.– Mike Kaskun
Nov 21 '18 at 10:39
I've updated the answer, can't find other ways with
xml.etree
. If you can use lxml
, you can find similar q&a how to do it.– Mike Kaskun
Nov 21 '18 at 10:39
Thanks @Mike Kaskun. But it removes the xmlns in all the place. I should not remove the xmlns from the document where it presents in the document. Above question I added the dealer-template block. There I need the xmlns and it should not be removed from there. And It has to be removed where it not presents
– Sathish
Nov 21 '18 at 11:02
Thanks @Mike Kaskun. But it removes the xmlns in all the place. I should not remove the xmlns from the document where it presents in the document. Above question I added the dealer-template block. There I need the xmlns and it should not be removed from there. And It has to be removed where it not presents
– Sathish
Nov 21 '18 at 11:02
1
1
Ok, I understand. I'll try something else later. Also, add your explanation to question, maybe someone else will help.
– Mike Kaskun
Nov 21 '18 at 11:44
Ok, I understand. I'll try something else later. Also, add your explanation to question, maybe someone else will help.
– Mike Kaskun
Nov 21 '18 at 11:44
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53405716%2fhow-to-get-the-xml-element-as-a-string-with-namespace-using-elementtree-in-pytho%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown