首先我们了解下 XML 格式
- <tag attrib1=1>text</tag>tail
- 1 2 3 4
Python 中处理 xml 文件有三种方式:
Python 中 ElementTree 模块
- <?xml version="1.0"?>
- <data>
- <country1 name="Liechtenstein">
- <rank1 updated="yes">2</rank1>
- <year1>2008</year1>
- <gdppc1>141100</gdppc1>test
- <neighbor1 name="Austria" direction="E"/>
- <neighbor1 name="Switzerland" direction="W"/>
- </country1>
- <country2 name="Singapore">
- <rank2 updated="no">5</rank2>
- <year2>2011</year2>
- <gdppc2>59900</gdppc2>
- <neighbor2 name="Malaysia" direction="N"/>
- </country2>
- </data>
- import xml.etree.ElementTree as ET
-
- et = ET.parse("xmlfile")
- root = et.getroot()
- print(root.tag) # data
- print(root[0].tag) # country1
- print(root[1].tag) # country2
- print(root[0].attrib) # {'name': 'Liechtenstein'}
- print(root[0][1].text) # 2008
- print(root[0][2].tail) # test
-
- for r in root:
- print(r.tag)
-
- # 结果如下
- country1
- country2
- # xml 文件参考上面的
-
- import xml.etree.ElementTree as ET
-
- et = ET.parse("xmlfile")
- root = et.getroot()
- # attr = root[0].attrib # 获取到country1节点
- # attr.clear() # 清除country1节点中的所有属性,这仅仅是在内存中删除了,xml文件中的内容没有改变
- # et.write("xmlfile") # #将内存中的数据写入xml文件中,此时xml文件中的内容才发生改变
- rank1 = root[0][0]
- print(rank1.get("updated")) # yes -- get取出对于属性的值
- print(rank1.get("aaa")) # 当元素key不存在时返回None
- rank1.set("name", "Evan") # 给节点元素添加属性
- et.write("xmlfile")
- neighbor1 = root[0][3]
- print(neighbor1.keys()) # ['direction', 'name']
- print(neighbor1.items()) # [('direction', 'E'), ('name', 'Austria')]