在 Python 请求库中使用 XML

wufei123 2025-01-05 阅读:61 评论:0

本文介绍如何使用Python的requests库和xml.etree.ElementTree模块解析XML数据。XML(可扩展标记语言)用于存储结构化数据。常见的XML应用包括站点地图和RSS订阅。以下是一个XML文件示例： <...

在 python 请求库中使用 xml

本文介绍如何使用Python的requests库和xml.etree.ElementTree模块解析XML数据。XML(可扩展标记语言)用于存储结构化数据。常见的XML应用包括站点地图和RSS订阅。

以下是一个XML文件示例：

PHP

<breakfast_menu>
  <food>
    <name>belgian waffles</name>
    <price>$5.95</price>
    <description>two of our famous belgian waffles with plenty of real maple syrup</description>
    <calories>650</calories>
  </food>
  <food>
    <name>strawberry belgian waffles</name>
    <price>$7.95</price>
    <description>light belgian waffles covered with strawberries and whipped cream</description>
    <calories>900</calories>
  </food>
  <food>
    <name>berry-berry belgian waffles</name>
    <price>$8.95</price>
    <description>light belgian waffles covered with an assortment of fresh berries and whipped cream</description>
    <calories>900</calories>
  </food>
  <food>
    <name>french toast</name>
    <price>$4.50</price>
    <description>thick slices made from our homemade sourdough bread</description>
    <calories>600</calories>
  </food>
  <food>
    <name>homestyle breakfast</name>
    <price>$6.95</price>
    <description>two eggs, bacon or sausage, toast, and our ever-popular hash browns</description>
    <calories>950</calories>
  </food>
</breakfast_menu>

这个例子展示了一个breakfast_menu根元素，包含多个food元素，每个food元素包含name、price、description和calories子元素。

接下来，我们将学习如何用Python解析此类XML数据。首先，设置开发环境：

安装必要的库：

PHP

sudo apt install python3 python3-virtualenv -y  # Debian/Ubuntu
python3 -m venv env  # 创建虚拟环境
source env/bin/activate  # 激活虚拟环境
pip3 install requests

创建main.py文件并输入以下代码：

步骤一：获取所有标签名

PHP

import requests
import xml.etree.ElementTree as ET

response = requests.get('https://www.w3schools.com/xml/simple.xml')
root = ET.fromstring(response.content)
for item in root.iter('*'):
    print(item.tag)

这将打印出所有XML标签的名称。

步骤二：提取特定元素的值

PHP

import requests
import xml.etree.ElementTree as ET

response = requests.get('https://www.w3schools.com/xml/simple.xml')
root = ET.fromstring(response.content)
for item in root.iterfind('food'):
    print(item.findtext('name'))
    print(item.findtext('price'))
    print(item.findtext('description'))
    print(item.findtext('calories'))

这将打印每个食物的名称、价格、描述和卡路里信息。

步骤三：格式化输出

为了更清晰地显示结果，我们可以格式化输出：

PHP

import requests
import xml.etree.ElementTree as ET

response = requests.get('https://www.w3schools.com/xml/simple.xml')
root = ET.fromstring(response.content)
for item in root.iterfind('food'):
    print('Name: {}, Price: {}, Description: {}, Calories: {}'.format(
        item.findtext('name'), item.findtext('price'), item.findtext('description'), item.findtext('calories')))