Reading and Writing XML Files in Python - GeeksforGeeks (2024)

Last Updated : 10 Aug, 2024

Comments

Improve

Extensible Markup Language, commonly known as XML is a language designed specifically to be easy to interpret by both humans and computers altogether. The language defines a set of rules used to encode a document in a specific format. In this article, methods have been described to read and write XML files in python.

Note: In general, the process of reading the data from an XML file and analyzing its logical components is known as Parsing. Therefore, when we refer to reading a xml file we are referring to parsing the XML document.

In this article, we would take a look at two libraries that could be used for the purpose of xml parsing. They are:

  • BeautifulSoup used alongside the lxml parser
  • Elementtree library.

Using BeautifulSoup alongside with lxml parser

For the purpose of reading and writing the xml file we would be using a Python library named BeautifulSoup. In order to install the library, type the following command into the terminal.

pip install beautifulsoup4

Beautiful Soup supports the HTML parser included in Python’s standard library, but it also supports a number of third-party Python parsers. One is the lxml parser (used for parsing XML/HTML documents). lxml could be installed by running the following command in the command processor of your Operating system:

pip install lxml

Firstly we will learn how to read from an XML file. We would also parse data stored in it. Later we would learn how to create an XML file and write data to it.

Reading Data From an XML File

There are two steps required to parse a xml file:-

  • Finding Tags
  • Extracting from tags

Example:

XML File used:

Reading and Writing XML Files in Python - GeeksforGeeks (1)

Python3
from bs4 import BeautifulSoup# Reading the data inside the xml# file to a variable under the name # datawith open('dict.xml', 'r') as f: data = f.read()# Passing the stored data inside# the beautifulsoup parser, storing# the returned object Bs_data = BeautifulSoup(data, "xml")# Finding all instances of tag # `unique`b_unique = Bs_data.find_all('unique')print(b_unique)# Using find() to extract attributes # of the first instance of the tagb_name = Bs_data.find('child', {'name':'Frank'})print(b_name)# Extracting the data stored in a# specific attribute of the # `child` tagvalue = b_name.get('test')print(value)

OUTPUT:

Reading and Writing XML Files in Python - GeeksforGeeks (2)

Writing an XML File

Writing a xml file is a primitive process, reason for that being the fact that xml files aren’t encoded in a special way. Modifying sections of a xml document requires one to parse through it at first. In the below code we would modify some sections of the aforementioned xml document.

Example:

Python3
from bs4 import BeautifulSoup# Reading data from the xml filewith open('dict.xml', 'r') as f: data = f.read()# Passing the data of the xml# file to the xml parser of# beautifulsoupbs_data = BeautifulSoup(data, 'xml')# A loop for replacing the value# of attribute `test` to WHAT !!# The tag is found by the clause# `bs_data.find_all('child', {'name':'Frank'})`for tag in bs_data.find_all('child', {'name':'Frank'}): tag['test'] = "WHAT !!"# Output the contents of the # modified xml fileprint(bs_data.prettify())

Output:

Reading and Writing XML Files in Python - GeeksforGeeks (3)

Using Elementtree

Elementtree module provides us with a plethora of tools for manipulating XML files. The best part about it being its inclusion in the standard Python’s built-in library. Therefore, one does not have to install any external modules for the purpose. Due to the xmlformat being an inherently hierarchical data format, it is a lot easier to represent it by a tree. The module provides ElementTree provides methods to represent whole XML document as a single tree.

In the later examples, we would take a look at discrete methods to read and write data to and from XML files.

Reading XML Files

To read an XML file using ElementTree, firstly, we import the ElementTree class found inside xml library, under the name ET (common convension). Then passed the filename of the xml file to the ElementTree.parse() method, to enable parsing of our xml file. Then got the root (parent tag) of our xml file using getroot(). Then displayed (printed) the root tag of our xml file (non-explicit way). Then displayed the attributes of the sub-tag of our parent tag using root[0].attrib. root[0] for the first tag of parent root and attrib for getting it’s attributes. Then we displayed the text enclosed within the 1st sub-tag of the 5th sub-tag of the tag root.

Example:

Python3
# importing element tree# under the alias of ETimport xml.etree.ElementTree as ET# Passing the path of the# xml document to enable the# parsing processtree = ET.parse('dict.xml')# getting the parent tag of# the xml documentroot = tree.getroot()# printing the root (parent) tag# of the xml document, along with# its memory locationprint(root)# printing the attributes of the# first tag from the parent print(root[0].attrib)# printing the text contained within# first subtag of the 5th tag from# the parentprint(root[5][0].text)

Output:

Reading and Writing XML Files in Python - GeeksforGeeks (4)

Writing XML Files

Now, we would take a look at some methods which could be used to write data on an xml document. In this example we would create a xml file from scratch.

To do the same, firstly, we create a root (parent) tag under the name of chess using the command ET.Element(‘chess’). All the tags would fall underneath this tag, i.e. once a root tag has been defined, other sub-elements could be created underneath it. Then we created a subtag/subelement named Opening inside the chess tag using the command ET.SubElement(). Then we created two more subtags which are underneath the tag Opening named E4 and D4. Then we added attributes to the E4 and D4 tags using set() which is a method found inside SubElement(), which is used to define attributes to a tag. Then we added text between the E4 and D4 tags using the attribute text found inside the SubElement function. In the end we converted the datatype of the contents we were creating from ‘xml.etree.ElementTree.Element’ to bytes object, using the command ET.tostring() (even though the function name is tostring() in certain implementations it converts the datatype to `bytes` rather than `str`). Finally, we flushed the data to a file named gameofsquares.xml which is a opened in `wb` mode to allow writing binary data to it. In the end, we saved the data to our file.

Example:

Python3
import xml.etree.ElementTree as ET# This is the parent (root) tag # onto which other tags would be# createddata = ET.Element('chess')# Adding a subtag named `Opening`# inside our root tagelement1 = ET.SubElement(data, 'Opening')# Adding subtags under the `Opening`# subtag s_elem1 = ET.SubElement(element1, 'E4')s_elem2 = ET.SubElement(element1, 'D4')# Adding attributes to the tags under# `items`s_elem1.set('type', 'Accepted')s_elem2.set('type', 'Declined')# Adding text between the `E4` and `D5` # subtags_elem1.text = "King's Gambit Accepted"s_elem2.text = "Queen's Gambit Declined"# Converting the xml data to byte object,# for allowing flushing data to file # streamb_xml = ET.tostring(data)# Opening a file under the name `items2.xml`,# with operation mode `wb` (write + binary)with open("GFG.xml", "wb") as f: f.write(b_xml)

Output:

Reading and Writing XML Files in Python - GeeksforGeeks (5)


Reading and Writing XML Files in Python – FAQs

How to Read XML File Using Python?

To read XML files in Python, you can use the xml.etree.ElementTree module, which provides a simple and efficient API for parsing and creating XML data.

Example of Reading an XML File:

import xml.etree.ElementTree as ET

# Load and parse the XML file
tree = ET.parse('example.xml')
root = tree.getroot()

# Print out the tag of the root and all child tags
print(root.tag)
for child in root:
print(child.tag, child.attrib)

This script will parse an XML file named example.xml and print the root element’s tag and the tags of its direct children along with their attributes.

How to Write an XML File in Python?

Writing an XML file can also be done using the xml.etree.ElementTree module. Here’s how you can create and write to an XML file:

import xml.etree.ElementTree as ET

# Create the file structure
root = ET.Element('data')
items = ET.SubElement(root, 'items')
item1 = ET.SubElement(items, 'item')
item1.set('name', 'item1')
item1.text = 'item1description'

# Create a new XML file with the results
tree = ET.ElementTree(root)
tree.write('new_items.xml')

This will create an XML file called new_items.xml with the specified structure.

How to Pass XML File as Parameter in Python?

Passing an XML file as a parameter in Python typically involves passing the file path as a string to a function that handles file reading or processing.

def process_xml(file_path):
tree = ET.parse(file_path)
root = tree.getroot()
# Further processing here

# Call the function with the path to your XML file
process_xml('path_to_file.xml')

What Language is XML?

XML (Extensible Markup Language) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. It is primarily used to facilitate the sharing of data across different information systems, particularly via the Internet, and is used both for encoding documents and serializing data.

How to Read XML File to CSV in Python?

To convert XML to CSV in Python, you can parse the XML using ElementTree and then use the csv module to write the parsed data to a CSV file.

Example of Converting XML to CSV:

import xml.etree.ElementTree as ET
import csv

# Parse the XML file
tree = ET.parse('example.xml')
root = tree.getroot()

# Open a file for writing
with open('output.csv', 'w', newline='') as file:
writer = csv.writer(file)
writer.writerow(['Header1', 'Header2']) # Optionally write headers

# Iterate through each child of the root
for child in root:
row = [child.find('Header1').text, child.find('Header2').text]
writer.writerow(row)

This script will read an XML file and write specific parts of its data to a CSV file named output.csv. Adjust the field names and structure according to your XML file’s layout



V

VasuDev4

Reading and Writing XML Files in Python - GeeksforGeeks (6)

Improve

Next Article

Reading and Writing CSV Files in Python

Please Login to comment...

Reading and Writing XML Files in Python - GeeksforGeeks (2024)
Top Articles
Q' Cards - MoneySavers.co.nz
Spain beat Germany live updates
# كشف تسربات المياه بجدة: أهمية وفوائد
Zuercher Portal Inmates Clinton Iowa
The Civil Rights Movement: A Very Short Introduction
Tripadvisor London Forum
Csuf Mail
Busted Mugshots Rappahannock Regional Jail
Solo Player Level 2K23
Temu Beanies
United Center Section 305
Craigslist Carroll Iowa
Un-Pc Purchase Crossword Clue
'A Cure for Wellness', Explained
Entegra Forum
New & Used Motorcycles for Sale | NL Classifieds
Walmart Tires Hours
What Does Fox Stand For In Fox News
Huniepop Jessie Questions And Answers
Chester Farmers Market vendor Daddy's a Hooker: Ed Lowery happy fiber artist for 65 years
Does Publix Pharmacy Accept Sunshine Health
Five Guys Calorie Calculator
German American Bank Owenton Ky
Gay Cest Com
Coleman Funeral Home Olive Branch Ms Obituaries
Holly Ranch Aussie Farm
Rockcastle County Schools Calendar
Central Nj Craiglist
Seattle Clipper Vacations Ferry Terminal Amtrak
Genova Nail Spa Pearland Photos
One Piece Chapter 1077 Tcb
Gw2 Titles
Uscis Fort Myers 3850 Colonial Blvd
Venus Nail Lounge Lake Elsinore
Savannah Riverboat Cruise Anniversary Package
Eros Cherry Hill
Restored Republic December 1 2022
Chicken Coop Brookhaven Ms
Are Huntington Home Candles Toxic
Slim Thug’s Wealth and Wellness: A Journey Beyond Music
Lesley Ann Downey Transcript
Riverry Studio
CareCredit Lawsuit - Illegal Credit Card Charges And Fees
Business Banking Online | Huntington
Sherwin Williams Buttercream
Chets Rental Chesterfield
Craigslist Boats For Sale By Owner Sacramento
Alles, was ihr über Saison 03 von Call of Duty: Warzone 2.0 und Call of Duty: Modern Warfare II wissen müsst
South Carolina Craigslist Motorcycles
Wgu Admissions Login
Hr Central Luxottica Benefits
Cambridge Assessor Database
Latest Posts
Article information

Author: Eusebia Nader

Last Updated:

Views: 6087

Rating: 5 / 5 (60 voted)

Reviews: 91% of readers found this page helpful

Author information

Name: Eusebia Nader

Birthday: 1994-11-11

Address: Apt. 721 977 Ebert Meadows, Jereville, GA 73618-6603

Phone: +2316203969400

Job: International Farming Consultant

Hobby: Reading, Photography, Shooting, Singing, Magic, Kayaking, Mushroom hunting

Introduction: My name is Eusebia Nader, I am a encouraging, brainy, lively, nice, famous, healthy, clever person who loves writing and wants to share my knowledge and understanding with you.