Guide to Reading Excel Files Using Python
A comprehensive guide on how to read content from Excel files (.xlsx, .xls) using Python, utilizing the openpyxl
and pandas
libraries with illustrative examples.
In this article, we will use the Pandas and Openpyxl libraries to read content from an Excel file. Pandas is a powerful library for data manipulation, while Openpyxl supports reading and writing Excel files.
# Method 1: Using the openpyxl library
# Step 1: Install the openpyxl library
# Run the following command in your terminal: pip install openpyxl
from openpyxl import load_workbook
# Path to the Excel file
file_path = "example.xlsx"
# Load the Excel file
workbook = load_workbook(filename=file_path)
sheet = workbook.active # Get the first sheet
# Iterate through each row and column to read data
for row in sheet.iter_rows(values_only=True):
print(row)
# Method 2: Using the pandas library
# Step 1: Install the pandas library
# Run the following command in your terminal: pip install pandas openpyxl
import pandas as pd
# Read the Excel file
df = pd.read_excel(file_path)
# Display the content
print(df)
Detailed Explanation:
-
Method 1: Using the
openpyxl
library:- Install the library: Run
pip install openpyxl
. - Use
load_workbook()
to load the Excel file and get the first active sheet. - Use
iter_rows(values_only=True)
to iterate through rows and retrieve values.
- Install the library: Run
-
Method 2: Using the
pandas
library:- Install the library: Run
pip install pandas openpyxl
. pd.read_excel(file_path)
loads the entire content of the Excel file into a DataFrame.- You can display the data in table format using
print(df)
.
- Install the library: Run
Python Version:
The code is compatible with Python 3.6 and above along with the latest versions of the openpyxl
and pandas
libraries.
Tips:
- Ensure you have installed the correct versions of the libraries.
- Check the path to the Excel file to avoid file not found errors.
- Use a virtual environment to manage Python libraries more easily.