How to remove MultiIndex columns in Pandas
This article explains how to remove MultiIndex columns from a Pandas DataFrame, a useful feature when working with complex data with multiple index levels. You will learn how to flatten or completely remove MultiIndex columns.
Pandas provides a MultiIndex feature that allows you to have multiple index levels for rows and columns. In some cases, you may want to remove MultiIndex columns to simplify your data. In this article, you'll learn how to remove or flatten MultiIndex columns in Pandas.
Python Code
import pandas as pd
# Create a DataFrame with MultiIndex columns
arrays = [['A', 'A', 'B', 'B'], ['one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
df = pd.DataFrame([[1, 2, 3, 4], [5, 6, 7, 8]], columns=index)
# Display the original DataFrame
print("DataFrame with MultiIndex columns:")
print(df)
# Method to remove MultiIndex columns
df.columns = ['_'.join(col) for col in df.columns]
# Display the DataFrame after removing MultiIndex columns
print("\nDataFrame after removing MultiIndex columns:")
print(df)
Detailed explanation:
-
import pandas as pd
: Imports the Pandas library to work with DataFrames. -
arrays = [['A', 'A', 'B', 'B'], ['one', 'two', 'one', 'two']]
: Creates a list for MultiIndex. -
tuples = list(zip(*arrays))
: Creates tuples from the lists usingzip
. -
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
: Creates MultiIndex columns from the list of tuples. -
df = pd.DataFrame(...)
: Creates a DataFrame with MultiIndex columns. -
df.columns = ['_'.join(col) for col in df.columns]
: Joins the levels of MultiIndex into a single string to remove MultiIndex. -
print(df)
: Prints the DataFrame after removing MultiIndex columns.
System requirements:
- Python 3.6 or above
- Pandas version 1.0.0 or newer
How to install the libraries needed to run the Python code above:
Use pip to install Pandas:
pip install pandas
Tips:
- MultiIndex is useful for working with complex data, but if not needed, consider flattening it for easier manipulation.
- You can customize how the levels of the MultiIndex are joined by using a different character instead of
_
.