Commonly used functions in the Pandas library and how to use them
This article lists important functions in the Pandas library for Python and provides guidance on how to use them. Pandas is a powerful tool for data manipulation and analysis in Python.
Pandas is a powerful Python library used for data manipulation and analysis. It provides a lot of useful functions to handle data, from creating, querying to processing and transforming data. In this article, we will list the available functions in Pandas and how to use them.
1. pd.read_csv()
Read data from a CSV file into a DataFrame.
import pandas as pd
df = pd.read_csv('data.csv')
print(df.head())
2. pd.DataFrame()
Create a DataFrame from data.
data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data)
print(df)
3. df.head()
View the first 5 rows of the DataFrame.
print(df.head())
4. df.tail()
View the last 5 rows of the DataFrame.
print(df.tail())
5. df.info()
View general information about the DataFrame.
print(df.info())
6. df.describe()
Summary statistics of the DataFrame.
print(df.describe())
7. df.shape
Return the number of rows and columns of the DataFrame.
print(df.shape)
8. df.columns
Get the list of column names.
print(df.columns)
9. df.dtypes
Return the data type of each column.
print(df.dtypes)
10. df['column_name']
Access a column of the DataFrame.
print(df['Name'])
11. df.loc[]
Access rows by label (label-based).
print(df.loc[0])
12. df.iloc[]
Access rows by index (index-based).
print(df.iloc[0])
13. df.drop()
Drop rows or columns from the DataFrame.
df = df.drop('Age', axis=1)
print(df)
14. df.isnull()
Check for missing values in the DataFrame.
print(df.isnull())
15. df.fillna()
Fill missing values in the DataFrame.
df = df.fillna(0)
print(df)
16. df.sort_values()
Sort data by one or more columns.
df = df.sort_values('Age')
print(df)
17. df.groupby()
Group data by one or more columns.
grouped = df.groupby('Age')
print(grouped.mean())
18. df.merge()
Merge two DataFrames based on one or more keys.
df1 = pd.DataFrame({'ID': [1, 2], 'Name': ['Alice', 'Bob']})
df2 = pd.DataFrame({'ID': [1, 2], 'Age': [25, 30]})
merged = df1.merge(df2, on='ID')
print(merged)
19. df.apply()
Apply a function to each row or column.
df['Age_plus_one'] = df['Age'].apply(lambda x: x + 1)
print(df)
20. df.to_csv()
Save DataFrame to a CSV file.
df.to_csv('output.csv', index=False)
System requirements:
- Python 3.6 or higher.
- Pandas library (install via pip).
How to install the library:
To install the Pandas library, simply use pip:
pip install pandas
Tips:
- When working with large datasets, you can use
df.memory_usage()
to see how much memory the DataFrame is consuming. - It’s a good practice to check your data carefully before applying operations to avoid errors from invalid or missing data.