Convert Pandas DataFrame to Dictionary
Pandas is a library that allows us to work with DataFrame objects. Occasionally, we may need to convert these objects into dictionaries. This can be necessary when we want to use a function that only accepts dictionary input.
Therefore, knowing how to convert DataFrame to dictionary is crucial. In this tutorial, we will see how to change a DataFrame into a dictionary in Pandas.
- Using to_dict() method
- Dictionary with Column Values
- Using set_index() for Nested Dictionary
- Using iterrows() Method
- Conclusion
Table of Contents
1. Using to_dict() method
The most straightforward way to convert a Pandas DataFrame to a dictionary is using the to_dict() method.
To convert a DataFrame to a dictionary, we need to pass the orient which specifies the format of the dictionary you will get as output.
We are using orient='records' which will return a list of dictionaries, each dictionary represents a row in the DataFrame.
import pandas as pd
# Creating a sample DataFrame
data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['NY', 'LA', 'SF']
}
df = pd.DataFrame(data)
# 👇 Convert DataFrame to dictionary
df_dict = df.to_dict(orient='records')
print("DataFrame:")
print(df)
print("\nDictionary:")
print(df_dict)
Output:
DataFrame: Name Age City 0 Alice 25 NY 1 Bob 30 LA 2 Charlie 35 SF Dictionary: [{'Name': 'Alice', 'Age': 25, 'City': 'NY'}, {'Name': 'Bob', 'Age': 30, 'City': 'LA'}, {'Name': 'Charlie', 'Age': 35, 'City': 'SF'}]
Other possible values for orient are:
- dict: Return type =>
{column_name_1: {index_1: value_1, index_2: value_2, ...}, column_name_2: {index_1: value_1, index_2: value_2, ...}, ...}
- list: Return type =>
{column_name_1: [value_1, value_2, ...], column_name_2: [value_1, value_2, ...], ...}
- series: Return type =>
[{row_1: value_1, row_2: value_2, ...}, {row_1: value_1, row_2: value_2, ...}, ...]
- split: Return type =>
{'index': [column_name_1, column_name_2, ...], 'columns': [row_1, row_2, ...], 'data': [[value_1, value_2, ...], [value_1, value_2, ...], ...]}
- index: Return type =>
{index_1: {column_name_1: value_1, column_name_2: value_2, ...}, index_2: {column_name_1: value_1, column_name_2: value_2, ...}, ...}
- records: Return type =>
[{column_name_1: value_1, column_name_2: value_2, ...}, {column_name_1: value_1, column_name_2: value_2, ...}, ...]
2. Dictionary with Column Values
We can use dictionary comprehension to create a dictionary with column values as keys and row values as values.
import pandas as pd
# Creating a sample DataFrame
data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['NY', 'LA', 'SF']
}
df = pd.DataFrame(data)
# 👇 Convert DataFrame to dictionary
df_dict = {col: df[col].tolist() for col in df.columns}
print("DataFrame:")
print(df)
print("\nDictionary:")
print(df_dict)
Output:
DataFrame: Name Age City 0 Alice 25 NY 1 Bob 30 LA 2 Charlie 35 SF Dictionary: {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35], 'City': ['NY', 'LA', 'SF']}
3. Using set_index() for Nested Dictionary
If your DataFrame has a unique index, you can use the set_index() method to create a nested dictionary. For example, username, ID, email can be unique indexes.
The idea is to set the unique index as the key of the dictionary and the remaining columns as values.
import pandas as pd
# Creating a sample DataFrame
data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['NY', 'LA', 'SF']
}
df = pd.DataFrame(data)
# Set 'Name' column as the index
df_indexed = df.set_index('Name')
# 👇 Convert DataFrame to nested dictionary
nested_dict = df_indexed.to_dict(orient='index')
print("DataFrame:")
print(df)
print("\nNested Dictionary:")
print(nested_dict)
Output:
DataFrame: Name Age City 0 Alice 25 NY 1 Bob 30 LA 2 Charlie 35 SF Nested Dictionary: {'Alice': {'Age': 25, 'City': 'NY'}, 'Bob': {'Age': 30, 'City': 'LA'}, 'Charlie': {'Age': 35, 'City': 'SF'}}
4. Using iterrows() Method
Not very effective method but we can also use the iterrows() method to iterate over the rows of the DataFrame and convert each row to a dictionary.
import pandas as pd
# Creating a sample DataFrame
data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['NY', 'LA', 'SF']
}
df = pd.DataFrame(data)
# 👇 Convert DataFrame to dictionary using iterrows()
iterrows_dict = {index: row.to_dict() for index, row in df.iterrows()}
print("DataFrame:")
print(df)
print("\nDictionary using iterrows():")
print(iterrows_dict)
Output:
DataFrame: Name Age City 0 Alice 25 NY 1 Bob 30 LA 2 Charlie 35 SF Dictionary using iterrows(): {0: {'Name': 'Alice', 'Age': 25, 'City': 'NY'}, 1: {'Name': 'Bob', 'Age': 30, 'City': 'LA'}, 2: {'Name': 'Charlie', 'Age': 35, 'City': 'SF'}}
Conclusion
The flexibility of manipulating data and integrating with other Python libraries is one of the reasons why converting a Pandas DataFrame to a dictionary is important. Depending on your specific needs, you can use the to_dict() method for simple conversion, create dictionaries with column values or build nested dictionaries.