Pandas Count Unique Values in Column
Counting unique values in a column is a fundamental operation in data analysis, and Pandas offers various methods to accomplish this task efficiently.
One of the use of counting columns may br to know the cardinality of a column.
In this article, you will learn multiple ways to count unique values in column.
- Using nunique() method
- Using value_counts() method
- Using set() and len() methods
- Using groupby() and size() methods
- Conclusion
Table of Contents
1. Using nunique() method
nunique() method returns the number of unique values in a column.
Access the column you want to count unique values of, and call the nunique() method on it.
import pandas as pd
# Creating a sample DataFrame
data = {'Category': ['A', 'B', 'C', 'A', 'B', 'D']}
df = pd.DataFrame(data)
# 👇 Count unique values in 'Category' column
unique_count = df['Category'].nunique()
print(f"Number of unique values in 'Category': {unique_count}")
Output:
Number of unique values in 'Category': 4
You see in the above output 'Category' column has 4 unique values.
2. Using value_counts() method
value_counts() method is another way that can be used to count the unique values in a column but instead of returning the number of unique value for entire column, it returns a Series containing the count of each unique value.
Let's see how to use it.
import pandas as pd
# Creating a sample DataFrame
data = {'Category': ['A', 'B', 'C', 'A', 'B', 'D']}
df = pd.DataFrame(data)
# 👇 Count unique values in 'Category' column
unique_count = df['Category'].value_counts()
print(f"Unique values in 'Category':\n{unique_count}")
Output:
Unique values in 'Category': A 2 B 2 C 1 D 1 Name: Category, dtype: int64
As you can see, the value_counts() method returns a Series with the count of each unique value in the column.
3. Using set() and len() methods
In this example we are using set() and len() methods to count unique values in a column. The set will return unique entries in the column and len will return the length of the set.
import pandas as pd
# Creating a sample DataFrame
data = {'Category': ['A', 'B', 'C', 'A', 'B', 'D']}
df = pd.DataFrame(data)
# 👇 Count unique values in 'Category' column
unique_count = len(set(df['Category']))
print(f"Number of unique values in 'Category': {unique_count}")
Output:
Number of unique values in 'Category': 4
4. Using groupby() and size() methods
groupby() method is used to group the DataFrame by a column and size() method returns the size of each group.
Using this we can get similar results as we got using value_counts() method.
import pandas as pd
# Creating a sample DataFrame
data = {'Category': ['A', 'B', 'C', 'A', 'B', 'D']}
df = pd.DataFrame(data)
# 👇 Count unique values in 'Category' column
unique_count = df.groupby('Category').size()
print(f"Unique values in 'Category':\n{unique_count}")
Output:
Unique values in 'Category': Category A 2 B 2 C 1 D 1 dtype: int64
Conclusion
Counting unique values in a Pandas DataFrame column is a routine but crucial aspect of data analysis. Whether you prefer the simplicity of nunique(), the detailed breakdown of value_counts(), or the flexibility of set() and len(), these methods empower you to gain valuable insights into the distribution and composition of your data.