Count Substring In String Python
We often encounter situations when need to count substring in string Python. In this section, you will see the various different methods to count both overlapping and non-overlapping substring in a string.
- Count Substring Using count() Method
- Using re.findall()
- Count All Occurrences Of Substring In String
- Conclusion
Table of Contents
A substring is a part of a string that is contained within another string in the same order. For example, the substring "loop" is a substring of "python loop is magic".
Whereas a string in python is a sequence of characters enclosed in quotes. For example, "Hello World", 'Hello World'
Count Substring Using count() Method
Python has a built-in string method called count() that counts the number of occurrences of a substring in a string.
The count() method returns an integer, which is the number of times the substring occurs in the string.
Note: The count() method is case sensitive and counts only non-overlapping occurrences of the substring.
Syntax:
string.count(substring, start, end)
Here:
- substring is the substring you are looking for in the string.
- start (optional) is the index of the first character of the string to start searching from. The default value is 0.
- end (optional) is the index of the last character of the string to search up to. The default value is the length of the string.
Example:
# count substring in string python
str = "Gift box inside a box in a box."
substr = "box"
# count sunstring
print(str.count(substr))
Output:
3
Let's watch another example to understand how the count() method only counts non-overlapping occurrences of the substring.
In the example search of substring "aba", where the string is "abababa".
# count substring in string python
str = "abababa"
substr = "aba"
# using count()
print(str.count(substr))
Output:
2
Here the output is 2 but it should be 3. The reason is that the 1 substring "aba" is overlapping with the string "abababa" in the middle. But the count() method counts only non-overlapping occurrences of the substring.
Count All Occurrences Of Substring In String
To count all occurrences you can create logic that checks for each index of the string and check if the substring is present at that index.
# Method 1
str = "abababa"
substr = "aba"
# counter to strore count of substring
count = 0
# loop to iterate through string
# you can take range - len(str)
# but len(str) - len(substr) + 1 efficient
for i in range(len(str) - len(substr) + 1):
# check if substring is present in string
if(str[i:i+len(substr)] == substr ):
count += 1
print(count)
Output:
3
- Store the string and substring in variables.
- Create a counter variable to store the count of the substring.
- Loop through the string using for loop. At each index of the string, check if the substring is present at that index by comparing str[i:i+len(substr)] == substr.
- Increment the counter variable by 1 if the substring is present at that index.
- The counter variable now has all the occurrences of the substring. Print the counter variable.
# Method 2
In this method, we will use startswith() method to check if the substring is present at any index of the string.
str = "abababa"
substr = "aba"
# count variable to store count of substring
count = 0
# using startWith() method
# check if the substring is present at any index
for i in range(len(str) - len(substr) + 1):
if(str.startswith(substr, i)):
count += 1
print(count)
Output:
3
- Just like the previous method, store the string, substring, and count in variables.
- Loop through the string and at each index check if the string starts with the substring by using startswith() method.
- If the string starts with the substring, increment the counter variable by 1.
- Print the counter variable.
Using re.findall()
The finall() is part of the re module. It is used to find all occurrences of a pattern in a string.
The method accepts two arguments:
- The pattern is the regular expression pattern to search for. You can also pass a string instead of a pattern.
- string is the string to search in.
Example:
import re
str = "abababa"
substr = "aba"
# counting occurrences of substring in string
count = len(re.findall(substr, str))
# Printing result
print("Number of substrings", count)
Output:
Number of substrings 2
The re.findall() method returns a list of all non-overlapping matches of the pattern in the string.
Conclusion
Summarising the article, we can say that use of count() method and re.findall() method to find all non-overlapping occurrences of a substring.
If you want to count substring in string python with all of its occurrences then use the 2 methods discussed above in the article.
Visit python documentation for more information.