Python – Extract elements with Range consecutive occurrences

Last Updated : 16 May, 2023

Sometimes while working with data, we can have a problem in which we need to select some of the elements that occur range times consecutively. This problem can occur in many domains. Let’s discuss certain ways in which this problem can be solved.
Method #1 : Using groupby() + list comprehension This task can be performed using above functionalities. In this, we group all the numbers that are occurring range consecutively. We iterate the list using list comprehension.

Python3

# Python3 code to demonstrate working of
# Extract elements with Range consecutive occurrences
# using groupby() + list comprehension
from itertools import groupby
 
# initialize list 
test_list = [1, 1, 4, 5, 5, 6, 7, 7, 7, 8, 8, 8, 8]
 
# printing original list 
print("The original list : " + str(test_list))
 
# initialize strt, end 
strt, end = 2, 3
 
# Extract elements with Range consecutive occurrences
# using groupby() + list comprehension
res1 = [i for i, j in groupby(test_list) if len(list(j)) <= end]
res2 = [i for i, j in groupby(test_list) if len(list(j)) >= strt]
res = list(set(res1) & set(res2))
 
# printing result
print("The range consecutive elements are : " + str(res))

Output :

The original list : [1, 1, 4, 5, 5, 6, 7, 7, 7, 8, 8, 8, 8]
The range consecutive elements are : [1, 5, 7]

Time Complexity: O(n*n) where n is the number of elements in the list “test_list”. groupby() + list comprehension performs n*n number of operations.
Auxiliary Space: O(n), extra space is required where n is the number of elements in the list

Method #2 : Using list comprehension + slice() + groupby() This task can also be performed using above functions. In this, we just perform grouping in similar way as above but the way we extract consecutive elements is by slice().

Python3

# Python3 code to demonstrate working of
# Extract elements with Range consecutive occurrences
# using groupby() + list comprehension + islice()
from itertools import groupby, islice
 
# initialize list 
test_list = [1, 1, 4, 5, 5, 6, 7, 7, 7, 8, 8, 8]
 
# printing original list 
print("The original list : " + str(test_list))
 
# initialize strt, end 
strt, end = 2, 3
 
# Extract elements with Range consecutive occurrences
# using groupby() + list comprehension + islice()
res1 = [i for i, j in groupby(test_list) if len(list(islice(j, 0, strt))) >= strt]
res2 = [i for i, j in groupby(test_list) if len(list(islice(j, 0, end))) <= end]
res = list(set(res1) & set(res2))
 
# printing result
print("The range consecutive elements are : " + str(res))

Output :

The original list : [1, 1, 4, 5, 5, 6, 7, 7, 7, 8, 8, 8]
The range consecutive elements are : [1, 5, 7, 8]

Time Complexity: O(n*n), where n is the length of the input list. This is because we’re using list comprehension + slice() + groupby() which has a time complexity of O(n*n) in the worst case.
Auxiliary Space: O(n), as we’re using additional space res other than the input list itself with the same size of input list

Method #3 : Using numpy

In this method we use numpy which first converts the input list to a NumPy array. Then, it uses NumPy functions to find the indices where the array changes value and split the array into subarrays based on those indices. These subarrays are stored in the split_arr variable. Finally, the code uses a list comprehension to iterate over the subarrays and appends the first element of each subarray to the res list if the length of the subarray is within a certain range specified by the strt and end variables.

Python3

import numpy as np
 
# initialize list
test_list = [1, 1, 4, 5, 5, 6, 7, 7, 7, 8, 8, 8, 8]
 
# printing original list
print("The original list : " + str(test_list))
 
# initialize strt, end
strt, end = 2, 3
 
# Extract elements with Range consecutive occurrences
# using numpy.split() method
arr = np.array(test_list)
split_arr = np.split(arr, np.where(np.diff(arr) != 0)[0] + 1)
res = [x[0] for x in split_arr if strt <= len(x) <= end]
 
# printing result
print("The range consecutive elements are : " + str(res))

Output:

The original list : [1, 1, 4, 5, 5, 6, 7, 7, 7, 8, 8, 8, 8]
The range consecutive elements are : [1, 5, 7]

Time Complexity: O(n), where n is the length of the input list
Auxiliary Space: O(n), because we are creating a NumPy array of the same length as the input list.