Open In App

High-Performance Array Operations with Cython | Set 2

Last Updated : 03 Jan, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

The resulting code in the first part works fast. In this article, we will compare the performance of the code with the clip() function that is present in the NumPy library. 
As to the surprise, our program is working fast as compared to the NumPy which is written in C.

Prerequisite: High-Performance Array Operations with Cython | Set 1

Perform Complex Operations in Cython with NumPy

There are various ways to perform Perform Complex Operations in Cython with NumPy. here we are discussing some generally used methods for performing complex Operations in Cython with NumPy those are following.

  • Comparing The Performances
  • Variant of the Clip() Function
  • Element-wise Squaring

Comparing The Performances

In this example code uses the `example` module to measure the execution time of two operations: NumPy’s `clip` function with input arrays `arr2` and `arr3`, and a custom program’s `clip` operation using arrays `arr2` and `arr3`. The results are printed, showing the time taken for each operation (in seconds) after performing 1000 iterations.

Python3




a = timeit('numpy.clip(arr2, -5, 5, arr3)',
       'from __main__ import b, c, numpy', number = 1000)
 
print ("\nTime for NumPy clip program : ", a)
 
b = timeit('sample.clip(arr2, -5, 5, arr3)',
           'from __main__ import b, c, sample', number = 1000)
 
print ("\nTime for our program : ", b)


Output:

Time for NumPy clip program : 8.093049556000551
Time for our program :, 3.760528204000366

Well the codes in the article required Cython typed memoryviews that simplifies the code that operates on arrays. The declaration cpdef clip() declares clip() as both a C-level and Python-level function. This means that the function call is more efficiently called by other Cython functions (e.g., if you want to invoke clip() from a different Cython function).

Two decorators are used in the code – @cython.boundscheck(False) and @cython.wraparound(False). Such are the few optional performance optimizations. 

  • @cython.boundscheck(False) : Eliminates all array bounds checking and can be used if the indexing won’t go out of range. 
  • @cython.wraparound(False) : Eliminates the handling of negative array indices as wrapping around to the end of the array (like with Python lists).

The inclusion of these decorators can make the code run substantially faster (almost 2.5 times faster on this example when tested).

Variant of the Clip() Function that Uses Conditional Expressions 

In this example code defines a Cython function named `clip` that takes a NumPy array `a`, along with minimum and maximum values, and clips the values in `a` to the specified range. The function uses Cython decorators to optimize array bounds checking and wraparound behavior for improved performance. If `min` is greater than `max`, it raises a ValueError.

Python3




# decorators
@cython.boundscheck(False)
@cython.wraparound(False)
 
cpdef clip(double[:] a, double min, double max, double[:] out):
     
    if min > max:
        raise ValueError("min must be <= max")
     
    if a.shape[0] != out.shape[0]:
        raise ValueError
        ("input and output arrays must be the same size")
     
    for i in range(a.shape[0]):
        out[i] = (a[i]
        if a[i] < max else max)
        if a[i] > min else min


When tested, this version of the code runs over 50% faster. But how this code would stack up against a handwritten C version. After experimenting, it can be tested that a handcrafted C extension runs more than 10% slower than the version created by Cython.

Element-wise Squaring

In this example code defines a Cython function named square_elements that performs an element-wise squaring operation on a NumPy array a and stores the result in another array out. Cython decorators @cython.boundscheck(False) and @cython.wraparound(False) are used to disable bounds checking and wraparound behavior, aiming to improve performance.

Python3




# decorators
@cython.boundscheck(False)
@cython.wraparound(False)
 
cpdef square_elements(double[:] a, double[:] out):
    if a.shape[0] != out.shape[0]:
        raise ValueError("Input and output arrays must be the same size")
     
    for i in range(a.shape[0]):
        out[i] = a[i] * a[i]


The function checks if the input array a and the output array out have the same size. If not, it raises a ValueError. Then, it iterates through each element of the input array, squares it, and assigns the result to the corresponding element in the output array. The result is an array where each element is the square of the corresponding element in the input array.



Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads