High-Performance Array Operations with Cython | Set 2

Prerequisite: High-Performance Array Operations with Cython | Set 1

The resulting code in the first part works fast. In this article, we will compare the performance of the code with the clip() function that is present in the NumPy library.

As to the surprise, our program is working fast as compared to the NumPy which is written in C.

Code #1 : Comparing the performances.

filter_none

edit
close

play_arrow

link
brightness_4
code

a = timeit('numpy.clip(arr2, -5, 5, arr3)',
       'from __main__ import b, c, numpy', number = 1000)
  
print ("\nTime for NumPy clip program : ", a)
  
b = timeit('sample.clip(arr2, -5, 5, arr3)',
           'from __main__ import b, c, sample', number = 1000)
  
print ("\nTime for our program : ", b)

chevron_right


Output :



Time for NumPy clip program : 8.093049556000551

Time for our program :, 3.760528204000366

Well the codes in the article required Cython typed memoryviews that simplifies the code that operates on arrays. The declaration cpdef clip() declares clip() as both a C-level and Python-level function. This means that the function call is more efficently called by other Cython functions (e.g., if you want to invoke clip() from a different Cython function).

Two decorators are used in the code – @cython.boundscheck(False) and @cython.wraparound(False). Such are the few optional performance optimizations.

@cython.boundscheck(False) : Eliminates all array bounds checking and an be used if the indexing won’t go out of range.
@cython.wraparound(False) : Eliminates the handling of negative array indices as wrapping around to the end of the array (like with Python lists). The inclusion of these decorators can make the code run substantially faster (almost 2.5 times faster on this example when tested).

Code #2 : Variant of the clip() function that uses conditional expressions

filter_none

edit
close

play_arrow

link
brightness_4
code

# decorators
@cython.boundscheck(False)
@cython.wraparound(False)
  
cpdef clip(double[:] a, double min, double max, double[:] out):
      
    if min > max:
        raise ValueError("min must be <= max")
      
    if a.shape[0] != out.shape[0]:
        raise ValueError
        ("input and output arrays must be the same size")
      
    for i in range(a.shape[0]):
        out[i] = (a[i] 
        if a[i] < max else max
        if a[i] > min else min

chevron_right


When tested, this version of the code runs over 50% faster. But how this code would stack up against a handwritten C version. After experimenting, it can be tested that a handcrafted C extension runs more than 10% slower than the version created by Cython.




My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.


Article Tags :

Be the First to upvote.


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.