Open In App

Sorting contents of a Data Frame in Julia

Improve
Improve
Like Article
Like
Save
Share
Report

Sorting is a technique of storing data in sorted order. Sorting can be performed using sorting algorithms or sorting functions. to sort in a particular dataframe then extracting its values and applying the sorting function would easily sort the contents of a particular dataframe. Julia provides various methods that can be used to perform the sorting of DataFrames. 

Remember to add the following package before starting i.e DataFrames with help of below code:

using Pkg
Pkg.add("DataFrames")

Methods for Sorting

Julia provides some methods for sorting data of a DataFrame:

  • sort() function: This function returns a sorted array or sorted copy of an array
  • sort() function passing an algorithm: This function returns a sorted array with the help of an applied algorithm
  • sortperm() function: This function returns a list of indices that one can use on the collection to produce a sorted collection.
  • partialsortperm() function: This function partially sorts the algorithm up to a particular range or permutation.
  • sort() function with rev=True: This function will sort the content of the dataframe into descending order.
  • sort!(): This function passing the dimension, this function can sort multidimensional arrays of DataFrames.
  • partialsortperm(): This function returns a partial permutation DataFrame’s column of the vector

Method 1:  Use of sort() Function

sort() Function in Julia is the most basic sorting method that can be used to sort data of a dataframe. 

Approach:

  • First, you can create the dataframe
  • The sort() function has arguments like the vector and the order in which the columns need to be sorted.

Julia




# Creating a dataframe
df1 = DataFrame(b = [:Hi, :Med, :Hi, :Low, :Hi],
               x = ["A", "E", "I", "O","A"],
               y = [6, 3, 7, 2, 1],
               z = [2, 1, 1, 2, 2])
 
# Method1
sort(df1,[:z,:y]) # sorted z then y


 
 

Method 2: Sort using Quicksort algorithm

 

Julia allows passing the algorithm type to sort() function to sort the column. sort(dataframe.columnheader; alg=QuickSort) function takes column name and algorithm type as an argument.

 

Approach:

 

  • Here, the sort() function is applied to a specific column.
  • It is passed as an argument in the sort function
  • Then the algorithm with which you want to sort the particular column is also passed as an argument
  • Store the returned value of this function in a separate variable
  • Then update in the particular column

 

Julia




# Method2 Algorithm(Quicksort)
# Sorting a particular column and storing it in s
s = sort(df1.y; alg = QuickSort)
 
# Now giving the value of s to the dataframe's y header
df1.y = s
df1 # printing the sorted y


 
 

Method 3: Sort using Partial QuickSort algorithm

 

sort(dataframe.columnheader; alg=PartialQuickSort(range)) function is passed with PartialQuickSort algorithm to sort the column upto a certain limit which is passed in the algorithm.

 

Approach:

 

  • Here, the sort() function is applied to a specific column.
  • It is passed as an argument in the sort function
  • Then the algorithm(PartialQuickSort) with which you want  to sort the particular column is also passed as an argument
  • Store the returned value of this function in a separate variable
  • Then update in the particular column

 

Julia




# Method3 Algorithm(PartialQuickSort)
# If we want sort a column upto a certain number
B = 3
t = sort(df1.z; alg = PartialQuickSort(B))
 
# passing the t variable in the dataframe
df1.z = t
df1


 
 

Method 4: Use of sortperm() function

 

sortperm() function is passed with the column name, to sort the column and return indexes of the sorted column.

 

Approach:

 

  • First store the particular column in which you want to apply this sorting in a separate variable
  • Apply the sortperm() function and pass the variable as argument this will return the sorted indexes of the particular column and store the returned indexes in a separate variable
  • Then traverse using the for loop in the variable where the indexes are stored
  • Print using the for loop and pass the index in the variable where the particular column was stored.

 

Julia




# Method4
r = df1.y
 
# returned indexes of the elements
k = sortperm(r)
 
# traversing through indices
for i in k
    println(r[i]) 
end


Method 5: Sort using Insertion sort Algorithm

sort(dataframe.column;alg=InsertionSort) function is passed with InsertionSort algorithm to sort the column up to certain limit which is passed in the algorithm.

Approach:

  • Creating a new dataframe  and applying the sort() function
  • This time the algorithm used is insertion sort
  • Then the algorithm(InsertionSort) with which you want  to sort the particular column is also passed as an argument
  • Store the returned value of this function in a separate variable
  • Then update in the particular column

Julia




# Created new dataframe as df2
df2 = DataFrame(x = [11,12, 13, 10, 23],
                y = [6, 3, 7, 2, 1],
                z = [2, 1, 1, 2, 2])
# Method5
s2 = sort(df2.x; alg = InsertionSort)
 
# now update the df2.x
df2.x = s2
df2


 
 

 

Method 6: Use of partialsortperm() function

 

partialsortperm(dataframe.column, range) function is an advanced form of sortperm() function which will return the indexes of the values which are in range. This partially sorts the column.

 

Approach:

 

  • Storing the particular column which needs to be sorted in another variable
  • Applying the partialsortperm() function passing the vector and the range till which it needs to be sorted
  • Finally, we can update with the help of passing the result into the particular DataFrame’s column
  • Now printing the dataframe would simply print with updated value

 

Julia




# Method6
a = df2.y
a = a[partialsortperm(a, 1:5)]
a


Method 7: Sorting in Descending order

sort(dataframe,rev=True) function is passed with dataframe and rev variable to sort the column. This function basically reverses or gives a descending order of the column passed.

Approach:

  • Sorting the dataframe’s particular column in the descending order using the sort() function
  • First storing the particular column in the variable
  • And applying the sort() function and passing the reverse of the particular column as rev = true
  • This now will sort in the descending order
  • At last, updating the dataframe by passing the variable into the dataframe

Julia




# Method7
s2 = sort(df2, rev=true)
df2 = s2 #updating the whole dataframe
df2


 
 

 

      

 

Method 8: Use of sort()! function

 

sort!(vector,dim) function is passed with dataframe and dimension in which we want to sort the column (dimension means dim=1 means row and dim=2 means column).

 

Approach:

 

  • The function for now applying the sort with user’s choice to either sort by row or column
  • Sorting by row is done by passing  vector into the sort!() function
  • Also, we need to pass the dim=1 which means to traverse row-wise
  • This function will print the sorting in the row manner
  • Now applying  the same function by just passing dim=2 to sort in column manner.
  • This now would print the sorted vector in the column manner.

 

Julia




# Method8
B = [4 3; 1 2]
sort!(B, dims = 1); B # sorting through row
sort!(B, dims = 2); B # sorting through column


 
 

 



Last Updated : 16 Feb, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads