Open In App

Reading Tabular Data from Files in Julia

Last Updated : 10 Jun, 2021
Improve
Improve
Like Article
Like
Save
Share
Report

Julia is a high level, high performance, dynamic programming language which allows users to load, save, and manipulate data in various types of files for data science, analysis, and machine learning purposes. Tabular data is data that has a structure of a table and it can be easily read from various files like text, CSV, Excel, etc. 

To perform such operations on data and files with ease, we add the Queryverse.jl package which provides us ease of use for other useful packages such as Query.jl, FileIO.jl, CSVFiles.jl, etc.

Julia




# Adding the Queryverse package
using Pkg
Pkg.add("Queryverse")


 
 

Reading Tabular Data from Text Files

 

To read data from a text file we have to open it first using the open() function. And to read the tabular data in the file we have to read data in the file line by line using readline() function as shown below:

 

Julia




# read file contents, line by line 
open("geek.txt") do f
 
  # line_number
  line = 0
   
  # read till end of file
  while ! eof(f)
   
    # read a new / next line for every iteration
    s = readline(f)
    line += 1
    println("$(line-1). $s")
  end
end


 
 

Reading Tabular Data from CSV Files

 

DataFrames are used to store data in a tabular form and these DataFrames can be read from CSV or Excel files by using the Queryverse.jl package and the load() function. Queryverse.jl package lets the FileIO.jl package use the CSVFiles.jl package to implement this.

 

Julia




# using necessary packages
using DataFrames, Queryverse
 
# reading dataframe
df = load("marks.csv") |> DataFrame


 
 

 

Sometimes in CSV files, data is separated by different characters like semicolons. 

 

 

The semicolon can be specified in the load() function to read data in normal tabular form, i.e. without the semicolons.

 

Julia




# reading data without semicolons
df = load("marks_sc.csv", ';') |> DataFrame


 
 

 

The column names of the DataFrame take up the first row of the file. To change this we can use the header keyword argument and equate it to false to remove the column names and change the first row into elements of the table in the file.

 

Julia




# reading data without headers
df = load("marks.csv",
           header_exists = false) |> DataFrame


 
 

 

While loading the data of the file, we can also change the column names using the colnames keyword as shown below:

 

Julia




# reading data by changing column names
df = load("marks.csv",
           colnames = ["class",
                          "score"]) |> DataFrame


 
 

 

Tabular data from a CSV file can be loaded without a specific number of rows using the skiplines_begin keyword.

 

Julia




# reading data without specific rows
df = load("marks.csv",
           skiplines_begin = 1) |> DataFrame


 
 

Reading Tabular Data from Excel Files

 

The process for reading data from excel sheets is the same as that of CSV files, which has been discussed above, but we have to specify a file with the extension ‘*.xlsx’ instead of a ‘.csv’ in the load() function and the specific sheet we want to read.

 

Julia




# reading sheet 1 of an excel file
df = load("marks.xlsx", "Sheet1") |> DataFrame


 
 

 

We can also read specific rows and columns of the data in an excel file using the skipstartrows and skipstartcols keywords which skip specified rows and columns as shown below:

 

Julia




# reading by skipping specific rows and columns
df = load("marks.xlsx", "Sheet1",
               skipstartrows = 1,
               skipstartcols = 1) |> DataFrame




Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads