Reading Tabular Data from Files in Julia

Last Updated : 10 Jun, 2021

Julia is a high level, high performance, dynamic programming language which allows users to load, save, and manipulate data in various types of files for data science, analysis, and machine learning purposes. Tabular data is data that has a structure of a table and it can be easily read from various files like text, CSV, Excel, etc.

To perform such operations on data and files with ease, we add the Queryverse.jl package which provides us ease of use for other useful packages such as Query.jl, FileIO.jl, CSVFiles.jl, etc.

Julia

# Adding the Queryverse package 
using Pkg 
Pkg.add("Queryverse")

Reading Tabular Data from Text Files

To read data from a text file we have to open it first using the open() function. And to read the tabular data in the file we have to read data in the file line by line using readline() function as shown below:

Julia

# read file contents, line by line  
open("geek.txt") do f
 
  # line_number 
  line = 0
   
  # read till end of file 
  while ! eof(f)
   
    # read a new / next line for every iteration
    s = readline(f)
    line += 1
    println("$(line-1). $s")
  end
end

Reading Tabular Data from CSV Files

DataFrames are used to store data in a tabular form and these DataFrames can be read from CSV or Excel files by using the Queryverse.jl package and the load() function. Queryverse.jl package lets the FileIO.jl package use the CSVFiles.jl package to implement this.

Julia

# using necessary packages
using DataFrames, Queryverse
 
# reading dataframe
df = load("marks.csv") |> DataFrame

Sometimes in CSV files, data is separated by different characters like semicolons.

The semicolon can be specified in the load() function to read data in normal tabular form, i.e. without the semicolons.

Julia

# reading data without semicolons
df = load("marks_sc.csv", ';') |> DataFrame

The column names of the DataFrame take up the first row of the file. To change this we can use the header keyword argument and equate it to false to remove the column names and change the first row into elements of the table in the file.

Julia

# reading data without headers
df = load("marks.csv", 
           header_exists = false) |> DataFrame

While loading the data of the file, we can also change the column names using the colnames keyword as shown below:

Julia

# reading data by changing column names
df = load("marks.csv", 
           colnames = ["class", 
                          "score"]) |> DataFrame

Tabular data from a CSV file can be loaded without a specific number of rows using the skiplines_begin keyword.

Julia

# reading data without specific rows
df = load("marks.csv", 
           skiplines_begin = 1) |> DataFrame

Reading Tabular Data from Excel Files

The process for reading data from excel sheets is the same as that of CSV files, which has been discussed above, but we have to specify a file with the extension ‘*.xlsx’ instead of a ‘.csv’ in the load() function and the specific sheet we want to read.

Julia

# reading sheet 1 of an excel file
df = load("marks.xlsx", "Sheet1") |> DataFrame

We can also read specific rows and columns of the data in an excel file using the skipstartrows and skipstartcols keywords which skip specified rows and columns as shown below:

Julia

# reading by skipping specific rows and columns
df = load("marks.xlsx", "Sheet1",
               skipstartrows = 1, 
               skipstartcols = 1) |> DataFrame

Suggest improvement

Reading Tabular Data from files in R Programming

Share your thoughts in the comments

Reading Tabular Data from Files in Julia

Julia

Reading Tabular Data from Text Files

Julia

Reading Tabular Data from CSV Files

Julia

Julia

Julia

Julia

Julia

Reading Tabular Data from Excel Files

Julia

Julia

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?