Open In App

How to Create ARFF File in Weka Tool

Last Updated : 30 Dec, 2022
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will be learning about ARFF files and how to create ARFF File (Attribute relation File Format)

As the name suggests it described a list of instances sharing a set of attributes. these files are supported by WEKA machine Learning tool,  arff files are used for the purpose of various operations related to data preprocessing, data cleaning  etc.

Structure of file.

ARFF file contains 2 sections 

  1. Header Section
  2. Data Section

All the keywords in ARFF file start with @ symbol.

1. Header Section

This section contains various information related to the dataset like the name of the relation, columns, and type of columns. The header section contains 2 parts Table/relation and attribute part.

@relation :used to give the table name

@attribute: used to give a column name

datatypes:

nominal: represented inside curly brackets (Like constants)

string : data type which accepts only string value

numeric: used to store numbers

date: used to store date

Syntax:

@relation tablename
@attribute column_name type

example:

@relation "employee"
@attribute f_name string
@attribute l_name string 
@attribute contact_num numeric
@attribute dept {HR,IT,MANAGEMENT,MAINTAINANCE}
@attribute DOB date dd-mm-yyyy
@attribute city string

Here dept column is having nominal data type so it can only accept above mentioned types of data only,

2. Data section

Data section is used to represents the data or entries for available columns. (according to the order in header section data would be inserted).

data section starts with @data, and this section must be added after Header section. only single record can be written in single line.

@data: Used to start data section

%: % sign is used to represent the comment in file.

Syntax:

@data

<record1>

<record2>

.

.

<record N>

all the Records must be in the same format as their attributes are defined in Header section Like

example:

1,naman,N,1234556678,IT,02-08-2000,rjt
2,yash,M,1234556679,HR,04-05-2001,amd
3,kishan,G,1214556678,MANAGEMENT,02-11-2001,pbr
4,?,?,5234556678,IT,03-05-2000,amd

entire file would  look like this:

emp.arff file:

@relation "employee"
@attribute id numeric
@attribute f_name string
@attribute l_name string 
@attribute contact_num numeric
@attribute dept {HR,IT,MANAGEMENT,MAINTAINANCE}
@attribute DOB date dd-mm-yyyy
@attribute city string

@data
1,naman,N,1234556678,IT,02-08-2000,rjt
2,yash,M,1234556679,HR,04-05-2001,amd
3,kishan,G,1214556678,MANAGEMENT,02-11-2001,pbr
4,?,?,5234556678,IT,03-05-2000,amd

We separate values by comma(,) and to represent the empty or missing value for a particular column we use the (?)sign.

How to Create and open arff file

you need to have weka tool install on your machine. you can check this How to install Weka.

Step 1: Open any text editor and paste the above code.

Step 2: Save the file with emp_dm.arff file extension

Step 3: Open weka tool 

Step 4: Click on Explorer

How to Create and open arff file

 

Then click on Open file

 

Select/Locate arff file from disk then click On Open.

 

Step 6: file is now Loaded now click on Edit from Preprocess Tab

 

Step 7: dataset would be shown like this.

 

So this is how you can work with arff file. with weka tool, various operations can be done on the Available Dataset. here missing values would be shown as the empty cells.


Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads