Java Program to Extract Content from a Excel sheet
Spreadsheets are the easier way to represent table like data and can give a visual representation of data in tabular format. In this article, let us see how to extract the contents of the Excel sheet via java. Here there arise two cases where Maven is considered in the program or not. Discussing both of them as a prerequisite for a better understanding of the program.
Apache POI API basics are crucial before moving ahead, so the two main prefixes are required when working with Apache POI are as follows:
- HSSF: denotes the API is for working with Excel 2003 and earlier.
- XSSF: denotes the API is for working with Excel 2007 and later.
Following 4 interfaces are important and essential to go through
- Workbook: High-level representation of an Excel workbook. HSSFWorkbook and XSSFWorkbook.
- Sheet: High-level representation of an Excel worksheet. Typical implementing classes are HSSFSheetand XSSFSheet.
- Row: High–level representation of a row in a spreadsheet. HSSFRow and XSSFRow are two concrete classes.
- Cell: High-level representation of a cell in a row. HSSFCell and XSSFCell are the typical implementing classes.
Case 1: Maven java project where dependencies are as follows
- All Maven projects will have pom.xml as the main file.
- There we need to add the dependencies.
- pom.xml file contents Excel formats do differ a bit as shown below:
- It is recommended to specify the latest version. (Maven project example used here is 3.11)
For Excel 2003 format
<dependency> <groupId>org.apache.poi</groupId> <artifactId>poi</artifactId> <version>VERSION</version> </dependency>
For Excel 2007 format
<dependency> <groupId>org.apache.poi</groupId> <artifactId>poi-ooxml</artifactId> <version>VERSION</version> </dependency>
Case 2: Non-Maven Java project.
In order to meet the requirements there arises an urgent need to add the jar files in the build path in order to extract the contents. For this, download the latest release of the Apache POI library.
To extract the contents from
Excel 2003 format
poi-VERSION.jar is enough.
Excel 2007 format :
poi-ooxml-VERSION.jar poi-ooxml-schemas-VERSION.jar xmlbeans-VERSION.jar
Procedure: Reading from Excel file using Apache POI with examples. The goal is to read the contents from the given Excel file and display the contents of the Excel file in the “output” window.
Step 1: Here we are using the POJO class which has an equal number of fields as given in the attached Excel file. Excel file got 3 columns and hence 3 fields are there in the POJO class. Sample Excel file contents are as shown below. It is always nicer to have a POJO(Plain old java object) class for these kinds of operations. As there are 3 column values and the details are related to an employee, let us have an employee class.
Sample input image:
Step 2: Depends on different data types like String, Number(it fits for integer, double, float, etc), Boolean, we need to have a method to get the cell values of Excel
Step 3:Method to extract Excel file contents. We need to specify the location of the file correctly. otherwise, it will end up with IOException
Step 4: Integrating the concepts from Step 1 to Step 3 in the Main program
Output: For our example, we have 3 rows of data only
Conclusion : Apache POI provides a nicer implementation to extract Excel file contents. In coding, according to the availability of data in the number of cells, we need to have POJO class attributes and also we need to specify coll data in the “readDataFromExcelFile” method. We can format Double data as per our requirement also.