Open In App

Apache POI | Getting Started

Last Updated : 13 Oct, 2022
Improve
Improve
Like Article
Like
Save
Share
Report

POI stands For “Poor Obfuscation Implementation”. Apache POI is an API provided by Apache foundation which is a collection of different java libraries. These libraries gives the facility to read, write and manipulate different Microsoft files such as excel sheet, power-point, and word files. It’s first version release on 30 December 2001.

Apache POI Architecture

Apache POI have different classes and method to work upon different MS Office Document.

  • POIFS It’s Stand for “Poor Obfuscation Implementation File System”.This component is the basic factor of all other POI elements. It is used to read different files explicitly.
  • HSSF It’s Stand for “Horrible Spreadsheet Format”. It is used to read and write xls format of MS-Excel files.
  • XSSF It’s Stand for “XML Spreadsheet Format”. It is used for xlsx file format of MS-Excel.
  • HPSF It’s Stand for “Horrible Property Set Format”.It is used to extract property sets of the MS-Office files.
  • HWPF It’s Stand for “Horrible Word Processor Format”.It is used to read and write doc extension files of MS-Word.
  • XWPF It’s Stand for “XML Word Processor Format”.It is used to read and write docx extension files of MS-Word.
  • HSLF It’s Stand for “Horrible Slide Layout Format”.It is used for read, create, and edit PowerPoint presentations.
  • HDGF It’s Stand for “Horrible Diagram Format”.It contains classes and methods for MS-Visio binary files.
  • HPBF It’s Stand for “Horrible Publisher Format”. use for read and write MS-Publisher files.

Installation

There are two ways for installing apache jar file depending upon the type of project:

  1. Maven Project If the project is MAVEN then add dependency in pom.xml file in the project. The dependency is to be added is as given below: 

html




<dependency>
      <groupId>org.apache.poi</groupId>
      <artifactId>poi</artifactId>
      <version>3.12</version>
    </dependency>
    <dependency>
      <groupId>org.apache.poi</groupId>
      <artifactId>poi-ooxml</artifactId>
      <version>3.12</version>
    </dependency>


  1. Steps to create a maven project in eclipse and add dependency
    • Click on file->new->maven project
    • A new window appears, Click on Next
    • Select maven-archetype-webapp
    • Give name of the project
    • A project is formed in the workspace and a pom.xml file automatically appears
    • Open this file in the existing structure of pom.xml file
    • Copy apache poi dependency in pom.xml file
    • Maven dependency is added when the pom.xml file is saved after copying the maven dependency.
  2. Simple Java Project If not using maven, then one can download maven jar files from POI download. Include following jar files minimum to run the sample code:

poi-3.10-FINAL.jar poi-ooxml-3.10-FINAL.jar commons-codec-1.5.jar poi-ooxml-schemas-3.10-FINAL.jar xml-apis-1.0.b2.jar stax-api-1.0.1.jar xmlbeans-2.3.0.jar dom4j-1.6.1.jar

  1. Follow this Link to see how to add external jars in eclipse.

Classes and Methods

Workbook It’s the super-interface of all classes that create or maintain Excel workbooks. Following are the two classes that implement this interface

  1. HSSFWorkbook It implements the Workbook interface and is used for Excel files in .xls format. Listed below are some of the methods and constructors under this class.
    • Methods and Constructors

HSSFWorkbook() HSSFWorkbook(DirectoryNode directory, boolean preserveNodes) HSSFWorkbook(DirectoryNode directory, POIFSFileSystem fs, boolean preserveNodes) HSSFWorkbook(java.io.InputStream s) HSSFWorkbook(java.io.InputStream s, boolean preserveNodes) HSSFWorkbook(POIFSFileSystem fs) HSSFWorkbook(POIFSFileSystem fs, boolean preserveNodes)

  • where: directory-It is the POI filesystem directory to process from. fs -It is the POI filesystem that contains the workbook stream. preservenodes – This is an optional parameter that decides whether to preserve other nodes like macros. It consumes a lot of memory as it stores all the POIFileSystem in memory (if set).
  1. XSSFWorkbook It is a class that is used to represent both high and low level Excel file formats. It belongs to the org.apache.xssf.usemodel package and implements the Workbook interface. Listed below are the methods and constructors under this class.
    • Classes

XSSFWorkbook() XSSFWorkbook(java.io.File file) XSSFWorkbook(java.io.InputStream is) XSSFWorkbook(java.lang.String path)

  • Methods

createSheet() createSheet(java.lang.String sheetname) createFont() createCellStyle() createFont() setPrintArea(int sheetIndex, int startColumn, int endColumn, int startRow, int endRow)

Advantages

  1. It’s suitable for large files and use less memory
  2. The main advantage of apache poi is that it’s support both HSSFWorkbook and XSSFWorkbook.
  3. It’s contain HSSF implementation of excel file format


Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads