Apache POI | Getting Started

POI stands For “Poor Obfuscation Implementation”. Apache POI is an API provided by Apache foundation which is a collection of different java libraries. This libraries gives the facility to read, write and manipulate different Microsoft files such as excel sheet, power-point, and word files. It’s first version release on 30 December 2001.

Apache POI Architecture

Apache POI have different classes and method to work upon different MS Office Document.

  • POIFS
    It’s Stand for “Poor Obfuscation Implementation File System”.This component is the basic factor of all other POI elements. It is used to read different files explicitly.
  • HSSF
    It’s Stand for “Horrible Spreadsheet Format”.It is used to read and write xls format of MS-Excel files.
  • XSSF
    It’s Stand for “XML Spreadsheet Format”.It is used for xlsx file format of MS-Excel.
  • HPSF
    It’s Stand for “Horrible Property Set Format”.It is used to extract property sets of the MS-Office files.
  • HWPF
    It’s Stand for “Horrible Word Processor Format”.It is used to read and write doc extension files of MS-Word.
  • XWPF
    It’s Stand for “XML Word Processor Format”.It is used to read and write docx extension files of MS-Word.
  • HSLF
    It’s Stand for “Horrible Slide Layout Format”.It is used for read, create, and edit PowerPoint presentations.
  • HDGF
    It’s Stand for “Horrible Diagram Format”.It contains classes and methods for MS-Visio binary files.
  • HPBF
    It’s Stand for “Horrible PuBlisher Format”. use for read and write MS-Publisher files.

Installation

Their are two ways for installing apache jar file depending upon the type of project:



  1. Maven Project

    If the project is MAVEN then add dependency in pom.xml file in the project.
    The dependency is to be added is as given below:

    filter_none

    edit
    close

    play_arrow

    link
    brightness_4
    code

    <dependency>
          <groupId>org.apache.poi</groupId>
          <artifactId>poi</artifactId>
          <version>3.12</version>
        </dependency>
        <dependency>
          <groupId>org.apache.poi</groupId>
          <artifactId>poi-ooxml</artifactId>
          <version>3.12</version>
        </dependency>

    chevron_right

    
    

    Steps to create a maven project in eclipse and add dependency

    • Click on file->new->maven project
    • A new window appears, Click on Next
    • Select maven-archetype-webapp
    • Give name of the project
    • A project is formed in the workspace and a pom.xml file automatically appears
    • Open this file in the existing structure of pom.xml file
    • Copy apache poi dependency in pom.xml file
    • Maven dependency is added when the pom.xml file is saved after copying the maven dependency.
  2. Simple Java Project

    If not using maven, then one can download maven jar files from POI download. Include following jar files minimum to run the sample code:

    poi-3.10-FINAL.jar
    poi-ooxml-3.10-FINAL.jar
    commons-codec-1.5.jar
    poi-ooxml-schemas-3.10-FINAL.jar
    xml-apis-1.0.b2.jar
    stax-api-1.0.1.jar
    xmlbeans-2.3.0.jar
    dom4j-1.6.1.jar



    Follow this Link to see how to add external jars in eclipse.

Classes and Methods

Workbook
It’s the super-interface of all classes that create or maintain Excel workbooks. Following are the two classes that implement this interface

  1. HSSFWorkbook
    It implements the Workbook interface and is used for Excel files in .xls format. Listed below are some of the methods and constructors under this class.

    • Methods and Constructors

      HSSFWorkbook()
      HSSFWorkbook(DirectoryNode directory, boolean preserveNodes)
      HSSFWorkbook(DirectoryNode directory, POIFSFileSystem fs, boolean preserveNodes)
      HSSFWorkbook(java.io.InputStream s)
      HSSFWorkbook(java.io.InputStream s, boolean preserveNodes)
      HSSFWorkbook(POIFSFileSystem fs)
      HSSFWorkbook(POIFSFileSystem fs, boolean preserveNodes)

      where:
      directory-It is the POI filesystem directory to process from.
      fs -It is the POI filesystem that contains the workbook stream.
      preservenodes – This is an optional parameter that decides whether to preserve other nodes like macros. It consumes a lot of memory as it stores all the POIFileSystem in memory (if set).

  2. XSSFWorkbook
    It is a class that is used to represent both high and low level Excel file formats. It belongs to the org.apache.xssf.usemodel package and implements the Workbook interface. Listed below are the methods and constructors under this class.

    • Classes

      XSSFWorkbook()
      XSSFWorkbook(java.io.File file)
      XSSFWorkbook(java.io.InputStream is)
      XSSFWorkbook(java.lang.String path)

    • Methods

      createSheet()
      createSheet(java.lang.String sheetname)
      createFont()
      createCellStyle()
      createFont()
      setPrintArea(int sheetIndex, int startColumn, int endColumn, int startRow, int endRow)

Advantages

  1. It’s suitable for large files and use less memory
  2. The main advantage of apache poi is that it’s support both HSSFWorkbook and XSSFWorkbook.
  3. It’s contain HSSF implementation of excel file format


My Personal Notes arrow_drop_up


If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.




Article Tags :
Practice Tags :


1


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.