Apache POI | Introduction

Apache POI is an open source java library to create and manipulate various file formats based on Microsoft Office. Using POI, one should be able to perform create, modify and display/read operations on following file formats. For Example, Java doesn’t provide built-in support for working with excel files, so we need to look for open source APIs for the job.
Apache POI provides Java API for manipulating various file formats based on the Office Open XML (OOXML) standard and OLE2 standard from Microsoft. Apache POI releases are available under the Apache License (V2.0).

Important features:

  • Apache POI provides stream-based processing, that is suitable for large files and requires less memory.
  • Apache POI is able to handle both XLS and XLSX formats of spreadsheets.
  • Apache POI contains HSSF implementation for Excel ’97(-2007) file format i.e XLS.
  • Apache POI XSSF implementation should be used for Excel 2007 OOXML (.xlsx) file format.
  • Apache POI HSSF and XSSF API provides mechanisms to read, write or modify excel spreadsheets.
  • Apache POI also provides SXSSF API that is an extension of XSSF to work with very large excel sheets.
  • SXSSF API requires less memory and is suitable when working with very large spreadsheets and heap memory is limited.
  • There are two models to choose from – event model and user model. Event model requires less memory because the excel file is read in tokens and requires processing them. User model is more object oriented and easy to use .
  • Apache POI provides excellent support for additional excel features such as working with Formulas, creating cell styles by filling colors and borders, fonts, headers and footers, data validations, images, hyperlinks etc.

Commonly used components of Apache POI:

  • HSSF (Horrible Spreadsheet Format) : It is used to read and write xls format of MS-Excel files.
  • XSSF (XML Spreadsheet Format) : It is used for xlsx file format of MS-Excel.
  • POIFS (Poor Obfuscation Implementation File System) : This component is the basic factor of all other POI elements. It is used to read different files explicitly.
  • HWPF (Horrible Word Processor Format) : It is used to read and write doc extension files of MS-Word.
  • HSLF (Horrible Slide Layout Format) : It is used for read, create, and edit PowerPoint presentations.

Environment

Apache POI runtime dependencies : If you are working on a maven project, you can include the POI dependency in pom.xml file using this:

filter_none

edit
close

play_arrow

link
brightness_4
code

<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi</artifactId>
    <version>3.9</version>
</dependency>

chevron_right


To add this in eclipse: go to-



Window -> Show View -> Other -> Maven -> Maven Repositories

If you are not using maven, then you can download maven jar files from POI download page. Include following jar files minimum to run the sample code:

  • poi-3.10-FINAL.jar
  • poi-ooxml-3.10-FINAL.jar
  • commons-codec-1.5.jar
  • poi-ooxml-schemas-3.10-FINAL.jar
  • xml-apis-1.0.b2.jar
  • stax-api-1.0.1.jar
  • xmlbeans-2.3.0.jar
  • dom4j-1.6.1.jar

Follow this link to see how to add external jars in eclipse.
References :
https://poi.apache.org/apidocs/
https://poi.apache.org/overview.html#components

This article is contributed by Pankaj Kumar. If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.



My Personal Notes arrow_drop_up


Article Tags :

2


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.