Skip to content
Related Articles

Related Articles

Introduction to Apache POI

View Discussion
Improve Article
Save Article
  • Difficulty Level : Easy
  • Last Updated : 11 Jul, 2022
View Discussion
Improve Article
Save Article

Apache POI is an open-source java library to create and manipulate various file formats based on Microsoft Office. Using POI, one should be able to perform create, modify and display/read operations on the following file formats. For Example, Java doesn’t provide built-in support for working with excel files, so we need to look for open-source APIs for the job. 

Apache POI provides Java API for manipulating various file formats based on the Office Open XML (OOXML) standard and OLE2 standard from Microsoft. Apache POI releases are available under the Apache License (V2.0). 

Some important features of Apache POI are as follows: 

  • Apache POI provides stream-based processing, that is suitable for large files and requires less memory.
  • Apache POI is able to handle both XLS and XLSX formats of spreadsheets.
  • Apache POI contains HSSF implementation for Excel ’97(-2007) file format i.e XLS.
  • Apache POI XSSF implementation should be used for Excel 2007 OOXML (.xlsx) file format.
  • Apache POI HSSF and XSSF API provide mechanisms to read, write or modify excel spreadsheets.
  • Apache POI also provides SXSSF API that is an extension of XSSF to work with very large excel sheets.
  • SXSSF API requires less memory and is suitable when working with very large spreadsheets and heap memory is limited.
  • There are two models to choose from – the event model and the user model. The event model requires less memory because the excel file is read in tokens and requires processing. The user model is more object-oriented and easy to use.
  • Apache POI provides excellent support for additional excel features such as working with Formulas, creating cell styles by filling colors and borders, fonts, headers and footers, data validations, images, hyperlinks, etc.

Commonly used components of Apache POI

  1. HSSF (Horrible Spreadsheet Format): It is used to read and write xls format of MS-Excel files.
  2. XSSF (XML Spreadsheet Format): It is used for xlsx file format of MS-Excel.
  3. POIFS (Poor Obfuscation Implementation File System): This component is the basic factor of all other POI elements. It is used to read different files explicitly.
  4. HWPF (Horrible Word Processor Format): It is used to read and write doc extension files of MS-Word.
  5. HSLF (Horrible Slide Layout Format): It is used for read, create, and edit PowerPoint presentations.

Environment

Apache POI runtime dependencies: If you are working on a Maven project, you can include the POI dependency in the pom.xml file using the below set of lines of code.

<dependency> 
    <groupId>org.apache.poi</groupId> 
    <artifactId>poi</artifactId> 
    <version>3.9</version> 
</dependency> 

Now, in order to add this in eclipse, go to 

Window -> Show View -> Other -> Maven -> Maven Repositories

If you are not using maven, then you can download maven jar files from the POI download page. Include the following jar files minimum to run the sample code: 

Follow this Link to see how to add external jars in eclipse.

This article is contributed by Pankaj Kumar. If you like GeeksforGeeks and would like to contribute, you can also write an article using write.geeksforgeeks.org or mail your article to review-team@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks. Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.

My Personal Notes arrow_drop_up
Recommended Articles
Page :

Start Your Coding Journey Now!