Apache POI is an open source java library to create and manipulate various file formats based on Microsoft Office. Using POI, one should be able to perform create, modify and display/read operations on following file formats. For Example, Java doesn’t provide built-in support for working with excel files, so we need to look for open source APIs for the job.
Apache POI provides Java API for manipulating various file formats based on the Office Open XML (OOXML) standard and OLE2 standard from Microsoft. Apache POI releases are available under the Apache License (V2.0).
- Apache POI provides stream-based processing, that is suitable for large files and requires less memory.
- Apache POI is able to handle both XLS and XLSX formats of spreadsheets.
- Apache POI contains HSSF implementation for Excel ’97(-2007) file format i.e XLS.
- Apache POI XSSF implementation should be used for Excel 2007 OOXML (.xlsx) file format.
- Apache POI HSSF and XSSF API provides mechanisms to read, write or modify excel spreadsheets.
- Apache POI also provides SXSSF API that is an extension of XSSF to work with very large excel sheets.
- SXSSF API requires less memory and is suitable when working with very large spreadsheets and heap memory is limited.
- There are two models to choose from – event model and user model. Event model requires less memory because the excel file is read in tokens and requires processing them. User model is more object oriented and easy to use .
- Apache POI provides excellent support for additional excel features such as working with Formulas, creating cell styles by filling colors and borders, fonts, headers and footers, data validations, images, hyperlinks etc.
Commonly used components of Apache POI:
- HSSF (Horrible Spreadsheet Format) : It is used to read and write xls format of MS-Excel files.
- XSSF (XML Spreadsheet Format) : It is used for xlsx file format of MS-Excel.
- POIFS (Poor Obfuscation Implementation File System) : This component is the basic factor of all other POI elements. It is used to read different files explicitly.
- HWPF (Horrible Word Processor Format) : It is used to read and write doc extension files of MS-Word.
- HSLF (Horrible Slide Layout Format) : It is used for read, create, and edit PowerPoint presentations.
Apache POI runtime dependencies : If you are working on a maven project, you can include the POI dependency in pom.xml file using this:
To add this in eclipse: go to-
Window -> Show View -> Other -> Maven -> Maven Repositories
If you are not using maven, then you can download maven jar files from POI download page. Include following jar files minimum to run the sample code:
This article is contributed by Pankaj Kumar. If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to email@example.com. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.
- Apache Kafka | Introduction
- Introduction to Apache Cassandra
- Introduction to PySpark | Distributed Computing with Apache Spark
- Introduction to Apache Maven | A build automation tool for Java projects
- Difference Between Apache Hive and Apache Impala
- Difference Between Apache Hadoop and Apache Storm
- Apache POI | Getting Started
- Apache Cassandra (NOSQL database)
- AWS EC2 Instance Setup with Apache Server
- How to install Apache server in Ubuntu ?
- Difference Between Hadoop and Apache Spark
- Reading and Writing data to excel file using Apache POI
- Creating Sheets in Excel File in Java using Apache POI
- Where does PHP store the error log? (php5, apache, fastcgi, cpanel)
- p5.js | Introduction