Open In App

Extract Data From PDF File in Android using Kotlin

Last Updated : 19 Jun, 2022
Improve
Improve
Like Article
Like
Save
Share
Report

PDF is a portable document format that is used to represent data such as images, tables, and many more. Nowadays the use of PDF is increased rapidly in different fields. Many apps have switched to overusing PDF files to represent data. So some of the apps have a requirement to extract the data from the PDF file and display that data inside our app. In this article, we will create an application to extract the data from the PDF file and display it in our app using Kotlin.

Note: If you are looking to implement How to extract data from PDF files in Android using Java. Check out the following article: How to extract data from PDF in Android using Java

Step by Step Implementation

Step 1: Create a New Project in Android Studio

To create a new project in Android Studio please refer to How to Create/Start a New Project in Android Studio. Note that select Kotlin as the programming language.

Step 2: Add dependency to the build.gradle(Module:app)

Navigate to the Gradle Scripts > build.gradle(Module:app) and add the below dependency in the dependencies section.

implementation 'com.itextpdf:itextg:5.5.10'

Now sync your project to install it. 

Step 3: Adding a PDF file to your app

As we are extracting data from PDF files, so we will be adding PDF files to our app. For adding PDF files to your app we have to create the raw folder first. Please refer to Resource Raw Folder in Android Studio to create a raw folder in android. After creating a new raw directory copy and paste your PDF file inside that “raw” folder. After adding that PDF file to your app, now we will move towards implementation in the XML part.    

Step 4: Working with the activity_main.xml file

Navigate to the app > res > layout > activity_main.xml and add the below code to that file. Below is the code for the activity_main.xml file. Comments are added inside the code to understand the code in more detail.

XML




<?xml version="1.0" encoding="utf-8"?>
<RelativeLayout 
    xmlns:tools="http://schemas.android.com/tools"
    android:id="@+id/container"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    android:orientation="vertical"
    tools:context=".MainActivity">
  
    <!--on below line we are creating 
        a text for heading of our app-->
    <TextView
        android:id="@+id/idTVExtracter"
        android:layout_width="match_parent"
        android:layout_height="wrap_content"
        android:layout_margin="8dp"
        android:gravity="center"
        android:padding="4dp"
        android:text="Text Extracter from PDF"
        android:textAlignment="center"
        android:textColor="@color/purple_200"
        android:textSize="18sp"
        android:textStyle="bold" />
  
    <!--on below line we are creating a scroll view for our text-->
    <ScrollView
        android:layout_width="match_parent"
        android:layout_height="match_parent"
        android:layout_above="@id/idBtnExtract"
        android:layout_below="@id/idTVExtracter">
  
        <!--text view for displaying our extracted text-->
        <TextView
            android:id="@+id/idTVPDF"
            android:layout_width="match_parent"
            android:layout_height="match_parent" />
  
    </ScrollView>
  
    <!--button for starting extraction process-->
    <Button
        android:id="@+id/idBtnExtract"
        android:layout_width="match_parent"
        android:layout_height="wrap_content"
        android:layout_alignParentBottom="true"
        android:layout_centerHorizontal="true"
        android:layout_marginStart="20dp"
        android:layout_marginEnd="20dp"
        android:layout_marginBottom="20dp"
        android:text="Extract Text from PDF"
        android:textAllCaps="false" />
  
</RelativeLayout>


Step 5: Working with the MainActivity.kt file

Go to the MainActivity.kt file and refer to the following code. Below is the code for the MainActivity.kt file. Comments are added inside the code to understand the code in more detail.

Kotlin




package com.gtappdevelopers.kotlingfgproject
  
import android.os.Bundle
import android.widget.Button
import android.widget.TextView
import androidx.appcompat.app.AppCompatActivity
import com.itextpdf.text.pdf.PdfReader
import com.itextpdf.text.pdf.parser.PdfTextExtractor
  
class MainActivity : AppCompatActivity() {
  
    // on below line we are creating 
    // variable for our button and text view.
    lateinit var extractedTV: TextView
    lateinit var extractBtn: Button
  
    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        setContentView(R.layout.activity_main)
          
        // on below line we are initializing our 
        // text view and button with its id.
        extractedTV = findViewById(R.id.idTVPDF)
        extractBtn = findViewById(R.id.idBtnExtract)
  
        // on below line we are adding on 
        // click listener for our button.
        extractBtn.setOnClickListener {
            // on below line we are calling extract data method
            // to extract data from our pdf file and 
            // display it in text view.
            extractData()
        }
    }
  
    // on below line we are creating an 
    // extract data method to extract our data.
    private fun extractData() {
        // on below line we are running a try and catch block 
        // to handle extract data operation.
        try {
            // on below line we are creating a 
            // variable for storing our extracted text
            var extractedText = ""
  
            // on below line we are creating a 
            // variable for our pdf extracter.
            val pdfReader: PdfReader = PdfReader("res/raw/android.pdf")
  
            // on below line we are creating 
            // a variable for pages of our pdf.
            val n = pdfReader.numberOfPages
  
            // on below line we are running a for loop.
            for (i in 0 until n) {
  
                // on below line we are appending 
                // our data to extracted 
                // text from our pdf file using pdf reader.
                extractedText =
                    """
                 $extractedText${
                        PdfTextExtractor.getTextFromPage(pdfReader, i + 1).trim { it <= ' ' }
                    }
                  
                 """.trimIndent()
                // to extract the PDF content from the different pages
            }
  
            // on below line we are setting 
            // extracted text to our text view.
            extractedTV.setText(extractedText)
  
            // on below line we are
            // closing our pdf reader.
            pdfReader.close()
  
        }
        // on below line we are handling our 
        // exception using catch block
        catch (e: Exception) {
            e.printStackTrace()
        }
    }
}


Now run your application to see the output of it. 

Output: 



Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads