Skip to main content

How to Read PDF and Write into Text file in Java? PDF to Text file in Java

Copy PDF text and paste to Text file in Java

Extract text from an existing PDF document to Text file in  JAVA

In this article, we will seen how to create new text file and Extract text from PDF document to text file.

We will use Apache pdfbox for extract PDF. For use Apache pdfbox we can use Maven project and include dependency or Crate Dynamic Web Project and add pdfbox JAR file. So in this we will use Dynamic Web Project. 

Step 1 : Create new Dynamic Web Project in eclipse

Go to File -> New -> Dynamic Web Project

Create Java class.

Step 2 : Add pdfbox JAR file in Project

Click on below link and download JAR file.

For include JAR into our project follow below steps :

  1. Click Right click on project -> Build Path - >  Configure Build Path
  2. Go to Libraries tab -> Click on Add External JARs button. Select Apache pdfbox jar.
  3. Click Apply and Close button.

Now all set for extracting PDF store into text file.

Step 3 : Java code for Extract PDF text to Text file

import java.io.BufferedWriter;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStreamWriter;

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.text.PDFTextStripper;


public class FileReadWrite {

    public static void main(String[] args) {
        try {
            PDDocument pd;
            BufferedWriter wr;
            
            String filePath = "D:\\JavaFileDemo/";
            
            // The PDF file name and full path that you want to extract
            File input = new File(filePath + "input.pdf");
            
            // The text file name and its path where you want to store
            File output = new File(filePath + "output.txt");

            pd = PDDocument.load(input);
            PDFTextStripper stripper = new PDFTextStripper();
            wr = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(output)));
            stripper.writeText(pd, wr);
            
            if (pd != null) {
                pd.close();
            }
            wr.close();
            System.out.println("Successfully extracted : PDF to text file");
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    }

If above code successfully compile and run, then "Successfully extracted : PDF to text file" message print.

Go to your Path location and check there is output.txt is successfully created and PDF data is pasted into it.


Other articles you may like :

Spring Boot and Security Articles :

Comments

Popular posts from this blog

Flipping the Matrix HackerRank Solution in Java with Explanation

Java Solution for Flipping the Matrix | Find Highest Sum of Upper-Left Quadrant of Matrix Problem Description : Sean invented a game involving a 2n * 2n matrix where each cell of the matrix contains an integer. He can reverse any of its rows or columns any number of times. The goal of the game is to maximize the sum of the elements in the n *n submatrix located in the upper-left quadrant of the matrix. Given the initial configurations for q matrices, help Sean reverse the rows and columns of each matrix in the best possible way so that the sum of the elements in the matrix's upper-left quadrant is maximal.  Input : matrix = [[1, 2], [3, 4]] Output : 4 Input : matrix = [[112, 42, 83, 119], [56, 125, 56, 49], [15, 78, 101, 43], [62, 98, 114, 108]] Output : 119 + 114 + 56 + 125 = 414 Full Problem Description : Flipping the Matrix Problem Description   Here we can find solution using following pattern, So simply we have to find Max of same number of box like (1,1,1,1). And ...

How to Implement One to Many and Many to One Mapping in Spring Boot using JPA

Spring Boot CRUD example using One-to-Many and Many to One mapping | With Thymeleaf User Interface In this tutorial, we will learn how to use @OneToMany and @ManyToOne annotation using JPA (Java Persistent API) in Spring Boot. We also attach Thymeleaf for User Interface. In past tutorial, we already created Spring Boot CRUD with Rest API, JPA and MySql. Please refer that one first, we will continue from there. Spring Boot application with Thymeleaf, Rest API, JPA and MySql Database    For applying One to Many relationship, we need another POJO class. In past we already created Book class, now we will create new class Author . As we know Author have multiple Books, so we can easily apply One to Many operation. Lets create POJO class for Author and apply @OneToMany on Book .  Define List of Book and apply @OneToMany annotation on field. We are using mappedBy property, so Author table does not create new column.  We already learn about mappedBy property in One-to-One a...