Skip to main content

How to Read PDF and Write into Text file in Java? PDF to Text file in Java

Copy PDF text and paste to Text file in Java

Extract text from an existing PDF document to Text file in  JAVA

In this article, we will seen how to create new text file and Extract text from PDF document to text file.

We will use Apache pdfbox for extract PDF. For use Apache pdfbox we can use Maven project and include dependency or Crate Dynamic Web Project and add pdfbox JAR file. So in this we will use Dynamic Web Project. 

Step 1 : Create new Dynamic Web Project in eclipse

Go to File -> New -> Dynamic Web Project

Create Java class.

Step 2 : Add pdfbox JAR file in Project

Click on below link and download JAR file.

For include JAR into our project follow below steps :

  1. Click Right click on project -> Build Path - >  Configure Build Path
  2. Go to Libraries tab -> Click on Add External JARs button. Select Apache pdfbox jar.
  3. Click Apply and Close button.

Now all set for extracting PDF store into text file.

Step 3 : Java code for Extract PDF text to Text file

import java.io.BufferedWriter;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStreamWriter;

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.text.PDFTextStripper;


public class FileReadWrite {

    public static void main(String[] args) {
        try {
            PDDocument pd;
            BufferedWriter wr;
            
            String filePath = "D:\\JavaFileDemo/";
            
            // The PDF file name and full path that you want to extract
            File input = new File(filePath + "input.pdf");
            
            // The text file name and its path where you want to store
            File output = new File(filePath + "output.txt");

            pd = PDDocument.load(input);
            PDFTextStripper stripper = new PDFTextStripper();
            wr = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(output)));
            stripper.writeText(pd, wr);
            
            if (pd != null) {
                pd.close();
            }
            wr.close();
            System.out.println("Successfully extracted : PDF to text file");
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    }

If above code successfully compile and run, then "Successfully extracted : PDF to text file" message print.

Go to your Path location and check there is output.txt is successfully created and PDF data is pasted into it.


Other articles you may like :

Spring Boot and Security Articles :

Comments

Popular posts from this blog

Queen's Attack II HackerRank Solution in Java with Explanation

Queen's Attack II Problem's Solution in Java (Chessboard Problem)   Problem Description : You will be given a square chess board with one queen and a number of obstacles placed on it. Determine how many squares the queen can attack.  A queen is standing on an n * n chessboard. The chess board's rows are numbered from 1 to n, going from bottom to top. Its columns are numbered from 1 to n, going from left to right. Each square is referenced by a tuple, (r, c), describing the row r and column c, where the square is located. The queen is standing at position (r_q, c_q). In a single move, queen can attack any square in any of the eight directions The queen can move: Horizontally (left, right) Vertically (up, down) Diagonally (four directions: up-left, up-right, down-left, down-right) The queen can move any number of squares in any of these directions, but it cannot move through obstacles. Input Format : n : The size of the chessboard ( n x n ). k : The number of obstacles...

Sales by Match HackerRank Solution | Java Solution

HackerRank Sales by Match problem solution in Java   Problem Description : Alex works at a clothing store. There is a large pile of socks that must be paired by color for sale. Given an array of integers representing the color of each sock, determine how many pairs of socks with matching colors there are. For example, there are n=7 socks with colors socks = [1,2,1,2,1,3,2]. There is one pair of color 1 and one of color 2 . There are three odd socks left, one of each color. The number of pairs is 2 .   Example 1 : Input : n = 6 arr = [1, 2, 3, 4, 5, 6] Output : 0 Explanation : We have 6 socks with all different colors, So print 0. Example 2 : Input : n = 10 arr = [1, 2, 3, 4, 1, 4, 2, 7, 9, 9] Output : 4 Explanation : We have 10 socks. There is pair of color 1, 2, 4 and 9, So print 4. This problem easily solved by HashMap . Store all pair of socks one by one in Map and check if any pair is present in Map or not. If pair is present then increment ans variable by 1 ...