File APIs for Word/Excel/PowerPoint/PDF

Posts

Showing posts from October, 2019

Split PDF File in Java

In this article, I will introduce two methods to split a PDF file in Java application: 1. Split a PDF to single page PDF files 2. Split a PDF to multiple PDF files by page range The below examples use Free Spire.PDF for Java library. Imported NameSpace: import com.spire.pdf.PdfDocument; import com.spire.pdf.PdfPageBase; import com.spire.pdf.graphics.PdfMargins; import java.awt.geom.Point2D; Split a PDF to single page PDF files //load the PDF file PdfDocument doc = new PdfDocument(); doc.loadFromFile( "sample.pdf" ); //split every page of the PDF into a separate file doc.split( "Split/splitDocument-{0}.pdf" , 0 ); doc.close(); Split a PDF to multiple PDF by page range //load the PDF file PdfDocument doc = new PdfDocument(); doc.loadFromFile( "sample.pdf" ); //create a new PDF file PdfDocument newDoc1 = new PdfDocument(); PdfPageBase page; //add 2 pages to the new PDF, and draw the content of page 1...

Extract Text From PDF in Java

In this article, we’re going to explain how to extract text from a Pdf file in Java. An overview of content: Extract All Text from a Pdf Read/Extract Text from a Specific Rectangle Area in a Pdf Page Read/Extract Text using SimpleTextExtractionStrategy The Pdf library we need: Spire.PDF for Java The example Pdf file: Sample Code Imported Namespaces import com.spire.pdf.*; import com.spire.pdf.exporting.text.SimpleTextExtractionStrategy; import java.awt.geom.Rectangle2D; import java.io.*; Read/Extract All Text from a Pdf //Instantiate a PdfDocument object PdfDocument pdf = new PdfDocument(); //Load the Pdf file pdf.loadFromFile("Additional.pdf"); StringBuilder sb= new StringBuilder(); //Extract text from every page of the Pdf for (PdfPageBase page: (Iterable<PdfPageBase>) pdf.getPages()) { sb.append(page.extractText(true)); } try { //Write the text into a .txt file FileWrite...