Skip to main content

Convert PDF to PDF/A or PDF/A to PDF in Java: Complete Guide

When working with enterprise documents, government archives, or files that need long-term preservation, you've probably heard of "PDF/A." Many people know about PDF, but aren't familiar with PDF/A—what exactly is it? Why convert to it? And how do you implement it in Java?

Today, let's explore this topic and share some practical development experiences with code examples.


Understanding PDF/A

PDF/A (Portable Document Format/Archive) is an archival version of PDF specifically designed for long-term preservation of electronic documents. Compared to regular PDF, it has these characteristics:

PDF/A Restrictions:

  • ❌ No external dependencies except embedded fonts
  • ❌ No JavaScript, audio, video, or other dynamic content
  • ❌ No encryption (partially allowed in some levels)
  • ❌ All colors must be explicitly defined (no device-dependent colors)

PDF/A Advantages:

  • ✅ Ensures documents display correctly even decades later
  • ✅ Self-contained, no external resource dependencies
  • ✅ Complies with ISO standards (ISO 19005)
  • ✅ Widely adopted by governments, courts, and archives

Common PDF/A Standards:

  • PDF/A-1 (2005): The earliest standard, based on PDF 1.4
  • PDF/A-2 (2011): Supports transparency effects and JPEG 2000 compression
  • PDF/A-3 (2012): Allows embedding arbitrary file formats (XML, CSV, etc.)

Each standard has two conformance levels:

  • Level A (Accessible): Preserves structural information for accessibility
  • Level B (Basic): Guarantees consistent visual rendering only

Environment Setup

Maven Dependency

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <name>e-iceblue</name>
        <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
    </repository>
</repositories>

<dependencies>
    <dependency>
        <groupId>e-iceblue</groupId>
        <artifactId>spire.pdf</artifactId>
        <version>12.6.1</version>
    </dependency>
</dependencies>

Gradle Configuration

repositories {
    maven {
        url 'https://repo.e-iceblue.cn/repository/maven-public/'
    }
}

dependencies {
    implementation 'e-iceblue:spire.pdf:12.6.1'
}

1. Basic Conversion: PDF to Various PDF/A Formats

The most straightforward approach—using the PdfStandardsConverter class:

import com.spire.pdf.conversion.PdfStandardsConverter;

public class BasicPdfToPdfA {
    public static void main(String[] args) {
        // Create converter instance
        PdfStandardsConverter converter = new PdfStandardsConverter("sample.pdf");

        // Convert to different PDF/A levels
        converter.toPdfA1A("output/PdfA1A.pdf");   // PDF/A-1A
        converter.toPdfA1B("output/PdfA1B.pdf");   // PDF/A-1B
        converter.toPdfA2A("output/PdfA2A.pdf");   // PDF/A-2A
        converter.toPdfA2B("output/PdfA2B.pdf");   // PDF/A-2B
        converter.toPdfA3A("output/PdfA3A.pdf");   // PDF/A-3A
        converter.toPdfA3B("output/PdfA3B.pdf");   // PDF/A-3B

        System.out.println("Conversion complete!");
    }
}

That's it! One line of code completes each format conversion.

How to Choose PDF/A Level?

FormatUse CaseCharacteristics
PDF/A-1BGeneral archivingBest compatibility, most conservative
PDF/A-2BModern documentsSupports transparency and layers
PDF/A-3BData embeddingCan embed XML, Excel, and other attachments
Level AAccessibility needsPreserves tag structure for disabled users
Level BGeneral purposeOnly guarantees visual consistency

Practical Recommendations:

  • Government/legal documents → PDF/A-1B (strictest)
  • Enterprise internal archiving → PDF/A-2B (balance of compatibility and features)
  • Need to embed data → PDF/A-3B (highest flexibility)

2. Handling Encrypted PDFs

If the source PDF is password-protected, decrypt it first before conversion:

import com.spire.pdf.conversion.PdfStandardsConverter;

public class EncryptedPdfToPdfA {
    public static void main(String[] args) {
        String inputFile = "data/encrypted.pdf";
        String password = "your_password";

        // Pass password when creating converter
        PdfStandardsConverter converter = new PdfStandardsConverter(inputFile, password);

        // Convert to PDF/A-2A
        converter.toPdfA2A("output/decrypted_pdfa.pdf");

        System.out.println("Encrypted PDF conversion complete!");
    }
}

Important Notes:

  • The converted PDF/A file is no longer encrypted (PDF/A standard restriction)
  • If you need to re-encrypt after archiving, handle it separately

3. Preserving Metadata

By default, conversion might lose some metadata. To preserve it, configure as follows:

import com.spire.pdf.conversion.PdfStandardsConverter;

public class PdfToPdfAWithMetadata {
    public static void main(String[] args) {
        String input = "data/document_with_metadata.pdf";
        String output = "output/pdfa_with_metadata.pdf";

        PdfStandardsConverter converter = new PdfStandardsConverter(input);

        // Key setting: preserve allowed metadata
        converter.getOptions().setPreserveAllowedMetadata(true);

        // Execute conversion
        converter.toPdfA1A(output);

        System.out.println("Conversion complete, metadata preserved!");
    }
}

Which Metadata Is Preserved?

  • ✅ Title, author, subject, keywords
  • ✅ Creation date, modification date
  • ✅ PDF/A compliance information
  • ❌ Some custom properties (if they don't comply with PDF/A standards)

4. Converting PDF/A Back to Regular PDF

Sometimes you need the reverse operation—converting PDF/A back to regular PDF (e.g., to add interactive features):

import com.spire.pdf.*;
import com.spire.pdf.graphics.PdfMargins;
import java.awt.geom.Dimension2D;

public class PdfAToPdf {
    public static void main(String[] args) {
        String input = "data/sample_pdfa.pdf";
        String output = "output/regular_pdf.pdf";

        // Load PDF/A file
        PdfDocument doc = new PdfDocument();
        doc.loadFromFile(input);

        // Create new document (non-PDF/A)
        PdfNewDocument newDoc = new PdfNewDocument();
        newDoc.setCompressionLevel(PdfCompressionLevel.None);

        // Copy content page by page
        for (PdfPageBase page : (Iterable<PdfPageBase>) doc.getPages()) {
            Dimension2D size = page.getSize();
            PdfPageBase p = newDoc.getPages().add(size, new PdfMargins(0));

            // Draw page content using template
            page.createTemplate().draw(p, 0, 0);
        }

        // Save as regular PDF
        newDoc.save(output);

        // Release resources
        newDoc.close();
        newDoc.dispose();
        doc.close();
        doc.dispose();

        System.out.println("PDF/A to PDF conversion complete!");
    }
}

Core Approach:

  1. Load the PDF/A document
  2. Create a new regular PDF document
  3. Copy content page by page (via templates)
  4. Save as new file

Use Cases:

  • Need to add JavaScript interactivity
  • Want to embed multimedia content
  • Remove PDF/A restrictions for editing

5. Creating PDF/A with Attachments

PDF/A-3 allows embedding arbitrary files as attachments, which is very useful in archiving scenarios:

import com.spire.pdf.*;
import com.spire.pdf.attachments.PdfAttachment;
import com.spire.pdf.graphics.PdfMargins;
import java.awt.geom.Dimension2D;
import java.io.*;

public class PdfAWithAttachments {
    public static void main(String[] args) throws IOException {
        String input = "data/report.pdf";
        String output = "output/report_with_attachments.pdfa";

        // Load source PDF
        PdfDocument doc = new PdfDocument();
        doc.loadFromFile(input);

        // Create PDF/A-3B document
        PdfNewDocument newDoc = new PdfNewDocument();
        newDoc.setConformance(PdfConformanceLevel.Pdf_A_3_B);

        // Copy page content
        for (PdfPageBase page : (Iterable<PdfPageBase>) doc.getPages()) {
            Dimension2D size = page.getSize();
            PdfPageBase p = newDoc.getPages().add(size, new PdfMargins(0));
            page.createTemplate().draw(p, 0, 0);
        }

        // Read attachment data
        byte[] excelData = readBytesFromFile("data/raw_data.xlsx");
        byte[] xmlData = readBytesFromFile("data/metadata.xml");

        // Create attachment objects
        PdfAttachment attach1 = new PdfAttachment("raw_data.xlsx", excelData);
        PdfAttachment attach2 = new PdfAttachment("metadata.xml", xmlData);

        // Add attachments
        newDoc.getAttachments().add(attach1);
        newDoc.getAttachments().add(attach2);

        // Save
        newDoc.save(output, FileFormat.PDF);

        // Release resources
        doc.close();
        doc.dispose();
        newDoc.close();
        newDoc.dispose();

        System.out.println("PDF/A-3B created with 2 attachments!");
    }

    private static byte[] readBytesFromFile(String filePath) throws IOException {
        FileInputStream input = new FileInputStream(filePath);
        byte[] data = new byte[input.available()];
        input.read(data);
        input.close();
        return data;
    }
}

Typical Use Cases:

  • Financial reports + raw Excel data
  • Academic papers + research datasets
  • Contract documents + signing record XML
  • Technical documentation + source code packages

6. Practical Example: Batch Conversion Tool

In real projects, you often need to process files in batch. Here's a complete utility class:

import com.spire.pdf.conversion.PdfStandardsConverter;
import java.io.File;
import java.util.ArrayList;
import java.util.List;

public class BatchPdfToPdfAConverter {

    /**
     * Batch convert all PDFs in a folder to PDF/A
     * 
     * @param inputDir Input folder path
     * @param outputDir Output folder path
     * @param pdfALevel PDF/A level (e.g., "1B", "2B", "3B")
     */
    public static void batchConvert(String inputDir, String outputDir, String pdfALevel) {
        File dir = new File(inputDir);

        if (!dir.exists() || !dir.isDirectory()) {
            System.err.println("Error: Input directory does not exist - " + inputDir);
            return;
        }

        // Create output directory
        new File(outputDir).mkdirs();

        // Get all PDF files
        File[] pdfFiles = dir.listFiles((d, name) -> 
            name.toLowerCase().endsWith(".pdf") && !name.toLowerCase().contains("pdfa")
        );

        if (pdfFiles == null || pdfFiles.length == 0) {
            System.out.println("No PDF files found");
            return;
        }

        int successCount = 0;
        int failCount = 0;
        List<String> errors = new ArrayList<>();

        System.out.println("Starting batch conversion, total " + pdfFiles.length + " files...\n");

        for (File pdfFile : pdfFiles) {
            try {
                String outputFileName = pdfFile.getName().replace(".pdf", "_PDFA-" + pdfALevel + ".pdf");
                String outputPath = outputDir + File.separator + outputFileName;

                PdfStandardsConverter converter = new PdfStandardsConverter(pdfFile.getAbsolutePath());

                // Convert according to specified level
                switch (pdfALevel.toUpperCase()) {
                    case "1A":
                        converter.toPdfA1A(outputPath);
                        break;
                    case "1B":
                        converter.toPdfA1B(outputPath);
                        break;
                    case "2A":
                        converter.toPdfA2A(outputPath);
                        break;
                    case "2B":
                        converter.toPdfA2B(outputPath);
                        break;
                    case "3A":
                        converter.toPdfA3A(outputPath);
                        break;
                    case "3B":
                        converter.toPdfA3B(outputPath);
                        break;
                    default:
                        throw new IllegalArgumentException("Unsupported PDF/A level: " + pdfALevel);
                }

                successCount++;
                System.out.println("✓ " + pdfFile.getName() + " -> " + outputFileName);

            } catch (Exception e) {
                failCount++;
                String errorMsg = pdfFile.getName() + ": " + e.getMessage();
                errors.add(errorMsg);
                System.err.println("✗ " + errorMsg);
            }
        }

        // Output statistics
        System.out.println("\n========== Conversion Complete ==========");
        System.out.println("Successful: " + successCount);
        System.out.println("Failed: " + failCount);

        if (!errors.isEmpty()) {
            System.out.println("\nError Details:");
            for (String error : errors) {
                System.out.println("  - " + error);
            }
        }
    }

    public static void main(String[] args) {
        // Batch convert to PDF/A-2B
        batchConvert("input/pdfs", "output/pdfa", "2B");
    }
}

Features:

  • ✅ Automatically scans all PDFs in folder
  • ✅ Supports all PDF/A levels
  • ✅ Detailed progress feedback and error reporting
  • ✅ Skips already converted files (filename doesn't contain "pdfa")
  • ✅ Automatically creates output directory

7. Common Issues and Solutions

Issue 1: Conversion Fails Due to Font Problems

Cause: PDF uses fonts that are not embedded.

Solution:

// Spire.PDF automatically handles font embedding
// If still failing, check if source PDF is corrupted
PdfStandardsConverter converter = new PdfStandardsConverter(inputFile);
converter.getOptions().setDisableFontSubstitution(false); // Allow font substitution
converter.toPdfA1B(outputFile);

Issue 2: File Size Increases Dramatically After Conversion

Cause: PDF/A requires embedding all fonts and resources.

Optimization Suggestions:

// 1. Compress source PDF before conversion
// 2. Use more efficient compression algorithms
// 3. Remove unnecessary metadata

// For large files, consider batch processing
Runtime runtime = Runtime.getRuntime();
long freeMemory = runtime.freeMemory();
if (freeMemory < 100 * 1024 * 1024) { // Less than 100MB
    System.gc(); // Trigger garbage collection
}

Issue 3: How to Verify Generated PDF/A Compliance?

Method 1: Use Online Validation Tools

  • veraPDF - Open-source PDF/A validator
  • Adobe Acrobat Pro - Built-in validation feature

Method 2: Programmatic Validation (Requires Additional Library)

// Can use Apache PDFBox preflight module
// Or call third-party API for validation

Issue 4: Slow Conversion Speed

Optimization Strategies:

// 1. Process multiple files in parallel
ExecutorService executor = Executors.newFixedThreadPool(4);
for (File file : files) {
    executor.submit(() -> convertSingleFile(file));
}
executor.shutdown();

// 2. Use SSD storage to improve I/O speed
// 3. Increase JVM heap memory: -Xmx4g

8. Best Practices Summary

1. Choose the Right PDF/A Level

  • Legal/Government Documents → PDF/A-1B (strictest, best compatibility)
  • Enterprise Internal Archiving → PDF/A-2B (balance of features and compatibility)
  • Research Data Archiving → PDF/A-3B (can embed datasets)
  • Accessibility Requirements → Level A series (preserves structural information)

2. Resource Management

// Always release resources in finally block
PdfStandardsConverter converter = null;
try {
    converter = new PdfStandardsConverter(inputFile);
    converter.toPdfA1B(outputFile);
} finally {
    if (converter != null) {
        converter.dispose();
    }
}

3. Error Handling

try {
    converter.toPdfA1B(outputFile);
} catch (Exception e) {
    // Log detailed error information
    logger.error("PDF conversion failed: " + inputFile, e);

    // Provide user-friendly messages
    if (e.getMessage().contains("font")) {
        System.err.println("Font issue, please check if source PDF uses special fonts");
    } else if (e.getMessage().contains("corrupt")) {
        System.err.println("File corrupted, please regenerate source PDF");
    }
}

4. Performance Monitoring

long startTime = System.currentTimeMillis();

// Execute conversion
converter.toPdfA1B(outputFile);

long endTime = System.currentTimeMillis();
System.out.println("Conversion time: " + (endTime - startTime) + " ms");

// Monitor memory usage
Runtime runtime = Runtime.getRuntime();
long usedMemory = (runtime.totalMemory() - runtime.freeMemory()) / (1024 * 1024);
System.out.println("Memory usage: " + usedMemory + " MB");

9. Comparison with Alternative Solutions

Spire.PDF vs Apache PDFBox

FeatureSpire.PDFApache PDFBox
API Simplicity✅ One-line conversion⚠️ Requires multi-step operations
PDF/A Support✅ Full support for all levels⚠️ Partial support
Learning Curve✅ Low⚠️ Moderate
LicenseCommercial (free tier available)Apache 2.0 (free)
Chinese Language Support✅ Excellent⚠️ Requires extra configuration
Technical Support✅ Official supportCommunity support

Selection Advice:

  • Sufficient budget, need rapid development → Spire.PDF
  • Open-source project, limited budget → Apache PDFBox
  • Need enterprise-level support → Spire.PDF

Conclusion

Converting PDF to PDF/A is common in real-world projects, especially in scenarios requiring long-term archiving. With Spire.PDF for Java, the entire process becomes quite simple:

Key Takeaways:

  • ✅ Use PdfStandardsConverter for conversion
  • ✅ Choose appropriate PDF/A level based on requirements
  • ✅ Pay attention to resource cleanup (call dispose())
  • ✅ Handle encrypted files and metadata preservation
  • ✅ Implement proper error handling and logging for batch processing

Practical Application Recommendations:

  1. Test with small samples first to confirm conversion quality
  2. Establish automated processes for regular document archiving
  3. Keep both original PDF and converted PDF/A
  4. Periodically verify PDF/A file compliance

Hope this article helps you better understand and implement PDF to PDF/A conversion. If you have specific questions, feel free to discuss in the comments!

Happy coding! 🚀

Comments

Popular posts from this blog

3 Ways to Generate Word Documents from Templates in Java

A template is a document with pre-applied formatting like styles, tabs, line spacing and so on. You can quickly generate a batch of documents with the same structure based on the template. In this article, I am going to show you the different ways to generate Word documents from templates programmatically in Java using Free Spire.Doc for Java library. Prerequisite First of all, you need to add needed dependencies for including Free Spire.Doc for Java into your Java project. There are two ways to do that. If you use maven, you need to add the following code to your project’s pom.xml file. <repositories>               <repository>                   <id>com.e-iceblue</id>                   <name>e-iceblue</name>...

Insert and Extract OLE objects in Word in Java

You can use OLE (Object Linking and Embedding) to include content from other programs, such as another Word document, an Excel or PowerPoint document to an existing Word document. This article demonstrates how to insert and extract embedded OLE objects in a Word document in Java by using Free Spire.Doc for Java API.   Add dependencies First of all, you need to add needed dependencies for including Free Spire.Doc for Java into your Java project. There are two ways to do that. If you use maven, you need to add the following code to your project’s pom.xml file.     <repositories>               <repository>                   <id>com.e-iceblue</id>                   <name>e-iceblue</name>    ...

Simple Java Code to Convert Excel to PDF in Java

This article demonstrates a simple solution to convert an Excel file to PDF in Java by using free Excel API – Free Spire.XLS for Java . The following examples illustrate two possibilities to convert Excel to PDF:      Convert the whole Excel file to PDF     Convert a particular Excel Worksheet to PDF Before start with coding, you need to Download Free Spire.XLS for Java package , unzip it and import Spire.Xls.jar file from the lib folder in your project as a denpendency. 1. Convert the whole Excel file to PDF Spire.XLS for Java provides saveToFile method in Workbook class that enables us to easily save a whole Excel file to PDF. import com.spire.xls.FileFormat; import com.spire.xls.Workbook; public class ExcelToPDF {     public static void main(String[] args){         //Create a Workbook         Workbook workbook = new Workbook();   ...