How to find out with java if a file (readable) is corrupted or not?

I have a web application in which a user can upload any PDF files via FTP. After downloading the pdf file, I perform certain operations on this pdf.

But the problem is that when downloading PDF via FTP, sometimes the connection is broken between them, and the downloaded pdf is not complete (acts as damaged). When I try to open this document in arobat reader, it gives the message "Error opening the document. The file is damaged and cannot be restored.

Now, before starting processing via PDF, I want to check if the downloadable PDF message is available, it means that it is not damaged.

Provide java for any API, or there is a way to check if the file is corrupt or not.

+3
source share
1 answer

We have iText API in Java for working with PDF files.

To check if the PDF file is valid for downloading and reading, use com.itextpdf.text.pdf.PdfReader.
If the file is damaged, a type exception is thrown com.itextpdf.text.exceptions.InvalidPdfException.

Example code snippet :

...  
import com.itextpdf.text.pdf.PdfReader;  
...  
try {  
    PdfReader pdfReader = new PdfReader( pathToUploadedPdfFile );  

    String textFromPdfFilePageOne = PdfTextExtractor.getTextFromPage( pdfReader, 1 ); 
    System.out.println( textFromPdfFilePageOne );
}  
catch ( Exception e ) {  
    // handle exception  
}  

In the case of downloaded but damaged files, the following error may occur:

com.itextpdf.text.exceptions.InvalidPdfException: Rebuild failed:   
  trailer not found.; Original message: PDF startxref not found.  

Note. To create such an exception, try saving the PDF file from the network, but interrupt it in the middle.
Use it to download through a piece of code and check if it is securely downloaded.

You can find detailed examples in the iText API, here .

+7
source

All Articles