Reading PDF in java as a file and making “PDF” editable

I desire to accomplish this through implementing my custom documents type readable along with my system into the output PDF.

This plan will be actually used by numerous customers which are going to make brand new data banks or extend existing ones. That is actually why having output developed as multple files is actually inefficient and extremly mediocre means of obtaining what I prefer to achieve (it would certainly complicate traits for the consumer).

I possess a plan which is going to be used for creating questions database. I’m creating it for a site that really want individual to know that contet was donwloaded coming from that site. That is actually why I yearn for the result be PDF – almost every person may see it, almost no person may edit it (as well as get rid of e.g. footer or even watermark, unlike in some simpler data types). That describes why it NEEDS TO be actually PDF.

And also what I wish to perform is actually to produce PDF files which are still editable with my program once created.

I generated 3 ways of performing that:

Conceal the report inside an image which would be actually added to the PDF somwhere on the final or very first page, in some way (that is actually still need to exercise) hidden from the general public eye. Understanding it is actually area, it ought to be actually relativley very easy to recover it using PDF public library.

In Java, there is no true variation between text message and binary data, you may read all of them each as an inputstream. The difference is actually that for binary reports, you can not definitely produce a Reader for it, because that assumes there’s a technique to convert the byte stream to unicode personalities, which won’t benefit PDF files.

One report only. As well as that file is actually PDF. Individual should certainly not be mindful of the addition.

The PDF specification defines what a valid newline character( s) is/are (there are actually a number of). Order a hex editor and also open up a PDF as well as you can a minimum of begin getting a sample of factors. Beware of where you insert your lines though – you’ll require to add all of them towards the end of the data where they will not oppress up the xref dining table offsets to the obj access.

Fasten the data to PDF and after that harming the aspect of PDF which contains it in a manner it just creates the PDF uninformed that it includes the documents, hence creating imposible for user to observe it (easely). Upon going through the document I ‘d go back the corruption as well as extract file making use of some of may PDF collections.

There is actually an accessible resource library in Java that permits you to adjust that: http://pdfbox.apache.org/userguide/metadata.html. View additionally an associated inquiry from another person that was successful in it: custom-made schema to XMP metadata or even http://plindenbaum.blogspot.co.uk/2010/07/pdfbox-insertextract-metadata-frominto.html

Carries out any person have a better idea?
Or even, just how perform I check out PDF as a DOCUMENTS, so the output is variety of characters (along with newline diagnosis), and after that revise the whole documents with my content addition?

A much better method is actually to utilize an additional existing way of encoding records in a PDF: XMP tags. This is actually permits any kind of type of complex Key-Value pairs to become encoded in XML as well as embedded in PDF’s, JPEGs and so on. Find http://partners.adobe.com/public/developer/en/xmp/sdk/XMPspecification.pdf.

In your situation, you ‘d need to review the data in byte barriers as well as possibly loophole over them to browse for bytes representing the ‘%’ as well as end-of-line character in PDF.

The trouble is actually that I don’t know just how to read the PDF as a data (I’m not trying to review it as a PDF, which I would do making use of a PDF public library).

I have know that if you add “%” indicator as a 1st character in line inside a PDF, the entire line will be actually dismissed (comparable to “//” in Java) due to the PDF viewers (atleast Adobe audience), implementing for me to incorporate as numerous lines as I wish to the PDF (if I know where, and also I carry out) whitout completion consumer being actually mindful of that. I could possibly apply my entire custom-made file into PDF by doing this. The problem right here is actually that I in fact need to review the PDF making use of some of the Java’s input visitors, yet I am actually not exactly sure which one. I know that PDF can not know like a document given that it is actually a binary documents (Right?).

Here’s an associated question that may be actually of passion: PDF parsing data trailer

I would suggest putting your remark immediately prior to the startxref line. If you place it anywhere else, you could possibly strong wind up switching things around and also breaking the xref desk pointers.

An easy formula for putting your exclusive opinion will definitely be actually:

You can (as well as should) perform this manually in a hex publisher.

Most likely to completion of the data Look backwards for startxref Put your exclusive comment quickly just before startxref – make certain to insert a newline sign in the end of your exclusive review Spare the PDF

Truly significant: are your individuals mosting likely to be saving modifications to these data? i.e. if they fill out the document area, are they mosting likely to strike conserve? Your remark pipes may be eliminated in the course of the save (as well as different models of various PDF audiences could behave in a different way in this regard)if they are actually.

XMP tags are the correct means to do what you are making an effort to do – you can embed whole entire XML sections, as well as I assume you will be actually challenging pressed ahead up along with a record construct that could not be conveyed as XML.

Leave a Reply

Your email address will not be published. Required fields are marked *