IN A NUTSHELL:
Although Adobe Acrobat Reader v9 for Linux can fill in forms, it still has limitations. I describe here how to use OpenOffice.org Writer instead to overlay text onto PDF pages. I also list alternatives I tried which were not as satisfactory.
Advantages of this method:
- can freely fill in PDF forms even if they are not created as an electronic "form" and would have needed to be printed out and filled out by pen and paper.
- overcome limitations in Acrobat Reader for Linux such as "You cannot save what you filled out; you can only print a copy"
- can easily do global search-and-replace, e.g. "I changed my mind about the requested date, which I filled in 10 times all over this form!"
- can cut&paste text from elsewhere, e.g. "It says to fill in the reason I'm right for this job, but I already answered this in another letter I wrote. I'll just paste it in!"
Disadvantages:
- takes more work
- relies on ability of ImageMagick (or other software commonly included with Linux) to read PDF files and convert to images
What you will need:
- OpenOffice.org software (commonly included with Linux)
- ImageMagick (also commonly included with Linux)
HOW TO FILL PDF FORMS IN LINUX
Currently (March 2010) the easiest way to fill in PDF forms is to use Adobe Acrobat Reader v9 for Linux. If this fulfills your needs, go use it; you need not read any further. The rest of this text is for people for whom Acrobat Reader does not do what they want, such as due to the following problems:
When I tried to use Adobe Acrobat Reader v9 for Linux for some PDF forms I needed to fill out, I ran into some problems:
- one form did not have the option to be filled electronically. I was to print it out and fill it by hand.
- another form could be filled out, but I could not save what I had filled out. That meant I could not stop partway through filling this 20-page form, and continue later; nor could I go back the next day to edit a mistake after I had done filling in the form
I am not sure whether all forms are like this under Acrobat Reader, or just under Acrobat for Linux, or whether the creator of the form deliberately set it so that we could not save what we had filled in.
Suboptimal Alternatives
I explored various options other than OpenOffice.org Writer, which were not as good for the following reasons:
GIMP v2.4.6:
- this is probably the next best alternative, especially for single-page forms
- you need to know how to use GIMP, which in my own opinion is not very user-friendly; e.g. you might be looking at one layer but are actually modifying another layer.
- You MUST know how layers work in GIMP (even if you are using single-page form)
- handles multiple pages as layers, making it somewhat clumsy to use. For example, you need to print each page separately, by turning the other pages into invisible layers, and then repeating for each page.
- Cannot "Save As" PDF, so to generate a modified PDF you need to "Print to PDF File"; this is not a feature of GIMP, so your Linux setup had better already have the a "convert to PDF" virtual printer installed.
Inkscape v0.46:
- can only handle one page at a time (a 20-page form would become 20 separate files)
- exporting/printing turned small text (from the PDF form) into black blocks. Might be solved by increasing the "gradient mesh precision" when importing, but I didn't explore too much
- If you can solve the problem of fine print on PDF turning into black blocks, Inkscape might be good for one-page forms.
OpenOffice.org Draw v2.4.1
- when I tried to import a PDF file, it opened OOo Writer instead, and showed gibberish characters
- may be different in OOo Draw v3, but I haven't tested that.
PDFedit v0.4.1:
- did not work on one of my PDF files: it cannot read PDF Portfolios/PDF Packages.
- on the PDF file that it *could* read, after I filled in the form it produced a PDF file that was not recognized by Acrobat v9 for Linux, or KPDF, although the eVince Document Reader for Linux was able to read it.
- notes: At first it refused to edit that PDF file. I had to go to Tools > Delinearize, open the PDF file (apparently this tool doesn't process the currently loaded file) and save it in a "delinearized" version
- it keeps producing this invisible error message window which only shows up on the Taskbar. If I click on the main window a few times then it goes away.
PDFmodify:
- not available for my Linux distribution; not tried
- a command-line utility that does not have a user-friendly GUI.
- $900 to register (but you can use unregistered version). Not open source: binary only (but available for Linux)
PDFtk:
- powerful and flexible command-line utility
- to fill in a form, you have to generate a file with the data, which PDFtk then merges into the PDF file
- there is a GUI, but not available for my distribution; did not test GUI. I suspect the GUI would *not* have a screen displaying that PDF where I can easily see the form and fill it in by editing.
PDFtoHTML v0.36:
- just converts to HTML enough that you can read the text; does very poorly on formatting, so output is not presentable
- in particular, fails to retain text positioning and tables (text within a row of table cells is simply listed in successive lines)
- not useful for reproducing a PDF page; equivalent to someone simply reading the PDF text out loud
Scribus v1.3.3.11:
- not easy to use.
- In particular, I still cannot figure out how to easily import images into a page. I can insert an image frame, and put the converted image inside, but it's too big, and I can't figure out how to rescale it. The only way I can find it so use the Edit Image feature, in which Scribus runs GIMP so that I can rescale the image, save it and then import that new modified image into Scribus. No thanks.
- I'm not sure whether Scribus can edit text as easily. Can it do global search-and-replace on text scattered in different locations?
- has some potential. Some of you might want to explore this a bit more.
eVince, Okular, KPDF, XPDF, GSview:
- these are viewers, not editors
When I searched for other solutions on the WWWeb, there was a lot of irrelevant stuff to sift through. People would suggest using PDF viewers which could not modify the PDF form, or suggest modifying the original non-PDF file and then generating a new modified PDF (not applicable unless you were the one who created the PDF in the first place), or even just suggesting, "You're not SUPPOSED to edit that PDF/fill in form!" etc.
How To Use OpenOffice.org Writer on PDF Forms
1. Convert PDF file to images
2. Set up OOo Writer
3. Import images into document
4. Fill in the form
5. Export to PDF
It sounds more complicated than it really is; it's easy once you get started. Here are the steps in detail:
1. Convert PDF file to images
You need to convert the PDF file to images. Here is the command-line command to do it. Maybe someone can write a quick GUI script for people who don't like the command line. Anyway, you need to have ImageMagick installed; this is a set of handy tools (free/open source software) that come with most major Linux distributions. We use the "convert" command from this toolkit.
At the command-line (terminal/console/shell/whatever you want to call it), type:
convert -density 150 -resample 150x150 Original_PDF.pdf Output_image.png
Of course, you would replace "Original_PDF.pdf" with your own PDF file name. Same with "Output_image.png". This produces a PNG image file. For multi-page PDF's, it will produce multiple numbered PNG files called "Output_image-0.png", "Output_image-1.png", "Output_image-2.png" etc. Note that numbering starts from 0, so the second page is numbered "1" etc. Also note that the eleventh page would be called "Output_image-10.png", which when sorted in alphabetical order comes after number 1 but before number 2.
I found that 150 was a good number for density and resample resolution: it's high enough that the image looks nice, but low enough so that the image does not take up too much memory in the resulting OpenDocument text file that we will produce with OOo Writer (which is already going to be much bigger than the original PDF file, in any case). If you want higher resolution, you can try 200 or even 300 as follows:
convert -density 300 -resample 300x300 Original_PDF.pdf Output_image.png
I recommend using ".png" as the output file extension since the PNG file format does better with diagrams and text. If your PDF contains big photographs, you might want to try using a ".jpg" extension on the output file, in which case the "convert" program will know you want a JPEG file. This gives a poor quality output unless you specify the quality as follows:
convert -quality 90 -density 150 -resample 150x150 Original_PDF.pdf Output_image.jpg
Use quality of 90 or above (max is 100 but it makes a HUGE output file). (I actually don't know if you need "-density 150 -resample 150x150 " when you're producing a JPEG file but I left it in just in case.)
2. Set up OOo Writer
Create a new blank document with OpenOffice.org Writer. We will need to use a small font with tight spacing so we can position the text correctly. Set the font size to 8pt, set the paragraph spacing to zero, and the line space to "fixed at 0.1 inches". I think the Arial font (without serifs) works better than the Times New Roman font (with serifs). I created a blank document with these font settings and saved it, so that in the future I can just open this document when I want to fill in other PDF forms.
If you know how to use Styles in OOo Writer, it's easiest to create a new style (I call mine "small text for filling forms"). Type in your text using this style. That way if there are any font changes that you want to apply universally, you can just modify this style instead of having to select the text in the document. Selecting text is not going to be easy: if you use the mouse, you keep selecting the images by accident, and the Ctrl-A key to select all text doesn't work on my computer for some reason. To use Styles, click on Format > Styles and Formatting; then in the pop-up window, right-click and select New (or Modify, if you already created your style). Then select your new style (instead of "Default") in the main Writer window, to make sure that any text you type in uses this new style.
Do all this before you put the images in your document. Once you put in images, it's going to be clumsy to modify the text formatting.
3. Import images into document
Actually, I suggest that you import one image (one PDF page) at a time, fill it in, then import the next image. You could import all the images before typing text, but I find this easier.
From main menu: Insert > Picture > From File (or Alt-I, Alt-U, Alt-F). In the file dialogue that opens, select your PNG file (remember that page 1 is numbered 0). If you can't remember which image to import, you can see what page you're on in the OOo Writer main window lower left corner.
Once the picture appears in your document, right-click on it and select Anchor > to Page (or press C, A) so you won't move the image around accidentally. Then right-click again and select Wrap > in Background (or press W, B). This pushes the image into the background so all the text is overlaid on top.
4. Fill in the form
Now click your mouse outside the image (I click on the left edge of the page) so that the image is no longer selected. Your cursor changes to an I-beam cursor instead of a hand, which means you can type text now. Start typing stuff and you can see that it overlaps.
You have to position your text correctly. Just use tabs and spaces and press Return for empty lines. Yes, this means that if you end up changing your font size then the whole text shifts and you have to correct it. If you insert lines on top then the text at the bottom gets pushed down. The reason I don't use columns/tables/text frames is two fold:
- I have to navigate the text with keyboard only. I can't click with the mouse to say "put the cursor here" because then it selects the background image. So, use Ctrl-Left/Ctrl-Right to move around
- I may need to select text (by pressing Shift while using the cursor keys, not with the mouse) and this is going to be tricky with tables/columns/frames, which might just select text in a cell/frame when I want to cross the boundaries (or select entire cells when I only wanted certain parts of it)
When you are ready for the next page, press Ctrl-Enter to insert a page break, creating a new page, and you can insert the image for the next page. Page breaks are your friend: it allows you to edit previous text without having the effects spill onto the next page.
5. Export to PDF
You can use the built-in PDF Export feature of OOo Writer (File > Export as PDF), or if your system has it, you can choose the virtual printer feature which acts just like a printer except it really creates PDF files. Check the quality of the imported image; if it's not high enough, you might want to replace the 150 number in the first step with a higher number.
Closing comments
I am using Kubuntu Linux v8.04 with KDE 3. This is the most current Ubuntu Long-Term Support version, until v10.04 comes out next month. (Well, the KDE 3 part isn't part of the Long-Term Support.) I think my strategy is a stable solution that uses time-tested software packages and not the newest features of software packages still under development.
This does rely on ImageMagick's ability to convert PDF files. If some new format comes out which is not recognized by ImageMagick, we would have to look for other ways to generate the images easily.
I spent a lot of time trying to figure out which was the best way to fill in PDF forms, a very common occurrence whether or not you're using Linux. Hopefully this will save some of you some time so you can get on with doing your work rather than fiddling around with various tools that should be helping you.