Using PDF in Linux


Table of Contents


I. Introduction to Linux and PDF

What is Linux?

Linux is an operating system that is a UNIX clone and was created by programmer Linus Torvalds, who Linux gets its name from. The interesting thing about the operating system is that it is under a special copyright that allows anyone to improve it, but no one to profit from these improvements. The operating system was initially composed of tools developed over a 15-year period by Richard Stallman and Project GNU, but Torvalds wrote a kernel in 1994 that organized many programmers Internet-wide, and managed releases.

Linux is generally used by people who do not want to use or support Microsoft and its popular Windows operating system.

It is free but there is a version that is also available from Red Hat that is very well organized and easy to install, which might be the best thing to do for people who are new to this operating system, since the process of installing it becomes much simpler this way.

What is PDF?

PDF stands for Portable Document Format, and it was designed in the early 1990s by Adobe Systems. It is becoming more and more popular and replacing PostScript as the preferred format for saving and viewing generic documents. Early on, only Adobe supplied programs that enabled users to view PDF files, but since the format's specification is open, Adobe's Reader is now one of many PDF viewers that are available.

PostScript is a page description language invented in the early 1980s by Adobe as well. It is an interpreted language with postfix (RPN) notation and is quite flexible because of this, while PDF is a file format describing the position and nature of text and pictorial content in raster or vector format, which makes it easier to process.

^ Back to ToC ^

II. Viewing PDF in Linux

Here are some of the best options for viewing PDF files while working in Linux.

Adobe Reader

Of course, Adobe Reader is the industry standard for viewing PDF files, whether you are using Windows or Linux. It was also the first program that was written to for viewing PDF files. It is the viewer that was designed along with Adobe's Acrobat program, which is the most used program for creating and modifying PDF files.

Reader has been available since version 3 for Linux. Reader has recently become better for Linux users, especially with versions 6 and 7. Even though version 6 did not come with a Linux port, it did support JavaScript, which closed the gap with PostScript.

If you are looking to view the PDF in a browser, be it Firefox, Opera, or Netscape, Reader is the only viewer that will work. However, Reader is the bulkiest of all available programs and the plug-ins are quite slow, which could make browsing the web while viewing a PDF file quite a nuisance. However, once Adobe Reader is up and running, the pages will be rendered quickly, but, viewing thumbnails can also take longer because the program does not cache the thumbnails.

KPDF

Many people who are in the know believe that this is the program that is giving Adobe Reader a run for its money. When compared to Reader, KPDF is faster at starting up, though it takes about the same amount of time to render images. It also has almost all of the same features that are available with Adobe Reader. Unlike Reader, KPDF does cache thumbnails, which means that they will only have to be rendered once.

KPDF can be used on its own, but it can also be run as a plugin for Konqueror, which also offers some advantages over Adobe Reader. With this plugin, the KPDF interface elements are merged, which is not the case with Reader.

Xpdf

This was the first alternative to Adobe Reader that became available for Linux users and it came out about three years after Reader. It uses the Motif toolkit to render the images but it does it very quickly and in this program, you are able to rotate the image, search the document and even zoom in and out.

This program was designed light, which is why it is probably the best one if you are only looking to skim through the PDF or read a small one that is made of only several pages. One downfall is that it does not show thumbnails, but if a textual outline was provided in the document for the thumbnails, it will show that.

According to most Linux users who have tried this program, the biggest problem with it is getting accustomed to the interface. When the program opens, you will get a retro-looking window that looks like there is no way to do anything on it other than with the navigation toolbar that is found at the bottom, and sometimes you have to right click on an empty spot in the main window to give you options for manipulating the document.

Evince

Most people who have had experiences with Evince say that it is quite similar to Xpdf, but it is significantly slower. It is a GNOME program and it reads multiple formats, include PDF, PostScript, TIFF and DVI. When looking at documents with formulas and simple vector graphics, Evince is great because the pages are displayed very quickly, which is why it is the default document viewer for GNOME.

It does not have nearly as many features as Adobe Reader does, but it is clean, efficient, very stable and integrated exceptionally with the desktop. It is also very easy to use when you need to print documents that are being viewed.

Therefore, if you are only looking to view a PDF document, and maybe print it, Evince is probably one of the better and simpler options.

ePDFViewer

ePDFViewer is essentially an Evince clone, however, it does not include the GNOME libraries that come with Evince. GNOME is a free software desktop project. Since it is similar to Evince, this means that it is also a fairly simple reader to use. Also, since it does not have the GNOME libraries, it is also much more portable than Evince. However, people often complain that ePDFViewer is too lightweight and doesn't really offer many options other than basic viewing capabilities.

Evince offers a sidebar that will show thumbnails of all of the pages that you are viewing in the PDF document, which ePDFViewer does not do. Because of this, you will need to scroll through the entire document when using ePDFViewer in order to locate the content that interests you.

Okular

In a nutshell, Okular could be seen as the KDE (an alternative graphic desktop project) version of Evince. It pretty much performs the same functions that Evince does in GNOME, however, its list of features is a bit more robust. Not only does the program work for viewing PDF files, but it also allows a Linux user to zoom specific parts of the document, annotate within the document, add bookmarks and copy parts of the document to a clipboard and then paste these contents into another document.

Even though Evince is simpler and easier to use, Okular is probably the better PDF viewer for a Linux desktop because it offers most of the features that Adobe Reader does, but it is much more stable and faster than Reader is when performing these tasks.

^ Back to ToC ^

III. Creating PDF in Linux

ps2pdf

Most people create PDF files by first making a PostScript file and then using the Adobe Acrobat Distiller to generate the PDF, but the Linux version of Adobe Acrobat does not have the Distiller.

However, there are ways to create a PDF in Linux, and one of the most popular and easiest ways to do this is by using the ps2pdf utility.

Ps2pdf is a good alternative to Acrobat Distiller. It's easy to use, very fast and allows you to make a nice PDF file without spending money on proprietary software.

It is true that Linux applications such as TeX and OpenOffice.org can create PDF files without creating a PostScript file first, but there are many times when you want to create a PDF file that these two do not support – which is when you turn to ps2pdf.

This utility can process complex PostScript files that Acrobat Distiller cannot and it comes with GhostScript, which is a free PostScript interpreter - running the PostScript through GhostScript and outputting the PDF file.

Both GhostScript and ps2pdf should be standard parts of most Linux systems, and in order to find out if they are, type “which ps2pdf“ at the command line. If a path is displayed (for example, /usr/bin/ps2pdf), then you have it, but if you don't, then just download GhostScript.

If you have made a PostScript file and want to convert it into PDF, open a terminal window and change to the directory with the PostScript file that you want to convert. This is if you are in KDE or GNOME. Once you have done this, type “ps2pdf“ and the name of the PostScript file.

This should lead to the creation of a PDF file that will be able to be viewed with Acrobat Reader or any other PDF viewer available.

According to the reports of some, ps2pdf does not work if you are using Knoppix. In that case, the script to use is ps2pdfwr, which is almost the same.

CUPS

Common UNIX Printer System (CUPS) has a very easy way to create PDF files. First, you must install cups-pdf on your computer, of course, which is easy. Go to the Add/Remove Software utility and search for cups-pdf, then select the cups-pdf entry and apply the changes to install. When you go to print a document, usually, you will see the option, under "Printer Name", for CUPS/Cups-pdf. What this option will do is export and save the document as a PDF file – which is the simplest method of creating a PDF document.

With CUPS you can also create PDF files by exporting to PDF from the file menu of an OpenOffice.org document. There are five tabs in the window: general, initial view, user interface, links and security.

Once you click the Export button you will have to define a location and a file name for the document, and then you are done.

^ Back to ToC ^

IV. Editing PDF in Linux

PDFEdit

This is one of the best programs to use when you need to edit PDF documents completely, not just export them. It is a free open source editor and it is available in both GUI and CLI interface.

The software will enable you to write, create and edit PDF files, and also print, save and export them.

To install the software, if you are using Debian or Ubuntu Linux, enter: $ sudo apt-get install pdfedit

And then, to begin editing in the software, type: $ pdfedit /path/to/pdf.file & $ pdfedit &

PDFEdit is probably the best free open source software for for Linux / Unix-like operating systems, but, the one downfall of it is that it does not support editing protected or encrypted PDF files.

Scribus

This is an open source desktop page layout application software and it can work with Linux, Mac and Windows computers. It also is fairly easy to install and use.

To install the software on Linux, enter: $ sudo apt-get install scribus

To use scribus to edit PDF files: Start scribus > New File > Insert > Image > Double click > Select PDF file

Flpsed

This is a WYSIWYG pseudo PostScript editor and it is very fast and lightweight.

To install flpsed, enter: $ sudo apt-get install flpsed

To edit file, enter: $ flpsed /path/to/pdf-file.pdf &

GIMP

GIMP should already be installed on Linux and it can be used to edit PDF files. However, you have to know something about GIMP and how it works in order to be able to manipulate PDF files with it, therefore, it is not nearly as convenient as the rest of these given options. If you are interested in using it, however, here's what to do.

First download and install the GIMP image editor from www.gimp.org.

Open GIMP, and then open the PDF document you wish to edit. For multiple page documents, it is easier to edit them one page at a time. If you choose more than one page, they will open in individual windows.

Make the changes you wish to make to the document.

Save the document as a GIMP XCF file.

Close the document, and then open the resulting image in Krita.

Printing to PDF with Krita

Go to File>Print, and then choose Print to PDF. In the same dialogue box, choose the destination folder and id of the output document.

When all this is done, you should open the resulting document to make sure everything worked properly.

Pdfescape.com

This is an excellent online tool that you can use to modify PDF files using a web browser.

^ Back to ToC ^

V. Converting PDF in Linux

PDF to text conversion

A PDF file can be converted into text by a tool that belongs to the XPDF package, which is called “pdftotext.“

The conversion is done with this command: pdftotext irgendein.pdf

This instruction produces a text file with the name irgendein.txt with contents of any PDF. When this is done, the file can be worked on regularly.

This parameter - layout ensures that the appearance of the PDF file is transferred as closely as possible to the text file: pdftotext -layout irgendein.pdf

If you want to transfer only certain pages from PDF to the text, use the parameter - f (for first PAGE) and - l (for last PAGE):

pdftotext -f 3 -l 7 irgendein.pdf

In this example only pages three to seven would be converted.

If you run your own dedicated server on Red Hat or something similar, you probably already have “pdftotext.”

It is an excellent command to use in order to create previews of documents.

PDF to image conversion

For converting a PDF file to an image file, the pdfimages utility can be used, but also, you can do this by way of a command line option as well.

You will need to use the convert command from the imagemagick image manipulation set of programs.

The convert program is a member of the ImageMagick suite of tools and it can be used to convert between image formats and also resize an image, and also to make various other changes and effects such as to blur, crop, despeckle, dither, draw on, flip, join, and re-sample the image, along with many other options.

This is also useful if you do not have a PDF reader installed or you are working on a web-based project.

^ Back to ToC ^

VI. Links to PDF/Linux tools

Viewers

Creators

Editors

Development libraries

Annotation tools

Desktop environments

^ Back to ToC ^