Editing PDFs with OpenOffice.org
Repackaged
OpenOffice has always been able to export PDF documents. Version 3.0 is the first to introduce an extension
that lets you import and edit PDFs.
By Florian Effenberger
mipan, Fotolia
Adobe originally designed the Portable Document Format (PDF) as a display-only format for
platform-independent document viewing, but several tools for editing PDF documents have appeared over the
years. OpenOffice.org version 3.0 (OOo) brings the power of PDF editing to the free OpenOffice suite.
Before you raise your hopes too high, let me start by saying that, despite the best efforts of editing tools, PDF
remains a format primarily dedicated to rendering documents and has limited editing capability. If you are
thinking of using an OpenOffice extension to edit the document in its original format, including templates,
paragraphs, images, and tables, you will be disappointed. This limitation is not a failing of OpenOffice; it is
inherent in the PDF format itself. If you need to send documents in an editable format, the Open Document
Format (ODF, i.e., OpenOffice's native file format) is a better choice.
Despite this, the PDF Import function is a useful feature that lets users open PDF document content for
editing, with some limitations. The extension adds a function for loading PDF files as easily as any other
format to the free office suite.
Installation
Although PDF Import [1] was released at the same time as OpenOffice 3.0, it is not included in the default
installation. To load the extension, you need to download it from the Extension Repository [2] (see the
"Extension Repository" box). The version number still has a "Beta" tag to show that the plugin is still under
development and might be buggy (Figure 1).
Repackaged 1
Figure 1: The PDF Import extension is downloaded easily with a couple of mouse clicks.
Clicking the Get it! button for your choice of platform and then selecting Open in your browser will launch
OpenOffice and offer to install the extension (Figure 2). Alternatively, you can download the file separately
and install it by double-clicking. After accepting the license agreement, the extension manager will list the
new function.
Figure 2: The PDF Import extension is ready for installation.
Extension Repository
The Extension Repository [2] has been an important site for all users of OpenOffice since version 3.0 at the
latest. Now, literally hundreds of downloadable extensions are available, and new extensions are added every
month. Most are free of charge, although a couple of commercial extensions are not.
Popular add-ons include the Presenter Console, the Wiki Publisher, WollMux by the City of Munich, and
Writer2LaTex. Many features are available only as extensions to make them easier to maintain than if they
were included in the main body of integrated OpenOffice program code. At the same time, extensions
prevent overloading of the program code with features that many users do not actually need.
How It Works
After installing the extension, OpenOffice will not look different from before, although the import feature is
listed by the Extension Manager. No new menu entries and no additional icon bars are visible. However, you
do not need them because you can load PDF files normally, with File | Open.... After a short delay, the
document opens - surprisingly in Draw, the OpenOffice drawing module. But why? At first sight, it might
seem strange, but if you think about a PDF file's characteristics, the reason quickly becomes apparent.
Because the individual file elements are defined in a proprietary page description language, it is impossible to
tell whether the PDF contains text, a presentation, or a table.
If you open a simple document containing body text in a standard font, like that shown in Figures 3 and 4, you
should have no trouble importing the file. But when you edit the document, you will notice that the layout is
not maintained. Each line of text is in a frame that is not linked to the other frames (lines of text). This means
Repackaged 2
that you can only edit line by line, making it difficult to format paragraphs or change the line spacing. Again,
this limitation is inherent in the PDF page description language.
Figure 3: The PDF document in a PDF reader ...
Figure 4: ... and after importing into OpenOffice.org Draw.
A similar problem occurs when you import tables. Again, you can edit a table line by line, but a workaround
is needed for any major changes because the individual rows and columns are isolated rather than forming a
contiguous table.
Because of the restrictions inherent in a PDF, much of the additional information is lost, such as the document
structure, the outline, and the templates. Although OpenOffice converts formatting correctly, it can only
support direct formatting because the PDF does not provide the templates or define their dependencies. This
does not change if a PDF is exported as a tagged PDF, or PDF/A-1. In contrast, images and drawings are
inserted correctly. Although OpenOffice will not automatically group drawing elements, you can easily
correct this with the standard tools in Draw.
The PDF format supports page orientations, such as portrait and landscape. OpenOffice Draw does not
support this feature at present; instead, it uses the orientation of the first page for all following pages. Also, it
is impossible to reimport the macros in OOo documents that have been converted to PDFs: Although recent
versions of the PDF format support script execution in PDF reader, this does not apply to OOo Basic and
other Office scripting languages.
Fonts are another obstacle. Although OpenOffice - as well as many other programs - will, by default, export a
subset of the fonts in a document to a PDF, the results are only suitable for viewing on third-party systems,
Repackaged 3
not for editing. Exporting fonts is also a difficult topic for licensing reasons. OpenOffice's approach to this is
to select a substitute font when importing PDF documents with fonts that are not installed on the local system
- in some cases, the substitute font is not a good choice.
The import function cannot handle protected and encrypted PDF documents. Whether the document is edit
protected or just view protected, the PDF Importer will fail to import (Figure 5). Of course, cracking tools will
let you remove the password protection, but you have to consider the legality of this action in the case of
third-party PDFs.
Figure 5: OpenOffice can't import encrypted PDF documents.
PDF Import Benefits
Although OpenOffice will fail to import a number of objects - typically because of weaknesses inherent in the
PDF format itself - the PDF Import extension is more than just a toy. Even expensive commercial tools are
limited in their abilities to edit PDF documents successfully because the format is just not designed for this
purpose. In addition, some of the problems you run into can be resolved with the use of standard OpenOffice
tools.
The font import issue is not as big a problem as it might seem in production use because most documents use
standard fonts that are available either on any system or in the form of a matching substitute in OpenOffice.
Often PDFs are scans, and thus image files that you can import.
Finally, don't forget that the OpenOffice PDF Import feature is still a beta version, which means that it could
still contain some bugs. Some functions might not yet be implemented. The developers are already working
on plans to improve import performance in future versions.
PDF Import makes sense in many scenarios. If you have lost an original document, for example, you will be
glad of any chance to reconstruct the content. If you need to reference external sources, you can use the PDF
Import feature to do so. Also, if you need to fill out or save a PDF form, even though the document does not
support this, you will again be glad of the OpenOffice extension. In production use, the extension is also
perfect for minor changes to documents - say, correcting typos or adding a watermark.
Copying via the Clipboard
After importing a PDF into OpenOffice, editing body text is a very painstaking task. If you want to copy text
from an unprotected PDF into a document of your own, the alternative is to use the clipboard.
First, select the text with Ctrl+A, press Ctrl+C to copy it from the PDF, then press Ctrl+V to insert it into
your document. In the best possible case, much of the formatting, such as bold and italic types, will be kept.
In most cases, the results will be contiguous paragraphs that you can format and align.
Repackaged 4
Practical Hybrid Format
Although the name doesn't suggest it, the PDF Import extension also includes an export function that creates a
really practical format. This unspectacular feature, which is tagged onto the normal PDF export dialog in
OpenOffice (Figure 6), lets you create PDFs in a hybrid format. The document has a .pdf suffix and can be
read in any normal PDF reader. In addition, it contains the original file in its native Open Document Format.
Figure 6: Create a hybrid file when you export PDFs.
This allows OpenOffice or StarOffice users to open the original document for editing in the PDF Import
extension. Instead of Draw, it opens the module used to create the file (e.g., Writer, Calc, or Impress). The
hybrid document thus combines the benefits of both formats: The recipient can edit the file normally and, just
to be on the safe side, is given a "proof copy" in PDF format, with fonts and graphics that show what the
original author meant the document to look like.
Conclusions
PDF Import for OpenOffice.org demonstrates its potential despite its fairly early development stage. Already
it is useful for minor corrections to PDFs, and the developers are working on improving the extension. It will
be interesting to see the changes in the next release.
Despite all this, you should always remember that PDF is a display format that does not lend itself to editing.
If you need to exchange editable documents, it makes far more sense to use a format like ODF or to create
hybrid PDFs that give you the best of both worlds.
INFO
[1] PDF Import extension: http://extensions.services.openoffice.org/project/pdfimport
[2] OOo Extension Repository: http://extensions.services.openoffice.org
THE AUTHOR
Florian Effenberger has been a free software evangelist for many years. He is the Co-Lead of
OpenOffice.org's international marketing project and a member of the board of OpenOffice.org Deutschland
e.V., a German NGO. His work mainly focuses on designing enterprise and school networks and software
distribution solutions based on free software. Florian is a regular contributor to various German and English
language publications, in which he investigates legal issues, among other topics.
Repackaged 5
Wyszukiwarka
Podobne podstrony:
2009 07 Weaving the Web Browser Synchronization and More with Mozilla Weave2009 07 Shining Boot Improving Boot Performance with Bootchart2009 07 062003 05 Revision Control Openoffice Org Explained2009 07 06 16h59m 43 500(16566402013641)2009 07 19 3067 292009 08 Dig Deep Debugging with Strace2003 12 Docbook Using Openoffice Org to Produce Docbook FilesŚK 2009 07 Polska Makieta Modułowa2009 07 08 FreeBSD – chwila dla admina, cz 1 [Poczatkujacy]2009 07 All in a NameWarunki techniczne zmiana 2009 07 08 Dz U 2009 56 461OpenOffice org Match2005 06 Luxury Export Creating Full Featured Pdfs in OpenofficeZADANIE D1 puste 2009 07 272009 07 05 3065 272004 11 Usprawniamy OpenOffice org, czyli makro do tworzenia tabel [Programowanie]2007 07 Partition Tricks Backing Up Partitions with Partimagewięcej podobnych podstron