This article gives a brief introduction to the PDF migration project, with a detailed description of the Lotus Forms 3.0 conversion tool. It also includes a detailed list of manual effort and requirements, which is an important list for project sizing and estimation. You should be familiar with the IBM Lotus Forms product. Users of IBM Lotus Forms are all welcome to learn and share the project experience with the forms service project.
Introducing PDF forms
PDF is the acronym for Portable Document Format, a file format developed by Adobe® Systems Inc. PDF captures formatting information from a variety of desktop publishing applications, making it possible to send formatted documents and have them display on the recipient's monitor or printer as they were intended. To view a PDF file, Adobe Reader is required; to create or modify a PDF file, Adobe Acrobat Professional or Adobe Acrobat Designer is required.
The PDF form can be categorized into two types: the static PDF form and the interactive PDF form. The static PDF form is a traditional PDF document with no interactive elements. The form is composed of static elements such as text, dot, line, and graphic. The file is stored as a binary format.
The other type of PDF form is the interactive PDF form. In the PDF specification, there are two types of interactive forms:
- AcroForm. This form, introduced in PDF Specification 1.2, is a collection of fields for gathering information interactively from the user. The contents and properties of an interactive form are defined by an interactive form dictionary that is referenced from the AcroForm entry in the document catalog in the PDF file.
- XML Forms Architecture (XFA). XFA provides a
template-based grammar and a set of processing rules that allow users to
build interactive forms. The template-based grammar defines fields in
which a user provides data. The open nature of XFA provides a common XML
grammar for describing interactive forms, which is a common basis for
form-related interactions between form-processing applications. This
open nature means that XFA is applied in a large variety of businesses.
XFA forms provides a wide range of features:
- Workflow. This type of form includes data presentation, data capture, and data editing. XFA works as a front-end application to submit data to a back-end server; it also can be used for printing purposes.
- Dynamic interactions. The dynamic features vary from interactive, human edited forms with dynamic calculations, validations, and other events to server-generated, machine-filled forms.
- Dynamic layout. Forms can automatically rearrange themselves to accommodate the data supplied by a user or by an external data source, such as a database server. For example, if the data retrieved from the server has 100 rows, the form displays 100 rows.
- Complexity. This feature includes single-page static forms, dynamic document assemblies based on data content, and huge production runs containing hundreds of thousands of transactions.
- XFA can be used in XML-based workflows.
- XFA separates data from the XFA template, which allows greater flexibility in the structure of the data supported and which allows data to be packaged separately from the form.
- XFA can specify dynamically growing forms.
- XFA can specify Web interactions, such as HTTP and Web Services Description Language (WSDL). Such interactions can be used to submit data to a server or to request that a server perform a calculation and return the result.
- XFA works with other XML grammars.
Table 1. Comparing Lotus Forms and XFA
|Feature list||Lotus Forms||XFA|
|Elements||Extensible Forms Description Language (XFDL) items, options, and XForms items, options||XFA items and options|
|Dynamic||XForms repeat||Subform and subform set|
|Validation test||Null, data type, data pattern, and regular expression test||Null, data type, format, and script test|
|Data instance||XML data instance or XForms data instance||XFA data template|
|Digital signature||XFDL digital signature||XML digital signature and PDF digital signature|
|ActiveX Data Object (ADO) API||None||Support|
|Embed||Support embedded in HTML||XML Data Package (XDP), embedded in PDF|
|Layout||Item location represented in pixels or relative location||Handled by layout processor in object’s container|
|HTML||Not supported||Support to embed HTML fragment|
PDF migration projects involve converting existing PDF forms to Lotus Forms. The PDF forms can be static or dynamic. It is important to have the requirement for the project clearly documented before the migration starts, and the requirement should provide element-level detail on each form so that the form developer can process each item according to clear instructions.
Usually the requirement document should contain the following contents:
- General instructions. This content is a general guideline for the migration. It includes the scope of the migration, features list, and a clear summary of what should be converted from PDF and what should not be converted.
- Template XFDL form. This template form includes the common style and reusable components, such as toolbar, background color, print setting, label font size and color, country and state list, and so on.
- Detailed spreadsheet for each form. For each form to be converted, there should be a spreadsheet that describes the content and the mapping between the PDF and XFDL because the elements on Lotus Forms and PDF could have different types and properties. Without the mapping information, the form developer can be lost in trying to find a suitable item type and option value. Table 2 shows the information that should be included in the spreadsheet.
Table 2. Sample requirement information
|Column name||Description||Sample value|
|Item name||Element label||Agency name|
|Type||Converted element type, which can be different from PDF||Field check group|
|Item rule||Type of the element data||Integer, one selection choice|
|Format||Value format, such as a date or zip code||Date: MM/DD/YYYY|
|Default value||Should the element have a default value?||Default 100|
|Required||Should the element be required to be filled?||Yes or no|
|Range||Value range, usually used for number values||1 to 100|
|Item length||Size of the element||20|
|Disable/enable||Logic for the enable/disable feature of the element||Enabled if the answer is yes to the first question|
|Visible/invisible||Logic for the visible/invisible feature of the element||Visible if the answer is yes to the first question|
|Help message||Help text of the element||Fill in the name of the agency.|
|Data instance name||The data instance name bound to the element||Agency_Name|
|Calculation||Formula if the element value is calculated by other elements||=Month1+Month2|
|Pattern||Reusable pattern that can be applied to this element||Signature button type 1|
|Others||Other information or logic about the element||Signature applies only to section 1 and 2|
After the requirement is defined and documented, you can perform the actual migration from PDF to Lotus forms. In general, you need to complete two major migrations steps:
- Using the conversion tool, do a raw conversion from PDF to Lotus Forms automatically. The Lotus Forms conversion tool is a Lotus Forms Designer plug-in that can be used to convert from PDF forms to Lotus Forms. Using this tool can save you manual effort in the migration.
- Manual updating of the raw converted form. The raw converted form includes only the layout and logic. Based on the requirement spreadsheet, form developers can check each item on the form and update the items in Forms Designer one by one.
Lotus Forms conversion tool
The conversion tool for IBM Lotus Forms 3.0.1 (the Forms Conversion Tool plug-in) is an innovative, easy-to-use tool that lets business owners and forms developers easily convert Portable Document Format (PDF) files and FileNet e-forms into Lotus Forms. It can also be used to transform existing Lotus Forms. In Lotus Forms Designer, the Lotus Forms conversion tool is embedded as part of Designer 3.5. Follow the link in the Resources section to download the Forms Conversion Tool plug-in or Lotus Forms Designer trial download and install it.
Automatic PDF conversion
After the conversion tool is installed, click File - New - Convert to Lotus Forms. The window shown in figure 1 displays.
Figure 1. The initial Convert Forms window
Figure 1 shows the conversion selection window of the conversion tool. Click the Add File button to add more files for conversion. In the format list, two parsers can be selected: PDF and FileNet Select the PDF parser for PDF conversion. Select pdf_default profile in the profile list, and then Click Next.
Figure 2. Selecting the file location
Figure 2 shows the window in which you can select the location to store the converted files. The default location is the same folder as the PDF file. You can also select the folder in the Designer Workspace. After you click Finish, the PDF is converted. Figures 3 and 4 show the original PDF form and the converted XFDL form.
Figure 3. Original PDF form
Figure 4. Converted Lotus Forms form
From these two illustrations, you can see that most of the labels and layouts are converted by the tool.
Customized optimizers and rules
Sometimes the converted result is not as good as expected when you use the default conversion profile. By creating a customized conversion file, you can tune the conversion process in more fine-grained details. Select the Forms Designer Windows - Preferences menu, then select Forms Conversion and Transformation Profiles on the left pane of the Preference window and click New button. The Profile Edit window displays as shown in figure 5.
Figure 5. The Edit Form Conversion Profile window
You can use this window to customize the conversion profile for a specific type of file. The conversion profile includes the following contents:
- Conversion rules. The rules are used to customize the content of the generated form including adding a toolbar from a template form, updating the element font, color, and border, changing the SID generation rule, and changing the element orders in the XFDL source.
- XForms generation. This option is used to select the list of controls that are bound to the XForms data instance.
- Layout optimizers. The optimizers include parameters that you can adjust during the conversion of PDF elements to XFDL elements, such as parameters to create a check box from four lines and to align labels and fields. The optimizers are targeted to create new interactive items (check boxes, fields) on the form based on static items (labels, lines) on the PDF and align existing items (labels or fields) on the form.
- Create a profile for each batch of forms. Each batch of forms can have a different style. Create a profile for every batch, and tune the parameters to achieve the best result. The profile can be exported and imported, so every form developer could have the same conversion setting.
- If the form requirement includes detailed XForms rules, you should suggest disabling the XForms instance generation because the generated instance name is based on the SID and usually it does not fit the back-end processing.
- Enable the Use default width and height option if in the converted form some labels are truncated because an insufficient length is set for the label. This rule can remove the width and height on the label, which uses the default size calculated by Lotus Forms viewer.
- Disable the “Combine Adjacent Lines/Labels” option if there are too few fields generated on the form. The combination of the lines and labels affects the field generation by the optimizer “Transform a line or a box into a field” because it removes some lines and labels on the form.
In most situations, the automatically converted forms cannot be used directly because the data instances on the form are not well organized and the migration project usually requires that the data instance conforms to certain schema that can be submitted to the back-end processing flow.
Manual update items
The manual update is a required step in the migration project to address the missing features that automatic conversion can not fulfill. There are a few steps required:
- Adjust the detailed layout, format, logic, and SID based on the documented form spreadsheet.
- Generate the XForms data instance and XForms binding based on the business logic.
- After automatic conversion, the unconverted PDF element, such as unrecognized color space or images, creates an XML comment in the XFDL. Manual update is required to clear these comments or update the unconverted items based on the comments.
- Update the digital signature based on the requirement. The auto conversion can generate only a Clickwap signature.
- Update the data submission based on the requirement. The auto conversion doesn’t handle the data submission in the original PDF form.
- Update graphic elements. In the PDF form, there could be graphic elements such as a vector image, an oblique line, and a curved line. These elements are not supported by Lotus Forms, but they can be replaced by using JPG or GIF images captured from the PDF form.
After the manual update, the form is generated and enters the quality assurance (QA) process. Table 3 lists the items that we summarized from our migration project that need to be highlighted in the QA process.
Table 3. QA checklist
|Font and color||Some fonts or colors in PDF do not show correctly in Lotus Forms. You need to check for any differences during the QA process.|
|Layout check||Compare the forms layout in Lotus Forms viewer or Webform Server with the original PDF. Focus on the labels and the lines. For example, in PDF the character width can be adjusted, but it is fixed in Lotus Forms. This discrepancy can cause the labels to look different.|
|Printing check||Print out both the converted form and the PDF. Compare the printed forms on paper to check any discrepancies.|
|Data validation rule||Compare the data validation on PDF and the converted form by entering data on the forms.|
|Dynamic part||PDF and XFDL both support generating dynamic contents. You can test it by generating a full set of data in the data instance.|
In general, the PDF migration project includes a well-defined requirement for each form, an automated PDF conversion tool, and manual update and QA processes. Because part of the process is automated and requires knowledge of PDF, it is different from other Lotus Forms projects, such as creating forms from scratch. Leveraging the Lotus Forms conversion tool is a good adoption in a PDF migration project.
Gu Yi, Lead Software Engineer, IBM