Introduction
PDF417 is a multi-row, variable-length symbology with high data capacity and error-correction capability. PDF417 has some unique features which makes it the widely used 2D symbology. A PDF417 symbol can be read by linear scanners, laser scanners or two-dimensional scanners. PDF417 is capable of encoding more than 1100 bytes, 1800 text characters or 2710 digits. Large data files can be encoded into a series of linked PDF417 symbols using a standard methodology referred to as Macro PDF417.
Major features of PDF417 symbology:
Character Set – All 128 ASCII characters, All 128 extended ASCII characters, 8-bit binary data;
Symbol Size – 3 to 90 rows, 90 to 583X in width
Bidirectional Decoding – Yes
Error Correction Level – 0 (no error correction) to 8 (the maximum error correction level)
Additional Options – Macro PDF417, Truncated PDF417, Global Label Identifier(GLI)
Symbol Structure
A typical PDF417 symbol contains 3 to 90 rows. Each row consists of (from the left to right):
- Leading quiet zone
- Start Pattern
- Left row indicator symbol character
- 1 to 30 data symbol characters
- Right row indicator symbol character
- Stop pattern
- Trailing quiet zone
Each symbol character is 17-module wide which always consists 4 bars and 4 spaces. Each symbol character represents a value ranging from 0 to 928 which is called “codewords” in the specification.
You can adjust the following parameters of a PDF417 symbol:
- Number of Rows
- Width of the unit (X dimension)
- Height of the unit (Y dimension)
- Number of Columns (or Aspect Ratio of the symbol)
Although you can adjust the number of rows and columns, the number of symbol characters remain constant among all rows of a given symbol – that is, a PDF417 symbol is always rectangle.
Symbol Character Encodation
Each PDF417 symbol character consists of 4 bars and 4 spaces which totals 17 modules in width. Each bar and space can be from 1 to 6 modules in length. In theory it has 9*929 patterns. Each set of 929 patterns is called a cluster (character set). PDF417 only uses cluster number 0, 3 and 6.
Row Encoding
Each row uses character patterns from a single cluster. Adjacent rows use different clusters in the sequence 0, 3, 6, 0, 3, 6: Cluster number = (row number -1 ) mod 3 ) * 3 Each row starts with a left row indicator and ends with a right row indicator. These row indicators are characters based on row number, total number of rows, number of columns and the error correction level.
Compaction Mode
The data is encoded using one of three compaction modes: Text compaction mode, which encodes alpha-numeric characters and punctuations; Binary Compaction mode, which encodes all 8-bit characters; Numeric Compaction mode, which achieve the highest density by only allowing digits. The default mode is text compaction mode. Using special codewords, the compaction mode can be switched from one from the another.
Symbol characters with values from 900 – 928 are reserved for control purposes. These control characters include mode latch and mode shift codewords. Mode latch characters cause a shift to the new mode which stays in effect until another mode switch is performed; mode shift character allows temporary shifts to binary compaction mode from text mode.
|
Value
|
Usage
|
|
0-899
|
Symbol characters used to encode actual data; depends on the compaction mode;
|
|
900
|
Mode Latch to Text Compaction
|
|
901
|
Mode Latch within Binary Compaction
|
|
902
|
Mode Latch within Numeric Mode
|
|
913
|
Mode Shift to Binary Compaction
|
|
924
|
Mode Latch within Binary Compaction
|
|
925, 926, 927
|
Used for GLI interpretation
|
|
922, 923, 928
|
Used for Macro PDF417 control blocks
|
|
921
|
Reader Initialization (Macro PDF417)
|
|
903-912, 914-920
|
Reserved for future use
|
Global Label Identifier (GLI)
Currently most of PDF417 symbols are based on the default GLI 0 which corresponds to ISO 8859-1 character set. It is possible to encode data in other languages, such as Japanese and Chinese. This usage, however, is not widely acknowledged. For more information about GLI allocation, refer to AIM-USA document, Global Label Identifier (GLI) Assignments. In this article we assume that the GLI value is 0 unless otherwise noted.
Error Correction Capacity
Each PDF417 symbol contains 2 to 512 error correction codewords corresponding to error correction level 0 (the least) to 8 (the highest). The actual number of error correction codewords is defined as follows:
|
Error Correction Level
|
Number of Error Correction codewords
|
|
0
|
2
|
|
1
|
4
|
|
2
|
8
|
|
3
|
16
|
|
4
|
32
|
|
5
|
64
|
|
6
|
128
|
|
7
|
256
|
|
8
|
512
|