PDF417 Barcode Specification

   Minimize

Introduction

PDF417 is a multi-row, variable-length symbology with high data capacity and error-correction capability. PDF417 has some unique features which makes it the widely used 2D symbology. A PDF417 symbol can be read by linear scanners, laser scanners or two-dimensional scanners. PDF417 is capable of encoding more than 1100 bytes, 1800 text characters or 2710 digits. Large data files can be encoded into a series of linked PDF417 symbols using a standard methodology referred to as Macro PDF417.

Major features of PDF417 symbology:

Character Set – All 128 ASCII characters, All 128 extended ASCII characters, 8-bit binary data;

Symbol Size – 3 to 90 rows, 90 to 583X in width

Bidirectional Decoding – Yes

Error Correction Level – 0 (no error correction) to 8 (the maximum error correction level)

Additional Options – Macro PDF417, Truncated PDF417, Global Label Identifier(GLI)

Symbol Structure

A typical PDF417 symbol contains 3 to 90 rows. Each row consists of (from the left to right):

  • Leading quiet zone
  • Start Pattern
  • Left row indicator symbol character
  • 1 to 30 data symbol characters
  • Right row indicator symbol character
  • Stop pattern
  • Trailing quiet zone

 

Each symbol character is 17-module wide which always consists 4 bars and 4 spaces. Each symbol character represents a value ranging from 0 to 928 which is called “codewords” in the specification.

You can adjust the following parameters of a PDF417 symbol:

  • Number of Rows
  • Width of the unit (X dimension)
  • Height of the unit (Y dimension)
  • Number of Columns (or Aspect Ratio of the symbol)

Although you can adjust the number of rows and columns, the number of symbol characters remain constant among all rows of a given symbol – that is, a PDF417 symbol is always rectangle.

Symbol Character Encodation

Each PDF417 symbol character consists of 4 bars and 4 spaces which totals 17 modules in width. Each bar and space can be from 1 to 6 modules in length. In theory it has 9*929 patterns. Each set of 929 patterns is called a cluster (character set). PDF417 only uses cluster number 0, 3 and 6.

Row Encoding

Each row uses character patterns from a single cluster. Adjacent rows use different clusters in the sequence 0, 3, 6, 0, 3, 6: Cluster number = (row number -1 ) mod 3 ) * 3 Each row starts with a left row indicator and ends with a right row indicator. These row indicators are characters based on row number, total number of rows, number of columns and the error correction level.

Compaction Mode

The data is encoded using one of three compaction modes: Text compaction mode, which encodes alpha-numeric characters and punctuations; Binary Compaction mode, which encodes all 8-bit characters; Numeric Compaction mode, which achieve the highest density by only allowing digits. The default mode is text compaction mode. Using special codewords, the compaction mode can be switched from one from the another.

Symbol characters with values from 900 – 928 are reserved for control purposes. These control characters include mode latch and mode shift codewords. Mode latch characters cause a shift to the new mode which stays in effect until another mode switch is performed; mode shift character allows temporary shifts to binary compaction mode from text mode.

Value

Usage

0-899

Symbol characters used to encode actual data; depends on the compaction mode;

900

Mode Latch to Text Compaction

901

Mode Latch within Binary Compaction

902

Mode Latch within Numeric Mode

913

Mode Shift to Binary Compaction

924

Mode Latch within Binary Compaction

925, 926, 927

Used for GLI interpretation

922, 923, 928

Used for Macro PDF417 control blocks

921

Reader Initialization (Macro PDF417)

903-912, 914-920

Reserved for future use

Global Label Identifier (GLI)

Currently most of PDF417 symbols are based on the default GLI 0 which corresponds to ISO 8859-1 character set. It is possible to encode data in other languages, such as Japanese and Chinese. This usage, however, is not widely acknowledged. For more information about GLI allocation, refer to AIM-USA document, Global Label Identifier (GLI) Assignments. In this article we assume that the GLI value is 0 unless otherwise noted.

Error Correction Capacity

Each PDF417 symbol contains 2 to 512 error correction codewords corresponding to error correction level 0 (the least) to 8 (the highest). The actual number of error correction codewords is defined as follows:

Error Correction Level

Number of Error Correction codewords

0

2

1

4

2

8

3

16

4

32

5

64

6

128

7

256

8

512