AU2004242417A1

AU2004242417A1 - Tamper detection and correction of documents using error correcting codes

Info

Publication number: AU2004242417A1
Application number: AU2004242417A
Authority: AU
Inventors: Eric Lap Min Cheung; Stephen Edward Ecob; Stephen Farrar
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2004-12-21
Filing date: 2004-12-21
Publication date: 2006-07-06

Description

I

S&F Ref: 694656

AUSTRALIA

PATENTS ACT 1990 COMPLETE SPECIFICATION FOR A STANDARD PATENT Name and Address of Applicant: Actual Inventor(s): Address for Service: Invention Title: Canon Kabushiki Kaisha, of 30-2, Shimomaruko 3-chome, Ohta-ku, Tokyo, 146, Japan Eric Lap Min Cheung Stephen Farrar Stephen Edward Ecob Spruson Ferguson St Martins Tower Level 31 Market Street Sydney NSW 2000 (CCN 3710000177) Tamper detection and correction of documents using error correcting codes The following statement is a full description of this invention, including the best method of performing it known to me/us:- 5845c -1- TAMPER DETECTION AND CORRECTION OF DOCUMENTS USING ERROR (Ni CORRECTING CODES Field of the Invention SThe present invention relates generally to documents and, in particular, to document protection. The present invention also relates to a method and apparatus for generating, printing and reading protected documents, and to a computer program product including a Scomputer readable medium having recorded thereon a computer program for generating, Sprinting and reading protected documents.

Background It is often desirable to ensure that a printed document has not been altered or tampered with in some unauthorised manner from the time the document was first printed. For example, a contract that has been agreed upon and signed on some date may subsequently be fraudulently altered. It is desirable to be able to detect such alterations in detail. Similarly, security documents of various sorts including cheques and monetary instruments record values, which are vulnerable to fraudulent alteration. Detection of any fraudulent alteration in such document is also desirable. Further, it is desirable that such detection be performed automatically, and that the detection reveal the exact nature of any alteration.

In addition to detection of fraudulent alteration or tampering with a document, it is desirable that printed documents offer a visible deterrent to fraudulent alteration. In the event of fraudulent alteration, it is desirable that an original of the altered printed document can be reliably reconstructed from the altered printed document.

Various methods of deterring and detecting fraudulent alteration to documents have been proposed and used.

One class of methods in use before high quality color scanners and printers became commonly available was to print important information such as monetary 694656.doc O amounts in special fonts or with special shadows that were, at the time, difficult to reproduce. However, with modem printers and scanners, such techniques have become vulnerable to attack.

NOne known method of detecting alteration of a document uses a two dimensional (2D) barcode printed on one part of a document page to encode (possibly cryptographically) a representation of some other portion of the document, such as a signature area. This 2D barcode can be decoded and a resulting image compared by an Ooperator to the area the barcode is intending to represent to check for similarity. Existing variants of such barcode protection may be divided into two categories.

The first category of 2D-barcode protection involves embedding a portion of a document's semantic information into a 2D barcode. Often, such semantic information may be hashed and encrypted. However, this first category of barcode protection does not allow non-textual documents to be protected. The second category of 2D-barcode protection treats a document as an image and embeds a portion of the image in a barcode.

However, embedding a portion of the image in the barcode may cause the barcode to become very large. In this instance, automatic verification at a fine granularity is not possible, as the image embedded in the barcode cannot be automatically lined up with the received document.

A related body of work is detection of tampering in digital images that are not subject to print/scan cycles. A number of "fragile watermark" methods are known.

However, these methods are generally not applicable to tamper detection in printed documents since they cannot withstand the introduction of noise, Rotation, Scaling and Translation (RST), re-sampling, and local distortion that occurs in a print/scan cycle.

Some of these fragile watermark methods operate by replacing all or some of the least significant bits of pixels of an image with some form of checksum of remaining bits in each pixel.

694656.doc -3- O A number of "semi-fragile watermark" methods are also known. These include methods that use cross-correlation to detect the presence of a lightly embedded shifted copy of a portion of an image. Another known semi-fragile watermark method embeds Nwatermarks into image blocks, and then compares the detection strength of these watermarks to discern if any blocks have been altered. These semi-fragile watermark methods tend to have less localisation ability as their detection ability improves, and as Ntheir localisation ability improves, these methods become more sensitive to noise and Oother distortions and so cannot be used to detect local changes in printed documents.

Other known methods of detecting alterations in digital images use special materials to make alteration difficult. Such methods include laminates covering the printed surface of a document where damage to the laminate is obvious. However using special materials introduces production complexity, and is not applicable to plain paper applications. These known methods are also not amenable to automatic detection.

An additional failing in many existing methods is weak cryptographic security.

In many cases, once a cryptographic algorithm being employed is identified, identification leads directly to a subversion method to attack the identified method.

Another common failing of present methods of detecting alterations to digital images is the distribution of alteration detection information over wide areas of a page, or even areas completely separate to the image area to be authenticated (as in the barcode method above). This introduces problems if there is incidental soiling of the document in areas apart from the image area being authenticated. Many of these methods cannot be used to authenticate the entire area of a document, so documents must be specifically designed to accommodate the methods.

A still further class of methods of detecting alterations to documents uses independent transfer of information about the original unaltered form of a document to verify the document. This could be as simple as a telephone call to a person with 694656.doc -4independent knowledge, and may extend to keeping a complete copy of the document in a secure location. Such methods have many practical disadvantages since they require handling and storage of such independent information.

SThus, a need clearly exists for more efficient methods of generating, printing and reading protected documents.

c- Summary N It is an object of the present invention to substantially overcome, or at least ameliorate, one or more disadvantages of existing arrangements.

According to one aspect of the present invention there is provided a method of generating a protected document, said method comprising the steps of: generating a block-based correlatable pattern of data; encoding data representing a document to be protected using an error correction code to generate parity bits for the document; and arranging the generated data patterns, the encoded document and the generated parity bits according to a predetermined arrangement to generate the protected document.

According to another aspect of the present invention there is provided a method of generating a protected document, said method comprising the steps of: generating one or more data patterns based on a mathematical function having a predetermined property; arranging the generated data patterns in a border region of said protected document; generating a block-based correlatable pattern of data; arranging the correlatable pattern of data in an interior region of said protected document according to a predetermined arrangement; and encoding data representing a document to be protected using an error correction code to generate parity bits for the document; and 694656.doc O arranging the encoded document and the generated parity bits in said interior region according to said predetermined arrangement to generate said protected document.

According to still another aspect of the present invention there is provided a method of generating a protected document, said method comprising the steps of: generating one or more spiral data patterns; arranging the spiral data patterns in a border region of said protected document; generating a noise pattern using random data; Oarranging the random data in an interior region of said protected document according to a predetermined arrangement; and encoding data representing a document to be protected using an error correction code to generate parity bits for the document; and arranging the encoded document and the generated parity bits in said interior region according to said predetermined arrangement to generate said protected document.

According to still another aspect of the present invention there is provided an apparatus for generating a protected document, said apparatus comprising: generating means for generating a block-based correlatable pattern of data; data encoding means for encoding data representing a document to be protected using an error correction code to generate parity bits for the document; and arranging means for arranging the generated data patterns, the encoded document and the generated parity bits according to a predetermined arrangement to generate the protected document.

According to still another aspect of the present invention there is provided an apparatus for generating a protected document, said apparatus comprising: first generating means for generating one or more data patterns based on a mathematical function having a predetermined property; 694656.doc -6- O first arranging means for arranging the generated data patterns in a border region of Ssaid protected document; second generating means for generating a block-based correlatable pattern of data; second arranging means for arranging the correlatable pattern of data in an interior region of said protected document according to a predetermined arrangement; and Nencoding means for encoding data representing a document to be protected using an N error correction code to generate parity bits for the document; and Othird arranging means for arranging the encoded document and the generated parity bits in said interior region according to said predetermined arrangement to generate said protected document.

According to still another aspect of the present invention there is provided an apparatus for generating a protected document, said apparatus comprising: first generating means for generating one or more spiral data patterns; first arranging means for arranging the spiral data patterns in a border region of said protected document; second generating means for generating a noise pattern using random data; second arranging means for arranging the random data in an interior region of said protected document according to a predetermined arrangement; encoding means for encoding data representing a document to be protected using an error correction code to generate parity bits for the document; and third arranging means for arranging the encoded document and the generated parity bits in said interior region according to said predetermined arrangement to generate said protected document.

According to still another aspect of the present invention there is provided a computer program for generating a protected document, said program comprising: code for generating a block-based correlatable pattern of data; 694656.doc -7- O code for encoding data representing a document to be protected using an error correction code to generate parity bits for the document; and code for arranging the generated data patterns, the encoded document and the generated parity bits according to a predetermined arrangement to generate the protected document.

According to still another aspect of the present invention there is provided a Ncomputer program for generating a protected document, said program comprising: Ocode for generating one or more data patterns based on a mathematical function having a predetermined property; code for arranging the generated data patterns in a border region of said protected document; code for generating a block-based correlatable pattern of data; code for arranging the correlatable pattern of data in an interior region of said protected document according to a predetermined arrangement; and code for encoding data representing a document to be protected using an error correction code to generate parity bits for the document; and code for arranging the encoded document and the generated parity bits in said interior region according to said predetermined arrangement to generate said protected document.

According to still another aspect of the present invention there is provided a computer program for generating a protected document, said program comprising: code for generating one or more spiral data patterns; code for arranging the spiral data patterns in a border region of said protected document; code for generating a noise pattern using random data; 694656.doc -8code for arranging the random data in an interior region of said protected document O according to a predetermined arrangement; and code for encoding data representing a document to be protected using an error correction code to generate parity bits for the document; and code for arranging the encoded document and the generated parity bits in said interior region according to said predetermined arrangement to generate said protected N document.

Other aspects of the invention are also disclosed.

Brief Description of the Drawings One or more embodiments of the present invention will now be described with reference to the drawings and appendices, in which: Fig. 1 is a schematic block diagram of a general-purpose computer upon which arrangements described may be practiced; Fig. 2A shows a protected document; Fig. 2B shows alignment pixels in the interior of the protected document of Fig. 2A; Fig. 3 shows the mapping of coordinates from protected document interior pixels to a coarsely aligned image and a scanned image of the document of Fig. 2A; Fig. 4 is a flow diagram showing a method of generating the protected document of Fig. 2A; Fig. 5 is a flow diagram showing a method of reading the protected document of Fig. 2A; Fig. 6 shows a plot of the real part of a Logarithmic Radial Harmonic Function

(LRHF);

Fig. 7 shows a spiral bitmap; Fig. 8 shows the location of spirals embedded in the protected document of Fig. 2A; 694656.doc -9- Fig. 9 is a flow diagram showing a method of determining a coarse alignment affine O transform, as executed in the method of Fig. Fig. 10 shows the protected document of Fig. 2A with its border divided into squares; Fig. 11 is a flow diagram showing a method of storing data in the border of the protected document of Fig. 2A, as executed in the method of Fig. 4; Fig. 12 is a flow diagram showing a method of extracting salt data from the border of the protected document of Fig. 2A, as executed in the method of Fig. Fig. 13 is a flow diagram showing a method of determining a fine alignment warp map for a scanned image of the protected document of Fig. 2A, as executed in the method of Fig. Fig. 14A is a flow diagram showing a method of generating alignment pixels in the interior of the protected document of Fig. 2A for documents that do not have a dominant amount of one color, as executed in the method of Fig. 4,; Fig. 14B is a flow diagram showing a method of generating alignment pixels in the interior of the protected document of Fig. 2A for documents that have a dominant amount of one color, as executed in the method of Fig. 4,; Fig. 15 is a flow diagram showing a method of generating a reference image for the protected document of Fig. 2A, as executed in the method of Fig. 13; Fig. 16A shows a correlation tile of the reference image, which may be used in the method of Fig. 13; Fig. 16B shows a correlation tile of the coarsely-aligned image, corresponding to the correlation tile of the reference image of Fig. 16A; Fig. 17 is a flow diagram showing a method of generating a displacement map for the for the scanned image of the protected document of Fig. 2A, as executed in the method of Fig. 13; 694656.doc O Fig. 18 shows an example of two overlapping correlation tiles, o Fig. 19 is a flow diagram showing an alternative method for determining the Fast d Fourier Transform (FFT) of correlation tiles, as executed in the method of Fig. 17; C Fig. 20 is a flow diagram showing a method of interpolating a mapping, as executed in the method of Fig. 13; F Fig. 21 is a flow diagram showing a method of determining the location of a highest Nq peak in a correlation image to sub-pixel accuracy, as executed in the method of Fig. 17; Fig. 22 is a flow diagram showing a method of encoding a document to be protected into a one dimensional (1D) document array and a one dimensional (ID) protection array, as executed in the method of Fig. 4; Fig. 23 is a flow diagram showing a method of arranging the two 1D arrays of Fig.

22 to form the protected document of Fig. 2A, as executed in the method of Fig. 4; Fig. 24 is a flow diagram showing a method of extracting the two 1D arrays of Fig.

22 from the scanned image of the protected document, as executed in the method of Fig.

Fig. 25 is a flow diagram showing a method of indicating the location of alterations to the protected document and generating an image correcting the alterations; Fig. 26 is a flow diagram showing a method of locating six peaks corresponding to each of the spirals embedded in the protected document of Fig. 2A.

Fig. 27 is a flow diagram showing a method of determining the dimensions of the protected document of Fig. 2A, as executed in the protected document of Fig. 9; Fig, 28A shows a document; Fig. 28B shows the document of Fig. 28A with an alignment pattern generated in accordance with the method of Fig. 14A; Fig. 28C shows the document of Fig. 28A with an alignment pattern generated in accordance with the method of Fig. 14B; 694656.doc -11- Fig. 29 is a flow diagram showing a method of generating a coarsely-aligned image Sfor the scanned image of the protected document of Fig. 2A, as executed in the method of d Fig. 13; N' Fig. 30 is a flow diagram showing a method of generating a correlation image as executed in the method of Fig. 17; Fig. 31 is a flow diagram showing a method of determining constant vectors as executed in the method of Fig. 21; Fig. 32 is a flow diagram showing a method of determining a width for the (Ni protection barcode of the protected document of Fig. 2A; Fig. 33 is a flow diagram showing a method of determining the width of the protection barcode for the protected document of Fig. 2A when verifying the protected document; Fig. 34 is a flow diagram showing a method of generating a pseudo-random permutation, as executed in the method of Fig. 22; and Fig. 35 is a flow diagram showing a method of generating an inverse pseudorandom permutation, as executed in the method of Fig. 22.

Detailed Description including Best Mode Where reference is made in any one or more of the accompanying drawings to steps and/or features, which have the same reference numerals, those steps and/or features have for the purposes of this description the same function(s) or operation(s), unless the contrary intention appears.

It is to be noted that the discussions contained in the "Background" section and that above relating to prior art arrangements relate to discussions of devices which form public knowledge through their respective publication and/or use. Such should not be interpreted as a representation by the present inventor(s) or patent applicant that such devices in any way form part of the common general knowledge in the art.

694656.doc -12- For ease of explanation the following description has been divided into Sections Sto 7.2, each section having associated subsections.

Introduction 1.1 System for generating and reading barcodes The methods described herein may be practiced using a general-purpose computer system 100, such as that shown in Fig. 1 wherein the processes of Figs. 2 to 35 may be N implemented as software, such as an application program executing within the computer system 100. In particular, the steps of the described methods may be affected by instructions in the software that are carried out by the computer. The instructions may be formed as one or more code modules, each for performing one or more particular tasks.

The software may also be divided into two separate parts, in which a first part performs the described methods and a second part manages a user interface between the first part and the user. The software may be stored in a computer readable medium, including the storage devices described below, for example. The software may be loaded into the computer from the computer readable medium, and then executed by the computer. A computer readable medium having such software or computer program recorded on it is a computer program product. The use of the computer program product in the computer affects an advantageous apparatus for implementing the described methods.

The computer system 100 is formed by a computer module 101, input devices such as a keyboard 102, mouse 103 and a scanner 119, output devices including a printer 115, a display device 114 and loudspeakers 117. The printer 115 may be in the form of an electro-photographic printer, an ink jet printer or the like. The printer may be used to print barcodes as described below. The scanner 119 may be in the form of a flatbed scanner, for example, which may be used to scan a barcode in order to generate a scanned image of the barcode. The scanner 119 may be configured within the chassis of a multifunction printer.

694656.doc -13- A Modulator-Demodulator (Modem) transceiver device 116 may be used by the computer module 101 for communicating to and from a communications network 120, for example, connectable via a telephone line 121 or other functional medium. The modem 116 may be used to obtain access to the Internet, and other network systems, such as a Local Area Network (LAN) or a Wide Area Network (WAN), and may be incorporated into the computer module 101 in some implementations. In one implementation, the printer 115 and/or scanner 119 may be connected to the computer module 101 via such communication networks.

The computer module 101 typically includes at least one processor unit 105, and a memory unit 106, for example formed from semiconductor random access memory (RAM) and read only memory (ROM). The module 101 also includes a number of input/output (11O) interfaces. These IO interfaces include an audio-video interface 107 that couples to the video display 114 and loudspeakers 117, an I/O interface 113 for the keyboard 102 and mouse 103 and optionally a joystick (not illustrated), and an interface 108 for the modem 116, printer 115 and scanner 119. In some implementations, the modem 116 may be incorporated within the computer module 101, for example within the interface 108. A storage device 109 may be provided and typically includes a hard disk drive 110 and a floppy disk drive 111. A magnetic tape drive (not illustrated) may also be used. A CD-ROM drive 112 may be provided as a non-volatile source of data.

The components 105 to 113 of the computer module 101, typically communicate via an interconnected bus 104 and in a manner that results in a conventional mode of operation of the computer system 100 known to those in the relevant art. Examples of computers on which the described arrangements can be practised include IBM-PC's and compatibles, Sun Sparcstations or alike computer systems evolved therefrom.

Typically, the application program is resident on the hard disk drive 110 and read and controlled in its execution by the processor 105. Intermediate storage of the program 694656.doc -14and any data fetched from the network 120 may be accomplished using the semiconductor N memory 106, possibly in concert with the hard disk drive 110. In some instances, the application program may be supplied to the user encoded on a CD-ROM or floppy disk and read via the corresponding drive 112 or 111, or alternatively may be read by the user 5 from the network 120 via the modem device 116. Still further, the software may be loaded into the computer system 100 from other computer readable media. The term S"computer readable medium" as used herein refers to any storage or transmission medium that participates in providing instructions and/or data to the computer system 100 for execution and/or processing. Examples of storage media include floppy disks, magnetic tape, CD-ROM, a hard disk drive, a ROM or integrated circuit, a magneto-optical disk, or a computer readable card such as a PCMCIA card and the like, whether or not such devices are internal or external of the computer module 101. Examples of transmission media include radio or infra-red transmission channels as well as a network connection to another computer or networked device, and the Internet or Intranets including e-mail transmissions and information recorded on Websites and the like.

The methods described below may alternatively be implemented in dedicated hardware such as one or more integrated circuits performing the functions or sub functions of the described methods. Such dedicated hardware may include graphic processors, digital signal processors, or one or more microprocessors and associated memories.

A document to be protected, as described below, may be stored in an electronic file of a file-system configured within the memory 106 or hard disk drive 110 of the computer module 101, for example. Similarly, the data read from a protected document may also be stored in the hard disk drive 110 or memory 106 upon the protected document being read. Alternatively, the document to be protected may be generated on-the-fly by a software application program resident on the hard disk drive 110 and being controlled in 694656.doc its execution by the processor 105. The data read from a protected document may also be O processed by such an application program.

1.2 Elements making up a protected document The term 'document' as referred to below refers to a bi-level image. Text documents r, and the like may be converted into bi-level images before being tamper-protected in accordance with the methods described below. The term 'protected document' refers to a document a bi-level image) with additional features appended to the document that allow for automatic per-pixel tamper detection and correction of the document.

When a protected document is printed, pixels of the protected document are represented as squares of ink on paper, for example. Each pixel, or square of ink, represents one bit of information. The presence or absence of ink at the position of a particular square on the paper indicates whether the bit represented by the particular square is "on" or "off' respectively. Ink of one color may be used in the printing of protected documents. This color may be black.

The dimensions of a protected document may be specified by the width (Wp) and height (Hp) in pixels of an interior region of the protected document, as will be described in detail below. The physical size of a printed protected document may be determined by the size of each pixel in the printed protected document on a page. The physical size of the printed protected document is determined by the resolution of printing. For example, the protected document may be printed at a resolution of 150 dots-per-inch. This means that each pixel is a square with side-length of one 150th of an inch. However, a person skilled in the relevant art would appreciate that any suitable printing resolution may be used to generate the protected documents described here.

Fig. 2A shows a protected document 200. The protected document 200 will be used below as an example protected document to describe the methods of Figs. 3 to 35. The protected document 200 comprises a coarse alignment border 201 and an interior 202.

694656.doc -16- O The border 201 of the protected document 200 comprises pixels 203). The border S201 has a width, which may be denoted as For example, B may be equal to thirtytwo (32) meaning that the protected document 200 has a border 201 thirty two (32) pixels on all four sides of the protected document 200. The pixels that lie in the border 201 may be referred to as "border pixels". The interior 202 of the protected document 200 Scomprises all pixels of the protected document 200 that are not in the border 201. In the Sinterior 202, some of the pixels may be referred to as "alignment pixels" 205, as seen in Fig. 2B. Alignment pixels 205 and border pixels 203 may be used to perform fine alignment on the protected document 200, which will be described in detail below.

The alignment pixels 205 are pixels whose row and column coordinates are divisible by three However, the alignment pixels 205 may be arranged in any other suitable arrangement. For example, one eighth of the pixels in the interior 202 of the protected document 200 may be selected pseudo-randomly to be alignment pixels.

The remaining pixels in the interior 202 may be divided into a protection barcode 203 and a document 204. The document 204 is a bi-level image as described above. For example, the document 204 may be a bi-level image of a text document. The protection barcode 203 comprises error-correction code parity bits that protect the document 204 from alterations. The protection barcode 203 may be appended to the top and bottom of the document 204. The width of the protection barcode 203 is the same on both sides of the protection barcode 203. The protection barcode 203 of Fig. 2A comprises two distinct regions 203A and 203B. However, the protection barcode 203 of Fig. 2A is processed as a single contiguous barcode 203, which may be read from top to bottom. The protection barcode 203 may also be arranged in many other shapes, such as a four-sided border, for example.

The protection barcode 203 and the document 204 contain alignment pixels 205), as both the protection barcode 203 and the document 204 may be fine aligned.

694656.doc -17- As described above, the dimensions of the protected document 200 may be O specified by the width (Wp) and height (Hp) in pixels of the interior region 202 of the protected document 200. In order to make it easier to determine the dimensions of the protected document 200 from a scanned image of the protected document 200, the r 5 possible values of height (Hp) and width (Wp) for the protected document 200 may be ,1 limited.

In one example, Hp and Wp may be limited to multiples of the width B of the border 201. If the interior 202 is not a multiple of the border width B, the dimensions of the interior 202 may be rounded up to a nearest multiple of B, as will be described in detail below.

For ease of explanation and in order to allow specific pixels in the interior 202 of the protected document 200 to be identified, a pixel coordinate system will be described.

In this pixel coordinate system, each pixel in the interior 202 may be uniquely specified by a 2-tuple of coordinates In this 2-tuple of coordinates x specifies a column for the pixel, where column numbers range from 0 to W-l; y specifies a row for the pixel, where row numbers range from 0 to H-1. The state of the pixel with coordinates y) may be denoted by c(x, If c(x, y) 0, the pixel at y) is in the "off' state. If a(x, y) 1, the pixel at y) is in the "on" state.

Two-stage alignment Determining the location of pixels in a scanned image of the protected document 200, produced using the scanner 119 when reading the protected document 200, can be problematic. A major problem with conventional methods of determining the location of pixels in a scanned image is their inability to accurately determine the location of pixels at anything except trivially low resolutions. This problem prevents conventional methods from automatically verifying documents at a per-pixel level. However, using the methods described herein, pixel locations in a scanned image of the protected document 200 694656.doc -18generated using the scanner 119 a standard commercial scanner) and printer 115 may be accurately determined at resolutions up to 200dpi. This upper resolution is due to the quality of the printing and scanning process, and is not an intrinsic limitation of the N methods described herein. As printers and scanners improve in quality, higher resolutions will be possible using the described methods without modification.

Determination of the location of pixels in a scanned image of the protected Ndocument 200 can be problematic since the protected document 200 may be printed at one resolution 150 pixels-per-inch) and scanned at a higher resolution 600 dpi). This means that a pixel in the scanned image is 4-by-4 scanned pixels in size. The location of the centre of the pixel in the scanned image is required to be determined accurately. However, due to distortions and warping, the locations of pixels in the scanned image of the protected document may deviate from their expected locations.

The location of pixels 203) in the scanned image of the protected document 200 may be determined using "coarse alignment" and "fine alignment". Coarse alignment represents an approximate mapping between pixels and the coordinates of their centres in the scanned image of the protected document 200. Coarse alignment may use an affine transformation. Since the mapping between pixels and their location in the scanned image is usually more complicated than an affine transform, coarse alignment may not accurately represent the pixel locations. Once the coarse alignment affine transform has been found, the scanned image may be transformed, undoing the effects of the affine transform, and thus producing an image that is approximately the same as the original printed protected document 200. This image that is approximately the same as the original protected document 200 may be referred to as the coarsely-aligned image.

Fig. 3 shows a coarsely-aligned image 302 and a scanned image 303. Each of the images 302 and 303 represent the protected document 200, which includes the protection barcode 203 and the document 204 the bi-level image of a document to be 694656.doc -19protected). A representation of a coarse alignment affine transform 311 is also shown.

o The coarse alignment affine transform 311 takes coordinates in the coarsely-aligned image and maps the coordinates in the coarsely-aligned image to coordinates in the N scanned image.

Fine alignment may be used to determine the mapping between interior pixels 301 the pixels in the protection barcode 203 and the document 204 of the interior 202 of Sthe protected document 200), as shown in Fig. 3, and the coarsely-aligned image 302, using an array of displacement vectors 310. Such an array of displacement vectors may be referred to as a "displacement map".

The displacement map 310 and the coarse alignment affine transform 311 together provide a mapping from the interior pixels 301 to coordinates in the scanned image 303 of the protected document 200. Given the coordinates of a pixel 315 in the interior pixels 301, the displacement map 310 may be used to find the coordinates of the centre of that pixel 317 in the coarsely-aligned image 302 of the protected document 200. Those coordinates may then be transformed by the coarse alignment affine transform 311, resulting in the coordinates of the centre of the pixel 319 in the scanned image 303 of the protected document 200. Thus the composition of the displacement map 310 and the affine transform 311 results in a mapping from the pixel coordinates the coordinates at point 315) to the scanned image coordinates the coordinates of the point 319).

The composed mapping is called a warp map. A representation of a warp map 312 is also shown in Fig. 3.

Generating and reading protected documents Fig. 4 is a flow diagram showing a method 400 of generating a protected document, such as the protected document 200, for example. The method 400 may be implemented as software resident on the hard disk drive 110 and be controlled in its execution by the processor 105.

694656.doc The method 400 accesses data in the form of a bi-level image representing a o document to be protected, and produces a 2D bi-level image representing the protected document 200. The 2D bi-level image represents the protection barcode 203, the N document 204 and the border 201. This 2D bi-level image forming the protected document 200 may then be printed using the printer 115.

The method 400 begins at the first step 402, where the processor 105 generates Sspirals for corners 209) of the protected document 200. The processor 105 encodes (or embeds) the spirals into border pixels of the protected document 200. At the next step 403, the processor 105 generates a border pattern for the border 201 of the protected document 200, storing data in the border pixels of the protected document 200. The processor 105 fills any of the border pixels that do not contain spirals with a small amount of data as will be described in detail below. The processor 105 may also store random data noise) into pixels of the barcode border 201 where spirals have been embedded.

A method 1100 of storing data in border pixels of the protected document 200, as executed at step 403, will be described below with reference to Fig. 11.

The method 400 continues at the next step 404, where the processor 105 generates an alignment pattern in the alignment pixels 205) of the protected document 200, in order to allow fine alignment to be performed when reading the protected document 200.

Step 404 may degrade the visual quality of the document 204 by corrupting every ninth pixel. Therefore, during generation of the protected document 200 two methods of generating an alignment pattern in the alignment pixels 205) of the protected document 200 will be described below. Firstly, a method 1400A of generating an alignment pattern in the alignment pixels 205) of the protected document 200 will be described in more detail below with reference to Fig. 14A, for execution at step 404.

The method 1400A may be used for documents that do not have a dominant amount of one color a monochrome image). For documents that do have a dominant amount 694656.doc -21of one color text documents with a white background), a different method 1400B of o generating an alignment pattern in the alignment pixels 205) of the protected document 200 may be executed at step 404. The method 1400B will be described in Sdetail below with reference to Fig. 14B.

The method 400 continues at the next step 405, where the processor 105 accesses data in the form of a bi-level image representing the document 204 to be protected and Sencodes the data to form a one dimensional (1D) document array and a 1D protection array. The bi-level image representing the document 204 a text document) to be protected may be accessed from memory 106, for example. A method 2200 of encoding a document 204 to be protected into a 1D document array and a 1D protection array, as executed at step 405, will be described in detail below with reference to Fig. 22. As will be described in detail below, the 1D document array stores a serialised version of the document 204 to be protected. The 1D protection array comprises protection or parity bits.

The method 400 concludes at the next step 406, where the processor 105 arranges the 1D document array and the 1D protection array as the document 204 and the protection barcode 203, respectively, to form the protected document 200. A method 2300 of arranging the two 1D arrays to form the protected document 200, as executed at step 406, will be described below with reference to Fig. 23.

Fig. 5 is a flow diagram showing a method 500 of reading a protected document, such as the protected document 200, for example. The method 500 may be implemented as software resident in the hard disk drive 110 and being controlled in its execution by the processor 105.

The method 500 accesses an image generated by scanning a printed version of the protected document 200. This image may be referred to as the 'scanned image' of the protected document 200. The scanned image may be accessed from memory 106, for 694656.doc -22- O example. The method 500 then produces data encoded in the printed version of the O protected document 200.

The method 500 begins at step 502, where the processor 105 determines a coarse N alignment affine transform based on the dimensions width Wp and height Hp) of the r- protected document 200. At step 502, the processor 105 determines the locations of Sspirals in the scanned image of the protected document 200 and uses the detected spirals N to locate the protected document 200 on a page. The processor 105 then determines the dimensions of the protected document 200, the resolution of the pixels of the protected document 200 and the coarse alignment affine transform based on the determined dimensions. A method 900 of determining a coarse alignment affine transform, using the locations of the spirals, as executed at step 502, will be described below with reference to Fig. 9.

At the next step 504, the processor 105 reads the border 201 of the protected document 200 and extracts salt data. Salt data is a small amount of data from the border 201 of the protected document 200, as will be described in more detail below. A method 1200 of extracting salt data from the border 201 of the protected document 200, will be described below with reference to Fig. 12. Then at the next step 505, the processor 105 analyses the scanned image of the protected document 200 to determine a fine alignment warp map. The fine alignment warp map describes where pixels in the protected document 200 as printed appear in the scanned image of the protected document 200 and may be used to align the scanned image of the protected document 200 to the printed version of the protected document 200. The fine alignment warp map determined at step 505 may be used to align the scanned image of the protected document 200 to the printed version of the protected document 200. A method 1300 of determining a fine alignment warp map for aligning the scanned image of the protected document 200, as executed at step 505, will be described below with reference to Fig. 13.

694656.doc -23- The method 500 continues at the next step 506, where the processor 105 extracts a I1D document array and a 1D protection array from the aligned scanned image of the protected document 200. A method 2400 of extracting the 1D document array and 1D protection array from the scanned image of the protected document 200, as executed at step 506, will be described in detail below with reference to Fig. 24. As will be described in detail below, in the method 2400, the protection barcode 203 and the document 204 of the protected document 200 are serialised into the 1D protection array and the 1D document array, respectively.

Then at the next step 507 of the method 500, the processor 105 uses the 1D document array and 1D protection array to detect alterations in the printed document 200.

At step 507, the processor 105 produces two images, a first image showing the location of the alterations to the protected document 200 and a second image correcting the alterations. A method 2500 indicating the location of the alterations to the protected document 200 and generating an image correcting the alterations, will be described in detail below with reference to Fig. Spirals and coarse alignment Step 402 of the method 400 and step 502 of the method 500 will now be described in more detail.

As described above, at step 402, the processor 105 generates spirals in the corners 209) of the protected document 200 located inside the border 201 of the protected document 200. These spirals are generated in the protected document 200 since the spirals have distinctive properties that allow the spirals to be easily detected when the protected document 200 is read.

As described above, at step 502, the processor 105 determines a coarse alignment affine transform. The course-alignment transform is determined based on the dimensions of the protected document 200. At step 502, the processor 105 determines the locations 694656.doc -24- 0 of spirals in the scanned image of the protected document 200 and uses the detected o spirals to locate the protected document 200 on a page. The processor 105 then determines the dimensions of the protected document 200, the resolution of the pixels in C the protected document 200 and the coarse alignment affine transform.

The spirals used in the protected document 200 are bitmapped versions of logarithmic radial harmonic functions (LRHF). Mathematically, LRHF are complex N valued functions defined on a plane. LRHF have the properties of scale and rotation 0 invariance, which means that if an LRHF is transformed by scaling or rotation the transformed LRHF is still an LRHF. As an example, Fig. 6 shows a plot of the real part 600 of an LRHF.

An LRHF has three parameters that may be adjusted. The first parameter is referred to as the Nyquist radius R, which is the radius at which the frequency of the LRHF becomes greater than 7 radians per pixel. The second parameter is referred to as the spiral angle a, which is the angle that the spiral arms 601) make with circles centred at the origin 602). The third parameter is referred to as the phase offset 4. An LRHF may be expressed in polar coordinates in accordance with Formula as follows: e(r, ej(me+nlnr+() (1) where the values of m and n may be determined in accordance with the following Formulae n Rn cos a (2) m LRTn sin a] In one example, a 6 radians and R B 4, where B represents the width of the border 201 of the protected document 200, as shown in Fig. 2. The selection of phase q varies for different spirals in the same protected document 200, as will be described in more detail below.

4.1 Embedding spirals 694656.doc At step 402 of the method 400, the processor 105 generates six spirals in the Sprotected document 200. The spirals are embedded in the coarse alignment border pixels 204) of the protected document 200. Each spiral is generated by generating a spiral Sbitmap, which samples the LRHF with the Nyquist radius R, the spiral angle a and the phase offset 4. The spiral bitmap has height and width equal to B pixels.

Fig. 7 shows a spiral bitmap 700. The polar coordinates in the spiral bitmap 700 Swill now be described. The term "origin" of the coordinate system of the spiral bitmap 700 refers to the centre 703 of the spiral bitmap 700. The radius r 701 of a point in the spiral bitmap 700 is the distance from that point to the origin 703, measured in pixels.

The angle 0 702 of a point in the spiral bitmap 700 is the angle of a ray from the origin 703 through the point. In accordance with this definition of radius r and angle 0, the value of a pixel in the spiral bitmap with coordinates 0) may be determined in accordance with Formula as follows: 1if r> Rand (3) Squares 705) of the spiral bitmap 700 shown in Fig. 7 are shaded where pixel values of the bitmap 700 are equal to one on). Squares 707) of the bitmap 700 are unshaded where the pixel values of the bitmap 700 are equal to zero off).

Once the spiral bitmap 700 has been generated, the spiral represented by the spiral bitmap 700 may be embedded into the pixels of the protected document 200. Pixels of the spiral bitmap 700 equal to zero are encoded into the protected document 200 by setting the state of a corresponding protected document pixel to "off'. Pixels of the spiral bitmap 700 equal to one are encoded into the protected document 200 by setting the state of a corresponding protected document pixel to "on".

As seen in Fig. 8, six spirals 801, 802, 803, 804, 805 and 806 may be embedded in the border 201 of the protected document 200. Each of these spirals 801, 802, 803, 804, 694656.doc -26- 805 and 806 is B pixels wide, and B pixels high. As described above, B may be equal to Sthirty-two (32) meaning that each of the spirals is thirty-two pixels wide and thirty-two pixels high. Five of the spirals spirals 801, 803, 804, 805 and 806, as seen in Fig. 8) embedded in the protected document 200, have the same value for phase 4= 0), while the remaining spiral spiral 802) has an opposite phase k The locations of the six spirals 801, 802, 803, 804, 805 and 806 embedded in the border 201 of the protected document 200 will now be described with reference to Fig. 8.

As seen in Fig. 8, four spirals 801, 803, 804 and 806 of the five spirals spirals 801, 803, 804, 805 and 806, as seen in Fig. 8) with phase 0 are positioned in the four corners 205) of the protected document 200. The other spiral 805 with 0 0 is positioned immediately to the left of the spiral 804 in the bottom-right corner 205 of the protected document 200. The spiral 802 with opposite phase 0 7r is positioned immediately to the right of the spiral in the top-left corner of the protected document 200.

The six spirals 801, 802, 803, 804, 805 and 806 embedded in the border 201 of the protected document 200 are encoded into pixels of the border 201 of the protected document 200.

4.2 Higher resolution spirals Spirals may be printed by the printer 115, for example, at a higher resolution than the resolution of the protected document 200 being printed. This may allow more accurate sampling of an underlying LRHF, and better spiral detect ability when the protected document 200 is scanned by the scanner 119, for example. Alternatively, the spirals of the protected document 200 may print at a 'spiral resolution'. The spiral resolution is equal to the resolution of printing of the protected document 200 the protected document resolution) multiplied by an integer referred to as a 'spiral factor', F.

The spiral resolution is preferably a highest resolution at which the printer 115 is able to print.

694656.doc -27- Each pixel at the protected document resolutionin the coarse alignment border 201, where a spiral is to be added to the protected document 200, is divided into an F x F array of pixels at the spiral resolution. Thus, each spiral is composed of an array of pixels with I a height of BF pixels and a width of BF pixels. In one example, the spiral bitmaps 700) formed at step 402 have a height H and width W equal to BF rather than B. In this instance, the spiral bitmaps 700) are embedded into the pixel arrays.

ri 4.3 Detecting spirals As described above, at step 502 of the method 500, the processor 105 detects the locations of spirals in the scanned image of the protected document 200 and then determines a coarse alignment affine transform, using the locations of the spirals. The detection of spiral locations may be achieved by performing a correlation between a spiral template image and the scanned image of the protected document 200.

The method 900 of determining a coarse alignment affine transform, as executed at step 502, will now be described with reference to Fig. 9. The method 900 may be implemented as software resident on the hard disk drive 110 and being controlled in its execution by the processor 105.

The method 900 begins at an initial step 901, where the processor 105 generates a spiral template image, within memory 106, for example. The generation of the spiral template image at step 901 is similar to the generation of the spiral bitmap in step 402 of the method 400. However, the spiral template image is complex valued and is larger in size than the spiral bitmap. Each pixel value in the spiral template image is stored in memory 106 as a pair of double-precision floating point numbers representing the real and imaginary parts of the pixel value. The spiral template image has height and width equal to Ts, the template size. The template size T s may vary. In one example T s 256.

694656.doc -28- Polar coordinates 0) in the spiral template are defined, with the origin in the centre of the template. The pixel value at polar coordinates 0) in the spiral template image may be determined in accordance with Formula as follows: Ce J(me+n lnr) if r>R (4) 0 otherwise where m and n are defined by Formulae above; the Nyquist radius R represents the radius at which the frequency of the LRHF becomes greater than 7r radians per pixel; and Sthe spiral angle a represents the angle that the spiral arms of the LRHF make with circles centred at the origin of the LRHF.

At the next step 903, the processor 105 performs a correlation between the scanned image and the complex spiral template image to generate a correlation image.

The correlation of two images I, and I2 is a correlation image The correlation image 1I may be determined in accordance with Formula below: Ix(x,y)= ~I 1 2 y) The sum of Formula ranges over all x' and y' where I, is defined, and in the image 12, the values of pixels outside the image are considered to be zero. If either of the images I, or 12 is complex-valued, the correlation image I, may be complex-valued too. The resulting correlation image I, contains peaks pixels with large modulus relative to neighbouring pixels), at the locations of spirals in the scanned image of the protected document 200. The phase of the pixel value of a peak is related to the phase 0 of a corresponding spiral 801) that was embedded in the protected document 200. The five spirals 801, 803, 804, 805 and 806 that were generated with 0= 0 at step 402 have peaks with similar phase, while the one spiral 802 that was generated with 0 7r at step 402 typically has a peak with opposite phase to the peaks of the other five spirals. Even if the scanned image of the protected document 200 is at a different resolution to the 694656.doc -29- O resolution that the protected document 200 was printed at, the spirals 801, 802, 803, 804, S805 and 806, will still be detected by the processor 105 since the underlying LRHF of the spirals is scale-invariant.

SAt the next step 904 of the method 900, the processor 105 examines the correlation image resulting from step 903, and locates the six peaks corresponding to each of the Sspirals 801 to 806 in accordance with the arrangement of the spirals 801 to 806 seen in Fig. 8. The six peaks corresponding to each of the spirals 801 to 806 may be located in accordance with the resolution at which the protected document 200 was printed (represented as RP) and the resolution at which the protected document 200 was scanned (represented as If either of the resolutions RP and R s is not known, but there are only a few possibilities for the values of the resolutions RP and then the six peaks of the spirals 801 to 806 may be located by trying each of the possible resolutions, and locating six peaks with a layout consistent with the corresponding possible resolution.

A method 2600 of locating the six peaks corresponding to each of the spirals 801 to 806, as executed at step 904, will now be described with reference to Fig. 26. The method 2600 may be implemented as software resident on the hard disk drive 110 and being controlled in its execution by the processor 105.

The method 2600 begins at step 2601, where the correlation image determined at step 903 is searched to locate the spirals 804 and 805 in the bottom-right corner 205 of the protected document 200. The spirals 804 and 805 correspond to a pair of peaks with approximately the same phase and lying approximately B x R, Rp pixels apart in the scanned image of the protected document 200. The coordinates of each the peaks of the spirals 804 and 805 may be denoted by q 4 and q 5 respectively.

At the next step 2603, the correlation image determined at step 903 is searched to locate the spirals 801 and 802 in the top-left corner of the protected document 200. The 694656.doc O spirals 801 and 802 correspond to a pair of peaks lying approximately (B x R Rp) pixels Sapart in the scanned image of the protected document 200. The peak of the spiral 801 will d have approximately the same phase as the peaks at q 4 and q 5 determined previously. The peak of the spiral 802 will have approximately the opposite phase. The coordinates of the peak corresponding to the spiral 801, in the scanned image, having approximately the C, same phase as the peaks at q 4 and q 5 may be denoted The coordinates of the peak corresponding to the spiral 802, in the scanned image, having approximately the opposite ,phase as the peaks at q 4 and q 5 may be denoted q 2 If the peak at q 4 is closer in distance to the peak at q, than the peak at q 5 is, then the peaks at q 4 and q 5 may be swapped.

The method 2600 concludes at the next step 2605, where the locations of the topright and bottom-left spirals 803 and 806 may be estimated. At step 2605 the correlation image of step 903 is searched to see if peaks with the correct phase are at the locations determined for the spirals 803 and 806. If peaks with the correct phase are found at the locations of the top-right and bottom-left spirals 803 and 806, then a protected document with consistent layout to the printed version of the protected document 200 has been found. The expected coordinates of the top-right spiral 803 may be denoted by q' 3 The value of the expected coordinates q' 3 may be determined by projecting a point at coordinates q 4 onto a line joining the coordinates q, to the coordinates q 2 Similarly, the expected coordinates of the bottom-left spiral 806 may be denoted q' 6 The value of the coordinates q' 6 may be determined by projecting a point at coordinates ql onto the line joining the coordinates q 4 and q 5 The correlation image may be searched for peaks at coordinates q 3 and q 6 that are close to expected coordinates q' 3 and q' 6 respectively.

Some predetermined tolerance parameters may be used in the method 2600, in order to determine whether peaks are approximately the right distance apart, whether two peaks 694656.doc -31- O have approximately the same (or opposite) phase, or whether two peaks are close. For Sexample, the following predetermined tolerances may be used.

Two peaks may be considered to be approximately a correct distance apart if the actual distance between the two peaks is within 5% of the correct distance. The peaks at coordinates q 4 and q 5 may be considered to be of the same phase if their phases are within ir/3 of each other. The peaks at coordinates q, and q 2 may be considered to be of opposite phase to each other if one phase is within 7r/3 of the other phase plus 7r. The peaks at coordinates q 3 and q 6 may be considered to be close to peaks at the expected coordinates q' 3 and q' 6 if the angles q' 3 qlq 3 and q' 6 q 4 q 6 are less than 50 respectively, and the angles qlq 3 q 4 and q 4 q 6 q, are within 50 of 90' respectively.

More than one pair of peaks may be found at the top-left hand corner of the protected document 200 when searching for either of the peaks with the same or opposite phase. In this instance, different combinations of the peaks may be tried in order to find a correct combination.

Returning to the method 900 of Fig. 9, at the next step 905, the processor 105 determines the dimensions of the protected document 200 and generates a coarsealignment affine transform based on the determined dimensions. The dimensions of the protected document 200 may be determined by examining the position of the peaks 801, 803 and 806 in the scanned image of the protected document 200.

A method 2700 of determining the dimensions of the protected document 200, as executed at step 905, will now be described with reference to Fig. 27. The method 2600 may be implemented as software resident in the hard disk drive 110 and being controlled in its execution by the processor 105.

The method 2700 begins at step 2701, where the processor 105 determines the distance between the peaks corresponding to the top-left spiral 801 and top-right spiral 694656.doc -32- O 803. This distance may be denoted by Ilql q311. At the next step 2703, the distance Sdetermined at step 2701 is converted from scanned pixels to pixels in the printed version of the protected document 200 by multiplying the distance Ilq- q3 11 by Rp/RS in accordance with Formula below, where W, represents the distance measured in protected document pixels: W= IIq q3 II xR R, (6) The value of W c is an approximation of the distance between the centres of the two spirals (Ni 801 and 803 in the printed version of the protected document 200. W, is equal to the width of the protected document 200 Wp), plus half the width of the top-left spiral 801, plus half the width of the top-right spiral 803. Since the width of the spirals 801 and 803 is the border width B, the width Wp of the protected document 200 is approximately W B. At the next step 2705 of the method 2700, the width Wp is determined by rounding the value of W e B to the nearest multiple of the border width B, on the basis that the width Wp and height Hp of the protected document 200 are both multiples of the border width B.

At the next step 2707, the processor 105 determines the protected document height Hp by rounding the value ofH c B in accordance with Formula as follows: H, B Il q, q6 l x Rp R, B (7) to the nearest multiple of the border width B. The method 2700 concludes following step 2707.

The coarse-alignment affine transform is specified by a matrix A and a vector a.

The coarse-alignment affine transform A is determined at step 905 using the width W and height H of the protected document 200 by determining the affine transform that takes the centres of the three spirals 801, 803, and 806, to the positions of the three peaks ql, q 3 694656.doc -33and q 6 in the scanned image of the protected document 200. If the elements of the matrix O A are denoted as follows: (a 0 0 a 0 1

A=

a o I I (8) .aio a 1 then the matrix A may be determined using Formulae and as follows: (a 1 p (q 6 -ql) ,1 ya 1 j Hp Then the vector a may be determined in accordance with Formula as follows: a=q-B a (11) l0 a+a,, Salt and border patterns Steps 403 of the method 400 of Fig. 4 and step 504 of the method 500 of Fig. 5 will now be described in more detail. As described above, at step 403, the processor 105 generates a border pattern for the protected document 200, storing data in the border pixels 203) of the protected document 200. Further, at the step 504, the processor 105 reads the border 201 of the protected document 200 and extracts data from the border 201 of the protected document 200. Each of steps 403 and 504 stores or reads a small amount of data salt data) out of the border 201 of the protected document 200. The salt data may store metadata such as a version value representing the version of the protected document 200.

For the purposes of storing and reading the salt data, the border 201 of the protected document is divided into squares 1001, 1002), as shown in Fig. 10. The border 201 has width equal to B, and the protected document 200 has both height and width that are multiples of the border width B. Thus, the border 201 of the protected document 200 may 694656.doc -34be divided evenly into squares 1001) with width and height equal to B 2. The o square 1001 may be referred to as a 'salt square'.

The corners 1006) of the protected document 200 contain spirals. As such, N salt squares 1002) that lie where a spiral has been placed may be removed from further consideration and are not considered as being salt squares. Each of the remaining salt squares, such as the square 1001, which have not been removed, may be used to store N one bit of salt data.

For the purposes of storing and reading the salt data, two pseudo-random arrays, a 0 and a y may be used. These pseudo-random arrays, ao and a 1 represent noise patterns.

Both of the arrays 0 and ac, at each 2-tuple of pixel coordinates contain a value ai(x, y) that is either zero or one Since the oa are pseudo-random, the values oi(x, y) will appear random, even though the values are predetermined given x and y. Any suitable pseudo-random number generation algorithm may be used to generate the arrays %b and For example, the arrays 0 and a, may be generated using the RC4 algorithm, initialized with known seeds. The arrays 0 and a 1 represent salt patterns, which may occur in the salt squares of the border 201, as will be described in detail below.

At step 403 the processor 105 assigns values to the pixels in the coarse alignment border 201 of the protected document 200, in accordance with the salt data to be encoded.

The number of bits of salt data that may be encoded is equal to the number of salt squares 1001) that fit in the border 201 of the protected document 200, given the protected document dimensions. Thus, protected documents with different dimensions may be able to store different amounts of salt data.

The method 1100 of storing data in border pixels of the protected document 200, as executed at step 403, will now be described in detail with reference to Fig. 11. The 694656.doc method 1100 may be implemented as software resident in the hard disk drive 110 and o being controlled in its execution by the processor 105.

The method 1100 begins at step 1102 where the processor 105 iterates through the salt squares 1001) of the protected document 200, in a predetermined order. For example, the processor 105 may iterate through the salt squares 1001 in scanline order. In this instance, on the first execution of step 1102, a leftmost salt square 1007 in the top N, row of salt squares is selected. This leftmost salt square 1007 becomes the currently selected salt square. On subsequent executions of 1102, subsequent salt squares 1009 etc) in the topmost row will be selected, and then salt squares in subsequent rows will be selected, row by row. In some rows row 1011) the salt squares may not all be adjacent.

At a following step 1103, the processor 105 sets the values of pixels in a currently selected salt square 1007). At step 1103 the processor 105 assigns the values of the pixels in the currently selected salt square to corresponding values of C, as follows: y, c) o(x, yc) for all y, c) in the selected salt square, where n is defined such that the currently selected salt square is the n-th salt square to be processed at step 1103, and i is the value of the n-th bit of the salt data.

At the next step 1104, if the processor 105 determines that there are more salt squares in the protected document 200 to be processed then the method 1100 returns to step 1102. Otherwise, the method 1100 concludes.

5.2 Reading the salt data The method 1200 of extracting salt data from the border 201 of the protected document 200, as executed at step 504, will now be described with reference to Fig. 12.

694656.doc -36- The method 1200 may be implemented as software resident on the hard disk drive 110 O and being controlled in its execution by the processor 105.

In the method 1200, the processor 105 uses the coarse-alignment affine transform Sdetermined at step 905 and the scanned image of the protected document 200 to extract the salt data from the border 201 of the protected document 200.

The method 1200 begins at step 1202, where the processor 105 iterates through the Ssalt squares 1007) of the protected document 200. For example, the processor 105 may iterate through the salt squares in the same predetermined order used in step 1102 described above. The following steps 1203 to 1206 of the method 1200 determine which of the two salt patterns represented by the pseudo-random arrays or a, occur in a selected salt square 1001. This may be achieved by correlating both salt patterns with the selected salt square, and determining which of the salt patterns provides a larger result.

Knowing which of the salt patterns correlate with the selected salt square enables the value of the data bit encoded in the selected salt square to be determined.

At step 1203, a coarsely-aligned image of the currently selected salt square is generated by the processor 105. The coarsely aligned image may be generated by interpolating the scanned image, in order to determine values for the coarsely aligned image at non-integer coordinates. The scanned image may be interpolated using bicubic interpolation. A greyscale value interpolated from the scanned image of the protected document 200 at the coordinates y) in the scanned image coordinate system may be denoted as s(x, y).

The coarsely-aligned image of the currently selected salt square may be denoted by

U

S

The image U s has both height and width equal to half the border width B 2).

As an example, if the currently selected salt square has a top-left pixel at coordinates (xs, then pixels in U, correspond to the pixels with x-coordinates between x s and 694656.doc -37- O x B/2 1, and y-coordinates between y, and ys B/2 1. If the x- and y-coordinates O of U s range from 0 to B/2 1, then the image U s may be generated in accordance with Formula (12) as follows: Us(x, s(AX +a (12) That is, the pixel coordinates are transformed using the coarse alignment affine transform, resulting in coordinates in the scanned image of the protected document 200. The Sscanned image may then be interpolated at these coordinates, and the greyscale value may be encoded into the coarsely-aligned image U s Two images, U and U 1 may also be generated at step 1203. The images Uo and U, contain the expected salt patterns, as represented by the arrays o and The images Uo and U, may be generated as follows: Uo(x, y) Q(x s, y Ys) y) ca(x x s y (13) The method 1200 continues at the next step 1204, where the processor 105 performs two circular correlations. The circular correlation of two images I, and 12 with the same dimensions generates a third image I x with the same dimensions, according to Formula (14) below: Ix(x,y)= (14) x',y' The sum of Formula (14) ranges over all x' and y' where I, is defined, and in the image I2, the values of pixels outside the image I x may be obtained by considering 12 to be periodic.

Two circular correlations are performed at step 1204 in accordance with the Formula The first of these circular correlations is the correlation of U s and Uo, 694656.doc -38resulting in a correlation image Uxo. The second of these correlations is the correlation of

SU

s and U

I

resulting in a correlation image Ux 1 At the next step 1205, the processor 105 determines maximum values in the N correlation images Ux 0 and Ux 1 Then at the next step 1206, the processor 105 stores a salt bit in a buffer containing salt data, using the maximum values determined at step C 1205. If the maximum value in image Ux 0 is greater than the maximum value in image Ux, 1 then the salt bit stored in the buffer is a zero Otherwise, the largest value in Ux 1 (Ni is greater than the largest value in Ux 0 and the salt bit stored in the buffer is a one The buffer containing the salt data may be configured within memory 106. At the next step 1207, if the processor 105 determines that there are more salt squares to be processed, then the method 1200 returns to step 1202. Otherwise, the method 1200 concludes.

Fine Alignment The method 1400A of generating an alignment pattern in the alignment pixels 205) in the interior 202 of the protected document 200, as executed at step 404, for documents that do not have a dominant amount of one color, will now be described in more detail with reference to Fig. 14A. The method 1400B of generating an alignment pattern in the alignment pixels 205) of the protected document 200, as executed at step 404, for documents that do have a dominant amount of one color, will also be described in more detail with reference to Fig. 14B. The method 1300 of determining a fine alignment warp map for the scanned image of the protected document 200, as executed at step 505, will also be described.

The fine alignment warp map is determined in the method 1300 using the alignment pattern generated in accordance with either of the methods 1400A or 1400B, depending on whether or not the document being processed has a dominant amount of one color.

694656.doc -39- The method 1400A may be implemented as software resident on the hard disk drive 110 and being controlled in its execution by the processor 105. The method 1400A comprises one step 1401, where the processor 105 encodes an alignment pattern into the NI pixels of the protected document 200. The alignment pattern used may be represented as a pseudo-random noise) array of bits. For example, the pseudo-random array of bits described above may be used at step 1401. In this instance, at step 1401, the processor N 105 may set the value of each alignment pixel y) 205) of the protected document 200 to The alignment pattern may be distributed uniformly across the pixels in the interior 202 of the protected document 200. Alternatively, the alignment pattern may be distributed in one or more particular areas of the interior 202 of the protected document 200.

As an example, Fig. 28A show a protected document 2800 before alignment pixels have been inserted Fig. 28B shows the document 2800 with alignment pixels 2805) inserted into the protected document 2800 using the method 1400A. As seen in Fig. 28B, the alignment pixels 2805) have resulted in significant corruption of the protected document 2800.

As described above, the method 1400B of generating an alignment pattern in the alignment pixels of a protected document may be used for protected documents that do have a dominant amount of one color. A text document is an example of such a document. Text documents typically comprise 10% black pixels and 90 white pixels.

To describe the method 1400B, a pixel in the protected document 2800 of Fig. 28A may be denoted as a less frequent color 2803) in the protected document 2800 may be denoted as Co and a more frequent color 2804) in the protected document 2800 may be denoted as C 1 The method 1400B may be implemented as software resident in the hard disk drive 110 and being controlled in its execution by the processor 694656.doc 105. The method 1400B, begins at the first step 1402, where the processor 105 selects an alignment pixel y) 2805) in the protected document 2800. At the next step 1403, if the corresponding protected document 2800 pixel d(x, y) is set to Co, then the N method 1400B proceeds to step 1404. Otherwise, the method 1400B proceeds to step r 5 1405. At step 1404, the processor 105 sets the selected alignment pixel to Co and the method 1400B proceeds to step 1408.

SAt step 1405, if the protected document pixel d(x, y) is set to C 1 and one of the Cpixels adjacent to d(x, y) is set to Co, then the method 1400B proceeds to step 1406.

Otherwise, the method 1400B proceeds to step 1407. At step 1406, the processor 105 sets the alignment pixel y) to CI.

At step 1407, the alignment pixel is set it to c(x, At the next step 1408, if there are more pixels in the document 2800 to process, then the method 1400B returns to step 1402. Otherwise the method 1400B concludes.

As an example, Fig. 28C shows the document 2800 with alignment pixels 2805) inserted using the method 1400B. As seen in Fig. 28C, image quality of the document 2800 is improved, as C 0 -colored pixels 2807) remain Co colored without corruption by CI-colored alignment pixels. Also, the shape of Co-colored regions is preserved as a C I-colored border is present around such regions.

The method 1300 of determining a fine alignment warp map for the scanned image of the protected document 200, as executed at step 505, will now be described with reference to Fig. 13. The method 1300 may be implemented as software resident in the hard disk drive and being controlled in its execution by the processor 105.

The fine alignment warp map is generated in preparation for verification of the protected document 200. The method 1400A of generating an alignment pattern in the alignment pixels of a document is used in the method 1300. The method 1400B is not 694656.doc -41used since the version of the protected document before alignment pixels have been inserted is not accessible.

The method 1300 uses the scanned image of the protected document 200, and the N coarse alignment affine transform specified by the matrix A and the vector a according to S 5 Formula (11) and determines the warp map for the scanned image of the protected document 200. If the scanned image of the protected document 200 has alignment pixels generated in accordance with the method 1400B, the scanned image will be aligned against an alignment pattern which is slightly different from a reference image used to create the protected document 200. Fine alignment may still be able to be used to align the scanned image of the protected document 200 despite the minor differences in the alignment patterns.

The method 1300 begins at step 1302 where the processor 105 generates a coarselyaligned image for the scanned image of the protected document. The coarsely-aligned image is generated from the scanned image using the coarse alignment affine transform specified by the matrix A and the vector a. The dimensions of the coarsely-aligned image are the same as the dimensions of the protected document 200.

A method 2900 of generating a coarsely-aligned image for the scanned image of the protected document 200, as executed at step 1302, will now be described with reference to Fig. 29. The method 2900 may be implemented as software resident on the hard disk drive 110 and being controlled in its execution by the processor 105.

The method 2900 begins at step 2901, where the processor 105 selects the coordinates for a first pixel position a current pixel position) in the coarsely-aligned image. The coarsely-aligned image may be generated in memory 106, for example. At the next step 2903, the processor 105 transforms the selected coordinates in the coarselyaligned image y) using the coarse alignment affine transform, resulting in coordinates A(x, y)T a for the selected pixel in the scanned image of the protected document 200.

694656.doc -42- Then at the next step 2905, the processor 105 interpolates the scanned image at the coordinates A(x, y)T a, using bicubic interpolation, resulting in a greyscale value. The resulting pixel value is stored in the coarsely-aligned image configured within memory N 106 at the current pixel position. Then at the next step 2909, if the coarsely aligned image r 5 is complete all pixel values have been generated for the coarsely-aligned image), the method 2900 concludes. Otherwise, the method 2900 returns to step 2901 to select a next Spixel position in the coarsely aligned image.

Alternatively, the scanned image may first be blurred with a low-pass filter prior to execution of the method 2900. Blurring the scanned image of the protected documenr 200 using the low-pass filter may reduce the effects of aliasing introduced when a highresolution scanned image is transformed to produce a lower-resolution coarsely-aligned image. Any suitable low-pass filter may be used to blur the scanned image. The selection of the low-pass filter may be based on the ratio between the resolution of the scanned image and the resolution of the protected document 200.

Following step 1302 of the method 1300, at the next step 1303, the processor 105 generates a reference image. A method 1500 for generating a reference image, as executed at step 1303, will now be described with reference to Fig. 15. The method 1500 may be implemented as software resident on the hard disk drive 110 and being controlled in its execution by the processor 105.

The method 1500 generates a temporary protected document with the same parameters dimensions and salt value) as the protected document 200. The temporary protected document may be configured within memory 106. The temporary protected document may be used to generate the reference image. The protected document dimensions and salt value used in the method 1500 have been determined previously in steps 502 and 504 of the method 500.

694656.doc -43- The method 1500 begins at step 1501, where the processor 105 generates spirals for o the corners of the temporary protected document, in a similar manner to the generation of the spirals for the protected document 200 at step 402 of the method 400. At the next step N 1503, the processor 105 generates a border pattern for the temporary protected document, r- 5 storing data in the border pixels of the temporary protected document, in a similar manner to the generation of the border pattern for the protected document 200 at step 402 of the N method 400. Then at the next step 1504, the processor 105 generates an alignment pattern in the alignment pixels in an interior region of the temporary protected document, in a similar manner to the generation of the alignment pattern at step 404 of the method 400 for the protected document 200. Accordingly, at step 1504, all of the pixels in the temporary protected document have been assigned values, except for the document pixels and the protection pixels.

The method 1500 continues at the next step 1505 where the processor 105 generates the reference image, within memory 106, using the temporary protected document.

Initially the reference image is empty. When the pixels in the temporary document are a corresponding pixel in the reference image is set to a value of and when the pixels are "off', the corresponding pixel in the reference image is set to a value of -1.

For the document pixels and the protection pixels which have not been assigned values previously, the corresponding pixel in the reference image is given a value of 0. The method 1500 concludes following step 1505.

At step 1303, where spirals are printed at a higher resolution than the protected document resolution where the spirals are to be embedded, rather than dividing these pixels into F x F pixels, the pixels may be left undefined. In this instance, the pixels in the reference image corresponding to undefined pixels in the temporary protected document may be assigned the value 0.

694656.doc -44- At the next step 1304 of the method 1300, the processor 105 uses the coarselyo aligned image and the reference image to generate a displacement map The d Sdisplacement map de stores displacement vectors. Each displacement vector stored is associated with a location in the reference image, and measures the amount of shift between the reference image and the coarsely-aligned image at that location.

SThe displacement map d c may be generated at step 1304 using a tiled correlation method. The generation of the displacement map dc involves selection of a tile size 2Q N and a step size P. The tile size and step size may be varied. Larger values of Q give more measurement precision, at the expense of averaging the increased precision over a larger spatial area, and possibly more processing time. Smaller values of step size P give more spatial detail. However, again using smaller values of step size P may increase processing time. As an example, in one implementation Q 96, and P 16. This represents a tile of 192 pixels high by 192 pixels wide, stepped along the reference image and the coarsely-aligned image, in both horizontal and vertical directions, in 16 pixel increments.

Fig. 16A shows a correlation tile 1603 of the reference image 1610, which may be used in step 1304. The correlation tile 1603 has a corresponding correlation tile 1604 in the coarsely-aligned image 1620, as seen in Fig. 16B. Both of the correlation tiles 1603 and 1604 have vertical and horizontal dimensions equal to 2Q, shown as 1601. The correlation tiles 1603 and 1604 are stepped in horizontal and vertical increments according to the step size P, shown as 1602.

A method 1700 of generating a displacement map d c as executed at step 1304, will now be described with reference to Fig. 17. The method 1700 may be implemented as software resident in the hard disk drive 110 and being controlled in its execution by the processor 105.

694656.doc The method 1700 begins at step 1702, where the processor 105 divides the o reference image 1610 and the coarsely-aligned image 1620 into overlapping tiles as described with reference to Fig. 16 and iterates through the tiles in both images 1610 and 1620. On a first execution of step 1702, top-left corner tiles 1603 and 1604 from both the reference image 1610 and the coarsely-aligned image 1620, respectively, are selected. On Ssubsequent executions of step 1702, subsequent pairs of corresponding tiles are selected, Sfrom left to right in each row of tiles, starting with a first row of tiles 1615), and finishing at a bottom row of tiles. The tile 1603 selected at step 1702 from the reference image may be denoted as and the selected tile 1604 from the coarsely-aligned image may be denoted T 2 Furthermore, the coordinates of the centre of the tiles 1603 and 1604 may be denoted as y).

Once the pair of corresponding tiles T1 and T2 has been selected at step 1702, at a next step 1703, the selected tiles T, and T 2 are windowed. The tiles T, and T 2 may be windowed at step 1703 by a Hanning window in a vertical direction, and a Hanning window in a horizontal direction. At the next step 1704, the selected tiles T, and T 2 are then circular phase correlated to generate a correlation image for the selected tiles. The correlation image for the selected tiles may be configured within memory 106. The circular phase correlation is performed at step 1704 via the frequency domain. A method 3000 of generating a correlation image for the selected tiles as executed at step 1704 will now be described with reference to Fig. The method 3000 begins at the first step 3001, where the processor 105 transforms the selected tiles T, and T 2 using a Fast Fourier Transform (FFT), to generate tiles T 1 and T2A. At the next step 3003, the processor 105 multiplies the tile T 1 by the complex conjugate of tile T2A to generate tile T, Then at the next step 3005, the processor 105 normalises the coefficients of the tile T so that each coefficient has unit magnitude.

694656.doc -46- The method 3000 concludes at the next step 3007, where the inverse FFT of the tile is O determined, to generate the correlation image T x for the tiles T 1 and T 2 selected at step 1702. The correlation image TX is an array of dimensions 2Q by 2Q of real values and may be configured within memory 106.

Returning to the method 1700, at the next step 1705, the processor 105 processes Sthe correlation image T x to determine a displacement vector representing the location, denoted (Ax, Ay)T of a highest peak in the correlation image T x to sub-pixel accuracy. A N method 2100 of determining the location of the highest peak in the correlation image T x to sub-pixel accuracy, as executed at step 1705, will be described below with reference to Fig. 21. The location of the peak represented by the displacement vector (Ax, Ay)T, in the correlation image T x measures the amount of shift between the tiles T, and T2, and hence the displacement, or warping, between the reference image and the coarsely-aligned image in the vicinity of T and T 2 The method 1700 continues at the next step 1706, where the processor 105 stores the location of the highest peak (Ax, Ay)T in the displacement map d, at the location of the centre of the selected tiles. At step 1706, the processor 105 assigns de(x, y) (Ax, Ay) where y) represents the coordinates of the centre of the tiles T, and T 2 However, if a peak in the correlation image T x could not be determined at step 1705, no peak location is stored in the displacement map y).

At the next step 1707, if the processor 105 determines that there are more tiles in the reference image and the coarsely-aligned image to be processed, then the method 1700 returns to step 1702. Otherwise, the method 1700 concludes.

The displacement map d, generated in accordance with the method 1700 is defined at some locations where the possible locations y) are the centres of correlation tiles. Since the tiles were stepped with a horizontal and vertical increment of step size P, 694656.doc -47the displacement map d, may be defined at a set of points lying in a regular grid with O spacing P.

Since the tiles 1603, 1604) used for correlations in the method 1700 are N overlapping, some of the calculations performed in determining the FFT of previous tiles, 5 may be reused when calculating the FFT of subsequent tiles. This may increase the speed of the fine alignment. An alternative method 1900 for determining the Fast Fourier STransform (FFT) of correlation tiles, as executed at steps 1703 and 1704, will now be Sdescribed with reference to Figs. 18 and 19.

Fig. 18 shows two overlapping tiles 1801 and 1802. The tile 1801 is shaded with north-easterly lines and the tile 1802 is shaded with south-easterly lines. A region 1803 as shown in Fig. 18 represents the overlap of the tiles 1801 and 1802. The amount of overlap of the tiles 1801 and 1802 represented by the region 1803 is equal to 2Q P columns, where 2Q represents the tile size and P represents the step size as described above.

The method 1900 may be implemented as software resident on the hard disk drive 110 and being controlled in its execution by the processor 105. The method 1900 begins at step 1902, where if the processor 105 determines that the tiles T, and T 2 overlap with the tiles T, and T 2 from a previous execution of the loop defined by steps 1702 to 1707) of the method 1700, the method 1900 proceeds to step 1904. Otherwise, the method 1900 proceeds to step 1903. At step 1903, each column of the tiles T, and T 2 is windowed vertically, and then a vertical FFT is applied to the tiles T 1 and T 2 resulting in processed data for T l and T 2 At the next step 1906, the method 1900 stores right-most one of the 2Q P columns of processed data from both of the tiles T 1 and T 2 in a cache of processed columns configured within memory 106. Any data in the cache may be overwritten at step 1906. The method 1900 concludes at the next step 1907 where the 694656.doc -48- O processor 105 windows and applies a horizontal FFT to each row of the processed data N for the tiles T, and T 2 Data resulting from step 1907 represents a two-dimensional windowed FFT of the tiles T 1 and T 2 At step 1904, there is no need to determine the leftmost 2Q- P columns of processed data. Rather these columns of data may be copied out of the cache of r processed columns configured within memory 106. Then at the next step 1905, the processor 105 applies the window and vertical FFT to each of the remaining P columns of the tiles T 1 and T 2 Following step 1905, the method 1900 proceeds to the step 1906 and the method 1900 concludes.

Returning to the method 1300 of Fig. 13, following the generation of the displacement map d c at step 1304, the following steps of the method 1300 may use the displacement map dc to generate a warp map wc. The warp map w c maps each pixel in the printed version of the protected document 200 to a location in the coordinate space of the scanned image of the protected document 200. Some parts of the warp map w c may map pixels in the protected document 200 to coordinates outside the scanned image, since the scanner 119 may not have scanned the entire printed version of the protected document 200.

If y) are the coordinates of a pixel in the reference image, then the displacement map dc(x, y) represents the shift to a corresponding location in the coarsely-aligned image. Therefore, the corresponding coordinates in the coarsely-aligned image may be determined as y)T dc(x, Applying the coarse alignment affine transform to the reference image provides the coordinates in the scanned image. The warp map w, maps each pixel y) in the protected document 200 to a location in the coordinate space of the scanned image of the protected document 200 in accordance with Formula (15) as follows: 694656.doc Sw, y) y)T a

C

However, the displacement map dc(x, y) is only defined at a few places, namely the locations of the centres of some correlation tiles 1603 and 1604). In order to C1 determine a value for Formula (15) at the locations of all pixels of the protected document 200, the displacement map d c is interpolated.

SThe method 1300 continues at the next step 1305, where the processor 105 determines an affine transform defined by a matrix G and vector g. The affine transform determined at step 1305 may be referred to as a gross approximation affine transform.

The gross approximation affine transform approximates the warp map w, with an affine transform. The error function to be minimized in determining the affine transform is the Euclidean norm measure E that may be defined according to Formula (16) as follows:

E

G gw(xy) 2 (16) (x,y) Formula (16) may be solved using least squares minimisation methods to determine the affine transform in accordance with Formula (17) as follows: (G Zw(xy) y Y (17) (X (xy) i For both Formulae (16) and the sums are taken over all coordinate pairs y) where the displacement map dc(x, y) is defined, and hence the warp map wc(x, y) is defined, via Formula 694656.doc O At the next step 1306 of the method 1300, the processor 105 removes the gross Sapproximation affine transform from the warp map w c to generate a modified warp map wc' in accordance with Formula (18) as follows: Swc'(x, y) y) G(x, y) g (18) where the modified warp map w c is defined at coordinates y) at which de(x, y) is

C

defined. Thus, the modified warp map is defined at some points y) that lie on the grid formed by the centres of the correlation tiles 1603, 1604).

C The method 1300 continues at the next step 1307, where the processor 105 interpolates the modified warp map we', so that the modified warp map w c is defined at all pixel coordinates y) in the protected document 200. A method 2000 of interpolating a mapping, as executed at step 1307, will be described in detail below with reference to Fig. At the next step 1308, the processor 105 then reapplies the previously removed gross approximation affine transform to the modified warp map to generate the warp map w C in accordance with Formula (19) as follows: y) y) G(x, y) T g (19) The warp map is now defined at all pixels in the protected document 200 and may be denoted w. The method 1300 concludes following step 1308.

6.1 Map Interpolation The method 2000 of interpolating a mapping, as executed at step 1307 in relation to the modified warp map wc', and as executed at step 1313 in relation to the displacement map d c will be described in detail below with reference to Fig. 20. The method 2000 may be implemented as software resident in the hard disk drive 110 and being controlled in its execution by the processor 105.

694656.doc -51 O The method 2000 uses a mapping m defined at the centre of one or more correlation tiles 1603 and 1604). The mapping m is the modified warp map w'c as determined at step 1306. The mapping m is interpolated in accordance with the method 2000 to be C defined at coordinates y) for all pixels in the protected document 200.

The method 2000 begins at step 2002 where the processor 105 generates a lowj resolution mapping mL within memory 106 and initializes the values of the mapping mL.

At step 2002, the mapping mL is defined at coordinates y) where m is defined, and is N assigned the same values as m at those points. Thus, the mapping mL is defined at some of the points at the centres of correlation tiles. The centres of the correlation tiles form a grid with a spacing equal to the tile step size, P.

A set of points referred to as "gridpoints" may be defined. The gridpoints comprise the points that are the centres of correlation tiles, and additionally include other points which are not at the centre of a correlation tile. These other points may be obtained by extending the regular grid formed by the tile centres. Gridpoints may be defined as those points y) in the extended grid whose coordinates lie in the range as follows: -2P x Wp 2P -2P y Hp 2P (21) With gridpoints defined as above, the coordinates of the gridpoints may be determined in accordance with Formula (22) as follows: y) (Q XP, Q YP) (22) where X and Y are integers, and X and Y lie in the following ranges: -1-I X: +lQ (23) Ll-:l y [H (24) The value of points in the mapping mL at each of the gridpoints y) may be determined in accordance with steps 2003 to 2007 described below. The mapping m

L

694656.doc -52- O was defined where m is defined in step 2002. At step 2003, the method 2000 begins a O loop defined by steps 2003 to 2006) that determines the remaining values of the mapping mL. At step 2003, if the processor 105 determines that the mapping mL has been C defined at all gridpoints y) then the method 2000 continues at the next step 2007.

S 5 Otherwise, the method 2000 proceeds to step 2004. At step 2004, the processor 105 determines the coordinates of all undefined gridpoints that are adjacent to neighbour) N defined gridpoints. Then at step 2005, the processor 105 determines values for each of O the gridpoints found in step 2004. The value for adjacent gridpoints is set to the average of the values of the low resolution mapping mL at adjacent defined gridpoints. Then at the next step 2006, the values determined at step 2005 are stored in the low resolution mapping mL configured within memory 106. The method 2000 then returns to step 2003.

As described above, at step 2003, if the processor 105 determines that the low resolution mapping mL has been defined at all gridpoints y) then the method 2000 continues at the next step 2007. At step 2007, the low resolution mapping m L has been determined at all gridpoints, and may be used to interpolate the mapping m. At step 2007, the mapping m is interpolated at all protected document pixel coordinates y) using bi-cubic interpolation on the mapping mL.

6.2 Peak detection The method 2100 of determining the location (Ax, Ay) of a highest peak in the correlation image T x to sub-pixel accuracy, as executed at step 1705, will now be described with reference to Fig. 21. The location (Ax, Ay) of the highest peak in the correlation image T x represents the shift between the two tiles T, and T 2 being correlated.

The method 2100 may be implemented as software resident in the hard disk drive 110 and being controlled in its execution by the processor 105.

694656.doc -53- The method 2100 analyses the correlation image T, and determines the location S(zx, Ay) of the highest peak in the correlation image T x to sub-pixel accuracy. The method 2100 selects an initial peak height threshold H i and a peak height ratio The initial peak height threshold H i and the peak height ratio Hr parameters may be varied.

Increasing the initial peak height threshold H i decreases the number of peaks considered Sacceptable. Decreasing the peak height ratio Hr increases the speed of execution of the method 2100 and also increases the chance that a wrong peak will be selected as the N highest peak. The initial peak height threshold H i and the peak height ratio Hr parameters may be set to Hi 0.1 and Hr 4.

The method 2100 begins at step 2102, where the processor 105 determines all "peaks" in the correlation image A "peak" is a pixel in the correlation image T. with coordinates (x 0 yo), whose pixel value Tx(x 0

Y

0 is larger than the values of eight neighbouring pixels of the pixel. This means that pixels on the edges of the correlation image T x may be regarded as having eight neighbours, since the correlation image T x uses periodic boundary conditions. Pixels on the left edge may be regarded as adjacent to the corresponding pixels on the right edge, and similarly the pixels on the top edge may be regarded as adjacent to the corresponding pixels on the bottom edge. The peaks in the correlation image T. may be stored in a list configured within memory 106. The peaks may be stored in the list in decreasing order of peak pixel value.

Each peak in the peak list has integer coordinates (x 0 y 0 These coordinates (x 0 y 0 provide a good first approximation to the shift between the reference and coarsely-aligned images. However, to obtain sub-pixel accurate coordinates (6x, Ay) for the location of the highest peak, the correlation image T x is interpolated in the vicinity of 694656.doc -54each peak. The method 2100 processes each peak in the peak list, and interpolates the o correlation image T to determine the sub-pixel accurate peak location.

Also at step 2102, a variable H t is initialized to an initial value of the initial peak NI height threshold Hi. At the next step 2103, the processor 105 iterates over all of the s peaks in the peak list. On the first execution of step 2103, a first peak in the peak list is N selected. On subsequent executions of step 2103 subsequent peaks in the peak list are selected. At step 2104, the value of the peak pixel Tx(x 0 y 0 selected at step 2103 is Nanalysed by the processor 105 to determine whether the peak pixel value Tx(x 0 y 0 multiplied by the peak height ratio Hr is larger than the current peak height threshold Ht.

That is, the processor 105 determines whether: Tx( 0

Y

0 x Hr> H t If the peak pixel value Tx(x 0 yo) multiplied by the peak height ratio Hr is larger than the current peak height threshold H t then the method 2100 proceeds to step 2105.

Otherwise, the method 2100 concludes. At step 2105, the processor 105 selects a subregion, h, of the correlation image T. The sub-region, h, has width and height of 2Z pixels, where Z 8. The sub-region h is also centred at the coordinates (x 0 yo) of the peak selected at step 2103. The value of the sub-region, h, may be determined in accordance with Formula (26) as follows: h(x, y) Tx(x 0 x Z, y 0 y Z) (26) for x and y in the range 0 to 2Z 1, where the values of the correlation image T x outside the image are obtained by again applying periodic boundary conditions to the correlation image T

X

That is, the values of the correlation image TX outside the image are obtained by making the correlation image periodic. At step 2105, the selected sub-region, h, is then transformed with the Fast Fourier Transform (FFT) to determine a transformed image hA.

694656.doc O The transformed image, is then used at the next step 2106, where the processor O 105 interpolates the correlation image T x in the vicinity of the peak (x 0 Yo) to determine San approximation of the location of the peak. The correlation image T x may be interpolated at twenty-five (25) points, where x and y coordinates may be determined as follows: Ni x e X 0.5, x o 0.25, x o 0, x o 0.25, xo 0.5 Y ye o 0.5, yo 0.25, Yo 0, yo 0.

2 5 yo N The interpolation performed at step 2106 is Fourier interpolation and is executed using Formula (27) as follows: C(xo bx, yo by)= h(Z bx, Z by) z z (27) ~ih(kn)p k (Z+x n Z+by) (27) k=-Zn=-Z where f3 is defined as follows: I ejrkx/Z if k (28) k ejkx/Z if k= Z(28) A better approximation to the peak location may be found using the value of at which the interpolated value Tx(x 1 is largest.

At the next step 2107, the processor 105 determines a sub-pixel accurate estimate of the location (x 2

Y

2 of the selected peak. The interpolated correlation image T x may be approximated by a bi-parabolic function, f, in a region close to A bi-parabolic function f has a form in accordance with Formula (29) as follows: f(x,y) aox 2 +axy +a2 2 +a3 +a 4 y +a 5 (29) The coefficients (ao, a a 5 that make f(x y approximately equal to the interpolated image Tx(x, y) when x and y are close to x, and y respectively, may be determined in order to determine the sub-pixel accurate estimate of the location of the selected peak. Equivalently, the function f(x, y) may be approximated to 694656.doc -56- O T(x y when x and y are small. The coefficients (ao, a, a 5 may be 0 determined in accordance with Formula (31) below in order to minimize E in accordance with Formula (30) as follows: E .125 .125 Tx(x, x, y y)) 2 dx dy 0.125 0.125 Na Z Z Sa h Zl(k,n)exp(jn(kxh+nyh))Vk,n (31) Sa 4 3 k=-Z n=-Z Sas4 where xh x x 0 Z and y yl yo Z, and where the Vkn are constant vectors. The constant vectors vk, n may be determined in accordance with a method 3100, which will now be described with reference to Fig. 31.

The method 3100 of determining the constant vectors vk, n as executed at step 2107 may be implemented as software resident in the hard disk drive 110 and being controlled in its execution by the processor 105.

The method 3100 begins at step 3101, where the processor 105 determines the matrix V defined in accordance with Formula (32) as follows: xy xy r0.125 (0.125 V y y dx dy (32) J-0.1 2 5 0.125 x y y Each element in the matrix V is the integral of a polynomial in x and y, and may be determined analytically. Then at the next step 3103, the processor 105 determines the values of the constant vectors Vk, n in accordance with the Formula (33) as follows: 694656.doc -57- O xy C _2 0.125 fo.125 Pk(X, x 2 d (33) S<Vk,n (2Z) 2 -0.125 0.125 k x c' Each element in the constant vectors vk,n is the integral of an exponential in x and y multiplied by a polynomial in x and y, and may be evaluated analytically. The method 3100 concludes after step 3103.

The sub-pixel accurate peak location (x 2 y 2 may be set to the position of the

O

Smaximum value of the bi-parabolic function f. The sub-pixel accurate peak location (x 2

Y

2 may be determined in accordance with Formula (34) as follows: x 2 1 (2a 2 a3 -a 1 a 4 (34) Y2 Y) a 2 -4aoa 2 (2aoa 4 -a1a 3 The height of the selected peak, H, in the interpolated correlation image T, is also determined at step 2107 in accordance with Formula (35) as follows: H f(x 2 x l y 2 Y) The method 2100 continues at the next step 2108, where the processor 105 determines whether the height of the selected peak, H, at the location (x 2 y 2 determined at step 2107 is the largest peak determined in a current execution of the method 2100. If the height of the selected peak, H, at the location (x 2

Y

2 is larger than the current peak height threshold H t then the location (x 2

Y

2 represents the location of the highest peak found in the current execution of the method 2100. In this instance, the current peak height threshold H t is assigned a new value of the selected peak H, and the sub-pixel accurate coordinates (Ax, Ay) representing the location of the highest peak in the correlation image Tx is assigned the value of the location (x 2 y 2 determined at step 2107.

Otherwise, if the height of the selected peak H is not larger than the current peak height 694656.doc -58threshold Ht, no highest peak location was found in the current iteration of the loop Sdefined by steps 2103 to 2108.

The method 2100 continues at the next step 2109, where if the processor 105 N determines that there are more peaks in the peak list, then the method 2100 returns to step 2103. Otherwise, the method 2100 concludes.

During the execution of the method 2100, no highest peak may be found. For Sexample, if at every execution of step 2108 the height of the selected peak, H, at the location (x 2

Y

2 is not larger than the current peak height threshold H t then the sub-pixel accurate coordinates (Ax, Ay) will not be set to any given values. However, if step 2108 did find a highest peak, then the values of the sub-pixel accurate coordinates (Ax, Ay) represent the location of the highest peak.

Document Protection and Verification Tamper protection may be applied to the protected document 200. The tamperprotected document 200 may be verified for authenticity.

Error-correction coding may be applied to the document 204 of the protected document 200 using an error correction code (ECC) so that tamper detection and correction of each pixel of the document 204 is possible. In this instance, low density parity check (LDPC) coding may be used to apply error-correction coding to the document 204. The publication "Low-density parity-check codes", IRE Transactions on Information Theory, Vol. 8, January 1962, describes one error-correction coding method which may be applied to the document 204. Alternatively, other error-correction coding methods may also be applied to the pre-processed data. For example, Reed-Solomon (RS) coding or Turbo codes.

Low density parity check (LDPC) coding is a block coding scheme, in which data representing the document 204 is first divided into blocks of length 694656.doc -59- O ECCK bits, and each block is encoded to produce encoded blocks of length U ECCN bits, where ECCN and ECCK are parameters of the particular LDPC code _in use. The encoded blocks have (ECCN ECCK) parity bits. If the length of any pre-processed data representing the document 204 is not a multiple of ECCK bits, s the pre-processed data may be padded with zeros to make the length a multiple of ECCK bits.

The width and area of the protection barcode 203 may be determined based on the shape and the proportion of parity bits in the LDPC code, respectively, as will be described in detail below. The protection barcode 203 may be appended to the top and bottom of the document 204, and the width of the barcode 203 is the same on both sides of the barcode 203. The width of the barcode 203 may be referred to as BarcodeWidth.

The width and height of the interior 202 of the protected document 200 is a multiple of the width of the coarse alignment border 201.

A method 3200 of determining the width of the protection barcode 203, BarcodeWidth, for the protected document 200 when protecting a document 204, will now be described in detail below with reference to Fig. 32. As described above, the width of the coarse alignment border 201 may be denoted as B. The method 3200 ensures that the interior 202 of the protected document 200 has the correct dimensions in order to fit the protection barcode 203. The method 3200 determines the height of the interior 202 to accommodate the protection barcode 203 and the document 204 and then rounds both the width and increased height of the interior 202 up to a nearest multiple of the width of the border 201, B.

The method 3200 may be implemented as software resident in the hard disk drive 110 and being controlled in its execution by the processor 105. The method 3200 begins as step 3201, where the processor 105 determines a width of the document 204, and a 694656.doc 0 current height of the document 204. At the next step 3203, the processor 105 O determines a final width Wf for the document 204 by rounding the document width 'W' Sup to the nearest multiple of B. Then at the next step 3205, a minimum height,

C

1 MinHeight, for the interior 202 of the protected document 200 is determined in accordance with Formula (36) as follows: H x ECCN] ECCN ECCK MinI He ight C 1 36 ECCK Wf The method 3200 continues at the next step 3207, where the processor 105 rounds up the minimum height, MinHeight, to the nearest multiple of B. A final height of the document may be determined by setting Hf to the rounded value of MinHeight. At the next step 3209, since the total height of the interior 202 has changed, the new height of the original document Hew is determined in accordance with Formula (37) as follows: ECCN -ECCK (H

ECCK

H (37) "ew =ECCN The new dimensions Wf and H ew may be used as the dimensions of the document 204 when the document 204 is encoded. In this instance, the document 204 may be padded with a border of white pixels so that the document 204 fits the new dimensions Wf and He,,, The method 3200 concludes at the next step 3211, where the processor 105 determines the width, BarcodeWidth, of both the top 203A and bottom 203B regions of the protection barcode 203, and the total area, BarcodeArea, of the protection barcode 203, in accordance with Formulas (38) and respectively.

694656.doc SHf -HNEW BarcodeWidth (38)

S

2 SBarcodeArea BarcodeWidth x Wf x 2 (39) The BarcodeArea may be greater than a minimum area needed for the barcode 203.

SIf the shape of the protection barcode 203 is changed, then the method for determining the size of the protection barcode 203 will also change.

If the value determined for BarcodeWidth is not an integer, then the top half of the protection barcode 203 will have FBarcodeWidth] lines of pixels, and the bottom half of the protection barcode 203 will have [BarcodeWidth] lines of pixels.

A method 3300 of determining the width of the protection barcode 203, BarcodeWidth, for the protected document 200 when verifying the protected document 200, will now be described in detail below with reference to Fig. 33. The width and height of the interior 202 of the protected document 200 being verified may be denoted as and respectively.

The method 3300 begins at the first step 3301, where the processor 105 determines the height of the document 204, H o using the height Hr of the interior 202 of the protected document 200 being verified. At the next step 3303, the processor 105 determines the width, BarcodeWidth, and the area, BarcodeArea, of the protection barcode 203 in accordance with Formulas (40) and (41) as follows: I ECC-1)

ECCK

H° ECCN

ECCN

694656.doc -62- H, H, Barcode Width H H (46) S2 BarcodeArea BarcodeWidth x Wr x 2 (47) Again, if the value determined for BarcodeWidth is not an integer, then the top half of the protection barcode 203 will have [BarcodeWidth] lines of pixels, and the bottom half of the protection barcode 203 will have LBarcodeWidthj lines of pixels.

N Different LDPC codes may be chosen and used. As such, a direct trade-off may be made between the size of the protection barcode 203 and ability to recover the document 204. For example, an LDPC code with a large amount of redundancy may be used to protect a very valuable document 204, at the cost of a larger protection barcode 203.

7.1 Tamper Protecting a Document Tamper protection is applied in two steps. As described above, step 405 of the method 400 accesses data in the form of a bi-level image representing the document 204 to be protected, from memory 106 for example and encodes the data to form a 1D document array and a 1D protection array. The 1D document array represents a serialised version of the document 204. The 1D protection array comprises protection bits.

As described above, at step 406 the processor 105 arranges the 1D document array and the 1D protection array on a page, as the document 204 and the protection barcode section 203, respectively.

The method 2200 of encoding a document 204 to be protected into 1D document array and a 1D protection array, as executed at step 405, will now be described with reference to Fig. 22. The method 2200 begins at step 2202, where the processor 105 serialises the document 204 to form a ID image array. The processor 105 accesses the pixels of the bi-level image representing the document 204 in raster order from left to right, then from top to bottom) and adds the pixels of the document 204 one by one to 694656.doc -63the 1D document array configured within memory 106. The 1D document array may also Sbe padded with zeros so that the size of the array is a multiple of ECCK.

At the next step 2203, the processor 105 pseudo-randomly permutes the order of Selements of the 1D document array. Step 2203 is executed since document alterations or tampering are generally localised. Once serialised, such alterations manifest as a burst error in the 1D document array. Most error correcting codes, including LDPC codes have N' difficulty correcting burst errors. However, such error correcting codes correct dispersed errors more easily. Allowing localised alterations to remain localised reduces the effectiveness of the described methods. Permuting the elements in the ID document array at step 2203, converts localised tampers into dispersed tampers.

Many methods may be used to generate a pseudo-random permutation at step 2203 and use the permuation to scramble the ordering of elements in a 1D array. For example, a pseudo-random array of positive integers oa 2 may be generated using the RC4 random number generation algorithm.

A method 3400 of generating a pseudo-random permutation, as executed at step 2203, will now be described with reference to Fig. 34. The method 3400 may be implemented as software resident on the hard disk drive 110 and being controlled in its execution by the processor 105. The length of the 1D document array may be denoted as

N.

The method 3400 begins at step 3401, where the processor 105 sets a variable x configured in memory to zero x where x ranges from Then at the next step 3402, the processor 105 accesses the 1D document array, denoted as and determines (u 2 mod The result of step 3402 may be denoted as a 0 ax). At the next step 3403, the processor 105 exchanges elements at A[0] and A[a 0 of the 1D document array, Then at the next step 3405, the processor 105 sets x equal to x plus 694656.doc -64one x x If x is greater than or equal to N at the next step 3407, then the method 3400 concludes. Otherwise, the method 3400 returns to step 3402. At the second execution of step 3402, the processor 105 determines (cO2 mod 1, since x 1.

N The result of the second execution of step 3402 may be denoted Then at the next execution of step 3403, the processor 105 exchanges the elements at A[1] and A[al]. The Smethod 3400 proceeds in this manner until x is greater than or equal to N at step 3407 and the 1D document array is randomly permuted.

SThe method 2200 continues at the next step 2204, where the processor 105 divides the ID document array into blocks of size ECCK. These blocks may be processed one at a time from left to right in the following steps 2205 to 2207, where a current block BK is accessed at each iteration of the steps 2205 to 2207.

At step 2205, the processor 105 accesses block BK and uses the LDPC encoder to generate an encoded block of size ECCN comprising the concatenation of the original block BK and generated parity bits. This encoded block may be denoted BKE. Then at the next step 2206, the processor 105 extracts the (ECCN ECCK) generated parity bits from block BKE and adds the extracted parity bits to the end of the 1D protection array.

The method 2200 continues at the next step 2207, where if there are more blocks of the 1D document array to process, then the method 2200 returns to step 2204. Otherwise, the method 2200 proceeds to step 2208. When all of the blocks of the 1D document array have been processed, the 1D protection array is padded with zeros so that the 1D protection array is the same size as the BarcodeArea described above.

At step 2208, the processor 105 accesses the 1D protection array and pseudorandomly permutes the order of the elements of the 1D protection array in accordance with the method 3400, where the array accessed at step 3402 of the method 3400 is the 1D protection array rather than the 1D document array.

694656.doc The method 2200 continues at the next step 2209, where the processor 105 accesses tthe permuted 1D document array from memory 110, for example, and applies the inverse pseudo-random permutation applied in step 2203. The method used to inverse pseudo- N' random permute the 1D document array at step 2209 depends on the method of permutation used in step 2203. The pseudo-random array of integers o 2 described above may be used at step 2209.

A method 3500 of generating an inverse pseudo-random permutation, as executed at step 2209, will now be described with reference to Fig. 35. The method 3500 may be (,i implemented as software resident on the hard disk drive 110 and being controlled in its execution by the processor 105.

The method 3500 begins at step 3501, where the processor 105 sets a variable x configured in memory 106 to N-2 x Then at the next step 3502, the processor 105 accesses the 1D document array, denoted as and determines (a 2 (N-2) mod 2) N-2. The result of step 3502 may be denoted as aN 2 That is, the processor 105 determines a, a 2 mod x at step 3502. At the next step 3503, the processor 105 exchanges elements at A[N-2] and A[aN 2 of the permuted ID document array, Then at the next step 3505, the processor 105 sets x equal to x minus one (1) x x If x is than zero at step 3507, then the method 3400 concludes.

Otherwise, the method 3500 returns to step 3502. At the second execution of step 3502, the processor 105 determines (a 2 mod 3) N-3. The result of the second execution of step 3502 may be denoted XN 3 Then at the next execution of step 3503, the processor 105 exchanges the elements at A[N-3] and A[XN3]. The method 3400 proceeds in this manner until x is less than zero at step 3505, resulting in the permuted 1D document array being undone.

The method 2200 concludes following step 2209.

694656.doc -66- The method 2300 of arranging the ID document array and the 1D protection array to form the protected document 200, as executed at step 406, will now be described with reference to Fig. 23. The method 2300 may be implemented as software resident in the hard disk drive 110 and being controlled in its execution by the processor 105. As described above, the method 2300 arranges the 1D document array and the ID protection N, array as the document 204 and the protection barcode 203, respectively, to form the protected document 200. The method 2300 may be implemented as software resident in the hard disk drive 110 and being controlled in its execution by the processor 105.

Steps 2302 to 2307 of the method 2300 iterate over the pixels in the protection barcode 203 and the document section 204 in raster order from left to right, then from top to bottom). A current pixel may be denoted P y) for steps 2303 to 2307, where y) is the row and column coordinates of the current pixel.

At step 2303, if the processor 105 determines that the current pixel P(x,y) is an alignment pixel, then the method 2300 proceeds to step 2307. Otherwise, the method 2300 proceeds to step 2304. The determination may be made at step 2303 by determining if x and y for the current pixel P(x, y) are both divisible by three If x and y for the current pixel P(x, y) are both divisible by three then the pixel P(x, y) is an alignment pixel.

At step 2304, if the current pixel P(x, y) is in the protection barcode 203, the method 2300 proceeds to step 2306. Otherwise, the method 2300 proceeds to step 2305.

The determination at step 2304 depends on the shape and location of the barcode 203.

For the barcode 203 of Fig. 2A, the protection barcode 203 is appended to the top barcode region 203A) and bottom barcode region 203B) of the document 204 as seen in Fig. 2A. In this instance, the current pixel P(x, y) is in the protection barcode 203 694656.doc -67if y is less than the value of BarcodeWidth, or y greater than or equal to (Hf SBarcodeWidth).

At step 2305, the processor 105 sets the current pixel P(x, y) to the next element in N the 1D document array configured within memory 106, for example. The next element is the element after a last used element in the 1D document array, in left to right order.

N At step 2306, the processor 105 sets the current pixel P(x, y) to the next element in N the 1D protection array. The next element in the 1D protection array is the element after a Slast used element, in left to right order.

At the next step 2307, if the processor 105 determines that there are more pixels in the protection barcode 203 and the document 204 to be processed, then the method 2300 proceeds to step 2302. Otherwise, the method 2300 concludes. Following the conclusion of the method 2300, every pixel in the protection barcode 203 and the document 204 will be allocated a value.

7.2 Verifying a document Verification of a protected document 200 is performed in steps 506 and 507 of the method 500, as described above. At step 506, the processor 105 extracts a 1D document array and a 1D protection array from the aligned scanned image of the protected document 200. The protection barcode section 203 and the document section 204 are serialised into a 1D protection array and a 1D document array respectively, at step 506, in accordance with the method 2400, which will be described in detail below with reference to Fig. 24.

At step 507, the 1D document array and the 1D protection array may be used to detect alterations in the scanned image of the printed document 200. The processor 105 produces two images at step 507, a first image showing the location of the alterations or tampered pixels in the scanned image of the document 204 and a second image where the 694656.doc -68alterations have been corrected or repaired. If the document 204 has been greatly o damaged or altered, repairing the document 204 may fail, and the document 204 may be marked invalid. If the document 204 was successfully repaired, the image showing the alterations is created by comparing the scanned image of document 204 with the repaired documcnt.

The method 2400 of extracting the two one dimensional arrays from the scanned N, image of the protected document 200, as executed at step 506, may be implemented as software resident in the hard disk drive 110 and being controlled in its execution by the processor 105. Steps 2402 to 2407 iterate over all the pixels in the interior 202 of the protected document 200 in raster order from left to right, then from top to bottom). A current pixel may be denoted as P y) for steps 2403 to 2407, where y) is the row and column coordinates of the current pixel P(x, y).

Steps 2402 to 2407 of the method 2400 iterate over the pixels in the protection barcode 203 and the document section 204 in raster order from left to right, then from top to bottom). A current pixel may be denoted P y) for steps 2403 to 2407, where y) is the row and column coordinates of the current pixel.

At step 2403, if the processor 105 determines that the current pixel P(x,y) is an alignment pixel, then the method 2400 proceeds to step 2407. Otherwise, the method 2400 proceeds to step 2404. The determination may be made at step 2403 by determining if x and y for the current pixel P(x, y) are both divisible by three If x and y for the current pixel P(x, y) are both divisible by three then the pixel P(x, y) is an alignment pixel.

At step 2404, if the current pixel P(x, y) is in the protection barcode 203, the method 2400 proceeds to step 2406. Otherwise, the method 2400 proceeds to step 2405.

The determination at step 2404 depends on the shape and location of the barcode 203.

For the protection barcode 203 of Fig. 2A, the protection barcode 203 is appended to the 694656.doc -69- O top barcode region 203A) and bottom barcode region 203B) of the document O 204 as seen in Fig. 2A. In this instance, the current pixel P(x, y) is in the protection barcode 203 if y is less than the value of BarcodeWidth, or y greater than or equal to (Hf NBarcodeWidth).

S 5 At step 2405, the processor 105 adds the value of the current pixel P(x, y) to the end of the 1D document array configured within memory 106, for example.

SAt the next step 2406, the processor 105 adds the value of the current pixel P(x, y) Sto the end of the 1D protection array.

At the next step 2407, if the processor 105 determines that there are more pixels in the protection barcode 203 and the document section 204 to be processed, then the method 2400 proceeds to step 2402. Otherwise, the method 2400 concludes. Following the conclusion of the method 2400, every pixel in the protection barcode 203 and the document 204 has been copied to either the 1D document array or the 1D protection array. The 1D document array may also be padded with zeros to increase the size of the 1D document array to the nearest multiple of ECCK.

As described above, step 507 repairs the document 204 and, if the repair was successful, the processor 105 creates an image showing the pixels that have been altered or tampered. Step 507 generates two new 1D arrays. The first array is the 1D repaired document array, and stores a serialised 2D-image representing the repaired document 204.

The second array generated at step 507 may be referred to as a 1D tamper array, and stores a serialised 2D-image representing the detected tampered areas of the document 204.

The method 2500 indicating the location of the alterations to the scanned image of the protected document and generating an image correcting the alterations, as executed at step 507, will be described in detail below with reference to Fig. 694656.doc The method 2500 begins at step 2502, where the processor 105 accesses the 1D o document array and pseudo-randomly permutes the order of the elements of the 1D document array, in accordance with the method 3400 described above.

N At the next step 2503, the processor 105 applies the inverse pseudo-random permutation to the 1D protection array, in accordance with the method 3500 described above.

N At the next step 2504, the processor 105 divides the 1D document array into blocks of size ECCK, and the 1D protection array into blocks of size (ECCN ECCK). Blocks in the 1D document array are processed one at a time from left to right in the following steps 2505 to 2508, of the method 2500. For each block from the 1D document array, the processor 105 pairs the block with a corresponding block in the 1D protection array to form a new block of size ECCN. This block may be referred to as BK. The block BK is the reconstructed LDPC code encoded block, with the parity bits reassembled next to the document bits.

At step 2505, the processor 105 processes the block BK and using LDPC attempts to repair any alterations made to the block BK. The output of step 2505 is a block with the parity bits removed, leaving repaired document bits. The output of step 2505 may be denoted as BKD.

At the next step 2506, if the processor 105 determines that severe damage was present in the block BK and the block BK cannot be repaired, the method 2500 proceeds to step 2511. Otherwise, any damage or fraudulent alteration to block BK has been successfully repaired into block BKD and the method 2500 proceeds to step 2507. At step 2511, the processor 105 reports that the block BK and therefore the document 204 cannot be repaired and the method 2500 concludes.

694656.doc -71- At step 2507, the processor 105 accesses the block BKD and adds the block BK D to Sthe end of the 1D repaired document array. In order to detect which pixels have been altered in the block BK, the processor 105 compares block BKD to the document bits of block BK. The bits of the block BKD and the document bits of block BK may be XORed 5 together. The result of such an XOR is added to the end of the 1D tamper array.

C1 At the next step 2508, if there are any more blocks of the 1D document array to process, the method 2500 returns to step 2504. Otherwise, the method 2500 proceeds to N step 2509. At step 2509, the 1D repaired document array contains a permuted version of the original document 204, and the ID tamper array contains a permuted serialised 2Dimage where altered pixels appear in one colour, and correct pixels appear in the other colour.

At step 2509, the processor 105 applies the inverse pseudo-random permutation to the 1D repaired document array and the 1D tamper array, in accordance with the method 3500 described above.

The method 2500 concludes at the next step 2510, where the processor 105 converts each of the 1D repaired document array and the 1D tamper array to 2D images. Each 1D array is iterated through from left to right, and the pixels are written into a 2D image in raster order. If a pixel y) about to be written in the 2D image is an alignment pixel, then the pixel is skipped over and the next pixel in raster order is written to instead.

Following step 2510, an image of the repaired document and an image that indicates which pixels of the printed version of the printed document 200 have been altered are configured within memory 106.

The aforementioned preferred method(s) comprise a particular control flow. There are many other variants of the preferred method(s) which use different control flows 694656.doc -72- O without departing the spirit or scope of the invention. Furthermore one or more of the o steps of the preferred method(s) may be performed in parallel rather sequentially.

The foregoing describes only some embodiments of the present invention, and N modifications and/or changes can be made thereto without departing from the scope and r- 5 spirit of the invention, the embodiments being illustrative and not restrictive. For Sexample, the interpolation of the scanned image using bi-cubic interpolation at steps 1203 and 1302 may be alternatively performed using any suitable interpolation. For example, bi-linear interpolation may be used at steps 1203 and 1302.

Futher, the interpolation of the mapping in accordance with the method 2000 using bi-cubic interpolation may alternatively be executed using any suitable interpolation method. For example, bi-linear interpolation may be used to interpolate the mapping in the method 2000.

Futher, the resistance of a protected document 200 from deliberate alteration or tampering is dependent on keeping both the permutation in step 2203 and the LDPC code secret. If an attacker knows both the permutation in step 2203 and the LDPC code, the attacker may modify the protected document 200 at will and alter the protection barcode 203 to create a new, valid protected document 200. Public/private key encryption may be used to overcome such a problem. The protection bits in the protection barcode 203 may be encrypted with the private key of a sender, and may be decrypted during verification by the public key of the sender. This makes modifying the protection bits very difficult without the private key of the sender. Furthermore, public/private key encryption allows a receiver to verify that the protected document originated from a claimed sender.

As described above, the barcode 203 is laid out above and below the document 204.

The shape and location of the barcode 203 is not fixed, and may be altered to any other suitable shape and location.

694656.doc -73- Further, the peak detection step 1705 uses Fourier interpolation and a bi-parabolic Sfit to estimate the location of a peak to sub-pixel accuracy. However, any suitable peak determination method may be used in the described methods. For example, a chirp-z transform may be used to interpolate the correlation image at a large number of points, and the point with the largest value may be taken as the peak location.

In the context of this specification, the word "comprising" means "including

C

principally but not necessarily solely" or "having" or "including", and not "consisting only of'. Variations of the word "comprising", such as "comprise" and "comprises" have correspondingly varied meanings.

694656.doc

Claims

1. A method of generating a protected document, said method comprising the steps of: generating a block-based correlatable pattern of data; encoding data representing a document to be protected using an error correction code to generate parity bits for the document; and arranging the generated data patterns, the encoded document and the generated Sparity bits according to a predetermined arrangement to generate the protected document.

2. A method according to claim 1, wherein the correlatable pattern is a noise pattern.

3. A method according to claim 1, wherein the correlatable pattern comprises one or more portions of pseudo-random data.

4. A method according to claim 3, further comprising the step of distributing the random data according to the correlatable pattern substantially uniformly throughout the protected document.

A method according to claim 4, wherein the random data is distributed within an interior region of said protected document.

6. A method according to claim 4, wherein the random data is distributed in a border region of said protected document.

694656.doc

7. A method according to claim 6, wherein the data representing the document to be O protected is interdispersed with the random data within an interior region of said protected document.

8. A method according to claim 1, further comprising the step of generating one or more further data patterns based on a mathematical function having a predetermined Sproperty.

9. A method according to claim 8, further comprising the step of arranging the further generated data patterns in a border region of said protected document. A method according to claim 9, further comprising the step of interdispersing one or more portions of data with the further generated data patterns within the border region of the protected document. 11. A method according to claim 8, wherein the further generated data patterns are spirals. 12. A method according to claim 11, wherein the spirals are arranged in comers of the protected document. 13. A method according to claim 11, wherein six of the spirals are generated. 14. A method according to claim 11, wherein at least one of the spirals has a different phase to others of the spirals. 694656.doc -76- A method according to claim 11, wherein the spirals are printed at a higher resolution than the encoded document. 16. A method according to claim 1, further comprising the step of error correcting the data representing the document to be protected. 17. A method according to claim 3, further comprising the step of XORing the data (Ni representing the document to be protected with a pseudo-random sequence of binary data. 18. A method according to claim 1, wherein the data representing the document to be protected is encoded into an array. 19. A method according to claim 1, wherein the parity bits are encoded into an array. A method according to claim 1, wherein the parity bits are encrypted using a public/private key encryption method. 21. A method of generating a protected document, said method comprising the steps of: generating one or more data patterns based on a mathematical function having a predetermined property; arranging the generated data patterns in a border region of said protected document; generating a block-based correlatable pattern of data; arranging the correlatable pattern of data in an interior region of said protected document according to a predetermined arrangement; and 694656.doc -77- encoding data representing a document to be protected using an error correction Scode to generate parity bits for the document; and arranging the encoded document and the generated parity bits in said interior region N, according to said predetermined arrangement to generate said protected document. 22. A method according to claim 21, wherein the correlatable pattern is a noise pattern. 23. A method according to claim 21, wherein the correlatable pattern comprises one or more portions of pseudo-random data. 24. A method according to claim 23, further comprising the step of distributing the random data according to the correlatable pattern substantially uniformly throughout the protected document. 25. A method according to claim 24, wherein the random data is distributed within an interior region of said protected document. 26. A method according to claim 24, wherein the random data is distributed in a border region of said protected document. 27. A method according to claim 26, wherein the data representing the document to be protected is interdispersed with the random data within said interior region. 28. A method according to claim 21, further comprising the step of arranging further generated data patterns in a border region of said protected document. 694656.doc -78- 29. A method according to claim 28, further comprising the step of interdispersing one O or more portions of data with the further generated data patterns within the border region of the protected document. 30. A method according to claim 28, wherein the further generated data patterns are spirals. 31. A method according to claim 30, wherein the spirals are arranged in corners of the protected document. 32. A method according to claim 30, wherein six of the spirals are generated. 33. A method according to claim 30, wherein at least one of the spirals has a different phase to others of the spirals. 34. A method according to claim 30, wherein the spirals are printed at a higher resolution than the encoded document. A method according to claim 21, further comprising the step of XORing data representing the document to be protected with a pseudo-random sequence of binary data. 36. A method according to claim 21, wherein data representing the document to be protected is encoded into an array. 37. A method according to claim 21, wherein the parity bits are encoded into an array. 694656.doc -79- 38. A method according to claim 21, wherein the parity bits are encrypted using a Spublic/private key encryption method. 39. A method of generating a protected document, said method comprising the steps of: generating one or more spiral data patterns; arranging the spiral data patterns in a border region of said protected document; N, generating a noise pattern using random data; arranging the random data in an interior region of said protected document according to a predetermined arrangement; and encoding data representing a document to be protected using an error correction code to generate parity bits for the document; and arranging the encoded document and the generated parity bits in said interior region according to said predetermined arrangement to generate said protected document. 40. An apparatus for generating a protected document, said apparatus comprising: generating means for generating a block-based correlatable pattern of data; data encoding means for encoding data representing a document to be protected using an error correction code to generate parity bits for the document; and arranging means for arranging the generated data patterns, the encoded document and the generated parity bits according to a predetermined arrangement to generate the protected document. 41. An apparatus for generating a protected document, said apparatus comprising: first generating means for generating one or more data patterns based on a mathematical function having a predetermined property; 694656.doc O first arranging means for arranging the generated data patterns in a border region of said protected document; second generating means for generating a block-based correlatable pattern of data; second arranging means for arranging the correlatable pattern of data in an interior region of said protected document according to a predetermined arrangement; and encoding means for encoding data representing a document to be protected using an Nerror correction code to generate parity bits for the document; and Othird arranging means for arranging the encoded document and the generated parity bits in said interior region according to said predetermined arrangement to generate said protected document. 42. An apparatus for generating a protected document, said apparatus comprising: first generating means for generating one or more spiral data patterns; first arranging means for arranging the spiral data patterns in a border region of said protected document; second generating means for generating a noise pattern using random data; second arranging means for arranging the random data in an interior region of said protected document according to a predetermined arrangement; encoding means for encoding data representing a document to be protected using an error correction code to generate parity bits for the document; and third arranging means for arranging the encoded document and the generated parity bits in said interior region according to said predetermined arrangement to generate said protected document. 43. A computer program for generating a protected document, said program comprising: 694656.doc -81 code for generating a block-based correlatable pattern of data; o code for encoding data representing a document to be protected using an error correction code to generate parity bits for the document; and Scode for arranging the generated data patterns, the encoded document and the generated parity bits according to a predetermined arrangement to generate the protected document. 44. A computer program for generating a protected document, said program comprising: code for generating one or more data patterns based on a mathematical function having a predetermined property; code for arranging the generated data patterns in a border region of said protected document; code for generating a block-based correlatable pattern of data; code for arranging the correlatable pattern of data in an interior region of said protected document according to a predetermined arrangement; and code for encoding data representing a document to be protected using an error correction code to generate parity bits for the document; and code for arranging the encoded document and the generated parity bits in said interior region according to said predetermined arrangement to generate said protected document. A computer program for generating a protected document, said program comprising: code for generating one or more spiral data patterns; 694656.doc __i code for arranging the spiral data patterns in a border region of said protected O document; code for generating a noise pattern using random data; N code for arranging the random data in an interior region of said protected document according to a predetermined arrangement; and code for encoding data representing a document to be protected using an error Scorrection code to generate parity bits for the document; and Scode for arranging the encoded document and the generated parity bits in said interior region according to said predetermined arrangement to generate said protected document. 46. A method of generating a protected document, said method being substantially as herein before described with reference to any one of the embodiments as that embodiment is shown in the accompanying drawings. 47. An apparatus for generating a protected document, said apparatus being substantially as herein before described with reference to any one of the embodiments as that embodiment is shown in the accompanying drawings. 48. A computer program for generating a protected document, said program being substantially as herein before described with reference to any one of the embodiments as that embodiment is shown in the accompanying drawings. DATED this the Twentieth Day of December 2004 CANON KABUSHIKI KAISHA Patent Attorneys for the Applicant SPRUSON&FERGUSON 694656.doc