EP2119217A1 - Dokument mit einem codierten teil - Google Patents

Dokument mit einem codierten teil

Info

Publication number
EP2119217A1
EP2119217A1 EP07785987A EP07785987A EP2119217A1 EP 2119217 A1 EP2119217 A1 EP 2119217A1 EP 07785987 A EP07785987 A EP 07785987A EP 07785987 A EP07785987 A EP 07785987A EP 2119217 A1 EP2119217 A1 EP 2119217A1
Authority
EP
European Patent Office
Prior art keywords
document
data
dots
symbols
symbol
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP07785987A
Other languages
English (en)
French (fr)
Inventor
Taswar Iqbal
Walter Geisselhardt
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Universitaet Duisburg Essen
Original Assignee
Universitaet Duisburg Essen
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Universitaet Duisburg Essen filed Critical Universitaet Duisburg Essen
Priority to EP07785987A priority Critical patent/EP2119217A1/de
Publication of EP2119217A1 publication Critical patent/EP2119217A1/de
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/32Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
    • H04N1/32101Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
    • H04N1/32144Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title embedded in the image data, i.e. enclosed or integrated in the image, e.g. watermark, super-imposed logo or stamp
    • H04N1/32352Controlling detectability or arrangements to facilitate detection or retrieval of the embedded information, e.g. using markers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/32Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
    • H04N1/32101Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
    • H04N1/32144Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title embedded in the image data, i.e. enclosed or integrated in the image, e.g. watermark, super-imposed logo or stamp
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/32Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
    • H04N1/32101Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
    • H04N1/32144Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title embedded in the image data, i.e. enclosed or integrated in the image, e.g. watermark, super-imposed logo or stamp
    • H04N1/32149Methods relating to embedding, encoding, decoding, detection or retrieval operations
    • H04N1/32309Methods relating to embedding, encoding, decoding, detection or retrieval operations in colour image data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N2201/00Indexing scheme relating to scanning, transmission or reproduction of documents or the like, and to details thereof
    • H04N2201/32Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
    • H04N2201/3201Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
    • H04N2201/3225Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title of data relating to an image, a page or a document
    • H04N2201/3233Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title of data relating to an image, a page or a document of authentication information, e.g. digital signature, watermark
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N2201/00Indexing scheme relating to scanning, transmission or reproduction of documents or the like, and to details thereof
    • H04N2201/32Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
    • H04N2201/3201Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
    • H04N2201/3225Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title of data relating to an image, a page or a document
    • H04N2201/3233Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title of data relating to an image, a page or a document of authentication information, e.g. digital signature, watermark
    • H04N2201/3236Details of authentication information generation
    • H04N2201/3238Details of authentication information generation using a coded or compressed version of the image data itself
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N2201/00Indexing scheme relating to scanning, transmission or reproduction of documents or the like, and to details thereof
    • H04N2201/32Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
    • H04N2201/3201Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
    • H04N2201/3269Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title of machine readable codes or marks, e.g. bar codes or glyphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N2201/00Indexing scheme relating to scanning, transmission or reproduction of documents or the like, and to details thereof
    • H04N2201/32Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
    • H04N2201/3201Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
    • H04N2201/3269Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title of machine readable codes or marks, e.g. bar codes or glyphs
    • H04N2201/327Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title of machine readable codes or marks, e.g. bar codes or glyphs which are undetectable to the naked eye, e.g. embedded codes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N2201/00Indexing scheme relating to scanning, transmission or reproduction of documents or the like, and to details thereof
    • H04N2201/32Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
    • H04N2201/3201Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
    • H04N2201/3271Printing or stamping

Definitions

  • the present invention relates to a document comprising an encoded portion with symbols consisting of preferably printed dots, and to a method for generating such a document.
  • the present invention relates to a high capacity invisible data stripe for digital authentication of hardcopy documents.
  • This invention relates to the digital authentication of printed documents regardless of the underlying textual-content nature (i.e. alphanumeric characters type and size) by using high-quality superposed background image for machine-readable data encoding.
  • the invention facilitates optical character recognition (OCR) technology and here it eliminates the challenges encountered from the languages with complex writing structures due to non-Roman alphanumeric characters, tables,0 figures, equation symbols etc. because the invention offers higher capacity that allows foreground contents to be encoded in particular as doc file.
  • OCR optical character recognition
  • Another aspect is original quality fax document transmission.
  • the invention has applications for military communications, where conventional ways of communication are not applicable (e.g. data encryption makes communica-5 tion suspicious, and digital communication might not be applicable always).
  • the literature dealing with digital authentication of hardcopy text documents0 can be divided into two categories.
  • digital watermarking techniques are used to embed authenticity verification related data in the textual contents by slightly modifying some selected features of text such as words, paragraphs, lines etc.
  • selected modified features are checked in the scanned/digitized image.
  • the limitation5 of the watermarking approach is its low capacity and it is attributed to con- tents dependence of the watermarking process.
  • Such watermarking is described e.g. in A. M. Alattar, and O. M. Alattar, ''Watermarking electronic text documents containing justified paragraphs and irregular line spacing," SPIE Vol. 5306, 2004, Security and Watermarking of Multimedia Contents VI, San Jose, CA USA, J. T. Brassil, S.
  • the given message digest computation algorithm has limitations due to 99% rather than 100% performance of OCR technology in ideal cases. It is mentionable that this technique can be applied only with drawbacks, in particular, to documents in which OCR-technology is very difficult due to complex writing structure of the language alphanumeric characters.
  • One example of such languages is Arabic, a widely used language that is very attractive from commercial point of view.
  • US 2006/0147082 Al discloses marking of a document with invisible marks. Each mark is preceded by a marker to indicate to the scanner the beginning of a replication of a unique identifying code pattern.
  • the code pattern is formed by dots forming a series of binary coded decimal numbers. So that the number of dots depends on the coded number.
  • the distributed marks do not allow an optimized capacity. Further, the system is sensible regarding scanning errors and dirt or other disturbances and influences.
  • Object of the present invention is to provide a document with a preferably invisible encoded portion and a method for generating such document, wherein the encoded portion allows an optimized high capacity of data and/or is easy to scan and/or read and/or can be read with high security or only few errors.
  • the encoded portion comprises multiple symbols which encode information as data bits.
  • each symbol comprises at most two spatially spaced dots, and/or comprises at least one data bit and at least one synchronization bit or dot. This allows a very dense arrangement of the symbols with high data capacity and fewer reading errors due to the synchronization.
  • An additional or alternative aspect is that the sym- bols with different data bit values consist of the same number of dots. This facilitates reading and/or error correction.
  • the synchronization dots are regularly arranged and interleaved with data dots. This facilitates reading and/or error correction.
  • symbol shall be understood in particular as a pattern of the encoded portion which pattern is repeated and/or forms a unit containing one data bit or multiple data bits.
  • synchronization means a pattern or dot arrangement in the space domain that is used for calibrating or reading or sensing the special location of encoded information or data dots.
  • Fig. 1 shows a document with an encoded portion as constant greyscale background
  • Fig. 2 shows another document according to the present invention
  • Fig. 3a to 3c shows schematic diagrams of preferred constructions of the encoded portion according to the present invention
  • Fig. 4 shows a schematic flow chart representing a preferred method for generating a document according to the present invention.
  • Fig. 1 shows an example of a document 1 according to the present invention.
  • the shown document 1 comprises text 2 and an encoded portion 3.
  • the en- coded portion 3 preferably forms a background, in particular an at least substantially constant greyscale background for the human eye.
  • the document 1 is preferably a paper printout.
  • the text 2 and the encoded portion 3 are printed, in particular with a laser printer (not shown).
  • the text 2 is superposed onto the encoded background, i.e. encoded portion 3.
  • the encoded portion 3 is interleaved between the text 2. It is possible to provide for example rectangular ar- eas around the letters and/or words, sentences and/or lines of the text 2 without encoded portion within these areas. Alternatively it is also possible that the text is superposed onto the encoded portion 3 without any consideration of covering of parts of the encoded portion 3.
  • Fig. 2 schematically shows another document 1 according to the present invention.
  • this document 1 is an identification (ID) card, driver Ii- cense or the like.
  • the document 1 may contain an area 4 for a the picture or drawing, an area 5 e.g. for a signature, and/or an area 6 e.g. for biographical data, personal data, text or the like.
  • the area(s) 4, 5, 6 may be surrounded or embedded or superposed by or on the encoded portion 3.
  • the encoded portion 3 can be located on the front side and/or on the backside of the document 1 and/or can be combined with printed text, handwriting, images, holograms and/or other optionally encoded patterns.
  • Fig. 3a shows a schematic representation of a preferred construction of the encoded portion 3.
  • the encoded portion 3 comprises or consists of symbols 7.
  • the symbols 7 and, thus, the encoded portion 3, are preferably unreadable or invisible for human eyes.
  • the symbols 7 are machine-readable and encode information as data bits.
  • Fig. 3a shows four symbols 7.
  • Each symbol 7 forms at least one data bit or consists of preferably only spatially spaced dots 8, 9.
  • each symbol 7 comprises at most three or two spatially spaced dots 8, 9.
  • each symbol 7 preferably comprises at least one data bit and at least one synchronization bit or dot 9.
  • At most three spaced dots 8 of one symbol 7 form multiple data bits, preferably two data bits.
  • only one or two spaced dots 8 of one symbol 7 form the at least one data bit, preferably two data bits.
  • the symbol 7 may comprise only one data bit.
  • each symbol 7 comprises multiple data bits, preferably two.
  • each symbol 7 comprises only one data dot 8, and/or preferably only one synchronization dot 9.
  • Each symbol 7 comprises a synchronization region 10 and a data region 1 1.
  • Preferably only one dot 8, 9 is located in the synchronization region 10 and/or in the data region 1 1.
  • the synchronization region 10 may be restricted to a space for the only one dot 9 or may cover a wider space, e.g. a stripe with multiple potential dot positions or any other area with multiple potential dot positions.
  • the preferably only one synchronization dot 9 per symbol 7 is located at the same position within each symbol 7.
  • the synchronization dots 9 of multiple or all symbols 7 are preferably regularly arranged. This facilitates scanning and/or reading and decreases errors. Further, this supports the preferably desired constant grey appearance of the encoded portion 3, in particular if it is used as background.
  • the number of synchronization dots 9 per symbol 7 can be changed in order to vary the grey level of the encoded portion 3.
  • the data region 1 1 comprises four potential positions 12 for the data dot 8.
  • Each of the positions 12 is preferably spaced to the other positions and/or to the synchronization dot 9 or potential positions of synchronization dot(s) 9 and/or to the synchronization region 10. This facilitates correct reading with reduced error rate.
  • each symbol 7 has preferably the form of a square, in particular of 6x6 units corresponding to the minimum dot size, as shown in Fig. 3a.
  • the unit size is preferably about 1/300 inch at 300 dpi and at about 1/600 inch at 600 dpi.
  • each symbol 7 can also have any other suitable form, e.g. a rectangular form or the like.
  • a pair of two positions may form one bit. Then, the bit value depends whether the data dot 8 is in one of the positions. With two such pairs and two data dots 8, two data bits can be formed.
  • preferably only one data dot 8 is provided which is located in one of the four positions 12 so that a value range of two data bit is also achieved. Further, this reduces the grey level.
  • the data dots 8 of the symbols 7 are interleaved between the synchronization dots 9. This allows a very dense arrange- ment with high data capacity.
  • the preferably regular arrangement of the synchronization dots 9 facilitates error correction and/or reading and/or decoding of the encoded portion 3.
  • Fig. 3b shows in a schematic representation another embodiment with different symbol size.
  • the symbol 7 is a square of 8x8 units.
  • the symbol 7 may have multiple synchronization dots 9 in the synchronization region 10. In the example, three synchronization dots 9 are shown.
  • Fig. 3c shows an other similar embodiment of the symbol 7.
  • the synchronization regions 10 and/or the data regions 1 1 of multiple or all symbols 7 are regularly arranged, in particular grid-like.
  • each symbol 7 comprises only one synchronization dot 9 and the synchronization dots 9 of the symbols 7 are regularly arranged, in particular grid-like.
  • the data dots 8 of the symbols 7 are interleaved between the synchronization dots 9. Interleaving means here that the data dots 8 are spatially arranged between the synchronization dots 9. Preferably, the symbols 7 are arranged one adjacent the other.
  • the encoded portion 3 forms a uniform region of greyscale or halftone.
  • the encoded portion 3 forms a background of at least part or the document 1 , preferably a constant greyscale background image of at least part of the document 1.
  • the optional text 2 is superposed on the encoded portion 3.
  • the encoded portion 3 can be interleaved between text lines.
  • the background portion visible within letters or words of the text 2 is also grey, but can or can not contain symbols 7, i.e. encoded data.
  • the encoded portion 3 encodes the complete text 2, i.e. contents of the text 2, or contents of the document 1.
  • the encoded portion 3 preferably encodes the complete text 2 on this page.
  • the encoded portion 3 of one or each page preferably encodes the complete text 2 of some, e.g. adjacent, or all pages.
  • the maximum size of the dots 8, 9 forming the symbol 7 is preferably at most 1/300 inch, more preferably 1/600 inch or less, in particular depending on printer resolution.
  • all dots 8, 9 of one symbol 7 and of all adjacent symbol 7 are spa- tially spaced from each other.
  • the encoded portion 3 comprises a fingerprint (e.g. for watermarking and/or biometric aspects) and/or a digital signature and/or the content of the text 2 of the document 1.
  • the documents contains more than 200.000 symbols 7 per page when printed at 300 dpi, or more than 800.000 symbols 7 per page when printed at 600 dpi.
  • each symbol 7 is at most 10/300 inch, preferably 6/300 inch, in particular about 6/600 inch to 10/600 inch or less.
  • the symbol size can be further decreased or increased while relaxing per unit data encoding capacity.
  • the data capacity of the encoded portion 3 is at least 0.6 kbyte per square inch at 300 dpi or at least 2 kbyte per square inch at 600 dpi, preferably about 2.5 kbyte per square inch or more.
  • the data capacity can be calculated as follows:
  • X is the number of unique arrangements that can be achieved for the given number of dots to be used in single row of size (A). For instance for a row of size 3 and using 1 , 2 and 3 dots, the possible unique arrangements are shown below (Black dot B, White dot W):
  • the data capacity can be increased by using dots of different colours.
  • Fig. 4 shows a preferred process for generating a document 1 according to the present invention.
  • the original electronic document e.g. a file readable by a word processing program or the like, is provided in step Sl .
  • This electronic document is converted into a graphic (printable) image, e.g. in the format tiff, in step S2 and provided as the foreground contents, in particular the text 2 or any image or the like, in step S3.
  • the electronic document provided in step 1 is compressed in order to reduce the data in step S4.
  • the data of the document for encoding may be or are interpreted as or converted into a bit stram.
  • step S5 the data are encrypted in step S5.
  • An optional error correction coding step S6 and a data scrambling step S7 may follow.
  • a data encoding process step S8 follows to provide a background image, in particular the encoded portion 3, in step S9.
  • step S lO the foreground content and the background image are superposed.
  • the superposed image or data are then printed in step S 11 so that the document 1 according to the present invention is produced, in particular printed.
  • a dye sublimation printer or ink jet printer or laser printer (not shown) is used for the printing process or step S I l .
  • the document 1 or at least the encoded portion 3 is scanned preferably with twice the resolution as printed. Thus, printing, scanning or reading errors can be avoided or at least minimized.
  • the initial encoded content can be decoded.
  • This information can be used e.g. for OCR or authentification or other pur- poses as explained in more detail in the following.
  • This invention enables one-to-one basis digital content integrity authentication of valuable hardcopy documents (e.g. contracts, official letters etc.) with large contents. It is independent of content size and can be extended to other appli- cations (e.g. as discussed in the following) as well.
  • Digital authentication process allows a secure document production process (compare Fig. 4), which allows full-contents of the foreground text to be encoded into the superposed background image in machine-readable format. Before encoding the contents each of the following operations: - data compression,
  • ECC error correction coding
  • the document 1 to be authenticated is scanned at sufficiently higher over-sampling rate and then data-reading technique is applied to decode the contents encoded in the background image.
  • data-reading technique is applied to decode the contents encoded in the background image.
  • the recovered data all the operations performed in data encoding process are performed in reverse order and the resulting contents are output e.g. as a doc file, which can be printed or shown on the computer screen for contents integrity verification.
  • human interaction based authentication can be en- sured.
  • the present invention is applicable for an automatic digital authentication or verification process in which contents decoded from the superposed background image are compared with the digitized image (consisting of super- posed background image and foreground text, means digitized image used as input.
  • background image is superposed on the graphic image of decoded contents and then comparison is made.
  • the superposed image with foreground contents is divided into two types of regions: 1) consisting of lines of text (with bounding rectangle) and 2) lines without text. Any modification encountered between two text lines is considered noise, whereas the region of text line is defined by the rectangle (with smallest area) bounding the text line.
  • Image Quality can be improved.
  • the quality of the superposed background image (encoded portion 3) is higher than in the prior art.
  • the individual data encoding symbols are completely imperceptible and do not affect the aesthetic appearance of the document 1 and even make the document visually more attractive than those without superposed background image. Any visual inspection of the superposed background image (encoded portion 3) does not give any indication about the existence of encoded data in background image.
  • the present invention allows many pages of foreground text to be encoded, consequently enabling one-to-one basis contents integrity verification (as mentioned above).
  • the higher data encoding capacity and visual quality are attributed to: smaller data encoding symbol size, data encoding symbol pattern mechanism, synchronization recovery mechanism and/or the data-reading technique.
  • the data-reading technique takes the scanned image, which is sufficiently over-sampled in the scanning process, preferably at least twice the printing resolution, as input and recovers the encoded contents from the digitized document image. It handles intentional, unintentional skewing distortion and noise encountered from the print-and-scan process. Almost all existing scanning devices in market satisfy the over-sampling constraint.
  • a background image of size 5 x 8 cm or 2 x 3.1 inch offers 15 KB raw data encoding capacity that is sufficient for the requirements of the ID cards and much higher than the existing ones
  • the higher capacity is achieved by the higher data encoding rate (as discussed before) as well as by utilizing the flexibility that is attributed to the background image size.
  • the conventional data strips, two- dimensional barcodes or the like have constraints imposed by the aesthetic appearance that limits them to a fixed size and consequently to a fixed (and less) capacity that is conventionally not sufficient.
  • underlying nature of the proposed Superposed Background Image along with higher data encoding rate results in (-12.5 KB without
  • Existing portable card readers, e.g. from DATASTRIP Inc. offer the necessary over-sampling rate along with biometrics matching capability, so the product can be launched immediately.
  • the higher capacity for biometrics data storage would result in stronger identity verification techniques by using multiple biometrics characteristics for identity verification.
  • the present invention can be used for bank checks. It allows all the foreground textual contents to be encoded (after encryption) into the background image that is already there for aesthetic appearance of the document (as in [BreO4]).
  • the novel technology offers more resistance than the prior art against the counterfeiting attacks due to its nature of data encoding sym- bols.
  • the resulting bank-checks are inexpensive as compared with existing bank-checks, which are expensive due to sophisticated security printing technologies usage for document protection.
  • Smart Document Processing a new dimension for printed document processing
  • Another application of the proposed technology is in faxed documents.
  • the existing fax document quality is poor and is mainly constrained by the trans- mitted data size (assuming that high quality equipment/sensors is used for scanning purpose).
  • To improve the visual quality of the faxed document requires more data to be transmitted.
  • a document produced with superposed image (encoded portion) results in: - Original quality rather than high quality document to be faxed.
  • Telefax machines characterized by the above scenario may be called Smart Telefax Machines in the future.
  • the document 1 is scanned, and the encoded portion 3 is decoded to obtain the digital data.
  • the digital data are then transmitted to a receiver.
  • the original document is printed, in particular together with the encoded portion.
  • this arrangement and method allows an optimal quality of the document 1 produced at the receiver side.
  • the encoded portion 3 can also be used to improve telefax transmission or legibility of a document 1 sent by telefax or the like.
  • the document 1 is transmitted by telefax and contains the encoded portion 3 with the full foreground contents encoded.
  • the telefax data or the telefax is received.
  • the telefax data can be used directly by a suitable hardware / software arrangement running a data decoding routine to recover data from the encoded portion 3 to recover or correct foreground data, text images or the like.
  • the received telefax will be digitized or scanned (after it has been printed) in order to obtain the respective digital data for running the data decoding routine to recover data from the encoded potion 3 and allow the respective correction if necessary.
  • the corrected data are used to print a corrected document 1 which resembles the original document 1 much better than usual telefaxes.
  • the present invention is applicable for secret or military communication.
  • the underlying characteristics: invisibility and higher data encoding capacity, and blind-data decoding capability of proposed technique from superposed background image make it very attractive for secret or military communications and also for other government departments due to the fact that digital communication is not applicable in all scenarios (e.g. the conventional cryptographic techniques makes the digital communication suspicious, even desired digital link might not be available or possible at all).
  • the superposed background image (encoded portion 3) can be used as channel to encode digital data of any type (e.g. text, graphic, audio/video clip etc.) without affecting the aesthetic appearance of the underlying document.
  • the superposed background image (encoded portion 3) is a halftone image that is obtained by the repeated application of especially designed symbols to be called from hereafter data encoding symbols. These symbols are used to encode digital data into the superposed background image in data encoding process.
  • the imperceptibility constraint is achieved (without compromising capacity) in particular by partitioning the symbol 7 into parts and decreasing the size (1/600 of an inch) of the basic elements of the data encoding symbol 7; unlike Suzaki et al. in who have done so at the cost of capacity decrease.
  • the present invention use only one third of the size for the symbol, which immediately results in 9 times higher capacity per square unit.
  • the data encoding symbol 7 is optimized against dot gain effects (encountered from printing process), imperceptibility and capacity. This means that for the given experimental set up (for laser printing technology) if capacity is further increased it results in higher error rate whereas at less data capacity channel would be underutilized.
  • the data encoding symbol 7 allows multiple grey levels to be achieved.
  • the data encoding symbol is partitioned into two parts in which one part deals with data encoding and the other one with synchronization recovery.
  • Multiple grey levels may be obtained by varying the number of black dots in synchronization recovery region under the constraints: synchronization recovery process is not affected, data decoding process is not affected due to the dot gain effects caused by the additional dots that are added for different grey levels.
  • the dots 8, 9, such as black, cyan, magenta and yellow can be used for the dots 8, 9, such as black, cyan, magenta and yellow. Consequently, this means that at a single location two bits can be encoded by using a colored dot (four different colors). By additionally changing the position, two more bits could be encoded, resulting in a total of four bits per symbol for example.
  • less or more colors can be used, depending on the quality of the printing possibilities and/or scanning possibilities.
  • larger dot sizes would facilitate the use of ink jet and dye sublimentation printing technologies.
  • the latter one offers more colors for printed dots.
  • UV ultraviolet
  • IR infrared
  • magnetic inks with suitable measures (e.g. appropriate dot size) for printing data encoding sym- bols 7.
  • the data encoding process takes foreground contents (doc file) and converts it into a graphic image.
  • the doc file is converted into binary data stream on which following operations: lossless data compression, data encryption, error correction coding and data scrambling, are performed respectively.
  • the resulting binary data stream is encoded into a superposed background image using the data encoding symbol 7.
  • the background image is superposed to the foreground contents (graphic image).
  • the superposition process consists of two stages: 1) selection of suitable data encoding region, and 2) elimination of artifacts caused by data encoding symbols on the foreground contents in overlapping regions, whereas the later stage is common and handled in same way in both cases.
  • first method for data encoding region selection data is encoded over the entire background image uniformly and the errors caused by the overlapping of foreground contents are compensated by the data scrambling and error correction coding (ECC).
  • ECC error correction coding
  • graphic image file of foreground contents is processed to look for the regions that do not overlap the background image and these regions in background image are used for data encoding.
  • a region in text document is defined by the rectangle of minimum area surrounding a text line and the x, y coordinates of four points of all such rectangular regions encountered in the entire image are encoded separately into the background image with higher overhead for ECC to make it more robust against the errors.
  • ECC Error Correction Code
  • the compensation process to count artefacts caused by data encoding symbols on foreground contents works as follows.
  • a pixel value at posi- tion (i,j) for the foreground text image, superposed background image and the resulting image after superposition by X(i,j), Y(i,j) and Z(i,j), respectively the process to count for the artifacts caused by data encoding symbols on foreground contents works as follows.
  • the given pixel (Lj) of resulting image Z(i,j) takes the pixel value of foreground text image X(i,j) when X(i,j) has black pixel, other- wise always the pixel value of the resulting image is the value of the superposed background image Y(i,j) regardless of its value, black or white.
  • Z(i, j) X(i, j) and Y(i, j).
  • X and Y are two binary matrices having same size and black and white pixels represented by binary values "0" and "1 ", respectively.
  • the printed document 1 is scanned at sufficiently higher over-sampling (at least twice) rate and then the data reading technique is applied.
  • the data reading technique applies different filters that deal with synchronization recovery from noisy environment, noise elimination from data encoding region and identification of information encoding dots.
  • the objective of synchronization recovery filter is to identify and accurately locate the dot 9 used for the synchronization recovery.
  • a synchronization recovery dot suffers from dot gain effects caused by the up to four neighbouring data encoding dots 8 separated by 1/600 of an inch. The accuracy of the located position is 1/600 of an inch. If synchronization error is encountered (for instance due to overlapped symbols) then it is taken the average value of next immediate neighbouring synchronization dots 9.
  • the filter dealing with information decoding process identifies and locates the position of data encoding dot in the region, defined by the synchronization dots of three neighbouring data encoding symbols (and fourth one for symbol under consideration), while taking into account the noise from the print and scan process. It has to locate one position among different positions that are used by the data encoding symbol for encoding multiple data bits per symbol. It also takes measures (e.g. using slope of greyscale region) to distinguish if given position lies at the critical region resulting from dot gain effects of the PS process. The accuracy needed for successful data decoding for a given symbol is again 1 /600 of an inch.
  • the resulting data is converted to the bit stream corresponding to the doc file (encoded into the background image), while performing the operations (data compression, data encryption, error correction coding and data scrambling) in reverse order.
  • the data encoding and reading technique serve the following goals:
  • the higher capacity can be utilized to lock the document (contracts, official letters, educational certificates, bank check) with digital signatures, and/or some desirable biomedical characteristics (fingerprints, voice print, iris scan, signatures, etc.).
  • biomedical characteristics fingerprints, voice print, iris scan, signatures, etc.
  • digitized signatures and fingerprints can also be encoded into the superposed background image and the resulting printed document would offer the same characteristic as the conventional contract writing process where involved parties sign the document (as a proof of willingness).
  • Another application of the present invention is an extension of the technique to document with text consisting of non-Roman letters.
  • the same type of documents e.g. contracts, official letters etc.
  • OCR technology either does not exist yet or the underlying nature (complex language structure) of the contents make it highly difficult to develop such technology (with very small error rate).
  • Examples are Arabic, Persian, Chinese, Urdu languages with complex language structure, using more or less same alphabets set (except Chinese language).
  • authentication is again done on one-to-one basis using the contents recovered from the superposed background image.
  • the technique using superposed background image can result in more secure identity verification process by offering more capacity than conventional 2-D barcodes etc. for biometrics data storage, allowing multiple biometrics characteristics to be used. It also eliminates the need for central database system for record keeping, as the full contents can be encoded into the document.
  • ID-card could be a better choice for the superposed back- ground image as usually it has not much foreground contents.
  • An extension could be to use non-constant greyscale image as a superposed background image.
  • the desired graphic image is to be transformed into a very light image (e.g. varying in 210-220 grey levels) and this image would be halftoned using some of the available grey levels from the data encoding technique. In halftoning, data encoding symbol would be used for screening process.
  • Fax documents quality can be improved up to the originally printed document quality.
  • quality of transmitted document can be improved together with the decrease in size of transmitted data that is conventionally transmitted.
  • the higher capacity can also be utilized for computation gain and here the possibility for encoding multiple copies of foreground contents can be ar- ranged in such a way so that first copy can be recovered very quickly without reading the data from whole background image. If first copy is not recovered successfully then the next copy can be tried.
  • the printed document with superposed background image can be used for counterfeiting and copy detection purpose.
  • very large size of the superposed background image requires about 800,000 data encoding symbols to be identified correctly from scanned document in order to launch a successful counterfeiting attack.
  • predetermined pseudorandom locations can be selected and used for counter- feiting detection, which would still demand for the same efforts from the counterfeiter as before (means almost all symbols to be decoded correctly).
  • the copy detection attack can easily be detected due to the fact that regardless the quality of copying device and application of over-sampling in scanning process, a document printed from scanned copy cannot produce the same vis- ual quality as the originally printed document. Furthermore, encoded information cannot be recovered from copied document.
  • the synchronization dots 9 may be used.
  • the present invention may also be used in the opposite case where the dots 8, 9 are white and/or colored and the other area or background is black or colored.
  • This scenario is similar to the negative of an image and can be used for the same dot size as described or for larger dot sizes ( 1 : 300, 1 : 200, 1 : 150 or 1 : 75 inch).

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Editing Of Facsimile Originals (AREA)
EP07785987A 2007-03-02 2007-07-11 Dokument mit einem codierten teil Withdrawn EP2119217A1 (de)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP07785987A EP2119217A1 (de) 2007-03-02 2007-07-11 Dokument mit einem codierten teil

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
EP07004358 2007-03-02
EP07785987A EP2119217A1 (de) 2007-03-02 2007-07-11 Dokument mit einem codierten teil
PCT/EP2007/006126 WO2008107001A1 (en) 2007-03-02 2007-07-11 Document with encoded portion

Publications (1)

Publication Number Publication Date
EP2119217A1 true EP2119217A1 (de) 2009-11-18

Family

ID=38657176

Family Applications (1)

Application Number Title Priority Date Filing Date
EP07785987A Withdrawn EP2119217A1 (de) 2007-03-02 2007-07-11 Dokument mit einem codierten teil

Country Status (2)

Country Link
EP (1) EP2119217A1 (de)
WO (1) WO2008107001A1 (de)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2010238503B2 (en) * 2010-10-29 2013-08-29 Canon Kabushiki Kaisha Two dimensional information symbol

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0634262B2 (ja) * 1988-05-10 1994-05-02 株式会社ジェーエスケー 光学読取コード及び情報伝達方法
US5296693A (en) * 1991-12-16 1994-03-22 Canon Kabushiki Kaisha Ink intrusion resistant digital code
US5416311A (en) * 1993-01-05 1995-05-16 Canon Kabushiki Kaisha Data storage device with multiple levels of spacial density
US5818032A (en) * 1997-01-03 1998-10-06 Sun; Tsu-Hung Tom Encoded color halftone micro-dots for high density digital information storage
US6487301B1 (en) * 1998-04-30 2002-11-26 Mediasec Technologies Llc Digital authentication with digital and analog documents
EP1484710B1 (de) * 1998-11-19 2008-01-09 Digimarc Corporation Ausweisdokument mit Photo
EP1087330A3 (de) * 1999-09-21 2002-04-10 Omron Corporation Zweidimensionaler Punktcode und Lesegerät dafür
AU5886801A (en) * 2000-05-09 2001-11-20 Colorzip Media Inc Machine readable code and method and device of encoding and decoding the same
US6959866B2 (en) * 2002-05-30 2005-11-01 Ricoh Company, Ltd. 2-Dimensional code pattern, 2-dimensional code pattern supporting medium, 2-dimensional code pattern generating method, and 2-dimensional code reading apparatus and method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO2008107001A1 *

Also Published As

Publication number Publication date
WO2008107001A1 (en) 2008-09-12

Similar Documents

Publication Publication Date Title
US7644281B2 (en) Character and vector graphics watermark for structured electronic documents security
Chiang et al. Printer and scanner forensics
JP4137084B2 (ja) 不正顕示機能付文書を処理するための方法、及び、不正顕示機能付文書の妥当性検証を行うための方法
US6978035B2 (en) Information hiding system, method, and printed matter into which information is hidden
US8430301B2 (en) Document authentication using hierarchical barcode stamps to detect alterations of barcode
US8064102B1 (en) Embedding frequency modulation infrared watermark in digital document
US8335342B2 (en) Protecting printed items intended for public exchange with information embedded in blank document borders
US8243982B2 (en) Embedding information in document border space
Picard Digital authentication with copy-detection patterns
US7787154B2 (en) Font printing system having embedded security information comprising variable data periodic line patterns
WO2002065381A1 (en) Document printed with graphical symbols which encode information
JP2004201321A (ja) ハードコピー保護文書を作成、検証するシステムおよび方法
US8139270B2 (en) Variable data periodic line patterns for composing a font system
CN101119429A (zh) 一种数字水印嵌入与提取的方法及装置
SK10072003A3 (sk) Dátový kanál pozadia na papierovom alebo inom nosiči
EP2222072A2 (de) Erkennungsmaschine basierend auf der Zeichensatzeingabe für Musterzeichensätze
Mayer et al. Fundamentals and applications of hardcopy communication
WO2015140562A1 (en) Steganographic document alteration
JP4461487B2 (ja) 画像処理方法および画像処理装置並びに真偽判定方法
US9361516B2 (en) Forensic verification utilizing halftone boundaries
EP2119217A1 (de) Dokument mit einem codierten teil
Briffa et al. Imperceptible printer dot watermarking for binary documents
Tkachenko Generation and analysis of graphical codes using textured patterns for printed document authentication
Borges et al. Document watermarking via character luminance modulation
Geisselhardt et al. High-capacity invisible background encoding for digital authentication of hardcopy documents

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20090807

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC MT NL PL PT RO SE SI SK TR

RIN1 Information on inventor provided before grant (corrected)

Inventor name: IQBAL, TASWAR

Inventor name: GEISSELHARDT, WALTER

DAX Request for extension of the european patent (deleted)
17Q First examination report despatched

Effective date: 20131021

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20140201