WO2005114548A1 - Procede d'evaluation de la qualite et de l'aptitude a l'usage d'images de cheques numeriques - Google Patents

Procede d'evaluation de la qualite et de l'aptitude a l'usage d'images de cheques numeriques Download PDF

Info

Publication number
WO2005114548A1
WO2005114548A1 PCT/GB2005/002019 GB2005002019W WO2005114548A1 WO 2005114548 A1 WO2005114548 A1 WO 2005114548A1 GB 2005002019 W GB2005002019 W GB 2005002019W WO 2005114548 A1 WO2005114548 A1 WO 2005114548A1
Authority
WO
WIPO (PCT)
Prior art keywords
cheque
image
descriptors
run length
descriptor
Prior art date
Application number
PCT/GB2005/002019
Other languages
English (en)
Inventor
David Hilton
Peter Wells
Weichao Tan
Original Assignee
Enseal Systems Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Enseal Systems Limited filed Critical Enseal Systems Limited
Publication of WO2005114548A1 publication Critical patent/WO2005114548A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/04Payment circuits
    • G06Q20/042Payment circuits characterized in that the payment protocol involves at least one cheque
    • G06Q20/0425Payment circuits characterized in that the payment protocol involves at least one cheque the cheque being electronic only

Definitions

  • This invention relates to a method for the assessment of the quality and usability of digital cheque images.
  • a cheque is any valuable document that is scanned, where the scanned image of the document needs to be assessed for quality purposes.
  • An example of a cheque is that defined as a check in the US legislation The Check Clearing for the 21 st Century Act.
  • a complication is the fact that cheques or images of cheques are typically processed at high speeds for the purposes of clearing. This implies the use of high speed scanners, converting paper documents into electronic equivalents. Whether this process occurs at a local branch or at a central clearing house, the essential feature is that no human intervention should be required because of the adverse impact on the rate of processing.
  • This invention is concerned with a process whereby the images produced by scanners are assessed for quality using algoritlims that function at very high speeds on standard desktop computers.
  • the invention is a method for assessing the quality of a digital image of a cheque; comprising the steps of: (a) scanning a particular cheque to generate a scanned, digital image; (b) generating a representation of the scanned image using a descriptor that identifies a visible kind of feature that is potentially present in a localised region of the scanned image, the representation occupying less data space than a rasterised version of the scanned image; (c) automatically assessing the quality of that particular, scanned image by using the descriptor for the localised region in the scanned image.
  • a virtual image is an abstract representation that describes the appearance of a cheque in mathematical terms; (it does not necessarily correspond to an actual, real image of a cheque).
  • a descriptor is fully defined by one or more parameters.
  • Parameters for a specific descriptor can identify different types of visible kinds of features present in the image and distinguish between these kinds of features. Further, the parameters, or the range of values they may take, for a specific descriptor can describe the range in appearance that a feature can potentially take in a image due to variations or defects in the printing or scanning process.
  • a descriptor is the result obtained using a run length analysis of the cheque image.
  • a black and white image is uniquely defined by listing the black run lengths and white run lengths traversing the image in a rasterised fashion (row by row.)
  • a letter such as "E” will have several long black run lengths at the top, bottom and middle but rather more short run lengths corresponding to the thickness of the vertical line. Run lengths can also be calculated vertically.
  • Run length calculations from rasterised images are computationally very simple and hence can be calculated at great speed, an essential requirement for this invention.
  • the scanning can be high speed, automated scanning of bulk quantities of cheques.
  • One or more parameters of the descriptor constitute a run length profile; the run length profile can then be the histogram of run length frequencies in a given localised region. Translation of peaks in the histogram corresponds simply to varying the print heaviness or darkness. Different run length profiles, each associated with the descriptor obtained using run length analysis, may be sufficient to distinguish between areas of upper and lower case text. Different run length profiles can also be sufficient to distinguish between streaks and properly formed text.
  • the quality of the scanned cheque image can be assessed by comparison of descriptors measured from multiple localised regions in the scanned cheque image with a set of standardised descriptors; the standardised descriptors can be derived from a template or calculated by analysis of a representative set of cheques.
  • the locations of the localised descriptors can be used to further verify the authenticity of the cheques by assessing their conformity to a known template or design of cheque.
  • run length profiles of different localised regions are used to assign each region to one of a range of categories corresponding to the likely nature of text or image of which the region is a part.
  • the invention will be described with reference to the accompanying Figure 1, in which the first 3 bar charts show the run length profile of the text of the payee on a cheque as the heaviness of the printing decreases and the two modal values move leftwards.
  • the 4 th bar chart shows the more diffuse run length profile of a hand written signature.
  • the present invention descries methods of providing compact and efficient localised descriptors of cheque data to build virtual representations or images of cheques.
  • the descriptors are able, by means of a small number of parameters, to describe both the range of the kinds of features that may occur on the documents in question and also the range in appearance that those features can take, brought about by variation in the printing and scanning processes.
  • the virtual images have smaller amounts of data than the raw rasterised formats — i.e. the mathematical abstract representation is more compact than the rasterised format, leading to computational efficiencies.
  • the invention utilises the descriptors to assess the quality of images by recognising, for instance, where a descriptor corresponds to unacceptably heavy text, or to features, such as streaks, which may arise from badly maintained scanners.
  • descriptors are also used to compare images with known templates in order to detect fraud.
  • the descriptors are supplemented by information about the location of features.
  • descriptors may also be derived from a set of images, so that any incoming image may be assessed to determine whether it lies within the tolerances of the set in question.
  • a paying bank receives digital images of many thousands of cheques from banks of first deposit, reconverting banks and cheque cashing outlets.
  • cheques are processed within a narrow time frame and failure to identify irregularities of any sort means that the liability for the transaction's correctness passes to the receiving bank.
  • a first requirement therefore, is that the images shall contain the required information in a form that can be handled by automatic means, the cost of a human operator being prohibitive.
  • This invention comprises a solution to the problem of assessing image quality in such circumstances. Essentially what is presented is an algorithm for processing digital images at high speed to assess whether or not there is clear textual or symbolic information present.
  • the invention also comprises a solution to the problem of assessing the conformity of cheque images to a known design.
  • digital images of cheques are presented in a variety of file formats, TIFF, BMP, JPEG etc., the common feature being that all can be converted into rasterised files representing the cheque.
  • the images may be greyscale, having values for each pixel ranging from 0 to 255, or black and white, having just two possible states for each pixel.
  • greyscale images a conversion to black and white takes place before the algorithm is applied, this being achieved by use of a simple threshold decision for each pixel.
  • the descriptors are fully defined by a small number of parameters and have two particularly important properties.
  • the variability of the parameters must be such that any features which might appear on a cheque can be distinguished one from another by the parameters to which they correspond.
  • a cheque might have text in upper case or lower case and the parameters for these cases should be sufficiently distinct to allow discrimination between the two cases.
  • the parameters for a logo or similar feature should differ appreciably from those for text or those for unintended features such as blobs and streaks.
  • the descriptors should be able to describe the range in appearance of different features that may occur as a result of the printing and scanning processes. If, for instance, an original electronic image is printed with more than the average amount of toner and then subsequently scanned on a scanner whose threshold is set to a low value, then text will have a dark appearance. Nonetheless, it should still be recognised as corresponding to the same text when imaged in different conditions.
  • a feature of the descriptors must be that the sets of parameters describing these two cases have properties that distinguish them from sets of parameters corresponding to other features. Likewise, if a cheque is printed or scanned at a different resolution the parameters of the descriptors must still be recognisable as belonging to the same set.
  • the process of producing a virtual image takes place by inspecting the image to find localised regions or zones where there is significant data, grouping the zones in some way and then deriving descriptors for each zone. From these it should be possible to assess what text is present, whether or not it lies within the acceptable range of darkness or lightness or whether there are features that are purely contingent on some failure of the mechanical processing.
  • the descriptors can be attached to prescribed locations (e.g. from a known template or design of cheque) and the failure of a cheque to contain descriptors with the correct parameters at the correct locations is taken as prima facie evidence for the existence of a counterfeit.
  • Run lengths One possible set of such descriptors is produced by the well known process of describing the image in "run lengths.” That is to say a black and white image is uniquely defined by listing the black run lengths and white run lengths traversing the image in a rasterised fashion (row by row.) A letter such as “E” will have several long black run lengths at the top, bottom and middle but rather more short run lengths corresponding to the thickness of the vertical line. Run lengths can also be calculated vertically.
  • the descriptor then is essentially the histogram of the run length frequencies taken in a particular locality.
  • Run length calculations from rasterised images are computationally very simple and hence can be calculated at great speed, an essential requirement for this invention.
  • the whole image is subdivided into small, localised regions or segments whose width would be about 3 letters and whose height 2 letters, depending on font size.
  • the black and white run lengths within each segment are calculated.
  • the frequencies of the run lengths are used to provide a profile for each segment.
  • the result will be a representation or virtual image consisting of sets of parameters derived from run length analysis.
  • Figure 1 shows three profiles of the same piece of text from a cheque where the threshold set for the scanner is varied, and as a comparison a profile for a hand written signature is included. It is readily apparent that by simple inspection of the profile the text can be cuscriminated from the signature and in addition the heaviness of the text can be assessed.
  • the run length profile will have recognisably the same shape but will have less contrasting peaks and hollows.
  • the process of quality assessment or verification of a cheque is implemented by comparing the profiles measured from the cheque being tested, with the previously categorised run length profiles, and assessing the level of correlation.
  • those parts of the cheque which contain usable text can be identified, initially as a set of small segments, where the borders of the segments are not precisely at the borders of the text.
  • adjacent segments can be linked together to form larger areas of text.
  • Run length profiles will also reveal the presence of blobs ot streaks which may arise from the scanning process.
  • the run length profiles of the localised regions or segments are used to assign each segment to one of a range of categories corresponding to the likely nature of text or image of which the segment is a part.
  • a category for good quality text a category for areas with many small run lengths arising from the presence of small dots, and a category for many long run lengths corresponding to dark areas.
  • a virtual image or representation of the cheque can be formed whose values are the category values. Any one of these segments may be inaccurately assigned on account of the limited sample size, but the accumulation of values will give a good indication of the nature of the underlying image or text.
  • a set of segments corresponding to a dark category might indicate the presence of heavy inking or a scanning artefact
  • a grouped set of segments classified as text might indicate the position of the textual data and provide a starting point for character recognition procedures.
  • a search may similarly be made for other features, using only the virtual image of category values and thus drastically reducing computational requirements.
  • OCR optical character recognition methods
  • Typical information to be conveyed by a cheque includes a Bank routing number, the amount, the account number and these all appear on the "MICR" line in a completely defined font at the bottom of the cheque.
  • the name of the Bank on which the cheque is drawn and the number of the check often appear on preprinted stock.
  • the amount (in words and in numbers), and the payee name are printed at time of issue or handwritten in the case of personal cheques.
  • a measure of the overall quality of the imaging can be obtained by assessing the presence of good text in areas where it should be present and assessing whether any additional located data is indicative of serious imaging defects or is merely superficial noise.
  • the expected locations of data can be specified with complete precision rather than merely being confined within broad bounds. This implementation is required where there is an attempt to detect counterfeit cheques because fraudsters frequently fail to achieve complete accuracy when copying cheque designs.
  • This detection can be carried out by providing a template derived from the original cheque design.
  • the template will comprise two types of item. First, there are items which are identical in size and position on every cheque produced for a given account. These items include the Bank's name and possibly logo, certain words such as 'pay,' 'date' etc and other decorative elements and are usually printed on the original cheque stock. Descriptors for these items will be very precise. Secondly there are items which are printed at the time of cheque issue such as the payee and value of the cheque. Now these latter items will have variable size and content but nonetheless will be fairly closely described by descriptors because they will be printed with a known font.
  • the template described above is derived by analysis of a set of cheques prior to being used for cheque verification.
  • This analysis produces a set of localised descriptors, identifying those zones which are identical on every cheque and those which a variable.
  • the analysis goes further by identifying the variability in the parameters describing the descriptors. By assessing the deviations it is possible to discover whether or not any selected cheque lies within the acceptable bounds of accuracy. This method is particularly powerful as the template can be updated as more cheques are assessed.

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

La présente invention concerne des procédés permettant de fournir des descripteurs localisés compactes et efficaces de données de chèques de manière à constituer des images virtuelles de chèques. Une image virtuelle est une représentation abstraite qui décrit l'apparence d'un chèque en termes mathématiques, elle ne correspond pas nécessairement à une image réelle d'un chèque. Les descripteurs sont capables, au moyen d'un petit nombre de paramètres, de décrire la gamme des types de caractéristiques qui peuvent apparaître sur les documents en question et la portée en apparence que ces caractéristiques peuvent prendre, provoquée par la variation des processus d'impression et de balayage. Les images virtuelles comportent de plus petites quantités de données que les formats tramés bruts, à savoir, la représentation abstraite mathématique est plus compacte que le format tramé, ce qui débouche sur une efficacité de calcul.
PCT/GB2005/002019 2004-05-20 2005-05-20 Procede d'evaluation de la qualite et de l'aptitude a l'usage d'images de cheques numeriques WO2005114548A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GBGB0411245.4A GB0411245D0 (en) 2004-05-20 2004-05-20 A method for the assessment of quality and usability of digital cheque images with minimal computational requirements
GB0411245.4 2004-05-20

Publications (1)

Publication Number Publication Date
WO2005114548A1 true WO2005114548A1 (fr) 2005-12-01

Family

ID=32607646

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2005/002019 WO2005114548A1 (fr) 2004-05-20 2005-05-20 Procede d'evaluation de la qualite et de l'aptitude a l'usage d'images de cheques numeriques

Country Status (2)

Country Link
GB (2) GB0411245D0 (fr)
WO (1) WO2005114548A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112365451A (zh) * 2020-10-23 2021-02-12 微民保险代理有限公司 图像质量等级的确定方法、装置、设备及计算机可读介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106780967B (zh) * 2017-01-09 2019-06-11 深圳怡化电脑股份有限公司 一种纸币版本识别方法及装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3345908A (en) * 1963-08-16 1967-10-10 Ibm Print characteristics displayer
US3347131A (en) * 1963-08-16 1967-10-17 Ibm Quantitative image measurement process for printed material
US4426731A (en) * 1980-10-21 1984-01-17 International Business Machines Corporation Character recognition employing compressed image data
US4504972A (en) * 1981-02-27 1985-03-12 Siemens Aktiengesellschaft Method and apparatus for automatic recognition of image and text or graphics areas on printed masters
GB2297159A (en) * 1992-08-03 1996-07-24 Ricoh Kk Special document discrimination system
US5818965A (en) * 1995-12-20 1998-10-06 Xerox Corporation Consolidation of equivalence classes of scanned symbols

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4590606A (en) * 1982-12-13 1986-05-20 International Business Machines Corporation Multi-function image processing system
US5537486A (en) * 1990-11-13 1996-07-16 Empire Blue Cross/Blue Shield High-speed document verification system
US5600732A (en) * 1994-12-08 1997-02-04 Banctec, Inc. Document image analysis method
JP4254204B2 (ja) * 2001-12-19 2009-04-15 富士ゼロックス株式会社 画像照合装置、画像形成装置及び画像照合プログラム
GB0313002D0 (en) * 2003-06-06 2003-07-09 Ncr Int Inc Currency validation
US20050096992A1 (en) * 2003-10-31 2005-05-05 Geisel Brian R. Image-enabled item processing for point of presentment application

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3345908A (en) * 1963-08-16 1967-10-10 Ibm Print characteristics displayer
US3347131A (en) * 1963-08-16 1967-10-17 Ibm Quantitative image measurement process for printed material
US4426731A (en) * 1980-10-21 1984-01-17 International Business Machines Corporation Character recognition employing compressed image data
US4504972A (en) * 1981-02-27 1985-03-12 Siemens Aktiengesellschaft Method and apparatus for automatic recognition of image and text or graphics areas on printed masters
GB2297159A (en) * 1992-08-03 1996-07-24 Ricoh Kk Special document discrimination system
US5818965A (en) * 1995-12-20 1998-10-06 Xerox Corporation Consolidation of equivalence classes of scanned symbols

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112365451A (zh) * 2020-10-23 2021-02-12 微民保险代理有限公司 图像质量等级的确定方法、装置、设备及计算机可读介质

Also Published As

Publication number Publication date
GB2414297A (en) 2005-11-23
GB0510334D0 (en) 2005-06-29
GB0411245D0 (en) 2004-06-23

Similar Documents

Publication Publication Date Title
Elkasrawi et al. Printer identification using supervised learning for document forgery detection
US7587066B2 (en) Method for detecting fraud in a value document such as a check
US9542752B2 (en) Document image compression method and its application in document authentication
US8144368B2 (en) Automated methods for distinguishing copies from original printed objects
Gebhardt et al. Document authentication using printing technique features and unsupervised anomaly detection
KR101515256B1 (ko) 동적 문서 식별 프레임워크를 사용한 문서 검증
US20060210138A1 (en) Verification of authenticity of check data
US20080310721A1 (en) Method And Apparatus For Recognizing Characters In A Document Image
US8743425B2 (en) Method for using void pantographs
van Beusekom et al. Automatic authentication of color laser print-outs using machine identification codes
Jain et al. Passive classification of source printer using text-line-level geometric distortion signatures from scanned images of printed documents
US8903155B2 (en) Optical waveform generation and use based on print characteristics for MICR data of paper documents
Piekarczyk Hierarchical random graph model for off-line handwritten signatures recognition
CN101118592A (zh) 一种基于字符打印特征的打印机取证方法
Garain et al. On automatic authenticity verification of printed security documents
Chhabra et al. Detecting fraudulent bank checks
WO2005114548A1 (fr) Procede d'evaluation de la qualite et de l'aptitude a l'usage d'images de cheques numeriques
Van Beusekom et al. Automatic counterfeit protection system code classification
US20050147296A1 (en) Method of detecting counterfeit documents by profiling the printing process
KR101232684B1 (ko) 베이시안 접근법을 이용한 지폐 진위 감별 방법
CN115205882A (zh) 一种医疗行业的费用支出凭证的智能识别和处理方法
Tkachenko et al. Robustness of character recognition techniques to double print-and-scan process
CN111445433B (zh) 一种电子卷宗的空白页和模糊页的检测方法及装置
Patgar et al. An unsupervised intelligent system to detect fabrication in photocopy document using Variations in Bounding Box Features
Naseem et al. Counterfeit Recognition of Pakistani Currency

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

WWW Wipo information: withdrawn in national office

Country of ref document: DE

122 Ep: pct application non-entry in european phase