AU2005203141B2 - A method of detecting streams of postal items - Google Patents

A method of detecting streams of postal items Download PDF

Info

Publication number
AU2005203141B2
AU2005203141B2 AU2005203141A AU2005203141A AU2005203141B2 AU 2005203141 B2 AU2005203141 B2 AU 2005203141B2 AU 2005203141 A AU2005203141 A AU 2005203141A AU 2005203141 A AU2005203141 A AU 2005203141A AU 2005203141 B2 AU2005203141 B2 AU 2005203141B2
Authority
AU
Australia
Prior art keywords
run
closely related
image
item
ocr
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU2005203141A
Other versions
AU2005203141A1 (en
Inventor
Belkacem Benyoub
Matthieu Letombe
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Solystic SAS
Original Assignee
Solystic SAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Solystic SAS filed Critical Solystic SAS
Publication of AU2005203141A1 publication Critical patent/AU2005203141A1/en
Application granted granted Critical
Publication of AU2005203141B2 publication Critical patent/AU2005203141B2/en
Ceased legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/24Character recognition characterised by the processing or recognition method
    • G06V30/242Division of the character sequences into groups prior to recognition; Selection of dictionaries
    • G06V30/244Division of the character sequences into groups prior to recognition; Selection of dictionaries using graphical properties, e.g. alphabet type or font

Abstract

The method involves forming a digital image of a surface of each postal item having information of address, and binarizing the digital image. Subsidiary components (CC) of the image are extracted from the digital image of the items to perform automatic recognition operation of the address by optical code recognition (OCR) and to distinguish a group of homogenous postal items. An independent claim is also included for a sorting machine of implementing a postal item processing method.

Description

AUSTRALIA Patents Act COMPLETE SPECIFICATION (ORIGINAL) Class Int. Class Application Number: Lodged: Complete Specification Lodged: Accepted: Published: Priority Related Art: Name of Applicant: Solystic Actual Inventor(s): Matthieu Letombe, Belkacem Benyoub Address for Service and Correspondence: PHILLIPS ORMONDE & FITZPATRICK Patent and Trade Mark Attorneys 367 Collins Street Melbourne 3000 AUSTRALIA Invention Title: A METHOD OF DETECTING STREAMS OF POSTAL ITEMS Our Ref: 747915 POF Code: 469254/458825 The following statement is a full description of this invention, including the best method of performing it known to applicant(s): 6006Q 1A A METHOD OF DETECTING RUNS OF UNIFORM MAIL ITEMS The invention relates to a method of handling mail items, which method comprises forming a digital image of each mail item, which image includes address information, 5 and performing an automatic address recognition operation by optical character recognition (OCR). The invention relates more particularly to a mail handling method that makes it possible to detect "runs". A "run" is typically a succession or a group of uniform 10 mail items, often a very large number of uniform mail items, having visible physical characteristics that are identical. The mail items come from the same mail sender and are addressed to different destinations. Such mail items can be referred to as "major user" mail items or as 15 "referenced customer" mail items. Figure 1 very diagrammatically shows the mail items E of a run F. Figure 1 shows two mail items E more precisely, and it can be seen that the front faces E of said two mail items bear address information. In its top 20 left corner, each of the two mail items also bears the name and address of the sender, indicated by reference x, in its bottom left corner, it also bears the name of the group (indicated by reference y) to which the sender belongs, and, in its center, slightly offset rightwards, 25 it bears an address block indicated by the reference z, the address block being disposed in a framed region under a plastics film. All of the mail items E of the run F are constituted in the same manner, and each of them bears the elements x, y, z that are characteristic of the 30 run F and that are placed in the same positions on the front faces of the mail items. Only the destination address in the block z of the mail item E changes from one mail item E to another in the run F, as can be seen in Figure 1. 35 Currently, runs are identified visually by the operators while they are feeding sorting machines with mail items.
2 More particularly, when an operator identifies a run, the operator can adjust the parameters of the automatic OCR recognition process for recognizing the destination address on the sorting machine. But often that 5 parameterization is not optimum and is not used because it requires expertise in the use of OCR that the operator generally does not possess. When the operator does possess the expertise necessary for such manual parameterization, said parameterization is generally 10 performed only on very long runs, i.e. runs made up of thousands of mail items, because implementing it requires actions of measurement, of optimization and of data input that slow down overall operation of the sorting machine. The OCR parameters can, for example, be adjusted in the 15 manner presented in European Patent Document EP 1 159 705. Generally, mail items constituting runs are sent to post offices by referenced customers who enjoy "discount" contracts. Discount contracts make provision for a reduction in the postage charge to be granted that is a 20 function of the outward sorting rate of reading and of the inward sorting rate of reading of the addresses by OCR. Generally the rates of reading are determined by means of test batches of mail items at the time of signing of the contract, and they are not checked throughout the entire 25 term of the contract, because determining rates of reading is lengthy to perform and thus costly. Thus, post offices do not have any means of checking that the performance levels specified in the contract are indeed complied with by the referenced customers. 30 A reference herein to a patent document or other matter which is given as prior art is not to be taken as an admission or a suggestion that that document or matter was known or that the information it contains was part of the common general knowledge as at the priority date of 35 any of the claims. It would be desirable to provide a method of automatically detecting runs for the purposes of automatically parameterizing the OCR address reader and of automatically monitoring the discount contracts. 747915 vol amend 2a According to one aspect, the present invention provides a method of handling mail items comprising the steps at: terming a digital image of a current mail item, which image includes address inftormaLion, 5 performing an automatic address recognition operation by OCR tram the image ot the mail item, automatically identifying a group or run of uniform mail items comprising a step of extracting run image attributes at the current mail Item which are suitable for distinguishing 10 characterisLics ot a run ot uniform mail items and comparing these run attributes corresponding to the current item with run model attributes that are prerecorded in memory in a catalogue for detecting a match between said run attributes corresponding to the current item and run model attributes corresponding to a 15 particular run model, wherein in response to matching detection these steps follow: - enriching said run model attributes corresponding to said particular run model with said run attributes 20 corresponding to the current item; - recording OCR results for the current item in correspondence with said particular run model for generating optimized adjustment parameters for OCR; - monitoring for this particular run model whether a 25 counting value, indicating the number of matching consecutive items preceding the current item with run model attributes of this particular run model, exceeds a certain threshold value for setting OCR operation with optimized adjustment parameLers corresponding to the 30 particular run model; - and in the case no matching is detected, recording in memory in the catalogue said run attributes corresponding to the current item as run model attributes corresponding to a new run model. 35 An arrangement provides a method of handling mail items, which method comprises forming a EFE '7dJD!lo 3 digital image of each mail item, which image includes address information, and performing an automatic address recognition operation by OCR, said method being characterized in that it further comprises processing for 5 automatically identifying groups of uniform mail items or "runs", which processing comprises a step consisting in extracting image attributes from the image of the mail item, which attributes are suitable for distinguishing a group or "run" of uniform mail items. 10 In a particular implementation of the method of the invention, the processing for identifying runs comprises a step consisting in extracting closely related components from the image of the mail item. These closely related components are already 15 extracted for performing the OCR address recognition. With the method of the invention, the same information is thus used to detect runs and thus to add a new feature in simple manner to an automatic mail sorting process. With the method of the invention, the OCR automatic 20 address recognition system can be adjusted automatically each time a new run is identified. In the same way, with the automatic run identification of the invention, it is possible to organize systematic and automatic determination of the rate of reading of the identified 25 runs for the purpose of managing discount contracts in post offices. In another implementation of the method of the invention, runs are listed in a catalogue in the form of a set of closely related components that are 30 predetermined for each run. Thus, during processing of a current mail item, the closely related components extracted from the digital image of the current mail item are compared with the predetermined closely related components of the runs in the catalogue for detecting 35 whether or not the current mail item belongs to one of the runs.
4 In another implementation of the method of the invention, the processing for identifying runs comprises a step consisting in extracting gray scale image attributes from the image of the mail item, and the runs 5 are listed in a catalogue in the form of gray scale image attributes that are predetermined for each run. In another implementation of the method of the invention, the catalogue contains OCR adjustment parameters associated with the runs, which adjustment 10 parameters improve automatically. In another implementation of the method of the invention, a current mail item is identified as belonging to a run, and the digital image of the current mail item is recorded in a database with the results of the OCR 15 address recognition operation for the purpose of checking the rate of reading for the run. The present invention also provides a mail sorting machine specially arranged to implement the above-defined method. 20 The invention will be better understood on reading the following description and on examining the accompanying figures. The description is given merely by way of indicative example and is in no way limiting to the invention. In the figures: 25 Figure 1 shows two mail items belonging to a run; Figure 2 is a very general flow chart showing run detection in the OCR address recognition process; Figure 3 is flow chart of the method of the invention for detecting a run; 30 Figure 4 shows in more detailed manner the process of Figure 3 for comparing closely related components: Figure 5 shows the catalogue of Figure 3 in more detail; Figure 6 is a detailed flow chart of the various 35 steps making it possible to decide that a run has been detected and to determine the actions to be taken; 5 Figure 7 is a very general flow chart showing another implementation of run detection; and Figure 8 is a diagram showing extraction of attributes from a gray scale image. 5 Figure 2 very diagrammatically shows an implementation of the method of the invention for automatically detecting or identifying runs in an automatic OCR address recognition process. The process of automatically handling a mail item in 10 a sorting machine includes automatic address recognition by OCR, whereby a digital image 1 is formed of the surface of the current mail item bearing address information. The digital image of the mail item is binarized at 2, and, at 3, closely related components 15 (CCs) are derived from the binarized digital image. The composition of said closely related components is described in more detail below. The closely related components (CCs) of the mail item are then used to make it possible to locate the address block, and the rows and 20 characters of the postal address, and serve for automatic address recognition 4 by OCR. The method of the invention also uses closely related components extracted at 3 as data suitable for distinguishing at step 5 a group or run of uniform mail 25 items. As shown in Figure 2, run detection 5 can be performed in parallel with OCR automatic address recognition so that the process loops back at L for each subsequent mail item with the same result as for the run detection. The information obtained on the basis of the 30 run detection and on the basis of the OCR address recognition performed on the current mail item is used during the OCR processing process performed on the subsequent mail item and during the associated run detection. 35 Figure 3 shows the run detection 5 of Figure 2 in more detailed manner and starting from step 6. Figure 3 also shows the forming 1 of the digital image of the 6 surface of the current mail item, the binarization 2, and the extraction of the closely related components 3 from said digital image. In step 6, the closely related components of the current mail item are compared with the 5 closely related components of models that are prerecorded and listed in a dynamic catalogue 7 stored in a memory. The term "model" is used to mean a set of predetermined closely related components that define (characterize) a run. The process 6 of comparing the closely related 10 components is described in more detail below with reference to Figure 4. When, at 6, it is found that the closely related components of the current mail item match 8 the closely related components of a model from the catalogue, the 15 model in question is enriched 9 with the closely related components of the current mail item so as to consolidate the robustness of the model. When it is found 10 that the closely related components of the current mail item do not match the closely related components of any of the 20 models of the catalogue 7, a new model is created 11 in the catalogue 7, e.g. by overwriting an old model, the closely related components of the new model then being the closely related components of the current mail item. It is easily possible to imagine associating each model 25 in the catalogue with a model identity number which is kept in a memory of the OCR 12. The OCR outcomes determined in step 4 can also be recorded in the OCR memory 12 in connection with the identity number of a run model. 30 In the method of the invention, a decision module of the OCR repeatedly analyzes the OCR memory 12 so as to detect a run, and decides at 13 whether to execute functions resulting from runs being detected. More particularly, when the decision module has 35 decided at 13 that the current mail item belongs to a current run, it can undertake actions 14 to optimize automatic address reading by OCR. When it is detected 7 that a current mail item is part of a run, the module 13 can also, at 15, cause information to be stored in a database 16 on the current mail item and optionally on the preceding mail items also belonging to said current 5 run for the purposes of monitoring major user discounts. An implementation of the method of the invention is thus based on using the closely related components as image attributes suitable for distinguishing a group of uniform mail items so as to characterize runs. In a 10 manner known per se, a closely related component in a binary image is constituted by all of the adjacent black spots that can be associated immediately adjacently with one another without going via a white spot. In Figure 1, two closely related components 17, "S" and "Raspail" (in 15 joined-up writing), each of which respectively matches every one of the mail items E of the run F, are highlighted by being boxed-in by respective dashed-line frames 18. The extraction 3 of the closely related components from a binarized digital image thus consists 20 in identifying each of the closely related components in a table of closely related components by a plurality of criteria, e.g. by the position of the rectangular frame 18 surrounding the closely related component 17 in the image, by the length and the width of the frame 18, by 25 the density of black inside the frame 18, and by the number of closely related pixels. Figure 4 is a more detailed flow chart of the process of comparing 6 the closely related components of Figure 3 so as to determine whether a current mail item 30 can be associated with a model from the catalogue 7. At step 3, all of the closely related components in the image of the current mail item are extracted. Of all of said closely related components, only the most significant are kept, by eliminating the others in step 35 20, e.g. only the one hundred having the largest number of pixels are kept. The elimination at 20 of the smallest closely related components makes it possible to 8 accelerate the process of comparison at 6 and also sometimes to eliminate the destination address, which address is generally different on the different mail items belonging to a run. At 21, each discriminating 5 closely related component of the current mail item is compared with each discriminating closely related component of each of the models. For this purpose, a cascade of filters are used with the various criteria of the table of closely related components, a compatibility 10 threshold being given for each of the criteria. For example, firstly the positions of the frames surrounding the closely related components are compared, and then, in the event of success relative to the chosen compatibility threshold, the lengths and widths of the 15 frames are compared, etc. For each of the models in the catalogue, it is then known which closely related components and thus how many (step 22) closely related components of the current mail item match the closely related components of a model. At 23, the model having 20 the largest number of closely related components that match is kept, and said number is compared with a threshold at 24 which is, for example, 30 matching closely related components. If this comparison step is conclusive, the current mail item is detected as being 25 part of a run identified by a model from the catalogue 7. Figure 5 shows an example of the structure of the catalogue 7 shown in Figure 2. The catalogue 7 is a structured file and mainly includes four elements of information for each model 30, namely, an identity number 30 31, a set of closely related components 32, optimization parameters for adjusting the OCR 33, and a utility indicator for indicating the utility of the model 34. In addition, the catalogue 7 is made up of two portions. A first portion 35 of the catalogue is dynamic 35 and its contents are continuously changing. It is in said first portion 35 of the catalogue 7 that a new model 30 associated with a current mail item is recorded 9 (created) at 11, when the closely related components of the current mail item do not match 10 the closely related components of the models in the catalogue 7. When said first portion 35 of the catalogue 7 is full, it is 5 possible to overwrite a model 30 in said first portion 35 so as to create 11 a new model 30. The model that is overwritten is, for example, the model whose utility indicator 34 is the lowest. The utility indicator 34 of a model 30 depends on the utility of the model 30 in the 10 method of the invention, i.e. on the time elapsed since it was last recognized, on the number of times it has been recognized, on whether it led to a decision to detect a run, on the utility of the OCR parameters of the model, etc. Recording a new model 30 in the first 15 portion 35 consists in giving an identity number 31 to the model 30. For example, in correspondence with said identity number 31, the following are recorded: the one hundred most significant closely related components 31 of the current mail item; the OCR parameters 33 used for 20 performing automatic address recognition on the current mail item; and the utility indicator 34. In addition, when a current mail item has been recognized as being associated 8 with a model 30 from the first portion 35 of the catalogue 7, the information 25 elements of the model are enriched by the current mail item. The identity number 31 of the model 30 does not change so as to enable the decision module to recognize the association of the current mail item with the model 30. The set of closely related components 32 of the 30 module is enriched 9 with the closely related components of the current mail item. The values of each of the criteria of the table of closely related components for the closely related components of the set of closely related components 32 of the model 30 that match the 35 closely related components of the current mail item are averaged. The closely related components of the current mail item that do not match the closely related 10 components of the model are recorded in the set of closely related components 32 of the model, until the set of the closely related components contains a maximum number of closely related components for a model 30. 5 This enrichment 9 of the closely related components of the model 30 makes it possible to consolidate the model by making it more reliable and, for example, more tolerant to noise on the closely related components during binarization of the digital image. Thus, in the 10 model 30, the OCR parameters 33 of the current mail item are recorded over the old ones, and the utility indicator 34 is updated. Initially, models 30 of runs for referenced customers who have entered into commercial agreements, or 15 models of runs that occur very frequently and that comprise very large numbers of mail items are recorded in a non-erasable second portion 36 of the catalogue. Once recorded in the second portion 36 of the catalogue 7, the information elements of the models 30 then do not change 20 ever again. The set 32 of closely related components of said models is complete and cannot be enriched any further with closely related components. The OCR parameters 33 for the mail items associated with the models 30 are fully optimized, typically by sampling, and 25 they are not changed. -The models 30 of the second portion 36 of the catalogue 7 have consecutive identity numbers 31. The utility indicator 34 is unnecessary. In addition to the initially-recorded models 30, models 30 coming from the first portion 35 of the catalogue 7 are 30 recorded at 37 in the second portion 36 of the catalogue 7 when said models have OCR parameters that are fully optimized and recur frequently, i.e. when their utility indicators 34 have exceeded a certain threshold. These models 30 are recorded in spaces left vacant in the 35 second portion 36 of catalogue 7 and they do not overwrite the previously-recorded models 30.
11 Figure 6 shows in more detail the steps whereby the information contained in the OCR memory 12 and in the catalogue 7 is analyzed by a decision module 40 of the mail handling machine in order to detect a run at 13, to 5 decide on performance of functions resulting from a run passing, and thus to decide on whether a run is detected, and to determine at 14 the suitable actions to be triggered. In step 41, and on the basis of the OCR memory 12, the decision module 40 analyzes the identity 10 numbers 31 of the models 30 associated with the latest mail items. If the identity number 31 of the model associated with the mail item preceding the current mail item -is different, at 42, from the identity number of the model associated with the current mail item, no action 15 relating to a run is triggered in step 43. Otherwise, at 44, a search,.in step 45, in the catalogue 7, makes it possible to determine whether the model 30 associated with the current mail item is a model 30 in the second portion 36 of the catalogue 7 or whether said model 30 20 has already been the subject of a run. If it has, at 46, the decision module decides (detects) in step 13 that the curr-ent mail item belongs to a run and triggers performance of functions resulting from a run being detected. Otherwise, at 47, the decision module 40 25 counts, in step 48, the number of consecutive mail items including the current mail item that are associated with the same model 30. If, as at output 49, this number is less than or equal to four, for example, no action relating to a run is triggered. If, as at the output 50, 30 said number is greater than four, the decision module decides, at step 13, that the current mail item belongs to a run, and triggers performance of functions resulting from a run being detected. A function resulting from a run being detected is 35 the function of monitoring discounts that involves, at 15, transmitting instructions to the database 16, so that the identification number 31 of the model 30 associated 12 with the mail items of the run, the OCR outcomes of all of the mail items of the run, and the images of a few mail items of the run, e.g. one in fifty, among other things, are retrieved and recorded in files identified, 5 for example, by the date, hour, minute, and second of the recording. Another function resulting from a run being detected is the function of automatically optimizing the OCR parameters for reading the addresses of the mail items 10 subsequent to the current mail item. The OCR parameters 33 recorded in correspondence with the identity number 31 of the model associated with the mail items of the run are retrieved, and their level of optimization is checked at 52. If said OCR parameters are fully optimized 53 15 (this generally applies for runs of mail items from referenced customers), they are transmitted during the looping back L to the OCR for reading the addresses on the subsequent mail items. If the OCR parameters 33 are not fully optimized 54 (this generally applies to runs 20 whose model has just been created in the catalogue), in addition to the OCR parameters 33 of the catalogue 7, the decision model 40 analyses the OCR outcomes and the OCR parameters of the mail items associated with said model and contained in the OCR memory, and determines, at 14, 25 new OCR parameters for automatically improving the automatic address recognition by OCR. The new OCR parameters are then transmitted during the looping back L to the OCR for reading the subsequent mail item. The OCR parameters will probably be improved over numerous mail 30 items before they enable optimum address recognition to be achieved. More particularly, an OCR parameter making it possible to accelerate OCR reading is the parameter concerning the distinction between "handwritten" and 35 "typed" text. After having located the address block and the rows, the OCR address recognition unit seeks to determine whether the writing in the address block is 13 handwritten or typed. Sometimes, the decision is unambiguous as of this step, and further, unnecessary processing is stopped, but other times, the decision between handwritten and typed writing is taken only at 5 the end of the OCR processing, during the OCR outcome. In the invention, analysis of the OCR outcomes of the current mail item and of the preceding mail items of the current run can lead to detecting that the writing is typed for the current run, and it is then no longer 10 necessary to perform handwritten writing recognition, thereby avoiding read errors and rejects. Thus, when the processing detects that the destinations are the same for all of the mail items of the current run, which is typically a case of error in 15 locating the address block by the OCR (the OCR is probably reading a block containing the address of the sender of the run), then with the invention, it is possible to stop that error by preventing the OCR from reading the erroneous address block and to locate another 20 address block in the image. The OCR outcomes of the mail items belonging to the current run that are previously processed can be re-considered and the mail items can be redirected when the sorting machine is equipped with a delay line that makes it possible not to sort a mail item 25 instantaneously when its OCR outcome is known. Optimizing the OCR parameters can also relate to: - the position of the address block when no address has been read; - the binarization parameters; 30 - the character font used; - the order of the elements of the address (post code or "ZIP" code, city, street); and - the processing mode to be used, e.g. domestic or international. 35 The method of the invention that is described above for optimizing OCR parameters is based on the assumption that the subsequent mail item belongs to the current run.
14 It is also possible to optimize the OCR parameters for reading the address of the current mail item by performing OCR processing in two OCR address recognition passes. 5 The OCR parameters can also be optimized in the manner described in European Patent Document EP 1 159 705 by using video encoding. The optimized OCR parameters are then recorded in the corresponding model in the catalogue for possible subsequent use. 10 The information contained in the database 12 serves to determine the rates of reading of the mail items of referenced customers in order to check that the clauses established in the discount contracts are complied with. Normally, this operation is performed independently of 15 the mail handling, e.g. on a computer on the basis of data 12 stored in the memories of a plurality of sorting machines, which databases 12 can contain data identifying the sorting machine and the unstacker through which the runs have passed. The referenced customers can be put 20 into correspondence with the detected runs directly if said runs have a model initially recorded in the second portion of the catalogue. Otherwise, the sender can be determined by an operator on the basis of the digital images of the mail items of the runs, as stored in the 25 database 12. In the above-described method of the invention, the concept of inter-mail-item context is introduced, and the basis for the detection of runs is a module that can compare two images and can determine whether or not they 30 match. The comparison is based above on closely related components as image attributes suitable for distinguishing a group of uniform mail items for detecting runs. But the invention covers other types of image attribute such as the positioning and the size of 35 the various blocks in the image, the density of black in the image, and the table of symbols resulting from OCR recognition. It is also possible to extract the 15 attributes of a gray scale image, or to use the densities of colors on a color digital image to constitute said image attributes suitable for distinguishing runs. Figure 7 shows another implementation of the method 5 of the invention for automatically detecting runs, in which implementation the image attributes suitable for distinguishing runs are attributes extracted from a gray scale image. The automatic processing process consisting in forming a digital image 1 of the current mail item, in 10 binarizing said image, at 2, and in deriving closely related components, at 3, from the binarized image, which components are used for automatic address recognition 4, remains unchanged. In step 60, the digital image formed at 1 is processed so as to extract the gray scale image 15 attributes and said gray scale image attributes are used for run detection at 61. Run detection 61 on the basis of the gray scale image attributes is performed like run detection 5 on the basis of the closely related components in parallel with OCR address recognition so 20 that the OCR processing process loops back at L for each subsequent mail item with the result of the run detection 60. Figure 8 shows in detail the extraction of gray scale image attributes of the current mail item that is 25 performed in step 60. The digital image 65 formed at 1 is a gray scale image of the current mail item. The image 65 of the current mail item is firstly sub-sampled in step 66 into a low-resolution gray scale image 67 of the current mail item. Global gray scale attributes 68 30 are extracted from the low-resolution image 67. The gray scale attributes 68 are more particularly the height and the width of the mail item, the minimum, the maximum, and the mean shade of gray of the pixels, the variance, and the entropy of the image. Various grids are then applied 35 to the low-resolution image 67 in step 69 so as to obtain gridded images, e.g. a first image 70 that is 4 x 4 grid and a second image 71 that is 5 x 4 grid in Figure 8.
16 Local gray scale attributes 72 are extracted from the various gridded images 70, 71. The local gray scale attributes are, in particular, the minimum, the maximum, and the mean shade of gray of the pixels, and the 5 variance and the entropy computed for each block 73 of the grid of each gridded image 70, 71. The global attributes 68 with t-he local gray scale attributes 72 of the current mail item form the gray scale image attributes 74 that serve for run detection 61. 10 The use of gray scale attributes 74 is substantially identical to use of closely related components. The set of closely related components 32 of a model 34 of the catalogue 7 is then replaced with gray scale attributes 74. The comparison between the gray scale attributes of 15 a model and the gray scale attributes of the current mail item takes place as follows. The global attributes of the current mail item and of the model are compared, and thresholding is used to determine whether the mail item matches the model. In the event that it matches the 20 model, the local attributes are compared, correlation coefficients between the local attributes of all of the blocks corresponding in pairs are computed, and are averaged to give the mean and median correlation coefficients of each grid, and, by averaging the 25 correlation coefficients of the various grids, a single correlation coefficient is obtained for correlation between current mail item and model, which single coefficient, by thresholding, indicates whether the model and the current mail item are associated. Enriching a 30 model to make it more robust takes place by averaging the gray scale attributes of the model with the gray scale attributes of the current mail item.

Claims (7)

  1. 2. A method according to claim 1, in which said run attributes generated from the current item image are closely related components extracted from this image and in 5 which closely related components corresponding to the current item are grouped by number of pixels to keep a limited predefined number of discriminating closely related components having highest number of pixels before comparing discriminating closely related components with closely 10 related components corresponding to particular run models.
  2. 3. A method according to claim 1, in which for generating said run attributes from the current item image, the method comprises the step of transforming said image to 15 a low-resolution gray scale image for extracting global attributes and local attributes.
  3. 4. The method of claim 3, wherein the global attributes are the minimum, the maximum, and the mean shade of gray of 20 the pixels in the global image.
  4. 5. The method of claim 3 or 4, wherein the local attributes are the minimum, the maximum, and the mean shade of gray of the pixels from the various gridded images. 25
  5. 6. A method according to any one of claims 1 to 5, in which the adjustment parameters comprise data indicating OCR operation for recognition of "handwritten" or "typed" text. 30
  6. 7. A method according to any one of claims 1 to 5, in which the adjustment parameters comprise data indicating at least one of the position of the address block on the item, an orientation of the address block on the item and a 35 character font used for the address on the item.
  7. 8. A method according to any one of the preceding claims, in which if a current mail item is identified as belonging to a run, the digital image of the current mail 747915 vol anend 19 item is recorded in a database with the results of the OCR address recognition operation for the purpose of checking the rate of reading for the run. 5 9. A method of handling mail items substantially as hereinbefore described with reference to the any one of the embodiments shown in Figures 2 to 8. 747915 vol amend
AU2005203141A 2004-07-20 2005-07-19 A method of detecting streams of postal items Ceased AU2005203141B2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
FR0451591A FR2873469B1 (en) 2004-07-20 2004-07-20 METHOD FOR DETECTION OF SPINNING.
FR0451591 2004-07-20

Publications (2)

Publication Number Publication Date
AU2005203141A1 AU2005203141A1 (en) 2006-02-09
AU2005203141B2 true AU2005203141B2 (en) 2010-08-12

Family

ID=34948634

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2005203141A Ceased AU2005203141B2 (en) 2004-07-20 2005-07-19 A method of detecting streams of postal items

Country Status (8)

Country Link
EP (1) EP1622065B1 (en)
AT (1) ATE432508T1 (en)
AU (1) AU2005203141B2 (en)
DE (1) DE602005014588D1 (en)
ES (1) ES2327853T3 (en)
FR (1) FR2873469B1 (en)
NO (1) NO335868B1 (en)
PT (1) PT1622065E (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ATE504363T1 (en) 2007-08-13 2011-04-15 Siemens Ag METHOD AND DEVICE FOR TRANSPORTING BULK SHIPMENTS
DE102008026088A1 (en) 2008-05-30 2009-12-03 Siemens Aktiengesellschaft Method for transporting mass of objects, particularly mails, involves transmitting computer operated description of quantity from data processing system of dispatcher to data processing system connected with sorting system
DE102007038186B4 (en) 2007-08-13 2009-05-14 Siemens Ag Method and device for transporting bulk mail
DE102009036626A1 (en) 2009-08-07 2011-02-10 Siemens Aktiengesellschaft Method and device for transporting objects to image-pattern-dependent destination points
FR2967277B1 (en) * 2010-11-05 2013-05-31 Solystic METHOD FOR PROCESSING WIRELESS COURIER COMPRISING AN AUTOMATIC LOCATION OF AN ADDRESS BLOCK USING MATRIX STORES
FR2984566B1 (en) * 2011-12-15 2015-10-16 Solystic METHOD FOR PROCESSING POSTAL SHIPMENTS WITH GENERATION OF DIGITAL MODELS OF SPEECHES ON INTERACTIVE TERMINAL

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE19836767C1 (en) * 1998-08-13 1999-11-18 Siemens Ag Processing of items to be returned to sender
WO2000048119A1 (en) * 1999-02-12 2000-08-17 Siemens Dematic Ag Method for reading document entries and addresses

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE19644163A1 (en) * 1996-10-24 1998-05-07 Siemens Ag Method and device for online processing of mail items to be forwarded
DE10150464A1 (en) * 2001-10-16 2003-04-30 Deutsche Post Ag Method and device for processing mail items

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE19836767C1 (en) * 1998-08-13 1999-11-18 Siemens Ag Processing of items to be returned to sender
WO2000048119A1 (en) * 1999-02-12 2000-08-17 Siemens Dematic Ag Method for reading document entries and addresses

Also Published As

Publication number Publication date
ATE432508T1 (en) 2009-06-15
FR2873469B1 (en) 2007-08-31
FR2873469A1 (en) 2006-01-27
EP1622065B1 (en) 2009-05-27
NO20053471D0 (en) 2005-07-15
DE602005014588D1 (en) 2009-07-09
PT1622065E (en) 2009-08-10
NO20053471L (en) 2006-01-23
ES2327853T3 (en) 2009-11-04
EP1622065A1 (en) 2006-02-01
AU2005203141A1 (en) 2006-02-09
NO335868B1 (en) 2015-03-09

Similar Documents

Publication Publication Date Title
AU2005203141B2 (en) A method of detecting streams of postal items
JP2728235B2 (en) Image quality analysis method
US9350552B2 (en) Document fingerprinting
KR100324847B1 (en) Address reader and mails separater, and character string recognition method
US4481665A (en) Character segmentation method
JP3888812B2 (en) Fact data integration method and apparatus
US20060253406A1 (en) Method for sorting postal items in a plurality of sorting passes
US20070065011A1 (en) Method and system for collecting data from a plurality of machine readable documents
EP0113410A2 (en) Image processors
JP3485020B2 (en) Character recognition method and apparatus, and storage medium
JP2006509271A (en) Mail identification tag with image signature and associated mail handler
JP4661921B2 (en) Document processing apparatus and program
US5038381A (en) Image/text filtering system and method
US10118202B2 (en) Method of sorting postal articles into a sorting frame with the sorted articles being counted automatically
KR19990072627A (en) Address recognizing method and mail processing apparatus
CN101593278B (en) Method and system for distinguishing language of document image
CN116541576A (en) File data management labeling method and system based on big data application
KR100655916B1 (en) Document image processing and verification system for digitalizing a large volume of data and method thereof
US8046308B2 (en) Method of processing postal items with account being taken of extra expense due to wrong delivery
CN113269101A (en) Bill identification method, device and equipment
US6373982B1 (en) Process and equipment for recognition of a pattern on an item presented
JP2001009381A (en) Information processing postal sorting system
CA2620180A1 (en) Method for retrieving text blocks in documents
JP3660405B2 (en) Sorting machine, address recognition device and address recognition method
CN115731555A (en) Document classification method, device, equipment and storage medium

Legal Events

Date Code Title Description
FGA Letters patent sealed or granted (standard patent)
MK14 Patent ceased section 143(a) (annual fees not paid) or expired