CA2967584C - Object identification and authentication - Google Patents
Object identification and authentication Download PDFInfo
- Publication number
- CA2967584C CA2967584C CA2967584A CA2967584A CA2967584C CA 2967584 C CA2967584 C CA 2967584C CA 2967584 A CA2967584 A CA 2967584A CA 2967584 A CA2967584 A CA 2967584A CA 2967584 C CA2967584 C CA 2967584C
- Authority
- CA
- Canada
- Prior art keywords
- image
- item
- digital image
- pattern
- processor
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 239000013598 vector Substances 0.000 claims abstract description 217
- 238000000034 method Methods 0.000 claims abstract description 119
- 238000012545 processing Methods 0.000 claims description 77
- 230000008569 process Effects 0.000 claims description 53
- 230000015654 memory Effects 0.000 claims description 21
- 238000004519 manufacturing process Methods 0.000 claims description 16
- 239000013078 crystal Substances 0.000 claims description 15
- 238000003860 storage Methods 0.000 claims description 9
- 239000002184 metal Substances 0.000 claims description 8
- 239000000463 material Substances 0.000 claims description 7
- 230000007547 defect Effects 0.000 claims 4
- 238000003754 machining Methods 0.000 claims 2
- 238000003801 milling Methods 0.000 claims 2
- 229930014626 natural product Natural products 0.000 claims 2
- 238000007726 management method Methods 0.000 description 18
- 238000003384 imaging method Methods 0.000 description 17
- 230000000694 effects Effects 0.000 description 14
- 239000003814 drug Substances 0.000 description 13
- 238000013507 mapping Methods 0.000 description 13
- 238000012015 optical character recognition Methods 0.000 description 13
- 239000010437 gem Substances 0.000 description 12
- 239000000835 fiber Substances 0.000 description 10
- 230000004044 response Effects 0.000 description 8
- 229940079593 drug Drugs 0.000 description 7
- 229910001751 gemstone Inorganic materials 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 6
- 238000000605 extraction Methods 0.000 description 6
- 238000001000 micrograph Methods 0.000 description 6
- 241000251730 Chondrichthyes Species 0.000 description 5
- 241001465754 Metazoa Species 0.000 description 5
- 238000005299 abrasion Methods 0.000 description 5
- 230000002708 enhancing effect Effects 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 230000001965 increasing effect Effects 0.000 description 5
- 241000283153 Cetacea Species 0.000 description 4
- 239000008280 blood Substances 0.000 description 4
- 210000004369 blood Anatomy 0.000 description 4
- 230000008859 change Effects 0.000 description 4
- 238000012512 characterization method Methods 0.000 description 4
- 238000004590 computer program Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 229920006395 saturated elastomer Polymers 0.000 description 4
- 241000283155 Delphinidae Species 0.000 description 3
- 235000006679 Mentha X verticillata Nutrition 0.000 description 3
- 235000002899 Mentha suaveolens Nutrition 0.000 description 3
- 235000001636 Mentha x rotundifolia Nutrition 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 3
- 230000003190 augmentative effect Effects 0.000 description 3
- 230000003068 static effect Effects 0.000 description 3
- 230000007704 transition Effects 0.000 description 3
- 238000013519 translation Methods 0.000 description 3
- 241000283287 Globicephala melas Species 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000005286 illumination Methods 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 239000007921 spray Substances 0.000 description 2
- 230000017105 transposition Effects 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 102000004506 Blood Proteins Human genes 0.000 description 1
- 108010017384 Blood Proteins Proteins 0.000 description 1
- 229920000742 Cotton Polymers 0.000 description 1
- 241000533950 Leucojum Species 0.000 description 1
- 102100023142 Zinc transporter ZIP5 Human genes 0.000 description 1
- 101710097854 Zinc transporter ZIP5 Proteins 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 230000001010 compromised effect Effects 0.000 description 1
- 230000008602 contraction Effects 0.000 description 1
- 238000005260 corrosion Methods 0.000 description 1
- 230000007797 corrosion Effects 0.000 description 1
- 230000001351 cycling effect Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000000593 degrading effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 239000002657 fibrous material Substances 0.000 description 1
- 238000005755 formation reaction Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000011022 operating instruction Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 108090000623 proteins and genes Proteins 0.000 description 1
- 238000009419 refurbishment Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000005507 spraying Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 239000004575 stone Substances 0.000 description 1
- 230000002459 sustained effect Effects 0.000 description 1
- 230000026676 system process Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000001931 thermography Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 238000009966 trimming Methods 0.000 description 1
- 238000002604 ultrasonography Methods 0.000 description 1
- 238000012285 ultrasound imaging Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
- 230000037303 wrinkles Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G07—CHECKING-DEVICES
- G07D—HANDLING OF COINS OR VALUABLE PAPERS, e.g. TESTING, SORTING BY DENOMINATIONS, COUNTING, DISPENSING, CHANGING OR DEPOSITING
- G07D7/00—Testing specially adapted to determine the identity or genuineness of valuable papers or for segregating those which are unacceptable, e.g. banknotes that are alien to a currency
- G07D7/06—Testing specially adapted to determine the identity or genuineness of valuable papers or for segregating those which are unacceptable, e.g. banknotes that are alien to a currency using wave or particle radiation
- G07D7/12—Visible light, infrared or ultraviolet radiation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/80—Recognising image objects characterised by unique random patterns
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/42—Document-oriented image-based pattern recognition based on the type of document
- G06V30/424—Postal images, e.g. labels or addresses on parcels or postal envelopes
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Toxicology (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Artificial Intelligence (AREA)
- Image Analysis (AREA)
Abstract
A method, system, medium, or apparatus for recognizing a structural feature of an object, selecting a region of the object based on the structural feature, and capturing an image of the selected region, wherein the image has sufficient resolution to show at least one fingerprint feature. The image data associated with the fingerprint feature is processed to generate a first feature vector, and a difference value between the first feature vector and a second feature vector associated with an identifier identifying the object is determined in order to calculate a match correlation between the first feature vector and the second feature vector.
Description
OBJECT IDENTIFICATION AND AUTHENTICATION
Technical Field [0001] This invention pertains to methods, systems, medium, and apparatus for identifying, tracking, tracing, inventorying, authenticating, verifying, sorting, delivering, or classifying objects and/or articles associated with objects, such as weapons, pharmaceuticals, drugs, animals, gems, coins, bullion, currency, integrated circuits, clothing, apparel, legal documents, financial documents, mail, art work, photographs, manufactured parts, labels, etc.
Summary
Technical Field [0001] This invention pertains to methods, systems, medium, and apparatus for identifying, tracking, tracing, inventorying, authenticating, verifying, sorting, delivering, or classifying objects and/or articles associated with objects, such as weapons, pharmaceuticals, drugs, animals, gems, coins, bullion, currency, integrated circuits, clothing, apparel, legal documents, financial documents, mail, art work, photographs, manufactured parts, labels, etc.
Summary
[0002] The following summary is intended to provide a basic understanding of some aspects of the various examples described in further detail herein. This summary is not intended to identify key/critical elements of the examples or to delineate the scope of the invention. Its sole purpose is to present some concepts of the examples in a simplified form as a prelude to the more detailed description that is presented later.
[0003] A method is disclosed herein, including recognizing a structural feature of an item of currency, selecting a region of the item based on the structural feature, and capturing an image of the selected region, wherein the image has sufficient resolution to show at least one fingerprint feature. The image data associated with the fingerprint feature is processed to generate a first feature vector, and a difference value between the first feature vector and a second feature vector associated with an identifier identifying the item is determined in order to calculate a match correlation between the first feature vector and the second feature vector.
[0004] A further method is disclosed herein, including capturing a first digital image of a first selected region of a first item of currency, wherein the digital image has sufficient resolution to show an area in the first selected region comprising a first pattern of features, and processing the first digital image to generate a first feature vector comprising data corresponding to the first pattern of features of the first item.
The first feature vector and the first digital image are stored in a database in association with a first item identifier.
The first feature vector and the first digital image are stored in a database in association with a first item identifier.
[0005] Additionally, a method is disclosed herein, including capturing a digital image of a selected region on an item of currency, wherein the digital image has sufficient resolution to allow recognition of characters of a serial number of the item if the serial numbers are within the selected region, and also of sufficient resolution to show elements of a grain surface of inter-character and intra-character regions of the serial number. A serial number of the item is recognized from the digital image, and the digital image is processed to locate at least one fingerprint feature. The data identifying the fingerprint feature is stored in a first feature vector, and the first feature vector and the digital image are stored in a database in association with the serial number.
[0006] A further method is disclosed herein, including capturing a digital image of a region including an identifiable structure of an item of currency, and extracting data representing at least one fingerprint feature from the digital image. The fingerprint feature data is stored in a feature vector in memory, and the digital image and the feature vector are stored in association with an identifier of the item.
[0007] A system is disclosed herein, including a camera configured to capture an image of an object, and fingerprinting software component is configured to process the captured image of the object to create a digital fingerprint of the object based on indicia appearing in the captured image. A software interface is configured to store the digital fingerprint in a database comprising a plurality of fingerprints associated with a plurality of objects, wherein the digital fingerprint identifies the object as being unique among the plurality of objects, and wherein the database relates the digital fingerprint to a processing step of the object.
[0008] A method is disclosed herein, including capturing a first image of an object, and generating a digital fingerprint responsive to capturing the first image.
The digital fingerprint is stored in a database, wherein the database comprises a plurality of fingerprints associated with a plurality of objects. The method further included determining a processing step associated with the object, relating the processing step to the digital fingerprint, and subsequently obtaining a second image of the object so that the second image may be compared with the plurality of fingerprints to identify the digital fingerprint. In response to identifying the digital fingerprint, the object may be processed according to the stored processing step.
The digital fingerprint is stored in a database, wherein the database comprises a plurality of fingerprints associated with a plurality of objects. The method further included determining a processing step associated with the object, relating the processing step to the digital fingerprint, and subsequently obtaining a second image of the object so that the second image may be compared with the plurality of fingerprints to identify the digital fingerprint. In response to identifying the digital fingerprint, the object may be processed according to the stored processing step.
[0009] An apparatus is disclosed herein, including means for accessing a database comprising a plurality of fingerprints associated with a plurality of objects, wherein the plurality of fingerprints comprises a digital fingerprint generated from a first image of the object, and a means for obtaining a second image of the object.
The apparatus may further comprise a means for comparing the second image with the plurality of fingerprints to identify the digital fingerprint, wherein a processing step is associated with the digital fingerprint. In response to identifying the digital fingerprint, the object may be processed according to the processing step.
The apparatus may further comprise a means for comparing the second image with the plurality of fingerprints to identify the digital fingerprint, wherein a processing step is associated with the digital fingerprint. In response to identifying the digital fingerprint, the object may be processed according to the processing step.
[0010] A method is disclosed herein, including obtaining, by a processing device, image data of an object, comparing, by the processing device, the image data with a pattern stored in memory, wherein the pattern identifies spatial information of corresponding pattern elements, and determining, by the processing device, a confidence level of the comparison of the image data according to a success in matching the image data with the pattern. Additionally, the method may include comparing, by the processing device, the confidence level with a confidence threshold associated with the pattern, wherein a plurality of patterns stored in memory are associated with corresponding confidence thresholds, and identifying, by the processing device, the pattern from the plurality of patterns based, at least in part, on comparing the confidence level with the confidence threshold. The object may be processed according to the identified pattern.
[0011] A system is disclosed herein, including an imaging device configured to generate image data responsive to a portion of an object located within view of the imaging device, a memory device configured to store a plurality of patterns, and a processing device configured to perform operations. The operations include comparing the image data with a pattern stored in the memory device, wherein the pattern identifies spatial, feature, or other distinguishing information of corresponding pattern elements, determining a confidence level of the comparison of the image data according to a success in matching the image data with the pattern elements, and comparing the confidence level with a confidence threshold associated with the pattern, wherein the plurality of patterns stored in the memory device are associated with corresponding confidence thresholds. The pattern may be identified from the plurality of patterns stored in the memory device based, at least in part, on comparing the confidence level with the confidence threshold. The object may be processed according to the identified pattern.
[0012] A non-transitory memory device is disclosed herein, having stored thereon computer-executable instructions that, in response to execution by a processing device, cause the processing device to perform operations. The operations include obtaining image data of an object, comparing the image data with a stored pattern, wherein the stored pattern identifies spatial, feature, or other distinguishing information of corresponding pattern elements, and determining a confidence level of the comparison of the image data according to a success in matching the image data with the pattern elements. The operations further include comparing the confidence level with a confidence threshold associated with the pattern, wherein a plurality of stored patterns are associated with corresponding confidence thresholds, and identifying the stored pattern from the plurality of stored patterns based, at least in part, on'comparing the confidence level with the confidence threshold. The object may be processed according to the identified stored pattern.
Brief Description of the Drawings
Brief Description of the Drawings
[0013] FIG. 1 illustrates an example system configured to process objects.
[0014] FIG. 2 illustrates an example process of mail processing.
[0015] FIG. 3 illustrates one or object images and associated image dimensions.
[0016] FIG. 4 illustrates a total number of pixels associated with one or more object images.
[0017] FIG. 5 illustrates a number of paragraphs associated with one or more object images.
[0018] FIG. 6 illustrates a number of lines associated with one or more object images.
[0019] FIG. 7 illustrates line dimensions associated with one or more object images.
[0020] FIG. 8 illustrates a comparison of text associated with one or more object images.
[0021] FIG. 9 illustrates an example image of an object and a coordinate system for providing, determining, indentifying, and/or generating a characterization of one or more objects.
[0022] FIG. 10 illustrates an example process for comparing and/or distinguishing a first image and a second image associated with one or more objects.
[0023] FIG. 11 illustrates the sequence used when the system parses an object.
[0024] FIG. 12 illustrates a system including both the front-end directory data compiler and a back-end runtime pattern identification and categorization data files.
[0025] FIG. 13 graphically represents an example pattern comprising an identification block and field descriptors.
[0026] FIG. 14 illustrates a process of identifying, imagining, matching, verifying classifying and delivering the results of an object match.
[0027] FIG. 15 illustrates an example process of automatically generating patterns from a number of sample objects.
[0028] FIG. 16 depicts an example of a system for object identification and inventory management.
[0029] FIG. 17 depicts an example of an object for identification and inventory management.
[0030] FIG. 18 depicts an example of an object for identification and inventory management.
[0031] FIG. 19a depicts an example of a high resolution image captured for object identification and inventory management.
[0032] FIG. 19b depicts an example of a high resolution image captured for object identification and inventory management.
[0033] FIG. 19c depicts an example of a high resolution image captured for object identification and inventory management.
[0034] FIG. 19d depicts an example of a high resolution image captured for object identification and inventory management.
[0035] FIG. 19e depicts an example of a feature vector including numerical values representing fingerprint features associated with a high resolution image for object identification and inventory management.
[0036] FIG. 20a depicts an example of a high resolution image captured for object identification and inventory management.
[0037] FIG. 20b depicts an example of a high resolution image captured for object identification and inventory management.
[0038] FIG. 20c depicts an example of a high resolution image captured for object identification and inventory management.
[0039] FIG. 20d depicts an example of a high resolution image captured for object identification and inventory management.
[0040] FIG. 20e depicts an example of a feature vector including numerical values representing fingerprint features associated with a high resolution image for object identification and inventory management.
[0041] FIG. 21 depicts a table showing differences between two feature vectors.
[0042] FIG. 22 depicts an example of a process for object identification and inventory management.
[0043] FIG. 23 depicts an example of a process for object identification and inventory management Background
[0044] An object may be tracked and/or inventoried by using a unique marking system. Objects may be physically marked with a unique serial number. The serial number may be engraved on the object and/or may be printed or engraved on a tag and affixed to the object by any of a variety of means. The serial number may be obscured purposely or inadvertently by physical damage and/or by loss of the tag.
For the purposes of authenticating, tracking and inventorying an object an obscured or lost serial number may be ineffective.
For the purposes of authenticating, tracking and inventorying an object an obscured or lost serial number may be ineffective.
[0045] Marking certain objects would damage or destroy the value of the object.
Art work, gemstones, and collector-grade coins are examples. Identifying or certifying information may be obtained concerning such objects but if they are attached or otherwise physically associated with the object, they are subject to getting lost or being altered. If identifying or certifying information is stored separately from the object, the entire identification/certification process must be performed again if the object is lost and later recovered or its chain of control is otherwise compromised.
Detailed Description
Art work, gemstones, and collector-grade coins are examples. Identifying or certifying information may be obtained concerning such objects but if they are attached or otherwise physically associated with the object, they are subject to getting lost or being altered. If identifying or certifying information is stored separately from the object, the entire identification/certification process must be performed again if the object is lost and later recovered or its chain of control is otherwise compromised.
Detailed Description
[0046] FIG. 1 illustrates an example system 100 configured to process objects. In some examples, the objects may comprise envelopes or mail pieces. Objects to be analyzed, identified, sorted, delivered, or classified may be fed into the system 100 at the object infeed 140 before being processed and ultimately removed at the exit 150 or as sortation completes. The object may be processed and/or operated on by any or all of a control 136, a reader 152, a camera 158, a printer/sprayer 154, and/or a labeler 156.
[0047] A system 125 is illustrated as including a parser 121, patterns 122, address records 123, data files and tables 124, and one or more logs 126. An image processing system 135 is illustrated as including a database 131, image capture block 132, and an Optical Character Recognition (OCR) system 133, that may include a Block-Field-Line Locator 134. An interface system 145 is illustrated as including a visual display 142 and an operator console 144. A network 120 may operatively connect the system 125, image processing system 135, interface system 145, or any combination thereof. A sortation device may be used to physically move, deliver, or sort the objects through the system 100.
[0048] In a mail system, the physical object is a mail piece. It may, however, be such things as a part name and part description as found in the manufacture or maintenance of an airplane, a business card, or almost any object that contains information on it that, when properly interpreted, tells the system what to do with the object based on the definition and use of the patterns discussed below.
[0049] By way of example, a mail piece received at a postal distribution center may be scanned for identification to finalize a destination of the mail piece.
When a mail piece cannot be finalized (e.g., contains insufficient readable information to allow its full ZIP code sprayed on the front), a fluorescent bar code may be sprayed on the back. The bar code may be referred to as an ID tag. The ID tag may identify that particular mail piece so that when later, after the delivery address has been successfully coded, the coding results may be reassociated with that mail piece and the delivery sequence ID tag may be sprayed on it.
When a mail piece cannot be finalized (e.g., contains insufficient readable information to allow its full ZIP code sprayed on the front), a fluorescent bar code may be sprayed on the back. The bar code may be referred to as an ID tag. The ID tag may identify that particular mail piece so that when later, after the delivery address has been successfully coded, the coding results may be reassociated with that mail piece and the delivery sequence ID tag may be sprayed on it.
[0050] The mail piece may have a fluorescent bar code or ID tag sprayed on the back. While the ID tag does not need to be sprayed, this is typical in the industry and can result in a considerable expense associated with the ink and/or labels. The ID tag may be sprayed on the mail piece not just when the mail piece cannot be finalized, but also for general tracking purposes. The ID tag may be used to associate later processing results with that particular mail piece.
[0051] The contents of the ID tag may be associated with an image of the front of the mail piece in a database. A mail piece that was not successfully finalized may be sorted to a reject bin. The image (and associated ID tag) may be transmitted for non-real time processing of some sort, either computer or manual. Assuming the image can be finalized after the additional processing, the ID tag may be associated in the database with the finalized ZIP code that may then be sprayed on the mail piece.
[0052] Sometime later, the mail piece may be rescanned, with the ID tag read.
The destination ZIP code may be retrieved from the database and sprayed on the mail piece, which may then enter the automatic processing procedures. The mail piece may be routed to its destination by the automatic processing procedures using the bar code.
The destination ZIP code may be retrieved from the database and sprayed on the mail piece, which may then enter the automatic processing procedures. The mail piece may be routed to its destination by the automatic processing procedures using the bar code.
[0053] The system 100 may be configured to process a set of object images.
Each image may be parsed into regions of interest and/or components, and a particular component may be associated with, and/or matched to, one or more lines of text and/or input data fields. A customer identification may be associated with an address block description or pattern 122, address records 123, and/or other data files and tables 124. The OCR system 133 may use the Block-Field-Line Locator to identify a region of interest or address block and subsequently the individual lines within that address block data. This line data may be passed on to the system 125, which may then use the pattern 122, data files and tables 124, address records 123, and/or parser 121 to identify individual address components and addresses in each image.
Each image may be parsed into regions of interest and/or components, and a particular component may be associated with, and/or matched to, one or more lines of text and/or input data fields. A customer identification may be associated with an address block description or pattern 122, address records 123, and/or other data files and tables 124. The OCR system 133 may use the Block-Field-Line Locator to identify a region of interest or address block and subsequently the individual lines within that address block data. This line data may be passed on to the system 125, which may then use the pattern 122, data files and tables 124, address records 123, and/or parser 121 to identify individual address components and addresses in each image.
[0054] The system 100 may take the parsed image data and deduce the allowed patterns in the addresses for that area and/or category. For example, it can be determined that the bottom-most line (e.g., as detected by a parser) has the rightward-most entity labeled "ZIP-5", the one to the left of that labeled "STATE" and the remaining, leftward-most entity labeled "CITY". It can therefore be deduced that CITY -> STATE -> ZIP on the bottom-most line is an allowed pattern that may be matched. The system 100 may extract the patterns automatically from labeled and/or described set of images, whether the patterns are simple or complex.
[0055] A physical object may be provided with enough information on it to allow the system 100 to determine and perform a desired function. For a mail system this may be an envelope with some attempt at providing, or approximation to, an address on the envelope. For a manufacturing plant or parts depot, this may be a label or serial number which identifies a part or otherwise associates information with the part. For a jeweler, art dealer, appraiser, or other type of evaluator, the object information may comprise a unique diffraction pattern of a gem stone or a surface crystal fracture caused when a coin is struck.
[0056] Scratches and other indications of usage of an object that may occur during manufacture, assembly, handling, environmental degradation, etc. may be used to uniquely identify the object. Other applications may include forensic and/or biological analysis of tissue samples, blood samples, or other samples that may be used to uniquely identify, distinguish, and/or provide a match with a particular person of interest. For example, a blood stain associated with one person may comprise a different level and/or pattern of blood proteins as compared to a blood stain associated with another person.
[0057] The system 100 may be configured to extract the information from the object (object information) and then categorize the extracted information (categorizing information), for example, as belonging to a predetermined area and/or category. For a mail piece, the object information and/or categorizing information may be determined by an address block locator and/or an OCR system.
[0058] A defined pattern or set of patterns associated with the object information and/or the categorizing information may exist a priori (e.g. a Universal Postal Union-defined address format for each country), or it may be defined for a specific application by a vendor or by a customer. Part of the defined pattern may include information on how to apply the pattern either alone or in a defined and prioritized order with other defined patterns, and what generic and specific information to return.
[0059] The database 131 may contain one or more lists of classification elements, individual applicable element values, and/or a system output when a desired pattern has been matched. For a mail application this database 131 may contain, for example, a list of states, cities within each state, neighborhoods within each city, and/or carrier routes within each neighborhood. The output may be the routing ZIP
code. The database hierarchy may correspond to the classifying elements to be found on the object and to the patterns created for classifying the object. In some examples, one or more digital fingerprints may be stored in the database 131, together with a plurality of document identifiers, and the digital fingerprints may be associated with unique alphanumeric identifiers.
code. The database hierarchy may correspond to the classifying elements to be found on the object and to the patterns created for classifying the object. In some examples, one or more digital fingerprints may be stored in the database 131, together with a plurality of document identifiers, and the digital fingerprints may be associated with unique alphanumeric identifiers.
[0060] The parser 121 may determine which lines and/or input data fields on the object correspond to which elements in the defined patterns, and to which elements and element values in the database. The parser 121 may perform fuzzy matching on the input data fields and interpolate missing elements where possible.
[0061] The relationship between the defined pattern and the elements in the database may be viewed as similar to that between a defined class in, for example, C++ and the many possible instantiations of that class. The pattern or patterns may show the overall structure and interrelationships of object elements, while the database may provide specific examples, element values of those patterns. For example, the pattern may include "city name" and the database may include "New Orleans", "Jackson", or "Sioux Falls" which are examples of city names that might be found on an envelope. The element values in the database are usually meant to encompass all or nearly all the allowable element values. The patterns may serve, among other things, as part of a feature vector of the mail piece (or object, generally) for its later identification. Feature vectors are described in greater detail further in this specification.
[0062] FIG. 2 illustrates an example process 200 of mail processing system. At operation 202, a mail piece may be run on a mail sortation system, which in some examples may comprise a mail service Input Sub-System (ISS). At operation 204, an image of the mail piece may be captured, for example, by one or more cameras and/or optical devices. At operation 206, an identification (ID) tag may be printed on the mail piece.
[0063] At operation 208, the document processing system may determine if the mail piece image comprises an address. If the mail piece image does not comprise an address, or if the address cannot be identified from the image, the mail piece may be rejected for manual sorting at operation 236 prior to delivery at operation 228. In response to analyzing the image for a legible address at operation 208, the document processing system may extract the address from the image at operation 210. At operation 212, the document processing system additionally may determine if the address is resolvable on-line, for example, during processing of a batch of mail pieces.
[0064] In applications where read rate is low and/or where near perfection is required, a process known as local or remote video encoding may be utilized.
The video encoding process may be described as follows. A unique ID may be created for the mail piece at the initial failed recognition attempt. An image of the mail piece may be captured, and the ID may be associated with the image. The ID may be sprayed, for example, with florescent ink on the back of the mail piece (the ID tag).
The video encoding process may be described as follows. A unique ID may be created for the mail piece at the initial failed recognition attempt. An image of the mail piece may be captured, and the ID may be associated with the image. The ID may be sprayed, for example, with florescent ink on the back of the mail piece (the ID tag).
[0065] If the mail piece does not comprise a resolvable address, the mail piece may be run on a recognition system at operation 214, which in some examples may be run offline and/or comprise a backend Remote Character Recognition (RCR) processing system. In a Multi-Line Optical Character Recognition (MLOCR) processing system, an image of the mail piece may be captured and sent to the Optical Character Recognition (OCR) engine where the destination address may be converted to text. At operation 216, the recognition system may determine if the mail piece image comprises an address that is resolvable. If the mail piece image does not comprise a resolvable address, the mail piece may be collected at operation 230 for additional processing. For example, the image of the mail piece may be sent to a Remote Encoding Center (REC) at operation 232 in a further attempt to resolve the address at operation 234. If the address still cannot be resolved, the mail piece may be rejected for manual sorting at operation 236.
[0066] In one example, the document processing system may determine if the address is resolvable on-line, for example, during processing of a batch of mail pieces. The text data then may be sent to a directory to determine if the mail piece can be assigned to an 11, 9, 5, or 0 digit zip code.
[0067] In some examples, the physical mail piece may be removed from the transport system. The image of the mail piece may be placed in a queue where the address elements are entered into the system. The address elements may be compared against the directory to identify the associated ID. The physical mail piece may then be rerun on the transport in a mode where the ID is read. If the ID
is reconciled, the destination may be sprayed on the front of the mail piece.
However, the cost of maintaining two sets of capture and printer technologies may be expensive and time consuming. For example, the camera may need to be adjusted for focus. Similarly, a device for spraying the back side of the mail piece may also require maintenance, such as cleaning the ink nozzles.
is reconciled, the destination may be sprayed on the front of the mail piece.
However, the cost of maintaining two sets of capture and printer technologies may be expensive and time consuming. For example, the camera may need to be adjusted for focus. Similarly, a device for spraying the back side of the mail piece may also require maintenance, such as cleaning the ink nozzles.
[0068] In response to analyzing the image for a resolvable address at operation 216 and/or at operation 234, the document processing system may store the resolved address at operation 218. The resolved address may be associated with the ID tag of the mail piece. At operation 220, the mail piece may be run on a mail sortation system, which in some examples may comprise a mail system Output Sub-System (OSS), where the ID tag may be read. In one example, the MLOCR
processing system may then sort the mail piece based on the lookup. In response to reading the ID tag, the resolved address may be loaded from a database and/or lookup table, at operation 222. After determining that the address is resolvable at operation 212 and/or after loading the resolved address at operation 222, the barcode may be printed in the Postnet Clear Zone at operation 224. At operation 226, the mail piece may be sorted using the printed barcode for subsequent delivery at operation 228.
processing system may then sort the mail piece based on the lookup. In response to reading the ID tag, the resolved address may be loaded from a database and/or lookup table, at operation 222. After determining that the address is resolvable at operation 212 and/or after loading the resolved address at operation 222, the barcode may be printed in the Postnet Clear Zone at operation 224. At operation 226, the mail piece may be sorted using the printed barcode for subsequent delivery at operation 228.
[0069] Each mail piece may include a variety of elements which, individually or in combination, may uniquely identify the mail piece. Among the unique elements on the mail piece are the contents of the delivery and destination addresses, the shape of the address blocks, the location on the envelope of the address blocks, the position of indicia, the characteristics of any handwriting, the type fonts used, other identification elements, or any combination thereof. Any unique characteristic or combination of characteristics of the mail piece may be used to identify the mail piece. The identification obtained from the unique elements may be used to identify or track a mail piece, or to re-identify a particular mail piece. The unique elements may be scanned or identified without the need for a second camera in the mail processing system.
[0070] FIGS. 3 to 9 illustrate example features associated with document fingerprinting. The features may be used in one or more processes for comparing an initial scanned image with a rescanned image, for example. Whereas some of the examples may assume that the first and second images are used to identify the same mail piece, in other examples, the first and second images may be used to distinguish two different mail pieces. Additionally, whereas any one example may be understood to identify unique characteristics of the mail piece, some examples may be understood to use two or more different sets of unique characteristics using any combination of FIGS. 3 to 9 to identify and/or distinguish the mail piece. The comparison of characteristics may be definitive (e.g. there is a ZIP Code reading 91445 at position x=1955, y=939) or probabilistic (e.g. a statistical comparison of a compendium of handwritten stroke shapes across the two images).
[0071] FIG. 3 illustrates one or more object images and associated image dimensions. A first image 310, which may comprise an initial scan of a mail piece, may be associated with, and/or identified by, first image dimensions 315. The first image dimensions 315 may identify the dimensions of the first image 310 and, indirectly, the dimensions of the mail piece itself. In the illustrated example, the first image of the mail piece may identify a width (W) of 2626 pixels and a height (H) of 1284 pixels.
[0072] A second image 320, which may comprise a rescanned image of the mail piece, may similarly be identified by image dimensions, such as second image dimensions 325. The second image dimensions 325 may identify a width of 2680 pixels and a height of 1420 pixels. Although the image dimensions 315, 325 associated with the first and second images, respectively, may not be identical, the system may nevertheless use this information to determine that a mail piece associated with the first and second images 310, 320 is in fact the same mail piece.
[0073] Additionally, some differences between first and second images 310, may be intentional and/or expected. For example, a change to second image 320 from first image 310 may include the addition of a cancellation mark and/or the addition of an ID tag to the same mail piece. Similarly, second image 320 may include evidence of normal usage and/or processing, such as a bent corner or wrinkle that was not present when first image 310 was obtained.
[0074] The system may be configured to identify the order of when certain changes may be made to an object, such as the mail piece. For example, a cancellation mark may normally be applied to the mail piece in between obtaining first and second images 310, 320. In a first example, the presence of the cancellation mark in second image 320 and absence of the cancellation mark in first image 2310 may not disqualify first and second images 310, 320 from being a match, e.g., the presence of the cancellation mark may be ignored. However, in a second example, the presence of the cancellation mark in first image 320 and absence of the cancellation mark in second image 320 may indicate first and second images 310, 320 do not match. In the Second example, the system may be configured to identify a match only when both first and second images 310, 320 include the cancellation mark.
[0075] The system may provide for a tolerance or range of variation in image dimensions for the associated images 310, 320 of the mail piece, for example, to account for differences in scanning devices, rates of transport (scanning speeds), alignment and/or skewing of the mail piece, damage to the mail piece, additional markings made on the mail piece, or any combination thereof. The tolerance or allowable range of variation may be predetermined. The tolerance may vary based on the type of document being analyzed, or on the feature or features used to identify the document fingerprint. The tolerance may be applied as an allowed range, as a probability that decreases with increasing mismatch, or in other ways.
[0076] FIG. 4 illustrates a total number of pixels, or a pixel count, associated with one or more object images, such as a first image 410 and a second image 420.
In some examples, the first image 410 may comprise an initial scan of a mail piece, and the second image 420 may comprise a rescanned image of the mail piece. The total number of pixels 415 associated with the first image 410 is shown as including 69,286 pixels, whereas the total number of pixels 425 associated with the second image 420 is shown as including 69,292 pixels.
In some examples, the first image 410 may comprise an initial scan of a mail piece, and the second image 420 may comprise a rescanned image of the mail piece. The total number of pixels 415 associated with the first image 410 is shown as including 69,286 pixels, whereas the total number of pixels 425 associated with the second image 420 is shown as including 69,292 pixels.
[0077] The system may provide for a tolerance or range of variation in total pixel count for the associated images 410, 420 of the mail piece, while still determining that the elements associated with the first and second images 410, 420 may uniquely identify the same mail piece. In some examples, the total number of pixels 415 and/or 425 may be determined from an analysis of the destination address 430, return address 440, postage indicia 450, cancellation markings, other markings associated with the mail piece(s) and/or image(s), or any combination thereof.
The degree of match may be definitive within a range or probabilistic.
The degree of match may be definitive within a range or probabilistic.
[0078] FIG. 5 illustrates a number of paragraphs associated with one or more object images, such as a first image 510 and a second image 520. In some examples, the first image 510 may comprise an initial scan of a mail piece, and the second image 520 may comprise a rescanned image of the mail piece. In the illustrated example, the first image 510 may be associated with two paragraphs, including a first paragraph 512 and a second paragraph 514. Similarly the second image 520 may be associated with two paragraphs 522, 524. In some examples, the first paragraph 512 may be associated with a return address and/or the second paragraph 514 may be associated with a destination address.
[0079] The paragraphs are not necessarily defined as lines of characters, and are not necessarily rectangular, but may be identified more generically as grouped together or concentrated pixels located or arranged in a region of the mail piece. In one example, both the image of the paragraph and the associated dimension of the paragraph (e.g., width and height) may be determined for the first and second images 510, 520.
[0080] FIG. 6 illustrates a number of lines associated with one or more object images, such as a first image 610 and a second image 620. In some examples, the first image 610 may comprise an initial scan of a mail piece, and the second image 620 may comprise a rescanned image of the mail piece. The number of lines may correspond with one or more paragraphs, such as a first paragraphs 602 and a second paragraph 604. For example, the first paragraph 602 may be associated with, and/or identified as including, two lines, such a first line 612 and a second line 614. The second paragraph 604 may be associated with, and/or identified as including, three lines, such a first line 611, a second line 613, and a third line 615.
The second image 620 similarly may be associated with a number paragraphs and/or of lines 625.
The second image 620 similarly may be associated with a number paragraphs and/or of lines 625.
[0081] In addition to determining the number of lines in each paragraph, the dimensions (e.g., width and height) of each line may also be determined. FIG.
illustrates line dimensions 730, 740 associated with one or more object images, such as a first image 710 and a second image 720, respectively. In some examples, the first image 710 may comprise an initial scan of a mail piece, and the second image 720 may comprise a rescanned image of the mail piece.
illustrates line dimensions 730, 740 associated with one or more object images, such as a first image 710 and a second image 720, respectively. In some examples, the first image 710 may comprise an initial scan of a mail piece, and the second image 720 may comprise a rescanned image of the mail piece.
[0082] The first image 710 may be associated with a plurality of paragraphs, including a first paragraph 702 and a second paragraph 704. The first paragraph 702 may be associated with, and/or identified as including, two lines, such a first line 712 and a second line 714. The second paragraph 704 may be identified may be associated with, and/or identified as including, three lines, such a first line 711, a second line 713, and a third line 715.
[0083] The second image 720 may similarly be associated with a plurality of paragraphs, including a first paragraph 722 and a second paragraph 724. The first paragraph 722 of the second image 720 may comprise a first line 732 and/or a second line 734. The second paragraph 724 of the second image 720 may comprise a first line 731, a second line 733, and/or a third line 735.
[0084] The first line 712 associated with the first paragraph 702 of the first image 710 may be associated with a height of 24 pixels and a width of 443 pixels, and the second line 714 associated with the first paragraph 702 of the first image 710 may be associated with a height of 24 pixels and a width of 285 pixels. On the other hand, the first line 732 associated with the first paragraph 722 of the second image 720 may be associated with a height of 24 pixels and a width of 443 pixels, and the second line 734 associated with the first paragraph 722 of the second image may be associated with a height of 24 pixels and a width of 286 pixels.
[0085] In the illustrated example, the width and height of the first lines 712, 732 may be identical, whereas the width of the second line 734 associated with the second image 720 may be one pixel (or more) greater (or less) than the width of the second line 714 associated with the first image710.
[0086] As previously described, the system may provide for a tolerance and/or range of variation in total pixel count for the associated images of the mail piece while still determining that the elements identified for both the first and second images 710, 720 may uniquely identify the same mail piece. Similarly, the system may provide for a tolerance and/or range of variation in total pixel count for the associated line and/or lines of one or more paragraphs in an initial scanned image and a rescanned image. All such comparisons may be definitive within a range or probabilistic.
[0087] FIG. 8 illustrates a comparison of text associated with one or more object images, such as a first image 810 and a second image 820. Text 815 associated with the first image 810 may be compared with the corresponding text 825 associated with the second image 820. For example, the first image 810 may be associated with an initial scanned paragraph 830, and the second image 820 may comprise a rescanned paragraph 840.
[0088] Differences between the text 815 associated with the first image 810 and the text 825 associated with the second image 820 may result from limitations in a character recognition system and/or to differences in processing the mail piece during the initial scan and rescan operations, by way of example. A first line of text 812 associated with the first image 810 may be compared with a first line of text 822 associated with the second image 820.
[0089] In the illustrated example, the first line of text 812 reads "15400NE9OthStreetSuite300" whereas the first line of text 822 reads "15400NE9Orh5treetSuite300." The first line of text 812 matches the text found in the initial scanned paragraph 830 of the first image 810; however, the first line of text 822 incorrectly reads "90rh5treet" instead of "90thStreet" as shown in the rescanned paragraph 840.
[0090] A second line of text 814 associated with the first image 810 may be compared with a second line of text 824 associated with the second image 820.
In some examples, the corresponding text 815, 825 associated with both the first and second images 810, 820 may have resulted from an erroneous reading of paragraphs 830, 840, respectively, such as when the zip code "98052" is read as "9B052", as illustrated in the second line 814 of text 815, and as "9BD52", as illustrated in the second line 824 of text 825. In addition to comparing the text in each paragraph, the location (e.g., relative position on the mail piece) of each character, or set of characters, of the text may also be compared in determining if the second image 820 identifies the same particular mail piece associated with the first image 810.
In some examples, the corresponding text 815, 825 associated with both the first and second images 810, 820 may have resulted from an erroneous reading of paragraphs 830, 840, respectively, such as when the zip code "98052" is read as "9B052", as illustrated in the second line 814 of text 815, and as "9BD52", as illustrated in the second line 824 of text 825. In addition to comparing the text in each paragraph, the location (e.g., relative position on the mail piece) of each character, or set of characters, of the text may also be compared in determining if the second image 820 identifies the same particular mail piece associated with the first image 810.
[0091] FIG. 9 illustrates an example image of an object 910 and a coordinate system 900 for providing, determining, indentifying, and/or generating a characterization of one or more objects. For example, an image of a mail piece may comprise a substantially white area 960. The substantially white area 960 may be distinguished from one or more substantially dark areas comprising a destination address 930, a return address 940, postage indicia 950, cancellation markings, spray, stamps, writing, stains, smudges, pictures, written and typed words, or any combination thereof. In some examples, the postage indicia 950 may comprise one or more of an image, an amount, a date, and/or a position of a stamp placed on the mail piece.
[0092] The white area 960 and/or the dark areas may be associated with, and/or identified with reference to, a coordinate system 900. The coordinate system may comprise one or more coordinates, such a horizontal coordinate 922 and/or a vertical coordinate 924. The coordinate system 900 may be configured to identify a position, dimension, concentration, percentage, number, other aspect, or any combination thereof, of the white area 960 and/or the dark areas. For example, the destination address 930 may be associated with a first set of coordinates, the return address 940 may be associated with a second set of coordinates, and/or the postage indicia 950 may be associated with a third set of coordinates.
[0093] The characterization of the image of the mail piece 910 may provide a means for mapping out every pixel associated with the white and dark areas.
For example, the coordinate system 900 may be used to determine and/or compare the elements illustrated in FIGS. 3 to 8.
For example, the coordinate system 900 may be used to determine and/or compare the elements illustrated in FIGS. 3 to 8.
[0094] A "Document Fingerprint" may be determined for each document, such as a mail piece, based on one or more elements, such as those described with reference to FIGS. 3 to 9. The elements may be processed separately or in combination, using multiple techniques for determining that a document is unique and/or differentiated from another similar document. Comparison of elements may be definitive within a range, probabilistic, or both. The document fingerprint, or digital fingerprint, may comprise a digital or electronic record of the document based on an image of the document, based on image data associated with the document, and/or based on a virtual representation of the document. By way of further example, the document fingerprint may include a spatial relationship between one or more features, artifacts, and/or indicia appearing or existing on the document. In some examples, the one or more features, artifacts, and/or indicia may be visual, textual, and/or audible in nature.
[0095] Each technique may allow for a particular variance that occurs as a result of taking different images of the same mail piece. Some of these techniques may be sufficient to establish uniqueness on their own. However, a combination of techniques may result in a more accurate determination and evaluation of the unique elements in order to virtually eliminate false positives (initially determined to be the same document) or false negatives (initially determined to be different documents).
[0096] Whereas a first technique or set of techniques may provide a result with a certain level of confidence, confidence in that result may be increased by combining further techniques. Alternatively, by combining the further techniques to analyzing the image, it may be shown that the initial result was in error.
[0097] By identifying the document according to the unique elements or characteristics of the document itself, it is possible to keep the physical document free of unnecessary ink, such as a sprayed-on ID tag, to keep the document clean and be environmentally conscious in using less ink in the process. In some examples, the document fingerprint may be determined, at least in part, according to the arrangement, texture, size, reflectance, and/or other characteristics of the paper fibers of the document, rather than by what is, or is not, printed on the document.
[0098] The paper or other fibrous material, such as cotton, that makes up the physical structure of the document may include fibers that are visible to the right camera with proper lighting and magnification. The pattern of these fibers may be a random result of the paper-making process. In addition to the fibers themselves there is the way the fibers affect the application of ink to the paper, whether in the delivery address, return address, postage indicia, cancellation markings, and/or other marks, that may affect the document fingerprint.
[0099] If the mail piece is handwritten, the variation in handwriting may be used to identify or, conversely, distinguish a document. Even in the case where the same person prepares two documents which include the same written content, it is possible to distinguish the two documents based on the variation in handwriting, however subtle. By analyzing the writer's handwriting, the two different documents written by the same person may be distinguished. The location of handwritten pixels and/or transitions of the document, for example, may be used as uniquely identifying marks.
[0100] [In addition to the unique elements or features described for identifying mail pieces, such as address block dimensions, location, pixel count, etc., handwriting provides additional information which may be used to identify or distinguish the document. For example, the handwriting may be analyzed for line quality, line thickness, line density, transition points, average line slope, other writing characteristics, or any combination thereof.
[0101] Whether the document includes machine print or handwriting, a number of characteristics, elements, and/or features may provide a unique identification of the document. Certain features may include sufficient randomization to identify, sort, and/or otherwise process the documents. For example, the identifying features and/or indicia may include a position of a stamp on the mail piece, an image of the stamp (e.g., different issued stamps may be used on different mail pieces), an amount of metered postage, the date or other metering data in the metered postage area, the kind of indicia (e.g., stamps, metered, bulk mail, etc), or any combination thereof.
[0102] Cancellation marks also may be used to analyze a document, such as a mail piece. Cancellation marks may be referenced to the envelope and to the stamps they cancel. The cancellation mark may differ from one mail piece to another mail piece depending, for example, on the machine doing the cancelling.
Even if there is no discernible difference in image content, there will be observable variations in inking, skew, and/or other such characteristics.
Even if there is no discernible difference in image content, there will be observable variations in inking, skew, and/or other such characteristics.
[0103] FIG. 10 illustrates an example process 1000 for comparing and/or distinguishing a first image and a second image associated with one or more objects.
At operation 1002, the document may be received or entered into the transport of the processing system. At operation 1004, the first image of the document may be captured. At operation 1006, the first image may be stored in a database 1005.
At operation 1008, the first image may be processed to generate a document fingerprint. At operation 1010, the document fingerprint may be stored. The document fingerprint may be stored in database 1005.
At operation 1002, the document may be received or entered into the transport of the processing system. At operation 1004, the first image of the document may be captured. At operation 1006, the first image may be stored in a database 1005.
At operation 1008, the first image may be processed to generate a document fingerprint. At operation 1010, the document fingerprint may be stored. The document fingerprint may be stored in database 1005.
[0104] The stored image data may include a destination address. For some documents, the destination address associated with the document may be resolved at operation 1012. At operation 1014, a destination code may be applied to the document. At operation 1016, the document may be routed or sorted according to the destination code. In some cases, the processing system may be unable to resolve the destination address based on the stored image data associated with the first scanned image. For example, the document may need to be taken off-line for further processing to identify the destination address.
[0105] At operation 1020, the document may be received and/or introduced for rescanning. In one example, the rescanning operation may be performed at a second location of the processing system, or by a separate processing system.
The rescanning operation may be performed after the destination address for the document was identified. The destination address may be associated with the fingerprint of the first image in the database 1005.
The rescanning operation may be performed after the destination address for the document was identified. The destination address may be associated with the fingerprint of the first image in the database 1005.
[0106] At operation 1022, the second image of the document may be captured.
An image of the front of the mail piece may be captured as a first image or initial image. A second image of the mail piece together with a unique ID from the initial image of the mail piece may be sent to a video encoding station. The second image may be obtained as a result of reintroducing the mail piece on the same transport of the processing system used to obtain the first image.
An image of the front of the mail piece may be captured as a first image or initial image. A second image of the mail piece together with a unique ID from the initial image of the mail piece may be sent to a video encoding station. The second image may be obtained as a result of reintroducing the mail piece on the same transport of the processing system used to obtain the first image.
[0107] In addition to identifying an object, such as a mail piece, some action may then be taken with respect to the identified object. For example, a mail piece may be processed when a ZIP code is applied to the mail piece. If the determination of the ZIP code initially fails, the mail piece patterns may be converted into a database entry that includes some encoding of the patterns. Furthermore, the image of the mail piece may be tagged with the index of the database entry, and the image of the mail piece may be sent off for further processing. The further processing may be performed on the image of the mail piece, instead of the mail piece itself.
When the further processing is successfully completed, the result of that further processing may be added to the database entry for that mail piece.
When the further processing is successfully completed, the result of that further processing may be added to the database entry for that mail piece.
[0108] The mail piece may then be physically scanned a second time. The second scan may be performed by a second device or second system that has access to the database which includes the mail piece images, the mail piece patterns, and the results of the further processing associated with the first scan. The image or patterns of the mail pieces may be compared with those in the database to identify the results of the further processing. The results of the further processing (e.g., a destination ZIP code, address, or identification) may be applied to the mail piece to assist in further routing or delivery of the mail piece to the destination.
[0109] At operation 1024, the second image may be stored in database 1005.
At operation 1026, the second image may be processed to generate a document fingerprint. The document fingerprint also may be stored in database 1005.
When a mail piece exits the transport of a mail processing system without being resolved by the address destination directory, a barcode or ID tag may not have been sprayed on the piece. The second image may provide a means or method to identify the mail piece and match it with the archived results from the video encoding. In this way, for example, the image matching technique may take the place of the missing barcode or ID tag.
At operation 1026, the second image may be processed to generate a document fingerprint. The document fingerprint also may be stored in database 1005.
When a mail piece exits the transport of a mail processing system without being resolved by the address destination directory, a barcode or ID tag may not have been sprayed on the piece. The second image may provide a means or method to identify the mail piece and match it with the archived results from the video encoding. In this way, for example, the image matching technique may take the place of the missing barcode or ID tag.
[0110] At operation 1028, database 1005 is searched for a matching fingerprint.
The document fingerprint associated with the second image may be compared with document fingerprints stored in the database 1005. Each mail piece is unique in some way. The unique elements of the mail piece may be used in lieu of the ID
tag.
The unique elements may be obtained and/or identified from the front of the mail piece, for example, when the mail piece is first run through the transport to identify the mail piece. The unique elements may further be used to re-identify the mail piece when the mail piece is rescanned, as part of a process to apply the now-finalized ZIP Code.
The document fingerprint associated with the second image may be compared with document fingerprints stored in the database 1005. Each mail piece is unique in some way. The unique elements of the mail piece may be used in lieu of the ID
tag.
The unique elements may be obtained and/or identified from the front of the mail piece, for example, when the mail piece is first run through the transport to identify the mail piece. The unique elements may further be used to re-identify the mail piece when the mail piece is rescanned, as part of a process to apply the now-finalized ZIP Code.
[0111] When the mail piece is rescanned, it is not necessary to re-extract exactly the same characteristics or unique elements obtained and/or identified from the mail piece in the initial scan. For example, the width of the destination address block may be measured slightly differently when the mail piece is rescanned (due, say, to slightly different transport speeds), or the skew on the mail piece may be slightly different (due to placement of the mail piece on the transport), and so on.
[0112] Despite these differences, the unique elements obtained during the initial scan and/or when the mail piece is rescanned may be used to similarly identify the same mail piece. For example, the unique characteristics may be compared using statistical analysis to identify an allowable range of variation, such that slight mismatches between the unique characteristics are not sufficient to confuse one mail piece with another, or to cause the misidentification of the same mail piece during multiple scanning operations.
[0113] Even two mail pieces that originate from the same sender and that are addressed to the same destination will have sufficient differences so that they may be distinguished from each other and uniquely identified based on the unique elements. For example, the two mail pieces may vary according to the placement of the address blocks, by difference in handwriting, by the number of pixels and number of transitions from black to white at each location across the mail piece, by irregularities in placement of cancellation marks, by ink wicking caused by paper fibers, by irregular inking caused by irregularities in the paper surface, by other unique elements, or any combination thereof. Even where the two mail pieces may otherwise look identical to the casual observer, the two mail pieces may still be distinguished based on the unique characteristics. In some examples, an interaction of ink with the paper fibers, or the physical characteristics of the paper fibers themselves, may be used for purposes of identification.
[0114] By using unique elements such as the dimensional qualities and printed characteristic of the mail piece, a unique ID may be assigned to a mail piece without using a second camera and/or printer. Whereas a camera may used to scan or rescan the document, other types of devices or sensors used to scan or otherwise capture an image may be used in place of, or in addition to, one or more cameras.
[0115] Image characterization which treats the image as one or more "characters"
may readily compare, distinguish, ignore, or exclude variances that occur for the first image (initial scan) and the second image (rescan) of the document. This process can be used for the entire mail piece or portions of it in the same way that an OCR
engine may distinguish different fonts of the character "A". All of the printed information of the mail piece may be used to determine a unique document, or to distinguish a number of documents. Thresholds, whether determinative or probabilistic, may be set or established for each technique to allow for the variances of different images of the same document.
may readily compare, distinguish, ignore, or exclude variances that occur for the first image (initial scan) and the second image (rescan) of the document. This process can be used for the entire mail piece or portions of it in the same way that an OCR
engine may distinguish different fonts of the character "A". All of the printed information of the mail piece may be used to determine a unique document, or to distinguish a number of documents. Thresholds, whether determinative or probabilistic, may be set or established for each technique to allow for the variances of different images of the same document.
[0116] At operation 1030, a database record associated with the first image may be retrieved. At operation 1032, the destination address or destination code associated with the database record may be applied to the document. At operation 1034, the document may be routed or sorted according to the destination code.
[0117] The analysis and processing applied to mail pieces may extend to virtually any type of object which may benefit from the identification (and subsequent re-identification) through the use of any natural or artificial system in which randomness or widespread variation plays a substantial role. For example, documents, weapons, pharmaceuticals, drugs, animals, gems, coins, bullion, currency, integrated circuits, clothing, apparel, legal documents, financial documents, art work, photographs, manufactured parts, and/or labels.
[0118] Patterns may be randomly or accidentally generated on the document.
The patterns may be generated without any intent to duplicate a particular effect or characteristic. The patterns may be intentionally generated or generated as a subsidiary effect of an intentional process such as manufacturing. In some embodiments, the elements of the pattern must be discernible each time the object is presented for identification and they must be of sufficient variability that two otherwise similar objects are very unlikely to have substantially identical patterns.
The effectively unique patterns may therefore be used for identification of the document or other type of object in which they occur.
The patterns may be generated without any intent to duplicate a particular effect or characteristic. The patterns may be intentionally generated or generated as a subsidiary effect of an intentional process such as manufacturing. In some embodiments, the elements of the pattern must be discernible each time the object is presented for identification and they must be of sufficient variability that two otherwise similar objects are very unlikely to have substantially identical patterns.
The effectively unique patterns may therefore be used for identification of the document or other type of object in which they occur.
[0119] That every snowflake is unique is a truism we all grow up with, but it conceals a substantial truth. If there is sufficient variation in scope available to a pattern, it is extremely unlikely that any one pattern will be accidentally duplicated.
The unique or almost-unique identification of an object based on the appearance of random or at least widely varying patterns on the object may be used to identify or recognize the object, whether it is an envelope, or a stolen jewel.
The unique or almost-unique identification of an object based on the appearance of random or at least widely varying patterns on the object may be used to identify or recognize the object, whether it is an envelope, or a stolen jewel.
[0120] Although the characteristics and features of the document may be described as being generally random, some information included on the document, such as the delivery address, may not strictly speaking be random. It is not randomness per se that is important in the patterns used for identification;
rather, it is the high variability and essential uniqueness of the patterns that are significant.
Practically any information that is highly variable from one object to the next can provide unique identification or differentiation of the objects provided the object characteristics can be quantified and encoded. In addition to the identification of objects based on fibers and/or scratches, for example, any intentional, random, natural, or accidental characteristic of an item/object, may be used for identification/authentication of the item/object. The characteristics may include those due to a manufacturing process, the length of lines in an address block, handwriting characteristics (e.g., in a mail piece), or explicit/intentional marking for re-identification purposes. Accordingly, the systems and method herein may apply to a wide variety of different classes of objects, rather than being tied to the identification and/or authentication of a specific class of objects.
rather, it is the high variability and essential uniqueness of the patterns that are significant.
Practically any information that is highly variable from one object to the next can provide unique identification or differentiation of the objects provided the object characteristics can be quantified and encoded. In addition to the identification of objects based on fibers and/or scratches, for example, any intentional, random, natural, or accidental characteristic of an item/object, may be used for identification/authentication of the item/object. The characteristics may include those due to a manufacturing process, the length of lines in an address block, handwriting characteristics (e.g., in a mail piece), or explicit/intentional marking for re-identification purposes. Accordingly, the systems and method herein may apply to a wide variety of different classes of objects, rather than being tied to the identification and/or authentication of a specific class of objects.
[0121] In some examples, the document may be viewed and/or rescanned by a second camera, different from the first camera that initially viewed or scanned the document. The second camera may be placed at a different distance, include a different focus, or may be skewed slightly with respect to the placement of the first camera. In addition, the document may have picked up additional "non-random"
features as part of the wear and tear of daily life, or as part of a processing or sorting operation, that may introduce physical differences in the document itself, or that account for differences in the first and second scanned image apart from the random variation. The non-random variations between the first scanned image and the second scanned image may increase the likelihood that the same document is erroneously identified as being two different documents.
features as part of the wear and tear of daily life, or as part of a processing or sorting operation, that may introduce physical differences in the document itself, or that account for differences in the first and second scanned image apart from the random variation. The non-random variations between the first scanned image and the second scanned image may increase the likelihood that the same document is erroneously identified as being two different documents.
[0122] There are a number of ways to create a sufficiently robust system or feature set of the images so that the variations due to the camera or wear and tear of the document do not cause an erroneous identification. The number of characteristics or features of the document may be increased such that a subset of the features is sufficient to uniquely re-identify the object. For example, if one feature of the document changes or is altered after the first image is captured and before the second image is captured, the system may ignore or exclude the feature which changed, and instead compare a number of other features associated with the first and second images. When the first and second images vary as a result of changes to the document itself, the system may nevertheless be able to identify that the first and second images identify the same document.
[0123] Although there are many acceptable methods of encoding the extracted features, one method that handles small variations naturally encodes the features in such a way that nearly identical features give nearly identical encodings, to compensate or allow for variation in the scanning processes. For example, any two cameras or sensors may have different output, however minor the variation, when scanning the same object. The system may therefore accept a match even when the features in the database differ from the features determined from the rescanned image by some small but finite amount.
[0124] In one example, the second camera or its associated programming may have the ability to remove, ignore, exclude or accommodate the non-random characteristics of the document sufficiently to allow re-identification based on the random characteristics. For example, when a document is viewed by otherwise identical cameras varying only in distance to the document, the first and second images may vary in size by some uniform scale factor. The differences in image size due to the distance of the cameras may be accounted for in analyzing the random variations in the first and second images.
[0125] The features encoded in the database may be adjusted or modified to account for slight variations in the identified characteristics of the scanned image.
For example, fourteen characters may be identified in the first line of a paragraph during a first scan, whereas fifteen characters may be identified in a second scan of the document. Recording the count of characters and allowing a mismatch of +1-character, for example, may be sufficient to accommodate the slight variation in the characteristics of the document when comparing the recorded features.
For example, fourteen characters may be identified in the first line of a paragraph during a first scan, whereas fifteen characters may be identified in a second scan of the document. Recording the count of characters and allowing a mismatch of +1-character, for example, may be sufficient to accommodate the slight variation in the characteristics of the document when comparing the recorded features.
[0126] Once the features are all quantified, an entry may be made in the database that uniquely identifies the associated document. For example, when a first image of the mail piece does not provide sufficient identification to allow for finalized routing, the characteristics of the mail piece may be encoded into a database entry in such a way that a later encoding associated with a second image of the mail piece can be matched against the previous encoding associated with the first image.
[0127] Once the mail piece is identified, the mail piece may then be further processed to determine the destination address. After the delivery code for the mail piece is determined, it may be applied to the mail piece. The delivery code may be associated with the database entry that holds the features of the mail piece in response to tagging the image of the mail piece with the database index. The database index may be used to re-identify the mail piece and to attach, spray, or otherwise include the now-completed routing code (e.g., delivery ZIP code) to the mail piece.
[0128] A mail piece has on it one or more addresses. For discussion purposes, assume the intent of the system is to determine where that mail piece should be routed, using the destination address printed on the address block of the mail piece.
A typical mail piece might contain an addressee, a street number and name, a city, a neighborhood, a sub-neighborhood (barrio in Mexico, for example), and other information. The goal of the mail piece processing is to route the mail piece along a particular set of intermediate destinations (e.g. sorting tables) until it reaches the intended addressee.
A typical mail piece might contain an addressee, a street number and name, a city, a neighborhood, a sub-neighborhood (barrio in Mexico, for example), and other information. The goal of the mail piece processing is to route the mail piece along a particular set of intermediate destinations (e.g. sorting tables) until it reaches the intended addressee.
[0129] As previously discussed, there are at least two kinds of information that may be extracted from the object, including the object information and the categorizing information. This information may be other than positional. A
simple example would be finding a ZIP code by the fact that it matches the NNNNN or the NNNNN-NNNN patterns regardless of where it appears on the mail piece. For non-mail piece objects this categorization information may be arbitrarily complex and, indeed, may require several rounds of parsing, database consulting, and rejection to determine even what this categorization information actually is.
simple example would be finding a ZIP code by the fact that it matches the NNNNN or the NNNNN-NNNN patterns regardless of where it appears on the mail piece. For non-mail piece objects this categorization information may be arbitrarily complex and, indeed, may require several rounds of parsing, database consulting, and rejection to determine even what this categorization information actually is.
[0130] In one example, the object information obtained from a mail piece may contain one or more strings of characters (possibly with alternatives and confidences). The categorizing information may include or identify where those characters appear on the mail piece (e.g. spatial, location, or positional information).
[0131] Categorizing information may identify what line the object information is on, and how far from a reference point the object information is located (e.g.
the left-hand edge of the address block), etc. This information may be used to enable the parser to determine what to do with the object data (i.e. to determine to what category it belongs). This actual category of this data may not appear explicitly on the object, but be deduced from, for example, positional information on object data obtained by the parser.
The Defined Pattern Set
the left-hand edge of the address block), etc. This information may be used to enable the parser to determine what to do with the object data (i.e. to determine to what category it belongs). This actual category of this data may not appear explicitly on the object, but be deduced from, for example, positional information on object data obtained by the parser.
The Defined Pattern Set
[0132] How and where various address components appears on the mail piece and what those components represent is combined with other elements of the system to produce a novel object handling system. How and where various address components appears on the mail piece and what those components represent may be combined with other elements of the pattern generation system to produce a further embodiment of a novel object handling system.
[0133] The patterns may be considered part of one or more feature vectors for the object. The feature vectors are described further herein, and they may be understood to be fairly complex representations of the object, such as a full structure of an address or a ratio of different proteins in a blood sample.
[0134] A defined pattern may exist in a previously existing form or it may be created by a user for a particular application. As an example of the former case, the Universal Postal Union (UPU) often provides guidance as to where on a mail piece the addressee name, the city name, etc. may appear for various countries. Such definition is almost always descriptive rather than proscriptive and often very approximate. In the latter case, a customer may establish a set of patterns for use in his application. Incoming mail and a country just establishing its addressing standards would be examples of the latter, as would use of the system for handling parts on a Boeing aircraft.
[0135] A typical address defined pattern might identify:
= The top line contains addressee name and may be located just below a 4-state bar code near the middle of the envelope.
= The bottom line contains the city name and/or country and, optionally, the government code for the address. The code may be located to the left of the city name and follows the pattern ANANNA, where A is and alphabetic and N a numeric character.
= The (third) line above the bottom line may contain the province name and the district name.
= There may be additional lines below the Addressee line, for example, which identifies a street name or house number.
= The top line contains addressee name and may be located just below a 4-state bar code near the middle of the envelope.
= The bottom line contains the city name and/or country and, optionally, the government code for the address. The code may be located to the left of the city name and follows the pattern ANANNA, where A is and alphabetic and N a numeric character.
= The (third) line above the bottom line may contain the province name and the district name.
= There may be additional lines below the Addressee line, for example, which identifies a street name or house number.
[0136] The pattern may associate two principle kinds of information: a designator for information type (city name, addressee name, etc.) and categorizing information that allows the parser to determine what kind of object information it has found on the object. An example of categorizing information might be where a ZIP code can be located on a mail piece. Location can be either absolute (say from the leading edge of an envelope) or relative (with reference, say, to the left-hand edge of the address block or to another element on the envelope). The four-digit extension of the ZIP Code may be located to the right of the five-digit ZIP Code in the US.
In addition, the pattern may contain information on whether each of the pattern elements must appear or are optional.
In addition, the pattern may contain information on whether each of the pattern elements must appear or are optional.
[0137] Because the system can apply fuzzy matching throughout the parsing process (see below), the matches to names may be inexact or exact, their placement may be inexact or exact, and even a "required" item may be missing and the object handled provided the remaining information allows unique classification via one or more of the patterns.
[0138] Often a single defined pattern is not sufficient for a broad set of objects.
There are two reasons for this. Consider the case of an address. First, there might be multiple allowed or commonly used address forms, each of which has its own defined pattern. Second, the address on the envelope may be sufficiently incomplete, inaccurate, or ambiguous that it must be approached from several different perspectives before proper confidence in how to handle it can be achieved.
Third, user-specific business rules may impose additional constraints or order of precedence that is reflected in the patterns. For example, a user may require that the result of a pattern for which city and province elements match exactly be preferred over the result of a pattern with only a postcode match.
There are two reasons for this. Consider the case of an address. First, there might be multiple allowed or commonly used address forms, each of which has its own defined pattern. Second, the address on the envelope may be sufficiently incomplete, inaccurate, or ambiguous that it must be approached from several different perspectives before proper confidence in how to handle it can be achieved.
Third, user-specific business rules may impose additional constraints or order of precedence that is reflected in the patterns. For example, a user may require that the result of a pattern for which city and province elements match exactly be preferred over the result of a pattern with only a postcode match.
[0139] A pattern can provide a description of the expected address and a description of the output that is to be returned with a match on a particular pattern.
The output data may be described using meta-tags, constant data, spatial relationships, and constant strings like "Recipient:". Meta-tags in the patterns may be ordered so as to instruct which tag to attempt to match on first. If no match value is found for the higher priority tag values, the system moves on to the next pattern.
Meta-tag values also may have qualifiers, like, 'require exact match' or 'allow abbreviation match'. The system can use the ordering of each element in the pattern to predetermine and pre-limit possible candidate values for subsequent data elements.
The output data may be described using meta-tags, constant data, spatial relationships, and constant strings like "Recipient:". Meta-tags in the patterns may be ordered so as to instruct which tag to attempt to match on first. If no match value is found for the higher priority tag values, the system moves on to the next pattern.
Meta-tag values also may have qualifiers, like, 'require exact match' or 'allow abbreviation match'. The system can use the ordering of each element in the pattern to predetermine and pre-limit possible candidate values for subsequent data elements.
[0140] The output description is associated with a match on the particular pattern, including a result weighting. There may be from one to N patterns used to match on a given set of inputs. These are checked against the input data in an ordered manner until a match is made. The result is weighted (le. for the confidence in the result) and qualified as a finalization or non-finalization. The output is pulled from the data base row that this pattern matches to so may include any data in that data base row.
[0141] There may be a defined order to the patterns to be applied to a particular object, with the intention that some be executed first and others later, perhaps conditionally. A pattern may be set early in the order of processing because it allows a more complete classification of the data from the object, because it is more likely to provide unambiguous results, is more likely to occur, or for many other reasons.
[0142] Based on a given object such as a mail piece, the system may have several predefined patterns that include different sets of categories or different spatial associations with one or more of the categories. A first pattern may be selected that is a preferred format or template, or one that simply includes a more complete address or larger number of categories. The system may analyze the image of the object by sequentially comparing the object information with the category information of the selected pattern. If the first object information (or first field) of the object matches the first category information of the selected pattern, than the system may proceed to a next object information (or second field) for comparison with the category information.
[0143] A successful matching of one or more of the object information may assist the system in interpreting or identifying object information in a subsequent field. For example, if a first object information identifies a state of an address, the system may utilize the state information to determine which city is included in the address based on a narrowed .database search associated with the city(ies) which reside in that state.
[0144] Accordingly, analyzing the object information can be instructive in identifying a process associated with a corresponding pattern. Furthermore, the selected pattern can be instructive in identifying or interpreting specific fields of object information that otherwise may not be clear absent correlation with the category information of the pattern both for the field in question and based on the other fields of the object. The order that the fields are examined or processed may be defined in the pattern.
[0145] If all, or a sufficient number, of the object information correlates with the category information, then the system may determine that the selected pattern provides a good match with the mail piece. The system may process the object according to instructions associated with the selected pattern. On the other hand, if the object information does not sufficiently correlate with the category information, the system selects a next pattern for comparison.
[0146] Confidence in a match made with a specific pattern is also a consideration.
There is, not only, a defined order to which patterns are applied to an object, but also, patterns are assigned a confidence level that is reported upon a match.
For instance, a match for a mail piece may be attempted with a pattern that requires an exact match on the City, State and Zip code, and if this fails a subsequent pattern may only require a match on the Zip code. A match on the later pattern while instructive may also include a lesser confidence code.
There is, not only, a defined order to which patterns are applied to an object, but also, patterns are assigned a confidence level that is reported upon a match.
For instance, a match for a mail piece may be attempted with a pattern that requires an exact match on the City, State and Zip code, and if this fails a subsequent pattern may only require a match on the Zip code. A match on the later pattern while instructive may also include a lesser confidence code.
[0147] There are at least two kinds of information that may be stored in the database: specific instances of the components of defined patterns and instructions on what to do with the object when an instance of a defined pattern is found.
The components of the defined patterns are listed in the database in such a way that the parser can tell which defined pattern(s) they apply to. This linkage may be as direct as database elements that assign the components to a particular pattern or patterns (pattern numbers or names, for example) or it may be implied by the database structure. A column may be headed by a header "Province Name", for example, with specific province names repeated for all the cities in the province, neighborhoods in the cities, etc. (which appear in subsequent columns), and any pattern requiring that information about an address would use the data in those columns.
The components of the defined patterns are listed in the database in such a way that the parser can tell which defined pattern(s) they apply to. This linkage may be as direct as database elements that assign the components to a particular pattern or patterns (pattern numbers or names, for example) or it may be implied by the database structure. A column may be headed by a header "Province Name", for example, with specific province names repeated for all the cities in the province, neighborhoods in the cities, etc. (which appear in subsequent columns), and any pattern requiring that information about an address would use the data in those columns.
[0148] Entries (typically rows) in the database may relate to one or more defined patterns. That is, the parser may use the same set of database entries to attempt to match more than one pattern. This is particularly true when two patterns differ only by an optional item. For example, if the "Neighborhood" is an optional address element, one pattern may direct the parser to look for it in a particular place on the mail piece while another ignores the neighborhood and looks for a city name in the same location on the object. The parser would in this circumstance use the same database row to try to match the data from the object to the two different patterns, and would follow the instructions associated with that row once a match was found.
[0149] A typical simple example of a mail piece database would have one row for each delivery point in the country. The row might contain all the elements that would be present were the address complete and in this case the entire database would be applicable to all patterns. A more complex database might have regions of the database applicable to specific patterns and not applicable to others. The database would then have an indicator, useable by the parser, of which part of the database to use for a particular pattern. Different patterns may be associated with different databases and different pattern sets may use apply to a single database.
[0150] The database may contain additional information to improve the parser's ability to match text strings contained in the object information. Non-standard abbreviations, transliterations, aliases, and numeric transformations (e.g.
"1" to "ONE" or "SIETE" to "VII") specific to the pattern domain may be included.
Word translation lists may be defined for multilingual applications. Mappings may be specified between characters with accent marks or non-Latin characters and their typographic or OCR equivalent. For example, the user may want to treat TV and 'n' as equivalent, or and `ue' as equivalent.
"1" to "ONE" or "SIETE" to "VII") specific to the pattern domain may be included.
Word translation lists may be defined for multilingual applications. Mappings may be specified between characters with accent marks or non-Latin characters and their typographic or OCR equivalent. For example, the user may want to treat TV and 'n' as equivalent, or and `ue' as equivalent.
[0151] The database allows the parser to determine whether the address block object data found on the object matches one of the defined patterns. The parser/database combination may allow matching of more than one pattern and a module for resolving the ambiguity. For example, a neighborhood and a city might have the same name (in Mexico, for example, it is quite possible to have a delivery point for which the district, city, and neighborhood all have the same name, only some of which appear on the mail piece).
[0152] If there are different routing instructions if the pattern matches that duplicated name to the city or if it matches it to the neighborhood, the parser may attempt to match both patterns and, getting a match to both, pass the results to a module that resolves the ambiguity by only routing the item to the deepest place in the sort that matches both patterns (in this case, the city level). The database may also be configured to automatically provide such decisions by putting the routing code for the city level of sort as the routing instructions for every city/neighborhood pairing that is duplicated.
[0153] Once a match to a defined pattern has been made, the database provides instructions on what to do with the object. For a mail piece, the database might include and return a bar code to be sprayed or printed on the mail piece.
Typically, each row in the database is unique and each one has one set of instructions to be implemented. In many cases the instructions may be the same for many database rows. For example, a country that automatically sorts only to neighborhood level and leaves it up to the courier to determine the final delivery may provide a database that contains street names and numbers but provides the same routing instructions for all streets and number ranges within a particular neighborhood. Multiple matches may be returned.
Typically, each row in the database is unique and each one has one set of instructions to be implemented. In many cases the instructions may be the same for many database rows. For example, a country that automatically sorts only to neighborhood level and leaves it up to the courier to determine the final delivery may provide a database that contains street names and numbers but provides the same routing instructions for all streets and number ranges within a particular neighborhood. Multiple matches may be returned.
[0154] The instructions provided by the database may be code numbers that the rest of the system (e.g. the mail sorting system) knows how to interpret to properly handle the object. The instructions in the system may include a tracking code for a matched object, Latitude and Longitude coordinates that may be used to further qualify a destination address, or instructions in plain text to a user on what to do with the object.
[0155]
FIG. 11 illustrates the sequence used when a system 40 parses an object, in this case a "ParseAddress. For sake of clarity, the various functions of the system 1140 are described linearly, however it should be understood that one or more of the functions may be performed in a different sequence or omitted altogether.
Furthermore, it should be understood that this is a highly reentrant system, capable of using information deduced in one part of the process to clarify information obtained in another.
FIG. 11 illustrates the sequence used when a system 40 parses an object, in this case a "ParseAddress. For sake of clarity, the various functions of the system 1140 are described linearly, however it should be understood that one or more of the functions may be performed in a different sequence or omitted altogether.
Furthermore, it should be understood that this is a highly reentrant system, capable of using information deduced in one part of the process to clarify information obtained in another.
[0156]
The system 1140 is illustrated as comprising several components, including a controller 1141, configuration manager 1143, pattern matcher 1145, component matcher 1147, and arbitrator 1149. The system 1140 combines data from the object, the defined patterns, and the database elements, and passes on to the rest of the system the instructions for handling the object. A primary purpose of the system 1140 is to handle all the inaccuracies and errors that occur in the real world. Thus, in every step of what is described below as the functioning of the system 1140, it should be understood that the system 1140 may be correcting errors in the object data or in the database and providing leeway in determining how well the data from the object matches a given defined pattern.
The system 1140 is illustrated as comprising several components, including a controller 1141, configuration manager 1143, pattern matcher 1145, component matcher 1147, and arbitrator 1149. The system 1140 combines data from the object, the defined patterns, and the database elements, and passes on to the rest of the system the instructions for handling the object. A primary purpose of the system 1140 is to handle all the inaccuracies and errors that occur in the real world. Thus, in every step of what is described below as the functioning of the system 1140, it should be understood that the system 1140 may be correcting errors in the object data or in the database and providing leeway in determining how well the data from the object matches a given defined pattern.
[0157]
The system 1140 may supply to the rest of the system not only handling instructions, but how confident it is that those instructions are correct.
The uncertainty measurement by the system 1140 may be used by the system (whether for a single information element or for the entire data extracted from the object) to modify the handling of the object, to request a new image from the object, to try a different pattern, to call for manual intervention, or to otherwise modify the handling of the object.
The system 1140 may supply to the rest of the system not only handling instructions, but how confident it is that those instructions are correct.
The uncertainty measurement by the system 1140 may be used by the system (whether for a single information element or for the entire data extracted from the object) to modify the handling of the object, to request a new image from the object, to try a different pattern, to call for manual intervention, or to otherwise modify the handling of the object.
[0158]
The system 1140 receives object data 1142 from the object, which data is intended by the system to provide information sufficient for determining how to handle the object, and which in turn identifies categorizing information. In a mail piece, the object data comprises a string of character data that might comprise an = address element, whereas the category data comprises the X-Y coordinates of each of those characters. Stated another way, the object data provides the system with raw data while the category data provides information that allows the system 1140 to determine what kind of information the raw data contains.
The system 1140 receives object data 1142 from the object, which data is intended by the system to provide information sufficient for determining how to handle the object, and which in turn identifies categorizing information. In a mail piece, the object data comprises a string of character data that might comprise an = address element, whereas the category data comprises the X-Y coordinates of each of those characters. Stated another way, the object data provides the system with raw data while the category data provides information that allows the system 1140 to determine what kind of information the raw data contains.
[0159] The system 1140 has access to the defined patterns and instructions on how to apply them. These instructions 1144 may be to apply them sequentially until a match above a certain confidence is found, to apply them until all possible matches are found, or almost anything else. In particular, the instructions 1144 may tell the order in which to apply the patterns (and under what circumstances to cease applying them) for determining handling instructions for a given object.
[0160] The system 1140 also has access to the database and uses the information in the database to attempt to fill in the defined patterns with information extracted from the object. It can fulfill this function in many ways, but the following describes one possible application to a mail piece. The system 1140 has received categorizing information as well as object data from the object. It uses this categorizing information 1146 to determine what pattern data element a piece of object data might represent. Thus a five digit number on a mail piece may be a ZIP
code if it appears in the last line, but a street number if it appears in the second.
code if it appears in the last line, but a street number if it appears in the second.
[0161] One goal of the system 1140 is to fill in all the required elements of a defined pattern with data extracted from the object whose characterizing information is within the tolerances specified by the defined pattern. It fills in the elements by determining from the characterizing information which data elements the object data might be and determining from the database what specific data element in an actual address (or object data row) is matched, and how well.
[0162] If the system 1140 is able to satisfactorily fill in a defined pattern with database objects of the proper type using object and characterizing information from the object, it reads the instructions 48 in the database on what to do with the object and makes that information available to the system. If it is unable to do so (or if its instructions tell it to keep working until all possible patterns are exhausted), it goes on to the next pattern and continues until no patterns remain. If it finds no satisfactory match it has default instructions that it outputs telling the system that no defined pattern was matched satisfactorily.
Smart Matching
Smart Matching
[0163] FIG. 12 illustrates the overall data and data flow for a system 1250 including both a front-end directory data compiler 1252, its associated input files and tables, 1251, 1265, 1259, 1258, and a back-end runtime pattern identification and the back-end pattern identification and categorization processing inputs 54.
On the left side, the directory data compiler inputs 1252 are what the user puts together in order to create the setup for the system to run, and the right side illustrates the input files 1254 that are used when the system is being run.
On the left side, the directory data compiler inputs 1252 are what the user puts together in order to create the setup for the system to run, and the right side illustrates the input files 1254 that are used when the system is being run.
[0164] The directory data compiler 1252 is provided to ensure that the configuration files 1251 are well-formed. The directory data compiler system validates two aspects of the configuration files 1251: structure and data content.
The configuration file 1251 is said to be structurally invalid if, for example, it contains improper elements or is missing a closing tag. The configuration file 1251 is said to have invalid data content if element or attribute data does not match the type specified in the schema. For example, a positive integer value is expected for searchOrder; a content error would result if the attribute had the string value "bunny."
The configuration file 1251 is said to be structurally invalid if, for example, it contains improper elements or is missing a closing tag. The configuration file 1251 is said to have invalid data content if element or attribute data does not match the type specified in the schema. For example, a positive integer value is expected for searchOrder; a content error would result if the attribute had the string value "bunny."
[0165] Note that some kinds of logic errors may not be readily detected. For example, an address pattern 1265 may have used a component name that does not match a corresponding address data field name. Additional checking may be performed by the system at run-time to notify the user of such errors.
[0166] Runtime character matching allows for the specification of the name of the file that contains mappings to allow the system to match common character alternates. For example, '0' could exist in the directory data 1253, but it will likely be recognized using OCR without the umlaut, so the mapping '0' -> '0' can be made to improve fuzzy matching performance. In one embodiment, only one CharacterMatchTable 1255 element is allowed. The character match table file may include a UTF-8 encoded text file, with one mapping per line. A mapping is declared with the character found in the directory data 1253 on the left, followed by a dash-greater than style arrow sign ("->"), followed by the mapped character.
The dash-greater than style arrow sign is also illustrated herein as a simple right-directional arrow for convenience.
The dash-greater than style arrow sign is also illustrated herein as a simple right-directional arrow for convenience.
[0167] More than one mapping may be declared for the same directory data character, with each character appearing on a separate line. For example:
[0168] Note that the character match table 1255 may be case-sensitive.
Therefore character mappings that are meant to apply to non case-sensitive fields have upper case values on both sides of the "¨>".
Therefore character mappings that are meant to apply to non case-sensitive fields have upper case values on both sides of the "¨>".
[0169] Word matching allows for the specification of the name of the file that contains mappings to allow the system to match common word alternates. For example, "MOUNT" could exist in the directory data 1253 as part of a field value, but it is commonly abbreviated as "MT", so the mapping "MOUNT" "MT"
should be made to improve fuzzy matching performance. In one embodiment, only one <WordMatchTable> element is allowed.
should be made to improve fuzzy matching performance. In one embodiment, only one <WordMatchTable> element is allowed.
[0170] The word match table file 1256 may be a UTF-8 encoded text file, with one mapping per line. A mapping is declared with the word found in the directory data 1253 on the left, followed by a dash-greater than style arrow sign ("--4"), followed by the mapped word. More than one mapping may be declared for the same directory data word, with each character appearing on a separate line. For example:
MOUNT ¨> MT
MOUNT ¨> MNT
MOUNT ¨> MT
MOUNT ¨> MNT
[0171] Word match table entries can also be used for numeric input and directory data words. For example:
20TH ¨> TWENTIETH
TWENTY ¨> 20
20TH ¨> TWENTIETH
TWENTY ¨> 20
[0172] An ignorable word option specifies the name of the file that contains words such as articles and prepositions which the system can ignore in the input string or directory string while fuzzy matching. For example, if the ignorable words table contains "OF" and "THE", then the input string "AVENUE AMERICAS" will fuzzy match the directory string "AVENUE OF THE AMERICAS". Similarly, the input string "AVENUE OF THE AMERICAS" will fuzzy match the directory string "AVENUE
AMERICAS". A small penalty may be applied to the match score for each ignored word so that a better score is achieved if the ignorable words are present and matched. In one embodiment, ignorable words must match exactly; "THF" would not be ignorable if the table only contained "THE".
AMERICAS". A small penalty may be applied to the match score for each ignored word so that a better score is achieved if the ignorable words are present and matched. In one embodiment, ignorable words must match exactly; "THF" would not be ignorable if the table only contained "THE".
[0173] The ignorable words table 1257 file may be a UTF-8 encoded text file with one ignorable word per line. Leading and trailing whitespace may be trimmed.
In one embodiment whitespace is not allowed within the word.
In one embodiment whitespace is not allowed within the word.
[0174] Customer address data 1251 consists of address records 1258 and alias tables 1259. Customer address data 1251 can be imported into the system from a text file that contains either delimited or fixed-width fields. An XML
configuration file is used to define the fields to be loaded along with properties of those fields, and specify the locations of the fields in the data file. Whether fixed-width or delimited, a customer address data file 1251 is expected to have one record per line, with a line feed character (An') at the end of each line.
configuration file is used to define the fields to be loaded along with properties of those fields, and specify the locations of the fields in the data file. Whether fixed-width or delimited, a customer address data file 1251 is expected to have one record per line, with a line feed character (An') at the end of each line.
[0175] A customer address data configuration file 1251, which contains at least one address file definition and may contain one or more optional alias file definitions, is used by the directory data compiler 1252 to create the directory data file 1253.
Note that the example uses only delimited address and alias files, but both delimited and fixed width files can be mixed in the same configuration.
Note that the example uses only delimited address and alias files, but both delimited and fixed width files can be mixed in the same configuration.
[0176] For fields that are not case-sensitive, values (including aliases) are converted to all upper case. If an upper case equivalent does not exist for a character then it is not modified. For example:
"Redmond Woodinville" ¨> "REDMOND WOODINVILLE"
"90th" "90TH"
"Redmond Woodinville" ¨> "REDMOND WOODINVILLE"
"90th" "90TH"
[0177] Character match, word match, and ignorable words may be provided in one or more tables. In one embodiment, character match, word match, and ignorable words are not converted and are always case-sensitive. Therefore values that are meant to apply to non case-sensitive fields are given in upper case.
[0178] Field aliases can be defined to improve address pattern matching. An alias is an alternate but equivalent representation of data for a specific field value.
For example, "Calif' and "California" are sometimes used as aliases for the preferred, canonical two-letter state code "CA". If either "Calif" or "California" is found in an address block it may be considered a match to a record that contains the canonical field value "CA". Each alias table is tied to a specific field. So while "Montana" may be an alias for the state field value "MT", it is not an alias for the word "MT" in the street field value "MT SHASTA".
For example, "Calif' and "California" are sometimes used as aliases for the preferred, canonical two-letter state code "CA". If either "Calif" or "California" is found in an address block it may be considered a match to a record that contains the canonical field value "CA". Each alias table is tied to a specific field. So while "Montana" may be an alias for the state field value "MT", it is not an alias for the word "MT" in the street field value "MT SHASTA".
[0179] Consider the following sample of a delimited table of two-letter state code aliases. An alias may consist of a single value or a list of values. For example, there might be multiple aliases for a city name. If a list of values is used then a delimiter may be supplied to correctly parse the list.
[0180] The system reads addresses by comparing a block of character strings or recognition results to a set of one or more customer-defined address patterns 1265.
The address pattern 1265 describes where different components of an address, such as street name and postcode, can be found relative to one another, and relative to the address block containing them. In addition, the address pattern 65 defines areas where ignorable text ("noise" which is not an important part of the address) may be found.
The address pattern 1265 describes where different components of an address, such as street name and postcode, can be found relative to one another, and relative to the address block containing them. In addition, the address pattern 65 defines areas where ignorable text ("noise" which is not an important part of the address) may be found.
[0181] FIG. 13 graphically represents an example pattern 1364 comprising an address block 1362. The example pattern 1364 may be used to describe the address block 1362 using field descriptors 1364. In this pattern 1364, city, state, and suite number are treated as noise 1366. In another application it might be preferable to identify these components in the pattern rather than ignore them. The firm line is ignored in this case with or without a noise declaration because this pattern 64 is searched from bottom to top and the firm line is above the topmost line that contains required components.
[0182] An input address block may contain additional information, such as telephone number or addressee name, mixed with required component data. To improve pattern matching performance with this extra data, noise character placeholders can be declared in the pattern. Up to maxQuantity characters in the input address block can be ignored between two matching components if a <NoiseChars> element exists in the pattern between the two <Component>
elements. To declare that one or more entire lines of the input address block could be ignored as noise a <NoiseLines> element should be used.
elements. To declare that one or more entire lines of the input address block could be ignored as noise a <NoiseLines> element should be used.
[0183] <NoiseChars> and <NoiseLines> may contain element text making it "named noise". This text indicates that the content of this specified noise area should be written as output using the given element text as the name.
Specification of named noise adds the following restrictions to the noise elements:
= No consecutive <NoiseChars> or <NoiseLines> where 1 of the elements is named = No <NoiseChars> or <NoiseLines> with only optional components between them (including all optional lines) = No completely optional lines with a named <NoiseChars> element
Specification of named noise adds the following restrictions to the noise elements:
= No consecutive <NoiseChars> or <NoiseLines> where 1 of the elements is named = No <NoiseChars> or <NoiseLines> with only optional components between them (including all optional lines) = No completely optional lines with a named <NoiseChars> element
[0184] Patterns may be configured using an XML file. The Pattern file 1265 (FIG.
12) contains one or more patterns, each identified by a customer-defined name.
The system attempts to match each pattern in the order in which it appears in the file. In one embodiment, a pattern 1364 defines a single configuration of an address block 1362 and the fields 1364 that will be returned to the caller for each address record that successfully matches the pattern 1364.
12) contains one or more patterns, each identified by a customer-defined name.
The system attempts to match each pattern in the order in which it appears in the file. In one embodiment, a pattern 1364 defines a single configuration of an address block 1362 and the fields 1364 that will be returned to the caller for each address record that successfully matches the pattern 1364.
[0185] A line represents a single line of an address as it appears on a piece of mail. In one embodiment, a line must contain at least one component, and can also contain noise. The position of each <Line> element in the address block 1362 is important since the pattern matcher utilizes the same relative positions in the input data.
[0186] A component is a piece of an address, such as postcode or street name, which is represented by a field in the directory data. <Component> elements are defined for a line in the same order in which they are expected to appear in the input data.
[0187] The pattern matcher can determine in what order it should search the input data for matching component values. Some components are considered optional and an address record will not be rejected if it has a component value that was not found in the input. For example, if street suffix is optional, then the input "MAIN" will match the record "MAIN ST". Given two records that are identical except one has a matching optional component and the other does not, the one with the matching optional component is preferred. For example, given the input "MAIN ST", the record "MAIN ST" is preferred over "MAIN RD" and simply "MAIN" if street suffix is optional.
[0188] The component matcher can scan each line from right to left looking for the component instead of left to right. This can improve matching performance for components that are typically found on the right side of the line, such as postcode.
[0189] Each character in the directory data string must be represented by a character in each possibility set of the text or OCR string. If all values for the field have one word and the same number of characters, the matcher is able to handle a limited number of split/merge cases (i.e. directory data string is split into two or is merged with another string in the input). A single missing, leading zero digit is allowed for fields that are all numeric. For example, the directory data string "08010"
matches the input string "8010".
matches the input string "8010".
[0190] A fuzzy logic matching confidence threshold can be set for the component.
String fuzzy matching is used to compare the input to field values from the directory.
Allowed settings are: "VeryLow", "Low", "Default", "High", and "VeryHigh". A
better match may be required for "High" than for "Low". For example, the directory string "WOODINVILLE" would not match the input string "WODNVALLE" with a setting of "VeryHigh", but it would match with a setting of "VeryLow" or "Low".
String fuzzy matching is used to compare the input to field values from the directory.
Allowed settings are: "VeryLow", "Low", "Default", "High", and "VeryHigh". A
better match may be required for "High" than for "Low". For example, the directory string "WOODINVILLE" would not match the input string "WODNVALLE" with a setting of "VeryHigh", but it would match with a setting of "VeryLow" or "Low".
[0191] The pattern matcher may match directory words to input abbreviations.
For example, with this feature enabled the directory string "Johann Sebastian Bach"
would match the input "J Sebastian Bach" or "J S Bach". At least one word must be unabbreviated, so "J S B" would not be an acceptable match. [Optional, default =
true].
For example, with this feature enabled the directory string "Johann Sebastian Bach"
would match the input "J Sebastian Bach" or "J S Bach". At least one word must be unabbreviated, so "J S B" would not be an acceptable match. [Optional, default =
true].
[0192] The pattern matcher may match directory words to an input acronym, or vice versa. For example, with this feature enabled the directory string "Salt Lake City" would match the input "SLC". Similarly, the directory string "MLK" would match the input "Martin Luther King". Ignorable words may be dropped from the directory string, so the directory string "United States of America" would match the input "USA"
if "of" is in the ignorable words table.
if "of" is in the ignorable words table.
[0193] The pattern matcher may match directory words that include contractions of articles and prepositions to an expanded equivalent form in the input. For example, with this feature enabled the directory string "Comte d'Urgell" would match the input "Comte de Urgell". In addition, the input would be matched if the article or preposition is missing, as in "Comte Urgell''.
[0194] The pattern matcher may trim leading noise glyphs from an input word in order to match a directory word. For example, with this feature enabled the directory string "Dallas" would match the input "IIIDallas". A substring match of a numeric part of a string is not allowed. So while the directory string "82nd Avenue" would match the input string "A82nd Avenue", it would not match the input string "182nd Avenue".
[0195] The pattern matcher may match the last directory word to the first input word, and then match the remaining words in the proper order. For example, with this feature enabled the directory string "San Gerolamo Emiliani" matches the input "Emiliani San Gerolamo".
[0196] The pattern matcher may allow a match with the first directory word missing from the input. For example, with this feature enabled the directory string "Giuseppe Verdi" matches the input "Verdi". This option can be combined with allowMissingMiddleWord, allowMissingNonNumericLastWord, and allowMissing NumericLastWord, however no more than one word can be missing from a string.
[0197] The pattern matcher may allow a match with any single word from the directory string to be missing from the input except the first or last word.
For example, with this feature enabled the directory string "Don Luigi Milani"
matches the input "Don Milani", however "Don Luigi Alberto Milani" does not. This option can be combined with allowMissingFirstWord, allowMissingNonNumericLastWord, and allowMissingNumericLastWord, however no more than one word can be missing from a string. Note that a word that is in the ignorable words table may not count as missing in this context.
For example, with this feature enabled the directory string "Don Luigi Milani"
matches the input "Don Milani", however "Don Luigi Alberto Milani" does not. This option can be combined with allowMissingFirstWord, allowMissingNonNumericLastWord, and allowMissingNumericLastWord, however no more than one word can be missing from a string. Note that a word that is in the ignorable words table may not count as missing in this context.
[0198] The pattern matcher may allow a match with the last directory word missing from the input provided the word does not contain any digits. For example, with this feature enabled the directory string ''Cernusco sul Naviglio"
matches the input ''Cernusco" (assuming "sul" is an ignorable word); however "State Route 20"
does not match "State Route" unless allowMissingNumericLastWord is also enabled.
This option can be combined with allowMissingFirstWord, allowMissingMiddleWord, and allowMissingNumericLastWord, however no more than one word can be missing from a string.
matches the input ''Cernusco" (assuming "sul" is an ignorable word); however "State Route 20"
does not match "State Route" unless allowMissingNumericLastWord is also enabled.
This option can be combined with allowMissingFirstWord, allowMissingMiddleWord, and allowMissingNumericLastWord, however no more than one word can be missing from a string.
[0199] The pattern matcher may allow a match with the last directory word missing from the input provided the word contains digits or is a Roman number.
For example, with this feature enabled the directory string "Vittorio Emmanuele II"
matches the input "Vittorio Emmanuele"; however "Martin Luther King" does not match "Martin Luther" unless allowMissingLastWord is also enabled. This option can be combined with allowMissingFirstWord, allowMissingMiddleWord, and allowMissingNonNumericLastWord, however no more than one word can be missing from a string.
For example, with this feature enabled the directory string "Vittorio Emmanuele II"
matches the input "Vittorio Emmanuele"; however "Martin Luther King" does not match "Martin Luther" unless allowMissingLastWord is also enabled. This option can be combined with allowMissingFirstWord, allowMissingMiddleWord, and allowMissingNonNumericLastWord, however no more than one word can be missing from a string.
[0200] The pattern matcher may trim the numeric ordinal, i.e. "sr, "nd", "rd", or "th", from a directory word prior to matching the input word. For example, with this feature enabled the directory string "29th Ave" would match the input "29 Ave". To avoid false-positives, ordinal trimming is not attempted with directory words that have a single-digit numeric portion, so the directory string "5th Pl" would not match "5 Pl".
[0201] The pattern matcher may match a Roman number to a numeric string. For example, with this feature enabled the directory word "XXIII" matches the input "23"
and the directory word "23" matches the input "XXIII". An exact match may be required.
and the directory word "23" matches the input "XXIII". An exact match may be required.
[0202] The pattern matcher may trim trailing noise glyphs from an input word in order to match a directory word. For example, with this feature enabled the directory string "Elm Street" would match the input "Elm StreetIII". A substring match of a numeric part of a string is not allowed. So while the directory string "Highway 52"
would match the input string "Highway 52A", it would not match the input string "Highway 521".
would match the input string "Highway 52A", it would not match the input string "Highway 521".
[0203] The pattern matcher may match strings with pairs of words transposed.
For example, with this feature enabled the directory string. "Redmond Woodinville Road" would match the input "Woodinville Redmond Road". Each word may only be affected by a single transposition. So the directory string "Redmond Woodinville Road" would not match "Woodinville Road Redmond" because two transpositions would be necessary: "Redmond"/"Woodinville" followed by "Redmond"/"Road".
For example, with this feature enabled the directory string. "Redmond Woodinville Road" would match the input "Woodinville Redmond Road". Each word may only be affected by a single transposition. So the directory string "Redmond Woodinville Road" would not match "Woodinville Road Redmond" because two transpositions would be necessary: "Redmond"/"Woodinville" followed by "Redmond"/"Road".
[0204] The pattern matcher may match an input word with a truncated directory word. For example, with this feature enabled the directory string "Philadelphia"
would match the input string "Phila". Where truncation is only allowed on the right-hand side, "Philadelphia" would not match "Delphia".
would match the input string "Phila". Where truncation is only allowed on the right-hand side, "Philadelphia" would not match "Delphia".
[0205] FIG. 14 illustrates a process 1400 of identifying, imaging, matching, verifying classifying and delivering the results of an object match. For example, an object such as the image of a label of address may be processed. At operation 1405, an object or mail piece is loaded into the system. In one embodiment, a large number of mail pieces are loaded into the system at the same time for sequential or parallel evaluation of the mail pieces.
[0206] At operation 1410, an image of the object is obtained. The image may be obtained with a camera, scanning device (e.g. charge coupled device CCD or contact image sensor CIS), optical sensor, thermal imaging device, magnetic imaging device, etc. In one embodiment, images of the objects are uploaded to the system in bulk. For example, the objects being identified, sorted, delivered, or classified may have been previously scanned or photographed ahead of time.
[0207] At operation 1415, the image of the object is processed with a recognition system, such as a system which utilizes OCR. The recognition system may parse the image into separate lines of characters or words that may be analyzed for context and/or meaning. The recognition system may identify an address block of the image, which specifies an intended destination of the object or mail piece.
[0208] At operation 1420, the parsed image data is compared with a first pattern.
For example, a first line of the address block is compared with a first field or component of the first pattern. Similarly, a second line of the address block may be compared with a second field of the first pattern. In one embodiment, a single line of the address block may be associated with, or compared to, a plurality of fields in the pattern. Operationally this matching may be performed top down or bottom up.
For example, a first line of the address block is compared with a first field or component of the first pattern. Similarly, a second line of the address block may be compared with a second field of the first pattern. In one embodiment, a single line of the address block may be associated with, or compared to, a plurality of fields in the pattern. Operationally this matching may be performed top down or bottom up.
[0209] The patterns may be weighted. The weightings may determine an order for comparison of the patterns with the image data. For example, the first pattern to be compared to the image data may have a higher weighting than a second pattern to be compared to the image data.
[0210] At operation 1425, a confidence level of the comparison of the image data with the first pattern is determined. A confidence threshold may be associated with the first pattern. In one embodiment, the first pattern is validated, or considered a match, when the confidence level equals or exceeds the confidence threshold for the first pattern.
[0211] If the confidence level is less than the confidence threshold associated with the first pattern, the system may then compare the image data with the second pattern according to the assigned weighting of the patterns. A confidence level of the comparison of the image data with the second pattern may then be determined.
The remaining patterns may be cycled through until a confidence threshold of a corresponding pattern is met or exceeded.
The remaining patterns may be cycled through until a confidence threshold of a corresponding pattern is met or exceeded.
[0212] At operation 1430, a pattern is selected or validated for the image data. In one embodiment, a single pattern is selected for the image data. For example, as soon as the confidence threshold for the corresponding pattern has been met or exceeded, the system stops cycling through the plurality of patterns and selects the corresponding pattern.
[0213] At operation 1435, a pattern output is identified for the selected pattern.
The pattern output may identify a standard or canonical format corresponding to the address block. In one embodiment, the canonical format provides additional information that was not included in the address block. The canonical format may also replace one or more words in the address block with a more standardized version, or corrected spelling. The canonical format may also remove redundant or unnecessary information that was identified in the address block. In one embodiment, the output does not include any of the same information that is included in the address block, but rather points to related information in a database such as geo coordinates, a telephone number, or a bar code.
The pattern output may identify a standard or canonical format corresponding to the address block. In one embodiment, the canonical format provides additional information that was not included in the address block. The canonical format may also replace one or more words in the address block with a more standardized version, or corrected spelling. The canonical format may also remove redundant or unnecessary information that was identified in the address block. In one embodiment, the output does not include any of the same information that is included in the address block, but rather points to related information in a database such as geo coordinates, a telephone number, or a bar code.
[0214] At operation 1440, the canonical format is sprayed on, printed on, or otherwise applied to the object or mail piece. In one embodiment, the canonical format information is transferred to the object via a short-range signal such as RFID.
In that case, the object may include a memory chip which is configured to store the canonical format information. The canonical format information may then be used for further sorting, delivery, classification, or inventory of the object.
In that case, the object may include a memory chip which is configured to store the canonical format information. The canonical format information may then be used for further sorting, delivery, classification, or inventory of the object.
[0215] The system may be used in applications such as high-speed mail sorting where response time is critical. The system is also designed to accommodate user data consisting of an arbitrary collection of fields, so database optimization must be able to automatically adapt to the data and patterns related to a specific application.
[0216] The system optimizes database access at compile time, when user data is normalized, analyzed, and loaded into a binary file format, and at initialization time, when software using the system is started. At compile time the system performs adaptive indexing to improve database query speed. By analyzing the data, patterns, and user configuration, the system determines which fields should be indexed to balance performance with database size. Indexes can significantly increase the database size, so it is not practical to create indexes for all combinations of fields. For example, consider a data set consisting of US
addresses. The following pattern could be defined to match a complete address:
State City -4 Street House Number Suffix ¨>
Pre-directional --4 Post-directional (fields are shown in search order)
addresses. The following pattern could be defined to match a complete address:
State City -4 Street House Number Suffix ¨>
Pre-directional --4 Post-directional (fields are shown in search order)
[0217] An index is created for state since the pattern matcher will need the list of city values associated with the parsed state. Similarly, a two-field index for state-city is created because the pattern matcher will retrieve a list of streets associated with the previously parsed state-city combination. A three-field index, state-city-street, is also created, but since the number of records associated with a specific state-city-street combination is relatively small this will be the last index created for this pattern. At this point entire records would be fetched instead of values for a single field.
[0218] At initialization the system analyzes the patterns and preemptively caches data that will be queried frequently or may be too expensive to access at parse-time.
Given the US address pattern described above, for example, the system knows to query the database to generate a static list of all state values to be used by the pattern matcher. The system then analyzes the size of the list of state values. If the list is not too long, the system queries the database to create a static, associative table of city values for each state value. These static data structures can be orders of magnitude faster to access and manipulate compared the same operations with a SQL query.
Given the US address pattern described above, for example, the system knows to query the database to generate a static list of all state values to be used by the pattern matcher. The system then analyzes the size of the list of state values. If the list is not too long, the system queries the database to create a static, associative table of city values for each state value. These static data structures can be orders of magnitude faster to access and manipulate compared the same operations with a SQL query.
[0219] This same concept may be utilized, for example, for sorting and delivery of international mail pieces. In additional to historical reasons where the names of cities may be different from one another, names of cities and countries may be spelled quite differently simply as a result of the various languages involved. An alias table which identifies all of the different variations for the various languages can thereby associate the variations with a canonical version of the name. For example, the canonical version of the name of a country may be the version that is native to the country in question, whereby all other deviations of that name that are typically used in other countries are associated as aliases. In another embodiment, the canonical version of all the names are those associated with the versions adopted by a particular accepted language, such as French.
[0220] In the European Union, a name of a city or country may be spelled quite differently depending on the language of choice. By way of example, The Netherlands is commonly referred to by a number of different names including Holland and Les Pays-Bas, to name a few. By associating these names with a canonical version of the address, whether a letter addressed to Amsterdam originated in France or in the U.K. a standardized address or label may be sprayed, printed, or otherwise applied to the letter which would identify the same canonical name for the destination country. By extension, a single canonical address can be applied for all of the different variations that exist either by convention or due to differences in languages.
[0221] determined that the change is due to normal wear. Such a feature comes into China, written in English. It is written to correspond to one of several possible Chinese address templates where the components are all in English.
The canonical output format, however, is not in English at all, but in Chinese.
The system templates and translation to canonical address formats can be used to automatically translate the address from one language (and is associated templates) to another language, including the canonical template for that country. In operation, a sequence of English-language templates are applied to the mail piece and once the template components are filled in, the address code for delivery is extracted from the database. The delivery code may then be sprayed, printed, or otherwise applied to the mail piece for routing using bar code readers.
The canonical output format, however, is not in English at all, but in Chinese.
The system templates and translation to canonical address formats can be used to automatically translate the address from one language (and is associated templates) to another language, including the canonical template for that country. In operation, a sequence of English-language templates are applied to the mail piece and once the template components are filled in, the address code for delivery is extracted from the database. The delivery code may then be sprayed, printed, or otherwise applied to the mail piece for routing using bar code readers.
[0222] A format translation database derives a delivery code by applying templates which point to the canonical form of the address. The process may work for part of an address or the entire address. For example, the mail piece may identify a city but not a state. The template can identify a unique city without any reference to the state, and output both the city name and the state as the official format for application on the mail piece. Similarly, the system can identify the canonical address written in a different language from that identified on the mail piece as originally received.
Handwriting Recognition
Handwriting Recognition
[0223] A handwriting engine is highly reliant on contextual information in order to properly read text. Contextual information allows the engine to be able to resolve ambiguous words by being able to apply a dictionary of possible words to a written word. The system may be used as the contextual system for a handwriting engine.
The system allows the user to create a parsable structure for text and an associated dictionary for each of the fields in that block of text. For example, a US
address block contains different fields that need to be read (city name, ZIP code, street name, etc.) and those text fields are in a small set of possible locations. As another example, a personal check has a set of fields that are to be read (date, amount, recipient, etc.) with those fields in a specific set of locations.
The system allows the user to create a parsable structure for text and an associated dictionary for each of the fields in that block of text. For example, a US
address block contains different fields that need to be read (city name, ZIP code, street name, etc.) and those text fields are in a small set of possible locations. As another example, a personal check has a set of fields that are to be read (date, amount, recipient, etc.) with those fields in a specific set of locations.
[0224] Handwriting applications would normally be written for a specific usage scenario without the ability to easily reuse the handwriting engine in a new situation.
With the system a handwriting engine can easily be used in new situations by changing the configuration and/or changing the data dictionary. Effectively, this may make the handwriting engine operate similarly as the system 1140 (FIG. 11) described above, using the patterns and the database to read the handwritten text.
With the system a handwriting engine can easily be used in new situations by changing the configuration and/or changing the data dictionary. Effectively, this may make the handwriting engine operate similarly as the system 1140 (FIG. 11) described above, using the patterns and the database to read the handwritten text.
[0225] A configuration is created that specifies the set of possible layouts of text elements that are to be read by the handwriting engine. Each text element in the configuration is associated with a data dictionary. The dictionary for each element may be dependent on previous elements (for example, list of streets in a particular city). The dictionary may also be a regular expression (for example, a date field on a check). The handwriting engine reads the configuration.
[0226] Given an input image containing handwritten text, the engine iterates through the words in the text block and while reading, determines which of the set of configured layouts this input text best matches. As it reads each input text element, it uses the dictionary for that element in a given text layout to determine how well that layout matches the input text block.
Automatic Pattern Generation
Automatic Pattern Generation
[0227] FIG. 15 illustrates an example process of automatically generating patterns from a number of sample objects. At operation 1505, a set of sample input data images are loaded into the system. The images can be loaded as digital images, or the images can be obtained from a sample set of physical objects, which may be scanned using an camera device. The automatic pattern generation may be applicable to weapons, pharmaceuticals, drugs, animals, gems, coins, bullion, currency, integrated circuits, clothing, apparel, legal documents, financial documents, mail, art work, photographs, manufactured parts, labels, etc, for which, in some examples, the algorithms described herein may be used to determine one or more features which may be used to distinguish/identify/authenticate the object.
[0228] Each input data image will contain input data corresponding to the Input Image Description in a specific defined area of interest. The area of interest is meant to represent a standard area of interest that may be encountered in the processing of like input data images.
[0229] At operation 1510, descriptions of the images are entered or loaded to the system. The descriptions may, but is not required to, identify a location of specific fields or components to provide a set of spatial identifiers. For example, one location of an address block may be described as "country", whereas another location of the address block may be described as "street address". Different patterns are associated with a different set of descriptions or spatial identifiers.
[0230] The sample input data identifies the pattern name that is associated with a specific input image, the priority and/or confidence level of this pattern, the pattern field element names, the corresponding data found in each pattern field element on the image area and in the area of interest, and the specific outputs that should be associated with a match on a given image pattern.
[0231] At operation 1515, a number of patterns are generated based on the images and associated descriptions. A different pattern may be generated for each image and image description combination. The patterns correlate the image data with the associated description and optional spatial identifiers. The patterns may be associated with one or more aliases for certain of the address fields or components.
[0232] For example, the customer creates N images, where each image shows a single representative example of a typical image to be processed including an example of a typical data format with data in the typical locations. When the set of N
images are taken together they represent the entire collection of data formats that the customer system processes using the patterns to be created. These N images have specific data on them. This data would be identified in an accompanying text image data description file. The data description file includes the image name, the data that should be found on that image and the 'meta-tag' that this specific data item is associated with.
images are taken together they represent the entire collection of data formats that the customer system processes using the patterns to be created. These N images have specific data on them. This data would be identified in an accompanying text image data description file. The data description file includes the image name, the data that should be found on that image and the 'meta-tag' that this specific data item is associated with.
[0233] The Data Description file may also specify the pattern outputs, pattern weighting and other characters of each data item specified. At operation 1520, pattern outputs are specified for each of the patterns. The pattern output may identify what information will be sprayed, printed, or otherwise applied to the mail piece if the associated pattern provides a match. In one embodiment, the pattern output identifies a canonical address.
[0234] At operation 1525, the patterns are weighted. For example, a first pattern may be weighted higher than a second pattern. The pattern weighting may indicate a preferred or standard format for the addresses. In one embodiment, the pattern weighting relates to a confidence level in how complete the information associated with the pattern is. For example, a pattern which identifies both city and state may have a higher weighting, or confidence, than a pattern which ordy identifies the city.
[0235] At operation 1530, confidence thresholds are specified for the patterns.
The confidence threshold may identify to what degree the image data of the scanned mail piece must match a particular pattern before a match is determined or verified.
If the confidence threshold for a pattern is not met, then the system moves on to the next lower weighted pattern to determine if a match with the next pattern can be met.
In one embodiment, the system stops comparing the image data with the patterns once the confidence threshold for a corresponding pattern is met. In this way, the highest weighted pattern which is validated for the mail piece is selected as a match, or as a best match with the image data. Different patterns may have different confidence thresholds.
The confidence threshold may identify to what degree the image data of the scanned mail piece must match a particular pattern before a match is determined or verified.
If the confidence threshold for a pattern is not met, then the system moves on to the next lower weighted pattern to determine if a match with the next pattern can be met.
In one embodiment, the system stops comparing the image data with the patterns once the confidence threshold for a corresponding pattern is met. In this way, the highest weighted pattern which is validated for the mail piece is selected as a match, or as a best match with the image data. Different patterns may have different confidence thresholds.
[0236] This system takes the loaded information and generates a plurality or deck of patterns based on the set of sample input data images and the sample input data image descriptions described above. This includes generating a single pattern based on each image and its corresponding description and specifying a confidence matching threshold for each pattern. The specified confidence threshold identifies to what extent image data must match a particular pattern before a match is verified and declared, the spatial, feature, or other distinguishing relations of the pattern field elements, and the generic canonical and/or custom output to be associated with a match on this pattern.
[0237] When a specific set of images and a specific set of image data descriptions are run through the automatic pattern generation system, the resulting pattern file may be used to aid in optimizing a data file specific to these reading these types of images and also used in recognizing other images containing similar data patterns. In one embodiment, the user can specify the patterns to be evaluated, the confidence associated with each pattern, and the corresponding outputs for the selected patterns.
[0238] A
physical object is provided with enough information on it to allow the system to determine and perform a desired function. For a mail system this may be an envelope with some attempt at or approximation to an address on the envelope.
For a manufacturing plant or parts depot, this may be a label or serial number which identifies a part or otherwise associates information with the part. The system is configured to extract the information from the object (object information) and then determine information about that information that is extracted from the object (categorizing information). For a mail piece, this additional component may comprise an address block locator and an OCR system.
physical object is provided with enough information on it to allow the system to determine and perform a desired function. For a mail system this may be an envelope with some attempt at or approximation to an address on the envelope.
For a manufacturing plant or parts depot, this may be a label or serial number which identifies a part or otherwise associates information with the part. The system is configured to extract the information from the object (object information) and then determine information about that information that is extracted from the object (categorizing information). For a mail piece, this additional component may comprise an address block locator and an OCR system.
[0239] A defined pattern or set of patterns may exist a priori (e.g. a Universal Postal Union-defined address format for each country), or it may be defined for a specific application by a vendor or by a customer. This will be described in detail below. Part of the defined pattern may include information on how to apply the pattern either alone or in a defined and prioritized order with other defined patterns, and what generic and specific information to return.
[0240] The database contains the lists of classification elements, individual applicable element values and the desired system output when a desired pattern has been matched. For a mail application this database may contain, for example, a list (the list being a classification element) of states (the states being individual applicable element values), the cities within each state, the neighborhoods within each city, and the carrier routes within each neighborhood. The desired output may be the routing ZIP code. The database hierarchy corresponds to the classifying elements to be found on the object and to the patterns created for classifying the object.
[0241] The system 1140 (FIG. 11) can determine which input data fields on the object correspond to which elements in the defined patterns and to which elements and element values in the database, and can perform fuzzy matching on the input data fields and to interpolate missing elements where possible.
[0242] The relationship between the defined pattern and the elements in the database may be viewed as similar to that between a defined class in, say, C++
and the many possible instantiations of that class. The pattern or patterns show the overall structure and interrelationships of object elements, while the database provides specific examples, element values (usually meant to be fairly all-encompassing) of those patterns.
and the many possible instantiations of that class. The pattern or patterns show the overall structure and interrelationships of object elements, while the database provides specific examples, element values (usually meant to be fairly all-encompassing) of those patterns.
[0243] The term "indicia" as used in this specification may apply to various features of a mail piece, document, or other object as described above. For example, indicia may include cancellation marks, address, name, stamps, forwarding information, etc. The systems, apparatus, methods, processes, and operations may apply equally well to indicia and anything else visible or discernable on the object to be identified, including random dirt marks and other physical characteristics such as the object's dimensions, weight, color, etc.
[0244] Whereas the specification repeatedly provides examples identifying a mail piece or mail pieces, the systems, methods, processes, and operations described herein may also be used to analyze or compare other types of documents, files, forms, contracts, letters, wills, bonds, or records associated with insurance, medical, dental, legal proceedings, passports, tax, accounting, etc. Similarly, objects other than documents, such as weapons, pharmaceuticals, drugs, animals, gems, coins, bullion, currency, integrated circuits, clothing, apparel, legal documents, financial documents, mail, art work, photographs, manufactured parts, and labels, may be analyzed according to the systems, apparatus, methods, processes, and operations described herein. Image data corresponding to the object being analyzed may be captured by a variety of devices, such as a cell phone camera, which may further perform any and/or all of the various steps, methods, processes, and operations described herein.
[0245] In some examples, the object in question may comprise a specific, previously-seen object that includes a serial number, such as bills and weapons.
The object may be compared to a database to determine whether there is a match to a specific entry in the database, or at least where there is a sufficient confidence that the object matches the entry.
The object may be compared to a database to determine whether there is a match to a specific entry in the database, or at least where there is a sufficient confidence that the object matches the entry.
[0246] In some example, the object in question may comprise a specific, previously-seen object that does not include a serial number or any other explicit identifier, such as collector coins or most artwork. In the example of coins, there may be many different types of coins of the same kind, and the systems/method disclosed herein may be used to determine whether a coin is the same one that was stolen, for example, from a particular collector. In the example of art work, where each object may be unique, the art work may be cataloged and therefore a stolen art work may be identified from a theft report, for example, but in some cases it may not be cataloged. Where the art work has been cataloged, the art work may be tested and/or compared against information that has been stored for the stolen item.
In the event that we initially do not know which art work was stolen (e.g., only that some art work was stolen) the retrieved art work can be compared against some or all of the entries in the database until we find, or in some cases do not find, a close enough match in order to identify and/or authenticate the stolen art work.
In the event that we initially do not know which art work was stolen (e.g., only that some art work was stolen) the retrieved art work can be compared against some or all of the entries in the database until we find, or in some cases do not find, a close enough match in order to identify and/or authenticate the stolen art work.
[0247] In some examples it may not be important to identify a unique item, but rather we just want to verify that it is an item which it purports to represent. For example, we may not care to identify a specific coin (e.g., by sequence or when it was minted), but rather we may be more interested in verifying that the coin was in fact made by the mint, and therefore is an authentic minted coin (i.e., one of a number of authentic minted coins). In some examples, there may not be an intentional identifier associated with the object, either on the object or in the database (such as a serial number). Rather, the database may comprise a hash table of features corresponding to legitimate objects. Such a system may be used, for example, to identify counterfeit coins, artwork, or objects, generally.
[0248] Some of the systems/methods described herein are configured to answer the basic question of "which item is this?" in which case an object identifier may be associated with one or more reference features. In some examples, the item may comprise an identification number or some other manufacturer-imposed identifier, such as a serial number on a bill or weapon. The identifier may be on the item itself, or may be on something associated with the item, such as a container, box, or bottle (e.g., for drugs, coins, or wine). When the identifier is available, the feature vectors of the item being evaluated can be compared with the feature vectors associated with the identifier stored in the database to determine if there is a match.
[0249] In some examples, although there may be some identifier associated with the feature vectors stored in the database, there may not be any identifier which is physically located on the item (or on a tag or a container of the item).
Again, by way of example, a collector coin may not have a serial number or other unique identifier.
Nevertheless, we still may want to be able to know which specific item this is.
Accordingly, the features of the item may be compared against the features associated with some or all of the entries stored in the database in order to not only find a match, which indicates that the item is known to us, but also to identify the item, for example, as a unique item from among all the entries in the database.
Again, by way of example, a collector coin may not have a serial number or other unique identifier.
Nevertheless, we still may want to be able to know which specific item this is.
Accordingly, the features of the item may be compared against the features associated with some or all of the entries stored in the database in order to not only find a match, which indicates that the item is known to us, but also to identify the item, for example, as a unique item from among all the entries in the database.
[0250] In some examples, it may be sufficient to merely determine that the item is known to us, for example, to authenticate that a coin was minted by an authorized/licensed mint, regardless of which particular coin it may be.
According to the methods/systems described herein, it is therefore possible to hash the feature vectors of all coins that have been produced from a mint, and to match the feature vectors of a particular coins against a hash table, in order to determine that the particular coin is one which has previously been cataloged (again, irrespective of the identification of any particular coin among the group of authentic coins).
According to the methods/systems described herein, it is therefore possible to hash the feature vectors of all coins that have been produced from a mint, and to match the feature vectors of a particular coins against a hash table, in order to determine that the particular coin is one which has previously been cataloged (again, irrespective of the identification of any particular coin among the group of authentic coins).
[0251] In some examples, a region in and around a serial number may be used for purposes of assisting in the identification/authentication of the object.
For example, an image of the serial number and the surrounding region may be captured. The serial number may be used to help identify what object the object purports to be, and then the feature vectors extracted from the area in or around the serial number may be compared with entries in a database to confirm and/or authenticate the purported self identification based on the serial number.
For example, an image of the serial number and the surrounding region may be captured. The serial number may be used to help identify what object the object purports to be, and then the feature vectors extracted from the area in or around the serial number may be compared with entries in a database to confirm and/or authenticate the purported self identification based on the serial number.
[0252] Whether the object is a mail piece, a coin, or a weapon, many of the system/methods described herein may operate similarly. For example, a mail piece may be identified using "random" characteristics that result from addressing, stamping, or otherwise processing the mail. In coins, we may be primarily interested in comparing surface characteristics or variances. Both the mail directory and the coin database may be used to create and/or store feature vectors for later identifying and/or authenticating the object of interest.
[0253] In some regards, the address of a mail piece may provide similar features as a serial number. For example, the address may serve to narrow the scope or number of entries that need to be compared to the mail by virtue of identifying a limited number of entries having the same or similar features associated with the address. This may be the case even if we were initially unable to determine the complete address of the mail (e.g., otherwise the mail would have been correctly routed in the first place). A portion of the address and/or identifier may provide important clues to help identify and/or authenticate the object in combination with a comparison of the feature vectors.
[0254] FIG. 16 depicts an example of a system 100 configured to identify, track, trace, inventory, authenticate, verify, sort, deliver, or classify objects and/or articles associated with objects, such as a weapon. The object may be identified by generating a feature vector associated with a specific physical region of the object.
An image of the physical region may be captured using a high-resolution imaging device. The physical region may be identified according to a proximity to or offset from one or more physical features of the object.
An image of the physical region may be captured using a high-resolution imaging device. The physical region may be identified according to a proximity to or offset from one or more physical features of the object.
[0255] In a first instance (such as when the object is issued), image data associated with the captured image may processed to identify one or more fingerprint features and to extract a first feature vector based on the fingerprint features. A "fingerprint feature" is a feature of the object that is innate to the object itself (the way human fingerprints are), a result of the manufacturing process, a result of external processes, or of any other random or pseudo random process. The first feature vector may be associated with an object identifier. The first feature vector and identifier may be recorded in a secure file or location.
[0256] In a second instance, the physical region of a purportedly same object may again be captured using a high-resolution imaging device and a second feature vector extracted. The object identifier may be used to retrieve the record of the associated first feature vector. The first feature vector and the second feature vector may be compared to determine whether the object associated with the first feature vector is the same object corresponding to the second feature vector. To determine if the second feature vector and the first feature vector are sufficiently similar to establish within a particular confidence level that they both came from the same object, difference values between the second feature vector and the first feature vector may be processed to determine the degree of match or mismatch of the feature vectors. The processing of the difference values may comprise a method to modify the difference values to dampen differences that do not contribute to object identification and to enhance differences that do contribute to object identification.
[0257] This application may be exemplified in many different forms and should not be construed as being limited to the examples set forth herein. In the figures, the size of the boxes is not intended to represent the size of the various physical components. Only those parts of the various units are shown and described which are necessary to convey an understanding of the examples to those skilled in the art.
[0258] Some of the following examples are described with reference to embodiments involving identification and inventory management of weaponry.
However, the principles disclosed herein are equally applicable to identification and inventory management of a variety of objects characterized by distinguishable physical features for example, pharmaceuticals, drugs, animals, gems, coins, bullion, currency, integrated circuits, clothing, apparel, legal documents, financial documents, mail, art work, photographs, manufactured parts, and labels, and the like or combinations thereof. The physical features may be observable with the naked eye and/or be microscopic in scale. Thus, various other examples of the disclosed technology are also possible and practical.
However, the principles disclosed herein are equally applicable to identification and inventory management of a variety of objects characterized by distinguishable physical features for example, pharmaceuticals, drugs, animals, gems, coins, bullion, currency, integrated circuits, clothing, apparel, legal documents, financial documents, mail, art work, photographs, manufactured parts, and labels, and the like or combinations thereof. The physical features may be observable with the naked eye and/or be microscopic in scale. Thus, various other examples of the disclosed technology are also possible and practical.
[0259] Additional aspects and advantages will be apparent from the following detailed description of example embodiments. The illustrated example embodiments and features are offered by way of example and not limitation. Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more examples.
[0260] In general, the methodologies of the present disclosed technology may be carried out using one or more digital processors, for example the types of microprocessors that are commonly found in mobile telephones, PC's, servers, laptops, Personal Data Assistants (PDAs) and all manner of desktop or portable electronic appliances.
[0261] In the following description, certain specific details of programming, software modules, user selections, network transactions, database queries, database structures, etc., are provided for a thorough understanding of the example embodiments of the disclosed technology. However, those skilled in the art will recognize that the disclosed technology can be practiced without one or more of the specific details, or with other methods, components, materials, etc.
[0262] The term "recognize" is a term of art used throughout the following description that refers to systems, software and processes that glean or "figure out"
information, for example alphanumeric information, from digital image data.
"Recognition" may include not only character recognition, but also relative location of characters, fields, features, or other distinguishing elements. Details are known in other contexts such as mail handling, document capture, and object recognition.
information, for example alphanumeric information, from digital image data.
"Recognition" may include not only character recognition, but also relative location of characters, fields, features, or other distinguishing elements. Details are known in other contexts such as mail handling, document capture, and object recognition.
[0263] For simplicity, the following discussion describes examples wherein the object is a weapon 1614 and system 1600 is configured for weapon inventory management beginning when a weapon 1614 is issued through ultimate disposal or surrender of weapon 1614.
[0264] System 1600 may comprise a high-resolution imaging (HRI) system and/or a hand-held imaging (HHI) system 1604. The imaging systems may be configured to capture any of a variety of images including digital images, ultrasound images, x-ray images, thermal images, microscopic digital images, images providing depth and/or topological information, and the like, or any combinations thereof.
[0265] HRI 1602 may be located at a Central Inventory Control Point (CICP) or any location conducive to manufacture, collection, storage, distribution and/or reclamation of weaponry. Weapon 1614 may be issued at CICP 1650. HRI 1602 may be configured to initially identify weapon 1614 and associate the identification information with personnel to whom the weapon 1614 is issued.
[0266] HHI 1604 may be located at a Forward Operating Base 1660 that is remote from CICP 1650. Weapon 1614 may be checked at FOB 1660 for surrender, disposal, and/or tracking. HHI 1604 may be configured to identify weapon 1614 for authentication and/or to verify that an authorized person is in possession of and/or surrendering weapon 1614. In an example, weapon 1614 may comprise several parts including a barrel, stock, sights and etc. Each part may be identified and/or cataloged separately and/or a single part may represent the entire weapon 1614.
[0267] In an example, when weapon 1614 is issued at CICP 1650, HRI 1602 may capture a first image 1902 (see, for example FIG. 19A) of a specific region 1629 of weapon 1614 including a structure 1624. Region 1629 may be identified as an offset from structure 1624. The location of region 1629 may be known only to the identification system and not to any personnel involved in the identification process.
[0268] In an example, weapon 1614 may have a unique surface crystal and abrasion structure from its manufacture and previous history. The surface crystal and abrasion structure may form random patterns. In addition, anything stamped into weapon 1614 (e.g. the serial number) may have random imperfections that are unique to weapon 1614, even if the exact same die is used to stamp the next weapon on the assembly line. Further, after weapon 1614 has spent time in the field, it acquires scratches and other imperfections that are also random.
Thus, region 1629 may include a unique pattern of crystals and/or abrasions comprising at least one fingerprint feature. System 1600 may extract a first feature vector 1634 to store data corresponding to the at least one fingerprint feature from image data 1630 associated with the first image 1902.
Thus, region 1629 may include a unique pattern of crystals and/or abrasions comprising at least one fingerprint feature. System 1600 may extract a first feature vector 1634 to store data corresponding to the at least one fingerprint feature from image data 1630 associated with the first image 1902.
[0269] When weapon 1614 is checked at FOB 1660, HHI 1604 may capture a second image 2002 (see FIG. 5a) of region 1629 and extract a second feature vector 1644 from image data 1640 associated with the second image 2002. First feature vector 1634 and second feature vector 1644 may be compared to authenticate weapon 1614. In other examples, either or both systems HRI 1602 or HHI 1604 may be used to extract the first and/or second feature vectors at issuance or when weapon 1614 is checked for surrender, disposal and/or tracking anywhere and at any time and claimed subject matter is not limited in this regard.
[0270] In an example, HRI 1602 may comprise an imaging device 1608, a non-specular illumination system 1610, and/or a mount 1612 to hold weapon 1614 in place. HRI 1602 may be configured with specialized optical recognition software to identify structure 1624 to locate region 1629 of weapon 1614. In another example, structure 1624 and/or region 1629 may be located by a user, manually. Weapon 1614 may be positioned on HRI 1602 in such a way as to facilitate imaging of region 1629. Structure 1624 may be a serial number and/or any other distinguishable physical feature of weapon 1614 (e.g., front or rear sight, barrel, muzzle, trigger, safety latch, model stamp, or the like, or any combinations thereof). HRI 1602 may capture first image 1902 of region 1629. Image 1902 may show elements of a grain surface within region 1629 proximate structure 1624 and/or imperfections in the surface and/or imperfections in the structure 1624, itself.
[0271] Structure 1624 is the stamped serial number. In an example embodiment, system 1600 may be configured to recognize the serial number from first image and may use an ASCII string for that serial number as a database index referring to weapon 1614 in inventory control database 1626. Through this recognition (e.g.
of the weapon's serial number) the claimed identity of an object such as weapon may be established. In alternative embodiments, the claimed identity may be entered by a user or captured from a tag or other item not part of the object.
of the weapon's serial number) the claimed identity of an object such as weapon may be established. In alternative embodiments, the claimed identity may be entered by a user or captured from a tag or other item not part of the object.
[0272] In an example, HRI 1602 may be configured to generate image data associated with image 1902. HRI 1602 may include a local processor to process image data 1630. Image data 1630 may be processed by HRI 1602 to generate first feature vector 1634. Processing image data 1630 may comprise identifying fingerprint features 1627 on a surface of weapon 1614 within region 1629 and expressing the fingerprint features as one or more values to generate first feature vector 1634. HRI 102 may be configured to store image data 1630 and/or first feature vector 1634 in inventory control database 1626 in communication with HRI
1602. Image data 1630 and/or first feature vector 1634 may be encrypted. In another example, HRI 1602 may include a remote computer 1618 configured to process image data 1630 to extract first feature vector 1634. Computer 1618 may store image data 1630 and/or first feature vector 1634 in inventory control database 1626. In another example, inventory control database 1626 may be stored in a memory component of HRI 1602.
1602. Image data 1630 and/or first feature vector 1634 may be encrypted. In another example, HRI 1602 may include a remote computer 1618 configured to process image data 1630 to extract first feature vector 1634. Computer 1618 may store image data 1630 and/or first feature vector 1634 in inventory control database 1626. In another example, inventory control database 1626 may be stored in a memory component of HRI 1602.
[0273] In an example, HRI 1602 may be configured to receive and/or generate additional data 1632 to be entered into inventory control database 1626 in association with image data 1630 and/or first feature vector 1634. The additional data may include data identifying a person to whom weapon 1614 is being issued, a serial number, a time and/or date stamp, geographical location information, weapon 1614 status (e.g., condition, age, wear, parts missing, and the like), or the like and any combinations thereof. In an example, data to be entered into inventory control database 1626 may be secured by any of a variety of data security techniques, such as by encrypting.
[0274] The above are performed when the weapon is first cataloged. The same imaging and recognition system may be used later when the weapon is received back from an FOB 1660 for ultimate disposal. At that point another high-resolution image of the identifying region may be extracted, the serial number may be recognized or otherwise identified, the feature vector 1634 may be extracted, and comparison may be made between the cataloged feature vector and the newly-captured one to determine the degree of certainty that this is the original weapon. In addition, this system may also allow for manual comparison of the identifying region images created at issue and at disposal.
[0275] Referring still to FIG. 16, in an example, weapon 1614 may be surrendered or otherwise returned to a site that is remote from the location of HRI
1602 such as an FOB 1660. FOB 1660 may not have access to technological capabilities available at CICP 1650. An HHI 1604 may be a portable handheld imaging device comprising at least one lens 1620, a handle 1622, actuator button(s) 1628 and/or illumination source(s) 1625. HHI 104 may be available at FOB 1660.
If weapon 1614 is returned to or checked at FOB 1660, weapon 1614 may be authenticated with HHI 1604. HHI 1604 may be configured with specialized software to locate region 1629 of weapon 1614 including structure 1624 (e.g., the serial number). HHI 1604 may be configured to capture a second image 2002 of region 1629 and to extract a second feature vector 1644 from second image data 1640.
Second feature vector 1644 may comprise at least one value representing fingerprint feature 1627. HHI 1604 may comprise a memory for storing image data 1640 associated with image 2002 and/or a processing device for processing the image data 1640. In another example, HHI 1604 may communicate the image data 1640 associated with image 2002 to computer 1618 for processing. Computer 1618 may generate second feature vector 1644 and/or may store feature vector 1644 and image data 1630 in inventory control database 1626 for processing at a later time. In another example, inventory control database 1626 may be stored in a memory component of HHI 1604.
1602 such as an FOB 1660. FOB 1660 may not have access to technological capabilities available at CICP 1650. An HHI 1604 may be a portable handheld imaging device comprising at least one lens 1620, a handle 1622, actuator button(s) 1628 and/or illumination source(s) 1625. HHI 104 may be available at FOB 1660.
If weapon 1614 is returned to or checked at FOB 1660, weapon 1614 may be authenticated with HHI 1604. HHI 1604 may be configured with specialized software to locate region 1629 of weapon 1614 including structure 1624 (e.g., the serial number). HHI 1604 may be configured to capture a second image 2002 of region 1629 and to extract a second feature vector 1644 from second image data 1640.
Second feature vector 1644 may comprise at least one value representing fingerprint feature 1627. HHI 1604 may comprise a memory for storing image data 1640 associated with image 2002 and/or a processing device for processing the image data 1640. In another example, HHI 1604 may communicate the image data 1640 associated with image 2002 to computer 1618 for processing. Computer 1618 may generate second feature vector 1644 and/or may store feature vector 1644 and image data 1630 in inventory control database 1626 for processing at a later time. In another example, inventory control database 1626 may be stored in a memory component of HHI 1604.
[0276] In an example, weapon 1614 may be identified in inventory control database 1626 according to the serial number marking on weapon 1614. HHI 1604 may be configured to recognize the serial number or the serial number may be entered by other means. HHI 1604 may access inventory control database 1626 using the serial number to look up a stored first feature vector 1634. HHI
1604 may access first feature vector 1634 from database 1626 according to any of a variety of other associations, such as, by serial number, assignment to a particular person, description, or color code, and the like, or any combinations thereof.
1604 may access first feature vector 1634 from database 1626 according to any of a variety of other associations, such as, by serial number, assignment to a particular person, description, or color code, and the like, or any combinations thereof.
[0277] HHI 1604 may be configured to compare first feature vector 1634 and second feature vector 1644 to authenticate weapon 1614. HHI 1604 may authenticate weapon 1614 by determining whether first feature vector 1634 and second feature vector 1644 match to a predetermined identification certainty level.
The match may be determined by the degree of correspondence of the patterns (or features extracted from those patterns) in the first feature vector 134 and the second feature vector 144. If a match is sufficiently close, weapon 1614 may be verified as authentic.
The match may be determined by the degree of correspondence of the patterns (or features extracted from those patterns) in the first feature vector 134 and the second feature vector 144. If a match is sufficiently close, weapon 1614 may be verified as authentic.
[0278] The comparison of first feature vector 1634 and second feature vector 1644 may dampen or enhance differences in the first and second feature vectors due to natural causes such as wear and tear and/or corrosion. For example, region 1629 of which both images 1902 and 2002 are taken is not likely to suffer less damage once weapon 1614 is in the field. However, weapon 1614 may suffer more damage. As a result, when comparing the first feature vector 1634 and the second feature vector 1644 the program that determines the certainty of a match between the first and second feature vectors asymmetrically. That is, a scratch (for example) that exists in the later image but not in the earlier image may add only a small amount of distance between the two feature vectors (and the differences it creates be dampened), while a scratch in the earlier image that is not in the later contributes a large amount of distance (and its effects be enhanced since there is no reasonable way for a scratch to be removed in the field). Thus, the comparison may minimize or enhance degradation of a match confidence level based on such differences.
Thus, when surrendered, weapon 1614 may still be authenticated despite changes in fingerprint features 1627 attributable to natural wear and tear.
Thus, when surrendered, weapon 1614 may still be authenticated despite changes in fingerprint features 1627 attributable to natural wear and tear.
[0279] In an example, initially HRI 1602 may extract several first feature vectors from corresponding images of a plurality of regions of weapon 1614. Thus, when the weapon 1614 is checked-in or surrendered the same plurality of regions may be imaged by HHI 1604 and second feature vectors may be extracted from those corresponding images. Comparing the plurality of first feature vectors with the corresponding plurality of second feature vectors may improve match certainty.
[0280] Processing of image data 1640 may be executed in HHI 1604 and/or in computer 1618 and claimed subject matter is not limited in this regard. For example, the extraction and/or comparison of first feature vector 1634 and second feature vector 1644 may be executed by computer 1618. Alternatively, first image 1902 and second image 2002 may be manually compared. HHI 1604 may store and/or associate image data 1640, second feature vector 1644 and/or an identification certainty level in database 1626.
[0281] HHI 1604 may encrypt image data 1640, second feature vector 1644 and/or an identification certainty level prior to storing in database 1626. In another embodiment HRI 1602 may be configured to authenticate weapon 1614.
[0282] The identification certainty level associated with a match between feature vectors may vary. For example, a certainty level associated with a match between feature vectors extracted from image data generated by different devices may be lower than a certainty level associated with a match between feature vectors extracted from image data generated by the same device.
[0283] In an example, HHI 1604 may be configured to receive and/or generate additional data 1642 to be entered into inventory control database 1626 in association with image data 1640 and/or second feature vector 1644. The additional data may include data identifying a person to surrendering or checking-in weapon 1614, the weapon 1614 serial number, a time and/or date stamp, geographical location information, weapon 1614 status (e.g., condition, age, wear, parts missing, and the like), or the like and any combinations thereof.
[0284] Weapon 1614 may inducted into a control system at the FOB 1660 and sent back to the central inventory control point 1 50 (an armory, a refurbishment point, disposal point, or other centralized location). An additional (third) high-resolution image may be taken with the HRI 1602 described above and comparisons made with the first image. This can be done if the FOB hand-held system HHI
does not have sufficiently high confidence in a match. In addition, the proposed field system may allow manual comparison of the old and the new serial number region images for identification where the automatic system is insufficiently certain. HRI
1602 may be configured to provide an image with a higher-resolution than HHI
1604.
Thus, if a confidence level in a match or non-match is not sufficiently certain, an additional image of region 1629 may be captured by HRI 1602 when weapon 1614 is surrendered or checked-in in order to improve the confidence level associated with the match.
does not have sufficiently high confidence in a match. In addition, the proposed field system may allow manual comparison of the old and the new serial number region images for identification where the automatic system is insufficiently certain. HRI
1602 may be configured to provide an image with a higher-resolution than HHI
1604.
Thus, if a confidence level in a match or non-match is not sufficiently certain, an additional image of region 1629 may be captured by HRI 1602 when weapon 1614 is surrendered or checked-in in order to improve the confidence level associated with the match.
[0285] In the above examples, system 1600 is configured to identify, track, trace, inventory, authenticate, verify, sort, deliver, and/or classify weapon 1614.
However, system 1600 may be configured to identify, track, trace, inventory, authenticate, verify, sort, deliver, or classify any type of objects and/or articles associated with objects, such as pharmaceuticals, drugs, coins, bullion, currency, integrated circuits, clothing, apparel, legal documents, financial documents, mail, art work, photographs, manufactured parts, labels, etc.
However, system 1600 may be configured to identify, track, trace, inventory, authenticate, verify, sort, deliver, or classify any type of objects and/or articles associated with objects, such as pharmaceuticals, drugs, coins, bullion, currency, integrated circuits, clothing, apparel, legal documents, financial documents, mail, art work, photographs, manufactured parts, labels, etc.
[0286] In an example, feature vectors may be used to track pilot whales.
Referring now to FIG. 17, pilot whales have a dorsal fin 1770 that extends out of the water and is easy to photograph. The posterior part of these fins is very thin (the fins are shaped like airfoils) and is very often damaged by shark bites 1772 in the normal course of life of the pilot whales. Using a digital camera, an image of the dorsal fin may be captured. Features along the posterior edge of the fin may be identified as fingerprint features. A feature vector comprising representations of the fingerprint features may be generated from the digital image data. The feature vector may be associated with an identifier for the pilot whale (e.g., the whale's "name") in a database. When an unidentified whale is photographed, the features of its dorsal fin may be extracted and the resulting feature vector may be compared with those in the database. If a sufficiently good match is obtained, the whale may be identified.
Referring now to FIG. 17, pilot whales have a dorsal fin 1770 that extends out of the water and is easy to photograph. The posterior part of these fins is very thin (the fins are shaped like airfoils) and is very often damaged by shark bites 1772 in the normal course of life of the pilot whales. Using a digital camera, an image of the dorsal fin may be captured. Features along the posterior edge of the fin may be identified as fingerprint features. A feature vector comprising representations of the fingerprint features may be generated from the digital image data. The feature vector may be associated with an identifier for the pilot whale (e.g., the whale's "name") in a database. When an unidentified whale is photographed, the features of its dorsal fin may be extracted and the resulting feature vector may be compared with those in the database. If a sufficiently good match is obtained, the whale may be identified.
[0287] The whale may sustain additional damage to the dorsal fin after the initial image was collected due to new shark bites and other causes of wear and tear.
In an example, the comparison procedure may use subtractive features of the fin, thus differences between the feature vectors may be dampened where such differences may be associated with new shark bites and other natural causes of wear and tear.
In other words, new bite marks may not strongly degrade the match if older bite marks are removed due to a deeper new bite. However, if the later fin image has dorsal fin material where the original does not (i.e. it lacks a bite mark where the original had one), the difference between the two feature vectors values is not dampened. Such a difference that is not attributable to a natural cause may even be amplified thus degrading the match considerably more than a dampened difference.
Therefore, a match may not be identified where difference between the feature vectors are not attributable to natural causes and are thus amplified.
In an example, the comparison procedure may use subtractive features of the fin, thus differences between the feature vectors may be dampened where such differences may be associated with new shark bites and other natural causes of wear and tear.
In other words, new bite marks may not strongly degrade the match if older bite marks are removed due to a deeper new bite. However, if the later fin image has dorsal fin material where the original does not (i.e. it lacks a bite mark where the original had one), the difference between the two feature vectors values is not dampened. Such a difference that is not attributable to a natural cause may even be amplified thus degrading the match considerably more than a dampened difference.
Therefore, a match may not be identified where difference between the feature vectors are not attributable to natural causes and are thus amplified.
[0288] In another embodiment, feature vectors may be used to identify coins.
Clearly it is not desirable to inscribe a high-value coin, gem, or artwork with an identifying serial number because it may devalue the object. Rather, the same serial number may be inscribed on a coin of lesser value, for example.
Clearly it is not desirable to inscribe a high-value coin, gem, or artwork with an identifying serial number because it may devalue the object. Rather, the same serial number may be inscribed on a coin of lesser value, for example.
[0289] Referring now to FIG. 18, a coin 1874 may comprise two random feature types, wear marks 1876 and/or a crystal pattern 1878 in the surface of the coin. In the former case, prior to cataloging, coin 1874 may have been pressed against other coins in a bag or suffered other marks. Coin 1874 may have different microscopic (or near-microscopic) features in the way the surface crystals of the metal fractured when the coin was struck. Since these depend on the alignment of crystal boundaries in the coin blank, and since that alignment is random, these prove a good feature set even if the coin is cataloged directly after stamping. In addition, no two stampings leave exactly the same image, even two stampings in a row from the same die.
[0290] Because the pattern of wear marks 1876 and/or the crystal pattern 1878 are random either or both may serve as fingerprint features from which a feature vector may be calculated and used for identification. Authentication of coin 1874 may be executed by comparing a feature vector known to be an authentic representation of coin 1874 to a feature vector to be authenticated. Differences between feature vectors may be minimized where the differences may be attributable to the effects of natural wear and tear effect. Differences between feature vectors may be magnified where the differences are not attributed to the effects of natural wear and tear effect.
[0291]
Similarly, gem stones have an optical or X-ray crystal pattern which provides an identifying feature set. Every gem stone is unique and every gem stone has a series of random flaws in its crystal structure. Indeed, even the most perfect and high-value natural stones have an internal crystal structure that has a great many flaws. The crystal pattern may be used to generate feature vectors for identification and authentication.
Similarly, gem stones have an optical or X-ray crystal pattern which provides an identifying feature set. Every gem stone is unique and every gem stone has a series of random flaws in its crystal structure. Indeed, even the most perfect and high-value natural stones have an internal crystal structure that has a great many flaws. The crystal pattern may be used to generate feature vectors for identification and authentication.
[0292] In an example embodiment, determining whether two objects are really the same object may depend on a degree of match of their features, not substantially on the degree of mismatch. This is true because the random or pseudo-random fingerprint features on an object that may be used for identification are subject to modification after the first feature vector is extracted and stored. A weapon, for example, may receive a scratch through part of the critical area, a document may get a coffee stain, a pilot whale may receive a new shark bite in its dorsal fin.
All of these are normal changes that may occur in the course of the life of the object.
All of these are normal changes that may occur in the course of the life of the object.
[0293] When the second feature vector, extracted after the changes occur, is compared with the first, the elements of the second vector that correspond to the areas associated with the fingerprint features that underwent change may be substantially different from those in the first feature vector. These changes, however, do not indicate that the two feature vectors were extracted from different objects. The vector comparison process, therefore, seeks to dampen the effects of such changes.
[0294] On the other hand, coffee stains do not disappear, scratches remove themselves, or shark-bitten dorsal fins heal themselves. When there is no natural process that can explain the differences (the apparent disappearance of a shark bite for example), those differences may be enhanced in comparing the two feature vectors because they focus in on differences far more likely to be caused by the feature vectors coming from two different objects than from any natural changes in the same object.
[0295] The purpose of both enhancement and dampening is to stress that, when it comes to identifying an object it is the areas in the second feature vector that match the areas in the first that are of substantial significance. If enough match, the fact that others are substantially different in an explainable way is not important.
Also of substantial significance are those areas that are different in ways that are very unlikely to occur naturally. These differences may be enhanced in importance because they are strongly indicative that the two vectors came from different objects.
Finally, differences that occur or could occur naturally say almost nothing about whether the two feature vectors describe the same object. Those features are dampened in the comparison of the two feature vectors.
Also of substantial significance are those areas that are different in ways that are very unlikely to occur naturally. These differences may be enhanced in importance because they are strongly indicative that the two vectors came from different objects.
Finally, differences that occur or could occur naturally say almost nothing about whether the two feature vectors describe the same object. Those features are dampened in the comparison of the two feature vectors.
[0296] There are many ways to accomplish such dampening, enhancing, and determining degree of match (rather than degree of mismatch). Below are three examples of methods to dampen and/or enhance differences between vectors. In each case, two feature vectors are represented, a first feature vector extracted from an object first in time or at time of indexing and a second feature vector extracted later in time or at the time of identification/verification of the object. In some embodiments, a set comprising a plurality of several such feature vectors may be processed for each object. A plurality of feature vectors may each be extracted from images of different structures or regions on the object. In the following example methods, a singular feature vector is described. However, the methods may be applied to all feature vectors with, for example, their effects simply added together or combined in other ways to get a figure of merit for the entire object.
[0297] Assume also for discussion that the "feature vector" is a 5 x 5 grayscale (levels 0-255) of a region of the object. The vector is a 25-long array of numbers, formed by taking the first row of the 5 x 5 array and making that the first five entries in the feature vector, the second row becoming the next five and so on.
[0298] Method 1: Calculate a new vector whose entries are the squares of the differences of the two vectors. Take a pre-determined threshold and count the number of entries in this resulting vector below that threshold. The pre-determined threshold is chosen such that two images of the same region on the same object are very likely to match within the threshold and two random images are not.
[0299] If the number of within-threshold features is above some predetermined number (say 10 so that 10 of the 25 regions on the two objects match very well) call that a match for the object with the original.
[0300] Method 2: Take the two feature vectors as above. Assume the numbers in each run from 0-255. Assume a match distance (chosen based on experience in testing this kind of object, for example). As an example, let that distance be 4. If the two values match within +/- 4, the probability of that happening randomly is 8/256 or about 3%. Calculate for each slot the probability that the two vector values are an accidental match and then calculate the overall probability that this might be a false positive by multiplying those results together. If, for example 10 of the vectors match within the range +/- 4, there is only a 0.03110 chance the result is random.
If it is not random, it must be because the second vector matches the first to high probability.
If the probability of a mismatch is sufficiently low, call it a match.
If it is not random, it must be because the second vector matches the first to high probability.
If the probability of a mismatch is sufficiently low, call it a match.
[0301] Method 3: Calculate the difference vector as in method 1. Sum the results. Subtract that result from 25 x 255 x 255 (the largest possible distance vector magnitude). Threshold the result like a normal distance calculation, so that if the result is low enough it is a match. All of these have the same intent: measure degree of match, not degree of mismatch. Prior to performing such operations it may be preferable to perform enhancement or dampening of features as discussed above.
There are many other ways to accomplish this besides those mentioned and claimed subject matter is not limited in this regard.
There are many other ways to accomplish this besides those mentioned and claimed subject matter is not limited in this regard.
[0302] FIGS. 19A, 19B, 19C, 19D, and 19E depict examples of a first micrograph image 1902 and a first feature vector 1634 for generating a first feature vector associated with weapon 1614. The first micrograph image 1902 may be taken when weapon 1614 is issued, for example, at central inventory control point 1650.
[0303] FIG. 19A depicts an example of first micrograph image 1902 focused on a selected region 1629 of a metal surface of weapon 1614. Selected region 1629 may be chosen based on a proximity to a particular structure 1624 of weapon 1614 such as a serial number. Regions located proximate to other structures of weapon may be selected, such as a forward sight or rear sight, and claimed subject matter is not limited in this regard. A smaller area 1904 is highlighted within region 1930. Area 1904 may include one or more fingerprint features 1627 and may be identified based on an offset from structure 1624 and selected for feature vector extraction.
[0304] FIG. 19B is a detailed view of the highlighted area 1904 in FIG.
19A.
19A.
[0305] FIG. 19C is an example of area 1904 of image 1902 prepared for feature vector extraction. In an example, area 1904 may be blurred and/or contrasted until the average grayscale is 127 out of 255. In an example, a histogram of area may be modified so that the darkest regions are just barely not saturated black (e.g., having a pixel value of 0), the lightest regions are just barely not saturated white (e.g., having a pixel value of 255), and the average threshold value is 127 (i.e., halfway).
[0306] FIG. 19D depicts area 1904 divided into a grid 414 having 56 equal regions. The average grayscale level in each of the 56 regions may be selected as representative features for a fingerprint feature value set of feature vector 1634.
However, this is merely one example of a method of preparing an image for feature vector extraction. In another example, a feature vector may be generated from area 1904 without modifying the grayscale and generating a feature vector representing each of the pixels in area 1904.
However, this is merely one example of a method of preparing an image for feature vector extraction. In another example, a feature vector may be generated from area 1904 without modifying the grayscale and generating a feature vector representing each of the pixels in area 1904.
[0307] FIG. 19E depicts an example of first feature vector 1634 comprising a table of numerical values of the average grayscale in each of the 56 regions of grid 1914. First feature vector 1634 is merely an example of a method of extracting a feature vector from an image. There are a variety of methods of extracting feature vectors known to those of skill in the art and claimed subject matter is not limited in this regard.
[0308] FIGS. 20A, 20B, 20C, 20D, and 20E depict examples of a second micrograph image 2002 and second feature vector 144 associated with a weapon purported to be weapon 1614. The second micrograph image 2002 may be taken at a FOB 160.
[0309] FIG. 20A depicts an example of a second micrograph image 2002 which may be focused on selected region 1629 if imaging weapon 1614. Image 2002 includes a deformation 2020 within highlighted area 2004. Deformation 2020 is not visible in image 1902. Deformation 2020 shows up in image 2002 as a dark line.
Deformation 2020 may be an abrasion weapon 1614 received in the field. Image 2002 also includes a light portion 2022 within highlighted area 2004 that is not visible in image 1902. Light area 2022 may represent a different elevated portion or different pattern of crystal and/or abrasion features from that visible in image 1902.
Light area 2022 may not be attributable to natural wear and tear a weapon may receive in the field and may call into question the authenticity of the weapon purporting to be weapon 1614.
Deformation 2020 may be an abrasion weapon 1614 received in the field. Image 2002 also includes a light portion 2022 within highlighted area 2004 that is not visible in image 1902. Light area 2022 may represent a different elevated portion or different pattern of crystal and/or abrasion features from that visible in image 1902.
Light area 2022 may not be attributable to natural wear and tear a weapon may receive in the field and may call into question the authenticity of the weapon purporting to be weapon 1614.
[0310] FIG. 20B is a detailed view of the highlighted area 2004 in FIG. 5a showing deformation 2020 and light portion 2022.
[0311] FIG. 20C is an example of area 2004 prepared for feature extraction.
Area 2004 is prepared in the same way area 1904 was prepared for feature vector extraction and is blurred and/or contrasted until the average grayscale is 127 out of 255. A histogram of area 2004 may be modified so that the darkest regions are just barely not saturated black (e.g., having a pixel value of 0), the lightest regions are just barely not saturated white (e.g., having a pixel value of 255), and the average threshold value is 127 (i.e., halfway).
Area 2004 is prepared in the same way area 1904 was prepared for feature vector extraction and is blurred and/or contrasted until the average grayscale is 127 out of 255. A histogram of area 2004 may be modified so that the darkest regions are just barely not saturated black (e.g., having a pixel value of 0), the lightest regions are just barely not saturated white (e.g., having a pixel value of 255), and the average threshold value is 127 (i.e., halfway).
[0312] FIG. 20D depicts area 2004 divided into a grid 2014 having 56 equal regions, as was area 1904 in FIG. 19D. The 56 regions may comprise fingerprint features based on grayscale. The average grayscale level in most of the 56 regions match the average grayscale in corresponding regions of area 1904 in FIG. 19D
with the exception of the regions changed by the deformation 2020 and light portion 2022. Regions 2040 will have a lower grayscale value than the corresponding regions in FIG. 19D due to an overall darkening effect caused by the deformation 2020. Regions 2050 will have a higher grayscale value due to an overall lightening effect caused by light portion 2022.
with the exception of the regions changed by the deformation 2020 and light portion 2022. Regions 2040 will have a lower grayscale value than the corresponding regions in FIG. 19D due to an overall darkening effect caused by the deformation 2020. Regions 2050 will have a higher grayscale value due to an overall lightening effect caused by light portion 2022.
[0313] FIG. 20E depicts an example of second feature vector 1644 comprising a table of numerical values representing fingerprint features. The values are of the average grayscale in each of the 56 regions of grid 2014. Regions 2040 have a lower numerical value than corresponding regions 1940 in first feature vector because deformation 2020 darkened these regions in area 2004 lowering the grayscale values. Similarly, regions 2050 have higher numerical values than corresponding regions in 1950 in first feature vector 1634 because the light portion 2022 lightened these regions in area 2004 increasing the grayscale values.
[0314] FIG. 21 is a table 2110 comprising a difference vector showing the difference between first feature vector 1634 and second feature vector 1644 wherein differences attributable to normal wear and tear are dampened and differences that are not attributable to normal wear and tear are enhanced.
[0315] In an example, feature vector 1634 was extracted first in time so feature vector 1644 may be subtracted from feature vector 1910 to render the difference shown in table 2110. In an example, based on observation or other data, it may be determined that a positive difference such as in fields 2140 may correlate to effects of normal wear and tear on weapon 1614. Thus, positive differences between feature vector 1910 and feature vector 2010 may be dampened to reduce the effect a positive difference has on a match correlation value in determining a match between feature vector 1910 and feature vector 2010. For example, dampening may comprise dividing by 10 each positive value by 10 giving field 2140 values of 1, 14.8 and 7.1 respective.
[0316] Similarly, it may be determined that negative differences (due to average lightening of an area in image 2002) are not likely to have arisen naturally due to normal wear and tear of weapon 1614 and may be present because the weapon returned is not the same weapon issued. Lighter area 2022 may result in higher grayscale values in figure 2002 than corresponding regions in image 1902 giving a negative difference between first feature vector 1910 and second feature vector 2010. Negative differences may not be dampened and may even be enhanced to accentuate the difference between feature vectors not related to natural causes. In an example, enhancing the differences may comprise simply multiplying the absolute value of any negative values in the difference vector 2100 by 10 giving field values of 580, 1120, 1320 and 1220 respective.
[0317] Difference vector 2100 may be used to derive a match correlation between feature vector 1634 and feature vector 1644 to authenticate purported weapon 1614.
In an example, a match correlation value may be a sum of the enhanced and dampened difference values. Thus, negative differences may shift a correlation value more than positive differences of the same magnitude.
In an example, a match correlation value may be a sum of the enhanced and dampened difference values. Thus, negative differences may shift a correlation value more than positive differences of the same magnitude.
[0318] In an example, a match between first feature vector 1634 and second feature vector 1644 may be determined based on a magnitude of a correlation value compared to a pre-determine threshold correlation value. In other embodiments, a match may be determined by a variety of comparison processes and claimed subject matter is not limited in this regard. In an example, if a match correlation value is within a predetermined confidence threshold value range then a match between first and second feature vectors may be declared and the purported weapon 1614 may be authenticated as the genuine weapon 1614.
[0319] FIG. 22 depicts an example of a process 2200 for generating a feature vector for authenticating an object. In an example, process 2200 may comprise leveraging randomly-occurring physical features of an object. The features may be microscopic in scale or may be visible to the naked eye. Such features may comprise, for example, cracks in surface material, a crystalline pattern, a pattern of fibers, a bleed pattern, a pattern of fabrication irregularities and the like, or any combinations thereof. The randomly-occurring features may include fingerprint features from which values may be derived and stored in a feature vector. The feature vector may then be associated with the object.
[0320] In an example, process 2200 may begin at operation 2202, where a structure (see structure 1624 in FIG. 16) of the object may be identified.
Such a structure may comprise a variety of physical formations on a surface of the object, such as, a stamped marking, a crystalline structure, a recess, an outcropping, a component, a part, an ink deposit, a tear, an edge, and the like or combinations thereof.
Such a structure may comprise a variety of physical formations on a surface of the object, such as, a stamped marking, a crystalline structure, a recess, an outcropping, a component, a part, an ink deposit, a tear, an edge, and the like or combinations thereof.
[0321] At operation 2204, a region of the object may be identified. In an example, the location of the region may be based on offset from the structure. The region (see region 1629 in FIG. 16) may comprise fingerprint features (e.g., see fingerprint features 1627 in FIG. 16). The fingerprint features may be proximate to the structure.
[0322] At operation 2206, an image of the region identified on the object may be captured using a imaging system (e.g., HRI 1602 or HHI 1604). Such an imaging system may be, for example, a digital camera, microscope, an electron microscope, a scanner, a thermal camera, a telescope, an ultrasound imaging device, or the like, or any combinations thereof. Such an imaging system may be fixed and/or portable and claimed subject matter is not limited in this regard.
[0323] At operation 2208, image data associated with the captured image may be generated.
[0324] At operation 2210, image data may be processed to identify fingerprint features in the captured imaged. The fingerprint features may be identified based on proximity to the structure or other distinguishing feaures.
[0325] At operation 2212, fingerprint features may processed to generate a feature vector.
[0326] At operation 2214, map an object identifier to the image data and/or the one or more feature vectors in a database. In another embodiment, a structure and region may also be mapped to the object in the database.
[0327] FIG. 23 depicts an example of an authentication process 2300 for identifying and verifying an identity an object. In an example, authentication process 2300 may comprise comparing a first feature vector and a second feature vector generated according to process 2200 described above. The first feature vector may be generated at a first time and the second feature vector may be generated at a later second time. An asymmetrical comparison model may be executed to compare the two feature vectors to compensate for changes to the object that are likely to have been sustained due to normal wear and tear. Thus, such changes may not substantially degrade a match if it can be determined that the change is due to normal wear. Such a feature recognition system may reduce a likelihood of a false negative match. Such a feature recognition system may be configured to determine which feature vector was derived from a later in time image in order to properly dampen the effects of normal wear and tear on the object.
[0328] In an example, authentication process 2300 may begin at operation 2302, where a first feature vector may be generated from an image of a selected region of an object including random features comprising at least one fingerprint feature suitable for extracting a feature vector. The random features may be proximate a distinguishable structure of the object. The first feature vector may be stored in a database associated with an object identifier.
[0329] At operation 2304, a second feature vector may be generated from a second image of the selected region. Importantly, exactly what the feature set from which the feature vectors are derived is not relevant so long as there is sufficient information in the resulting feature vector to tell how similar or dissimilar the original images were. Any feature set will do provided it meets that criterion.
[0330] At operation 2306, the first feature vector may be accessed from the database and identified as first in time. The second feature vector may be identified as second in time. In one embodiment, a date and/or time stamp may be accessed to determine which feature vector was taken first in time.
[0331] At operation 2308, compare the feature vectors to determine differences between the first feature vector and the second feature vector.
[0332] At operation 2310, augment (or modify) differences between the feature vectors. In one embodiment, only differences that exceed a threshold value may be augmented. In an example, augmenting (or modifying) comprises dampening or reducing differences between first feature vector and second feature vector determined and/or likely to be caused by normal wear and tear or as a result of decay and/or other natural and unintentional causes. In an example, augmenting comprises enhancing or increasing differences between first feature vector and second feature vector determined and/or not likely to be caused by normal wear and tear or as a result of decay and/or other natural and unintentional causes.
Operation 810 may include either dampening or enhancing differences between vectors or may include both dampening and enhancing differences between vectors.
Operation 810 may include either dampening or enhancing differences between vectors or may include both dampening and enhancing differences between vectors.
[0333] At operation 2312, calculate a match correlation value. In one embodiment, a match correlation value may be a sum of all of the difference values in the difference vector. In another embodiment, a match correlation value may be a sum of selected values exceeding a threshold value. There may be a variety of other ways to calculate a match correlation value and claimed subject matter is not limited in this regard.
[0334] At operation 2314, a determination may be made whether first feature vector and second feature vector match. In an example, a match may be identified based on a predetermined threshold difference tolerance.
[0335] An indication that the objects match or do not match may be displayed in operation 2316.
[0336] The system, apparatus, methods, processes, and operations described above may use dedicated processor systems, micro controllers, programmable logic devices, or microprocessors that may perform some or all of the operations described herein. Some of the operations described above may be implemented in software and other operations may be implemented in hardware. One or more of the operations, processes, or methods described herein may be performed by an apparatus, device, or system similar to those as described herein and with reference to the illustrated figures.
[0337] The processing device may execute instructions or "code" stored in memory. The memory may store data as well. The processing device may include, but may not be limited to, an analog processor, a digital processor, a microprocessor, multi-core processor, processor array, network processor, etc.
The processing device may be part of an integrated control system or system manager, or may be provided as a portable electronic device configured to interface with a networked system either locally or remotely via wireless transmission.
The processing device may be part of an integrated control system or system manager, or may be provided as a portable electronic device configured to interface with a networked system either locally or remotely via wireless transmission.
[0338] The processor memory may be integrated together with the processing device, for example RAM or FLASH memory disposed within an integrated circuit microprocessor or the like. In other examples, the memory may comprise an independent device, such as an external disk drive, storage array, or portable FLASH key fob. The memory and processing device may be operatively coupled together, or in communication with each other, for example by an I/O port, network connection, etc. such that the processing device may read a file stored on the memory. Associated memory may be "read only" by design (ROM) by virtue of permission settings, or not. Other examples of memory may include, but may not be limited to, WORM, EPROM, EEPROM, FLASH, etc. which may be implemented in solid state semiconductor devices. Other memories may comprise moving parts, such a conventional rotating disk drive. All such memories may be "machine-readable" in that they may be readable by a processing device.
[0339] Operating instructions or commands may be implemented or embodied in tangible forms of stored computer software (also known as a "computer program"
or "code"). Programs, or code, may be stored in a digital memory that may be read by the processing device. "Computer-readable storage medium" (or alternatively, "machine-readable storage medium") may include all of the foregoing types of memory, as well as new technologies that may arise in the future, as long as they may be capable of storing digital information in the nature of a computer program or other data, at least temporarily, in such a manner that the stored information may be "read" by an appropriate processing device. The term "computer-readable" may not be limited to the historical usage of "computer" to imply a complete mainframe, mini-computer, desktop or even laptop computer. Rather, "computer-readable" may comprise storage medium that may be readable by a processor, processing device, or any computing system. Such media may be any available media that may be locally and/or remotely accessible by a computer or processor, and may include volatile and non-volatile media, and removable and non-removable media.
or "code"). Programs, or code, may be stored in a digital memory that may be read by the processing device. "Computer-readable storage medium" (or alternatively, "machine-readable storage medium") may include all of the foregoing types of memory, as well as new technologies that may arise in the future, as long as they may be capable of storing digital information in the nature of a computer program or other data, at least temporarily, in such a manner that the stored information may be "read" by an appropriate processing device. The term "computer-readable" may not be limited to the historical usage of "computer" to imply a complete mainframe, mini-computer, desktop or even laptop computer. Rather, "computer-readable" may comprise storage medium that may be readable by a processor, processing device, or any computing system. Such media may be any available media that may be locally and/or remotely accessible by a computer or processor, and may include volatile and non-volatile media, and removable and non-removable media.
[0340] A program stored in a computer-readable storage medium may comprise a computer program product. For example, a storage medium may be used as a convenient means to store or transport a computer program. For the sake of convenience, the operations may be described as various interconnected or coupled functional blocks or diagrams. However, there may be cases where these functional blocks or diagrams may be equivalently aggregated into a single logic device, program or operation with unclear boundaries.
[0341] Having described and illustrated the principles of a preferred embodiment, it should be apparent that the examples may be modified in arrangement and detail without departing from such principles. We claim all modifications and variation coming within the spirit and scope of the following claims.
Claims (30)
1. A machine-implemented method comprising:
capturing a digital image of a selected region on an item of currency, wherein the digital image has sufficient resolution to allow recognition of characters of a serial number of the currency if the serial numbers are within the selected region, and also of sufficient resolution to show elements of a grain surface of inter-character and intra-character regions of the serial number;
recognizing a serial number of the item from the digital image;
processing the digital image to locate at least one fingerprint feature;
storing data identifying the fingerprint feature in a first feature vector;
storing the first feature vector and the digital image in a database in association with the serial number.
capturing a digital image of a selected region on an item of currency, wherein the digital image has sufficient resolution to allow recognition of characters of a serial number of the currency if the serial numbers are within the selected region, and also of sufficient resolution to show elements of a grain surface of inter-character and intra-character regions of the serial number;
recognizing a serial number of the item from the digital image;
processing the digital image to locate at least one fingerprint feature;
storing data identifying the fingerprint feature in a first feature vector;
storing the first feature vector and the digital image in a database in association with the serial number.
2. The method of claim 1 and further comprising storing a second digital image in the database in association with a matched serial number of the item.
3. The method of claim 1 wherein at least one of the first digital image, the first feature vector, and the object serial number is stored in encrypted form in the database.
4. The method of claim 1 wherein the item of currency comprises a coin.
5. A machine-implemented method comprising:
capturing a digital image of a region including an identifiable structure of an item of currency;
extracting data representing at least one fingerprint feature from the digital image;
storing fingerprint feature data in a feature vector in memory; and storing the digital image and the feature vector in association with an identifier of the item.
Date Recue/Date Received 2022-01-13
capturing a digital image of a region including an identifiable structure of an item of currency;
extracting data representing at least one fingerprint feature from the digital image;
storing fingerprint feature data in a feature vector in memory; and storing the digital image and the feature vector in association with an identifier of the item.
Date Recue/Date Received 2022-01-13
6. The method of claim 5 wherein the digital image is a high-resolution color image, detailed enough to show elements of a grain surface of inter-character and intra-character regions proximate to the identifiable structure.
7. The method of claim 6 wherein the item comprises a paper currency, and wherein the identifiable structure is an alphanumeric identifier.
8. The method of claim 6 wherein the item comprises a coin, and wherein the identifiable structure is an alphanumeric identifier.
9. The method of claim 8 and further comprising transmitting the encrypted digital image and the encrypted fingerprint data and the alphanumeric identifier of the item to an inventory control system in association with one another.
10. The method of claim 5 wherein the digital image is detailed enough to reveal variations, imperfections and/or flaws in the individual characters within an alphanumeric identifier.
11. The method of claim 5 wherein the fingerprint feature is a natural product of manufacture of the item or manufacture of a material or piece incorporated into the item.
12. The method of claim 11 wherein the fingerprint feature comprises metal crystal structure defects or variations, the fingerprint feature data indicating a location of at least one defect or variation in a crystalline structure of a metal region.
13. The method of claim 5 wherein the fingerprint feature comprises metal machining or milling marks presumably resulting from manufacture of a material or piece incorporated into the item.
Date Recue/Date Received 2022-01-13
Date Recue/Date Received 2022-01-13
14. The method of claim 5 wherein the region is selected based on a predetermined offset from the identifiable structure on the item and further comprising identifying the fingerprint feature based on an offset from the identifiable feature, and storing data identifying the region in a fingerprint database in association with an alphanumeric identifier of the item.
15. The method of claim 5 and further comprising encrypting the data identifying said region where the fingerprint feature is located.
16. A system comprising:
at least one processor; and at least one computer-readable storage medium communicatively coupled to the at least one processor and which stores processor-executable instructions which, when executed by the at least one processor, cause the at least one processor to:
receive a digital image of a selected region on an item, wherein the digital image has sufficient resolution to allow recognition of characters of a serial number of the currency if the serial numbers are within the selected region, and also of sufficient resolution to show elements of a grain surface of inter-character and intra-character regions of the serial number;
recognize a serial number of the item from the digital image;
process the digital image to locate at least one fingerprint feature;
store data identifying the fingerprint feature in a first feature vector; and store the first feature vector and the digital image in a database in association with the serial number.
at least one processor; and at least one computer-readable storage medium communicatively coupled to the at least one processor and which stores processor-executable instructions which, when executed by the at least one processor, cause the at least one processor to:
receive a digital image of a selected region on an item, wherein the digital image has sufficient resolution to allow recognition of characters of a serial number of the currency if the serial numbers are within the selected region, and also of sufficient resolution to show elements of a grain surface of inter-character and intra-character regions of the serial number;
recognize a serial number of the item from the digital image;
process the digital image to locate at least one fingerprint feature;
store data identifying the fingerprint feature in a first feature vector; and store the first feature vector and the digital image in a database in association with the serial number.
17. The system of claim 16 wherein, when executed by the at least one processor, the processor-executable instructions cause the at least one processor further to:
store a second digital image in the database in association with a matched serial number of the item.
Date Recue/Date Received 2022-01-13
store a second digital image in the database in association with a matched serial number of the item.
Date Recue/Date Received 2022-01-13
18. The system of claim 16 wherein, when executed by the at least one processor, the processor-executable instructions cause the at least one processor to store the at least one of the first digital image, the first feature vector, and the object serial number in encrypted form in the database.
19. The system of claim 16 wherein the item comprises one of a coin, a banknote, or a weapon.
20. A system comprising:
at least one processor; and at least one computer-readable storage medium communicatively coupled to the at least one processor and which stores processor-executable instructions which, when executed by the at least one processor, cause the at least one processor to:
receive a digital image of a region including an identifiable structure of an item;
extract data representing at least one fingerprint feature from the digital image;
store fingerprint feature data in a feature vector in memory; and store the digital image and the feature vector in association with an identifier of the item.
at least one processor; and at least one computer-readable storage medium communicatively coupled to the at least one processor and which stores processor-executable instructions which, when executed by the at least one processor, cause the at least one processor to:
receive a digital image of a region including an identifiable structure of an item;
extract data representing at least one fingerprint feature from the digital image;
store fingerprint feature data in a feature vector in memory; and store the digital image and the feature vector in association with an identifier of the item.
21. The system of claim 20 wherein the digital image is a high-resolution color image, detailed enough to show elements of a grain surface of inter-character and intra-character regions proximate to the identifiable structure.
22. The system of claim 21 wherein the item comprises a paper currency or a coin, and wherein the identifiable structure is an alphanumeric identifier.
23. The system of claim 21 wherein the item comprises a weapon, and wherein the identifiable structure is an alphanumeric identifier.
24. The system of claim 22 or 23 wherein, when executed by the at least one processor, the processor-executable instructions cause the at least one processor further to:
transmit the encrypted digital image and the encrypted fingerprint data and the alphanumeric identifier of the item to an inventory control system in association with one another.
transmit the encrypted digital image and the encrypted fingerprint data and the alphanumeric identifier of the item to an inventory control system in association with one another.
25. The system of claim 20 wherein the digital image is detailed enough to reveal variations, imperfections and/or flaws in the individual characters within an alphanumeric identifier.
26. The system of claim 20 wherein the fingerprint feature is a natural product of manufacture of the item or manufacture of a material or piece incorporated into the item.
27. The system of claim 26 wherein the fingerprint feature comprises metal crystal structure defects or variations, the fingerprint feature data indicating a location of at least one defect or variation in a crystalline structure of a metal region.
28. The system of claim 20 wherein the fingerprint feature comprises metal machining or milling marks presumably resulting from manufacture of a material or piece incorporated into the item.
29. The system of claim 20 wherein the region is selected based on a predetermined offset from the identifiable structure on the item and wherein, when Date Recue/Date Received 2022-01-13 executed by the at least one processor, the processor-executable instructions cause the at least one processor to:
identify the fingerprint feature based on an offset from the identifiable feature, and store data that identifies the region in a fingerprint database in association with an alphanumeric identifier of the item.
identify the fingerprint feature based on an offset from the identifiable feature, and store data that identifies the region in a fingerprint database in association with an alphanumeric identifier of the item.
30. The system of claim 20 wherein, when executed by the at least one processor, the processor-executable instructions cause the at least one processor further to:
encrypt the data identifying said region where the fingerprint feature is located.
Date Recue/Date Received 2022-01-13
encrypt the data identifying said region where the fingerprint feature is located.
Date Recue/Date Received 2022-01-13
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CA2967584A CA2967584C (en) | 2013-08-30 | 2013-08-30 | Object identification and authentication |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CA2825681A CA2825681C (en) | 2013-08-30 | 2013-08-30 | Object identification and authentication |
CA2967584A CA2967584C (en) | 2013-08-30 | 2013-08-30 | Object identification and authentication |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA2825681A Division CA2825681C (en) | 2013-08-30 | 2013-08-30 | Object identification and authentication |
Publications (2)
Publication Number | Publication Date |
---|---|
CA2967584A1 CA2967584A1 (en) | 2015-02-28 |
CA2967584C true CA2967584C (en) | 2022-12-13 |
Family
ID=52580565
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA2825681A Active CA2825681C (en) | 2013-08-30 | 2013-08-30 | Object identification and authentication |
CA2967584A Active CA2967584C (en) | 2013-08-30 | 2013-08-30 | Object identification and authentication |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA2825681A Active CA2825681C (en) | 2013-08-30 | 2013-08-30 | Object identification and authentication |
Country Status (1)
Country | Link |
---|---|
CA (2) | CA2825681C (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109598185B (en) * | 2018-09-04 | 2022-09-20 | 创新先进技术有限公司 | Image recognition translation method, device and equipment and readable storage medium |
EP3859597A1 (en) | 2020-01-31 | 2021-08-04 | U-NICA Systems AG | A computer implemented method and system of surface identification comprising scales |
CN114979794B (en) * | 2022-05-13 | 2023-11-14 | 深圳智慧林网络科技有限公司 | Data transmission method and device |
-
2013
- 2013-08-30 CA CA2825681A patent/CA2825681C/en active Active
- 2013-08-30 CA CA2967584A patent/CA2967584C/en active Active
Also Published As
Publication number | Publication date |
---|---|
CA2825681A1 (en) | 2015-02-28 |
CA2825681C (en) | 2017-07-18 |
CA2967584A1 (en) | 2015-02-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8774455B2 (en) | Document fingerprinting | |
US11423641B2 (en) | Database for detecting counterfeit items using digital fingerprint records | |
US10614302B2 (en) | Controlled authentication of physical objects | |
US9646206B2 (en) | Object identification and inventory management | |
US7356162B2 (en) | Method for sorting postal items in a plurality of sorting passes | |
US9058543B2 (en) | Defined data patterns for object handling | |
US6886136B1 (en) | Automatic template and field definition in form processing | |
EP2869240A2 (en) | Digital fingerprinting object authentication and anti-counterfeiting system | |
EP3282391A1 (en) | Event-driven authentication of physical objects | |
US20040076320A1 (en) | Character recognition, including method and system for processing checks with invalidated MICR lines | |
CN103914680A (en) | Character image jet-printing, recognition and calibration system and method | |
CA2967584C (en) | Object identification and authentication | |
JPH11238097A (en) | Mail address prereader and address prereading method | |
CN107240185A (en) | A kind of crown word number identification method, device, equipment and storage medium | |
EP3809324A1 (en) | Securing composite objects using digital fingerprints | |
Gaikwad et al. | Automatic Indian New Fake Currency Detection | |
CN116756358A (en) | Electronic management method for flight manifest | |
US20040024716A1 (en) | Mail sorting processes and systems | |
CN116469090A (en) | Method and device for detecting code spraying pattern, electronic equipment and storage medium | |
US11640702B2 (en) | Structurally matching images by hashing gradient singularity descriptors | |
Zhao et al. | Ballot Tabulation Using Deep Learning | |
Honggang et al. | Bank check image binarization based on signal matching | |
KR101826640B1 (en) | ID card type determine device in identification reader and method thereof | |
Rao et al. | Extraction of Serial Number on Currency Notes Using LabVIEW. | |
JPS6379193A (en) | Character reader |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
EEER | Examination request |
Effective date: 20170517 |
|
EEER | Examination request |
Effective date: 20170517 |
|
EEER | Examination request |
Effective date: 20170517 |
|
EEER | Examination request |
Effective date: 20170517 |
|
EEER | Examination request |
Effective date: 20170517 |
|
EEER | Examination request |
Effective date: 20170517 |
|
EEER | Examination request |
Effective date: 20170517 |