US20210217129A1 - Detection of encoded signals and icons - Google Patents
Detection of encoded signals and icons Download PDFInfo
- Publication number
- US20210217129A1 US20210217129A1 US17/107,346 US202017107346A US2021217129A1 US 20210217129 A1 US20210217129 A1 US 20210217129A1 US 202017107346 A US202017107346 A US 202017107346A US 2021217129 A1 US2021217129 A1 US 2021217129A1
- Authority
- US
- United States
- Prior art keywords
- signal
- icon
- image
- encoded
- image data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000001514 detection method Methods 0.000 title abstract description 79
- 230000004044 response Effects 0.000 claims abstract description 52
- 238000000034 method Methods 0.000 claims description 132
- 230000008569 process Effects 0.000 description 85
- 238000012545 processing Methods 0.000 description 54
- 230000008859 change Effects 0.000 description 53
- 238000010586 diagram Methods 0.000 description 37
- 238000004891 communication Methods 0.000 description 25
- 230000006870 function Effects 0.000 description 23
- 238000005516 engineering process Methods 0.000 description 21
- 238000005286 illumination Methods 0.000 description 21
- 239000000872 buffer Substances 0.000 description 17
- 238000013459 approach Methods 0.000 description 16
- 238000013461 design Methods 0.000 description 15
- 238000001914 filtration Methods 0.000 description 13
- 230000003287 optical effect Effects 0.000 description 13
- 238000012937 correction Methods 0.000 description 12
- 238000000605 extraction Methods 0.000 description 12
- 238000003860 storage Methods 0.000 description 12
- 230000005236 sound signal Effects 0.000 description 11
- 238000004458 analytical method Methods 0.000 description 10
- 238000011156 evaluation Methods 0.000 description 9
- 238000013507 mapping Methods 0.000 description 9
- 238000004806 packaging method and process Methods 0.000 description 9
- 238000007639 printing Methods 0.000 description 9
- 239000000284 extract Substances 0.000 description 8
- 239000000758 substrate Substances 0.000 description 8
- 238000005070 sampling Methods 0.000 description 7
- 238000013519 translation Methods 0.000 description 7
- 230000003595 spectral effect Effects 0.000 description 6
- 230000009466 transformation Effects 0.000 description 6
- 230000000007 visual effect Effects 0.000 description 6
- 238000004422 calculation algorithm Methods 0.000 description 5
- 239000000969 carrier Substances 0.000 description 5
- 238000003708 edge detection Methods 0.000 description 5
- 239000000976 ink Substances 0.000 description 5
- 230000002829 reductive effect Effects 0.000 description 5
- 239000004065 semiconductor Substances 0.000 description 5
- 230000002146 bilateral effect Effects 0.000 description 4
- 238000003384 imaging method Methods 0.000 description 4
- 238000002372 labelling Methods 0.000 description 4
- 230000000873 masking effect Effects 0.000 description 4
- 235000016709 nutrition Nutrition 0.000 description 4
- 230000035764 nutrition Effects 0.000 description 4
- 238000003909 pattern recognition Methods 0.000 description 4
- 230000011664 signaling Effects 0.000 description 4
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 239000003086 colorant Substances 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 235000013305 food Nutrition 0.000 description 3
- 230000000977 initiatory effect Effects 0.000 description 3
- 230000033001 locomotion Effects 0.000 description 3
- 238000005259 measurement Methods 0.000 description 3
- 238000012913 prioritisation Methods 0.000 description 3
- 238000009877 rendering Methods 0.000 description 3
- 238000013515 script Methods 0.000 description 3
- 229920001621 AMOLED Polymers 0.000 description 2
- 241000272060 Elapidae Species 0.000 description 2
- 206010020751 Hypersensitivity Diseases 0.000 description 2
- 208000026935 allergic disease Diseases 0.000 description 2
- 230000007815 allergy Effects 0.000 description 2
- 230000004075 alteration Effects 0.000 description 2
- 230000003190 augmentative effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 238000013075 data extraction Methods 0.000 description 2
- 230000006837 decompression Effects 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000004049 embossing Methods 0.000 description 2
- 230000008571 general function Effects 0.000 description 2
- 230000007274 generation of a signal involved in cell-cell signaling Effects 0.000 description 2
- 238000009499 grossing Methods 0.000 description 2
- 238000010330 laser marking Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 235000013372 meat Nutrition 0.000 description 2
- 230000036961 partial effect Effects 0.000 description 2
- 239000002245 particle Substances 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 230000001902 propagating effect Effects 0.000 description 2
- 238000013442 quality metrics Methods 0.000 description 2
- 238000005096 rolling process Methods 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 235000017060 Arachis glabrata Nutrition 0.000 description 1
- 235000010777 Arachis hypogaea Nutrition 0.000 description 1
- 244000105624 Arachis hypogaea Species 0.000 description 1
- 235000018262 Arachis monticola Nutrition 0.000 description 1
- 101100148710 Clarkia breweri SAMT gene Proteins 0.000 description 1
- 241001061257 Emmelichthyidae Species 0.000 description 1
- 241000283070 Equus zebra Species 0.000 description 1
- 235000000177 Indigofera tinctoria Nutrition 0.000 description 1
- 241000023320 Luma <angiosperm> Species 0.000 description 1
- 241001247287 Pentalinon luteum Species 0.000 description 1
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical group [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 239000013566 allergen Substances 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 239000011248 coating agent Substances 0.000 description 1
- 238000000576 coating method Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000013481 data capture Methods 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 230000005684 electric field Effects 0.000 description 1
- 238000005530 etching Methods 0.000 description 1
- 238000010304 firing Methods 0.000 description 1
- 238000007647 flexography Methods 0.000 description 1
- 238000003707 image sharpening Methods 0.000 description 1
- 230000001976 improved effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 229940097275 indigo Drugs 0.000 description 1
- COHYTHOBJLSHDF-UHFFFAOYSA-N indigo powder Natural products N1C2=CC=CC=C2C(=O)C1=C1C(=O)C2=CC=CC=C2N1 COHYTHOBJLSHDF-UHFFFAOYSA-N 0.000 description 1
- 239000004615 ingredient Substances 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- OSWPMRLSEDHDFF-UHFFFAOYSA-N methyl salicylate Chemical compound COC(=O)C1=CC=CC=C1O OSWPMRLSEDHDFF-UHFFFAOYSA-N 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 229910052754 neon Inorganic materials 0.000 description 1
- GKAOGPIIYCISHV-UHFFFAOYSA-N neon atom Chemical compound [Ne] GKAOGPIIYCISHV-UHFFFAOYSA-N 0.000 description 1
- 238000007645 offset printing Methods 0.000 description 1
- 238000012015 optical character recognition Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 235000020232 peanut Nutrition 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 230000000704 physical effect Effects 0.000 description 1
- 238000011045 prefiltration Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000006722 reduction reaction Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 239000010979 ruby Substances 0.000 description 1
- 229910001750 ruby Inorganic materials 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 238000004513 sizing Methods 0.000 description 1
- 231100000430 skin reaction Toxicity 0.000 description 1
- 239000004984 smart glass Substances 0.000 description 1
- 230000007480 spreading Effects 0.000 description 1
- 238000003892 spreading Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000029305 taxis Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
- 238000002604 ultrasonography Methods 0.000 description 1
- 238000001429 visible spectrum Methods 0.000 description 1
- 238000005303 weighing Methods 0.000 description 1
- 235000013618 yogurt Nutrition 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/0021—Image watermarking
- G06T1/005—Robust watermarking, e.g. average attack or collusion attack resistant
- G06T1/0071—Robust watermarking, e.g. average attack or collusion attack resistant using multiple or alternating watermarks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06K—GRAPHICAL DATA READING; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K7/00—Methods or arrangements for sensing record carriers, e.g. for reading patterns
- G06K7/10—Methods or arrangements for sensing record carriers, e.g. for reading patterns by electromagnetic radiation, e.g. optical sensing; by corpuscular radiation
- G06K7/10544—Methods or arrangements for sensing record carriers, e.g. for reading patterns by electromagnetic radiation, e.g. optical sensing; by corpuscular radiation by scanning of the records by radiation in the optical part of the electromagnetic spectrum
- G06K7/10712—Fixed beam scanning
- G06K7/10722—Photodetector array or CCD scanning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/0021—Image watermarking
- G06T1/0028—Adaptive watermarking, e.g. Human Visual System [HVS]-based watermarking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/0021—Image watermarking
- G06T1/005—Robust watermarking, e.g. average attack or collusion attack resistant
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/0021—Image watermarking
- G06T1/005—Robust watermarking, e.g. average attack or collusion attack resistant
- G06T1/0064—Geometric transfor invariant watermarking, e.g. affine transform invariant
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N1/00—Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
- H04N1/32—Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
- H04N1/32101—Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
- H04N1/32144—Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title embedded in the image data, i.e. enclosed or integrated in the image, e.g. watermark, super-imposed logo or stamp
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2201/00—General purpose image data processing
- G06T2201/005—Image watermarking
- G06T2201/0061—Embedding of the watermark in each block of the image, e.g. segmented watermarking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2201/00—General purpose image data processing
- G06T2201/005—Image watermarking
- G06T2201/0065—Extraction of an embedded watermark; Reliable detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2201/00—General purpose image data processing
- G06T2201/005—Image watermarking
- G06T2201/0202—Image watermarking whereby the quality of watermarked images is measured; Measuring quality or performance of watermarking methods; Balancing between quality and robustness
Definitions
- This disclosure relates to automatic identification of objects and icons, and related image signal processing.
- a pack is constructed with an over-wrap that obscures barcodes on individual items.
- the overwrap carries a separate barcode for the family pack.
- Conflict occurs when a scanner reads barcodes for individual items and the family pack or misses the barcode of the family pack.
- conflict also occurs when the scanner reads the barcode of the family pack and then individual items, without treating the individual items as part of the pack.
- the individual items are held in a carrying case that bears the barcode of the family pack. The individual items may be oriented to obscure their barcodes, yet they may still be visible.
- the items within a pack may be different items that the retailer wishes to sell together or multiple instances of the same item in a group.
- each of the items contains a different barcode, which is also different than the group barcode.
- errors occur when the scanner provides decoded product codes for the individual items in the family pack.
- a package design file may encompass design elements, each bearing a different product code, which may conflict in some cases.
- the package design file may include references to artwork in other files, which is composited to produce the package design image prior to printing. In this image assembly process, conflicting codes may be incorporated from the artwork in the reference files. In the latter case, conflicting codes may be printed due to printing plates that apply imagery with conflicting codes. Also, printing may occur with plural print stages, in which a first print technology like flexography or offset applies a first design to a package substrate, and a second print technology like a digital offset or inkjet applies a second design to a package substrate.
- the scanner includes a processor that controls illumination and image capture by an imager of an object within its view volume.
- a processor executes a controller process to receive a detection result from a recognition unit for image frames captured of an object or objects in the view volume.
- the detection results acquired from sensing the object within a scan operation includes an outer or inner code, or both.
- An example of an outer code is an identifier of a family pack or price change label
- an example of an inner code is an identifier of a family pack member or product identifier of a product with a price change label attached.
- the controller analyzes the detection result by comparing the detection result with state stored for a prior detection result during the scan operation to determine whether to initiate one of plural types of waiting periods based on the type of detection result and comparing the detection result with a prior result in a state data structure.
- the controller sets the waiting period to control reporting of an outer code relative to an inner code on the package. It enforces a first type of waiting period and control logic to control reporting of an inner code after detection of an outer code and a second type of waiting period and control logic to delay reporting of an inner code until the second type of waiting period ends. Variations of the waiting period and control logic are described further below.
- a smartphone comprising: an imager for capturing plural image frames of a package; a processor coupled to the imager; the processor configured to execute a controller process, the controller process comprising instructions executed by the processor to: analyze image data associated with an image frame, the image frame captured by said imager, in which the analyze image data executes to detect the presence or absence of an icon and to decode a signal encoded within the image data; and provide a first response when the signal is decoded but the icon is not detected; provide a second, different response when the signal is decoded and the icon is detected.
- Yet another aspect is a method of detecting the presence of an icon in imagery, the imagery captured by a camera integrated within a portable electronic device.
- the method comprises: using one or more cores of a multi-core processor, filtering the imagery to remove noise, said filtering yielding filtered imagery; detecting a plurality of contours within the filtered imagery, and for each of the plurality of contours, executing the following criteria checks: i) determining whether the contour is closed; ii) determining whether the contour comprises an area associated within a predetermined area range; and iii) determining whether the contour comprises a convex contour; outputting an indication that the contour comprises a candidate contour only when each of criteria i, ii and iii are satisfied.
- Additional aspects of the disclosure include control logic and associated methods for integrated within automatic identification devices, and various configurations and types of recognition units and controller logic for determining when and how to handle responses when an icon is detection in the presence or absence of encoded signals.
- FIG. 1 is a system diagram illustrating components of a point of sale system in a retail store.
- FIG. 2 is a diagram illustrating a sequence of decode operations by a scanner.
- FIG. 3 is a diagram illustrating another sequence of decode operations by the scanner.
- FIG. 4 is a diagram of components in an imager based scanner.
- FIG. 5 is a diagram illustrating a processing architecture for controlling recognition units within a scanner.
- FIG. 6 is diagram illustrating software modules that operate on a sequence of image frames to detect and extract digital payloads from images of objects within the frames.
- FIGS. 7A and 7B illustrate image portions of an object in different frames captured from a field of view of a scanner's imager.
- FIGS. 8A and 8B illustrate another example of image portions of an object in different frames captured from a field of view of a scanner's imager.
- FIG. 9 is a flow diagram of a controller process that resolves product identification conflicts.
- FIG. 10 is a block diagram of a signal encoder for encoding a digital payload signal into an image signal.
- FIG. 11 is a block diagram of a compatible signal decoder for extracting the digital payload signal from an image signal.
- FIG. 12 is a flow diagram illustrating operations of a signal generator.
- FIG. 13 is a diagram illustrating embedding of an auxiliary signal into host image signal.
- FIG. 14 is a flow diagram illustrating a method for decoding a payload signal from a host image signal.
- FIG. 15 is a rendition of a physical object including an icon and various encoded symbologies.
- FIG. 16A is a flow diagram showing cooperation of a signal decoder and an icon detector.
- FIG. 16B is a flow diagram showing cooperation of an icon detector and signal decoder.
- FIG. 17A is a flow diagram showing two stages associated with the icon detector of FIG. 16A .
- FIG. 17B is a flow diagram showing stage 1 of the icon detector shown in FIG. 17A .
- FIG. 17C is a flow diagram showing stage 2 of the icon detector shown in FIG. 17A .
- FIGS. 18A and 18B show an example MatLab script.
- FIG. 19 is a block diagram of an electronic device (e.g., a smartphone) that can be used to carry out the processes and features shown in FIGS. 16-17C and 20A-20E .
- an electronic device e.g., a smartphone
- FIG. 20A is a flow diagram for a process to detect candidate contours within image data.
- FIG. 20B is a flow diagram showing one embodiment of contour refinement.
- FIG. 20C is a flow diagram for icon matching of candidate contours.
- FIG. 20D is another flow diagram for icon matching of candidate contours.
- FIG. 20E is yet another flow diagram for icon matching of candidate contours.
- FIG. 21 shows a rotation angle for a minimum bounding box.
- FIGS. 22A and 22B show object evaluation within a block.
- FIG. 22C shows lines through a block which includes objects.
- FIGS. 23A and 23B show object evaluation within a block; and FIG. 23C shows remaining objects.
- FIG. 24A shows objects including tangent lines
- FIG. 24B shows other objects including tangent lines.
- FIGS. 25A-25D show signal encoding in, on and around various icons.
- FIG. 1 is a system diagram illustrating components of a point of sale system in a retail store.
- Each check-out station is equipped with a POS terminal 14 and scanner 12 .
- the scanner has a processor and memory and executes scanner firmware, as detailed further below.
- the POS terminal is a general purpose computer connected to the scanner via a standard cable or wireless interconnect, e.g., to connect the scanner directly to a serial port, keyboard port, USB port or like port of the POS terminal or through an interface device (e.g., a wedge).
- Each of the POS terminals are connected via a network to the store's back office system 16 .
- GTIN Global Trade Identification Number
- the GTIN plays a vital role within store operations as it identifies products and acts as a database key to associate the product with product attributes including its name and price.
- the GTIN is assigned by the manufacturer of the item and encoded in the packaging, via a UPC Symbol and, preferably, a digital encoding that replicates the GTIN in two-dimensional tiles across the package design, as detailed further below.
- tiled data encoding is a Digimarc Barcode data carrier from Digimarc Corporation of Beaverton, Oreg.
- the retailer's system has a database of item files for each of the products it sells.
- This item file includes various attributes of the item that the store uses to manage its operation, such as price, scanning description, department ID, food stamp information, tax information, etc.
- the POS terminal retrieves this information as needed from the back office by querying the database with the item identifier (e.g., a GTIN of the product provided by the scanner).
- the item identifier e.g., a GTIN of the product provided by the scanner.
- a barcode preferably the Digimarc Barcode data carrier
- the Digimarc Barcode data carrier is used to convey family pack identifiers and price change codes on packaging.
- the retailer or manufacturer assigns a GTIN as the product identifier of the pack, and creates an associated item file for that pack.
- the GTIN is encoded in a conventional barcode and/or the Digimarc Barcode data carrier applied to the over-wrap or carrier of the pack.
- the Digimarc Barcode data carrier is advantageous because it replicates the GTIN across the package to provide more efficient and reliable decoding of a GTIN, and has additional data capacity to carry one or more flags indicating to the scanner that family pack or price change processing logic applies.
- Barcodes and in particular, Digimarc Barcode data carriers, are preferably used to convey price change information in labels applied to product packaging.
- Price changes are usually of one of the following two types: a discount code, or a new fixed price.
- the discount code references a monetary amount to be reduced from the price assigned to the item's GTIN.
- the code references a new fixed price that replaces the price assigned to the item's GTIN.
- the Digimarc Barcode data carrier also includes a flag indicating that price change processing logic applies in the scanner.
- the price change label may have other detectable properties, such as a color or spectral composition, shape, RFID tag, image template, or marking that the scanner's recognition unit(s) can detect.
- Price changes are typically managed by department within a retailer. This enables the managers of the departments, such as the bakery, meat, product and deli departments, to determine when and how much to discount items that they wish to move from their inventory.
- the price change information includes a department identifier, enabling the retailer's system to track the price change to the department.
- the new fixed price or price change may be encoded directly in the digital payload of the data carrier printed on the price change label. Alternatively, the fixed price or discount may be stored in an item record and looked up by the POS using the code decoded from the payload.
- a GTIN identifying a product or class of products to which the price change applies may be included in the payload of the data carrier on the product as well.
- the product information is printed by a label printer within the store.
- a label printer within a scale which is used to weigh and print a label for a variable weight item.
- the GTIN format includes fields used to encode the variable nature of such items by encoding a variable amount (e.g., variable weight) or a variable price.
- this GTIN is encoded on the label with a Digimarc Barcode data carrier, though conventional barcodes may also be used.
- Variable items are a prime example of items that often are subject to price changes.
- a label with the price change is applied to the item as described above. This label may be applied over the prior label to obscure it, or may be applied next to it.
- the label printer in the store may be configured to print a price change label, which fits over the original label, or complements it. In either case, the scanner decodes the code or codes it detects on the package, and its processing logic issues the correct product and pricing information to the POS system.
- the back office system maintains a database of item file information in its memory (persistent and volatile memory (e.g., RAM), as needed). It uses the GTIN to associate a product with the product attributes and retrieves these attributes and delivers them to the scanning application software of the POS terminal in response to database queries keyed by the GTIN or like item code. Item files are also created for family pack items and price change labels.
- the item database is mirrored within the POS terminals of the retail store, and each POS terminal executes item look up operations within its local copy of the item database.
- the POS scanning application software obtains the output of the scanner, which is comprised of the recognized codes, e.g., GTIN, price change code, or like code. It then does a look up, either locally or via the back office to get related attributes for each code. With these attributes, the POS software executes typical POS functions, such as displaying product name and price during check-out, tabulating total price, with taxes and discounts, coupons, etc.; managing payment, and generating a receipt. Importantly, the POS software need not be modified to handle family pack configurations and price changes. Instead, the scanner logic resolves potential code scanning conflicts and reports the resolved code or codes in a fashion that the POS terminal is accustomed to seeing.
- the recognized codes e.g., GTIN, price change code, or like code. It then does a look up, either locally or via the back office to get related attributes for each code. With these attributes, the POS software executes typical POS functions, such as displaying product name and price during check-out, tabulating total price, with taxes and discounts
- a scanning application executes within each of the store's POS terminals. This application is responsible for obtaining the codes reported by the scanner hardware and performing the attribute look up operation. It receives each code from the scanner, in response to the scanner decoding UPC and Digimarc Barcode data carrier during check-out. A processor in the scanner executes firmware instructions loaded from memory to perform these decoding operations.
- FIGS. 2 and 3 are diagrams illustrating sequencing of decode operations to set the stage for the processing logic that interprets the sequence.
- the scanner executes recognition operations on image frames captured while a product package or packages move through its field of view. From mere decoding of conventional barcodes, it is not determinable whether the barcodes originate from the same or different objects. To address this, we have incorporated new features in encoding on the package and logic within the scanner.
- the inner barcode corresponds to a barcode of an individual item in a family pack or the original barcode on a package, before a price change label is added.
- the “outer barcode” corresponds to a barcode of the family pack or a price change label.
- the family pack code may indeed be outside the member item code (e.g., in the case of an over-wrap), it need not be. The same is true for the price change label relative to the original barcode on a product.
- Inner and outer barcodes are examples of a broader category of inner and outer codes detected by the scanner. These codes may be detected by image recognition methods, of which optical code reading is a subset. Other forms of image recognition are feature extraction and matching and template matching (e.g., a price change label template), to name two examples. They may also be detected by other sensor types, such as RFID, and a combination of sensor input, e.g., weight from a scale (e.g., to distinguish a family pack from a family pack member), geometric features from image feature extraction (including depth from a depth sensor), and spectral information (color such as a color histogram of a detected object, or pixel samples from spectral bands obtained by multi-spectral illumination and/or multi-spectral filters).
- image recognition methods of which optical code reading is a subset.
- Other forms of image recognition are feature extraction and matching and template matching (e.g., a price change label template), to name two examples. They may also be detected by other sensor types, such as RFID, and
- FIG. 2 illustrates a sequence in which decoding of an inner barcode precedes an outer barcode.
- the scanner decodes an inner barcode, it does not immediately report it. Instead, it pauses for a predetermined delay, e.g., in the range of around 500 ms.
- the amount of this delay may be specified in relative or absolute time by a flag in the data carrier (namely, in the digital data encoded in the family pack or family pack member). If the next barcode is an outer barcode of a family pack, the scanner logic reports only the GTIN for the family pack.
- the scanner logic reports it.
- the scanner logic that controls which code or codes to report depend on whether the price change is a fixed price or a discount code. For a fixed price code, that fixed price code replaces the code from the inner barcode as it provides the code that the POS terminal uses to query the back office database for the new price. For a discount code, the logic causes the scanner to report the discount code as well as the code from first detected barcode that triggered the waiting period.
- data flags are encoded in the inner and/or outer barcode data carriers to signal to the scanner that an outer barcode may accompany the inner barcode.
- the inner barcode of FIG. 2 signals that it is part of a family pack, which in turn, triggers a waiting period for the scanner to detect an outer barcode. If no outer barcode is decoded in the waiting period, then scanner reports the inner barcode to the POS terminal.
- FIG. 3 illustrates a sequence in which decoding of an outer barcode precedes an inner barcode.
- This sequence may occur, for example, following the decoding of the outer barcode of FIG. 2 .
- the scanner logic similarly waits for a predetermined period of time (e.g., 500 ms).
- a barcode decoded in the waiting period is ignored if a family pack flag is set because a barcode detected in this waiting period is deemed to be from the same family pack.
- the time range for the waiting period may vary with the device, as each device has different image capture systems, with different field of view parameters, which govern the number and type of views captured of an object or group of objects as they are scanned in the scanner view volume.
- Checker usage patterns also govern the waiting period, as they also impact movement of objects through the view volume, and/or how the checker employs the scanner to image objects.
- the waiting period can range from around 300 ms to 1.5 seconds.
- the logic depends on the type of price change. For a fixed price code detected as the OB of FIG. 3 , an inner barcode detected in the waiting period is ignored. For a discount code, the inner barcode in the waiting period is reported.
- Image based scanners typically fall into two classes: fixed and hand-held.
- Fixed scanners are designed to be integrated within a check-out station, at which the operator or a conveyor moves items in the field of the scanner's image capture system.
- the image capture system is comprised of optical elements, such as a lens, mirror(s), beam splitter(s), 2D imager (e.g., CMOS camera), which together enable capture of plural views of an object that are combined into a single frame.
- an illumination source is also included to illuminate the object for each capture. See, e.g., US Publication Nos. 20090206161 and 20130206839, which are incorporated by reference.
- Hand-held scanners are, as the name implies, designed to be held in the hand and pointed at objects. They have different optical systems adapted for this type of capture, including lens, sensor array adapted for capturing at varying distances, as well as illumination source for illuminating the object at these distances.
- image based systems capture frames in range of around 10 to 90 frames per second.
- processing of a frame must be complete prior to the arrival of the next frame.
- the scanner processing unit or units have from 10 to 100 ms to decode at least one code and perform other recognition operations, if included.
- image processing of image frames is governed by time constraints, not strictly frames.
- the processing unit or units within the device process frames concurrently but when processing capacity reached, some frames get dropped, and processing resumes on subsequent frames when processing capacity is available.
- This type of resource management is sometimes employed opportunistically in response to detecting an object in the view volume of the scanner's imaging system. For example, as a new object enters the view volume, an image process executing within the scanner detects it and launches decoding processes on subsequent frames.
- FIG. 4 is a diagram of components in an imager based scanner. Our description is primarily focused on fixed, multi-plane imager based scanner. However, it is not intended to be limiting, as the embodiments may be implemented in other imaging devices, such as hand-held scanners, smartphones, tablets, machine vision systems, etc.
- the scanner has a bus 100 , to which many devices, modules, etc., (each of which may be generically referred as a “component”) are communicatively coupled.
- the bus 100 may combine the functionality of a direct memory access (DMA) bus and a programmed input/output (PIO) bus.
- DMA direct memory access
- PIO programmed input/output
- the bus 100 facilitates both DMA transfers and direct processor read and write instructions.
- the bus 100 is one of the Advanced Microcontroller Bus Architecture (AMBA) compliant data buses.
- FIG. 4 illustrates an embodiment in which all components are communicatively coupled to the bus 100 , one or more components may be communicatively coupled to a separate bus, and may be communicatively coupled to two or more buses.
- the scanner can optionally include one or more bus controllers (e.g., a DMA controller, an I2C bus controller, or the like or combination thereof), through which data can be routed between certain of the components.
- the scanner also includes at least one processor 102 .
- the processor 102 may be a microprocessor, mobile application processor, etc., known in the art (e.g., a Reduced Instruction Set Computer (RISC) from ARM Limited, the Krait CPU product-family, X86-based microprocessor available from the Intel Corporation including those in the Pentium, Xeon, Itanium, Celeron, Atom, Core i-series product families, etc.).
- the processor may also be a Digital Signal Processor (DSP) such the C6000 DSP category from Texas Instruments.
- FIG. 4 shows a second processor behind processor 102 to illustrate that the scanner may have plural processors, as well as plural core processors.
- Other components on the bus 100 may also include processors, such as DSP or microcontroller.
- processor architectures used in current scanner technology include, for example, ARM (which includes several architecture versions), Intel, and TI C6000 DSP. Processor speeds typically range from 400 MHz to 2+ Ghz.
- Some scanner devices employ ARM NEON technology, which provides a Single Instruction, Multiple Data (SIMD) extension for a class of ARM processors.
- SIMD Single Instruction, Multiple Data
- the processor 102 runs an operating system of the scanner, and runs application programs and, manages the various functions of the device.
- the processor 102 may include or be coupled to a read-only memory (ROM) (not shown), which stores an operating system (e.g., a “high-level” operating system, a “real-time” operating system, a mobile operating system, or the like or combination thereof) and other device firmware that runs on the scanner.
- ROM read-only memory
- the scanner also includes a volatile memory 104 electrically coupled to bus 100 (also referred to as dynamic memory).
- the volatile memory 104 may include, for example, a type of random access memory (RAM).
- RAM random access memory
- the scanner includes a memory controller that controls the flow of data to and from the volatile memory 104 .
- Current scanner devices typically have around 500 MB of dynamic memory, and provide a minimum of 8 KiB of stack memory for certain recognition units.
- the watermark processor which is implemented as an embedded system SDK, for example, it is recommended that the scanner have a minimum of 8 KiB stack memory for running the embedded system SDK.
- the scanner also includes a storage memory 106 connected to the bus.
- the storage memory 106 typically includes one or more non-volatile semiconductor memory devices such as ROM, EPROM and EEPROM, NOR or NAND flash memory, or the like or combinations thereof, and may also include alternative storage devices, such as, for example, magnetic or optical disks.
- the storage memory 106 is used to store one or more items of software.
- Software can include system software, application software, middleware, one or more computer files (e.g., one or more data files, configuration files, library files, archive files, etc.), one or more software components, or the like or stack or other combination thereof.
- system software examples include operating systems (e.g., including one or more high-level operating systems, real-time operating systems, mobile operating systems, or the like or combination thereof), one or more kernels, one or more device drivers, firmware, one or more utility programs (e.g., that help to analyze, configure, optimize, maintain, etc., one or more components of the scanner), and the like.
- operating systems for scanners include but are not limited to Windows (multiple versions), Linux, iOS, Quadros, and Android.
- Compilers used to convert higher level software instructions into executable code for these devices include: Microsoft C/C++, GNU, ARM, and Clang/LLVM.
- Examples of compilers used for ARM architectures are RVDS 4.1+, DS-5, CodeSourcery, and Greenhills Software.
- the imager interface 108 connects one or more one or more imagers 110 to bus 100 .
- the imager interface supplies control signals to the imagers to capture frames and communicate them to other components on the bus.
- the imager interface also includes an image processing DSP that provides image processing functions, such as sampling and preparation of groups of pixel regions from the 2D sensor array (blocks, scanlines, etc.) for further image processing.
- the DSP in the imager interface may also execute other image pre-processing, recognition or optical code reading instructions on these pixels.
- the imager interface 108 also includes memory buffers for transferring image and image processing results to other components on the bus 100 .
- each imager 110 is comprised of a digital image sensor (e.g., CMOS or CCD) or like camera having a two-dimensional array of pixels.
- the sensor may be a monochrome or color sensor (e.g., one that employs a Bayer arrangement), and operate in a rolling and/or global shutter mode.
- Examples of these imagers include model EV76C560 CMOS sensor offered by e2v Technologies PLC, Essex, England, and model MT9V022 sensor offered by On Semiconductor of Phoenix, Ariz.
- Each imager 110 captures an image of its view or views of a view volume of the scanner, as illuminated by an illumination source.
- the imager captures at least one view.
- Plural views e.g., view 1 112 and view 2 114
- optical elements such as mirrors and beam splitters are used to direct light reflected from different sides of an object in the view volume to the imager.
- an illumination driver 116 that controls and illumination sources 118 .
- Typical scanners employ Light Emitting Diodes (LEDs) as illumination sources.
- LEDs Light Emitting Diodes
- red LEDs are paired with a monochrome camera.
- the illumination driver applies signals to the LEDs to turn them on in a controlled sequence (strobe them) in synchronization with capture by an imager or imagers.
- plural different color LEDs may also be used and strobed in a manner such that the imager(s) selectively capture images under illumination from different color LED or sets of LEDs. See, e.g., Patent Application Publication Nos.
- 20130329006 entitled COORDINATED ILLUMINATION AND IMAGE SIGNAL CAPTURE FOR ENHANCED SIGNAL DETECTION
- 20160187199 entitled SENSOR-SYNCHRONIZED SPECTRALLY-STRUCTURED-LIGHT IMAGING, which are hereby incorporated by reference.
- the latter captures images in plural different spectral bands beyond standard RGB color planes, enabling extraction of encoded information as well as object recognition based on pixel samples in more narrow spectral bands at, above and below the visible spectrum.
- a broadband illumination source is flashed and image pixels in different bands, e.g., RGB, are captured with a color image sensor (e.g., such as one with a Bayer arrangement).
- the illumination driver may also strobe different sets of LED that are arranged to illuminate particular views within the view volume (e.g., so as to capture images of different sides of an object in the view volume).
- a further extension of scanner capability is to include a RGB+D imager, which provides a depth measurement in addition to Red, Green and Blue samples per pixel.
- the depth sample enables use of object geometry to assist in product identification.
- the scanner also includes at least one communications module 118 , each comprised of circuitry to transmit and receive data through a wired or wireless link to another device or network.
- a communication module is a connector that operates in conjunction with software or firmware on the scanner to function as a serial port (e.g., RS232), a Universal Serial Bus (USB) port, and an IR interface.
- a serial port e.g., RS232
- USB Universal Serial Bus
- IR interface IR interface
- Another example of a communication module in a scanner is a universal interface driver application specific integrated circuit (UIDA) that supports plural different host interface protocols, such as RS-232C, IBM46XX, or Keyboard Wedge interface.
- the scanner may also have communication modules to support other communication modes, such as USB, Ethernet, Bluetooth, WiFi, infrared (e.g., IrDa) or RFID communication.
- a sensor interface module 122 communicatively coupled to one or more sensors 124 .
- Some scanner configurations have a scale for weighing items, and other data capture sensors such as RFID or NFC readers or the like for reading codes from products, consumer devices, payment cards, etc.
- the sensor interface module 130 may also optionally include cache or other local memory device (e.g., volatile memory, non-volatile memory or a combination thereof), DMA channels, one or more input buffers, one or more output buffers to store and communicate control and data signals to and from the sensor.
- cache or other local memory device e.g., volatile memory, non-volatile memory or a combination thereof
- the scanner may be equipped with a variety of user input/output devices, connected to the bus 100 via a corresponding user I/O interface 126 .
- Scanners for example, provide user output in the form of a read indicator light or sound, and thus have an indicator light or display 128 and/or speaker 130 .
- the scanner may also have a display and display controller connecting the display device to the bus 100 .
- the scanner has a touch screen for both display and user input.
- FIG. 5 is a diagram illustrating a processing architecture for controlling recognition units within a scanner.
- the processing architecture comprises a controller and recognition units.
- Each of these elements is a logical processing module implemented as a set of instructions executing on a processor in the scanner, or implemented in an array of digital logic gates, such as a Field Programmable Gate Array (FPGA) or Application Specific Integrated Circuit (ASIC).
- FPGA Field Programmable Gate Array
- ASIC Application Specific Integrated Circuit
- Each of the modules may operate within a single component (such as a processor, FPGA or ASIC), within cores of a plural core processor, or within two or more components that are interconnected via the bus 100 or other interconnect between components in the scanner hardware of FIG. 4 .
- the implementer may create the instructions of each module in a higher level programming language, such as C/C++ and then port them to the particular hardware components in the scanner architecture of choice.
- the controller 140 is responsible for sending recognition tasks to recognition units ( 142 , 144 and 146 ), getting the results of those tasks, and then executing logic to determine the item code to be sent to the host POS system of the scanner.
- the controller module 140 communicates with the recognition units ( 142 - 146 ) via communication links 148 , 150 , 152 .
- the manner in which the controller communicates with the recognition units depend on the implementation of each.
- the controller communicates through a memory buffer, e.g., via the bus 100 .
- IPC inter-process communication
- the particular form of IPC depends in part on the operating system executing in the scanner.
- IPC may be implemented with sockets.
- Windows based Operating Systems from Microsoft Corp. also provide an implementation of sockets for IPC.
- controller and recognition units may be implemented within a single software process in which communication among software routines within the process is implemented with shared memory.
- the software program of each recognition units may be executed serially and report its results back to the controller.
- Recognition units may also be executed as separate threads of execution.
- the operating system running in the scanner manages pre-emptive multi-tasking and multi-threading (if employed) for software processes and threads.
- the operating system also manages concurrent execution on processes on processors, in some scanners where more than one processor is available for the controller, recognition units, and other image processing.
- a recognition unit executes instructions on an image block provided to it to recognize an object or objects in the image block and return a corresponding recognition result.
- the recognition result comprises the digital payload extracted from the carrier, which may be formatted as a string of binary or M-ary symbols or converted to a higher level code such as a GTIN data structure in accordance with the GS1 specification for GTINs.
- Recognition units that perform optical code reading include, for example, optical code readers for 1-dimensional optical codes like UPC, EAN, Code 39, Code 128 (including GS1-128), stacked codes like DataBar stacked and PDF417, or 2-dimensional optical codes like a DataMatrix, QR code or MaxiCode.
- Some scanners also have varying levels of object recognition capability, in which the recognition process entails feature extraction and classification or identification based on the extracted features.
- Some of these type of recognition processes provide attributes of an item or label, or a class of the product or label. Attributes of the item include color (e.g., color histogram) or geometry, such as position, shape, bounding region or other geometric attributes).
- the attributes may be further submitted to a classifier to classify an item type. The controller combines this information with other recognition results or sensor input to disambiguate plural codes detected from an object in the view volume.
- the scanner may have more sophisticated object recognition capability that is able to match extracted features with a feature database in memory and identify a product based on satisfying match criteria. This technology is described further below.
- the recognition units may also operate on other sensed data. Examples include decoding of an RFID tag based on sensed RF signal input, and weight attributes from a scale.
- FIG. 6 is diagram illustrating a software modules 160 , 162 that operate on a sequence of image frames 164 to detect and extract digital payloads from images of objects within the frames.
- Controller 160 is an example of a controller 140 in the architecture of FIG. 5 .
- This diagram illustrates the interaction of a controller with one particular implementation of a recognition unit 162 .
- the controller 160 and recognition unit are software processes. In one embodiment, they execute on distinct processors within the scanner. For example, they execute either in the separate processors 102 , 102 a , or the controller executes in processor 102 and recognition unit executes in a processor within the imager interface 108 (e.g., DSP). In another embodiment, they execute within the same processor, e.g., processor 102 , or within a DSP in the imager interface 108 .
- the controller executes in processor 102 , and the instructions of the recognition unit are implemented within an FPGA or ASIC, which is part of another component, such as the imager interface, or a separate component on bus 100 .
- the software process of the recognition unit 162 performs a form of recognition that employs digital watermark decoding to detect and extract watermark payloads from encoded data tiles in the image frames 164 .
- the term, “frame,” refers to a group of pixels read from a 2D sensor array for a time period in which a 2D image is captured on the sensor array. Recall that the sensor may operate in rolling shutter or global shutter mode.
- selected rows of the sensor array are sampled during a capture period and stored in a memory buffer (e.g., in the imager interface), which is accessed by the recognition unit(s).
- an entire frame of all pixels in the sensor array are sampled and stored in a frame buffer, which is then accessed by the recognition unit(s).
- the group of pixels sampled from a frame may include plural views of the viewing volume, or a part of the viewing volume.
- the recognition unit 162 has the following sub-modules of instructions: interface 166 and watermark processors 168 , 170 , 172 .
- the interface comprises software code for receiving calls from the controlling and returning recognition results from shared memory of the software process of the recognition unit 162 .
- Watermark processors are instances of watermark decoders.
- controller 160 invokes the recognition unit 162 on image frames containing the object. Via interface 166 , the controller 160 calls the recognition unit 162 , providing the frames 164 by supplying an address of or pointer to them in the memory of the scanner (image buffer in e.g., either volatile memory 104 or memory buffers in imager interface 108 ). It also provides other attributes, such as attributes of the view from which the frame originated.
- the recognition unit proceeds to invoke a watermark processor 168 - 172 on frames in serial fashion. Watermark processors 1 - 3 operate on frames 1 - 3 , and then process flow returns back to watermark processor 1 for frame 4 , and so on. This is just one example of process flow in a serial process flow implementation. Alternatively, watermark processors may be executed concurrently within a process as threads, or executed as separate software processes, each with an interface and watermark processor instance.
- the recognition unit 162 provides the extracted payload results, if any, for each frame via communication link as described above.
- the controller analyzes the results from the recognition unit and other recognition units and determines when and what to report to the POS terminal.
- Each watermark processor records in shared memory of the recognition unit 162 its result for analyzing the image block assigned to it. This result is a no detect, a successful read result along with decoded payload, or payloads (in the event that distinct payloads are detected within a frame).
- the watermark processor provides orientation parameters of the decoded payload, which provide geometric orientation and/or position of the tile or tiles from which the payload is decoded.
- FIGS. 7A and 7B illustrate image portions 180 , 182 in different frames captured from a field of view of a scanner's imager.
- An object 184 is moving through this field of view in these frames.
- image portion to reflect that the image portion of a frame is not necessarily co-extensive with the entire pixel array of an imager.
- an imager may capture plural views of the object 184 per frame, and the image portion may correspond to one particular view of plural different views captured by the image sensor array for a frame. Alternatively, it may encompass plural views imaged within a frame.
- frames from different imagers may be composited, in which case, the image portion may include a portion of frames composited from different imagers.
- FIG. 7A depicts an image block from a frame at a first capture time
- FIG. 7B represents an image block from a second, later capture time.
- the imager has a frame capture rate of 100 frames per second.
- a new frame is available for sampling as fast as every 10 ms.
- the rate at which the controller provides frames or portions of frames to each recognition unit may not be as high as the frame rate.
- the frames illustrated here need not be strictly adjacent in a video sequence from the sensor, but are within a time period in which an object 184 moves through the field of view of the scanner.
- the object movement may be from a checker swiping the object 184 through a field of view of the scanner or positioning a hand held scanner to image the object, or from a mechanical mechanism, such as a conveyor moving an object through a view volume of a scanner.
- Image portion 180 at frame time, T 1 includes an image captured of at least a first part of object 184 .
- This object has encoded data tiles having a first payload 186 a , 186 b , and encoded data tile 188 a having a second payload.
- Image block 182 at a later frame time, T 2 , depicts that the object 184 has moved further within the field of view of the scanner.
- more tiles are captured, such as 186 c having the same payload as 186 a and 186 b , and 188 b having the same payload as 188 a.
- FIGS. 7A and 7B illustrate the problem outlined above for conflicting codes on objects.
- the recognition unit may detect a first code in 188 a and another code in 186 a or none of the codes in 186 from frame at T 1 .
- the reverse may happen for the frame at T 2 , as more of the tiles of 186 are visible to the scanner than 188.
- the recognition unit is more likely to detect 186 at T 2 .
- the code in 188 is an example of an inner barcode. It is only partially obscured by the label or overwrap on which the code in 186 resides.
- Tiles 188 a - b carry an “inner barcode,” whereas tiles 186 a - c contain an “outer barcode,” using the terminology introduced earlier.
- the encoded tiles 188 a - b correspond to packaging of an individual item in a family pack or the label bearing the GTIN of a product, before a price change.
- the encoded tiles 186 a - c correspond to packaging of the family pack, such as a partial over-wrap or carrier.
- Encoded tiles 186 a - c alternatively correspond to a price change label.
- the sequence of detection is likely to be as shown in FIG. 2 , where the inner barcode of 188 is detected at T 1 and then the outer barcode is detected at T 2 . This sequence of detection may not always happen, but in cases where different codes are detected from a package either within a frame, or over different frames, there is a need for code conflict resolution.
- FIGS. 8A and 8B illustrate another example of image portions 190 , 192 in different frames captured from a field of view of a scanner's imager.
- an outer barcode is likely to be detected first, but later, the inner barcode is likely to be detected.
- an outer barcode is encoded in tiles 196 a - d
- an inner barcode in tiles 198 a - b .
- the outer barcode is encoded in tiles 196 a - d on the package of the overwrap, but the overwrap does not completely obscure the inner barcode, which is a barcode encoded in tiles 198 a - b on an individual item or items within the family pack.
- the price change is encoded in 196 a - d , e.g., on a label affixed to the package 194 over the original packaging.
- the original packaging retains encoding of the original item's GTIN in tiles 198 a - b .
- the sequence of detection of outer than inner barcode of FIG. 3 is likely to happen in this case.
- a recognition unit is likely to detect the payload of tiles 196 a - d , and likely not 198 a .
- the recognition unit is likely to detect the payload of tiles 198 a - b .
- This scenario poses a conflict if the scanner were to report the GTIN of the inner barcode separately from the family pack. Further, in some price change label scenarios, the scanner needs to detect that it should not report the original GTIN, as this would not reflect the price change correctly.
- FIG. 9 is a flow diagram of a controller process that resolves these potential code conflicts.
- this control logic is implemented within the controller 140 of FIG. 5 .
- it may also be distributed between the controller 140 and one or more recognition units (e.g., 142 , 144 , 146 ).
- a recognition unit may implement control logic for resolving conflicts among codes that it detects during scanning operation, and report a subset of codes to a controller 140 for which conflicts have been resolved.
- the controller receives recognition results from plural different recognition units and executes control logic to resolve conflicts among the recognition results from these recognition units.
- control logic is implemented as software instructions within a controller software process 160 executing on a processor ( 102 , 102 a or 108 ) of the scanner.
- the recognition unit 162 is a software process executed on that processor or different processor within the scanner.
- the controller begins by initiating the recognition units.
- the recognition units e.g., 142 - 146
- the controller issues instructions to the imager 110 via the imager interface and the illumination driver 116 to coordinate image capture and illumination as objects are scanned.
- the imaging interface 108 captures image data from the image 110 for a frame, buffers it in a RAM memory and signals the controller that new image block is available.
- This RAM memory may be within the interface 108 or in RAM memory 104 .
- the controller gets an address of an image block in this RAM memory and passes the address to a recognition unit, along with additional attributes of that image block useful in assisting recognition operations (such as the view or camera that the image block came from, its geometric state (e.g., orientation of the view), frame identifier, and the like).
- the recognition unit proceeds to obtain and perform recognition operations on the image block.
- a watermark processor executes decoder operations on the image block to search for an encoded data carrier and extract its payload from one or more of these encoded tiles, if detected.
- Plural instances of watermark processors may be assigned to process image blocks of different frames, as shown in FIG. 6 .
- the controller gets recognition results from the recognition units as shown in step 203 .
- the controller queries a recognition unit to get its recognition result. It then evaluates the result to determine whether it has successfully recognized an object and has provided its item identifier (e.g. a GTIN, price code identifier or like item identifier), as shown in decision block 204 . If not, it passes the next image block to the recognition unit (back to 201 - 202 ).
- item identifier e.g. a GTIN, price code identifier or like item identifier
- the controller If the controller has obtained an item identifier, it evaluates the identifier against other identifiers obtained from the frame and prior frames during a pending time out period in step 205 . This evaluation includes a comparison of the detected identifier with other identifiers from the same frame or prior frame stored in a state data structure.
- the controller retains state information for identifiers. Upon detection of a new identifier, the controller checks whether it is flagged, or has otherwise been detected as a family pack, family pack member or price change label. A family pack or family pack member is signaled via a flag decoded from the data carrier encoded on the object. Likewise, a price change label is similarly indicated by a flag. Alternative means of detecting family packs, family pack member items, and price change labels may be used in place of the flag or in addition to a flag, as described in this document (e.g., by label geometry, color, recognized image feature set or label template, etc.).
- the detection of a family pack causes the controller to update the state by storing the family pack identifier in a state data structure and initiating a waiting period.
- the family pack identifier is queued for reporting at this point, as there is no need to wait to report it. Instead, this waiting period is used to prevent reporting an identifier of a member of the family pack for detections during waiting period initiated upon detection of the family pack.
- the waiting period is implemented using a timer as explained below.
- a duplicate time out period has a different objective from that of a waiting period to resolve a conflict. As such, it may be preferred to instantiate separate timers for duplicate and conflict rejection.
- the detection of a new family pack member causes the controller to check whether a family pack identifier with a pending waiting period is in a state data structure.
- the pending waiting period is indicated by the timer for the waiting period not being in time out state when queried for an update. If family pack is in a waiting period, the family pack member is not reported. If a family pack is not in a waiting period, the controller updates the state data structure by storing the family pack member's identifier and initiating a waiting period for it. This family pack member waiting period is used to instruct the controller to wait to determine whether a family pack identifier is detected in the waiting period. It may also be used for duplicate rejection.
- the family pack identifier is stored in a state data structure and is queued for reporting (there is no need to wait on reporting). Additionally, the family pack member is stored in a state data structure for duplicate rejection, and a family pack waiting period is initiated for the family pack identifier by setting a timer for a family pack waiting period.
- the controller finds a detection result with a new fixed price flag set, it stores the new fixed price code and queues it for reporting. From a reporting perspective, the controller reports the new fixed price instead of the original product identifier (GTIN) decoded from the same object.
- the scanner determines whether an identifier is from the same object by proximity in detection time or detection location of the price change label relative to the original product identifier (GTIN). Proximity in detection time is implemented based on a waiting period.
- a waiting period is imposed for new identifiers detected because of the possibility that detection of a new fixed price label may replace the GTIN that the controller reports to the POS terminal.
- the new identifier is retained and a waiting period is initiated to determine whether a fixed price label is detected in that ensuing waiting period. If a new fixed price code is detected first before the original product identifier on the object, meaning that no product identifier is in a waiting period state in the state data structure, the new fixed price code is queued for reporting. Subsequent product identifiers in the waiting period are not reported, but may be stored for duplicate rejection.
- the controller For a detected discount code, the controller stores the discount code in a state data structure and queues it for reporting.
- the scanner logic determines whether a product identifier is detected from the same object as noted in the previous case, e.g., by proximity in detection time and/or position in frame(s) relative to the discount label. If a product identifier from the same object is in the state data structure under its waiting period, the detected discount code is reported along with it. The discount code is stored for duplicate rejection, but is reported only once. If a discount is detected first, with no product identifier in a pending waiting period, the controller stores it in the state data structure and initiates a waiting period. It is reported if a new product identifier is detected in its waiting period. Since the discount should be associated with a product identifier, the controller may flag the POS terminal to have the checker scan or otherwise enter the product identifier of the product to which the discount code applies.
- the controller updates the state data structure with the identifier and status of an identifier (including product or price change codes), including state associated with family pack or price change detection results. It also calls a timer instance, if one has been initiated, to get its count and update the status of the timer as timed out, or still pending. It may also retain other information helpful in resolving conflict among detected items.
- This information may include a frame identifier or time code to indicate where an identifier originated from within a frame or a time of the frame in which it was detected. This information may also include position information, such orientation parameters and/or spatial location within a frame from which the identifier was extracted.
- the positional information may be used to determine that identifiers are from items that are to be priced separately, and as such, both reported to the POS. For example, if the identifiers originate from different frame locations and have tile orientation that is inconsistent, then they are candidates of being from separate objects, and handled as such by the controller.
- the controller determines whether to report the identifier or identifiers in the state data structure. The decision is based on state of the identifiers in the data structure and the state of the timer used to track a waiting period that has been initiated.
- the controller reports an identifier, including price change codes, for which a waiting period has not been imposed, or the waiting period to report has timed out. Time out periods used only for duplicate rejection do not require a waiting period for reporting. However, potential conflicts arising from family pack or price changes may require a waiting period as described above.
- the controller determines whether an identifier is in a waiting period by checking the state data structure to check whether the timer instance for a waiting period has timed out.
- the controller has updated the state data structure to signal that an identifier is in a state to be reported, or ignored. If it determines to report, the controller transmits the identifier(s) to the POS terminal via the scanner's communication interface as shown in block 208 .
- the controller sets up a timer for a waiting period, if necessary, for this pass through the controller process.
- the timer may be implemented with a soft timer, a software process such as a C++ timer object, which in turn, interfaces with a timer interrupt service available in the scanner's operating system.
- the timer creates a timer instance for a waiting period.
- the timer instance invokes the timer interrupt service to update its count.
- the time interrupt services exposes a counter in the scanner hardware, e.g., as part of the ARM or other processor sub-system in the scanner. For flags that signal the start of a waiting period, such as a family pack or member of family pack, a new timer is initiated for that family pack related waiting period. The same is true for price change related waiting periods.
- FIG. 9 depicts an example of a sequence of operations of a controller implementation.
- the sequence of operations may vary from the one depicted here.
- the timer may be set within the set of instructions that execute the update to the state of 206 .
- code conflict logic may be implemented within each recognition unit, and at the level of the controller.
- Conflict logic within a recognition unit is employed to resolve conflict among codes of the same type detected by the recognition unit. For example, in the case where plural conflicting codes of the same type are present on a package, the recognition unit employs code conflict logic to prevent reporting an erroneous code to the controller, and ultimately, to prevent the scanner from reporting an improper code to the POS system.
- the recognition unit writes its detection results to a data structure and returns the data structure (or pointer to it) when the controller queries it for detection results.
- the recognition unit records the state of detection results in the data structure, including whether a detected identifier is in a waiting period and whether a detected identifier is in a potentially conflicted status with another identifier.
- plural different codes of the same symbology and type are detected within a frame, they are recorded as potentially conflicting. This may occur where there are two different GTINs without a family pack or price code relationship to justify the existence of the different GTINs.
- a waiting period is initiated for each code. For subsequent codes detected within the waiting period, the recognition unit updates the data structure.
- the recognition unit may be able to resolve the conflict based on detection results within the waiting period that confirm that one identifier should be given priority over another. For example, subsequent detection of one of the identifiers in subsequent image frames of a package within the waiting period may be sufficient to confirm that one identifier has priority and should be reported as such through the state data structure. Alternatively, the conflict may not be resolved, and instead, the recognition unit reports potentially conflicting identifiers on a package to the controller via a pointer to the data structure.
- the controller either resolves the conflict based on detection results from another recognition unit and reports the highest priority identifier or reports an error to the POS system. For example, a GTIN in a barcode of one type reported from one recognition unit may agree with a GTIN in a different symbology reported from another recognition unit. For results within a waiting period, the controller compares the detection results from different recognition units and determines, based on matching the GTINs from different symbologies, that a conflicting GTIN can be excluded and the matching GTIN given priority. The controller then reports the higher priority GTIN. Alternatively, if a conflict persists or is not resolved, the controller signals an error to the POS system and prompts a re-scan, or manual entry. The re-scan may be switched to a presentment mode rather than a scan and pass mode so that the user can present the correct code for scanning.
- recognition units can become more sophisticated in detection performance, detection result and state reporting, and conflict logic. These updates are reflected in updates to the contents of the data structure, which provide more detail of the context of the detection of each identifier (e.g., location, time of detect, number of detects, waiting period state) as well as recommended reporting logic (e.g., reporting an instruction to the controller to hold for waiting period, resolve conflict between codes A, B, etc., or seek to confirm detection result with result of another recognition unit).
- identifier e.g., location, time of detect, number of detects, waiting period state
- recommended reporting logic e.g., reporting an instruction to the controller to hold for waiting period, resolve conflict between codes A, B, etc., or seek to confirm detection result with result of another recognition unit.
- the scanner may be updated on a different schedule without concern of becoming incompatible with the recognition unit, as the data structure is configured to include a detection result that is backward compatible.
- An older version of a controller continues to interpret simpler results as before, e.g., report GTIN, wait, or error.
- a new version of the controller is preferably updated to interpret error or wait states in the extended data structure, as an instruction to read and resolve potential code conflicts identified in the extended data structure.
- the recognition unit updates are provided with helper source code that provide scanner manufactures guidance on how to exploit the additional detection result data and code conflict logic implemented by the recognition unit and reported in the extended data structure it returns.
- FIG. 10 is a block diagram of a signal encoder for encoding a digital payload signal into an image signal.
- FIG. 11 is a block diagram of a compatible signal decoder for extracting the digital payload signal from an image signal.
- Signal encoder and decoder may be used for communicating a data channel for many applications, the objective for use in physical objects is robust signal communication through images formed on and captured from these objects.
- Signal encoders and decoders like those in the Digimarc Barcode Platform from Digimarc Corporation, communicate auxiliary data in a data carrier within image content.
- Encoding and decoding is applied digitally, yet the signal survives digital to analog transformation and analog to digital transformation.
- the encoder generates a modulated image that is converted to a rendered form, such as a printed image.
- a receiving device Prior to decoding, a receiving device has an imager to capture the modulated signal, convert it to an electric signal, which is digitized and then processed by the decoder.
- Inputs to the signal encoder include a host image 220 and auxiliary data payload 222 .
- the objectives of the encoder include encoding a robust signal with desired payload capacity per unit of host signal (e.g., the spatial area of a two-dimensional tile), while maintaining perceptual quality.
- desired payload capacity per unit of host signal e.g., the spatial area of a two-dimensional tile
- there is little host interference on the one hand yet little host content in which to mask the presence of the data channel within an image.
- Some examples include a package design that is devoid of much image variability (e.g., a single, uniform color). See, e.g., US Published Application No. 20160275639, entitled SPARSE MODULATION FOR ROBUST SIGNALING AND SYNCHRONIZATION, incorporated herein by reference.
- the auxiliary data payload 222 includes the variable data information to be conveyed in the data channel, possibly along with other protocol data used to facilitate the communication.
- the protocol of the auxiliary data encoding scheme comprises the format of the auxiliary data payload, error correction coding schemes, payload modulation methods (such as the carrier signal, spreading sequence, encoded payload scrambling or encryption key), signal structure (including mapping of modulated signal to embedding locations within a tile), error detection in payload (CRC, checksum, etc.), perceptual masking method, host signal insertion function (e.g., how auxiliary data signal is embedded in or otherwise combined with host image signal in a package or label design), and synchronization method and signals.
- payload modulation methods such as the carrier signal, spreading sequence, encoded payload scrambling or encryption key
- signal structure including mapping of modulated signal to embedding locations within a tile
- error detection in payload CRC, checksum, etc.
- perceptual masking method e.g., how auxiliary data signal is
- the protocol defines the manner in which the signal is structured and encoded for robustness, perceptual quality or data capacity. For a particular application, there may be a single protocol, or more than one protocol, depending on application requirements. Examples of multiple protocols include cases where there are different versions of the channel, different channel types (e.g., several digital watermark layers within a host). Different versions may employ different robustness encoding techniques or different data capacity.
- Protocol selector module 224 determines the protocol to be used by the encoder for generating a data signal. It may be programmed to employ a particular protocol depending on the input variables, such as user control, application specific parameters, or derivation based on analysis of the host signal.
- Perceptual analyzer module 226 analyzes the input host signal to determine parameters for controlling signal generation and embedding, as appropriate. It is not necessary in certain applications, while in others it may be used to select a protocol and/or modify signal generation and embedding operations. For example, when encoding in host color images that will be printed or displayed, the perceptual analyzer 156 is used to ascertain color content and masking capability of the host image.
- the output of this analysis along with the rendering method (display or printing device) and rendered output form (e.g., ink and substrate) is used to control auxiliary signal encoding in particular color channels (e.g., one or more channels of process inks, Cyan, Magenta, Yellow, or Black (CMYK) or spot colors), perceptual models, and signal protocols to be used with those channels.
- color channels e.g., one or more channels of process inks, Cyan, Magenta, Yellow, or Black (CMYK) or spot colors
- CYK Cyan, Magenta, Yellow, or Black
- signal protocols to be used with those channels.
- the perceptual analyzer module 226 also computes a perceptual model, as appropriate, to be used in controlling the modulation of a data signal onto a data channel within image content as described below.
- the signal generator module 228 operates on the auxiliary data and generates a data signal according to the protocol. It may also employ information derived from the host signal, such as that provided by perceptual analyzer module 226 , to generate the signal. For example, the selection of data code signal and pattern, the modulation function, and the amount of signal to apply at a given embedding location may be adapted depending on the perceptual analysis, and in particular on the perceptual model and perceptual mask that it generates. Please see below and the incorporated patent documents for additional aspects of this process.
- Embedder module 230 takes the data signal and modulates it into an image by combining it with the host image.
- the operation of combining may be an entirely digital signal processing operation, such as where the data signal modulates the host signal digitally, may be a mixed digital and analog process or may be purely an analog process (e.g., where rendered output images, with some signals being modulated data and others being host image content, such as the various layers of a package design file).
- One approach is to adjust the host signal value as a function of the corresponding data signal value at an embedding location, which is limited or controlled according to the perceptual model and a robustness model for that embedding location.
- the adjustment may be altering the host image by adding a scaled data signal or multiplying by a scale factor dictated by the data signal value corresponding to the embedding location, with weights or thresholds set on the amount of the adjustment according to the perceptual model, robustness model, and available dynamic range.
- the adjustment may also be altering by setting the modulated host signal to a particular level (e.g., quantization level) or moving it within a range or bin of allowable values that satisfy a perceptual quality or robustness constraint for the encoded data.
- the signal generator produces a data signal with data elements that are mapped to embedding locations in a tile. These data elements are modulated onto the host image at the embedding locations.
- a tile is a pattern of embedding locations. The tile derives its name from the way in which it is repeated in contiguous blocks of a host signal, but it need not be arranged this way.
- image-based encoders we use tiles in the form of a two dimensional array (e.g., 128 by 128, 256 by 256, 512 by 512) of embedding locations.
- the embedding locations correspond to host signal samples at which an encoded signal element is embedded in an embedding domain, such as a spatial domain (e.g., pixels at a spatial resolution), frequency domain (frequency components at a frequency resolution), or some other feature space.
- a spatial domain e.g., pixels at a spatial resolution
- frequency domain frequency components at a frequency resolution
- an embedding location as a bit cell, referring to a unit of data (e.g., an encoded bit or chip element) encoded within a host signal at the location of the cell.
- a unit of data e.g., an encoded bit or chip element
- the operation of combining may include one or more iterations of adjustments to optimize the modulated host for perceptual quality or robustness constraints.
- One approach for example, is to modulate the host image so that it satisfies a perceptual quality metric as determined by perceptual model (e.g., visibility model) for embedding locations across the signal.
- Another approach is to modulate the host image so that it satisfies a robustness metric across the signal.
- Yet another is to modulate the host image according to both the robustness metric and perceptual quality metric derived for each embedding location.
- the incorporated documents provide examples of these techniques. Below, we highlight a few examples. See, e.g., U.S. Pat. Nos. 9,449,357 and 9,401,001, and US Published Patent Application No. US 2016-0316098 A1, which are hereby incorporated herein by reference.
- the perceptual analyzer For color images, the perceptual analyzer generates a perceptual model that evaluates visibility of an adjustment to the host by the embedder and sets levels of controls to govern the adjustment (e.g., levels of adjustment per color direction, and per masking region). This may include evaluating the visibility of adjustments of the color at an embedding location (e.g., units of noticeable perceptual difference in color direction in terms of CIE Lab values), Contrast Sensitivity Function (CSF), spatial masking model (e.g., using techniques described by Watson in US Published Patent Application No. US 2006-0165311 A1, which is incorporated by reference herein), etc.
- a perceptual model that evaluates visibility of an adjustment to the host by the embedder and sets levels of controls to govern the adjustment (e.g., levels of adjustment per color direction, and per masking region). This may include evaluating the visibility of adjustments of the color at an embedding location (e.g., units of noticeable perceptual difference in color direction in terms of CIE Lab values), Contra
- One way to approach the constraints per embedding location is to combine the data with the host at embedding locations and then analyze the difference between the encoded host with the original.
- the perceptual model specifies whether an adjustment is noticeable based on the difference between a visibility threshold function computed for an embedding location and the change due to embedding at that location.
- the embedder then can change or limit the amount of adjustment per embedding location to satisfy the visibility threshold function.
- there are various ways to compute adjustments that satisfy a visibility threshold with different sequence of operations. See, e.g., Digimarc's U.S. Pat. Nos. 9,449,357, 9,401,001, 9,380,186, 9,117,268 and 7,352,878, which are each hereby incorporated herein by reference in its entirety.
- the embedder also computes a robustness model.
- the computing of a robustness model may include computing a detection metric for an embedding location or region of locations.
- the approach is to model how well the decoder will be able to recover the data signal at the location or region. This may include applying one or more decode operations and measurements of the decoded signal to determine how strong or reliable the extracted signal. Reliability and strength may be measured by comparing the extracted signal with the known data signal.
- decode operations that are candidates for detection metrics within the embedder.
- One example is an extraction filter which exploits a differential relationship to recover the data signal in the presence of noise and host signal interference.
- the host interference is derivable by applying an extraction filter to the modulated host. The extraction filter models data signal extraction from the modulated host and assesses whether the differential relationship needed to extract the data signal reliably is maintained. If not, the modulation of the host is adjusted so that it is.
- Detection metrics may be evaluated such as by measuring signal strength as a measure of correlation between the modulated host and variable or fixed data components in regions of the host, or measuring strength as a measure of correlation between output of an extraction filter and variable or fixed data components.
- the embedder changes the amount and location of host signal alteration to improve the correlation measure. These changes may be particularly tailored so as to establish relationships of the data signal within a particular tile, region in a tile or bit cell pattern of the modulated host. To do so, the embedder adjusts bit cells that violate the relationship so that the relationship needed to encode a bit (or M-ary symbol) value is satisfied and the thresholds for perceptibility are satisfied. Where robustness constraints are dominant, the embedder will exceed the perceptibility threshold where necessary to satisfy a desired robustness threshold.
- the robustness model may also model distortion expected to be incurred by the modulated host, apply the distortion to the modulated host, and repeat the above process of measuring detection metrics and adjusting the amount of alterations so that the data signal will withstand the distortion. See, e.g., U.S. Pat. Nos. 9,380,186, 9,401,001 and 9,449,357, which are each hereby incorporated herein by reference, for image related processing.
- This modulated host is then output as an output image signal 232 , with a data channel encoded in it.
- the operation of combining also may occur in the analog realm where the data signal is transformed to a rendered form, such as a layer of ink or coating applied by a commercial press to substrate.
- a data signal that is overprinted as a layer of material, engraved in, or etched onto a substrate, where it may be mixed with other signals applied to the substrate by similar or other marking methods.
- the embedder employs a predictive model of distortion and host signal interference, and adjusts the data signal strength so that it will be recovered more reliably.
- the predictive modeling can be executed by a classifier that classifies types of noise sources or classes of host image and adapts signal strength and configuration of the data pattern to be more reliable to the classes of noise sources and host image signals that the encoded data signal is likely to be encounter or be combined with.
- the output 232 from the embedder signal typically incurs various forms of distortion through its distribution or use. For printed objects, this distortion occurs through rendering an image with the encoded signal in the printing process, and subsequent scanning back to a digital image via a camera or like image sensor.
- the signal decoder receives an encoded host signal 240 and operates on it with one or more processing stages to detect a data signal, synchronize it, and extract data.
- This signal decoder corresponds to a type of recognition unit in FIG. 5 and watermark processor in FIG. 6 .
- the decoder is paired with an input device in which a sensor captures an analog form of the signal and an analog to digital converter converts it to a digital form for digital signal processing.
- aspects of the decoder may be implemented as analog components, e.g., such as preprocessing filters that seek to isolate or amplify the data channel relative to noise, much of the decoder is implemented as digital signal processing modules that implement the signal processing operations within a scanner. As noted, these modules are implemented as software instructions executed within the scanner, an FPGA, or ASIC.
- the detector 242 is a signal processing module that detects presence of the data channel.
- the incoming signal is referred to as a suspect host because it may not have a data channel or may be so distorted as to render the data channel undetectable.
- the detector is in communication with a protocol selector 244 to get the protocols it uses to detect the data channel. It may be configured to detect multiple protocols, either by detecting a protocol in the suspect signal and/or inferring the protocol based on attributes of the host signal or other sensed context information. A portion of the data signal may have the purpose of indicating the protocol of another portion of the data signal. As such, the detector is shown as providing a protocol indicator signal back to the protocol selector 244 .
- the synchronizer module 246 synchronizes the incoming signal to enable data extraction. Synchronizing includes, for example, determining the distortion to the host signal and compensating for it. This process provides the location and arrangement of encoded data elements within the host signal.
- the data extractor module 248 gets this location and arrangement and the corresponding protocol and demodulates a data signal from the host.
- the location and arrangement provide the locations of encoded data elements.
- the extractor obtains estimates of the encoded data elements and performs a series of signal decoding operations.
- the detector, synchronizer and data extractor may share common operations, and in some cases may be combined.
- the detector and synchronizer may be combined, as initial detection of a portion of the data signal used for synchronization indicates presence of a candidate data signal, and determination of the synchronization of that candidate data signal provides synchronization parameters that enable the data extractor to apply extraction filters at the correct orientation, scale and start location of a tile.
- data extraction filters used within data extractor may also be used to detect portions of the data signal within the detector or synchronizer modules.
- the decoder architecture may be designed with a data flow in which common operations are re-used iteratively, or may be organized in separate stages in pipelined digital logic circuits so that the host data flows efficiently through the pipeline of digital signal operations with minimal need to move partially processed versions of the host data to and from a shared memory unit, such as a RAM memory.
- FIG. 12 is a flow diagram illustrating operations of a signal generator.
- Each of the blocks in the diagram depict processing modules that transform the input auxiliary data (e.g., GTIN or other item identifier plus flags) into a digital payload data signal structure.
- each block provides one or more processing stage options selected according to the protocol.
- the auxiliary data payload is processed to compute error detection bits, e.g., such as a Cyclic Redundancy Check, Parity, check sum or like error detection message symbols. Additional fixed and variable messages used in identifying the protocol and facilitating detection, such as synchronization signals may be added at this stage or subsequent stages.
- error detection bits e.g., such as a Cyclic Redundancy Check, Parity, check sum or like error detection message symbols. Additional fixed and variable messages used in identifying the protocol and facilitating detection, such as synchronization signals may be added at this stage or subsequent stages.
- Error correction encoding module 302 transforms the message symbols of the digital payload signal into an array of encoded message elements (e.g., binary or M-ary elements) using an error correction method. Examples include block codes, BCH, Reed Solomon, convolutional codes, turbo codes, etc.
- Repetition encoding module 304 repeats and concatenates the string of symbols from the prior stage to improve robustness. For example, certain message symbols may be repeated at the same or different rates by mapping them to multiple locations within a unit area of the data channel (e.g., one unit area being a tile of bit cells, as described further below).
- Repetition encoding may be removed and replaced entirely with error correction coding. For example, rather than applying convolutional encoding (1/3 rate) followed by repetition (repeat three times), these two can be replaced by convolution encoding to produce a coded payload with approximately the same length.
- carrier modulation module 306 takes message elements of the previous stage and modulates them onto corresponding carrier signals.
- a carrier might be an array of pseudorandom signal elements, with equal number of positive and negative elements (e.g., 16, 32, 64 elements), or other waveform.
- Mapping module 308 maps signal elements of each modulated carrier signal to locations within the channel.
- the locations correspond to embedding locations within the host signal.
- the embedding locations may be in one or more coordinate system domains in which the host signal is represented within a memory of the signal encoder.
- the locations may correspond to regions in a spatial domain, temporal domain, frequency domain, or some other transform domain. Stated another way, the locations may correspond to a vector of host signal features, which are modulated to encode a data signal within the features.
- Mapping module 308 also maps a synchronization signal to embedding locations within the host signal, for embodiments employing an explicit synchronization signal.
- An explicit synchronization signal is described further below.
- the decoder To accurately recover the payload, the decoder must be able to extract estimates of the coded bits at the embedding locations within each tile. This requires the decoder to synchronize the image under analysis to determine the embedding locations. For images, where the embedding locations are arranged in two dimensional blocks within a tile, the synchronizer determines rotation, scale and translation (origin) of each tile. This may also involve approximating the geometric distortion of the tile by an affine transformation that maps the embedded signal back to its original embedding locations.
- the auxiliary signal may include an explicit or implicit synchronization signal.
- An explicit synchronization signal is an auxiliary signal separate from the encoded payload that is embedded with the encoded payload, e.g., within the same tile).
- An implicit synchronization signal is a signal formed with the encoded payload, giving it structure that facilitates geometric/temporal synchronization. Examples of explicit and implicit synchronization signals are provided in our previously cited patents U.S. Pat. Nos. 6,614,914, and 5,862,260.
- an explicit synchronization signal is a signal comprised of a set of sine waves, with pseudo-random phase, which appear as peaks in the Fourier domain of the suspect signal. See, e.g., U.S. Pat. Nos. 6,614,914, and 5,862,260, describing use of a synchronization signal in conjunction with a robust data signal. Also see U.S. Pat. No. 7,986,807, which is hereby incorporated by reference.
- FIG. 13 is a diagram illustrating embedding of an auxiliary signal into host signal.
- the inputs are a host signal block (e.g., blocks of a host digital image) ( 320 ) and an encoded auxiliary signal ( 322 ), which is to be inserted into the signal block.
- the encoded auxiliary signal may include an explicit synchronization component, or the encoded payload may be formulated to provide an implicit synchronization signal.
- Processing block 324 is a routine of software instructions or equivalent digital logic configured to insert the mapped signal(s) into the host by adjusting the corresponding host signal sample(s) at an embedding location according to the value of the mapped signal element.
- the mapped signal is added/subtracted from corresponding a sample value, with scale factor and threshold from the perceptual model or like mask controlling the adjustment amplitude.
- the encoded payload and synchronization signals may be combined and then added, or added separately with separate mask coefficients to control the signal amplitude independently.
- the product or label identifier (e.g., in GTIN format) and additional flag or flags used by control logic are formatted into a binary sequence, which is encoded and mapped to the embedding locations of a tile.
- the embedding locations correspond to spatial domain embedding locations within an image.
- the spatial locations correspond to pixel samples at a configurable spatial resolution, such as 100 or 300 DPI.
- the spatial resolution of the embedded signal is 300 DPI, for an embodiment where the resulting image with encode data is printed on a package or label material, such as a paper, plastic or like substrate.
- the payload is repeated in contiguous tiles each comprised of 256 by 256 of embedding locations. With these embedding parameters, an instance of the payload is encoded in each tile, occupying a block of host image of about 1.28 by 1.28 inches. These parameters are selected to provide a printed version of the image on paper or other substrate. At this size, the tile can be redundantly encoded in several contiguous tiles, providing added robustness.
- An alternative to achieving desired payload capacity is to encode a portion of the payload in smaller tiles, e.g., 128 by 128, and use a protocol indicator to specify the portion of the payload conveyed in each 128 by 128 tile. Erasure codes may be used to convey different payload components per tile and then assemble the components in the decoder, as elaborated upon below.
- error correction coding is applied to the binary sequence.
- This implementation applies a convolutional coder at rate 1/4, which produces an encoded payload signal of 4096 bits.
- Each of these bits is modulated onto a binary antipodal, pseudorandom carrier sequence ( ⁇ 1, 1) of length 16, e.g., multiply or XOR the payload bit with the binary equivalent of chip elements in its carrier to yield 4096 modulated carriers, for a signal comprising 65,536 elements. These elements map to the 65,536 embedding locations in each of the 256 by 256 tiles.
- An alternative embodiment, for robust encoding on packaging employs tiles of 128 by 128 embedding locations. Through convolutional coding of an input payload at rate 1/3 and subsequent repetition coding, an encoded payload of 1024 bits is generated. Each of these bits is modulated onto a similar carrier sequence of length 16, and the resulting 16,384 signal elements are mapped to the 16,384 embedding locations within the 128 by 128 tile.
- mapping functions to map the encoded payload to embedding locations.
- these elements have a pseudorandom mapping to the embedding locations.
- they are mapped to bit cell patterns of differentially encoded bit cells as described in US Published Patent Application no. 20160217547, incorporated above.
- the tile size may be increased to accommodate the differential encoding of each encoded bit in a pattern of differential encoded bit cells, where the bit cells corresponding to embedding locations at a target resolution (e.g., 300 DPI).
- the mapping function maps a discrete digital image of the synchronization signal to the host image block.
- the synchronization signal comprises a set of Fourier magnitude peaks or sinusoids with pseudorandom phase
- the synchronization signal is generated in the spatial domain in a block size coextensive with the 256 by 256 tile (or other tile size, e.g., 128 by 128) at target embedding resolution.
- One signaling approach which is detailed in U.S. Pat. Nos. 6,614,914, and 5,862,260, is to map elements to pseudo-random locations within a channel defined by a domain of a host signal. See, e.g., FIG. 9 of U.S. Pat. No. 6,614,914.
- elements of a watermark signal are assigned to pseudo-random embedding locations within an arrangement of sub-blocks within a block (referred to as a “tile”).
- the elements of this watermark signal correspond to error correction coded bits output from an implementation of stage 304 of FIG. 5 . These bits are modulated onto a pseudo-random carrier to produce watermark signal elements (block 306 of FIG.
- An embedder module modulates this signal onto a host signal by increasing or decreasing host signal values at these locations for each error correction coded bit according to the values of the corresponding elements of the modulated carrier signal for that bit.
- FIG. 14 is a flow diagram illustrating a method for decoding a payload signal from a host image signal. This method is a particular embodiment of a recognition unit of FIG. 5 , and a watermark processor of FIG. 6 . Implementations of recognition unit and watermark processors available from Digimarc Corporation include:
- the Embedded Systems SDK is the one typically integrated into scanner hardware.
- the frames are captured at a resolution preferably near the resolution at which the auxiliary signal has been encoded within the original image (e.g., 300 DPI, 100 DPI, etc.).
- An image up-sampling or down-sampling operation may be performed to convert the image frames supplied by the imager to a target resolution for further decoding.
- the resulting image blocks supplied to the decoder from these frames may potentially include an image with the payload. At least some number of tiles of encoded signal may be captured within the field of view, if an object with encoded data is being scanned. Otherwise, no encoded tiles will be present. The objective, therefore, is to determine as efficiently as possible whether encoded tiles are present.
- the decoder selects image blocks for further analysis.
- the block size of these blocks is set large enough to span substantially all of a complete tile of encoded payload signal, and preferably a cluster of neighboring tiles.
- the spatial scale of the encoded signal is likely to vary from its scale at the time of encoding. This spatial scale distortion is further addressed in the synchronization process.
- the first stage of the decoding process filters the image to prepare it for detection and synchronization of the encoded signal ( 402 ).
- the decoding process sub-divides the image into blocks and selects blocks for further decoding operations.
- a first filtering stage converts the input color image signal (e.g., RGB values) to a color channel or channels where the auxiliary signal has been encoded. See, e.g., U.S. Pat. No. 9,117,268 for more on color channel encoding and decoding.
- the decoding process operates on this “red” channel sensed by the scanner.
- Some scanners may pulse LEDs of different color to obtain plural color or spectral samples per pixel as described in our Patent Application Publication 2013-0329006, entitled COORDINATED ILLUMINATION AND IMAGE SIGNAL CAPTURE FOR ENHANCED SIGNAL DETECTION, which is hereby incorporated by reference.
- a second filtering operation isolates the auxiliary signal from the host image.
- Pre-filtering is adapted for the auxiliary signal encoding format, including the type of synchronization employed. For example, where an explicit synchronization signal is used, pre-filtering is adapted to isolate the explicit synchronization signal for the synchronization process.
- the synchronization signal is a collection of peaks in the Fourier domain.
- the image blocks Prior to conversion to the Fourier domain, the image blocks are pre-filtered. See, e.g., LaPlacian pre-filter in U.S. Pat. No. 6,614,914.
- a window function is applied to the blocks and then a transform to the Fourier domain, applying an FFT.
- Another filtering operation is performed in the Fourier domain. See, e.g., pre-filtering options in U.S. Pat. Nos. 6,988,202, 6,614,914, 20120078989, which are hereby incorporated by reference.
- synchronization process ( 404 ) is executed on a filtered block to recover the rotation, spatial scale, and translation of the encoded signal tiles.
- This process may employ a log polar method as detailed in U.S. Pat. No. 6,614,914 or least squares approach of 20120078989 to recover rotation and scale of a synchronization signal comprised of peaks in the Fourier domain.
- the phase correlation method of U.S. Pat. No. 6,614,914 is used, or phase estimation and phase deviation methods of U.S. Pat. No. 9,182,778, which is hereby incorporated herein by reference, are used.
- the decoder steps through the embedding locations in a tile, extracting bit estimates from each location ( 406 ).
- This process applies, for each location, the rotation, scale and translation parameters, to extract a bit estimate from each embedding location ( 406 ).
- it visits each embedding location in a tile, it transforms it to a location in the received image based on the affine transform parameters derived in the synchronization, and then samples around each location. It does this process for the embedding location and its neighbors to feed inputs to an extraction filter (e.g., oct axis or cross shaped).
- an extraction filter e.g., oct axis or cross shaped
- a bit estimate is extracted at each embedding location using filtering operations, e.g., oct axis or cross shaped filter (see above), to compare a sample at embedding locations with neighbors.
- the output (e.g., 1, ⁇ 1) of each compare operation is summed to provide an estimate for an embedding location.
- Each bit estimate at an embedding location corresponds to an element of a modulated carrier signal.
- the signal decoder estimates a value of each error correction encoded bit by accumulating the bit estimates from the embedding locations of the carrier signal for that bit ( 408 ). For instance, in the encoder embodiment above, error correction encoded bits are modulated over a corresponding carrier signal with 16 elements (e.g., multiplied by or XOR with a binary anti-podal signal). A bit value is demodulated from the estimates extracted from the corresponding embedding locations of these elements. This demodulation operation multiplies the estimate by the carrier signal sign and adds the result. This demodulation provides a soft estimate for each error correction encoded bit.
- a Viterbi decoder is used to produce the payload signal, including the checksum or CRC.
- a compatible decoder is applied to reconstruct the payload. Examples include block codes, BCH, Reed Solomon, Turbo codes.
- the payload is validated by computing the check sum and comparing with the decoded checksum bits ( 412 ).
- the check sum matches the one in the encoder, of course.
- the decoder computes a CRC for a portion of the payload and compares it with the CRC portion in the payload.
- the payload is stored in shared memory of the decoder process.
- the recognition unit in which the decoder process resides returns it to the controller via its interface. This may be accomplished by various communication schemes, such as IPC, shared memory within a process, DMA, etc.
- the scanner may also include a recognition unit that implements an image recognition method for identifying a product in a store's inventory as well as product labels, such as price change labels.
- a recognition unit that implements an image recognition method for identifying a product in a store's inventory as well as product labels, such as price change labels.
- reference image feature sets of each product are stored in a database of the scanner's memory and linked to an item identifier for a product and/or particular label (e.g., price change label).
- the recognition unit extracts corresponding features from an image frame and matches them against the reference feature sets to detect a likely match. If the match criteria are satisfied, the recognition unit returns an item identifier to the controller.
- the recognition unit may also return spatial information, such as position, bounding box, shape or other geometric parameters for a recognized item to enable the controller to detect whether a code from another recognition unit is from the same object.
- SIFT image fingerprint-based system
- SURF SURF
- ORB ORB
- CONGAS are some of the most popular algorithms.
- SIFT, SURF and ORB are each implemented in the popular OpenCV software library, e.g., version 2.3.1.
- CONGAS is used by Google Goggles for that product's image recognition service, and is detailed, e.g., in Neven et al, “Image Recognition with an Adiabatic Quantum Computer I. Mapping to Quadratic Unconstrained Binary Optimization,” Arxiv preprint arXiv:0804.4457, 2008.
- SIFT Scale-Invariant Feature Transform
- a computer vision technology pioneered by David Lowe and described in various of his papers including “Distinctive Image Features from Scale-Invariant Keypoints,” International Journal of Computer Vision, 60, 2 (2004), pp. 91-110; and “Object Recognition from Local Scale-Invariant Features,” International Conference on Computer Vision, Corfu, Greece (September 1999), pp. 1150-1157, as well as in U.S. Pat. No. 6,711,293, which is hereby incorporated herein by reference.
- SIFT Scale-Invariant Feature Transform
- a computer vision technology pioneered by David Lowe and described in various of his papers including “Distinctive Image Features from Scale-Invariant Keypoints,” International Journal of Computer Vision, 60, 2 (2004), pp. 91-110; and “Object Recognition from Local Scale-Invariant Features,” International Conference on Computer Vision, Corfu, Greece (September 1999), pp. 1150-1157, as well as in U.S. Pat. No. 6,711,293.
- SIFT works by identification and description—and subsequent detection—of local image features.
- the SIFT features are local and based on the appearance of the object at particular interest points, and are invariant to image scale, rotation and affine transformation. They are also robust to changes in illumination, noise, and some changes in viewpoint. In addition to these properties, they are distinctive, relatively easy to extract, allow for correct object identification with low probability of mismatch and are straightforward to match against a (large) database of local features.
- Object description by set of SIFT features is also robust to partial occlusion; as few as 3 SIFT features from an object can be enough to compute location and pose.
- the technique starts by identifying local image features—termed keypoints—in a reference image. This is done by convolving the image with Gaussian blur filters at different scales (resolutions), and determining differences between successive Gaussian-blurred images. Keypoints are those image features having maxima or minima of the difference of Gaussians occurring at multiple scales. (Each pixel in a difference-of-Gaussian frame is compared to its eight neighbors at the same scale, and corresponding pixels in each of the neighboring scales (e.g., nine other scales). If the pixel value is a maximum or minimum from all these pixels, it is selected as a candidate keypoint.
- the above procedure typically identifies many keypoints that are unsuitable, e.g., due to having low contrast (thus being susceptible to noise), or due to having poorly determined locations along an edge (the Difference of Gaussians function has a strong response along edges, yielding many candidate keypoints, but many of these are not robust to noise).
- These unreliable keypoints are screened out by performing a detailed fit on the candidate keypoints to nearby data for accurate location, scale, and ratio of principal curvatures. This rejects keypoints that have low contrast, or are poorly located along an edge.
- this process starts by—for each candidate keypoint—interpolating nearby data to more accurately determine keypoint location. This is often done by a Taylor expansion with the keypoint as the origin, to determine a refined estimate of maxima/minima location.
- the value of the second-order Taylor expansion can also be used to identify low contrast keypoints. If the contrast is less than a threshold (e.g., 0.03), the keypoint is discarded.
- a threshold e.g. 0.3
- a variant of a corner detection procedure is applied. Briefly, this involves computing the principal curvature across the edge, and comparing to the principal curvature along the edge. This is done by solving for eigenvalues of a second order Hessian matrix.
- the keypoint descriptor is computed as a set of orientation histograms on (4 ⁇ 4) pixel neighborhoods.
- the foregoing procedure is applied to training images to compile a reference database.
- An unknown image is then processed as above to generate keypoint data, and the closest-matching image in the database is identified by a Euclidian distance-like measure.
- a “best-bin-first” algorithm is typically used instead of a pure Euclidean distance calculation, to achieve several orders of magnitude speed improvement.
- a “no match” output is produced if the distance score for the best match is close—e.g., 25% —to the distance score for the next-best match.
- an image may be matched by clustering. This identifies features that belong to the same reference image—allowing unclustered results to be discarded as spurious.
- a Hough transform can be used—identifying clusters of features that vote for the same object pose.
- SIFT is a well-known technique for generating robust local descriptors
- GLOH c.f., Mikolajczyk et al, “Performance Evaluation of Local Descriptors,” IEEE Trans. Pattern Anal. Mach. Intell., Vol. 27, No. 10, pp. 1615-1630, 2005
- SURF c.f., Bay et al, SURF: Speeded Up Robust eatures,” Eur. Conf. on Computer Vision (1), pp. 404-417, 2006
- Chen et al “Efficient Extraction of Robust Image Features on Mobile Devices,” Proc. of the 6.sup.th IEEE and ACM Int. Symp. On Mixed and Augmented Reality, 2007
- Takacs et al “Outdoors Augmented Reality on Mobile Phone Using Loxel-Based Visual Feature Organization,” ACM Int. Conf. on Multimedia Information Retrieval, October 2008).
- ORB refers to Oriented Fast and Rotated BRIEF, a fast local robust feature detector.
- Ethan Rublee Vincent Rabaud
- Kurt Konolige Kurt Konolige
- Gary Bradski ORB: an efficient alternative to SIFT or SURF”
- Computer Vision ICCV
- Bag of Features or Bag of Words, methods.
- Such methods extract local features from patches of an image (e.g., SIFT points), and automatically cluster the features into N groups (e.g., 168 groups)—each corresponding to a prototypical local feature.
- a vector of occurrence counts of each of the groups i.e., a histogram
- a vector occurrence count is then determined, and serves as a reference signature for the image.
- To determine if a query image matches the reference image local features are again extracted from patches of the image, and assigned to one of the earlier-defined N-groups (e.g., based on a distance measure from the corresponding prototypical local features).
- a vector occurrence count is again made, and checked for correlation with the reference signature.
- object recognition techniques in the following can be adapted for identifying products in a store's inventory:
- an encoded object e.g., a retail package, label or product hang tag
- an object e.g., representing one face of a retail package
- the artwork includes castles, sundial, shields, knight/horse, scenery, etc.
- the text includes “VALIANT”, “For the courage to get deep down clean”, “ICON Label”, etc.
- a 1D barcode and a 2D barcode are examples of these items, and/or include additional or different printed features and graphics.
- FIG. 15 the artwork depicted in FIG. 15 is for illustrative purposes and shouldn't limit the following discussion.
- the illustrated grid-like pattern (creating grid cells) virtually represents different encoding areas. That is, a grid would not typically be printed on a retail package, but is shown in FIG. 15 to help the reader visualize examples of multiple encoding areas.
- encoding regions need not be rectangular in shape.
- Machine-readable data may be redundantly encoded within two-dimensional spatial areas (e.g., within some or all of the grid cells) across an image to create an enhanced or transformed image with an auxiliary data signal.
- the encoding can be applied to an object during printing or labeling with commercial presses, or directly by applying encoding after artwork, text and barcodes have been laid down, with ink jet, laser marking, embossing, photographic, or other marking technology. Redundant marking is particularly useful for automatic identification of objects, as it is able to be merged with other imagery (instead of occupying dedicated spatial area like conventional codes) and enables reliable and efficient optical reading of the machine readable data from various different views of the object.
- the encoding comprises digital watermarking (or a “digital watermark”).
- Digital watermarking refers to an encoded signal that carries a machine-readable (or decodable) code.
- digital watermarking is designed to be less visually perceptible to a human viewer relative to an overt symbology such as a visible 1D or 2D barcode or QR code.
- the following patent documents describe many suitable examples of digital watermarking, e.g., U.S. Pat. Nos. 6,102,403, 6,614,914, 9,117,268, 9,245,308 and 9,380,186, and US Publication Nos. 20160217547 and 20160275639, which are each hereby incorporated by reference in its entirety. The artisan will be familiar with others.
- the retail package includes an icon 550 .
- An icon may include, e.g., a logo, shape, graphic design, symbol, etc.
- Icon 550 typically does not include a machine-readable signal encoded therein.
- the icon 550 may include associated text and/or be differently shaped than illustrated. That is, it need not be a hexagon, nor need it be internally grey-stippled.
- Icon 550 may be used as an indicator of information associated with the retail package, its contents or both.
- icon 550 may be shaped and colored like a peanut to indicate a potential allergy or associated allergy information. In other cases icon 550 may be used as an age restriction indicator.
- the icon may be a particularly stylized “R”, perhaps placed within a colored shape (e.g., box), which can be used to indicate a suitability (or not) for children.
- icon 550 includes a so-called SmartLabel label.
- SmartLabel was a collaborative effort to standardize a digital label format which consumers can use to access product information using their smartphones.
- the SmartLabel is typically associated with a visible QR code.
- the QR code is read (but not the icon) by a smartphone to access product information, e.g., nutrition, ingredients, allergens, in a consistent format.
- the SmartLabel label itself is used more as a visual cue to a shopper or consumer that related product information exists online. But, real estate on a product package is often limited.
- Branding information, graphics, nutrition information, 1D barcode, QR codes, etc. can take up a lot of space.
- a QR code or other visible symbology can be an eyesore.
- Image data 500 is captured by a camera or other image sensor.
- a smartphone camera captures image data representing some or all of a product package (e.g., the package face shown in FIG. 15 ).
- a suitable smartphone is discussed below relative to FIG. 19 .
- a smartphone may represent captured image data in various ways.
- a smartphone camera may output captured image data in RGB, RGBA or Yuv format.
- image data 500 can be variously represented.
- image data 500 represents a cropped version of an image frame. For example, if image data includes 911 ⁇ 512 pixels, the center 400 ⁇ 400 pixels can be used. One purpose of cropping is to focus in on a center of the frame, which is likely the target of a captured image. In some other embodiments, image data 500 represents a filtered or processed version of captured image data.
- Image data 500 is processed by a Signal Decoder 502 , which may include, e.g., a barcode decoder, and/or an encoded signal decoder.
- a Signal Decoder 502 may include, e.g., a barcode decoder, and/or an encoded signal decoder.
- One example of an encoded signal decoder is a digital watermark decoder.
- Image data 500 may represent a frame of imagery, portions of a frame, or streaming imagery, e.g., multiple frames.
- Signal Decoder 502 analyzes the image data 500 in search of an encoded signal, e.g., which carries a code, message or payload. For example, if the image data 500 includes digital watermarking encoded therein, the Signal Decoder 502 attempts to decode 504 the digital watermarking to obtain the code, message or payload.
- the code, message or payload includes a GTIN number, or other product identifier such as a UPC number. If no signal is successfully decoded, Signal Decoder 502 preferably moves on to analyze other image data, e.g., another image frame(s) or another image portion. In some cases, the Signal Decoder 502 may output (or set a flag representing) a message, e.g., “No Detect” or “no signal found”, or the like.
- Icon Detector 506 operates to detect 508 an icon, e.g., icon 550 ( FIG. 15 ). We sometimes use the phrase “target icon” to mean a particular icon that is to be detected or a reference icon from which templates are determined. If an icon is not detected (but the encoded signal was), a first response is presented (e.g., “Response 1” in FIG. 16A ). If an icon is detected (along with the encoded signal), a second response is presented (e.g., “Response 2”). Icon Detector 506 may be configured to search the same image data 500 for the icon.
- Icon Detector 506 is configured to detect icon 550 within a predetermined number of image frames (e.g., 2-5 frames) relative to the encoded signal decode, or within a certain time frame (e.g., within 1 second or less). In still a further cases, if an encoded signal is detected then only the icon detector runs for the next, e.g., n number of frames (e.g., 2-6 frames). In still other implementations, Signal Decoder 502 and Icon Detector 506 switch order of operations.
- a target icon is searched for first and, only upon a successful icon detection, then is an encoded signal searched for.
- This alternative process is shown with respect to FIG. 16B . If an icon is detected (but the encoded signal was not), a first response is presented (e.g., “Response 1” in FIG. 16A ). If an icon is detected (along with the encoded signal), a second response is presented (e.g., “Response 2”). Besides the order of operation, the technology shown in FIGS. 16A and 16B are the same.
- a localized encoded signal search is carried out.
- an encoded signal is placed in or around a localized spatial area relative to an icon.
- the encoded signal surrounds an icon, e.g., icon 550 .
- the encoded signal can be provided in an N ⁇ M rectangular area, where N ⁇ M are measurement units such as in inches, dots per inch, centimeters, etc.
- the encoded signal can be redundantly provided in this N ⁇ M area, e.g., in a tiled-like manner.
- the icon will not include any encoding within its area, whereas in other cases the encoded signal will be provided within or on the icon.
- N corresponds to 1/300 inch to 4 inches
- M corresponds to 1/300 to 4 inches.
- a signal decoder can initiate decoding of an area engulfing, surrounding or neighboring the detected icon.
- the signal decoder can analyze image data within the N ⁇ M area.
- the encoded area is not limited to a rectangle.
- a signal can be encoded within any number of areas including, e.g., the cloud shown in FIG. 25B .
- An image mask or layer can be used to confine the encoding to an area engulfing, surrounding or neighboring an icon.
- the icon is surrounded or neighbored by an area having 1/300 inch to 4 inches on all sides.
- some icons may designed so that they, themselves, can host encoded signals.
- the dashed lines in FIG. 25C represented a signal encoded within an icon, e.g., icon 580 .
- the encoding may be a relatively sparse signaling technology such as discussed in our US Published Patent Application Nos. US 2016-0275639 A1 and US 2017-0024840 A1, which are each hereby incorporated herein by reference in its entirety.
- a line contour change for example, a line contour change, line width modulation (LWM), Line Continuity Modulation (LCM), Line Angle Modulation (LAM), Line Frequency Modulation (LFM), Line Thickness Modulation (LTM), or a combination of these technologies can be used, e.g., as described in assignee's US Patent Application No. US 2016-0189326 A1, which is hereby incorporated herein by reference in its entirety.
- LWM or LTM technique is shown by reference no. 602 , with a line contour change shown by reference no. 604 .
- image data surrounding, corresponding to, neighboring or engulfing the icon can be analyzed to decode an encoded signal.
- a window (or other area define imagery) around (and/or including) the detected icon is searched.
- the window can be expanded if an initial analysis does not decode an encoded signal.
- the window may include 1/300 to 2 inches around the icon. If a signal is not decoded, the area can be expanded from 2-4 inches.
- FIGS. 16A and 16B process may operate on a smartphone, e.g., as depicted in FIG. 19 .
- a smartphone may, at times, be concurrently (or serially) executing multiple different image and/or audio signal processing operations.
- data from an image pipeline e.g., providing image data collected by a camera
- the pipeline data may also be analyzed for optical character recognition and/or image recognition. Prioritizing these different operations and their corresponding output (e.g., decode identifiers, detection indications and/or corresponding responses) can be tricky.
- One approach sets a predetermined time or frame count before providing a response (e.g., a UI indication of a successful read). For example, if a 1D barcode is detected at time 0 seconds, then a response will not be provided until x seconds (or milliseconds) from time 0 seconds. Image signal processing analyzes continues during this time frame to determine whether any other codes, icons, character or image features can be decoded, detected or recognized. If more than one (1) code is detected or decoded then a prioritization can be consulted. For example, it might be determined that an icon takes precedence over all other codes or symbols, so only information associated with a successful icon detection is presented.
- a QR 2-D barcode is ranked highest, so only a response associated with the QR code is provided.
- a prioritization may indicate which response to display first, second, third and so on.
- a retail package includes an encoded signal redundantly provided over its surface.
- the package may include redundant instances of digital watermarking carrying a GTIN number in each of the grid cells (or a subset of the grid cells).
- the encoding e.g., digital watermarking
- the package also includes an icon 550 , which indicates the presence of additional information associated with the package or package contents, e.g., online information. Icon 550 may be even located near a nutrition text box printed on the package (text box not shown in FIG. 15 ).
- a smartphone camera captures image data representing a portion of the package which includes both i) the encoded signal, and ii) icon 550 .
- the image data is provided to the process detailed in FIG. 16A .
- the encoded signal is decoded along with icon 550 being detected, triggering a certain response (e.g., “Response 2” in FIG. 16A ).
- the certain responses can cause the smartphone to provide, e.g., access to the additional information.
- the networks, data stores and cloud-based routing described in assignee's U.S. Pat. No. 8,990,638, which is hereby incorporated herein by reference in its entirety, can be used to provide access to the additional information.
- a remote database includes a response table or database.
- the table or database may include multiple responses per encoded signal identifier. If the identifier is received without an icon detection indication, then a Response 1 is provided. But, if the identifier is received with an icon detection indication, then a Response 2 is provided.)
- the certain response is limited to access to the additional information.
- the encoded signal may carry a certain payload like a GTIN, such information preferably is not provided for user or application access.
- the response e.g., “Response 2” is limited to providing access to the additional information, and not, e.g., the GTIN itself.
- a retail package in a second embodiment, relative to the package example in FIG. 15 , includes an encoded signal redundantly provided over its surface.
- the package may include redundant instances of digital watermarking carrying a GTIN number in each of the grid cells (or a subset of the grid cells).
- the encoding e.g., digital watermarking
- the package also includes an icon 550 , which indicates the presence of additional information associated with the package or package contents, e.g., online information. Icon 550 may be even located near a nutrition text box printed on the package (not shown in FIG. 15 ).
- a smartphone camera captures image data representing a portion of the package which includes i) the encoded signal, but not ii) icon 550 .
- the image data is provided to the process detailed in FIG. 16A .
- the encoded signal is decoded but icon 550 is not detected, triggering a certain response (e.g., “Response 1” in FIG. 16A ). Since icon 550 is not detected it can be assumed that there is not a current interest in the additional information. Therefore, the response may include providing access to the GTIN information, or product information associated with the GTIN.
- Signal Decoder 502 can include, e.g., a digital watermark decoder such as disclosed in U.S. Pat. Nos. 6,102,403, 6,614,914, 9,117,268, 9,245,308 and/or 9,380,186, and US Publication Nos. 20160217547 and/or 20160275639, which are each hereby incorporated by reference in its entirety.
- Signal Decoder 502 may include, e.g., a 1D or 2D barcode decoder.
- a 1D and 2D barcode detector is ZXing (“Zebra Crossing”), which is an open-source, multi-format 1D/2D barcode image processing library implemented in Java, with ports to other languages, found at currently at https://github.com/zxing/zxing.
- Icon Detector 504 Various implementations of Icon Detector 504 are discussed further with reference to FIGS. 17A-17C .
- Image Data 500 is provided so that potential icon candidates can be identified 520 .
- 520 may identify many different image areas with characteristics that may be associated with icon 550 .
- Identified candidates are passed on for processing 530 to determine whether they represent an icon, e.g., icon 550 in FIG. 15 .
- Image data 500 can be filtered 520 for smoothing or to remove noise.
- a bilateral filter can be employed to remove noise from the image data 500 .
- a bilateral filter may be viewed, e.g., as a weighted average of pixels, which takes into account the variation of pixel intensities to preserve edges. See, e.g., Paris, et al., “A gentle introduction to bilateral filtering and its applications,” Proceedings of SIGGRAPH '08 ACM SIGGRAPH, article no. 1, 2008-08-11, which is hereby incorporated herein by reference.
- Edge detection 521 can be performed on the filter image data.
- the Canny edge detector can be used. See, e.g., J. Canny (1986) “A computational approach to edge detection”, IEEE Trans.
- the Canny-Deriche detector is another filter that could be used. See, e.g., R. Deriche (1987) Using Canny's criteria to derive an optimal edge detector recursively implemented, Int. J. Computer Vision, vol. 1, pages 167-187, which is hereby incorporated herein by reference. Or the Log Gabor filter could be used instead of or in combination with the above mentioned filters.
- contours 522 identified by the edge detector 521 it can be determined whether various criteria is met. This criteria can be determined based on the physical properties of icon 550 . For example, consider an icon that is somewhat hexagonal in shape. The criteria for such an icon may include whether a contour is, e.g., a “closed contour” 523 , has a pixel size or area within predetermined limits 524 (e.g., to weed out too large and too small of areas), is convex 525 , and has the correct number of sides (e.g., at least 6 if looking for a hexagonal shaped icon, or at least n sides if looking for an n-sided polygon) 526 .
- a contour e.g., a “closed contour” 523
- predetermined limits 524 e.g., to weed out too large and too small of areas
- the correct number of sides e.g., at least 6 if looking for a hexagonal shaped icon, or at least n sides if looking
- contours (or a subset of those meeting predetermined criteria, e.g., exactly 6 sides, within a certain size, etc.) meeting these criterion ( 523 , 524 , 525 and/or 526 ) can be passed to a second stage for further analysis or identified as candidate contours 528 . Otherwise, contours not meeting these criterion can be discarded 527 . Of course, not all of the criterion need to be met. For example, candidate contours can be identified based on successfully meeting 3 out of the 4 criterion.
- Determined candidate contour(s) can analyzed in a second stage ( FIG. 17C ) to determine whether it corresponds to icon 550 .
- a template based approach to determine whether a candidate contour (e.g., including image data enclosed within the candidate contour) matches a template based on icon 550 .
- An area associated with the candidate contour can be assessed.
- a minimum bounding box can be drawn around the candidate contour.
- a minimum bounding box can be generated in software, e.g., such as various scripts for in MatLab from MathWorks (fx minBoundingBox(X), which computes the minimum bounding box of a set of 2D points, and where the input includes [x,y] coordinates corresponding to points on a candidate contour).
- fx minBoundingBox(X) which computes the minimum bounding box of a set of 2D points, and where the input includes [x,y] coordinates corresponding to points on a candidate contour.
- An example open source MatLab bounding box script is shown for minBoundingBox(X) in FIGS. 18A and 18B .
- the minimum bounding box helps facilitate re-orientation 532 of the candidate contour to resolve image rotation and scale.
- the bounding box (and its image contents) can be rotated such that one of its edges is horizontal to an image plane.
- the image data within the candidate contour can be resized, e.g., according to the sizing of previously stored templates.
- the candidate contour (e.g., including image content represented within the contour) may be binarized 533 , e.g., if later stage matching templates are provided in binary form.
- template correlation 534 a correlation is determined between the processed candidate contour and the matching template(s). Since we propose using a minimum bounding box, and since at least one edge of that box is preferably reoriented to a horizontal line, we suggest using four (4) templates per candidate contour (one representing 0° rotation, one representing 90° rotation, one representing 180° rotation, and one representing 270° rotation). Using four (4) templates is useful since the potential icon could be variously oriented within the minimum bounding box.
- One of the four different rotation angles should be a good approximation, e.g., due to bounding box re-orientation 532 .
- additional templates at additional angles can be used, e.g., but at an efficiency cost.
- the templates are based on a target icon (e.g., icon 550 ) and can be binarized to cut back on processing time.
- the template and the candidate contour are compared on a pixel-by-pixel basis.
- a multiplication (or AND) operation can be carried out for each template pixel and its corresponding candidate pixel. For example, if the template pixel value is a binary 1 but the candidate contour pixel value is a 0, then the resulting operation yields a 0.
- the template pixel value is a binary 1 and the candidate contour pixel value is a 1, then the resulting operation yields a 1.
- the value of pixel operations can be summed, yielding a result. A higher value can be used to indicate a close match.
- the results can be normalized 535 to aid in determining a match 538 .
- a correlation coefficient e.g., Pearson's correlation coefficient (r). For monochrome images, image 1 and image 2, the Pearson correlation coefficient is defined as:
- correlation results can be optionally normalized 535 to determine whether the candidate contour matches 538 the icon.
- Image data 500 is obtained from a portable device, e.g., a smartphone, such as discussed below in FIG. 19 .
- a portable device e.g., a smartphone
- Other representations of the image data could alternatively be used.
- the image data 500 is filtered 520 , e.g., using a bilateral filter. Such a filter preferably preserves edges while smoothing (or removing noise from the image data 500 ).
- Edge Detection is carried out at 521 , e.g., using a Canny edge detector or other edge detector as discussed above.
- the output of the edge detector is preferably a binary image 540 representing the edges in image data 500 .
- Contours within the binary edge image are identified in 542 .
- so-called blob detection can be used.
- a “connected component labeling” process e.g., may initially label pixels (e.g., assigns a value to each pixel). For example, all pixels that are connected to each other can be given the same value or linked together (e.g., a linked list of pixels).
- Pixels can be clustered based on their connectivity to other pixels (or based on assigned values). Such clusters can be used as (or as a proxy for) contours. Once contours are identified, they can be refined 544 to determine whether they are suitable candidates for further analysis.
- FIG. 20B explores an embodiment of the contour refinement 544 .
- one or more of the contours are approximated with certain precision 545 .
- This 545 process can be substituted for process 542 in FIG. 20A .
- Given a contour a number of points representing the contour is reduced. In one example, a number of points is reduced such that straight lines between the points yields a suitable approximation of the contour. Suitable in this example means that the fit error (or distance error) between a contour segment and its representative straight line fall within a predetermined threshold. In another example, a predetermined number of points are used to represent the contour. It is then determined whether the contour is a closed contour 546 . If not, the process stops for that particular contour, and a next contour, if available, is analyzed. Of course, this feature 546 can be integrated into the feature 545 .
- the contour is closed, it is further evaluated in 547 . There, it is determined whether the closed contour has: i) at least n-number of sides, where n is an integer, ii) an area above a minimum threshold area, and if iii) the contour is convex. (Instead of having each of these three criteria resulting in a single decision, they can be broken into 2 or 3 individual decisions.) If all of these criteria are met, flow continues to 548 . If not, that particular closed contour is discarded.
- a minimum bounding box is calculated around the closed contour, e.g., as discussed above with reference to FIG. 17B , item 531 .
- the minimum bounding box can then be evaluated 549 , e.g., to determine whether its aspect ratio is within a certain range. For example, since a square has equal sides, its aspect ratio is 1.
- a 4:3 rectangle on the other hand, has an aspect ratio of 1.33 (4/3).
- a suitable aspect ratio range can be established, e.g., based on a particular icon for evaluation. By way of example, for a SmartLabel icon, we prefer an aspect ratio of 0.4-2.5. If the bounding box aspect ratio is not within a predetermined range, the closed contour is not a candidate. If it is within the predetermined range, the contour is identified as a potential candidate contour.
- a set of candidate contours is determined or obtained, e.g., by one or more of the processes discussed with reference to FIG. 17A, 17B, 20A or 20B .
- the order of which to evaluate candidates within the set of candidates can be determined, e.g., based on a first in—first out process or first in—last out process.
- the aspect ratio determined in FIG. 20B , item 549 can be used to rank candidate contours. For example, if a target icon has an aspect ratio near 1, candidate contours can be ranked according to their determined aspect ratios, with the closest aspect ratio to 1 being evaluated first, and the second closest being evaluated next, and then so on.
- the candidate contours are ranked according to their minimum bounding box area (or an area calculated for the closed contour), with the largest area first, and the smallest area last.
- an angle of rotation is found 560 for the minimum bounding box found in 548 .
- a portion of image data 500 is extracted or obtained 561 that corresponds to the area bounded by the minimum bounding box.
- the corresponding pixels that are within the area (e.g., the corresponding spatial locations) identified by the minimum bounding box are obtained for further evaluation.
- image data 500 after filtering by 520 is obtained or extracted which corresponds to the area (e.g., the corresponding spatial locations) of the minimum bounding box.
- the extracted or obtained image data (or filtered image data) is then oriented 562 (e.g., rotated) according to the rotation angle identified in 561 .
- This orientation process helps the icon matching be more rotation invariant relative to an un-rotated block.
- the block can then be resized 563 to match or approximate the size of the template(s).
- the image content within the block is then binarized 564 , e.g., using Otsu's thresholding. See Nobuyuki Otsu (1979), “A threshold selection method from gray-level histograms,” IEEE Trans. Sys., Man., Cyber. 9 (1): 62-66, which is hereby incorporated herein by reference.
- Otsu's thresholding assumes that an image contains two classes of pixels following a bi-modal histogram (e.g., foreground pixels and background pixels), it then calculates an optimum threshold separating the two classes so that their combined spread (e.g., intra-class variance) is minimal, or equivalently (e.g., because the sum of pairwise squared distances is constant), so that their inter-class variance is maximal.
- a bi-modal histogram e.g., foreground pixels and background pixels
- an optimum threshold separating the two classes so that their combined spread (e.g., intra-class variance) is minimal, or equivalently (e.g., because the sum of pairwise squared distances is constant), so that their inter-class variance is maximal.
- feature 564 could be combined with the resizing process in 563 .
- Objects within the binarized block can be evaluated in 565 . For example, an area associated with each object can be determined. With reference to FIG. 22A, 4 objects 570 , 571 , 572 , 573 are associated with a particular binarized block. A threshold area can be set, e.g., either to discard objects with an associate area that is either too large or too small. For example, in the case of a SmartLabel icon, objects with an area larger than, e.g., a value between 12-25% of the binarized block, can be discarded. So, in this particular example, object 570 is the only object in FIG. 22A that needs to be discarded, with the remaining objects shown in FIG. 22B .
- An alternative evaluation technique looks for an expected pattern. For example, virtual lines can be drawn (or pixels along a virtual line can be evaluated) through a block. A pattern or ratio of on and off pixels along the line(s) can be evaluate to determine whether it meets a threshold level, pattern or ratio. For example, the left and right dashed lines in FIG. 22C only cross through objects 571 and 572 , but not object 573 . The middle dashed line crosses through all three objects 571 , 572 and 573 . The middle line is likely to meet the predetermined threshold of on off pixel threshold, pattern or ratio, for this particular example, while the left and right lines would not. (This same pattern or ratio process could be used as an initial filter, e.g., after filtering 520 or edge detection 521 to do a rough check whether an expected pattern or ratio associated with an icon is present in the image data 500 .)
- Template matching 566 is carried out for the processed block.
- the template correlation and normalizing processes discussed above with respect to 534 and 535 can be carried out. If a normalized correlation value is higher than a predetermined threshold 567 , the candidate contour is accepted as a match to the target icon. If not, the candidate contour is not a match. Additional candidate contours can be evaluated according to the FIG. 20C processes if no match is found. And, unless multiple icons are being searched for, the processes need not evaluate additional candidates once an icon match is found.
- image data 500 can be resized at different scales and then evaluated according to 564 - 567 .
- FIG. 20D Another embodiment of how to determine whether a candidate contour is a match with a particular icon is discussed with reference to FIG. 20D .
- Image processing flow proceeds through operations 560 - 565 as discussed above with respect to FIG. 20C .
- a subset of remaining objects to retain is determined at 568 .
- a resized block (after 563 ) is shown in FIG. 23A .
- the block includes objects 580 , 581 , 582 and 583 .
- Binarization 564 and Evaluation 565 may yield the remaining objects shown in FIG. 23B , including objects 585 .
- These objects 585 e.g., maybe binarization artifacts associated with corners or other object structures. It would be good to remove these objects prior to template correlation.
- a subset of remaining objects to retain is determined, e.g., by only keeping the largest sized n number of objects, where n is an integer. For example, and again with reference to FIG. 23B , if we are looking for a target icon including objects 581 , 582 and 583 (but not objects 585 ) then we can prune the number of objects to the 3 largest remaining objects ( 581 , 582 , 583 ).
- the term “sized” (or size) in this context can be determined by, e.g., an object's spatial area or by an object's length of perimeter.
- the remaining objects are shown in FIG. 23C . (Items 581 and 582 are drawn with cross-hatching. This is intended to represent that these objects could either be dark or light objects, or a combination of such.)
- the integer n can be increased or decreased depending on the number of objects expected in a target icon.
- m of the n number of remaining objects are convex 569 , where m and n are each integers.
- convex implies that any tangent to a shape will result in the object's interior only being on one side of the tangent, e.g., as shown in FIG. 24A .
- a concave shape in contrast, would have a potential tangent resulting in portions of the shape falling on both sides of the tangent line, e.g., as shown in FIG. 24B .
- FIG. 24B it should be noted, however, that if an icon included a concave shape, we could alternatively determine whether m of the n number of remaining objects were concave.
- FIG. 20E Another embodiment of how to determine whether a candidate contour is a match with a particular icon is discussed with reference to FIG. 20E , where shape matching utilizing so-called “image moments” is employed.
- Image processing flow proceeds through operations 560 - 561 as discussed above with respect to FIG. 20C . Omitted, however is operations 562 and 563 relative to FIG. 20C . This is because an image moment shape matching operation typically extracts rotationally and scale invariant candidate features from an image portion. Flow moves on to operations 590 and 591 , which are essentially the same as operations 564 and 565 in FIG. 20C , respectfully. Different reference numbers are used in FIG. 20C vs. FIG. 20E since the terms “image portion” are used in FIG. 20E instead using “block” as in FIG. 20C . But, the two terms can be used interchangeable, however, since they both represent image data from a certain spatial image area.
- image moments of shapes from the binarized, evaluated image portion are compared to image moments of one or more shapes in a target icon.
- a target icon For example, and with reference to FIG. 23C , three (3) shapes 581 , 582 and 583 are intended to be matched in a target icon. Image moments for each of these shapes can be determined and stored as references. Then, moments from an image portion can be determined and compared against the references. The comparisons can be normalized and then compared against a predetermined threshold. If the nominalized comparison exceeds the threshold (or is lower than if a perfect match is a zero (0)), then the icon matches the target icon. If not, no icon is detected.
- FIG. 17A - FIG. 17C and FIGS. 20A-20D Another check can be added to the processes discusses above with respect to FIG. 17A - FIG. 17C and FIGS. 20A-20D .
- one of the expected objects includes a circularly shaped object, e.g., items 573 or 583 .
- An ideal circle this ratio should be equal to 1.
- an icon may include a machine-readable code encoded therein or there around. Detection of the machine-readable code triggers a response associated with the icon. In this example, instead of detection of the icon+encoded signal, the detection of the machine-readable code, alone, triggers the response associated with the icon. As a further alternative, detection of the machine-readable code+an encoded signal triggers the response associate with the icon.
- the encoded signal includes a plural-bit payload.
- the plural-bit payload has at least one bit (e.g., a “trigger bit”) that can be set to indicate the presence of information associated with an icon or with a package.
- the remaining portion of the payload may including, e.g., a GTIN or UPC number.
- a signal decoder upon a successful decode of a payload including a trigger bit provides access to (or indicates to a software app to provide access to) information associated with the icon.
- module may refer to software, firmware and/or circuitry configured to perform any of the methods, processes, algorithms, functions or operations described herein.
- Software may be embodied as a software package, code, instructions, instruction sets or data recorded on non-transitory computer readable storage mediums.
- Software instructions for implementing the detailed functionality can be authored by artisans without undue experimentation from the descriptions provided herein, e.g., written in C, C++, MatLab, Visual Basic, Java, Python, Tcl, Perl, Scheme, Ruby, and assembled in executable binary files, etc., in conjunction with associated data.
- Firmware may be embodied as code, instructions or instruction sets or data that are hard-coded (e.g., nonvolatile) in memory devices.
- circuitry may include, for example, singly or in any combination, hardwired circuitry, programmable circuitry such as one or more computer processors comprising one or more individual instruction processing cores, parallel processors, state machine circuitry, or firmware that stores instructions executed by programmable circuitry.
- Applicant's work also includes taking the scientific principles and natural laws on which the present technology rests, and tying them down in particularly defined implementations. For example, the implementations discussed with reference to FIGS. 15-17C and FIGS. 20A-20E .
- One such realization of such implementations is electronic circuitry that has been custom-designed and manufactured to perform some or all of the component acts, as an application specific integrated circuit (ASIC).
- ASIC application specific integrated circuit
- VHDL an IEEE standard, and doubtless the most common hardware design language
- the VHDL output is then applied to a hardware synthesis program, such as Design Compiler by Synopsis, HDL Designer by Mentor Graphics, or Encounter RTL Compiler by Cadence Design Systems.
- the hardware synthesis program provides output data specifying a particular array of electronic logic gates that will realize the technology in hardware form, as a special-purpose machine dedicated to such purpose. This output data is then provided to a semiconductor fabrication contractor, which uses it to produce the customized silicon part.
- Suitable contractors include TSMC, Global Foundries, and ON Semiconductors.
- Another specific implementation of the present disclosure includes barcode and/or encoded signal detection operating on a specifically configured smartphone (e.g., iPhone 7 or Android device) or other mobile device, such phone or device.
- the smartphone or mobile device may be configured and controlled by software (e.g., an App or operating system) resident on the smartphone device.
- the resident software may include, e.g., a barcode decoder, digital watermark detector and detectability measure generator module.
- FIG. 19 is a diagram of a portable electronic device (e.g., a smartphone, mobile device, tablet, laptop, wearable or other electronic device) in which the components of the above processes (e.g., those in FIGS. 16-17C and 20A-20E ) may be implemented.
- a portable electronic device e.g., a smartphone, mobile device, tablet, laptop, wearable or other electronic device
- the components of the above processes e.g., those in FIGS. 16-17C and 20A-20E
- the following reference numbers refer to FIG. 19 , and not any of the other drawings, unless expressly noted.
- a system for an electronic device includes bus 100 , to which many devices, modules, etc., (each of which may be generically referred as a “component”) are communicatively coupled.
- the bus 100 may combine the functionality of a direct memory access (DMA) bus and a programmed input/output (PIO) bus.
- DMA direct memory access
- PIO programmed input/output
- the bus 100 may facilitate both DMA transfers and direct CPU read and write instructions.
- the bus 100 is one of the Advanced Microcontroller Bus Architecture (AMBA) compliant data buses.
- AMBA Advanced Microcontroller Bus Architecture
- the electronic device can optionally include one or more bus controllers (e.g., a DMA controller, an I2C bus controller, or the like or any combination thereof), through which data can be routed between certain of the components.
- bus controllers e.g., a DMA controller, an I2C bus controller, or the like or any combination thereof
- the electronic device also includes a CPU 102 .
- the CPU 102 may be any microprocessor, multi-core microprocessor, parallel processors, mobile application processor, etc., known in the art (e.g., a Reduced Instruction Set Computer (RISC) from ARM Limited, the Krait CPU product-family, any X86-based microprocessor available from the Intel Corporation including those in the Pentium, Xeon, Itanium, Celeron, Atom, Core i-series product families, etc.).
- RISC Reduced Instruction Set Computer
- Another CPU example is an Apple A10, A8 or A7.
- the A8 is built on a 64-bit architecture, includes a motion co-processor and is manufactured on a 20 nm process.
- the CPU 102 runs an operating system of the electronic device, runs application programs (e.g., mobile apps such as those available through application distribution platforms such as the Apple App Store, Google Play, etc., or custom designed to include signal decoding and icon detection) and, optionally, manages the various functions of the electronic device.
- the CPU 102 may include or be coupled to a read-only memory (ROM) (not shown), which may hold an operating system (e.g., a “high-level” operating system, a “real-time” operating system, a mobile operating system, or the like or any combination thereof) or other device firmware that runs on the electronic device.
- ROM read-only memory
- Encoded signal decoding and icon detection capabilities can be integrated into the operating system itself.
- the electronic device may also include a volatile memory 104 electrically coupled to bus 100 .
- the volatile memory 104 may include, for example, any type of random access memory (RAM).
- the electronic device may further include a memory controller that controls the flow of data to and from the volatile memory 104 .
- the electronic device may also include a storage memory 106 connected to the bus.
- the storage memory 106 typically includes one or more non-volatile semiconductor memory devices such as ROM, EPROM and EEPROM, NOR or NAND flash memory, or the like or any combination thereof, and may also include any kind of electronic storage device, such as, for example, magnetic or optical disks.
- the storage memory 106 is used to store one or more items of software.
- Software can include system software, application software, middleware (e.g., Data Distribution Service (DDS) for Real Time Systems, MER, etc.), one or more computer files (e.g., one or more data files, configuration files, library files, archive files, etc.), one or more software components, or the like or any stack or other combination thereof.
- DDS Data Distribution Service
- system software examples include operating systems (e.g., including one or more high-level operating systems, real-time operating systems, mobile operating systems, or the like or any combination thereof), one or more kernels, one or more device drivers, firmware, one or more utility programs (e.g., that help to analyze, configure, optimize, maintain, etc., one or more components of the electronic device), and the like.
- operating systems e.g., including one or more high-level operating systems, real-time operating systems, mobile operating systems, or the like or any combination thereof
- kernels e.g., including one or more high-level operating systems, real-time operating systems, mobile operating systems, or the like or any combination thereof
- device drivers e.g., including one or more device drivers, firmware, one or more utility programs (e.g., that help to analyze, configure, optimize, maintain, etc., one or more components of the electronic device), and the like.
- utility programs e.g., that help to analyze, configure, optimize, maintain, etc., one or more components of the electronic
- Application software typically includes any application program that helps users solve problems, perform tasks, render media content, retrieve (or access, present, traverse, query, create, organize, etc.) information or information resources on a network (e.g., the World Wide Web), a web server, a file system, a database, etc.
- software components include device drivers, software CODECs, message queues or mailboxes, databases, etc.
- a software component can also include any other data or parameter to be provided to application software, a web application, or the like or any combination thereof.
- Examples of data files include image files, text files, audio files, video files, haptic signature files, and the like.
- a user input device 110 can, for example, include a button, knob, touch screen, trackball, mouse, microphone (e.g., an electret microphone, a MEMS microphone, or the like or any combination thereof), an IR or ultrasound-emitting stylus, an ultrasound emitter (e.g., to detect user gestures, etc.), one or more structured light emitters (e.g., to project structured IR light to detect user gestures, etc.), one or more ultrasonic transducers, or the like or any combination thereof.
- the user interface module 108 may also be configured to indicate, to the user, the effect of the user's control of the electronic device, or any other information related to an operation being performed by the electronic device or function otherwise supported by the electronic device. Thus the user interface module 108 may also be communicatively coupled to one or more user output devices 112 .
- a user output device 112 can, for example, include a display (e.g., a liquid crystal display (LCD), a light emitting diode (LED) display, an active-matrix organic light-emitting diode (AMOLED) display, an e-ink display, etc.), a light, an illumination source such as a flash or torch, a buzzer, a haptic actuator, a loud speaker, or the like or any combination thereof.
- the flash includes a True Tone flash including a dual-color or dual-temperature flash that has each color firing at varying intensities based on a scene to make sure colors and skin tone stay true.
- the user input devices 110 and user output devices 112 are an integral part of the electronic device; however, in alternate embodiments, any user input device 110 (e.g., a microphone, etc.) or user output device 112 (e.g., a loud speaker, haptic actuator, light, display, or printer) may be a physically separate device that is communicatively coupled to the electronic device (e.g., via a communications module 114 ).
- a printer encompasses many different devices for applying our encoded signals to objects, such as 2D and 3D printers, etching, engraving, flexo-printing, offset printing, embossing, laser marking, etc.
- the printer may also include a digital press such as HP's indigo press.
- An encoded object may include, e.g., a consumer packaged product, a label, a sticker, a logo, a driver's license, a passport or other identification document, etc.
- the user interface module 108 is illustrated as an individual component, it will be appreciated that the user interface module 108 (or portions thereof) may be functionally integrated into one or more other components of the electronic device (e.g., the CPU 102 , the sensor interface module 130 , etc.).
- the image signal processor 116 is configured to process imagery (including still-frame imagery, video imagery, or the like or any combination thereof) captured by one or more cameras 120 , or by any other image sensors, thereby generating image data. Such imagery may correspond with image data 500 as shown in FIGS. 16, 17A, 17B and/or 20A .
- General functions typically performed by the ISP 116 can include Bayer transformation, demosaicing, noise reduction, image sharpening, filtering, or the like or any combination thereof.
- the GPU 118 can be configured to process the image data generated by the ISP 116 , thereby generating processed image data.
- General functions typically performed by the GPU 118 include compressing image data (e.g., into a JPEG format, an MPEG format, or the like or any combination thereof), creating lighting effects, rendering 3D graphics, texture mapping, calculating geometric transformations (e.g., rotation, translation, etc.) into different coordinate systems, etc. and send the compressed video data to other components of the electronic device (e.g., the volatile memory 104 ) via bus 100 .
- the GPU 118 may also be configured to perform one or more video decompression or decoding processes.
- Image data generated by the ISP 116 or processed image data generated by the GPU 118 may be accessed by the user interface module 108 , where it is converted into one or more suitable signals that may be sent to a user output device 112 such as a display, printer or speaker.
- GPU 118 may also be configured to serve one or more functions of a signal decoder. In some cases GPU 118 is involved in encoded signal decoding (e.g., FIGS. 16A and 16B, 502 ), while icon detection ( FIGS. 16A and 16B, 506 ) is performed by the CPU 102 . In other implementations, GPU 118 performs both signal detection 502 ( FIGS. 16A and 16B ) and Icon detection 506 ( FIGS. 16A and 16B ). In some cases, Icon Detector 506 ( FIGS. 16A and 16B ) is incorporated into Signal Decoder 502 ( FIGS. 16A and 16B ), which may execute by CPU 102 , GPU 118 or on a processing core.
- an audio I/O module 122 is configured to encode, decode and route data to and from one or more microphone(s) 124 (any of which may be considered a user input device 110 ) and loud speaker(s) 126 (any of which may be considered a user output device 110 ).
- microphone(s) 124 any of which may be considered a user input device 110
- loud speaker(s) 126 any of which may be considered a user output device 110 .
- sound can be present within an ambient, aural environment (e.g., as one or more propagating sound waves) surrounding the electronic device.
- a sample of such ambient sound can be obtained by sensing the propagating sound wave(s) using one or more microphones 124 , and the microphone(s) 124 then convert the sensed sound into one or more corresponding analog audio signals (typically, electrical signals), thereby capturing the sensed sound.
- the signal(s) generated by the microphone(s) 124 can then be processed by the audio I/O module 122 (e.g., to convert the analog audio signals into digital audio signals) and thereafter output the resultant digital audio signals (e.g., to an audio digital signal processor (DSP) such as audio DSP 128 , to another module such as a song recognition module, a speech recognition module, a voice recognition module, etc., to the volatile memory 104 , the storage memory 106 , or the like or any combination thereof).
- the audio I/O module 122 can also receive digital audio signals from the audio DSP 128 , convert each received digital audio signal into one or more corresponding analog audio signals and send the analog audio signals to one or more loudspeakers 126 .
- the audio I/O module 122 includes two communication channels (e.g., so that the audio I/O module 122 can transmit generated audio data and receive audio data simultaneously).
- the audio DSP 128 performs various processing of digital audio signals generated by the audio I/O module 122 , such as compression, decompression, equalization, mixing of audio from different sources, etc., and thereafter output the processed digital audio signals (e.g., to the audio I/O module 122 , to another module such as a song recognition module, a speech recognition module, a voice recognition module, etc., to the volatile memory 104 , the storage memory 106 , or the like or any combination thereof).
- the audio DSP 128 may include one or more microprocessors, digital signal processors or other microcontrollers, programmable logic devices, or the like or any combination thereof.
- the audio DSP 128 may also optionally include cache or other local memory device (e.g., volatile memory, non-volatile memory or a combination thereof), DMA channels, one or more input buffers, one or more output buffers, and any other component facilitating the functions it supports (e.g., as described below).
- the audio DSP 128 includes a core processor (e.g., an ARM® AudioDETM processor, a Hexagon processor (e.g., QDSP6V5A)), as well as a data memory, program memory, DMA channels, one or more input buffers, one or more output buffers, etc.
- audio I/O module 122 and the audio DSP 128 are illustrated as separate components, it will be appreciated that the audio I/O module 122 and the audio DSP 128 can be functionally integrated together. Further, it will be appreciated that the audio DSP 128 and other components such as the user interface module 108 may be (at least partially) functionally integrated together.
- the aforementioned communications module 114 includes circuitry, antennas, sensors, and any other suitable or desired technology that facilitates transmitting or receiving data (e.g., within a network) through one or more wired links (e.g., via Ethernet, USB, FireWire, etc.), or one or more wireless links (e.g., configured according to any standard or otherwise desired or suitable wireless protocols or techniques such as Bluetooth, Bluetooth Low Energy, WiFi, WiMAX, GSM, CDMA, EDGE, cellular 3G or LTE, Li-Fi (e.g., for IR- or visible-light communication), sonic or ultrasonic communication, etc.), or the like or any combination thereof.
- wired links e.g., via Ethernet, USB, FireWire, etc.
- wireless links e.g., configured according to any standard or otherwise desired or suitable wireless protocols or techniques such as Bluetooth, Bluetooth Low Energy, WiFi, WiMAX, GSM, CDMA, EDGE, cellular 3G or LTE, Li-Fi (e.g., for
- the communications module 114 may include one or more microprocessors, digital signal processors or other microcontrollers, programmable logic devices, or the like or any combination thereof.
- the communications module 114 includes cache or other local memory device (e.g., volatile memory, non-volatile memory or a combination thereof), DMA channels, one or more input buffers, one or more output buffers, or the like or any combination thereof.
- the communications module 114 includes a baseband processor (e.g., that performs signal processing and implements real-time radio transmission operations for the electronic device).
- Sensor 132 can, for example, include an accelerometer (e.g., for sensing acceleration, orientation, vibration, etc.), a magnetometer (e.g., for sensing the direction of a magnetic field), a gyroscope (e.g., for tracking rotation, orientation, or twist), a barometer (e.g., for sensing air pressure, from which relative elevation can be determined), a wind meter, a moisture sensor, an ambient light sensor, an IR or UV sensor or other photodetector, a pressure sensor, a temperature sensor, an acoustic vector sensor (e.g., for sensing particle velocity), a galvanic skin response (GSR) sensor, an ultrasonic sensor, a location sensor (e.g., a GPS receiver module, etc.), a gas or other chemical sensor, or the like or any combination thereof.
- an accelerometer e.g., for sensing acceleration, orientation, vibration, etc.
- a magnetometer e.g., for sensing the direction of a
- any camera 120 or microphone 124 can also be considered a sensor 132 .
- a sensor 132 generates one or more signals (typically, electrical signals) in the presence of some sort of stimulus (e.g., light, sound, moisture, gravitational field, magnetic field, electric field, etc.), in response to a change in applied stimulus, or the like or any combination thereof.
- some sort of stimulus e.g., light, sound, moisture, gravitational field, magnetic field, electric field, etc.
- all sensors 132 coupled to the sensor interface module 130 are an integral part of the electronic device; however, in alternate embodiments, one or more of the sensors may be physically separate devices communicatively coupled to the electronic device (e.g., via the communications module 114 ).
- the sensor interface module 130 is configured to activate, deactivate or otherwise control an operation (e.g., sampling rate, sampling range, etc.) of one or more sensors 132 (e.g., in accordance with instructions stored internally, or externally in volatile memory 104 or storage memory 106 , ROM, etc., in accordance with commands issued by one or more components such as the CPU 102 , the user interface module 108 , the audio DSP 128 , the cue detection module 134 , or the like or any combination thereof).
- sensor interface module 130 can encode, decode, sample, filter or otherwise process signals generated by one or more of the sensors 132 .
- the sensor interface module 130 can integrate signals generated by multiple sensors 132 and optionally process the integrated signal(s). Signals can be routed from the sensor interface module 130 to one or more of the aforementioned components of the electronic device (e.g., via the bus 100 ). In another embodiment, however, any signal generated by a sensor 132 can be routed (e.g., to the CPU 102 ), the before being processed.
- the sensor interface module 130 may include one or more microprocessors, digital signal processors or other microcontrollers, programmable logic devices, or the like or any combination thereof.
- the sensor interface module 130 may also optionally include cache or other local memory device (e.g., volatile memory, non-volatile memory or a combination thereof), DMA channels, one or more input buffers, one or more output buffers, and any other component facilitating the functions it supports (e.g., as described above).
- the sensor interface module 130 may be provided as the “Sensor Core” (Sensors Processor Subsystem (SPS)) from Qualcomm, the “frizz” from Megachips, or the like or any combination thereof.
- the sensor interface module 130 is illustrated as an individual component, it will be appreciated that the sensor interface module 130 (or portions thereof) may be functionally integrated into one or more other components (e.g., the CPU 102 , the communications module 114 , the audio I/O module 122 , the audio DSP 128 , the cue detection module 134 , or the like or any combination thereof). Concluding Remarks
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Electromagnetism (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Toxicology (AREA)
- Health & Medical Sciences (AREA)
- Image Processing (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
Description
- This application is a continuation of U.S. patent application Ser. No. 15/960,408, filed Apr. 23, 2018 (U.S. Pat. No. 10,853,903), which claims the benefit of U.S. Patent Application No. 62/488,661, filed Apr. 21, 2017. The Ser. No. 15/960,408 application is also a continuation in part of U.S. patent application Ser. No. 15/448,403, which claims the benefit of US Patent Application Nos. 62/429,539, filed Dec. 2, 2016, 62/405,709, filed Oct. 7, 2016, and 62/400,083, filed Sep. 26, 2016. This application is also related to U.S. Pat. Nos. 9,117,268, 9,224,184, 9,380,186, 9,401,001, 9,805,435, 10,262,176, and US Published Patent Application Nos. 20160217547 and 20160275639. Each of the above patent documents is hereby incorporated herein by reference in its entirety.
- This disclosure relates to automatic identification of objects and icons, and related image signal processing.
- Barcodes have dramatically transformed the efficiency of retail store operation. Nevertheless, correct identification and handling of products is challenging when there are potentially conflicting labels applied to items or groups of items. Such conflict often arises in the following scenarios:
- 1. groups of separately marked items sold as a unit (e.g., a family pack);
- 2. items marked with price change labels (e.g., a discount or fixed price label).
- In the first case, error occurs when items are recognized and priced individually rather than as a group. In some configurations, a pack is constructed with an over-wrap that obscures barcodes on individual items. The overwrap carries a separate barcode for the family pack. Conflict occurs when a scanner reads barcodes for individual items and the family pack or misses the barcode of the family pack. Conflict also occurs when the scanner reads the barcode of the family pack and then individual items, without treating the individual items as part of the pack. In another family pack configuration, the individual items are held in a carrying case that bears the barcode of the family pack. The individual items may be oriented to obscure their barcodes, yet they may still be visible. The items within a pack may be different items that the retailer wishes to sell together or multiple instances of the same item in a group. In the former situation, each of the items contains a different barcode, which is also different than the group barcode. In all these cases, errors occur when the scanner provides decoded product codes for the individual items in the family pack.
- In the case of price change labels, error occurs when the scanner or checker misses the price change item, and instead, only provides the product code for the product without the price change. Additional slowing occurs in the check-out process when the checker is required to manually enter the change in price.
- Other errors may occur due to conflicting codes inserted in product packaging artwork or printing errors. In the former case, a package design file may encompass design elements, each bearing a different product code, which may conflict in some cases. Also, the package design file may include references to artwork in other files, which is composited to produce the package design image prior to printing. In this image assembly process, conflicting codes may be incorporated from the artwork in the reference files. In the latter case, conflicting codes may be printed due to printing plates that apply imagery with conflicting codes. Also, printing may occur with plural print stages, in which a first print technology like flexography or offset applies a first design to a package substrate, and a second print technology like a digital offset or inkjet applies a second design to a package substrate.
- The problem with these scenarios is that they cause pricing error and slow down the check-out process. Below, we describe approaches for scanner devices to identify items accurately and at higher speed while minimizing use of processing resources within the POS system or requiring manual intervention by the checker.
- One aspect of the disclosure is a scanner with control logic that resolves code conflicts based on detection results from one or more recognition units in the scanner. The scanner includes a processor that controls illumination and image capture by an imager of an object within its view volume. A processor executes a controller process to receive a detection result from a recognition unit for image frames captured of an object or objects in the view volume. For some objects, the detection results acquired from sensing the object within a scan operation (typically under 1 second) includes an outer or inner code, or both. An example of an outer code is an identifier of a family pack or price change label, while an example of an inner code is an identifier of a family pack member or product identifier of a product with a price change label attached.
- The controller analyzes the detection result by comparing the detection result with state stored for a prior detection result during the scan operation to determine whether to initiate one of plural types of waiting periods based on the type of detection result and comparing the detection result with a prior result in a state data structure. The controller sets the waiting period to control reporting of an outer code relative to an inner code on the package. It enforces a first type of waiting period and control logic to control reporting of an inner code after detection of an outer code and a second type of waiting period and control logic to delay reporting of an inner code until the second type of waiting period ends. Variations of the waiting period and control logic are described further below.
- Another aspect of the disclosure is a smartphone comprising: an imager for capturing plural image frames of a package; a processor coupled to the imager; the processor configured to execute a controller process, the controller process comprising instructions executed by the processor to: analyze image data associated with an image frame, the image frame captured by said imager, in which the analyze image data executes to detect the presence or absence of an icon and to decode a signal encoded within the image data; and provide a first response when the signal is decoded but the icon is not detected; provide a second, different response when the signal is decoded and the icon is detected.
- Yet another aspect is a method of detecting the presence of an icon in imagery, the imagery captured by a camera integrated within a portable electronic device. The method comprises: using one or more cores of a multi-core processor, filtering the imagery to remove noise, said filtering yielding filtered imagery; detecting a plurality of contours within the filtered imagery, and for each of the plurality of contours, executing the following criteria checks: i) determining whether the contour is closed; ii) determining whether the contour comprises an area associated within a predetermined area range; and iii) determining whether the contour comprises a convex contour; outputting an indication that the contour comprises a candidate contour only when each of criteria i, ii and iii are satisfied.
- Additional aspects of the disclosure include control logic and associated methods for integrated within automatic identification devices, and various configurations and types of recognition units and controller logic for determining when and how to handle responses when an icon is detection in the presence or absence of encoded signals.
- Further aspects, advantages and features are described and illustrated in the detailed description and drawings below.
-
FIG. 1 is a system diagram illustrating components of a point of sale system in a retail store. -
FIG. 2 is a diagram illustrating a sequence of decode operations by a scanner. -
FIG. 3 is a diagram illustrating another sequence of decode operations by the scanner. -
FIG. 4 is a diagram of components in an imager based scanner. -
FIG. 5 is a diagram illustrating a processing architecture for controlling recognition units within a scanner. -
FIG. 6 is diagram illustrating software modules that operate on a sequence of image frames to detect and extract digital payloads from images of objects within the frames. -
FIGS. 7A and 7B illustrate image portions of an object in different frames captured from a field of view of a scanner's imager. -
FIGS. 8A and 8B illustrate another example of image portions of an object in different frames captured from a field of view of a scanner's imager. -
FIG. 9 is a flow diagram of a controller process that resolves product identification conflicts. -
FIG. 10 is a block diagram of a signal encoder for encoding a digital payload signal into an image signal. -
FIG. 11 is a block diagram of a compatible signal decoder for extracting the digital payload signal from an image signal. -
FIG. 12 is a flow diagram illustrating operations of a signal generator. -
FIG. 13 is a diagram illustrating embedding of an auxiliary signal into host image signal. -
FIG. 14 is a flow diagram illustrating a method for decoding a payload signal from a host image signal. -
FIG. 15 is a rendition of a physical object including an icon and various encoded symbologies. -
FIG. 16A is a flow diagram showing cooperation of a signal decoder and an icon detector. -
FIG. 16B is a flow diagram showing cooperation of an icon detector and signal decoder. -
FIG. 17A is a flow diagram showing two stages associated with the icon detector ofFIG. 16A . -
FIG. 17B is a flowdiagram showing stage 1 of the icon detector shown inFIG. 17A . -
FIG. 17C is a flowdiagram showing stage 2 of the icon detector shown inFIG. 17A . -
FIGS. 18A and 18B show an example MatLab script. -
FIG. 19 is a block diagram of an electronic device (e.g., a smartphone) that can be used to carry out the processes and features shown inFIGS. 16-17C and 20A-20E . -
FIG. 20A is a flow diagram for a process to detect candidate contours within image data. -
FIG. 20B is a flow diagram showing one embodiment of contour refinement. -
FIG. 20C is a flow diagram for icon matching of candidate contours. -
FIG. 20D is another flow diagram for icon matching of candidate contours. -
FIG. 20E is yet another flow diagram for icon matching of candidate contours. -
FIG. 21 shows a rotation angle for a minimum bounding box. -
FIGS. 22A and 22B show object evaluation within a block. -
FIG. 22C shows lines through a block which includes objects. -
FIGS. 23A and 23B show object evaluation within a block; andFIG. 23C shows remaining objects. -
FIG. 24A shows objects including tangent lines; andFIG. 24B shows other objects including tangent lines. -
FIGS. 25A-25D show signal encoding in, on and around various icons. -
FIG. 1 is a system diagram illustrating components of a point of sale system in a retail store. Each check-out station is equipped with aPOS terminal 14 andscanner 12. The scanner has a processor and memory and executes scanner firmware, as detailed further below. The POS terminal is a general purpose computer connected to the scanner via a standard cable or wireless interconnect, e.g., to connect the scanner directly to a serial port, keyboard port, USB port or like port of the POS terminal or through an interface device (e.g., a wedge). Each of the POS terminals are connected via a network to the store'sback office system 16. - Items in the store are assigned an identification number in a numbering scheme managed by GS1 called a Global Trade Identification Number (GTIN). The GTIN plays a vital role within store operations as it identifies products and acts as a database key to associate the product with product attributes including its name and price. For many products, the GTIN is assigned by the manufacturer of the item and encoded in the packaging, via a UPC Symbol and, preferably, a digital encoding that replicates the GTIN in two-dimensional tiles across the package design, as detailed further below. One example of such tiled data encoding is a Digimarc Barcode data carrier from Digimarc Corporation of Beaverton, Oreg. The retailer's system has a database of item files for each of the products it sells. This item file includes various attributes of the item that the store uses to manage its operation, such as price, scanning description, department ID, food stamp information, tax information, etc. The POS terminal retrieves this information as needed from the back office by querying the database with the item identifier (e.g., a GTIN of the product provided by the scanner).
- A barcode, preferably the Digimarc Barcode data carrier, is used to convey family pack identifiers and price change codes on packaging. For family packs, the retailer or manufacturer assigns a GTIN as the product identifier of the pack, and creates an associated item file for that pack. The GTIN is encoded in a conventional barcode and/or the Digimarc Barcode data carrier applied to the over-wrap or carrier of the pack. The Digimarc Barcode data carrier is advantageous because it replicates the GTIN across the package to provide more efficient and reliable decoding of a GTIN, and has additional data capacity to carry one or more flags indicating to the scanner that family pack or price change processing logic applies.
- Barcodes, and in particular, Digimarc Barcode data carriers, are preferably used to convey price change information in labels applied to product packaging. Price changes are usually of one of the following two types: a discount code, or a new fixed price. In the former, the discount code references a monetary amount to be reduced from the price assigned to the item's GTIN. In the latter, the code references a new fixed price that replaces the price assigned to the item's GTIN. The Digimarc Barcode data carrier also includes a flag indicating that price change processing logic applies in the scanner. As an additional or alternative means to trigger processing logic, the price change label may have other detectable properties, such as a color or spectral composition, shape, RFID tag, image template, or marking that the scanner's recognition unit(s) can detect.
- Price changes are typically managed by department within a retailer. This enables the managers of the departments, such as the bakery, meat, product and deli departments, to determine when and how much to discount items that they wish to move from their inventory. The price change information includes a department identifier, enabling the retailer's system to track the price change to the department. The new fixed price or price change may be encoded directly in the digital payload of the data carrier printed on the price change label. Alternatively, the fixed price or discount may be stored in an item record and looked up by the POS using the code decoded from the payload. In some systems, a GTIN identifying a product or class of products to which the price change applies may be included in the payload of the data carrier on the product as well.
- For some products, the product information, such as the GTIN, is printed by a label printer within the store. One example is a label printer within a scale, which is used to weigh and print a label for a variable weight item. The GTIN format includes fields used to encode the variable nature of such items by encoding a variable amount (e.g., variable weight) or a variable price. Preferably this GTIN is encoded on the label with a Digimarc Barcode data carrier, though conventional barcodes may also be used.
- Variable items are a prime example of items that often are subject to price changes. To facilitate price changes, a label with the price change is applied to the item as described above. This label may be applied over the prior label to obscure it, or may be applied next to it. The label printer in the store may be configured to print a price change label, which fits over the original label, or complements it. In either case, the scanner decodes the code or codes it detects on the package, and its processing logic issues the correct product and pricing information to the POS system.
- The back office system maintains a database of item file information in its memory (persistent and volatile memory (e.g., RAM), as needed). It uses the GTIN to associate a product with the product attributes and retrieves these attributes and delivers them to the scanning application software of the POS terminal in response to database queries keyed by the GTIN or like item code. Item files are also created for family pack items and price change labels. In some configurations, the item database is mirrored within the POS terminals of the retail store, and each POS terminal executes item look up operations within its local copy of the item database.
- During the scanning operation, the POS scanning application software obtains the output of the scanner, which is comprised of the recognized codes, e.g., GTIN, price change code, or like code. It then does a look up, either locally or via the back office to get related attributes for each code. With these attributes, the POS software executes typical POS functions, such as displaying product name and price during check-out, tabulating total price, with taxes and discounts, coupons, etc.; managing payment, and generating a receipt. Importantly, the POS software need not be modified to handle family pack configurations and price changes. Instead, the scanner logic resolves potential code scanning conflicts and reports the resolved code or codes in a fashion that the POS terminal is accustomed to seeing.
- A scanning application executes within each of the store's POS terminals. This application is responsible for obtaining the codes reported by the scanner hardware and performing the attribute look up operation. It receives each code from the scanner, in response to the scanner decoding UPC and Digimarc Barcode data carrier during check-out. A processor in the scanner executes firmware instructions loaded from memory to perform these decoding operations.
- Processing logic within the scanning operation handles the above-described cases of family pack and price changes.
FIGS. 2 and 3 are diagrams illustrating sequencing of decode operations to set the stage for the processing logic that interprets the sequence. During check out at the POS terminal, the scanner executes recognition operations on image frames captured while a product package or packages move through its field of view. From mere decoding of conventional barcodes, it is not determinable whether the barcodes originate from the same or different objects. To address this, we have incorporated new features in encoding on the package and logic within the scanner. - For purposes of illustration, we introduce the concept of an “inner barcode,” (TB) and “outer barcode” (OB). The inner barcode corresponds to a barcode of an individual item in a family pack or the original barcode on a package, before a price change label is added. The “outer barcode” corresponds to a barcode of the family pack or a price change label. Though the family pack code may indeed be outside the member item code (e.g., in the case of an over-wrap), it need not be. The same is true for the price change label relative to the original barcode on a product.
- Inner and outer barcodes are examples of a broader category of inner and outer codes detected by the scanner. These codes may be detected by image recognition methods, of which optical code reading is a subset. Other forms of image recognition are feature extraction and matching and template matching (e.g., a price change label template), to name two examples. They may also be detected by other sensor types, such as RFID, and a combination of sensor input, e.g., weight from a scale (e.g., to distinguish a family pack from a family pack member), geometric features from image feature extraction (including depth from a depth sensor), and spectral information (color such as a color histogram of a detected object, or pixel samples from spectral bands obtained by multi-spectral illumination and/or multi-spectral filters).
-
FIG. 2 illustrates a sequence in which decoding of an inner barcode precedes an outer barcode. Whenever the scanner decodes an inner barcode, it does not immediately report it. Instead, it pauses for a predetermined delay, e.g., in the range of around 500 ms. The amount of this delay may be specified in relative or absolute time by a flag in the data carrier (namely, in the digital data encoded in the family pack or family pack member). If the next barcode is an outer barcode of a family pack, the scanner logic reports only the GTIN for the family pack. - If the outer barcode is a price change, the scanner logic reports it. The scanner logic that controls which code or codes to report depend on whether the price change is a fixed price or a discount code. For a fixed price code, that fixed price code replaces the code from the inner barcode as it provides the code that the POS terminal uses to query the back office database for the new price. For a discount code, the logic causes the scanner to report the discount code as well as the code from first detected barcode that triggered the waiting period.
- In these scenarios, data flags are encoded in the inner and/or outer barcode data carriers to signal to the scanner that an outer barcode may accompany the inner barcode. For family packs, for example, the inner barcode of
FIG. 2 signals that it is part of a family pack, which in turn, triggers a waiting period for the scanner to detect an outer barcode. If no outer barcode is decoded in the waiting period, then scanner reports the inner barcode to the POS terminal. -
FIG. 3 illustrates a sequence in which decoding of an outer barcode precedes an inner barcode. This sequence may occur, for example, following the decoding of the outer barcode ofFIG. 2 . In this case, the scanner logic similarly waits for a predetermined period of time (e.g., 500 ms). A barcode decoded in the waiting period is ignored if a family pack flag is set because a barcode detected in this waiting period is deemed to be from the same family pack. The time range for the waiting period may vary with the device, as each device has different image capture systems, with different field of view parameters, which govern the number and type of views captured of an object or group of objects as they are scanned in the scanner view volume. Checker usage patterns also govern the waiting period, as they also impact movement of objects through the view volume, and/or how the checker employs the scanner to image objects. The waiting period can range from around 300 ms to 1.5 seconds. - For a price change label, the logic depends on the type of price change. For a fixed price code detected as the OB of
FIG. 3 , an inner barcode detected in the waiting period is ignored. For a discount code, the inner barcode in the waiting period is reported. - Having illustrated high level operation of the scanner logic, we now provide additional implementation details. The details of the implementation vary with the hardware and software configuration of the scanner, as well as the type of codes and recognition processes employed within the scanner.
- Image based scanners typically fall into two classes: fixed and hand-held. Fixed scanners are designed to be integrated within a check-out station, at which the operator or a conveyor moves items in the field of the scanner's image capture system. The image capture system is comprised of optical elements, such as a lens, mirror(s), beam splitter(s), 2D imager (e.g., CMOS camera), which together enable capture of plural views of an object that are combined into a single frame. Additionally, an illumination source is also included to illuminate the object for each capture. See, e.g., US Publication Nos. 20090206161 and 20130206839, which are incorporated by reference.
- Hand-held scanners are, as the name implies, designed to be held in the hand and pointed at objects. They have different optical systems adapted for this type of capture, including lens, sensor array adapted for capturing at varying distances, as well as illumination source for illuminating the object at these distances.
- These image based systems capture frames in range of around 10 to 90 frames per second. In some imager based scanners, processing of a frame must be complete prior to the arrival of the next frame. In this case, the scanner processing unit or units have from 10 to 100 ms to decode at least one code and perform other recognition operations, if included.
- In other imager based scanners, image processing of image frames is governed by time constraints, not strictly frames. In this form of real time image processing, the processing unit or units within the device process frames concurrently but when processing capacity reached, some frames get dropped, and processing resumes on subsequent frames when processing capacity is available. This type of resource management is sometimes employed opportunistically in response to detecting an object in the view volume of the scanner's imaging system. For example, as a new object enters the view volume, an image process executing within the scanner detects it and launches decoding processes on subsequent frames.
- For the sake of illustration,
FIG. 4 is a diagram of components in an imager based scanner. Our description is primarily focused on fixed, multi-plane imager based scanner. However, it is not intended to be limiting, as the embodiments may be implemented in other imaging devices, such as hand-held scanners, smartphones, tablets, machine vision systems, etc. - Please also see the specification of assignee's co-pending application Ser. No. 14/842,575, published as US 2017-0004597 A1, which is hereby incorporated herein by reference. This specification describes hardware configurations for reading machine readable data encoded on objects, including configurations usable with imager based scanners used in automatic identification applications.
- Referring to
FIG. 4 , the scanner has abus 100, to which many devices, modules, etc., (each of which may be generically referred as a “component”) are communicatively coupled. Thebus 100 may combine the functionality of a direct memory access (DMA) bus and a programmed input/output (PIO) bus. In other words, thebus 100 facilitates both DMA transfers and direct processor read and write instructions. In one embodiment, thebus 100 is one of the Advanced Microcontroller Bus Architecture (AMBA) compliant data buses. AlthoughFIG. 4 illustrates an embodiment in which all components are communicatively coupled to thebus 100, one or more components may be communicatively coupled to a separate bus, and may be communicatively coupled to two or more buses. Although not illustrated, the scanner can optionally include one or more bus controllers (e.g., a DMA controller, an I2C bus controller, or the like or combination thereof), through which data can be routed between certain of the components. - The scanner also includes at least one
processor 102. Theprocessor 102 may be a microprocessor, mobile application processor, etc., known in the art (e.g., a Reduced Instruction Set Computer (RISC) from ARM Limited, the Krait CPU product-family, X86-based microprocessor available from the Intel Corporation including those in the Pentium, Xeon, Itanium, Celeron, Atom, Core i-series product families, etc.). The processor may also be a Digital Signal Processor (DSP) such the C6000 DSP category from Texas Instruments.FIG. 4 shows a second processor behindprocessor 102 to illustrate that the scanner may have plural processors, as well as plural core processors. Other components on thebus 100 may also include processors, such as DSP or microcontroller. - Processor architectures used in current scanner technology include, for example, ARM (which includes several architecture versions), Intel, and TI C6000 DSP. Processor speeds typically range from 400 MHz to 2+ Ghz. Some scanner devices employ ARM NEON technology, which provides a Single Instruction, Multiple Data (SIMD) extension for a class of ARM processors.
- The
processor 102 runs an operating system of the scanner, and runs application programs and, manages the various functions of the device. Theprocessor 102 may include or be coupled to a read-only memory (ROM) (not shown), which stores an operating system (e.g., a “high-level” operating system, a “real-time” operating system, a mobile operating system, or the like or combination thereof) and other device firmware that runs on the scanner. - The scanner also includes a
volatile memory 104 electrically coupled to bus 100 (also referred to as dynamic memory). Thevolatile memory 104 may include, for example, a type of random access memory (RAM). Although not shown, the scanner includes a memory controller that controls the flow of data to and from thevolatile memory 104. Current scanner devices typically have around 500 MB of dynamic memory, and provide a minimum of 8 KiB of stack memory for certain recognition units. For some embodiments of the watermark processor, which is implemented as an embedded system SDK, for example, it is recommended that the scanner have a minimum of 8 KiB stack memory for running the embedded system SDK. - The scanner also includes a
storage memory 106 connected to the bus. Thestorage memory 106 typically includes one or more non-volatile semiconductor memory devices such as ROM, EPROM and EEPROM, NOR or NAND flash memory, or the like or combinations thereof, and may also include alternative storage devices, such as, for example, magnetic or optical disks. Thestorage memory 106 is used to store one or more items of software. Software can include system software, application software, middleware, one or more computer files (e.g., one or more data files, configuration files, library files, archive files, etc.), one or more software components, or the like or stack or other combination thereof. - Examples of system software include operating systems (e.g., including one or more high-level operating systems, real-time operating systems, mobile operating systems, or the like or combination thereof), one or more kernels, one or more device drivers, firmware, one or more utility programs (e.g., that help to analyze, configure, optimize, maintain, etc., one or more components of the scanner), and the like. Suitable operating systems for scanners include but are not limited to Windows (multiple versions), Linux, iOS, Quadros, and Android.
- Compilers used to convert higher level software instructions into executable code for these devices include: Microsoft C/C++, GNU, ARM, and Clang/LLVM. Examples of compilers used for ARM architectures are RVDS 4.1+, DS-5, CodeSourcery, and Greenhills Software.
- Also connected to the
bus 100 is animager interface 108. Theimager interface 108 connects one or more one ormore imagers 110 tobus 100. The imager interface supplies control signals to the imagers to capture frames and communicate them to other components on the bus. In some implementations, the imager interface also includes an image processing DSP that provides image processing functions, such as sampling and preparation of groups of pixel regions from the 2D sensor array (blocks, scanlines, etc.) for further image processing. The DSP in the imager interface may also execute other image pre-processing, recognition or optical code reading instructions on these pixels. Theimager interface 108 also includes memory buffers for transferring image and image processing results to other components on thebus 100. - Though one
imager 110 is shown inFIG. 4 , the scanner may have additional imagers. Each imager is comprised of a digital image sensor (e.g., CMOS or CCD) or like camera having a two-dimensional array of pixels. The sensor may be a monochrome or color sensor (e.g., one that employs a Bayer arrangement), and operate in a rolling and/or global shutter mode. Examples of these imagers include model EV76C560 CMOS sensor offered by e2v Technologies PLC, Essex, England, and model MT9V022 sensor offered by On Semiconductor of Phoenix, Ariz. Eachimager 110 captures an image of its view or views of a view volume of the scanner, as illuminated by an illumination source. The imager captures at least one view. Plural views (e.g.,view1 112 and view2 114) are captured by a single imager in scanners where optical elements, such as mirrors and beam splitters are used to direct light reflected from different sides of an object in the view volume to the imager. - Also coupled to the
bus 100 is anillumination driver 116 that controls andillumination sources 118. Typical scanners employ Light Emitting Diodes (LEDs) as illumination sources. In one typical configuration, red LEDs are paired with a monochrome camera. The illumination driver applies signals to the LEDs to turn them on in a controlled sequence (strobe them) in synchronization with capture by an imager or imagers. In another configuration, plural different color LEDs may also be used and strobed in a manner such that the imager(s) selectively capture images under illumination from different color LED or sets of LEDs. See, e.g., Patent Application Publication Nos. 20130329006, entitled COORDINATED ILLUMINATION AND IMAGE SIGNAL CAPTURE FOR ENHANCED SIGNAL DETECTION, and 20160187199 entitled SENSOR-SYNCHRONIZED SPECTRALLY-STRUCTURED-LIGHT IMAGING, which are hereby incorporated by reference. The latter captures images in plural different spectral bands beyond standard RGB color planes, enabling extraction of encoded information as well as object recognition based on pixel samples in more narrow spectral bands at, above and below the visible spectrum. - In another configuration, a broadband illumination source is flashed and image pixels in different bands, e.g., RGB, are captured with a color image sensor (e.g., such as one with a Bayer arrangement). The illumination driver may also strobe different sets of LED that are arranged to illuminate particular views within the view volume (e.g., so as to capture images of different sides of an object in the view volume).
- A further extension of scanner capability is to include a RGB+D imager, which provides a depth measurement in addition to Red, Green and Blue samples per pixel. The depth sample enables use of object geometry to assist in product identification.
- The scanner also includes at least one
communications module 118, each comprised of circuitry to transmit and receive data through a wired or wireless link to another device or network. One example of a communication module is a connector that operates in conjunction with software or firmware on the scanner to function as a serial port (e.g., RS232), a Universal Serial Bus (USB) port, and an IR interface. Another example of a communication module in a scanner is a universal interface driver application specific integrated circuit (UIDA) that supports plural different host interface protocols, such as RS-232C, IBM46XX, or Keyboard Wedge interface. The scanner may also have communication modules to support other communication modes, such as USB, Ethernet, Bluetooth, WiFi, infrared (e.g., IrDa) or RFID communication. - Also connected to the
bus 100 is asensor interface module 122 communicatively coupled to one ormore sensors 124. Some scanner configurations have a scale for weighing items, and other data capture sensors such as RFID or NFC readers or the like for reading codes from products, consumer devices, payment cards, etc. - The
sensor interface module 130 may also optionally include cache or other local memory device (e.g., volatile memory, non-volatile memory or a combination thereof), DMA channels, one or more input buffers, one or more output buffers to store and communicate control and data signals to and from the sensor. - Finally, the scanner may be equipped with a variety of user input/output devices, connected to the
bus 100 via a corresponding user I/O interface 126. Scanners, for example, provide user output in the form of a read indicator light or sound, and thus have an indicator light ordisplay 128 and/orspeaker 130. The scanner may also have a display and display controller connecting the display device to thebus 100. For I/O capability, the scanner has a touch screen for both display and user input. -
FIG. 5 is a diagram illustrating a processing architecture for controlling recognition units within a scanner. The processing architecture comprises a controller and recognition units. Each of these elements is a logical processing module implemented as a set of instructions executing on a processor in the scanner, or implemented in an array of digital logic gates, such as a Field Programmable Gate Array (FPGA) or Application Specific Integrated Circuit (ASIC). Each of the modules may operate within a single component (such as a processor, FPGA or ASIC), within cores of a plural core processor, or within two or more components that are interconnected via thebus 100 or other interconnect between components in the scanner hardware ofFIG. 4 . The implementer may create the instructions of each module in a higher level programming language, such as C/C++ and then port them to the particular hardware components in the scanner architecture of choice. - In this example, we show a controller and three recognition units. There may be more or less of each in a given implementation. The
controller 140 is responsible for sending recognition tasks to recognition units (142, 144 and 146), getting the results of those tasks, and then executing logic to determine the item code to be sent to the host POS system of the scanner. Thecontroller module 140 communicates with the recognition units (142-146) viacommunication links bus 100. - To communicate among software processes, the controller process employs inter-process communication (IPC). The particular form of IPC depends in part on the operating system executing in the scanner. For a Unix OS or Unix derivatives, IPC may be implemented with sockets. Windows based Operating Systems from Microsoft Corp. also provide an implementation of sockets for IPC.
- Finally, controller and recognition units may be implemented within a single software process in which communication among software routines within the process is implemented with shared memory. Within a process, the software program of each recognition units may be executed serially and report its results back to the controller. Recognition units may also be executed as separate threads of execution. The operating system running in the scanner manages pre-emptive multi-tasking and multi-threading (if employed) for software processes and threads. The operating system also manages concurrent execution on processes on processors, in some scanners where more than one processor is available for the controller, recognition units, and other image processing.
- A recognition unit executes instructions on an image block provided to it to recognize an object or objects in the image block and return a corresponding recognition result. For optical codes like barcodes and Digimarc Barcode data carriers, the recognition result comprises the digital payload extracted from the carrier, which may be formatted as a string of binary or M-ary symbols or converted to a higher level code such as a GTIN data structure in accordance with the GS1 specification for GTINs. Recognition units that perform optical code reading include, for example, optical code readers for 1-dimensional optical codes like UPC, EAN, Code 39, Code 128 (including GS1-128), stacked codes like DataBar stacked and PDF417, or 2-dimensional optical codes like a DataMatrix, QR code or MaxiCode.
- Some scanners also have varying levels of object recognition capability, in which the recognition process entails feature extraction and classification or identification based on the extracted features. Some of these type of recognition processes provide attributes of an item or label, or a class of the product or label. Attributes of the item include color (e.g., color histogram) or geometry, such as position, shape, bounding region or other geometric attributes). The attributes may be further submitted to a classifier to classify an item type. The controller combines this information with other recognition results or sensor input to disambiguate plural codes detected from an object in the view volume.
- Depending on processing power, memory and memory bandwidth constraints, the scanner may have more sophisticated object recognition capability that is able to match extracted features with a feature database in memory and identify a product based on satisfying match criteria. This technology is described further below.
- Though we are primarily focused on image processing recognition, the recognition units may also operate on other sensed data. Examples include decoding of an RFID tag based on sensed RF signal input, and weight attributes from a scale.
-
FIG. 6 is diagram illustrating asoftware modules Controller 160 is an example of acontroller 140 in the architecture ofFIG. 5 . This diagram illustrates the interaction of a controller with one particular implementation of arecognition unit 162. In this instance, thecontroller 160 and recognition unit are software processes. In one embodiment, they execute on distinct processors within the scanner. For example, they execute either in theseparate processors processor 102 and recognition unit executes in a processor within the imager interface 108 (e.g., DSP). In another embodiment, they execute within the same processor, e.g.,processor 102, or within a DSP in theimager interface 108. - In still another embodiment, the controller executes in
processor 102, and the instructions of the recognition unit are implemented within an FPGA or ASIC, which is part of another component, such as the imager interface, or a separate component onbus 100. - The software process of the
recognition unit 162 performs a form of recognition that employs digital watermark decoding to detect and extract watermark payloads from encoded data tiles in the image frames 164. The term, “frame,” refers to a group of pixels read from a 2D sensor array for a time period in which a 2D image is captured on the sensor array. Recall that the sensor may operate in rolling shutter or global shutter mode. In some implementations, selected rows of the sensor array are sampled during a capture period and stored in a memory buffer (e.g., in the imager interface), which is accessed by the recognition unit(s). In others, an entire frame of all pixels in the sensor array are sampled and stored in a frame buffer, which is then accessed by the recognition unit(s). The group of pixels sampled from a frame may include plural views of the viewing volume, or a part of the viewing volume. - The
recognition unit 162 has the following sub-modules of instructions:interface 166 andwatermark processors recognition unit 162. Watermark processors are instances of watermark decoders. - When an object moves into the view volume of the scanner,
controller 160 invokes therecognition unit 162 on image frames containing the object. Viainterface 166, thecontroller 160 calls therecognition unit 162, providing theframes 164 by supplying an address of or pointer to them in the memory of the scanner (image buffer in e.g., eithervolatile memory 104 or memory buffers in imager interface 108). It also provides other attributes, such as attributes of the view from which the frame originated. The recognition unit proceeds to invoke a watermark processor 168-172 on frames in serial fashion. Watermark processors 1-3 operate on frames 1-3, and then process flow returns back towatermark processor 1 for frame 4, and so on. This is just one example of process flow in a serial process flow implementation. Alternatively, watermark processors may be executed concurrently within a process as threads, or executed as separate software processes, each with an interface and watermark processor instance. - The
recognition unit 162 provides the extracted payload results, if any, for each frame via communication link as described above. The controller analyzes the results from the recognition unit and other recognition units and determines when and what to report to the POS terminal. Each watermark processor records in shared memory of therecognition unit 162 its result for analyzing the image block assigned to it. This result is a no detect, a successful read result along with decoded payload, or payloads (in the event that distinct payloads are detected within a frame). Optionally the watermark processor provides orientation parameters of the decoded payload, which provide geometric orientation and/or position of the tile or tiles from which the payload is decoded. -
FIGS. 7A and 7B illustrateimage portions object 184 is moving through this field of view in these frames. Here, we use the phrase, “image portion,” to reflect that the image portion of a frame is not necessarily co-extensive with the entire pixel array of an imager. As noted, an imager may capture plural views of theobject 184 per frame, and the image portion may correspond to one particular view of plural different views captured by the image sensor array for a frame. Alternatively, it may encompass plural views imaged within a frame. Also, frames from different imagers may be composited, in which case, the image portion may include a portion of frames composited from different imagers. Nevertheless,FIG. 7A depicts an image block from a frame at a first capture time, andFIG. 7B represents an image block from a second, later capture time. - For sake of illustration, we use an example where the imager has a frame capture rate of 100 frames per second. Thus, a new frame is available for sampling as fast as every 10 ms. The rate at which the controller provides frames or portions of frames to each recognition unit may not be as high as the frame rate. Thus, the frames illustrated here need not be strictly adjacent in a video sequence from the sensor, but are within a time period in which an
object 184 moves through the field of view of the scanner. The object movement may be from a checker swiping theobject 184 through a field of view of the scanner or positioning a hand held scanner to image the object, or from a mechanical mechanism, such as a conveyor moving an object through a view volume of a scanner.Image portion 180 at frame time, T1, includes an image captured of at least a first part ofobject 184. This object has encoded data tiles having afirst payload Image block 182, at a later frame time, T2, depicts that theobject 184 has moved further within the field of view of the scanner. At T2, more tiles are captured, such as 186 c having the same payload as 186 a and 186 b, and 188 b having the same payload as 188 a. -
FIGS. 7A and 7B illustrate the problem outlined above for conflicting codes on objects. In this scenario, the recognition unit may detect a first code in 188 a and another code in 186 a or none of the codes in 186 from frame at T1. However, the reverse may happen for the frame at T2, as more of the tiles of 186 are visible to the scanner than 188. The recognition unit is more likely to detect 186 at T2. The code in 188 is an example of an inner barcode. It is only partially obscured by the label or overwrap on which the code in 186 resides. Tiles 188 a-b carry an “inner barcode,” whereas tiles 186 a-c contain an “outer barcode,” using the terminology introduced earlier. - This sequence illustrates one scenario where the different codes created for family packs and price change labels create scanner conflict. The encoded tiles 188 a-b correspond to packaging of an individual item in a family pack or the label bearing the GTIN of a product, before a price change. The encoded tiles 186 a-c correspond to packaging of the family pack, such as a partial over-wrap or carrier. Encoded tiles 186 a-c alternatively correspond to a price change label. The sequence of detection is likely to be as shown in
FIG. 2 , where the inner barcode of 188 is detected at T1 and then the outer barcode is detected at T2. This sequence of detection may not always happen, but in cases where different codes are detected from a package either within a frame, or over different frames, there is a need for code conflict resolution. -
FIGS. 8A and 8B illustrate another example ofimage portions object 194 moves through the field of view, an outer barcode is likely to be detected first, but later, the inner barcode is likely to be detected. In this scenario, an outer barcode is encoded in tiles 196 a-d, and an inner barcode in tiles 198 a-b. For family packs, the outer barcode is encoded in tiles 196 a-d on the package of the overwrap, but the overwrap does not completely obscure the inner barcode, which is a barcode encoded in tiles 198 a-b on an individual item or items within the family pack. For price change labels, the price change is encoded in 196 a-d, e.g., on a label affixed to thepackage 194 over the original packaging. The original packaging, however, retains encoding of the original item's GTIN in tiles 198 a-b. The sequence of detection of outer than inner barcode ofFIG. 3 is likely to happen in this case. At time T1, a recognition unit is likely to detect the payload of tiles 196 a-d, and likely not 198 a. At time T2, the recognition unit is likely to detect the payload of tiles 198 a-b. This scenario poses a conflict if the scanner were to report the GTIN of the inner barcode separately from the family pack. Further, in some price change label scenarios, the scanner needs to detect that it should not report the original GTIN, as this would not reflect the price change correctly. -
FIG. 9 is a flow diagram of a controller process that resolves these potential code conflicts. Preferably, this control logic is implemented within thecontroller 140 ofFIG. 5 . However, it may also be distributed between thecontroller 140 and one or more recognition units (e.g., 142, 144, 146). In particular, a recognition unit may implement control logic for resolving conflicts among codes that it detects during scanning operation, and report a subset of codes to acontroller 140 for which conflicts have been resolved. The controller, in turn, receives recognition results from plural different recognition units and executes control logic to resolve conflicts among the recognition results from these recognition units. - One particular software architecture in which this control logic is implemented is the architecture illustrated in
FIG. 6 . In this implementation, the control logic is implemented as software instructions within acontroller software process 160 executing on a processor (102, 102 a or 108) of the scanner. Therecognition unit 162 is a software process executed on that processor or different processor within the scanner. - As shown in
step 200, the controller begins by initiating the recognition units. The recognition units (e.g., 142-146) are launched as instances of software processes executing on a processor within the scanner. The controller issues instructions to theimager 110 via the imager interface and theillumination driver 116 to coordinate image capture and illumination as objects are scanned. Theimaging interface 108 captures image data from theimage 110 for a frame, buffers it in a RAM memory and signals the controller that new image block is available. - This RAM memory may be within the
interface 108 or inRAM memory 104. In steps 201-202, the controller gets an address of an image block in this RAM memory and passes the address to a recognition unit, along with additional attributes of that image block useful in assisting recognition operations (such as the view or camera that the image block came from, its geometric state (e.g., orientation of the view), frame identifier, and the like). In response, the recognition unit proceeds to obtain and perform recognition operations on the image block. For decoding of Digimarc Barcode data carriers repeated in contiguous tiles, a watermark processor executes decoder operations on the image block to search for an encoded data carrier and extract its payload from one or more of these encoded tiles, if detected. Plural instances of watermark processors may be assigned to process image blocks of different frames, as shown inFIG. 6 . - The controller gets recognition results from the recognition units as shown in
step 203. The controller queries a recognition unit to get its recognition result. It then evaluates the result to determine whether it has successfully recognized an object and has provided its item identifier (e.g. a GTIN, price code identifier or like item identifier), as shown indecision block 204. If not, it passes the next image block to the recognition unit (back to 201-202). - If the controller has obtained an item identifier, it evaluates the identifier against other identifiers obtained from the frame and prior frames during a pending time out period in
step 205. This evaluation includes a comparison of the detected identifier with other identifiers from the same frame or prior frame stored in a state data structure. - If it is a new identifier, it is stored in a state data structure in shared memory of the controller process and analyzed further to determine whether to report it or initiate a waiting period to report it. If it has identified the identifier as a duplicate identifier with another identifier in a pending duplicate time out period, it is rejected as a duplicate.
- For the evaluation executed in
step 205, the controller retains state information for identifiers. Upon detection of a new identifier, the controller checks whether it is flagged, or has otherwise been detected as a family pack, family pack member or price change label. A family pack or family pack member is signaled via a flag decoded from the data carrier encoded on the object. Likewise, a price change label is similarly indicated by a flag. Alternative means of detecting family packs, family pack member items, and price change labels may be used in place of the flag or in addition to a flag, as described in this document (e.g., by label geometry, color, recognized image feature set or label template, etc.). - The detection of a family pack causes the controller to update the state by storing the family pack identifier in a state data structure and initiating a waiting period. The family pack identifier is queued for reporting at this point, as there is no need to wait to report it. Instead, this waiting period is used to prevent reporting an identifier of a member of the family pack for detections during waiting period initiated upon detection of the family pack. The waiting period is implemented using a timer as explained below. A duplicate time out period has a different objective from that of a waiting period to resolve a conflict. As such, it may be preferred to instantiate separate timers for duplicate and conflict rejection.
- The detection of a new family pack member causes the controller to check whether a family pack identifier with a pending waiting period is in a state data structure. The pending waiting period is indicated by the timer for the waiting period not being in time out state when queried for an update. If family pack is in a waiting period, the family pack member is not reported. If a family pack is not in a waiting period, the controller updates the state data structure by storing the family pack member's identifier and initiating a waiting period for it. This family pack member waiting period is used to instruct the controller to wait to determine whether a family pack identifier is detected in the waiting period. It may also be used for duplicate rejection. If a family pack identifier is detected in this family pack member waiting period, the family pack identifier is stored in a state data structure and is queued for reporting (there is no need to wait on reporting). Additionally, the family pack member is stored in a state data structure for duplicate rejection, and a family pack waiting period is initiated for the family pack identifier by setting a timer for a family pack waiting period.
- There are at least two types of price change labels: new fixed price and discount labels. When the controller finds a detection result with a new fixed price flag set, it stores the new fixed price code and queues it for reporting. From a reporting perspective, the controller reports the new fixed price instead of the original product identifier (GTIN) decoded from the same object. The scanner determines whether an identifier is from the same object by proximity in detection time or detection location of the price change label relative to the original product identifier (GTIN). Proximity in detection time is implemented based on a waiting period.
- In an implementation where new fixed price labels are employed, a waiting period is imposed for new identifiers detected because of the possibility that detection of a new fixed price label may replace the GTIN that the controller reports to the POS terminal. When a new product identifier is detected, and there is no waiting period for a new fixed price code in a state data structure, the new identifier is retained and a waiting period is initiated to determine whether a fixed price label is detected in that ensuing waiting period. If a new fixed price code is detected first before the original product identifier on the object, meaning that no product identifier is in a waiting period state in the state data structure, the new fixed price code is queued for reporting. Subsequent product identifiers in the waiting period are not reported, but may be stored for duplicate rejection.
- For a detected discount code, the controller stores the discount code in a state data structure and queues it for reporting. The scanner logic determines whether a product identifier is detected from the same object as noted in the previous case, e.g., by proximity in detection time and/or position in frame(s) relative to the discount label. If a product identifier from the same object is in the state data structure under its waiting period, the detected discount code is reported along with it. The discount code is stored for duplicate rejection, but is reported only once. If a discount is detected first, with no product identifier in a pending waiting period, the controller stores it in the state data structure and initiates a waiting period. It is reported if a new product identifier is detected in its waiting period. Since the discount should be associated with a product identifier, the controller may flag the POS terminal to have the checker scan or otherwise enter the product identifier of the product to which the discount code applies.
- In
step 206, the controller updates the state data structure with the identifier and status of an identifier (including product or price change codes), including state associated with family pack or price change detection results. It also calls a timer instance, if one has been initiated, to get its count and update the status of the timer as timed out, or still pending. It may also retain other information helpful in resolving conflict among detected items. This information may include a frame identifier or time code to indicate where an identifier originated from within a frame or a time of the frame in which it was detected. This information may also include position information, such orientation parameters and/or spatial location within a frame from which the identifier was extracted. In cases where different identifiers are detected within a frame, or within frames within a waiting period, the positional information may be used to determine that identifiers are from items that are to be priced separately, and as such, both reported to the POS. For example, if the identifiers originate from different frame locations and have tile orientation that is inconsistent, then they are candidates of being from separate objects, and handled as such by the controller. - In
decision step 207, the controller determines whether to report the identifier or identifiers in the state data structure. The decision is based on state of the identifiers in the data structure and the state of the timer used to track a waiting period that has been initiated. The controller reports an identifier, including price change codes, for which a waiting period has not been imposed, or the waiting period to report has timed out. Time out periods used only for duplicate rejection do not require a waiting period for reporting. However, potential conflicts arising from family pack or price changes may require a waiting period as described above. The controller determines whether an identifier is in a waiting period by checking the state data structure to check whether the timer instance for a waiting period has timed out. In some cases, another detection will trigger a report, prior to a timer getting to a time out state. In this case, the controller has updated the state data structure to signal that an identifier is in a state to be reported, or ignored. If it determines to report, the controller transmits the identifier(s) to the POS terminal via the scanner's communication interface as shown inblock 208. - In the
next step 209, the controller sets up a timer for a waiting period, if necessary, for this pass through the controller process. The timer may be implemented with a soft timer, a software process such as a C++ timer object, which in turn, interfaces with a timer interrupt service available in the scanner's operating system. In this approach, the timer creates a timer instance for a waiting period. The timer instance invokes the timer interrupt service to update its count. The time interrupt services exposes a counter in the scanner hardware, e.g., as part of the ARM or other processor sub-system in the scanner. For flags that signal the start of a waiting period, such as a family pack or member of family pack, a new timer is initiated for that family pack related waiting period. The same is true for price change related waiting periods. -
FIG. 9 depicts an example of a sequence of operations of a controller implementation. The sequence of operations may vary from the one depicted here. For example the timer may be set within the set of instructions that execute the update to the state of 206. - As noted, code conflict logic may be implemented within each recognition unit, and at the level of the controller. Conflict logic within a recognition unit is employed to resolve conflict among codes of the same type detected by the recognition unit. For example, in the case where plural conflicting codes of the same type are present on a package, the recognition unit employs code conflict logic to prevent reporting an erroneous code to the controller, and ultimately, to prevent the scanner from reporting an improper code to the POS system.
- In one embodiment, the recognition unit writes its detection results to a data structure and returns the data structure (or pointer to it) when the controller queries it for detection results. The recognition unit records the state of detection results in the data structure, including whether a detected identifier is in a waiting period and whether a detected identifier is in a potentially conflicted status with another identifier. When plural different codes of the same symbology and type are detected within a frame, they are recorded as potentially conflicting. This may occur where there are two different GTINs without a family pack or price code relationship to justify the existence of the different GTINs. A waiting period is initiated for each code. For subsequent codes detected within the waiting period, the recognition unit updates the data structure. The recognition unit may be able to resolve the conflict based on detection results within the waiting period that confirm that one identifier should be given priority over another. For example, subsequent detection of one of the identifiers in subsequent image frames of a package within the waiting period may be sufficient to confirm that one identifier has priority and should be reported as such through the state data structure. Alternatively, the conflict may not be resolved, and instead, the recognition unit reports potentially conflicting identifiers on a package to the controller via a pointer to the data structure.
- In response, the controller either resolves the conflict based on detection results from another recognition unit and reports the highest priority identifier or reports an error to the POS system. For example, a GTIN in a barcode of one type reported from one recognition unit may agree with a GTIN in a different symbology reported from another recognition unit. For results within a waiting period, the controller compares the detection results from different recognition units and determines, based on matching the GTINs from different symbologies, that a conflicting GTIN can be excluded and the matching GTIN given priority. The controller then reports the higher priority GTIN. Alternatively, if a conflict persists or is not resolved, the controller signals an error to the POS system and prompts a re-scan, or manual entry. The re-scan may be switched to a presentment mode rather than a scan and pass mode so that the user can present the correct code for scanning.
- This approach for integrating recognition units in scanners enables the recognition units to be updated over time while maintaining the same interface with the scanner and the interface to its controller. Specifically, recognition units can become more sophisticated in detection performance, detection result and state reporting, and conflict logic. These updates are reflected in updates to the contents of the data structure, which provide more detail of the context of the detection of each identifier (e.g., location, time of detect, number of detects, waiting period state) as well as recommended reporting logic (e.g., reporting an instruction to the controller to hold for waiting period, resolve conflict between codes A, B, etc., or seek to confirm detection result with result of another recognition unit). The scanner may be updated on a different schedule without concern of becoming incompatible with the recognition unit, as the data structure is configured to include a detection result that is backward compatible. An older version of a controller continues to interpret simpler results as before, e.g., report GTIN, wait, or error. In contrast, a new version of the controller is preferably updated to interpret error or wait states in the extended data structure, as an instruction to read and resolve potential code conflicts identified in the extended data structure.
- Preferably, the recognition unit updates are provided with helper source code that provide scanner manufactures guidance on how to exploit the additional detection result data and code conflict logic implemented by the recognition unit and reported in the extended data structure it returns.
-
FIG. 10 is a block diagram of a signal encoder for encoding a digital payload signal into an image signal.FIG. 11 is a block diagram of a compatible signal decoder for extracting the digital payload signal from an image signal. - While the signal encoder and decoder may be used for communicating a data channel for many applications, the objective for use in physical objects is robust signal communication through images formed on and captured from these objects. Signal encoders and decoders, like those in the Digimarc Barcode Platform from Digimarc Corporation, communicate auxiliary data in a data carrier within image content.
- Encoding and decoding is applied digitally, yet the signal survives digital to analog transformation and analog to digital transformation. For example, the encoder generates a modulated image that is converted to a rendered form, such as a printed image. Prior to decoding, a receiving device has an imager to capture the modulated signal, convert it to an electric signal, which is digitized and then processed by the decoder.
- Inputs to the signal encoder include a
host image 220 andauxiliary data payload 222. The objectives of the encoder include encoding a robust signal with desired payload capacity per unit of host signal (e.g., the spatial area of a two-dimensional tile), while maintaining perceptual quality. In some cases, there may be very little variability or presence of a host signal. In this case, there is little host interference on the one hand, yet little host content in which to mask the presence of the data channel within an image. Some examples include a package design that is devoid of much image variability (e.g., a single, uniform color). See, e.g., US Published Application No. 20160275639, entitled SPARSE MODULATION FOR ROBUST SIGNALING AND SYNCHRONIZATION, incorporated herein by reference. - The
auxiliary data payload 222 includes the variable data information to be conveyed in the data channel, possibly along with other protocol data used to facilitate the communication. The protocol of the auxiliary data encoding scheme comprises the format of the auxiliary data payload, error correction coding schemes, payload modulation methods (such as the carrier signal, spreading sequence, encoded payload scrambling or encryption key), signal structure (including mapping of modulated signal to embedding locations within a tile), error detection in payload (CRC, checksum, etc.), perceptual masking method, host signal insertion function (e.g., how auxiliary data signal is embedded in or otherwise combined with host image signal in a package or label design), and synchronization method and signals. - The protocol defines the manner in which the signal is structured and encoded for robustness, perceptual quality or data capacity. For a particular application, there may be a single protocol, or more than one protocol, depending on application requirements. Examples of multiple protocols include cases where there are different versions of the channel, different channel types (e.g., several digital watermark layers within a host). Different versions may employ different robustness encoding techniques or different data capacity.
Protocol selector module 224 determines the protocol to be used by the encoder for generating a data signal. It may be programmed to employ a particular protocol depending on the input variables, such as user control, application specific parameters, or derivation based on analysis of the host signal. -
Perceptual analyzer module 226 analyzes the input host signal to determine parameters for controlling signal generation and embedding, as appropriate. It is not necessary in certain applications, while in others it may be used to select a protocol and/or modify signal generation and embedding operations. For example, when encoding in host color images that will be printed or displayed, the perceptual analyzer 156 is used to ascertain color content and masking capability of the host image. The output of this analysis, along with the rendering method (display or printing device) and rendered output form (e.g., ink and substrate) is used to control auxiliary signal encoding in particular color channels (e.g., one or more channels of process inks, Cyan, Magenta, Yellow, or Black (CMYK) or spot colors), perceptual models, and signal protocols to be used with those channels. Please see, e.g., our work on visibility and color models used in perceptual analysis in our U.S. application Ser. No. 14/616,686 (issued as U.S. Pat. No. 9,380,186) and Ser. No. 14/588,636 (issued as U.S. Pat. No. 9,401,001) and U.S. Pat. Nos. 9,449,357, 9,117,268 and 7,352,878, which are hereby incorporated by reference. - The
perceptual analyzer module 226 also computes a perceptual model, as appropriate, to be used in controlling the modulation of a data signal onto a data channel within image content as described below. - The
signal generator module 228 operates on the auxiliary data and generates a data signal according to the protocol. It may also employ information derived from the host signal, such as that provided byperceptual analyzer module 226, to generate the signal. For example, the selection of data code signal and pattern, the modulation function, and the amount of signal to apply at a given embedding location may be adapted depending on the perceptual analysis, and in particular on the perceptual model and perceptual mask that it generates. Please see below and the incorporated patent documents for additional aspects of this process. -
Embedder module 230 takes the data signal and modulates it into an image by combining it with the host image. The operation of combining may be an entirely digital signal processing operation, such as where the data signal modulates the host signal digitally, may be a mixed digital and analog process or may be purely an analog process (e.g., where rendered output images, with some signals being modulated data and others being host image content, such as the various layers of a package design file). - There are a variety of different functions for combining the data and host in digital operations. One approach is to adjust the host signal value as a function of the corresponding data signal value at an embedding location, which is limited or controlled according to the perceptual model and a robustness model for that embedding location. The adjustment may be altering the host image by adding a scaled data signal or multiplying by a scale factor dictated by the data signal value corresponding to the embedding location, with weights or thresholds set on the amount of the adjustment according to the perceptual model, robustness model, and available dynamic range. The adjustment may also be altering by setting the modulated host signal to a particular level (e.g., quantization level) or moving it within a range or bin of allowable values that satisfy a perceptual quality or robustness constraint for the encoded data.
- As detailed further below, the signal generator produces a data signal with data elements that are mapped to embedding locations in a tile. These data elements are modulated onto the host image at the embedding locations. A tile is a pattern of embedding locations. The tile derives its name from the way in which it is repeated in contiguous blocks of a host signal, but it need not be arranged this way. In image-based encoders, we use tiles in the form of a two dimensional array (e.g., 128 by 128, 256 by 256, 512 by 512) of embedding locations. The embedding locations correspond to host signal samples at which an encoded signal element is embedded in an embedding domain, such as a spatial domain (e.g., pixels at a spatial resolution), frequency domain (frequency components at a frequency resolution), or some other feature space. We sometimes refer to an embedding location as a bit cell, referring to a unit of data (e.g., an encoded bit or chip element) encoded within a host signal at the location of the cell. Again please see the documents incorporated herein for more information on variations for particular type of media.
- The operation of combining may include one or more iterations of adjustments to optimize the modulated host for perceptual quality or robustness constraints. One approach, for example, is to modulate the host image so that it satisfies a perceptual quality metric as determined by perceptual model (e.g., visibility model) for embedding locations across the signal. Another approach is to modulate the host image so that it satisfies a robustness metric across the signal. Yet another is to modulate the host image according to both the robustness metric and perceptual quality metric derived for each embedding location. The incorporated documents provide examples of these techniques. Below, we highlight a few examples. See, e.g., U.S. Pat. Nos. 9,449,357 and 9,401,001, and US Published Patent Application No. US 2016-0316098 A1, which are hereby incorporated herein by reference.
- For color images, the perceptual analyzer generates a perceptual model that evaluates visibility of an adjustment to the host by the embedder and sets levels of controls to govern the adjustment (e.g., levels of adjustment per color direction, and per masking region). This may include evaluating the visibility of adjustments of the color at an embedding location (e.g., units of noticeable perceptual difference in color direction in terms of CIE Lab values), Contrast Sensitivity Function (CSF), spatial masking model (e.g., using techniques described by Watson in US Published Patent Application No. US 2006-0165311 A1, which is incorporated by reference herein), etc. One way to approach the constraints per embedding location is to combine the data with the host at embedding locations and then analyze the difference between the encoded host with the original. The perceptual model then specifies whether an adjustment is noticeable based on the difference between a visibility threshold function computed for an embedding location and the change due to embedding at that location. The embedder then can change or limit the amount of adjustment per embedding location to satisfy the visibility threshold function. Of course, there are various ways to compute adjustments that satisfy a visibility threshold, with different sequence of operations. See, e.g., Digimarc's U.S. Pat. Nos. 9,449,357, 9,401,001, 9,380,186, 9,117,268 and 7,352,878, which are each hereby incorporated herein by reference in its entirety.
- The embedder also computes a robustness model. The computing of a robustness model may include computing a detection metric for an embedding location or region of locations. The approach is to model how well the decoder will be able to recover the data signal at the location or region. This may include applying one or more decode operations and measurements of the decoded signal to determine how strong or reliable the extracted signal. Reliability and strength may be measured by comparing the extracted signal with the known data signal. Below, we detail several decode operations that are candidates for detection metrics within the embedder. One example is an extraction filter which exploits a differential relationship to recover the data signal in the presence of noise and host signal interference. At this stage of encoding, the host interference is derivable by applying an extraction filter to the modulated host. The extraction filter models data signal extraction from the modulated host and assesses whether the differential relationship needed to extract the data signal reliably is maintained. If not, the modulation of the host is adjusted so that it is.
- Detection metrics may be evaluated such as by measuring signal strength as a measure of correlation between the modulated host and variable or fixed data components in regions of the host, or measuring strength as a measure of correlation between output of an extraction filter and variable or fixed data components. Depending on the strength measure at a location or region, the embedder changes the amount and location of host signal alteration to improve the correlation measure. These changes may be particularly tailored so as to establish relationships of the data signal within a particular tile, region in a tile or bit cell pattern of the modulated host. To do so, the embedder adjusts bit cells that violate the relationship so that the relationship needed to encode a bit (or M-ary symbol) value is satisfied and the thresholds for perceptibility are satisfied. Where robustness constraints are dominant, the embedder will exceed the perceptibility threshold where necessary to satisfy a desired robustness threshold.
- The robustness model may also model distortion expected to be incurred by the modulated host, apply the distortion to the modulated host, and repeat the above process of measuring detection metrics and adjusting the amount of alterations so that the data signal will withstand the distortion. See, e.g., U.S. Pat. Nos. 9,380,186, 9,401,001 and 9,449,357, which are each hereby incorporated herein by reference, for image related processing.
- This modulated host is then output as an
output image signal 232, with a data channel encoded in it. The operation of combining also may occur in the analog realm where the data signal is transformed to a rendered form, such as a layer of ink or coating applied by a commercial press to substrate. Another example is a data signal that is overprinted as a layer of material, engraved in, or etched onto a substrate, where it may be mixed with other signals applied to the substrate by similar or other marking methods. In these cases, the embedder employs a predictive model of distortion and host signal interference, and adjusts the data signal strength so that it will be recovered more reliably. The predictive modeling can be executed by a classifier that classifies types of noise sources or classes of host image and adapts signal strength and configuration of the data pattern to be more reliable to the classes of noise sources and host image signals that the encoded data signal is likely to be encounter or be combined with. - The
output 232 from the embedder signal typically incurs various forms of distortion through its distribution or use. For printed objects, this distortion occurs through rendering an image with the encoded signal in the printing process, and subsequent scanning back to a digital image via a camera or like image sensor. - Turning to
FIG. 11 , the signal decoder receives an encodedhost signal 240 and operates on it with one or more processing stages to detect a data signal, synchronize it, and extract data. This signal decoder corresponds to a type of recognition unit inFIG. 5 and watermark processor inFIG. 6 . - The decoder is paired with an input device in which a sensor captures an analog form of the signal and an analog to digital converter converts it to a digital form for digital signal processing. Though aspects of the decoder may be implemented as analog components, e.g., such as preprocessing filters that seek to isolate or amplify the data channel relative to noise, much of the decoder is implemented as digital signal processing modules that implement the signal processing operations within a scanner. As noted, these modules are implemented as software instructions executed within the scanner, an FPGA, or ASIC.
- The
detector 242 is a signal processing module that detects presence of the data channel. The incoming signal is referred to as a suspect host because it may not have a data channel or may be so distorted as to render the data channel undetectable. The detector is in communication with aprotocol selector 244 to get the protocols it uses to detect the data channel. It may be configured to detect multiple protocols, either by detecting a protocol in the suspect signal and/or inferring the protocol based on attributes of the host signal or other sensed context information. A portion of the data signal may have the purpose of indicating the protocol of another portion of the data signal. As such, the detector is shown as providing a protocol indicator signal back to theprotocol selector 244. - The
synchronizer module 246 synchronizes the incoming signal to enable data extraction. Synchronizing includes, for example, determining the distortion to the host signal and compensating for it. This process provides the location and arrangement of encoded data elements within the host signal. - The
data extractor module 248 gets this location and arrangement and the corresponding protocol and demodulates a data signal from the host. The location and arrangement provide the locations of encoded data elements. The extractor obtains estimates of the encoded data elements and performs a series of signal decoding operations. - As detailed in examples below and in the incorporated documents, the detector, synchronizer and data extractor may share common operations, and in some cases may be combined. For example, the detector and synchronizer may be combined, as initial detection of a portion of the data signal used for synchronization indicates presence of a candidate data signal, and determination of the synchronization of that candidate data signal provides synchronization parameters that enable the data extractor to apply extraction filters at the correct orientation, scale and start location of a tile. Similarly, data extraction filters used within data extractor may also be used to detect portions of the data signal within the detector or synchronizer modules. The decoder architecture may be designed with a data flow in which common operations are re-used iteratively, or may be organized in separate stages in pipelined digital logic circuits so that the host data flows efficiently through the pipeline of digital signal operations with minimal need to move partially processed versions of the host data to and from a shared memory unit, such as a RAM memory.
-
FIG. 12 is a flow diagram illustrating operations of a signal generator. Each of the blocks in the diagram depict processing modules that transform the input auxiliary data (e.g., GTIN or other item identifier plus flags) into a digital payload data signal structure. For a given protocol, each block provides one or more processing stage options selected according to the protocol. Inprocessing module 300, the auxiliary data payload is processed to compute error detection bits, e.g., such as a Cyclic Redundancy Check, Parity, check sum or like error detection message symbols. Additional fixed and variable messages used in identifying the protocol and facilitating detection, such as synchronization signals may be added at this stage or subsequent stages. - Error
correction encoding module 302 transforms the message symbols of the digital payload signal into an array of encoded message elements (e.g., binary or M-ary elements) using an error correction method. Examples include block codes, BCH, Reed Solomon, convolutional codes, turbo codes, etc. -
Repetition encoding module 304 repeats and concatenates the string of symbols from the prior stage to improve robustness. For example, certain message symbols may be repeated at the same or different rates by mapping them to multiple locations within a unit area of the data channel (e.g., one unit area being a tile of bit cells, as described further below). - Repetition encoding may be removed and replaced entirely with error correction coding. For example, rather than applying convolutional encoding (1/3 rate) followed by repetition (repeat three times), these two can be replaced by convolution encoding to produce a coded payload with approximately the same length.
- Next,
carrier modulation module 306 takes message elements of the previous stage and modulates them onto corresponding carrier signals. For example, a carrier might be an array of pseudorandom signal elements, with equal number of positive and negative elements (e.g., 16, 32, 64 elements), or other waveform. We elaborate further on signal configurations below. -
Mapping module 308 maps signal elements of each modulated carrier signal to locations within the channel. In the case where a digital host signal is provided, the locations correspond to embedding locations within the host signal. The embedding locations may be in one or more coordinate system domains in which the host signal is represented within a memory of the signal encoder. The locations may correspond to regions in a spatial domain, temporal domain, frequency domain, or some other transform domain. Stated another way, the locations may correspond to a vector of host signal features, which are modulated to encode a data signal within the features. -
Mapping module 308 also maps a synchronization signal to embedding locations within the host signal, for embodiments employing an explicit synchronization signal. An explicit synchronization signal is described further below. - To accurately recover the payload, the decoder must be able to extract estimates of the coded bits at the embedding locations within each tile. This requires the decoder to synchronize the image under analysis to determine the embedding locations. For images, where the embedding locations are arranged in two dimensional blocks within a tile, the synchronizer determines rotation, scale and translation (origin) of each tile. This may also involve approximating the geometric distortion of the tile by an affine transformation that maps the embedded signal back to its original embedding locations.
- To facilitate synchronization, the auxiliary signal may include an explicit or implicit synchronization signal. An explicit synchronization signal is an auxiliary signal separate from the encoded payload that is embedded with the encoded payload, e.g., within the same tile). An implicit synchronization signal is a signal formed with the encoded payload, giving it structure that facilitates geometric/temporal synchronization. Examples of explicit and implicit synchronization signals are provided in our previously cited patents U.S. Pat. Nos. 6,614,914, and 5,862,260.
- In particular, one example of an explicit synchronization signal is a signal comprised of a set of sine waves, with pseudo-random phase, which appear as peaks in the Fourier domain of the suspect signal. See, e.g., U.S. Pat. Nos. 6,614,914, and 5,862,260, describing use of a synchronization signal in conjunction with a robust data signal. Also see U.S. Pat. No. 7,986,807, which is hereby incorporated by reference.
- Our U.S. Pat. No. 9,182,778, which is hereby incorporated by reference, provides additional methods for detecting an embedded signal with this type of structure and recovering rotation, scale and translation from these methods.
- Examples of implicit synchronization signals, and their use, are provided in U.S. Pat. Nos. 6,614,914 and 5,862,260, as well as U.S. Pat. Nos. 6,625,297 and 7,072,490, and US Published Patent Application No. 20160217547, which are hereby incorporated by reference in their entirety.
-
FIG. 13 is a diagram illustrating embedding of an auxiliary signal into host signal. As shown, the inputs are a host signal block (e.g., blocks of a host digital image) (320) and an encoded auxiliary signal (322), which is to be inserted into the signal block. The encoded auxiliary signal may include an explicit synchronization component, or the encoded payload may be formulated to provide an implicit synchronization signal.Processing block 324 is a routine of software instructions or equivalent digital logic configured to insert the mapped signal(s) into the host by adjusting the corresponding host signal sample(s) at an embedding location according to the value of the mapped signal element. For example, the mapped signal is added/subtracted from corresponding a sample value, with scale factor and threshold from the perceptual model or like mask controlling the adjustment amplitude. In implementations with an explicit synchronization signal, the encoded payload and synchronization signals may be combined and then added, or added separately with separate mask coefficients to control the signal amplitude independently. - Applying the method of
FIG. 12 , the product or label identifier (e.g., in GTIN format) and additional flag or flags used by control logic are formatted into a binary sequence, which is encoded and mapped to the embedding locations of a tile. For sake of illustration, we describe an implementation of a tile having 256 by 256 embedding locations, where the embedding locations correspond to spatial domain embedding locations within an image. In particular, the spatial locations correspond to pixel samples at a configurable spatial resolution, such as 100 or 300 DPI. In this example, we will explain the case where the spatial resolution of the embedded signal is 300 DPI, for an embodiment where the resulting image with encode data is printed on a package or label material, such as a paper, plastic or like substrate. The payload is repeated in contiguous tiles each comprised of 256 by 256 of embedding locations. With these embedding parameters, an instance of the payload is encoded in each tile, occupying a block of host image of about 1.28 by 1.28 inches. These parameters are selected to provide a printed version of the image on paper or other substrate. At this size, the tile can be redundantly encoded in several contiguous tiles, providing added robustness. An alternative to achieving desired payload capacity is to encode a portion of the payload in smaller tiles, e.g., 128 by 128, and use a protocol indicator to specify the portion of the payload conveyed in each 128 by 128 tile. Erasure codes may be used to convey different payload components per tile and then assemble the components in the decoder, as elaborated upon below. - Following the construction of the payload, error correction coding is applied to the binary sequence. This implementation applies a convolutional coder at
rate 1/4, which produces an encoded payload signal of 4096 bits. Each of these bits is modulated onto a binary antipodal, pseudorandom carrier sequence (−1, 1) oflength 16, e.g., multiply or XOR the payload bit with the binary equivalent of chip elements in its carrier to yield 4096 modulated carriers, for a signal comprising 65,536 elements. These elements map to the 65,536 embedding locations in each of the 256 by 256 tiles. - An alternative embodiment, for robust encoding on packaging employs tiles of 128 by 128 embedding locations. Through convolutional coding of an input payload at
rate 1/3 and subsequent repetition coding, an encoded payload of 1024 bits is generated. Each of these bits is modulated onto a similar carrier sequence oflength 16, and the resulting 16,384 signal elements are mapped to the 16,384 embedding locations within the 128 by 128 tile. - There are several alternatives for mapping functions to map the encoded payload to embedding locations. In one, these elements have a pseudorandom mapping to the embedding locations. In another, they are mapped to bit cell patterns of differentially encoded bit cells as described in US Published Patent Application no. 20160217547, incorporated above. In the latter, the tile size may be increased to accommodate the differential encoding of each encoded bit in a pattern of differential encoded bit cells, where the bit cells corresponding to embedding locations at a target resolution (e.g., 300 DPI).
- Our published US Patent Application No. 20160275639, incorporated above, describes methods for inserting auxiliary signals in areas of package and label designs that have little host image variability. These methods are particularly useful for labels, including price change labels and fresh food labels. These signal encoding methods may be ported to the printing sub-system in scales used within fresh food, deli and meat departments to encode GTINs and control flags for variable weight items in the image of a label, which is then printed by the printer sub-system (typically a thermal printer) on the label and affixed to an item.
- For an explicit synchronization signal, the mapping function maps a discrete digital image of the synchronization signal to the host image block. For example, where the synchronization signal comprises a set of Fourier magnitude peaks or sinusoids with pseudorandom phase, the synchronization signal is generated in the spatial domain in a block size coextensive with the 256 by 256 tile (or other tile size, e.g., 128 by 128) at target embedding resolution.
- Various detailed examples of encoding protocols and processing stages of these protocols are provided in our prior work, such as our U.S. Pat. Nos. 6,614,914, 5,862,260, 6,674,876, and 9,117,268, which are hereby incorporated by reference, and US Patent Publication No 20160275639, previously incorporated. More background on signaling protocols, and schemes for managing compatibility among protocols, are provided in U.S. Pat. No. 7,412,072, which is hereby incorporated by reference.
- One signaling approach, which is detailed in U.S. Pat. Nos. 6,614,914, and 5,862,260, is to map elements to pseudo-random locations within a channel defined by a domain of a host signal. See, e.g., FIG. 9 of U.S. Pat. No. 6,614,914. In particular, elements of a watermark signal are assigned to pseudo-random embedding locations within an arrangement of sub-blocks within a block (referred to as a “tile”). The elements of this watermark signal correspond to error correction coded bits output from an implementation of
stage 304 ofFIG. 5 . These bits are modulated onto a pseudo-random carrier to produce watermark signal elements (block 306 ofFIG. 12 ), which in turn, are assigned to the pseudorandom embedding locations within the sub-blocks (block 308 ofFIG. 12 ). An embedder module modulates this signal onto a host signal by increasing or decreasing host signal values at these locations for each error correction coded bit according to the values of the corresponding elements of the modulated carrier signal for that bit. -
FIG. 14 is a flow diagram illustrating a method for decoding a payload signal from a host image signal. This method is a particular embodiment of a recognition unit ofFIG. 5 , and a watermark processor ofFIG. 6 . Implementations of recognition unit and watermark processors available from Digimarc Corporation include: - Digimarc Mobile Software Development Kit; and
- Digimarc Embedded Systems SDK.
- The Embedded Systems SDK is the one typically integrated into scanner hardware.
- Corresponding encoder embodiments available from Digimarc Corporation include:
- Digimarc Barcode SDKs
- Digimarc Barcode Plugin
- Returning to
FIG. 14 , the frames are captured at a resolution preferably near the resolution at which the auxiliary signal has been encoded within the original image (e.g., 300 DPI, 100 DPI, etc.). An image up-sampling or down-sampling operation may be performed to convert the image frames supplied by the imager to a target resolution for further decoding. - The resulting image blocks supplied to the decoder from these frames may potentially include an image with the payload. At least some number of tiles of encoded signal may be captured within the field of view, if an object with encoded data is being scanned. Otherwise, no encoded tiles will be present. The objective, therefore, is to determine as efficiently as possible whether encoded tiles are present.
- In the initial processing of the decoding method, it is advantageous to select frames and blocks within frames that have image content that are most likely to contain the encoded payload. From the image passed to the decoder, the decoder selects image blocks for further analysis. The block size of these blocks is set large enough to span substantially all of a complete tile of encoded payload signal, and preferably a cluster of neighboring tiles. However, because the distance from the camera may vary, the spatial scale of the encoded signal is likely to vary from its scale at the time of encoding. This spatial scale distortion is further addressed in the synchronization process.
- For more on block selection, please see co-pending U.S. Pat. No. 9,521,291, which is hereby incorporated herein by reference.
- Please also see US Published Patent Application No. US 2016-0364623 A1, which is hereby incorporated herein by reference, for more on block selection where processing is time is more limited.
- The first stage of the decoding process filters the image to prepare it for detection and synchronization of the encoded signal (402). The decoding process sub-divides the image into blocks and selects blocks for further decoding operations. For color images, a first filtering stage converts the input color image signal (e.g., RGB values) to a color channel or channels where the auxiliary signal has been encoded. See, e.g., U.S. Pat. No. 9,117,268 for more on color channel encoding and decoding. For an image captured under red illumination by a monochrome scanner, the decoding process operates on this “red” channel sensed by the scanner. Some scanners may pulse LEDs of different color to obtain plural color or spectral samples per pixel as described in our Patent Application Publication 2013-0329006, entitled COORDINATED ILLUMINATION AND IMAGE SIGNAL CAPTURE FOR ENHANCED SIGNAL DETECTION, which is hereby incorporated by reference.
- A second filtering operation isolates the auxiliary signal from the host image. Pre-filtering is adapted for the auxiliary signal encoding format, including the type of synchronization employed. For example, where an explicit synchronization signal is used, pre-filtering is adapted to isolate the explicit synchronization signal for the synchronization process.
- In some embodiments, the synchronization signal is a collection of peaks in the Fourier domain. Prior to conversion to the Fourier domain, the image blocks are pre-filtered. See, e.g., LaPlacian pre-filter in U.S. Pat. No. 6,614,914. A window function is applied to the blocks and then a transform to the Fourier domain, applying an FFT. Another filtering operation is performed in the Fourier domain. See, e.g., pre-filtering options in U.S. Pat. Nos. 6,988,202, 6,614,914, 20120078989, which are hereby incorporated by reference.
- For more on filters, also see U.S. Pat. No. 7,076,082, which is hereby incorporated by reference. This patent describes a multi-axis filter, e.g., an oct-axis filter. Oct axis compares a discrete image sample with eight neighbors to provide a compare value (e.g., +1 for positive difference, −1 or negative difference), and sums the compare values. Different arrangements of neighbors and weights may be applied to shape the filter according to different functions. Another filter variant is a cross shaped filter, in which a sample of interest is compared with an average of horizontal neighbors and vertical neighbors, which are then similarly summed.
- Next, synchronization process (404) is executed on a filtered block to recover the rotation, spatial scale, and translation of the encoded signal tiles. This process may employ a log polar method as detailed in U.S. Pat. No. 6,614,914 or least squares approach of 20120078989 to recover rotation and scale of a synchronization signal comprised of peaks in the Fourier domain. To recover translation, the phase correlation method of U.S. Pat. No. 6,614,914 is used, or phase estimation and phase deviation methods of U.S. Pat. No. 9,182,778, which is hereby incorporated herein by reference, are used.
- Alternative methods perform synchronization on an implicit synchronization signal, e.g., as detailed in published application no. 20160217547.
- Next, the decoder steps through the embedding locations in a tile, extracting bit estimates from each location (406). This process applies, for each location, the rotation, scale and translation parameters, to extract a bit estimate from each embedding location (406). In particle, as it visits each embedding location in a tile, it transforms it to a location in the received image based on the affine transform parameters derived in the synchronization, and then samples around each location. It does this process for the embedding location and its neighbors to feed inputs to an extraction filter (e.g., oct axis or cross shaped). A bit estimate is extracted at each embedding location using filtering operations, e.g., oct axis or cross shaped filter (see above), to compare a sample at embedding locations with neighbors. The output (e.g., 1, −1) of each compare operation is summed to provide an estimate for an embedding location. Each bit estimate at an embedding location corresponds to an element of a modulated carrier signal.
- The signal decoder estimates a value of each error correction encoded bit by accumulating the bit estimates from the embedding locations of the carrier signal for that bit (408). For instance, in the encoder embodiment above, error correction encoded bits are modulated over a corresponding carrier signal with 16 elements (e.g., multiplied by or XOR with a binary anti-podal signal). A bit value is demodulated from the estimates extracted from the corresponding embedding locations of these elements. This demodulation operation multiplies the estimate by the carrier signal sign and adds the result. This demodulation provides a soft estimate for each error correction encoded bit.
- These soft estimates are input to an error correction decoder to produce the payload signal (410). For a convolutional encoded payload, a Viterbi decoder is used to produce the payload signal, including the checksum or CRC. For other forms of error correction, a compatible decoder is applied to reconstruct the payload. Examples include block codes, BCH, Reed Solomon, Turbo codes.
- Next, the payload is validated by computing the check sum and comparing with the decoded checksum bits (412). The check sum matches the one in the encoder, of course. For the example above, the decoder computes a CRC for a portion of the payload and compares it with the CRC portion in the payload.
- At this stage, the payload is stored in shared memory of the decoder process. The recognition unit in which the decoder process resides returns it to the controller via its interface. This may be accomplished by various communication schemes, such as IPC, shared memory within a process, DMA, etc.
- The scanner may also include a recognition unit that implements an image recognition method for identifying a product in a store's inventory as well as product labels, such as price change labels. In such a system, reference image feature sets of each product are stored in a database of the scanner's memory and linked to an item identifier for a product and/or particular label (e.g., price change label). The recognition unit extracts corresponding features from an image frame and matches them against the reference feature sets to detect a likely match. If the match criteria are satisfied, the recognition unit returns an item identifier to the controller. The recognition unit may also return spatial information, such as position, bounding box, shape or other geometric parameters for a recognized item to enable the controller to detect whether a code from another recognition unit is from the same object.
- One form of recognition system is an image fingerprint-based system. SIFT, SURF, ORB and CONGAS are some of the most popular algorithms. SIFT, SURF and ORB are each implemented in the popular OpenCV software library, e.g., version 2.3.1. CONGAS is used by Google Goggles for that product's image recognition service, and is detailed, e.g., in Neven et al, “Image Recognition with an Adiabatic Quantum Computer I. Mapping to Quadratic Unconstrained Binary Optimization,” Arxiv preprint arXiv:0804.4457, 2008.
- SIFT is an acronym for Scale-Invariant Feature Transform, a computer vision technology pioneered by David Lowe and described in various of his papers including “Distinctive Image Features from Scale-Invariant Keypoints,” International Journal of Computer Vision, 60, 2 (2004), pp. 91-110; and “Object Recognition from Local Scale-Invariant Features,” International Conference on Computer Vision, Corfu, Greece (September 1999), pp. 1150-1157, as well as in U.S. Pat. No. 6,711,293, which is hereby incorporated herein by reference.
- SIFT is an acronym for Scale-Invariant Feature Transform, a computer vision technology pioneered by David Lowe and described in various of his papers including “Distinctive Image Features from Scale-Invariant Keypoints,” International Journal of Computer Vision, 60, 2 (2004), pp. 91-110; and “Object Recognition from Local Scale-Invariant Features,” International Conference on Computer Vision, Corfu, Greece (September 1999), pp. 1150-1157, as well as in U.S. Pat. No. 6,711,293.
- SIFT works by identification and description—and subsequent detection—of local image features. The SIFT features are local and based on the appearance of the object at particular interest points, and are invariant to image scale, rotation and affine transformation. They are also robust to changes in illumination, noise, and some changes in viewpoint. In addition to these properties, they are distinctive, relatively easy to extract, allow for correct object identification with low probability of mismatch and are straightforward to match against a (large) database of local features. Object description by set of SIFT features is also robust to partial occlusion; as few as 3 SIFT features from an object can be enough to compute location and pose.
- The technique starts by identifying local image features—termed keypoints—in a reference image. This is done by convolving the image with Gaussian blur filters at different scales (resolutions), and determining differences between successive Gaussian-blurred images. Keypoints are those image features having maxima or minima of the difference of Gaussians occurring at multiple scales. (Each pixel in a difference-of-Gaussian frame is compared to its eight neighbors at the same scale, and corresponding pixels in each of the neighboring scales (e.g., nine other scales). If the pixel value is a maximum or minimum from all these pixels, it is selected as a candidate keypoint.
- (It will be recognized that the just-described procedure is a blob-detection method that detects space-scale extrema of a scale-localized Laplacian transform of the image. The difference of Gaussians approach is an approximation of such Laplacian operation, expressed in a pyramid setting.)
- The above procedure typically identifies many keypoints that are unsuitable, e.g., due to having low contrast (thus being susceptible to noise), or due to having poorly determined locations along an edge (the Difference of Gaussians function has a strong response along edges, yielding many candidate keypoints, but many of these are not robust to noise). These unreliable keypoints are screened out by performing a detailed fit on the candidate keypoints to nearby data for accurate location, scale, and ratio of principal curvatures. This rejects keypoints that have low contrast, or are poorly located along an edge.
- More particularly this process starts by—for each candidate keypoint—interpolating nearby data to more accurately determine keypoint location. This is often done by a Taylor expansion with the keypoint as the origin, to determine a refined estimate of maxima/minima location.
- The value of the second-order Taylor expansion can also be used to identify low contrast keypoints. If the contrast is less than a threshold (e.g., 0.03), the keypoint is discarded.
- To eliminate keypoints having strong edge responses but that are poorly localized, a variant of a corner detection procedure is applied. Briefly, this involves computing the principal curvature across the edge, and comparing to the principal curvature along the edge. This is done by solving for eigenvalues of a second order Hessian matrix.
- Once unsuitable keypoints are discarded, those that remain are assessed for orientation, by a local image gradient function. Magnitude and direction of the gradient are calculated for every pixel in a neighboring region around a keypoint in the Gaussian blurred image (at that keypoint's scale). An orientation histogram with 36 bins is then compiled—with each bin encompassing ten degrees of orientation. Each pixel in the neighborhood contributes to the histogram, with the contribution weighted by its gradient's magnitude and by a Gaussian with σ 1.5 times the scale of the keypoint. The peaks in this histogram define the keypoint's dominant orientation. This orientation data allows SIFT to achieve rotation robustness, since the keypoint descriptor can be represented relative to this orientation.
- From the foregoing, plural keypoints at different scales are identified—each with corresponding orientations. This data is invariant to image translation, scale and rotation. 128 element descriptors are then generated for each keypoint, allowing robustness to illumination and 3D viewpoint.
- This operation is similar to the orientation assessment procedure just-reviewed. The keypoint descriptor is computed as a set of orientation histograms on (4×4) pixel neighborhoods. The orientation histograms are relative to the keypoint orientation and the orientation data comes from the Gaussian image closest in scale to the keypoint's scale. As before, the contribution of each pixel is weighted by the gradient magnitude, and by a Gaussian with σ 1.5 times the scale of the keypoint. Histograms contain 8 bins each, and each descriptor contains a 4×4 array of 16 histograms around the keypoint. This leads to a SIFT feature vector with (4×4×8=128 elements). This vector is normalized to enhance invariance to changes in illumination.
- The foregoing procedure is applied to training images to compile a reference database. An unknown image is then processed as above to generate keypoint data, and the closest-matching image in the database is identified by a Euclidian distance-like measure. (A “best-bin-first” algorithm is typically used instead of a pure Euclidean distance calculation, to achieve several orders of magnitude speed improvement.) To avoid false positives, a “no match” output is produced if the distance score for the best match is close—e.g., 25% —to the distance score for the next-best match.
- To further improve performance, an image may be matched by clustering. This identifies features that belong to the same reference image—allowing unclustered results to be discarded as spurious. A Hough transform can be used—identifying clusters of features that vote for the same object pose.
- An article detailing a particular hardware embodiment for performing the SIFT procedure, suitable for implementation in a next generation cell phone, is Bonato et al, “Parallel Hardware Architecture for Scale and Rotation Invariant Feature Detection,” IEEE Trans on Circuits and Systems for Video Tech, Vol. 18, No. 12, 2008.
- An alternative hardware architecture for executing SIFT techniques is detailed in Se et al, “Vision Based Modeling and Localization for Planetary Exploration Rovers,” Proc. of Int. Astronautical Congress (IAC), October, 2004.
- While SIFT is a well-known technique for generating robust local descriptors, there are others. These include GLOH (c.f., Mikolajczyk et al, “Performance Evaluation of Local Descriptors,” IEEE Trans. Pattern Anal. Mach. Intell., Vol. 27, No. 10, pp. 1615-1630, 2005) and SURF (c.f., Bay et al, SURF: Speeded Up Robust eatures,” Eur. Conf. on Computer Vision (1), pp. 404-417, 2006; Chen et al, “Efficient Extraction of Robust Image Features on Mobile Devices,” Proc. of the 6.sup.th IEEE and ACM Int. Symp. On Mixed and Augmented Reality, 2007; and Takacs et al, “Outdoors Augmented Reality on Mobile Phone Using Loxel-Based Visual Feature Organization,” ACM Int. Conf. on Multimedia Information Retrieval, October 2008).
- ORB refers to Oriented Fast and Rotated BRIEF, a fast local robust feature detector. For information about it, please see, Ethan Rublee, Vincent Rabaud, Kurt Konolige, Gary Bradski “ORB: an efficient alternative to SIFT or SURF”, Computer Vision (ICCV), 2011 IEEE International Conference on. IEEE, 2011.
- Still other fingerprinting techniques are detailed in patent publications 20090282025, 20060104598, WO2012004626 and WO2012156774 (all by LTU Technologies of France).
- Yet other fingerprinting techniques are variously known as Bag of Features, or Bag of Words, methods. Such methods extract local features from patches of an image (e.g., SIFT points), and automatically cluster the features into N groups (e.g., 168 groups)—each corresponding to a prototypical local feature. A vector of occurrence counts of each of the groups (i.e., a histogram) is then determined, and serves as a reference signature for the image. To determine if a query image matches the reference image, local features are again extracted from patches of the image, and assigned to one of the earlier-defined N-groups (e.g., based on a distance measure from the corresponding prototypical local features). A vector occurrence count is again made, and checked for correlation with the reference signature. Further information is detailed, e.g., in Nowak, et al, Sampling strategies for bag-of-features image classification, Computer Vision-ECCV 2006, Springer Berlin Heidelberg, pp. 490-503; and Fei-Fei et al, A Bayesian Hierarchical Model for Learning Natural Scene Categories, IEEE Conference on Computer Vision and Pattern Recognition, 2005; and references cited in such papers.
- In our related work, we describe methods for 3D object recognition based on capture of 2D images. See assignee's US Application Publication 2015-0016712, METHODS FOR OBJECT RECOGNITION AND RELATED ARRANGEMENTS, which is hereby incorporated by reference.
- As alternatives, several other object recognition schemes are documented in published papers, and are incorporated by reference herein. The object recognition techniques in the following can be adapted for identifying products in a store's inventory:
-
- Fei-Fei et al, A Bayesian Hierarchical Model for Learning Natural Scene Categories, IEEE Conference on Computer Vision and Pattern Recognition, 2005;
- Ohbuchi, et al, Distance Metric Learning and Feature Combination for Shape-Based 3D Model Retrieval, Poster Presentation, Proc. of the ACM workshop on 3D Object Retrieval, 2010.
- Lian, et al, Visual similarity based 3D shape retrieval using bag-of-features, IEEE Shape Modeling International Conference 2010; and
- Ohbuchi, et al, Accelerating bag-of-features SIFT algorithm for 3d model retrieval, Proc. SAMT 2008 Workshop on Semantic 3D Media; which are all hereby incorporated by reference.
Detection Trigger with Digital Watermarking and Other Symbologies
- There are times where an encoded object (e.g., a retail package, label or product hang tag) needs to be interpreted in different ways, e.g., depending on symbologies detected, context and/or user intent. Consider
FIG. 15 , where an object (e.g., representing one face of a retail package) includes artwork, text, and various machine-readable symbologies. In the illustrated example, the artwork includes castles, sundial, shields, knight/horse, scenery, etc. The text includes “VALIANT”, “For the courage to get deep down clean”, “ICON Label”, etc. And a 1D barcode and a 2D barcode. Of course, the object may include a subset of these items, and/or include additional or different printed features and graphics. Thus, the artwork depicted inFIG. 15 is for illustrative purposes and shouldn't limit the following discussion. The illustrated grid-like pattern (creating grid cells) virtually represents different encoding areas. That is, a grid would not typically be printed on a retail package, but is shown inFIG. 15 to help the reader visualize examples of multiple encoding areas. Moreover, encoding regions need not be rectangular in shape. - Machine-readable data may be redundantly encoded within two-dimensional spatial areas (e.g., within some or all of the grid cells) across an image to create an enhanced or transformed image with an auxiliary data signal. The encoding can be applied to an object during printing or labeling with commercial presses, or directly by applying encoding after artwork, text and barcodes have been laid down, with ink jet, laser marking, embossing, photographic, or other marking technology. Redundant marking is particularly useful for automatic identification of objects, as it is able to be merged with other imagery (instead of occupying dedicated spatial area like conventional codes) and enables reliable and efficient optical reading of the machine readable data from various different views of the object. In one embodiments, the encoding comprises digital watermarking (or a “digital watermark”). Digital watermarking as used in this patent document, refers to an encoded signal that carries a machine-readable (or decodable) code. In some embodiments digital watermarking is designed to be less visually perceptible to a human viewer relative to an overt symbology such as a visible 1D or 2D barcode or QR code. The following patent documents describe many suitable examples of digital watermarking, e.g., U.S. Pat. Nos. 6,102,403, 6,614,914, 9,117,268, 9,245,308 and 9,380,186, and US Publication Nos. 20160217547 and 20160275639, which are each hereby incorporated by reference in its entirety. The artisan will be familiar with others.
- Returning to
FIG. 15 , the retail package includes anicon 550. An icon may include, e.g., a logo, shape, graphic design, symbol, etc.Icon 550 typically does not include a machine-readable signal encoded therein. Theicon 550 may include associated text and/or be differently shaped than illustrated. That is, it need not be a hexagon, nor need it be internally grey-stippled.Icon 550 may be used as an indicator of information associated with the retail package, its contents or both. For example,icon 550 may be shaped and colored like a peanut to indicate a potential allergy or associated allergy information. Inother cases icon 550 may be used as an age restriction indicator. For example, the icon may be a particularly stylized “R”, perhaps placed within a colored shape (e.g., box), which can be used to indicate a suitability (or not) for children. In other cases,icon 550 includes a so-called SmartLabel label. SmartLabel was a collaborative effort to standardize a digital label format which consumers can use to access product information using their smartphones. The SmartLabel is typically associated with a visible QR code. The QR code is read (but not the icon) by a smartphone to access product information, e.g., nutrition, ingredients, allergens, in a consistent format. The SmartLabel label itself is used more as a visual cue to a shopper or consumer that related product information exists online. But, real estate on a product package is often limited. Branding information, graphics, nutrition information, 1D barcode, QR codes, etc. can take up a lot of space. E.g., consider a yogurt cup which has very limited space on the container surface. And even if a package is not tight on space, a QR code or other visible symbology can be an eyesore. - Use of an icon with machine-readable symbologies is discussed with reference to
FIG. 16A .Image data 500 is captured by a camera or other image sensor. For example, a smartphone camera captures image data representing some or all of a product package (e.g., the package face shown inFIG. 15 ). One example of a suitable smartphone is discussed below relative toFIG. 19 . A smartphone may represent captured image data in various ways. For example, a smartphone camera may output captured image data in RGB, RGBA or Yuv format. Thus,image data 500 can be variously represented. In our preferred embodiment, we use greyscale data forimage data 500, e.g., the Y value from the Yuv, or converted luminance data from RGB data (e.g., Luma=0.2126*R+0.7152*G+0.0722*B). In some embodiments,image data 500 represents a cropped version of an image frame. For example, if image data includes 911×512 pixels, thecenter 400×400 pixels can be used. One purpose of cropping is to focus in on a center of the frame, which is likely the target of a captured image. In some other embodiments,image data 500 represents a filtered or processed version of captured image data. -
Image data 500 is processed by aSignal Decoder 502, which may include, e.g., a barcode decoder, and/or an encoded signal decoder. One example of an encoded signal decoder is a digital watermark decoder.Image data 500 may represent a frame of imagery, portions of a frame, or streaming imagery, e.g., multiple frames.Signal Decoder 502 analyzes theimage data 500 in search of an encoded signal, e.g., which carries a code, message or payload. For example, if theimage data 500 includes digital watermarking encoded therein, theSignal Decoder 502 attempts to decode 504 the digital watermarking to obtain the code, message or payload. In one example, the code, message or payload includes a GTIN number, or other product identifier such as a UPC number. If no signal is successfully decoded,Signal Decoder 502 preferably moves on to analyze other image data, e.g., another image frame(s) or another image portion. In some cases, theSignal Decoder 502 may output (or set a flag representing) a message, e.g., “No Detect” or “no signal found”, or the like. - If an encoded signal is successfully decoded, flow moves to an
Icon Detector 506.Icon Detector 506 operates to detect 508 an icon, e.g., icon 550 (FIG. 15 ). We sometimes use the phrase “target icon” to mean a particular icon that is to be detected or a reference icon from which templates are determined. If an icon is not detected (but the encoded signal was), a first response is presented (e.g., “Response 1” inFIG. 16A ). If an icon is detected (along with the encoded signal), a second response is presented (e.g., “Response 2”).Icon Detector 506 may be configured to search thesame image data 500 for the icon. That is,icon 550 must be present in the same image frame (or image portion or streaming frames) as the encoded signal was found in or searched across to yield a successful “Response 2”. In other cases,Icon Detector 506 is configured to detecticon 550 within a predetermined number of image frames (e.g., 2-5 frames) relative to the encoded signal decode, or within a certain time frame (e.g., within 1 second or less). In still a further cases, if an encoded signal is detected then only the icon detector runs for the next, e.g., n number of frames (e.g., 2-6 frames). In still other implementations,Signal Decoder 502 andIcon Detector 506 switch order of operations. That is, a target icon is searched for first and, only upon a successful icon detection, then is an encoded signal searched for. This alternative process is shown with respect toFIG. 16B . If an icon is detected (but the encoded signal was not), a first response is presented (e.g., “Response 1” inFIG. 16A ). If an icon is detected (along with the encoded signal), a second response is presented (e.g., “Response 2”). Besides the order of operation, the technology shown inFIGS. 16A and 16B are the same. - In another
FIG. 16B embodiment, once an icon is detected, a localized encoded signal search is carried out. For example, and with reference toFIGS. 25A-D , an encoded signal is placed in or around a localized spatial area relative to an icon. In a first case,FIG. 25A , the encoded signal surrounds an icon, e.g.,icon 550. The encoded signal can be provided in an N×M rectangular area, where N×M are measurement units such as in inches, dots per inch, centimeters, etc. The encoded signal can be redundantly provided in this N×M area, e.g., in a tiled-like manner. In some cases the icon will not include any encoding within its area, whereas in other cases the encoded signal will be provided within or on the icon. In one example, N corresponds to 1/300 inch to 4 inches, and M corresponds to 1/300 to 4 inches. Once an icon is detected, a signal decoder can initiate decoding of an area engulfing, surrounding or neighboring the detected icon. For example, the signal decoder can analyze image data within the N×M area. Of course, the encoded area is not limited to a rectangle. For example, a signal can be encoded within any number of areas including, e.g., the cloud shown inFIG. 25B . An image mask or layer can be used to confine the encoding to an area engulfing, surrounding or neighboring an icon. Preferably, the icon is surrounded or neighbored by an area having 1/300 inch to 4 inches on all sides. - With reference to
FIGS. 25C and 25D , some icons may designed so that they, themselves, can host encoded signals. The dashed lines inFIG. 25C represented a signal encoded within an icon, e.g.,icon 580. For example, the encoding may be a relatively sparse signaling technology such as discussed in our US Published Patent Application Nos. US 2016-0275639 A1 and US 2017-0024840 A1, which are each hereby incorporated herein by reference in its entirety. Or, depending on the colors (if any) included within an icon, the color encoding technologies described in our U.S. Pat. Nos. 9,380,186 and 9,117,268, US Published Patent Application No. US 2016-0198064 A1, U.S. patent application Ser. No. 15/418,364, filed Jan. 27, 2017, and Ser. No. 15/261,005, filed Sep. 9, 2016, can be employed. The U.S. Pat. Nos. 9,380,186, 9,117,268, US 2016-0198064 A1, Ser. Nos. 15/418,364 and 15/261,005 patent documents are each hereby incorporated herein by reference in its entirety. Still other encoding techniques may be used to encode an icon itself. For example, a line contour change, line width modulation (LWM), Line Continuity Modulation (LCM), Line Angle Modulation (LAM), Line Frequency Modulation (LFM), Line Thickness Modulation (LTM), or a combination of these technologies can be used, e.g., as described in assignee's US Patent Application No. US 2016-0189326 A1, which is hereby incorporated herein by reference in its entirety. Returning toFIG. 25D , a LWM or LTM technique is shown by reference no. 602, with a line contour change shown by reference no. 604. - Once an icon is detected (in a
FIG. 16B implementation), image data surrounding, corresponding to, neighboring or engulfing the icon can be analyzed to decode an encoded signal. In some cases, a window (or other area define imagery) around (and/or including) the detected icon is searched. The window can be expanded if an initial analysis does not decode an encoded signal. For example, the window may include 1/300 to 2 inches around the icon. If a signal is not decoded, the area can be expanded from 2-4 inches. - We envision that the
FIGS. 16A and 16B process may operate on a smartphone, e.g., as depicted inFIG. 19 . A smartphone may, at times, be concurrently (or serially) executing multiple different image and/or audio signal processing operations. For example, data from an image pipeline (e.g., providing image data collected by a camera) may be analyzed to detect 1D barcodes, 2D barcodes, encoded signals, and/or icons. The pipeline data may also be analyzed for optical character recognition and/or image recognition. Prioritizing these different operations and their corresponding output (e.g., decode identifiers, detection indications and/or corresponding responses) can be tricky. One approach sets a predetermined time or frame count before providing a response (e.g., a UI indication of a successful read). For example, if a 1D barcode is detected attime 0 seconds, then a response will not be provided until x seconds (or milliseconds) fromtime 0 seconds. Image signal processing analyzes continues during this time frame to determine whether any other codes, icons, character or image features can be decoded, detected or recognized. If more than one (1) code is detected or decoded then a prioritization can be consulted. For example, it might be determined that an icon takes precedence over all other codes or symbols, so only information associated with a successful icon detection is presented. Or, maybe a QR 2-D barcode is ranked highest, so only a response associated with the QR code is provided. Or, still further, a prioritization may indicate which response to display first, second, third and so on. Further scheduling and prioritization methods and apparatus, which can be advantageously used in the present context, are described in assignee's US Published Patent Application Nos. 20110212717, 20110161076 and 20120284012, which are each hereby incorporated herein by reference in its entirety. Regarding the 20120284012 application, see, e.g., the section headings entitled “Evidence-Based State Machines, and Blackboard-Based Systems” and “More on Middleware, Etc.”. - Returning more particularly to icon detection, and in another embodiment relative to the package example in
FIG. 15 , a retail package includes an encoded signal redundantly provided over its surface. For example, the package may include redundant instances of digital watermarking carrying a GTIN number in each of the grid cells (or a subset of the grid cells). Preferably, the encoding (e.g., digital watermarking) is included on all sides of the package. The package also includes anicon 550, which indicates the presence of additional information associated with the package or package contents, e.g., online information.Icon 550 may be even located near a nutrition text box printed on the package (text box not shown inFIG. 15 ). A smartphone camera captures image data representing a portion of the package which includes both i) the encoded signal, and ii)icon 550. The image data is provided to the process detailed inFIG. 16A . In this scenario, the encoded signal is decoded along withicon 550 being detected, triggering a certain response (e.g., “Response 2” inFIG. 16A ). The certain responses can cause the smartphone to provide, e.g., access to the additional information. For example, the networks, data stores and cloud-based routing described in assignee's U.S. Pat. No. 8,990,638, which is hereby incorporated herein by reference in its entirety, can be used to provide access to the additional information. (In one implementation, a remote database includes a response table or database. The table or database may include multiple responses per encoded signal identifier. If the identifier is received without an icon detection indication, then aResponse 1 is provided. But, if the identifier is received with an icon detection indication, then aResponse 2 is provided.) In some cases, the certain response is limited to access to the additional information. And, even though the encoded signal may carry a certain payload like a GTIN, such information preferably is not provided for user or application access. In this first scenario, it is assumed that there is an interest in the additional information since theicon 550, which indicates the ability to access additional information, was detected. Therefore, the response (e.g., “Response 2”) is limited to providing access to the additional information, and not, e.g., the GTIN itself. - In a second embodiment, relative to the package example in
FIG. 15 , a retail package includes an encoded signal redundantly provided over its surface. For example, the package may include redundant instances of digital watermarking carrying a GTIN number in each of the grid cells (or a subset of the grid cells). Preferably, the encoding (e.g., digital watermarking) is included on all sides of the package. The package also includes anicon 550, which indicates the presence of additional information associated with the package or package contents, e.g., online information.Icon 550 may be even located near a nutrition text box printed on the package (not shown inFIG. 15 ). A smartphone camera captures image data representing a portion of the package which includes i) the encoded signal, but not ii)icon 550. The image data is provided to the process detailed inFIG. 16A . In this scenario, the encoded signal is decoded buticon 550 is not detected, triggering a certain response (e.g., “Response 1” inFIG. 16A ). Sinceicon 550 is not detected it can be assumed that there is not a current interest in the additional information. Therefore, the response may include providing access to the GTIN information, or product information associated with the GTIN. - The algorithms, processes, image capture and functionality shown in
FIGS. 16A and 16B can be carried out on a portable or mobile device, e.g., a smartphone, tablet, smart glasses, or laptop, e.g., as discussed below with respect toFIG. 19 .Signal Decoder 502 can include, e.g., a digital watermark decoder such as disclosed in U.S. Pat. Nos. 6,102,403, 6,614,914, 9,117,268, 9,245,308 and/or 9,380,186, and US Publication Nos. 20160217547 and/or 20160275639, which are each hereby incorporated by reference in its entirety. Other decoders suitable for inclusion inSignal Decoder 502 may include, e.g., a 1D or 2D barcode decoder. One example of a suitable 1D and 2D barcode detector is ZXing (“Zebra Crossing”), which is an open-source, multi-format 1D/2D barcode image processing library implemented in Java, with ports to other languages, found at currently at https://github.com/zxing/zxing. - Various implementations of
Icon Detector 504 are discussed further with reference toFIGS. 17A-17C . - In
FIG. 17A ,Image Data 500 is provided so that potential icon candidates can be identified 520. For example, 520 may identify many different image areas with characteristics that may be associated withicon 550. Identified candidates are passed on for processing 530 to determine whether they represent an icon, e.g.,icon 550 inFIG. 15 . - Let's look under the hood with reference to
FIG. 17B andFIG. 17C . -
Image data 500 can be filtered 520 for smoothing or to remove noise. For example, a bilateral filter can be employed to remove noise from theimage data 500. A bilateral filter may be viewed, e.g., as a weighted average of pixels, which takes into account the variation of pixel intensities to preserve edges. See, e.g., Paris, et al., “A gentle introduction to bilateral filtering and its applications,” Proceedings of SIGGRAPH '08 ACM SIGGRAPH, article no. 1, 2008-08-11, which is hereby incorporated herein by reference.Edge detection 521 can be performed on the filter image data. For example, the Canny edge detector can be used. See, e.g., J. Canny (1986) “A computational approach to edge detection”, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 8, pages 679-714, which is hereby incorporated herein by reference. The Canny-Deriche detector is another filter that could be used. See, e.g., R. Deriche (1987) Using Canny's criteria to derive an optimal edge detector recursively implemented, Int. J. Computer Vision, vol. 1, pages 167-187, which is hereby incorporated herein by reference. Or the Log Gabor filter could be used instead of or in combination with the above mentioned filters. See, e.g., Sylvain Fischer, Rafael Redondo, Laurent Perrinet, Gabriel Cristobal, “Sparse approximation of images inspired from the functional architecture of the primary visual areas,” EURASIP Journal on Advances in Signal Processing, special issue on Image Perception, 2007. Yet another edge detector is the Sobel Edge detector, e.g., which is discussed in Gao et al. “An improved Sobel edge detection.” Computer Science and Information Technology (ICCSIT), 2010 3rd IEEE International Conference on. Vol. 5. IEEE, 2010, which is hereby incorporated herein by reference in its entirety. - For all (or a subset of)
contours 522 identified by theedge detector 521, it can be determined whether various criteria is met. This criteria can be determined based on the physical properties oficon 550. For example, consider an icon that is somewhat hexagonal in shape. The criteria for such an icon may include whether a contour is, e.g., a “closed contour” 523, has a pixel size or area within predetermined limits 524 (e.g., to weed out too large and too small of areas), is convex 525, and has the correct number of sides (e.g., at least 6 if looking for a hexagonal shaped icon, or at least n sides if looking for an n-sided polygon) 526. All contours (or a subset of those meeting predetermined criteria, e.g., exactly 6 sides, within a certain size, etc.) meeting these criterion (523, 524, 525 and/or 526) can be passed to a second stage for further analysis or identified ascandidate contours 528. Otherwise, contours not meeting these criterion can be discarded 527. Of course, not all of the criterion need to be met. For example, candidate contours can be identified based on successfully meeting 3 out of the 4 criterion. - Determined candidate contour(s) can analyzed in a second stage (
FIG. 17C ) to determine whether it corresponds toicon 550. For example, we can use a template based approach to determine whether a candidate contour (e.g., including image data enclosed within the candidate contour) matches a template based onicon 550. An area associated with the candidate contour can be assessed. For example, a minimum bounding box can be drawn around the candidate contour. For example, the techniques described in O'Rourke, Joseph (1985), “Finding minimal enclosing boxes”, International Journal of Computer and Information Sciences, 14 (3): 183-199, which is hereby incorporated herein by reference, can be used. Additionally, a minimum bounding box can be generated in software, e.g., such as various scripts for in MatLab from MathWorks (fx minBoundingBox(X), which computes the minimum bounding box of a set of 2D points, and where the input includes [x,y] coordinates corresponding to points on a candidate contour). An example open source MatLab bounding box script is shown for minBoundingBox(X) inFIGS. 18A and 18B . - The minimum bounding box helps facilitate
re-orientation 532 of the candidate contour to resolve image rotation and scale. For example, the bounding box (and its image contents) can be rotated such that one of its edges is horizontal to an image plane. And the image data within the candidate contour can be resized, e.g., according to the sizing of previously stored templates. - The candidate contour (e.g., including image content represented within the contour) may be binarized 533, e.g., if later stage matching templates are provided in binary form. Next is
template correlation 534. Here, a correlation is determined between the processed candidate contour and the matching template(s). Since we propose using a minimum bounding box, and since at least one edge of that box is preferably reoriented to a horizontal line, we suggest using four (4) templates per candidate contour (one representing 0° rotation, one representing 90° rotation, one representing 180° rotation, and one representing 270° rotation). Using four (4) templates is useful since the potential icon could be variously oriented within the minimum bounding box. One of the four different rotation angles should be a good approximation, e.g., due to boundingbox re-orientation 532. Of course, additional templates at additional angles can be used, e.g., but at an efficiency cost. The templates are based on a target icon (e.g., icon 550) and can be binarized to cut back on processing time. In one correlation example, the template and the candidate contour are compared on a pixel-by-pixel basis. A multiplication (or AND) operation can be carried out for each template pixel and its corresponding candidate pixel. For example, if the template pixel value is a binary 1 but the candidate contour pixel value is a 0, then the resulting operation yields a 0. But, if the template pixel value is a binary 1 and the candidate contour pixel value is a 1, then the resulting operation yields a 1. The value of pixel operations can be summed, yielding a result. A higher value can be used to indicate a close match. The results can be normalized 535 to aid in determining amatch 538. In another embodiment, we use a cross-correlation or convolution operation to identify a match with a target icon. In still another embodiment we use a correlation coefficient, e.g., Pearson's correlation coefficient (r). For monochrome images,image 1 andimage 2, the Pearson correlation coefficient is defined as: -
- where xi is the intensity of the ith pixel in
image 1, yi is the intensity of the ith pixel inimage 2, xm is the mean intensity ofimage 1, and ym is the mean intensity ofimage 2. The correlation coefficient has the value r=1 if the two images are identical, r=0 if they are uncorrelated, and r=−1 if they are anti-correlated, for example, if one image is the negative of the other. See, e.g., J. L. Rodgers, J. L. and W. A. Nicewander, “Thirteen Ways to Look at the Correlation Coefficient”, American Statistician 42, 59-66 (1995), which is hereby incorporated herein by reference in its entirety. Here again correlation results can be optionally normalized 535 to determine whether the candidate contour matches 538 the icon. - Further embodiments for icon detection are discussed below with reference to
FIGS. 20A-20E . - Candidate contour selection proceeds with reference to
FIG. 20A .Image data 500 is obtained from a portable device, e.g., a smartphone, such as discussed below inFIG. 19 . We prefer to use greyscale imagery as discussed above with reference toFIGS. 17A-17C . But, as mentioned above, other representations of the image data could alternatively be used. Theimage data 500 is filtered 520, e.g., using a bilateral filter. Such a filter preferably preserves edges while smoothing (or removing noise from the image data 500). Edge Detection is carried out at 521, e.g., using a Canny edge detector or other edge detector as discussed above. The output of the edge detector is preferably abinary image 540 representing the edges inimage data 500. Contours within the binary edge image are identified in 542. For example, so-called blob detection (alternatively called “connected component labeling”) can be used. See, e.g., Dillencourt et al., “A general approach to connected-component labeling for arbitrary image representations,” Journal of the ACM. J. ACM. 39 (2): 253 (1992), which is hereby incorporated herein by reference. A “connected component labeling” process, e.g., may initially label pixels (e.g., assigns a value to each pixel). For example, all pixels that are connected to each other can be given the same value or linked together (e.g., a linked list of pixels). Pixels can be clustered based on their connectivity to other pixels (or based on assigned values). Such clusters can be used as (or as a proxy for) contours. Once contours are identified, they can be refined 544 to determine whether they are suitable candidates for further analysis. -
FIG. 20B explores an embodiment of thecontour refinement 544. - Using the
binary edge image 540, one or more of the contours are approximated withcertain precision 545. (This 545 process can be substituted forprocess 542 inFIG. 20A .) Given a contour, a number of points representing the contour is reduced. In one example, a number of points is reduced such that straight lines between the points yields a suitable approximation of the contour. Suitable in this example means that the fit error (or distance error) between a contour segment and its representative straight line fall within a predetermined threshold. In another example, a predetermined number of points are used to represent the contour. It is then determined whether the contour is aclosed contour 546. If not, the process stops for that particular contour, and a next contour, if available, is analyzed. Of course, thisfeature 546 can be integrated into thefeature 545. - If the contour is closed, it is further evaluated in 547. There, it is determined whether the closed contour has: i) at least n-number of sides, where n is an integer, ii) an area above a minimum threshold area, and if iii) the contour is convex. (Instead of having each of these three criteria resulting in a single decision, they can be broken into 2 or 3 individual decisions.) If all of these criteria are met, flow continues to 548. If not, that particular closed contour is discarded.
- A minimum bounding box is calculated around the closed contour, e.g., as discussed above with reference to
FIG. 17B ,item 531. The minimum bounding box can then be evaluated 549, e.g., to determine whether its aspect ratio is within a certain range. For example, since a square has equal sides, its aspect ratio is 1. A 4:3 rectangle, on the other hand, has an aspect ratio of 1.33 (4/3). A suitable aspect ratio range can be established, e.g., based on a particular icon for evaluation. By way of example, for a SmartLabel icon, we prefer an aspect ratio of 0.4-2.5. If the bounding box aspect ratio is not within a predetermined range, the closed contour is not a candidate. If it is within the predetermined range, the contour is identified as a potential candidate contour. - One embodiment of how to determine whether a candidate contour is a match with a particular icon is discussed with reference to
FIG. 20C . - A set of candidate contours is determined or obtained, e.g., by one or more of the processes discussed with reference to
FIG. 17A, 17B, 20A or 20B . The order of which to evaluate candidates within the set of candidates can be determined, e.g., based on a first in—first out process or first in—last out process. In another example, the aspect ratio determined inFIG. 20B ,item 549, can be used to rank candidate contours. For example, if a target icon has an aspect ratio near 1, candidate contours can be ranked according to their determined aspect ratios, with the closest aspect ratio to 1 being evaluated first, and the second closest being evaluated next, and then so on. In another example, the candidate contours are ranked according to their minimum bounding box area (or an area calculated for the closed contour), with the largest area first, and the smallest area last. - For a first candidate contour, an angle of rotation (see
FIG. 21 ) is found 560 for the minimum bounding box found in 548. A portion ofimage data 500 is extracted or obtained 561 that corresponds to the area bounded by the minimum bounding box. For example, the corresponding pixels that are within the area (e.g., the corresponding spatial locations) identified by the minimum bounding box are obtained for further evaluation. In our preferred approach, however,image data 500 after filtering by 520 is obtained or extracted which corresponds to the area (e.g., the corresponding spatial locations) of the minimum bounding box. The extracted or obtained image data (or filtered image data) is then oriented 562 (e.g., rotated) according to the rotation angle identified in 561. We refer to this rotated, extracted image data (or filtered image data) as a “block.” This orientation process helps the icon matching be more rotation invariant relative to an un-rotated block. The block can then be resized 563 to match or approximate the size of the template(s). - The image content within the block is then binarized 564, e.g., using Otsu's thresholding. See Nobuyuki Otsu (1979), “A threshold selection method from gray-level histograms,” IEEE Trans. Sys., Man., Cyber. 9 (1): 62-66, which is hereby incorporated herein by reference. Otsu's thresholding assumes that an image contains two classes of pixels following a bi-modal histogram (e.g., foreground pixels and background pixels), it then calculates an optimum threshold separating the two classes so that their combined spread (e.g., intra-class variance) is minimal, or equivalently (e.g., because the sum of pairwise squared distances is constant), so that their inter-class variance is maximal. Of course, feature 564 could be combined with the resizing process in 563.
- Objects within the binarized block can be evaluated in 565. For example, an area associated with each object can be determined. With reference to
FIG. 22A, 4 objects FIG. 22A that needs to be discarded, with the remaining objects shown inFIG. 22B . In particular, we currently prefer discarding objects with an area more than 17% of the block area. An alternative evaluation technique looks for an expected pattern. For example, virtual lines can be drawn (or pixels along a virtual line can be evaluated) through a block. A pattern or ratio of on and off pixels along the line(s) can be evaluate to determine whether it meets a threshold level, pattern or ratio. For example, the left and right dashed lines inFIG. 22C only cross throughobjects objects edge detection 521 to do a rough check whether an expected pattern or ratio associated with an icon is present in theimage data 500.) - Template matching 566, e.g., including a normalized correlation, is carried out for the processed block. For example, the template correlation and normalizing processes discussed above with respect to 534 and 535 can be carried out. If a normalized correlation value is higher than a
predetermined threshold 567, the candidate contour is accepted as a match to the target icon. If not, the candidate contour is not a match. Additional candidate contours can be evaluated according to theFIG. 20C processes if no match is found. And, unless multiple icons are being searched for, the processes need not evaluate additional candidates once an icon match is found. - It should be noted that different resizing 563 can be tried per candidate contour, which would provide better scale invariance relative to a single resizing. For example,
image data 500 can be resized at different scales and then evaluated according to 564-567. - Another embodiment of how to determine whether a candidate contour is a match with a particular icon is discussed with reference to
FIG. 20D . - Image processing flow proceeds through operations 560-565 as discussed above with respect to
FIG. 20C . A subset of remaining objects to retain is determined at 568. For example, and with reference toFIGS. 23A-23C , a resized block (after 563) is shown inFIG. 23A . The block includesobjects Binarization 564 andEvaluation 565 may yield the remaining objects shown inFIG. 23B , including objects 585. Theseobjects 585, e.g., maybe binarization artifacts associated with corners or other object structures. It would be good to remove these objects prior to template correlation. In 568 a subset of remaining objects to retain is determined, e.g., by only keeping the largest sized n number of objects, where n is an integer. For example, and again with reference toFIG. 23B , if we are looking for a targeticon including objects FIG. 23C . (Items - Next, it can be determined whether m of the n number of remaining objects are convex 569, where m and n are each integers. In this context, convex implies that any tangent to a shape will result in the object's interior only being on one side of the tangent, e.g., as shown in
FIG. 24A . A concave shape, in contrast, would have a potential tangent resulting in portions of the shape falling on both sides of the tangent line, e.g., as shown inFIG. 24B . (It should be noted, however, that if an icon included a concave shape, we could alternatively determine whether m of the n number of remaining objects were concave.) In the illustrated example (FIG. 23C ), we may decide that m=2, or if a lower false positive is required, then m=3. If the number of remaining objects is equal to (or greater than) m, flow moves on totemplate correlation 566 and comparison withthreshold 567 as discussed above with reference toFIG. 20C . If not, it is determined that the candidate does not match the target icon. - Another embodiment of how to determine whether a candidate contour is a match with a particular icon is discussed with reference to
FIG. 20E , where shape matching utilizing so-called “image moments” is employed. - Image processing flow proceeds through operations 560-561 as discussed above with respect to
FIG. 20C . Omitted, however isoperations FIG. 20C . This is because an image moment shape matching operation typically extracts rotationally and scale invariant candidate features from an image portion. Flow moves on tooperations operations FIG. 20C , respectfully. Different reference numbers are used inFIG. 20C vs.FIG. 20E since the terms “image portion” are used inFIG. 20E instead using “block” as inFIG. 20C . But, the two terms can be used interchangeable, however, since they both represent image data from a certain spatial image area. Flow continues tooperation 592, where image moments of shapes from the binarized, evaluated image portion are compared to image moments of one or more shapes in a target icon. For example, and with reference toFIG. 23C , three (3) shapes 581, 582 and 583 are intended to be matched in a target icon. Image moments for each of these shapes can be determined and stored as references. Then, moments from an image portion can be determined and compared against the references. The comparisons can be normalized and then compared against a predetermined threshold. If the nominalized comparison exceeds the threshold (or is lower than if a perfect match is a zero (0)), then the icon matches the target icon. If not, no icon is detected. - Image moments are discussed, e.g., in Jan Flusser, Tomáš Suk and Barbara Zitová, “Moments and Moment Invariants in Pattern Recognition,” 2009 John Wiley & Sons, Ltd. ISBN: 978-0-470-69987-4, which is incorporated herein by reference. Early work in the field included Hu's seven moments, e.g., see Hu, M. K.: Visual Pattern Recognition by Moment Invariants. IRE Trans. Inform. Theory 1(8) (Feb. 1962) 179-187, hereby incorporated herein by reference in its entirety.
- Another check can be added to the processes discusses above with respect to
FIG. 17A -FIG. 17C andFIGS. 20A-20D . If one of the expected objects includes a circularly shaped object, e.g.,items - As an alternative arrangement, an icon (e.g., icon 550) may include a machine-readable code encoded therein or there around. Detection of the machine-readable code triggers a response associated with the icon. In this example, instead of detection of the icon+encoded signal, the detection of the machine-readable code, alone, triggers the response associated with the icon. As a further alternative, detection of the machine-readable code+an encoded signal triggers the response associate with the icon.
- In still another implementation, the encoded signal includes a plural-bit payload. The plural-bit payload has at least one bit (e.g., a “trigger bit”) that can be set to indicate the presence of information associated with an icon or with a package. The remaining portion of the payload may including, e.g., a GTIN or UPC number. A signal decoder, upon a successful decode of a payload including a trigger bit provides access to (or indicates to a software app to provide access to) information associated with the icon.
- The components and operations of the various described embodiments shown in
FIGS. 15-17C and 20A-20E can be implemented in modules. Notwithstanding any specific discussion of the embodiments set forth herein, the term “module” may refer to software, firmware and/or circuitry configured to perform any of the methods, processes, algorithms, functions or operations described herein. Software may be embodied as a software package, code, instructions, instruction sets or data recorded on non-transitory computer readable storage mediums. Software instructions for implementing the detailed functionality can be authored by artisans without undue experimentation from the descriptions provided herein, e.g., written in C, C++, MatLab, Visual Basic, Java, Python, Tcl, Perl, Scheme, Ruby, and assembled in executable binary files, etc., in conjunction with associated data. Firmware may be embodied as code, instructions or instruction sets or data that are hard-coded (e.g., nonvolatile) in memory devices. As used herein, the term “circuitry” may include, for example, singly or in any combination, hardwired circuitry, programmable circuitry such as one or more computer processors comprising one or more individual instruction processing cores, parallel processors, state machine circuitry, or firmware that stores instructions executed by programmable circuitry. - Applicant's work also includes taking the scientific principles and natural laws on which the present technology rests, and tying them down in particularly defined implementations. For example, the implementations discussed with reference to
FIGS. 15-17C andFIGS. 20A-20E . One such realization of such implementations is electronic circuitry that has been custom-designed and manufactured to perform some or all of the component acts, as an application specific integrated circuit (ASIC). - To realize such implementations, some or all of the technology is first implemented using a general purpose computer, using software such as MatLab (from MathWorks, Inc.). A tool such as HDLCoder (also available from MathWorks) is next employed to convert the MatLab model to VHDL (an IEEE standard, and doubtless the most common hardware design language). The VHDL output is then applied to a hardware synthesis program, such as Design Compiler by Synopsis, HDL Designer by Mentor Graphics, or Encounter RTL Compiler by Cadence Design Systems. The hardware synthesis program provides output data specifying a particular array of electronic logic gates that will realize the technology in hardware form, as a special-purpose machine dedicated to such purpose. This output data is then provided to a semiconductor fabrication contractor, which uses it to produce the customized silicon part. (Suitable contractors include TSMC, Global Foundries, and ON Semiconductors.) Another specific implementation of the present disclosure includes barcode and/or encoded signal detection operating on a specifically configured smartphone (e.g.,
iPhone 7 or Android device) or other mobile device, such phone or device. The smartphone or mobile device may be configured and controlled by software (e.g., an App or operating system) resident on the smartphone device. The resident software may include, e.g., a barcode decoder, digital watermark detector and detectability measure generator module. - For the sake of further illustration,
FIG. 19 is a diagram of a portable electronic device (e.g., a smartphone, mobile device, tablet, laptop, wearable or other electronic device) in which the components of the above processes (e.g., those inFIGS. 16-17C and 20A-20E ) may be implemented. The following reference numbers refer toFIG. 19 , and not any of the other drawings, unless expressly noted. - Referring to
FIG. 19 , a system for an electronic device includesbus 100, to which many devices, modules, etc., (each of which may be generically referred as a “component”) are communicatively coupled. Thebus 100 may combine the functionality of a direct memory access (DMA) bus and a programmed input/output (PIO) bus. In other words, thebus 100 may facilitate both DMA transfers and direct CPU read and write instructions. In one embodiment, thebus 100 is one of the Advanced Microcontroller Bus Architecture (AMBA) compliant data buses. AlthoughFIG. 19 illustrates an embodiment in which all components are communicatively coupled to thebus 100, it will be appreciated that one or more sub-sets of the components may be communicatively coupled to a separate bus in any suitable or beneficial manner, and that any component may be communicatively coupled to two or more buses in any suitable or beneficial manner. Although not illustrated, the electronic device can optionally include one or more bus controllers (e.g., a DMA controller, an I2C bus controller, or the like or any combination thereof), through which data can be routed between certain of the components. - The electronic device also includes a
CPU 102. TheCPU 102 may be any microprocessor, multi-core microprocessor, parallel processors, mobile application processor, etc., known in the art (e.g., a Reduced Instruction Set Computer (RISC) from ARM Limited, the Krait CPU product-family, any X86-based microprocessor available from the Intel Corporation including those in the Pentium, Xeon, Itanium, Celeron, Atom, Core i-series product families, etc.). Another CPU example is an Apple A10, A8 or A7. By way of further example, the A8 is built on a 64-bit architecture, includes a motion co-processor and is manufactured on a 20 nm process. TheCPU 102 runs an operating system of the electronic device, runs application programs (e.g., mobile apps such as those available through application distribution platforms such as the Apple App Store, Google Play, etc., or custom designed to include signal decoding and icon detection) and, optionally, manages the various functions of the electronic device. TheCPU 102 may include or be coupled to a read-only memory (ROM) (not shown), which may hold an operating system (e.g., a “high-level” operating system, a “real-time” operating system, a mobile operating system, or the like or any combination thereof) or other device firmware that runs on the electronic device. Encoded signal decoding and icon detection capabilities can be integrated into the operating system itself. - The electronic device may also include a
volatile memory 104 electrically coupled tobus 100. Thevolatile memory 104 may include, for example, any type of random access memory (RAM). Although not shown, the electronic device may further include a memory controller that controls the flow of data to and from thevolatile memory 104. - The electronic device may also include a
storage memory 106 connected to the bus. Thestorage memory 106 typically includes one or more non-volatile semiconductor memory devices such as ROM, EPROM and EEPROM, NOR or NAND flash memory, or the like or any combination thereof, and may also include any kind of electronic storage device, such as, for example, magnetic or optical disks. In embodiments of the present invention, thestorage memory 106 is used to store one or more items of software. Software can include system software, application software, middleware (e.g., Data Distribution Service (DDS) for Real Time Systems, MER, etc.), one or more computer files (e.g., one or more data files, configuration files, library files, archive files, etc.), one or more software components, or the like or any stack or other combination thereof. - Examples of system software include operating systems (e.g., including one or more high-level operating systems, real-time operating systems, mobile operating systems, or the like or any combination thereof), one or more kernels, one or more device drivers, firmware, one or more utility programs (e.g., that help to analyze, configure, optimize, maintain, etc., one or more components of the electronic device), and the like.
- Application software typically includes any application program that helps users solve problems, perform tasks, render media content, retrieve (or access, present, traverse, query, create, organize, etc.) information or information resources on a network (e.g., the World Wide Web), a web server, a file system, a database, etc. Examples of software components include device drivers, software CODECs, message queues or mailboxes, databases, etc. A software component can also include any other data or parameter to be provided to application software, a web application, or the like or any combination thereof. Examples of data files include image files, text files, audio files, video files, haptic signature files, and the like.
- Also connected to the
bus 100 is auser interface module 108. Theuser interface module 108 is configured to facilitate user control of the electronic device. Thus theuser interface module 108 may be communicatively coupled to one or moreuser input devices 110. Auser input device 110 can, for example, include a button, knob, touch screen, trackball, mouse, microphone (e.g., an electret microphone, a MEMS microphone, or the like or any combination thereof), an IR or ultrasound-emitting stylus, an ultrasound emitter (e.g., to detect user gestures, etc.), one or more structured light emitters (e.g., to project structured IR light to detect user gestures, etc.), one or more ultrasonic transducers, or the like or any combination thereof. - The
user interface module 108 may also be configured to indicate, to the user, the effect of the user's control of the electronic device, or any other information related to an operation being performed by the electronic device or function otherwise supported by the electronic device. Thus theuser interface module 108 may also be communicatively coupled to one or moreuser output devices 112. Auser output device 112 can, for example, include a display (e.g., a liquid crystal display (LCD), a light emitting diode (LED) display, an active-matrix organic light-emitting diode (AMOLED) display, an e-ink display, etc.), a light, an illumination source such as a flash or torch, a buzzer, a haptic actuator, a loud speaker, or the like or any combination thereof. In the case of an iPhone 6, the flash includes a True Tone flash including a dual-color or dual-temperature flash that has each color firing at varying intensities based on a scene to make sure colors and skin tone stay true. - Generally, the
user input devices 110 anduser output devices 112 are an integral part of the electronic device; however, in alternate embodiments, any user input device 110 (e.g., a microphone, etc.) or user output device 112 (e.g., a loud speaker, haptic actuator, light, display, or printer) may be a physically separate device that is communicatively coupled to the electronic device (e.g., via a communications module 114). A printer encompasses many different devices for applying our encoded signals to objects, such as 2D and 3D printers, etching, engraving, flexo-printing, offset printing, embossing, laser marking, etc. The printer may also include a digital press such as HP's indigo press. An encoded object may include, e.g., a consumer packaged product, a label, a sticker, a logo, a driver's license, a passport or other identification document, etc. Although theuser interface module 108 is illustrated as an individual component, it will be appreciated that the user interface module 108 (or portions thereof) may be functionally integrated into one or more other components of the electronic device (e.g., theCPU 102, thesensor interface module 130, etc.). - Also connected to the
bus 100 is animage signal processor 116 and a graphics processing unit (GPU) 118. The image signal processor (ISP) 116 is configured to process imagery (including still-frame imagery, video imagery, or the like or any combination thereof) captured by one ormore cameras 120, or by any other image sensors, thereby generating image data. Such imagery may correspond withimage data 500 as shown inFIGS. 16, 17A, 17B and/or 20A . General functions typically performed by theISP 116 can include Bayer transformation, demosaicing, noise reduction, image sharpening, filtering, or the like or any combination thereof. TheGPU 118 can be configured to process the image data generated by theISP 116, thereby generating processed image data. General functions typically performed by theGPU 118 include compressing image data (e.g., into a JPEG format, an MPEG format, or the like or any combination thereof), creating lighting effects, rendering 3D graphics, texture mapping, calculating geometric transformations (e.g., rotation, translation, etc.) into different coordinate systems, etc. and send the compressed video data to other components of the electronic device (e.g., the volatile memory 104) viabus 100. TheGPU 118 may also be configured to perform one or more video decompression or decoding processes. Image data generated by theISP 116 or processed image data generated by theGPU 118 may be accessed by theuser interface module 108, where it is converted into one or more suitable signals that may be sent to auser output device 112 such as a display, printer or speaker.GPU 118 may also be configured to serve one or more functions of a signal decoder. In somecases GPU 118 is involved in encoded signal decoding (e.g.,FIGS. 16A and 16B, 502 ), while icon detection (FIGS. 16A and 16B, 506 ) is performed by theCPU 102. In other implementations,GPU 118 performs both signal detection 502 (FIGS. 16A and 16B ) and Icon detection 506 (FIGS. 16A and 16B ). In some cases, Icon Detector 506 (FIGS. 16A and 16B ) is incorporated into Signal Decoder 502 (FIGS. 16A and 16B ), which may execute byCPU 102,GPU 118 or on a processing core. - Also coupled the
bus 100 is an audio I/O module 122, which is configured to encode, decode and route data to and from one or more microphone(s) 124 (any of which may be considered a user input device 110) and loud speaker(s) 126 (any of which may be considered a user output device 110). For example, sound can be present within an ambient, aural environment (e.g., as one or more propagating sound waves) surrounding the electronic device. A sample of such ambient sound can be obtained by sensing the propagating sound wave(s) using one ormore microphones 124, and the microphone(s) 124 then convert the sensed sound into one or more corresponding analog audio signals (typically, electrical signals), thereby capturing the sensed sound. The signal(s) generated by the microphone(s) 124 can then be processed by the audio I/O module 122 (e.g., to convert the analog audio signals into digital audio signals) and thereafter output the resultant digital audio signals (e.g., to an audio digital signal processor (DSP) such asaudio DSP 128, to another module such as a song recognition module, a speech recognition module, a voice recognition module, etc., to thevolatile memory 104, thestorage memory 106, or the like or any combination thereof). The audio I/O module 122 can also receive digital audio signals from theaudio DSP 128, convert each received digital audio signal into one or more corresponding analog audio signals and send the analog audio signals to one ormore loudspeakers 126. In one embodiment, the audio I/O module 122 includes two communication channels (e.g., so that the audio I/O module 122 can transmit generated audio data and receive audio data simultaneously). - The
audio DSP 128 performs various processing of digital audio signals generated by the audio I/O module 122, such as compression, decompression, equalization, mixing of audio from different sources, etc., and thereafter output the processed digital audio signals (e.g., to the audio I/O module 122, to another module such as a song recognition module, a speech recognition module, a voice recognition module, etc., to thevolatile memory 104, thestorage memory 106, or the like or any combination thereof). Generally, theaudio DSP 128 may include one or more microprocessors, digital signal processors or other microcontrollers, programmable logic devices, or the like or any combination thereof. Theaudio DSP 128 may also optionally include cache or other local memory device (e.g., volatile memory, non-volatile memory or a combination thereof), DMA channels, one or more input buffers, one or more output buffers, and any other component facilitating the functions it supports (e.g., as described below). In one embodiment, theaudio DSP 128 includes a core processor (e.g., an ARM® AudioDE™ processor, a Hexagon processor (e.g., QDSP6V5A)), as well as a data memory, program memory, DMA channels, one or more input buffers, one or more output buffers, etc. Although the audio I/O module 122 and theaudio DSP 128 are illustrated as separate components, it will be appreciated that the audio I/O module 122 and theaudio DSP 128 can be functionally integrated together. Further, it will be appreciated that theaudio DSP 128 and other components such as theuser interface module 108 may be (at least partially) functionally integrated together. - The
aforementioned communications module 114 includes circuitry, antennas, sensors, and any other suitable or desired technology that facilitates transmitting or receiving data (e.g., within a network) through one or more wired links (e.g., via Ethernet, USB, FireWire, etc.), or one or more wireless links (e.g., configured according to any standard or otherwise desired or suitable wireless protocols or techniques such as Bluetooth, Bluetooth Low Energy, WiFi, WiMAX, GSM, CDMA, EDGE, cellular 3G or LTE, Li-Fi (e.g., for IR- or visible-light communication), sonic or ultrasonic communication, etc.), or the like or any combination thereof. In one embodiment, thecommunications module 114 may include one or more microprocessors, digital signal processors or other microcontrollers, programmable logic devices, or the like or any combination thereof. Optionally, thecommunications module 114 includes cache or other local memory device (e.g., volatile memory, non-volatile memory or a combination thereof), DMA channels, one or more input buffers, one or more output buffers, or the like or any combination thereof. In one embodiment, thecommunications module 114 includes a baseband processor (e.g., that performs signal processing and implements real-time radio transmission operations for the electronic device). - Also connected to the
bus 100 is asensor interface module 130 communicatively coupled to one or more sensor(s) 132.Sensor 132 can, for example, include an accelerometer (e.g., for sensing acceleration, orientation, vibration, etc.), a magnetometer (e.g., for sensing the direction of a magnetic field), a gyroscope (e.g., for tracking rotation, orientation, or twist), a barometer (e.g., for sensing air pressure, from which relative elevation can be determined), a wind meter, a moisture sensor, an ambient light sensor, an IR or UV sensor or other photodetector, a pressure sensor, a temperature sensor, an acoustic vector sensor (e.g., for sensing particle velocity), a galvanic skin response (GSR) sensor, an ultrasonic sensor, a location sensor (e.g., a GPS receiver module, etc.), a gas or other chemical sensor, or the like or any combination thereof. Although separately illustrated inFIG. 8 , anycamera 120 ormicrophone 124 can also be considered asensor 132. Generally, asensor 132 generates one or more signals (typically, electrical signals) in the presence of some sort of stimulus (e.g., light, sound, moisture, gravitational field, magnetic field, electric field, etc.), in response to a change in applied stimulus, or the like or any combination thereof. In one embodiment, allsensors 132 coupled to thesensor interface module 130 are an integral part of the electronic device; however, in alternate embodiments, one or more of the sensors may be physically separate devices communicatively coupled to the electronic device (e.g., via the communications module 114). To the extent that anysensor 132 can function to sense user input, thensuch sensor 132 can also be considered auser input device 110. Thesensor interface module 130 is configured to activate, deactivate or otherwise control an operation (e.g., sampling rate, sampling range, etc.) of one or more sensors 132 (e.g., in accordance with instructions stored internally, or externally involatile memory 104 orstorage memory 106, ROM, etc., in accordance with commands issued by one or more components such as theCPU 102, theuser interface module 108, theaudio DSP 128, thecue detection module 134, or the like or any combination thereof). In one embodiment,sensor interface module 130 can encode, decode, sample, filter or otherwise process signals generated by one or more of thesensors 132. In one example, thesensor interface module 130 can integrate signals generated bymultiple sensors 132 and optionally process the integrated signal(s). Signals can be routed from thesensor interface module 130 to one or more of the aforementioned components of the electronic device (e.g., via the bus 100). In another embodiment, however, any signal generated by asensor 132 can be routed (e.g., to the CPU 102), the before being processed. - Generally, the
sensor interface module 130 may include one or more microprocessors, digital signal processors or other microcontrollers, programmable logic devices, or the like or any combination thereof. Thesensor interface module 130 may also optionally include cache or other local memory device (e.g., volatile memory, non-volatile memory or a combination thereof), DMA channels, one or more input buffers, one or more output buffers, and any other component facilitating the functions it supports (e.g., as described above). In one embodiment, thesensor interface module 130 may be provided as the “Sensor Core” (Sensors Processor Subsystem (SPS)) from Qualcomm, the “frizz” from Megachips, or the like or any combination thereof. Although thesensor interface module 130 is illustrated as an individual component, it will be appreciated that the sensor interface module 130 (or portions thereof) may be functionally integrated into one or more other components (e.g., theCPU 102, thecommunications module 114, the audio I/O module 122, theaudio DSP 128, thecue detection module 134, or the like or any combination thereof). Concluding Remarks - Having described and illustrated the principles of the technology with reference to specific implementations, it will be recognized that the technology can be implemented in many other, different, forms. To provide a comprehensive disclosure without unduly lengthening the specification, applicants incorporate by reference the US Patents and Patent Applications (“patent documents”) referenced above. Each of the above patent documents is incorporated herein in its entirety, including all drawings and any appendices, even if the patent documents are only referenced to specific portions thereof.
- The particular combinations of elements and features in the above-detailed embodiments are exemplary only; the interchanging and substitution of these teachings with other teachings in this and the incorporated-by-reference patents/applications are also contemplated.
Claims (18)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/107,346 US20210217129A1 (en) | 2016-09-26 | 2020-11-30 | Detection of encoded signals and icons |
Applications Claiming Priority (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201662400083P | 2016-09-26 | 2016-09-26 | |
US201662405709P | 2016-10-07 | 2016-10-07 | |
US201662429539P | 2016-12-02 | 2016-12-02 | |
US201715448403A | 2017-03-02 | 2017-03-02 | |
US201762488661P | 2017-04-21 | 2017-04-21 | |
US15/960,408 US10853903B1 (en) | 2016-09-26 | 2018-04-23 | Detection of encoded signals and icons |
US17/107,346 US20210217129A1 (en) | 2016-09-26 | 2020-11-30 | Detection of encoded signals and icons |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/960,408 Continuation US10853903B1 (en) | 2016-09-26 | 2018-04-23 | Detection of encoded signals and icons |
Publications (1)
Publication Number | Publication Date |
---|---|
US20210217129A1 true US20210217129A1 (en) | 2021-07-15 |
Family
ID=73554553
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/960,408 Active 2037-05-12 US10853903B1 (en) | 2016-09-26 | 2018-04-23 | Detection of encoded signals and icons |
US17/107,346 Abandoned US20210217129A1 (en) | 2016-09-26 | 2020-11-30 | Detection of encoded signals and icons |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/960,408 Active 2037-05-12 US10853903B1 (en) | 2016-09-26 | 2018-04-23 | Detection of encoded signals and icons |
Country Status (1)
Country | Link |
---|---|
US (2) | US10853903B1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113610094A (en) * | 2021-08-27 | 2021-11-05 | 四川中电启明星信息技术有限公司 | Distribution room pointer instrument reading method based on rotation projection calibration |
US20220283824A1 (en) * | 2021-03-08 | 2022-09-08 | Realtek Semiconductor Corp. | Processing system and processing method for performing emphasis process on button object of user interface |
WO2024010988A1 (en) * | 2022-07-05 | 2024-01-11 | Wacom Co., Ltd. | Digital image watermarking |
WO2024182094A1 (en) * | 2023-02-27 | 2024-09-06 | Zebra Technologies Corporation | Barcode-aware object verification |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10853903B1 (en) * | 2016-09-26 | 2020-12-01 | Digimarc Corporation | Detection of encoded signals and icons |
CN110557221A (en) * | 2018-05-31 | 2019-12-10 | 索尼公司 | Electronic device, communication method, decoding method, and medium |
WO2022040574A1 (en) * | 2020-08-21 | 2022-02-24 | Beam, Inc. | Integrating overlaid digital content into displayed data via graphics processing circuitry |
US11256887B1 (en) * | 2020-09-28 | 2022-02-22 | eSmart Source Inc. | Merging RFID data and barcode data |
US11477020B1 (en) | 2021-04-30 | 2022-10-18 | Mobeus Industries, Inc. | Generating a secure random number by determining a change in parameters of digital content in subsequent frames via graphics processing circuitry |
US11475610B1 (en) | 2021-04-30 | 2022-10-18 | Mobeus Industries, Inc. | Controlling interactivity of digital content overlaid onto displayed data via graphics processing circuitry using a frame buffer |
US11682101B2 (en) | 2021-04-30 | 2023-06-20 | Mobeus Industries, Inc. | Overlaying displayed digital content transmitted over a communication network via graphics processing circuitry using a frame buffer |
US11586835B2 (en) | 2021-04-30 | 2023-02-21 | Mobeus Industries, Inc. | Integrating overlaid textual digital content into displayed data via graphics processing circuitry using a frame buffer |
US11601276B2 (en) | 2021-04-30 | 2023-03-07 | Mobeus Industries, Inc. | Integrating and detecting visual data security token in displayed data via graphics processing circuitry using a frame buffer |
US11562153B1 (en) | 2021-07-16 | 2023-01-24 | Mobeus Industries, Inc. | Systems and methods for recognizability of objects in a multi-layer display |
SE2251170A1 (en) * | 2022-10-07 | 2024-04-08 | Wiretronic Ab | Method and system for identifying embedded information |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040258274A1 (en) * | 2002-10-31 | 2004-12-23 | Brundage Trent J. | Camera, camera accessories for reading digital watermarks, digital watermarking method and systems, and embedding digital watermarks with metallic inks |
US20100005156A1 (en) * | 2006-10-06 | 2010-01-07 | Philip Wesby | System and method for data acquisition and process and processing |
US20160203352A1 (en) * | 2013-09-20 | 2016-07-14 | Flashback Survey, Inc. | Using scanable codes to obtain a service |
US20170361233A1 (en) * | 2016-06-21 | 2017-12-21 | Activision Publishing, Inc. | System and method for reading graphically-encoded identifiers from physical trading cards through image-based template matching |
US10803272B1 (en) * | 2016-09-26 | 2020-10-13 | Digimarc Corporation | Detection of encoded signals and icons |
US10853903B1 (en) * | 2016-09-26 | 2020-12-01 | Digimarc Corporation | Detection of encoded signals and icons |
Family Cites Families (72)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4628532A (en) | 1983-07-14 | 1986-12-09 | Scan Optics, Inc. | Alphanumeric handprint recognition |
US4874936A (en) * | 1988-04-08 | 1989-10-17 | United Parcel Service Of America, Inc. | Hexagonal, information encoding article, process and system |
US6614914B1 (en) | 1995-05-08 | 2003-09-02 | Digimarc Corporation | Watermark embedder and reader |
US5862260A (en) | 1993-11-18 | 1999-01-19 | Digimarc Corporation | Methods for surveying dissemination of proprietary empirical data |
US6122403A (en) | 1995-07-27 | 2000-09-19 | Digimarc Corporation | Computer system linked by using information in data objects |
US6449377B1 (en) | 1995-05-08 | 2002-09-10 | Digimarc Corporation | Methods and systems for watermark processing of line art images |
US8144368B2 (en) | 1998-01-20 | 2012-03-27 | Digimarc Coporation | Automated methods for distinguishing copies from original printed objects |
US6988202B1 (en) | 1995-05-08 | 2006-01-17 | Digimarc Corporation | Pre-filteriing to increase watermark signal-to-noise ratio |
US6311214B1 (en) | 1995-07-27 | 2001-10-30 | Digimarc Corporation | Linking of computers based on optical sensing of digital data |
US7412072B2 (en) | 1996-05-16 | 2008-08-12 | Digimarc Corporation | Variable message coding protocols for encoding auxiliary data in media signals |
US6249603B1 (en) | 1998-06-16 | 2001-06-19 | Xerox Corporation | Efficient search for a gray-level pattern in an image |
US6264105B1 (en) * | 1998-11-05 | 2001-07-24 | Welch Allyn Data Collection, Inc. | Bar code reader configured to read fine print barcode symbols |
US6102403A (en) | 1999-01-20 | 2000-08-15 | A&L Associates Creative Games, Llc | Method for playing high-low card game |
US6711293B1 (en) | 1999-03-08 | 2004-03-23 | The University Of British Columbia | Method and apparatus for identifying scale invariant features in an image and use of same for locating an object in an image |
US6625297B1 (en) | 2000-02-10 | 2003-09-23 | Digimarc Corporation | Self-orienting watermarks |
US8355525B2 (en) | 2000-02-14 | 2013-01-15 | Digimarc Corporation | Parallel processing of digital watermarking operations |
US6674876B1 (en) | 2000-09-14 | 2004-01-06 | Digimarc Corporation | Watermarking in the time-frequency domain |
US6483927B2 (en) | 2000-12-18 | 2002-11-19 | Digimarc Corporation | Synchronizing readers of hidden auxiliary data in quantization-based data hiding schemes |
US7197160B2 (en) | 2001-03-05 | 2007-03-27 | Digimarc Corporation | Geographic information systems using digital watermarks |
US7734506B2 (en) | 2002-04-22 | 2010-06-08 | Norman Ken Ouchi | Catalog, catalog query, and item identifier for a physical item |
FR2843212B1 (en) | 2002-08-05 | 2005-07-22 | Ltu Technologies | DETECTION OF A ROBUST REFERENCE IMAGE WITH LARGE PHOTOMETRIC TRANSFORMATIONS |
US7072490B2 (en) | 2002-11-22 | 2006-07-04 | Digimarc Corporation | Symmetry watermark |
US7352878B2 (en) | 2003-04-15 | 2008-04-01 | Digimarc Corporation | Human perceptual model applied to rendering of watermarked signals |
JP4059173B2 (en) * | 2003-06-27 | 2008-03-12 | 株式会社デンソーウェーブ | Optical information reading apparatus and optical information reading method |
ATE376699T1 (en) | 2004-01-06 | 2007-11-15 | Nxp Bv | METHOD FOR REPRODUCING GRAPHIC OBJECTS |
US7359563B1 (en) | 2004-04-05 | 2008-04-15 | Louisiana Tech University Research Foundation | Method to stabilize a moving image |
US8891811B2 (en) * | 2004-09-17 | 2014-11-18 | Digimarc Corporation | Hierarchical watermark detector |
WO2006053023A2 (en) | 2004-11-09 | 2006-05-18 | Digimarc Corporation | Authenticating identification and security documents |
US20060157574A1 (en) | 2004-12-21 | 2006-07-20 | Canon Kabushiki Kaisha | Printed data storage and retrieval |
US7783130B2 (en) | 2005-01-24 | 2010-08-24 | The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration | Spatial standard observer |
GB2457267B (en) | 2008-02-07 | 2010-04-07 | Yves Dassas | A method and system of indexing numerical data |
US8353457B2 (en) | 2008-02-12 | 2013-01-15 | Datalogic ADC, Inc. | Systems and methods for forming a composite image of multiple portions of an object from multiple perspectives |
EP2272014A2 (en) | 2008-04-29 | 2011-01-12 | LTU Technologies S.A.S. | Method for generating a representation of image content using image search and retrieval criteria |
US8805110B2 (en) | 2008-08-19 | 2014-08-12 | Digimarc Corporation | Methods and systems for content processing |
US8199969B2 (en) | 2008-12-17 | 2012-06-12 | Digimarc Corporation | Out of phase digital watermarking in two chrominance directions |
US9117268B2 (en) | 2008-12-17 | 2015-08-25 | Digimarc Corporation | Out of phase digital watermarking in two chrominance directions |
US9749607B2 (en) | 2009-07-16 | 2017-08-29 | Digimarc Corporation | Coordinated illumination and image signal capture for enhanced signal detection |
US9197736B2 (en) | 2009-12-31 | 2015-11-24 | Digimarc Corporation | Intuitive computing methods and systems |
US20110321082A1 (en) | 2010-06-29 | 2011-12-29 | At&T Intellectual Property I, L.P. | User-Defined Modification of Video Content |
WO2012004626A1 (en) | 2010-07-06 | 2012-01-12 | Ltu Technologies | Method and apparatus for obtaining a symmetry invariant descriptor from a visual patch of an image |
CN103190078B (en) | 2010-09-03 | 2017-12-08 | 数字标记公司 | For estimating the signal processor and method of the conversion between signal |
US9240021B2 (en) | 2010-11-04 | 2016-01-19 | Digimarc Corporation | Smartphone-based methods and systems |
WO2012156774A1 (en) | 2011-05-18 | 2012-11-22 | Ltu Technologies | Method and apparatus for detecting visual words which are representative of a specific image category |
US9367770B2 (en) * | 2011-08-30 | 2016-06-14 | Digimarc Corporation | Methods and arrangements for identifying objects |
US10326968B2 (en) | 2011-10-20 | 2019-06-18 | Imax Corporation | Invisible or low perceptibility of image alignment in dual projection systems |
US9380186B2 (en) | 2012-08-24 | 2016-06-28 | Digimarc Corporation | Data hiding for spot colors in product packaging |
US9449357B1 (en) | 2012-08-24 | 2016-09-20 | Digimarc Corporation | Geometric enumerated watermark embedding for spot colors |
US9401001B2 (en) | 2014-01-02 | 2016-07-26 | Digimarc Corporation | Full-color visibility model using CSF which varies spatially with local luminance |
US9141842B2 (en) | 2012-02-15 | 2015-09-22 | Datalogic ADC, Inc. | Time division exposure of a data reader |
US20140105450A1 (en) | 2012-10-17 | 2014-04-17 | Robert Berkeley | System and method for targeting and reading coded content |
US9224184B2 (en) | 2012-10-21 | 2015-12-29 | Digimarc Corporation | Methods and arrangements for identifying objects |
US8990638B1 (en) | 2013-03-15 | 2015-03-24 | Digimarc Corporation | Self-stabilizing network nodes in mobile discovery system |
WO2014169238A1 (en) | 2013-04-11 | 2014-10-16 | Digimarc Corporation | Methods for object recognition and related arrangements |
US9521291B2 (en) | 2013-07-19 | 2016-12-13 | Digimarc Corporation | Feature-based watermark localization in digital capture systems |
WO2015089115A1 (en) | 2013-12-09 | 2015-06-18 | Nant Holdings Ip, Llc | Feature density object classification, systems and methods |
US9565335B2 (en) | 2014-01-02 | 2017-02-07 | Digimarc Corporation | Full color visibility model using CSF which varies spatially with local luminance |
US9635378B2 (en) | 2015-03-20 | 2017-04-25 | Digimarc Corporation | Sparse modulation for robust signaling and synchronization |
US10424038B2 (en) | 2015-03-20 | 2019-09-24 | Digimarc Corporation | Signal encoding outside of guard band region surrounding text characters, including varying encoding strength |
EP2921989A1 (en) * | 2014-03-17 | 2015-09-23 | Université de Genève | Method for object recognition and/or verification on portable devices |
US10540564B2 (en) | 2014-06-27 | 2020-01-21 | Blinker, Inc. | Method and apparatus for identifying vehicle information from an image |
US9667829B2 (en) | 2014-08-12 | 2017-05-30 | Digimarc Corporation | System and methods for encoding information for printed articles |
US10113910B2 (en) | 2014-08-26 | 2018-10-30 | Digimarc Corporation | Sensor-synchronized spectrally-structured-light imaging |
US9516001B2 (en) * | 2014-09-30 | 2016-12-06 | The Nielsen Company (Us), Llc | Methods and apparatus to identify media distributed via a network |
US9747656B2 (en) | 2015-01-22 | 2017-08-29 | Digimarc Corporation | Differential modulation for robust signaling and synchronization |
US9754341B2 (en) | 2015-03-20 | 2017-09-05 | Digimarc Corporation | Digital watermarking and data hiding with narrow-band absorption materials |
US9922220B2 (en) | 2015-06-11 | 2018-03-20 | Digimarc Corporation | Image block selection for efficient time-limited decoding |
US9819950B2 (en) | 2015-07-02 | 2017-11-14 | Digimarc Corporation | Hardware-adaptable watermark systems |
EP3311360B1 (en) | 2015-07-16 | 2023-07-19 | Digimarc Corporation | Signal processors and methods for estimating geometric transformations of images for digital data extraction |
CN107157717A (en) | 2016-03-07 | 2017-09-15 | 维看公司 | Object detection from visual information to blind person, analysis and prompt system for providing |
US20170372449A1 (en) | 2016-06-24 | 2017-12-28 | Intel Corporation | Smart capturing of whiteboard contents for remote conferencing |
US10255649B2 (en) | 2016-08-15 | 2019-04-09 | Digimarc Corporation | Signal encoding for difficult environments |
US10304149B2 (en) | 2016-08-15 | 2019-05-28 | Digimarc Corporation | Signal encoding for difficult environments |
-
2018
- 2018-04-23 US US15/960,408 patent/US10853903B1/en active Active
-
2020
- 2020-11-30 US US17/107,346 patent/US20210217129A1/en not_active Abandoned
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040258274A1 (en) * | 2002-10-31 | 2004-12-23 | Brundage Trent J. | Camera, camera accessories for reading digital watermarks, digital watermarking method and systems, and embedding digital watermarks with metallic inks |
US20100005156A1 (en) * | 2006-10-06 | 2010-01-07 | Philip Wesby | System and method for data acquisition and process and processing |
US20160203352A1 (en) * | 2013-09-20 | 2016-07-14 | Flashback Survey, Inc. | Using scanable codes to obtain a service |
US20170361233A1 (en) * | 2016-06-21 | 2017-12-21 | Activision Publishing, Inc. | System and method for reading graphically-encoded identifiers from physical trading cards through image-based template matching |
US10803272B1 (en) * | 2016-09-26 | 2020-10-13 | Digimarc Corporation | Detection of encoded signals and icons |
US10853903B1 (en) * | 2016-09-26 | 2020-12-01 | Digimarc Corporation | Detection of encoded signals and icons |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220283824A1 (en) * | 2021-03-08 | 2022-09-08 | Realtek Semiconductor Corp. | Processing system and processing method for performing emphasis process on button object of user interface |
US11455179B1 (en) * | 2021-03-08 | 2022-09-27 | Realtek Semiconductor Corp. | Processing system and processing method for performing emphasis process on button object of user interface |
CN113610094A (en) * | 2021-08-27 | 2021-11-05 | 四川中电启明星信息技术有限公司 | Distribution room pointer instrument reading method based on rotation projection calibration |
WO2024010988A1 (en) * | 2022-07-05 | 2024-01-11 | Wacom Co., Ltd. | Digital image watermarking |
WO2024182094A1 (en) * | 2023-02-27 | 2024-09-06 | Zebra Technologies Corporation | Barcode-aware object verification |
Also Published As
Publication number | Publication date |
---|---|
US10853903B1 (en) | 2020-12-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20210217129A1 (en) | Detection of encoded signals and icons | |
US11449698B2 (en) | Scanner with control logic for resolving package labeling conflicts | |
US11257198B1 (en) | Detection of encoded signals and icons | |
US10803272B1 (en) | Detection of encoded signals and icons | |
US11676238B2 (en) | Detecting conflicts between multiple different signals within imagery | |
US10242434B1 (en) | Compensating for geometric distortion of images in constrained processing environments | |
US10506128B1 (en) | Encoded signal systems and methods to ensure minimal robustness | |
US9311531B2 (en) | Systems and methods for classifying objects in digital images captured using mobile devices | |
US11336795B2 (en) | Encoded signal systems and methods to ensure minimal robustness | |
US10460161B1 (en) | Methods and systems for ensuring correct printing plate usage and signal tolerances | |
US11715172B2 (en) | Detecting conflicts between multiple different encoded signals within imagery, using only a subset of available image data | |
US11875485B2 (en) | Compensating for geometric distortion of images in constrained processing environments | |
US11941720B2 (en) | Detecting conflicts between multiple different encoded signals within imagery, using only a subset of available image data, and robustness checks | |
US10373299B1 (en) | Compensating for geometric distortion of images in constrained processing environments | |
Medic | Model driven optical form recognition | |
WO2016199126A1 (en) | System and method of recognizing state of a product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: DIGIMARC CORPORATION, OREGON Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:WEAVER, MATTHEW M.;ROGERS, ELIOT;DESHMUKH, UTKARSH;SIGNING DATES FROM 20190411 TO 20190412;REEL/FRAME:054494/0522 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: APPLICATION DISPATCHED FROM PREEXAM, NOT YET DOCKETED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |