US9064161B1 - System and method for detecting generic items in image sequence - Google Patents
System and method for detecting generic items in image sequence Download PDFInfo
- Publication number
- US9064161B1 US9064161B1 US11/811,211 US81121107A US9064161B1 US 9064161 B1 US9064161 B1 US 9064161B1 US 81121107 A US81121107 A US 81121107A US 9064161 B1 US9064161 B1 US 9064161B1
- Authority
- US
- United States
- Prior art keywords
- visual features
- features
- group
- classification
- parameter
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06K—GRAPHICAL DATA READING; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K7/00—Methods or arrangements for sensing record carriers, e.g. for reading patterns
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06K—GRAPHICAL DATA READING; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K7/00—Methods or arrangements for sensing record carriers, e.g. for reading patterns
- G06K7/01—Details
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
- G06V10/464—Salient features, e.g. scale invariant feature transforms [SIFT] using a plurality of salient features, e.g. bag-of-words [BoW] representations
-
- G—PHYSICS
- G07—CHECKING-DEVICES
- G07G—REGISTERING THE RECEIPT OF CASH, VALUABLES, OR TOKENS
- G07G1/00—Cash registers
-
- G—PHYSICS
- G07—CHECKING-DEVICES
- G07G—REGISTERING THE RECEIPT OF CASH, VALUABLES, OR TOKENS
- G07G1/00—Cash registers
- G07G1/0036—Checkout procedures
- G07G1/0045—Checkout procedures with a code reader for reading of an identifying code of the article to be registered, e.g. barcode reader or radio-frequency identity [RFID] reader
- G07G1/0054—Checkout procedures with a code reader for reading of an identifying code of the article to be registered, e.g. barcode reader or radio-frequency identity [RFID] reader with control of supplementary check-parameters, e.g. weight or number of articles
- G07G1/0063—Checkout procedures with a code reader for reading of an identifying code of the article to be registered, e.g. barcode reader or radio-frequency identity [RFID] reader with control of supplementary check-parameters, e.g. weight or number of articles with means for detecting the geometric dimensions of the article of which the code is read, such as its size or height, for the verification of the registration
-
- G—PHYSICS
- G07—CHECKING-DEVICES
- G07G—REGISTERING THE RECEIPT OF CASH, VALUABLES, OR TOKENS
- G07G3/00—Alarm indicators, e.g. bells
- G07G3/003—Anti-theft control
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/68—Food, e.g. fruit or vegetables
Definitions
- the field of the present disclosure generally relates to techniques for detecting the presence of generic items in a video or other sequence of images, or more particularly relates to systems and methods for locating and tracking unidentified objects to minimize retailer losses.
- BoB item detection system that can accurately detect the presence of merchandise on a cart with very few false positives and potentially lock the cash register in such a way that the cashier cannot complete the transaction without processing the BoB item.
- a preferred embodiment features a system and method for detecting the presence of objects, either known or unknown, based on groups of visual features extracted from image data
- the system is a checkout system for detecting items of merchandise on a shopping cart without the use of reflectors, markers, or other indicia on the cart or merchandise.
- the merchandise checkout system preferably includes: a feature extractor for extracting visual features from a plurality of images; a motion detector configured to detect one or more groups of the visual features present in at least two of the plurality of images; a classifier to classify each of said groups of visual features based on one or more classification criteria, wherein each of the one or more parameters is associated with one of said groups of visual features; and an alarm configured to generate an alert if the one or more parameters for any of said groups of visual features satisfy one or more classification criteria.
- the images include still or video images of the items on a moving structure such as a shopping cart, including the bottom basket of the shopping cart.
- the visual features extracted from the images are generated from the graphical and textual indicia on the merchandise packaging.
- the visual features are preferably scale-invariant features such as Scale-Invariant Feature Transform (SIFT) features and Speeded Up Robust Features (SURF), although various other feature descriptors and detectors known to those skilled in the art are also suitable.
- Extracted features may be tracked between images as a group using a geometric transform such as an affine transformation or homography transformation, for example.
- the parameters used to characterize the images preferably include a translational motion parameter, a rotational motion parameter, a speed parameter, a direction parameter, a number of tracked features parameter, a group entropy parameter, a time of image capture parameter, an aspect ratio parameter, an elapse time parameter, an edge score parameter, a SIFT feature, a SURF feature, a codebook histogram parameter, and combinations thereof.
- the classifier that distinguishes whether an item is detected or not may be selected from: one or more thresholding conditions, a linear classification, a 3-D Motion Estimation classifier, a Nearest Neighbor classifier, a Neural Network classifier, a Vector Quantization classifier, and combinations thereof.
- Another embodiment features a method of detecting an object from a sequence of images, the method comprising: extracting visual features from a plurality of images; detecting one or more groups of the extracted visual features that are present in at least two of the plurality of images; generating one or more motion parameters or appearance parameters to characterize each of the one or more groups of extracted visual features; and generating an alert if the one or more motion parameters or appearance parameters for any of said groups of visual features satisfy one or more classification criteria.
- the group of matching features may be detected and tracked in multiple image using an affine transform or homography, for example.
- FIG. 1 is a perspective view of a merchandise checkout system for a retail establishment, in accordance with a preferred embodiment
- FIG. 2 is a functional block diagram of an exemplary merchandise checkout system, in accordance with a preferred embodiment
- FIG. 3 is a flowchart of a general method of detecting an object on a moving structure, in accordance with a preferred embodiment
- FIG. 4 is a flowchart of the method of extracting scale-invariant visual features, in accordance with a preferred embodiment
- FIG. 5 is a flowchart of a particular method of detecting an object on a shopping cart, in accordance with another preferred embodiment
- FIG. 6 is a 3-D plot of feature descriptors and codebook entries, in accordance with another preferred embodiment
- FIG. 7 is a codebook histogram, in accordance with another preferred embodiment.
- FIG. 8 is a flowchart of the method of generating a codebook histogram for classification, in accordance with another preferred embodiment
- FIG. 1 Illustrated in FIG. 1 is a merchandise checkout system 100 for detecting merchandise being purchased by customers at a checkout terminal in a retail store, for example.
- the merchandise checkout system 100 is further configured to identify objects on the shopping cart using the systems and methods taught in U.S. Pat. No. 7,100,824, which is hereby incorporated by reference herein.
- the merchandise checkout system 100 is configured to detect the presence of merchandise on the cart without necessarily identifying the merchandise.
- the merchandise checkout system 100 may be particularly useful in situations in which the merchandise on the shopping cart evades identification because the merchandise is obstructed by another object, or because the item is not represented in the database of known objects.
- the checkout system may also generate an alert to notify the cashier or checkout terminal 106 of the presence, identity, price, and quantity of the items of merchandise 116 in the shopping cart if recognized.
- FIG. 1 Illustrated in FIG. 1 is a functional block diagram of the merchandise checkout system 100 .
- the system 100 includes one or more visual sensors mounted on one or more sides of the checkout lane to capture a sequence of images of a shopping cart or other moving structure and any merchandise thereon. As such, the visual sensors can capture images of merchandise on a cart 108 without the items being removed from the cart.
- the set of visual sensors which generally encompass digital cameras and video cameras, include a first camera 118 a mounted in the checkout counter and a second camera 118 b mounted in the checkout counter on the opposite side of the lane.
- the image data from the cameras is transmitted to a local object detection processor 103 installed in the checkout counter, or transmitted to a remote server via wired or wireless network connection. An alert may then be transmitted to the checkout terminal 106 so that appropriate action may be taken, or the information transmitted to the server to update the inventory records.
- the object detection processor 103 in the preferred embodiment detects the presence of an unidentified object by tracking scale invariant visual features that match across a sequence of images, generating one or more motion parameters and appearance parameters to characterize the motion of the matching features, and classifying the matching features based on their motion and appearance characteristics.
- visual features associated with merchandise are located, even in the presence of background objects opposite the cart and customers walking next to the cart.
- the features that satisfy the classification parameters are generally limited to unidentified items of merchandise.
- An estimate of likelihood that the remaining features are of an item of merchandise may also then be generated for purposes of determining the appropriate response.
- the processor 103 generates one of a plurality of types of alerts to notify the cashier of the presence of the item on the shopping cart.
- FIG. 2 Illustrated in FIG. 2 is a functional block diagram of an exemplary merchandise checkout system.
- the one or more cameras 201 - 203 trained on the check out aisle transmit image data to the object detection processor 103 via a camera interface 212 .
- the interface may be adjusted to multiplex the camera feeds and buffer the image data where necessary.
- the camera interface 212 is adapted to transmit signals to trigger the cameras 201 - 203 to capture images at a determined frame, the frame rate being tailored to capture two or more images of the items on the cart as it is pushed through the aisle.
- the system and method for dynamically capturing the sequence of images is taught in pending U.S. patent application entitled “OPTICAL FLOW FOR OBJECT RECOGNITION,” application Ser. No. 11/324,957, filed Jan. 3, 2006, which is hereby incorporated by reference herein.
- the object detection processor 103 preferably includes an image capture module 220 configured to receive and preprocess the image data from the cameras 201 - 203 . Preprocessing may include adjusting the image contrast/brightness or resolution, removing distortion including pincushion artifacts, and cropping the image to selectively remove background features or portions of the cart.
- the image data from the image capture module 220 is received by a feature extractor module 230 which locates keypoints associated with visual features in the images and generates multi-dimensional vectors characterizing each feature.
- the visual features are scale-invariant features generated with a scale-invariant feature transform (SIFT) process described in more detail in context of FIG. 4 .
- SIFT scale-invariant feature transform
- the feature comparison module 240 is configured to compare the features of different images to match SIFT features and combinations of features that are present in two or more selected images.
- the motion detector module 250 determines the speed and direction of the objects using the SIFT features common to the selected images.
- the motion detector 250 is a motion estimator that estimates movement (e.g., translational movement, rotational movement) of the features between the selected images.
- movement e.g., translational movement, rotational movement
- the objects need not be identified in order to track the features' motion.
- the classifier 260 is configured to filter SIFT features or groups of features based on various motion parameters and appearance parameters tailored to merchandise or other desirable objects from background and other undesirable objects. If one or more items of merchandise are detected, an alarm referred to herein as the point-of-sale (POS) alert module 270 notifies the cashier of its presence on the cart.
- POS point-of-sale
- the object detection processor 103 is also configured to detect and identify merchandise or other objects on the cart 108 where possible. In this configuration, extracted features are compared to a database of features of known objects and the matches used to identify the associated items of merchandise. If the object cannot be identified, the features may then be processed in the manner described herein to detect at least the presence of the object on the cart.
- the merchandise detected by the checkout system of the preferred embodiment generally includes consumable and household goods available for purchase in grocery stores, for example.
- a larger percentage of such goods are marked with graphical indicia including product name, manufacture name, and one or more trademarks and design logos that provide the basis for identification of the object by both customers and the checkout system.
- the checkout system may be employed to detect and/or recognize any number of object types present in retail, wholesale, commercial, or residential settings.
- FIG. 3 Illustrated in FIG. 3 is a flowchart of a method of detecting objects on shopping carts or other moving structures.
- the object detection processor 103 receives 310 at least two images or selects at least two images from a video sequence. The selected images are acquired at a rate high enough to ensure that an object appears in two or more frames, yet low enough to minimize the error associated with velocity measurements and rotation angle measurements.
- the object detection processor 103 proceeds to extract 320 features by applying the scale-invariant feature transform to locate keypoints at areas of high contrast in a scale-invariant space, generate a plurality of gradients from the image data in proximity to the keypoint, and generate a feature descriptor vector from the gradient information. This process is repeated for each SIFT feature extracted in all of the two or more images.
- the set of SIFT features derived from the two or more images may inadvertently include visual features associated with background objects and customers in addition to those of merchandise.
- the visual features from a first image are compared to the features extracted from the remaining one or more images for purposes of identifying matching features with the same motion, i.e., features that move with a similar direction and speed between the two or more images being compared.
- the matching features are detected 330 using an affine transformation, although various transforms including similarity transforms and homography transforms may also be employed.
- the affine transformation identifies subgroups of features that are common to the two or more images and estimates the translational and rotational movement of the features from the first image to the second. Multiple subgroups can be identified, each subgroup characterized by a unique translational motion, rotational motion, or combination thereof.
- each subgroup of features identifies one object or multiple objects with the same motion between the two or more images.
- Features that are present in only one of the two or more selected images are either used for a confidence estimate or discarded. Even with some of the features discarded, however, the remaining features are generally sufficient to track individual objects in the field of view and distinguish the objects with the same motion from other objects with a different motion.
- segmentation of the image into moving objects can be achieved by the technique of “optical flow,” referenced above.
- features need not be explicitly identified, but rather the image is subdivided into a grid of blocks, and each block's motion from one image to another is determined by the known technique of maximizing normalized correlation or the known technique of minimizing the sum of squared differences. Blocks having substantially the same motion can be grouped together to delineate a moving object, thus providing the same type of information that would be obtained by tracking features with an affine transform and grouping features with the same motion. If necessary for the further steps of classification of the detected moving objects, as described below, features can be extracted on the portion of the image identified as a moving object.
- the objects are classified 340 based upon their motion, appearance, timing, and/or other criteria.
- visual features are classified into two groups, the first group including features characteristic of desirable objects and the second group of undesirable objects.
- the group of desirable objects consists of those objects that exhibit motion that satisfy a threshold or fall within an allowable range of translation and rotation, for example, while the group of undesirable objects fail to satisfy the translation motion requirement, rotational motion requirement, or both.
- the group of desirable objects is restricted to the set of features having the greatest horizontal motion, substantially no vertical motion, and substantially no rotation.
- the horizontal motion threshold is consistent with merchandise on the bottom basket of a cart while excluding background objects behind the checkout aisle.
- the vertical motion and rotational motion thresholds are set to exclude features associated with customers, namely the clothing and shoes of customers for example, as they walk through the aisle.
- different motion criteria may be employed for different applications of the invention including other retail or commercial environments.
- a confidence estimate is generated 350 to assess the likelihood that these remaining set of tracked features are actually associated with an item of merchandise on the cart.
- the estimate in the preferred embodiment may be based on any number of criteria including, for example, the average or median speed of the object in the horizontal or vertical directions, the number of features satisfying the translational and rotational motion criteria, the entropy of the tracked area, the aspect ratio of the area containing the features, the time since the start of the purchase transaction, the time interval between the consecutive selected images, and the resemblance of the features to an edge.
- the confidence estimate is inferred based on the satisfaction of the classification criteria without the need to generate a numerical estimate.
- the object detection processor 103 takes action to alert 360 the cashier or other user that there is an item in the shopping cart.
- the alert can take the form of an audible alert, a visual alert, or suspension of the transaction until the cashier inspects the cart.
- a scale-invariant feature is one that can be reliably detected regardless of the scale with which the object appears in the image.
- the scale-invariant features preferably SIFT features—are extracted from each of the one or more images of a shopping cart with merchandise therein.
- Visual features are extracted from a plurality of Difference-of-Gaussian (DoG) images derived from the selected input image.
- DoG Difference-of-Gaussian
- a Difference-of-Gaussian image represents a band-pass filtered image produced by subtracting a first copy of the image blurred with a first Gaussian kernel from a second copy of the image blurred with a second Gaussian kernel.
- DoG images 402 The process of generating DoG images 402 is repeated for multiple frequency bands—that is, at different scales—in order to accentuate objects and object features independent of their size and resolution. While image blurring is achieved using Gaussian convolution kernels of variable width, one skilled in the art will appreciate that the same results may be achieved by using a fixed-width Gaussian of appropriate variance with images of different resolutions produced by down-sampling the original input image.
- Each of the DoG images is inspected to determine the location 404 of the pixel extrema, including minima and maxima.
- an extremum must possess the highest or lowest pixel intensity among the eight adjacent pixels in the same DoG image as well as the neighboring-image pixels (either 2 contiguous pixels or nine adjacent pixels) in the two adjacent DoG images having the closest related band-pass filtering, i.e., the adjacent DoG images having the next highest scale and the next lowest scale, if present.
- the identified extrema which may be referred to herein as image “keypoints,” are associated with the center point of the visual features.
- an improved estimate of the location of each extremum within a DoG image may be determined through interpolation using a 3-dimensional quadratic function, for example, to improve feature matching and stability.
- the local image properties are used to assign 406 an orientation (among a 360 degree range of possible orientations) to each of the keypoints.
- the orientation is derived from an orientation histogram formed from gradient orientations at all points within a circular window around the keypoint.
- the peak in the orientation histogram which corresponds to a dominant direction of the gradients local to a keypoint, is assigned to be the feature's orientation.
- the SIFT processor of the feature extractor With the orientation of each keypoint assigned, the SIFT processor of the feature extractor generates 408 a feature descriptor to characterize the image data in a region surrounding each identified keypoint at its respective orientation.
- a feature descriptor is an array of gradient data from the region of the image immediately surrounding an associated keypoint.
- the surrounding region within the associated DoG image is subdivided into an M ⁇ M array of subfields aligned with the keypoint's assigned orientation.
- Each subfield is characterized by an orientation histogram having a plurality of bins, each bin representing the sum of the image's gradient magnitudes possessing a direction within a particular angular range and present within the associated subfield.
- the feature descriptor includes a 128 byte array corresponding to a 4 ⁇ 4 array of subfields with each subfield including eight bins corresponding to an angular width of 45 degrees.
- the feature descriptor in the preferred embodiment further includes an identifier of the associated image, the scale of the DoG image in which the associated keypoint was identified, the orientation of the feature, and the geometric location of the keypoint in the associated DoG image.
- the process of generating 402 DoG images, localizing 404 pixel extrema across the DoG images, assigning 406 an orientation to each of the localized extrema, and generating 408 a feature descriptor for each of the localized extrema may then be repeated for each of the two or more images received from the one or more cameras trained on the shopping cart passing through a checkout lane.
- the SIFT methodology has also been extensively described in U.S. Pat. No. 6,711,293 issued Mar. 23, 2004, which is hereby incorporated by reference herein, and by David G. Lowe, “Object Recognition from Local Scale-Invariant Features,” Proceedings of the International Conference on Computer Vision, Corfu, Greece, September, 1999 and by David G. Lowe, “Local Feature View Clustering for 3D Object Recognition,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Kauai, Hi., December, 2001; both of which are incorporated herein by reference.
- a scale-invariant and rotation-invariant technique referred to as Speeded Up Robust Features (SURF) is implemented.
- the SURF technique uses a Hessian matrix composed of box filters that operate on points of the image to determine the location of keypoints as well as the scale of the image data at which the keypoint is an extremurn in scale space.
- the box filters approximate Gaussian second order derivative filters.
- An orientation is assigned to the feature based on Gaussian weighted, Haar-wavelet responses in the horizontal and vertical directions. A square aligned with the assigned orientation is centered about the point for purposes of generating a feature descriptor.
- Multiple Haar-wavelet responses are generated at multiple points for orthogonal directions in each of 4 ⁇ 4 sub-regions that make up the square.
- Exemplary feature detectors include: the Harris detector which finds corner-like features at a fixed scale; the Harris-Laplace detector which uses a scale-adapted Harris function to localize points in scale-space (it then selects the points for which the Laplacian-of-Gaussian attains a maximum over scale); Hessian-Laplace localizes points in space at the local maxima of the Hessian determinant and in scale at the local maxima of the Laplacian-of-Gaussian; the Harris/Hessian Affine detector which does an affine adaptation of the Harris/Hessian Laplace detector using the second moment matrix; the Maximally Stable Extremal Regions detector which finds regions such that pixels inside the MSER have either higher (brighter extremal regions) or lower (dark extremal regions) intensity than all pixels on its outer boundary; the salient region detector which maximizes the entropy within the region, proposed by
- Exemplary feature descriptors include: Shape Contexts which computes the distance and orientation histogram of other points relative to the interest point; Image Moments which generate descriptors by taking various higher order image moments; Jet Descriptors which generate higher order derivatives at the interest point; Gradient location and orientation histogram which uses a histogram of location and orientation of points in a window around the interest point; Gaussian derivatives; moment invariants; complex features; steerable filters; and phase-based local features known to those skilled in the art.
- FIG. 5 Illustrated in FIG. 5 is a flowchart of the preferred method of detecting objects on a shopping cart and alerting a cashier or other user when appropriate.
- the object detection processor 103 receives 510 (or selects) at least two images from a video sequence as the shopping cart is moved through the checkout lane. Although a camera can easily acquire images at a rate of 30 frames per second, the images selected for object detection can be much less. For a consumer pushing a cart at approximately one mile per hour, the selected images should be captured at a frequency of 1-5 frames per second.
- the object detection processor 103 proceeds to extract 512 features from the two images using the SIFT process to locate keypoints at areas of high contrast and generate a descriptor for each of the keypoints.
- the extracted features are associated with a pair of images are compared for purposes of identifying 514 features having a similar motion or appearance amongst the two images.
- the set of visual features from the first image are compared to the features extracted from the second image.
- groups of features with the same or similar translational and rotational motion are detected and grouped. Multiple subgroups can be identified, each subgroup characterized by a unique translational motion, rotational motion, or combination thereof.
- two or more pairs of images are analyzed in the manner described herein. If only two images are received, decision block 516 is answered in the negative and one or more additional images acquired. In some implementations, the two pairs of images are generated from as few as three images. When relying on three images, the set of visual features from the first image are compared to the features extracted from the second image, and the set of visual features from the second image are compared to the features extracted from the third image.
- decision block 516 When at least two pairs of images have been acquired and the matching features amongst each of the pairs identified, decision block 516 is answered in the affirmative. If there are no matching features common to any of the pairs of images, there is insufficient information to detect an object. In this situation, decision block 518 is answered in the affirmative and the process ended. If at least one pair of images yields matching features, however, decision block 518 is answered in the negative and the matching features are subjected to classification.
- classification parameters are generated for each of the groups of matching features of each of the pairs of images.
- the parameters are selected to distinguish objects on a cart from everything else that may pass through the checkout aisle.
- Classification parameters are generally computed for each subgroup of matching features having the same motion between a pair of images.
- a list of classification parameters may include, but is not limited to, one or more of the following: (a) horizontal shift, (b) vertical shift, (c) number of tracked features, (d) matching feature entropy, (e) time of image capture relative to start of purchase transaction, (f) aspect ratio, (g) elapse time, (h) edge score, and (i) amount of rotation.
- the horizontal shift parameter refers to the median horizontal displacement of a subgroup of matching features observed in a pair of images.
- the vertical shift parameter refers to the median vertical displacement of a subgroup of matching features observed in a pair of images.
- the number of tracked features parameter refers to total number of matching features associated with a pair of images.
- the matching feature entropy parameter refers to the entropy associated with the pixel brightness within an area of the image bounded by the group of matching features having the same motion.
- the time of image capture parameter refers to the time—hour, minute, and/or second—the pair of images are captured by the camera relative to the beginning of the purchase transaction, i.e., the time the customer's first item was scanned in or otherwise rung up.
- the aspect ratio parameter refers to the ratio of the overall width to the overall height of a box bounding a group of matching features having the same motion.
- the elapse time parameter refers to the time interval between which two images of a pair are captured.
- the edge score parameter is a measure of how edge-like the matching features are.
- the rotation parameter refers to the amount of angular rotation exhibited by a group of matching features with similar motion between a pair of images.
- the object detection processor 103 classifies 522 all the features of the pairs of images to distinguish whether an object is present on the cart and whether to alert the cashier.
- a group of one or more features is more likely to be classified as an object if the following classification criteria are satisfied: the horizontal motion exhibited by matching features of a pair of images exceeds a determined threshold; the vertical motion exhibited by matching features of a pair of images is zero or below a threshold; the rotation exhibited by matching features of a pair of images is zero or below a minimal threshold; the number of matching features exceeds a threshold indicating that they are statistically significant; the entropy of the area occupied by the matching feature is minimal or below a threshold; the aspect ratio of a group of matching features is distinguishable from the aspect ratio of the structural members of the shopping cart; the elapse time is minimal or below a threshold; and/or the features are characterized by an edge score above a threshold.
- the object detection processor 103 is configured to alert the cashier or other user of the presence of the item on the shopping cart or, if appropriate, intervene in the transaction.
- the type of alert or intervention depends on the number of subgroups of valid matching features, i.e., groups of features that exhibit the same affine motion, homography motion, or otherwise satisfy the classification criteria. If there is at least one group of matching features validated for each of the two pairs of images, decision block 524 is answered in the affirmative. If the valid matching features also exhibit the appropriate motion, i.e., horizontal motion in excess of a predetermined threshold, the probability of an object being on the cart is relatively high and decision block 526 is answered in the affirmative. The object detection processor 103 then temporarily suspends or interrupts 528 the purchase transaction and prompts the cashier to check unaccounted for items of merchandise on the cart.
- decision block 526 is answered in the negative and an alert 530 in the form of a visual, audio, or tactile que is presented to the user through the checkout terminal 106 , for example.
- the alert 530 may also include the presentation of an image of the merchandise on the cart—with the associated detected object highlighted—so that cashier can independently verify the presence, quantity, or identify of merchandise therein.
- the cashier may also be alerted 530 in a similar manner if a valid match is detected in only one of the two pair of images, in which case decision block 532 answered in the affirmative. If there are no valid matching features in either pairs of images, decision block 532 is answered in the negative, the absence of a detected objected confirmed 534 , and the customer transaction allowed to proceed without an alert or interruption.
- groups of matching features are classified as valid or not based on the application of thresholds to a plurality of motion, appearance, and/or temporal classification parameters.
- the classification and validation are performed using other techniques including a linear classification, 3-D Motion Estimation, Nearest Neighbor classification, Neural Network classification, and Vector Quantization classification, for example.
- 3-D Motion Estimation technique rather than finding groups of matching features using an affine transformation (which models motion within the 2-D image plane), a full 3-D motion estimation can be used instead.
- Such methods are described in the book “Multiple View Geometry in Computer Vision” by Richard Hartlley and Andrew Zisserman (Cambridge University Press, 2000), in chapters 8, 9, 10, and 11, which describes methods using multiple images from a single camera or the images of multiple cameras in combination.
- the result of such a computation is a set of matching features represented in the two images whose respective 3-D points in space move with a consistent 3-D rigid motion (a 3-D rotation and translation).
- the object represented by the features can then be verified to be an item or not based on the 3-D motion.
- an item in a shopping cart can translate in a direction parallel to the floor, but is unlikely to have a vertical translation component (unlike, say, a purse, a shoe, or pants).
- an item in a shopping cart is likely to rotate along an axis perpendicular to the floor, but should not have rotational components parallel to the floor.
- the method can also compute the 3-D structure (relative 3-D coordinates) of the object points represented by the matching features in the images, and an analysis of the 3-D structure can further be used to distinguish between items and non-items. For instance, if the 3-D structure lies primarily on a plane parallel to the floor, then it is likely to be the grill of the shopping cart instead of an item, which would tend to have structure on a plane parallel to the floor.
- a set of sample images with groups of matching features are selected for purposes of training the classifier.
- the groups of matching features referred to herein as training matches, are used to construct a classifier in the following manner.
- each training match is manually classified (ground-truthed) as either representing an item or not an item (positive and negative examples, respectively).
- a value of “1” is associated with each feature of a positive example, and a valve of “0” associated with each feature of a negative example.
- the training matches are then used to classify groups of a sequence of images. For each feature of group of matching features to be classified, a search is performed for the nearest neighbor feature from the set of training examples.
- the features of the training examples can be stored in a K-D tree, for example, for efficient searching.
- a threshold can be applied to decide whether to classify the group of matching features as an item or not.
- the threshold may be tuned upward to minimize false-positive detections, tuned downward to minimize the false negative detections, or anywhere there between.
- a neural network is employed to classify a set of matching features.
- an artificial neural network is trained to produce an output value of “1” for features from positive examples and “0” for features from negative examples.
- the neural network output when applied to all the features of the match—is averaged and a threshold applied to the averaged value.
- VQ Vector Quantization
- a codebook refers to a discrete set of elements chosen to represent or quantize the infinite set of continuous-valued SIFT feature vectors.
- Each of the elements of the codebook is referred to as a codebook entry.
- a codebook is generated by mapping visual features extracted from a plurality of sample images into the associated feature space.
- a clustering algorithm is then applied to the map of features to identify clusters of features in proximity to one another as well as the centers of those clusters.
- the center points associated with the clusters are defined to be the codebook entries with which codebook histograms are formed.
- the N-dimensional space is diagrammatically represented in FIG. 6 , and each codebook entry depicted as a small circle containing a numerical identifier.
- a dimensionality-reduction algorithm may be applied to the codebook histogram to reduce the size of the N-dimensional space with minimal impact on the accuracy of the classification.
- the clustering algorithm is the k-means clustering technique taught in: “Pattern Classification” by Richard O. Duda, Peter E. Hart, and David G. Stork, pp: 526-528, 2nd Edition, John Wiley & Sons, Inc., New York, 2001, which is hereby incorporated by reference herein.
- the classifier is trained with training data that has been ground-truthed to affirmatively associate the data with an item of merchandise or a non-item.
- groups of matching features from training images are mapped into the feature space and the nearest-neighbor codebook entry identified.
- each SIFT feature is illustrated as a solid point and the nearest-neighbor codebook entry to which each feature collapses is indicated by an arrow.
- a codebook histogram is used to indicate the number of features of a match for which each codebook entry is the nearest neighbor.
- each element of the codebook is designated by an identifier on the horizontal axis.
- the number of features associated with each codebook entry is indicated by the count on the vertical axis. As shown, the first element of the codebook (identified by 1 on horizontal axis) has a count of three to indicate that three related features are present in the group of matching features used for training.
- the codebook histogram is also represented in vector format to the left of FIG. 7 .
- a unique codebook histogram referred to herein as a model histogram, is generated for each of a plurality of positive or negative examples identified by ground-truthing.
- actual image data is classified by extracting 810 the features from the video data of the shopping cart, identifying 820 at least one group of features exhibiting the same motion, appearance, and/or temporal criteria, generating 830 a candidate codebook histogram by locating the nearest neighbors for each of the features of the at least one group of matching features, and using 840 the candidate codebook histogram as the parameter for classification.
- the VQ classifier compares the candidate histogram to each of the plurality of model histograms to identify the most similar model codebook histogram.
- the VQ classifier then classifies the match as an item of merchandise depending on whether the model histogram identified is associated with a positive example or negative example.
- Various other methods can be used to compare the candidate codebook histogram of a match to the model codebook histograms from the ground-truthed samples including, for example, support vector machines, nearest neighbor classifiers, or neural networks.
- One or more embodiments may be implemented with one or more computer readable media, wherein each medium may be configured to include thereon data or computer executable instructions for manipulating data or computer executable instructions include data structures, objects, programs, routines, or other program modules that may be accessed by a processing system, such as one associated with a general-purpose computer or processor capable of performing various different functions or one associated with a special-purpose computer capable of performing a limited number of functions.
- Computer executable instructions cause the processing system to perform a particular function or group of functions and are examples of program code means for implementing steps for methods disclosed herein.
- a particular sequence of the executable instructions provides an example of corresponding acts that may be used to implement such steps.
- Examples of computer readable media include random-access memory (“RAM”), read-only memory (“ROM”), programmable read-only memory (“PROM”), erasable programmable read-only memory (“EPROM”), electrically erasable programmable read-only memory (“EEPROM”), compact disk read-only memory (“CD-ROM”), or any other device or component that is capable of providing data or executable instructions that may be accessed by a processing system.
- Examples of mass storage devices incorporating computer readable media include hard disk drives, magnetic disk drives, tape drives, optical disk drives, and solid state memory chips, for example.
- the term processor as used herein refers to a number of processing devices including general purpose computers, special purpose computers, application-specific integrated circuit (ASIC), and digital/analog circuits with discrete components, for example.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Business, Economics & Management (AREA)
- Geometry (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Finance (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (26)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/811,211 US9064161B1 (en) | 2007-06-08 | 2007-06-08 | System and method for detecting generic items in image sequence |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/811,211 US9064161B1 (en) | 2007-06-08 | 2007-06-08 | System and method for detecting generic items in image sequence |
Publications (1)
Publication Number | Publication Date |
---|---|
US9064161B1 true US9064161B1 (en) | 2015-06-23 |
Family
ID=53397165
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/811,211 Active 2028-04-06 US9064161B1 (en) | 2007-06-08 | 2007-06-08 | System and method for detecting generic items in image sequence |
Country Status (1)
Country | Link |
---|---|
US (1) | US9064161B1 (en) |
Cited By (35)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150310092A1 (en) * | 2014-04-28 | 2015-10-29 | Microsoft Corporation | Attribute histograms for providing data access |
US20160371634A1 (en) * | 2015-06-17 | 2016-12-22 | Tata Consultancy Services Limited | Computer implemented system and method for recognizing and counting products within images |
US9665832B2 (en) * | 2011-10-20 | 2017-05-30 | Affectomatics Ltd. | Estimating affective response to a token instance utilizing a predicted affective response to its background |
US9792480B2 (en) * | 2014-12-23 | 2017-10-17 | Toshiba Tec Kabushiki Kaisha | Image recognition apparatus, commodity information processing apparatus and image recognition method |
WO2017201487A1 (en) * | 2016-05-20 | 2017-11-23 | Magic Leap, Inc. | Method and system for performing convolutional image transformation estimation |
US20170357939A1 (en) * | 2016-06-10 | 2017-12-14 | Wal-Mart Stores, Inc. | Methods and Systems for Monitoring a Retail Shopping Facility |
US10078878B2 (en) | 2012-10-21 | 2018-09-18 | Digimarc Corporation | Methods and arrangements for identifying objects |
US10192087B2 (en) | 2011-08-30 | 2019-01-29 | Digimarc Corporation | Methods and arrangements for identifying objects |
US10229406B2 (en) * | 2016-09-20 | 2019-03-12 | Walmart Apollo, Llc | Systems and methods for autonomous item identification |
US10282852B1 (en) * | 2018-07-16 | 2019-05-07 | Accel Robotics Corporation | Autonomous store tracking system |
CN109784126A (en) * | 2017-11-10 | 2019-05-21 | 富士通株式会社 | Data cutting method and device, article detection method and device |
US10474858B2 (en) | 2011-08-30 | 2019-11-12 | Digimarc Corporation | Methods of identifying barcoded items by evaluating multiple identification hypotheses, based on data from sensors including inventory sensors and ceiling-mounted cameras |
US10535146B1 (en) | 2018-07-16 | 2020-01-14 | Accel Robotics Corporation | Projected image item tracking system |
US10586208B2 (en) | 2018-07-16 | 2020-03-10 | Accel Robotics Corporation | Smart shelf system that integrates images and quantity sensors |
US10818031B2 (en) | 2017-11-22 | 2020-10-27 | Blynk Technology | Systems and methods of determining a location of a mobile container |
US10909694B2 (en) | 2018-07-16 | 2021-02-02 | Accel Robotics Corporation | Sensor bar shelf monitor |
US10942914B2 (en) * | 2017-10-19 | 2021-03-09 | Adobe Inc. | Latency optimization for digital asset compression |
WO2021142416A1 (en) * | 2020-01-10 | 2021-07-15 | Sbot Technologies, Inc. D/B/A Caper, Inc. | Systems and methods for training data generation for object identification and self-checkout anti-theft |
US20210216785A1 (en) * | 2020-01-10 | 2021-07-15 | Everseen Limited | System and method for detecting scan and non-scan events in a self check out process |
US11069070B2 (en) | 2018-07-16 | 2021-07-20 | Accel Robotics Corporation | Self-cleaning autonomous store |
US11086843B2 (en) | 2017-10-19 | 2021-08-10 | Adobe Inc. | Embedding codebooks for resource optimization |
US11093736B1 (en) * | 2020-01-24 | 2021-08-17 | Synchrony Bank | Systems and methods for machine vision based object recognition |
CN113296159A (en) * | 2021-05-20 | 2021-08-24 | 厦门星宸科技有限公司 | Object sensing device and method |
WO2021167663A1 (en) * | 2020-02-19 | 2021-08-26 | Toyota Research Institute, Inc. | Unknown object identification for robotic device |
US11106941B2 (en) | 2018-07-16 | 2021-08-31 | Accel Robotics Corporation | System having a bar of relocatable distance sensors that detect stock changes in a storage area |
US11120363B2 (en) | 2017-10-19 | 2021-09-14 | Adobe Inc. | Latency mitigation for encoding data |
EP3961577A1 (en) * | 2020-09-01 | 2022-03-02 | Certus Warensicherungs-Systeme GmbH | Method for visually monitoring the movement of goods to detect the passage of the goods at a monitoring point |
US11281876B2 (en) | 2011-08-30 | 2022-03-22 | Digimarc Corporation | Retail store with sensor-fusion enhancements |
US11394927B2 (en) | 2018-07-16 | 2022-07-19 | Accel Robotics Corporation | Store device network that transmits power and data through mounting fixtures |
US20220300940A1 (en) * | 2019-12-20 | 2022-09-22 | Fujitsu Frontech Limited | Paper sheet storage apparatus, product registration method, and recording medium |
US20220414900A1 (en) * | 2021-06-29 | 2022-12-29 | 7-Eleven, Inc. | Item identification using multiple cameras |
US11663550B1 (en) * | 2020-04-27 | 2023-05-30 | State Farm Mutual Automobile Insurance Company | Systems and methods for commercial inventory mapping including determining if goods are still available |
US11734767B1 (en) | 2020-02-28 | 2023-08-22 | State Farm Mutual Automobile Insurance Company | Systems and methods for light detection and ranging (lidar) based generation of a homeowners insurance quote |
US11915217B2 (en) | 2017-07-26 | 2024-02-27 | Maplebear Inc. | Self-checkout anti-theft vehicle systems and methods |
US11972474B2 (en) * | 2018-08-06 | 2024-04-30 | Skidata Ag | Method for purchasing goods, access authorizations or authorizations for using a service from a plurality of offered goods, access authorizations or authorizations for using a service |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6711293B1 (en) | 1999-03-08 | 2004-03-23 | The University Of British Columbia | Method and apparatus for identifying scale invariant features in an image and use of same for locating an object in an image |
US20040130620A1 (en) * | 2002-11-12 | 2004-07-08 | Buehler Christopher J. | Method and system for tracking and behavioral monitoring of multiple objects moving through multiple fields-of-view |
US20050189412A1 (en) * | 2004-02-27 | 2005-09-01 | Evolution Robotics, Inc. | Method of merchandising for checkout lanes |
US20060081714A1 (en) * | 2004-08-23 | 2006-04-20 | King Martin T | Portable scanning device |
US20060147087A1 (en) * | 2005-01-04 | 2006-07-06 | Luis Goncalves | Optical flow for object recognition |
US20060171419A1 (en) * | 2005-02-01 | 2006-08-03 | Spindola Serafin D | Method for discontinuous transmission and accurate reproduction of background noise information |
US7100824B2 (en) | 2004-02-27 | 2006-09-05 | Evolution Robotics, Inc. | System and methods for merchandise checkout |
US20070102523A1 (en) * | 2005-11-08 | 2007-05-10 | Microsoft Corporation | Laser velocimetric image scanning |
US7337960B2 (en) | 2004-02-27 | 2008-03-04 | Evolution Robotics, Inc. | Systems and methods for merchandise automatic checkout |
US7725484B2 (en) | 2005-11-18 | 2010-05-25 | University Of Kentucky Research Foundation (Ukrf) | Scalable object recognition using hierarchical quantization with a vocabulary tree |
US7842876B2 (en) | 2007-01-05 | 2010-11-30 | Harman International Industries, Incorporated | Multimedia object grouping, selection, and playback system |
US8325982B1 (en) * | 2009-07-23 | 2012-12-04 | Videomining Corporation | Method and system for detecting and tracking shopping carts from videos |
-
2007
- 2007-06-08 US US11/811,211 patent/US9064161B1/en active Active
Patent Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6711293B1 (en) | 1999-03-08 | 2004-03-23 | The University Of British Columbia | Method and apparatus for identifying scale invariant features in an image and use of same for locating an object in an image |
US20040130620A1 (en) * | 2002-11-12 | 2004-07-08 | Buehler Christopher J. | Method and system for tracking and behavioral monitoring of multiple objects moving through multiple fields-of-view |
US7246745B2 (en) | 2004-02-27 | 2007-07-24 | Evolution Robotics Retail, Inc. | Method of merchandising for checkout lanes |
US7100824B2 (en) | 2004-02-27 | 2006-09-05 | Evolution Robotics, Inc. | System and methods for merchandise checkout |
US20060283943A1 (en) | 2004-02-27 | 2006-12-21 | Evolution Robotics Retail, Inc. | Systems and methods for merchandise checkout |
US20050189412A1 (en) * | 2004-02-27 | 2005-09-01 | Evolution Robotics, Inc. | Method of merchandising for checkout lanes |
US7337960B2 (en) | 2004-02-27 | 2008-03-04 | Evolution Robotics, Inc. | Systems and methods for merchandise automatic checkout |
US20060081714A1 (en) * | 2004-08-23 | 2006-04-20 | King Martin T | Portable scanning device |
US20060147087A1 (en) * | 2005-01-04 | 2006-07-06 | Luis Goncalves | Optical flow for object recognition |
US7646887B2 (en) | 2005-01-04 | 2010-01-12 | Evolution Robotics Retail, Inc. | Optical flow for object recognition |
US20060171419A1 (en) * | 2005-02-01 | 2006-08-03 | Spindola Serafin D | Method for discontinuous transmission and accurate reproduction of background noise information |
US20070102523A1 (en) * | 2005-11-08 | 2007-05-10 | Microsoft Corporation | Laser velocimetric image scanning |
US7725484B2 (en) | 2005-11-18 | 2010-05-25 | University Of Kentucky Research Foundation (Ukrf) | Scalable object recognition using hierarchical quantization with a vocabulary tree |
US7842876B2 (en) | 2007-01-05 | 2010-11-30 | Harman International Industries, Incorporated | Multimedia object grouping, selection, and playback system |
US8325982B1 (en) * | 2009-07-23 | 2012-12-04 | Videomining Corporation | Method and system for detecting and tracking shopping carts from videos |
Non-Patent Citations (7)
Title |
---|
Bay et al., "SURF: Speeded Up Robust Features," Computer Vision and Image Understanding (CVIU), vol. 110, No. 3, pp. 346-359 (2008). |
Duda et al., "Pattern Classification," pp. 526-528, 2nd Edition, John Wiley & Sons, Inc., New York, 2001. |
Goncalves et al., U.S. Appl. No. 13/107,824, filed May 13, 2011, titled Systems and Methods for Object Recognition Using a Large Database. |
Kelson R.T. Aires, Plane Detection Using Affine Homography, Department of Computing Engineering and Automation, Federal University of Rio Grande do Norte, Brazil, ftp://adelardo:web@users.dca.ufrn.br/artigos/CBA08a.pdf. * |
Lowe, "Local Feature View Clustering for 3D Object Recognition," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Kauai, Hawaii, Dec. 2001. |
Lowe, "Object Recognition from Local Scale-Invariant Features," Proceedings of the International Conference on Computer Vision, Corfu, Greece, Sep. 1999. |
Office action response filed in U.S. Appl. No. 13/107,824 on Jul. 3, 2014. |
Cited By (66)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11288472B2 (en) * | 2011-08-30 | 2022-03-29 | Digimarc Corporation | Cart-based shopping arrangements employing probabilistic item identification |
US10474858B2 (en) | 2011-08-30 | 2019-11-12 | Digimarc Corporation | Methods of identifying barcoded items by evaluating multiple identification hypotheses, based on data from sensors including inventory sensors and ceiling-mounted cameras |
US10963657B2 (en) * | 2011-08-30 | 2021-03-30 | Digimarc Corporation | Methods and arrangements for identifying objects |
US11281876B2 (en) | 2011-08-30 | 2022-03-22 | Digimarc Corporation | Retail store with sensor-fusion enhancements |
US10192087B2 (en) | 2011-08-30 | 2019-01-29 | Digimarc Corporation | Methods and arrangements for identifying objects |
US9665832B2 (en) * | 2011-10-20 | 2017-05-30 | Affectomatics Ltd. | Estimating affective response to a token instance utilizing a predicted affective response to its background |
US10078878B2 (en) | 2012-10-21 | 2018-09-18 | Digimarc Corporation | Methods and arrangements for identifying objects |
US10902544B2 (en) | 2012-10-21 | 2021-01-26 | Digimarc Corporation | Methods and arrangements for identifying objects |
US20150310092A1 (en) * | 2014-04-28 | 2015-10-29 | Microsoft Corporation | Attribute histograms for providing data access |
US9792480B2 (en) * | 2014-12-23 | 2017-10-17 | Toshiba Tec Kabushiki Kaisha | Image recognition apparatus, commodity information processing apparatus and image recognition method |
US10510038B2 (en) * | 2015-06-17 | 2019-12-17 | Tata Consultancy Services Limited | Computer implemented system and method for recognizing and counting products within images |
US20160371634A1 (en) * | 2015-06-17 | 2016-12-22 | Tata Consultancy Services Limited | Computer implemented system and method for recognizing and counting products within images |
US11593654B2 (en) | 2016-05-20 | 2023-02-28 | Magic Leap, Inc. | System for performing convolutional image transformation estimation |
WO2017201487A1 (en) * | 2016-05-20 | 2017-11-23 | Magic Leap, Inc. | Method and system for performing convolutional image transformation estimation |
KR20220125380A (en) * | 2016-05-20 | 2022-09-14 | 매직 립, 인코포레이티드 | Method and system for performing convolutional image transformation estimation |
IL262886B (en) * | 2016-05-20 | 2022-09-01 | Magic Leap Inc | Method and system for performing convolutional image transformation estimation |
US10489708B2 (en) | 2016-05-20 | 2019-11-26 | Magic Leap, Inc. | Method and system for performing convolutional image transformation estimation |
US11062209B2 (en) | 2016-05-20 | 2021-07-13 | Magic Leap, Inc. | Method and system for performing convolutional image transformation estimation |
KR20190010582A (en) * | 2016-05-20 | 2019-01-30 | 매직 립, 인코포레이티드 | Method and system for performing convolutional image transform estimation |
KR20220003171A (en) * | 2016-05-20 | 2022-01-07 | 매직 립, 인코포레이티드 | Method and system for performing convolutional image transformation estimation |
KR20210019609A (en) * | 2016-05-20 | 2021-02-22 | 매직 립, 인코포레이티드 | Method and system for performing convolutional image transformation estimation |
US20170357939A1 (en) * | 2016-06-10 | 2017-12-14 | Wal-Mart Stores, Inc. | Methods and Systems for Monitoring a Retail Shopping Facility |
US10565554B2 (en) * | 2016-06-10 | 2020-02-18 | Walmart Apollo, Llc | Methods and systems for monitoring a retail shopping facility |
US10229406B2 (en) * | 2016-09-20 | 2019-03-12 | Walmart Apollo, Llc | Systems and methods for autonomous item identification |
US11915217B2 (en) | 2017-07-26 | 2024-02-27 | Maplebear Inc. | Self-checkout anti-theft vehicle systems and methods |
US10942914B2 (en) * | 2017-10-19 | 2021-03-09 | Adobe Inc. | Latency optimization for digital asset compression |
US11893007B2 (en) | 2017-10-19 | 2024-02-06 | Adobe Inc. | Embedding codebooks for resource optimization |
US11120363B2 (en) | 2017-10-19 | 2021-09-14 | Adobe Inc. | Latency mitigation for encoding data |
US11086843B2 (en) | 2017-10-19 | 2021-08-10 | Adobe Inc. | Embedding codebooks for resource optimization |
CN109784126A (en) * | 2017-11-10 | 2019-05-21 | 富士通株式会社 | Data cutting method and device, article detection method and device |
CN109784126B (en) * | 2017-11-10 | 2022-11-18 | 富士通株式会社 | Data cutting method and device and article detection method and device |
US10818031B2 (en) | 2017-11-22 | 2020-10-27 | Blynk Technology | Systems and methods of determining a location of a mobile container |
US11106941B2 (en) | 2018-07-16 | 2021-08-31 | Accel Robotics Corporation | System having a bar of relocatable distance sensors that detect stock changes in a storage area |
US10535146B1 (en) | 2018-07-16 | 2020-01-14 | Accel Robotics Corporation | Projected image item tracking system |
US11394927B2 (en) | 2018-07-16 | 2022-07-19 | Accel Robotics Corporation | Store device network that transmits power and data through mounting fixtures |
US11113825B2 (en) | 2018-07-16 | 2021-09-07 | Accel Robotics Corporation | Multi-surface image projection item tracking system |
US11069070B2 (en) | 2018-07-16 | 2021-07-20 | Accel Robotics Corporation | Self-cleaning autonomous store |
US10909694B2 (en) | 2018-07-16 | 2021-02-02 | Accel Robotics Corporation | Sensor bar shelf monitor |
US11049263B2 (en) | 2018-07-16 | 2021-06-29 | Accel Robotics Corporation | Person and projected image item tracking system |
US10282852B1 (en) * | 2018-07-16 | 2019-05-07 | Accel Robotics Corporation | Autonomous store tracking system |
US10783491B2 (en) | 2018-07-16 | 2020-09-22 | Accel Robotics Corporation | Camera-based tracking and authorization extension system |
US10586208B2 (en) | 2018-07-16 | 2020-03-10 | Accel Robotics Corporation | Smart shelf system that integrates images and quantity sensors |
US11972474B2 (en) * | 2018-08-06 | 2024-04-30 | Skidata Ag | Method for purchasing goods, access authorizations or authorizations for using a service from a plurality of offered goods, access authorizations or authorizations for using a service |
US20220300940A1 (en) * | 2019-12-20 | 2022-09-22 | Fujitsu Frontech Limited | Paper sheet storage apparatus, product registration method, and recording medium |
US20210216785A1 (en) * | 2020-01-10 | 2021-07-15 | Everseen Limited | System and method for detecting scan and non-scan events in a self check out process |
US11756389B2 (en) * | 2020-01-10 | 2023-09-12 | Everseen Limited | System and method for detecting scan and non-scan events in a self check out process |
WO2021142416A1 (en) * | 2020-01-10 | 2021-07-15 | Sbot Technologies, Inc. D/B/A Caper, Inc. | Systems and methods for training data generation for object identification and self-checkout anti-theft |
US11741420B2 (en) | 2020-01-24 | 2023-08-29 | Synchrony Bank | Systems and methods for machine vision based object recognition |
US12001997B2 (en) | 2020-01-24 | 2024-06-04 | Synchrony Bank | Systems and methods for machine vision based object recognition |
US11093736B1 (en) * | 2020-01-24 | 2021-08-17 | Synchrony Bank | Systems and methods for machine vision based object recognition |
WO2021167663A1 (en) * | 2020-02-19 | 2021-08-26 | Toyota Research Institute, Inc. | Unknown object identification for robotic device |
US11328170B2 (en) | 2020-02-19 | 2022-05-10 | Toyota Research Institute, Inc. | Unknown object identification for robotic device |
US11734767B1 (en) | 2020-02-28 | 2023-08-22 | State Farm Mutual Automobile Insurance Company | Systems and methods for light detection and ranging (lidar) based generation of a homeowners insurance quote |
US11756129B1 (en) | 2020-02-28 | 2023-09-12 | State Farm Mutual Automobile Insurance Company | Systems and methods for light detection and ranging (LIDAR) based generation of an inventory list of personal belongings |
US11989788B2 (en) | 2020-02-28 | 2024-05-21 | State Farm Mutual Automobile Insurance Company | Systems and methods for light detection and ranging (LIDAR) based generation of a homeowners insurance quote |
US11676343B1 (en) | 2020-04-27 | 2023-06-13 | State Farm Mutual Automobile Insurance Company | Systems and methods for a 3D home model for representation of property |
US11663550B1 (en) * | 2020-04-27 | 2023-05-30 | State Farm Mutual Automobile Insurance Company | Systems and methods for commercial inventory mapping including determining if goods are still available |
US12086861B1 (en) | 2020-04-27 | 2024-09-10 | State Farm Mutual Automobile Insurance Company | Systems and methods for commercial inventory mapping including a lidar-based virtual map |
US11830150B1 (en) | 2020-04-27 | 2023-11-28 | State Farm Mutual Automobile Insurance Company | Systems and methods for visualization of utility lines |
US11900535B1 (en) | 2020-04-27 | 2024-02-13 | State Farm Mutual Automobile Insurance Company | Systems and methods for a 3D model for visualization of landscape design |
EP3961577A1 (en) * | 2020-09-01 | 2022-03-02 | Certus Warensicherungs-Systeme GmbH | Method for visually monitoring the movement of goods to detect the passage of the goods at a monitoring point |
US11665318B2 (en) * | 2021-05-20 | 2023-05-30 | Sigmastar Technology Ltd. | Object detection apparatus and method |
US20220377281A1 (en) * | 2021-05-20 | 2022-11-24 | Sigmastar Technology Ltd. | Object detection apparatus and method |
CN113296159B (en) * | 2021-05-20 | 2023-10-31 | 星宸科技股份有限公司 | Object sensing device and method |
CN113296159A (en) * | 2021-05-20 | 2021-08-24 | 厦门星宸科技有限公司 | Object sensing device and method |
US20220414900A1 (en) * | 2021-06-29 | 2022-12-29 | 7-Eleven, Inc. | Item identification using multiple cameras |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9064161B1 (en) | System and method for detecting generic items in image sequence | |
US20220198550A1 (en) | System and methods for customer action verification in a shopping cart and point of sales | |
US8196822B2 (en) | Self checkout with visual recognition | |
CN111415461B (en) | Article identification method and system and electronic equipment | |
US9477955B2 (en) | Automatic learning in a merchandise checkout system with visual recognition | |
Santra et al. | A comprehensive survey on computer vision based approaches for automatic identification of products in retail store | |
US20090060259A1 (en) | Upc substitution fraud prevention | |
EP3128496B1 (en) | Vehicle identification method and system | |
US9740937B2 (en) | System and method for monitoring a retail environment using video content analysis with depth sensing | |
US20100110183A1 (en) | Automatically calibrating regions of interest for video surveillance | |
US9299229B2 (en) | Detecting primitive events at checkout | |
Fan et al. | Shelf detection via vanishing point and radial projection | |
WO2013033442A1 (en) | Methods and arrangements for identifying objects | |
JP2008538030A (en) | Method and apparatus for detecting suspicious behavior using video analysis | |
Bolme et al. | Simple real-time human detection using a single correlation filter | |
Schlecht et al. | Contour-based object detection. | |
CN113468914A (en) | Method, device and equipment for determining purity of commodities | |
Siva et al. | Scene invariant crowd segmentation and counting using scale-normalized histogram of moving gradients (homg) | |
CN110705363B (en) | Commodity specification identification method and device | |
Popa et al. | Detecting customers’ buying events on a real-life database | |
Jeong et al. | A comparison of keypoint detectors in the context of pedestrian counting | |
Fan et al. | Fast detection of retail fraud using polar touch buttons | |
US20240220999A1 (en) | Item verification systems and methods for retail checkout stands | |
Mansouri et al. | A new strategy based on spatiogram similarity association for multi-pedestrian tracking | |
Pan et al. | Soft margin keyframe comparison: Enhancing precision of fraud detection in retail surveillance |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: EVOLUTION ROBOTICS RETAIL, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BOMAN, ROBERT;GONCALVES, LUIS;OSTROWSKI, JAMES;REEL/FRAME:024534/0245 Effective date: 20100527 |
|
AS | Assignment |
Owner name: DATALOGIC ADC, INC., OREGON Free format text: MERGER;ASSIGNOR:EVOLUTION ROBOTICS RETAIL, INC.;REEL/FRAME:034686/0102 Effective date: 20120531 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
CC | Certificate of correction | ||
FEPP | Fee payment procedure |
Free format text: PAYER NUMBER DE-ASSIGNED (ORIGINAL EVENT CODE: RMPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |