US20200234463A1 - Systems and methods to check-in shoppers in a cashier-less store - Google Patents

Systems and methods to check-in shoppers in a cashier-less store Download PDF

Info

Publication number
US20200234463A1
US20200234463A1 US16/842,382 US202016842382A US2020234463A1 US 20200234463 A1 US20200234463 A1 US 20200234463A1 US 202016842382 A US202016842382 A US 202016842382A US 2020234463 A1 US2020234463 A1 US 2020234463A1
Authority
US
United States
Prior art keywords
subjects
mobile computing
identified
subject
linked
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US16/842,382
Other versions
US11200692B2 (en
Inventor
Jordan E. Fisher
Warren Green
Daniel L. Fischetti
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Standard Cognition Corp
Original Assignee
Standard Cognition Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US15/847,796 external-priority patent/US10055853B1/en
Priority claimed from US15/907,112 external-priority patent/US10133933B1/en
Priority claimed from US15/945,473 external-priority patent/US10474988B2/en
Priority claimed from US16/255,573 external-priority patent/US10650545B2/en
Application filed by Standard Cognition Corp filed Critical Standard Cognition Corp
Priority to US16/842,382 priority Critical patent/US11200692B2/en
Publication of US20200234463A1 publication Critical patent/US20200234463A1/en
Priority to US17/383,303 priority patent/US11538186B2/en
Publication of US11200692B2 publication Critical patent/US11200692B2/en
Application granted granted Critical
Assigned to STANDARD COGNITION, CORP. reassignment STANDARD COGNITION, CORP. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: FISHER, JORDAN E., GREEN, WARREN, FISCHETTI, DANIEL L.
Priority to US18/089,503 priority patent/US11810317B2/en
Priority to US18/387,427 priority patent/US20240070895A1/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/14Digital output to display device ; Cooperation and interconnection of the display device with other functional units
    • G06K9/00771
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/08Logistics, e.g. warehousing, loading or distribution; Inventory or stock management
    • G06Q10/087Inventory or stock management, e.g. order filling, procurement or balancing against orders
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0633Lists, e.g. purchase orders, compilation or processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • H04L63/0853Network architectures or network communication protocols for network security for authentication of entities using an additional device, e.g. smartcard, SIM or a different communication terminal
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/131Protocols for games, networked simulations or virtual reality
    • H04L67/18
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/2866Architectures; Arrangements
    • H04L67/30Profiles
    • H04L67/306User profiles
    • H04L67/38
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/52Network services specially adapted for the location of the user terminal
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/90Arrangement of cameras or camera modules, e.g. multiple cameras in TV studios or sports stadiums
    • H04N5/232
    • H04N5/247
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W12/00Security arrangements; Authentication; Protecting privacy or anonymity
    • H04W12/02Protecting privacy or anonymity, e.g. protecting personally identifiable information [PII]
    • H04W12/0605
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W12/00Security arrangements; Authentication; Protecting privacy or anonymity
    • H04W12/06Authentication
    • H04W12/065Continuous authentication
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/021Services related to particular areas, e.g. point of interest [POI] services, venue services or geofences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/029Location-based management or tracking services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/30Services specially adapted for particular environments, situations or purposes
    • H04W4/33Services specially adapted for particular environments, situations or purposes for indoor environments, e.g. buildings
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/80Services using short range communication, e.g. near-field communication [NFC], radio-frequency identification [RFID] or low energy communication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/61Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio
    • H04L65/612Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio for unicast

Definitions

  • the present invention relates to systems that link subjects in an area of real space with user accounts linked with client applications executing on mobile computing devices.
  • Identifying subjects within an area of real space, such as people in a shopping store, uniquely associating the identified subjects with real people or with authenticated accounts associated with responsible parties can present many technical challenges. For example, consider such an image processing system deployed in a shopping store with multiple customers moving in aisles between the shelves and open spaces within the shopping store. Customers take items from shelves and put those in their respective shopping carts or baskets. Customers may also put items on the shelf, if they do not want the item. Though the system may identify a subject in the images, and the items the subject takes, the system must accurately identify an authentic user account responsible for the taken items by that subject.
  • facial recognition or other biometric recognition technique
  • This approach requires access by the image processing system to databases storing the personal identifying biometric information, linked with the accounts. This is undesirable from a security and privacy standpoint in many settings.
  • a system, and method for operating a system are provided for linking subjects, such as persons in an area of real space, with user accounts.
  • the system can use image processing to identify subjects in the area of real space without requiring personal identifying biometric information.
  • the user accounts are linked with client applications executable on mobile computing devices. This function of linking identified subjects to user accounts by image and signal processing presents a complex problem of computer engineering, relating to the type of image and signal data to be processed, what processing of the image and signal data to perform, and how to determine actions from the image and signal data with high reliability.
  • a system and method are provided for linking subjects in an area of real space with user accounts.
  • the user accounts are linked with client applications executable on mobile computing devices.
  • a plurality of cameras or other sensors produce respective sequences of images in corresponding fields of view in the real space.
  • a system and method are described for determining locations of identified subjects represented in the images and matching the identified subjects with user accounts by identifying locations of mobile devices executing client applications in the area of real space and matching locations of the mobile devices with locations of the subjects.
  • the mobile devices emit signals usable to indicate locations of the mobile devices in the area of real space.
  • the system matches the identified subjects with user accounts by identifying locations of mobile devices using the emitted signals.
  • the signals emitted by the mobile devices comprise images.
  • the client applications on the mobile devices cause display of semaphore images, which can be as simple as a particular color, on the mobile devices in the area of real space.
  • the system matches the identified subjects with user accounts by identifying locations of mobile devices by using an image recognition engine that determines locations of the mobile devices displaying semaphore images.
  • the system includes a set of semaphore images.
  • the system accepts login communications from a client application on a mobile device identifying a user account before matching the user account to an identified subject in the area of real space. After accepting login communications, the system sends a selected semaphore image from the set of semaphore images to the client application on the mobile device.
  • the system sets a status of the selected semaphore image as assigned.
  • the system receives a displayed image of the selected semaphore image, recognizes the displayed image and matches the recognized image with the assigned images from the set of semaphore images.
  • the system matches a location of the mobile device displaying the recognized semaphore image located in the area of real space with a not yet linked identified subject.
  • the system after matching the user account to the identified subject, sets the status of the recognized semaphore image as available.
  • the signals emitted by the mobile devices comprise radio signals indicating a service location of the mobile device.
  • the system receives location data transmitted by the client applications on the mobile devices.
  • the system matches the identified subjects with user accounts using the location data transmitted from the mobile devices.
  • the system uses the location data transmitted from the mobile device from a plurality of locations over a time interval in the area of real space to match the identified subjects with user accounts.
  • This matching the identified unmatched subject with the user account of the client application executing on the mobile device includes determining that all other mobile devices transmitting location information of unmatched user accounts are separated from the mobile device by a predetermined distance and determining a closest unmatched identified subject to the mobile device.
  • the signals emitted by the mobile devices comprise radio signals indicating acceleration and orientation of the mobile device.
  • such acceleration data is generated by accelerometer of the mobile computing device.
  • direction data from a compass on the mobile device is also received by the processing system.
  • the system receives the accelerometer data from the client applications on the mobile devices.
  • the system matches the identified subjects with user accounts using the accelerometer data transmitted from the mobile device.
  • the system uses the accelerometer data transmitted from the mobile device from a plurality of locations over a time interval in the area of real space and derivative of data indicating the locations of identified subjects over the time interval in the area of real space to match the identified subjects with user accounts.
  • the system matches the identified subjects with user accounts using a trained network to identify locations of mobile devices in the area of real space based on the signals emitted by the mobile devices.
  • the signals emitted by the mobile devices include location data and accelerometer data.
  • the system includes log data structures including a list of inventory items for the identified subjects.
  • the system associates the log data structure for the matched identified subject to the user account for the identified subject.
  • the system processes a payment for the list of inventory items for the identified subject from a payment method identified in the user account linked to the identified subject.
  • the system matches the identified subjects with user accounts without use of personal identifying biometric information associated with the user accounts.
  • FIG. 1 illustrates an architectural level schematic of a system in which a matching engine links subjects identified by a subject tracking engine to user accounts linked with client applications executing on mobile devices.
  • FIG. 2 is a side view of an aisle in a shopping store illustrating a subject with a mobile computing device and a camera arrangement.
  • FIG. 3 is a top view of the aisle of FIG. 2 in a shopping store illustrating the subject with the mobile computing device and the camera arrangement.
  • FIG. 4 shows an example data structure for storing joints information of subjects.
  • FIG. 5 shows an example data structure for storing a subject including the information of associated joints.
  • FIG. 6 is a flowchart showing process steps for matching an identified subject to a user account using a semaphore image displayed on a mobile computing device.
  • FIG. 7 is a flowchart showing process steps for matching an identified subject to a user account using service location of a mobile computing device.
  • FIG. 8 is a flowchart showing process steps for matching an identified subject to a user account using velocity of subjects and a mobile computing device.
  • FIG. 9A is a flowchart showing a first part of process steps for matching an identified subject to a user account using a network ensemble.
  • FIG. 9B is a flowchart showing a second part of process steps for matching an identified subject to a user account using a network ensemble.
  • FIG. 9C is a flowchart showing a third part of process steps for matching an identified subject to a user account using a network ensemble.
  • FIG. 10 is an example architecture in which the four techniques presented in FIGS. 6 to 9C are applied in an area of real space to reliably match an identified subject to a user account.
  • FIG. 11 is a camera and computer hardware arrangement configured for hosting the matching engine of FIG. 1 .
  • FIGS. 1-11 A system and various implementations of the subject technology is described with reference to FIGS. 1-11 .
  • the system and processes are described with reference to FIG. 1 , an architectural level schematic of a system in accordance with an implementation. Because FIG. 1 is an architectural diagram, certain details are omitted to improve the clarity of the description.
  • FIG. 1 The discussion of FIG. 1 is organized as follows. First, the elements of the system are described, followed by their interconnections. Then, the use of the elements in the system is described in greater detail.
  • FIG. 1 provides a block diagram level illustration of a system 100 .
  • the system 100 includes cameras 114 , network nodes hosting image recognition engines 112 a, 112 b, and 112 n, a subject tracking engine 110 deployed in a network node 102 (or nodes) on the network, mobile computing devices 118 a, 118 b, 118 m (collectively referred as mobile computing devices 120 ), a training database 130 , a subject database 140 , a user account database 150 , an image database 160 , a matching engine 170 deployed in a network node or nodes (also known as a processing platform) 103 , and a communication network or networks 181 .
  • the network nodes can host only one image recognition engine, or several image recognition engines.
  • the system can also include an inventory database and other supporting data.
  • a network node is an addressable hardware device or virtual device that is attached to a network, and is capable of sending, receiving, or forwarding information over a communications channel to or from other network nodes.
  • Examples of electronic devices which can be deployed as hardware network nodes include all varieties of computers, workstations, laptop computers, handheld computers, and smartphones.
  • Network nodes can be implemented in a cloud-based server system. More than one virtual device configured as a network node can be implemented using a single physical device.
  • network nodes hosting image recognition engines For the sake of clarity, only three network nodes hosting image recognition engines are shown in the system 100 . However, any number of network nodes hosting image recognition engines can be connected to the subject tracking engine 110 through the network(s) 181 . Similarly, three mobile computing devices are shown in the system 100 . However, any number of mobile computing devices can be connected to the network node 103 hosting the matching engine 170 through the network(s) 181 . Also, an image recognition engine, a subject tracking engine, a matching engine and other processing engines described herein can execute using more than one network node in a distributed architecture.
  • Network(s) 181 couples the network nodes 101 a, 101 b, and 101 n, respectively, hosting image recognition engines 112 a, 112 b, and 112 n, the network node 102 hosting the subject tracking engine 110 , the mobile computing devices 118 a, 118 b, and 118 m, the training database 130 , the subject database 140 , the user account database 150 , the image database 160 , and the network node 103 hosting the matching engine 170 .
  • Cameras 114 are connected to the subject tracking engine 110 through network nodes hosting image recognition engines 112 a, 112 b, and 112 n.
  • the cameras 114 are installed in a shopping store such that sets of cameras 114 (two or more) with overlapping fields of view are positioned over each aisle to capture images of real space in the store.
  • two cameras are arranged over aisle 116 a
  • two cameras are arranged over aisle 116 b
  • three cameras are arranged over aisle 116 n.
  • the cameras 114 are installed over aisles with overlapping fields of view.
  • the cameras are configured with the goal that customers moving in the aisles of the shopping store are present in the field of view of two or more cameras at any moment in time.
  • Cameras 114 can be synchronized in time with each other, so that images are captured at the same time, or close in time, and at the same image capture rate.
  • the cameras 114 can send respective continuous streams of images at a predetermined rate to network nodes hosting image recognition engines 112 a - 112 n. Images captured in all the cameras covering an area of real space at the same time, or close in time, are synchronized in the sense that the synchronized images can be identified in the processing engines as representing different views of subjects having fixed positions in the real space.
  • the cameras send image frames at the rates of 30 frames per second (fps) to respective network nodes hosting image recognition engines 112 a - 112 n.
  • Each frame has a timestamp, identity of the camera (abbreviated as “camera_id”), and a frame identity (abbreviated as “frame_id”) along with the image data.
  • Other embodiments of the technology disclosed can use different types of sensors such as infrared or RF image sensors, ultrasound sensors, thermal sensors, Lidars, etc., to generate this data.
  • Multiple types of sensors can be used, including for example ultrasound or RF sensors in addition to the cameras 114 that generate RGB color output.
  • Multiple sensors can be synchronized in time with each other, so that frames are captured by the sensors at the same time, or close in time, and at the same frame capture rate.
  • sensors other than cameras, or sensors of multiple types can be used to produce the sequences of images utilized.
  • Cameras installed over an aisle are connected to respective image recognition engines.
  • the two cameras installed over the aisle 116 a are connected to the network node 101 a hosting an image recognition engine 112 a.
  • the two cameras installed over aisle 116 b are connected to the network node 101 b hosting an image recognition engine 112 b.
  • Each image recognition engine 112 a - 112 n hosted in a network node or nodes 101 a - 101 n separately processes the image frames received from one camera each in the illustrated example.
  • each image recognition engine 112 a, 112 b, and 112 n is implemented as a deep learning algorithm such as a convolutional neural network (abbreviated CNN).
  • the CNN is trained using a training database 130 .
  • image recognition of subjects in the real space is based on identifying and grouping joints recognizable in the images, where the groups of joints can be attributed to an individual subject.
  • the training database 130 has a large collection of images for each of the different types of joints for subjects.
  • the subjects are the customers moving in the aisles between the shelves.
  • the system 100 is referred to as a “training system.” After training the CNN using the training database 130 , the CNN is switched to production mode to process images of customers in the shopping store in real time.
  • the system 100 is referred to as a runtime system (also referred to as an inference system).
  • the CNN in each image recognition engine produces arrays of joints data structures for images in its respective stream of images.
  • an array of joints data structures is produced for each processed image, so that each image recognition engine 112 a - 112 n produces an output stream of arrays of joints data structures.
  • These arrays of joints data structures from cameras having overlapping fields of view are further processed to form groups of joints, and to identify such groups of joints as subjects. These groups of joints may not uniquely identify the individual in the image, or an authentic user account for the individual in the image, but can be used to track a subject in the area.
  • the subjects can be identified and tracked by the system using an identifier “subject_id” during their presence in the area of real space.
  • the system identifies the customer using joints analysis as described above and is assigned a “subject_id”.
  • This identifier is, however, not linked to real world identity of the subject such as user account, name, driver's license, email addresses, mailing addresses, credit card numbers, bank account numbers, driver's license number, etc. or to identifying biometric identification such as finger prints, facial recognition, hand geometry, retina scan, iris scan, voice recognition, etc. Therefore, the identified subject is anonymous. Details of an example technology for subject identification and tracking are presented in U.S. Pat. No. 10,055,853, issued 21 Aug. 2018, titled, “Subject Identification and Tracking Using Image Recognition Engine” which is incorporated herein by reference as if fully set forth herein.
  • the subject tracking engine 110 hosted on the network node 102 receives, in this example, continuous streams of arrays of joints data structures for the subjects from image recognition engines 112 a - 112 n.
  • the subject tracking engine 110 processes the arrays of joints data structures and translates the coordinates of the elements in the arrays of joints data structures corresponding to images in different sequences into candidate joints having coordinates in the real space.
  • the combination of candidate joints identified throughout the real space can be considered, for the purposes of analogy, to be like a galaxy of candidate joints.
  • movement of the candidate joints is recorded so that the galaxy changes over time.
  • the output of the subject tracking engine 110 is stored in the subject database 140 .
  • the subject tracking engine 110 uses logic to identify groups or sets of candidate joints having coordinates in real space as subjects in the real space.
  • each set of candidate points is like a constellation of candidate joints at each point in time.
  • the constellations of candidate joints can move over time.
  • the logic to identify sets of candidate joints comprises heuristic functions based on physical relationships amongst joints of subjects in real space. These heuristic functions are used to identify sets of candidate joints as subjects.
  • the sets of candidate joints comprise individual candidate joints that have relationships according to the heuristic parameters with other individual candidate joints and subsets of candidate joints in a given set that has been identified, or can be identified, as an individual subject.
  • the system processes payment of items bought by the customer.
  • the system has to link the customer with a “user account” containing preferred payment method provided by the customer.
  • the “identified subject” is anonymous because information about the joints and relationships among the joints is not stored as biometric identifying information linked to an individual or to a user account.
  • the system includes a matching engine 170 (hosted on the network node 103 ) to process signals received from mobile computing devices 120 (carried by the subjects) to match the identified subjects with user accounts.
  • the matching can be performed by identifying locations of mobile devices executing client applications in the area of real space (e.g., the shopping store) and matching locations of mobile devices with locations of subjects, without use of personal identifying biometric information from the images.
  • the actual communication path to the network node 103 hosting the matching engine 170 through the network 181 can be point-to-point over public and/or private networks.
  • the communications can occur over a variety of networks 181 , e.g., private networks, VPN, MPLS circuit, or Internet, and can use appropriate application programming interfaces (APIs) and data interchange formats, e.g., Representational State Transfer (REST), JavaScriptTM Object Notation (JSON), Extensible Markup Language (XML), Simple Object Access Protocol (SOAP), JavaTM Message Service (JMS), and/or Java Platform Module System. All of the communications can be encrypted.
  • APIs Application programming interfaces
  • JSON JavaScriptTM Object Notation
  • XML Extensible Markup Language
  • SOAP Simple Object Access Protocol
  • JMS JavaTM Message Service
  • Java Platform Module System Java Platform Module System
  • the communication is generally over a network such as a LAN (local area network), WAN (wide area network), telephone network (Public Switched Telephone Network (PSTN), Session Initiation Protocol (SIP), wireless network, point-to-point network, star network, token ring network, hub network, Internet, inclusive of the mobile Internet, via protocols such as EDGE, 3G, 4G LTE, Wi-Fi, and WiMAX. Additionally, a variety of authorization and authentication techniques, such as username/password, Open Authorization (OAuth), Kerberos, SecureID, digital certificates and more, can be used to secure the communications.
  • OAuth Open Authorization
  • Kerberos Kerberos
  • SecureID digital certificates and more
  • the technology disclosed herein can be implemented in the context of any computer-implemented system including a database system, a multi-tenant environment, or a relational database implementation like an OracleTM compatible database implementation, an IBM DB2 Enterprise ServerTM compatible relational database implementation, a MySQLTM or PostgreSQLTM compatible relational database implementation or a Microsoft SQL ServerTM compatible relational database implementation or a NoSQLTM non-relational database implementation such as a VampireTM compatible non-relational database implementation, an Apache CassandraTM compatible non-relational database implementation, a BigTableTM compatible non-relational database implementation or an HBaseTM or DynamoDBTM compatible non-relational database implementation.
  • a relational database implementation like an OracleTM compatible database implementation, an IBM DB2 Enterprise ServerTM compatible relational database implementation, a MySQLTM or PostgreSQLTM compatible relational database implementation or a Microsoft SQL ServerTM compatible relational database implementation or a NoSQLTM non-relational database implementation such as a VampireTM
  • the technology disclosed can be implemented using different programming models like MapReduceTM, bulk synchronous programming, MPI primitives, etc. or different scalable batch and stream management systems like Apache StormTM, Apache SparkTM, Apache KafkaTM, Apache FlinkTM TruvisoTM, Amazon Elasticsearch ServiceTM, Amazon Web ServicesTM (AWS), IBM Info-SphereTM, BorealisTM, and Yahoo! S4TM.
  • Apache StormTM Apache SparkTM, Apache KafkaTM, Apache FlinkTM TruvisoTM, Amazon Elasticsearch ServiceTM, Amazon Web ServicesTM (AWS), IBM Info-SphereTM, BorealisTM, and Yahoo! S4TM.
  • the cameras 114 are arranged to track multi-joint subjects (or entities) in a three-dimensional (abbreviated as 3D) real space.
  • the real space can include the area of the shopping store where items for sale are stacked in shelves.
  • a point in the real space can be represented by an (x, y, z) coordinate system.
  • Each point in the area of real space for which the system is deployed is covered by the fields of view of two or more cameras 114 .
  • FIG. 2 shows an arrangement of shelves, forming an aisle 116 a, viewed from one end of the aisle 116 a.
  • Two cameras, camera A 206 and camera B 208 are positioned over the aisle 116 a at a predetermined distance from a roof 230 and a floor 220 of the shopping store above the inventory display structures, such as shelves.
  • the cameras 114 comprise cameras disposed over and having fields of view encompassing respective parts of the inventory display structures and floor area in the real space.
  • a subject 240 is holding the mobile computing device 118 a and standing on the floor 220 in the aisle 116 a.
  • the mobile computing device can send and receive signals through the wireless network(s) 181 .
  • the mobile computing devices 120 communicate through a wireless network using for example a Wi-Fi protocol, or other wireless protocols like Bluetooth, ultra-wideband, and ZigBee, through wireless access points (WAP) 250 and 252 .
  • the real space can include all of the floor 220 in the shopping store from which inventory can be accessed.
  • Cameras 114 are placed and oriented such that areas of the floor 220 and shelves can be seen by at least two cameras.
  • the cameras 114 also cover at least part of the shelves 202 and 204 and floor space in front of the shelves 202 and 204 .
  • Camera angles are selected to have both steep perspective, straight down, and angled perspectives that give more full body images of the customers.
  • the cameras 114 are configured at an eight (8) foot height or higher throughout the shopping store.
  • the cameras 206 and 208 have overlapping fields of view, covering the space between a shelf A 202 and a shelf B 204 with overlapping fields of view 216 and 218 , respectively.
  • a location in the real space is represented as a (x, y, z) point of the real space coordinate system.
  • “x” and “y” represent positions on a two-dimensional (2D) plane which can be the floor 220 of the shopping store.
  • the value “z” is the height of the point above the 2D plane at floor 220 in one configuration.
  • FIG. 3 illustrates the aisle 116 a viewed from the top of FIG. 2 , further showing an example arrangement of the positions of cameras 206 and 208 over the aisle 116 a.
  • the cameras 206 and 208 are positioned closer to opposite ends of the aisle 116 a.
  • the camera A 206 is positioned at a predetermined distance from the shelf A 202 and the camera B 208 is positioned at a predetermined distance from the shelf B 204 .
  • the cameras are positioned at equal distances from each other.
  • two cameras are positioned close to the opposite ends and a third camera is positioned in the middle of the aisle. It is understood that a number of different camera arrangements are possible.
  • the image recognition engines 112 a - 112 n receive the sequences of images from cameras 114 and process images to generate corresponding arrays of joints data structures. In one embodiment, the image recognition engines 112 a - 112 n identify one of the 19 possible joints of each subject at each element of the image. The possible joints can be grouped in two categories: foot joints and non-foot joints. The 19 th type of joint classification is for all non-joint features of the subject (i.e. elements of the image not classified as a joint).
  • An array of joints data structures for a particular image classifies elements of the particular image by joint type, time of the particular image, and the coordinates of the elements in the particular image.
  • the image recognition engines 112 a - 112 n are convolutional neural networks (CNN)
  • the joint type is one of the 19 types of joints of the subjects
  • the time of the particular image is the timestamp of the image generated by the source camera 114 for the particular image
  • the coordinates (x, y) identify the position of the element on a 2D image plane.
  • the output of the CNN is a matrix of confidence arrays for each image per camera.
  • the matrix of confidence arrays is transformed into an array of joints data structures.
  • a joints data structure 400 as shown in FIG. 4 is used to store the information of each joint.
  • the joints data structure 400 identifies x and y positions of the element in the particular image in the 2D image space of the camera from which the image is received.
  • a joint number identifies the type of joint identified. For example, in one embodiment, the values range from 1 to 19.
  • a value of 1 indicates that the joint is a left ankle
  • a value of 2 indicates the joint is a right ankle and so on.
  • the type of joint is selected using the confidence array for that element in the output matrix of CNN. For example, in one embodiment, if the value corresponding to the left-ankle joint is highest in the confidence array for that image element, then the value of the joint number is “1”.
  • a confidence number indicates the degree of confidence of the CNN in predicting that joint. If the value of confidence number is high, it means the CNN is confident in its prediction.
  • An integer-Id is assigned to the joints data structure to uniquely identify it.
  • the output matrix of confidence arrays per image is converted into an array of joints data structures for each image.
  • the joints analysis includes performing a combination of k-nearest neighbors, mixture of Gaussians, and various image morphology transformations on each input image. The result comprises arrays of joints data structures which can be stored in the form of a bit mask in a ring buffer that maps image numbers to bit masks at each moment in time.
  • the tracking engine 110 is configured to receive arrays of joints data structures generated by the image recognition engines 112 a - 112 n corresponding to images in sequences of images from cameras having overlapping fields of view.
  • the arrays of joints data structures per image are sent by image recognition engines 112 a - 112 n to the tracking engine 110 via the network(s) 181 .
  • the tracking engine 110 translates the coordinates of the elements in the arrays of joints data structures corresponding to images in different sequences into candidate joints having coordinates in the real space.
  • the tracking engine 110 comprises logic to identify sets of candidate joints having coordinates in real space (constellations of joints) as subjects in the real space.
  • the tracking engine 110 accumulates arrays of joints data structures from the image recognition engines for all the cameras at a given moment in time and stores this information as a dictionary in the subject database 140 , to be used for identifying a constellation of candidate joints.
  • the dictionary can be arranged in the form of key-value pairs, where keys are camera ids and values are arrays of joints data structures from the camera.
  • this dictionary is used in heuristics-based analysis to determine candidate joints and for assignment of joints to subjects.
  • a high-level input, processing and output of the tracking engine 110 is illustrated in table 1. Details of the logic applied by the subject tracking engine 110 to create subjects by combining candidate joints and track movement of subjects in the area of real space are presented in U.S. Pat. No. 10,055,853, issued 21 Aug. 2018, titled, “Subject Identification and Tracking Using Image Recognition Engine” which is incorporated herein by reference.
  • Inputs Processing Output Arrays of joints data Create joints dictionary List of identified structures per image and for Reproject joint positions subjects in the each joints data structure in the fields of view of real space at a Unique ID cameras with moment in time Confidence number overlapping fields of Joint number view to candidate joints (x, y) position in image space
  • the subject tracking engine 110 uses heuristics to connect joints of subjects identified by the image recognition engines 112 a - 112 n. In doing so, the subject tracking engine 110 creates new subjects and updates the locations of existing subjects by updating their respective joint locations.
  • the subject tracking engine 110 uses triangulation techniques to project the locations of joints from 2D space coordinates (x, y) to 3D real space coordinates (x, y, z).
  • FIG. 5 shows the subject data structure 500 used to store the subject.
  • the subject data structure 500 stores the subject related data as a key-value dictionary.
  • the key is a frame_number and the value is another key-value dictionary where key is the camera_id and value is a list of 18 joints (of the subject) with their locations in the real space.
  • the subject data is stored in the subject database 140 . Every new subject is also assigned a unique identifier that is used to access the subject's data in the subject database 140 .
  • the system identifies joints of a subject and creates a skeleton of the subject.
  • the skeleton is projected into the real space indicating the position and orientation of the subject in the real space. This is also referred to as “pose estimation” in the field of machine vision.
  • the system displays orientations and positions of subjects in the real space on a graphical user interface (GUI).
  • GUI graphical user interface
  • the image analysis is anonymous, i.e., a unique identifier assigned to a subject created through joints analysis does not identify personal identification of the subject as described above.
  • the matching engine 170 includes logic to match the identified subjects with their respective user accounts by identifying locations of mobile devices (carried by the identified subjects) that are executing client applications in the area of real space.
  • the matching engine uses multiple techniques, independently or in combination, to match the identified subjects with the user accounts.
  • the system can be implemented without maintaining biometric identifying information about users, so that biometric information about account holders is not exposed to security and privacy concerns raised by distribution of such information.
  • a customer logs in to the system using a client application executing on a personal mobile computing device upon entering the shopping store, identifying an authentic user account to be associated with the client application on the mobile device.
  • the system then sends a “semaphore” image selected from the set of unassigned semaphore images in the image database 160 to the client application executing on the mobile device.
  • the semaphore image is unique to the client application in the shopping store as the same image is not freed for use with another client application in the store until the system has matched the user account to an identified subject. After that matching, the semaphore image becomes available for use again.
  • the client application causes the mobile device to display the semaphore image, which display of the semaphore image is a signal emitted by the mobile device to be detected by the system.
  • the matching engine 170 uses the image recognition engines 112 a - n or a separate image recognition engine (not shown in FIG. 1 ) to recognize the semaphore image and determine the location of the mobile computing device displaying the semaphore in the shopping store.
  • the matching engine 170 matches the location of the mobile computing device to a location of an identified subject.
  • the matching engine 170 then links the identified subject (stored in the subject database 140 ) to the user account (stored in the user account database 150 ) linked to the client application for the duration in which the subject is present in the shopping store.
  • biometric identifying information is used for matching the identified subject with the user account, and none is stored in support of this process. That is, there is no information in the sequences of images used to compare with stored biometric information for the purposes of matching the identified subjects with user accounts in support of this process.
  • the matching engine 170 uses other signals in the alternative or in combination from the mobile computing devices 120 to link the identified subjects to user accounts. Examples of such signals include a service location signal identifying the position of the mobile computing device in the area of the real space, speed and orientation of the mobile computing device obtained from the accelerometer and compass of the mobile computing device, etc.
  • the system can use biometric information to assist matching a not-yet-linked identified subject to a user account.
  • the system stores “hair color” of the customer in his or her user account record.
  • the system might use for example hair color of subjects as an additional input to disambiguate and match the subject to a user account. If the user has red colored hair and there is only one subject with red colored hair in the area of real space or in close proximity of the mobile computing device, then the system might select the subject with red hair color to match the user account.
  • FIGS. 6 to 9C present process steps of four techniques usable alone or in combination by the matching engine 170 .
  • FIG. 6 is a flowchart 600 presenting process steps for a first technique for matching identified subjects in the area of real space with their respective user accounts.
  • the subjects are customers (or shoppers) moving in the store in aisles between shelves and other open spaces.
  • the process starts at step 602 .
  • the subject opens a client application on a mobile computing device and attempts to login.
  • the system verifies the user credentials at step 604 (for example, by querying the user account database 150 ) and accepts login communication from the client application to associate an authenticated user account with the mobile computing device.
  • the system determines that the user account of the client application is not yet linked to an identified subject.
  • the system sends a semaphore image to the client application for display on the mobile computing device at step 606 .
  • semaphore images include various shapes of solid colors such as a red rectangle or a pink elephant, etc.
  • a variety of images can be used as semaphores, preferably suited for high confidence recognition by the image recognition engine.
  • Each semaphore image can have a unique identifier.
  • the processing system includes logic to accept login communications from a client application on a mobile device identifying a user account before matching the user account to an identified subject in the area of real space, and after accepting login communications sends a selected semaphore image from the set of semaphore images to the client application on the mobile device.
  • the system selects an available semaphore image from the image database 160 for sending to the client application.
  • the system changes a status of the semaphore image in the image database 160 as “assigned” so that this image is not assigned to any other client application.
  • the status of the image remains as “assigned” until the process to match the identified subject to the mobile computing device is complete. After matching is complete, the status can be changed to “available.” This allows for rotating use of a small set of semaphores in a given system, simplifying the image recognition problem.
  • the client application receives the semaphore image and displays it on the mobile computing device. In one embodiment, the client application also increases the brightness of the display to increase the image visibility.
  • the image is captured by one or more cameras 114 and sent to an image processing engine, referred to as WhatCNN.
  • the system uses WhatCNN at step 608 to recognize the semaphore images displayed on the mobile computing device.
  • WhatCNN is a convolutional neural network trained to process the specified bounding boxes in the images to generate a classification of hands of the identified subjects.
  • One trained WhatCNN processes image frames from one camera. In the example embodiment of the shopping store, for each hand joint in each image frame, the WhatCNN identifies whether the hand joint is empty.
  • the WhatCNN also identifies a semaphore image identifier (in the image database 160 ) or an SKU (stock keeping unit) number of the inventory item in the hand joint, a confidence value indicating the item in the hand joint is a non-SKU item (i.e. it does not belong to the shopping store inventory) and a context of the hand joint location in the image frame.
  • a WhatCNN model per camera identifies semaphore images (displayed on mobile computing devices) in hands (represented by hand joints) of subjects.
  • a coordination logic combines the outputs of WhatCNN models into a consolidated data structure listing identifiers of semaphore images in left hand (referred to as left_hand_classid) and right hand (right_hand_classid) of identified subjects (step 610 ).
  • the system stores this information in a dictionary mapping subject_id to left_hand_classid and right_hand_classid along with a timestamp, including locations of the joints in real space.
  • the details of WhatCNN are presented in U.S. patent application Ser. No. 15/907,112, filed 27 Feb. 2018, titled, “Item Put and Take Detection Using Image Recognition” which is incorporated herein by reference as if fully set forth herein.
  • the system checks if the semaphore image sent to the client application is recognized by the WhatCNN by iterating the output of the WhatCNN models for both hands of all identified subjects. If the semaphore image is not recognized, the system sends a reminder at a step 614 to the client application to display the semaphore image on the mobile computing device and repeats process steps 608 to 612 . Otherwise, if the semaphore image is recognized by WhatCNN, the system matches a user_account (from the user account database 150 ) associated with the client application to subject_id (from the subject database 140 ) of the identified subject holding the mobile computing device (step 616 ). In one embodiment, the system maintains this mapping (subject_id-user_account) until the subject is present in the area of real space. The process ends at step 618 .
  • the flowchart 700 in FIG. 7 presents process steps for a second technique for matching identified subjects with user accounts.
  • This technique uses radio signals emitted by the mobile devices indicating location of the mobile devices.
  • the process starts at step 702 , the system accepts login communication from a client application on a mobile computing device as described above in step 604 to link an authenticated user account to the mobile computing device.
  • the system receives service location information from the mobile devices in the area of real space at regular intervals.
  • latitude and longitude coordinates of the mobile computing device emitted from a global positioning system (GPS) receiver of the mobile computing device are used by the system to determine the location.
  • GPS global positioning system
  • the service location of the mobile computing device obtained from GPS coordinates has an accuracy between 1 to 3 meters.
  • the service location of a mobile computing device obtained from GPS coordinates has an accuracy between 1 to 5 meters.
  • WAP wireless access point
  • Other techniques can be used in combination with the above technique or independently to determine the service location of the mobile computing device. Examples of such techniques include using signal strengths from different wireless access points (WAP) such as 250 and 252 shown in FIGS. 2 and 3 as an indication of how far the mobile computing device is from respective access points. The system then uses known locations of wireless access points (WAP) 250 and 252 to triangulate and determine the position of the mobile computing device in the area of real space. Other types of signals (such as Bluetooth, ultra-wideband, and ZigBee) emitted by the mobile computing devices can also be used to determine a service location of the mobile computing device.
  • WAP wireless access points
  • Other types of signals such as Bluetooth, ultra-wideband, and ZigBee
  • the system monitors the service locations of mobile devices with client applications that are not yet linked to an identified subject at step 708 at regular intervals such as every second.
  • the system determines the distance of a mobile computing device with an unmatched user account from all other mobile computing devices with unmatched user accounts. The system compares this distance with a pre-determined threshold distance “d” such as 3 meters. If the mobile computing device is away from all other mobile devices with unmatched user accounts by at least “d” distance (step 710 ), the system determines a nearest not yet linked subject to the mobile computing device (step 714 ). The location of the identified subject is obtained from the output of the JointsCNN at step 712 .
  • the location of the subject obtained from the JointsCNN is more accurate than the service location of the mobile computing device.
  • the system performs the same process as described above in flowchart 600 to match the subject_id of the identified subject with the user_account of the client application. The process ends at a step 718 .
  • biometric identifying information is used for matching the identified subject with the user account, and none is stored in support of this process. That is, there is no information in the sequences of images used to compare with stored biometric information for the purposes of matching the identified subjects with user account in support of this process. Thus, this logic to match the identified subjects with user accounts operates without use of personal identifying biometric information associated with the user accounts.
  • the flowchart 800 in FIG. 8 presents process steps for a third technique for matching identified subjects with user accounts.
  • This technique uses signals emitted by an accelerometer of the mobile computing devices to match identified subjects with client applications.
  • the process starts at step 802 .
  • the process starts at step 604 to accept login communication from the client application as described above in the first and second techniques.
  • the system receives signals emitted from the mobile computing devices carrying data from accelerometers on the mobile computing devices in the area of real space, which can be sent at regular intervals.
  • the system calculates an average velocity of all mobile computing devices with unmatched user accounts.
  • the accelerometers provide acceleration of mobile computing devices along the three axes (x, y, z).
  • the v 0 is initialized as “0” and subsequently, for every time t+1, v t becomes v 0 .
  • the system calculates moving averages of velocities of all mobile computing devices over a larger period of time such as 3 seconds which is long enough for the walking gait of an average person, or over longer periods of time.
  • the system calculates Euclidean distance (also referred to as L2 norm) between velocities of all pairs of mobile computing devices with unmatched client applications to not yet linked identified subjects.
  • the velocities of subjects are derived from changes in positions of their joints with respect to time, obtained from joints analysis and stored in respective subject data structures 500 with timestamps.
  • a location of center of mass of each subject is determined using the joints analysis.
  • the velocity, or other derivative, of the center of mass location data of the subject is used for comparison with velocities of mobile computing devices.
  • a score_counter for the subject_id-user_account pair is incremented. The above process is performed at regular time intervals, thus updating the score_counter for each subject_id-user_account pair.
  • the system compares the score_counter values for pairs of every unmatched user account with every not yet linked identified subject (step 812 ). If the highest score is greater than threshold_1 (step 814 ), the system calculates the difference between the highest score and the second highest score (for pair of same user account with a different subject) at step 816 . If the difference is greater than threshold_2, the system selects the mapping of user_account to the identified subject at step 818 and follows the same process as described above in step 616 . The process ends at a step 820 .
  • the velocity of the hand (of the identified subject) holding the mobile computing device is used in above process instead of using the velocity of the center of mass of the subject. This improves performance of the matching algorithm.
  • the system uses training data with labels assigned to the images. During training, various combinations of the threshold values are used and the output of the algorithm is matched with ground truth labels of images to determine its performance. The values of thresholds that result in best overall assignment accuracy are selected for use in production (or inference).
  • biometric identifying information is used for matching the identified subject with the user account, and none is stored in support of this process. That is, there is no information in the sequences of images used to compare with stored biometric information for the purposes of matching the identified subjects with user accounts in support of this process. Thus, this logic to match the identified subjects with user accounts operates without use of personal identifying biometric information associated with the user accounts.
  • a network ensemble is a learning paradigm where many networks are jointly used to solve a problem. Ensembles typically improve the prediction accuracy obtained from a single classifier by a factor that validates the effort and cost associated with learning multiple models.
  • the second and third techniques presented above are jointly used in an ensemble (or network ensemble). To use the two techniques in an ensemble, relevant features are extracted from application of the two techniques.
  • FIGS. 9A-9C present process steps (in a flowchart 900 ) for extracting features, training the ensemble and using the trained ensemble to predict match of a user account to a not yet linked identified subject.
  • FIG. 9A presents the process steps for generating features using the second technique that uses service location of mobile computing devices.
  • the process starts at step 902 .
  • a Count_X for the second technique is calculated indicating a number of times a service location of a mobile computing device with an unmatched user account is X meters away from all other mobile computing devices with unmatched user accounts.
  • Count_X values of all tuples of subject_id-user_account pairs is stored by the system for use by the ensemble. In one embodiment, multiple values of X are used e.g., 1 m, 2 m, 3 m, 4 m, 5 m (steps 908 and 910 ).
  • the count is stored as a dictionary that maps tuples of subject_id-user_account to count score, which is an integer. In the example where 5 values of X are used, five such dictionaries are created at step 912 . The process ends at step 914 .
  • FIG. 9B presents the process steps for generating features using the third technique that uses velocities of mobile computing devices.
  • the process starts at step 920 .
  • a Count_Y, for the third technique is determined which is equal to score_counter values indicating number of times Euclidean distance between a particular subject_id-user_account pair is below a threshold_0.
  • Count_Y values of all tuples of subject_id-user_account pairs is stored by the system for use by the ensemble. In one embodiment, multiple values of threshold_0 are used e.g., five different values (steps 926 and 928 ).
  • the Count_Y is stored as a dictionary that maps tuples of subject_id-user_account to count score, which is an integer. In the example where 5 values of threshold are used, five such dictionaries are created at step 930 . The process ends at step 932 .
  • the features from the second and third techniques are then used to create a labeled training data set and used to train the network ensemble.
  • a labeled training data set To collect such a data set, multiple subjects (shoppers) walk in an area of real space such as a shopping store. The images of these subject are collected using cameras 114 at regular time intervals. Human labelers review the images and assign correct identifiers (subject_id and user_account) to the images in the training data.
  • the process is described in a flowchart 900 presented in FIG. 9C . The process starts at a step 940 .
  • Count_X and Count_Y dictionaries obtained from second and third techniques are compared with corresponding true labels assigned by the human labelers on the images to identify correct matches (true) and incorrect matches (false) of subject_id and user_account.
  • a binary classifier is trained using this training data set (step 944 ).
  • Commonly used methods for binary classification include decision trees, random forest, neural networks, gradient boost, support vector machines, etc.
  • a trained binary classifier is used to categorize new probabilistic observations as true or false.
  • the trained binary classifier is used in production (or inference) by giving as input Count_X and Count_Y dictionaries for subject_id-user_account tuples.
  • the trained binary classifier classifies each tuple as true or false at a step 946 .
  • the process ends at a step 948 .
  • the system sends a notification to the mobile computing device to open the client application. If the user accepts the notification, the client application will display a semaphore image as described in the first technique. The system will then follow the steps in the first technique to check-in the shopper (match subject_id to user_account). If the customer does not respond to the notification, the system will send a notification to an employee in the shopping store indicating the location of the unmatched customer. The employee can then walk to the customer, ask him to open the client application on his mobile computing device to check-in to the system using a semaphore image.
  • biometric identifying information is used for matching the identified subject with the user account, and none is stored in support of this process. That is, there is no information in the sequences of images used to compare with stored biometric information for the purposes of matching the identified subjects with user accounts in support of this process. Thus, this logic to match the identified subjects with user accounts operates without use of personal identifying biometric information associated with the user accounts.
  • FIG. 10 An example architecture of a system in which the four techniques presented above are applied to match a user_account to a not yet linked subject in an area of real space is presented in FIG. 10 .
  • FIG. 10 is an architectural diagram, certain details are omitted to improve the clarity of description.
  • the system presented in FIG. 10 receives image frames from a plurality of cameras 114 .
  • the cameras 114 can be synchronized in time with each other, so that images are captured at the same time, or close in time, and at the same image capture rate.
  • Images captured in all the cameras covering an area of real space at the same time, or close in time, are synchronized in the sense that the synchronized images can be identified in the processing engines as representing different views at a moment in time of subjects having fixed positions in the real space.
  • the images are stored in a circular buffer of image frames per camera 1002 .
  • a “subject identification” subsystem 1004 processes image frames received from cameras 114 to identify and track subjects in the real space.
  • the first image processors include subject image recognition engines such as the JointsCNN above.
  • a “semantic diffing” subsystem 1006 (also referred to as second image processors) includes background image recognition engines, which receive corresponding sequences of images from the plurality of cameras and recognize semantically significant differences in the background (i.e. inventory display structures like shelves) as they relate to puts and takes of inventory items, for example, over time in the images from each camera.
  • the second image processors receive output of the subject identification subsystem 1004 and image frames from cameras 114 as input. Details of “semantic diffing” subsystem are presented in U.S. patent application Ser. No. 15/945,466, filed 4 Apr. 2018, titled, “Predicting Inventory Events using Semantic Diffing,” and U.S. patent application Ser. No. 15/945,473, filed 4 Apr.
  • the second image processors process identified background changes to make a first set of detections of takes of inventory items by identified subjects and of puts of inventory items on inventory display structures by identified subjects.
  • the first set of detections are also referred to as background detections of puts and takes of inventory items.
  • the first detections identify inventory items taken from the shelves or put on the shelves by customers or employees of the store.
  • the semantic diffing subsystem includes the logic to associate identified background changes with identified subjects.
  • a “region proposals” subsystem 1008 (also referred to as third image processors) includes foreground image recognition engines, receives corresponding sequences of images from the plurality of cameras 114 , and recognizes semantically significant objects in the foreground (i.e. shoppers, their hands and inventory items) as they relate to puts and takes of inventory items, for example, over time in the images from each camera.
  • the region proposals subsystem 1008 also receives output of the subject identification subsystem 1004 .
  • the third image processors process sequences of images from cameras 114 to identify and classify foreground changes represented in the images in the corresponding sequences of images.
  • the third image processors process identified foreground changes to make a second set of detections of takes of inventory items by identified subjects and of puts of inventory items on inventory display structures by identified subjects.
  • the second set of detections are also referred to as foreground detection of puts and takes of inventory items.
  • the second set of detections identifies takes of inventory items and puts of inventory items on inventory display structures by customers and employees of the store.
  • the system described in FIG. 10 includes a selection logic 1010 to process the first and second sets of detections to generate log data structures including lists of inventory items for identified subjects. For a take or put in the real space, the selection logic 1010 selects the output from either the semantic diffing subsystem 1006 or the region proposals subsystem 1008 . In one embodiment, the selection logic 1010 uses a confidence score generated by the semantic diffing subsystem for the first set of detections and a confidence score generated by the region proposals subsystem for a second set of detections to make the selection. The output of the subsystem with a higher confidence score for a particular detection is selected and used to generate a log data structure 1012 (also referred to as a shopping cart data structure) including a list of inventory items (and their quantities) associated with identified subjects.
  • a log data structure 1012 also referred to as a shopping cart data structure
  • the system in FIG. 10 applies the four techniques for matching the identified subject (associated with the log data) to a user_account which includes a payment method such as credit card or bank account information.
  • the four techniques are applied sequentially as shown in the figure. If the process steps in flowchart 600 for the first technique produces a match between the subject and the user account then this information is used by a payment processor 1036 to charge the customer for the inventory items in the log data structure. Otherwise (step 1028 ), the process steps presented in flowchart 700 for the second technique are followed and the user account is used by the payment processor 1036 .
  • the system sends a notification to the mobile computing device to open the client application and follow the steps presented in the flowchart 600 for the first technique. If the customer does not respond to the notification, the system will send a notification to an employee in the shopping store indicating the location of the unmatched customer. The employee can then walk to the customer, ask him to open the client application on his mobile computing device to check-in to the system using a semaphore image (step 1040 ). It is understood that in other embodiments of the architecture presented in FIG. 10 , fewer than four techniques can be used to match the user accounts to not yet linked identified subjects.
  • FIG. 11 presents an architecture of a network hosting the matching engine 170 which is hosted on the network node 103 .
  • the system includes a plurality of network nodes 103 , 101 a - 101 n, and 102 in the illustrated embodiment.
  • the network nodes are also referred to as processing platforms.
  • Processing platforms (network nodes) 103 , 101 a - 101 n, and 102 and cameras 1112 , 1114 , 1116 , . . . 1118 are connected to network(s) 1181 .
  • FIG. 11 shows a plurality of cameras 1112 , 1114 , 1116 , . . . 1118 connected to the network(s).
  • the cameras 1112 to 1118 are connected to the network(s) 1181 using Ethernet-based connectors 1122 , 1124 , 1126 , and 1128 , respectively.
  • the Ethernet-based connectors have a data transfer speed of 1 gigabit per second, also referred to as Gigabit Ethernet.
  • cameras 114 are connected to the network using other types of network connections which can have a faster or slower data transfer rate than Gigabit Ethernet.
  • a set of cameras can be connected directly to each processing platform, and the processing platforms can be coupled to a network.
  • Storage subsystem 1130 stores the basic programming and data constructs that provide the functionality of certain embodiments of the present invention.
  • the various modules implementing the functionality of the matching engine 170 may be stored in storage subsystem 1130 .
  • the storage subsystem 1130 is an example of a computer readable memory comprising a non-transitory data storage medium, having computer instructions stored in the memory executable by a computer to perform all or any combination of the data processing and image processing functions described herein, including logic to link subjects in an area of real space with a user account, to determine locations of identified subjects represented in the images, match the identified subjects with user accounts by identifying locations of mobile computing devices executing client applications in the area of real space by processes as described herein.
  • the computer instructions can be stored in other types of memory, including portable memory, that comprise a non-transitory data storage medium or media, readable by a computer.
  • a host memory subsystem 1132 typically includes a number of memories including a main random access memory (RAM) 1134 for storage of instructions and data during program execution and a read-only memory (ROM) 1136 in which fixed instructions are stored.
  • RAM main random access memory
  • ROM read-only memory
  • the RAM 1134 is used as a buffer for storing subject_id-user_account tuples matched by the matching engine 170 .
  • a file storage subsystem 1140 provides persistent storage for program and data files.
  • the storage subsystem 1140 includes four 120 Gigabyte (GB) solid state disks (SSD) in a RAID 0 (redundant array of independent disks) arrangement identified by a numeral 1142 .
  • SSD solid state disks
  • user account data in the user account database 150 and image data in the image database 160 which is not in RAM is stored in RAID 0 .
  • the hard disk drive (HDD) 1146 is slower in access speed than the RAID 0 1142 storage.
  • the solid state disk (SSD) 1144 contains the operating system and related files for the matching engine 170 .
  • three cameras 1112 , 1114 , and 1116 are connected to the processing platform (network node) 103 .
  • Each camera has a dedicated graphics processing unit GPU 1 1162 , GPU 2 1164 , and GPU 3 1166 , to process images sent by the camera. It is understood that fewer than or more than three cameras can be connected per processing platform. Accordingly, fewer or more GPUs are configured in the network node so that each camera has a dedicated GPU for processing the image frames received from the camera.
  • the processor subsystem 1150 , the storage subsystem 1130 and the GPUs 1162 , 1164 , and 1166 communicate using the bus subsystem 1154 .
  • a network interface subsystem 1170 is connected to the bus subsystem 1154 forming part of the processing platform (network node) 103 .
  • Network interface subsystem 1170 provides an interface to outside networks, including an interface to corresponding interface devices in other computer systems.
  • the network interface subsystem 1170 allows the processing platform to communicate over the network either by using cables (or wires) or wirelessly.
  • the wireless radio signals 1175 emitted by the mobile computing devices 120 in the area of real space are received (via the wireless access points) by the network interface subsystem 1170 for processing by the matching engine 170 .
  • a number of peripheral devices such as user interface output devices and user interface input devices are also connected to the bus subsystem 1154 forming part of the processing platform (network node) 103 . These subsystems and devices are intentionally not shown in FIG. 11 to improve the clarity of the description.
  • bus subsystem 1154 is shown schematically as a single bus, alternative embodiments of the bus subsystem may use multiple busses.
  • the cameras 114 can be implemented using Chameleon3 1.3 MP Color USB3 Vision (Sony ICX445), having a resolution of 1288 ⁇ 964, a frame rate of 30 FPS, and at 1.3 MegaPixels per image, with Varifocal Lens having a working distance (mm) of 300- ⁇ , a field of view field of view with a 1 ⁇ 3′′ sensor of 98.2°-23.8°.
  • the system for linking subjects in an area of real space with user accounts described above also includes one or more of the following features.
  • the system includes a plurality of cameras, cameras in the plurality of cameras producing respective sequences of images in corresponding fields of view in the real space.
  • the processing system is coupled to the plurality of cameras, the processing system includes logic to determine locations of identified subjects represented in the images.
  • the system matches the identified subjects with user accounts by identifying locations of mobile devices executing client applications in the area of real space, and matches locations of the mobile devices with locations of the subjects.
  • system the signals emitted by the mobile computing devices comprise images.
  • the signals emitted by the mobile computing devices comprise radio signals.
  • the system includes a set of semaphore images accessible to the processing system.
  • the processing system includes logic to accept login communications from a client application on a mobile computing device identifying a user account before matching the user account to an identified subject in the area of real space, and after accepting login communications the system sends a selected semaphore image from the set of semaphore images to the client application on the mobile device.
  • the processing system sets a status of the selected semaphore image as assigned.
  • the processing system receives a displayed image of the selected semaphore image.
  • the processing system recognizes the displayed image and matches the recognized semaphore image with the assigned images from the set of semaphore images.
  • the processing system matches a location of the mobile computing device displaying the recognized semaphore image located in the area of real space with a not yet linked identified subject.
  • the processing system after matching the user account to the identified subject, sets the status of the recognized semaphore image as available.
  • the client applications on the mobile computing devices transmit accelerometer data to the processing system, and the system matches the identified subjects with user accounts using the accelerometer data transmitted from the mobile computing devices.
  • the logic to match the identified subjects with user accounts includes logic that uses the accelerometer data transmitted from the mobile computing device from a plurality of locations over a time interval in the area of real space and a derivative of data indicating the locations of identified subjects over the time interval in the area of real space.
  • the signals emitted by the mobile computing devices include location data and accelerometer data.
  • the signals emitted by the mobile computing devices comprise images.
  • the signals emitted by the mobile computing devices comprise radio signals.
  • a method of linking subjects in an area of real space with user accounts is disclosed.
  • the user accounts are linked with client applications executable on mobile computing devices is disclosed.
  • the method includes, using a plurality of cameras to produce respective sequences of images in corresponding fields of view in the real space. Then the method includes determining locations of identified subjects represented in the images.
  • the method includes matching the identified subjects with user accounts by identifying locations of mobile computing devices executing client applications in the area of real space. Finally, the method includes matching locations of the mobile computing devices with locations of the subjects.
  • the method also includes, setting a status of the selected semaphore image as assigned, receiving a displayed image of the selected semaphore image, recognizing the displayed semaphore image and matching the recognized image with the assigned images from the set of semaphore images.
  • the method includes, matching a location of the mobile computing device displaying the recognized semaphore image located in the area of real space with a not yet linked identified subject.
  • the method includes after matching the user account to the identified subject setting the status of the recognized semaphore image as available.
  • matching the identified subjects with user accounts further includes using the accelerometer data transmitted from the mobile computing device from a plurality of locations over a time interval in the area of real space.
  • a derivative of data indicating the locations of identified subjects over the time interval in the area of real space.
  • the signals emitted by the mobile computing devices include location data and accelerometer data.
  • the signals emitted by the mobile computing devices comprise images.
  • the signals emitted by the mobile computing devices comprise radio signals.
  • a non-transitory computer readable storage medium impressed with computer program instructions to link subjects in an area of real space with user accounts is disclosed.
  • the user accounts are linked with client applications executable on mobile computing devices, the instructions, when executed on a processor, implement a method.
  • the method includes using a plurality of cameras to produce respective sequences of images in corresponding fields of view in the real space.
  • the method includes determining locations of identified subjects represented in the images.
  • the method includes matching the identified subjects with user accounts by identifying locations of mobile computing devices executing client applications in the area of real space.
  • the method includes matching locations of the mobile computing devices with locations of the subjects.
  • the non-transitory computer readable storage medium implements the method further comprising the following steps.
  • the method includes setting a status of the selected semaphore image as assigned, receiving a displayed image of the selected semaphore image, recognizing the displayed semaphore image and matching the recognized image with the assigned images from the set of semaphore images.
  • the method includes matching a location of the mobile computing device displaying the recognized semaphore image located in the area of real space with a not yet linked identified subject. After matching the user account to the identified subject setting the status of the recognized semaphore image as available.
  • the non-transitory computer readable storage medium implements the method including matching the identified subjects with user accounts by using the accelerometer data transmitted from the mobile computing device from a plurality of locations over a time interval in the area of real space and a derivative of data indicating the locations of identified subjects over the time interval in the area of real space.
  • the signals emitted by the mobile computing devices include location data and accelerometer data.
  • any data structures and code described or referenced above are stored according to many implementations in computer readable memory, which comprises a non-transitory computer-readable storage medium, which may be any device or medium that can store code and/or data for use by a computer system.
  • ASICs application-specific integrated circuits
  • FPGAs field-programmable gate arrays
  • magnetic and optical storage devices such as disk drives, magnetic tape, CDs (compact discs), DVDs (digital versatile discs or digital video discs), or other media capable of storing computer-readable media now known or later developed.

Abstract

Systems and techniques are provided for linking subjects in an area of real space with user accounts. The user accounts are linked with client applications executable on mobile computing devices. A plurality of cameras are disposed above the area. The cameras in the plurality of cameras produce respective sequences of images in corresponding fields of view in the real space. A processing system is coupled to the plurality of cameras. The processing system includes logic to determine locations of subjects represented in the images. The processing system further includes logic to match the identified subjects with user accounts by identifying locations of the mobile computing devices executing client applications in the area of real space and matching locations of the mobile computing devices with locations of the subjects.

Description

    PRIORITY APPLICATION
  • This application is a continuation of U.S. patent application Ser. No. 16/255,573 (Attorney Docket No. STCG 1009-1) filed 23 Jan. 2019, which is a continuation-in-part of U.S. patent application Ser. No. 15/945,473, filed 4 Apr. 2018, now U.S. Pat. No. 10,474,988 (Attorney Docket No. STCG 1005-1), which is a continuation-in-part of U.S. patent application Ser. No. 15/907,112, filed 27 Feb. 2018, now U.S. Pat. No. 10,133,933 (Attorney Docket No. STCG 1002-1), which is a continuation-in-part of U.S. patent application Ser. No. 15/847,796, filed 19 Dec. 2017, now U.S. Pat. No. 10,055,853 (Attorney Docket No. STCG 1001-1), which claims benefit of U.S. Provisional Patent Application No. 62/542,077 (Attorney Docket No. STCG 1000-1) filed 7 Aug. 2017, which applications are incorporated herein by reference.
  • BACKGROUND Field
  • The present invention relates to systems that link subjects in an area of real space with user accounts linked with client applications executing on mobile computing devices.
  • Description of Related Art
  • Identifying subjects within an area of real space, such as people in a shopping store, uniquely associating the identified subjects with real people or with authenticated accounts associated with responsible parties can present many technical challenges. For example, consider such an image processing system deployed in a shopping store with multiple customers moving in aisles between the shelves and open spaces within the shopping store. Customers take items from shelves and put those in their respective shopping carts or baskets. Customers may also put items on the shelf, if they do not want the item. Though the system may identify a subject in the images, and the items the subject takes, the system must accurately identify an authentic user account responsible for the taken items by that subject.
  • In some systems, facial recognition, or other biometric recognition technique, might be used to identify the subjects in the images, and link them with accounts. This approach, however, requires access by the image processing system to databases storing the personal identifying biometric information, linked with the accounts. This is undesirable from a security and privacy standpoint in many settings.
  • It is desirable to provide a system that can more effectively and automatically link a subject in an area of real space to a user known to the system for providing services to the subject. Also, it is desirable to provide image processing systems by which images of large spaces are used to identify subjects without requiring personal identifying biometric information of the subjects.
  • SUMMARY
  • A system, and method for operating a system, are provided for linking subjects, such as persons in an area of real space, with user accounts. The system can use image processing to identify subjects in the area of real space without requiring personal identifying biometric information. The user accounts are linked with client applications executable on mobile computing devices. This function of linking identified subjects to user accounts by image and signal processing presents a complex problem of computer engineering, relating to the type of image and signal data to be processed, what processing of the image and signal data to perform, and how to determine actions from the image and signal data with high reliability.
  • A system and method are provided for linking subjects in an area of real space with user accounts. The user accounts are linked with client applications executable on mobile computing devices. A plurality of cameras or other sensors produce respective sequences of images in corresponding fields of view in the real space. Using these sequences of images, a system and method are described for determining locations of identified subjects represented in the images and matching the identified subjects with user accounts by identifying locations of mobile devices executing client applications in the area of real space and matching locations of the mobile devices with locations of the subjects.
  • In one embodiment described herein, the mobile devices emit signals usable to indicate locations of the mobile devices in the area of real space. The system matches the identified subjects with user accounts by identifying locations of mobile devices using the emitted signals.
  • In one embodiment, the signals emitted by the mobile devices comprise images. In a described embodiment, the client applications on the mobile devices cause display of semaphore images, which can be as simple as a particular color, on the mobile devices in the area of real space. The system matches the identified subjects with user accounts by identifying locations of mobile devices by using an image recognition engine that determines locations of the mobile devices displaying semaphore images. The system includes a set of semaphore images. The system accepts login communications from a client application on a mobile device identifying a user account before matching the user account to an identified subject in the area of real space. After accepting login communications, the system sends a selected semaphore image from the set of semaphore images to the client application on the mobile device. The system sets a status of the selected semaphore image as assigned. The system receives a displayed image of the selected semaphore image, recognizes the displayed image and matches the recognized image with the assigned images from the set of semaphore images. The system matches a location of the mobile device displaying the recognized semaphore image located in the area of real space with a not yet linked identified subject. The system, after matching the user account to the identified subject, sets the status of the recognized semaphore image as available.
  • In one embodiment, the signals emitted by the mobile devices comprise radio signals indicating a service location of the mobile device. The system receives location data transmitted by the client applications on the mobile devices. The system matches the identified subjects with user accounts using the location data transmitted from the mobile devices. The system uses the location data transmitted from the mobile device from a plurality of locations over a time interval in the area of real space to match the identified subjects with user accounts. This matching the identified unmatched subject with the user account of the client application executing on the mobile device includes determining that all other mobile devices transmitting location information of unmatched user accounts are separated from the mobile device by a predetermined distance and determining a closest unmatched identified subject to the mobile device.
  • In one embodiment, the signals emitted by the mobile devices comprise radio signals indicating acceleration and orientation of the mobile device. In one embodiment, such acceleration data is generated by accelerometer of the mobile computing device. In another embodiment, in addition to the accelerometer data, direction data from a compass on the mobile device is also received by the processing system. The system receives the accelerometer data from the client applications on the mobile devices. The system matches the identified subjects with user accounts using the accelerometer data transmitted from the mobile device. In this embodiment, the system uses the accelerometer data transmitted from the mobile device from a plurality of locations over a time interval in the area of real space and derivative of data indicating the locations of identified subjects over the time interval in the area of real space to match the identified subjects with user accounts.
  • In one embodiment, the system matches the identified subjects with user accounts using a trained network to identify locations of mobile devices in the area of real space based on the signals emitted by the mobile devices. In such an embodiment, the signals emitted by the mobile devices include location data and accelerometer data.
  • In one embodiment, the system includes log data structures including a list of inventory items for the identified subjects. The system associates the log data structure for the matched identified subject to the user account for the identified subject.
  • In one embodiment, the system processes a payment for the list of inventory items for the identified subject from a payment method identified in the user account linked to the identified subject.
  • In one embodiment, the system matches the identified subjects with user accounts without use of personal identifying biometric information associated with the user accounts.
  • Methods and computer program products which can be executed by computer systems are also described herein.
  • Other aspects and advantages of the present invention can be seen on review of the drawings, the detailed description and the claims, which follow.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 illustrates an architectural level schematic of a system in which a matching engine links subjects identified by a subject tracking engine to user accounts linked with client applications executing on mobile devices.
  • FIG. 2 is a side view of an aisle in a shopping store illustrating a subject with a mobile computing device and a camera arrangement.
  • FIG. 3 is a top view of the aisle of FIG. 2 in a shopping store illustrating the subject with the mobile computing device and the camera arrangement.
  • FIG. 4 shows an example data structure for storing joints information of subjects.
  • FIG. 5 shows an example data structure for storing a subject including the information of associated joints.
  • FIG. 6 is a flowchart showing process steps for matching an identified subject to a user account using a semaphore image displayed on a mobile computing device.
  • FIG. 7 is a flowchart showing process steps for matching an identified subject to a user account using service location of a mobile computing device.
  • FIG. 8 is a flowchart showing process steps for matching an identified subject to a user account using velocity of subjects and a mobile computing device.
  • FIG. 9A is a flowchart showing a first part of process steps for matching an identified subject to a user account using a network ensemble.
  • FIG. 9B is a flowchart showing a second part of process steps for matching an identified subject to a user account using a network ensemble.
  • FIG. 9C is a flowchart showing a third part of process steps for matching an identified subject to a user account using a network ensemble.
  • FIG. 10 is an example architecture in which the four techniques presented in FIGS. 6 to 9C are applied in an area of real space to reliably match an identified subject to a user account.
  • FIG. 11 is a camera and computer hardware arrangement configured for hosting the matching engine of FIG. 1.
  • DETAILED DESCRIPTION
  • The following description is presented to enable any person skilled in the art to make and use the invention, and is provided in the context of a particular application and its requirements. Various modifications to the disclosed embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the present invention. Thus, the present invention is not intended to be limited to the embodiments shown but is to be accorded the widest scope consistent with the principles and features disclosed herein.
  • System Overview
  • A system and various implementations of the subject technology is described with reference to FIGS. 1-11. The system and processes are described with reference to FIG. 1, an architectural level schematic of a system in accordance with an implementation. Because FIG. 1 is an architectural diagram, certain details are omitted to improve the clarity of the description.
  • The discussion of FIG. 1 is organized as follows. First, the elements of the system are described, followed by their interconnections. Then, the use of the elements in the system is described in greater detail.
  • FIG. 1 provides a block diagram level illustration of a system 100. The system 100 includes cameras 114, network nodes hosting image recognition engines 112 a, 112 b, and 112 n, a subject tracking engine 110 deployed in a network node 102 (or nodes) on the network, mobile computing devices 118 a, 118 b, 118 m (collectively referred as mobile computing devices 120), a training database 130, a subject database 140, a user account database 150, an image database 160, a matching engine 170 deployed in a network node or nodes (also known as a processing platform) 103, and a communication network or networks 181. The network nodes can host only one image recognition engine, or several image recognition engines. The system can also include an inventory database and other supporting data.
  • As used herein, a network node is an addressable hardware device or virtual device that is attached to a network, and is capable of sending, receiving, or forwarding information over a communications channel to or from other network nodes. Examples of electronic devices which can be deployed as hardware network nodes include all varieties of computers, workstations, laptop computers, handheld computers, and smartphones. Network nodes can be implemented in a cloud-based server system. More than one virtual device configured as a network node can be implemented using a single physical device.
  • For the sake of clarity, only three network nodes hosting image recognition engines are shown in the system 100. However, any number of network nodes hosting image recognition engines can be connected to the subject tracking engine 110 through the network(s) 181. Similarly, three mobile computing devices are shown in the system 100. However, any number of mobile computing devices can be connected to the network node 103 hosting the matching engine 170 through the network(s) 181. Also, an image recognition engine, a subject tracking engine, a matching engine and other processing engines described herein can execute using more than one network node in a distributed architecture.
  • The interconnection of the elements of system 100 will now be described. Network(s) 181 couples the network nodes 101 a, 101 b, and 101 n, respectively, hosting image recognition engines 112 a, 112 b, and 112 n, the network node 102 hosting the subject tracking engine 110, the mobile computing devices 118 a, 118 b, and 118 m, the training database 130, the subject database 140, the user account database 150, the image database 160, and the network node 103 hosting the matching engine 170. Cameras 114 are connected to the subject tracking engine 110 through network nodes hosting image recognition engines 112 a, 112 b, and 112 n. In one embodiment, the cameras 114 are installed in a shopping store such that sets of cameras 114 (two or more) with overlapping fields of view are positioned over each aisle to capture images of real space in the store. In FIG. 1, two cameras are arranged over aisle 116 a, two cameras are arranged over aisle 116 b, and three cameras are arranged over aisle 116 n. The cameras 114 are installed over aisles with overlapping fields of view. In such an embodiment, the cameras are configured with the goal that customers moving in the aisles of the shopping store are present in the field of view of two or more cameras at any moment in time.
  • Cameras 114 can be synchronized in time with each other, so that images are captured at the same time, or close in time, and at the same image capture rate. The cameras 114 can send respective continuous streams of images at a predetermined rate to network nodes hosting image recognition engines 112 a-112 n. Images captured in all the cameras covering an area of real space at the same time, or close in time, are synchronized in the sense that the synchronized images can be identified in the processing engines as representing different views of subjects having fixed positions in the real space. For example, in one embodiment, the cameras send image frames at the rates of 30 frames per second (fps) to respective network nodes hosting image recognition engines 112 a-112 n. Each frame has a timestamp, identity of the camera (abbreviated as “camera_id”), and a frame identity (abbreviated as “frame_id”) along with the image data. Other embodiments of the technology disclosed can use different types of sensors such as infrared or RF image sensors, ultrasound sensors, thermal sensors, Lidars, etc., to generate this data. Multiple types of sensors can be used, including for example ultrasound or RF sensors in addition to the cameras 114 that generate RGB color output. Multiple sensors can be synchronized in time with each other, so that frames are captured by the sensors at the same time, or close in time, and at the same frame capture rate. In all of the embodiments described herein sensors other than cameras, or sensors of multiple types, can be used to produce the sequences of images utilized.
  • Cameras installed over an aisle are connected to respective image recognition engines. For example, in FIG. 1, the two cameras installed over the aisle 116 a are connected to the network node 101 a hosting an image recognition engine 112 a. Likewise, the two cameras installed over aisle 116 b are connected to the network node 101 b hosting an image recognition engine 112 b. Each image recognition engine 112 a-112 n hosted in a network node or nodes 101 a-101 n, separately processes the image frames received from one camera each in the illustrated example.
  • In one embodiment, each image recognition engine 112 a, 112 b, and 112 n is implemented as a deep learning algorithm such as a convolutional neural network (abbreviated CNN). In such an embodiment, the CNN is trained using a training database 130. In an embodiment described herein, image recognition of subjects in the real space is based on identifying and grouping joints recognizable in the images, where the groups of joints can be attributed to an individual subject. For this joints-based analysis, the training database 130 has a large collection of images for each of the different types of joints for subjects. In the example embodiment of a shopping store, the subjects are the customers moving in the aisles between the shelves. In an example embodiment, during training of the CNN, the system 100 is referred to as a “training system.” After training the CNN using the training database 130, the CNN is switched to production mode to process images of customers in the shopping store in real time.
  • In an example embodiment, during production, the system 100 is referred to as a runtime system (also referred to as an inference system). The CNN in each image recognition engine produces arrays of joints data structures for images in its respective stream of images. In an embodiment as described herein, an array of joints data structures is produced for each processed image, so that each image recognition engine 112 a-112 n produces an output stream of arrays of joints data structures. These arrays of joints data structures from cameras having overlapping fields of view are further processed to form groups of joints, and to identify such groups of joints as subjects. These groups of joints may not uniquely identify the individual in the image, or an authentic user account for the individual in the image, but can be used to track a subject in the area. The subjects can be identified and tracked by the system using an identifier “subject_id” during their presence in the area of real space.
  • For example, when a customer enters a shopping store, the system identifies the customer using joints analysis as described above and is assigned a “subject_id”. This identifier is, however, not linked to real world identity of the subject such as user account, name, driver's license, email addresses, mailing addresses, credit card numbers, bank account numbers, driver's license number, etc. or to identifying biometric identification such as finger prints, facial recognition, hand geometry, retina scan, iris scan, voice recognition, etc. Therefore, the identified subject is anonymous. Details of an example technology for subject identification and tracking are presented in U.S. Pat. No. 10,055,853, issued 21 Aug. 2018, titled, “Subject Identification and Tracking Using Image Recognition Engine” which is incorporated herein by reference as if fully set forth herein.
  • The subject tracking engine 110, hosted on the network node 102 receives, in this example, continuous streams of arrays of joints data structures for the subjects from image recognition engines 112 a-112 n. The subject tracking engine 110 processes the arrays of joints data structures and translates the coordinates of the elements in the arrays of joints data structures corresponding to images in different sequences into candidate joints having coordinates in the real space. For each set of synchronized images, the combination of candidate joints identified throughout the real space can be considered, for the purposes of analogy, to be like a galaxy of candidate joints. For each succeeding point in time, movement of the candidate joints is recorded so that the galaxy changes over time. The output of the subject tracking engine 110 is stored in the subject database 140.
  • The subject tracking engine 110 uses logic to identify groups or sets of candidate joints having coordinates in real space as subjects in the real space. For the purposes of analogy, each set of candidate points is like a constellation of candidate joints at each point in time. The constellations of candidate joints can move over time.
  • In an example embodiment, the logic to identify sets of candidate joints comprises heuristic functions based on physical relationships amongst joints of subjects in real space. These heuristic functions are used to identify sets of candidate joints as subjects. The sets of candidate joints comprise individual candidate joints that have relationships according to the heuristic parameters with other individual candidate joints and subsets of candidate joints in a given set that has been identified, or can be identified, as an individual subject.
  • In the example of a shopping store, as the customer completes shopping and moves out of the store, the system processes payment of items bought by the customer. In a cashier-less store, the system has to link the customer with a “user account” containing preferred payment method provided by the customer.
  • As described above, the “identified subject” is anonymous because information about the joints and relationships among the joints is not stored as biometric identifying information linked to an individual or to a user account.
  • The system includes a matching engine 170 (hosted on the network node 103) to process signals received from mobile computing devices 120 (carried by the subjects) to match the identified subjects with user accounts. The matching can be performed by identifying locations of mobile devices executing client applications in the area of real space (e.g., the shopping store) and matching locations of mobile devices with locations of subjects, without use of personal identifying biometric information from the images.
  • The actual communication path to the network node 103 hosting the matching engine 170 through the network 181 can be point-to-point over public and/or private networks. The communications can occur over a variety of networks 181, e.g., private networks, VPN, MPLS circuit, or Internet, and can use appropriate application programming interfaces (APIs) and data interchange formats, e.g., Representational State Transfer (REST), JavaScript™ Object Notation (JSON), Extensible Markup Language (XML), Simple Object Access Protocol (SOAP), Java™ Message Service (JMS), and/or Java Platform Module System. All of the communications can be encrypted. The communication is generally over a network such as a LAN (local area network), WAN (wide area network), telephone network (Public Switched Telephone Network (PSTN), Session Initiation Protocol (SIP), wireless network, point-to-point network, star network, token ring network, hub network, Internet, inclusive of the mobile Internet, via protocols such as EDGE, 3G, 4G LTE, Wi-Fi, and WiMAX. Additionally, a variety of authorization and authentication techniques, such as username/password, Open Authorization (OAuth), Kerberos, SecureID, digital certificates and more, can be used to secure the communications.
  • The technology disclosed herein can be implemented in the context of any computer-implemented system including a database system, a multi-tenant environment, or a relational database implementation like an Oracle™ compatible database implementation, an IBM DB2 Enterprise Server™ compatible relational database implementation, a MySQL™ or PostgreSQL™ compatible relational database implementation or a Microsoft SQL Server™ compatible relational database implementation or a NoSQL™ non-relational database implementation such as a Vampire™ compatible non-relational database implementation, an Apache Cassandra™ compatible non-relational database implementation, a BigTable™ compatible non-relational database implementation or an HBase™ or DynamoDB™ compatible non-relational database implementation. In addition, the technology disclosed can be implemented using different programming models like MapReduce™, bulk synchronous programming, MPI primitives, etc. or different scalable batch and stream management systems like Apache Storm™, Apache Spark™, Apache Kafka™, Apache Flink™ Truviso™, Amazon Elasticsearch Service™, Amazon Web Services™ (AWS), IBM Info-Sphere™, Borealis™, and Yahoo! S4™.
  • Camera Arrangement
  • The cameras 114 are arranged to track multi-joint subjects (or entities) in a three-dimensional (abbreviated as 3D) real space. In the example embodiment of the shopping store, the real space can include the area of the shopping store where items for sale are stacked in shelves. A point in the real space can be represented by an (x, y, z) coordinate system. Each point in the area of real space for which the system is deployed is covered by the fields of view of two or more cameras 114.
  • In a shopping store, the shelves and other inventory display structures can be arranged in a variety of manners, such as along the walls of the shopping store, or in rows forming aisles or a combination of the two arrangements. FIG. 2 shows an arrangement of shelves, forming an aisle 116 a, viewed from one end of the aisle 116 a. Two cameras, camera A 206 and camera B 208 are positioned over the aisle 116 a at a predetermined distance from a roof 230 and a floor 220 of the shopping store above the inventory display structures, such as shelves. The cameras 114 comprise cameras disposed over and having fields of view encompassing respective parts of the inventory display structures and floor area in the real space. The coordinates in real space of members of a set of candidate joints, identified as a subject, identify locations of the subject in the floor area. In FIG. 2, a subject 240 is holding the mobile computing device 118 a and standing on the floor 220 in the aisle 116 a. The mobile computing device can send and receive signals through the wireless network(s) 181. In one example, the mobile computing devices 120 communicate through a wireless network using for example a Wi-Fi protocol, or other wireless protocols like Bluetooth, ultra-wideband, and ZigBee, through wireless access points (WAP) 250 and 252.
  • In the example embodiment of the shopping store, the real space can include all of the floor 220 in the shopping store from which inventory can be accessed. Cameras 114 are placed and oriented such that areas of the floor 220 and shelves can be seen by at least two cameras. The cameras 114 also cover at least part of the shelves 202 and 204 and floor space in front of the shelves 202 and 204. Camera angles are selected to have both steep perspective, straight down, and angled perspectives that give more full body images of the customers. In one example embodiment, the cameras 114 are configured at an eight (8) foot height or higher throughout the shopping store.
  • In FIG. 2, the cameras 206 and 208 have overlapping fields of view, covering the space between a shelf A 202 and a shelf B 204 with overlapping fields of view 216 and 218, respectively. A location in the real space is represented as a (x, y, z) point of the real space coordinate system. “x” and “y” represent positions on a two-dimensional (2D) plane which can be the floor 220 of the shopping store. The value “z” is the height of the point above the 2D plane at floor 220 in one configuration.
  • FIG. 3 illustrates the aisle 116 a viewed from the top of FIG. 2, further showing an example arrangement of the positions of cameras 206 and 208 over the aisle 116 a. The cameras 206 and 208 are positioned closer to opposite ends of the aisle 116 a. The camera A 206 is positioned at a predetermined distance from the shelf A 202 and the camera B 208 is positioned at a predetermined distance from the shelf B 204. In another embodiment, in which more than two cameras are positioned over an aisle, the cameras are positioned at equal distances from each other. In such an embodiment, two cameras are positioned close to the opposite ends and a third camera is positioned in the middle of the aisle. It is understood that a number of different camera arrangements are possible.
  • Joints Data Structure
  • The image recognition engines 112 a-112 n receive the sequences of images from cameras 114 and process images to generate corresponding arrays of joints data structures. In one embodiment, the image recognition engines 112 a-112 n identify one of the 19 possible joints of each subject at each element of the image. The possible joints can be grouped in two categories: foot joints and non-foot joints. The 19th type of joint classification is for all non-joint features of the subject (i.e. elements of the image not classified as a joint).
  • Foot Joints:
      • Ankle joint (left and right)
    Non-Foot Joints:
      • Neck
      • Nose
      • Eyes (left and right)
      • Ears (left and right)
      • Shoulders (left and right)
      • Elbows (left and right)
      • Wrists (left and right)
      • Hip (left and right)
      • Knees (left and right)
        Not a joint
  • An array of joints data structures for a particular image classifies elements of the particular image by joint type, time of the particular image, and the coordinates of the elements in the particular image. In one embodiment, the image recognition engines 112 a-112 n are convolutional neural networks (CNN), the joint type is one of the 19 types of joints of the subjects, the time of the particular image is the timestamp of the image generated by the source camera 114 for the particular image, and the coordinates (x, y) identify the position of the element on a 2D image plane.
  • The output of the CNN is a matrix of confidence arrays for each image per camera. The matrix of confidence arrays is transformed into an array of joints data structures. A joints data structure 400 as shown in FIG. 4 is used to store the information of each joint. The joints data structure 400 identifies x and y positions of the element in the particular image in the 2D image space of the camera from which the image is received. A joint number identifies the type of joint identified. For example, in one embodiment, the values range from 1 to 19. A value of 1 indicates that the joint is a left ankle, a value of 2 indicates the joint is a right ankle and so on. The type of joint is selected using the confidence array for that element in the output matrix of CNN. For example, in one embodiment, if the value corresponding to the left-ankle joint is highest in the confidence array for that image element, then the value of the joint number is “1”.
  • A confidence number indicates the degree of confidence of the CNN in predicting that joint. If the value of confidence number is high, it means the CNN is confident in its prediction. An integer-Id is assigned to the joints data structure to uniquely identify it. Following the above mapping, the output matrix of confidence arrays per image is converted into an array of joints data structures for each image. In one embodiment, the joints analysis includes performing a combination of k-nearest neighbors, mixture of Gaussians, and various image morphology transformations on each input image. The result comprises arrays of joints data structures which can be stored in the form of a bit mask in a ring buffer that maps image numbers to bit masks at each moment in time.
  • Subject Tracking Engine
  • The tracking engine 110 is configured to receive arrays of joints data structures generated by the image recognition engines 112 a-112 n corresponding to images in sequences of images from cameras having overlapping fields of view. The arrays of joints data structures per image are sent by image recognition engines 112 a-112 n to the tracking engine 110 via the network(s) 181. The tracking engine 110 translates the coordinates of the elements in the arrays of joints data structures corresponding to images in different sequences into candidate joints having coordinates in the real space. The tracking engine 110 comprises logic to identify sets of candidate joints having coordinates in real space (constellations of joints) as subjects in the real space. In one embodiment, the tracking engine 110 accumulates arrays of joints data structures from the image recognition engines for all the cameras at a given moment in time and stores this information as a dictionary in the subject database 140, to be used for identifying a constellation of candidate joints. The dictionary can be arranged in the form of key-value pairs, where keys are camera ids and values are arrays of joints data structures from the camera. In such an embodiment, this dictionary is used in heuristics-based analysis to determine candidate joints and for assignment of joints to subjects. In such an embodiment, a high-level input, processing and output of the tracking engine 110 is illustrated in table 1. Details of the logic applied by the subject tracking engine 110 to create subjects by combining candidate joints and track movement of subjects in the area of real space are presented in U.S. Pat. No. 10,055,853, issued 21 Aug. 2018, titled, “Subject Identification and Tracking Using Image Recognition Engine” which is incorporated herein by reference.
  • TABLE 1
    Inputs, processing and outputs from subject
    tracking engine
    110 in an example embodiment.
    Inputs Processing Output
    Arrays of joints data Create joints dictionary List of identified
    structures per image and for Reproject joint positions subjects in the
    each joints data structure in the fields of view of real space at a
    Unique ID cameras with moment in time
    Confidence number overlapping fields of
    Joint number view to candidate joints
    (x, y) position in
    image space
  • Subject Data Structure
  • The subject tracking engine 110 uses heuristics to connect joints of subjects identified by the image recognition engines 112 a-112 n. In doing so, the subject tracking engine 110 creates new subjects and updates the locations of existing subjects by updating their respective joint locations. The subject tracking engine 110 uses triangulation techniques to project the locations of joints from 2D space coordinates (x, y) to 3D real space coordinates (x, y, z). FIG. 5 shows the subject data structure 500 used to store the subject. The subject data structure 500 stores the subject related data as a key-value dictionary. The key is a frame_number and the value is another key-value dictionary where key is the camera_id and value is a list of 18 joints (of the subject) with their locations in the real space. The subject data is stored in the subject database 140. Every new subject is also assigned a unique identifier that is used to access the subject's data in the subject database 140.
  • In one embodiment, the system identifies joints of a subject and creates a skeleton of the subject. The skeleton is projected into the real space indicating the position and orientation of the subject in the real space. This is also referred to as “pose estimation” in the field of machine vision. In one embodiment, the system displays orientations and positions of subjects in the real space on a graphical user interface (GUI). In one embodiment, the image analysis is anonymous, i.e., a unique identifier assigned to a subject created through joints analysis does not identify personal identification of the subject as described above.
  • Matching Engine
  • The matching engine 170 includes logic to match the identified subjects with their respective user accounts by identifying locations of mobile devices (carried by the identified subjects) that are executing client applications in the area of real space. In one embodiment, the matching engine uses multiple techniques, independently or in combination, to match the identified subjects with the user accounts. The system can be implemented without maintaining biometric identifying information about users, so that biometric information about account holders is not exposed to security and privacy concerns raised by distribution of such information.
  • In one embodiment, a customer logs in to the system using a client application executing on a personal mobile computing device upon entering the shopping store, identifying an authentic user account to be associated with the client application on the mobile device. The system then sends a “semaphore” image selected from the set of unassigned semaphore images in the image database 160 to the client application executing on the mobile device. The semaphore image is unique to the client application in the shopping store as the same image is not freed for use with another client application in the store until the system has matched the user account to an identified subject. After that matching, the semaphore image becomes available for use again. The client application causes the mobile device to display the semaphore image, which display of the semaphore image is a signal emitted by the mobile device to be detected by the system. The matching engine 170 uses the image recognition engines 112 a-n or a separate image recognition engine (not shown in FIG. 1) to recognize the semaphore image and determine the location of the mobile computing device displaying the semaphore in the shopping store. The matching engine 170 matches the location of the mobile computing device to a location of an identified subject. The matching engine 170 then links the identified subject (stored in the subject database 140) to the user account (stored in the user account database 150) linked to the client application for the duration in which the subject is present in the shopping store. No biometric identifying information is used for matching the identified subject with the user account, and none is stored in support of this process. That is, there is no information in the sequences of images used to compare with stored biometric information for the purposes of matching the identified subjects with user accounts in support of this process.
  • In other embodiments, the matching engine 170 uses other signals in the alternative or in combination from the mobile computing devices 120 to link the identified subjects to user accounts. Examples of such signals include a service location signal identifying the position of the mobile computing device in the area of the real space, speed and orientation of the mobile computing device obtained from the accelerometer and compass of the mobile computing device, etc.
  • In some embodiments, though embodiments are provided that do not maintain any biometric information about account holders, the system can use biometric information to assist matching a not-yet-linked identified subject to a user account. For example, in one embodiment, the system stores “hair color” of the customer in his or her user account record. During the matching process, the system might use for example hair color of subjects as an additional input to disambiguate and match the subject to a user account. If the user has red colored hair and there is only one subject with red colored hair in the area of real space or in close proximity of the mobile computing device, then the system might select the subject with red hair color to match the user account.
  • The flowcharts in FIGS. 6 to 9C present process steps of four techniques usable alone or in combination by the matching engine 170.
  • Semaphore Images
  • FIG. 6 is a flowchart 600 presenting process steps for a first technique for matching identified subjects in the area of real space with their respective user accounts. In the example of a shopping store, the subjects are customers (or shoppers) moving in the store in aisles between shelves and other open spaces. The process starts at step 602. As a subject enters the area of real space, the subject opens a client application on a mobile computing device and attempts to login. The system verifies the user credentials at step 604 (for example, by querying the user account database 150) and accepts login communication from the client application to associate an authenticated user account with the mobile computing device. The system determines that the user account of the client application is not yet linked to an identified subject. The system sends a semaphore image to the client application for display on the mobile computing device at step 606. Examples of semaphore images include various shapes of solid colors such as a red rectangle or a pink elephant, etc. A variety of images can be used as semaphores, preferably suited for high confidence recognition by the image recognition engine. Each semaphore image can have a unique identifier. The processing system includes logic to accept login communications from a client application on a mobile device identifying a user account before matching the user account to an identified subject in the area of real space, and after accepting login communications sends a selected semaphore image from the set of semaphore images to the client application on the mobile device.
  • In one embodiment, the system selects an available semaphore image from the image database 160 for sending to the client application. After sending the semaphore image to the client application, the system changes a status of the semaphore image in the image database 160 as “assigned” so that this image is not assigned to any other client application. The status of the image remains as “assigned” until the process to match the identified subject to the mobile computing device is complete. After matching is complete, the status can be changed to “available.” This allows for rotating use of a small set of semaphores in a given system, simplifying the image recognition problem.
  • The client application receives the semaphore image and displays it on the mobile computing device. In one embodiment, the client application also increases the brightness of the display to increase the image visibility. The image is captured by one or more cameras 114 and sent to an image processing engine, referred to as WhatCNN. The system uses WhatCNN at step 608 to recognize the semaphore images displayed on the mobile computing device. In one embodiment, WhatCNN is a convolutional neural network trained to process the specified bounding boxes in the images to generate a classification of hands of the identified subjects. One trained WhatCNN processes image frames from one camera. In the example embodiment of the shopping store, for each hand joint in each image frame, the WhatCNN identifies whether the hand joint is empty. The WhatCNN also identifies a semaphore image identifier (in the image database 160) or an SKU (stock keeping unit) number of the inventory item in the hand joint, a confidence value indicating the item in the hand joint is a non-SKU item (i.e. it does not belong to the shopping store inventory) and a context of the hand joint location in the image frame.
  • As mentioned above, two or more cameras with overlapping fields of view capture images of subjects in real space. Joints of a single subject can appear in image frames of multiple cameras in a respective image channel. A WhatCNN model per camera identifies semaphore images (displayed on mobile computing devices) in hands (represented by hand joints) of subjects. A coordination logic combines the outputs of WhatCNN models into a consolidated data structure listing identifiers of semaphore images in left hand (referred to as left_hand_classid) and right hand (right_hand_classid) of identified subjects (step 610). The system stores this information in a dictionary mapping subject_id to left_hand_classid and right_hand_classid along with a timestamp, including locations of the joints in real space. The details of WhatCNN are presented in U.S. patent application Ser. No. 15/907,112, filed 27 Feb. 2018, titled, “Item Put and Take Detection Using Image Recognition” which is incorporated herein by reference as if fully set forth herein.
  • At step 612, the system checks if the semaphore image sent to the client application is recognized by the WhatCNN by iterating the output of the WhatCNN models for both hands of all identified subjects. If the semaphore image is not recognized, the system sends a reminder at a step 614 to the client application to display the semaphore image on the mobile computing device and repeats process steps 608 to 612. Otherwise, if the semaphore image is recognized by WhatCNN, the system matches a user_account (from the user account database 150) associated with the client application to subject_id (from the subject database 140) of the identified subject holding the mobile computing device (step 616). In one embodiment, the system maintains this mapping (subject_id-user_account) until the subject is present in the area of real space. The process ends at step 618.
  • Service Location
  • The flowchart 700 in FIG. 7 presents process steps for a second technique for matching identified subjects with user accounts. This technique uses radio signals emitted by the mobile devices indicating location of the mobile devices. The process starts at step 702, the system accepts login communication from a client application on a mobile computing device as described above in step 604 to link an authenticated user account to the mobile computing device. At step 706, the system receives service location information from the mobile devices in the area of real space at regular intervals. In one embodiment, latitude and longitude coordinates of the mobile computing device emitted from a global positioning system (GPS) receiver of the mobile computing device are used by the system to determine the location. In one embodiment, the service location of the mobile computing device obtained from GPS coordinates has an accuracy between 1 to 3 meters. In another embodiment, the service location of a mobile computing device obtained from GPS coordinates has an accuracy between 1 to 5 meters.
  • Other techniques can be used in combination with the above technique or independently to determine the service location of the mobile computing device. Examples of such techniques include using signal strengths from different wireless access points (WAP) such as 250 and 252 shown in FIGS. 2 and 3 as an indication of how far the mobile computing device is from respective access points. The system then uses known locations of wireless access points (WAP) 250 and 252 to triangulate and determine the position of the mobile computing device in the area of real space. Other types of signals (such as Bluetooth, ultra-wideband, and ZigBee) emitted by the mobile computing devices can also be used to determine a service location of the mobile computing device.
  • The system monitors the service locations of mobile devices with client applications that are not yet linked to an identified subject at step 708 at regular intervals such as every second. At step 708, the system determines the distance of a mobile computing device with an unmatched user account from all other mobile computing devices with unmatched user accounts. The system compares this distance with a pre-determined threshold distance “d” such as 3 meters. If the mobile computing device is away from all other mobile devices with unmatched user accounts by at least “d” distance (step 710), the system determines a nearest not yet linked subject to the mobile computing device (step 714). The location of the identified subject is obtained from the output of the JointsCNN at step 712. In one embodiment the location of the subject obtained from the JointsCNN is more accurate than the service location of the mobile computing device. At step 616, the system performs the same process as described above in flowchart 600 to match the subject_id of the identified subject with the user_account of the client application. The process ends at a step 718.
  • No biometric identifying information is used for matching the identified subject with the user account, and none is stored in support of this process. That is, there is no information in the sequences of images used to compare with stored biometric information for the purposes of matching the identified subjects with user account in support of this process. Thus, this logic to match the identified subjects with user accounts operates without use of personal identifying biometric information associated with the user accounts.
  • Speed and Orientation
  • The flowchart 800 in FIG. 8 presents process steps for a third technique for matching identified subjects with user accounts. This technique uses signals emitted by an accelerometer of the mobile computing devices to match identified subjects with client applications. The process starts at step 802. The process starts at step 604 to accept login communication from the client application as described above in the first and second techniques. At step 806, the system receives signals emitted from the mobile computing devices carrying data from accelerometers on the mobile computing devices in the area of real space, which can be sent at regular intervals. At a step 808, the system calculates an average velocity of all mobile computing devices with unmatched user accounts.
  • The accelerometers provide acceleration of mobile computing devices along the three axes (x, y, z). In one embodiment, the velocity is calculated by taking the accelerations values at small time intervals (e.g., at every 10 milliseconds) to calculate the current velocity at time “t” i.e., vt=v0+at, where v0 is initial velocity. In one embodiment, the v0 is initialized as “0” and subsequently, for every time t+1, vt becomes v0. The velocities along the three axes are then combined to determine an overall velocity of the mobile computing device at time “t.” Finally at step 808, the system calculates moving averages of velocities of all mobile computing devices over a larger period of time such as 3 seconds which is long enough for the walking gait of an average person, or over longer periods of time.
  • At step 810, the system calculates Euclidean distance (also referred to as L2 norm) between velocities of all pairs of mobile computing devices with unmatched client applications to not yet linked identified subjects. The velocities of subjects are derived from changes in positions of their joints with respect to time, obtained from joints analysis and stored in respective subject data structures 500 with timestamps. In one embodiment, a location of center of mass of each subject is determined using the joints analysis. The velocity, or other derivative, of the center of mass location data of the subject is used for comparison with velocities of mobile computing devices. For each subject_id-user_account pair, if the value of the Euclidean distance between their respective velocities is less than a threshold_0, a score_counter for the subject_id-user_account pair is incremented. The above process is performed at regular time intervals, thus updating the score_counter for each subject_id-user_account pair.
  • At regular time intervals (e.g., every one second), the system compares the score_counter values for pairs of every unmatched user account with every not yet linked identified subject (step 812). If the highest score is greater than threshold_1 (step 814), the system calculates the difference between the highest score and the second highest score (for pair of same user account with a different subject) at step 816. If the difference is greater than threshold_2, the system selects the mapping of user_account to the identified subject at step 818 and follows the same process as described above in step 616. The process ends at a step 820.
  • In another embodiment, when JointsCNN recognizes a hand holding a mobile computing device, the velocity of the hand (of the identified subject) holding the mobile computing device is used in above process instead of using the velocity of the center of mass of the subject. This improves performance of the matching algorithm. To determine values of the thresholds (threshold_0, threshold_1, threshold_2), the system uses training data with labels assigned to the images. During training, various combinations of the threshold values are used and the output of the algorithm is matched with ground truth labels of images to determine its performance. The values of thresholds that result in best overall assignment accuracy are selected for use in production (or inference).
  • No biometric identifying information is used for matching the identified subject with the user account, and none is stored in support of this process. That is, there is no information in the sequences of images used to compare with stored biometric information for the purposes of matching the identified subjects with user accounts in support of this process. Thus, this logic to match the identified subjects with user accounts operates without use of personal identifying biometric information associated with the user accounts.
  • Network Ensemble
  • A network ensemble is a learning paradigm where many networks are jointly used to solve a problem. Ensembles typically improve the prediction accuracy obtained from a single classifier by a factor that validates the effort and cost associated with learning multiple models. In the fourth technique to match user accounts to not yet linked identified subjects, the second and third techniques presented above are jointly used in an ensemble (or network ensemble). To use the two techniques in an ensemble, relevant features are extracted from application of the two techniques. FIGS. 9A-9C present process steps (in a flowchart 900) for extracting features, training the ensemble and using the trained ensemble to predict match of a user account to a not yet linked identified subject.
  • FIG. 9A presents the process steps for generating features using the second technique that uses service location of mobile computing devices. The process starts at step 902. At a step 904, a Count_X, for the second technique is calculated indicating a number of times a service location of a mobile computing device with an unmatched user account is X meters away from all other mobile computing devices with unmatched user accounts. At step 906, Count_X values of all tuples of subject_id-user_account pairs is stored by the system for use by the ensemble. In one embodiment, multiple values of X are used e.g., 1 m, 2 m, 3 m, 4 m, 5 m (steps 908 and 910). For each value of X, the count is stored as a dictionary that maps tuples of subject_id-user_account to count score, which is an integer. In the example where 5 values of X are used, five such dictionaries are created at step 912. The process ends at step 914.
  • FIG. 9B presents the process steps for generating features using the third technique that uses velocities of mobile computing devices. The process starts at step 920. At a step 922, a Count_Y, for the third technique is determined which is equal to score_counter values indicating number of times Euclidean distance between a particular subject_id-user_account pair is below a threshold_0. At a step 924, Count_Y values of all tuples of subject_id-user_account pairs is stored by the system for use by the ensemble. In one embodiment, multiple values of threshold_0 are used e.g., five different values (steps 926 and 928). For each value of threshold_0, the Count_Y is stored as a dictionary that maps tuples of subject_id-user_account to count score, which is an integer. In the example where 5 values of threshold are used, five such dictionaries are created at step 930. The process ends at step 932.
  • The features from the second and third techniques are then used to create a labeled training data set and used to train the network ensemble. To collect such a data set, multiple subjects (shoppers) walk in an area of real space such as a shopping store. The images of these subject are collected using cameras 114 at regular time intervals. Human labelers review the images and assign correct identifiers (subject_id and user_account) to the images in the training data. The process is described in a flowchart 900 presented in FIG. 9C. The process starts at a step 940. At a step 942, features in the form of Count_X and Count_Y dictionaries obtained from second and third techniques are compared with corresponding true labels assigned by the human labelers on the images to identify correct matches (true) and incorrect matches (false) of subject_id and user_account.
  • As we have only two categories of outcome for each mapping of subject_id and user_account: true or false, a binary classifier is trained using this training data set (step 944). Commonly used methods for binary classification include decision trees, random forest, neural networks, gradient boost, support vector machines, etc. A trained binary classifier is used to categorize new probabilistic observations as true or false. The trained binary classifier is used in production (or inference) by giving as input Count_X and Count_Y dictionaries for subject_id-user_account tuples. The trained binary classifier classifies each tuple as true or false at a step 946. The process ends at a step 948.
  • If there is an unmatched mobile computing device in the area of real space after application of the above four techniques, the system sends a notification to the mobile computing device to open the client application. If the user accepts the notification, the client application will display a semaphore image as described in the first technique. The system will then follow the steps in the first technique to check-in the shopper (match subject_id to user_account). If the customer does not respond to the notification, the system will send a notification to an employee in the shopping store indicating the location of the unmatched customer. The employee can then walk to the customer, ask him to open the client application on his mobile computing device to check-in to the system using a semaphore image.
  • No biometric identifying information is used for matching the identified subject with the user account, and none is stored in support of this process. That is, there is no information in the sequences of images used to compare with stored biometric information for the purposes of matching the identified subjects with user accounts in support of this process. Thus, this logic to match the identified subjects with user accounts operates without use of personal identifying biometric information associated with the user accounts.
  • Architecture
  • An example architecture of a system in which the four techniques presented above are applied to match a user_account to a not yet linked subject in an area of real space is presented in FIG. 10. Because FIG. 10 is an architectural diagram, certain details are omitted to improve the clarity of description. The system presented in FIG. 10 receives image frames from a plurality of cameras 114. As described above, in one embodiment, the cameras 114 can be synchronized in time with each other, so that images are captured at the same time, or close in time, and at the same image capture rate. Images captured in all the cameras covering an area of real space at the same time, or close in time, are synchronized in the sense that the synchronized images can be identified in the processing engines as representing different views at a moment in time of subjects having fixed positions in the real space. The images are stored in a circular buffer of image frames per camera 1002.
  • A “subject identification” subsystem 1004 (also referred to as first image processors) processes image frames received from cameras 114 to identify and track subjects in the real space. The first image processors include subject image recognition engines such as the JointsCNN above.
  • A “semantic diffing” subsystem 1006 (also referred to as second image processors) includes background image recognition engines, which receive corresponding sequences of images from the plurality of cameras and recognize semantically significant differences in the background (i.e. inventory display structures like shelves) as they relate to puts and takes of inventory items, for example, over time in the images from each camera. The second image processors receive output of the subject identification subsystem 1004 and image frames from cameras 114 as input. Details of “semantic diffing” subsystem are presented in U.S. patent application Ser. No. 15/945,466, filed 4 Apr. 2018, titled, “Predicting Inventory Events using Semantic Diffing,” and U.S. patent application Ser. No. 15/945,473, filed 4 Apr. 2018, titled, “Predicting Inventory Events using Foreground/Background Processing,” both of which are incorporated herein by reference as if fully set forth herein. The second image processors process identified background changes to make a first set of detections of takes of inventory items by identified subjects and of puts of inventory items on inventory display structures by identified subjects. The first set of detections are also referred to as background detections of puts and takes of inventory items. In the example of a shopping store, the first detections identify inventory items taken from the shelves or put on the shelves by customers or employees of the store. The semantic diffing subsystem includes the logic to associate identified background changes with identified subjects.
  • A “region proposals” subsystem 1008 (also referred to as third image processors) includes foreground image recognition engines, receives corresponding sequences of images from the plurality of cameras 114, and recognizes semantically significant objects in the foreground (i.e. shoppers, their hands and inventory items) as they relate to puts and takes of inventory items, for example, over time in the images from each camera. The region proposals subsystem 1008 also receives output of the subject identification subsystem 1004. The third image processors process sequences of images from cameras 114 to identify and classify foreground changes represented in the images in the corresponding sequences of images. The third image processors process identified foreground changes to make a second set of detections of takes of inventory items by identified subjects and of puts of inventory items on inventory display structures by identified subjects. The second set of detections are also referred to as foreground detection of puts and takes of inventory items. In the example of a shopping store, the second set of detections identifies takes of inventory items and puts of inventory items on inventory display structures by customers and employees of the store. The details of a region proposal subsystem are presented in U.S. patent application Ser. No. 15/907,112, filed 27 Feb. 2018, titled, “Item Put and Take Detection Using Image Recognition” which is incorporated herein by reference as if fully set forth herein.
  • The system described in FIG. 10 includes a selection logic 1010 to process the first and second sets of detections to generate log data structures including lists of inventory items for identified subjects. For a take or put in the real space, the selection logic 1010 selects the output from either the semantic diffing subsystem 1006 or the region proposals subsystem 1008. In one embodiment, the selection logic 1010 uses a confidence score generated by the semantic diffing subsystem for the first set of detections and a confidence score generated by the region proposals subsystem for a second set of detections to make the selection. The output of the subsystem with a higher confidence score for a particular detection is selected and used to generate a log data structure 1012 (also referred to as a shopping cart data structure) including a list of inventory items (and their quantities) associated with identified subjects.
  • To process a payment for the items in the log data structure 1012, the system in FIG. 10 applies the four techniques for matching the identified subject (associated with the log data) to a user_account which includes a payment method such as credit card or bank account information. In one embodiment, the four techniques are applied sequentially as shown in the figure. If the process steps in flowchart 600 for the first technique produces a match between the subject and the user account then this information is used by a payment processor 1036 to charge the customer for the inventory items in the log data structure. Otherwise (step 1028), the process steps presented in flowchart 700 for the second technique are followed and the user account is used by the payment processor 1036. If the second technique is unable to match the user account with a subject (1030) then the process steps presented in flowchart 800 for the third technique are followed. If the third technique is unable to match the user account with a subject (1032) then the process steps in flowchart 900 for the fourth technique are followed to match the user account with a subject.
  • If the fourth technique is unable to match the user account with a subject (1034), the system sends a notification to the mobile computing device to open the client application and follow the steps presented in the flowchart 600 for the first technique. If the customer does not respond to the notification, the system will send a notification to an employee in the shopping store indicating the location of the unmatched customer. The employee can then walk to the customer, ask him to open the client application on his mobile computing device to check-in to the system using a semaphore image (step 1040). It is understood that in other embodiments of the architecture presented in FIG. 10, fewer than four techniques can be used to match the user accounts to not yet linked identified subjects.
  • Network Configuration
  • FIG. 11 presents an architecture of a network hosting the matching engine 170 which is hosted on the network node 103. The system includes a plurality of network nodes 103, 101 a-101 n, and 102 in the illustrated embodiment. In such an embodiment, the network nodes are also referred to as processing platforms. Processing platforms (network nodes) 103, 101 a-101 n, and 102 and cameras 1112, 1114, 1116, . . . 1118 are connected to network(s) 1181.
  • FIG. 11 shows a plurality of cameras 1112, 1114, 1116, . . . 1118 connected to the network(s). A large number of cameras can be deployed in particular systems. In one embodiment, the cameras 1112 to 1118 are connected to the network(s) 1181 using Ethernet-based connectors 1122, 1124, 1126, and 1128, respectively. In such an embodiment, the Ethernet-based connectors have a data transfer speed of 1 gigabit per second, also referred to as Gigabit Ethernet. It is understood that in other embodiments, cameras 114 are connected to the network using other types of network connections which can have a faster or slower data transfer rate than Gigabit Ethernet. Also, in alternative embodiments, a set of cameras can be connected directly to each processing platform, and the processing platforms can be coupled to a network.
  • Storage subsystem 1130 stores the basic programming and data constructs that provide the functionality of certain embodiments of the present invention. For example, the various modules implementing the functionality of the matching engine 170 may be stored in storage subsystem 1130. The storage subsystem 1130 is an example of a computer readable memory comprising a non-transitory data storage medium, having computer instructions stored in the memory executable by a computer to perform all or any combination of the data processing and image processing functions described herein, including logic to link subjects in an area of real space with a user account, to determine locations of identified subjects represented in the images, match the identified subjects with user accounts by identifying locations of mobile computing devices executing client applications in the area of real space by processes as described herein. In other examples, the computer instructions can be stored in other types of memory, including portable memory, that comprise a non-transitory data storage medium or media, readable by a computer.
  • These software modules are generally executed by a processor subsystem 1150. A host memory subsystem 1132 typically includes a number of memories including a main random access memory (RAM) 1134 for storage of instructions and data during program execution and a read-only memory (ROM) 1136 in which fixed instructions are stored. In one embodiment, the RAM 1134 is used as a buffer for storing subject_id-user_account tuples matched by the matching engine 170.
  • A file storage subsystem 1140 provides persistent storage for program and data files. In an example embodiment, the storage subsystem 1140 includes four 120 Gigabyte (GB) solid state disks (SSD) in a RAID 0 (redundant array of independent disks) arrangement identified by a numeral 1142. In the example embodiment, user account data in the user account database 150 and image data in the image database 160 which is not in RAM is stored in RAID 0. In the example embodiment, the hard disk drive (HDD) 1146 is slower in access speed than the RAID 0 1142 storage. The solid state disk (SSD) 1144 contains the operating system and related files for the matching engine 170.
  • In an example configuration, three cameras 1112, 1114, and 1116, are connected to the processing platform (network node) 103. Each camera has a dedicated graphics processing unit GPU 1 1162, GPU 2 1164, and GPU 3 1166, to process images sent by the camera. It is understood that fewer than or more than three cameras can be connected per processing platform. Accordingly, fewer or more GPUs are configured in the network node so that each camera has a dedicated GPU for processing the image frames received from the camera. The processor subsystem 1150, the storage subsystem 1130 and the GPUs 1162 , 1164, and 1166 communicate using the bus subsystem 1154.
  • A network interface subsystem 1170 is connected to the bus subsystem 1154 forming part of the processing platform (network node) 103. Network interface subsystem 1170 provides an interface to outside networks, including an interface to corresponding interface devices in other computer systems. The network interface subsystem 1170 allows the processing platform to communicate over the network either by using cables (or wires) or wirelessly. The wireless radio signals 1175 emitted by the mobile computing devices 120 in the area of real space are received (via the wireless access points) by the network interface subsystem 1170 for processing by the matching engine 170. A number of peripheral devices such as user interface output devices and user interface input devices are also connected to the bus subsystem 1154 forming part of the processing platform (network node) 103. These subsystems and devices are intentionally not shown in FIG. 11 to improve the clarity of the description. Although bus subsystem 1154 is shown schematically as a single bus, alternative embodiments of the bus subsystem may use multiple busses.
  • In one embodiment, the cameras 114 can be implemented using Chameleon3 1.3 MP Color USB3 Vision (Sony ICX445), having a resolution of 1288×964, a frame rate of 30 FPS, and at 1.3 MegaPixels per image, with Varifocal Lens having a working distance (mm) of 300-∞, a field of view field of view with a ⅓″ sensor of 98.2°-23.8°.
  • Particular Implementations
  • In various embodiments, the system for linking subjects in an area of real space with user accounts described above also includes one or more of the following features.
  • The system includes a plurality of cameras, cameras in the plurality of cameras producing respective sequences of images in corresponding fields of view in the real space. The processing system is coupled to the plurality of cameras, the processing system includes logic to determine locations of identified subjects represented in the images. The system matches the identified subjects with user accounts by identifying locations of mobile devices executing client applications in the area of real space, and matches locations of the mobile devices with locations of the subjects.
  • In one embodiment, the system the signals emitted by the mobile computing devices comprise images.
  • In one embodiment, the signals emitted by the mobile computing devices comprise radio signals.
  • In one embodiment, the system includes a set of semaphore images accessible to the processing system. The processing system includes logic to accept login communications from a client application on a mobile computing device identifying a user account before matching the user account to an identified subject in the area of real space, and after accepting login communications the system sends a selected semaphore image from the set of semaphore images to the client application on the mobile device.
  • In one such embodiment, the processing system sets a status of the selected semaphore image as assigned. The processing system receives a displayed image of the selected semaphore image. The processing system recognizes the displayed image and matches the recognized semaphore image with the assigned images from the set of semaphore images. The processing system matches a location of the mobile computing device displaying the recognized semaphore image located in the area of real space with a not yet linked identified subject. The processing system, after matching the user account to the identified subject, sets the status of the recognized semaphore image as available.
  • In one embodiment, the client applications on the mobile computing devices transmit accelerometer data to the processing system, and the system matches the identified subjects with user accounts using the accelerometer data transmitted from the mobile computing devices.
  • In one such embodiment, the logic to match the identified subjects with user accounts includes logic that uses the accelerometer data transmitted from the mobile computing device from a plurality of locations over a time interval in the area of real space and a derivative of data indicating the locations of identified subjects over the time interval in the area of real space.
  • In one embodiment, the signals emitted by the mobile computing devices include location data and accelerometer data.
  • In one embodiment, the signals emitted by the mobile computing devices comprise images.
  • In one embodiment, the signals emitted by the mobile computing devices comprise radio signals.
  • A method of linking subjects in an area of real space with user accounts is disclosed. The user accounts are linked with client applications executable on mobile computing devices is disclosed. The method includes, using a plurality of cameras to produce respective sequences of images in corresponding fields of view in the real space. Then the method includes determining locations of identified subjects represented in the images. The method includes matching the identified subjects with user accounts by identifying locations of mobile computing devices executing client applications in the area of real space. Finally, the method includes matching locations of the mobile computing devices with locations of the subjects.
  • In one embodiment, the method also includes, setting a status of the selected semaphore image as assigned, receiving a displayed image of the selected semaphore image, recognizing the displayed semaphore image and matching the recognized image with the assigned images from the set of semaphore images. The method includes, matching a location of the mobile computing device displaying the recognized semaphore image located in the area of real space with a not yet linked identified subject. Finally, the method includes after matching the user account to the identified subject setting the status of the recognized semaphore image as available.
  • In one embodiment, matching the identified subjects with user accounts further includes using the accelerometer data transmitted from the mobile computing device from a plurality of locations over a time interval in the area of real space. A derivative of data indicating the locations of identified subjects over the time interval in the area of real space.
  • In one embodiment, the signals emitted by the mobile computing devices include location data and accelerometer data.
  • In one embodiment, the signals emitted by the mobile computing devices comprise images.
  • In one embodiment, the signals emitted by the mobile computing devices comprise radio signals.
  • A non-transitory computer readable storage medium impressed with computer program instructions to link subjects in an area of real space with user accounts is disclosed. The user accounts are linked with client applications executable on mobile computing devices, the instructions, when executed on a processor, implement a method. The method includes using a plurality of cameras to produce respective sequences of images in corresponding fields of view in the real space. The method includes determining locations of identified subjects represented in the images. The method includes matching the identified subjects with user accounts by identifying locations of mobile computing devices executing client applications in the area of real space. Finally, the method includes matching locations of the mobile computing devices with locations of the subjects.
  • In one embodiment, the non-transitory computer readable storage medium implements the method further comprising the following steps. The method includes setting a status of the selected semaphore image as assigned, receiving a displayed image of the selected semaphore image, recognizing the displayed semaphore image and matching the recognized image with the assigned images from the set of semaphore images. The method includes matching a location of the mobile computing device displaying the recognized semaphore image located in the area of real space with a not yet linked identified subject. After matching the user account to the identified subject setting the status of the recognized semaphore image as available.
  • In one embodiment, the non-transitory computer readable storage medium implements the method including matching the identified subjects with user accounts by using the accelerometer data transmitted from the mobile computing device from a plurality of locations over a time interval in the area of real space and a derivative of data indicating the locations of identified subjects over the time interval in the area of real space.
  • In one embodiment, the signals emitted by the mobile computing devices include location data and accelerometer data.
  • Any data structures and code described or referenced above are stored according to many implementations in computer readable memory, which comprises a non-transitory computer-readable storage medium, which may be any device or medium that can store code and/or data for use by a computer system. This includes, but is not limited to, volatile memory, non-volatile memory, application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), magnetic and optical storage devices such as disk drives, magnetic tape, CDs (compact discs), DVDs (digital versatile discs or digital video discs), or other media capable of storing computer-readable media now known or later developed.
  • The preceding description is presented to enable the making and use of the technology disclosed. Various modifications to the disclosed implementations will be apparent, and the general principles defined herein may be applied to other implementations and applications without departing from the spirit and scope of the technology disclosed. Thus, the technology disclosed is not intended to be limited to the implementations shown, but is to be accorded the widest scope consistent with the principles and features disclosed herein. The scope of the technology disclosed is defined by the appended claims.

Claims (27)

What is claimed is:
1. A system for linking subjects in an area of real space with user accounts, the user accounts being linked with client applications executable on mobile computing devices, comprising:
a processing system configured to receive a plurality of sequences of images of corresponding fields of view in the real space, the processing system including logic to determine locations of identified subjects represented in the images, logic to match the identified subjects with user accounts by identifying locations of mobile devices executing client applications in the area of real space, and matching locations of the mobile devices with locations of the subjects.
2. The system of claim 1, wherein the client applications on the mobile computing devices transmit accelerometer data to the processing system, and the logic to match the identified subjects with user accounts uses the accelerometer data transmitted from the mobile computing devices.
3. The system of claim 2, wherein the logic to match the identified subjects with user accounts further comprises:
logic to calculate velocity of mobile devices using the accelerometer data transmitted from mobile computing devices;
logic to calculate distance between velocities of pairs of mobile computing devices with unmatched client applications and velocities of not yet linked identified subjects wherein the velocities of not yet linked identified subjects are calculated from changes in positions of joints of subjects over time, and
logic to match mobile computing devices with unmatched client applications to not yet linked identified subjects when the distance between the velocity of a mobile computing device and the velocity of a subject is below a first threshold.
4. The system of claim 3, wherein the logic to match mobile computing devices with unmatched client application to not yet linked identified subjects further including logic to increment a score counter for a pair of a mobile computing device with unmatched client application and a not yet linked identified subject when the distance between the velocity of the mobile computing device and the velocity of the subject is below a first threshold.
5. The system of claim 4, further comprising
logic to compare the score counters for all pairs of the mobile computing devices with unmatched client applications and the not yet linked identified subjects with a second threshold and selecting a score counter with a highest score above the second threshold; and
logic to compare the score of the selected score counter with score of a score counter with a second highest score; and matching the mobile computing device with unmatched client application to not yet linked subject corresponding to the pair of mobile computing device and the subject with the selected score counter when a difference between the scores of the selected score counter and the score of the second highest score counter is above a third threshold.
6. The system of claim 3, further comprising logic to calculate the velocities of not yet linked identified subjects from changes in center of mass of subjects over time wherein the center of mass of subjects is determined from locations of joints of corresponding subjects.
7. The system of claim 3, further comprising logic to process the plurality of sequences of images of corresponding fields of view in the real space to determine hand joints of the not yet linked identified subjects holding mobile computing devices and calculating the velocities of the hand joints of subjects over time.
8. The system of claim 5, further comprising a trained neural network to predict the first threshold, the second threshold and the third threshold.
9. The system of claim 1, further including log data structures including a list of inventory items for the identified subjects, the processing system including logic to associate the log data structure for the matched identified subject to the user account for the identified subject.
10. A method for linking subjects in an area of real space with user accounts, the user accounts being linked with client applications executable on mobile computing devices, the method including:
receiving a plurality of sequences of images of corresponding fields of view in the real space,
determining locations of identified subjects represented in the images,
matching the identified subjects with user accounts by identifying locations of mobile devices executing client applications in the area of real space, and matching locations of the mobile devices with locations of the subjects.
11. The method of claim 10, wherein the client applications on the mobile computing devices transmit accelerometer data, and matching the identified subjects with user accounts includes using the accelerometer data transmitted from the mobile computing devices.
12. The method of claim 11, wherein matching the identified subjects with user accounts further including:
calculating velocity of mobile devices using the accelerometer data transmitted from mobile computing devices;
calculating distance between velocities of pairs of mobile computing devices with unmatched client applications and velocities of not yet linked identified subjects wherein the velocities of not yet linked identified subjects are calculated from changes in positions of joints of subjects over time, and
matching mobile computing devices with unmatched client applications to not yet linked identified subjects when the distance between the velocity of a mobile computing device and the velocity of a subject is below a first threshold.
13. The method of claim 12, wherein matching mobile computing devices with unmatched client application to not yet linked identified subjects further including incrementing a score counter for a pair of a mobile computing device with unmatched client application and a not yet linked identified subject when the distance between the velocity of the mobile computing device and the velocity of the subject is below a first threshold.
14. The method of claim 13, further including
comparing the score counters for all pairs of the mobile computing devices with unmatched client applications and the not yet linked identified subjects with a second threshold and selecting a score counter with a highest score above the second threshold; and
comparing the score of the selected score counter with score of a score counter with a second highest score; and matching the mobile computing device with unmatched client application to not yet linked subject corresponding to the pair of mobile computing device and the subject with the selected score counter when a difference between the scores of the selected score counter and the score of the second highest score counter is above a third threshold.
15. The method of claim 12, further including calculating the velocities of not yet linked identified subjects from changes in center of mass of subjects over time wherein the center of mass of subjects is determined from locations of joints of corresponding subjects.
16. The method of claim 12, further including processing the plurality of sequences of images of corresponding fields of view in the real space to determine hand joints of the not yet linked identified subjects holding mobile computing devices and calculating the velocities of the hand joints of subjects over time.
17. The method of claim 14, further including using a trained neural network to predict the first threshold, the second threshold and the third threshold.
18. The method of claim 10, further including associating a log data structure for the matched identified subject to the user account for the identified subject, wherein the log data structure includes a list of inventory items for the identified subject.
19. A non-transitory computer readable storage medium impressed with computer program instructions to link subjects in an area of real space with user accounts, the user accounts being linked with client applications executable on mobile computing devices, the instructions, when executed on a processor, implement a method comprising:
receiving a plurality of sequences of images of corresponding fields of view in the real space,
determining locations of identified subjects represented in the images,
matching the identified subjects with user accounts by identifying locations of mobile devices executing client applications in the area of real space, and
matching locations of the mobile devices with locations of the subjects.
20. The non-transitory computer readable storage medium of claim 19, wherein the client applications on the mobile computing devices transmit accelerometer data, and matching the identified subjects with user accounts includes using the accelerometer data transmitted from the mobile computing devices.
21. The non-transitory computer readable storage medium of claim 20, wherein matching the identified subjects with user accounts implementing the method further comprising:
calculating velocity of mobile devices using the accelerometer data transmitted from mobile computing devices;
calculating distance between velocities of pairs of mobile computing devices with unmatched client applications and velocities of not yet linked identified subjects wherein the velocities of not yet linked identified subjects are calculated from changes in positions of joints of subjects over time, and
matching mobile computing devices with unmatched client applications to not yet linked identified subjects when the distance between the velocity of a mobile computing device and the velocity of a subject is below a first threshold.
22. The non-transitory computer readable storage medium of claim 21, wherein matching mobile computing devices with unmatched client application to not yet linked identified subjects implementing the method further comprising:
incrementing a score counter for a pair of a mobile computing device with unmatched client application and a not yet linked identified subject when the distance between the velocity of the mobile computing device and the velocity of the subject is below a first threshold.
23. The non-transitory computer readable storage medium of claim 22, implementing the method further comprising:
comparing the score counters for all pairs of the mobile computing devices with unmatched client applications and the not yet linked identified subjects with a second threshold and selecting a score counter with a highest score above the second threshold; and
comparing the score of the selected score counter with score of a score counter with a second highest score; and matching the mobile computing device with unmatched client application to not yet linked subject corresponding to the pair of mobile computing device and the subject with the selected score counter when a difference between the scores of the selected score counter and the score of the second highest score counter is above a third threshold.
24. The non-transitory computer readable storage medium of claim 21, implementing the method further comprising:
calculating the velocities of not yet linked identified subjects from changes in center of mass of subjects over time wherein the center of mass of subjects is determined from locations of joints of corresponding subjects.
25. The non-transitory computer readable storage medium of claim 21, implementing the method further comprising:
processing the plurality of sequences of images of corresponding fields of view in the real space to determine hand joints of the not yet linked identified subjects holding mobile computing devices and calculating the velocities of the hand joints of subjects over time.
26. The non-transitory computer readable storage medium of claim 23, implementing the method further comprising:
using a trained neural network to predict the first threshold, the second threshold and the third threshold.
27. The non-transitory computer readable storage medium of claim 19, implementing the method further comprising:
associating a log data structure for the matched identified subject to the user account for the identified subject, wherein the log data structure includes a list of inventory items for the identified subject.
US16/842,382 2017-08-07 2020-04-07 Systems and methods to check-in shoppers in a cashier-less store Active US11200692B2 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
US16/842,382 US11200692B2 (en) 2017-08-07 2020-04-07 Systems and methods to check-in shoppers in a cashier-less store
US17/383,303 US11538186B2 (en) 2017-08-07 2021-07-22 Systems and methods to check-in shoppers in a cashier-less store
US18/089,503 US11810317B2 (en) 2017-08-07 2022-12-27 Systems and methods to check-in shoppers in a cashier-less store
US18/387,427 US20240070895A1 (en) 2017-08-07 2023-11-06 Systems and methods to check-in shoppers in a cashier-less store

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US201762542077P 2017-08-07 2017-08-07
US15/847,796 US10055853B1 (en) 2017-08-07 2017-12-19 Subject identification and tracking using image recognition
US15/907,112 US10133933B1 (en) 2017-08-07 2018-02-27 Item put and take detection using image recognition
US15/945,473 US10474988B2 (en) 2017-08-07 2018-04-04 Predicting inventory events using foreground/background processing
US16/255,573 US10650545B2 (en) 2017-08-07 2019-01-23 Systems and methods to check-in shoppers in a cashier-less store
US16/842,382 US11200692B2 (en) 2017-08-07 2020-04-07 Systems and methods to check-in shoppers in a cashier-less store

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US16/255,573 Continuation US10650545B2 (en) 2017-08-07 2019-01-23 Systems and methods to check-in shoppers in a cashier-less store

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/383,303 Continuation US11538186B2 (en) 2017-08-07 2021-07-22 Systems and methods to check-in shoppers in a cashier-less store

Publications (2)

Publication Number Publication Date
US20200234463A1 true US20200234463A1 (en) 2020-07-23
US11200692B2 US11200692B2 (en) 2021-12-14

Family

ID=78823631

Family Applications (3)

Application Number Title Priority Date Filing Date
US16/842,382 Active US11200692B2 (en) 2017-08-07 2020-04-07 Systems and methods to check-in shoppers in a cashier-less store
US18/089,503 Active US11810317B2 (en) 2017-08-07 2022-12-27 Systems and methods to check-in shoppers in a cashier-less store
US18/387,427 Pending US20240070895A1 (en) 2017-08-07 2023-11-06 Systems and methods to check-in shoppers in a cashier-less store

Family Applications After (2)

Application Number Title Priority Date Filing Date
US18/089,503 Active US11810317B2 (en) 2017-08-07 2022-12-27 Systems and methods to check-in shoppers in a cashier-less store
US18/387,427 Pending US20240070895A1 (en) 2017-08-07 2023-11-06 Systems and methods to check-in shoppers in a cashier-less store

Country Status (1)

Country Link
US (3) US11200692B2 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10853965B2 (en) 2017-08-07 2020-12-01 Standard Cognition, Corp Directional impression analysis using deep learning
US11023850B2 (en) 2017-08-07 2021-06-01 Standard Cognition, Corp. Realtime inventory location management using deep learning
US11200692B2 (en) 2017-08-07 2021-12-14 Standard Cognition, Corp Systems and methods to check-in shoppers in a cashier-less store
US11232687B2 (en) 2017-08-07 2022-01-25 Standard Cognition, Corp Deep learning-based shopper statuses in a cashier-less store
US11250376B2 (en) 2017-08-07 2022-02-15 Standard Cognition, Corp Product correlation analysis using deep learning
US20220067390A1 (en) * 2020-09-01 2022-03-03 Lg Electronics Inc. Automated shopping experience using cashier-less systems
US11270260B2 (en) 2017-08-07 2022-03-08 Standard Cognition Corp. Systems and methods for deep learning-based shopper tracking
US11303853B2 (en) 2020-06-26 2022-04-12 Standard Cognition, Corp. Systems and methods for automated design of camera placement and cameras arrangements for autonomous checkout
US11361468B2 (en) 2020-06-26 2022-06-14 Standard Cognition, Corp. Systems and methods for automated recalibration of sensors for autonomous checkout
US11538186B2 (en) 2017-08-07 2022-12-27 Standard Cognition, Corp. Systems and methods to check-in shoppers in a cashier-less store
US11948313B2 (en) 2019-04-18 2024-04-02 Standard Cognition, Corp Systems and methods of implementing multiple trained inference engines to identify and track subjects over multiple identification intervals

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11263780B2 (en) * 2019-01-14 2022-03-01 Sony Group Corporation Apparatus, method, and program with verification of detected position information using additional physical characteristic points

Family Cites Families (327)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US1037779A (en) 1909-12-23 1912-09-03 Nernst Lamp Company Means for holding globes.
US4746830A (en) 1986-03-14 1988-05-24 Holland William R Electronic surveillance and identification
NL194847C (en) 1991-07-17 2003-04-03 John Wolfgang Halpern Pocket-sized electronic travel and commuter pass and a number of account systems.
US5745036A (en) 1996-09-12 1998-04-28 Checkpoint Systems, Inc. Electronic article security system for store which uses intelligent security tags and transaction data
US6154559A (en) 1998-10-01 2000-11-28 Mitsubishi Electric Information Technology Center America, Inc. (Ita) System for classifying an individual's gaze direction
DE19847261A1 (en) 1998-10-05 2000-04-06 Dcs Dialog Communication Syste Process and system for person recognition with model-based face finding
US7050624B2 (en) 1998-12-04 2006-05-23 Nevengineering, Inc. System and method for feature location and tracking in multiple dimensions including depth
GB2344904A (en) 1998-12-17 2000-06-21 Ibm Home stock control computer system
US7212201B1 (en) 1999-09-23 2007-05-01 New York University Method and apparatus for segmenting an image in order to locate a part thereof
NZ543166A (en) 2000-04-07 2006-12-22 Procter & Gamble Monitoring the effective velocity of items through a store or warehouse for predicting stock levels
US6768102B1 (en) 2000-06-15 2004-07-27 Chipworks Method and system for recalibration during micro-imaging to determine thermal drift
US6678413B1 (en) 2000-11-24 2004-01-13 Yiqing Liang System and method for object identification and behavior characterization using video analysis
GB0101794D0 (en) 2001-01-24 2001-03-07 Central Research Lab Ltd Monitoring responses to visual stimuli
US6584375B2 (en) 2001-05-04 2003-06-24 Intellibot, Llc System for a retail environment
US20030078849A1 (en) 2001-10-23 2003-04-24 Ncr Corporation Self-checkout system having component video camera for produce purchase monitoring
US7688349B2 (en) 2001-12-07 2010-03-30 International Business Machines Corporation Method of detecting and tracking groups of people
US7050652B2 (en) 2002-03-08 2006-05-23 Anzus, Inc. Methods and arrangements to enhance correlation
US20040099736A1 (en) 2002-11-25 2004-05-27 Yoram Neumark Inventory control and identification method
DE10324579A1 (en) 2003-05-30 2004-12-16 Daimlerchrysler Ag operating device
US20050177446A1 (en) 2004-02-11 2005-08-11 International Business Machines Corporation Method and system for supporting coordination and collaboration of multiple shoppers
KR100519782B1 (en) 2004-03-04 2005-10-07 삼성전자주식회사 Method and apparatus for detecting people using a stereo camera
ES2305930T3 (en) 2004-03-17 2008-11-01 Norbert Prof. Dr. Link DEVICE AND PROCEDURE FOR THE DETECTION AND MONITORING OF PERSONS IN AN INSPECTION AREA.
US8331723B2 (en) * 2004-03-25 2012-12-11 Ozluturk Fatih M Method and apparatus to correct digital image blur due to motion of subject or imaging device
US7631808B2 (en) 2004-06-21 2009-12-15 Stoplift, Inc. Method and apparatus for detecting suspicious activity using video analysis
US8289390B2 (en) 2004-07-28 2012-10-16 Sri International Method and apparatus for total situational awareness and monitoring
US7586492B2 (en) 2004-12-20 2009-09-08 Nvidia Corporation Real-time display post-processing using programmable hardware
CA2497795A1 (en) 2005-02-21 2006-08-21 Mohamed Almahakeri Inventory control device for counting cigarette packages
US9270841B2 (en) 2005-04-15 2016-02-23 Freeze Frame, Llc Interactive image capture, marketing and distribution
US20090041297A1 (en) 2005-05-31 2009-02-12 Objectvideo, Inc. Human detection and tracking for security applications
US7894932B2 (en) 2005-07-19 2011-02-22 Kiva Systems, Inc. Method and system for replenishing inventory items
US7894933B2 (en) 2005-07-19 2011-02-22 Kiva Systems, Inc. Method and system for retrieving inventory items
US9036028B2 (en) 2005-09-02 2015-05-19 Sensormatic Electronics, LLC Object tracking and alerts
KR100999084B1 (en) 2006-01-12 2010-12-07 오티스 엘리베이터 컴파니 Video aided system for elevator control
US9129290B2 (en) 2006-02-22 2015-09-08 24/7 Customer, Inc. Apparatus and method for predicting customer behavior
US9092807B1 (en) 2006-05-05 2015-07-28 Appnexus Yieldex Llc Network-based systems and methods for defining and managing multi-dimensional, advertising impression inventory
US20070282665A1 (en) 2006-06-02 2007-12-06 Buehler Christopher J Systems and methods for providing video surveillance data
US8013838B2 (en) 2006-06-30 2011-09-06 Microsoft Corporation Generating position information using a video camera
EP2082564A2 (en) 2006-08-24 2009-07-29 Chumby Industries, Inc. Configurable personal audiovisual device for use in networked application-sharing system
GB0617604D0 (en) 2006-09-07 2006-10-18 Zeroshift Ltd Inventory control system
US7693757B2 (en) 2006-09-21 2010-04-06 International Business Machines Corporation System and method for performing inventory using a mobile inventory robot
EP2479992B1 (en) 2006-12-04 2020-07-15 Isolynx, LLC Autonomous systems and methods for still and moving picture production
US8189926B2 (en) 2006-12-30 2012-05-29 Videomining Corporation Method and system for automatically analyzing categories in a physical space based on the visual characterization of people
US7971156B2 (en) 2007-01-12 2011-06-28 International Business Machines Corporation Controlling resource access based on user gesturing in a 3D captured image stream of the user
CN101231755B (en) 2007-01-25 2013-03-06 上海遥薇(集团)有限公司 Moving target tracking and quantity statistics method
US20080181507A1 (en) 2007-01-29 2008-07-31 Intellivision Technologies Corp. Image manipulation for videos and still images
US8587661B2 (en) 2007-02-21 2013-11-19 Pixel Velocity, Inc. Scalable system for wide area surveillance
US20080243614A1 (en) 2007-03-30 2008-10-02 General Electric Company Adaptive advertising and marketing system and method
US7961946B2 (en) 2007-05-15 2011-06-14 Digisensory Technologies Pty Ltd Method and system for background estimation in localization and tracking of objects in a smart video camera
US20090217315A1 (en) 2008-02-26 2009-08-27 Cognovision Solutions Inc. Method and system for audience measurement and targeting media
US7742952B2 (en) 2008-03-21 2010-06-22 Sunrise R&D Holdings, Llc Systems and methods of acquiring actual real-time shopper behavior data approximate to a moment of decision by a shopper
US8189855B2 (en) 2007-08-31 2012-05-29 Accenture Global Services Limited Planogram extraction based on image processing
US8843959B2 (en) 2007-09-19 2014-09-23 Orlando McMaster Generating synchronized interactive link maps linking tracked video objects to other multimedia content in real-time
ATE455325T1 (en) * 2007-11-30 2010-01-15 Ericsson Telefon Ab L M PORTABLE ELECTRONIC DEVICE HAVING MORE THAN ONE DISPLAY AREA AND METHOD FOR CONTROLLING A USER INTERFACE THEREOF
DE102008007199A1 (en) 2008-02-01 2009-08-06 Robert Bosch Gmbh Masking module for a video surveillance system, method for masking selected objects and computer program
US8897742B2 (en) * 2009-11-13 2014-11-25 William J. Johnson System and method for sudden proximal user interface
US8066571B2 (en) 2008-06-09 2011-11-29 Metaplace, Inc. System and method for enabling characters to be manifested within a plurality of different virtual spaces
US8009863B1 (en) 2008-06-30 2011-08-30 Videomining Corporation Method and system for analyzing shopping behavior using multiple sensor tracking
US8219438B1 (en) 2008-06-30 2012-07-10 Videomining Corporation Method and system for measuring shopper response to products based on behavior and facial expression
US7742623B1 (en) 2008-08-04 2010-06-22 Videomining Corporation Method and system for estimating gaze target, gaze sequence, and gaze map from video
US9147174B2 (en) 2008-08-08 2015-09-29 Snap-On Incorporated Image-based inventory control system using advanced image recognition
CN102144201A (en) 2008-09-03 2011-08-03 皇家飞利浦电子股份有限公司 Method of performing a gaze-based interaction between a user and an interactive display system
US20100103104A1 (en) 2008-10-29 2010-04-29 Electronics And Telecommunications Research Institute Apparatus for user interface based on wearable computing environment and method thereof
US20100138281A1 (en) 2008-11-12 2010-06-03 Yinying Zhang System and method for retail store shelf stock monitoring, predicting, and reporting
US8279325B2 (en) 2008-11-25 2012-10-02 Lytro, Inc. System and method for acquiring, editing, generating and outputting video data
US8577705B1 (en) 2008-12-30 2013-11-05 Videomining Corporation Method and system for rating the role of a product category in the performance of a store area
US8180107B2 (en) 2009-02-13 2012-05-15 Sri International Active coordinated tracking for multi-camera systems
US8239277B2 (en) 2009-03-31 2012-08-07 The Nielsen Company (Us), Llc Method, medium, and system to monitor shoppers in a retail or commercial establishment
WO2011035302A1 (en) 2009-09-21 2011-03-24 Checkpoint Systems, Inc. Retail product tracking system, method, and apparatus
US8438484B2 (en) 2009-11-06 2013-05-07 Sony Corporation Video preview module to enhance online video experience
US8855682B2 (en) * 2010-02-23 2014-10-07 Robert Osann, Jr. System for safe texting while driving
JP5138667B2 (en) 2009-12-22 2013-02-06 東芝テック株式会社 Self-checkout terminal
US20110209042A1 (en) 2010-02-25 2011-08-25 International Business Machines Corporation Information Technology Standard Inventory Utility
US8213680B2 (en) 2010-03-19 2012-07-03 Microsoft Corporation Proxy training data for human body tracking
JP2011209966A (en) 2010-03-29 2011-10-20 Sony Corp Image processing apparatus and method, and program
US8749630B2 (en) 2010-05-13 2014-06-10 Ecole Polytechnique Federale De Lausanne (Epfl) Method and system for automatic objects localization
JP2011253344A (en) 2010-06-02 2011-12-15 Midee Co Ltd Purchase behavior analysis device, purchase behavior analysis method and program
US20110320322A1 (en) 2010-06-25 2011-12-29 Symbol Technologies, Inc. Inventory monitoring using complementary modes for item identification
JP5269002B2 (en) 2010-06-28 2013-08-21 株式会社日立製作所 Camera placement decision support device
US8615254B2 (en) * 2010-08-18 2013-12-24 Nearbuy Systems, Inc. Target localization utilizing wireless and camera sensor fusion
US8306256B2 (en) 2010-09-16 2012-11-06 Facebook, Inc. Using camera signatures from uploaded images to authenticate users of an online system
US9494532B2 (en) 2010-09-24 2016-11-15 Siemens Energy, Inc. System and method for side-by-side inspection of a device
US10121133B2 (en) 2010-10-13 2018-11-06 Walmart Apollo, Llc Method for self-checkout with a mobile device
US8193909B1 (en) 2010-11-15 2012-06-05 Intergraph Technologies Company System and method for camera control in a surveillance system
US9171442B2 (en) 2010-11-19 2015-10-27 Tyco Fire & Security Gmbh Item identification using video recognition to supplement bar code or RFID information
US8635556B2 (en) 2010-11-30 2014-01-21 Alcatel Lucent Human readable iconic display server
US9449233B2 (en) 2010-12-01 2016-09-20 The Trustees Of The University Of Pennsylvania Distributed target tracking using self localizing smart camera networks
US8448056B2 (en) 2010-12-17 2013-05-21 Microsoft Corporation Validation analysis of human target
TWI426775B (en) 2010-12-17 2014-02-11 Ind Tech Res Inst Camera recalibration system and the method thereof
US9595127B2 (en) * 2010-12-22 2017-03-14 Zspace, Inc. Three-dimensional collaboration
EP2673999B1 (en) * 2011-02-09 2014-11-26 Andrew LLC Method for improving the location determination using proximity information
BR112013021059A2 (en) 2011-02-16 2020-10-27 Visa International Service Association Snap mobile payment systems, methods and devices
US10083453B2 (en) 2011-03-17 2018-09-25 Triangle Strategy Group, LLC Methods, systems, and computer readable media for tracking consumer interactions with products using modular sensor units
WO2012135115A2 (en) 2011-03-25 2012-10-04 Visa International Service Association In-person one-tap purchasing apparatuses, methods and systems
US8811719B2 (en) 2011-04-29 2014-08-19 Microsoft Corporation Inferring spatial object descriptions from spatial gestures
US8510166B2 (en) 2011-05-11 2013-08-13 Google Inc. Gaze tracking system
US20130027561A1 (en) 2011-07-29 2013-01-31 Panasonic Corporation System and method for improving site operations by detecting abnormalities
US20130076898A1 (en) 2011-08-01 2013-03-28 Richard Philippe Apparatus, systems, and methods for tracking medical products using an imaging unit
JP5984096B2 (en) 2011-08-30 2016-09-06 ディジマーク コーポレイション Method and mechanism for identifying an object
US20130060708A1 (en) 2011-09-06 2013-03-07 Rawllin International Inc. User verification for electronic money transfers
US10307640B2 (en) 2011-09-20 2019-06-04 Brian Francis Mooney Apparatus and method for analyzing a golf swing
US8624725B1 (en) 2011-09-22 2014-01-07 Amazon Technologies, Inc. Enhanced guidance for electronic devices having multiple tracking modes
US9177195B2 (en) 2011-09-23 2015-11-03 Shoppertrak Rct Corporation System and method for detecting, tracking and counting human objects of interest using a counting system and a data capture device
US20130142384A1 (en) 2011-12-06 2013-06-06 Microsoft Corporation Enhanced navigation through multi-sensor positioning
US8630457B2 (en) 2011-12-15 2014-01-14 Microsoft Corporation Problem states for pose tracking pipeline
CN103843024A (en) 2012-01-05 2014-06-04 维萨国际服务协会 Transaction visual capturing apparatuses, methods and systems
US9740937B2 (en) 2012-01-17 2017-08-22 Avigilon Fortress Corporation System and method for monitoring a retail environment using video content analysis with depth sensing
US20130201339A1 (en) 2012-02-08 2013-08-08 Honeywell International Inc. System and method of optimal video camera placement and configuration
US10163031B2 (en) * 2012-02-29 2018-12-25 RetailNext, Inc. Method and system for full path analysis
US20130235206A1 (en) 2012-03-12 2013-09-12 Numerex Corp. System and Method of On-Shelf Inventory Management
JP5888034B2 (en) 2012-03-16 2016-03-16 富士通株式会社 User detection device, method and program
US9892438B1 (en) 2012-05-03 2018-02-13 Stoplift, Inc. Notification system and methods for use in retail environments
US9390032B1 (en) 2012-06-27 2016-07-12 Amazon Technologies, Inc. Gesture camera configurations
US10333923B2 (en) * 2012-08-19 2019-06-25 Rajul Johri Authentication based on visual memory
US20140089673A1 (en) * 2012-09-25 2014-03-27 Aliphcom Biometric identification method and apparatus to authenticate identity of a user of a wearable device that includes sensors
JP6088792B2 (en) 2012-10-31 2017-03-01 株式会社メガチップス Image detection apparatus, control program, and image detection method
US20140188648A1 (en) 2012-12-28 2014-07-03 Wal-Mart Stores, Inc. Searching Digital Receipts At A Mobile Device
US9824384B2 (en) 2013-01-23 2017-11-21 Wal-Mart Stores, Inc. Techniques for locating an item to purchase in a retail environment
JP5356615B1 (en) 2013-02-01 2013-12-04 パナソニック株式会社 Customer behavior analysis device, customer behavior analysis system, and customer behavior analysis method
US9154191B2 (en) * 2013-02-05 2015-10-06 Empire Technology Development Llc Secure near field communication (NFC) handshake
US9693684B2 (en) 2013-02-14 2017-07-04 Facebook, Inc. Systems and methods of eye tracking calibration
US20140244207A1 (en) 2013-02-28 2014-08-28 Michael Alan Hicks Methods and apparatus to determine in-aisle locations in monitored environments
CN105190655B (en) 2013-03-04 2018-05-18 日本电气株式会社 Article management system, information processing equipment and its control method and control program
US10025486B2 (en) 2013-03-15 2018-07-17 Elwha Llc Cross-reality select, drag, and drop for augmented reality systems
US9542748B2 (en) 2013-04-08 2017-01-10 Avago Technologies General Ip (Singapore) Pte. Ltd. Front-end architecture for image processing
US20140304123A1 (en) 2013-04-09 2014-10-09 International Business Machines Corporation Electronically tracking inventory in a retail store
US20140300736A1 (en) 2013-04-09 2014-10-09 Microsoft Corporation Multi-sensor camera recalibration
AU2013205548A1 (en) 2013-04-30 2014-11-13 Canon Kabushiki Kaisha Method, system and apparatus for tracking objects of a scene
WO2014186642A2 (en) 2013-05-17 2014-11-20 International Electronic Machines Corporation Operations monitoring in an area
TW201445474A (en) 2013-05-17 2014-12-01 jun-cheng Qiu Mobile shopping method without arranging authentication device
US9907103B2 (en) * 2013-05-31 2018-02-27 Yulong Computer Telecommunication Scientific (Shenzhen) Co., Ltd. Mobile terminal, wearable device, and equipment pairing method
EP3005175A4 (en) 2013-06-05 2016-12-28 Freshub Ltd Methods and devices for smart shopping
US10176456B2 (en) 2013-06-26 2019-01-08 Amazon Technologies, Inc. Transitioning items from a materials handling facility
US10268983B2 (en) 2013-06-26 2019-04-23 Amazon Technologies, Inc. Detecting item interaction and movement
CN103366370B (en) 2013-07-03 2016-04-20 深圳市智美达科技股份有限公司 Method for tracking target in video monitoring and device
US9904946B2 (en) 2013-07-18 2018-02-27 Paypal, Inc. Reverse showrooming and merchant-customer engagement system
KR101472455B1 (en) 2013-07-18 2014-12-16 전자부품연구원 User interface apparatus based on hand gesture and method thereof
TW201504964A (en) 2013-07-23 2015-02-01 Yi-Li Huang Secure mobile device shopping system and method
US10290031B2 (en) 2013-07-24 2019-05-14 Gregorio Reid Method and system for automated retail checkout using context recognition
US9473747B2 (en) 2013-07-25 2016-10-18 Ncr Corporation Whole store scanner
US9405988B2 (en) 2013-08-13 2016-08-02 James Alves License plate recognition
US9269012B2 (en) * 2013-08-22 2016-02-23 Amazon Technologies, Inc. Multi-tracker object tracking
JP6529078B2 (en) 2013-09-06 2019-06-12 日本電気株式会社 Customer behavior analysis system, customer behavior analysis method, customer behavior analysis program and shelf system
WO2015040661A1 (en) 2013-09-19 2015-03-26 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ Control method for displaying merchandising information on information terminal
US10235726B2 (en) * 2013-09-24 2019-03-19 GeoFrenzy, Inc. Systems and methods for secure encryption of real estate titles and permissions
US20150085111A1 (en) 2013-09-25 2015-03-26 Symbol Technologies, Inc. Identification using video analytics together with inertial sensor data
US9489623B1 (en) 2013-10-15 2016-11-08 Brain Corporation Apparatus and methods for backward propagation of errors in a spiking neuron network
US9881221B2 (en) 2013-10-24 2018-01-30 Conduent Business Services, Llc Method and system for estimating gaze direction of vehicle drivers
US9536177B2 (en) 2013-12-01 2017-01-03 University Of Florida Research Foundation, Inc. Distributive hierarchical model for object recognition in video
TW201525716A (en) * 2013-12-23 2015-07-01 Qisda Corp Pairing method for mobile devices
US10430776B2 (en) 2014-01-09 2019-10-01 Datalogic Usa, Inc. System and method for exception handling in self-checkout and automated data capture systems
US20150206188A1 (en) 2014-01-17 2015-07-23 Panasonic Intellectual Property Corporation Of America Item presentation method, and information display method
CN104144419B (en) * 2014-01-24 2017-05-24 腾讯科技(深圳)有限公司 Identity authentication method, device and system
US9760852B2 (en) 2014-01-28 2017-09-12 Junaid Hasan Surveillance tracking system and related methods
US9693023B2 (en) 2014-02-05 2017-06-27 Panasonic Intellectual Property Management Co., Ltd. Monitoring apparatus, monitoring system, and monitoring method
JP6369534B2 (en) 2014-03-05 2018-08-08 コニカミノルタ株式会社 Image processing apparatus, image processing method, and image processing program
WO2015133699A1 (en) 2014-03-06 2015-09-11 에스케이플래닛 주식회사 Object recognition apparatus, and recording medium in which method and computer program therefor are recorded
US20150262116A1 (en) 2014-03-16 2015-09-17 International Business Machines Corporation Machine vision technology for shelf inventory management
RU2014110361A (en) 2014-03-18 2015-09-27 ЭлЭсАй Корпорейшн IMAGE PROCESSOR CONFIGURED FOR EFFICIENT EVALUATION AND ELIMINATION OF FRONT PLAN INFORMATION ON IMAGES
US20180108001A1 (en) 2014-03-24 2018-04-19 Thomas Jason Taylor Voice triggered transactions
US9824388B2 (en) 2014-04-10 2017-11-21 Point Inside, Inc. Location assignment system and method
US10430864B2 (en) 2014-04-10 2019-10-01 Point Inside, Inc. Transaction based location assignment system and method
US9940633B2 (en) 2014-04-25 2018-04-10 Conduent Business Services, Llc System and method for video-based detection of drive-arounds in a retail setting
JP6197952B2 (en) 2014-05-12 2017-09-20 富士通株式会社 Product information output method, product information output program and control device
US20150327794A1 (en) 2014-05-14 2015-11-19 Umm Al-Qura University System and method for detecting and visualizing live kinetic and kinematic data for the musculoskeletal system
US20150332223A1 (en) 2014-05-19 2015-11-19 Square, Inc. Transaction information collection for mobile payment experience
US9760944B2 (en) 2014-06-13 2017-09-12 Lisa J. Kleinhandler Systems, methods, servers, and clients for inventory exchange
US10176452B2 (en) 2014-06-13 2019-01-08 Conduent Business Services Llc Store shelf imaging system and method
US10242393B1 (en) 2014-06-24 2019-03-26 Amazon Technologies, Inc. Determine an item and user action in a materials handling facility
US20160034027A1 (en) 2014-07-29 2016-02-04 Qualcomm Incorporated Optical tracking of a user-guided object for mobile platform user input
JP6074395B2 (en) 2014-09-12 2017-02-01 富士フイルム株式会社 Content management system, managed content generation method, managed content playback method, program, and recording medium
AU2014240213B2 (en) 2014-09-30 2016-12-08 Canon Kabushiki Kaisha System and Method for object re-identification
JP6547268B2 (en) 2014-10-02 2019-07-24 富士通株式会社 Eye position detection device, eye position detection method and eye position detection program
US20160110791A1 (en) 2014-10-15 2016-04-21 Toshiba Global Commerce Solutions Holdings Corporation Method, computer program product, and system for providing a sensor-based environment
US9349054B1 (en) 2014-10-29 2016-05-24 Behavioral Recognition Systems, Inc. Foreground detector for video analytics system
US9984403B2 (en) 2014-10-30 2018-05-29 Wal-Mart Stores, Inc. Electronic shopping cart processing system and method
US9569692B2 (en) 2014-10-31 2017-02-14 The Nielsen Company (Us), Llc Context-based image recognition for consumer market research
US9262681B1 (en) 2014-11-19 2016-02-16 Amazon Technologies, Inc. Probabilistic registration of interactions, actions or activities from multiple views
US9443164B2 (en) 2014-12-02 2016-09-13 Xerox Corporation System and method for product identification
US9449208B2 (en) 2014-12-03 2016-09-20 Paypal, Inc. Compartmentalized smart refrigerator with automated item management
US9811754B2 (en) 2014-12-10 2017-11-07 Ricoh Co., Ltd. Realogram scene analysis of images: shelf and label finding
US10169677B1 (en) 2014-12-19 2019-01-01 Amazon Technologies, Inc. Counting stacked inventory using image analysis
US10438277B1 (en) 2014-12-23 2019-10-08 Amazon Technologies, Inc. Determining an item involved in an event
JP6447108B2 (en) 2014-12-24 2019-01-09 富士通株式会社 Usability calculation device, availability calculation method, and availability calculation program
US20180003315A1 (en) 2015-01-07 2018-01-04 Weir Minerals Australia Ltd Lever float valve
US11120478B2 (en) 2015-01-12 2021-09-14 Ebay Inc. Joint-based item recognition
US20180025175A1 (en) 2015-01-15 2018-01-25 Nec Corporation Information output device, camera, information output system, information output method, and program
US20160217157A1 (en) 2015-01-23 2016-07-28 Ebay Inc. Recognition of items depicted in images
US10956856B2 (en) 2015-01-23 2021-03-23 Samsung Electronics Co., Ltd. Object recognition for a storage structure
CN105987694B (en) 2015-02-09 2019-06-07 株式会社理光 The method and apparatus for identifying the user of mobile device
US10042031B2 (en) 2015-02-11 2018-08-07 Xerox Corporation Method and system for detecting that an object of interest has re-entered a field of view of an imaging device
JP5988225B2 (en) 2015-02-25 2016-09-07 パナソニックIpマネジメント株式会社 Monitoring device and monitoring method
US9524450B2 (en) 2015-03-04 2016-12-20 Accenture Global Services Limited Digital image processing using convolutional neural networks
US10130232B2 (en) 2015-03-06 2018-11-20 Walmart Apollo, Llc Shopping facility assistance systems, devices and methods
US10810539B1 (en) 2015-03-25 2020-10-20 Amazon Technologies, Inc. Re-establishing tracking of a user within a materials handling facility
US10332089B1 (en) 2015-03-31 2019-06-25 Amazon Technologies, Inc. Data synchronization system
CN104778690B (en) 2015-04-02 2017-06-06 中国电子科技集团公司第二十八研究所 A kind of multi-target orientation method based on camera network
GB201506444D0 (en) 2015-04-16 2015-06-03 Univ Essex Entpr Ltd Event detection and summarisation
JP5906558B1 (en) 2015-04-17 2016-04-20 パナソニックIpマネジメント株式会社 Customer behavior analysis apparatus, customer behavior analysis system, and customer behavior analysis method
US10217120B1 (en) 2015-04-21 2019-02-26 Videomining Corporation Method and system for in-store shopper behavior analysis with multi-modal sensor fusion
CN104850846B (en) 2015-06-02 2018-08-24 深圳大学 A kind of Human bodys' response method and identifying system based on deep neural network
US11049063B2 (en) * 2015-06-04 2021-06-29 Centriq Technology, Inc. Asset communication hub
US20160358145A1 (en) * 2015-06-05 2016-12-08 Yummy Foods, Llc Systems and methods for frictionless self-checkout merchandise purchasing
US20160371726A1 (en) 2015-06-22 2016-12-22 Kabushiki Kaisha Toshiba Information processing apparatus, information processing method, and computer program product
US10210737B2 (en) 2015-06-23 2019-02-19 Cleveland State University Systems and methods for privacy-aware motion tracking with notification feedback
US10021458B1 (en) 2015-06-26 2018-07-10 Amazon Technologies, Inc. Electronic commerce functionality in video overlays
EP3113104B1 (en) 2015-06-30 2018-02-21 Softkinetic Software Method for signal processing
US10860837B2 (en) 2015-07-20 2020-12-08 University Of Maryland, College Park Deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition
US9912647B2 (en) * 2015-07-22 2018-03-06 International Business Machines Corporation Vehicle wireless internet security
US9911290B1 (en) 2015-07-25 2018-03-06 Gary M. Zalewski Wireless coded communication (WCC) devices for tracking retail interactions with goods and association to user accounts
CN105069413B (en) 2015-07-27 2018-04-06 电子科技大学 A kind of human posture's recognition methods based on depth convolutional neural networks
CN106408080B (en) 2015-07-31 2019-01-01 富士通株式会社 The counting device and method of moving object
EP3128482A1 (en) 2015-08-07 2017-02-08 Xovis AG Method for calibration of a stereo camera
CA2995866A1 (en) 2015-09-03 2017-03-09 Miovision Technologies Incorporated System and method for detecting and tracking objects
US10474877B2 (en) 2015-09-22 2019-11-12 Google Llc Automated effects generation for animated content
US10110877B2 (en) 2015-10-12 2018-10-23 Dell Products, Lp Method and apparatus for depth algorithm adjustment to images based on predictive analytics and sensor feedback in an information handling system
TWI553494B (en) 2015-11-04 2016-10-11 創意引晴股份有限公司 Multi-modal fusion based Intelligent fault-tolerant video content recognition system and recognition method
US20170148005A1 (en) 2015-11-20 2017-05-25 The Answer Group, Inc. Integrated Automatic Retail System and Method
US9953217B2 (en) 2015-11-30 2018-04-24 International Business Machines Corporation System and method for pose-aware feature learning
US20170161555A1 (en) 2015-12-04 2017-06-08 Pilot Ai Labs, Inc. System and method for improved virtual reality user interaction utilizing deep-learning
US10318008B2 (en) 2015-12-15 2019-06-11 Purdue Research Foundation Method and system for hand pose detection
US10417696B2 (en) 2015-12-18 2019-09-17 Ricoh Co., Ltd. Suggestion generation based on planogram matching
WO2017107168A1 (en) 2015-12-25 2017-06-29 Intel Corporation Event-driven framework for gpu programming
WO2017123920A1 (en) 2016-01-14 2017-07-20 RetailNext, Inc. Detecting, tracking and counting objects in videos
US20170206664A1 (en) 2016-01-14 2017-07-20 James Shen Method for identifying, tracking persons and objects of interest
TW201725545A (en) 2016-01-15 2017-07-16 T Wallet Co Ltd Mobile payment method that effectively overcomes the potential risk of financial information of the user being misappropriated
WO2017151241A2 (en) 2016-01-21 2017-09-08 Wizr Llc Video processing
US10262331B1 (en) 2016-01-29 2019-04-16 Videomining Corporation Cross-channel in-store shopper behavior analysis
US20170249339A1 (en) 2016-02-25 2017-08-31 Shutterstock, Inc. Selected image subset based search
EP3420520A4 (en) 2016-02-26 2019-10-23 Imagr Limited System and methods for shopping in a physical store
JP6968399B2 (en) 2016-02-29 2021-11-17 サインポスト株式会社 Information processing system
US10497049B2 (en) 2016-03-05 2019-12-03 Home Depot Product Authority, Llc Optimistic product order reservation system and method
KR102462572B1 (en) 2016-03-17 2022-11-04 모토로라 솔루션즈, 인크. Systems and methods for training object classifiers by machine learning
US10813477B2 (en) 2016-03-22 2020-10-27 Nec Corporation Image display device, image display system, image display method, and program
US10277831B2 (en) 2016-03-25 2019-04-30 Fuji Xerox Co., Ltd. Position identifying apparatus and method, path identifying apparatus, and non-transitory computer readable medium
US20170308911A1 (en) 2016-04-21 2017-10-26 Candice Barham Shopping cart tracking system and method for analyzing shopping behaviors
US10846996B2 (en) 2016-04-25 2020-11-24 Standard Cognition Corp. Registry verification for a mechanized store using radio frequency tags
US9886827B2 (en) 2016-04-25 2018-02-06 Bernd Schoner Registry verification for a mechanized store
US10387896B1 (en) 2016-04-27 2019-08-20 Videomining Corporation At-shelf brand strength tracking and decision analytics
JP7009389B2 (en) 2016-05-09 2022-01-25 グラバンゴ コーポレイション Systems and methods for computer vision driven applications in the environment
US10891589B2 (en) 2019-03-13 2021-01-12 Simbe Robotics, Inc. Method for deploying fixed and mobile sensors for stock keeping in a store
US10354262B1 (en) 2016-06-02 2019-07-16 Videomining Corporation Brand-switching analysis using longitudinal tracking of at-shelf shopper behavior
US10645366B2 (en) 2016-06-10 2020-05-05 Lucid VR, Inc. Real time re-calibration of stereo cameras
US10282621B2 (en) 2016-07-09 2019-05-07 Grabango Co. Remote state following device
WO2018023124A1 (en) 2016-07-29 2018-02-01 ACF Technologies, Inc. Automated queuing system
US10380756B2 (en) 2016-09-08 2019-08-13 Sony Corporation Video processing system and method for object detection in a sequence of image frames
KR20180032400A (en) 2016-09-22 2018-03-30 한국전자통신연구원 multiple object tracking apparatus based Object information of multiple camera and method therefor
US10409548B2 (en) 2016-09-27 2019-09-10 Grabango Co. System and method for differentially locating and modifying audio sources
US10210603B2 (en) 2016-10-17 2019-02-19 Conduent Business Services Llc Store shelf imaging system and method
US20180137480A1 (en) * 2016-11-11 2018-05-17 Honey Inc. Mobile device gesture and proximity communication
US10529137B1 (en) 2016-11-29 2020-01-07 MAX-PLANCK-Gesellschaft zur Förderung der Wissenschaften e.V. Machine learning systems and methods for augmenting images
US20180150788A1 (en) 2016-11-30 2018-05-31 Wipro Limited Inventory control system and a method for inventory control in an establishment
US11068949B2 (en) 2016-12-09 2021-07-20 365 Retail Markets, Llc Distributed and automated transaction systems
US10165194B1 (en) 2016-12-16 2018-12-25 Amazon Technologies, Inc. Multi-sensor camera system
JP2018099317A (en) 2016-12-20 2018-06-28 大日本印刷株式会社 Display shelf, display shelf system, information processor and program
SE542415C2 (en) 2016-12-21 2020-04-28 Itab Scanflow Ab Training of an automatic in-store registration system
US10410253B2 (en) 2016-12-27 2019-09-10 Conduent Business Services, Llc Systems and methods for dynamic digital signage based on measured customer behaviors through video analytics
US11526893B2 (en) * 2016-12-29 2022-12-13 Capital One Services, Llc System and method for price matching through receipt capture
US10592771B2 (en) 2016-12-30 2020-03-17 Accenture Global Solutions Limited Multi-camera object tracking
JP6734940B2 (en) 2017-02-01 2020-08-05 株式会社日立製作所 Three-dimensional measuring device
US20180225625A1 (en) 2017-02-06 2018-08-09 Cornelius, Inc. Inventory Management System and Method
US11132737B2 (en) 2017-02-10 2021-09-28 Grabango Co. Dynamic customer checkout experience within an automated shopping environment
US20180240180A1 (en) 2017-02-20 2018-08-23 Grabango Co. Contextually aware customer item entry for autonomous shopping applications
GB2560177A (en) 2017-03-01 2018-09-05 Thirdeye Labs Ltd Training a computational neural network
GB2560387B (en) 2017-03-10 2022-03-09 Standard Cognition Corp Action identification using neural networks
US10929829B1 (en) * 2017-05-04 2021-02-23 Amazon Technologies, Inc. User identification and account access using gait analysis
US10778906B2 (en) 2017-05-10 2020-09-15 Grabango Co. Series-configured camera array for efficient deployment
US10515518B2 (en) * 2017-05-18 2019-12-24 Bank Of America Corporation System for providing on-demand resource delivery to resource dispensers
US10341606B2 (en) 2017-05-24 2019-07-02 SA Photonics, Inc. Systems and method of transmitting information from monochrome sensors
US10943088B2 (en) 2017-06-14 2021-03-09 Target Brands, Inc. Volumetric modeling to identify image areas for pattern recognition
US20180374076A1 (en) 2017-06-21 2018-12-27 Therman Wheeler Proximity based interactions via mobile devices
US10740742B2 (en) 2017-06-21 2020-08-11 Grabango Co. Linked observed human activity on video to a user account
US20190034735A1 (en) 2017-07-25 2019-01-31 Motionloft, Inc. Object detection sensors and systems
US11250376B2 (en) 2017-08-07 2022-02-15 Standard Cognition, Corp Product correlation analysis using deep learning
US10474991B2 (en) * 2017-08-07 2019-11-12 Standard Cognition, Corp. Deep learning-based store realograms
US10650545B2 (en) * 2017-08-07 2020-05-12 Standard Cognition, Corp. Systems and methods to check-in shoppers in a cashier-less store
US10445694B2 (en) 2017-08-07 2019-10-15 Standard Cognition, Corp. Realtime inventory tracking using deep learning
US10853965B2 (en) 2017-08-07 2020-12-01 Standard Cognition, Corp Directional impression analysis using deep learning
US10133933B1 (en) 2017-08-07 2018-11-20 Standard Cognition, Corp Item put and take detection using image recognition
US10055853B1 (en) 2017-08-07 2018-08-21 Standard Cognition, Corp Subject identification and tracking using image recognition
US11200692B2 (en) * 2017-08-07 2021-12-14 Standard Cognition, Corp Systems and methods to check-in shoppers in a cashier-less store
US10127438B1 (en) 2017-08-07 2018-11-13 Standard Cognition, Corp Predicting inventory events using semantic diffing
US11023850B2 (en) 2017-08-07 2021-06-01 Standard Cognition, Corp. Realtime inventory location management using deep learning
US10474988B2 (en) 2017-08-07 2019-11-12 Standard Cognition, Corp. Predicting inventory events using foreground/background processing
US11232687B2 (en) 2017-08-07 2022-01-25 Standard Cognition, Corp Deep learning-based shopper statuses in a cashier-less store
EP3665648A4 (en) 2017-08-07 2020-12-30 Standard Cognition, Corp. Item put and take detection using image recognition
GB201715509D0 (en) 2017-09-25 2017-11-08 Thirdeye Labs Ltd Person indentification across multiple captured images
US11301684B1 (en) 2017-09-29 2022-04-12 Amazon Technologies, Inc. Vision-based event detection
EP3474184A1 (en) 2017-10-20 2019-04-24 Checkout Technologies srl Device for detecting the interaction of users with products arranged on a stand or display rack of a store
EP3474183A1 (en) 2017-10-20 2019-04-24 Checkout Technologies srl System for tracking products and users in a store
US11080780B2 (en) * 2017-11-17 2021-08-03 Ebay Inc. Method, system and computer-readable media for rendering of three-dimensional model data based on characteristics of objects in a real-world environment
CN108055390B (en) 2017-11-22 2021-11-02 厦门瞳景智能科技有限公司 AR method and system for determining corresponding id of client based on mobile phone screen color
US20190180272A1 (en) 2017-12-12 2019-06-13 Janathon R. Douglas Distributed identity protection system and supporting network for providing personally identifiable financial information protection services
US10460468B2 (en) 2017-12-15 2019-10-29 Motorola Mobility Llc User pose and item correlation
US20190205905A1 (en) 2017-12-31 2019-07-04 OneMarket Network LLC Machine Learning-Based Systems and Methods of Determining User Intent Propensity from Binned Time Series Data
KR20190093733A (en) 2018-01-10 2019-08-12 트라이큐빅스 인크. Items recognition system in unmanned store and the method thereof
CA2995242A1 (en) 2018-02-15 2019-08-15 Wrnch Inc. Method and system for activity classification
US20200151692A1 (en) 2018-04-18 2020-05-14 Sbot Technologies, Inc. d/b/a Caper Inc. Systems and methods for training data generation for object identification and self-checkout anti-theft
US10175340B1 (en) 2018-04-27 2019-01-08 Lyft, Inc. Switching between object detection and data transfer with a vehicle radar
AU2018204004A1 (en) 2018-06-06 2020-01-02 Canon Kabushiki Kaisha Method, system and apparatus for selecting frames of a video sequence
US10282852B1 (en) 2018-07-16 2019-05-07 Accel Robotics Corporation Autonomous store tracking system
US10535146B1 (en) 2018-07-16 2020-01-14 Accel Robotics Corporation Projected image item tracking system
US20220230216A1 (en) 2018-07-16 2022-07-21 Accel Robotics Corporation Smart shelf that combines weight sensors and cameras to identify events
US11394927B2 (en) 2018-07-16 2022-07-19 Accel Robotics Corporation Store device network that transmits power and data through mounting fixtures
US10373322B1 (en) 2018-07-16 2019-08-06 Accel Robotics Corporation Autonomous store system that analyzes camera images to track people and their interactions with items
US20210158430A1 (en) 2018-07-16 2021-05-27 Accel Robotics Corporation System that performs selective manual review of shopping carts in an automated store
US10909694B2 (en) 2018-07-16 2021-02-02 Accel Robotics Corporation Sensor bar shelf monitor
US10282720B1 (en) 2018-07-16 2019-05-07 Accel Robotics Corporation Camera-based authorization extension system
CA3112512A1 (en) 2018-07-26 2020-01-30 Standard Cognition, Corp. Systems and methods to check-in shoppers in a cashier-less store
WO2020023796A2 (en) 2018-07-26 2020-01-30 Standard Cognition, Corp. Realtime inventory location management using deep learning
WO2020023799A1 (en) 2018-07-26 2020-01-30 Standard Cognition, Corp. Product correlation analysis using deep learning
WO2020023798A1 (en) 2018-07-26 2020-01-30 Standard Cognition, Corp. Deep learning-based store realograms
JP7228670B2 (en) 2018-07-26 2023-02-24 スタンダード コグニション コーポレーション Real-time inventory tracking using deep learning
WO2020023926A1 (en) 2018-07-26 2020-01-30 Standard Cognition, Corp. Directional impression analysis using deep learning
WO2020023930A1 (en) 2018-07-26 2020-01-30 Standard Cognition, Corp. Deep learning-based shopper statuses in a cashier-less store
JP2021532382A (en) 2018-07-30 2021-11-25 ポニー エーアイ インコーポレイテッド Systems and methods for calibrating in-vehicle vehicle cameras
US10257708B1 (en) * 2018-08-20 2019-04-09 OpenPath Security Inc. Device for triggering continuous application execution using beacons
US20200074432A1 (en) 2018-08-31 2020-03-05 Standard Cognition, Corp Deep learning-based actionable digital receipts for cashier-less checkout
FR3087558B1 (en) 2018-10-19 2021-08-06 Idemia Identity & Security France METHOD OF EXTRACTING CHARACTERISTICS FROM A FINGERPRINT REPRESENTED BY AN INPUT IMAGE
US11062460B2 (en) 2019-02-13 2021-07-13 Adobe Inc. Representation learning using joint semantic vectors
US20200294079A1 (en) 2019-02-21 2020-09-17 Walmart Apollo, Llc Method and apparatus for calculating promotion adjusted loyalty
US11232575B2 (en) 2019-04-18 2022-01-25 Standard Cognition, Corp Systems and methods for deep learning-based subject persistence
US11915225B2 (en) * 2019-05-03 2024-02-27 Visa International Service Association Mobile merchant payment system
GB2584400A (en) 2019-05-08 2020-12-09 Thirdeye Labs Ltd Processing captured images
US11315287B2 (en) 2019-06-27 2022-04-26 Apple Inc. Generating pose information for a person in a physical environment
KR102223570B1 (en) 2019-08-01 2021-03-05 이선호 Solar thermal powered thermoelectric generator system with solar tracker
US20210295081A1 (en) 2020-03-19 2021-09-23 Perceive Inc Tracking and analytics system
US11860289B2 (en) * 2021-09-14 2024-01-02 At&T Intellectual Property I, L.P. Moving user equipment geolocation

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11270260B2 (en) 2017-08-07 2022-03-08 Standard Cognition Corp. Systems and methods for deep learning-based shopper tracking
US11544866B2 (en) 2017-08-07 2023-01-03 Standard Cognition, Corp Directional impression analysis using deep learning
US11200692B2 (en) 2017-08-07 2021-12-14 Standard Cognition, Corp Systems and methods to check-in shoppers in a cashier-less store
US11232687B2 (en) 2017-08-07 2022-01-25 Standard Cognition, Corp Deep learning-based shopper statuses in a cashier-less store
US11250376B2 (en) 2017-08-07 2022-02-15 Standard Cognition, Corp Product correlation analysis using deep learning
US11810317B2 (en) 2017-08-07 2023-11-07 Standard Cognition, Corp. Systems and methods to check-in shoppers in a cashier-less store
US11023850B2 (en) 2017-08-07 2021-06-01 Standard Cognition, Corp. Realtime inventory location management using deep learning
US11538186B2 (en) 2017-08-07 2022-12-27 Standard Cognition, Corp. Systems and methods to check-in shoppers in a cashier-less store
US10853965B2 (en) 2017-08-07 2020-12-01 Standard Cognition, Corp Directional impression analysis using deep learning
US11948313B2 (en) 2019-04-18 2024-04-02 Standard Cognition, Corp Systems and methods of implementing multiple trained inference engines to identify and track subjects over multiple identification intervals
US11361468B2 (en) 2020-06-26 2022-06-14 Standard Cognition, Corp. Systems and methods for automated recalibration of sensors for autonomous checkout
US11303853B2 (en) 2020-06-26 2022-04-12 Standard Cognition, Corp. Systems and methods for automated design of camera placement and cameras arrangements for autonomous checkout
US11818508B2 (en) 2020-06-26 2023-11-14 Standard Cognition, Corp. Systems and methods for automated design of camera placement and cameras arrangements for autonomous checkout
US20220067390A1 (en) * 2020-09-01 2022-03-03 Lg Electronics Inc. Automated shopping experience using cashier-less systems
US11966901B2 (en) * 2020-09-01 2024-04-23 Lg Electronics Inc. Automated shopping experience using cashier-less systems

Also Published As

Publication number Publication date
US20240070895A1 (en) 2024-02-29
US11200692B2 (en) 2021-12-14
US20230140693A1 (en) 2023-05-04
US11810317B2 (en) 2023-11-07

Similar Documents

Publication Publication Date Title
US11538186B2 (en) Systems and methods to check-in shoppers in a cashier-less store
US11810317B2 (en) Systems and methods to check-in shoppers in a cashier-less store
US20220130220A1 (en) Assigning, monitoring and displaying respective statuses of subjects in a cashier-less store
US11948313B2 (en) Systems and methods of implementing multiple trained inference engines to identify and track subjects over multiple identification intervals
TWI787536B (en) Systems and methods to check-in shoppers in a cashier-less store
US11544866B2 (en) Directional impression analysis using deep learning
US11250376B2 (en) Product correlation analysis using deep learning
US10055853B1 (en) Subject identification and tracking using image recognition
US11023850B2 (en) Realtime inventory location management using deep learning
US10127438B1 (en) Predicting inventory events using semantic diffing
US20190156276A1 (en) Realtime inventory tracking using deep learning
US20200074432A1 (en) Deep learning-based actionable digital receipts for cashier-less checkout
WO2020023930A1 (en) Deep learning-based shopper statuses in a cashier-less store
US20190156275A1 (en) Systems and methods for deep learning-based notifications
US20190156273A1 (en) Deep learning-based store realograms
WO2020023926A1 (en) Directional impression analysis using deep learning
WO2020023799A1 (en) Product correlation analysis using deep learning
US20230088414A1 (en) Machine learning-based re-identification of shoppers in a cashier-less store for autonomous checkout

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY

FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO SMALL (ORIGINAL EVENT CODE: SMAL); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCF Information on status: patent grant

Free format text: PATENTED CASE

AS Assignment

Owner name: STANDARD COGNITION, CORP., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:FISHER, JORDAN E.;GREEN, WARREN;FISCHETTI, DANIEL L.;SIGNING DATES FROM 20190220 TO 20190226;REEL/FRAME:061832/0017