WO2022241294A2 - Reconnaissance d'identité utilisant des caractéristiques corporelles associées à un visage - Google Patents

Reconnaissance d'identité utilisant des caractéristiques corporelles associées à un visage Download PDF

Info

Publication number
WO2022241294A2
WO2022241294A2 PCT/US2022/029314 US2022029314W WO2022241294A2 WO 2022241294 A2 WO2022241294 A2 WO 2022241294A2 US 2022029314 W US2022029314 W US 2022029314W WO 2022241294 A2 WO2022241294 A2 WO 2022241294A2
Authority
WO
WIPO (PCT)
Prior art keywords
person
bodyprint
bodyprints
video frames
camera
Prior art date
Application number
PCT/US2022/029314
Other languages
English (en)
Other versions
WO2022241294A3 (fr
Inventor
Nitin Gupta
Jingwen ZHU
Jonghoon Jin
Andrew C. Edwards
Floris CHABERT
Vinay Sharma
Hendrik Dahlkamp
Original Assignee
Apple Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Apple Inc. filed Critical Apple Inc.
Priority to CN202280034667.6A priority Critical patent/CN117377988A/zh
Priority to EP22729364.4A priority patent/EP4338136A2/fr
Publication of WO2022241294A2 publication Critical patent/WO2022241294A2/fr
Publication of WO2022241294A3 publication Critical patent/WO2022241294A3/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • G06V40/25Recognition of walking or running movements, e.g. gait recognition

Definitions

  • An application may analyze the image to determine characteristics of the person’s face and then attempt to match the person’s face with other known faces.
  • identity recognition is a growing field and various challenges exist related to performing recognition. For example, sometimes a video camera may not be able to perform facial recognition of a person, given a particular video feed (e.g., if the person is facing away from the camera). In these scenarios, it may be difficult to accurately perform identity recognition of the person.
  • FIG.1 is a simplified block diagram of an example system, according to some embodiments;
  • FIG.2 is another simplified block diagram illustrating at least some example techniques for providing a notification based on determining the presence of a particular person at a location, according to some embodiments;
  • FIG.3 is another simplified block diagram illustrating at least some example techniques for providing a notification based on determining the presence of a particular person at a location, according to some embodiments;
  • FIG.4 is another simplified block diagram illustrating at least some example techniques involving a user interface (UI) for providing a notification based on determining the presence of a particular person at a location, according to some embodiments;
  • FIG.5 is another simplified block diagram illustrating an example architecture of a system used to provide notifications based on determining the presence of a particular person at a location, according to some embodiments;
  • FIG.6 is a simplified flow diagram illustrating a process for
  • FIG.8 is a simplified flow diagram illustrating a process for determining whether to add a newly identified bodyprint (e.g., torso image) to a cluster (e.g., gallery) of bodyprints for a known person, according to some embodiments.
  • a newly identified bodyprint e.g., torso image
  • a cluster e.g., gallery
  • bodyprints for a known person.
  • DETAILED DESCRIPTION [0011]
  • various examples will be described. For purposes of explanation, specific configurations and details are set forth in order to provide a thorough understanding of the examples. However, it will also be apparent to one skilled in the art that the examples may be practiced without the specific details. Furthermore, well-known features may be omitted or simplified in order not to obscure the example being described.
  • the device may then store physical characteristic information of the particular person in association with the identity of the particular person based on having recognized the face of the particular person from the first video feed.
  • a gallery of images may be stored, where each image of the gallery includes a torso, and the images of the torso can be associated with the particular person who’s face had been identified.
  • the identified face may correspond to a person in a contacts list, such that the face/person are known by an owner/user of the device.
  • the device can receive a second video feed showing a second person whose face is determined to not be recognized by the device (e.g., an obstructed view or poor image quality) or is not visible to the device (e.g., walking away from the camera).
  • the device can compare the stored physical characteristic information of the first person (e.g., the gallery of images) with additional physical characteristic information of the second person shown in the second video feed (note: the second person may be the first person; however, this is yet to be determined). Based on the comparison, the device can provide a notification indicating whether the identity of the second person corresponds to the identity of the first person.
  • the notification may be a message (e.g., a text, email, pop-up, etc.) or it may be a name or visual identifier on a screen displaying the frames or video. In this way, techniques may enable a device to identify a person without having a view or quality image of the face.
  • the device may provide notifications of the presence of a particular person in a wider range of scenarios and/or with higher precision/recall, for example, including when the person’s face may not be shown in a video feed and/or recognizable by the device.
  • the gallery of images corresponding to physical characteristics e.g., torso, etc.
  • the gallery of images may only be stored for a particular amount of time (e.g., one day, one week, or the like). In this way, the gallery of images may be repopulated each day (or longer) for each detected person.
  • a first device e.g., a resident device
  • the resident device e.g., a smart speaker, a smart digital media player, and/or any device that may be configured or otherwise intended to be relatively stationary within a location (e.g., a home)
  • a first camera e.g., positioned to observe a home entrance location near the front door
  • a second camera e.g., positioned inside the home to survey an area within the home, such as a hallway, living room area, etc.
  • the resident device may receive a first video feed (e.g., including a first plurality of video frames) during a first time period (e.g., a first phase, during a particular weekday morning).
  • the first plurality of video frames may show a first person approaching the front door of the home, whereby at least one of the frames of the first plurality of video frames includes a face of the first person.
  • the resident device can identify an identity of the first person based at least in part on recognizing the face of the first person (e.g., utilizing a suitable facial recognition algorithm).
  • the resident device may have previously stored a faceprint of the first person (e.g., a personal contact associated with the home, such as a resident, friend, babysitter, housekeeper, or the like), whereby the faceprint corresponds to a unique multidimensional (e.g., vector-based) representation of the face.
  • a dimension of the vector may be associated with at least one characteristic of the face of the contact.
  • the facial recognition algorithm may compare a faceprint, determined from one of the frames of the first plurality of video frames, with the previously stored faceprint that is associated with the first person. Upon determining that the faceprints match, the resident device may thereby identify the identity of the first person shown in the first plurality of video frames (e.g., as matching the personal contact).
  • the resident device then may identify one or more physical characteristics of the first person, and then may store physical characteristic information corresponding to the identified one or more physical characteristics. For example, the resident device may determine that at least one of the frames of the first plurality of video frames includes both the recognized face and other body features of the first person, including, but not limited to a torso, one or more arms, and/or one or more legs. The resident device then may identify one or more physical characteristics of the first person based on the other body features. These physical characteristics may also be associated with, for example, a body shape, a texture and/or color of the body of the first person, a texture and/or color of an article of clothing worn by the first person, etc.
  • physical characteristic information corresponding to the identified one or more physical characteristics may be stored by the resident device.
  • these physical characteristics may be stored (e.g., by the resident device) as a gallery of images (e.g., gallery of torsos), and may be only stored for one day (or, maybe be stored longer if desired).
  • the resident device may generate a bodyprint of the first person.
  • the bodyprint may include a multidimensional vector that is associated with (e.g., represents) the one or more physical characteristics of the first person, whereby a dimension of the vector may be associated with at least one characteristic of the body of the contact.
  • a bodyprint may utilize any suitable format and/or data structure that may be suitable for efficient comparison between bodyprints.
  • the bodyprint may be stored in association with the identity of the first person.
  • the resident device may utilize the frame(s) that showed the recognized face and body features of the first person to associate the identity of the first person with the bodyprint.
  • a cluster of bodyprints e.g., the gallery of images
  • a different bodyprint may be generated for respective frames of the first plurality of video frames that show at least a portion of the body of the first person.
  • video frames may respectively capture different aspects (e.g., shape, color, texture) of the person’s torso, arms, legs, and/or other body features.
  • the resident device may generate a bodyprint based on the respective frame and then associate the bodyprint with the identity of the first person.
  • the resident device may select a bodyprint to be included in the cluster of bodyprints based on determining that the bodyprint provides an acceptable information gain (e.g., above a threshold quality level) that may be subsequently used to compare against another bodyprint when performing identity recognition.
  • an acceptable information gain e.g., above a threshold quality level
  • a bodyprint generated from a particular frame may be a candidate for inclusion within the cluster of bodyprints even though the face of the first person may not be recognizable in the particular frame. This may be the case, for example, if the resident device determines (e.g., based on executing a motion tracking algorithm) that the body shown in the particular frame is the same body that is shown in a related frame of the first plurality of frames.
  • the body (and/or body portion) shown in the related frame may include a face that is recognized by the resident device as corresponding to the identity of the first person.
  • the resident device receives a second plurality of video frames that includes a second person.
  • the resident device may determine that a face of the second person is not recognized or is not in view by the first device.
  • the second plurality of video frames may show the body of the second person walking away from the camera, with a portion of their face facing away from the camera (and/or otherwise obscured).
  • the second plurality of video frames may be poor image quality, such that the face of the person is not recognizable at all. Accordingly, while the body of the second person may be captured in frames of the second plurality of video frames, the resident device may determine that the face (e.g., and/or identity) of the second person is not recognized based on facial recognition. The resident device thus may attempt to determine the identity of the second person based on comparing previously stored physical characteristic information (e.g., a bodyprint of the first person) with additional physical characteristic information (e.g., a bodyprint) of the second person.
  • previously stored physical characteristic information e.g., a bodyprint of the first person
  • additional physical characteristic information e.g., a bodyprint
  • the resident device determines that the bodyprint of the first person matches the bodyprint of the second person (e.g., based on computing a Euclidean distance between bodyprint vectors).
  • a machine learning model of the resident device may be trained to generate the vectors and/or perform the comparison based on learning to associate a first bodyprint of a particular person with a second (e.g., different) bodyprint of the same person, whereby the training is performed over a plurality of training samples.
  • the resident device may determine that the second bodyprint matches the first bodyprint (or at least some number or percentage of the bodyprints in the gallery (e.g., cluster) of bodyprints.
  • the resident device may provide a notification (e.g., an audio announcement and/or text-based message) indicating that the identity of the second person corresponds to the identity of the first person, or that identity of the second person is not known.
  • a notification e.g., an audio announcement and/or text-based message
  • the resident device may store a plurality of clusters of bodyprints, respectively, for contact persons associated with the home environment. Accordingly, when attempting to determine whether a recently determined bodyprint (e.g., of the second person, in the illustration above) matches a previously stored bodyprint, the resident device may compare different bodyprints of a same cluster with the recently determined bodyprint and/or compare bodyprints of a plurality of clusters with the recently determined bodyprint.
  • the second bodyprint may be used to compare against gallery.
  • a good quality gallery of bodyprints is established for the contact (e.g., a person identified in a user’s contacts)
  • the quality of the gallery will not be affected by future images (e.g., including images that are not actually bodyprints (e.g., not a torso) at all).
  • a machine learning model executing on the resident device may also (and/or alternatively) be trained to compare other physical characteristics of a first person with a second person to determine if there is a match.
  • the physical characteristics may be associated with a motion of a person over time (e.g., for example, a gait of the person as they walk).
  • a “motionprint” that captures a unique motion of the person over time may be stored in a suitable data structure (e.g., a multidimensional vector).
  • a previously stored motionprint of a first person may be compared against a recently generated motionprint of a second person to determine if there is match. If so, then the resident device may utilize this information to determine whether an identity of the second person matches the identity of the first person.
  • techniques described herein may enable a system to determine an identity of a person from a video feed even when the person’s face may be determined to not be recognized by the system. This may improve upon existing systems by expanding the range of scenarios in which a user’s identity may be recognized. This may, for example, improve a customer experience of a system operating in a home environment by optionally notifying users about the presence of a particular person in a location in the home. This functionality may be enabled even when a camera observing the location may not often capture a detailed (e.g., good quality) image of the face of a person at the location, thus making facial recognition difficult to accurately perform (e.g., with acceptable precision and/or recall).
  • techniques described herein provide another mechanism (e.g., a fallback mechanism or a parallel mechanism) for performing identity recognition.
  • techniques described herein may be used to determine when one or more physical characteristics of a particular person are detected to have changed. For example, a system may detect a likelihood that clothing worn by the particular person may have changed from a first time period (e.g., in the morning) to a second time period (e.g., in the afternoon). In some embodiments, this may be useful for updating bodyprints of a particular person on a suitable cadence, to ensure that the accuracy of detection remains acceptable.
  • techniques described may enable a mechanism whereby a home environment that includes multiple cameras (e.g., each receiving video feeds) can utilize respective feeds of the cameras to generate a high-quality cluster of bodyprints of a person, based on frames drawn from the multiple cameras.
  • the cluster of bodyprints of the person can further be synchronized across multiple devices (e.g., user devices, resident devices) of the home environment, so that each device has a high quality set of bodyprints (and/or faceprints) that may be used for identity detection in the home environment setting.
  • this synchronization mechanism among a plurality of devices (e.g., resident devices) of the home environment may enable more privacy controls.
  • a cluster of bodyprints may be synchronized locally among devices of the home environment.
  • a system of the present disclosure may be used in any suitable environment, such as an office building, a warehouse, a parking lot, a public park, etc.
  • FIG.1 is a simplified block diagram 100 that illustrates a system notification service operating in an example environment, according to some embodiments.
  • the example environment depicted is a home environment 101.
  • the home environment 101 may be associated with one or more people (e.g., contact persons) who have some common affiliation (e.g., family members, roommates, a housekeeper, a babysitter, etc.).
  • user 112 may represent an affiliated user (e.g., “Person A,” who may be the housekeeper).
  • a device 102 e.g., a tablet, a smart home controller, a smart digital media player, a home automation device (e.g., that is part of a home automation system), or the like).
  • the device 102 e.g., a resident device of the home
  • the notification service 104 may notify the home that the detected user 114 matches the identity of user 112 (e.g., “Person A”).
  • the home environment 101 may be associated with a physical location and/or structure (e.g., a house and/or a surrounding yard or walkway, etc.), whereby one or more cameras may be positioned within the home environment 101.
  • a physical location and/or structure e.g., a house and/or a surrounding yard or walkway, etc.
  • techniques described herein may be performed within any suitable physical environment (e.g., a physical location), whereby one or more cameras may be positioned.
  • Some non-limiting examples include a house, a recreation center, an office building, an outdoor park, a parking lot, etc.
  • a camera of the home environment 101 may be positioned at any suitable location associated with the home environment 101.
  • observation camera 108 is positioned near the front door to survey the outdoor entrance to the home
  • observation camera 110 is positioned to survey an interior corridor of the home.
  • cameras may survey additional and/or alternate locations of the home environment 101, including, for example, the backyard of the home, a particular room (e.g., a living room), a garage, etc.
  • an observation camera may be a webcam, a pan-tilt-zoom (PTZ) camera, etc., which may be communicatively connected to a separate device (e.g., device 102).
  • PTZ pan-tilt-zoom
  • an observation camera may be a component of a user device (e.g., a tablet, a mobile phone), which may, in turn, be connected to a separate device (e.g., device 102).
  • the device 102 itself may include an observation camera. It should be understood that any suitable arrangement may be used to communicatively connect cameras and devices, as described herein.
  • the observation cameras are communicatively connected (e.g., via a WiFi signal) to the device 102, whereby the observation cameras respectively transmit a video feed of any suitable image (e.g., frame) quality to the device 102.
  • the device 102 may be any suitable computing device that is associated with (e.g., resides in) a particular environment and is configured to receive a video feed (e.g., a plurality of video frames) from an observation camera, analyze the frames to determine if a detected person in the video frames matches a particular identity, and then perform one or more operations upon completion of the analysis (e.g., providing a notification to a user or tagging that user with their identity for logging, later review of the feed, and/or while viewing a live stream of the video).
  • a video feed e.g., a plurality of video frames
  • the device 102 may correspond to a resident device.
  • the resident device 102 may provide a notification by announcing, for example, that a particular contact of the home environment 101 (e.g., user 112) has arrived.
  • the resident device 102 may transmit a message to one or more user devices that the particular contact has arrived. For example, an alert message may pop up on a display of a user device. It should be understood that notifications may be provided by a resident device using any suitable channel and/or method, depending, for example, on the type of resident device, a type of user device, the surrounding environment, etc.
  • the resident device may correspond to a smart TV device (e.g., a digital media player that is connected to a TV).
  • the smart TV device may be equipped to present a graphical user interface (GUI) on the TV, which may include a Picture-in-Picture (PIP) presentation.
  • GUI graphical user interface
  • PIP Picture-in-Picture
  • the resident device may provide a notification in the form of an audiovisual (AV) feed.
  • the resident device may display a video feed (e.g., received from observation camera 108 or 110) in the inset window of the TV.
  • the resident device may enable bi-directional communication between a user in the home environment and the person (e.g., user 112) outside.
  • the resident device 102 may contain a local memory repository that is suitable for storing and processing information associated with images received from one or more cameras. This may include, for example, physical characteristic information of a person detected in an image. In some embodiments, physical characteristic information may include facial characteristics of the person (e.g., stored within a faceprint).
  • physical characteristic information may also and/or alternatively include non-facial characteristics associated with the body of the person and/or movements of the body. This information may be associated with, for example, a head of a person, a torso of the person, a gait of the person, one or more arms or legs, a body shape, etc.. In some embodiments, information associated with the body of the person may be included within a bodyprint (e.g., a multidimensional vector), described further herein. In some embodiments, physical characteristic information may include information associated with a texture or color of the body of the person and/or an article of clothing worn by the person.
  • physical characteristic information may include any suitable combination of facial and/or non-facial characteristics associated with a person’s body.
  • the resident device 102 may store physical characteristic information of a person in associated within an identity of the person. For example, using diagram 100 for illustration, the resident device 102 may store, among other things, a faceprint of user 112, who may be a contact of the home environment. [0028]
  • physical characteristic information e.g., a faceprint, a bodyprint, an image, etc.
  • a first person may captured and/or stored in association with the identity of the first person (e.g., user 112) at any suitable time, and subsequently processed according to any suitable method.
  • the notification service 104 may receive images of the first person (e.g., user 112, “Person A” in FIG.1) at a time before time T 1 from a suitable source (e.g., a photo library of a user device, an observation camera, etc.), generate a faceprint from the image (e.g., via the detection model 106), and store the faceprint within a set of reference faceprints for user 112. Subsequently, at time T1, the notification service 104 of resident device 102 may receive a first plurality of frames (e.g., in a video feed) from observation camera 108.
  • a suitable source e.g., a photo library of a user device, an observation camera, etc.
  • the notification service 104 of resident device 102 may receive a first plurality of frames (e.g., in a video feed) from observation camera 108.
  • the detection model 106 of resident device 102 may then utilize the set of reference faceprints of the first person (e.g., user 112) to perform facial recognition.
  • the resident device 102 may identify the identity of user 112 (e.g., via the previously stored association between the identity of user 112 and facial characteristic information (e.g., a faceprint) of user 112).
  • the resident device 102 e.g., via the detection model 106) may identity one or more physical characteristics of the body of user 112 from one or more frames of the first plurality of frames that were used to identify the face of user 112.
  • the resident device 102 may determine that the face of a second person (e.g., user 114, who may (or may not) be the same person as user 112) may not be recognizable via facial recognition (e.g., because the person may be facing away from the observation camera 110).
  • the resident device 102 e.g., via the detection model 106) may determine additional physical characteristic information (e.g., corresponding to one or more bodyprints) of the second person shown in the second plurality of frames and then compare the additional physical characteristic information with the previously stored physical characteristic information of the first person (e.g., from the first plurality of video frames received at time T 1 ).
  • the detection model 106 of resident device 102 may compare the additional physical characteristic information with bodyprints (and/or clusters of bodyprints) of a plurality of people (e.g., corresponding to contacts of the home environment 101), to determine if there is a match. Depending on whether a match is determined via the comparison, the notification service 104 of resident device 102 may then provide a notification (e.g., via an audio message and/or via a message to another device) indicating whether an identity of the second person shown within the second plurality of frames corresponds to the identity of the first person shown in the first plurality of frames.
  • a notification e.g., via an audio message and/or via a message to another device
  • the resident device 102 determines that the identity of the second person (e.g., user 114) corresponds to (e.g., matches) the identity of the first person (e.g., that the second person is likely “Person A,” whereby user 114 is the same person as user 112)).
  • the resident device 102 e.g., a smart speaker
  • a plurality of devices may be synchronized within the home environment.
  • resident device 102 may be a first device of a plurality of resident devices (e.g., including the first device and a second device) of the home environment 101 (e.g., positioned in different locations of the home environment 101).
  • Each device may execute a notification service similar to notification service 104.
  • the first device and the second device may synchronize physical characteristic information (e.g., bodyprints) of a particular person (e.g., a contact of the home environment 101).
  • the first device and the second device may respectively maintain (e.g., at time T 0 ) synchronized clusters of bodyprints for user 112 (e.g., a reference cluster of bodyprints), stored on each device.
  • the first device may receive the first plurality of video frames, generate one or more bodyprints (and/or faceprints), and determine that at least one generated bodyprint is associated with a higher level of information gain than at least one bodyprint of the existing cluster of bodyprints for user 112.
  • the first device may update the existing cluster with one or more new bodyprints for user 112.
  • the first device may further transmit the new bodyprints associated with the higher information gain to the second device, which may in turn update its reference cluster of bodyprints for user 112.
  • devices of the home environment 101 may be synchronized to ensure that they have up-to-date bodyprints for people.
  • devices of the home environment 101 may not synchronize bodyprints with each other.
  • the devices may exchange information (e.g., synchronization data, including image, faceprints, bodyprints, etc.) according to any suitable cadence and or set (or subset) of data.
  • certain bodyprints may be discarded from analysis, either as a reference bodyprint for comparing to the cluster (e.g., gallery) or as a candidate for being added to the cluster.
  • the determination of whether to discard a bodyprint may be made based on one or more rules (or heuristics).
  • Example rules include discarding images with bodyprints (also referred to as torsos) that are rotated more than a threshold degree. For example, the person’s body may be bent over (e.g., to pick something up) or the camera angle may be such that the body appears at an angle.
  • Other rules include discarding images with a “roll” (e.g., a degree of rotation from vertical) that is greater than a threshold and/or discarding images that are in landscape mode (e.g., any images that are wider than they are tall).
  • an image may have a defined region of interest (e.g., a particular portion of the image that is believed to include a faceprint or a bodyprint).
  • a detected bodyprint is too close to the edge of a region of interest, the image may be discarded.
  • any torsos that apply to these rules may be discarded or otherwise ignored as if they are not torsos at all.
  • Other rules may also be considered, as appropriate, for bodyprints that may be discarded with respect to addition to the cluster but may still be used for analysis against the cluster of bodyprints.
  • some bodyprints may be of a high enough level of quality that they can still be used to compare against the cluster of bodyprints for potentially identifying a person, but are too low of a level of quality to be included in the cluster.
  • a confidence score may be generated for all torsos that passed the first set of discarding rules (see rules above about rotation, roll, and/or orientation), and this confidence score may be used to determine whether to add the torso to the cluster after the comparison is complete. These torsos that are not included in the gallery are, therefore, not associated with any recognized faces.
  • a confidence score for a given bodyprint (e.g., a torso). For example, image characteristics, including some combination of saturation, brightness, sharpness, and/or contrast may be used to generated a confidence score (also referred to herein as a quality score). In some cases, images that contain bodyprints may be analyzed for saturation levels, sharpness levels, brightness levels, and/or contrast levels. Using some combination of these techniques, a machine learning model output may include the confidence score.
  • the image if the confidence score is below a threshold, the image is deemed to be low quality, and if the confidence score is above the threshold, the image is determined to be high quality. Hi quality images may be added to the cluster, while low quality images may be excluded from the cluster.
  • an image may be received (for examples, by the resident deice 102), and a confidence score may be generated by detection model 106.
  • the confidence score may an output of the model that is separate from the multidimensional vector or it may part of the vector.
  • the detection model 106 (or a different algorithm used prior to analysis by the detection model 106) may be configured to evaluate any combination of saturation, brightness, sharpness, and/or contrast to generate a confidence score.
  • the evaluation of image characteristics may be a first type of confidence score, and a second type of confidence score may be received as an output from a deep learning model that is trained with bodyprints.
  • the model can be trained to output a confidence score, some training data may need to be labeled.
  • the training data may consist of both good quality and bad quality bodyprint images, and these images may need to be labeled before the training.
  • These training images may be labeled based on the two separate confidence scores determined by analysis described above (e.g., saturation, brightness, sharpness, and/or contrast). Additionally, the training data may be labeled based on a false bodyprint detection generated by a bodyprint detector.
  • the bodyprint detector detects a false bodyprint (e.g., an image that does not actually have a bodyprint (e.g., torso) in it), that image can be labeled as false.
  • a false bodyprint e.g., an image that does not actually have a bodyprint (e.g., torso) in it
  • Some images may be manually labeled as false.
  • an algorithm may be used to automatically generate false bodyprints (e.g., by automatically cropping a bodyprint image to remove the bodyprint from the image). For example, a bounding box may be generated in an image, and it may be moved up/down/left/right, as desired to crop out the bodyprint, thus creating a false bodyprint.
  • These auto-generated false bodyprint images may also be automatically labeled as false.
  • the deep learning model may be trained with a) bodyprint images labeled based on the image characteristic described above, and/or b) bodyprint images labeled based on the false bodyprint detection score (e.g., either true or false).
  • the model can output a vector for the input and/or a confidence score that identifies whether the image includes a bodyprint and the quality of the bodyprint.
  • the final output of the model e.g., confidence/quality score
  • an intermediary entity may perform one or more computing operations of the notification service 104. For example, consider a case in which the intermediary entity may correspond to a server device of a cloud computing platform. In this case, the resident device 102 may offload one or more operations to the remote platform.
  • FIGs.2 and 3 are simplified block diagrams illustrating at least some example techniques for providing a notification that a particular person may be detected at a location, according to some embodiments. The operations of process 200 span both FIGs.
  • the operations of FIG.3 may correspond to a second phase of the process that follows the first phase, in which the system receives and analyzes a second plurality of video frames (e.g., received during a subsequent time period), in which the system determines that a face of a second person shown in the second plurality of frames may not be recognized by the system (e.g., via facial recognition).
  • the system accordingly determines second physical characteristic information of the second person and then compares the second physical characteristic information with the previously stored first physical characteristic information to determine the identity of the second person.
  • the diagram 201 of FIG.2 depicts example states that correspond to the first phase of process 200
  • the diagram 301 of FIG.3 depicts example states that correspond to the second phase of process 200.
  • the diagram 201 may include elements that are similar to those depicted in reference to FIG.1.
  • a first person e.g., user 205
  • an observation camera 203 may be similar to observation camera 108.
  • a data store 221 may correspond to a local repository of the system (e.g., of resident device 102) within the home environment.
  • a second person 225 may be similar to user 114 of FIG. 1
  • observation camera 223 may be similar to observation camera 110
  • system 235 may be similar to the resident device 102 of FIG.1.
  • the system may receive a first plurality of video frames during a first time period, whereby at least one of the frames shows a face of a first person.
  • the first person e.g., user 205
  • the observation camera 203 may be positioned to observe activity near the entrance to the front door of the home.
  • the observation camera 203 may observe the face and body of user 205, for example, as they approach the front door of the home during the first time period (e.g., during the morning).
  • observation camera 203 may be positioned at any suitable location such that the camera 203 may capture the face and non-facial body portion of user 205.
  • a video feed may be transmitted by the observation camera 203 to the system (e.g., system 235, which may be similar to resident device 102 of FIG. 1), whereby the video feed includes the first plurality of video frames 207.
  • the system may perform one or more operations to analyze frames of the first plurality of video frames 207. For example, with respect to a representative particular frame 214 of the first plurality of frames, the system may perform object detection of one or more objects within the frame.
  • the system may generate a bounding box 211 that includes the body of user 205 (e.g., including the torso, arms, hands, legs, etc.).
  • any suitable region(s) may be captured within a bounding box (e.g., only the torso, only the arms, both the torso and head, the entire person, etc.).
  • the system may generate a bounding box 209 that includes the face (e.g., the front of the head) of user 205.
  • the system may determine which frames of the video feed to select for analysis as the first plurality of video frames 207 (e.g., every frame, or a sampling of frames).
  • a plurality of persons and/or other objects may be detected within a particular frame.
  • the video feed e.g., the first plurality of video frames 207
  • the activity e.g., movement
  • the system may associate (e.g., track) bounding boxes of the same object across a plurality of frames. For example, suppose that the first plurality of video frames 207 shows user 205 walking as they approach the front door. Accordingly, each frame may show the body of user 205 in a different position, according to the gait of user 205.
  • the system may determine a bounding box for the body (e.g., including the torso) of user 205 for each frame analyzed.
  • the system may then analyze characteristics of the torso as it appears in each image, to determine if it is the same torso (or a different torso, of another person). For example, torsos that are similar in size and position (e.g., from frame to frame) may be associated with the same person (e.g., user 205), while torsos that are in different positions (e.g., larger than a predefined threshold difference in distance) and/or have different sizes between frames may be determined to be different persons. In this way, the system may track and associate bounding boxes of the same person across the first plurality of video frames 207.
  • the system may similarly track a plurality of people within the first plurality of video frames 207.
  • a set of bounding boxes for a body e.g., including a torso
  • a given person e.g., user 205
  • bounding box 211 may be included within a set of bounding boxes for the same person (e.g., user 205), which may be further associated with the bounding box 209 (and/or a set of facial bounding boxes) of the face of the same person.
  • the system may subsequently associate one or more bodyprints (e.g., respectively determined from portions of the frames corresponding to the bounding boxes of the body of the person) with the face (and/or the identity) of the same person.
  • the system identifies an identity of the first person based on recognizing the face shown in one or more frames of the first plurality of video frames. For example, continuing within the illustration of diagram 201, upon generating the bounding box 209 for the face of user 205, the system may perform a facial recognition process to recognize the face of user 205. For example, as described herein, prior to the first time period, the system may maintain and store information associated with the face of user 205.
  • a cluster of faceprints may be selected and/or stored in association with the identity of user 205 (e.g., a contact person of the home environment), whereby a faceprint may be selected for inclusion within the cluster at least in part because it provides a higher level of information gain/quality (e.g., compared to other candidate faceprints) that are suitable for performing facial recognition.
  • a machine learning model (e.g., detection model 106 of FIG.1) of the system may then compare one or more faceprints (that were generated from the face croppings 213 of the first plurality of video frames 207) with the previously stored cluster of reference faceprints of user 205 (e.g., retrieved from data store 221 of the system).
  • the system may compare the one or more faceprints with a plurality of clusters of reference faceprints (e.g., stored in data store 221), respectively associated with different contacts of the home environment.
  • the detection model may detect a match between at least one faceprint (e.g., from at least one frame, such as frame 214) and one of the faceprints of the clusters of reference faceprints stored on the system.
  • the system may thereby determine the identity 215 of the user 205 as being “Person A” (e.g., a known contact of the home environment).
  • the system may store first physical characteristic information of the first person in association with the identity of the first person.
  • the system may generate one or more body croppings 219 from the first plurality of video frames 207 (e.g., from one or more bounding boxes).
  • a body image cropping may include any suitable portion(s) of the body.
  • a body image cropping may include only the torso, include both the torso and the head, include the whole body, etc.
  • Diagram 201 depicts the whole body has being included within set of body croppings 219.
  • a trained machine learning model e.g., a sub-model of detection model 106 may determine first physical characteristic information 217 of the first person based on the set of body croppings.
  • the machine learning model may be trained to determine, for a given body cropping, physical characteristic information corresponding to one or more physical characteristics associated with the body. These characteristics may be associated with any suitable aspect of the body (e.g., body shape, color, texture, characteristics of clothing worn by the body, and or movement of the body).
  • this physical characteristic information may include non-facial characteristic information that may be used to identify a person, apart from performing facial recognition.
  • these characteristics may be associated with a particular part of the body (e.g., the torso, the back of the head, or arms, etc.), and/or any suitable combination of parts thereof. In some embodiments, these characteristics may be used to uniquely identify a person (e.g., user 205, with identity 215).
  • the first physical characteristic information 217 may be collectively represented by any suitable data structure (e.g., a bodyprint). As described herein, and, similar to a faceprint, a bodyprint may include a multidimensional vector (e.g., 128 dimensions, or any suitable number of dimensions). In some embodiments, a dimension of the bodyprint may be associated with at least one characteristic of the body.
  • an association between a dimension and at least one characteristic may be dynamically determined by the machine learning model upon the generation of the vector.
  • the system may utilize a vector structure (e.g., for a bodyprint and/or faceprint) in part to efficiently store data representing aspects of a face or body of a person.
  • a vector structure may also enable more efficient comparison of vectors (e.g., determining Euclidean distances and/or cosine similarities of vectors), for example, to enable more efficient identity detection.
  • the system may perform any suitable transformations to a cropping (and/or bounding box of an image) to prepare the cropping for input into a machine learning model so that a vector (e.g., faceprint and/or bodyprint) may be generated.
  • a cropping of a torso may be resized (e.g., from a portrait image to a square image).
  • the system may associate the first physical characteristic information 217 (e.g., the bodyprint of user 205) with the identity 215 of the first person (“Person A”), based in part on the previously performed facial recognition.
  • This association may also be performed based on the system associating the body croppings 219 with the same face, as described herein (e.g., with respect to block 202).
  • any suitable association may be stored by the system with respect to images, facial characteristics (e.g., faceprints), body characteristics (e.g., bodyprints), and the identity of a person.
  • the system may store a cluster of bodyprints in association with the identity 215.
  • a bodyprint may be generated per body (image) cropping, whereby a select number of bodyprints may be included within the cluster of bodyprints (e.g., a reference cluster) based on being determined to have a higher information gain (e.g., unique body features for performing identity detection), compared with other bodyprints that were not included.
  • the one or more bodyprints and/or associations may be stored in the data store 221 for subsequent utilization, as described in reference to FIG. 3. Accordingly, block 206 may conclude the first phase of process 200 (e.g., indicated by the triangle marker “A”).
  • FIG.3 illustrates a second phase of the process 200, whereby diagram 301 of FIG.
  • the system receives a second plurality of video frames 227 that show a second person during a second time period, whereby a face of the second person is determined to not be recognized by the system.
  • the second time period may follow the first time period of phase one (e.g., a time period of the afternoon on the same day or different day).
  • the system may first attempt to perform facial recognition of the second person 225 based on the second plurality of video frames 227, similar to as described in reference to FIG.2 (e.g., blocks 202 and/or 204). For example, the system may determine one or more bounding boxes for particular objects in each frame. In one example, for a representative frame of the second plurality of video frames 227, a bounding box 229 may correspond to the head of the second person 225 and bounding box 231 may correspond to the body (e.g., excluding the head) portion of the second person 225.
  • the system may determine face croppings from the bounding boxes of the head of the second person 225 (e.g., potentially including a portion of the face, or not showing the face), determine faceprints from the face croppings, and then compare the faceprints against one or more reference clusters of faceprints (e.g., contacts of the home environment).
  • the machine learning model e.g., detection model 106 of the system may determine that the face may not be recognized (e.g., since the face may not be captured adequately to enable facial recognition with an acceptable level of confidence).
  • operations of block 210 may involve performing identity recognition based on non-facial body characteristics, may not be performed. For example, these operations may be performed as a fallback mechanism in case facial recognition does not successfully determine the identity of the second person 225 with high confidence.
  • the system may perform operations of block 210 and/or 212 independent of whether facial recognition may successfully identity the identity of the second person.
  • the system may perform facial recognition in parallel to performing identity recognition based on non-facial body characteristics. [0050] In some examples, however, using operations of block 210 as a fallback is completely optional.
  • the system may decide whether to fallback to using bodyprints based on the quality of the face. In some cases, if the face is recognizable, the fallback to using bodyprints may not be utilized. However, if the face is not recognizable, the system may or may not fallback to using the bodyprints. In some examples, it may be advantageous to fallback to using bodyprint detection to attempt to identify a person regardless of whether the face was recognizable (e.g., in cases where the bodyprint comparison techniques yield high accuracy identifications).
  • the system may determine second physical characteristic information associated with the second person. For example, the system may determine body croppings based on the bounding boxes of detected objects in the second plurality of video frames 227. As described herein, a body cropping may include any suitable one or more portions of the body and/or associated articles of clothing. For example, a body cropping may include the head and torso of the person shown in an image. In another example a body cropping may include the whole body of the person shown in the image.
  • a body cropping may include only a torso portion of the body (and/or association clothing).
  • a body cropping may correspond to a new image that is cropped from an original image.
  • a body cropping may correspond to a portion of the original image, whereby the parameters of the portion are determined and/or stored by the system.
  • the system provide the one or more body croppings from respective frames of the second plurality of video frames 227 to a machine learning model (e.g., detection model 106), whereby the model may determine second physical characteristic information from a body cropping.
  • the system may determine a bodyprint for a particular body cropping, as described with respect to FIG. 2.
  • the system may determine a plurality of bodyprints from a sampling of frames of the second plurality of video frames 227.
  • the system may compare first physical characteristic information of the first person with second physical characteristic information of the second person, whereby the first physical characteristic information may be previously stored in association with the identity of the first person. For example, recall that at block 206 of FIG.2, the system stored the first physical characteristic information 217 (e.g., a bodyprint) in association with the identity of the first person.
  • the system may have stored a cluster of bodyprints in association with the identity 215 of the first person (e.g., “Person A”).
  • a bodyprint corresponding to the first physical characteristic information was generated based on a body image cropping whereby the first person was facing towards the camera.
  • a bodyprint corresponding to the second physical characteristic information was generated based on a body image cropping whereby the second person (who may or may not be the same identity as the first person) was facing away from the camera. While these aspects of croppings may differ, note also that the first person and the second person are shown as wearing the same clothing (e.g., a plaid shirt). One or more dimensions of the respective bodyprints may also capture this information, which may be used to compare bodyprints to determine if there is a match.
  • the bodyprints may be associated a color of the shirt, a shape of the body and/or shirt, a shape of the head, a texture of the shirt, etc.
  • one or more of these characteristics may remain substantially similar (e.g., same) over a period of time, independent of whether the user may be facing towards the camera or away from the camera.
  • the machine learning model of the system may be trained to generate similar bodyprints of the same person with similar body characteristics (e.g., the same color and/or pattern of shirt, the same body shape, etc.), independent of whether the person may be facing towards or away from a camera in respective photos.
  • the machine learning model may also be trained to generate similar bodyprints of the same person’s body, despite some feature differences between the person’s body captured in different frames.
  • the model may generate similar bodyprints of the body, despite the body being shown with a slightly different size between images and/or a slightly different lighting between images.
  • the cluster of reference bodyprints e.g., including a bodyprint for the first physical characteristic information 217) may be respectively associated with different vantage points of the body (e.g., different viewing angles, etc.), and similarly, for the cluster of bodyprints generated from the second plurality of video frames (e.g., including the second physical characteristic information 233).
  • the system may be enabled to perform a wider and/or more accurate range of bodyprint comparisons and be resilient for analyzing body characteristics from different vantage points as part of that comparison.
  • the system may generate a score based on the comparison, whereby the score may correspond to a level of similarity (e.g., a Euclidean distance) between different bodyprints.
  • the system may determine a confidence score for the second physical characteristic information (e.g., bodyprint/torso), and use that score to determine whether to include the new second physical characteristic information in the cluster of bodyprints.
  • the system may provide a notification that an identity of the second person matches the identity of the first person. For example, suppose that at block 210, the system determined that the first physical characteristic information 217 matched the second physical characteristic information 233. In this example, the system may determine a score that matches a predefined threshold metric (e.g., greater than or equal to the predefined threshold, for example, 90%).
  • a predefined threshold metric e.g., greater than or equal to the predefined threshold, for example, 90%.
  • the system may then determine that the identity of the second person matches the identity of the first person based on the comparison. If the score does not match the threshold metric, then the system may determine that the identity of the second person does not match the identity of the first person. In the example of process 200, the system determines that there is a match, and thus, the second person 225 is likely “Person A” (e.g., having identity 215). Accordingly, in this example, where the system 235 may correspond to a smart speaker device, the system provides a notification in the form of an audio announcement, “I’ve detected someone who may be Person A.” In some embodiments, as described further in reference to FIG.4, the system may provide the notification for presentation according to any suitable method (e.g., audio, video, text, etc.).
  • any suitable method e.g., audio, video, text, etc.
  • FIG.4 is another simplified block diagram illustrating at least some example techniques for providing a notification based on determining the presence of a particular person at a location, according to some embodiments.
  • diagram 400 of FIG. 4 several elements are depicted, including observation camera 402, a video feed 404, body croppings 408, a user device 406, a pop-up notification 410, and a video presentation 412.
  • Diagram 400 depicts a scenario in which observation camera 402 captures and transmits a video feed 404 to a device (e.g., resident device 102 of FIG.1), whereby the resident device analyzes the video feed 404 and transmits a notification to the user device 406 to alert a user of the presence of a particular person (e.g., “Person A”) at a particular location being observed by the observation camera 402.
  • a device e.g., resident device 102 of FIG.1
  • the resident device analyzes the video feed 404 and transmits a notification to the user device 406 to alert a user of the presence of a particular person (e.g., “Person A”) at a particular location being observed by the observation camera 402.
  • an observation camera may be positioned at any suitable location.
  • the observation camera 402 which may be similar to observation camera 110 of FIG.1, may be positioned in an interior (e.g., living room corridor) setting. Camera 402 may be positioned at approximately waist level.
  • the camera may be positioned closer to the ground, mounted to have a bird’s eye view, or any suitable elevation such that the camera may capture at least a portion of a person’s body and/or face.
  • the position of a camera may depend on one or more factors, including, for example, a number of cameras, the location of each camera, expected traffic patterns around the location of the camera, etc. For example, suppose that observation camera 402 is a sole camera that is connected to the resident device of the home environment that performs identity detection.
  • observation camera 402 may be positioned to capture both a face of a person approaching the camera 402 during a first time period (e.g., entering the living room corridor), which may correspond to the first phase of process 200, as well as the body of the person facing/moving away from the camera 402 during a second time period (e.g., leaving the living room corridor), which may correspond to the second phase of process 200.
  • the video feed 404 shows an example in which a person may be walking away from the camera, whereby the face of the person may not be captured by the video feed 404 such that it may be recognized by the resident device.
  • there may be two or more connected cameras e.g., similar to FIG.
  • a first camera may be positioned to primarily capture the face of a person (e.g., entering a home), while a second camera may be positioned such that it primarily captures the backside of the person (e.g., showing primarily non-facial features of the person).
  • techniques may enable the system to identify the identity of the person captured by the video feed of the second camera (and/or the first camera, for example, when the person is leaving the home), even though the face may not recognized by the system.
  • body croppings 408 may be generated based on a plurality of video frames of the video feed 404.
  • FIG.5 is another simplified block diagram illustrating an example architecture of a system used to provide notifications based on determining the presence of a particular person at a location (e.g., within a home environment context), according to some embodiments.
  • the diagram 500 includes a user device 502 (e.g., which may have an integrated camera component), an observation camera 504, a resident device 506, a network 508, and a remote server 522.
  • the user device 502, the observation camera 504, and the resident device 506, respectively, may be similar to any of the user devices, observation cameras, and/or resident devices described herein.
  • the remote server 522 may correspond to one or more server computers (e.g., a server cluster) of a cloud computing platform, as described herein.
  • the network 508 may include any one or a combination of many different types of networks, such as cable networks, the Internet, wireless networks, cellular networks, and other private and/or public networks.
  • the user device 502 may be any suitable computing device (e.g., a mobile phone, tablet, personal computer (PC), smart glasses, a smart watch, etc.).
  • the user device 502 will have a camera embedded as a component of the device (e.g., a mobile phone camera).
  • the user device 502 will be connected to another device (e.g., a standalone digital camera), from which it receives images (e.g., over the network 508).
  • the notification management module 512 may transmit images (e.g., image croppings generated from the photo library) to the resident device 506 for processing by the resident device 506.
  • images e.g., image croppings generated from the photo library
  • the resident device 506 may generate faceprints (and/or clusters of reference faceprints) for contacts that correspond to images stored in the photo library on the user device 502. These images may be transmitted on any suitable cadence and/or selection algorithm.
  • the user device 502 may first encrypt images that are transmitted to the resident device 506.
  • the user device 502 and the resident device 506 may share an encryption key (e.g., a symmetric key), whereby the resident device 506 receives an encrypted image and then decrypts the image using the encryption key.
  • the encryption key may not be shared (or may be shared) with the remote server 522.
  • the images may be first transmitted to the remote server 522 (e.g., for temporary storage), and then later transmitted by the remote server 522 to the resident device 506. In some embodiments, the images may be transmitted directly to the resident device 506, without involving the remote server 522. It should be understood that one or more functions of the notification management module 512 may be performed by the resident device 506 (e.g., configuring the resident device).
  • these elements may be implemented similarly (or differently) than as described in reference to similar elements of user device 502.
  • the storage unit 550 may store images (e.g., image croppings) received by user device 502 and/or remote server 522.
  • the resident device 506 may be housed in any suitable unit (e.g., a smart TV, a smart speaker, etc.). As described herein, it should be understood that one or more of the elements described in diagram 500 (e.g., user device 502, camera 504, and/or remote server 522) may be enabled to perform one or more of the operations of resident device 506.
  • the memory 530 may include an operating system 532 and one or more application programs or services for implementing the features disclosed herein, including a communications module 534, an encryption module 536, a profile management module 538, a synchronization module 540, a model training module 542, a scoring module 544, and notification management module 546.
  • one or more application programs or services of memory 530 may be included as part of the notification service 104 of FIG.1.
  • the communications module 534 may comprise code that causes the processor 548 to generate messages, forward messages, reformat messages, and/or otherwise communicate with other entities.
  • the communications module 534 may receive (and/or transmit) images from the user device 502 and/or remote server 522.
  • the communications module 534 may also be responsible for providing notifications.
  • the communications module 534 may transmit a notification message to the user device 502 upon detecting the presence and/or identity of a person based on a plurality of video frames received from observation camera 504.
  • the communications module 534 may provide a notification using any suitable channel and/or to any suitable device.
  • the communications module 534 may provide an audible notification via a speaker I/O device 554 at a location within a home environment.
  • the communications module 534 may provide an audiovisual notification to a smart TV within a home environment.
  • a PIP display of the smart TV may display a video feed from camera 504 (e.g., showing a walking through a corridor of a home, exiting a home, etc.).
  • the smart TV may also announce who is at the door and/or allow two-way communication via a speaker and/or microphone I/O devices of the resident device 506.
  • the encryption module 536 may comprise code that causes the processor 548 to encrypt and/or decrypt messages.
  • the encryption module 536 may receive encrypted data (e.g., an encrypted image cropping) from the remote server 522.
  • the encryption module 536 may include any suitable encryption algorithms to encrypt data in embodiments of the invention.
  • Suitable data encryption algorithms may include Data Encryption Standard (DES), tripe DES, Advanced Encryption Standard (AES), etc. It may also store (e.g., in storage unit 550) encryption keys (e.g., encryption and/or decryption keys) that can be used with such encryption algorithms.
  • the encryption module 536 may utilize symmetric or asymmetric encryption techniques to encrypt and/or verify data.
  • the user device 502 may contain similar code and/or keys as encryption module 536 that is suitable for encrypting/decrypting data communications with the resident device (and/or remote server 522).
  • the profile management module 538 may comprise code that causes the processor 548 to maintain and store profiles of contacts.
  • the profile management module 538 may receive images (e.g., image croppings) from one or more user devices and/or cameras, each image cropping showing a portion of a face of a contact associated with the respective user device.
  • the profile management module 540 may determine (e.g., via a trained model) facial characteristic information for a particular contact, which may include a cluster of one or more reference faceprints.
  • the profile management module 540 may also determine non-facial physical characteristic information for the particular contact, which may include a cluster of one or more reference bodyprints. In some embodiments, this module 540 may associate an identity of the particular contact with facial characteristic information and/or non-facial characteristic information.
  • elements of the profile may be respectively updated according to any suitable cadence and/or heuristic.
  • facial characteristic information for a particular contact may be updated as new images are received that provide more information gain than the existing reference set of face images.
  • non-facial physical characteristic information associated with the body may be updated according to a predefined schedule.
  • the system may update a cluster of bodyprints daily, based in part on a heuristic that indicates that a person may be likely to change their clothing on a daily basis.
  • the system may be enabled to determine when a person’s clothing has changed, and then update the cluster of bodyprints (e.g., reference bodyprints) to be up-to-date according to the latest clothing.
  • the profile management module 538 may store any suitable information that may be utilized for comparing with recently received sensor data (e.g., frames from a video feed of camera 504) to determine an identity of a person.
  • the system may store gait characteristic information that corresponds to a unique profile of a person’s walking pattern. This gait characteristic information may be compared against reference gait information stored by the profile management module 538 to detect whether the detected gait of the person matches the reference gait profile.
  • any suitable number of bodyprints e.g., 10, 20, etc. may be stored within a cluster of bodyprints.
  • any suitable algorithm may be used to determine the amount of information gain provided by a particular bodyprint (e.g., a K-means clustering algorithm).
  • the synchronization module 540 may comprise code that causes the processor 548 to transmit and/or receive information associated with synchronizing profiles across a plurality of devices (e.g., associated with the same home environment). For example, as described herein, resident device 506 may determine one or more images (and/or associated faceprints or bodyprints) that provide a higher level of information gain when compared to existing reference images (and/or reference faceprints or bodyprints) for a particular person.
  • the synchronization module 540 may then cause these one or more images to be transmitted as synchronization data to other resident devices of the home environment, whereby the receiving devices update reference data (e.g., images, faceprints, bodyprints, etc.) based on the received synchronization data.
  • devices of the home environment may be synchronized with one another.
  • any suitable device of the home environment may be synchronized according to the synchronization data (e.g., user device 502, remote server 522, etc.).
  • one or more devices may be configured to not be synchronized with other devices of the home environment.
  • the model training module 542 may comprise code that causes the processor 548 to train a machine learning model.
  • the machine learning (ML) model may be trained to perform one or more sub-tasks, including, for example, generating a physical characteristic information (e.g., captured via a bodyprint vector or a motionprint vector), facial characteristic information (e.g., captured via faceprint vector), and/or perform vector comparisons (e.g., determine a cosine similarity) to identify whether a face (e.g., and/or body, gait, etc.) match is detected.
  • the model training module 542 may utilize any suitable machine learning technique. Some non-limiting examples may include utilizing a neural network, support vector machines, nearest neighbor approach, or decision trees.
  • the training process may begin whereby an untrained ML model receives a plurality of images (e.g., image croppings) of a particular person.
  • This plurality of images may, respectively, include different portions of a body (e.g., including the torso) of the person.
  • one portion may be a side view
  • another portion may be a straight-on view
  • another portion may be an opposite side view, etc.
  • Some portions may have different conditions and/or backgrounds, and/or may be captured by different devices (e.g., a camera of user device 502, observation camera 504, etc.).
  • each image of the plurality of images may be labeled as portraying the body of the same person. These labels may correspond to “ground truth” data.
  • the neural network may be trained to receive as input one image of the plurality of images and output a first bodyprint of the body (e.g., torso portion) shown in the image.
  • the bodyprint may correspond to a multidimensional vector, whereby each dimension of the vector corresponds to a characteristic of the body of the person in the image (e.g., a distance between two known points on the body, a particular color or texture represented by pixels of the image, etc.).
  • the training algorithm may adjust dimensions of one or more of the bodyprints. For example, as described above, the training algorithm may utilize a backpropagation algorithm to minimize a cost function associated with the distance between bodyprint (e.g., distance between bodyprint vectors). In some embodiments, this backpropagation algorithm may be used to tune (e.g., update) weights of nodes of the neural network.
  • the neural network may be trained to generate similar bodyprints from images of the same body, whereby the images may have varying levels of quality (e.g., being received from different cameras and/or under different lighting conditions) and/or show different portions of the body.
  • the bodyprints may later be used by the trained model for efficient comparison during body recognition (e.g., at block 210 of FIG.2).
  • the ML model may be trained to generate bodyprints (and/or faceprints, etc.) based on training samples associated with any suitable number of persons (e.g., several hundred, thousand, etc.), and produce bodyprints (and/or faceprints) for each person that are suitable for subsequent comparison between prints.
  • the operations of the model training module 542 may also be performed by the remote server 522.
  • the scoring module 544 may comprise code that causes the processor 548 to determine a score that corresponds to a level of similarity between first physical characteristic information (e.g., a bodyprint associated with a body of a first person) and a second physical characteristic information (e.g., a bodyprint associated with a body of a second person).
  • first physical characteristic information e.g., a bodyprint associated with a body of a first person
  • a second physical characteristic information e.g., a bodyprint associated with a body of a second person.
  • the scoring module 544 may utilize a trained ML model (e.g., via the model training module 542) to generate and compare bodyprints to determine whether the identity of the first person matches the identity of the second person.
  • the scoring module 544 may first generate a score by comparing faceprints between the first person and the second person. In the event that the score matches a threshold metric, the scoring module 544 may thereby determine a successful match based on the faceprints. In the event that the score does not match (e.g., is less than) the threshold metric, the scoring module 544 may generate and compare bodyprints of the first person and the second person (e.g., as a fallback mechanism). In another example, both faceprint and bodyprint recognition may be performed parallel. In at least this way, techniques described herein enable identity recognition to be performed across a wider range of use cases and with a higher level of recall and/or precision.
  • the notification management module 546 may comprise code that causes the processor 548 to store and manage settings for providing notifications, as described herein.
  • the notification management module 546 may also be responsible for generating notifications that are provided by the communications module 534. It should be understood that a notification may presented in any suitable form (e.g., text, audio, video, and/or suitable combinations).
  • the notification management module 546 may be configured to performing no operation (e.g., a “no-op”) in a particular setting.
  • the resident device 506 may be configured to only provide AV-based notifications to user device 502 if a detected person is not a known contact.
  • FIGs.6 and 7 are simplified flow diagrams illustrating a process for providing a notification based on determining the presence of a particular person at a location, according to some embodiments.
  • process 600 and/or 700 may correspond to a first phase of the process (e.g., process 200 of FIG.2), while process 700 of FIG.7 may correspond to a second phase of the process (e.g., process 200 (e.g., as described in reference to FIG.3)). While the operations of process 600 and/or 700 are described as being performed by a resident device of a home environment, it should be understood that any suitable device (e.g., a user device, a server device) may be used to perform one or more operations of these processes. Process 600 and process 700 (described below) are respectively illustrated as logical flow diagrams, each operation of which represents a sequence of operations that can be implemented in hardware, computer instructions, or a combination thereof.
  • the operations represent computer-executable instructions stored on one or more computer-readable storage media that, when executed by one or more processors, perform the recited operations.
  • computer-executable instructions include routines, programs, objects, components, data structures, and the like that perform particular functions or implement particular data types.
  • the order in which the operations are described is not intended to be construed as a limitation, and any number of the described operations can be combined in any order and/or in parallel to implement the processes.
  • any, or all of the processes may be performed under the control of one or more computer systems configured with executable instructions and may be implemented as code (e.g., executable instructions, one or more computer programs, or one or more applications) executing collectively on one or more processors, by hardware, or combinations thereof.
  • code e.g., executable instructions, one or more computer programs, or one or more applications
  • the code may be stored on a computer-readable storage medium, for example, in the form of a computer program comprising a plurality of instructions executable by one or more processors.
  • the computer-readable storage medium is non-transitory.
  • a first device may receive (e.g., from a first camera) a first plurality of video frames during a first time period, whereby at least one of the frames of the first plurality of video frames includes a face of a first person.
  • one or more operations of block 602 may be similar to one or more operations of block 202 of FIG. 2.
  • the first device may be one of a plurality of devices (e.g., of the home environment), which may be communicatively connected to one another. In some embodiments, this may facilitate synchronization operations (e.g., locally within the home environment), as described herein.
  • the first device may identify an identity of the first person based on recognizing the face in at least one frame of the first plurality of frames.
  • one or more operations of block 602 may be similar to one or more operations of block 204 of FIG.2.
  • the first device may identify one or more physical characteristics of the first person from at least a frame of the first plurality of video frames that included the face.
  • the one or more characteristics are non-facial characteristics, for example, associated with a torso and/or the back of the head of the first person.
  • the one or more characteristics are associated with at least one of a gait, one or more arms, one or more legs, or a body shape.
  • the one or more physical characteristics of the first person are associated with at least one of a texture or color of at least one of a body of the first person or an article of clothing worn by the first person.
  • one or more operations of block 606 may be similar to one or more operations of phase one of FIG.2.
  • the first device may store physical characteristic information corresponding to the identified one or more physical characteristics, whereby the physical characteristic information may be stored in association with the identity of the first person based at least in part on the recognized face shown at least one frame of the first plurality of video frames.
  • one or more operations of block 608 may be similar to one or more operations of block 206.
  • the physical characteristic information corresponds to a bodyprint of the first person.
  • the bodyprint may include a multidimensional vector, whereby a dimension of the vector may be associated with one or more physical characteristics of the first person.
  • the bodyprint may be one of a cluster of bodyprints (e.g., reference bodyprints) stored by the first device. The bodyprint may be selected for inclusion among the cluster of bodyprints based on an information gain associated with the bodyprint.
  • the first device may synchronize one or more bodyprints with one or more other devices of the home environment, for example, to ensure that the devices include a similar reference set of bodyprints (and/or faceprints).
  • the first device may update a cluster of reference bodyprints for a person based on any suitable criteria. For example, the first device may subsequently receive a third plurality of video frames during a subsequent time period. The first device may determine that at least one of the one or more identified physical characteristics of the first person may have changed (e.g., wearing a different colored shirt) based at least in part on analyzing the third plurality of video frames. The first device may accordingly update the cluster of bodyprints with updated bodyprints determined from third plurality of video frames. In another example, the first device may update reference bodyprints according to a particular cadence (e.g., daily, weekly, the start of a new time period, etc.).
  • a particular cadence e.g., daily, weekly, the start of a new time period, etc.
  • the first device may store more than one cluster of bodyprints for a particular person. For example, suppose that during one time period, the device stores physical characteristic information (e.g., a cluster of bodyprints) associated within the particular person wearing a particular clothing (e.g., a green patterned shirt). Furthermore, suppose that during another time period, the device stores other physical characteristics (e.g., another cluster of bodyprints) associated with the particular person wearing a different clothing (e.g., a red patterned shirt). In this example, the device may store both clusters of bodyprints for the same person.
  • physical characteristic information e.g., a cluster of bodyprints
  • other physical characteristics e.g., another cluster of bodyprints
  • the device may store both clusters of bodyprints for the same person.
  • the device may compare one or more bodyprints generated during the present time period, respectively, with one or more of the plurality of previously stored clusters of bodyprints (e.g., reference clusters).
  • block 608 may conclude the first phase (e.g., indicated by triangle marker “B” in FIG.6), whereby the second phase of the process may continue during a second time period, as illustrated by process 700 of FIG.7.
  • the device may receive a second plurality of video frames during a second time period.
  • the second plurality of video frames may include a second person where a face of the second person may be determined to not be recognized by the first device.
  • one or more operations of block 610 may be similar to one or more operations of block 208 of FIG.3.
  • the second plurality of video frames may be received by the same camera (e.g., the first camera) as the first plurality of video frames.
  • the second plurality of video frames may be received by a different camera (e.g., a second camera).
  • the first device compares additional physical characteristic information of the second person identified in the second plurality of video frames with the stored physical characteristic information (e.g., the previously stored one or more bodyprint clusters) associated with the identity of the first person.
  • the stored physical characteristic information e.g., the previously stored one or more bodyprint clusters
  • one or more operations of block 612 may be similar to one or more operations of block 210 of FIG.3.
  • the additional physical characteristic information corresponds to one or more physical characteristics of the second person that are a same type of the one or more physical characteristics of the first person.
  • the comparison is performed using a machine learning model that is trained to associate a first bodyprint of the first person with a second bodyprint of the first person.
  • the first device may provide a notification indicating whether an identity of the second person corresponds to the identity of the first person based on the comparison.
  • one or more operations of block 614 may be similar to one or more operations of block 212 of FIG. 3.
  • the notification may be provided based on a confidence score that is determined from the comparison at block 612. For example, the first device may determine that the confidence score matches a threshold metric (e.g., 90%), and then provide the notification accordingly.
  • FIG.8 is a simplified flow diagram illustrating a process for determining whether to add a bodyprint image to a cluster of bodyprint images corresponding to a recognized person, according to some embodiments.
  • Process 800 is illustrated as a logical flow diagram, each operation of which represents a sequence of operations that can be implemented in hardware, computer instructions, or a combination thereof.
  • the operations represent computer-executable instructions stored on one or more computer-readable storage media that, when executed by one or more processors, perform the recited operations.
  • computer-executable instructions include routines, programs, objects, components, data structures, and the like that perform particular functions or implement particular data types.
  • any, or all of the processes may be performed under the control of one or more computer systems configured with executable instructions and may be implemented as code (e.g., executable instructions, one or more computer programs, or one or more applications) executing collectively on one or more processors, by hardware, or combinations thereof.
  • code e.g., executable instructions, one or more computer programs, or one or more applications
  • the code may be stored on a computer-readable storage medium, for example, in the form of a computer program comprising a plurality of instructions executable by one or more processors.
  • the computer-readable storage medium is non-transitory.
  • a computing device may maintain (or generate and then maintain) a plurality of images associated with a person.
  • the plurality of images may be a collection of physical characteristics including, but not limited to, facial characteristics and/or body characteristics (e.g., a torso) of the person.
  • the plurality (e.g., cluster and/or gallery) of images may correspond to a particular person (e.g., a contact of the user of the computing device).
  • the computing device may receive a new image (e.g., from a camera or other video recorder). This new image may have been collected from the same or a different camera as other images of this person.
  • the new image may include both facial and torso information.
  • the computing device may determine whether the new image includes a face that is identifiable (e.g., good quality) and/or is a face of the person. That is, the process 800 will determine if the face is both clear enough to determine that it’s a face, and also recognize (e.g., identify) the person as being in a list of contacts of the user. In some instances, if the face is not identifiable, then it will not be recognizable either.
  • the process 800 may not fallback to using torsoprints in the event that the face was identifiable and/or recognized, and there will be instances where the face is identifiable but not recognizable, the process 800 will not fallback to using bodyprints (block 816), and will also not determine whether to add the new torsoprint to the cluster (blocks 808, 810). In these instances, the process 800 may end, not successfully identifying any person. In this case, the process 800 may provide a notification that an unidentified person was detected.
  • the computing device may generate a score associated with the new image (and/or the torso in the image) at block 808.
  • the scores may be generated based on one or more (e.g., some combination) of the rules described above. For example, the score may be generated based on image characteristics such as saturation, contrast, brightness, and/or sharpness, whether the image is a false torso, or the like.
  • the computing device may determine whether the score is above a threshold (e.g., a quality threshold). If the score is above the threshold, the new image may be added to the cluster at block 812.
  • the new image may be discarded at block 814.
  • the computing device may identify a person based on comparison of the new image to a cluster of bodyprints associated with the person at block 816. If identified, the computing device may notify users about the presence of the known/recognized person or the presence of an unrecognized person.
  • the process 800 may only proceed to block 816 if the image did not have a face or if a face in the image was of such low quality that it was not identifiable (e.g., it would not be possible to recognize the user).
  • Illustrative techniques for providing a notification based on determining the presence of a particular person at a location and determining whether to add a bodyprint to a cluster of bodyprints for a recognized person are described above. Some or all of these techniques may, but need not, be implemented at least partially by architectures such as those shown at least in FIGs. 1-7 above. While many of the embodiments are described above with reference to resident devices and user devices, it should be understood that other types of computing devices may be suitable to perform the techniques disclosed herein.
  • User or client devices can include any of a number of general purpose personal computers, such as desktop or laptop computers running a standard operating system, as well as cellular, wireless and handheld devices running mobile software and capable of supporting a number of networking and messaging protocols. Such a system also can include a number of workstations running any of a variety of commercially-available operating systems and other known applications for purposes such as development and database management. These devices also can include other electronic devices, such as dummy terminals, thin-clients, gaming systems and other devices capable of communicating via a network.
  • Most embodiments utilize at least one network that would be familiar to those skilled in the art for supporting communications using any of a variety of commercially-available protocols, such as TCP/IP, OSI, FTP, UPnP, NFS, CIFS, and AppleTalk.
  • the network can be, for example, a local area network, a wide-area network, a virtual private network, the Internet, an intranet, an extranet, a public switched telephone network, an infrared network, a wireless network, and any combination thereof.
  • the network server can run any of a variety of server or mid-tier applications, including HTTP servers, FTP servers, CGI servers, data servers, Java servers, and business application servers.
  • the server(s) also may be capable of executing programs or scripts in response requests from user devices, such as by executing one or more applications that may be implemented as one or more scripts or programs written in any programming language, such as Java ® , C, C# or C++, or any scripting language, such as Perl, Python or TCL, as well as combinations thereof.
  • the server(s) may also include database servers, including without limitation those commercially available from Oracle ® , Microsoft ® , Sybase ®, and IBM ® .
  • the environment can include a variety of data stores and other memory and storage media as discussed above.
  • SAN storage-area network
  • any necessary files for performing the functions attributed to the computers, servers or other network devices may be stored locally and/or remotely, as appropriate.
  • each such device can include hardware elements that may be electrically coupled via a bus, the elements including, for example, at least one central processing unit (CPU), at least one input device (e.g., a mouse, keyboard, controller, touch screen or keypad), and at least one output device (e.g., a display device, printer or speaker).
  • CPU central processing unit
  • input device e.g., a mouse, keyboard, controller, touch screen or keypad
  • output device e.g., a display device, printer or speaker
  • Such a system may also include one or more storage devices, such as disk drives, optical storage devices, and solid-state storage devices such as RAM or ROM, as well as removable media devices, memory cards, flash cards, etc.
  • Such devices can include a computer-readable storage media reader, a communications device (e.g., a modem, a network card (wireless or wired), an infrared communication device, etc.), and working memory as described above.
  • the computer-readable storage media reader can be connected with, or configured to receive, a non-transitory computer-readable storage medium, representing remote, local, fixed, and/or removable storage devices as well as storage media for temporarily and/or more permanently containing, storing, transmitting, and retrieving computer-readable information.
  • the system and various devices also typically will include a number of software applications, modules, services or other elements located within at least one working memory device, including an operating system and application programs, such as a client application or browser.
  • Non-transitory storage media and computer-readable storage media for containing code, or portions of code can include any appropriate media known or used in the art such as, but not limited to, volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data, including RAM, ROM, Electrically Erasable Programmable Read-Only Memory (EEPROM), flash memory or other memory technology, CD-ROM, DVD or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices or any other medium that can be used to store the desired information and that can be accessed by the a system device.
  • volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data, including RAM, ROM, Electrically Erasable Programmable Read-Only Memory (EEPROM), flash memory or other memory technology, CD-ROM, DVD or other optical storage, magnetic cassettes, magnetic tape
  • one aspect of the present technology is the gathering and use of data (e.g., images of people) to perform facial recognition.
  • data e.g., images of people
  • the present disclosure contemplates that in some instances, this gathered data may include personally identifiable information (PII) data that uniquely identifies or can be used to contact or locate a specific person.
  • PII personally identifiable information
  • Such personal information data can include facial and/or non-facial characteristics of a person’s body, demographic data, location-based data (e.g., GPS coordinates), telephone numbers, email addresses, Twitter ID's, home addresses, or any other identifying or personal information.
  • the present disclosure recognizes that the use of such personal information data, in the present technology, can be used to the benefit of users.
  • the personal information data can be used to identify a person as being a contact (or not known contact) of a user of a user device.
  • the entities responsible for the collection, analysis, disclosure, transfer, storage, or other use of such personal information data will comply with well-established privacy policies and/or privacy practices.
  • such entities should implement and consistently use privacy policies and practices that are generally recognized as meeting or exceeding industry or governmental requirements for maintaining personal information data private and secure.
  • Such policies should be easily accessible by users, and should be updated as the collection and/or use of data changes.
  • Personal information from users should be collected for legitimate and reasonable uses of the entity and not shared or sold outside of those legitimate uses. Further, such collection/sharing should occur after receiving the informed consent of the users.
  • policies and practices should be adapted for the particular types of personal information data being collected and/or accessed and adapted to applicable laws and standards, including jurisdiction-specific considerations. For instance, in the US, collection of or access to certain health data may be governed by federal and/or state laws, such as the Health Insurance Portability and Accountability Act (HIPAA); whereas health data in other countries may be subject to other regulations and policies and should be handled accordingly. Hence different privacy practices should be maintained for different personal data types in each country.
  • HIPAA Health Insurance Portability and Accountability Act
  • the present disclosure also contemplates embodiments in which users selectively block the use of, or access to, personal information data. That is, the present disclosure contemplates that hardware and/or software elements can be provided to prevent or block access to such personal information data.
  • the present technology can be configured to allow users to select to "opt in” or “opt out” of participation in the collection of personal information data during registration for services or anytime thereafter.
  • the present disclosure contemplates providing notifications relating to the access or use of personal information.
  • a user may be notified upon downloading an app that their personal information data will be accessed and then reminded again just before personal information data is accessed by the app.
  • personal information data should be managed and handled in a way to minimize risks of unintentional or unauthorized access or use. Risk can be minimized by limiting the collection of data and deleting data once it is no longer needed.
  • data de-identification can be used to protect a user’s privacy.
  • De-identification may be facilitated, when appropriate, by removing specific identifiers (e.g., date of birth, etc.), controlling the amount or specificity of data stored (e.g., collecting location data a city level rather than at an address level), controlling how data is stored (e.g., aggregating data across users), and/or other methods.
  • specific identifiers e.g., date of birth, etc.
  • controlling the amount or specificity of data stored e.g., collecting location data a city level rather than at an address level
  • controlling how data is stored e.g., aggregating data across users

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Collating Specific Patterns (AREA)
  • Image Analysis (AREA)

Abstract

Des techniques sont divulguées pour déterminer s'il faut inclure une empreinte corporelle dans un groupe d'empreintes corporelles associées à une personne reconnue. Par exemple, un dispositif effectue une reconnaissance faciale pour identifier l'identité d'une première personne. Le dispositif identifie et stocke également des informations de caractéristiques physiques de la première personne, les informations stockées étant associées à l'identité de la première personne sur la base du visage reconnu. Ensuite, le dispositif reçoit une seconde source vidéo montrant une image d'une seconde personne dont le visage est également déterminé comme étant reconnu par le dispositif. Le dispositif génère ensuite un score de qualité pour les caractéristiques physiques dans l'image de l'utilisateur. Le dispositif peut ensuite ajouter l'image présentant les caractéristiques physiques à un groupe d'images associées à la personne si le score de qualité est supérieur à un seuil, ou éliminer l'image si ce n'est pas le cas.
PCT/US2022/029314 2021-05-14 2022-05-13 Reconnaissance d'identité utilisant des caractéristiques corporelles associées à un visage WO2022241294A2 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202280034667.6A CN117377988A (zh) 2021-05-14 2022-05-13 利用面部关联的身体特征的身份识别
EP22729364.4A EP4338136A2 (fr) 2021-05-14 2022-05-13 Reconnaissance d'identité utilisant des caractéristiques corporelles associées à un visage

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202163188949P 2021-05-14 2021-05-14
US63/188,949 2021-05-14
US202263341379P 2022-05-12 2022-05-12
US63/341,379 2022-05-12

Publications (2)

Publication Number Publication Date
WO2022241294A2 true WO2022241294A2 (fr) 2022-11-17
WO2022241294A3 WO2022241294A3 (fr) 2022-12-15

Family

ID=82016544

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2022/029314 WO2022241294A2 (fr) 2021-05-14 2022-05-13 Reconnaissance d'identité utilisant des caractéristiques corporelles associées à un visage

Country Status (2)

Country Link
EP (1) EP4338136A2 (fr)
WO (1) WO2022241294A2 (fr)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008117333A (ja) * 2006-11-08 2008-05-22 Sony Corp 情報処理装置、情報処理方法、個人識別装置、個人識別装置における辞書データ生成・更新方法および辞書データ生成・更新プログラム
US10789820B1 (en) * 2017-09-19 2020-09-29 Alarm.Com Incorporated Appearance based access verification
US20200380299A1 (en) * 2019-05-31 2020-12-03 Apple Inc. Recognizing People by Combining Face and Body Cues

Also Published As

Publication number Publication date
WO2022241294A3 (fr) 2022-12-15
EP4338136A2 (fr) 2024-03-20

Similar Documents

Publication Publication Date Title
US11735018B2 (en) Security system with face recognition
US10372988B2 (en) Systems and methods for automatically varying privacy settings of wearable camera systems
US20220245396A1 (en) Systems and Methods of Person Recognition in Video Streams
US11710348B2 (en) Identifying objects within images from different sources
WO2017166469A1 (fr) Procédé et appareil de protection de sécurité basés sur un téléviseur intelligent
US10769909B1 (en) Using sensor data to detect events
US20170262706A1 (en) Smart tracking video recorder
US11302156B1 (en) User interfaces associated with device applications
CN108881813A (zh) 一种视频数据处理方法及装置、监控系统
CN117377988A (zh) 利用面部关联的身体特征的身份识别
EP3975132A1 (fr) Identification d'objets partiellement couverts en utilisant des couvertures simulées
US20230013117A1 (en) Identity recognition utilizing face-associated body characteristics
US20220366727A1 (en) Identity recognition utilizing face-associated body characteristics
EP4338136A2 (fr) Reconnaissance d'identité utilisant des caractéristiques corporelles associées à un visage
WO2024063969A1 (fr) Reconnaissance d'identité utilisant des caractéristiques corporelles associées à un visage
EA038335B1 (ru) Способ и система распознавания лиц и построения маршрута с помощью средства дополненной реальности
CN115118536B (zh) 分享方法、控制设备及计算机可读存储介质
KR20140067730A (ko) 로봇을 이용한 자동추적 사진촬영 제공시스템 및 방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22729364

Country of ref document: EP

Kind code of ref document: A2

WWE Wipo information: entry into national phase

Ref document number: 202280034667.6

Country of ref document: CN

WWE Wipo information: entry into national phase

Ref document number: 2022729364

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2022729364

Country of ref document: EP

Effective date: 20231214