WO2019179441A1 - Focus tracking method and device of smart apparatus, smart apparatus, and storage medium - Google Patents

Focus tracking method and device of smart apparatus, smart apparatus, and storage medium Download PDF

Info

Publication number
WO2019179441A1
WO2019179441A1 PCT/CN2019/078747 CN2019078747W WO2019179441A1 WO 2019179441 A1 WO2019179441 A1 WO 2019179441A1 CN 2019078747 W CN2019078747 W CN 2019078747W WO 2019179441 A1 WO2019179441 A1 WO 2019179441A1
Authority
WO
WIPO (PCT)
Prior art keywords
smart device
face
center point
human body
image
Prior art date
Application number
PCT/CN2019/078747
Other languages
French (fr)
Chinese (zh)
Inventor
周子傲
谢长武
王雪松
马健
Original Assignee
北京猎户星空科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京猎户星空科技有限公司 filed Critical 北京猎户星空科技有限公司
Publication of WO2019179441A1 publication Critical patent/WO2019179441A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/04847Interaction techniques to control parameter settings, e.g. interaction with sliders or dials
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J11/00Manipulators not otherwise provided for
    • B25J11/0005Manipulators having means for high-level communication with users, e.g. speech generator, face recognition means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0487Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/165Detection; Localisation; Normalisation using facial parts and geometric relationships

Definitions

  • the present disclosure relates to the field of smart device technologies, and in particular, to a focus following method, device, smart device, and storage medium for a smart device.
  • smart devices interact with users more and more.
  • smart devices can follow the user's movement through the method of focus following, and achieve the effect that the smart device pays attention to user behavior.
  • the smart device adopts a face recognition technology to collect a user's face center point, calculate a distance between the user's face center point and the collected image center position, and control the smart device to rotate so that the user's face is located at the image center position.
  • the present disclosure proposes a focus following method of a smart device.
  • the method supplements the key points of the human body as a focus.
  • the smart device does not detect the key points of the face, the key points of the human body are detected as the following focus from the collected images, thereby preventing the user from losing focus in the case of bowing and turning. Improves the success rate and accuracy of focus tracking.
  • the present disclosure proposes a focus following device of a smart device.
  • the present disclosure proposes a smart device.
  • the present disclosure proposes a non-transitory computer readable storage medium.
  • the first aspect of the present disclosure provides a focus following method of a smart device, including:
  • the face key point is not detected from the environment image, detecting a human body key point of the target user from the environment image, determining a body center point according to the body key point, and controlling the smart point
  • the device performs focus tracking on the center point of the human body.
  • the focus following method of the smart device of the embodiment of the present disclosure firstly detects a face key point of the target user from the environment image collected by the smart device, determines a face center point according to the face key point, and controls the smart device to The face center point performs focus tracking. If the face key point is not detected from the environment image, the target user's human key point is detected from the environment image, the body center point is determined according to the body key point, and the smart device is controlled to the human body center. Point to focus follow. Therefore, the method solves the technical problem that the focus cannot be maintained due to the detection of the key points of the face in the prior art, and the key points of the human body are used as the focus to complement the foot.
  • the method collects In the image to be detected, the key points of the human body are detected as the following focus, and the user is prevented from losing focus in the case of bowing and turning, etc., and the success rate and accuracy of focus following are improved.
  • the method before identifying a face key point of the target user from the environment image collected by the smart device, the method further includes: identifying a center point of the environment image collected by the smart device, and centering the environment image The point is the reference point and a circle is created for the image area that the focus follows.
  • performing focus following includes: periodically determining whether the detected face center point or the body center point is in the image area; when the face center point or the body center point is not in Obtaining, in the image area, a shortest path between the face center point or the body center point and the image area center point; acquiring control information for controlling the movement of the smart device according to the shortest path; The smart device moves according to the control information, so that the detected face center point or body center point falls within the image area.
  • the face key of the target user is detected from the environment image collected by the smart device, and the face center point is determined according to the face key point, including: according to the preset head feature, Identifying a head region of the target user in the environment image; extracting the face key point from the head region; if the extracted face key point is one, the face key point is taken as a face center point; if the extracted face key points are two or more, obtaining the first center point of all the extracted face key points, the first center point is taken as The face center point.
  • acquiring the first center point of all the extracted face key points includes: using each face key point as a node, using one of the nodes as a starting node, and all the nodes Connected one by one to form a key point graph covering all nodes; obtain a center point of the key point graph, and determine a center point of the key point graph as the first center point.
  • detecting a key point of the human body of the target user from the collected environment image includes: identifying from a collected human body area located below the head area; and after identifying the human body area And controlling an imaging angle of the pan-tilt camera of the smart device to move in a direction of the head region; after the camera angle is moved, capturing an environment image; determining whether the head region is included in the environment image; If the head region is included in the environment image, identifying the face key point from the head region; if the head region is not included in the environment image, detecting from the environment image The key point of the human body of the target user.
  • the method before detecting a face key point of the target user from the environment image collected by the smart device, the method further includes: performing human body recognition on the environment image; and identifying a plurality of human bodies from the environment image Obtaining a distance between each human body and the smart device; selecting a human body closest to the smart device as the human body corresponding to the target user
  • selecting a human body closest to the smart device as the human body corresponding to the target user includes: querying the smart device when the human body closest to the smart device is multiple Whether there is a face image corresponding to the human body closest to the smart device in the registered user face image library; if there is a face image corresponding to the human body closest to the smart device in the face image library And the human body that is closest to the smart device is the human body corresponding to the target user; if there is no face image corresponding to the human body closest to the smart device in the face image library, Then randomly selecting a human body that is closest to the smart device as a human body corresponding to the target user; if there are multiple face images corresponding to the human body closest to the smart device in the face image library, The human body closest to the smart device is firstly queried as the human body corresponding to the target user.
  • the second aspect of the present disclosure provides a focus following device of a smart device, including:
  • a detecting module configured to detect a face key point of the target user from the environment image collected by the smart device, and detect the image from the environment image when the face key point is not detected from the environment image The key point of the target user's body;
  • a determining module configured to determine a face center point according to the face key point, and determine a body center point according to the body key point when the human body key point is detected;
  • control module configured to control the smart device to perform focus tracking on the face center point, and control the smart device to perform focus tracking on the human body center point when determining the body center point.
  • the focus following device of the smart device according to the above-described embodiments of the present disclosure may further have the following additional technical features:
  • the focus following device of the smart device of the above embodiment further includes: a generating module, configured to identify a face key of the target user in the environment image collected from the smart device Identifying a center point of the environment image collected by the smart device, and using a center point of the environment image as a reference point, generating a circular image area for focus following.
  • a generating module configured to identify a face key of the target user in the environment image collected from the smart device Identifying a center point of the environment image collected by the smart device, and using a center point of the environment image as a reference point, generating a circular image area for focus following.
  • control module is specifically configured to: periodically determine whether the detected face center point or the body center point is in the image area; when the face center point or the body center point is not Acquiring a shortest path between the face center point or the body center point and the image area center point when the image area is in the image area; acquiring control information for controlling the movement of the smart device according to the shortest path; The smart device moves according to the control information, so that the detected face center point or body center point falls within the image area.
  • the detecting module is configured to: identify a head area of the target user from the environment image according to a preset head feature; and extract the face from the head area key point;
  • Determining a module specifically, if the extracted face key point is one, the face key point is used as the face center point; if the extracted face key points are two and two In the above, the first center point of all the extracted face key points is obtained, and the first center point is used as the face center point.
  • the determining module is specifically configured to: use each face key point as a node, and use one of the nodes as a starting node to connect all the nodes one by one to form a key point graphic covering all the nodes. Obtaining a center point of the key point graphic, and determining a center point of the key point graphic as the first center point.
  • the detecting module is specifically configured to: identify a human body region located below the head region from the collected; and control the pan/tilt camera of the smart device after identifying the human body region a camera angle is moved in a direction in which the head region is located; after the camera angle is moved, capturing an environment image; determining whether the head region is included in the environment image; and if the environment image includes the header And the part area identifies the face key point from the head area; if the head area is not included in the environment image, detecting a human body key point of the target user from the environment image.
  • the focus following device of the smart device of the foregoing embodiment further includes: a human body recognition module, configured to detect, before detecting a key point of the target user from the environment image collected by the smart device, The environment image is used for human body recognition; the distance detecting module is configured to acquire a distance between each human body and the smart device when a plurality of human bodies are identified from the environment image; and a module for selecting and selecting the smart device The device is the human body corresponding to the closest human body as the target user.
  • the module is specifically configured to: if there is a plurality of human bodies that are closest to the smart device, query whether the registered user face image library of the smart device exists in the a face image corresponding to the closest human body of the smart device; if there is a face image corresponding to the human body closest to the smart device in the face image library, the one is closest to the smart device
  • the human body is the human body corresponding to the target user; if there is no face image corresponding to the human body closest to the smart device in the face image library, a human body closest to the smart device is randomly selected.
  • the first query is the closest to the smart device.
  • the human body serves as a human body corresponding to the target user.
  • the focus following device of the smart device of the embodiment of the present disclosure first detects a face key point of the target user from the environment image collected by the smart device, determines a face center point according to the face key point, and controls the smart device to The face center point performs focus tracking. If the face key point is not detected from the environment image, the target user's human key point is detected from the environment image, the body center point is determined according to the body key point, and the smart device is controlled to the human body center. Point to focus follow. Therefore, the device solves the technical problem that the focus cannot be maintained due to the detection of the key points of the face in the prior art, and the key points of the human body are used as the focus to complement the foot.
  • the smart device When the smart device does not detect the key point of the face, the device collects In the image to be detected, the key points of the human body are detected as the following focus, and the user is prevented from losing focus in the case of bowing and turning, etc., and the success rate and accuracy of focus following are improved.
  • a third aspect of the present disclosure provides a smart device, comprising: a housing, a processor, a memory, a circuit board, and a power supply circuit, wherein the circuit board is disposed inside the space enclosed by the housing, and the processor And a memory disposed on the circuit board; a power supply circuit for supplying power to each circuit or device of the smart device; a memory for storing executable program code; and a processor for operating by reading executable program code stored in the memory A program corresponding to the program code is executed for implementing the focus following method of the smart device as described in the above embodiments.
  • the fourth aspect of the present disclosure provides a non-transitory computer readable storage medium having stored thereon a computer program, wherein the program is executed by the processor to implement the focus of the smart device as described in the above embodiments.
  • the program is executed by the processor to implement the focus of the smart device as described in the above embodiments.
  • FIG. 1 is a schematic flowchart of a focus following method of a smart device according to an embodiment of the present disclosure
  • FIG. 2 is a schematic diagram of a position of a key point of a human body according to an embodiment of the present disclosure
  • FIG. 3 is a schematic flowchart of a method for determining a face center point according to an embodiment of the present disclosure
  • FIG. 4 is a schematic diagram of a location of a face key point according to an embodiment of the present disclosure
  • FIG. 5 is a schematic flowchart of a focus following method of another smart device according to an embodiment of the present disclosure
  • FIG. 6 is a schematic diagram of a focus following process according to an embodiment of the present disclosure.
  • FIG. 7 is a schematic flowchart diagram of a focus follow method of a specific smart device according to an embodiment of the present disclosure
  • FIG. 8 is a schematic flowchart of a method for determining a target user according to an embodiment of the present disclosure
  • FIG. 9 is a schematic diagram of a principle for binocular vision calculation distance according to an embodiment of the present disclosure.
  • FIG. 10 is a schematic structural diagram of a focus following device of a smart device according to an embodiment of the present disclosure.
  • FIG. 11 is a schematic structural diagram of a focus following device of another smart device according to an embodiment of the present disclosure.
  • FIG. 12 is a block diagram of an exemplary smart device suitable for implementing an implementation of the present disclosure, in accordance with an embodiment of the present disclosure.
  • the execution body of the focus following method of the smart device of the embodiment of the present disclosure may be a smart device that captures an image of the surrounding environment by the camera device and follows the focus on the image, such as an intelligent robot or the like.
  • FIG. 1 is a schematic flowchart diagram of a focus following method of a smart device according to an embodiment of the present disclosure. As shown in FIG. 1, the focus following method of the smart device includes the following steps:
  • Step 101 Detect a face key point of the target user from the environment image collected by the smart device, determine a face center point according to the face key point, and control the smart device to perform focus follow on the face center point.
  • the smart device may be a robot, a smart home appliance, or the like.
  • the smart device is equipped with a camera device, such as a camera, and the smart device can collect the environment image in the monitoring range in real time through the camera device. After the environmental image is acquired, the environmental image can be detected to identify the human body entering the monitoring range.
  • a camera device such as a camera
  • the environment image combined with the face recognition technology, it is detected whether there is a face in the collected image.
  • the outline of the object is extracted, and the extracted object outline is compared with a pre-existing face contour or a human body contour.
  • a preset threshold it can be considered that the user is recognized from the environmental image.
  • the smart device detects the face key point of the target user, and determines the face center point according to the face key point.
  • the key point of the face can be the facial features of the target user, such as eyes, nose and mouth, etc.
  • the smart device can determine the key points of the face by detecting the shape of the face organ and the position of the different organs in the face, and then according to The detected face key points determine the face center point.
  • the camera or the visual system controlling the smart device follows the focus in real time, and keeps the focus in the following area of the collected environment image, wherein, following The area can cover a partial area of the environment image that is not fixed in the environment image, but moves in real time following the surveillance field of view.
  • the following area generally needs to cover the central area in the environment image in order to keep the smart device and the monitored target user able to face-to-face interaction.
  • the head of the robot is an imaging device, and the camera device that controls the robot focuses on the focus of the face center point, thereby achieving the effect that the robot always “gazes” the target user and enhances the user experience.
  • Step 102 If the face key point is not detected from the environment image, the target user's human key point is detected from the environment image, the body center point is determined according to the body key point, and the smart device is controlled to focus on the body center point. follow.
  • the face key may not be detected in the environment image
  • the smart device detects the key point of the target user from the environment image, wherein the key point of the human body is the target user body except the head.
  • the key point of the other parts. 2 is a schematic diagram of a position of a key point of a human body according to an embodiment of the present disclosure. As shown in FIG. 2, the smart device identifies a contour edge of a torso of a target user in an environment image, and the intersection of the limb and the trunk is a key point of the human body, according to The key points of the human body determine the center point of the human body.
  • the camera device of the smart device moves downward to detect that the intersection point P1 of the user's neck and the torso is a key point of the human body, and the key point is the center point of the human body;
  • the smart device detects in the environmental image that the intersection of the user's two arms and the torso is P2 and P3, and the midpoint of the connection between P2 and P3 is the key point of the human body.
  • the smart device focuses on the center point of the human body, and keeps the focus in the following area of the collected environment image.
  • the method of performing focus tracking on the center point of the human body may refer to the focus of the center point of the face in the above example. The method to follow is not repeated here.
  • the focus following method of the smart device of the embodiment of the present disclosure detects a face key point of the target user from the environment image collected by the smart device, determines a face center point according to the face key point, and controls the smart device to The face center point performs focus tracking. If the face key point is not detected from the environment image, the target user's human key point is detected from the environment image, the body center point is determined according to the body key point, and the smart device is controlled to the human body center. Point to focus follow. Therefore, the method solves the technical problem that the focus cannot be maintained due to the detection of the key point of the face, and the key point of the human body is used as the focus to complement the image. When the smart device does not detect the key point of the face, the image is collected from the captured image. The detection of the key points of the human body as the focus of follow-up avoids the loss of focus of the user in the case of bowing and turning, and improves the success rate and accuracy of focus following.
  • FIG. 3 is a method for determining a face center according to an embodiment of the present disclosure. Schematic diagram of the point method.
  • the method for determining a face center point includes the following steps:
  • step 201 the head area of the target user is identified.
  • the smart device sets the head feature according to the pre-stored head model, for example, the shape structure of the head, the basic proportion, and the positional relationship with the human torso, etc., and the smart device identifies the image from the environment image according to the preset head feature.
  • the head area of the target user for example, the shape structure of the head, the basic proportion, and the positional relationship with the human torso, etc.
  • step 202 a face key point is detected in the head area.
  • the process of detecting the key point of the target user in the identified head area and the process of identifying the key point of the face from the head area can be referred to the description of related content in the foregoing embodiment, and details are not described herein again.
  • step 203 the number of the detected face key points is determined. If the number of face key points is one, step 204 is performed. If the number of face key points is two or more, step 205 is performed. .
  • a detected face key point is a face center point.
  • a face key point detected in the target user's head area is a face center point.
  • the eye is used as the target user's face center point.
  • Step 205 Acquire a first center point of all detected face key points, and use the first center point as the face center point.
  • the first center point is the center point of the key point pattern surrounded by all detected key points of the face.
  • 4 is a schematic diagram of a location of a face key point according to an embodiment of the present disclosure. As shown in FIG. 4, each face key point is used as a connection node of a key point graphic, and one of the nodes is used as a starting node, and all The nodes are connected one by one to form a keypoint graph covering all the nodes. If the keypoint graph obtained is a symmetrical graph (as shown in Fig. 4), the midpoint of the symmetry axis of the keypoint graph is the keypoint graph.
  • the first center point of the key point graphic is determined as the center point of the face; if the key point figure is an irregular figure, the intersection of the longest axis and the shortest axis of the irregular figure is the first center point of the key point graphic The first center point of the key point graphic is determined as the center point of the face.
  • the method for determining a face center point of the embodiment of the present disclosure determines a face center point by the detected face key point, and performing focus tracking on the face center point to ensure that the target user's face area is within the following area of the smart device, so that The target users who keep the smart device and monitor can interact face to face.
  • FIG. 5 is a schematic flowchart of a focus following method of another smart device according to an embodiment of the present disclosure.
  • the focus following method includes the following steps:
  • Step 301 Acquire a reference point of an image area for focus following.
  • the smart device takes the intersection of the horizontal symmetry axis and the vertical symmetry axis of the collected environment image as the center point of the environment image, and then uses the center point of the environment image as the reference point of the image region for the focus to follow.
  • Step 302 generating an image area for focus following.
  • the smart device takes a preset pixel value as a radius, and uses a reference point of the image area for the focus to be the center of the circle, and generates a circular image area for the focus to follow.
  • the size of the pixel value is preset according to the maximum pixel value of the camera device and the distance between the camera device and the target user. For example, when the camera of the smart device is 2 million pixels, the user and the camera device are different through a large amount of experimental data.
  • the average value of the face detection area under the distance is rounded at a radius of 72 pixels when the target user is 2 meters away from the smart device, and the face area can be ensured to be within the image area of the circle.
  • Step 303 controlling the image area to perform focus following.
  • the smart device periodically determines whether the detected face center point is in the image area, and when the face center point is not in the image area, the smart device controls the image area to perform focus tracking.
  • FIG. 6 is a schematic diagram of a focus following process according to an embodiment of the present disclosure.
  • the reference point of the image region is taken as the origin, and the horizontal symmetry axis and the vertical symmetry axis of the image region are taken as the X axis.
  • the Y-axis generates a coordinate system.
  • the direction is moved by 5 cm or the like, thereby controlling the smart device to move according to the control information, so that the detected center point of the face falls into the image area.
  • the focus following method of the embodiment of the present disclosure generates a circular image region for focus following according to a center point of the collected environment image and a preset pixel value radius, compared to the “well” character following region in the related art. Or the square following area has four corners removed, so that the image area following the focus is more accurate, and the focus is followed according to the shortest path between the center point of the face and the center point of the image area, which shortens the movement of the camera or the visual system. Time increases the timeliness of focus follow-up.
  • the smart device detects the target user's human key point for focus follow.
  • the user's action such as bowing or turning may only last for a short period of time. It can be understood that focusing on the target user's face key points is easier to make the user observe the smart device on the basis of ensuring that the focus is not followed.
  • the embodiment of the present disclosure proposes a specific focus tracking method of the smart device.
  • FIG. 7 is a schematic flowchart of a focus tracking method of a specific smart device according to an embodiment of the present disclosure. As shown in FIG. 7, the method includes:
  • Step 401 identifying from the collected human body area located below the head area.
  • the smart device identifies the human body area under the head area of the target user in the environment image when the smart device such as the user is unable to collect the key point of the face.
  • the deep learning technique is used to acquire the feature model of the human body in different forms, and the collected environmental image is matched with the feature model to identify the human body region in various forms such as standing, sitting and walking.
  • Step 402 After identifying the human body area, the imaging angle of the pan/tilt camera controlling the smart device moves in the direction of the head region.
  • the camera angle of the pan/tilt camera or the pan-tilt camera is moved in the direction in which the head region is located, that is, the shooting angle or position is adjusted upward from the current shooting angle or position.
  • it may be slowly moved up or raised at a preset fixed speed.
  • the camera movement can be controlled at different speeds, for example, when the center point of the human body is the intersection of the neck and the trunk of the target user, the speed is slowed upward at 10°/s.
  • the center point of the human body is located at the center point of the target user's torso, move upwards at a speed of 20°/s, thereby reducing the focus search time and avoiding focus follow loss.
  • Step 403 After the camera angle is moved, the captured environment image is captured.
  • Step 404 determining whether a header area is included in the environment image.
  • step 405 Performing header area identification on the currently collected environment image, if it is recognized that the head image area is included in the environment image, step 405 is performed; if it is recognized that the head area is not included in the environment image, step 406 is performed.
  • Step 405 identifying a face key point from the head area.
  • the face center point is determined according to the face key point, and the face center point is subjected to focus follow.
  • Step 406 Detect a human key point of the target user from the environment image.
  • the human key point of the target user is detected from the environment image. Further, after extracting the key points of the human body, the center point of the human body is determined according to the key points of the human body, and then the focus of the center point of the human body is followed.
  • the focus following method of the smart device of the embodiment of the present disclosure detects a key point of the face on the basis of detecting the key point of the human body, and if the key point of the face is detected, determining the center point of the face according to the key point of the face If the focus is not followed, the focus of the human body is determined according to the key points of the human body. On the basis of ensuring that the focus is not lost, the focus of the target user's face is followed, which improves the vividness and flexibility of the smart device interaction.
  • FIG. 8 is a schematic flowchart of a method for determining a target user according to an embodiment of the present disclosure. As shown in FIG. 8, the method for determining a target user includes:
  • Step 501 Perform human body recognition on the environment image.
  • the smart device can identify the human body in the environment image through face detection or human body detection.
  • Step 502 When a plurality of human bodies are identified from the environment image, obtain a distance between each human body and the smart device.
  • the smart device can recognize each human body that enters the monitoring range from the collected environmental image.
  • each human body identified is regarded as a candidate.
  • the method of the human body identification reference may be made to the description of the foregoing embodiment, and details are not described herein again.
  • the smart device acquires the distance between each human body and the smart device in the environment image. It can be understood that the closer the distance between the candidate target and the smart device, the possibility that there is an interaction intention between the candidate target and the smart device. The greater the degree of the interaction, the distance between the candidate target and the smart device is used as one of the basis for determining whether the candidate target exists or not, and the interaction intention of interacting with the smart device.
  • the distance between the candidate target and the smart device can be obtained by a depth camera or a binocular vision camera or a laser radar.
  • the smart device is configured with a depth camera, and the depth map of the candidate target is obtained through the depth camera.
  • a controllable light spot, a light strip or a smooth surface structure can be projected to the candidate target surface by the structured light projector, and an image is obtained by the image sensor in the depth camera, and the candidate is calculated by using the triangular principle through the geometric relationship.
  • a binocular vision camera is configured in the smart device, and the candidate target is captured by the binocular vision camera. Then, the parallax of the image captured by the binocular vision camera is calculated, and the distance between the candidate target and the smart device is calculated based on the parallax.
  • FIG. 9 is a schematic diagram of the principle of calculating binocular vision distance according to an embodiment of the present disclosure.
  • Fig. 9 in the actual space, the positions O l and O r of the two cameras are plotted, and the optical axes of the left and right cameras, the focal planes of the two cameras, and the focal plane are at a distance f from the plane of the two cameras.
  • p and p' are the positions of the same candidate target P in different captured images, respectively.
  • the distance from the p-point to the left boundary of the captured image is x l
  • the distance from the p-point to the left boundary of the captured image is x r .
  • O l and Or are respectively two cameras, the two cameras are in the same plane, and the distance between the two cameras is Z.
  • the distance b between P in Fig. 9 and the plane where the two cameras are located has the following relationship:
  • d is the visual difference of the image captured by the same candidate target binocular camera. Since Z and f are constant values, the distance b between the candidate target and the plane of the camera, that is, the distance between the candidate target and the smart device, can be determined according to the visual difference d.
  • the laser radar is arranged in the smart device, and the laser is emitted into the monitoring range by the laser radar, and the emitted laser encounters obstacles within the monitoring range to be reflected.
  • the smart device receives the laser returned by each obstacle within the monitored range and generates a binary map of each obstacle based on the returned laser.
  • each binary image is fused with the environment image, and the binary image corresponding to the candidate target is identified from all the binary images.
  • the contour or size of each obstacle can be identified according to the binary map of each obstacle, and then the contour or size of each target in the environment image is matched, so that the binary map corresponding to the candidate target can be obtained.
  • the laser return time of the binary image corresponding to the candidate target is multiplied by the speed of light, and divided by 2 to obtain the distance between the candidate target and the smart device.
  • Step 503 Select a human body that is closest to the smart device as the human body corresponding to the target user.
  • the candidate target may not have the interaction intention of interacting with the smart device, so the human body closest to the smart device is selected as the human body corresponding to the target user for focus tracking.
  • the smart device can query the face image corresponding to the human body closest to the smart device in the registered user face image database to determine the target user, wherein the human body corresponding to the target user can be determined in different manners according to actual conditions.
  • a human body closest to the smart device is used as the human body corresponding to the target user.
  • a human body closest to the smart device is randomly selected as the human body corresponding to the target user.
  • the human body closest to the smart device is firstly queried as the human body corresponding to the target user.
  • a focus tracking method of a smart device by using a distance between a candidate target and a smart device, selecting, from all candidate targets, a candidate target having an interaction intention of interacting with the smart device, when the face is detected Directly using people as interactive targets can reduce the false start of smart devices.
  • FIG. 10 is a schematic structural diagram of a focus following device of a smart device according to an embodiment of the present disclosure.
  • the focus following device of the smart device includes: a detecting module 110, a determining module 120, and a control module 130.
  • the detecting module 110 is configured to detect a face key point of the target user from the environment image collected by the smart device, and when the face key point is not detected from the environment image, from the environment image Detecting key points of the human body of the target user.
  • the determining module 120 is configured to determine a face center point according to the face key point, and determine a body center point according to the body key point when the body key point is detected.
  • the control module 130 is configured to control the smart device to perform focus tracking on the face center point, and when the body center point is determined, control the smart device to perform focus tracking on the body center point.
  • control module 130 is specifically configured to: periodically determine whether the detected face center point or the body center point is in the image area; when the face center point Or obtaining a shortest path between the face center point or the body center point and the image area center point when the body center point is not in the image area; acquiring, according to the shortest path, controlling the smart device movement Control information; controlling the smart device to move according to the control information, such that the detected face center point or body center point falls within the image area.
  • the detecting module 110 is specifically configured to: identify a head area of the target user from the environment image according to a preset head feature; and from the head area Extract the key points of the face.
  • the determining module 120 is specifically configured to: if the extracted face key point is one, use the face key point as the face center point; if the extracted face key points are two and Two or more, obtaining the first center point of all the extracted face key points, and using the first center point as the face center point.
  • the determining module 120 is specifically configured to: use each face key point as a node, and use one of the nodes as a starting node to connect all the nodes one by one to form an overlay all. a key point graph of the node; acquiring a center point of the key point graph, and determining a center point of the key point graph as the first center point.
  • the detecting module 110 is specifically configured to: identify, from the collected human body region located below the head region; and when the human body region is identified, control the smart The camera angle of the pan/tilt camera of the device moves toward the direction of the head region; after the camera angle is moved, the captured environment image is captured; whether the head region is included in the environment image; if the environment image Including the head region, the face key point is identified from the head region; if the head region is not included in the environment image, detecting the target user from the environment image The key point of the human body.
  • FIG. 11 is a schematic structural diagram of a focus following device of another smart device according to an embodiment of the present disclosure. As shown in FIG. 11 , before the focus following device of the smart device of the foregoing embodiment, the human body recognition module 210 and the distance detecting device are further included. Module 220, selection module 230, and generation module 240.
  • the human body recognition module 210 is configured to perform human body recognition on the environment image before detecting a key point of the target user from the environment image collected by the smart device;
  • the distance detecting module 220 is configured to acquire a distance between each human body and the smart device when a plurality of human bodies are identified from the environment image;
  • the selecting module 230 is configured to select a human body that is closest to the smart device as a human body corresponding to the target user.
  • a generating module 240 configured to identify a center point of the environment image collected by the smart device before identifying a face key point of the target user in the environment image collected by the smart device, to use the environment image
  • the center point is the reference point, and a circular image area for focus tracking is generated.
  • the focus following device of the smart device of the embodiment of the present disclosure first detects a face key point of the target user from the environment image collected by the smart device, determines a face center point according to the face key point, and controls the smart device to The face center point performs focus tracking. If the face key point is not detected from the environment image, the target user's human key point is detected from the environment image, the body center point is determined according to the body key point, and the smart device is controlled to the human body center. Point to focus follow. Therefore, the device solves the technical problem that the focus cannot be maintained due to the detection of the key points of the face, and the key points of the human body are used as the focus to complement the image. When the smart device does not detect the key point of the face, the image is collected from the captured image. The detection of the key points of the human body as the focus of follow-up avoids the loss of focus of the user in the case of bowing and turning, and improves the success rate and accuracy of focus following.
  • an embodiment of the present disclosure further provides a smart device, including: a housing, a processor, a memory, a circuit board, and a power supply circuit, wherein the circuit board is disposed inside the space enclosed by the housing, the processor and The memory is disposed on the circuit board; the power circuit is configured to supply power to each circuit or device of the smart device; the memory is used to store executable program code; and the processor is executable and executable by reading executable program code stored in the memory.
  • an embodiment of the present disclosure further provides a non-transitory computer readable storage medium having stored thereon a computer program, which is executed by a processor to implement focus tracking of a smart device as described in the above embodiments. method.
  • FIG. 12 illustrates a block diagram of an exemplary smart device suitable for use in implementing embodiments of the present application.
  • the smart device includes a housing 310, a processor 320, a memory 330, a circuit board 340, and a power supply circuit 350.
  • the circuit board 340 is disposed inside the space enclosed by the housing 310, and the processor 320 and The memory 330 is disposed on the circuit board 340;
  • the power supply circuit 350 is configured to supply power to the respective circuits or devices of the smart device;
  • the memory 930 is used to store executable program code; and the processor 320 reads the executable program stored in the memory 330.
  • the code runs a program corresponding to the executable program code for executing the focus following method of the smart device described in the above embodiments.
  • first and second are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated.
  • features defining “first” or “second” may include at least one of the features, either explicitly or implicitly.
  • the meaning of "a plurality” is at least two, such as two, three, etc., unless specifically defined otherwise.
  • Any process or method description in the flowcharts or otherwise described herein may be understood to represent a module, segment or portion of code comprising one or more executable instructions for implementing the steps of a custom logic function or process.
  • the scope of the preferred embodiments of the present disclosure includes additional implementations, in which the functions may be performed in a substantially simultaneous manner or in an inverse order depending on the functions involved, in the order shown or discussed. It will be understood by those skilled in the art to which the embodiments of the present disclosure pertain.
  • a "computer-readable medium” can be any apparatus that can contain, store, communicate, propagate, or transport a program for use in an instruction execution system, apparatus, or device, or in conjunction with the instruction execution system, apparatus, or device.
  • computer readable media include the following: electrical connections (electronic devices) having one or more wires, portable computer disk cartridges (magnetic devices), random access memory (RAM), Read only memory (ROM), erasable editable read only memory (EPROM or flash memory), fiber optic devices, and portable compact disk read only memory (CDROM).
  • the computer readable medium may even be a paper or other suitable medium on which the program can be printed, as it may be optically scanned, for example by paper or other medium, followed by editing, interpretation or, if appropriate, other suitable The method is processed to obtain the program electronically and then stored in computer memory.
  • portions of the present disclosure can be implemented in hardware, software, firmware, or a combination thereof.
  • multiple steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system.
  • a suitable instruction execution system For example, if implemented in hardware and in another embodiment, it can be implemented by any one or combination of the following techniques well known in the art: discrete with logic gates for implementing logic functions on data signals Logic circuits, application specific integrated circuits with suitable combinational logic gates, programmable gate arrays (PGAs), field programmable gate arrays (FPGAs), and the like.
  • each functional unit in various embodiments of the present disclosure may be integrated into one processing module, or each unit may exist physically separately, or two or more units may be integrated into one module.
  • the above integrated modules can be implemented in the form of hardware or in the form of software functional modules.
  • the integrated modules, if implemented in the form of software functional modules and sold or used as stand-alone products, may also be stored in a computer readable storage medium.
  • the above mentioned storage medium may be a read only memory, a magnetic disk or an optical disk or the like. While the embodiments of the present disclosure have been shown and described above, it is understood that the foregoing embodiments are illustrative and are not to be construed as limiting the scope of the disclosure The embodiments are subject to variations, modifications, substitutions and variations.

Abstract

The present disclosure provides a focus tracking method and device of a smart apparatus, a smart apparatus, and a storage medium. The method comprises: detecting a face key point of a target user in an environmental image acquired by a smart apparatus, determining a face center point according to the face key point, and controlling the smart apparatus to perform focus tracking on the face center point; and if no face key point is found in the environment image, detecting a body key point of the target user in the environmental image, determining a body center point according to the body key point, and controlling the smart apparatus to perform focus tracking on the body center point. A body key point is used as a substitute in the above method to resolve the technical issue in which focus tracking cannot be maintained when no face key point has been detected, thereby preventing loss or missed detection of a focus point, and improving the success rate and accuracy of focus tracking.

Description

智能设备的焦点跟随方法、装置、智能设备及存储介质Focus tracking method, device, smart device and storage medium of smart device
相关申请的交叉引用Cross-reference to related applications
本公开要求北京猎户星空科技有限公司于2018年03月21日提交的、发明名称为“智能设备的焦点跟随方法、装置、智能设备及存储介质”的、中国专利申请号“201810236920.1”的优先权。The present disclosure claims the priority of the Chinese patent application number "201810236920.1" submitted by Beijing Orion Star Technology Co., Ltd. on March 21, 2018, entitled "Focus-Following Method, Device, Intelligent Device and Storage Medium for Intelligent Devices" .
技术领域Technical field
本公开涉及智能设备技术领域,尤其涉及一种智能设备的焦点跟随方法、装置、智能设备及存储介质。The present disclosure relates to the field of smart device technologies, and in particular, to a focus following method, device, smart device, and storage medium for a smart device.
背景技术Background technique
随着人工智能技术的发展,智能设备与用户交互的方式越来越丰富,其中,智能设备可以通过焦点跟随的方法跟随用户移动,达到智能设备关注用户行为的效果。With the development of artificial intelligence technology, smart devices interact with users more and more. Among them, smart devices can follow the user's movement through the method of focus following, and achieve the effect that the smart device pays attention to user behavior.
相关技术中,智能设备采用人脸识别技术,采集用户的人脸中心点,计算用户人脸中心点与采集到的图像中心位置的距离,并控制智能设备转动使用户面部位于图像中心位置。In the related art, the smart device adopts a face recognition technology to collect a user's face center point, calculate a distance between the user's face center point and the collected image center position, and control the smart device to rotate so that the user's face is located at the image center position.
发明内容Summary of the invention
本公开提出一种智能设备的焦点跟随方法。该方法以人体关键点作为焦点补足,当智能设备未检测到人脸关键点时,从采集到的图像中检测人体关键点作为跟随的焦点,避免用户在低头和转头等情况下造成焦点丢失,提高了焦点跟随的成功率和准确性。The present disclosure proposes a focus following method of a smart device. The method supplements the key points of the human body as a focus. When the smart device does not detect the key points of the face, the key points of the human body are detected as the following focus from the collected images, thereby preventing the user from losing focus in the case of bowing and turning. Improves the success rate and accuracy of focus tracking.
本公开提出一种智能设备的焦点跟随装置。The present disclosure proposes a focus following device of a smart device.
本公开提出一种智能设备。The present disclosure proposes a smart device.
本公开提出一种非临时性计算机可读存储介质。The present disclosure proposes a non-transitory computer readable storage medium.
本公开第一方面实施例提出了一种智能设备的焦点跟随方法,包括:The first aspect of the present disclosure provides a focus following method of a smart device, including:
从智能设备采集的环境图像中检测目标用户的人脸关键点,根据所述人脸关键点确定人脸中心点,并控制所述智能设备对所述人脸中心点进行焦点跟随;Detecting a face key point of the target user from the environment image collected by the smart device, determining a face center point according to the face key point, and controlling the smart device to perform focus tracking on the face center point;
如果从所述环境图像中未检测到所述人脸关键点,则从所述环境图像中检测所述目标用户的人体关键点,根据所述人体关键点确定人体中心点,并控制所述智能设备对所述人体中心点进行焦点跟随。If the face key point is not detected from the environment image, detecting a human body key point of the target user from the environment image, determining a body center point according to the body key point, and controlling the smart point The device performs focus tracking on the center point of the human body.
本公开实施例的智能设备的焦点跟随方法,首先从智能设备采集的环境图像中检测目标用户的人脸关键点,根据人脸关键点确定人脸中心点,并控制所述智能设备对所述人脸中心点进行焦点跟随,如果从环境图像中未检测到人脸关键点,则从环境图像中检测目标用户的人体关键点,根据人体关键点确定人体中心点,并控制智能设备对人体中心点进行焦点跟随。由此,该方法解决了现有技术中因检测不到人脸关键点导致无法保持焦点跟随的技术问题,以人体关键点作为焦点补足,当智能设备未检测到人脸关键点时,从采集到的图像中检测人体关键点作为跟随的焦点,避免用户在低头和转头等情况下造成焦点丢失,提高了焦点跟随的成功率和准确性。The focus following method of the smart device of the embodiment of the present disclosure firstly detects a face key point of the target user from the environment image collected by the smart device, determines a face center point according to the face key point, and controls the smart device to The face center point performs focus tracking. If the face key point is not detected from the environment image, the target user's human key point is detected from the environment image, the body center point is determined according to the body key point, and the smart device is controlled to the human body center. Point to focus follow. Therefore, the method solves the technical problem that the focus cannot be maintained due to the detection of the key points of the face in the prior art, and the key points of the human body are used as the focus to complement the foot. When the smart device does not detect the key points of the face, the method collects In the image to be detected, the key points of the human body are detected as the following focus, and the user is prevented from losing focus in the case of bowing and turning, etc., and the success rate and accuracy of focus following are improved.
另外,根据本公开上述实施例的智能设备的焦点跟随方法,还可以具有如下附加的技术特征:In addition, the focus following method of the smart device according to the above-described embodiments of the present disclosure may further have the following additional technical features:
在本公开一个实施例中,从智能设备采集的环境图像中识别目标用户的人脸关键点之前,还包括:识别所述智能设备所采集的环境图像的中心点,以所述环境图像的中心点为基准点,生成一个圆形用于焦点跟随的图像区域。In an embodiment of the present disclosure, before identifying a face key point of the target user from the environment image collected by the smart device, the method further includes: identifying a center point of the environment image collected by the smart device, and centering the environment image The point is the reference point and a circle is created for the image area that the focus follows.
在本公开一个实施例中,进行焦点跟随,包括:定时判断检测出的所述人脸中心点或者人体中心点是否处于所述图像区域内;当所述人脸中心点或者人体中心点未处于所述图像区域内时,获取所述人脸中心点或者人体中心点与所述图像区域中心点之间的最短路径;根据所述最短路径,获取用于控制智能设备移动的控制信息;控制所述智能设备按照所述控制信息移动,使得检测到的所述人脸中心点或者人体中心点落入所述图像区域内。In an embodiment of the present disclosure, performing focus following includes: periodically determining whether the detected face center point or the body center point is in the image area; when the face center point or the body center point is not in Obtaining, in the image area, a shortest path between the face center point or the body center point and the image area center point; acquiring control information for controlling the movement of the smart device according to the shortest path; The smart device moves according to the control information, so that the detected face center point or body center point falls within the image area.
在本公开一个实施例中,从智能设备采集的环境图像中检测目标用户的人脸关键点,根据所述人脸关键点确定人脸中心点,包括:根据预设的头部特征,从所述环境图像中识别所述目标用户的头部区域;从所述头部区域提取所述人脸关键点;如果提取出的所述人脸关键点为一个,将所述人脸关键点作为所述人脸中心点;如果提取出的所述人脸关键点为两个以及两个以上,获取提取出的所有的所述人脸关键点的第一中心点,将所述第一中心点作为所述人脸中心点。In an embodiment of the present disclosure, the face key of the target user is detected from the environment image collected by the smart device, and the face center point is determined according to the face key point, including: according to the preset head feature, Identifying a head region of the target user in the environment image; extracting the face key point from the head region; if the extracted face key point is one, the face key point is taken as a face center point; if the extracted face key points are two or more, obtaining the first center point of all the extracted face key points, the first center point is taken as The face center point.
在本公开一个实施例中,获取提取出的所有的所述人脸关键点的第一中心点,包括:将每个人脸关键点作为节点,以其中一个节点作为起始节点,将所有的节点逐个连接起来,形成一个覆盖所有节点的关键点图形;获取所述关键点图形的中心点,将所述关键点图形的中心点,确定为所述第一中心点。In an embodiment of the present disclosure, acquiring the first center point of all the extracted face key points includes: using each face key point as a node, using one of the nodes as a starting node, and all the nodes Connected one by one to form a key point graph covering all nodes; obtain a center point of the key point graph, and determine a center point of the key point graph as the first center point.
在本公开一个实施例中,从采集的环境图像中检测所述目标用户的人体关键点,包括:从采集的位于所述头部区域下方的人体区域进行识别;当识别到所述人体区域后,控制所述智能设备的云台摄像头的摄像角度向所述头部区域所在方向移动;在所述摄像角度移动后,拍摄获取环境图像;判断所述环境图像中是否包括所述头部区域;如果所述环境图像 中包括所述头部区域,则从所述头部区域识别所述人脸关键点;如果所述环境图像中未包括所述头部区域,则从所述环境图像中检测所述目标用户的人体关键点。In an embodiment of the present disclosure, detecting a key point of the human body of the target user from the collected environment image includes: identifying from a collected human body area located below the head area; and after identifying the human body area And controlling an imaging angle of the pan-tilt camera of the smart device to move in a direction of the head region; after the camera angle is moved, capturing an environment image; determining whether the head region is included in the environment image; If the head region is included in the environment image, identifying the face key point from the head region; if the head region is not included in the environment image, detecting from the environment image The key point of the human body of the target user.
在本公开一个实施例中,从智能设备采集的环境图像中检测目标用户的人脸关键点之前,还包括:对所述环境图像进行人体识别;当从所述环境图像中识别出多个人体时,获取每个人体与智能设备之间的距离;选取与所述智能设备距离最近的人体作为所述目标用户对应的人体In an embodiment of the present disclosure, before detecting a face key point of the target user from the environment image collected by the smart device, the method further includes: performing human body recognition on the environment image; and identifying a plurality of human bodies from the environment image Obtaining a distance between each human body and the smart device; selecting a human body closest to the smart device as the human body corresponding to the target user
在本公开一个实施例中,选取距离所述智能设备最近的人体作为所述目标用户对应的人体,包括:当与所述智能设备距离最近的人体为多个时,查询所述智能设备的已注册用户人脸图像库中是否存在所述与所述智能设备距离最近的人体对应的人脸图像;如果所述人脸图像库中存在一个与所述智能设备距离最近的人体对应的人脸图像,则将所述一个与所述智能设备距离最近的人体作为所述目标用户对应的人体;如果所述人脸图像库中不存在所有与所述智能设备距离最近的人体对应的人脸图像,则随机选取一个与所述智能设备距离最近的人体作为所述目标用户对应的人体;如果所述人脸图像库中存在多个与所述智能设备距离最近的人体对应的人脸图像,则将最先查询出的与所述智能设备距离最近的人体作为所述目标用户对应的人体。In an embodiment of the present disclosure, selecting a human body closest to the smart device as the human body corresponding to the target user includes: querying the smart device when the human body closest to the smart device is multiple Whether there is a face image corresponding to the human body closest to the smart device in the registered user face image library; if there is a face image corresponding to the human body closest to the smart device in the face image library And the human body that is closest to the smart device is the human body corresponding to the target user; if there is no face image corresponding to the human body closest to the smart device in the face image library, Then randomly selecting a human body that is closest to the smart device as a human body corresponding to the target user; if there are multiple face images corresponding to the human body closest to the smart device in the face image library, The human body closest to the smart device is firstly queried as the human body corresponding to the target user.
本公开第二方面实施例提出了一种智能设备的焦点跟随装置,包括:The second aspect of the present disclosure provides a focus following device of a smart device, including:
检测模块,用于从智能设备采集的环境图像中检测目标用户的人脸关键点,以及在从所述环境图像中未检测到所述人脸关键点时,从所述环境图像中检测所述目标用户的人体关键点;a detecting module, configured to detect a face key point of the target user from the environment image collected by the smart device, and detect the image from the environment image when the face key point is not detected from the environment image The key point of the target user's body;
确定模块,用于根据所述人脸关键点确定人脸中心点,以及在检测到人体关键点时,根据所述人体关键点确定人体中心点;a determining module, configured to determine a face center point according to the face key point, and determine a body center point according to the body key point when the human body key point is detected;
控制模块,用于控制所述智能设备对所述人脸中心点进行焦点跟随,以及在确定出所述人体中心点时,控制所述智能设备对所述人体中心点进行焦点跟随。And a control module, configured to control the smart device to perform focus tracking on the face center point, and control the smart device to perform focus tracking on the human body center point when determining the body center point.
另外,根据本公开上述实施例的智能设备的焦点跟随装置,还可以具有如下附加的技术特征:In addition, the focus following device of the smart device according to the above-described embodiments of the present disclosure may further have the following additional technical features:
在本公开一个实施例中,上述实施例的智能设备的焦点跟随装置还包括:生成模块,用于在从所述智能设备采集的所述环境图像中识别所述目标用户的人脸关键点之前,识别所述智能设备所采集的环境图像的中心点,以所述环境图像的中心点为基准点,生成一个圆形用于焦点跟随的图像区域。In an embodiment of the present disclosure, the focus following device of the smart device of the above embodiment further includes: a generating module, configured to identify a face key of the target user in the environment image collected from the smart device Identifying a center point of the environment image collected by the smart device, and using a center point of the environment image as a reference point, generating a circular image area for focus following.
在本公开一个实施例中,控制模块,具体用于:定时判断检测出的所述人脸中心点或者人体中心点是否处于所述图像区域内;当所述人脸中心点或者人体中心点未处于所述图像区域内时,获取所述人脸中心点或者人体中心点与所述图像区域中心点之间的最短路径; 根据所述最短路径,获取用于控制智能设备移动的控制信息;控制所述智能设备按照所述控制信息移动,使得检测到的所述人脸中心点或者人体中心点落入所述图像区域内。In an embodiment of the present disclosure, the control module is specifically configured to: periodically determine whether the detected face center point or the body center point is in the image area; when the face center point or the body center point is not Acquiring a shortest path between the face center point or the body center point and the image area center point when the image area is in the image area; acquiring control information for controlling the movement of the smart device according to the shortest path; The smart device moves according to the control information, so that the detected face center point or body center point falls within the image area.
在本公开一个实施例中,检测模块,具体用于:根据预设的头部特征,从所述环境图像中识别所述目标用户的头部区域;从所述头部区域提取所述人脸关键点;In an embodiment of the present disclosure, the detecting module is configured to: identify a head area of the target user from the environment image according to a preset head feature; and extract the face from the head area key point;
确定模块,具体用于如果提取出的所述人脸关键点为一个,将所述人脸关键点作为所述人脸中心点;如果提取出的所述人脸关键点为两个以及两个以上,获取提取出的所有的所述人脸关键点的第一中心点,将所述第一中心点作为所述人脸中心点。Determining a module, specifically, if the extracted face key point is one, the face key point is used as the face center point; if the extracted face key points are two and two In the above, the first center point of all the extracted face key points is obtained, and the first center point is used as the face center point.
在本公开一个实施例中,确定模块,具体用于:将每个人脸关键点作为节点,以其中一个节点作为起始节点,将所有的节点逐个连接起来,形成一个覆盖所有节点的关键点图形;获取所述关键点图形的中心点,将所述关键点图形的中心点,确定为所述第一中心点。In an embodiment of the present disclosure, the determining module is specifically configured to: use each face key point as a node, and use one of the nodes as a starting node to connect all the nodes one by one to form a key point graphic covering all the nodes. Obtaining a center point of the key point graphic, and determining a center point of the key point graphic as the first center point.
在本公开一个实施例中,检测模块,具体用于:从采集的对位于所述头部区域下方的人体区域进行识别;当识别到所述人体区域后,控制所述智能设备的云台摄像头的摄像角度向所述头部区域所在方向移动;在所述摄像角度移动后,拍摄获取环境图像;判断所述环境图像中是否包括所述头部区域;如果所述环境图像中包括所述头部区域,则从所述头部区域识别所述人脸关键点;如果所述环境图像中未包括所述头部区域,则从所述环境图像中检测所述目标用户的人体关键点。In an embodiment of the present disclosure, the detecting module is specifically configured to: identify a human body region located below the head region from the collected; and control the pan/tilt camera of the smart device after identifying the human body region a camera angle is moved in a direction in which the head region is located; after the camera angle is moved, capturing an environment image; determining whether the head region is included in the environment image; and if the environment image includes the header And the part area identifies the face key point from the head area; if the head area is not included in the environment image, detecting a human body key point of the target user from the environment image.
在本公开一个实施例中,上述实施例的智能设备的焦点跟随装置还包括:人体识别模块,用于从所述智能设备采集的环境图像中检测所述目标用户的人脸关键点之前,对所述环境图像进行人体识别;距离检测模块,用于当从所述环境图像中识别出多个人体时,获取每个人体与智能设备之间的距离;选取模块,用于选取与所述智能设备距离最近的人体作为所述目标用户对应的人体。In an embodiment of the present disclosure, the focus following device of the smart device of the foregoing embodiment further includes: a human body recognition module, configured to detect, before detecting a key point of the target user from the environment image collected by the smart device, The environment image is used for human body recognition; the distance detecting module is configured to acquire a distance between each human body and the smart device when a plurality of human bodies are identified from the environment image; and a module for selecting and selecting the smart device The device is the human body corresponding to the closest human body as the target user.
在本公开一个实施例中,选取模块,具体用于:当与所述智能设备距离最近的人体为多个时,查询所述智能设备的已注册用户人脸图像库中是否存在所述与所述智能设备距离最近的人体对应的人脸图像;如果所述人脸图像库中存在一个与所述智能设备距离最近的人体对应的人脸图像,则将所述一个与所述智能设备距离最近的人体作为所述目标用户对应的人体;如果所述人脸图像库中不存在所有与所述智能设备距离最近的人体对应的人脸图像,则随机选取一个与所述智能设备距离最近的人体作为所述目标用户对应的人体;如果所述人脸图像库中存在多个与所述智能设备距离最近的人体对应的人脸图像,则将最先查询出的与所述智能设备距离最近的人体作为所述目标用户对应的人体。In an embodiment of the present disclosure, the module is specifically configured to: if there is a plurality of human bodies that are closest to the smart device, query whether the registered user face image library of the smart device exists in the a face image corresponding to the closest human body of the smart device; if there is a face image corresponding to the human body closest to the smart device in the face image library, the one is closest to the smart device The human body is the human body corresponding to the target user; if there is no face image corresponding to the human body closest to the smart device in the face image library, a human body closest to the smart device is randomly selected. a human body corresponding to the target user; if there are a plurality of face images corresponding to the human body closest to the smart device in the face image library, the first query is the closest to the smart device The human body serves as a human body corresponding to the target user.
本公开实施例的智能设备的焦点跟随装置,首先从智能设备采集的环境图像中检测目标用户的人脸关键点,根据人脸关键点确定人脸中心点,并控制所述智能设备对所述人脸中心点进行焦点跟随,如果从环境图像中未检测到人脸关键点,则从环境图像中检测目标 用户的人体关键点,根据人体关键点确定人体中心点,并控制智能设备对人体中心点进行焦点跟随。由此,该装置解决了现有技术中因检测不到人脸关键点导致无法保持焦点跟随的技术问题,以人体关键点作为焦点补足,当智能设备未检测到人脸关键点时,从采集到的图像中检测人体关键点作为跟随的焦点,避免用户在低头和转头等情况下造成焦点丢失,提高了焦点跟随的成功率和准确性。The focus following device of the smart device of the embodiment of the present disclosure first detects a face key point of the target user from the environment image collected by the smart device, determines a face center point according to the face key point, and controls the smart device to The face center point performs focus tracking. If the face key point is not detected from the environment image, the target user's human key point is detected from the environment image, the body center point is determined according to the body key point, and the smart device is controlled to the human body center. Point to focus follow. Therefore, the device solves the technical problem that the focus cannot be maintained due to the detection of the key points of the face in the prior art, and the key points of the human body are used as the focus to complement the foot. When the smart device does not detect the key point of the face, the device collects In the image to be detected, the key points of the human body are detected as the following focus, and the user is prevented from losing focus in the case of bowing and turning, etc., and the success rate and accuracy of focus following are improved.
本公开第三方面实施例提出了一种智能设备,其特征在于,包括:壳体、处理器、存储器、电路板和电源电路,其中,电路板安置在壳体围成的空间内部,处理器和存储器设置在电路板上;电源电路,用于为上述智能设备的各个电路或器件供电;存储器用于存储可执行程序代码;处理器通过读取存储器中存储的可执行程序代码来运行与可执行程序代码对应的程序,以用于实现如上述实施例所述的智能设备的焦点跟随方法。A third aspect of the present disclosure provides a smart device, comprising: a housing, a processor, a memory, a circuit board, and a power supply circuit, wherein the circuit board is disposed inside the space enclosed by the housing, and the processor And a memory disposed on the circuit board; a power supply circuit for supplying power to each circuit or device of the smart device; a memory for storing executable program code; and a processor for operating by reading executable program code stored in the memory A program corresponding to the program code is executed for implementing the focus following method of the smart device as described in the above embodiments.
本公开第四方面实施例提出了一种非临时性计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现如上述实施例所述的智能设备的焦点跟随方法。The fourth aspect of the present disclosure provides a non-transitory computer readable storage medium having stored thereon a computer program, wherein the program is executed by the processor to implement the focus of the smart device as described in the above embodiments. Follow the method.
本公开附加的方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本公开的实践了解到。The aspects and advantages of the present invention will be set forth in part in the description which follows.
附图说明DRAWINGS
本公开上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the present disclosure will become apparent and readily understood from
图1为本公开实施例所提供的一种智能设备的焦点跟随方法的流程示意图;1 is a schematic flowchart of a focus following method of a smart device according to an embodiment of the present disclosure;
图2为本公开实施例所提供的一种人体关键点位置示意图;2 is a schematic diagram of a position of a key point of a human body according to an embodiment of the present disclosure;
图3为本公开实施例所提供的一种确定人脸中心点方法的流程示意图;FIG. 3 is a schematic flowchart of a method for determining a face center point according to an embodiment of the present disclosure;
图4为本公开实施例所提供的一种人脸关键点位置示意图;4 is a schematic diagram of a location of a face key point according to an embodiment of the present disclosure;
图5为本公开实施例所提供的另一种智能设备的焦点跟随方法的流程示意图;FIG. 5 is a schematic flowchart of a focus following method of another smart device according to an embodiment of the present disclosure;
图6为本公开实施例所提供的一种焦点跟随过程示意图;FIG. 6 is a schematic diagram of a focus following process according to an embodiment of the present disclosure;
图7为本公开实施例所提供的一种具体的智能设备的焦点跟随方法的流程示意图;FIG. 7 is a schematic flowchart diagram of a focus follow method of a specific smart device according to an embodiment of the present disclosure;
图8为本公开实施例所提供的一种确定目标用户方法的流程示意图;FIG. 8 is a schematic flowchart of a method for determining a target user according to an embodiment of the present disclosure;
图9为本公开实施例提供的一种双目视觉计算距离的原理示意图;FIG. 9 is a schematic diagram of a principle for binocular vision calculation distance according to an embodiment of the present disclosure;
图10为本公开实施例提供的一种智能设备的焦点跟随装置的结构示意图;FIG. 10 is a schematic structural diagram of a focus following device of a smart device according to an embodiment of the present disclosure;
图11为本公开实施例提供的另一种智能设备的焦点跟随装置的结构示意图;以及FIG. 11 is a schematic structural diagram of a focus following device of another smart device according to an embodiment of the present disclosure;
图12为本公开实施例所提供的一种适于用来实现本公开实施方式的示例性智能设备的框图。12 is a block diagram of an exemplary smart device suitable for implementing an implementation of the present disclosure, in accordance with an embodiment of the present disclosure.
具体实施方式detailed description
下面详细描述本公开的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,旨在用于解释本公开,而不能理解为对本公开的限制。The embodiments of the present disclosure are described in detail below, and the examples of the embodiments are illustrated in the drawings, wherein the same or similar reference numerals are used to refer to the same or similar elements or elements having the same or similar functions. The embodiments described below with reference to the drawings are illustrative, and are not intended to be construed as limiting.
下面参考附图描述本公开实施例的智能设备的焦点跟随方法和装置。A focus following method and apparatus of a smart device of an embodiment of the present disclosure will be described below with reference to the accompanying drawings.
其中,本公开实施例的智能设备的焦点跟随方法的执行主体可以是通过摄像装置采集周围环境图像,并对图像上的焦点进行跟随的智能设备,比如,智能机器人等。The execution body of the focus following method of the smart device of the embodiment of the present disclosure may be a smart device that captures an image of the surrounding environment by the camera device and follows the focus on the image, such as an intelligent robot or the like.
图1为本公开实施例所提供的一种智能设备的焦点跟随方法的流程示意图。如图1所示,该智能设备的焦点跟随方法包括以下步骤:FIG. 1 is a schematic flowchart diagram of a focus following method of a smart device according to an embodiment of the present disclosure. As shown in FIG. 1, the focus following method of the smart device includes the following steps:
步骤101,从智能设备采集的环境图像中检测目标用户的人脸关键点,根据人脸关键点确定人脸中心点,并控制智能设备对人脸中心点进行焦点跟随。Step 101: Detect a face key point of the target user from the environment image collected by the smart device, determine a face center point according to the face key point, and control the smart device to perform focus follow on the face center point.
本实施例中,智能设备可以是机器人、智能家电等。In this embodiment, the smart device may be a robot, a smart home appliance, or the like.
智能设备上配置有摄像装置,如摄像头,智能设备通过摄像装置可实时采集监控范围内的环境图像。在获取环境图像后,可对环境图像进行检测,以识别进入监控范围的人体。The smart device is equipped with a camera device, such as a camera, and the smart device can collect the environment image in the monitoring range in real time through the camera device. After the environmental image is acquired, the environmental image can be detected to identify the human body entering the monitoring range.
具体而言,从环境图像中,结合人脸识别技术检测采集到的图像中是否存在人脸。作为一种示例,从环境图像中,提取物体的轮廓,将提取的物体轮廓与预存的人脸轮廓或人体轮廓,进行比对。当提取的轮廓与预设的轮廓之间的相似度超过预设的阈值,可以认为从环境图像中识别到了用户。从而,通过该方法可以识别出环境图像中所有的用户。Specifically, from the environment image, combined with the face recognition technology, it is detected whether there is a face in the collected image. As an example, from the environment image, the outline of the object is extracted, and the extracted object outline is compared with a pre-existing face contour or a human body contour. When the similarity between the extracted contour and the preset contour exceeds a preset threshold, it can be considered that the user is recognized from the environmental image. Thus, all users in the environmental image can be identified by this method.
进一步的,若环境图像中存在目标用户的人脸,则智能设备检测目标用户的人脸关键点,根据人脸关键点确定人脸中心点。其中,人脸关键点可以是目标用户的五官,比如眼睛、鼻子和嘴巴等,智能设备可以通过检测人脸器官的形状和不同器官在人脸所在的位置等方式确定人脸关键点,进而根据检测出的人脸关键点确定人脸中心点。Further, if the face of the target user exists in the environment image, the smart device detects the face key point of the target user, and determines the face center point according to the face key point. Among them, the key point of the face can be the facial features of the target user, such as eyes, nose and mouth, etc., the smart device can determine the key points of the face by detecting the shape of the face organ and the position of the different organs in the face, and then according to The detected face key points determine the face center point.
更进一步的,在智能设备获取人脸中心点后,以人脸中心点为焦点,控制智能设备的摄像装置或者视觉系统实时跟随焦点,保持焦点在采集的环境图像的跟随区域内,其中,跟随区域能够覆盖环境图像中部分区域,该跟随区域并不是固定的在环境图像中的,而是跟随监控视野实时移动的。跟随区域一般需要覆盖环境图像中的中心区域,以便于保持智能设备与监控的目标用户能够面对面交互。Further, after the smart device acquires the face center point, focusing on the face center point, the camera or the visual system controlling the smart device follows the focus in real time, and keeps the focus in the following area of the collected environment image, wherein, following The area can cover a partial area of the environment image that is not fixed in the environment image, but moves in real time following the surveillance field of view. The following area generally needs to cover the central area in the environment image in order to keep the smart device and the monitored target user able to face-to-face interaction.
例如,当智能设备为智能机器人时,机器人的头部为摄像装置,控制机器人的摄像装置以人脸中心点为焦点进行焦点跟随,从而达到机器人始终“注视”目标用户的效果,提升用户使用体验。For example, when the smart device is an intelligent robot, the head of the robot is an imaging device, and the camera device that controls the robot focuses on the focus of the face center point, thereby achieving the effect that the robot always “gazes” the target user and enhances the user experience. .
步骤102,如果从环境图像中未检测到所述人脸关键点,则从环境图像中检测目标用 户的人体关键点,根据人体关键点确定人体中心点,并控制智能设备对人体中心点进行焦点跟随。Step 102: If the face key point is not detected from the environment image, the target user's human key point is detected from the environment image, the body center point is determined according to the body key point, and the smart device is controlled to focus on the body center point. follow.
具体的,当目标用户转身或低头时,环境图像中可能无法检测到人脸关键点,则智能设备从环境图像中检测目标用户的人体关键点,其中,人体关键点是目标用户身体除头部以外的其他部分的关键点。图2为本公开实施例所提供的一种人体关键点位置示意图,如图2所示,智能设备在环境图像中识别目标用户躯干的轮廓边缘,以肢体与躯干的交点为人体关键点,根据人体关键点确定人体中心点。例如,当用户低头时智能设备无法检测出人脸关键点,则智能设备的摄像装置向下移动,以检测出用户脖子与躯干的交点P1为人体关键点,以该关键点为人体中心点;又比如,当目标用户转身时,智能设备在环境图像中检测出用户两只手臂与躯干的交点为P2和P3,以P2和P3连线的中点为人体关键点。Specifically, when the target user turns or bows, the face key may not be detected in the environment image, and the smart device detects the key point of the target user from the environment image, wherein the key point of the human body is the target user body except the head. The key point of the other parts. 2 is a schematic diagram of a position of a key point of a human body according to an embodiment of the present disclosure. As shown in FIG. 2, the smart device identifies a contour edge of a torso of a target user in an environment image, and the intersection of the limb and the trunk is a key point of the human body, according to The key points of the human body determine the center point of the human body. For example, when the smart device cannot detect the key point of the face when the user bows, the camera device of the smart device moves downward to detect that the intersection point P1 of the user's neck and the torso is a key point of the human body, and the key point is the center point of the human body; For example, when the target user turns around, the smart device detects in the environmental image that the intersection of the user's two arms and the torso is P2 and P3, and the midpoint of the connection between P2 and P3 is the key point of the human body.
进一步的,智能设备以人体中心点为焦点进行焦点跟随,保持焦点在采集的环境图像的跟随区域内,其中,对人体中心点进行焦点跟随的方法可以参照上述示例中对人脸中心点进行焦点跟随的方法,在此不再赘述。Further, the smart device focuses on the center point of the human body, and keeps the focus in the following area of the collected environment image. The method of performing focus tracking on the center point of the human body may refer to the focus of the center point of the face in the above example. The method to follow is not repeated here.
本公开实施例的智能设备的焦点跟随方法,通过从智能设备采集的环境图像中检测目标用户的人脸关键点,根据人脸关键点确定人脸中心点,并控制所述智能设备对所述人脸中心点进行焦点跟随,如果从环境图像中未检测到人脸关键点,则从环境图像中检测目标用户的人体关键点,根据人体关键点确定人体中心点,并控制智能设备对人体中心点进行焦点跟随。由此,该方法解决了因检测不到人脸关键点导致无法保持焦点跟随的技术问题,以人体关键点作为焦点补足,当智能设备未检测到人脸关键点时,从采集到的图像中检测人体关键点作为跟随的焦点,避免用户在低头和转头等情况下造成焦点丢失,提高了焦点跟随的成功率和准确性。The focus following method of the smart device of the embodiment of the present disclosure detects a face key point of the target user from the environment image collected by the smart device, determines a face center point according to the face key point, and controls the smart device to The face center point performs focus tracking. If the face key point is not detected from the environment image, the target user's human key point is detected from the environment image, the body center point is determined according to the body key point, and the smart device is controlled to the human body center. Point to focus follow. Therefore, the method solves the technical problem that the focus cannot be maintained due to the detection of the key point of the face, and the key point of the human body is used as the focus to complement the image. When the smart device does not detect the key point of the face, the image is collected from the captured image. The detection of the key points of the human body as the focus of follow-up avoids the loss of focus of the user in the case of bowing and turning, and improves the success rate and accuracy of focus following.
基于上述实施例,为了更加清楚的描述人脸中心点的确定过程,本公开实施例提出了一种确定人脸中心点的方法,图3为本公开实施例所提供的一种确定人脸中心点方法的流程示意图。Based on the above embodiment, in order to more clearly describe the process of determining the face center point, the embodiment of the present disclosure provides a method for determining a face center point, and FIG. 3 is a method for determining a face center according to an embodiment of the present disclosure. Schematic diagram of the point method.
如图3所示,该确定人脸中心点方法包括以下步骤:As shown in FIG. 3, the method for determining a face center point includes the following steps:
步骤201,识别目标用户的头部区域。In step 201, the head area of the target user is identified.
具体的,智能设备根据预先存储的头部模型设置头部特征,比如,头部的形体结构、基本比例和与人体躯干的位置关系等,智能设备根据预设的头部特征从环境图像中识别目标用户的头部区域。Specifically, the smart device sets the head feature according to the pre-stored head model, for example, the shape structure of the head, the basic proportion, and the positional relationship with the human torso, etc., and the smart device identifies the image from the environment image according to the preset head feature. The head area of the target user.
步骤202,在头部区域检测人脸关键点。In step 202, a face key point is detected in the head area.
具体的,在识别的头部区域内检测目标用户的人脸关键点,从头部区域识别人脸关键点的过程,可参见上述实施例中相关内容的记载,此处不再赘述。Specifically, the process of detecting the key point of the target user in the identified head area and the process of identifying the key point of the face from the head area can be referred to the description of related content in the foregoing embodiment, and details are not described herein again.
步骤203,判断检测出的人脸关键点的个数,若人脸关键点个数为一个,则执行步骤204,若人脸关键点的个数为两个以及两个以上,则执行步骤205。In step 203, the number of the detected face key points is determined. If the number of face key points is one, step 204 is performed. If the number of face key points is two or more, step 205 is performed. .
步骤204,以检测出的一个人脸关键点为人脸中心点。In step 204, a detected face key point is a face center point.
具体的,在目标用户的头部区域内检测出的一个人脸关键点为人脸中心点,比如,若仅检测出目标用户的眼睛,则将眼睛作为目标用户的人脸中心点。Specifically, a face key point detected in the target user's head area is a face center point. For example, if only the target user's eyes are detected, the eye is used as the target user's face center point.
步骤205,获取检测出的所有的人脸关键点的第一中心点,将所述第一中心点作为所述人脸中心点。Step 205: Acquire a first center point of all detected face key points, and use the first center point as the face center point.
其中,第一中心点是检测出的所有人脸关键点围成的关键点图形的中心点。图4为本公开实施例所提供的一种人脸关键点位置示意图,如图4所示,将每个人脸关键点作为关键点图形的连接节点,以其中一个节点作为起始节点,将所有的节点逐个连接起来,形成一个覆盖所有节点的关键点图形,若获得的关键点图形为对称图形(如图4所示),则以关键点图形的对称轴的中点为关键点图形的第一中心点,将关键点图形的第一中心点确定为人脸中心点;若关键点图形为不规则图形,则将不规则图形最长轴与最短轴的交点为关键点图形的第一中心点,将关键点图形的第一中心点确定为人脸中心点。The first center point is the center point of the key point pattern surrounded by all detected key points of the face. 4 is a schematic diagram of a location of a face key point according to an embodiment of the present disclosure. As shown in FIG. 4, each face key point is used as a connection node of a key point graphic, and one of the nodes is used as a starting node, and all The nodes are connected one by one to form a keypoint graph covering all the nodes. If the keypoint graph obtained is a symmetrical graph (as shown in Fig. 4), the midpoint of the symmetry axis of the keypoint graph is the keypoint graph. a center point, the first center point of the key point graphic is determined as the center point of the face; if the key point figure is an irregular figure, the intersection of the longest axis and the shortest axis of the irregular figure is the first center point of the key point graphic The first center point of the key point graphic is determined as the center point of the face.
本公开实施例的确定人脸中心点方法,以检测出的人脸关键点确定人脸中心点,对人脸中心点进行焦点跟随可以保证目标用户的面部区域在智能设备的跟随区域内,以便于保持智能设备与监控的目标用户能够面对面交互。The method for determining a face center point of the embodiment of the present disclosure determines a face center point by the detected face key point, and performing focus tracking on the face center point to ensure that the target user's face area is within the following area of the smart device, so that The target users who keep the smart device and monitor can interact face to face.
基于上述实施例,在对人脸进行检测之前,需要预先生成图像区域,该图像区域即为跟随区域。图5为本公开实施例所提供的另一种智能设备的焦点跟随方法的流程示意图。Based on the above embodiment, before the face is detected, it is necessary to generate an image area in advance, which is the following area. FIG. 5 is a schematic flowchart of a focus following method of another smart device according to an embodiment of the present disclosure.
如图5所示,该焦点跟随方法包括以下步骤:As shown in FIG. 5, the focus following method includes the following steps:
步骤301,获取用于焦点跟随的图像区域的基准点。Step 301: Acquire a reference point of an image area for focus following.
具体的,智能设备以采集的环境图像的水平对称轴和垂直对称轴的交点为环境图像的中心点,然后以环境图像的中心点为用于焦点跟随的图像区域的基准点。Specifically, the smart device takes the intersection of the horizontal symmetry axis and the vertical symmetry axis of the collected environment image as the center point of the environment image, and then uses the center point of the environment image as the reference point of the image region for the focus to follow.
步骤302,生成用于焦点跟随的图像区域。Step 302, generating an image area for focus following.
具体的,智能设备以预设的像素值为半径,以用于焦点跟随的图像区域的基准点为圆心,生成一个圆形用于焦点跟随的图像区域。其中,像素值的大小是根据摄像装置的最大像素值和摄像装置与目标用户的距离预先设置的,例如,当智能设备的摄像头为200万像素时,通过大量实验数据获得用户与摄像装置在不同距离下的人脸检测面积的平均值,在目标用户与智能设备相距2米时,以72像素为半径做圆,可以确保人脸面积在所做的圆的图像区域内。Specifically, the smart device takes a preset pixel value as a radius, and uses a reference point of the image area for the focus to be the center of the circle, and generates a circular image area for the focus to follow. The size of the pixel value is preset according to the maximum pixel value of the camera device and the distance between the camera device and the target user. For example, when the camera of the smart device is 2 million pixels, the user and the camera device are different through a large amount of experimental data. The average value of the face detection area under the distance is rounded at a radius of 72 pixels when the target user is 2 meters away from the smart device, and the face area can be ensured to be within the image area of the circle.
步骤303,控制图像区域进行焦点跟随。Step 303, controlling the image area to perform focus following.
具体的,智能设备定时判断检测出的人脸中心点是否处于图像区域内,当人脸中心点 未处于图像区域内时,智能设备控制图像区域进行焦点跟随。Specifically, the smart device periodically determines whether the detected face center point is in the image area, and when the face center point is not in the image area, the smart device controls the image area to perform focus tracking.
具体实施时,图6为本公开实施例所提供的一种焦点跟随过程示意图,如图6所示,以图像区域的基准点为原点,以图像区域的水平对称轴和垂直对称轴为X轴和Y轴生成坐标系,当人脸中心点未处于图像区域内时,获取人脸中心点与图像区域中心点之间的最短路径,即以图像区域的基准点为起点,以人脸中心点为终点的有向线段,根据最短路径,获取用于控制智能设备移动的控制信息,比如,将图像区域沿
Figure PCTCN2019078747-appb-000001
方向移动5厘米等,进而控制智能设备按照控制信息移动,使得检测到的人脸中心点或落入图像区域内。
In a specific implementation, FIG. 6 is a schematic diagram of a focus following process according to an embodiment of the present disclosure. As shown in FIG. 6 , the reference point of the image region is taken as the origin, and the horizontal symmetry axis and the vertical symmetry axis of the image region are taken as the X axis. And the Y-axis generates a coordinate system. When the face center point is not in the image area, the shortest path between the face center point and the image area center point is obtained, that is, the reference point of the image area is used as the starting point, and the face center point is used. A directed line segment that is the end point, and according to the shortest path, obtains control information for controlling the movement of the smart device, for example, along the image area
Figure PCTCN2019078747-appb-000001
The direction is moved by 5 cm or the like, thereby controlling the smart device to move according to the control information, so that the detected center point of the face falls into the image area.
本公开实施例的焦点跟随方法,根据采集的环境图像的中心点和预设的像素值为半径生成用于焦点跟随的圆形图像区域,相比于相关技术中的“井”字格跟随区域或方格跟随区域去掉了四个边角,使焦点跟随的图像区域更加准确,并且按照人脸中心点与图像区域中心点之间的最短路径进行焦点跟随,缩短了摄像装置或视觉系统的移动时间提高了焦点跟随的时效性。The focus following method of the embodiment of the present disclosure generates a circular image region for focus following according to a center point of the collected environment image and a preset pixel value radius, compared to the “well” character following region in the related art. Or the square following area has four corners removed, so that the image area following the focus is more accurate, and the focus is followed according to the shortest path between the center point of the face and the center point of the image area, which shortens the movement of the camera or the visual system. Time increases the timeliness of focus follow-up.
基于上述实施例,在目标用户低头或转身等无法检测到人脸关键点的情况下,智能设备检测目标用户的人体关键点进行焦点跟随。然而,用户的低头或转身等动作可能仅持续较短的时间,可以理解,在保证焦点跟随不丢失的基础上,对目标用户的人脸关键点进行焦点跟随更容易使用户观察到智能设备的“注视”效果,为了进一步提高智能设备的主动交互效果,本公开实施例提出了一种具体的智能设备的焦点跟随方法。Based on the above embodiment, in the case where the target user is unable to detect the face key point such as turning or turning, the smart device detects the target user's human key point for focus follow. However, the user's action such as bowing or turning may only last for a short period of time. It can be understood that focusing on the target user's face key points is easier to make the user observe the smart device on the basis of ensuring that the focus is not followed. In order to further improve the active interaction effect of the smart device, the embodiment of the present disclosure proposes a specific focus tracking method of the smart device.
具体而言,图7为本公开实施例所提供的一种具体的智能设备的焦点跟随方法的流程示意图,如图7所示,该方法包括:Specifically, FIG. 7 is a schematic flowchart of a focus tracking method of a specific smart device according to an embodiment of the present disclosure. As shown in FIG. 7, the method includes:
步骤401,从采集的位于头部区域下方的人体区域进行识别。 Step 401, identifying from the collected human body area located below the head area.
其中,在用户低头等智能设备无法采集到人脸关键点时,智能设备对环境图像中目标用户的头部区域下方的人体区域进行识别。例如,通过深度学习技术获取人体在不同形态下的特征模型,将采集到的环境图像与特征模型相匹配,识别目标用户在站立、坐立和行走等多种形态下的人体区域。The smart device identifies the human body area under the head area of the target user in the environment image when the smart device such as the user is unable to collect the key point of the face. For example, the deep learning technique is used to acquire the feature model of the human body in different forms, and the collected environmental image is matched with the feature model to identify the human body region in various forms such as standing, sitting and walking.
步骤402,当识别到人体区域后,控制智能设备的云台摄像头的摄像角度向头部区域所在方向移动。Step 402: After identifying the human body area, the imaging angle of the pan/tilt camera controlling the smart device moves in the direction of the head region.
为了能够实现智能设备与目标用户“面对面”交互,当识别人体区域后,可以尝试抬高云台摄像头的摄像角度或者云台摄像头,以往上寻找目标用户的头部。具体地,控制云台摄像头的摄像角度或者云台摄像头向头部区域所在的方向移动,也就是说,从当前拍摄角度或者位置,往上调整拍摄角度或者位置。In order to enable the "face-to-face" interaction between the smart device and the target user, after identifying the human body region, an attempt can be made to raise the camera angle of the PTZ camera or the PTZ camera, and in the past, to find the target user's head. Specifically, the camera angle of the pan/tilt camera or the pan-tilt camera is moved in the direction in which the head region is located, that is, the shooting angle or position is adjusted upward from the current shooting angle or position.
作为一种示例,可以按照预设的固定的速度向上缓慢移动或者升高。As an example, it may be slowly moved up or raised at a preset fixed speed.
作为另一种示例,根据人体中心点位置的不同,可以以不同的速度控制摄像头移动, 比如,当人体中心点为目标用户的脖子与躯干的交点的时,以10°/s的速度向上缓慢移动,当人体中心点位于目标用户躯干中心点时,以20°/s的速度向上移动,从而减少焦点寻找时间,避免焦点跟随丢失。As another example, depending on the position of the center point of the human body, the camera movement can be controlled at different speeds, for example, when the center point of the human body is the intersection of the neck and the trunk of the target user, the speed is slowed upward at 10°/s. Move, when the center point of the human body is located at the center point of the target user's torso, move upwards at a speed of 20°/s, thereby reducing the focus search time and avoiding focus follow loss.
步骤403,在摄像角度移动后,拍摄获取环境图像。Step 403: After the camera angle is moved, the captured environment image is captured.
步骤404,判断环境图像中是否包括头部区域。 Step 404, determining whether a header area is included in the environment image.
对当前采集的环境图像进行头部区域识别,如果识别出环境图像中包括头部区域,则执行步骤405;如果识别出环境图像中未包括头部区域,则执行步骤406。Performing header area identification on the currently collected environment image, if it is recognized that the head image area is included in the environment image, step 405 is performed; if it is recognized that the head area is not included in the environment image, step 406 is performed.
需要说明的是,从当前采集的环境图像进行头部区域识别的过程,可参见上述实施例中相关内容的记载,此处不再赘述。It should be noted that, in the process of performing the header area identification from the currently collected environment image, refer to the description of related content in the foregoing embodiment, and details are not described herein again.
步骤405,从头部区域识别人脸关键点。 Step 405, identifying a face key point from the head area.
需要说明的是,从头部区域识别人脸关键点的过程,可参见上述实施例中相关内容的记载,此处不再赘述。It should be noted that the process of recognizing the key points of the face from the head area can be referred to the description of related content in the foregoing embodiment, and details are not described herein again.
进一步地,当从头部区域识别人脸关键点后,则根据人脸关键点确定人脸中心点,并对人脸中心点进行焦点跟随。Further, after the face key point is recognized from the head region, the face center point is determined according to the face key point, and the face center point is subjected to focus follow.
步骤406,从环境图像中检测目标用户的人体关键点。Step 406: Detect a human key point of the target user from the environment image.
关于从环境图像中识别人体关键点的过程,可参见上述实施例中相关内容的记载,此处不再赘述。For the process of identifying the key points of the human body from the environment image, refer to the description of related content in the above embodiments, and details are not described herein again.
如果环境图像中未包扩头部区域,或者在头部区域仍无法检测到人脸关键点,则根据从环境图像中检测目标用户的人体关键点。进一步地,在提取到人体关键点之后,根据人体关键点确定人体中心点,然后对人体中心点进行焦点跟随。If the head region is not included in the environment image, or the face key is still not detected in the head region, the human key point of the target user is detected from the environment image. Further, after extracting the key points of the human body, the center point of the human body is determined according to the key points of the human body, and then the focus of the center point of the human body is followed.
本公开实施例的智能设备的焦点跟随方法,在检测到人体关键点的基础上移动摄像头检测人脸关键点,若果检测到人脸关键点,则根据人脸关键点确定人脸中心点进行焦点跟随,若无法检测到人脸关键点,则根据人体关键点确定人体中心点进行焦点跟随。在保证焦点跟随不丢失的基础上,对目标用户的人脸关键点进行焦点跟随,提高了智能设备交互的生动性和灵活性。The focus following method of the smart device of the embodiment of the present disclosure detects a key point of the face on the basis of detecting the key point of the human body, and if the key point of the face is detected, determining the center point of the face according to the key point of the face If the focus is not followed, the focus of the human body is determined according to the key points of the human body. On the basis of ensuring that the focus is not lost, the focus of the target user's face is followed, which improves the vividness and flexibility of the smart device interaction.
基于上述实施例,在智能设备采集的环境图像中若存在多个用户,智能设备需要识别与智能设备具有交互意愿的目标用户进行焦点跟随。作为一种可能的实现方式,可根据候选目标的人体与智能设备之间的距离选取目标用户。图8为本公开实施例所提供的一种确定目标用户方法的流程示意图,如图8所示,该确定目标用户方法包括:Based on the foregoing embodiment, if there are multiple users in the environment image collected by the smart device, the smart device needs to identify the target user who has the willingness to interact with the smart device to perform focus tracking. As a possible implementation manner, the target user may be selected according to the distance between the human body and the smart device of the candidate target. FIG. 8 is a schematic flowchart of a method for determining a target user according to an embodiment of the present disclosure. As shown in FIG. 8, the method for determining a target user includes:
步骤501,对环境图像进行人体识别。Step 501: Perform human body recognition on the environment image.
本实施例中,智能设备可通过人脸检测或者人体检测,识别环境图像中的人体。In this embodiment, the smart device can identify the human body in the environment image through face detection or human body detection.
步骤502,当从环境图像中识别出多个人体时,获取每个人体与智能设备之间的距离。Step 502: When a plurality of human bodies are identified from the environment image, obtain a distance between each human body and the smart device.
具体的,智能设备可以从采集到的环境图像识别进入到监控范围内的每个人体。本实施例中,将识别出的每个人体作为一个候选目。其中,人体识别的方法可参照上述实施例的描述,在此不再赘述。Specifically, the smart device can recognize each human body that enters the monitoring range from the collected environmental image. In this embodiment, each human body identified is regarded as a candidate. For the method of the human body identification, reference may be made to the description of the foregoing embodiment, and details are not described herein again.
进一步的,智能设备获取环境图像中每个人体与智能设备之间的距离,可以理解的是,候选目标与智能设备之间的距离越近,说明候选目标与智能设备之间存在交互意图的可能性越大,因此本实施例中,将候选目标与智能设备之间的距离,作为判断候选目标是否存在,与智能设备交互的交互意图的依据之一。Further, the smart device acquires the distance between each human body and the smart device in the environment image. It can be understood that the closer the distance between the candidate target and the smart device, the possibility that there is an interaction intention between the candidate target and the smart device. The greater the degree of the interaction, the distance between the candidate target and the smart device is used as one of the basis for determining whether the candidate target exists or not, and the interaction intention of interacting with the smart device.
本实施例中,可通过深度摄像头或者双目视觉摄像头或者激光雷达,获取候选目标与智能设备之间的距离。In this embodiment, the distance between the candidate target and the smart device can be obtained by a depth camera or a binocular vision camera or a laser radar.
作为一种可能的实现方式,智能设备中配置有深度摄像头,通过深度摄像头,获取候选目标的深度图。在具体实现时,可通过结构光投射器向候选目标表面投射可控制的光点、光条或光面结构,并由深度摄像头中的图像传感器获得图像,通过几何关系,利用三角原理计算得到候选目标的三维坐标,从而可以得到候选目标与智能设备之间的距离。As a possible implementation manner, the smart device is configured with a depth camera, and the depth map of the candidate target is obtained through the depth camera. In a specific implementation, a controllable light spot, a light strip or a smooth surface structure can be projected to the candidate target surface by the structured light projector, and an image is obtained by the image sensor in the depth camera, and the candidate is calculated by using the triangular principle through the geometric relationship. The three-dimensional coordinates of the target, so that the distance between the candidate target and the smart device can be obtained.
作为另一种可能的实现方式,在智能设备中配置双目视觉摄像头,通过双目视觉摄像头,对候选目标进行拍摄。然后,计算双目视觉摄像头所拍摄图像的视差,根据视差计算候选目标与智能设备之间的距离。As another possible implementation, a binocular vision camera is configured in the smart device, and the candidate target is captured by the binocular vision camera. Then, the parallax of the image captured by the binocular vision camera is calculated, and the distance between the candidate target and the smart device is calculated based on the parallax.
图9为本公开实施例提供的一种双目视觉计算距离的原理示意图。图9中,在实际空间中,画出了两个摄像头所在位置O l和O r,以及左右摄像头的光轴线,两个摄像头的焦平面,焦平面距离两个摄像头所在平面的距离为f。 FIG. 9 is a schematic diagram of the principle of calculating binocular vision distance according to an embodiment of the present disclosure. In Fig. 9, in the actual space, the positions O l and O r of the two cameras are plotted, and the optical axes of the left and right cameras, the focal planes of the two cameras, and the focal plane are at a distance f from the plane of the two cameras.
如图9所示,p和p′分别是同一候选目标P在不同拍摄图像中的位置。其中,p点距离所在拍摄图像的左侧边界的距离为x l,p′点距离所在拍摄图像的左侧边界的距离为x r。O l和O r分别为两个摄像头,这两个摄像头在同一平面,两个摄像头之间的距离为Z。 As shown in FIG. 9, p and p' are the positions of the same candidate target P in different captured images, respectively. Wherein, the distance from the p-point to the left boundary of the captured image is x l , and the distance from the p-point to the left boundary of the captured image is x r . O l and Or are respectively two cameras, the two cameras are in the same plane, and the distance between the two cameras is Z.
基于三角测距原理,图9中的P与两个摄像头所在平面之间的距离b,具有如下关系:
Figure PCTCN2019078747-appb-000002
Based on the principle of triangulation, the distance b between P in Fig. 9 and the plane where the two cameras are located has the following relationship:
Figure PCTCN2019078747-appb-000002
基于此,可以推得
Figure PCTCN2019078747-appb-000003
其中,d为同一候选目标双目摄像头所拍摄图像的视觉差。由于Z、f为定值,因此,根据视觉差d可以确定出候选目标与摄像头所在平面之间的距离b,即候选目标与智能设备之间的距离。
Based on this, you can push
Figure PCTCN2019078747-appb-000003
Where d is the visual difference of the image captured by the same candidate target binocular camera. Since Z and f are constant values, the distance b between the candidate target and the plane of the camera, that is, the distance between the candidate target and the smart device, can be determined according to the visual difference d.
作为再一种可能的实现方式,在智能设备中配置激光雷达,通过激光雷达向监控范围 内发射激光,发射的激光遇到监控范围内的障碍物将被反射。智能设备接收监控范围内的每个障碍物返回的激光,根据返回的激光生成每个障碍物的二值图。然后,将每个二值图与环境图像进行融合,从所有二值图中识别出与候选目标对应的二值图。具体地,可以根据每个障碍物的二值图可以识别出每个障碍物的轮廓或者大小,然后将环境图像中每个目标的轮廓或者大小进行匹配,从而可以得到候选目标对应的二值图。之后,将候选目标对应的二值图的激光返回时间乘以光速,并除以2,得到候选目标与智能设备之间的距离。As a further possible implementation, the laser radar is arranged in the smart device, and the laser is emitted into the monitoring range by the laser radar, and the emitted laser encounters obstacles within the monitoring range to be reflected. The smart device receives the laser returned by each obstacle within the monitored range and generates a binary map of each obstacle based on the returned laser. Then, each binary image is fused with the environment image, and the binary image corresponding to the candidate target is identified from all the binary images. Specifically, the contour or size of each obstacle can be identified according to the binary map of each obstacle, and then the contour or size of each target in the environment image is matched, so that the binary map corresponding to the candidate target can be obtained. . Then, the laser return time of the binary image corresponding to the candidate target is multiplied by the speed of light, and divided by 2 to obtain the distance between the candidate target and the smart device.
需要说明的是,其他用于计算候选目标与智能设备之间的距离的方法,也包含在本公开实施例的范围内。It should be noted that other methods for calculating the distance between the candidate target and the smart device are also included in the scope of the embodiments of the present disclosure.
步骤503,选取与智能设备距离最近的人体作为目标用户对应的人体。Step 503: Select a human body that is closest to the smart device as the human body corresponding to the target user.
具体的,由于当候选目标与智能设备之间的距离较远时,候选目标可能不存在与智能设备交互的交互意图,因此选取与智能设备距离最近的人体作为目标用户对应的人体进行焦点跟随。Specifically, when the distance between the candidate target and the smart device is far, the candidate target may not have the interaction intention of interacting with the smart device, so the human body closest to the smart device is selected as the human body corresponding to the target user for focus tracking.
需要说明的是,与智能设备距离最近的人体可能为多个,比如,多个用户站成平行一排参观智能设备,而其中仅有讲解员具有与智能设备交互的意图。此时,智能设备可以查询已注册用户人脸图像库中与智能设备距离最近的人体对应的人脸图像确定目标用户,其中根据实际情况的不同,可以通过不同的方式确定目标用户对应的人体。It should be noted that there may be multiple human bodies closest to the smart device. For example, multiple users stand in parallel rows to visit the smart device, and only the presenter has the intention to interact with the smart device. At this time, the smart device can query the face image corresponding to the human body closest to the smart device in the registered user face image database to determine the target user, wherein the human body corresponding to the target user can be determined in different manners according to actual conditions.
第一种示例,如果人脸图像库中存在一个与智能设备距离最近的人体对应的人脸图像,则将一个与智能设备距离最近的人体作为目标用户对应的人体。In the first example, if there is a face image corresponding to the human body closest to the smart device in the face image library, a human body closest to the smart device is used as the human body corresponding to the target user.
第二种示例,如果人脸图像库中不存在与智能设备距离最近的人体对应的人脸图像,则随机选取一个与智能设备距离最近的人体作为目标用户对应的人体。In the second example, if there is no face image corresponding to the human body closest to the smart device in the face image library, a human body closest to the smart device is randomly selected as the human body corresponding to the target user.
第三种示例,如果人脸图像库中存在多个与智能设备距离最近的人体对应的人脸图像,则将最先查询出的与智能设备距离最近的人体作为目标用户对应的人体。In a third example, if there are a plurality of face images corresponding to the human body closest to the smart device in the face image library, the human body closest to the smart device is firstly queried as the human body corresponding to the target user.
本公开实施例的智能设备的焦点跟随方法,通过候选目标与智能设备之间的距离,从所有候选目标中筛选出存在与智能设备交互的交互意图的候选目标,相比在检测到人脸时,直接将人作为交互目标,可以降低智能设备的误启动。A focus tracking method of a smart device according to an embodiment of the present disclosure, by using a distance between a candidate target and a smart device, selecting, from all candidate targets, a candidate target having an interaction intention of interacting with the smart device, when the face is detected Directly using people as interactive targets can reduce the false start of smart devices.
为了实现上述实施例,本公开实施例还提出一种智能设备的焦点跟随装置。图10为本公开实施例提供的一种智能设备的焦点跟随装置的结构示意图。In order to implement the above embodiments, an embodiment of the present disclosure further provides a focus following device of a smart device. FIG. 10 is a schematic structural diagram of a focus following device of a smart device according to an embodiment of the present disclosure.
如图10所示,该智能设备的焦点跟随装置包括:检测模块110、确定模块120和控制模块130。As shown in FIG. 10, the focus following device of the smart device includes: a detecting module 110, a determining module 120, and a control module 130.
其中,检测模块110,用于从智能设备采集的环境图像中检测目标用户的人脸关键点,以及在从所述环境图像中未检测到所述人脸关键点时,从所述环境图像中检测所述目标用户的人体关键点。The detecting module 110 is configured to detect a face key point of the target user from the environment image collected by the smart device, and when the face key point is not detected from the environment image, from the environment image Detecting key points of the human body of the target user.
确定模块120,用于根据所述人脸关键点确定人脸中心点,以及在检测到人体关键点时,根据所述人体关键点确定人体中心点。The determining module 120 is configured to determine a face center point according to the face key point, and determine a body center point according to the body key point when the body key point is detected.
控制模块130,用于控制所述智能设备对所述人脸中心点进行焦点跟随,在确定出所述人体中心点时,控制所述智能设备对所述人体中心点进行焦点跟随。The control module 130 is configured to control the smart device to perform focus tracking on the face center point, and when the body center point is determined, control the smart device to perform focus tracking on the body center point.
在本实施例一种可能的实现方式中,控制模块130,具体用于:定时判断检测出的所述人脸中心点或者人体中心点是否处于所述图像区域内;当所述人脸中心点或者人体中心点未处于所述图像区域内时,获取所述人脸中心点或者人体中心点与所述图像区域中心点之间的最短路径;根据所述最短路径,获取用于控制智能设备移动的控制信息;控制所述智能设备按照所述控制信息移动,使得检测到的所述人脸中心点或者人体中心点落入所述图像区域内。In a possible implementation manner of the embodiment, the control module 130 is specifically configured to: periodically determine whether the detected face center point or the body center point is in the image area; when the face center point Or obtaining a shortest path between the face center point or the body center point and the image area center point when the body center point is not in the image area; acquiring, according to the shortest path, controlling the smart device movement Control information; controlling the smart device to move according to the control information, such that the detected face center point or body center point falls within the image area.
在本实施例一种可能的实现方式中,检测模块110,具体用于:根据预设的头部特征,从所述环境图像中识别所述目标用户的头部区域;从所述头部区域提取所述人脸关键点。In a possible implementation manner of the embodiment, the detecting module 110 is specifically configured to: identify a head area of the target user from the environment image according to a preset head feature; and from the head area Extract the key points of the face.
确定模块120,具体用于:如果提取出的所述人脸关键点为一个,将所述人脸关键点作为所述人脸中心点;如果提取出的所述人脸关键点为两个以及两个以上,获取提取出的所有的所述人脸关键点的第一中心点,将所述第一中心点作为所述人脸中心点。The determining module 120 is specifically configured to: if the extracted face key point is one, use the face key point as the face center point; if the extracted face key points are two and Two or more, obtaining the first center point of all the extracted face key points, and using the first center point as the face center point.
在本实施例一种可能的实现方式中,确定模块120,具体用于:将每个人脸关键点作为节点,以其中一个节点作为起始节点,将所有的节点逐个连接起来,形成一个覆盖所有节点的关键点图形;获取所述关键点图形的中心点,将所述关键点图形的中心点,确定为所述第一中心点。In a possible implementation manner of the embodiment, the determining module 120 is specifically configured to: use each face key point as a node, and use one of the nodes as a starting node to connect all the nodes one by one to form an overlay all. a key point graph of the node; acquiring a center point of the key point graph, and determining a center point of the key point graph as the first center point.
在本实施例一种可能的实现方式中,检测模块110,具体用于:从采集的对位于所述头部区域下方的人体区域进行识别;当识别到所述人体区域后,控制所述智能设备的云台摄像头的摄像角度向所述头部区域所在方向移动;在所述摄像角度移动后,拍摄获取环境图像;判断所述环境图像中是否包括所述头部区域;如果所述环境图像中包括所述头部区域,则从所述头部区域识别所述人脸关键点;如果所述环境图像中未包括所述头部区域,则从所述环境图像中检测所述目标用户的人体关键点。In a possible implementation manner of the embodiment, the detecting module 110 is specifically configured to: identify, from the collected human body region located below the head region; and when the human body region is identified, control the smart The camera angle of the pan/tilt camera of the device moves toward the direction of the head region; after the camera angle is moved, the captured environment image is captured; whether the head region is included in the environment image; if the environment image Including the head region, the face key point is identified from the head region; if the head region is not included in the environment image, detecting the target user from the environment image The key point of the human body.
基于上述实施例,在智能设备采集的环境图像中若存在多个用户,智能设备需要识别与智能设备具有交互意愿的目标用户进行焦点跟随并生成用于焦点跟随的图像区域。图11为本公开实施例提供的另一种智能设备的焦点跟随装置的结构示意图,如图11所示,在上述实施例的智能设备的焦点跟随装置之前还包括:人体识别模块210、距离检测模块220、选取模块230和生成模块240。Based on the foregoing embodiment, if there are multiple users in the environment image collected by the smart device, the smart device needs to identify the target user who has the willingness to interact with the smart device to perform focus following and generate an image region for focus following. FIG. 11 is a schematic structural diagram of a focus following device of another smart device according to an embodiment of the present disclosure. As shown in FIG. 11 , before the focus following device of the smart device of the foregoing embodiment, the human body recognition module 210 and the distance detecting device are further included. Module 220, selection module 230, and generation module 240.
其中,人体识别模块210,用于从所述智能设备采集的环境图像中检测所述目标用户的人脸关键点之前,对所述环境图像进行人体识别;The human body recognition module 210 is configured to perform human body recognition on the environment image before detecting a key point of the target user from the environment image collected by the smart device;
距离检测模块220,用于当从所述环境图像中识别出多个人体时,获取每个人体与智能设备之间的距离;The distance detecting module 220 is configured to acquire a distance between each human body and the smart device when a plurality of human bodies are identified from the environment image;
选取模块230,用于选取与所述智能设备距离最近的人体作为所述目标用户对应的人体。The selecting module 230 is configured to select a human body that is closest to the smart device as a human body corresponding to the target user.
生成模块240,用于在从所述智能设备采集的所述环境图像中识别所述目标用户的人脸关键点之前,识别所述智能设备所采集的环境图像的中心点,以所述环境图像的中心点为基准点,生成一个圆形用于焦点跟随的图像区域。a generating module 240, configured to identify a center point of the environment image collected by the smart device before identifying a face key point of the target user in the environment image collected by the smart device, to use the environment image The center point is the reference point, and a circular image area for focus tracking is generated.
本公开实施例的智能设备的焦点跟随装置,首先从智能设备采集的环境图像中检测目标用户的人脸关键点,根据人脸关键点确定人脸中心点,并控制所述智能设备对所述人脸中心点进行焦点跟随,如果从环境图像中未检测到人脸关键点,则从环境图像中检测目标用户的人体关键点,根据人体关键点确定人体中心点,并控制智能设备对人体中心点进行焦点跟随。由此,该装置解决了因检测不到人脸关键点导致无法保持焦点跟随的技术问题,以人体关键点作为焦点补足,当智能设备未检测到人脸关键点时,从采集到的图像中检测人体关键点作为跟随的焦点,避免用户在低头和转头等情况下造成焦点丢失,提高了焦点跟随的成功率和准确性。The focus following device of the smart device of the embodiment of the present disclosure first detects a face key point of the target user from the environment image collected by the smart device, determines a face center point according to the face key point, and controls the smart device to The face center point performs focus tracking. If the face key point is not detected from the environment image, the target user's human key point is detected from the environment image, the body center point is determined according to the body key point, and the smart device is controlled to the human body center. Point to focus follow. Therefore, the device solves the technical problem that the focus cannot be maintained due to the detection of the key points of the face, and the key points of the human body are used as the focus to complement the image. When the smart device does not detect the key point of the face, the image is collected from the captured image. The detection of the key points of the human body as the focus of follow-up avoids the loss of focus of the user in the case of bowing and turning, and improves the success rate and accuracy of focus following.
为达上述目的,本公开实施例还提出了一种智能设备,包括:壳体、处理器、存储器、电路板和电源电路,其中,电路板安置在壳体围成的空间内部,处理器和存储器设置在电路板上;电源电路,用于为上述智能设备的各个电路或器件供电;存储器用于存储可执行程序代码;处理器通过读取存储器中存储的可执行程序代码来运行与可执行程序代码对应的程序,以用于实现如上述实施例所述的智能设备的焦点跟随方法。In order to achieve the above object, an embodiment of the present disclosure further provides a smart device, including: a housing, a processor, a memory, a circuit board, and a power supply circuit, wherein the circuit board is disposed inside the space enclosed by the housing, the processor and The memory is disposed on the circuit board; the power circuit is configured to supply power to each circuit or device of the smart device; the memory is used to store executable program code; and the processor is executable and executable by reading executable program code stored in the memory. A program corresponding to the program code for implementing a focus following method of the smart device as described in the above embodiments.
为了实现上述目的,本公开实施例还提出了一种非临时性计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如上述实施例所述的智能设备的焦点跟随方法。In order to achieve the above object, an embodiment of the present disclosure further provides a non-transitory computer readable storage medium having stored thereon a computer program, which is executed by a processor to implement focus tracking of a smart device as described in the above embodiments. method.
图12示出了适于用来实现本申请实施方式的示例性智能设备的框图。如图12所示,该智能设备包括:壳体310、处理器320、存储器330、电路板340和电源电路350,其中,电路板340安置在壳体310围成的空间内部,处理器320和存储器330设置在电路板340上;电源电路350,用于为上述智能设备的各个电路或器件供电;存储器930用于存储可执行程序代码;处理器320通过读取存储器330中存储的可执行程序代码来运行与可执行程序代码对应的程序,用于执行上述实施例所述的智能设备的焦点跟随方法。FIG. 12 illustrates a block diagram of an exemplary smart device suitable for use in implementing embodiments of the present application. As shown in FIG. 12, the smart device includes a housing 310, a processor 320, a memory 330, a circuit board 340, and a power supply circuit 350. The circuit board 340 is disposed inside the space enclosed by the housing 310, and the processor 320 and The memory 330 is disposed on the circuit board 340; the power supply circuit 350 is configured to supply power to the respective circuits or devices of the smart device; the memory 930 is used to store executable program code; and the processor 320 reads the executable program stored in the memory 330. The code runs a program corresponding to the executable program code for executing the focus following method of the smart device described in the above embodiments.
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本公开的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须 针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In the description of the present specification, the description with reference to the terms "one embodiment", "some embodiments", "example", "specific example", or "some examples" and the like means a specific feature described in connection with the embodiment or example. A structure, material, or feature is included in at least one embodiment or example of the present disclosure. In the present specification, the schematic representation of the above terms is not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in a suitable manner in any one or more embodiments or examples. In addition, various embodiments or examples described in the specification, as well as features of various embodiments or examples, may be combined and combined.
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本公开的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。Moreover, the terms "first" and "second" are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, features defining "first" or "second" may include at least one of the features, either explicitly or implicitly. In the description of the present disclosure, the meaning of "a plurality" is at least two, such as two, three, etc., unless specifically defined otherwise.
流程图中或在此以其他方式描述的任何过程或方法描述可以被理解为,表示包括一个或更多个用于实现定制逻辑功能或过程的步骤的可执行指令的代码的模块、片段或部分,并且本公开的优选实施方式的范围包括另外的实现,其中可以不按所示出或讨论的顺序,包括根据所涉及的功能按基本同时的方式或按相反的顺序,来执行功能,这应被本公开的实施例所属技术领域的技术人员所理解。Any process or method description in the flowcharts or otherwise described herein may be understood to represent a module, segment or portion of code comprising one or more executable instructions for implementing the steps of a custom logic function or process. And the scope of the preferred embodiments of the present disclosure includes additional implementations, in which the functions may be performed in a substantially simultaneous manner or in an inverse order depending on the functions involved, in the order shown or discussed. It will be understood by those skilled in the art to which the embodiments of the present disclosure pertain.
在流程图中表示或在此以其他方式描述的逻辑和/或步骤,例如,可以被认为是用于实现逻辑功能的可执行指令的定序列表,可以具体实现在任何计算机可读介质中,以供指令执行系统、装置或设备(如基于计算机的系统、包括处理器的系统或其他可以从指令执行系统、装置或设备取指令并执行指令的系统)使用,或结合这些指令执行系统、装置或设备而使用。就本说明书而言,"计算机可读介质"可以是任何可以包含、存储、通信、传播或传输程序以供指令执行系统、装置或设备或结合这些指令执行系统、装置或设备而使用的装置。计算机可读介质的更具体的示例(非穷尽性列表)包括以下:具有一个或多个布线的电连接部(电子装置),便携式计算机盘盒(磁装置),随机存取存储器(RAM),只读存储器(ROM),可擦除可编辑只读存储器(EPROM或闪速存储器),光纤装置,以及便携式光盘只读存储器(CDROM)。另外,计算机可读介质甚至可以是可在其上打印所述程序的纸或其他合适的介质,因为可以例如通过对纸或其他介质进行光学扫描,接着进行编辑、解译或必要时以其他合适方式进行处理来以电子方式获得所述程序,然后将其存储在计算机存储器中。The logic and/or steps represented in the flowchart or otherwise described herein, for example, may be considered as an ordered list of executable instructions for implementing logical functions, and may be embodied in any computer readable medium, Used in conjunction with, or in conjunction with, an instruction execution system, apparatus, or device (eg, a computer-based system, a system including a processor, or other system that can fetch instructions and execute instructions from an instruction execution system, apparatus, or device) Or use with equipment. For the purposes of this specification, a "computer-readable medium" can be any apparatus that can contain, store, communicate, propagate, or transport a program for use in an instruction execution system, apparatus, or device, or in conjunction with the instruction execution system, apparatus, or device. More specific examples (non-exhaustive list) of computer readable media include the following: electrical connections (electronic devices) having one or more wires, portable computer disk cartridges (magnetic devices), random access memory (RAM), Read only memory (ROM), erasable editable read only memory (EPROM or flash memory), fiber optic devices, and portable compact disk read only memory (CDROM). In addition, the computer readable medium may even be a paper or other suitable medium on which the program can be printed, as it may be optically scanned, for example by paper or other medium, followed by editing, interpretation or, if appropriate, other suitable The method is processed to obtain the program electronically and then stored in computer memory.
应当理解,本公开的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中,多个步骤或方法可以用存储在存储器中且由合适的指令执行系统执行的软件或固件来实现。如,如果用硬件来实现和在另一实施方式中一样,可用本领域公知的下列技术中的任一项或他们的组合来实现:具有用于对数据信号实现逻辑功能的逻辑门电路的离散逻辑电路,具有合适的组合逻辑门电路的专用集成电路,可编程门阵列(PGA),现场可编程门阵列(FPGA)等。It should be understood that portions of the present disclosure can be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, multiple steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware and in another embodiment, it can be implemented by any one or combination of the following techniques well known in the art: discrete with logic gates for implementing logic functions on data signals Logic circuits, application specific integrated circuits with suitable combinational logic gates, programmable gate arrays (PGAs), field programmable gate arrays (FPGAs), and the like.
本技术领域的普通技术人员可以理解实现上述实施例方法携带的全部或部分步骤是可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,该程序在执行时,包括方法实施例的步骤之一或其组合。One of ordinary skill in the art can understand that all or part of the steps carried by the method of implementing the above embodiments can be completed by a program to instruct related hardware, and the program can be stored in a computer readable storage medium. When executed, one or a combination of the steps of the method embodiments is included.
此外,在本公开各个实施例中的各功能单元可以集成在一个处理模块中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。In addition, each functional unit in various embodiments of the present disclosure may be integrated into one processing module, or each unit may exist physically separately, or two or more units may be integrated into one module. The above integrated modules can be implemented in the form of hardware or in the form of software functional modules. The integrated modules, if implemented in the form of software functional modules and sold or used as stand-alone products, may also be stored in a computer readable storage medium.
上述提到的存储介质可以是只读存储器,磁盘或光盘等。尽管上面已经示出和描述了本公开的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本公开的限制,本领域的普通技术人员在本公开的范围内可以对上述实施例进行变化、修改、替换和变型。The above mentioned storage medium may be a read only memory, a magnetic disk or an optical disk or the like. While the embodiments of the present disclosure have been shown and described above, it is understood that the foregoing embodiments are illustrative and are not to be construed as limiting the scope of the disclosure The embodiments are subject to variations, modifications, substitutions and variations.

Claims (18)

  1. 一种智能设备的焦点跟随方法,其特征在于,包括以下步骤:A focus following method for a smart device, comprising the steps of:
    从智能设备采集的环境图像中检测目标用户的人脸关键点,根据所述人脸关键点确定人脸中心点,并控制所述智能设备对所述人脸中心点进行焦点跟随;Detecting a face key point of the target user from the environment image collected by the smart device, determining a face center point according to the face key point, and controlling the smart device to perform focus tracking on the face center point;
    如果从所述环境图像中未检测到所述人脸关键点,则从所述环境图像中检测所述目标用户的人体关键点,根据所述人体关键点确定人体中心点,并控制所述智能设备对所述人体中心点进行焦点跟随。If the face key point is not detected from the environment image, detecting a human body key point of the target user from the environment image, determining a body center point according to the body key point, and controlling the smart point The device performs focus tracking on the center point of the human body.
  2. 根据权利要1所述的方法,其特征在于,所述从智能设备采集的环境图像中识别目标用户的人脸关键点之前,还包括:The method according to claim 1, wherein before the identifying the key point of the target user in the environment image collected by the smart device, the method further comprises:
    识别所述智能设备所采集的环境图像的中心点,以所述环境图像的中心点为基准点,生成一个圆形用于焦点跟随的图像区域。Identifying a center point of the environment image collected by the smart device, and using a center point of the environment image as a reference point, generating a circular image area for focus following.
  3. 根据权利要求2所述的方法,其特征在于,所述进行焦点跟随,包括:The method of claim 2 wherein said performing focus tracking comprises:
    定时判断检测出的所述人脸中心点或者人体中心点是否处于所述图像区域内;Timingly determining whether the detected face center point or the body center point is in the image area;
    当所述人脸中心点或者人体中心点未处于所述图像区域内时,获取所述人脸中心点或者人体中心点与所述图像区域中心点之间的最短路径;Obtaining a shortest path between the face center point or the body center point and the image area center point when the face center point or the body center point is not in the image area;
    根据所述最短路径,获取用于控制智能设备移动的控制信息;Obtaining control information for controlling movement of the smart device according to the shortest path;
    控制所述智能设备按照所述控制信息移动,使得检测到的所述人脸中心点或者人体中心点落入所述图像区域内。Controlling the smart device to move according to the control information, such that the detected face center point or body center point falls within the image area.
  4. 根据权利要求1-3任一项所述的方法,其特征在于,所述从智能设备采集的环境图像中检测目标用户的人脸关键点,根据所述人脸关键点确定人脸中心点,包括:The method according to any one of claims 1 to 3, wherein the target user's face key point is detected from the environment image collected by the smart device, and the face center point is determined according to the face key point. include:
    根据预设的头部特征,从所述环境图像中识别所述目标用户的头部区域;Identifying a head region of the target user from the environment image according to a preset head feature;
    从所述头部区域提取所述人脸关键点;Extracting the face key point from the head region;
    如果提取出的所述人脸关键点为一个,将所述人脸关键点作为所述人脸中心点;If the extracted face key point is one, the face key point is used as the face center point;
    如果提取出的所述人脸关键点为两个以及两个以上,获取提取出的所有的所述人脸关键点的第一中心点,将所述第一中心点作为所述人脸中心点。If the extracted face key points are two or more, obtaining the first center point of all the extracted face key points, and using the first center point as the face center point .
  5. 根据权利要求4所述的方法,其特征在于,所述获取提取出的所有的所述人脸关键点的第一中心点,包括:The method according to claim 4, wherein the obtaining the first center point of all the extracted face key points comprises:
    将每个人脸关键点作为节点,以其中一个节点作为起始节点,将所有的节点逐个连接起来,形成一个覆盖所有节点的关键点图形;Each face key is taken as a node, and one of the nodes is used as a starting node, and all the nodes are connected one by one to form a key point graph covering all the nodes;
    获取所述关键点图形的中心点,将所述关键点图形的中心点,确定为所述第一中心点。Obtaining a center point of the key point graphic, and determining a center point of the key point graphic as the first center point.
  6. 根据权利要求1-5任一项所述的方法,其特征在于,所述从所述环境图像中检测所 述目标用户的人体关键点,包括:The method according to any one of claims 1 to 5, wherein the detecting a human key point of the target user from the environment image comprises:
    从采集的位于所述头部区域下方的人体区域进行识别;Identifying from a human body region located below the head region;
    当识别到所述人体区域后,控制所述智能设备的云台摄像头的摄像角度向所述头部区域所在方向移动;After the human body area is identified, an imaging angle of a pan/tilt camera that controls the smart device moves in a direction in which the head area is located;
    在所述摄像角度移动后,拍摄获取环境图像;After the camera angle is moved, capturing an environment image;
    判断所述环境图像中是否包括所述头部区域;Determining whether the head region is included in the environment image;
    如果所述环境图像中包括所述头部区域,则从所述头部区域识别所述人脸关键点;If the head region is included in the environment image, identifying the face key point from the head region;
    如果所述环境图像中未包括所述头部区域,则从所述环境图像中检测所述目标用户的人体关键点。If the head region is not included in the environment image, the human key point of the target user is detected from the environment image.
  7. 根据权利要求1-6任一项所述的方法,其特征在于,所述从智能设备采集的环境图像中检测目标用户的人脸关键点之前,还包括:The method according to any one of claims 1-6, wherein before the detecting the key point of the target user in the environment image collected by the smart device, the method further comprises:
    对所述环境图像进行人体识别;Performing human body recognition on the environmental image;
    当从所述环境图像中识别出多个人体时,获取每个人体与智能设备之间的距离;Obtaining a distance between each human body and the smart device when a plurality of human bodies are identified from the environmental image;
    选取与所述智能设备距离最近的人体作为所述目标用户对应的人体。The human body closest to the smart device is selected as the human body corresponding to the target user.
  8. 根据权利要求7所述的方法,其特征在于,所述选取距离所述智能设备最近的人体作为所述目标用户对应的人体,包括:The method according to claim 7, wherein the selecting a human body closest to the smart device as the human body corresponding to the target user comprises:
    当与所述智能设备距离最近的人体为多个时,查询所述智能设备的已注册用户人脸图像库中是否存在所述与所述智能设备距离最近的人体对应的人脸图像;When there are a plurality of human bodies that are closest to the smart device, query whether the face image corresponding to the human body closest to the smart device exists in the registered user face image library of the smart device;
    如果所述人脸图像库中存在一个与所述智能设备距离最近的人体对应的人脸图像,则将所述一个与所述智能设备距离最近的人体作为所述目标用户对应的人体;If a face image corresponding to the human body closest to the smart device exists in the face image library, the human body closest to the smart device is used as a human body corresponding to the target user;
    如果所述人脸图像库中不存在所有与所述智能设备距离最近的人体对应的人脸图像,则随机选取一个与所述智能设备距离最近的人体作为所述目标用户对应的人体;If the face image corresponding to the human body closest to the smart device does not exist in the face image library, randomly select a human body closest to the smart device as the human body corresponding to the target user;
    如果所述人脸图像库中存在多个与所述智能设备距离最近的人体对应的人脸图像,则将最先查询出的与所述智能设备距离最近的人体作为所述目标用户对应的人体。If a plurality of face images corresponding to the human body closest to the smart device are present in the face image library, the human body closest to the smart device is firstly queried as the human body corresponding to the target user. .
  9. 一种智能设备的焦点跟随装置,其特征在于,包括:A focus following device for a smart device, comprising:
    检测模块,用于从智能设备采集的环境图像中检测目标用户的人脸关键点,以及在从所述环境图像中未检测到所述人脸关键点时,从所述环境图像中检测所述目标用户的人体关键点;a detecting module, configured to detect a face key point of the target user from the environment image collected by the smart device, and detect the image from the environment image when the face key point is not detected from the environment image The key point of the target user's body;
    确定模块,用于根据所述人脸关键点确定人脸中心点,以及在检测到人体关键点时,根据所述人体关键点确定人体中心点;a determining module, configured to determine a face center point according to the face key point, and determine a body center point according to the body key point when the human body key point is detected;
    控制模块,用于控制所述智能设备对所述人脸中心点进行焦点跟随,以及在确定出所述人体中心点时,控制所述智能设备对所述人体中心点进行焦点跟随。And a control module, configured to control the smart device to perform focus tracking on the face center point, and control the smart device to perform focus tracking on the human body center point when determining the body center point.
  10. 根据权利要求9所述的装置,其特征在于,还包括:The device according to claim 9, further comprising:
    生成模块,用于在从所述智能设备采集的所述环境图像中识别所述目标用户的人脸关键点之前,识别所述智能设备所采集的环境图像的中心点,以所述环境图像的中心点为基准点,生成一个圆形用于焦点跟随的图像区域。a generating module, configured to identify a center point of the environment image collected by the smart device before identifying a face key point of the target user in the environment image collected from the smart device, to use the environment image The center point is the reference point, and a circular image area for focus tracking is generated.
  11. 根据权利要求10所述的装置,其特征在于,所述控制模块,具体用于:The device according to claim 10, wherein the control module is specifically configured to:
    定时判断检测出的所述人脸中心点或者人体中心点是否处于所述图像区域内;Timingly determining whether the detected face center point or the body center point is in the image area;
    当所述人脸中心点或者人体中心点未处于所述图像区域内时,获取所述人脸中心点或者人体中心点与所述图像区域中心点之间的最短路径;Obtaining a shortest path between the face center point or the body center point and the image area center point when the face center point or the body center point is not in the image area;
    根据所述最短路径,获取用于控制智能设备移动的控制信息;Obtaining control information for controlling movement of the smart device according to the shortest path;
    控制所述智能设备按照所述控制信息移动,使得检测到的所述人脸中心点或者人体中心点落入所述图像区域内。Controlling the smart device to move according to the control information, such that the detected face center point or body center point falls within the image area.
  12. 根据权利要求9-11任一项所述的装置,其特征在于,A device according to any one of claims 9-11, wherein
    所述检测模块,具体用于:根据预设的头部特征,从所述环境图像中识别所述目标用户的头部区域;从所述头部区域提取所述人脸关键点;The detecting module is specifically configured to: identify a head area of the target user from the environment image according to a preset head feature; and extract the face key point from the head area;
    所述确定模块,具体用于:如果提取出的所述人脸关键点为一个,将所述人脸关键点作为所述人脸中心点;如果提取出的所述人脸关键点为两个以及两个以上,获取提取出的所有的所述人脸关键点的第一中心点,将所述第一中心点作为所述人脸中心点。The determining module is specifically configured to: if the extracted face key point is one, the face key point is used as the face center point; if the extracted face key point is two And two or more, obtaining the first center point of all the extracted face key points, and using the first center point as the face center point.
  13. 根据权利要求12所述的装置,其特征在于,所述确定模块,具体用于:The device according to claim 12, wherein the determining module is specifically configured to:
    将每个人脸关键点作为节点,以其中一个节点作为起始节点,将所有的节点逐个连接起来,形成一个覆盖所有节点的关键点图形;Each face key is taken as a node, and one of the nodes is used as a starting node, and all the nodes are connected one by one to form a key point graph covering all the nodes;
    获取所述关键点图形的中心点,将所述关键点图形的中心点,确定为所述第一中心点。Obtaining a center point of the key point graphic, and determining a center point of the key point graphic as the first center point.
  14. 根据权利要求9-13任一项所述的装置,其特征在于,所述检测模块,具体用于:The device according to any one of claims 9 to 13, wherein the detecting module is specifically configured to:
    从采集的对位于所述头部区域下方的人体区域进行识别;Identifying a human body region located below the head region from the collection;
    当识别到所述人体区域后,控制所述智能设备的云台摄像头的摄像角度向所述头部区域所在方向移动;After the human body area is identified, an imaging angle of a pan/tilt camera that controls the smart device moves in a direction in which the head area is located;
    在所述摄像角度移动后,拍摄获取环境图像;After the camera angle is moved, capturing an environment image;
    判断所述环境图像中是否包括所述头部区域;Determining whether the head region is included in the environment image;
    如果所述环境图像中包括所述头部区域,则从所述头部区域识别所述人脸关键点;If the head region is included in the environment image, identifying the face key point from the head region;
    如果所述环境图像中未包括所述头部区域,则从所述环境图像中检测所述目标用户的人体关键点。If the head region is not included in the environment image, the human key point of the target user is detected from the environment image.
  15. 根据权利要求9-14任一项所述的装置,其特征在于,所述装置还包括:The device according to any one of claims 9 to 14, wherein the device further comprises:
    人体识别模块,用于从所述智能设备采集的环境图像中检测所述目标用户的人脸关键 点之前,对所述环境图像进行人体识别;a human body recognition module, configured to perform human body recognition on the environmental image before detecting a key point of the target user from an environment image collected by the smart device;
    距离检测模块,用于当从所述环境图像中识别出多个人体时,获取每个人体与智能设备之间的距离;a distance detecting module, configured to acquire a distance between each human body and the smart device when a plurality of human bodies are identified from the environment image;
    选取模块,用于选取与所述智能设备距离最近的人体作为所述目标用户对应的人体。And a selection module, configured to select a human body that is closest to the smart device as a human body corresponding to the target user.
  16. 根据权利要求15所述的装置,其特征在于,所述选取模块,具体用于:The device according to claim 15, wherein the selecting module is specifically configured to:
    当与所述智能设备距离最近的人体为多个时,查询所述智能设备的已注册用户人脸图像库中是否存在所述与所述智能设备距离最近的人体对应的人脸图像;When there are a plurality of human bodies that are closest to the smart device, query whether the face image corresponding to the human body closest to the smart device exists in the registered user face image library of the smart device;
    如果所述人脸图像库中存在一个与所述智能设备距离最近的人体对应的人脸图像,则将所述一个与所述智能设备距离最近的人体作为所述目标用户对应的人体;If a face image corresponding to the human body closest to the smart device exists in the face image library, the human body closest to the smart device is used as a human body corresponding to the target user;
    如果所述人脸图像库中不存在所有与所述智能设备距离最近的人体对应的人脸图像,则随机选取一个与所述智能设备距离最近的人体作为所述目标用户对应的人体;If the face image corresponding to the human body closest to the smart device does not exist in the face image library, randomly select a human body closest to the smart device as the human body corresponding to the target user;
    如果所述人脸图像库中存在多个与所述智能设备距离最近的人体对应的人脸图像,则将最先查询出的与所述智能设备距离最近的人体作为所述目标用户对应的人体。If a plurality of face images corresponding to the human body closest to the smart device are present in the face image library, the human body closest to the smart device is firstly queried as the human body corresponding to the target user. .
  17. 一种智能设备,其特征在于,包括壳体、处理器、存储器、电路板和电源电路,其中,电路板安置在壳体围成的空间内部,处理器和存储器设置在电路板上;电源电路,用于为上述智能设备的各个电路或器件供电;存储器用于存储可执行程序代码;处理器通过读取存储器中存储的可执行程序代码来运行与可执行程序代码对应的程序,以用于实现如权利要求1-8中任一项所述的智能设备的焦点跟随方法。A smart device, comprising: a housing, a processor, a memory, a circuit board, and a power supply circuit, wherein the circuit board is disposed inside the space enclosed by the housing, the processor and the memory are disposed on the circuit board; and the power circuit Used to supply power to various circuits or devices of the above smart device; the memory is used to store executable program code; the processor runs a program corresponding to the executable program code by reading executable program code stored in the memory for use in A focus following method of a smart device according to any one of claims 1-8.
  18. 一种非临时性计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现如权利要求1-8中任一所述的智能设备的焦点跟随方法。A non-transitory computer readable storage medium having stored thereon a computer program, characterized in that the program is executed by a processor to implement a focus following method of the smart device according to any one of claims 1-8.
PCT/CN2019/078747 2018-03-21 2019-03-19 Focus tracking method and device of smart apparatus, smart apparatus, and storage medium WO2019179441A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810236920.1A CN108733280A (en) 2018-03-21 2018-03-21 Focus follower method, device, smart machine and the storage medium of smart machine
CN201810236920.1 2018-03-21

Publications (1)

Publication Number Publication Date
WO2019179441A1 true WO2019179441A1 (en) 2019-09-26

Family

ID=63941065

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/078747 WO2019179441A1 (en) 2018-03-21 2019-03-19 Focus tracking method and device of smart apparatus, smart apparatus, and storage medium

Country Status (3)

Country Link
CN (1) CN108733280A (en)
TW (1) TWI705382B (en)
WO (1) WO2019179441A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111241961A (en) * 2020-01-03 2020-06-05 精硕科技(北京)股份有限公司 Face detection method and device and electronic equipment
CN111968163A (en) * 2020-08-14 2020-11-20 济南博观智能科技有限公司 Thermopile array temperature measurement method and device
CN112866773A (en) * 2020-08-21 2021-05-28 海信视像科技股份有限公司 Display device and camera tracking method in multi-person scene

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108733280A (en) * 2018-03-21 2018-11-02 北京猎户星空科技有限公司 Focus follower method, device, smart machine and the storage medium of smart machine
CN109373904A (en) * 2018-12-17 2019-02-22 石家庄爱赛科技有限公司 3D vision detection device and 3D vision detection method
CN109740464B (en) * 2018-12-21 2021-01-26 北京智行者科技有限公司 Target identification following method
CN109781008B (en) * 2018-12-30 2021-05-25 北京猎户星空科技有限公司 Distance measuring method, device, equipment and medium
CN110197117B (en) * 2019-04-18 2021-07-06 北京奇艺世纪科技有限公司 Human body contour point extraction method and device, terminal equipment and computer readable storage medium
CN110084207A (en) * 2019-04-30 2019-08-02 惠州市德赛西威智能交通技术研究院有限公司 Automatically adjust exposure method, device and the storage medium of face light exposure
CN111639515A (en) * 2020-01-16 2020-09-08 上海黑眸智能科技有限责任公司 Target loss retracing method, device, system, electronic terminal and storage medium
CN113518474A (en) * 2020-03-27 2021-10-19 阿里巴巴集团控股有限公司 Detection method, device, equipment, storage medium and system
CN111860403A (en) * 2020-07-28 2020-10-30 商汤国际私人有限公司 Scene information detection method and device and electronic equipment
CN112672062B (en) * 2020-08-21 2022-08-09 海信视像科技股份有限公司 Display device and portrait positioning method
CN112702652A (en) * 2020-12-25 2021-04-23 珠海格力电器股份有限公司 Smart home control method and device, storage medium and electronic device
CN113572957B (en) * 2021-06-26 2022-08-05 荣耀终端有限公司 Shooting focusing method and related equipment
CN113183157A (en) * 2021-07-01 2021-07-30 德鲁动力科技(成都)有限公司 Method for controlling robot and flexible screen interactive quadruped robot

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101167086A (en) * 2005-05-31 2008-04-23 实物视频影像公司 Human detection and tracking for security applications
CN103077403A (en) * 2012-12-30 2013-05-01 信帧电子技术(北京)有限公司 Pedestrian counting method and device
CN103890498A (en) * 2011-10-18 2014-06-25 三菱电机株式会社 Air conditioner indoor unit
WO2016202764A1 (en) * 2015-06-15 2016-12-22 Thomson Licensing Apparatus and method for video zooming by selecting and tracking an image area
CN107038418A (en) * 2017-03-24 2017-08-11 厦门瑞为信息技术有限公司 A kind of intelligent air condition dual camera follows the trail of the method for obtaining clear human body image
WO2018020275A1 (en) * 2016-07-29 2018-02-01 Unifai Holdings Limited Computer vision systems
CN108733280A (en) * 2018-03-21 2018-11-02 北京猎户星空科技有限公司 Focus follower method, device, smart machine and the storage medium of smart machine

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201001043A (en) * 2008-06-25 2010-01-01 Altek Corp Method of auto focusing on faces used by digital imaging device
JP5978639B2 (en) * 2012-02-06 2016-08-24 ソニー株式会社 Image processing apparatus, image processing method, program, and recording medium
CN104732210A (en) * 2015-03-17 2015-06-24 深圳超多维光电子有限公司 Target human face tracking method and electronic equipment
CN104935844A (en) * 2015-06-17 2015-09-23 四川长虹电器股份有限公司 Method for automatically adjusting screen orientation according to face orientation of looker and television
CN106407882A (en) * 2016-07-26 2017-02-15 河源市勇艺达科技股份有限公司 Method and apparatus for realizing head rotation of robot by face detection

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101167086A (en) * 2005-05-31 2008-04-23 实物视频影像公司 Human detection and tracking for security applications
CN103890498A (en) * 2011-10-18 2014-06-25 三菱电机株式会社 Air conditioner indoor unit
CN103077403A (en) * 2012-12-30 2013-05-01 信帧电子技术(北京)有限公司 Pedestrian counting method and device
WO2016202764A1 (en) * 2015-06-15 2016-12-22 Thomson Licensing Apparatus and method for video zooming by selecting and tracking an image area
WO2018020275A1 (en) * 2016-07-29 2018-02-01 Unifai Holdings Limited Computer vision systems
CN107038418A (en) * 2017-03-24 2017-08-11 厦门瑞为信息技术有限公司 A kind of intelligent air condition dual camera follows the trail of the method for obtaining clear human body image
CN108733280A (en) * 2018-03-21 2018-11-02 北京猎户星空科技有限公司 Focus follower method, device, smart machine and the storage medium of smart machine

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111241961A (en) * 2020-01-03 2020-06-05 精硕科技(北京)股份有限公司 Face detection method and device and electronic equipment
CN111241961B (en) * 2020-01-03 2023-12-08 北京秒针人工智能科技有限公司 Face detection method and device and electronic equipment
CN111968163A (en) * 2020-08-14 2020-11-20 济南博观智能科技有限公司 Thermopile array temperature measurement method and device
CN111968163B (en) * 2020-08-14 2023-10-10 济南博观智能科技有限公司 Thermopile array temperature measurement method and device
CN112866773A (en) * 2020-08-21 2021-05-28 海信视像科技股份有限公司 Display device and camera tracking method in multi-person scene
CN112866773B (en) * 2020-08-21 2023-09-26 海信视像科技股份有限公司 Display equipment and camera tracking method in multi-person scene

Also Published As

Publication number Publication date
TW201941098A (en) 2019-10-16
CN108733280A (en) 2018-11-02
TWI705382B (en) 2020-09-21

Similar Documents

Publication Publication Date Title
WO2019179441A1 (en) Focus tracking method and device of smart apparatus, smart apparatus, and storage medium
WO2019179442A1 (en) Interaction target determination method and apparatus for intelligent device
CN111989537B (en) System and method for detecting human gaze and gestures in an unconstrained environment
CN109034013B (en) Face image recognition method, device and storage medium
JP5950973B2 (en) Method, apparatus and system for selecting a frame
WO2019179443A1 (en) Continuous wake-up method and apparatus for intelligent device, intelligent device, and storage medium
JP5001930B2 (en) Motion recognition apparatus and method
JP7113013B2 (en) Subject head tracking
JP2012022411A (en) Information processing apparatus and control method thereof, and program
JP2006343859A (en) Image processing system and image processing method
EP2198391A1 (en) Long distance multimodal biometric system and method
JP4992823B2 (en) Face detection apparatus and face detection method
JP2015184054A (en) Identification device, method, and program
CN111212226A (en) Focusing shooting method and device
WO2022021093A1 (en) Photographing method, photographing apparatus, and storage medium
CN112655021A (en) Image processing method, image processing device, electronic equipment and storage medium
JP2008203995A (en) Object shape generation method, object shape generation device and program
CN113093907B (en) Man-machine interaction method, system, equipment and storage medium
Debnath et al. Detection and controlling of drivers' visual focus of attention
Yu et al. Perspective-aware convolution for monocular 3d object detection
Kim et al. Simulation of face pose tracking system using adaptive vision switching
JP6468755B2 (en) Feature point detection system, feature point detection method, and feature point detection program
EP3246793A1 (en) Virtual reality display
CN112711324B (en) Gesture interaction method and system based on TOF camera
US11435745B2 (en) Robot and map update method using the same

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19770929

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 18.01.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19770929

Country of ref document: EP

Kind code of ref document: A1