CN109492521B

CN109492521B - Face positioning method and robot

Info

Publication number: CN109492521B
Application number: CN201811070041.2A
Authority: CN
Inventors: 马勇
Original assignee: Beijing Miwen Power Technology Co ltd
Current assignee: Beijing Miwen Power Technology Co ltd
Priority date: 2018-09-13
Filing date: 2018-09-13
Publication date: 2022-05-13
Anticipated expiration: 2038-09-13
Also published as: CN109492521A

Abstract

The invention discloses a face positioning method and a robot, wherein the method and the robot acquire an image of a designated area and all three-dimensional coordinates in the designated area at the same time, determine plane coordinates of the three-dimensional coordinates in the image, establish a corresponding relation between each three-dimensional coordinate and each plane coordinate, acquire the area position of each face in the image, and determine the position information of the face in the designated area according to the corresponding relation and the area position. By applying the technical scheme, the robot can accurately position the relative positions (relative distance, relative angle and the like) of the human face and the robot, so that the robot can accurately position the human body at the positions of all directions around the robot, and the accuracy of the robot on the positioning of the human body is improved.

Description

Face positioning method and robot

Technical Field

The invention relates to the field of positioning, in particular to a face positioning method. The invention also relates to a robot.

Background

A robot is a machine device that automatically performs work. It can accept human command, run the program programmed in advance, and also can operate according to the principle outline action made by artificial intelligence technology. The task of which is to assist or replace human work, such as production, construction, or dangerous work.

With the increasing application scenes of robots, specific human face targets need to be tracked in customs, airports, banks, video teleconferences and other occasions. At present, the mainstream scheme for tracking the face closest to the robot is to identify the position of the face in an image after a color image is shot, further calculate the difference between the position and a central point x coordinate, and then send a command to a robot motion system to enable the robot to turn to the direction of a target person.

However, the inventor finds that the position of the face on the plane image and the real three-dimensional stereo space are different in implementing the invention. Two different locations in three-dimensional space may be the same location on the planar image. Therefore, the existing method simply depends on the recognition of the face position in the color image, and the relative position between the face and the robot (especially the relative distance, relative angle, etc. between the face and the robot) cannot be accurately located, so that the robot cannot accurately locate the human body at each direction around the robot.

Disclosure of Invention

The invention provides a human face positioning method, which is used for solving the problem that how a robot accurately positions a human body at each direction position around the robot, and comprises the following steps:

acquiring an image of a designated area and all three-dimensional coordinates in the designated area at the same time;

determining the plane coordinates of the three-dimensional coordinates in the image, and establishing the corresponding relation between each three-dimensional coordinate and each plane coordinate;

acquiring the area position of each face contained in the image;

and determining the position information of the human face in the designated area according to the corresponding relation and the area position.

Preferably, determining the plane coordinates of the three-dimensional coordinates in the image specifically includes:

carrying out planar depth processing on the three-dimensional coordinates according to the acquisition angle of the image to generate a planar graph containing the three-dimensional coordinates;

carrying out same-scale processing on the plane graph according to the size of the image;

and mapping each three-dimensional coordinate in the planar graph after the same-proportion processing to the image, and determining the planar coordinate according to a mapping result.

Preferably, the determining, according to the correspondence and the region position, the position information of the face in the specified region includes:

generating a plurality of characteristic point coordinates according to the vertex coordinates of the region positions, wherein the characteristic point coordinates are evenly distributed in the region positions;

acquiring three-dimensional coordinates corresponding to the characteristic points according to the corresponding relation, and acquiring the distance between the three-dimensional coordinates corresponding to the characteristic points and the image acquisition equipment;

screening out the reference position points of the human face from the feature points according to the distance;

and determining the three-dimensional coordinates corresponding to the reference position points according to the corresponding relation, and taking the three-dimensional coordinates corresponding to the reference position points as the position information of the human face.

Preferably, the step of screening the reference position points of the face from the feature points according to the distance specifically includes:

determining the average value of the distances of all the feature point coordinates, and removing the feature point coordinates of which the distances are higher than the specified threshold value of the average value;

if the remaining feature point coordinates are less than the designated number, selecting the central point of the area position as the reference position point;

and if the residual feature point coordinates are not less than the specified number, selecting the feature point coordinates closest to the average value from the residual feature point coordinates as the reference position points.

Preferably, after determining the position information of the face in the designated area according to the correspondence and the area position, the method further includes:

acquiring the face to be tracked, which needs to be tracked by the acquisition equipment, according to the position information;

determining the three-dimensional coordinates of the face to be tracked according to the corresponding relation;

determining a relative angle according to the three-dimensional coordinates of the face to be tracked and the coordinates of the acquisition equipment;

and instructing the acquisition equipment to move according to the relative angle.

Preferably, the three-dimensional coordinates are point cloud data;

scanning the designated area by the point cloud data through a laser radar to generate the point cloud data;

or the point cloud data is generated according to color information in a black and white image corresponding to the image.

Correspondingly, the invention also proposes a robot comprising:

the acquisition module acquires the image of the designated area and all three-dimensional coordinates in the designated area at the same time;

the corresponding module is used for determining the plane coordinates of the three-dimensional coordinates in the image and establishing the corresponding relation between each three-dimensional coordinate and each plane coordinate;

the acquisition module is used for acquiring the area position of each face contained in the image;

and the determining module is used for determining the position information of the face in the designated area according to the corresponding relation and the area position.

Correspondingly, the present invention also provides a computer-readable storage medium, in which instructions are stored, and when the instructions are run on a terminal device, the terminal device is caused to execute the above-mentioned face positioning method.

Correspondingly, the present invention further provides a computer program product, which is characterized in that when the computer program product runs on a terminal device, the terminal device is caused to execute the above-mentioned face positioning method.

By applying the technical scheme, the image of the designated area and all three-dimensional coordinates in the designated area are acquired at the same time, the plane coordinates of the three-dimensional coordinates in the image are determined, the corresponding relation between each three-dimensional coordinate and each plane coordinate is established, the area position of each face contained in the image is obtained, and the position information of the face in the designated area is determined according to the corresponding relation and the area position. By applying the technical scheme, the robot can accurately position the relative positions (relative distance, relative angle and the like) of the human face and the robot, so that the robot can accurately position the human body at the positions of all directions around the robot, and the accuracy of the robot on the positioning of the human body is improved.

Drawings

In order to more clearly illustrate the technical solution of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.

Fig. 1 is a schematic flow chart of a face positioning method proposed in the present application;

fig. 2 is a schematic structural diagram of a robot according to the present application.

Detailed Description

As described in the background art, the mainstream scheme for tracking a face closest to a robot in the prior art is to identify the position of the face in an image after a color image is captured, calculate the difference between the position and a central point x coordinate, and then send a command to a robot motion system to turn the robot to the direction of a target person. However, since the position of the face on the plane image is different from the real three-dimensional space, two different positions in the three-dimensional space may be the same position on the plane image. Therefore, the existing method simply depends on the recognition mode of the face position in the color image, and the relative position between the face and the robot cannot be accurately positioned.

In view of the foregoing problems, embodiments of the present invention provide a face positioning method, so that a robot can accurately position a human body at each direction around the robot. Meanwhile, the invention not only can play a good role in the recent face tracking, but also can be applied to the accurate following of things (such as people, articles and the like). The technical solution of the present invention will be clearly and completely described below with reference to the accompanying drawings.

In the embodiment of the present invention, the robot means a machine device that automatically performs work. It can accept human command, run the program programmed in advance, and also can operate according to the principle outline action made by artificial intelligence technology. The task of which is to assist or replace human work, such as production, construction, or dangerous work. The changes of the functional structure, the application environment and the like do not affect the protection scope of the invention.

As shown in fig. 1, the face positioning method specifically includes the following steps:

s101, acquiring an image of a designated area and all three-dimensional coordinates in the designated area at the same time.

This step is intended to acquire images and three-dimensional coordinates in the same time zone, wherein there are many ways to acquire images and three-dimensional coordinates, for example: the invention is applicable to various fields such as photography, video shooting, radar scanning, acoustic scanning, etc., and different acquisition means do not affect the protection scope of the invention as long as the purpose of acquiring images in the same time and three-dimensional coordinates in the same area can be achieved. Meanwhile, the nature of the acquired image (such as color image, black-and-white image, etc.) does not affect the protection scope of the present invention as long as the purpose of the acquisition can be achieved. Meanwhile, different three-dimensional coordinate representation methods do not influence the protection scope of the invention.

Preferably, the three-dimensional coordinates are point cloud data; scanning the designated area by the point cloud data through a laser radar to generate the point cloud data; or the point cloud data is generated according to color information in a black and white image corresponding to the image.

In a specific application scene, the robot acquires a color image of a frame of image through a camera, and the color image is used as a resource for analyzing the position of a human face. Meanwhile, the robot obtains point cloud information data which is in the same time with the color of the camera through the laser radar.

In a specific application scenario, the step may use the depth image instead of the color image to obtain the depth image point cloud at the same time. Wherein the depth image means: only images in black and white and gray. The degree of black color at each point in the image represents the distance of that point from the camera. Depth image point cloud: and converting the depth image into information data similar to laser radar point cloud data through mathematical processing. Can play a similar role to the laser radar point cloud information.

S102, determining the plane coordinates of the three-dimensional coordinates in the image, and establishing the corresponding relation between each three-dimensional coordinate and each plane coordinate.

This step is intended to determine the planar coordinates of the three-dimensional coordinates in the image and to establish the correspondence. The obtained coordinates of the same point are different in different directions, angles and algorithms, but the unique point coordinates and different algorithms corresponding to the point coordinates can be obtained within the protection scope of the application.

Preferably, in order to better determine the plane coordinates of the three-dimensional coordinates in the image, the following steps are preferably performed:

(1) and carrying out planar depth processing on the three-dimensional coordinates according to the acquisition angle of the image to generate a planar graph containing the three-dimensional coordinates.

(2) And carrying out same-scale processing on the plane graph according to the size of the image.

(3) And mapping each three-dimensional coordinate in the planar graph after the same-proportion processing to the image, and determining the planar coordinate according to a mapping result.

In a specific application scenario, the point cloud information data is subjected to planar deepening in the same proportion with camera shooting data by taking a vertical plane of a robot facing direction ("the front facing direction of the robot") as a reference (hereinafter, the flattened data is called a flattened radar data Map), and a mapping relation function Map () of points in the radar data and points in the plane data is obtained. And then comparing the planar image formed by the radar data with the original three-dimensional data. Thereby establishing the three-dimensional coordinates of all points in the acquired color image data and the surrounding environment. The same-scale planar deepening is to change three-dimensional data scanned by radar data into a planar image with the same size as an area shot by a color image, and record the corresponding relation between each point in the planar image and a point in the three-dimensional data before conversion.

S103, acquiring the region position of each face contained in the image.

This step aims to obtain the relative region positions of the faces in the image. The acquired image may be a color image, a black-and-white image, etc., and the acquired region position may be a frame region, a contour region, etc., all of which are within the scope of the present application.

And S104, determining the position information of the human face in the designated area according to the corresponding relation and the area position.

The step aims to determine the position information of the face in the designated area, and different methods and algorithms for obtaining the corresponding relation and the area position of the target image do not influence the protection scope of the invention.

Preferably, in order to better determine the position information of the face in the designated area according to the correspondence and the area position, the steps are preferably as follows:

(1) and generating a plurality of characteristic point coordinates according to the vertex coordinates of the region positions, wherein the characteristic point coordinates are evenly distributed in the region positions.

(2) And acquiring the three-dimensional coordinates corresponding to the characteristic points according to the corresponding relation, and acquiring the distance between the three-dimensional coordinates corresponding to the characteristic points and the image acquisition equipment.

(3) And screening out the reference position points of the human face from the feature points according to the distance.

(4) And determining the three-dimensional coordinates corresponding to the reference position points according to the corresponding relation, and taking the three-dimensional coordinates corresponding to the reference position points as the position information of the human face.

Preferably, in order to better screen the reference position point of the face from the feature points according to the distance, the steps are preferably as follows:

(1) and determining the average value of the distances of all the characteristic point coordinates, and removing the characteristic point coordinates with the distances higher than the specified threshold value of the average value.

(2) And if the coordinates of the remaining feature points are less than the specified number, selecting the central point of the area position as the reference position point.

(3) And if the residual feature point coordinates are not less than the specified number, selecting the feature point coordinates closest to the average value from the residual feature point coordinates as the reference position points.

Preferably, after determining the position information of the face in the designated area according to the correspondence and the area position, the following steps are preferably performed:

(1) and acquiring the face to be tracked, which needs to be tracked by the acquisition equipment, according to the position information.

(2) And determining the three-dimensional coordinates of the face to be tracked according to the corresponding relation.

(3) And determining a relative angle according to the three-dimensional coordinates of the face to be tracked and the coordinates of the acquisition equipment.

(4) And instructing the acquisition equipment to move according to the relative angle.

In a specific application scene, n points are selected in a face area according to an average distribution mode. The distances of the n points from the robot are acquired in the planarization radar data map. And performing iterative operation processing on the data of the n points, removing points which exceed more than 50% of the average distance (average value) from all the selected points to the robot, and repeating iterative operation on the rest points. Several points closest to the actual situation (mean) are extracted. If the number of the extracted points close to the real situation is more than or equal to half, 1 point which is the closest to the average value in the points is selected; if the number of the face images is less than half, directly selecting a point in the center of the face image. These points (the previously mentioned points, i.e. the points closest to the true live, and at the same time the points closest to the mean) will be the standard points of the subsequent face distance.

And comparing the data of all the faces from the standard points obtained in the process, and selecting the face closest to the standard points.

In a specific application scene, the coordinates of the face center point in the radar point cloud space are calculated according to the corresponding relation between the laser radar planograph and the laser radar point cloud information. The coordinate data is then mathematically processed to obtain coordinate information (e.g., (x1, y1, z1)) for a corresponding point on the same plane as the lidar and the camera.

In a specific application scenario, an included angle θ between the facing direction of the robot and a point (x1, y1, z1), that is, a current relative angle between a selected face and the facing direction of the robot, is calculated by using a data inverse trigonometric function formula θ, arctan (x1/y 1). Therefore, a relatively accurate relative angle between the human and the robot is obtained.

In a specific application scene, a motion command of rotating a specified angle is sent to the robot motion system.

In order to further illustrate the technical idea of the present invention, the technical solution of the present invention will now be described with reference to specific application scenarios.

In this specific application scenario, the specific processing flow is as follows:

(1) and acquiring a color image of one frame of image from the camera. As a resource for later analysis of the face position.

(2) Point cloud information data which is at the same time as the camera color image is obtained from the laser radar, and is subjected to planar deepening in proportion to the camera shooting data (hereinafter referred to as a planarized radar data Map) by taking a vertical plane of a robot orientation direction (a front orientation direction of the robot) as a reference, so as to obtain a data mapping relation function Map (). Thereby establishing three-dimensional coordinates of all points and surroundings in the color image data obtained in step (1).

Specifically, depth of the same-proportion plane is as follows: that is, three-dimensional data scanned by radar data is changed into a planar image with the same size as the area shot by the color image, and the corresponding relation between each point in the planar image and the point in the three-dimensional data before conversion is recorded.

(3) All face information is identified from the color image.

(4) For each face, the following process is performed:

(a) assume that the four vertex coordinates of the face region are (x1, y1), (x2, y1), (x1, y2), (x2, y2), where x2 > x1 and y2 > y 1. The following coordinate information of 9 points is taken from the planarization radar data map:

·(x1+(x2-x1)/4，y1+(y2-y1)/4)

·(x1+(x2-x1)/2，y1+(y2-y1)/4)

·(x1+(x2-x1)*3/4，y1+(y2-y1)/2)

·(x1+(x2-x1)/4，y1+(y2-y1)/2)

·(x1+(x2-x1)/2，y1+(y2-y1)/2)

·(x1+(x2-x1)*3/4，y1+(y2-y1)/2)

·(x1+(x2-x1)/4，y1+(y2-y1)*3/4)

·(x1+(x2-x1)/2，y1+(y2-y1)*3/4)

·(x1+(x2-x1)*3/4，y1+(y2-y1)*3/4)

(b) the distance of the above 9 points from the robot is acquired in the planarization radar data map.

(c) And (4) removing points which exceed the average value by more than 50% by using iterative operation.

(d) After the algorithm is applied, if the number of the finally remained points is more than 5, taking one point which is closest to the average value in the points as a standard point, and continuing the following steps; if there are less than 5 points left, these points are discarded, and the standard points are (x1+ (x2-x1)/2, y1+ (y2-y1)/2), and the following steps are continued.

Specifically, iterative operation: all values are averaged and the value that differs most from the average is removed. The remaining points are repeated.

(5) And (5) comparing all the standard point data obtained in the process (4), and selecting the face closest to the standard point data. By now the position of the nearest face has been obtained.

(6) And calculating the coordinates of the face center point in the radar point cloud space according to the corresponding relation between the laser radar planarization map and the laser radar point cloud information. The coordinate data is then mathematically processed (z0 instead of z1) to obtain coordinate information for the corresponding points on the same plane as the lidar and camera (x1, y1, z 1).

(7) And calculating an included angle theta between the facing direction of the robot and a point (x1, y1 and z1), namely the current relative angle between the selected face and the facing direction of the robot by using a data inverse trigonometric function formula theta (x1/y 1). Therefore, the accurate relative angle between the robot and the human is obtained.

(8) And sending a motion command of rotating a designated angle to the robot motion system.

To achieve the above technical object, the present application further proposes a robot, as shown in fig. 2, comprising:

the acquisition module 210 acquires an image of a specified area and all three-dimensional coordinates in the specified area at the same time;

a corresponding module 220, configured to determine plane coordinates of the three-dimensional coordinates in the image, and establish a corresponding relationship between each three-dimensional coordinate and each plane coordinate;

an obtaining module 230, configured to obtain the region positions of the faces included in the image;

and the determining module 240 determines the position information of the face in the designated area according to the corresponding relationship and the area position.

In a specific application scenario, the corresponding module 220 determines a plane coordinate of the three-dimensional coordinate in the image, specifically:

In a specific application scenario, the determining module 240 determines the position information of the face in the designated area according to the corresponding relationship and the area position, specifically:

In a specific application scenario, the determining module 240 screens the reference position point of the face from the feature points according to the distance, specifically:

In a specific application scenario, after determining the position information of the face in the designated area according to the corresponding relationship and the area position, the determining module 240 further includes:

In a specific application scene, the three-dimensional coordinates in each module are specifically point cloud data;

Through the above description of the embodiments, those skilled in the art will clearly understand that the present invention may be implemented by hardware, or by software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present invention can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.), and includes several instructions for enabling a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the method according to the implementation scenarios of the present invention.

Those skilled in the art will appreciate that the figures are merely schematic representations of one preferred implementation scenario and that the blocks or flow diagrams in the figures are not necessarily required to practice the present invention.

Those skilled in the art will appreciate that the modules in the devices in the implementation scenario may be distributed in the devices in the implementation scenario according to the description of the implementation scenario, or may be located in one or more devices different from the present implementation scenario with corresponding changes. The modules of the implementation scenario may be combined into one module, or may be further split into a plurality of sub-modules.

The above-mentioned invention numbers are merely for description and do not represent the merits of the implementation scenarios.

The above disclosure is only a few specific implementation scenarios of the present invention, however, the present invention is not limited thereto, and any variations that can be made by those skilled in the art are intended to fall within the scope of the present invention.

Claims

1. A face localization method, comprising: acquiring an image of a designated area and all three-dimensional coordinates in the designated area at the same time; determining the plane coordinates of the three-dimensional coordinates in the image, and establishing the corresponding relation between each three-dimensional coordinate and each plane coordinate; acquiring the area position of each face contained in the image; determining the position information of the face in a specified area according to the corresponding relation and the area position;

determining a plane coordinate of the three-dimensional coordinate in the image, specifically: carrying out planar depth processing on the three-dimensional coordinates according to the acquisition angle of the image to generate a planar graph containing the three-dimensional coordinates; carrying out same-scale processing on the plane graph according to the size of the image; mapping each three-dimensional coordinate in the planar graph after the same-proportion processing to the image, and determining the planar coordinate according to a mapping result;

determining the position information of the face in the designated area according to the corresponding relation and the area position, specifically: generating a plurality of characteristic point coordinates according to the vertex coordinates of the region positions, wherein the characteristic point coordinates are evenly distributed in the region positions; acquiring three-dimensional coordinates corresponding to the characteristic points according to the corresponding relation, and acquiring the distance between the three-dimensional coordinates corresponding to the characteristic points and the image acquisition equipment; screening out a reference position point of the face from the feature points according to the distance; determining a three-dimensional coordinate corresponding to the reference position point according to the corresponding relation, and taking the three-dimensional coordinate corresponding to the reference position point as the position information of the human face;

and screening out the reference position points of the human face from the feature points according to the distance, which specifically comprises the following steps: determining the average value of the distances of all the feature point coordinates, and removing the feature point coordinates of which the distances are higher than the specified threshold value of the average value; if the remaining feature point coordinates are less than the designated number, selecting the central point of the area position as the reference position point; and if the residual feature point coordinates are not less than the specified number, selecting the feature point coordinates closest to the average value from the residual feature point coordinates as the reference position points.

2. The method of claim 1, wherein after determining the position information of the face in the designated area according to the correspondence and the area position, further comprising: acquiring the face to be tracked, which needs to be tracked by the acquisition equipment, according to the position information; determining the three-dimensional coordinates of the face to be tracked according to the corresponding relation; determining a relative angle according to the three-dimensional coordinates of the face to be tracked and the coordinates of the acquisition equipment; and instructing the acquisition equipment to move according to the relative angle.

3. The method according to any one of claims 1-2, wherein the three-dimensional coordinates are in particular point cloud data; scanning the designated area by the point cloud data through a laser radar to generate the point cloud data; or the point cloud data is generated according to color information in a black and white image corresponding to the image.

4. A robot, characterized in that the robot comprises: the acquisition module acquires the image of the designated area and all three-dimensional coordinates in the designated area at the same time; the corresponding module is used for determining the plane coordinates of the three-dimensional coordinates in the image and establishing the corresponding relation between each three-dimensional coordinate and each plane coordinate; the acquisition module is used for acquiring the area position of each face contained in the image; the determining module is used for determining the position information of the face in the designated area according to the corresponding relation and the area position;

determining the position information of the face in the designated area according to the corresponding relation and the area position, specifically: generating a plurality of characteristic point coordinates according to the vertex coordinates of the region positions, wherein the characteristic point coordinates are evenly distributed in the region positions; acquiring three-dimensional coordinates corresponding to the characteristic points according to the corresponding relation, and acquiring the distance between the three-dimensional coordinates corresponding to the characteristic points and the image acquisition equipment; screening out the reference position points of the human face from the feature points according to the distance; determining a three-dimensional coordinate corresponding to the reference position point according to the corresponding relation, and taking the three-dimensional coordinate corresponding to the reference position point as the position information of the human face;

5. A computer-readable storage medium having stored therein instructions that, when run on a terminal device, cause the terminal device to perform the face localization method according to any one of claims 1-3.

6. A computer program product, characterized in that it, when run on a terminal device, causes the terminal device to execute the face localization method of any of claims 1-3.