CN111898552B

CN111898552B - Method and device for distinguishing person attention target object and computer equipment

Info

Publication number: CN111898552B
Application number: CN202010762123.4A
Authority: CN
Inventors: 董勇; 杨青川; 宁瑶
Original assignee: Chengdu Xinchao Media Group Co Ltd
Current assignee: Chengdu Xinchao Media Group Co Ltd
Priority date: 2020-07-31
Filing date: 2020-07-31
Publication date: 2022-12-27
Anticipated expiration: 2040-07-31
Also published as: CN111898552A

Abstract

The invention relates to the technical field of behavior recognition, and discloses a method and a device for distinguishing a person concerned target object and computer equipment. In the method for distinguishing the attention of the personnel to the target object, whether the center of the target object is coincident with the origin of a camera coordinate system or not, the current face attitude angle and the current face multidimensional data of the personnel on the spot can be sequentially extracted based on a face image collected on the spot, the current face attitude angle and the current face multidimensional data are compared with the current face attitude angle range which is determined according to the corresponding relation between the face multidimensional data and the face attitude angle range and corresponds to the current face multidimensional data, if the current face attitude angle is within the current face attitude angle range, the situation that the personnel on the spot are the same as a tester for collecting the face multidimensional data and the face attitude angle range is shown, the personnel pay attention to the target object on the spot, and then the personnel can be distinguished whether the personnel pay attention to the target object or not by using less computer resources, so that the estimation of the attention point under the far-field scene is realized.

Description

Method and device for distinguishing person attention target object and computer equipment

Technical Field

The invention belongs to the technical field of behavior recognition, and particularly relates to a method and a device for distinguishing a target object of attention of a person and computer equipment.

Background

In many industries, attention points of audiences need to be sensed timely, and then interest points and judgments of the audiences need to be known, so that design schemes can be improved, and the development of the industries can be promoted. In order to determine whether the target object is focused by the audience, under the condition of a near-field scene, such as a helmet-type scene, an eyeball tracking technology is generally adopted to realize the focus estimation; however, in a far scene of 0.5 m or more, it is difficult to track eyeball features and to perform focus estimation by using the current technology, and therefore, in such a case, the head pose estimation technology is generally used.

The scheme of the head posture estimation technology comprises the following steps: the attitude angle of the head is obtained through a human face image (in a three-dimensional space, the rotating attitude of an object can be represented by three euler angles, namely a pitch angle pitch rotating around an X axis of a rectangular coordinate system, a yaw angle yaw rotating around a Y axis of the rectangular coordinate system and a roll angle roll rotating around a Z axis of the rectangular coordinate system, so that the head is usually referred to as a head raising angle, a head shaking angle and a head turning angle, as shown in figure 1), the attention direction of eyeballs is indirectly reflected according to the head attitude angle, and if the target is positioned in the attention direction, the target is taken as an estimated attention point. At present, the algorithm steps for obtaining the head pose angle through a human face image are generally as follows: (1) carrying out two-dimensional face key point detection on a face image; (2) Matching the detected two-dimensional face key points with corresponding face key points in the three-dimensional face model; (3) Solving a conversion relation matrix of the two-dimensional face key points and the corresponding three-dimensional face key points; (4) And solving three Euler angles of the head relative to a camera coordinate system (the camera coordinate system is a three-dimensional rectangular coordinate system established by taking the focusing center of a camera for shooting the face image as an origin and taking the optical axis as a Z axis) according to the rotation relation matrix.

At present, the scene suitable for the head posture estimation technology is only the condition that the center of the target object is coincident with the origin of a camera coordinate system, and whether the human head focuses on the target object can be judged by adopting a fixed yaw angle yaw and a fixed pitch angle pitch. However, when the center of the object does not coincide with the origin of the coordinate system of the camera, the camera is at the third view angle, and cannot replace the view angle of the object, and if the camera is determined based on the fixed yaw angle yaw and pitch angle pitch again, a great error is inevitably caused, even the south thill north track is formed, and therefore a new technical solution is urgently needed.

Disclosure of Invention

In order to solve the problems that the existing head pose estimation technology has limited application scenes and cannot be applied to the situation that the center of a target object is not coincident with the origin of a camera coordinate system, the invention aims to provide a method, a device, computer equipment and a computer readable storage medium for judging whether a person pays attention to the target object, whether the center of the target object is coincident with the origin of the camera coordinate system or not can be judged whether the person pays attention to the target object based on a face image acquired on site and by using less computer resources, and the estimation of the attention point in a far-field scene is realized.

In a first aspect, the present invention provides a method for identifying a target object of interest of a person, including:

acquiring a face image, wherein the face image comprises at least one person;

extracting a current face attitude angle and current face multidimensional data of the person from the face image;

determining a current face attitude angle range corresponding to the current face multi-dimensional data according to the corresponding relation between the face multi-dimensional data and the face attitude angle range, wherein the face attitude angle range is a face attitude angle interval which corresponds to the face multi-dimensional data and can focus on a target object;

and when the current face pose angle is within the range of the current face pose angle, judging that the person pays attention to the target object.

Based on the above invention, a new method for estimating a focus point no matter whether the center of a target object coincides with the origin of a camera coordinate system or not can be provided, that is, a current face pose angle and current face multidimensional data of field personnel are sequentially extracted based on a face image acquired on the field, the current face pose angle is compared with a current face pose angle range determined according to the corresponding relationship between the face multidimensional data and the face pose angle range and corresponding to the current face multidimensional data, if the current face pose angle is within the current face pose angle range, it indicates that the field personnel as a testing personnel for acquiring the face multidimensional data and the face pose angle range pay attention to the target object on the field, and further, less computer resources can be used for judging whether the personnel pay attention to the target object or not, so that the focus point estimation under a far field scene is realized, and the actual application and popularization are facilitated.

In one possible design, extracting the current face multidimensional data of the person from the face image includes:

extracting two-dimensional coordinate data of the human face from the human face image;

and extracting a first face imaging size parameter value from the face image, wherein the face two-dimensional coordinate data and the first face imaging size parameter value are used as the current face multi-dimensional data.

Through the possible design, when the image acquisition equipment is a monocular camera, the distance degree from the face to the origin of a camera coordinate system can be reflected by using the size of the face imaging size, so that the Z-axis coordinate can be replaced under the condition that distance measurement cannot be carried out, the extractability of the current face multidimensional data is ensured, and the estimation of the focus under a far-field scene is realized.

extracting a first face imaging size parameter value from the face image;

and calculating to obtain a second face imaging size parameter value according to the current face attitude angle and the first face imaging size parameter value, wherein the face two-dimensional coordinate data and the second face imaging size parameter value are used as the current face multidimensional data, and the second face imaging size parameter value is a face imaging size parameter value corresponding to the face of the person when the face of the person is perpendicular to the optical axis of the image acquisition equipment under the condition that the face coordinate position is unchanged.

Through the possible design, when the image acquisition equipment is a monocular camera, the distance degree from the face to the origin of a camera coordinate system can be reflected by using the size of the face imaging size under the main visual angle, so that the Z-axis coordinate can be replaced under the condition that the distance cannot be measured, the extractability of the current face multidimensional data and the accuracy of a subsequent attention discrimination result are ensured, and the estimation of the attention point under the far-field scene is realized.

extracting a first face imaging size parameter value from the face image;

calculating to obtain a second face imaging size parameter value according to the current face attitude angle and the first face imaging size parameter value, wherein the second face imaging size parameter value is a face imaging size parameter value corresponding to the face of the person when the face is perpendicular to the optical axis of the image acquisition equipment under the condition that the face coordinate position is unchanged;

identifying the age of the person according to the face image;

and when the age is smaller than the preset age, correcting the second face imaging size parameter value according to the proportional relation between the child face size standard parameter and the adult face size standard parameter to obtain a third face imaging size parameter value, wherein the child face size standard parameter corresponds to the age, and the face two-dimensional coordinate data and the third face imaging size parameter value are used as the current face multi-dimensional data.

Through the possible design, when the image acquisition equipment is a monocular camera, the distance degree from the face to the origin of a camera coordinate system can be reflected by using the face imaging size which is under the main visual angle and is corrected by age factors, so that the Z-axis coordinate can be replaced under the condition that distance measurement cannot be carried out, the extractability of the current face multidimensional data and the accuracy of a subsequent attention judgment result are ensured, and the estimation of the attention point under a far-field scene is realized.

In one possible design, extracting the current face multidimensional data from the face image includes:

identifying the age of the person according to the face image;

extracting a first face imaging size parameter value from the face image;

and calculating to obtain a second face imaging size parameter value according to the current face attitude angle and the first face imaging size parameter value, wherein the age, the face two-dimensional coordinate data and the second face imaging size parameter value are used as the current face multi-dimensional data, and the second face imaging size parameter value is a face imaging size parameter value corresponding to the face of the person when the face of the person is perpendicular to the optical axis of the image acquisition equipment under the condition that the face coordinate position is unchanged.

Through the possible design, when the image acquisition equipment is a monocular camera, the distance degree from the face to the origin of a camera coordinate system is reflected by the face imaging size under the main visual angle, so that the Z-axis coordinate can be replaced under the condition that the distance cannot be measured, the extractability of the current face multidimensional data is ensured, meanwhile, the problem that the final attention judgment result causes errors due to the fact that the age difference between field personnel and testing personnel is large can be avoided by adding a dimension, namely the age, in the current face multidimensional data, the judgment accuracy is further improved, and the estimation of the attention point under the far-field scene is realized.

In one possible design, calculating a second face imaging size parameter value according to the current face pose angle and the first face imaging size parameter value includes:

leading the current face attitude angle into a trigonometric function for reflecting the rotation of the face onto a plane to obtain a rotation transformation coefficient, wherein the plane is vertical to the optical axis of the image acquisition equipment;

and calculating to obtain the second face imaging size parameter value according to the first face imaging size parameter value and the rotation transformation coefficient.

Through the possible design, the real and accurate human face main view size parameters can be obtained, and the accuracy of subsequent judgment is ensured.

extracting a first face imaging size parameter value from the face image;

identifying the age of the person according to the face image;

when the age is smaller than the preset age, correcting the first face imaging size parameter value according to the proportional relation between the child face size standard parameter and the adult face size standard parameter to obtain a fourth face imaging size parameter value, wherein the child face size standard parameter corresponds to the age, and the face two-dimensional coordinate data and the fourth face imaging size parameter value are used as the current face multi-dimensional data.

In one possible design, extracting current face multidimensional data from the face image includes:

identifying the age of the person according to the face image;

and extracting a first face imaging size parameter value from the face image, wherein the age, the face two-dimensional coordinate data and the first face imaging size parameter value are used as the current face multi-dimensional data.

Through the possible design, when the image acquisition equipment is a monocular camera, the distance degree from the face to the origin of a camera coordinate system can be reflected by using the size of the face imaging size, so that the Z-axis coordinate can be replaced under the condition that distance measurement cannot be carried out, the extractability of the current face multidimensional data is ensured, meanwhile, the problem of errors caused by the fact that the final attention discrimination result is large in age difference between field personnel and testing personnel can be avoided by adding one dimension, namely the age, in the current face multidimensional data, the discrimination accuracy is further improved, and the estimation of the attention point under a far-field scene is realized.

Through the possible design, when the image acquisition equipment is a monocular camera, the distance degree from the face to the origin of a camera coordinate system can be reflected by using the face imaging size corrected by age factors, so that the Z-axis coordinate can be replaced under the condition that distance measurement cannot be carried out, the extractability of the current face multidimensional data and the accuracy of a subsequent attention discrimination result are ensured, and the estimation of the attention point under a far-field scene is realized.

In one possible design, determining a current face pose angle range corresponding to the current face multidimensional data according to a corresponding relationship between the face multidimensional data and the face pose angle range includes:

and importing the current face multidimensional data into a continuous curve fitting function, and calculating to obtain a current face attitude angle range corresponding to the current face multidimensional data, wherein the continuous curve fitting function is obtained by fitting multiple groups of actually measured face attitude angle ranges and face multidimensional data.

Through the possible design, after the face attitude angle range and the face multidimensional data which are actually measured in a limited group are obtained, the current face attitude angle ranges corresponding to different variables are obtained in a refined mode through a curve function fitting technology, and therefore the acquisition work is reduced, and meanwhile the accuracy of follow-up judgment is guaranteed.

In a second aspect, the invention provides a device for distinguishing a target object concerned by a person, which comprises an image acquisition unit, a data extraction unit, a range determination unit and a concerned distinguishing unit which are sequentially connected in a communication manner;

the image acquisition unit is used for acquiring a face image, wherein the face image comprises at least one person;

the data extraction unit is used for extracting a current face attitude angle and current face multidimensional data of the person from the face image;

the range determining unit is used for determining a current face pose angle range corresponding to the current face multi-dimensional data according to the corresponding relation between the face multi-dimensional data and the face pose angle range, wherein the face pose angle range refers to a face pose angle interval corresponding to the face multi-dimensional data and capable of paying attention to a target object;

and the attention judging unit is used for judging that the person pays attention to the target object when the current face attitude angle is within the range of the current face attitude angle.

In one possible design, the data extraction unit comprises a face two-dimensional coordinate extraction subunit and a first size parameter extraction subunit;

the face two-dimensional coordinate extraction subunit is used for extracting face two-dimensional coordinate data from the face image;

the first size parameter extraction subunit is in communication connection with the face two-dimensional coordinate extraction subunit, and is configured to extract a first face imaging size parameter value from the face image, where the face two-dimensional coordinate data and the first face imaging size parameter value are used as the current face multidimensional data.

In one possible design, the data extraction unit comprises a face two-dimensional coordinate extraction subunit, a first size parameter extraction subunit and a second size parameter extraction subunit;

the first size parameter extraction subunit is used for extracting a first face imaging size parameter value from the face image;

the second size parameter extraction subunit is respectively in communication connection with the face two-dimensional coordinate extraction subunit and the first size parameter extraction subunit, and is configured to calculate a second face imaging size parameter value according to the current face attitude angle and the first face imaging size parameter value, where the face two-dimensional coordinate data and the second face imaging size parameter value are used as the current face multidimensional data, and the second face imaging size parameter value is a face imaging size parameter value corresponding to a face of the person when the face is perpendicular to an optical axis of the image acquisition device under a condition that a face coordinate position is unchanged.

In one possible design, the data extraction unit comprises a face two-dimensional coordinate extraction subunit, a first size parameter extraction subunit, a second size parameter extraction subunit, an age extraction subunit and a third size parameter extraction subunit;

the second size parameter extraction subunit is in communication connection with the first size parameter extraction subunit, and is configured to calculate a second face imaging size parameter value according to the current face pose angle and the first face imaging size parameter value, where the second face imaging size parameter value is a face imaging size parameter value corresponding to a face of the person when the face is perpendicular to an optical axis of the image acquisition device under a condition that a face coordinate position is unchanged;

the age extracting subunit is used for identifying the age of the person according to the face image;

and the third size parameter extraction subunit is respectively in communication connection with the face two-dimensional coordinate extraction subunit, the second size parameter extraction subunit and the age extraction subunit, and is used for correcting the second face imaging size parameter value according to the proportional relation between the child face size standard parameter and the adult face size standard parameter when the age is smaller than the preset age to obtain a third face imaging size parameter value, wherein the child face size standard parameter corresponds to the age and takes the face two-dimensional coordinate data and the third face imaging size parameter value as the current face multi-dimensional data.

In one possible design, the data extraction unit comprises an age extraction subunit, a human face two-dimensional coordinate extraction subunit, a first size parameter extraction subunit and a second size parameter extraction subunit;

the second size parameter extraction subunit is in communication connection with the age extraction subunit, the face two-dimensional coordinate extraction subunit and the first size parameter extraction subunit are used for calculating to obtain a second face imaging size parameter value according to the current face attitude angle and the first face imaging size parameter value, wherein the age, the face two-dimensional coordinate data and the second face imaging size parameter value are used as the current face multidimensional data, and the second face imaging size parameter value is a face imaging size parameter value corresponding to the face of a person when the face of the person is perpendicular to the optical axis of the image acquisition equipment under the condition that the face coordinate position is unchanged.

In one possible design, the second size parameter extraction sub-unit comprises a coefficient acquisition grandchild unit and a size calculation grandchild unit which are connected in a communication manner;

the coefficient obtaining unit is used for importing the current face attitude angle into a trigonometric function for reflecting the face rotation onto a plane to obtain a rotation transformation coefficient, wherein the plane is vertical to the optical axis of the image acquisition equipment;

and the size calculating unit is used for calculating to obtain the second face imaging size parameter value according to the first face imaging size parameter value and the rotation transformation coefficient.

In one possible design, the data extraction unit comprises a face two-dimensional coordinate extraction subunit, a first size parameter extraction subunit, an age extraction subunit and a fourth size parameter extraction subunit;

the first size parameter extraction subunit is configured to extract a first face imaging size parameter value from the face image;

the fourth size parameter draws the subunit, communication connection respectively the people face two-dimensional coordinate draws the subunit first size parameter draw the subunit with the age draws the subunit, is used for when the age is less than when predetermineeing the age, according to the proportional relation of children's face size standard parameter and adult's face size standard parameter, rectifies first face formation of image size parameter value obtains fourth face formation of image size parameter value, wherein, children's face size standard parameter with the age corresponds, will the people face two-dimensional coordinate data with fourth face formation of image size parameter value is regarded as current people face multidimension data.

In a possible design, the range determining unit is specifically configured to import the current face multidimensional data into a continuous curve fitting function, and calculate to obtain a current face pose angle range corresponding to the current face multidimensional data, where the continuous curve fitting function is obtained by fitting according to multiple groups of measured face pose angle ranges and face multidimensional data.

In a third aspect, the present invention provides a computer device, comprising a memory and a processor, which are communicatively connected, wherein the memory is used for storing a computer program, and the processor is used for reading the computer program and executing the method for discriminating a person attention target object as described in the first aspect or any one of the first aspect.

In a fourth aspect, the present invention provides a computer-readable storage medium having stored thereon instructions which, when executed on a computer, perform the method for discriminating a person's attention target object as described in the first aspect or any one of the possible designs of the first aspect.

In a fifth aspect, the present invention provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of discriminating a person's attention target object as described above in the first aspect or any one of the possible designs of the first aspect.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a diagram illustrating a head posture in a case where a head is raised, shaken, and turned in the related art.

Fig. 2 is an exemplary diagram of a positional relationship among the image capturing device, the target object, and the human face according to the present invention.

FIG. 3 is a flow chart of a method for identifying a target object of interest of a person according to the present invention.

Fig. 4 is a schematic structural diagram of an apparatus for discriminating a target object of interest of a person according to the present invention.

Fig. 5 is a schematic structural diagram of a computer device provided by the present invention.

In the above drawings: 1-an image acquisition device; 2-a target; 3-a human face; 4-imaging field of view; 5-grid.

Detailed Description

The invention is further described with reference to the following figures and specific examples. It should be noted that the description of the embodiments is provided to help understanding of the present invention, but the present invention is not limited thereto. Specific structural and functional details disclosed herein are merely illustrative of example embodiments of the invention. This invention may, however, be embodied in many alternate forms and should not be construed as limited to the embodiments set forth herein.

It will be understood that, although the terms first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of example embodiments of the present invention.

It should be understood that, for the term "and/or" as may appear herein, it is merely an associative relationship describing an associated object, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, B exists alone, and A and B exist at the same time; for the term "/and" as may appear herein, which describes another associative object relationship, it means that two relationships may exist, e.g., a/and B, may mean: a exists independently, and A and B exist independently; in addition, with respect to the character "/" which may appear herein, it generally means that the former and latter associated objects are in an "or" relationship.

It will be understood that when an element is referred to herein as being "connected," "connected," or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may be present. Conversely, if a unit is referred to herein as being "directly connected" or "directly coupled" to another unit, it is intended that no intervening units are present. In addition, other words used to describe the relationship between elements should be interpreted in a similar manner (e.g., "between 8230; \8230; between pairs" directly between 8230; \8230; between "," adjacent "pairs" directly adjacent ", etc.).

It is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments of the invention. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises," "comprising," "includes" and/or "including," when used herein, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, numbers, steps, operations, elements, components, and/or groups thereof.

It should also be noted that, in some alternative designs, the functions/acts noted may occur out of the order noted in the figures. For example, two figures shown in succession may, in fact, be executed substantially concurrently, or the figures may sometimes be executed in the reverse order, depending upon the functionality/acts involved.

It should be understood that specific details are provided in the following description to facilitate a thorough understanding of example embodiments. However, it will be understood by those of ordinary skill in the art that the example embodiments may be practiced without these specific details. For example, systems may be shown in block diagrams in order not to obscure the examples in unnecessary detail. In other instances, well-known processes, structures and techniques may be shown without unnecessary detail in order to avoid obscuring the example embodiments.

As shown in fig. 2 to 3, the method for identifying a target object of interest of a person provided in the first aspect of this embodiment may be, but is not limited to, applied to identifying whether a person is interested in a target object (e.g., a commodity, a drawing, or an elevator advertisement) in a scene such as a shop, an exhibition hall, and an elevator, and whether the center of the target object coincides with the origin of the camera coordinate system, the estimation of the point of interest in a far-field scene may be achieved based on a face image acquired in the field. As shown in fig. 2, in the indoor exhibition hall space, the origin O of the camera coordinate system of the image capturing device 1 (which may be but is not limited to a binocular camera or a monocular camera) is not substantially coincident with the center of the target object 2 (which may be but is not limited to a painting), when the face 3 is at any position in the space, the yaw angle yaw and the pitch angle pitch of the face pose are different from the camera coordinate system (i.e. the coordinate system of the image capturing device 1), which represents the orientation of the face and the attention direction of the eyeball, and at this time, the technical solution provided in the first aspect of the present embodiment needs to be adopted to estimate the attention point in the far field. The method for discriminating the target object of interest of the person may include, but is not limited to, the following steps S101 to S104.

S101, a face image is obtained, wherein the face image comprises at least one person.

In the step S101, the face image may be, but is not limited to, acquired by the image acquisition apparatus 1 shown in fig. 2, and the face image may be acquired when the person appears in the imaging field of view 4 of the image acquisition apparatus 1.

And S102, extracting the current face pose angle and the current face multidimensional data of the person from the face image.

In step S102, a specific manner of extracting the current face pose angle of the person from the face image is an existing conventional manner, and may include, but is not limited to: (1) Performing two-dimensional face key point detection on the face image; (2) Matching the detected two-dimensional face key points with corresponding face key points in the three-dimensional face model; (3) Solving a conversion relation matrix of the two-dimensional face key points and the corresponding three-dimensional face key points; (4) And solving three Euler angles (namely the current human face attitude angle: a pitch angle pitch, a yaw angle yaw and a roll angle) of the human face relative to the camera coordinate system according to the rotation relation matrix. The two-dimensional face key point detection is used for positioning key region positions of the face, including eyebrows, eyes, a nose, a mouth, a face contour and the like, of the face by giving a face image. The existing face key point detection methods are roughly divided into three types: based on an Active Shape Model (ASM) and an Active Appearance Model (AAM), a Cascaded Shape regression (CPR) and a deep learning method, a two-dimensional face mark frame, a two-dimensional face key region and/or a two-dimensional face key point and the like can be detected based on any one of the three existing methods. In addition, the current face pose angle can be obtained by converting three Euler angles of the face relative to the camera coordinate system into three Euler angles of the face relative to other space coordinate systems through geometry.

In step S102, when the image capturing device 1 is a binocular camera, a distance from the face 3 to an origin O of a camera coordinate system (i.e., Z-axis coordinates of the face in the camera coordinate system) may be obtained by using a binocular range finding principle, and meanwhile, since the face image is perpendicular to an optical axis of the image capturing device 1 (i.e., Z-axis in the camera coordinate system), X-axis coordinates and Y-axis coordinates of the face in the camera coordinate system may be directly obtained based on a coordinate position of the face 3 in the face image, as shown in fig. 2, when an imaging plane of the image capturing device is divided into a plurality of grids 5, each grid 5 represents a specific coordinate position in an XY plane, and thus the grid position occupied by the face 3 is an X-axis coordinate and a Y-axis coordinate, and further the X-axis coordinates, the Y-axis coordinates and the Z-axis coordinates of the face 3 in the camera coordinate system may be used as the current face multi-dimensional data.

S103, determining a current face pose angle range corresponding to the current face multi-dimensional data according to the corresponding relation between the face multi-dimensional data and the face pose angle range, wherein the face pose angle range refers to a face pose angle interval corresponding to the face multi-dimensional data and capable of paying attention to a target object.

In step S103, the face pose angle interval includes a pair of face pose angle upper limit value and face pose angle lower limit value, which are used to confirm that the person is paying attention to the target object when corresponding to the face multidimensional data in the interval. The face multidimensional data, the face pose angle upper limit value and the face pose angle lower limit value can be acquired in advance in the same manner as the step S102 when a tester focuses on the target object on site. As shown in fig. 2, in the indoor exhibition hall space, the XYZ coordinate system is a camera coordinate system, in a rectangular pyramid having the focus center of the image capturing device 1 as the vertex and the optical axis as the center line, the bottom surface of the rectangular pyramid is an imaging field of view 4, at this time, a 4 × 5 grid can be divided on the XY plane, and the rectangular pyramid space can be further divided into a plurality of rectangular pyramid spaces in the Z-axis dimension, the XYZ coordinates of each rectangular pyramid space are used as the face multidimensional data, and the corresponding upper limit value of the face pose angle and the corresponding lower limit value of the face pose angle need to be respectively collected for the XYZ coordinates of each rectangular pyramid space, so that when the face 3 occupies a certain rectangular pyramid space, the corresponding upper limit value of the face pose angle and the corresponding lower limit value of the face pose angle can be found according to the obtained current XYZ coordinates (i.e. current face multidimensional data) to serve as the current face pose angle range.

And S104, when the current face pose angle is within the range of the current face pose angle, judging that the person pays attention to the target object.

In step S104, when the current face pose angle of the person is within the current face pose angle range, it indicates that the person focuses on the target object on the spot like a test person, and it may be determined that the person focuses on the target object. Conversely, when the current face pose angle is outside the range of the current face pose angle, it may be determined that the person is not paying attention to the target object.

Therefore, by the technical solution of discrimination described in detail in the above steps S101 to S104, a new method for estimating a point of interest is provided no matter whether the center of a target object coincides with the origin of a camera coordinate system, that is, based on a face image acquired on site, current face pose angle and current face multidimensional data of a person on site are sequentially extracted, and the current face pose angle is compared with a current face pose angle range determined according to a correspondence between the face multidimensional data and the face pose angle range and corresponding to the current face multidimensional data, and if the current face pose angle is within the current face pose angle range, it indicates that the person on site pays attention to the target object on site as a tester for acquiring the face multidimensional data and the face pose angle range, and further, it is possible to discriminate whether the person pays attention to the target object by using less computer resources, thereby realizing estimation of a point of interest in a far field scene, and facilitating practical application and popularization.

On the basis of the technical solution of the first aspect, the present embodiment further specifically proposes a possible design that extracts the multidimensional data of the current face when the image capturing device is a monocular-head camera, that is, extracts the multidimensional data of the current face of the person from the face image, including but not limited to the following steps S211 to S212.

And S211, extracting two-dimensional coordinate data of the human face from the human face image.

In step S211, since the face image is perpendicular to the optical axis of the image capturing device 1 (i.e., the Z axis in the camera coordinate system), the X-axis coordinate and the Y-axis coordinate of the face in the camera coordinate system can be directly obtained based on the coordinate position of the face 3 in the face image, as shown in fig. 2, when the imaging plane of the image capturing device is divided into a plurality of grids 5, each grid 5 represents a specific coordinate position in the XY plane, and thus the grid position occupied by the face 3 is the X-axis coordinate and the Y-axis coordinate, and further the X-axis coordinate and the Y-axis coordinate currently corresponding to the face 3 can be used as the two-dimensional coordinate data of the face.

S212, extracting a first face imaging size parameter value from the face image, wherein the face two-dimensional coordinate data and the first face imaging size parameter value are used as the current face multi-dimensional data.

In step S212, the first face imaging size parameter value is a face imaging size of the face 3 in the face image, and may be obtained by directly comparing sizes of the captured photos. Considering that the imaging result of the monocular camera cannot be measured, the first face imaging size parameter value may be used to replace the Z-axis coordinate of the current face multidimensional data in the first aspect, that is, the distance from the face 3 to the origin O of the camera coordinate system is reflected by the face imaging size. Specifically, the first face imaging size parameter value may include, but is not limited to, an area value of a face image mark frame, an area value of any face key region, and/or a distance value between any two face key points, and the like, where the face key region and the face key point are both detected from the face image (based on an existing face key point detection method). For example, the face key region may be, but is not limited to, a face region, an eye region, a nose region, a mouth region, or the like, and the distance parameter between the two face key points may be, but is not limited to, a pupil distance value, or the like.

Therefore, through the possible design one described in the above steps S211 to S212, when the image acquisition device is a monocular-head camera, the distance from the face to the origin of the camera coordinate system can be reflected by using the size of the face imaging size, so that the Z-axis coordinate can be replaced in the case of no distance measurement, the extractability of the current face multidimensional data is ensured, and the estimation of the focus in the far-field scene is realized.

On the basis of the technical solution of the first aspect, the present embodiment further specifically proposes another possible design two of extracting multidimensional data of a current face when the image capturing device is a monocular head camera, that is, extracting multidimensional data of a current face of the person from the face image, which includes, but is not limited to, the following steps S221 to S224.

And S221, extracting two-dimensional coordinate data of the human face from the human face image.

S222, extracting a first face imaging size parameter value from the face image.

And S223, identifying the age of the person according to the face image.

S224, when the age is smaller than the preset age, correcting the first face imaging size parameter value according to the proportional relation between the child face size standard parameter and the adult face size standard parameter to obtain a fourth face imaging size parameter value, wherein the child face size standard parameter corresponds to the age, and the face two-dimensional coordinate data and the fourth face imaging size parameter value are used as the current face multi-dimensional data.

In the foregoing steps S221 to S222, the two-dimensional face coordinate data and the first face imaging size parameter value may be referred to as the first possible design. In the step 223, the manner of identifying the age of the person may be, but is not limited to, specifically, importing the face image into a face recognition model that has been subjected to deep learning training to obtain the age of the person. The face recognition model is an existing conventional model which can be used for carrying out age recognition and the like according to face images through deep learning training and can be based on the face images, and the age of people can be recognized and the like. In step S224, it is considered that the relative difference between the human face dimension parameters of the adult, such as the head, the face, or the interpupillary distance, is not large, and may be used as a basic criterion, but when the field person and the testing person are an adult and a child, an error may be caused to a final attention discrimination result, so that the first human face imaging dimension parameter value needs to be corrected according to a proportional relationship between the child human face dimension standard parameter and the adult human face dimension standard parameter (that is, when the field person is an adult and the testing person is a child, the dimension parameter is reduced, and when the field person is a child and the testing person is an adult, the dimension parameter is enlarged), so as to obtain a fourth human face imaging dimension parameter value, and then the fourth human face imaging dimension parameter value is used to replace the first human face imaging dimension parameter value of the current human face multidimensional data that may be designed, so that the current human face multidimensional data of the person and the human face multidimensional data of the testing person may be kept consistent, thereby avoiding causing an error to a subsequent attention discrimination result, and further improving the accuracy.

Therefore, through the second possible design described in the above steps S221 to S224, when the image capture device is a monocular-head camera, the size of the face imaging size corrected by the age factor can be used to reflect the distance from the face to the origin of the camera coordinate system, so that the Z-axis coordinate can be replaced in the case of no distance measurement, the extractability of the current face multidimensional data and the accuracy of the subsequent attention discrimination result are ensured, and the estimation of the attention point in the far-field scene is realized. Furthermore, the first face imaging size parameter value may be corrected based on a gender factor, that is, the gender of the person is identified from the face image, and at the time of correction, the child face size standard parameter corresponds to the age and the gender, and the adult face size standard parameter corresponds to the gender.

On the basis of the technical solution of the first aspect, the present embodiment further specifically proposes another possible design three for extracting the multidimensional data of the current face when the image capturing device is a monocular-head camera, that is, extracting the multidimensional data of the current face of the person from the face image, including but not limited to the following steps S231 to S233.

And S231, extracting two-dimensional coordinate data of the human face from the human face image.

And S232, extracting a first face imaging size parameter value from the face image.

And S233, calculating to obtain a second face imaging size parameter value according to the current face attitude angle and the first face imaging size parameter value, wherein the face two-dimensional coordinate data and the second face imaging size parameter value are used as the current face multi-dimensional data, and the second face imaging size parameter value is a face imaging size parameter value corresponding to the face of the person when the face coordinate position is perpendicular to the optical axis of the image acquisition equipment under the condition of unchanged face coordinate position.

In the foregoing steps S231 and S232, the two-dimensional coordinate data of the human face and the parameter value of the first human face imaging size may be referred to as the first possible design. In step S233, it is considered that the first face imaging size parameter value is obtained at an oblique viewing angle of an image capture device, which may cause an error to a final attention discrimination result, and therefore, a geometric spatial rotation transformation needs to be performed according to the current face attitude angle and the first face imaging size parameter value, so as to obtain a second face imaging size parameter value at a main viewing angle, and then the second face imaging size parameter value is adopted to replace the first face imaging size parameter value of the current face multidimensional data in the above possible design one, so that an error to a subsequent attention discrimination result may be avoided, and the discrimination accuracy is further improved.

Therefore, through the third possible design described in the above steps S231 to S233, when the image acquisition device is a monocular-head camera, the degree of distance from the face to the origin of the camera coordinate system can be reflected by using the size of the face imaging size at the main viewing angle, so that the Z-axis coordinate can be replaced in the case of no distance measurement, the extractability of the current face multidimensional data and the accuracy of the subsequent attention discrimination result are ensured, and the estimation of the attention point in the far-field scene is realized.

On the basis of the technical solution of the first aspect, the present embodiment further specifically proposes another possible design four for extracting multidimensional data of a current face when the image capturing device is a monocular head camera, that is, extracting multidimensional data of the current face of the person from the face image, which includes, but is not limited to, the following steps S241 to S245.

And S241, extracting face two-dimensional coordinate data from the face image.

And S242, extracting a first face imaging size parameter value from the face image.

And S243, calculating to obtain a second face imaging size parameter value according to the current face attitude angle and the first face imaging size parameter value, wherein the second face imaging size parameter value is a face imaging size parameter value corresponding to the face of the person when the face of the person is perpendicular to the optical axis of the image acquisition equipment under the condition that the face coordinate position is unchanged.

And S244, identifying the age of the person according to the face image.

And S245, when the age is smaller than a preset age, correcting the second face imaging size parameter value according to the proportional relation between the child face size standard parameter and the adult face size standard parameter to obtain a third face imaging size parameter value, wherein the child face size standard parameter corresponds to the age, and the face two-dimensional coordinate data and the third face imaging size parameter value are used as the current face multi-dimensional data.

In the foregoing steps S241 to S243, the two-dimensional face coordinate data and the first face imaging size parameter value may refer to the foregoing first possible design, and the second face imaging size parameter value may refer to the foregoing third possible design. In the step S244, the age of the person may be recognized by, but not limited to, importing the face image into a face recognition model that has been deeply trained to obtain the age of the person. The human face recognition model is an existing conventional model capable of recognizing ages and the like according to human face images through deep learning training and can be based on the ages and the like of people recognized by the human face images. In the step S245, it is considered that the relative difference between the human face dominant view size parameters of an adult, such as the head, the face or the interpupillary distance, is not large, and the human face dominant view size parameters can be used as a basic judgment basis, but when the field personnel and the testing personnel are an adult and a child, an error is caused to a final attention judgment result, so that the second human face imaging size parameter value is corrected according to the proportional relationship between the child human face size standard parameter and the adult human face size standard parameter (that is, when the field personnel is an adult and the testing personnel is a child, the size parameter is reduced, and when the field personnel is a child and the testing personnel is an adult, the size parameter is enlarged), a third human face imaging size parameter value is obtained, and the third human face imaging size parameter value is used to replace the second human face imaging size parameter value of the current human face multi-dimensional data in the above possibly designed two, so that the current human face multi-dimensional data of the personnel and the human face multi-dimensional data of the testing personnel can be kept consistent, an error is avoided, and the subsequent judgment result is further improved in accuracy.

Therefore, by the fourth possible design described in the above steps S241 to S245, when the image capturing device is a monocular camera, the distance from the face to the origin of the camera coordinate system can be reflected by using the size of the face image under the main viewing angle and corrected by the age factor, so that the Z-axis coordinate can be replaced in the case of being unable to measure the distance, the extractability of the current face multidimensional data and the accuracy of the subsequent attention discrimination result can be ensured, and the estimation of the attention point in the far-field scene can be realized. Furthermore, the second face imaging size parameter value may be corrected based on gender factors (for example, the interpupillary distance value of an adult male is 60 mm to 73 mm, the interpupillary distance value of an adult female is 55 mm to 68 mm, and different gender face imaging size parameter values may also be different), that is, the gender of the person is identified from the face image, and in the correction, the child face size standard parameter corresponds to the age and the gender, and the adult face size standard parameter corresponds to the gender.

On the basis of the technical solution of the first aspect, the present embodiment further specifically proposes another possible design five for extracting multidimensional data of a current face when the image acquisition device is a monocular-head camera, that is, extracting multidimensional data of the current face of the person from the face image, including but not limited to the following steps S251 to S254.

And S251, identifying the age of the person according to the face image.

And S252, extracting face two-dimensional coordinate data from the face image.

And S253, extracting a first face imaging size parameter value from the face image.

And S254, calculating to obtain a second face imaging size parameter value according to the current face attitude angle and the first face imaging size parameter value, wherein the age, the face two-dimensional coordinate data and the second face imaging size parameter value are used as the current face multi-dimensional data, and the second face imaging size parameter value is a face imaging size parameter value corresponding to the face of the person when the face coordinate position is perpendicular to the optical axis of the image acquisition equipment under the condition of unchanged face coordinate position.

In the foregoing steps S251 to S254, the age may be referred to as the foregoing possible design four, the face two-dimensional coordinate data and the first face imaging size parameter value may be referred to as the foregoing possible design one, and the second face imaging size parameter value may be referred to as the foregoing possible design three. By adding a dimension, namely age, in the current face multidimensional data, the problem that the final attention discrimination result has errors due to the fact that age difference between field personnel and testing personnel is large can be avoided, and discrimination accuracy is further improved.

Therefore, through the fifth possible design described in the above steps S251 to S254, when the image acquisition device is a monocular camera, the distance from the face to the origin of the camera coordinate system can be reflected by using the size of the face imaging size at the main viewing angle, so that the Z-axis coordinate can be replaced in the case of being unable to measure the distance, the extractability of the current face multidimensional data can be ensured, and meanwhile, by adding a dimension, namely age, to the current face multidimensional data, the problem of error caused by the final attention discrimination result due to the large age difference between field personnel and test personnel can be avoided, the discrimination accuracy can be further improved, and the estimation of the attention point in the far field scene can be realized. In addition, a dimension, namely gender, can be added in the current face multidimensional data, namely the gender of the person is identified according to the face image, and then the age, the gender, the face two-dimensional coordinate data and the first face imaging size parameter value are used as the current face multidimensional data.

For similar purposes, the age, the two-dimensional coordinate data of the face, and the first face imaging size parameter value may also be used as the current face multidimensional data, that is: extracting current face multidimensional data from the face image, wherein the extracting process comprises the following steps: identifying the age of the person according to the face image; extracting two-dimensional coordinate data of the human face from the human face image; and extracting a first face imaging size parameter value from the face image, wherein the age, the face two-dimensional coordinate data and the first face imaging size parameter value are used as the current face multi-dimensional data. Therefore, when the image acquisition equipment is a monocular camera, the distance degree from the face to the origin of a camera coordinate system can be reflected by using the size of the face imaging size, so that the Z-axis coordinate can be replaced under the condition that distance measurement cannot be performed, the extractability of the current face multidimensional data is ensured, the problem that the final attention judgment result has errors due to the fact that the age difference between field personnel and testing personnel is large can be avoided by adding a dimension, namely the age, in the current face multidimensional data, the judgment accuracy is further improved, and the estimation of the attention point under the far-field scene is realized.

On the basis of the technical solution of any one of the three to five possible designs, a possible design six for how to calculate the second face imaging size parameter value is further specifically proposed in this embodiment, that is, the second face imaging size parameter value is calculated according to the current face pose angle and the first face imaging size parameter value, which includes, but is not limited to, the following steps S301 to S302.

And S301, leading the current face attitude angle into a trigonometric function for reflecting the face rotation onto a plane to obtain a rotation transformation coefficient, wherein the plane is vertical to the optical axis of the image acquisition equipment.

In step S301, rotating the face to a plane means that the face 3 is rotated to face the lens of the image capturing device 1, and the trigonometric function can be obtained according to conventional geometric analysis. In the actual calculation process, the change range of the human face posture before and after rotation is considered to be small, the influence of the pitch angle and the roll angle on the human face size parameter can be ignored, and then the rotation transformation coefficient can be calculated according to the following formula:

η＝sec(θ _yaw )

in the formula, eta represents the rotation transformation coefficient, theta _yaw Representing the yaw angle in the current face pose angle, and sec () representing the secant function.

S302, calculating to obtain a second face imaging size parameter value according to the first face imaging size parameter value and the rotation transformation coefficient.

In step S302, a result of multiplying the first face imaging size parameter value by the rotation transformation coefficient may be specifically used as the second face imaging size parameter value, for example, a pupil distance value is multiplied by the rotation transformation coefficient.

Therefore, through the six possible designs described in the steps S301 to S302, the real and accurate face dominant-view size parameters can be obtained, and the accuracy of subsequent discrimination is ensured.

On the basis of the first aspect and the technical solution that may be one to six, the present embodiment further specifically provides a seventh possible design method for how to accurately obtain the current face pose angle range, that is, determining the current face pose angle range corresponding to the current face multidimensional data according to the corresponding relationship between the face multidimensional data and the face pose angle range, where the method includes: and importing the current face multidimensional data into a continuous curve fitting function, and calculating to obtain a current face attitude angle range corresponding to the current face multidimensional data, wherein the continuous curve fitting function is obtained by fitting according to a plurality of groups of actually measured face attitude angle ranges and the face multidimensional data.

Therefore, by the aid of the above described possible design seven, after a limited set of actually measured face attitude angle ranges and face multidimensional data are obtained, current face attitude angle ranges corresponding to different variables can be obtained in a refined manner through a curve function fitting technology, and accordingly, acquisition work can be reduced, and accuracy of subsequent judgment is ensured.

As shown in fig. 4, a second aspect of the present embodiment provides a virtual device for implementing the method for identifying a target object of interest of a person in any one of the first aspect or the possible designs of the first aspect, including an image acquisition unit, a data extraction unit, a range determination unit, and an attention identification unit, which are sequentially connected in communication;

the data extraction unit is used for extracting the current face attitude angle and the current face multidimensional data of the personnel from the face image;

and the attention distinguishing unit is used for judging that the person pays attention to the target object when the current face attitude angle is within the range of the current face attitude angle.

the second size parameter extraction subunit is respectively in communication connection with the face two-dimensional coordinate extraction subunit and the first size parameter extraction subunit, and is configured to calculate a second face imaging size parameter value according to the current face pose angle and the first face imaging size parameter value, where the face two-dimensional coordinate data and the second face imaging size parameter value are used as the current face multi-dimensional data, and the second face imaging size parameter value is a face imaging size parameter value corresponding to a face of a person when the face is perpendicular to an optical axis of an image acquisition device under a condition that a face coordinate position is unchanged.

In one possible design, the data extraction unit comprises an age extraction subunit, a face two-dimensional coordinate extraction subunit, a first size parameter extraction subunit and a second size parameter extraction subunit;

the second size parameter extraction subunit is respectively in communication connection with the age extraction subunit, the face two-dimensional coordinate extraction subunit and the first size parameter extraction subunit, and is used for calculating to obtain a second face imaging size parameter value according to the current face attitude angle and the first face imaging size parameter value, wherein the age, the face two-dimensional coordinate data and the second face imaging size parameter value are used as the current face multi-dimensional data, and the second face imaging size parameter value is a face imaging size parameter value corresponding to the face of a person when the face of the person is perpendicular to the optical axis of the image acquisition equipment under the condition that the face coordinate position is unchanged.

the coefficient obtaining unit is used for leading the current face attitude angle into a trigonometric function for reflecting the face rotation onto a plane to obtain a rotation transformation coefficient, wherein the plane is vertical to the optical axis of the image acquisition equipment;

the fourth size parameter draws the subunit, communication connection respectively face two-dimensional coordinate draws the subunit first size parameter draw the subunit with age draws the subunit, is used for when the age is less than when predetermineeing the age, according to the proportional relation of children's face size standard parameter and adult's face size standard parameter, rectifies first face formation of image size parameter value obtains fourth face formation of image size parameter value, wherein, children's face size standard parameter with age corresponds, will face two-dimensional coordinate data with fourth face formation of image size parameter value is as current face multidimension data.

For the working process, the working details and the technical effects of the foregoing apparatus provided in the second aspect of this embodiment, reference may be made to the method for determining a target object of interest by a person in the first aspect or any one of the possible designs in the first aspect, which is not described herein again.

As shown in fig. 5, a third aspect of the present embodiment provides a computer device for executing the method for discriminating a human attention target object as possibly designed in any one of the first aspect or the first aspect, including a memory and a processor, which are communicatively connected, wherein the memory is used for storing a computer program, and the processor is used for reading the computer program and executing the method for discriminating a human attention target object as possibly designed in any one of the first aspect or the first aspect. For example, the Memory may include, but is not limited to, a Random-Access Memory (RAM), a Read-Only Memory (ROM), a Flash Memory (Flash Memory), a First-in First-out (FIFO), and/or a First-in Last-out (FILO), and the like; the processor may not be limited to the microprocessor of the model number employing the STM32F105 family. In addition, the computer device may also include, but is not limited to, a power module, a display screen, and other necessary components.

For the working process, working details and technical effects of the foregoing computer device provided in the third aspect of this embodiment, reference may be made to the method for determining a target object of interest of a person in the first aspect or any one of the possible designs in the first aspect, which is not described herein again.

A fourth aspect of the present invention provides a computer-readable storage medium storing instructions for a method for discriminating a person's attention target object, including any one of the first aspect or the first possible design, when the instructions are executed on a computer, the method for discriminating a person's attention target object is performed according to any one of the first aspect or the first possible design. The computer-readable storage medium refers to a carrier for storing data, and may include, but is not limited to, floppy disks, optical disks, hard disks, flash memories, flash disks and/or Memory sticks (Memory sticks), etc., and the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.

For the working process, the working details and the technical effects of the foregoing computer-readable storage medium provided in the fourth aspect of this embodiment, reference may be made to the first aspect or any one of the methods that may be designed by a human to focus on an object, and details are not described herein again.

A fifth aspect of the present invention provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method for discriminating a target object of interest of a person as described in the first aspect or any one of the first aspects. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable devices.

The embodiments described above are merely illustrative, and may or may not be physically separate if they refer to units illustrated as separate components; if reference is made to a component displayed as a unit, it may or may not be a physical unit, i.e. it may be located in one place, or it may be distributed over a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: modifications may be made to the embodiments described above, or equivalents may be substituted for some of the features described. And such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Finally, it should be noted that the present invention is not limited to the above alternative embodiments, and that various other forms of products can be obtained by anyone in light of the present invention. The above detailed description should not be taken as limiting the scope of the invention, which is defined in the claims, and which the description is intended to be interpreted accordingly.

Claims

1. A method of discriminating a target of interest to a person, comprising:

acquiring a face image, wherein the face image comprises at least one person;

extracting the current face multidimensional data of the person from the face image, wherein the data comprises any one of the following modes (A) to (E):

(A) Extracting two-dimensional coordinate data of the human face from the human face image; extracting a first face imaging size parameter value comprising an area value of a face image marking frame, an area value of any face key region and/or a distance value between any two face key points from the face image, wherein the face key region and the face key points are detected from the face image, and the face two-dimensional coordinate data and the first face imaging size parameter value are used as current face multi-dimensional data;

(B) Extracting two-dimensional coordinate data of the human face from the human face image; extracting a first face imaging size parameter value comprising an area value of a face image marking frame, an area value of any face key region and/or a distance value between any two face key points from the face image, wherein the face key region and the face key points are detected from the face image; leading the current face attitude angle into a trigonometric function for reflecting the rotation of the face onto a plane to obtain a rotation transformation coefficient, wherein the plane is vertical to the optical axis of the image acquisition equipment; calculating to obtain a second face imaging size parameter value according to the first face imaging size parameter value and the rotation transformation coefficient, wherein the face two-dimensional coordinate data and the second face imaging size parameter value are used as current face multi-dimensional data, and the second face imaging size parameter value is a face imaging size parameter value corresponding to the face of the person when the face of the person is perpendicular to an optical axis of the image acquisition equipment under the condition that the face coordinate position is unchanged;

(C) Extracting two-dimensional coordinate data of the human face from the human face image; extracting a first face imaging size parameter value comprising an area value of a face image marking frame, an area value of any face key region and/or a distance value between any two face key points from the face image, wherein the face key region and the face key points are detected from the face image; leading the current face attitude angle into a trigonometric function for reflecting the rotation of the face onto a plane to obtain a rotation transformation coefficient, wherein the plane is vertical to the optical axis of the image acquisition equipment; calculating to obtain a second face imaging size parameter value according to the first face imaging size parameter value and the rotation transformation coefficient, wherein the second face imaging size parameter value is a face imaging size parameter value corresponding to the face of the person when the face is perpendicular to the optical axis of the image acquisition equipment under the condition that the face coordinate position is unchanged; identifying the age of the person according to the face image; when the age is smaller than the preset age, correcting the second face imaging size parameter value according to the proportional relation between the child face size standard parameter and the adult face size standard parameter to obtain a third face imaging size parameter value, wherein the child face size standard parameter corresponds to the age, and the face two-dimensional coordinate data and the third face imaging size parameter value are used as current face multi-dimensional data;

(D) Identifying the age of the person according to the face image; extracting two-dimensional coordinate data of the human face from the human face image; extracting a first face imaging size parameter value containing an area value of a face image mark frame, an area value of any face key area and/or a distance value between any two face key points from the face image, wherein the face key area and the face key points are detected from the face image; leading the current face attitude angle into a trigonometric function for reflecting the face to rotate to a plane to obtain a rotation transformation coefficient, wherein the plane is vertical to the optical axis of the image acquisition equipment; calculating to obtain a second face imaging size parameter value according to the first face imaging size parameter value and the rotation transformation coefficient, wherein the age, the face two-dimensional coordinate data and the second face imaging size parameter value are used as current face multi-dimensional data, and the second face imaging size parameter value is a face imaging size parameter value corresponding to the face of the person when the face of the person is perpendicular to an optical axis of image acquisition equipment under the condition that the face coordinate position is unchanged;

(E) Extracting two-dimensional coordinate data of the human face from the human face image; extracting a first face imaging size parameter value comprising an area value of a face image marking frame, an area value of any face key region and/or a distance value between any two face key points from the face image, wherein the face key region and the face key points are detected from the face image; identifying the age of the person according to the face image; when the age is smaller than a preset age, correcting the first face imaging size parameter value according to the proportional relation between the child face size standard parameter and the adult face size standard parameter to obtain a fourth face imaging size parameter value, wherein the child face size standard parameter corresponds to the age, and the face two-dimensional coordinate data and the fourth face imaging size parameter value are used as current face multi-dimensional data;

2. A device for distinguishing a person concerning a target object is characterized by comprising an image acquisition unit, a data extraction unit, a range determination unit and a concerned distinguishing unit which are sequentially in communication connection;

(A) Extracting two-dimensional coordinate data of the human face from the human face image; extracting a first face imaging size parameter value comprising an area value of a face image mark frame, an area value of any face key area and/or a distance value between any two face key points from the face image, wherein the face key area and the face key points are detected from the face image, and the face two-dimensional coordinate data and the first face imaging size parameter value are used as current face multi-dimensional data;

(B) Extracting two-dimensional coordinate data of the human face from the human face image; extracting a first face imaging size parameter value comprising an area value of a face image marking frame, an area value of any face key region and/or a distance value between any two face key points from the face image, wherein the face key region and the face key points are detected from the face image; leading the current face attitude angle into a trigonometric function for reflecting the face to rotate to a plane to obtain a rotation transformation coefficient, wherein the plane is vertical to the optical axis of the image acquisition equipment; calculating to obtain a second face imaging size parameter value according to the first face imaging size parameter value and the rotation conversion coefficient, wherein the face two-dimensional coordinate data and the second face imaging size parameter value are used as current face multi-dimensional data, and the second face imaging size parameter value is a face imaging size parameter value corresponding to the face of the person when the face coordinate position is perpendicular to the optical axis of the image acquisition equipment under the condition of unchanged face coordinate position;

(C) Extracting two-dimensional coordinate data of the human face from the human face image; extracting a first face imaging size parameter value containing an area value of a face image mark frame, an area value of any face key area and/or a distance value between any two face key points from the face image, wherein the face key area and the face key points are detected from the face image; leading the current face attitude angle into a trigonometric function for reflecting the rotation of the face onto a plane to obtain a rotation transformation coefficient, wherein the plane is vertical to the optical axis of the image acquisition equipment; calculating to obtain a second face imaging size parameter value according to the first face imaging size parameter value and the rotation conversion coefficient, wherein the second face imaging size parameter value is a face imaging size parameter value corresponding to the situation that the face of the person is perpendicular to the optical axis of the image acquisition equipment under the condition that the face coordinate position is unchanged; identifying the age of the person according to the face image; when the age is smaller than the preset age, correcting the second face imaging size parameter value according to the proportional relation between the child face size standard parameter and the adult face size standard parameter to obtain a third face imaging size parameter value, wherein the child face size standard parameter corresponds to the age, and the face two-dimensional coordinate data and the third face imaging size parameter value are used as current face multi-dimensional data;

(D) Identifying the age of the person according to the face image; extracting two-dimensional coordinate data of the human face from the human face image; extracting a first face imaging size parameter value containing an area value of a face image mark frame, an area value of any face key area and/or a distance value between any two face key points from the face image, wherein the face key area and the face key points are detected from the face image; leading the current face attitude angle into a trigonometric function for reflecting the rotation of the face onto a plane to obtain a rotation transformation coefficient, wherein the plane is vertical to the optical axis of the image acquisition equipment; calculating to obtain a second face imaging size parameter value according to the first face imaging size parameter value and the rotation conversion coefficient, wherein the age, the face two-dimensional coordinate data and the second face imaging size parameter value are used as current face multi-dimensional data, and the second face imaging size parameter value is a face imaging size parameter value corresponding to the face of the person when the face coordinate position is perpendicular to the optical axis of the image acquisition equipment under the condition of unchanged face coordinate position;

(E) Extracting two-dimensional coordinate data of the human face from the human face image; extracting a first face imaging size parameter value containing an area value of a face image mark frame, an area value of any face key area and/or a distance value between any two face key points from the face image, wherein the face key area and the face key points are detected from the face image; identifying the age of the person according to the face image; when the age is smaller than a preset age, correcting the first face imaging size parameter value according to the proportional relation between the child face size standard parameter and the adult face size standard parameter to obtain a fourth face imaging size parameter value, wherein the child face size standard parameter corresponds to the age, and the face two-dimensional coordinate data and the fourth face imaging size parameter value are used as current face multi-dimensional data;

3. A computer device comprising a memory and a processor in communication connection, wherein the memory is configured to store a computer program, and the processor is configured to read the computer program and execute the method of identifying a target object of interest of a person according to claim 1.

4. A computer-readable storage medium having stored thereon instructions which, when executed on a computer, perform the method of discriminating a person's attention target object as defined in claim 1.