CN109034138B

CN109034138B - Image processing method and device

Info

Publication number: CN109034138B
Application number: CN201811055938.8A
Authority: CN
Inventors: 曾增; 陈陆义; 王春洁; 贺文强; 李雪
Original assignee: Hunan Visualtouring Information Technology Co ltd
Current assignee: Hunan Fenghua Intelligent Technology Co ltd
Priority date: 2018-09-11
Filing date: 2018-09-11
Publication date: 2021-09-03
Anticipated expiration: 2038-09-11
Also published as: CN109034138A

Abstract

The embodiment of the invention provides an image processing method and device. In one embodiment, the image processing method includes: obtaining three-dimensional characteristic point data of each point in a target image; acquiring a first region of interest corresponding to a nose part in the target image; traversing and deriving each row of data in the first region of interest to obtain a plurality of positioning points of the alar part in each row of data; screening out a target positioning point from the plurality of positioning points as an output result of the identification of the nasal alar part; and obtaining the identification result of the target image according to the output result.

Description

Image processing method and device

Technical Field

The invention relates to the field of data processing, in particular to an image processing method and device.

Background

The detection of the key feature points of the human face is also called human face positioning, positions of various organs of the human face are positioned through given human face image data, and the positions are described by marked feature points. The human face key point detection method mainly comprises a method based on deep learning, a method based on cascade shape regression and the like. However, due to the diversity of human faces, the existing recognition technology cannot obtain a recognition result well.

Disclosure of Invention

In view of the above, an object of the embodiments of the present invention is to provide an image processing method and apparatus.

The image processing method provided by the embodiment of the invention comprises the following steps:

obtaining three-dimensional characteristic point data of each point in a target image;

acquiring a first interested area corresponding to the nose part in the target image, traversing and deriving each row of data in the first interested area to obtain a plurality of positioning points of the nose wing in each row of data, screening a target positioning point from the plurality of positioning points to serve as an output result of nose wing identification, and/or acquiring a second interested area on the upper side of the eyebrow in the target image, acquiring a corresponding fluctuation point in each row of data in the second interested area, and acquiring the forehead boundary of a person in the target image according to the plurality of fluctuation points; and

and obtaining the recognition result of the target image according to the output result and/or the forehead boundary.

An embodiment of the present invention further provides an image processing apparatus, including:

the acquisition module is used for acquiring three-dimensional feature point data of each point in the target image;

the first acquisition module is used for acquiring a first region of interest corresponding to the nose part in the target image;

the traversing module is used for traversing and deriving each row of data in the first region of interest to obtain a plurality of positioning points of the alar part in each row of data;

the first screening module is used for screening a target positioning point from the plurality of positioning points as an output result of the identification of the nasal alar part;

and the identification module is used for obtaining an identification result of the target image according to the output result.

Compared with the prior art, the image processing method and the image processing device provided by the embodiment of the invention have the advantages that the three-dimensional data information is introduced into the face recognition through the three-dimensional feature point data in the first region of interest acquired on the target image, and compared with the face recognition of a two-dimensional image, the nose width size information and the nose height information can be applied to the face recognition, so that the recognition of the alar part of the face has higher reliability.

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

Fig. 1 is a schematic view of an electronic terminal according to an embodiment of the present invention.

Fig. 2 is a flowchart of an image processing method according to an embodiment of the present invention.

Fig. 3 is a schematic diagram of the positions of key feature points in a face image in an example.

Fig. 4a is a schematic view of the first region of interest taken from a gray scale image in one example.

Fig. 4b is a schematic view of a first region of interest taken from a depth map in one example.

Fig. 4c is a schematic diagram of the first region of interest extracted from the depth map in one example.

Fig. 5a is a schematic diagram of fluctuation in one line of data in the first region of interest in one example.

FIG. 5b is a graph illustrating the corresponding derivatives for fluctuations in one row of data in the first region of interest, according to one example.

Fig. 6 is a partial flowchart of an image processing method according to another embodiment of the present invention.

Fig. 7 is a detailed flowchart of step S304 of an image processing method according to another embodiment of the present invention.

Fig. 8a is a schematic diagram of extracted skeleton lines in an example provided by the embodiment of the present invention.

Fig. 8b is a diagram illustrating data along a selected edge of an earlobe according to an example provided by an embodiment of the present invention.

Fig. 8c is a schematic diagram of data along the edge of an earlobe at another selected angle in one example provided by an embodiment of the present invention.

Fig. 9 is a partial flowchart of an image processing method according to still another embodiment of the present invention.

Fig. 10a is a schematic diagram of depth map data corresponding to face data in an example provided by an embodiment of the present invention.

Fig. 10b is a schematic view of a second region of interest in the corresponding example of fig. 10 a.

Fig. 10c shows a depth value variation map of one column in which black data of a face image is widest in one example.

Fig. 11 is a detailed flowchart of step S403 in the image processing method according to still another embodiment of the present invention.

Fig. 12 is a schematic functional block diagram of an image processing apparatus according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present invention, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.

Fig. 1 is a block diagram of an electronic terminal 100. The electronic terminal 100 includes an image processing apparatus 110, a memory 111, a memory controller 112, a processor 113, a peripheral interface 114, an input/output unit 115, and a display unit 116. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration and is not intended to limit the structure of the electronic terminal 100. For example, the electronic terminal 100 may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1. The electronic terminal 100 described in this embodiment may be a computing device having an image processing capability, such as a personal computer, an image processing server, or a mobile electronic device.

The memory 111, the memory controller 112, the processor 113, the peripheral interface 114, the input/output unit 115 and the display unit 116 are electrically connected to each other directly or indirectly to realize data transmission or interaction. For example, the components may be electrically connected to each other via one or more communication buses or signal lines. The image processing device 110 includes at least one software function module which can be stored in the memory 111 in the form of software or Firmware (Firmware) or solidified in an Operating System (OS) of the electronic terminal 100. The processor 113 is configured to execute an executable module stored in the memory, such as a software functional module or a computer program included in the image processing apparatus 110.

The Memory 111 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like. The memory 111 is configured to store a program, and the processor 113 executes the program after receiving an execution instruction, and the method executed by the electronic terminal 100 defined by the process disclosed in any embodiment of the present invention may be applied to the processor 113, or implemented by the processor 113.

The processor 113 may be an integrated circuit chip having signal processing capabilities. The Processor 113 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The peripheral interface 114 couples various input/output devices to the processor 113 and memory 111. In some embodiments, the peripheral interface 114, the processor 113, and the memory controller 112 may be implemented in a single chip. In other examples, they may be implemented separately from the individual chips.

The input/output unit 115 is used to provide input data to a user. The input/output unit 115 may be, but is not limited to, a mouse, a keyboard, and the like.

The display unit 116 provides an interactive interface (e.g., a user operation interface) between the electronic terminal 100 and a user or is used to display image data to a user reference. In this embodiment, the display unit may be a liquid crystal display or a touch display. In the case of a touch display, the display can be a capacitive touch screen or a resistive touch screen, which supports single-point and multi-point touch operations. The support of single-point and multi-point touch operations means that the touch display can sense touch operations simultaneously generated from one or more positions on the touch display, and the sensed touch operations are sent to the processor for calculation and processing.

At present, human face key point algorithm libraries with good open source effects comprise a dlib library and a libfacedetection library of Shiqiyu, but the two algorithm libraries cannot meet the high-precision use condition at feature points, and for different precision requirements, a large amount of time needs to be consumed for algorithm training and debugging, so that the cost is high, and meanwhile, the data information acquired by two-dimensional images cannot effectively meet the three-dimensional world requirements.

Please refer to fig. 2, which is a flowchart illustrating an image processing method applied to the electronic terminal shown in fig. 1 according to an embodiment of the present invention. The specific process shown in fig. 2 will be described in detail below.

Step S201, three-dimensional feature point data of each point in the target image is obtained.

In this embodiment, the target image at least includes a depth image and an RGB image of a human face.

In this embodiment, the three-dimensional point cloud may be obtained based on the coded structured light and the binocular stereoscopic vision principle. Specifically, the acquired three-dimensional point cloud data is back projected to a pixel coordinate system of the RGB camera in combination with internal reference of the RGB camera, so as to obtain a depth image, wherein the depth image and each pixel of the RGB image are in one-to-one correspondence. The obtaining method of the depth image and the three-dimensional point cloud in this embodiment is not limited to the obtaining method described above, and may also be obtained in other manners, for example, kinect may obtain the depth image first, and obtain the corresponding three-dimensional point cloud data by combining with the internal reference calculation of the camera. Furthermore, the two-dimensional pixel coordinates in the depth image of the face can be calculated to obtain corresponding three-dimensional point coordinates.

Specifically, the face feature points in the RGB image can be obtained by detecting the face feature points in the RGB image, and the three-dimensional feature point data is formed by calculating three-dimensional coordinates by combining the two-dimensional RGB image feature points with the depth image. In one example, a dlib face detection library may be used to detect an RGB image of a face to obtain two-dimensional feature points of the face, and three-dimensional feature point data may be calculated according to pixel coordinates of the two-dimensional feature points in the RGB image, values in a corresponding depth map, and camera parameters. The face feature points are used for identifying key positions of the face, such as the positions of the nose tip, the chin, the canthus and the like.

In this embodiment, the three-dimensional feature point data of the human face in the target image may be obtained according to the structured light three-dimensional measurement system. And converting the three-dimensional data corresponding to each pixel into an image to obtain a depth map acquired by the camera, wherein each pixel corresponds to a point cloud, and the value of the point cloud can reflect the change of the three-dimensional information of the face in the target image.

Furthermore, three-dimensional information of the target image can be acquired through TOF technology, laser radar scanning and the like.

Step S202, a first region of interest corresponding to the nose part in the target image is acquired.

In one embodiment, the key feature point data obtained by utilizing the dlib face detection library is shown in fig. 3, which shows the positions of 68 key feature points and the corresponding number of each key feature point position. The positions of key feature point 1 through key feature point 68 are shown in FIG. 3, respectively. As can be seen in fig. 3, the key feature points 29-35 identify the nose region in the face image. Combining the positions of 29-35 points in the 68 key feature points, the first region of interest corresponding to the nose part can be extracted by using the rect function in the opencv library. Further, the coordinates of the three-dimensional feature point data in the target image and the length and width parameters of the rectangle are input into the Rect function, and a first region of interest corresponding to the nose part in the target image can be obtained. As shown in fig. 4a and 4b, fig. 4a is a schematic diagram of the first region of interest taken from a gray scale image in one example, and fig. 4b is a schematic diagram of the first region of interest taken from a depth map in one example. Further, the relevant content for the dlib face detection library may refer to content in the link http:// dlib.

Step S203, performing traversal derivation on each line of data in the first region of interest to obtain a plurality of positioning points of the alar part of each line of data.

In this embodiment, step S203 includes: calculating a first derivative of each row of data; and acquiring the position of the valley point of the first derivative corresponding to the line, and taking the valley point as an positioning point.

In this embodiment, the width of the first region of interest may be taken out as w, and the first region of interest is divided into two regions, i.e., 0-w/2 and w/2-w. Because the depth image can reflect the surface depth change of the three-dimensional model, the nasal alar part and the face area have larger depth change. And respectively solving a first derivative for each row of data by taking the positioning point as a judgment reference, obtaining the position of the trough by solving the first derivative, and determining the position of the trough as the positioning point. As shown in fig. 5a, fig. 5a is a schematic diagram illustrating fluctuation in one line of data in the first region of interest in one example. Wherein the abscissa in fig. 5a represents the horizontal coordinate of each point in one of the rows and the value corresponding to the ordinate represents the gray value of the point at the corresponding position in the row. Fig. 5b is a diagram illustrating the corresponding derivative of the fluctuation in one line of data in the first region of interest in one example, and it can be seen that there is a distinct valley in the data shown in fig. 5b, and the valley point can be considered as the corresponding nose wing position point.

Before step S203, a white area in the first region of interest, which is extracted from the depth map in the example shown in fig. 4b, may also represent a region without three-dimensional points, and for convenience of subsequent calculation, the white area is subjected to skeleton extraction, and a skeleton line is reserved. Specifically, as shown in fig. 4c, fig. 4c is a schematic diagram of the first region of interest, which is extracted from the depth map in one example. Further, the step S203 may be to perform processing on the first region of interest after performing "extracting the white region by skeleton and reserving one skeleton line".

And step S204, screening out a target positioning point from the plurality of positioning points as an output result of the identification of the nasal alar.

It will be appreciated that the alar part of the nose of a person is more widely spaced relative to the rest of the nose. Therefore, two points having the widest horizontal distance among the plurality of anchor points can be calculated as the alar position points.

In one embodiment, the distances between the two positioning points in each row are calculated, and the two positioning points with the widest distance are selected as the output result of the nasal ala recognition.

In this embodiment, two positioning points, at least one positioning point on the left side and at least one positioning point on the right side, may be found in each line of data in the first region of interest.

In another embodiment, the distance between two positioning points in each row is calculated, two positioning points with the widest distance and two positioning points with the second widest distance are selected, and the average value of the screened data is calculated and used as the output result of the alar identification.

In this embodiment, two positioning points, one positioning point on the left side and one positioning point on the right side, may be found in each line of data in the first region of interest.

In other embodiments, a plurality of large positioning point sets may be screened out, and the positioning points of the nasal ala may be obtained by calculating the mean of the plurality of large positioning point sets.

The corresponding point of the point on the three-dimensional point cloud can be obtained through the feature point coordinates and the depth map acquired from the image, the three-dimensional data information is introduced into face recognition, compared with the face recognition of a two-dimensional image, the nose width size information and the nose height information can be applied to the face recognition, and the method has higher reliability.

And S205, obtaining the identification result of the target image according to the output result.

In this embodiment, the image recognition method may be used for recognizing a human face; the method can also be used for screening target people; but also for the classification of race, etc.

In another embodiment, as shown in fig. 6, before step S205, the method further comprises the following steps.

Step S301, acquiring a plurality of point cloud data in the target image, wherein the longitudinal difference between the target image and the nasal wing position point found by the output result is smaller than a preset value.

And (3) traversing the whole face point cloud model in the target image by combining any calculated three-dimensional point coordinate (x, y, z) of the nose wing to find out all point cloud data meeting the condition that the absolute value of (yi-y) is less than a preset value. In this embodiment, the preset value may be one of 0.4-0.8mm, for example, 0.5 mm.

And S302, screening the point cloud data according to the empirical parameters of the positions of the points near the earlobe to obtain the point cloud data near the earlobe.

The empirical parameters may include, among other things, a defined parameter for the lateral extent of the x-axis.

And step S303, processing the point cloud data near the earlobe to obtain a depth map, and obtaining a plan map corresponding to the point cloud data near the earlobe.

In this embodiment, empirical parameters of the positions of the points near the earlobe may be substituted into the plurality of point cloud data to simplify the data.

And further, the screened three-dimensional point data is back projected to a pixel coordinate system of the camera to obtain the plan by combining the corresponding relation between the space three-dimensional point and the camera image coordinate system.

And step S304, determining the position of the point near the earlobe according to the plan.

The step S205 may be replaced by a step S305, and obtaining a recognition result of the target image according to the position of the earlobe vicinity point and the output result.

In this embodiment, as shown in fig. 7, the step S304 includes the following steps.

Step S3041, performing a close operation on the discrete points in the plane map to obtain a continuous point set.

Step S3042, extracting the continuous point sets to obtain an ear lobe edge line.

In one embodiment, a closed operation is implemented by using a combination of the partition and the enode function in opencv, and the data skeleton line is extracted. FIG. 8a is a schematic diagram of extracted skeleton lines in one example.

Step S3043, obtaining a turning point position of the edge line of the earlobe, and taking the turning point position as a position near the earlobe.

Because the human face is an irregular curved surface with radian, the point cloud near the selected earlobe is projected to an image coordinate system of the side camera, and a section of broken line is obtained as a graph. Analyzing the data change, turning the boundary of the side face and the ear, and determining the peak position of the data change as the point near the earlobe.

The method in the embodiment has obvious advantages for identifying the earlobe in the side face.

In this embodiment, the algorithm for positioning the point near the earlobe may also be implemented in the following manner:

in this embodiment, the target image includes images taken at a plurality of angles. The collected data may include RGB image information and three-dimensional point cloud information. For example, images may be taken from three directions. Wherein the three directions may include a front, a left side, and a right side, respectively.

In this embodiment, a complete three-dimensional point cloud data can be obtained from the three-dimensional point cloud information acquired in three directions through a stitching algorithm. In one embodiment, the stitching of the three-dimensional point cloud information may be referenced to an intermediate camera.

Furthermore, the three-dimensional model after splicing can be adjusted. The alignment operation may approximate a model corresponding to the complete three-dimensional point cloud data to an elevational front. In one example, the nose tip point can be selected as an origin of a coordinate axis, and the origin is moved by a set length along the negative direction of the z axis, and the nose tip point can also be understood as a three-dimensional coordinate corresponding to the 30 th point in 68 feature points in the face image. Wherein the set length may be 100 mm.

In this embodiment, according to the three-dimensional feature point data of the nasal ala obtained in step S204, assuming that the coordinates of the three-dimensional feature point on the nasal ala side are (x0, y0, z0), the three-dimensional data is horizontally intercepted through the three-dimensional feature point, that is, all point clouds whose coordinates satisfy the range of [ y0-t, y0+ t ] (t is a smaller threshold) are screened through all point cloud data to obtain transversal data, and the transversal data is used as the edge line of the ear lobe. A graphical representation of data along the edge of the selected ear lobe in one example is shown in figures 8b and 8 c.

In this embodiment, as shown in fig. 9, before step S205, the method further includes the following steps.

Step S401, a second region of interest on the upper side of the eyebrow in the target image is obtained.

In this embodiment, a second region of interest above the eyebrow is extracted based on data of the eyebrow portions of the 68 key feature points originally detected in the target image.

In one example, the person of the target image wears a headband, and the headband is integrated with the forehead in the depth image.

In an example, as shown in fig. 10a, fig. 10a is a schematic diagram of depth map data corresponding to face data in an example provided by the embodiment of the present invention. Wherein the black part in the figure represents valid data, i.e. the color is different from the background color; white is invalid data, relatively close to background. In one example, the data of the black portion in the diagram can be taken for computational analysis.

Further, the second region of interest is acquired. Taking the example shown in fig. 10a as an example, extracting the second region of interest of the face image can obtain partial depth map data, that is, the partial depth map shown in fig. 10 b.

Step S402, acquiring a fluctuation point corresponding to each line of data in the second region of interest.

In this embodiment, because there is a thickness variation in the boundary portion between the headband and the forehead, the corresponding depth map data will have a significant fluctuation, where the part above the forehead of the person is relatively wider than other parts of the face of the person, and the column with the widest black data shown in fig. 10b may be selected for analysis, where fig. 10c shows a depth value variation map of the column with the widest black data. As shown in fig. 10c, the

abscissa

150, 200, 250, 300, 350 and 400 in the figure is used to identify the position in the vertical direction of the widest column of dots of black data, and the

ordinate

4350, 4400, 4450 and 4500 is used to identify the depth value of the widest column of dots of black data. The position corresponding to the obvious peak value in the figure can be judged as the junction position of the forehead and the hair band. The peak shown in fig. 10c can be understood as a fluctuation point of the widest column of black data. Corresponding fluctuation points in the other columns of data in the second region of interest can be obtained in the same manner.

Further, the position of the intersection line of the forehead and the headband corresponds to the position where a distinct peak appears in the column data of the second region of interest.

And step S403, obtaining the forehead boundary of the person in the target image according to the plurality of fluctuation points.

Step S205 may be replaced by step S404, and obtaining the recognition result of the target image according to the forehead boundary and the output result.

Further, steps S401 to S403 may not be executed on the basis of steps S201 to S205, that is, the image processing method may include the steps of:

acquiring a second interested area on the upper side of the eyebrow in the target image;

acquiring a corresponding fluctuation point in each line of data in the second region of interest;

obtaining the forehead boundary of a person in the target image according to the plurality of fluctuation points; and

and obtaining the recognition result of the target image according to the forehead boundary.

For details of the above steps, reference may be made to the foregoing description, which is not repeated herein.

In this embodiment, as shown in fig. 11, the step S403 includes the following steps.

Step S4031, detecting a plurality of fluctuation points corresponding to each row, and removing a data point with an excessively large standard deviation from the plurality of fluctuation points to obtain a target fluctuation point.

In this embodiment, the widest line data and the next widest data may be taken, and assuming that the number of selected lines is N, peak detection may be performed on a certain line of data to screen out the most suitable peak position, where the N line data corresponds to the fluctuation points corresponding to the N peaks.

Further, analyzing the standard deviation of the N data rejects the data with excessive deviation. In this embodiment, a reference line (n 0 is assumed as the reference line) and a boundary point between the forehead and the headband corresponding to the reference line can be obtained.

Step S4032, traverse comparison is performed on the target fluctuation points corresponding to all the columns in the second region of interest, and a point where the difference value between the target fluctuation points and the adjacent target fluctuation points exceeds a set empirical threshold is removed, so as to obtain a forehead boundary.

The extracted second region of interest is traversed in two directions from a reference column, wherein the reference column may be represented as an n0 th column, where 0 and w represent two edge columns in the horizontal direction, respectively. I.e. the second region of interest is traversed in both the n0 to w direction and the n0 to 0 direction. It can be known that the boundary line between the hair band and the forehead is continuous, the boundary point between each adjacent column has a certain range of fluctuation, and the empirical threshold is set to start the traversal search. Finally, data with the hair band depth values removed can be obtained, and the forehead point position can be directly determined according to the edge line. In this embodiment, the empirical threshold may be 9-15 pixels wide, for example, 12 pixels wide.

The existing forehead identification adopts a skin color extraction algorithm to detect the boundary between hair and forehead, but the method is limited by illumination (uneven illumination, light reflection, shadow and the like), and meanwhile, if the hair of a shooting object is worn with hair accessories of different colors, the skin color extraction is also greatly influenced. However, the forehead recognition mode in the embodiment can well avoid the condition that the hair band cannot be recognized. The method starts from the depth map, removes the data of the hair ornament part, and can better position the forehead and the hair edge.

According to the image processing method, the three-dimensional data information is introduced into the face recognition through the three-dimensional feature point data in the first region of interest acquired on the target image, and compared with the face recognition of a two-dimensional image, the nose width size information and the nose height information can be applied to the face recognition, so that the face recognition has higher reliability.

In another embodiment, the present embodiment may further include the steps S201 to 204, S301 to 304, and S401 to 403, and after the steps S201 to 204, S301 to 304, and S401 to 403 are executed, the target image is positioned and identified according to the position of the ear lobe vicinity point, the forehead boundary, and the output result to obtain a face identification result.

Fig. 12 is a schematic diagram of functional modules of the image processing apparatus 110 shown in fig. 1 according to an embodiment of the present invention. The modules and units of the image processing apparatus 110 in the present embodiment are used for executing the steps in the above method embodiments. The image processing apparatus 110 includes: the device comprises an obtaining module 1101, a first obtaining module 1102, a traversing module 1103, a first screening module 1104 and an identifying module 1105.

The obtaining module 1101 is configured to obtain three-dimensional feature point data of each point in the target image.

The target image includes a depth image and an RGB image, where each point in the depth image and the RGB image corresponds to one point, and the obtaining module 1101 is further configured to:

calculating to obtain three-dimensional point cloud data corresponding to the depth image according to the depth image and the internal parameters of the camera for obtaining the depth image;

carrying out human face characteristic point detection on the RGB image of the human face to obtain human face characteristic points in the RGB image;

and matching the human face feature points with the three-dimensional point cloud data to obtain three-dimensional feature point data.

The first obtaining module 1102 is configured to obtain a first region of interest corresponding to a nose portion in the target image.

The traversing module 1103 is configured to traverse and derive each row of data in the first region of interest to obtain a plurality of positioning points of the alar part in each row of data.

The first screening module 1104 is configured to screen a target positioning point from the plurality of positioning points as an output result of the alar identification.

The identifying module 1105 is configured to perform positioning identification on the target image according to the output result to obtain a face identification result.

In this embodiment, the image processing apparatus 110 may further include: a second acquisition module 1106, a second filtering module 1107, a processing module 1108, and a determination module 1109.

The second obtaining module 1106 is configured to obtain a plurality of point cloud data in the target image, where a longitudinal difference between the target image and the nasal ala position point found in the output result is smaller than a preset value.

The second screening module 1107 is configured to screen the plurality of point cloud data according to the empirical parameters of the positions of the points near the earlobe to obtain point cloud data near the earlobe.

The processing module 1108 is configured to process the point cloud data near the earlobe to obtain a depth map, and obtain a plan view corresponding to the point cloud data near the earlobe.

The determining module 1109 is configured to determine a position of a point near an earlobe according to the plan view.

The identification module 1105 is further for: and positioning and identifying the target image according to the position of the ear lobe nearby point and the output result to obtain a face identification result.

In this embodiment, the determining module 1109 is further configured to perform a close operation on the discrete points in the plane map to obtain a continuous point set, extract the continuous point set to obtain an edge line of the ear lobe, obtain a turning point position of the edge line of the ear lobe, and use the turning point position as a position near the ear lobe.

In this embodiment, the image processing apparatus 110 may further include: the device comprises a third acquisition module, a fourth acquisition module and an acquisition module.

The third obtaining module is configured to obtain a second region of interest on an upper side of an eyebrow in the target image.

The fourth obtaining module is configured to obtain a fluctuation point corresponding to each line of data in the second region of interest.

The obtaining module is used for obtaining the forehead boundary of the person in the target image according to the plurality of fluctuation points.

The identification module 1105 is further for: and positioning and identifying the target image according to the forehead boundary and the output result to obtain a face identification result.

In this embodiment, the obtaining module is further configured to detect a plurality of fluctuation points corresponding to each row, remove data points with an excessively large standard deviation from the plurality of fluctuation points to obtain a target fluctuation point, perform traversal comparison on the target fluctuation points corresponding to all the rows in the second region of interest, and remove a point where a difference value between the target fluctuation point and an adjacent target fluctuation point exceeds a set empirical threshold, so as to obtain a forehead boundary.

In this embodiment, the traversing module 1103 is further configured to calculate a first derivative of each row of data, obtain a position of a valley point of the first derivative corresponding to the row, and use the valley point as an anchor point.

In this embodiment, the first filtering module 1104 is further configured to calculate distances between two positioning points in each row, and select two positioning points with the widest distance as an output result of the alar identification.

In this embodiment, the first filtering module 1104 is further configured to calculate distances between two positioning points in each row, select two positioning points with the widest distance and two positioning points with the second widest distance, and calculate an average value of the filtered data as an output result of the alar identification.

For other details of the present embodiment, reference may be further made to the description of the above method embodiment, which is not repeated herein.

According to the image processing device, the three-dimensional data information is introduced into the face recognition through the three-dimensional feature point data in the first region of interest acquired on the target image, and compared with the face recognition of a two-dimensional image, the nose width size information and the nose height information can be applied to the face recognition, so that the recognition of the nose wing of the face has higher reliability.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, the functional modules in the embodiments of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes. It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. An image processing method, comprising:

obtaining three-dimensional feature point data of each point in a target image, obtaining a first region of interest corresponding to a nose part in the target image, traversing and deriving each row of data in the first region of interest to obtain a plurality of positioning points of a nose wing in each row of data, screening a target positioning point from the plurality of positioning points to serve as an output result of nose wing identification, and/or obtaining a second region of interest on the upper side of an eyebrow in the target image, obtaining fluctuation points corresponding to each row of data in the second region of interest, detecting a plurality of fluctuation points corresponding to each row, and removing data points with overlarge standard deviation from the plurality of fluctuation points to obtain a target fluctuation point; traversing and comparing target fluctuation points corresponding to all columns in the second region of interest, and eliminating points of which the difference value with adjacent target fluctuation points exceeds a set experience threshold value to obtain a forehead boundary, wherein the fluctuation points represent points of fluctuation of depth map data; and

2. The image processing method according to claim 1, wherein before the step of performing location recognition on the target image according to the output result to obtain a face recognition result, the method further comprises:

acquiring a plurality of point cloud data of which the longitudinal difference value between the target image and the nasal wing position point found by the output result is smaller than a preset value;

screening the plurality of point cloud data according to the empirical parameters of the positions of the points near the earlobe to obtain the point cloud data near the earlobe;

processing the point cloud data near the earlobe to obtain a plan view corresponding to the point cloud data near the earlobe;

determining the position of the point near the earlobe according to the plan;

the step of obtaining the recognition result of the target image according to the output result comprises the following steps: and obtaining the recognition result of the target image according to the position of the ear lobe nearby point and the output result.

3. The image processing method according to claim 2, wherein the step of determining the position of the point near the earlobe from the plan view comprises:

performing closed operation on discrete points in the plane graph to obtain a continuous point set;

extracting the continuous point sets to obtain the edge lines of the earlobes;

and acquiring the turning point position of the edge line of the earlobe, and taking the turning point position as the position of the position near the earlobe.

4. The image processing method of claim 1, wherein the step of deriving each row of data in the first region of interest through traversal to obtain a plurality of anchor points for the alar part of the nose in each row of data comprises:

calculating a first derivative of each row of data;

and acquiring the position of the valley point of the first derivative corresponding to the line, and taking the valley point as an positioning point.

5. The image processing method of claim 1, wherein the step of screening out a target positioning point from the plurality of positioning points as an output result of the alar recognition comprises:

and calculating the distance between the two positioning points in each row, and selecting the two positioning points with the widest distance as the output result of the identification of the nasal alar part.

6. The image processing method of claim 1, wherein the step of screening out a target positioning point from the plurality of positioning points as an output result of the alar recognition comprises:

and calculating the distance between the two positioning points in each line, selecting the two positioning points with the widest distance and the two positioning points with the second widest distance, and calculating the average value of the screened data to be used as an output result of the nasal ala recognition.

7. The image processing method of claim 1, wherein the target image comprises a depth image and an RGB image, wherein each point in the depth image and the RGB image corresponds to, the obtainingTo obtainThree-dimensional feature point data of each point in the target image, comprising:

8. An image processing apparatus characterized by comprising:

the third acquisition module is used for acquiring a second region of interest on the upper side of the eyebrow in the target image;

the fourth acquisition module is used for acquiring a corresponding fluctuation point in each line of data in the second region of interest;

the obtaining module is used for detecting a plurality of fluctuation points corresponding to each row, eliminating data points with overlarge standard deviation from the plurality of fluctuation points to obtain target fluctuation points, traversing and comparing the target fluctuation points corresponding to all the rows in the second region of interest, and eliminating points with difference values exceeding a set experience threshold value with adjacent target fluctuation points to obtain a forehead boundary, wherein the fluctuation points represent points of fluctuation of the depth map data;

and the identification module is used for obtaining the identification result of the target image according to the output result and/or the forehead boundary.

9. The image processing apparatus according to claim 8, wherein said apparatus further comprises:

the second acquisition module is used for acquiring a plurality of point cloud data of which the longitudinal difference value between the target image and the nasal wing position point searched by the output result is smaller than a preset value;

the second screening module is used for screening the plurality of point cloud data according to the experience parameters of the positions of the points near the earlobe to obtain the point cloud data near the earlobe;

the processing module is used for processing the point cloud data near the earlobe to obtain a plan view corresponding to the point cloud data near the earlobe;

the determining module is used for determining the position of the point near the earlobe according to the plan;

the identification module is further configured to: and obtaining the recognition result of the target image according to the position of the ear lobe nearby point and the output result.