CN117292424A

CN117292424A - Image generation method and device and electronic equipment

Info

Publication number: CN117292424A
Application number: CN202311290779.0A
Authority: CN
Inventors: 胡星辉
Original assignee: Zero Beam Technology Co ltd
Current assignee: Zero Beam Technology Co ltd
Priority date: 2023-10-08
Filing date: 2023-10-08
Publication date: 2023-12-26

Abstract

The application provides an image generation method, an image generation device and electronic equipment, wherein the image generation method comprises the following steps: determining a first image, the first image including a first face; obtaining a second image according to the first image, a first parameter corresponding to the first face and a second parameter corresponding to a preset standard face model, wherein the second image comprises a second face, the gesture of the second face is different from that of the first face, the first parameter comprises first key point information of a first key point included in the first face and gesture information of the first face, and the second parameter comprises second key point information of a second key point included in the standard face model. Therefore, the number of samples in the training data set corresponding to the face recognition model is effectively increased, and because each sample in the training data set is not required to be manually collected, the cost and time for collecting the training data set are effectively reduced.

Description

Image generation method and device and electronic equipment

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to an image generating method, an image generating device, and an electronic device.

Background

The traditional face recognition method relies on a large-scale training data set, a good face recognition model is obtained, and millions of face images are needed to form the training data set. In the process of acquiring the training data set, a large manufacturer needs to collect more than one million pieces of face data with great manpower and material cost, then download, process and mark the face data (such as face images) through the internet, and then re-integrate the face data into the training data set for training the face recognition model, which is a difficult and expensive task for processing such huge data set. Therefore, the existing method for acquiring the training data set for face recognition has the problems of high cost and long time consumption. The same problem also exists in the acquisition method of other animal face recognition training data sets.

Disclosure of Invention

The application provides an image generation method, an image generation device and electronic equipment, and aims to solve the problems of high cost and long time consumption of an existing training data set acquisition method for face recognition in the prior art.

To solve the foregoing technical problem, in a first aspect, an embodiment of the present application provides an image generating method, including: determining a first image, the first image including a first face; obtaining a second image according to the first image, a first parameter corresponding to the first face and a second parameter corresponding to a preset standard face model, wherein the second image comprises a second face, the gesture of the second face is different from that of the first face, the first parameter comprises first key point information of a first key point included in the first face and gesture information of the first face, and the second parameter comprises second key point information of a second key point included in the standard face model.

In the implementation manner of the application, after the first image is determined, the first image is subjected to image conversion processing according to the first image, the first parameter corresponding to the first face and the second parameter corresponding to the preset standard face model, so that a processed second image can be obtained, and the gesture of the second face included in the second image is different from that of the first face. Therefore, the second image corresponding to the same face under different postures can be obtained by carrying out transformation processing on the first image comprising the first face, namely, two images are obtained through one image, the postures of the faces included in the two images are different, the number of samples in the training data set corresponding to the face recognition model is increased, and because each sample in the training data set is not required to be manually collected, the cost and time for collecting the training data set are effectively reduced.

In one possible implementation manner of the first aspect, the first image and the second image are two-dimensional images, and the obtaining the second image according to the first image, a first parameter corresponding to the first face, and a second parameter corresponding to a preset standard face model includes: obtaining a third image according to the first key point information, the standard face model and the second key point information, wherein the third image is a two-dimensional image; obtaining a fourth image according to the third image and the gesture information, wherein the fourth image comprises a third face, and the fourth image is a two-dimensional image; and correcting the position of the third key point according to third key point information of the third key point included in the third face and index information corresponding to the third key point to obtain a second image.

In the implementation manner of the application, in the process that the first image obtains the second image, the first image is firstly converted into a two-dimensional third image, then the third image is converted into a two-dimensional fourth image, and finally the second image is obtained according to third key point information of a third key point included in a third face included in the fourth image and index information corresponding to the third key point. In the image conversion process, the first image is subjected to multiple transformation processing and correction processing, so that the information of the second face included in the obtained second image is more accurate, the training of the face recognition model is more convenient, and the performance of the obtained face recognition model is improved.

In a possible implementation of the first aspect, the method further includes: determining a side to be repaired corresponding to the second face according to the accumulated quantity of pixels of the second face; and symmetrically processing the side to be repaired according to the preset symmetrical weight to obtain a fifth image.

In the implementation manner of the application, the second face included in the second image is symmetrically processed to obtain a fifth image. Therefore, on the basis of the second image, a fifth image with different facial information from the second face is obtained, namely three images (comprising the first image, the second image and the third image) are obtained through one image, the gestures of the faces included in the three images and the facial information are different, the number of samples in the training data set corresponding to the face recognition model is effectively increased, and the acquisition cost and time of the training data set are effectively reduced.

In a possible implementation manner of the first aspect, obtaining the fourth image according to the third image and the pose information includes: obtaining a sixth image according to the third image and the gesture information; determining a point to be corrected with wrong projection position on the sixth image; and adjusting the position of the point to be corrected to the correct position corresponding to the point to be corrected, and obtaining a fourth image.

In the implementation manner of the method, the sixth image is obtained according to the third image, then the point to be corrected with the wrong projection position on the sixth image is corrected, then the fourth image is obtained, and the obtained information of the third face included in the fourth image is more accurate. Therefore, the second face information included in the second image obtained by the fourth image is more accurate, the training of the face recognition model is facilitated, and the performance of the obtained face recognition model is improved.

In a possible implementation manner of the first aspect, obtaining the third image according to the first keypoint information, the standard face model and the second keypoint information includes: obtaining a camera internal reference matrix, a camera projection matrix, a rotation matrix and a translation vector according to the first key point information and the second key point information; obtaining a model view matrix and an image projection matrix according to the camera internal reference matrix, the camera projection matrix, the rotation matrix and the translation vector; according to the model view matrix and the image projection matrix, performing first image conversion processing on the standard face model to obtain a third image; obtaining a sixth image according to the third image and the gesture information, including: and performing second image conversion processing on the third image according to the posture information to obtain a sixth image.

In the implementation mode of the method, the standard face model is subjected to the first image conversion processing through the model view matrix and the image projection matrix which are obtained through the camera internal reference matrix, the camera projection matrix, the rotation matrix and the translation vector, and the accuracy and the integrity of the information of the obtained third image can be ensured. Therefore, the second image conversion processing is performed on the third image according to the gesture information, and the obtained information of the sixth image also has higher accuracy and integrity, so that the information of the second face included in the finally obtained second image is more accurate, the training of the face recognition model is facilitated, and the performance of the obtained face recognition model is improved.

In a possible implementation manner of the first aspect, the sixth image includes a fourth face and a background portion, determining a point to be corrected with an incorrect projection position on the sixth image, and adjusting the position of the point to be corrected to a correct position corresponding to the point to be corrected, so as to obtain a fourth image, where the determining includes: determining a background template; separating a projection point corresponding to the fourth face and a projection point corresponding to the background part according to the background template; determining a first point to be corrected with an incorrect projection position in the projection points corresponding to the fourth face, and adjusting the position of the first point to be corrected to a correct position corresponding to the first point to be corrected to obtain a corrected fourth face; determining a second point to be corrected with wrong projection positions in the projection points corresponding to the background parts, and adjusting the positions of the second point to be corrected to the correct positions corresponding to the second point to be corrected to obtain corrected background parts; and obtaining a fourth image according to the corrected fourth face and the corrected background part.

In the implementation manner of the application, according to the background template, the projection point corresponding to the fourth face and the projection point corresponding to the background area are separated; and then respectively correcting the first point to be corrected with the wrong projection position in the projection points corresponding to the fourth face and the second point to be corrected with the wrong projection position in the projection points corresponding to the background part, and then obtaining a fourth image. The point to be corrected with the wrong projection position in the fourth face can be more accurately found, the position of the point to be corrected is corrected, and the information of the third face included in the fourth image is more accurate. Therefore, the second face information included in the second image obtained by the fourth image is more accurate, the training of the face recognition model is facilitated, and the performance of the obtained face recognition model is improved.

In a possible implementation manner of the first aspect, determining a first point to be corrected, where a projection position of the first point to be corrected is wrong, in the projection points corresponding to the fourth face, and adjusting the position of the first point to be corrected to a correct position corresponding to the first point to be corrected includes: if the first point to be corrected with the wrong projection position in the projection points corresponding to the fourth face is determined to be located outside the boundary of the image area corresponding to the fourth face, normalization processing is carried out on the first point to be corrected, so that the position of the first point to be corrected is adjusted to the correct position corresponding to the first point to be corrected.

In the implementation manner of the method, the position of the first point to be corrected is corrected in a normalization processing mode, so that the corrected position of the first point to be corrected is more accurate.

In a possible implementation of the first aspect, the first keypoint information and the gesture information are obtained by: inputting the first image into a preset recognition model for recognition processing to obtain first key point information and gesture information.

In the implementation mode of the method, the first image is input into the preset recognition model to carry out image recognition processing, and the first key point information and the gesture information can be obtained more rapidly and accurately.

In a second aspect, an embodiment of the present application provides an image generating apparatus for performing the image generating method described in the first aspect, the apparatus including: a first processing module for determining a first image, the first image comprising a first face; the second processing module is used for obtaining a second image according to the first image, a first parameter corresponding to the first face and a second parameter corresponding to a preset standard face model, wherein the second image comprises a second face, the gesture of the second face is different from that of the first face, the first parameter comprises first key point information of a first key point included in the first face, and gesture information corresponding to the first face, and the second parameter comprises second key point information of a second key point included in the preset standard face model.

In a third aspect, embodiments of the present application provide an electronic device, including: a memory for storing a computer program, the computer program comprising program instructions; a processor configured to execute program instructions to cause an electronic device to perform the image generation method provided by the first aspect and/or any one of the possible implementation manners of the first aspect.

In a fourth aspect, embodiments of the present application provide a computer readable storage medium storing a computer program comprising program instructions to be executed by an electronic device to cause performance of the image generation method provided by the first aspect and/or any one of the possible implementations of the first aspect.

In a fifth aspect, embodiments of the present application provide a computer program product comprising a computer program/instruction which, when executed by a processor, implements the image generation method provided by the first aspect and/or any one of the possible implementations of the first aspect.

The beneficial effects of the application are that:

according to the image generation method, after the first image is determined, the first image is subjected to image conversion processing according to the first image, the first parameters corresponding to the first face and the second parameters corresponding to the preset standard face model, so that a processed second image can be obtained, and the gesture of the second face included in the second image is different from that of the first face. Therefore, the second image corresponding to the same face under different postures can be obtained by carrying out transformation processing on the first image comprising the first face, namely, two images are obtained through one image, the postures of the faces included in the two images are different, the number of samples in the training data set corresponding to the face recognition model is effectively increased, and because each sample in the training data set is not required to be manually collected, the cost and time for collecting the training data set are effectively reduced.

Drawings

In order to more clearly illustrate the technical solutions of the present application, the following description will briefly explain the drawings used in the description of the embodiments.

FIG. 1 is a flow diagram illustrating an image generation method, according to some implementations of the present application;

FIG. 2 is a schematic flow diagram illustrating one process of obtaining a second image, according to some implementations of the present application;

FIG. 3 is a schematic flow diagram illustrating one process of obtaining a fourth image, according to some implementations of the present application;

FIG. 4 is a schematic flow diagram illustrating one process of obtaining a third image, according to some implementations of the present application;

FIG. 5 is a schematic flow diagram illustrating another process of obtaining a fourth image, according to some implementations of the present application;

FIG. 6 is a flow diagram illustrating another image generation method, according to some implementations of the present application;

FIG. 7 is a flow diagram illustrating another image generation method, according to some implementations of the present application;

FIG. 8 is a schematic diagram illustrating a configuration of an image generation apparatus according to some implementations of the present application;

fig. 9 is a schematic structural diagram of an electronic device, according to some implementations of the present application.

Detailed Description

The technical solutions of the present application will be described in further detail below with reference to the accompanying drawings.

As described above, the conventional face recognition method relies on a large-scale training data set, and a lot of time and cost are required to acquire a face picture. The existing method for collecting the training data set for face recognition has the problems of high cost and long time consumption.

Based on the above, the application provides the image generation method, the image generation device and the electronic equipment, which can enlarge the number of samples in the training data set corresponding to the face recognition model, effectively reduce the acquisition cost and time of the training data set, and solve the problems of high cost and long time consumption of the acquisition method of the training data set for face recognition.

Next, detailed descriptions will be given of specific procedures and advantages of the image generation method provided in the present application with reference to fig. 1 to fig. 7.

The image generation method is applied to electronic equipment, and the electronic equipment can be equipment capable of processing images, such as a computer, a server, a cluster server and the like. In one implementation of the present application, the image processing method, as shown in fig. 1, includes the following steps:

S100: a first image is determined, the first image including a first face.

Specifically, the first image is a sample image that can be used for a face recognition model, and the face recognition model can be used for face recognition, such as mobile phone screen recognition, access control recognition, and the like, and also can be used for animal type recognition. Therefore, the first image needs to include a first face, which may be a side face image or a front face image of the object to be acquired, or may be an image of any angle.

S200: and obtaining a second image according to the first image, the first parameter corresponding to the first face and the second parameter corresponding to the preset standard face model.

The second image comprises a second face, the gesture of the second face is different from that of the first face, the first parameter comprises first key point information of first key points included in the first face and gesture information of the first face, and the second parameter comprises second key point information of second key points included in the standard face model.

The pose information of the face in the application may also be called as euler angle of the face, and based on a spatial standard coordinate system, the pose information respectively comprises an up-down turning angle (i.e. pitch angle-rotating around an X-axis), a left-right turning angle (i.e. yaw angle-rotating around a Y-axis), and an in-plane rotation angle (i.e. roll angle-rotating around a Z-axis), namely angles corresponding to head lifting, head shaking and head turning.

The first key points included in the first face are points corresponding to the positions of key areas of the located face, such as the eyebrows, the nose, the mouth, the outline of the face and the like, by performing key point detection on the first image including the first face. The first keypoint information may include coordinates, number, index, etc. of the first keypoint. The index here refers to the number of first keypoints, for example, the number of first keypoints is 64 in total, the number of keypoints is 1-64, the number of keypoints of the eye part is 1-10, the number of keypoints of the nose part is 11-20, etc. It should be noted that, the number of keypoints and the corresponding index of keypoints are not limited in this application, and the above description is merely for example.

The second key points corresponding to the preset standard face model (a face 3D model) in the application are in one-to-one correspondence with the first key points included in the first face, and are points corresponding to the positions of key areas of the face, and the number of the second key points and the positions represented by the key points are the same.

The first key point information of the first key points included in the first face and the gesture information corresponding to the first face can be identified and detected through a preset identification model. In one implementation of the present application, the first keypoint information and the gesture information are obtained by: inputting the first image into a preset recognition model for image recognition processing to obtain first key point information and gesture information.

The preset recognition model may be a preset deep learning model, such as a landmark and gesture model.

Next, in step S200, the specific content of the second image is described in detail according to the first image, the first parameter corresponding to the first face, and the second parameter corresponding to the preset standard face model.

In one implementation manner of the present application, the first image and the second image are two-dimensional images, as shown in fig. 2, according to the first image, a first parameter corresponding to the first face, and a second parameter corresponding to a preset standard face model, a second image is obtained, which includes the following steps:

s210: and obtaining a third image according to the first key point information, the standard face model and the second key point information, wherein the third image is a two-dimensional image.

S220: and obtaining a fourth image according to the third image and the gesture information. The fourth image comprises a third face, and the fourth image is a two-dimensional image.

S230: and correcting the position of the third key point according to third key point information of the third key point included in the third face and index information corresponding to the third key point to obtain a second image.

The index of the key points refers to the index of the key points, and the third key points included in the third face in this application correspond to the first key points included in the first face, that is, the number of the third key points included in the third face and the positions of the areas represented by the key points with the same index should be the same. For example, in the first face, the key points of the eye parts should be numbered 1-10, the key points of the nose parts should be numbered 11-20, and in the third face, the key points of the eye parts should also be numbered 1-10, and the key points of the nose parts should also be numbered 11-20. And correcting the position of the third key point according to the third key point information of the third key point included in the third face and index information corresponding to the third key point, namely judging whether the projection position of the key point is wrong or not according to the label and the position of the third key point included in the third face. For example, although the key points of the eye parts are numbered 1 to 10, the key points numbered 5 and 6 are projected on the nose of the third face, so that it is necessary to determine whether the projected positions of the key points are wrong or not based on index information of the key points known in advance, and if not, correct the positions of the third key points, and correct the positions of the key points numbered 5 and 6 to the correct positions of the eyes, thereby obtaining an accurate fourth image.

When the first image is converted into the second image, the standard face model is converted into a third image, the third image is converted into a two-dimensional fourth image, and finally the second image is obtained according to third key point information of third key points included in the third face and index information corresponding to the third key points included in the fourth image. In the image conversion process, the first image is subjected to multiple transformation processing and correction processing, so that the information of the second face included in the obtained second image is more accurate, the training of the face recognition model is more convenient, and the performance of the obtained face recognition model is improved.

Further, in an implementation manner of the present application, during the acquisition of a part of the training data set, an ordinary geometric transformation (such as mutual projection between a two-dimensional coordinate point and a three-dimensional coordinate point) is performed on the acquired original image, so as to increase the number of sample sets of the training data set, but the image obtained by the ordinary geometric transformation has a serious distortion problem, so that the performance of the obtained face recognition model is also poor. Compared with the method, the image generation method has the advantages that the first image is subjected to multiple transformation processing and correction processing, so that the information of the second face included in the obtained second image is more accurate, the training of the face recognition model is more convenient, and the performance of the obtained face recognition model is improved.

In one implementation manner of the present application, as shown in fig. 3, a fourth image is obtained according to the third image and the gesture information, including the following steps:

s221: and obtaining a sixth image according to the third image and the gesture information.

S222: and determining a point to be corrected of the projection position error on the sixth image.

S223: and adjusting the position of the point to be corrected to the correct position corresponding to the point to be corrected, and obtaining a fourth image.

After the image conversion, the problem of error projection position occurs in the position in the third image in the process of converting the standard face model into the third image due to insufficient computing power of the electronic device or errors occurring in the projection process, and further, points with error projection position occur in the obtained sixth image in the process of converting the third image into the sixth image. Therefore, the position of the point to be corrected with the wrong projection position on the sixth image needs to be corrected, and the position of the point to be corrected is adjusted to the correct position corresponding to the point to be corrected, so that a clearer and more accurate fourth image can be obtained.

And obtaining a sixth image according to the third image, correcting the point to be corrected with the wrong projection position on the sixth image, and obtaining a fourth image, wherein the obtained fourth image comprises more accurate information of a third face. Therefore, the second face information included in the second image obtained by the fourth image is more accurate, the training of the face recognition model is facilitated, and the performance of the obtained face recognition model is improved.

The first key point information and the second key point information may be specifically a first key point coordinate and a second key point coordinate. In one implementation manner of the present application, according to the first keypoint information, the standard face model and the second keypoint information, a third image is obtained, as shown in fig. 4, including the following steps:

s211: and obtaining a camera internal reference matrix, a camera projection matrix, a rotation matrix and a translation vector according to the first key point information and the second key point information.

S212: and obtaining a model view matrix and an image projection matrix according to the camera internal reference matrix, the camera projection matrix, the rotation matrix and the translation vector.

S213: and performing first image conversion processing on the face standard model according to the model view matrix and the image projection matrix to obtain a third image.

The camera reference matrix and the camera projection matrix refer to parameter matrices corresponding to a camera for shooting a first image, and conversion between images can be performed according to the camera reference matrix, the camera projection matrix, the rotation matrix and the translation vector.

In one implementation manner of the present application, obtaining the sixth image according to the third image and the gesture information includes: and performing second image conversion processing on the third image according to the posture information to obtain a sixth image.

Specifically, as described above, the camera reference matrix, the camera projection matrix, the rotation matrix and the translation vector are obtained according to the first key point information and the second key point information, then the model view matrix and the image projection matrix are obtained by calculation according to the camera reference matrix, the camera projection matrix, the rotation matrix and the translation vector, and finally the third image is converted through the model view matrix, the image projection matrix and the gesture information to obtain the two-dimensional sixth image.

And performing first image conversion processing on the standard face model through the camera internal reference matrix, the camera projection matrix, the rotation matrix and the translation vector to obtain a model view matrix and an image projection matrix, so that the accuracy and the completeness of the information of the obtained third image can be ensured. Therefore, the second image conversion processing is performed on the third image according to the gesture information, and the obtained information of the sixth image also has higher accuracy and integrity, so that the information of the second face included in the finally obtained second image is more accurate, the training of the face recognition model is facilitated, and the performance of the obtained face recognition model is improved.

In one implementation manner of the present application, the sixth image includes a fourth face and a background portion, a point to be corrected with an incorrect projection position on the sixth image is determined, and the position of the point to be corrected is adjusted to a correct position corresponding to the point to be corrected, so as to obtain a fourth image, as shown in fig. 5, including the following steps:

S240: a background template is determined.

S250: and separating the projection point corresponding to the fourth face and the projection point corresponding to the background part according to the background template.

S260: and determining a first point to be corrected with wrong projection position in the projection points corresponding to the fourth face, and adjusting the position of the first point to be corrected to the correct position corresponding to the first point to be corrected to obtain the corrected fourth face.

S270: and determining a second point to be corrected with wrong projection positions in the projection points corresponding to the background parts, and adjusting the positions of the second points to be corrected to the correct positions corresponding to the second points to be corrected to obtain corrected background parts.

S280: and obtaining a fourth image according to the corrected fourth face and the corrected background part.

The background template, which may also be referred to as a background mask, is a standard background for an image. The design may be performed by existing graphic design designs. The background template in the present application is a standard background template obtained according to the sixth image, and the pixels of the background template are matched with the pixels of the background area in the sixth image. Therefore, according to the background template, the projection point corresponding to the fourth face and the projection point corresponding to the background portion can be accurately separated.

Since the step S260 and the step S270 correct the position of the point to be corrected having the wrong projection position among the projection points corresponding to the fourth face and the background portion after the separation, the step S260 and the step S270 do not have a fixed relationship, and the step S260 may be performed first, the step S270 may be performed second, the step S260 may be performed first, or the step S260 and the step S270 may be performed simultaneously. And finally, carrying out fusion processing on the corrected fourth face and the corrected background part, so as to obtain a more accurate fourth image.

Separating a projection point corresponding to the fourth face and a projection point corresponding to the background area according to the background template; and then respectively correcting the first point to be corrected with the wrong projection position in the projection points corresponding to the fourth face and the second point to be corrected with the wrong projection position in the projection points corresponding to the background part, and then obtaining a fourth image. The point to be corrected with the wrong projection position in the fourth face can be more accurately found, the position of the point to be corrected is corrected, and the information of the third face included in the fourth image is more accurate. Therefore, the second face information included in the second image obtained by the fourth image is more accurate, the training of the face recognition model is facilitated, and the performance of the obtained face recognition model is improved.

In one implementation manner of the present application, determining a first point to be corrected with an incorrect projection position in a projection point corresponding to a fourth face, and adjusting the position of the first point to be corrected to a correct position corresponding to the first point to be corrected includes: if it is determined that the first point to be corrected with the wrong projection position in the projection point corresponding to the fourth face is located outside the boundary of the image area corresponding to the fourth face, normalization processing is performed on the first point to be corrected, so that the position of the first point to be corrected is adjusted to the correct position corresponding to the first point to be corrected, and the corrected position of the first point to be corrected is more accurate.

The point to be corrected with the wrong projection position on the background part can also be corrected in a normalization mode, or other existing algorithms for correcting the position of the projection point are corrected, and the description of the point to be corrected is omitted.

In one implementation of the present application, after step S200, as shown in fig. 6, the method further includes the following steps:

s300: and determining the side to be repaired corresponding to the second face according to the accumulated quantity of the pixels of the second face.

S400: and symmetrically processing the side to be repaired according to the preset symmetrical weight to obtain a fifth image.

In the process of carrying out the symmetrical processing of the face, the existing symmetrical processing mode can be adopted. Specifically, existing software may be used to perform symmetry processing of the face region in the image, such as auto cad, which performs a soft symmetry operation if symmetry options are enabled and an eye mask (i.e., a standard symmetric eye template) is provided, determines the more obstructed side of the face based on the accumulated amount of visible pixels of the face (i.e., the second face), and applies symmetry to use the visible portion from the obstructed side to perform symmetry processing on the face to obtain a fifth image with a symmetrical face. According to the soft symmetry process, namely according to the face image in the image, one side, which is more blocked, of the face image and one side, which is more dense, of the face pixel points are determined, and after one side, which is more blocked, of the face image and one side, which is more dense, of the face pixel points are determined, the one side, which is more dense, of the face pixel points, is symmetrically projected to one side, which is more blocked, of the face image, so that the face image after corresponding symmetry is obtained.

And carrying out symmetrical processing on a second face included in the second image to obtain a fifth image. Therefore, on the basis of the second image, a fifth image with different facial information from the second face is obtained, namely three images (comprising the first image, the second image and the fifth image) are obtained through one image, the gestures of the faces included in the three images and the facial information are different, the number of samples in the training data set corresponding to the face recognition model is effectively increased, and the acquisition cost and time of the training data set are effectively reduced.

In one implementation manner of the present application, the image generating method is a method for quickly and simply acquiring face data of a rich scene, as shown in fig. 7, and includes the following steps:

the method comprises the steps that a camera and camera equipment are used for collecting pictures (namely a first image) containing faces, then the collected pictures are input into a landmark and gesture model, the landmark and gesture labeling labels (namely first key point information and gesture information, wherein the first key point information comprises first key point coordinates) of the collected pictures are obtained through the landmark and gesture model, and the landmark and gesture model uses the optimal landmark and gesture large model in a target scene.

And determining second key point coordinates (namely second key point information) of the provided 3D model (namely the standard face model), and calculating a projection matrix, a camera internal reference matrix, a rotation matrix and a translation vector of the camera according to the first key point coordinates and the second key point coordinates. And calculating a model view matrix and a projection matrix through a projection matrix of the camera, a camera internal parameter matrix, a rotation matrix and a translation vector. And then converting the 3D model (namely 3D face coordinates) according to the vertex coordinates of the 3D model through the model view matrix and the projection matrix to obtain a third image. And converting the third image into a two-dimensional sixth image according to the third image and the gesture information.

A background mask (i.e., background template) corresponding to the third image is then calculated, and the face projection point and the background projection point of the sixth image are separated according to the background mask.

And judging whether the projection positions of the face projection points are correct, and if not, respectively processing the positions of the error projection points outside the face area image and the error projection points inside the face area image. And carrying out normalization processing on the error projection points outside the face area image, adjusting the positions of the error projection points according to a certain rule, and finally remapping the points to the size of the original image. And then processing the point with wrong projection position in the background according to the set rule. After processing both the point of the face with the wrong projection position and the point of the background with the wrong projection position, a fourth image is obtained.

And combining the projection points (i.e. the key points) and the indexes according to the third key point information of the third key points corresponding to the fourth image and the indexes corresponding to the third key points, and correcting the position of the key points with wrong projection positions on the fourth image (i.e. performing warping processing on the fourth image) to obtain a second image.

And carrying out symmetry processing on the included face of the second image by applying soft symmetry, creating a frontal view, returning the frontal view with and without symmetry, face points in the image, points displayed outside and symmetry weights, and obtaining a fifth image.

And finally, outputting the first image, the second image and the fifth image as samples of a training data set for training the face recognition model.

Compared with the data enhancement method in the traditional face recognition, the image generation method improves the face recognition performance, and the method is matched with the performance of acquiring a large millions of Internet data sets and introduces face appearance change through synthesized data. Unlike general data enhancement methods, this method utilizes domain-specific techniques to generate composite images. By using the method of synthetic data enhancement, the time and cost of collecting, processing and marking a large number of images can be reduced while improving the accuracy of the face recognition system.

In one implementation of the present application, there is provided an image generating apparatus, as shown in fig. 8, including:

the first processing module is configured to determine a first image, the first image including a first face.

The second processing module is used for obtaining a second image according to the first image, a first parameter corresponding to the first face and a second parameter corresponding to a preset standard face model, wherein the second image comprises a second face, the gesture of the second face is different from that of the first face, the first parameter comprises first key point information of a first key point included in the first face, and gesture information of the first face, and the second parameter comprises second key point information of a second key point included in the standard face model.

The specific operation content that can be performed by each processing module refers to the image generation method corresponding to fig. 1. Also, depending on the specific operational steps of the image generation method described above, the image generation apparatus may include more or fewer processing modules for processing the content in the image generation method described above.

Referring to fig. 9, fig. 9 is a block diagram of an electronic device according to an implementation of the present application. The electronic device can include one or more processors 1002, system control logic 1008 coupled to at least one of the processors 1002, system memory 1004 coupled to the system control logic 1008, non-volatile memory (NVM) 1006 coupled to the system control logic 1008, and a network interface 1010 coupled to the system control logic 1008.

The processor 1002 may include one or more single-core or multi-core processors. The processor 1002 may include any combination of general-purpose and special-purpose processors (e.g., graphics processor, application processor, baseband processor, etc.). In implementations herein, the processor 1002 may be configured to perform the aforementioned image generation method.

In some implementations, the system control logic 1008 may include any suitable interface controller to provide any suitable interface to at least one of the processors 1002 and/or any suitable device or component in communication with the system control logic 1008.

In some implementations, the system control logic 1008 may include one or more memory controllers to provide an interface to the system memory 1004. The system memory 1004 may be used for loading and storing data and/or instructions. The system memory 1004 of the electronic device can include any suitable volatile memory in some implementations, such as suitable dynamic random access memory (Dynamic Random Access Memory, DRAM).

NVM/memory 1006 may include one or more tangible, non-transitory computer-readable media for storing data and/or instructions. In some implementations, NVM/memory 1006 may include any suitable nonvolatile memory, such as flash memory, and/or any suitable nonvolatile storage device, such as at least one of a Hard Disk Drive (HDD), compact Disc (CD) Drive, digital versatile Disc (Digital Versatile Disc, DVD) Drive.

NVM/memory 1006 may include a portion of a memory resource installed on an apparatus of an electronic device, or it may be accessed by, but not necessarily part of, the device. For example, NVM/memory 1006 may be accessed over a network via network interface 1010.

In particular, system memory 1004 and NVM/storage 1006 may each include: a temporary copy and a permanent copy of instruction 1020. The instructions 1020 may include: instructions that, when executed by at least one of the processors 1002, cause the electronic device to implement the aforementioned image generation method. In some implementations, instructions 1020, hardware, firmware, and/or software components thereof may additionally/alternatively be disposed in system control logic 1008, network interface 1010, and/or processor 1002.

The network interface 1010 may include a transceiver to provide a radio interface for electronic devices to communicate with any other suitable device (e.g., front end module, antenna, etc.) over one or more networks. In some implementations, the network interface 1010 may be integrated with other components of the electronic device. For example, the network interface 1010 may be integrated with at least one of the processor 1002, the system memory 1004, the nvm/storage 1006, and a firmware device (not shown) having instructions that, when executed by at least one of the processor 1002, implement the image generation methods described previously.

The network interface 1010 may further include any suitable hardware and/or firmware to provide a multiple-input multiple-output radio interface. For example, network interface 1010 may be a network adapter, a wireless network adapter, a telephone modem, and/or a wireless modem.

In one implementation, at least one of the processors 1002 may be packaged together with logic for one or more controllers of the system control logic 1008 to form a system package (System In a Package, siP). In one implementation, at least one of the processors 1002 may be integrated on the same die with logic for one or more controllers of the System control logic 1008 to form a System on Chip (SoC).

The electronic device may further include: input/output (I/O) devices 1012. The I/O device 1012 may include a user interface to enable a user to interact with the electronic device; the design of the peripheral component interface enables the peripheral component to also interact with the electronic device. In some implementations, the electronic device further includes a sensor for determining at least one of environmental conditions and location information associated with the electronic device.

In some implementations, the user interface may include, but is not limited to, a display (e.g., a liquid crystal display, a touch screen display, etc.), a speaker, a microphone, one or more cameras (e.g., still image cameras and/or video cameras), a flashlight (e.g., light emitting diode flash), and a keyboard.

In some implementations, the peripheral component interface may include, but is not limited to, a non-volatile memory port, an audio jack, and a power interface.

In some implementations, the sensors may include, but are not limited to, gyroscopic sensors, accelerometers, proximity sensors, ambient light sensors, and positioning units. The positioning unit may also be part of the network interface 1010 or interact with the network interface 1010 to communicate with components of a positioning network, such as global positioning system (Global Positioning System, GPS) satellites.

It should be understood that the structure illustrated in the implementation of the present invention does not constitute a specific limitation on the electronic device. In other implementations of the application, the electronic device may include more or fewer components than shown, or certain components may be combined, or certain components may be split, or different arrangements of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.

Program code may be applied to input instructions to perform the functions described herein and generate output information. The output information may be applied to one or more output devices in a known manner. For purposes of implementations of the present application, a processing system includes any system having a processor such as, for example, a digital signal processor (Digital Signal Processor, DSP), microcontroller, application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or microprocessor.

The program code may be implemented in a high level procedural or object oriented programming language to communicate with a processing system. Program code may also be implemented in assembly or machine language, if desired. Indeed, the mechanisms described herein are not limited in scope to any particular programming language. In either case, the language may be a compiled or interpreted language.

One or more aspects of at least one implementation may be implemented by representative instructions stored on a computer-readable storage medium, which represent various logic in a processor, which when read by a machine, cause the machine to fabricate logic to perform the techniques described herein. These representations, referred to as "IP cores," may be stored on a tangible computer readable storage medium and provided to a plurality of customers or manufacturing facilities for loading into the manufacturing machine that actually manufactures the logic or processor.

It should be noted that in the drawings, some structural or method features may be shown in a specific arrangement and/or order. However, it should be understood that such a particular arrangement and/or ordering may not be required. Rather, in some implementations, the features can be arranged in a different manner and/or order than shown in the illustrative drawings. Additionally, the inclusion of structural or methodological features in a particular figure is not meant to imply that such features are required in all implementations, and in some implementations, such features may not be included or may be combined with other features.

It should be noted that the terms "first," "second," and the like are used merely to distinguish between descriptions and should not be construed as indicating or implying relative importance.

It should be noted that in the drawings, some structural or method features may be shown in a specific arrangement and/or order. However, it should be understood that such a particular arrangement and/or ordering may not be required. Rather, in some embodiments, these features may be arranged in a different manner and/or order than shown in the illustrative figures. Additionally, the inclusion of structural or methodological features in a particular figure is not meant to imply that such features are required in all embodiments, and in some embodiments, may not be included or may be combined with other features.

While the present application has been shown and described with respect to certain preferred embodiments thereof, it will be understood by those of ordinary skill in the art that the foregoing is a further detailed description of the present application in conjunction with the specific embodiments and is not intended to limit the practice of the present application to such descriptions. Various changes in form and detail may be made therein by those skilled in the art, including a few simple inferences or alternatives, without departing from the spirit and scope of the present application.

Claims

1. An image generation method, the method comprising:

determining a first image, the first image comprising a first face;

obtaining a second image according to the first image, a first parameter corresponding to the first face and a second parameter corresponding to a preset standard face model, wherein the second image comprises a second face, the gesture of the second face is different from that of the first face, the first parameter comprises first key point information of a first key point included in the first face and gesture information of the first face, and the second parameter comprises second key point information of a second key point included in the standard face model.

2. The method of generating an image according to claim 1, wherein the first image and the second image are two-dimensional images, and obtaining the second image according to the first image, the first parameter corresponding to the first face, and a second parameter corresponding to a preset standard face model includes:

obtaining a third image according to the first key point information, the standard face model and the second key point information, wherein the third image is a two-dimensional image;

Obtaining a fourth image according to the third image and the gesture information, wherein the fourth image comprises a third face, and the fourth image is a two-dimensional image;

and correcting the position of the third key point according to third key point information of the third key point included in the third face and index information corresponding to the third key point to obtain the second image.

3. The image generation method according to claim 2, characterized in that the method further comprises:

determining a side to be repaired corresponding to the second face according to the accumulated quantity of pixels of the second face;

and carrying out symmetrical processing on the side to be repaired according to preset symmetrical weights to obtain a fifth image.

4. The image generation method according to claim 2 or 3, wherein obtaining a fourth image from the third image and the posture information, comprises:

obtaining a sixth image according to the third image and the gesture information;

determining a point to be corrected with wrong projection position on the sixth image;

and adjusting the position of the point to be corrected to the correct position corresponding to the point to be corrected, and obtaining the fourth image.

5. The image generation method according to claim 4, wherein,

Obtaining a third image according to the first key point information, the standard face model and the second key point information, wherein the third image comprises the following steps:

obtaining a camera internal reference matrix, a camera projection matrix, a rotation matrix and a translation vector according to the first key point information and the second key point information;

obtaining a model view matrix and an image projection matrix according to the camera internal reference matrix, the camera projection matrix, the rotation matrix and the translation vector;

according to the model view matrix and the image projection matrix, performing first image conversion processing on the standard face model to obtain the third image;

obtaining a sixth image according to the third image and the gesture information, including:

and performing second image conversion processing on the third image according to the gesture information to obtain the sixth image.

6. The image generating method according to claim 5, wherein the sixth image includes a fourth face and a background portion, determining a point to be corrected having an incorrect projection position on the sixth image, and adjusting the position of the point to be corrected to a correct position corresponding to the point to be corrected, to obtain the fourth image, including:

Determining a background template;

separating the projection point corresponding to the fourth face and the projection point corresponding to the background part according to the background template;

determining a first point to be corrected with an incorrect projection position in the projection points corresponding to the fourth face, and adjusting the position of the first point to be corrected to a correct position corresponding to the first point to be corrected to obtain a corrected fourth face; and

determining a second point to be corrected with an incorrect projection position in the projection points corresponding to the background part, and adjusting the position of the second point to be corrected to the correct position corresponding to the second point to be corrected to obtain a corrected background part;

and obtaining the fourth image according to the corrected fourth face and the corrected background part.

7. The image generating method according to claim 6, wherein determining a first point to be corrected having an erroneous projection position among projection points corresponding to the fourth face, and adjusting the position of the first point to be corrected to a correct position corresponding to the first point to be corrected, comprises:

if the first point to be corrected with the wrong projection position in the projection point corresponding to the fourth face is determined to be located outside the boundary of the image area corresponding to the fourth face, normalization processing is performed on the first point to be corrected, so that the position of the first point to be corrected is adjusted to the correct position corresponding to the first point to be corrected.

8. The image generation method according to any one of claims 5 to 7, wherein the first keypoint information and the pose information are obtained by:

and inputting the first image into a preset recognition model for image recognition processing to obtain the first key point information and the gesture information.

9. An image generation apparatus, the apparatus comprising:

a first processing module to determine a first image, the first image comprising a first face;

the second processing module is configured to obtain a second image according to the first image, a first parameter corresponding to the first face, and a second parameter corresponding to a preset standard face model, where the second image includes a second face, a posture of the second face is different from a posture of the first face, the first parameter includes first key point information of a first key point included in the first face, and posture information of the first face, and the second parameter includes second key point information of a second key point included in the standard face model.

10. An electronic device, comprising:

a memory for storing a computer program, the computer program comprising program instructions;

A processor for executing the program instructions to cause the electronic device to perform the image generation method of any one of claims 1-8.