CN112990032B

CN112990032B - Face image processing method and device

Info

Publication number: CN112990032B
Application number: CN202110309597.8A
Authority: CN
Inventors: 刘凯; 张燕红; 张立民; 方伟; 姜杰; 王萌
Original assignee: School Of Aeronautical Combat Service Naval Aeronautical University Of Pla
Current assignee: School Of Aeronautical Combat Service Naval Aeronautical University Of Pla
Priority date: 2021-03-23
Filing date: 2021-03-23
Publication date: 2022-08-16
Anticipated expiration: 2041-03-23
Also published as: CN112990032A

Abstract

The application provides a face image processing method and device. Wherein the method comprises the following steps: acquiring a face image; detecting whether the face in the face image is occluded by an occlusion through a Yolo _ v3 detection model; when the detection result is that the human face in the human face image is shielded by a shielding object, restoring the human face image shielded by the shielding object through a restoration model; and outputting the restored human face image. Therefore, the operation of removing the shielding object is carried out on the shielded human face image, so that the head tracking system of the flight training simulator can overcome the influence of the shielding object such as a helmet and the like, and is more stable and reliable in the training process, and the flight simulation training is more vivid.

Description

Face image processing method and device

Technical Field

The present application relates to the field of image recognition technologies, and in particular, to a method and an apparatus for processing a face image.

Background

In the process of simulating air combat training, the head tracking system can construct a flight visual angle and an attention point for the simulation training, so that head posture and position information of the pilot is provided for other human-computer interaction systems. Therefore, the head tracking system is a key device for improving the fidelity of the simulated training. The non-contact pilot head tracking system commonly used at present is usually infrared point marking or natural image acquisition. The pilot does not wear a helmet in a natural image acquisition mode, the human face of the pilot is detected through the camera in the cabin, and the angle measurement and calculation are completed by performing visual angle projection according to the key point and the 3D coordinate of the standard model. However, in the modern simulated air combat, a pilot needs to wear a helmet to execute a training subject, and the traditional natural image acquisition mode cannot meet the requirement.

Therefore, it is desirable to provide a method and an apparatus for processing a face image. Through the operation of removing the shelter to the face image that is sheltered from for the head tracking system of flight training simulator can overcome the influence of shelters such as helmet, and is more stable, reliable in the training process, thereby makes the flight simulation training more lifelike.

Disclosure of Invention

The embodiment of the application provides a technical scheme for removing an obstruction from a face image, which is used for solving the technical problem that key points of a face are obstructed by the obstruction.

Specifically, the face image processing method comprises the following steps:

acquiring a face image;

detecting whether the face in the face image is occluded by an occlusion through a Yolo _ v3 detection model;

when the detection result is that the human face in the human face image is shielded by a shielding object, restoring the human face image shielded by the shielding object through a restoration model;

and outputting the restored human face image.

Further, the Yolo _ v3 detection model is obtained through the following steps:

acquiring a training image set;

generating a training image set database through a confrontation generation network according to the training image set;

carrying out negative feedback optimization on a Yolo _ v3 detection model according to the training image set database;

the method for acquiring the training image set specifically includes:

and acquiring a training image set consisting of a plurality of face image elements shielded by the shielding object and face image elements not shielded by the shielding object.

Further, generating a training image set database according to the training image and through a confrontation generation network, specifically comprising the following steps:

optimizing an confrontation generation network through a neural network according to the training image set;

generating a network according to the public face image elements which are not shielded by the shielding object and through the optimized confrontation, and generating the public face image elements which are shielded by the shielding object;

and expanding a training image set database according to the public face image elements which are not shielded by the shielding object and the public face image elements which are shielded by the shielding object.

Further, the restoration model is obtained by optimizing the following steps:

acquiring a training image set;

and performing negative feedback optimization on the recovery model according to the training image set and through a neural network.

Further, the shade is a helmet.

Further, the face image processing method further includes:

detecting key points of the human face according to the restored human face image and through a pilot head tracking system;

calculating the view angle of the human face according to the key points of the human face and the 3D coordinates of the standard model;

and adjusting the training image in the human face visual angle in real time according to the result of the human face visual angle calculation and through a human-computer interaction system.

The embodiment of the application also provides a human face image processing device.

Specifically, a face image processing apparatus includes:

the acquisition module is used for acquiring a face image;

the detection module is used for detecting whether the face in the face image is blocked by a blocking object through a Yolo _ v3 detection model;

the restoration module is used for restoring the human face image shielded by the shielding object through a restoration model when the detection result is that the human face in the human face image is shielded by the shielding object;

the output module is used for outputting the restored face image;

further, the face image processing apparatus further includes an optimization module, configured to optimize a Yolo _ v3 detection model, specifically configured to:

acquiring a training image set;

generating a training image set database according to the training image set and through a confrontation generation network;

and performing negative feedback optimization of a Yolo _ v3 detection model according to the training image set database.

Further, the optimization module is configured to generate a training image set database according to the training image and through a confrontation generation network, and specifically configured to:

acquiring a training image set;

optimizing a countermeasure generation network according to the training image set and through a neural network;

Further, the optimization module is further configured to optimize a recovery model, specifically:

acquiring a training image set;

and performing negative feedback optimization on a recovery model according to the training image set and through a neural network.

The technical scheme provided by the embodiment of the application at least has the following beneficial effects:

through the face image processing method and device, the operation of removing the shielding object can be carried out on the shielded face image. Therefore, the head tracking system of the simulator can overcome the influence of shelters such as helmets and the like, and is more stable and reliable in the training process, so that the flight simulation training is more vividly carried out.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

fig. 1 is a flowchart of a face image processing method according to an embodiment of the present application.

Fig. 2 is a schematic structural diagram of a face image processing apparatus according to an embodiment of the present application.

100 human face image processing device

11 acquisition module

12 detection module

13 restoration module

14 output module

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Referring to fig. 1, a method for processing a face image according to an embodiment of the present application includes the following steps:

s100: and acquiring a human face image.

It can be understood that, when performing operations such as face detection or recognition, an image with a face needs to be acquired first. Specifically, the face image is acquired by reading a real-time image shot by the shooting device. For example, a face image shot by a camera in real time is acquired. In addition, a face image may also be acquired by reading an image file stored in a computer storage medium. For example, a photograph or video image with a human face stored on a computer storage medium is obtained. The method and the device for acquiring the face image have the advantages that the face image is acquired in the process of processing the face image by acquiring the real-time image shot by the shooting device or acquiring the face image stored in the image file of the computer storage medium. Therefore, the acquisition of the face image can be designed into different acquisition modes. The described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

S200: and detecting whether the face in the face image is occluded by an occlusion through a Yolo _ v3 detection model.

It can be understood that when the face is covered by the blocking object, an image of the face is obtained, and at this time, the image cannot be directly applied to face recognition or key point detection. If the face is not shielded by the shielding object, the image of which the face is not shielded can be obtained, and the image can be directly applied to face recognition or key point detection. It should be noted that the specific determination requirement of whether the face is covered by the covering object needs to be determined according to the actual application requirement. For example, when performing face recognition, all areas of the face in the acquired face image must not be blocked by the blocking object. At this time, in the range of the visual angle of image acquisition, the person's face can be judged to be blocked by the blocking object when wearing a mask, a helmet, sunglasses, or the like. However, if the local area of the face is identified, it is only necessary that the local area is not covered by the blocking object. However, when eye features of a human face are identified, the mouth and the nose wear the mask, so that the identification and detection of the eye features are not affected. At this time, it cannot be determined that the face is blocked by the blocking object. Therefore, after the image with the face is acquired, it is necessary to detect whether the face in the image is occluded according to the set occlusion requirement of the determination occlusion object. When detecting whether the face in the face image is shielded, the image can be detected and classified by a manual mode or a related detection model. For example, when the face is detected to be blocked by a blocking object, the face image corresponding to the face is the first type image; when the face is detected not to be shielded by the shielding object, the face image corresponding to the face is the second type image. It is to be understood that the specific description of the face image classification described herein is not to be taken as limiting the scope of the present application. Particularly, in the algorithm for detecting and classifying, the Yolo _ v3 network algorithm has a clear structure and excellent real-time performance, and a user can select a calculation method of a backbone network according to actual requirements. Therefore, the Yolo _ v3 detection model is selected to detect whether the face in the face image is occluded by an occlusion. Wherein, if the performance of the algorithm is perfect, a Darknet-53 calculation method can be selected. To make the calculation more lightweight and faster, a tiny dark globe algorithm can be used. It should be understood that the specific calculation method of the backbone network adopted by the Yolo _ v3 detection model is obviously not limited to the specific protection scope of the present application.

Further, in a preferred embodiment provided herein, the Yolo _ v3 detection model is obtained by the following steps:

acquiring a training image set;

the method for acquiring the training image set specifically includes:

It can be understood that the Yolo _ v3 network algorithm has a clear structure and excellent real-time performance, and a user can select a calculation method of the Yolo _ v3 backbone network according to actual use requirements. However, in order to ensure the detection accuracy of whether the face is occluded by an occlusion in the face image, optimization training needs to be performed on the Yolo _ v3 detection model. Specifically, when optimizing the Yolo _ v3 detection model, a training image set for optimizing the Yolo _ v3 detection model needs to be acquired first. The training image set consists of a plurality of face image elements which are shielded by a shielding object and face image elements which are not shielded by the shielding object. It should be noted that the image elements in the training image set are selected according to the specific detection scenario of the Yolo _ v3 detection model. That is, the determination requirement that the human face is blocked by the blocking object in the actual detection application needs to be obtained, and then the image element is selected. A countermeasure generation network (GAN network) can generate output images with different effects by mutual game learning of a generation Model (generic Model) and a discriminant Model (discriminant Model). Therefore, a training image set database with corresponding required effects can be generated according to the acquired training image set and through the GAN network. In this way, the number and variety of training image sets for optimizing the Yolo _ v3 detection model can be made richer, thereby providing a large amount of training data for training the Yolo _ v3 detection model. At this time, the face image elements blocked by the blocking object and the face image elements not blocked by the blocking object in the training image set database are used as input images of a Yolo _ v3 detection model, and the input images are detected and classified through the Yolo _ v3 detection model, so that different classification results corresponding to the input images can be obtained, and the classification accuracy of the different classification results can be calculated. According to the accuracy of each classification result, the optimization parameters of the Yolo _ v3 detection model can be adjusted until the optimized Yolo _ v3 detection model can accurately detect the face image blocked by the blocking object and the face image not blocked by the blocking object. That is, negative feedback optimization of the Yolo _ v3 detection model is performed based on the training image set database. In addition, when the Yolo _ v3 detection model is optimized, the target to be detected occupies a large size in the image, so that the target to be detected is easy to identify. Therefore, simplified processing is performed on the convolution layer number of the Yolo _ v3 network. In detail, the Scale3 part in the Yolo _ v3 network is omitted, and the Detection part in the Yolo _ v3 network is omitted. That is, detection is performed using only Scale1 and Scale2 portions of the Yolo _ v3 network. Also, to enhance the target detection capability of both Scale1 and Scale2, a plug-and-play module is used in the Yolo _ v3 network. Experiments show that the SENet module is added after the 5 th and 10 th convolution layers in the Yolo _ v3 network, so that the accuracy of face detection can be improved. Thus, the optimized yolk _ v3 detection model can detect the face image more stably, in real time and accurately. It is to be understood that the specific number of face image elements occluded by an occlusion and face image elements not occluded by an occlusion in the training image set and the training image set database obviously does not constitute a limitation to the specific protection scope of the present application.

Further, in a preferred embodiment provided by the present application, the generating a training image set database by using a confrontation generation network according to the training image specifically includes the following steps:

It can be understood that, according to the acquired image and through the change of the relevant generation parameters, the GAN network can generate a series of images with different effects corresponding to the acquired image. Therefore, through the picture generation capability of the GAN network, the face images in the training image set can be generated into the face images with specified effects according to the set generation parameters. In this way, the related face image elements in the training image set can be rapidly increased, so that a training image set database is formed. Wherein the training image set database is capable of providing a large amount of training data for training of the Yolo _ v3 detection model.

Specifically, since the generation parameters in the GAN network need to be adjusted continuously according to the image generation effect actually required, the generation parameters in the GAN network need to be optimized first when the training image set database is generated through the GAN network. At this time, a plurality of face image elements which are shielded by the shielding object in the acquired training image set are used as images which are required to be generated, namely required output images of the GAN network; and taking the face image elements which are not shielded by the shielding objects in the acquired training image set as input images of the GAN network. When the input image is input to the GAN network, a corresponding real-time output image can be generated under the initial generation parameters. And adjusting the generation parameters of the GAN network according to the error value between the real-time output image and the final required effect image. And when the error value between the real-time output image and the final required effect image meets the set qualified threshold, the optimization of the GAN network is completed. For example, when the acquired face image element which is not occluded by an occlusion object is input to the Paddle GAN network, the initial setting value of the Paddle GAN network generation parameter is the default parameter. In this case, the generation parameters of the paddlegan network can be adjusted by comparing the real-time output image of the network with the image required to be generated. It is understood that the specific value of the error qualification threshold between the real-time output image of the GAN network and the final desired effect image and the type of GAN network selected are not limitations to the specific scope of the present application. When the optimization of the GAN network is completed and the public face image elements which are not covered by the blocking object are obtained, the public face image elements which are not covered by the blocking object can be generated into the public face image elements which are covered by the blocking object according to the image generation parameters of the GAN network. It is to be understood that the specific number of public human face image elements that are not occluded by an occlusion as described herein is obviously not to be construed as limiting the specific scope of the present application. At this time, the number of the face image elements which are not blocked by the blocking object and the number of the face image elements which are not blocked by the blocking object in the training image set can be increased by adding the face image elements which are not blocked by the blocking object and the face image elements which are blocked by the blocking object into the training image set. I.e. a corresponding training image set database is formed. In this way, while the number of face image elements used for Yolo _ v3 detection model training is increased, the face image elements are made more diverse. Thus, the accuracy of the detection of the Yolo _ v3 detection model is increased. S300: and when the detection result is that the face in the face image is shielded by the shielding object, restoring the face image shielded by the shielding object through the restoration model.

It can be understood that the face image obtained is detected by the Yolo _ v3 detection model, and when the face in the image is detected not to be occluded by an occlusion, the image can be directly applied. However, when it is detected that a face in an image is blocked by a blocking object, the image cannot be directly recognized and applied. At this time, the face image obtained needs to be subjected to the occlusion removal process. After the shielding object is removed, the shielding object can be applied according to the actual application requirement. For example, in a real-time image acquired by a shooting device, a human face is blocked by a helmet in the acquired image due to a shooting visual angle problem. At this time, it is subjected to a helmet-removing process. Namely, the human face in the image is restored. The face-related features in the restored image can be obtained. And removing the obstruction in the acquired face image and restoring the face image by a related restoration model. If the initialized restoration model is used for directly restoring the face image, the restoration accuracy and precision of the restored image are both in an uncontrollable state, and the restored image cannot meet the application requirement. Therefore, the restoration model in practical application can restore the face image blocked by the blocking object according to the specified requirement, because the restoration model is optimally trained according to the practical application requirement before being used for restoring the face image blocked by the blocking object. It should be noted that, in the actual application process, if it is detected that the face in the acquired face image is blocked by the blocking object, in order to increase the efficiency and accuracy of restoration, the size or color channel of the acquired face image may be adjusted according to the setting rule.

Further, in a preferred embodiment provided herein, the restoration model is obtained by optimizing the following steps:

acquiring a training image set;

It can be understood that the restoration model cannot stably restore the image under the initial restoration parameters, and the restoration accuracy and precision are not controllable. Therefore, in order to optimize the restoration model, it is necessary to adjust the restoration parameters so that the face image blocked by the blocking object can be stably restored.

Specifically, since the restoration model is used to restore a face image blocked by a blocking object to a face image not blocked by the blocking object, when adjusting restoration parameters of the restoration model, a training image set composed of a plurality of face image elements blocked by the blocking object and face image elements not blocked by the blocking object needs to be acquired first. At this time, a plurality of face image elements shielded by a shielding object in the acquired training image set are used as training images for training a restoration model, namely input images of the restoration model; and taking the face image elements which are not shielded by the shielding object in the acquired training image set as a restoration model standard output image. In order to increase the efficiency and accuracy of the reconstruction model optimization, the sizes or color channels of the images in the training image set can be uniformly preprocessed. When the input image is input into the restoration model, the restoration model can restore according to the current restoration parameters through the neural network and output a real-time restoration image. And when the error between the real-time restoration image output by the restoration model and the standard output image reaches a set qualified threshold, indicating that the restoration parameters of the restoration model are optimized. However, if the error between the real-time restored image output by the restoration model and the standard output image cannot reach the set qualified threshold, the restoration model needs to be further optimized under the action of the neural network. Namely, a negative feedback optimization restoration model is performed according to the acquired training image set and through a neural network. For example, in the process of simulating air combat training, a pilot needs to ensure that a key feature region in an acquired face image is not shielded by a helmet, such as an eye region of a pilot. When the Yolo _ v3 detection model detects that the eye region of the pilot face is blocked by the helmet, image processing for depanenting by a recovery model is required. At this time, the restoration model needs to be able to accurately remove the helmet from the face image and restore the eye region of the pilot. In order to ensure that the restoration model can accurately restore in practical application, a training image set composed of a plurality of image elements of the human face which are shielded by a helmet and a plurality of image elements which are not shielded by the helmet needs to be acquired first. In order to increase the efficiency and accuracy of optimization, the sizes of the acquired images can be uniformly adjusted. For example, the image size is adjusted to 128 × 128, and becomes a grayscale image. And then, optimizing the recovery model through the neural network. Experiments show that: the restoration model is optimized into the restoration model with at least 5-layer coding and 5-layer decoding functions, a convolution kernel is at least 5 x 5, and the maximum pooling is at least 3 x 3 through a neural network, and the restoration model which can be used for restoring the face image blocked by the blocking object can be obtained after training for at least 10000 times according to an error function formed by Euclidean distances between an output image and an input image.

S400: and outputting the restored human face image.

It can be understood that when the acquired face image is detected as being occluded by an occlusion object by the Yolo _ v3 detection model, an occluded area in the corresponding face image restored by the restoration model can be restored. At this time, the restored face image can be output. The output face image can be further processed or directly applied according to the actual application scene or the actual use requirement.

Further, in a preferred embodiment provided herein, the shield is a helmet.

It can be understood that, when the face detection is performed in the image, all the objects covering the target area to be identified on the face can be regarded as the blocking objects. The image processing method is directly applied to flight training of pilots, and specifically, the heads of the pilots do not need to wear helmets when the pilots carry out conventional flight training. At this time, the face image shot by the shooting equipment can directly detect the key points of the face. However, when a pilot is performing training related to simulating air combat, the training requires that the pilot must wear a helmet. At the moment, in the face image shot by the shooting equipment, the helmet can shield the face of the pilot, so that the identification of key points on the face of the pilot is influenced. Therefore, the pilot is directly influenced to acquire a real-time changing scene, and the quality of flight training is reduced. It is necessary to detect whether there is a face image blocked by the helmet in the acquired face image. When the human face image blocked by the helmet exists, the human face image needs to be restored. Therefore, the pilot can acquire more accurate training instructions or training images, and the flight training quality is improved.

Further, in a preferred embodiment provided by the present application, the face image processing method further includes:

It can be understood that, in the process of performing flight training, when the Yolo _ v3 detection model detects that the target area to be recognized by the face in the acquired image is blocked by the helmet, the restoration model restores the face image. The restored image may be further applied. Specifically, as the target area to be recognized in the restored human face image is restored, the human face key points can be detected by the pilot head tracking system. And when the key points of the human face of the pilot are detected, calculating the visual angle of the human face through the 3D coordinates of the originally configured standard model. At the moment, according to the calculated human face visual angle, the human-computer interaction system can adjust the display visual angle of the training image in real time. At the moment, the pilot can carry out flight training according to the presented real-time training image. Therefore, the problem that the image in the visual angle range of the pilot is inconsistent with the training influence actually presented due to the fact that the helmet is worn can be solved, and the training quality of flight training is improved.

Referring to fig. 2, a face image processing apparatus 100 according to an embodiment of the present application includes:

the acquisition module 11 is used for acquiring a face image;

the detection module 12 is configured to detect whether a face in the face image is occluded by an occlusion through a Yolo _ v3 detection model;

the restoration module 13 is configured to restore, when the detection result is that the face in the face image is blocked by a blocking object, the face image blocked by the blocking object through a restoration model;

and the output module 14 is configured to output the restored face image.

And the acquisition module 11 is used for acquiring a face image.

It can be understood that, when performing operations such as face detection or recognition, the acquisition module 11 needs to acquire an image with a face first. Specifically, the obtaining module 11 obtains a face image, and may read a real-time image captured by the capturing device. For example, a face image shot by a camera in real time is acquired. In addition, the acquisition module 11 may also acquire a face image by reading an image file stored in a computer storage medium. For example, a photograph or video image with a human face stored on a computer storage medium is obtained. The method and the device for acquiring the face image have the advantages that the face image is acquired in the process of processing the face image by acquiring the real-time image shot by the shooting device or acquiring the face image stored in the image file of the computer storage medium. Therefore, the acquisition of the face image can be designed into different acquisition modes. The described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

And the detection module 12 is configured to detect whether the face in the face image is occluded by an occlusion through a Yolo _ v3 detection model.

It can be understood that, when the face is blocked by the blocking object, the obtaining module 11 obtains an image of the face, and at this time, the image cannot be directly applied to the recognition or the key point detection of the face. If the face is not blocked by the blocking object, the obtaining module 11 may obtain an image of the face that is not blocked, and at this time, the image may be directly applied to face recognition or key point detection. It should be noted that the specific requirement for determining whether the face is covered by the covering object needs to be determined according to actual application requirements. For example, when face recognition is performed, all regions of the face in the acquired face image must not be blocked by the blocking object. At this time, in the range of the visual angle of image acquisition, the person's face can be judged to be blocked by the blocking object when wearing a mask, a helmet, sunglasses, or the like. However, if the local area of the face is identified, it is only necessary that the local area is not covered by the blocking object. However, when eye features of a human face are identified, the mouth and the nose wear the mask, so that the identification and detection of the eye features are not affected. At this time, it cannot be determined that the face is blocked by the blocking object. Therefore, after the acquisition module 11 acquires the image with the human face, the detection module 12 needs to detect whether the human face in the image is occluded according to the set occlusion requirement of the determination occlusion object. When the detection module 12 detects whether the face in the face image is occluded, the image may be detected and classified in a manual manner or a related detection model. For example, when the detection module 12 detects that a human face is blocked by a blocking object, the corresponding human face image is a first-class image; when the detection module 12 detects that the face is not covered by the blocking object, the face image corresponding to the face is the second type image. It is to be understood that the specific description of the face image classification described herein is not to be taken as limiting the scope of the present application. Particularly, in the algorithm for detecting and classifying, the Yolo _ v3 network algorithm has a clear structure and excellent real-time performance, and a user can select a calculation method of a backbone network according to actual requirements. Therefore, the Yolo _ v3 detection model is selected to detect whether the face in the face image is occluded by an occlusion. Wherein, if the performance of the algorithm is ensured to be perfect, a Darknet-53 calculation method can be selected. To make the calculation more lightweight and faster, a tiny dark globe algorithm can be used. It should be understood that, obviously, the specific calculation method of the backbone network selected by the Yolo _ v3 detection model described herein does not limit the specific scope of the present application.

And the restoration module 13 is configured to restore the face image blocked by the blocking object through a restoration model when the detection result is that the face in the face image is blocked by the blocking object.

It is understood that the face image acquired by the acquisition module 11 is detected by the detection module 12, and when it is detected that the face in the image is not covered by the blocking object, the image can be directly applied. However, when the detection module 12 detects that the face of the person in the image is blocked by the blocking object, the image cannot be directly recognized and applied. At this time, the restoration module 13 is required to perform the occlusion removal processing on the acquired face image. After the shielding object is removed, the shielding object can be applied according to the actual application requirement. For example, in a real-time image acquired by a shooting device, a human face is blocked by a helmet in the image acquired by the shooting angle problem acquisition module 11. At this time, the recovery module 13 is required to perform a helmet removing process. Namely, the human face in the image is restored. The face-related features in the restored image can be obtained. And removing the obstruction in the acquired face image and restoring the face image by a related restoration model. If the initialized restoration model is used for directly restoring the face image, the restoration accuracy and precision of the restored image are both in an uncontrollable state, and the restored image cannot meet the application requirement. Therefore, the restoration model in practical application can restore the face image blocked by the blocking object according to the specified requirement because the restoration model is optimally trained according to the practical application requirement before being used for restoring the face image blocked by the blocking object. It should be noted that, in an actual application process, if the detection module 12 detects that the face in the acquired face image is blocked by a blocking object, in order to increase the efficiency and accuracy of restoration, the restoration module 13 may further adjust the size or color channel of the acquired face image according to a setting rule.

And the output module 14 is configured to output the restored face image.

It can be understood that, when the face image acquired by the acquisition module 11 is detected as being blocked by a blocking object by the detection module 12, the blocked area in the corresponding face image restored by the restoration module 13 can be restored. At this time, the restored face image can be output. The output face image can be further processed or directly applied according to the actual application scene or the actual use requirement.

Further, in a preferred embodiment provided by the present application, the face image processing apparatus further includes an optimization module, configured to optimize a Yolo _ v3 detection model, specifically:

acquiring a training image set;

It can be understood that the Yolo _ v3 network algorithm adopted by the detection module 12 has a clear structure and excellent real-time performance, and a user can select a calculation method of the Yolo _ v3 backbone network according to actual use requirements. However, in order to ensure the detection accuracy of the detection module 12 for detecting whether the face in the face image is blocked by the blocking object, the optimization module needs to perform optimization training on the Yolo _ v3 detection model. Specifically, when the optimization module performs optimization of the Yolo _ v3 detection model, a training image set for optimizing the Yolo _ v3 detection model is first required to be acquired. The training image set consists of a plurality of face image elements which are shielded by a shielding object and face image elements which are not shielded by the shielding object. It should be noted that the image elements in the training image set are selected according to the specific detection scenario of the Yolo _ v3 detection model. That is, the optimization module needs to obtain a determination requirement that the face is covered by the blocking object in the actual detection application, and then selects the image element. A countermeasure generation network (GAN network) can generate output images with different effects by mutual game learning of a generation Model (generic Model) and a discriminant Model (discriminant Model). Therefore, the optimization module can generate a training image set database with corresponding required effects according to the acquired training image set and through the GAN network. In this way, the number and variety of training image sets for optimizing the Yolo _ v3 detection model can be made richer, thereby providing a large amount of training data for training the Yolo _ v3 detection model. At this time, the face image elements blocked by the blocking object and the face image elements not blocked by the blocking object in the training image set database are used as input images of a Yolo _ v3 detection model, and the input images are detected and classified through the Yolo _ v3 detection model, so that different classification results corresponding to the input images can be obtained, and the classification accuracy of the different classification results can be calculated. According to the accuracy of each classification result, the optimization parameters of the Yolo _ v3 detection model can be adjusted until the optimized Yolo _ v3 detection model can accurately detect the face image blocked by the blocking object and the face image not blocked by the blocking object. That is, negative feedback optimization of the Yolo _ v3 detection model is performed based on the training image set database. In addition, when the optimization module optimizes the Yolo _ v3 detection model, the target to be detected occupies a large size proportion in the image, so that the target to be detected is easy to identify. Therefore, simplified processing is performed on the convolution layer number of the Yolo _ v3 network. In detail, the Scale3 part in the Yolo _ v3 network is omitted, and the Detection part of the Yolo _ v3 network is deleted. That is, detection is performed using only Scale1 and Scale2 portions of the Yolo _ v3 network. And, in order to enhance the target detection capability of the two scales 1 and 2, a plug and play module is used in the Yolo _ v3 network. Experiments show that the SENet module is added after the 5 th and 10 th convolution layers in the Yolo _ v3 network, so that the accuracy of face detection can be improved. Thus, the optimized yolk _ v3 detection model can detect the face image more stably, in real time and accurately. It is to be understood that the specific number of face image elements occluded by an occlusion and face image elements not occluded by an occlusion in the training image set and the training image set database obviously does not constitute a limitation to the specific protection scope of the present application.

Further, in a preferred embodiment provided by the present application, the optimization module is configured to generate a training image set database according to the training image and by using a confrontation generation network, and specifically configured to:

It can be understood that, according to the acquired image and through the change of the relevant generation parameters, the GAN network can generate a series of images with different effects corresponding to the acquired image. Therefore, through the picture generation capability of the GAN network, the face images in the training image set can be generated into the face images with specified effects according to the set generation parameters. In this way, the related face image elements in the training image set can be rapidly increased, so as to form a training image set database. Wherein the training image set database is capable of providing a large amount of training data for training of the Yolo _ v3 detection model.

Specifically, since the generation parameters in the GAN network need to be adjusted continuously according to the image generation effect actually required, the optimization module preferably needs to optimize the generation parameters in the GAN network when generating the training image set database through the GAN network. At the moment, a plurality of face image elements shielded by the shielding objects in the training image set acquired by the optimization module are used as images required to be generated, namely required output images of the GAN network; and taking the face image elements which are not shielded by the shielding object in the training image set acquired by the optimization module as the input image of the GAN network. When the input image is input to the GAN network, a corresponding real-time output image can be generated under the initial generation parameters. And adjusting the generation parameters of the GAN network according to the error value between the real-time output image and the final required effect image. And when the error value between the real-time output image and the final required effect image meets the set qualified threshold, the optimization of the GAN network is completed. For example, when a face image element which is not occluded by an occlusion and acquired by the optimization module is input into the Paddle GAN network, the initial setting value of the generation parameter of the Paddle GAN network is the default parameter. In this case, the generation parameters of the paddlegan network can be adjusted by comparing the real-time output image of the network with the image required to be generated. It is understood that the specific value of the error qualification threshold between the real-time output image of the GAN network and the final desired effect image and the type of GAN network selected are not limitations to the specific scope of the present application. When the optimization of the GAN network is completed and the public face image elements which are not covered by the blocking object are obtained, the public face image elements which are not covered by the blocking object can be generated into the public face image elements which are covered by the blocking object according to the image generation parameters of the GAN network. It is to be understood that the specific number of public human face image elements that are not occluded by an occlusion as described herein is obviously not to be construed as limiting the specific scope of the present application. At this time, the number of the face image elements which are not blocked by the blocking object and the number of the face image elements which are not blocked by the blocking object in the training image set can be increased by adding the face image elements which are not blocked by the blocking object and the face image elements which are blocked by the blocking object into the training image set. I.e. a corresponding training image set database is formed. In this way, while the number of face image elements used for Yolo _ v3 detection model training is increased, the face image elements are made more diverse. Thus, the accuracy of the detection of the Yolo _ v3 detection model is increased.

Further, in a preferred embodiment provided herein, the optimization module is further configured to optimize a recovery model, specifically:

acquiring a training image set;

It can be understood that the restoration model cannot stably restore the image under the initial restoration parameters, and the restoration accuracy and precision are not controllable. Therefore, the optimization module optimizes the restoration model, and firstly, the restoration parameters need to be adjusted, so that the face image blocked by the blocking object can be stably restored finally.

Specifically, since the restoration model is used to restore a face image that is blocked by a blocking object to a face image that is not blocked by the blocking object, when the optimization module adjusts the restoration parameters of the restoration model, a training image set that is composed of a plurality of face image elements that are blocked by the blocking object and face image elements that are not blocked by the blocking object needs to be acquired first. At this time, a plurality of human face image elements shielded by a shielding object in the obtained training image set by the optimization module are used as training images for training a recovery model, namely input images of the recovery model; and taking the face image elements which are not shielded by the shielding object in the training image set acquired by the optimization module as a restoration model standard output image. In order to increase the efficiency and accuracy of the reconstruction model optimization, the sizes or color channels of the images in the training image set may be uniformly preprocessed. When the input image is input into the restoration model, the restoration model can restore according to the current restoration parameters through the neural network and output a real-time restoration image. And when the error between the real-time restoration image output by the restoration model and the standard output image reaches a set qualified threshold, indicating that the restoration parameters of the restoration model are optimized. However, if the error between the real-time restored image output by the restoration model and the standard output image cannot reach the set qualified threshold, the restoration model needs to be further optimized under the action of the neural network. Namely, a recovery model is optimized in a negative feedback mode through a neural network according to the training image set acquired by the optimization module. For example, in the process of simulating air combat training, a pilot needs to ensure that a key feature region in an acquired face image is not shielded by a helmet, such as an eye region of a pilot. When the detection module 12 detects that the eye region of the pilot's face is blocked by the helmet, image processing for helmet removal is required by the restoration module 13. At this time, the restoration module 13 needs to be able to accurately remove the helmet from the face image and restore the eye region of the pilot. In order to ensure that the restoration module 13 can accurately restore in practical application, the optimization module needs to first acquire a training image set composed of a plurality of image elements of which the faces are shielded by helmets and a plurality of image elements which are not shielded by helmets. In order to increase the efficiency and accuracy of optimization, the optimization module can uniformly adjust the sizes of the acquired images for adjustment.

It is to be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, the statement that there is an element defined as "comprising" … … does not exclude the presence of other like elements in the process, method, article, or apparatus that comprises the element.

The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims

1. A face image processing method is characterized by comprising the following steps:

acquiring a face image;

outputting the restored human face image;

2. The method for processing a human face image according to claim 1, wherein the Yolo _ v3 detection model is obtained by the following steps:

acquiring a training image set;

the method for acquiring the training image set specifically includes:

3. The method for processing a face image according to claim 2, wherein a training image set database is generated by a confrontation generation network based on the training image, and specifically comprises the following steps:

4. The facial image processing method as claimed in claim 1, wherein said restoration model is obtained by optimizing:

acquiring a training image set;

5. The method of processing a human face image according to claim 1, wherein the obstruction is a helmet.

6. A face image processing apparatus, comprising:

the acquisition module is used for acquiring a face image;

the output module is used for outputting the restored face image; detecting key points of the human face according to the restored human face image and through a pilot head tracking system; calculating the view angle of the face according to the key points of the face and through the 3D coordinates of a standard model; and adjusting the training image in the human face visual angle in real time according to the result of the human face visual angle calculation and through a human-computer interaction system.

7. The facial image processing apparatus according to claim 6, further comprising an optimization module for optimizing a Yolo _ v3 detection model, specifically for:

acquiring a training image set;

8. The facial image processing apparatus according to claim 7, wherein the optimization module is configured to generate a training image set database from the training images through a confrontation generation network, and is specifically configured to:

9. The facial image processing apparatus of claim 7, wherein said optimization module is further configured to optimize a restoration model, in particular to:

acquiring a training image set;