CN111914629A

CN111914629A - Method, apparatus, device and storage medium for generating training data for face recognition

Info

Publication number: CN111914629A
Application number: CN202010564108.9A
Authority: CN
Inventors: 温圣召
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2020-06-19
Filing date: 2020-06-19
Publication date: 2020-11-10
Anticipated expiration: 2040-06-19
Also published as: CN111914629B

Abstract

The application discloses a method, a device, equipment and a storage medium for generating training data for face recognition, and relates to the technical field of artificial intelligence deep learning and computer vision, in particular to the technical field of face recognition. The specific implementation scheme is as follows: acquiring a face image without wearing a mask, and acquiring a mask image; calculating the spatial offset angle of the face image without wearing the mask; rotating the mask image according to the spatial offset angle to make the mask image spatially consistent with the face image without wearing the mask; and fusing the rotated mask image to the face image without the mask so as to generate the face image with the mask. The method can obtain the face image of the mask, thereby expanding the training data quantity for establishing the face recognition model and improving the recognition accuracy of the face image of the mask.

Description

Method, apparatus, device and storage medium for generating training data for face recognition

Technical Field

The present application relates to the field of artificial intelligence deep learning and computer vision technologies, and in particular, to the field of face recognition, and in particular, to a method, an apparatus, a device, and a storage medium for generating training data for face recognition.

Background

Face recognition is a biometric technology for identity recognition based on facial feature information of a person. The face recognition product is widely applied to the fields of finance, judicial sciences, military, public security, frontier inspection, government, aerospace, electric power, factories, education, medical treatment, numerous enterprises and public institutions and the like. With further maturity of the technology and improvement of social acceptance, the face recognition technology is applied to more fields.

In the related technology, the face recognition technology mainly collects and mines a large amount of face data, performs clustering and then performs matching training. The effect of the face recognition model on a certain scene mainly depends on whether the training sample contains enough scene data, and when the scene data in the training sample is less, the trained model (i.e. the face recognition model) performs poorly on the scene.

Nowadays, because the people who wear gauze mask are more and more, wear gauze mask becomes a current main scene, but because gauze mask content is less among the traditional training sample, lead to current training scheme to showing badly to the scene of wearing gauze mask, cause the discernment accuracy of the facial image of wearing gauze mask lower.

Disclosure of Invention

The method, the device, the equipment and the storage medium for generating training data for face recognition are provided, and are used for solving the technical problem that in the related art, due to the fact that training data of face images of a mask wearing is lacked in a face recognition model, recognition accuracy of the face images of the mask wearing is low.

According to a first aspect, there is provided a method of generating training data for face recognition, comprising:

acquiring a face image without wearing a mask, and acquiring a mask image;

calculating a spatial offset angle of the face image without wearing the mask;

rotating the mask image according to the spatial offset angle to make the mask image spatially consistent with the face image without the mask; and

and fusing the rotated mask image to the non-mask-wearing face image to generate a mask-wearing face image.

The method for generating training data for face recognition comprises the steps of firstly obtaining a face image without wearing a mask, obtaining a mask image, then calculating a spatial offset angle of the face image without wearing the mask, rotating the mask image according to the spatial offset angle to enable the mask image to be consistent with the face image without wearing the mask in space, and finally fusing the rotated mask image to the face image without wearing the mask to generate the face image with wearing the mask. Therefore, the face image of the mask can be obtained, the training data quantity of the face recognition model is expanded, and the recognition accuracy of the face image of the mask is improved.

According to a second aspect, there is provided an apparatus for generating training data for face recognition, comprising:

the acquisition module is used for acquiring a face image without wearing a mask and acquiring a mask image;

the calculation module is used for calculating the spatial offset angle of the face image without wearing the mask;

the adjusting module is used for rotating the mask image according to the spatial offset angle so as to make the mask image spatially consistent with the face image without wearing the mask; and

and the generating module is used for fusing the rotated mask image to the face image without the mask so as to generate the face image with the mask.

The device of training data is generated for face identification of this application embodiment, acquire the face image and the gauze mask image of not wearing the gauze mask through obtaining the module earlier, then calculate the space offset angle of the face image of not wearing the gauze mask through calculation module, then rotate so that gauze mask image is unanimous in space with the face image who does not wear the gauze mask through adjustment module according to the space offset angle to the gauze mask image, fuse the face image to not wearing the gauze mask through the gauze mask image after generating the module at last, in order to generate the face image who wears the gauze mask. Therefore, the face image of the mask can be obtained, the training data quantity of the face recognition model is expanded, and the recognition accuracy of the face image of the mask is improved.

According to a third aspect, there is provided an electronic device comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method of generating training data for face recognition as described in an embodiment of the above aspect.

According to a fourth aspect, there is provided a non-transitory computer-readable storage medium having stored thereon a computer program for causing a computer to perform the method of generating training data for face recognition as described in an embodiment of the above aspect.

According to the technology of the application, the technical problem that in the related technology, due to the fact that training data of the face image of the mask wearing is lacked in the face recognition model, the recognition accuracy of the face image of the mask wearing is low is solved.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:

fig. 1 is a schematic flowchart of a method for generating training data for face recognition according to an embodiment of the present application;

FIG. 2 is a schematic flow chart of another method for generating training data for face recognition according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a right-handed Cartesian coordinate system for three-dimensional space provided by an embodiment of the present application;

FIG. 4 is a schematic flow chart diagram illustrating a further method for generating training data for face recognition according to an embodiment of the present application;

fig. 5 is a schematic flowchart of a method for generating training data for face recognition according to an embodiment of the present application;

FIG. 6 is a schematic flow chart illustrating a further method for generating training data for face recognition according to an embodiment of the present application;

FIG. 7 is a block diagram illustrating an apparatus for generating training data for face recognition according to an embodiment of the present disclosure;

FIG. 8 is a block diagram of another apparatus for generating training data for face recognition according to an embodiment of the present disclosure;

FIG. 9 is a block diagram illustrating an apparatus for generating training data for face recognition according to an embodiment of the present application;

fig. 10 is a block diagram illustrating a further apparatus for generating training data for face recognition according to an embodiment of the present application; and

fig. 11 is a block diagram of an electronic device that generates training data for face recognition according to an embodiment of the application.

Detailed Description

The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

A method, an apparatus, a device, and a storage medium for generating training data for face recognition according to embodiments of the present application are described below with reference to the accompanying drawings.

The embodiment of the application provides a method for generating training data for face recognition, aiming at the technical problem that in the related art, due to the fact that training data of face images of a mask wearing is lacked in a face recognition model, the recognition accuracy of the face images of the mask wearing is low.

The method for generating the training data for face recognition can fuse the mask image to the face image without wearing the mask to generate the face image wearing the mask, solves the problems in the related art, and improves training efficiency.

The method for generating training data for face recognition provided in the embodiment of the present application may be executed by an electronic device, where the electronic device may be a Personal Computer (PC), a tablet Computer, a palmtop Computer, or the like, and is not limited herein.

In the embodiment of the application, the electronic device can be provided with a processing component, a storage component and a driving component. Optionally, the driving component and the processing component may be integrated, the storage component may store an operating system, an application program, or other program modules, and the processing component implements the method for generating training data for face recognition provided in the embodiment of the present application by executing the application program stored in the storage component.

The face image of the mask wearing obtained by the method for generating the training data for face recognition is used for expanding the number of the training data for establishing the face recognition model, so that the recognition accuracy of the face image of the mask wearing is improved. The face recognition model may be preset in the electronic device, that is, the face recognition model may be stored in a storage space of the electronic device in advance, the storage space is not limited to an entity-based storage space, for example, a hard disk, and the storage space may also be a storage space (cloud storage space) of a network hard disk connected to the electronic device.

Fig. 1 is a flowchart illustrating a method for generating training data for face recognition according to an embodiment of the present disclosure.

The method for generating training data for face recognition in the embodiment of the application can be further executed by the device for generating training data for face recognition in the embodiment of the application, and the device can be configured in electronic equipment to fuse a mask image into a face image without wearing the mask so as to generate the face image wearing the mask.

As a possible situation, the method for generating training data for face recognition in the embodiment of the present application may also be executed at a server side, where the server may be a cloud server, and the method for generating training data for face recognition may be executed at a cloud side.

As shown in fig. 1, the method for generating training data for face recognition may include the following steps:

step 101, acquiring a face image without wearing a mask, and acquiring a mask image, wherein the number of the face images without wearing the mask can be multiple.

It should be noted that the non-mask face images described in this embodiment may be all non-mask face images in a preset face recognition model, and the all non-mask face images may include face images of different users that are not wearing masks and acquired by a camera, where the preset face recognition model may provide data support for a face recognition function. The mask image described in this embodiment may be a common mask image retrieved by the electronic device through a network, or may be a mask image acquired by a camera, which is not limited herein.

Specifically, the electronic device can directly call all face images without wearing the mask from a preset face recognition model, and retrieve common mask images through a network. It should be noted that the mask image described in this embodiment may be directly fused to the standard front face image, that is, the position of the mask image in space is relatively identical to the standard front face image.

And 102, calculating the spatial offset angle of the face image without wearing the mask.

It should be noted that the spatial offset angle described in this embodiment may represent a spatial offset angle of the face image of the wearer mask with respect to the standard front face image (for example, an offset angle at each latitude in space).

In the embodiment of the application, the electronic device can calculate the spatial offset angle of the face image without wearing the mask through a preset algorithm, wherein the preset algorithm can be calibrated according to the actual situation.

And 103, rotating the mask image according to the spatial offset angle so that the mask image is spatially consistent with the face image without wearing the mask.

If the mask image and the face image are not spatially consistent, the mask is likely to be placed at an incorrect position when the images are fused subsequently, for example, a strap for hanging ears on the mask is not hung on the ears, the mask surface of the mask does not completely cover the mouth, and the like.

And 104, fusing the rotated mask image to a face image without wearing the mask to generate a face image with the mask.

Specifically, the electronic device can directly call all face images without wearing the mask from a preset face recognition model, retrieve common mask images through a network, and calculate the spatial offset angle of the face images without wearing the mask through a preset algorithm. The electronic device can then rotate the mask image according to the spatial offset angle so that the mask image is spatially consistent with the image of the person's face without the mask. And then the electronic equipment can respectively fuse the rotated mask images to the face images of all the non-worn masks so as to generate face images of the worn masks corresponding to the face images of all the non-worn masks.

Further, in order to improve the accuracy of recognizing the face image of the mask, the electronic device may input the face image of the mask to a preset face recognition model after obtaining the face images of the mask corresponding to all the face images of the mask not worn, so as to provide more training data for the face recognition function.

In the embodiment of the application, firstly, a face image without wearing a mask is obtained, a mask image is obtained, then, a spatial offset angle of the face image without wearing the mask is calculated, the mask image is rotated according to the spatial offset angle so that the mask image is consistent with the face image without wearing the mask in space, and finally, the rotated mask image is fused to the face image without wearing the mask so as to generate the face image with wearing the mask. Therefore, the face image of the mask can be obtained, the training data quantity of the face recognition model is expanded, and the recognition accuracy of the face image of the mask is improved.

To clarify the above embodiment and also to obtain an image of a hood that is more pleasing to the public's aesthetic and usage habits, in one embodiment of the present application, as shown in fig. 2, obtaining an image of a mask may include the following steps:

step 201, obtaining a sample face image with a mask. The number of the sample face images can be multiple.

In the embodiment of the application, the sample face image can be obtained by collecting a large number of front face images of outdoor mask wearing personnel through the camera by related personnel and then conducting match selection according to preset rules. The preset rule can be calibrated according to the actual situation.

Specifically, related personnel acquire a large number of front face images of outdoor mask wearing personnel through a camera and input the front face images into electronic equipment, and then the electronic equipment performs match selection according to preset rules to obtain a plurality of sample face images.

Step 202, obtaining the labeling boundary coordinates of the mask area in the sample face image with the mask.

It should be noted that the labeling boundary coordinates described in this embodiment may be three-dimensional coordinates, i.e., right-hand cartesian coordinates of a three-dimensional space. Thus, the labeling boundary coordinates completely cover the boundary of the mask region.

And step 203, extracting the mask image according to the marked boundary coordinates of the mask area.

Specifically, after obtaining a sample face image with a mask, the electronic device optionally samples a center point between two eyes in the sample face image as an origin to establish a right-hand cartesian coordinate system of a three-dimensional space, and labels boundary coordinates of a mask area in the sample face image to obtain labeled boundary coordinates of the mask area in the sample face image. And then the electronic equipment extracts the mask image from the sample face image according to the marked boundary coordinates of the mask area. The obtained mask image is more in line with the aesthetic and use habits of the public.

To clearly illustrate the embodiment illustrated in fig. 1, in one embodiment of the present application, calculating the spatial offset angle of the non-mask-worn face image may include acquiring a standard frontal face image and generating the spatial offset angle from the non-mask-worn face image and the standard frontal face image.

In the embodiment of the application, the standard front face image may be preset in the electronic device, and the standard front face image may be a standard example given by the face recognition system when the face image of the user is input, or may be a face image automatically generated by the face recognition system according to a preset recognition standard, which is not limited herein.

Specifically, after the electronic device acquires the face image and the mask image of the non-worn mask, the electronic device can call out the standard frontal image from the built-in storage space, and generate a spatial offset angle according to the face image and the standard frontal image of the non-worn mask, so that the spatial position of the mask image can be adjusted subsequently.

In one embodiment of the present application, the spatial offset angle may include a pitch angle, a yaw angle, and a roll angle, wherein the pitch angle may represent a head-up angle, as shown in fig. 3, which is a right-hand cartesian coordinate system of a three-dimensional space; yaw angle may represent the angle of a left or right turn; the roll angle may represent the angle of left and right skew. That is to say, the spatial position of the face image without wearing the mask relative to the standard frontal face image can be well reflected through the pitch angle, the yaw angle and the roll angle.

To clearly illustrate the above embodiment, in an embodiment of the present application, as shown in fig. 4, the generating of the spatial offset angle from the non-mask-worn face image and the standard front face image includes the following steps:

step 401, acquiring a plurality of first key point positions in a face image without a mask.

In the embodiment of the present application, the plurality of first keypoint locations may include keypoint locations of mouth, nose, chin, ear, and the like on the face.

Step 402, a plurality of second key point positions in the standard frontal face image are obtained, wherein the plurality of first key point positions correspond to the plurality of second key point positions one to one.

In the embodiment of the present application, the plurality of second keypoint locations may also include keypoint locations such as the mouth, the nose, the chin, and the ears on the human face, and correspond to the plurality of first keypoint locations one to one.

It should be noted that the location of the key point described in this embodiment may be the location of the area covered by the mask worn by the person.

And 403, calculating a space offset angle according to the position of the first key point and the position of the second key point.

Specifically, after acquiring the face image and the mask image of the person without wearing the mask, the electronic device may call up a standard frontal face image from a built-in storage space. Then the electronic equipment can respectively establish a right-hand Cartesian coordinate system of a three-dimensional space in the face image without the mask and the standard frontal face image by taking the middle of the images as an origin, and respectively acquire the coordinates of key points such as the mouth, the nose, the chin, the ears and the like in the face image without the mask and the standard frontal face image. And then the electronic equipment calculates the spatial offset angle based on the coordinates of key points such as the mouth, the nose, the chin, the ears and the like in the face image without the mask and the standard frontal face image. Therefore, the spatial offset angle of the face image without wearing the mask to the standard frontal face image can be accurately calculated, so that the spatial position of the mask image can be conveniently adjusted subsequently.

In order to clearly illustrate the embodiment shown in fig. 1, in an embodiment of the present application, as shown in fig. 5, the mask image is rotated according to a spatial offset angle so that the mask image is spatially consistent with the face image without wearing the mask, and the following steps may be included:

step 501, a plurality of third key point positions of the mask image are obtained.

In the embodiment of the present application, the plurality of third key point positions may include positions of two ear loops of the mask, a center of the mask, a highest point of the mask, a lowest point of the mask, and the like.

Step 502, determining position vectors of a plurality of third key positions according to the spatial offset angle.

In the embodiment of the present application, a right-handed cartesian coordinate system of a three-dimensional space may also be established with the center of the mask image as the origin, so as to determine the position vectors of the plurality of third key positions according to the spatial offset angle.

It should be noted that the origin of the right-hand cartesian coordinates of the three-dimensional space established in the face image, the standard frontal face image, and the mask image without the mask should be the same.

And step 503, controlling the positions of the plurality of third key points to move according to the position vectors so as to make the mask image spatially consistent with the face image without wearing the mask.

Specifically, after calculating the spatial offset angle of the face image without wearing the mask, the electronic device may establish a right-hand cartesian coordinate system of a three-dimensional space in the mask image with the right middle of the image as an origin, and respectively acquire coordinates of a plurality of third key positions of the mask image, and determine a position vector of the plurality of third key positions according to the spatial offset angle. And then the electronic equipment can control the positions of the third key points to move according to the position vectors so as to make the mask image spatially consistent with the face image without wearing the mask, thereby keeping the mask image spatially consistent with the face image without wearing the mask.

In order to further expand the training data required for face recognition, in an embodiment of the present application, after the mask image is rotated according to the spatial offset angle so that the mask image is spatially consistent with the face image without wearing the mask, the data expansion of the mask image after the rotation may be further included to obtain a plurality of mask images with different forms, wherein the plurality of mask images with different forms are used for being fused with the face image without wearing the mask. It should be noted that the data augmentation described in this embodiment may include horizontal/vertical flipping, rotation, scaling, cropping, translation, contrast, and color dithering, among others, where color dithering may include color replacement.

To clearly illustrate the above embodiment, in an embodiment of the present application, as shown in fig. 6, data expansion of the mask image after rotation to obtain a plurality of mask images with different forms may include the following steps:

step 601, extracting mask feature information in the rotated mask image.

In the embodiment of the present application, the mask characteristic information may include size information of the mask, color information of the mask, position information of the mask in space, shape information of the mask, and the like.

Step 602, controlling the rotated mask image to respectively perform one or more of translation, rotation, size scaling and color replacement according to the mask feature information and a preset selection rule, so as to obtain a plurality of mask images in different forms. The preset selection rule can be calibrated according to actual conditions.

Specifically, after the electronic device completes the operation of rotating the mask image, one or more feature information of the size information of the mask, the color information of the mask, the position information of the mask in the space, the shape information of the mask and the like may be extracted from the rotated mask image through a preset extraction algorithm. And then the electronic equipment can control the rotated mask image to respectively perform one or more of translation, rotation, size scaling and color replacement according to the mask characteristic information and the preset selection rule so as to obtain a plurality of mask images in different forms. And finally, the electronic equipment can respectively fuse the mask images in different forms to the face images without wearing the masks so as to generate a plurality of face images wearing masks in different forms, wherein the same user identification can correspond to the face images wearing the masks in different forms. Therefore, the training data quantity of the face recognition model can be greatly expanded by the data augmentation mode, so that the problems of single mask position, fixed mask shielding position, single mask color and the like in mask training data in the related technology are solved, and the recognition accuracy of the face image of the mask is greatly improved.

In order to solve the problem that data corresponding to the data of the mask worn by the same person and the data not worn by the same person in the training data are insufficient, in an embodiment of the present application, the method for generating training data for face recognition may further include binding a face image without wearing the mask with a face image wearing the mask, and inputting the face image to a preset face recognition model to update the preset face recognition model.

Specifically, the electronic device can directly call all face images without wearing the mask from a preset face recognition model, thereby calling mask images from a built-in storage space and calculating the spatial offset angle of the face images without wearing the mask. The electronic device can then rotate the mask image according to the spatial offset angle so that the mask image is spatially consistent with the image of the person's face without the mask. And then the electronic equipment can perform data amplification on the rotated mask images to obtain a plurality of mask images in different forms, and the plurality of mask images in different forms are respectively fused with the face images of all the non-worn masks to generate a plurality of face images worn with masks in different forms. And finally, the electronic equipment binds the face image of the mask which is not worn and corresponds to the same user identification with the face images of the masks which are worn in different shapes, and inputs the face images into a preset face recognition model to update the preset face recognition model. Therefore, the problem that data corresponding to the data of the mask worn by the same person and the data of the mask not worn by the same person in the training data are insufficient is solved, and meanwhile, the identification accuracy of the face image worn by the mask is further improved.

It should be noted that, in this embodiment, there may be a plurality of face images of the non-worn mask corresponding to the same user identifier.

The method for generating the training data for face recognition can greatly expand the face image data with the mask, solves the problems of insufficient mask data, single mask type, single color, single occlusion area and the like in face recognition of the wearing mask, and solves the problems of too few wearing masks and non-wearing masks of the same person. In addition, the data amplification mode is added, so that the identification accuracy of the face image wearing the mask is greatly improved under the conditions that the mask data acquisition cost is not increased and only the existing data is used.

Fig. 7 is a block diagram illustrating an apparatus for generating training data for face recognition according to an embodiment of the present application.

The device for generating training data for face recognition in the embodiment of the application can be configured in electronic equipment to realize the fusion of mask images to face images without wearing a mask, so as to generate face images wearing the mask.

As shown in fig. 7, the apparatus 1000 for generating training data for face recognition may include: an acquisition module 100, a calculation module 200, an adjustment module 300 and a generation module 400.

The acquiring module 100 is configured to acquire a face image without wearing a mask, and acquire a mask image. The number of face images without wearing a mask may be plural.

Specifically, the obtaining module 100 may directly retrieve all face images without wearing a mask from a preset face recognition model provided in the electronic device, and retrieve common mask images through a network. It should be noted that the mask image described in this embodiment may be directly fused to the standard front face image, that is, the position of the mask image in space is relatively identical to the standard front face image.

The calculation module 200 is used for calculating the spatial offset angle of the face image without wearing the mask.

In this embodiment, the calculation module 200 may calculate the spatial offset angle of the face image without wearing the mask through a preset algorithm, where the preset algorithm may be calibrated according to an actual situation.

The adjusting module 300 is configured to rotate the mask image according to the spatial offset angle so that the mask image is spatially consistent with the face image without wearing the mask.

The generating module 400 is configured to fuse the rotated mask image to a face image without a mask, so as to generate a face image with a mask.

Specifically, the obtaining module 100 may directly retrieve all face images without wearing a mask from a preset face recognition model provided in the electronic device, and retrieve common mask images through a network. The calculation module 200 then calculates the spatial offset angle of the face image without wearing the mask by a preset algorithm. Then, the adjusting module 300 may rotate the mask image according to the spatial offset angle, so that the mask image is spatially consistent with the face image without wearing the mask. The generating module 400 may fuse the rotated mask images to the face images of all the unworn masks, respectively, to generate face images of the unworn masks corresponding to the face images of all the unworn masks.

Further, in order to improve the accuracy of recognizing the face image of the mask, the generating module 400 may input the face images of the mask corresponding to all the face images of the mask not worn to the preset face recognition model to provide more training data for the face recognition function.

In the embodiment of the application, the face image and the mask image of the mask which is not worn are obtained through the obtaining module, then the spatial offset angle of the face image of the mask which is not worn is calculated through the calculating module, then the mask image is rotated through the adjusting module according to the spatial offset angle so that the mask image is consistent with the face image of the mask which is not worn in space, and finally the mask image after rotation is fused to the face image of the mask which is not worn through the generating module so as to generate the face image of the mask which is worn. Therefore, the face image of the mask can be obtained, the training data quantity of the face recognition model is expanded, and the recognition accuracy of the face image of the mask is improved.

In one embodiment of the present application, as shown in fig. 8, the calculation module 200 may include an acquisition unit 210 and a generation unit 220.

The acquiring unit 210 is configured to acquire a standard frontal face image.

The generating unit 220 is configured to generate a spatial offset angle from the non-mask face image and the standard frontal face image.

In an embodiment of the present application, the generating unit 220 is specifically configured to acquire a plurality of first keypoint locations in a face image without a mask, and a plurality of second keypoint locations in a standard frontal face image, where the plurality of first keypoint locations and the plurality of second keypoint locations correspond to each other one to one, and calculate a spatial offset angle according to the first keypoint locations and the second keypoint locations.

In an embodiment of the present application, the adjusting module 300 is specifically configured to acquire a plurality of third key point positions of the mask image, determine a position vector of the plurality of third key point positions according to the spatial offset angle, and control the plurality of third key point positions to move according to the position vector, so that the mask image is spatially consistent with the face image without wearing the mask.

In an embodiment of the present application, the obtaining module 100 is specifically configured to obtain a sample face image with a mask, obtain labeled boundary coordinates of a mask region in the sample face image with the mask, and extract a mask image according to the labeled boundary coordinates of the mask region.

In an embodiment of the present application, as shown in fig. 9, the apparatus 1000 for generating training data for face recognition may further include a data expansion module 500, wherein the data expansion module 500 is configured to perform data expansion on the mask image after rotation to obtain a plurality of mask images of different shapes, and the plurality of mask images of different shapes are used for being fused with the face image without wearing a mask.

In an embodiment of the present application, the data augmentation module 500 is specifically configured to extract mask feature information in the mask image after rotation, and control the mask image after rotation to perform one or more of translation, rotation, size scaling and color replacement respectively according to the mask feature information and a preset selection rule, so as to obtain a plurality of mask images in different forms.

In an embodiment of the present application, as shown in fig. 10, the apparatus 1000 for generating training data for face recognition may further include a binding module 600, wherein the binding module 600 is configured to bind a face image without wearing a mask and a face image wearing a mask, and input the bound face image and the face image into a preset face recognition model to update the preset face recognition model.

In one embodiment of the present application, the spatial offset angles include a pitch angle, a yaw angle, and a roll angle.

It should be noted that the foregoing explanation of the embodiment of the method for generating training data for face recognition is also applicable to the apparatus for generating training data for face recognition in this embodiment, and is not repeated herein.

According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.

As shown in fig. 11, the electronic device is a block diagram of an electronic device for generating training data for face recognition according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.

As shown in fig. 11, the electronic apparatus includes: one or more processors 801, memory 802, and interfaces for connecting the various components, including a high speed interface and a low speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). Fig. 11 illustrates an example of a processor 801.

The memory 802 is a non-transitory computer readable storage medium as provided herein. Wherein the memory stores instructions executable by at least one processor to cause the at least one processor to perform the method of generating training data for face recognition provided herein. A non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to perform the method of generating training data for face recognition provided herein.

The memory 802, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the method for generating training data for face recognition in the embodiments of the present application (for example, the apparatus 1000 for generating training data for face recognition shown in fig. 7 includes the obtaining module 100, the calculating module 200, the adjusting module 300, and the generating module 400). The processor 801 executes various functional applications of the server and data processing by running non-transitory software programs, instructions, and modules stored in the memory 802, that is, implements the method of generating training data for face recognition in the above-described method embodiment.

The memory 802 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of an electronic device that generates a method of training data for face recognition, and the like. Further, the memory 802 may include high speed random access memory and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 802 optionally includes memory located remotely from the processor 801, which may be connected via a network to an electronic device for a method of generating training data for face recognition. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The electronic device of the method of generating training data for face recognition may further comprise: an input device 803 and an output device 804. The processor 801, the memory 802, the input device 803, and the output device 804 may be connected by a bus or other means, and are exemplified by a bus in fig. 11.

The input device 803 may receive input numeric or character information and generate key signal inputs related to user settings and function control of an electronic apparatus that generates a method of training data for face recognition, such as an input device like a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, etc. The output devices 804 may include a display device, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

According to the technical scheme of the embodiment of the application, the face image of the mask can be obtained, so that the training data quantity for establishing the face recognition model is expanded, and the recognition accuracy of the face image of the mask is improved.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.

The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. A method of generating training data for face recognition, comprising:

acquiring a face image without wearing a mask, and acquiring a mask image;

calculating a spatial offset angle of the face image without wearing the mask;

2. The method of generating training data for face recognition according to claim 1, wherein said calculating a spatial offset angle of the non-mask face image comprises:

acquiring a standard front face image; and

and generating the spatial offset angle according to the facial image of the non-worn mask and the standard frontal face image.

3. The method of generating training data for face recognition according to claim 2, wherein said generating the spatial offset angle from the non-mask face image and the standard frontal face image comprises:

acquiring a plurality of first key point positions in the face image without wearing the mask;

acquiring a plurality of second key point positions in the standard frontal face image, wherein the plurality of first key point positions correspond to the plurality of second key point positions one to one; and

and calculating the space offset angle according to the first key point position and the second key point position.

4. The method of generating training data for face recognition according to claim 1, wherein said rotating the mask image according to the spatial offset angle to spatially conform the mask image to the non-mask face image comprises:

acquiring a plurality of third key point positions of the mask image;

determining position vectors of the positions of the plurality of third key points according to the space offset angle; and

and controlling the positions of the plurality of third key points to move according to the position vector so as to make the mask image consistent with the face image without wearing the mask in space.

5. The method of generating training data for face recognition according to any one of claims 1-4, wherein said acquiring a mask image comprises:

acquiring a sample face image with a mask;

acquiring labeling boundary coordinates of a mask area in the sample face image with the mask; and

and extracting the mask image according to the labeling boundary coordinates of the mask area.

6. The method of generating training data for face recognition according to claim 1, wherein after said rotating the mask image according to the spatial offset angle to make the mask image spatially consistent with the non-mask-worn face image, further comprising:

and performing data amplification on the rotated mask image to obtain a plurality of mask images in different forms, wherein the mask images in different forms are used for being fused with the face image without wearing the mask.

7. The method of generating training data for face recognition according to claim 6, wherein said data augmenting said rotated mask image to obtain a plurality of mask images of different morphologies comprises:

extracting mask characteristic information in the rotated mask image; and

and controlling the rotated mask image to respectively perform one or more of translation, rotation, size scaling and color replacement according to the mask characteristic information and a preset selection rule so as to obtain a plurality of mask images in different forms.

8. The method of generating training data for face recognition according to claim 1, further comprising:

and binding the face image of the non-mask with the face image of the mask, and inputting the bound face image into a preset face recognition model to update the preset face recognition model.

9. A method of generating training data for face recognition according to any one of claims 1 to 4, wherein the spatial offset angles include pitch, yaw and roll.

10. An apparatus for generating training data for face recognition, comprising:

11. The apparatus for generating training data for face recognition according to claim 10, wherein the computing module comprises:

an acquisition unit configured to acquire a standard frontal face image; and

a generating unit configured to generate the spatial offset angle from the non-mask face image and the standard frontal face image.

12. The apparatus for generating training data for face recognition according to claim 11, wherein the generating unit is specifically configured to:

13. The apparatus for generating training data for face recognition according to claim 10, wherein the adjusting module is specifically configured to:

acquiring a plurality of third key point positions of the mask image;

14. The apparatus for generating training data for face recognition according to any one of claims 10 to 13, wherein the obtaining module is specifically configured to:

acquiring a sample face image with a mask;

15. An apparatus for generating training data for face recognition according to claim 10, further comprising:

and the data amplification module is used for performing data amplification on the rotated mask images to obtain a plurality of mask images in different forms, wherein the mask images in different forms are used for being fused with the face images of the mask which is not worn.

16. The apparatus for generating training data for face recognition according to claim 15, wherein the data augmentation module is specifically configured to:

extracting mask characteristic information in the rotated mask image; and

17. An apparatus for generating training data for face recognition according to claim 10, further comprising:

and the binding module is used for binding the face image of the non-wearing mask with the face image of the wearing mask and inputting the face image into a preset face recognition model so as to update the preset face recognition model.

18. Apparatus for generating training data for face recognition according to any of claims 10 to 13, wherein the spatial offset angles comprise pitch, yaw and roll angles.

19. An electronic device, comprising:

at least one processor; and

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of generating training data for face recognition according to any one of claims 1 to 9.

20. A non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform the method of generating training data for face recognition of any one of claims 1-9.