CN115114651A

CN115114651A - Reversible protection method of face privacy mask based on reversible neural network technology

Info

Publication number: CN115114651A
Application number: CN202210246639.2A
Authority: CN
Inventors: 杨杨; 时铭; 黄一洋
Original assignee: Anhui University
Current assignee: Anhui University
Priority date: 2022-03-14
Filing date: 2022-03-14
Publication date: 2022-09-27

Abstract

The invention relates to a reversible protection method of a face privacy mask based on a reversible neural network technology, which overcomes the defects of imperceptibility and reversibility of face privacy protection in comparison with the prior art. The invention comprises the following steps: acquiring an original face image; constructing a reversible mask network; training a reversible mask network; acquiring a face image to be protected; generating a face privacy mask; and (4) removing the face privacy mask. The Mask-Net is used for naturally generating a Mask face, the Mask face is placed on the protected face, the Mask face is generated to be worn, the Mask face is taken down from the Mask face worn by an authorized party, and the recovered face is obtained, so that the imperceptibility and reversibility of face privacy protection are guaranteed, and meanwhile, the superior face privacy protection is realized.

Description

Reversible protection method of face privacy mask based on reversible neural network technology

Technical Field

The invention relates to the technical field of image privacy processing, in particular to a reversible protection method of a face privacy mask based on a reversible neural network technology.

Background

Computer vision technology has been widely applied to tasks such as vision recognition, and all of the technologies bring great convenience to people's daily life and also bring great risks. A large number of raw photos and videos are uploaded to the cloud or sent to third parties for analysis and recognition tasks, including facial information. However, the face information is a sensitive information containing a large amount of personal information, and if carelessly protected, the highly sensitive face information is easily accessed and illegally used by a third party or an attacker. Protecting the privacy of such data requires a technique capable of securing facial information while performing conventional computer vision technology applications.

In previous research, the original face privacy protection method is realized by performing irreversible processing on an original face by using methods such as super-pixel, fuzzy, gaussian noise, edge and low resolution. However, the original face privacy protection methods are all directed at the loss of face semantic information, and the reusability of the original face information cannot be guaranteed. With the development of deep learning technology, GAN is developing continuously, and researchers in the field of face privacy protection are focusing on GAN. Ren et al propose motion detection that learns anonymized faces to protect privacy, with minimal impact of video face anonymizers on motion detectors; maximov et al propose CIAGAN, an image and video anonymization model for generating an anti-network based on conditions; you et al propose a novel reversible face privacy protection scheme to anonymize face information and recover the original face information when needed.

It can be found through analysis of existing face privacy protection methods that these existing face privacy protection methods only retain semantic information for computer vision tasks, but permanently destroy original face information before uploading, thereby preventing third parties and legitimate users from accessing the original face information, and almost all of these methods cannot completely restore the original face. In fact, a blurred face is more noticeable to an attacker, since it is unnatural and perceptible.

Furthermore, in some applications, it is desirable that blurred faces be perceived and that the original protected faces need to be restored. For example, on a social platform, many people like to record their lives and share their photos, they want photos to look natural without revealing facial privacy to unauthorized people, and they also want the original face to be displayed to people authorized by friends (friends or family, etc.). If a criminal is caught in video surveillance, the criminal's original face also needs to be restored after privacy is protected.

Therefore, there are two problems to be solved in practical application. Firstly, how to realize the imperceptibility of the face privacy protection, namely how to naturally hide the protected face; and the reversibility is realized, namely, the blocked image can be well restored to the protected face.

Disclosure of Invention

The invention aims to solve the defects of imperceptibility and reversibility of face privacy protection in the prior art, and provides a reversible protection method of a face privacy mask based on a reversible neural network technology to solve the problems.

In order to achieve the purpose, the technical scheme of the invention is as follows:

a reversible protection method of a face privacy mask based on a reversible neural network technology comprises the following steps:

acquiring an original face image: acquiring original protected face images and forming a training data set;

construction of reversible mask network: constructing a reversible mask network based on a reversible neural network;

training of reversible mask networks: inputting a training data set into a reversible mask network for training;

acquiring a face image to be protected: acquiring a face image to be protected;

generating a face privacy mask: inputting a face image to be protected into the trained reversible mask network to generate a face image with a mask;

removing the face privacy mask: and inputting the face image with the mask into the trained reversible mask network, and removing the face privacy mask.

The construction of the reversible mask network comprises the following steps:

setting a reversible Mask network to comprise 2 submodules, namely a Mask-Net module and a reversible embedded network;

the Mask-Net module is set to comprise four modules: the structure of the encoder, the ID injection module, the decoder and the face enhancement module is set as follows:

setting the input of an encoder as a protected face, encoding the protected face through the encoder, and outputting the encoded protected face;

setting the input of an ID injection module as the identity information of the coded protected face and the source face, and migrating the identity information of the source face to the coded protected face on a characteristic level;

setting a mask face which is preliminarily generated by a decoder;

setting a mask face to be generated by using a face enhancement module to perform super-resolution reconstruction on the preliminarily generated mask face;

setting a reversible embedded network to comprise a forward process, a backward process and a loss function thereof, wherein the forward process and the backward process have the same structure;

the set-forward procedure comprises three steps: DWT, forward embedded network, IWT; setting the backward part comprises three steps: DWT, backward recovery network, IWT; the DWT is used for converting the time domain characteristics of the image into frequency domain characteristics, and the IWT is used for restoring the frequency domain characteristics into the image;

setting a forward embedded network, wherein the forward embedded network is formed by cascading N embedded blocks;

setting a backward recovery network, wherein the backward recovery network is formed by cascading N recovery blocks;

setting the overall LOSS function of the reversible embedded network,

the embedding loss function is defined as follows:

wherein the content of the first and second substances,

is equivalent to

Theta is the network parameter, T is the number of training samples,

is used for measuring the face x of a person wearing a mask _Masked And mask face x _Mask Difference between, L _Embedding (θ) represents an insertion loss;

the recovery loss function is defined as follows:

wherein the face is restored

Is equivalent to

Representing the recovery process, χ is the distribution of n, T is the number of training samples,

representing a restored face x _Recovered And protected face x _Protected Difference between, L _Recovering (θ) represents recovery loss;

the low frequency wavelet loss is defined as follows:

wherein, H () _LL Which represents the low-frequency sub-band,

representing the face x wearing the mask _Masked Low frequency sub-band and mask face x _Mask T is the number of training samples, L _{low-frequency} (θ) represents low frequency loss;

defining the total loss L _Total For the insertion loss L _Embedding Recovering the loss L _Recovering And low frequency loss L _{low-frequency} The sum of the weights of the three is expressed as follows:

L _Total ＝λ ₁ L _Embedding +λ ₂ L _Recovering +λ ₃ L _{low-frequency}

wherein λ ₁ 、λ ₂ And λ ₃ Weights for balancing different loss functions.

The training of the reversible neural network comprises the following steps:

setting training parameters: setting the learning rate to be 0.00001, the weight attenuation to be 1000, the batch size to be 16 samples and the cycle number to be 10000 times;

inputting the training data set into a Mask-Net module, and outputting a Mask face;

inputting the input protected face and the mask face into a forward embedded network to carry out mask wearing training:

the embedding process is as follows: inputting protected face x _Protected Hemian face x _Mask ；

For protected face x _Protected And mask face x _Mask Performing wavelet transformation, and defining the change of the feature map through DWT as follows:

wherein B is the batch size, H is the depth, W is the width, and C is the number of channels;

after DWT, the protected face x in the frequency domain is processed _Protected And mask face x _Mask Inputting into a forward embedded network, wherein the forward embedded network has N sameAn embedded block of the structure, for the ith embedded block in the module, the input is

And

output is as

And

the calculation formula is as follows:

where alpha is a multiplicative constant factor used as the impulse function for clamping, p (·),

eta (. cndot.) is represented by a dense block, x _Protected Is a protected face, x _Mask Is a mask face, i represents the ith embedded block;

after the Nth embedded block, the output

And

performing IWT to obtain x _Masked And missing information m, whose expression is as follows:

where m is the missing information, the face x of the wearer _Masked Obeying the same distribution as the lost information m, N representing the Nth recovery block;

randomly sampling from the Case-advertising distribution to generate side information n, which obeys the same distribution as m;

training reversible embedded network mask unloading: use of auxiliary information n to help obtain a restored face x during face-off _Recovered ；

The recovery process is as follows: input mask-on-wear face x _Masked And auxiliary information n;

to wearing mask face x _Masked Carrying out DWT processing on the auxiliary information n;

after DWT, the face x with the mask is worn in the frequency domain _Masked And auxiliary information N is input into a backward recovery network, N recovery blocks with the same structure are shared in the backward recovery network, and the input of the ith recovery block in the module is

And n ⁱ Output is n ⁱ⁺¹ And

the calculation formula is as follows:

x _Masked the face with the mask is worn, n is auxiliary information, alpha is a multiplication constant factor used as an impulse function for clamping, rho (·),

η (·) is represented by a dense block, i represents the ith embedded block;

after the Nth recovery block, output

And n ^N IWT is carried out to obtain a recovered face x _Recovered The expression is as follows:

x _Masked wearing a mask face, n is auxiliary information, x _Recovered It is the recovered face and N represents the nth recovered block.

Advantageous effects

Compared with the prior art, the reversible protection method of the face privacy Mask based on the reversible neural network technology naturally generates a Mask face by using Mask-Net, places the Mask face on the protected face, generates the face of the wearing Mask, takes the Mask face off the face of the wearing Mask on an authorized party to obtain a recovered face, and achieves superior face privacy protection while ensuring imperceptibility and reversibility of face privacy protection.

Compared with the prior art, the method has the advantages that the visual quality of the face generated on the basis of objective parameters is obviously improved, the accuracy of face recovery is improved to a certain extent, and high reusability which is difficult to realize by the existing face privacy protection technology is realized.

Drawings

FIG. 1 is a sequence diagram of the method of the present invention;

FIG. 2 is a view showing the structure of an insert according to the present invention;

fig. 3 is a diagram of a recovery block structure according to the present invention;

FIG. 4 is a graph showing experimental results of the method of the present invention and the prior art method.

Detailed Description

So that the manner in which the above recited features of the present invention can be understood and readily understood, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings, wherein:

as shown in fig. 1, the reversible protection method for a face privacy mask based on the reversible neural network technology of the present invention includes the following steps:

firstly, acquiring an original face image: original protected face images are acquired and a training data set is formed.

And secondly, constructing a reversible mask network: and constructing a reversible mask network based on the reversible neural network. The structure of the reversible network is symmetrical, so that the reversible mask network can be constructed by improving the reversible network by utilizing the symmetry of the reversible network, the high similarity between the recovered face and the original face can be greatly ensured, the high availability of the face image in the face privacy protection method is ensured, and symmetrical blocks of the reversible network are further designed to achieve the purpose of wearing the reversible mask on the face image.

The construction of the reversible mask network comprises the following steps:

(1) the reversible Mask network is set to comprise 2 submodules, namely a Mask-Net module and a reversible embedded network.

(2) The Mask-Net module is set to comprise four modules: the structure of the encoder, the ID injection module, the decoder and the face enhancement module is set as follows:

A1) setting the input of an encoder as a protected face, encoding the protected face through the encoder, and outputting the encoded protected face;

A2) setting the input of an ID injection module as the identity information of the coded protected face and the source face, and migrating the identity information of the source face to the coded protected face on a characteristic level;

A3) setting a mask face which is preliminarily generated by a decoder;

A4) and setting a mask face to be generated by performing super-resolution reconstruction on the preliminarily generated mask face by using a face enhancement module.

(3) Setting a reversible embedded network to comprise a forward process, a backward process and a loss function thereof, wherein the forward process and the backward process have the same structure;

B1) the set-forward procedure comprises three steps: DWT, forward embedded network, IWT; setting the backward part comprises three steps: DWT, backward recovery network, IWT; the DWT is used for converting the time domain characteristics of the image into frequency domain characteristics, and the IWT is used for restoring the frequency domain characteristics into the image;

B2) setting a forward embedded network, wherein the forward embedded network is formed by cascading N embedded blocks;

B3) setting a backward recovery network, and cascading N recovery blocks of the backward recovery network;

B4) setting the overall LOSS function of the reversible embedded network,

the embedding loss function is defined as follows:

wherein the content of the first and second substances,

is equivalent to

Theta is the network parameter, T is the number of training samples,

is used for measuring the face x of a person wearing a mask _Masked And mask face x _Mask Difference therebetween, L _Embedding (θ) represents an insertion loss;

the recovery loss function is defined as follows:

wherein the face is restored

Is equivalent to

the low frequency wavelet loss is defined as follows:

wherein, H () _LL Which represents the low-frequency sub-band,

Thirdly, training a reversible mask network: the training data set is input into a reversible mask network for training.

In the training process, the visual quality of the face mask generated by face changing of the simwap module is reduced, so that the original effect of the whole network cannot be realized, and the visual quality of the face mask generated by the GPEN technology is enhanced after the module. When the reversible forward embedding process is performed, part of image information is lost along with processing, a recovery process needs to be completed under the condition of no additional auxiliary information due to application requirements in an actual scene, the lost information and the auxiliary information n which needs to be used obey the same distribution due to reversible constraint of a network, and the auxiliary information with the same distribution can be obtained in a sampling mode.

The training of the reversible mask network comprises the following specific steps:

(1) setting training parameters: the learning rate was set to 0.00001, the weight decay to 1000, the batch size to 16 samples, and the number of cycles to 10000 times.

(2) And inputting the training data set into a Mask-Net module, and outputting a Mask face.

(3) The input of the input protected face and the face of the mask is input to be embedded into a network to carry out mask wearing training:

C1) the embedding process is as follows: inputting protected face x _Protected Hemian face x _Mask ；

C2) For protected face x _Protected And mask face x _Mask Performing wavelet transformation, and defining the change of the feature map through DWT as follows:

C3) after DWT, the protected face x in the frequency domain is processed _Protected And mask face x _Mask Inputting into a forward embedded network, wherein the forward embedded network has N embedded blocks with the same structure, the specific structure of the embedded blocks is shown in FIG. 2, and for the ith embedded block in the module, the input is

And

output is as

And

the calculation formula is as follows:

C4) after the Nth embedded block, the output

And

the side information n is randomly sampled from the Case-advertising distribution and follows the same distribution as m.

(4) Training reversible embedded network mask unloading: use of auxiliary information n to help obtain a restored face x during face-off _Recovered ；

D1) The recovery process is as follows: input mask-on-wear face x _Masked And auxiliary information n;

D2) to wearing mask face x _Masked Carrying out DWT processing on the auxiliary information n;

D3) throughAfter DWT, the face x is put on the mask in frequency domain _Masked And auxiliary information N are input into a backward recovery network, wherein the backward recovery network has N recovery blocks with the same structure, the structure of the recovery block is shown in FIG. 3, and the input of the ith recovery block in the module is

And n ⁱ Output is n ⁱ⁺¹ And

the calculation formula is as follows:

η (·) is represented by a dense block, i represents the ith embedded block;

D4) after the Nth recovery block, output

Fourthly, acquiring a face image to be protected: and acquiring a face image to be protected.

And fifthly, generating a face privacy mask: and inputting the face image to be protected into the trained reversible mask network to generate the face image with the mask.

Sixthly, removing the face privacy mask: and inputting the face image with the mask into the trained reversible mask network, and removing the face privacy mask.

In practical application, a face image needing to be protected and a face image prepared to be used as a mask are input into a network; obtaining a face image of a user wearing the mask, wherein the privacy information of the user cannot be obtained through the face image of the user wearing the mask; when the authorized party needs to acquire the privacy information of the user, the authorized party can obtain the original face image by inputting the face image with the mask into the network again.

Table 1 gives detailed information of parameter settings in the experiment. We set the batch size to 16 to fully utilize the GPU. According to the mainstream settings, the learning rate was set to 1e-5, the weight decay was set to 1000, the epoch parameter was set to 10000, and the effect of the three rates on the recovery effect is discussed in the next table.

TABLE 1 Experimental setup comparison Table of IMN

Table 2 will discuss in detail the effect of three% on the experiment. Lambda ₁ ，λ ₂ ，λ ₃ Is a weight which balances the different loss functions, λ ₁ And λ ₃ Parameters and mask face x _Mask And wear mask face x _Masked Related to and making them visually identical. Lambda [ alpha ] ₂ Parameter of (2) restoring face x _Recovered And protected face x _Protected In relation to that they are not only at the visual levelAnd also at the pixel level. In pursuit of the best recovery, we randomly selected 20 test images from the database, and the results are shown in table 2. We can see that the better the experimental performance as the parameter 2 increases, which means the recovered face x _Recovered Protected face x reverting to pixel level protection _Protected The closer. The recovery effect is best with a ratio of 1:3: 1.

TABLE 2 comparison of the effects of different parameter ratios

λ ₁ :λ ₂ :λ ₃	1:1:1	1:2:1	1:3:1	1:4:1
					PSNR	38.42	46.35	47.09	46.15
SSIM	0.943	0.988	0.991	0.988
					RMSE	9.945	1.579	1.437	1.706
MAE	2.108	0.905	0.829	0.930

Table 3 discusses experimental results comparison of the test images on objective parameters, and fig. 4 shows the subjective visual effect of the test images. The experiment respectively tests the protected face image x _Protected And restoring image x _Recovered Face mask x _Mask And mask-wearing face x _Masked PSNR and SSIM in between. The results show that the face image x is protected _Protected And restoring image x _Recovered The similarity between the two is very high, and the high reusability of the privacy protection method for the face image is proved. At the same time, show the face x of the mask _Mask And mask-wearing face x _Masked The similarity between the face images is very high, which proves that the face images have little damage to the protected face images when the mask is worn.

TABLE 3 comparison of objective parameters of test images

Table 4 shows the comparison of the effect with the analogous method. The high reusability method only has one reversible mosaic transformation which is proposed by You et al and protects privacy at present. Obviously, the visual quality of the processed image is much worse than the patented method. Therefore, we only compare the quality of the restored face. The experiments were performed on the same data set used. The patented method has better recovery than the You et al method. The underlying reason is that the performance of the reversible network architecture is much better than that of the autoencoder.

Table 4 comparison of effects of the present invention and the conventional method

Methods	PSNR	SSIM	RMSE	MAE
					You et.al	36.67	0.988	14.72	2.74
Proposed method	52.02	0.997	0.441	0.45

The foregoing shows and describes the general principles, principal features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are merely illustrative of the principles of the invention, but that various changes and modifications may be made without departing from the spirit and scope of the invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. A reversible protection method of a face privacy mask based on a reversible neural network technology is characterized by comprising the following steps:

11) acquiring an original face image: acquiring original protected face images and forming a training data set;

12) construction of reversible mask network: constructing a reversible mask network based on a reversible neural network;

13) training of reversible mask networks: inputting a training data set into a reversible mask network for training;

14) acquiring a face image to be protected: acquiring a face image to be protected;

15) generating a face privacy mask: inputting a face image to be protected into the trained reversible mask network to generate a face image with a mask;

16) removing the face privacy mask: and inputting the face image with the mask into the trained reversible mask network, and removing the face privacy mask.

2. The reversible protection method for the face privacy mask based on the reversible neural network technology as claimed in claim 1, wherein the construction of the reversible mask network comprises the following steps:

21) setting a reversible Mask network to comprise 2 submodules, namely a Mask-Net module and a reversible embedded network;

22) the Mask-Net module is set to comprise four modules: the structure of the encoder, the ID injection module, the decoder and the face enhancement module is set as follows:

221) setting the input of an encoder as a protected face, encoding the protected face through the encoder, and outputting the encoded protected face;

222) setting the input of an ID injection module as the identity information of the coded protected face and the source face, and migrating the identity information of the source face to the coded protected face on a characteristic level;

223) setting a mask face which is preliminarily generated by a decoder;

224) setting a mask face to be generated by using a face enhancement module to perform super-resolution reconstruction on the preliminarily generated mask face;

23) setting a reversible embedded network to comprise a forward process, a backward process and a loss function thereof, wherein the forward process and the backward process have the same structure;

231) the set-forward procedure comprises three steps: DWT, forward embedded network, IWT; setting the backward part comprises three steps: DWT, backward recovery network, IWT; the DWT is used for converting the time domain characteristics of the image into frequency domain characteristics, and the IWT is used for restoring the frequency domain characteristics into the image;

232) setting a forward embedded network, wherein the forward embedded network is formed by cascading N embedded blocks;

233) setting a backward recovery network, wherein the backward recovery network is formed by cascading N recovery blocks;

234) setting the overall LOSS function of the reversible embedded network,

the embedding loss function is defined as follows:

wherein the content of the first and second substances,

is equivalent to

Theta is a network parameter, T is a number of training samples,

is used for measuring the face x of a person wearing the mask _Masked And mask face x _Mask Difference between, L _Embedding (θ) represents an insertion loss;

the recovery loss function is defined as follows:

wherein the face is restored

Is equivalent to

the low frequency wavelet loss is defined as follows:

wherein, H () _LL Which represents the low-frequency sub-band,

3. The reversible protection method for the face privacy mask based on the reversible neural network technology as claimed in claim 1, wherein the training of the reversible neural network comprises the following steps:

31) setting training parameters: setting the learning rate to be 0.00001, the weight attenuation to be 1000, the batch size to be 16 samples and the cycle number to be 10000 times;

32) inputting the training data set into a Mask-Net module, and outputting a Mask face;

33) the input of the input protected face and the face of the mask is input to be embedded into a network to carry out mask wearing training:

331) the embedding process is as follows: inputting protected face x _Protected Hemian face x _Mask ；

332) For protected face x _Protected And mask face x _Mask Performing wavelet transformation, and defining the change of the feature map through DWT as follows:

333) after DWT, the protected face x in the frequency domain is processed _Protected And mask face x _Mask Inputting into a forward embedded network, wherein N embedded blocks with the same structure are in the forward embedded network, and inputting into the ith embedded block in the module

And

output is as

And

the calculation formula is as follows:

where a is a multiplicative constant factor used as the impulse function to clamp, p (·),

334) after the Nth embedded block, the output

And

34) training reversible embedded network mask unloading: use of auxiliary information n to help obtain a restored face x during face-off _Recovered ；

341) The recovery process is as follows: input mask-on-wear face x _Masked And auxiliary information n;

342) to wearing mask face x _Masked Carrying out DWT processing on the auxiliary information n;

343) after DWT, the face x with the mask is worn in the frequency domain _Masked And auxiliary information N is input into a backward recovery network, N recovery blocks with the same structure are shared in the backward recovery network, and the input of the ith recovery block in the module is

And n ⁱ Output is n ⁱ⁺¹ And

the calculation formula is as follows:

x _Masked wearing a face, n is auxiliary information, alpha is a multiplication constant factor used as an impulse function for clamping, rho (-) and,

η (·) is represented by a dense block, i represents the ith embedded block;

344) after the Nth recovery block, output

x _Masked is to wear a mask face, n is auxiliary information, x _Recovered Is to restore the face and N represents the nth restoration block.