CN114091104A

CN114091104A - Method and apparatus for protecting privacy information of image sample set

Info

Publication number: CN114091104A
Application number: CN202111415199.0A
Authority: CN
Inventors: 李一鸣; 刘沛东; 邱伟峰; 江勇; 夏树涛
Original assignee: Alipay Hangzhou Information Technology Co Ltd
Current assignee: Alipay Hangzhou Information Technology Co Ltd
Priority date: 2021-11-25
Filing date: 2021-11-25
Publication date: 2022-02-25

Abstract

The embodiment of the specification provides a method and a device for protecting privacy information of an image sample set. One embodiment of the method comprises: determining an image sample to be protected in an image sample set to be protected as a target sample, wherein the target sample comprises a target image and a target label; determining an image with a label different from the target label as a selected image; adjusting the pixel value of the selected image by taking the processing result of the pre-trained image recognition model for the selected image approaching to the processing result for the target image as a target to obtain an adjusted image, wherein the image recognition model is obtained by training the to-be-protected image sample set; and setting the target label as the label of the adjustment image to obtain a protected image sample comprising the adjustment image and the target label, and forming a protected image sample set.

Description

Method and apparatus for protecting privacy information of image sample set

Technical Field

The embodiment of the specification relates to the field of artificial intelligence, in particular to a method and a device for protecting privacy information of an image sample set.

Background

As computer software and artificial intelligence continue to develop, machine learning models are becoming more widely used, for example, machine learning models are widely used in the field of image processing, and training a well-performing model for processing images requires the use of a large number of image samples. At this stage, many open-source image sample sets on the network are open. There are also many image sample sets of businesses or individuals that are only allowed for internal use due to privacy or confidentiality concerns. The image sample set, especially the face image sample set, may contain a large amount of user privacy information, and when data is leaked or sourced, the privacy information contained in the image sample set may be used maliciously by attackers. For example, an attacker may perform a relevance attack on a correspondence between a leaked face image and a tag, for example, a false video corresponding to the face image is generated by face changing through AI (Artificial Intelligence) to perform fraud on a relative of a user corresponding to the face image, which greatly threatens user privacy. The attack is mainly generated because an attacker successfully acquires the corresponding relation between the face image and the label. Therefore, how to protect the image sample set has important practical significance and value.

Disclosure of Invention

Embodiments of the present specification describe a method and apparatus for protecting privacy information of an image sample set, the method determining an image sample to be protected as a target sample, and determining an image with a tag different from the target tag as a selected image. When the pixel values of the selected image are adjusted, the processing result of the image recognition model for the selected image and the processing result for the target image are required to approach, so that the obtained adjusted image approaches the target image for the image recognition model, and therefore, the effect of the model obtained by training the protected image sample set and the effect of the model obtained by training the to-be-protected image sample set can be ensured to be close. From the human visual point of view, the adjusted image still looks like the selected image, and is not the same as the target image. Based on the method, the protected image sample can be used for replacing the target sample, so that the corresponding relation between the target image and the target label in the target sample is protected, the protection of privacy information is realized, and a user of the protected image sample set can be ensured to train a model with good effect.

According to a first aspect, there is provided a method for protecting privacy information of a sample set of images, comprising: determining an image sample to be protected in an image sample set to be protected as a target sample, wherein the target sample comprises a target image and a target label; determining an image with a label different from the target label as a selected image; adjusting the pixel value of the selected image by taking the processing result of the pre-trained image recognition model for the selected image approaching to the processing result for the target image as a target to obtain an adjusted image, wherein the image recognition model is obtained by training the to-be-protected image sample set; and setting the target label as the label of the adjustment image to obtain a protected image sample comprising the adjustment image and the target label, and forming a protected image sample set.

In one embodiment, the determining that the image with the tag different from the target tag is the selected image includes: and selecting an image with a label different from the target label from the image sample set to be protected as a selected image.

In one embodiment, before adjusting the pixel values of the selected image, the method further comprises: and in response to determining that the model structure used by the target user corresponding to the protected image sample set is known, performing model training based on the model structure by using the to-be-protected image sample set to obtain the image recognition model.

In one embodiment, the processing result is an output vector of an intermediate layer of the image recognition model; the adjusting the pixel value of the selected image to obtain an adjusted image includes: determining a distance between a first output vector for the selected image and a second output vector for the target image; and adjusting the pixel value of the selected image by taking the minimum distance as a target.

In one embodiment, the distance comprises one of: euclidean distance, manhattan distance, infinite norm of the difference vector.

In one embodiment, adjusting the pixel values of the selected image to minimize the distance comprises: determining a gradient of the distance with respect to the pixel value; the pixel value adjustment is performed in a predetermined step size in the direction in which the gradient decreases by a predetermined number of steps.

In one embodiment, before adjusting the pixel values of the selected image, the method further comprises: and in response to the fact that the model structure used by the target user corresponding to the protected image sample set is unknown, performing model training respectively on the basis of a plurality of preset model structures by using the to-be-protected image sample set to obtain a plurality of image recognition models.

In one embodiment, the processing results are output vectors of intermediate layers of the plurality of image recognition models; the adjusting the pixel value of the selected image to obtain an adjusted image includes: determining a weighting result of distances between output vectors of the plurality of image recognition models for the selected image and the target image, respectively; the pixel values of the selected image are adjusted with the goal of minimizing the weighting result.

In one embodiment, the method further comprises: determining a plurality of selected images according to the target image to generate a plurality of adjustment images; and setting the labels of the multiple adjustment images as target labels to obtain multiple protected image samples.

In one embodiment, the image recognition model includes a softmax layer, and the intermediate layer is a previous layer of the softmax layer.

According to a second aspect, there is provided an apparatus for protecting privacy information of a sample set of images, comprising: the protection device comprises a first determining unit, a second determining unit and a third determining unit, wherein the first determining unit is configured to determine an image sample to be protected in an image sample set to be a target sample, and the target sample comprises a target image and a target label; a second determination unit configured to determine an image having a label different from the target label as a selected image; an adjusting unit, configured to adjust a pixel value of the selected image to obtain an adjusted image, with a processing result of a pre-trained image recognition model for the selected image approaching to a processing result for the target image as a target, wherein the image recognition model is obtained by training using the to-be-protected image sample set; and the generating unit is configured to set the target label as a label of the adjustment image, obtain a protected image sample comprising the adjustment image and the target label, and be used for forming a protected image sample set.

According to a third aspect, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method as described in any one of the implementations of the first aspect.

According to a fourth aspect, a computing device is provided, which includes a memory and a processor, and is characterized in that the memory stores executable codes, and the processor executes the executable codes to implement the method as described in any implementation manner of the first aspect.

According to the method and the device for protecting the privacy information of the image sample set, the image sample to be protected is determined as the target sample, and the target sample comprises the target image and the target label. Thereafter, an image having a label different from the target label is determined as the selected image. And then, taking the processing result of the pre-trained image recognition model for the selected image as a target, and adjusting the pixel value of the selected image to obtain an adjusted image. And finally, setting the target label as the label of the adjustment image to obtain a protected image sample containing the adjustment image and the target label. When the pixel value of the selected image is adjusted, the processing result of the image recognition model for the selected image and the processing result for the target image are required to approach each other, and therefore, the obtained adjusted image approaches the target image with respect to the image recognition model. Therefore, the effect of the model obtained by training the protected image sample set is close to that of the model obtained by training the to-be-protected image sample set. From the human visual point of view, the adjusted image still looks like the selected image, and is not the same as the target image. Based on the method, the protected image sample can be used for replacing the target sample, so that the corresponding relation between the target image and the target label in the target sample is protected, the protection of privacy information is realized, and a user of the protected image sample set can be ensured to train a model with good effect.

Drawings

FIG. 1 shows a schematic diagram of one application scenario in which embodiments of the present description may be applied;

FIG. 2 shows a flow diagram of a method for protecting privacy information of a sample set of images, according to one embodiment;

FIG. 3 is a schematic diagram showing the composition of information contained in an adjusted image;

FIG. 4 shows a flow diagram of a method for training an image recognition model;

FIG. 5 is a schematic diagram illustrating the effect of generating multiple protected image samples for the same target image;

fig. 6 shows a schematic block diagram of an apparatus for protecting privacy information of a sample set of images according to one embodiment.

Detailed Description

The technical solutions provided in the present specification are described in further detail below with reference to the accompanying drawings and embodiments. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings. It should be noted that the embodiments and features of the embodiments in the present specification may be combined with each other without conflict.

As mentioned above, an attacker may attack the private information contained in the open source image sample set or the leaked image sample set for internal use only. To prevent this from happening, some owners of the image sample sets block the samples from being used maliciously by encrypting all or part of the image samples (e.g., images or labels) in the image sample sets. However, the existence of encryption is obvious and easily perceived by an attacker. Furthermore, the encryption approach is not particularly suitable for protecting open source image sample sets, since encrypting the image samples may affect the usability of the samples.

To this end, embodiments of the present specification provide a method for protecting privacy information of an image sample set, so that protection of the image sample set can be achieved. Taking an image of an image sample to be protected as a face image and a label as a name as an example, fig. 1 shows a schematic diagram of an application scenario in which the embodiment of the present specification can be applied. As shown in fig. 1, the image sample set to be protected 101 includes a plurality of image samples to be protected, and each image sample to be protected may include a face image and a label. In this example, each image sample to be protected in the image sample set to be protected 101 may be processed. Taking an image of a to-be-protected image sample to be processed currently as a face image with three sheets and a label of "three sheets" as an example, first, the to-be-protected image sample in the to-be-protected image sample set 101 is determined as a target sample, where the target sample includes a target image 102 and a target label of "three sheets", and in this example, the target image 102 is a face image with three sheets. Next, a face image whose label is different from "zhangsan" is determined as the selected image 103, and in this example, the selected image 103 is a face image of lie four whose label is "lie four". Then, with the processing result of the image recognition model 104 with respect to the selected image 103 approaching the processing result with respect to the target image 102 as a target, the pixel values of the selected image 103 are adjusted to obtain an adjusted image 105. Here, the image recognition model 104 may be trained using the image sample set 101 to be protected. Finally, the target label "zhang san" is set as the label of the adjusted image 105, resulting in a protected image sample 106 comprising the adjusted image 105 and the target label "zhang san" for forming a protected image sample set. When adjusting the pixel values of the selected image 103, the result of the processing of the selected image 103 by the image recognition model 104 is required to be closer to the result of the processing of the target image 102, and therefore the obtained adjusted image 105 is closer to the target image 102 with respect to the image recognition model 104. Therefore, the effect of the model trained by using the protected image sample set is close to that of the model trained by using the to-be-protected image sample set 101. From the human visual point of view, the adjusted image 105 still looks like a face image of lie four (selected image), which is not the same as the face image of zhang three (target image). Based on the method, the protected image sample can be used for replacing the target sample, so that the corresponding relation between the human face image with the size of three and the target label with the size of three is protected, the protection of privacy information is realized, and a user of the protected image sample set can train a model with good effect.

With continued reference to fig. 2, fig. 2 illustrates a flow diagram of a method for protecting privacy information of a sample set of images, according to one embodiment. It is to be appreciated that the method can be performed by any apparatus, device, platform, cluster of devices having computing and processing capabilities. As shown in fig. 2, the method for protecting privacy information of an image sample set may include the following steps:

step 201, determining the image sample to be protected in the image sample set to be protected as a target sample.

In this embodiment, the image sample to be protected in the image sample set to be protected may be determined as a target sample, where the target sample may include a target image and a target label. Here, each of the image samples to be protected in the above-mentioned image sample set to be protected may include an image and a label. For example, the image may be a natural image, an object image, an animal image, a medical image, a human face image, a fingerprint image, or the like.

Step 202, determining an image with a label different from the target label as a selected image.

In this embodiment, an image in which the tag is different from the target tag may be determined as the selected image. Here, an image in which any one label is different from the target label may be determined as the selected image. In practice, in order to make the generated protected image sample more realistic and less noticeable to an attacker, an image having a type similar to that of the target image may be selected as the selected image. For example, when the target image is a face image of a certain person, a face image of another person may be selected as the selected image. For another example, when the target image is a fingerprint image of a certain person, a fingerprint image of another person may be selected as the selected image.

In some alternative implementations, an image with a label different from the target label may be selected from the sample set of images to be protected as the selected image. The selected image is selected from the image sample set to be protected, so that the generated protected image sample is more real and is not easy to be perceived by an attacker.

Step 203, taking the processing result of the pre-trained image recognition model for the selected image approaching the processing result for the target image as a target, and adjusting the pixel value of the selected image to obtain an adjusted image.

In this embodiment, an image recognition model may be obtained by training in advance, where the image recognition model may be obtained by training using the above-mentioned image sample set to be protected. For example, the image recognition model may be Deep Neural Networks (DNNs). In this way, the pixel values of the selected image are adjusted to obtain the adjusted image, with the goal that the processing result of the image recognition model for the selected image approaches the processing result for the target image. Here, the processing result may be various processing results, for example, the image recognition model may include an input layer, a plurality of intermediate layers, and an output layer, and the processing result may be an output result of each layer, for example, an output vector of a certain intermediate layer, or a probability vector of each class finally output by the output layer. In this example, the process of generating the adjustment image corresponds to adding the relevant knowledge about the target image to the selected image. As shown in fig. 3, taking a face image as an example, fig. 3 is a schematic diagram illustrating the composition of information included in an adjustment image. In the example shown in fig. 3, the adjusted image 301 contains information from the selected image 302 and a perturbation 303, wherein the perturbation 303 contains information from the target image. This makes it possible to include the related knowledge about the target image in the adjustment image 301.

And 204, setting the target label as a label of the adjustment image to obtain a protected image sample comprising the adjustment image and the target label, and forming a protected image sample set.

In this embodiment, the target label may be set as the label of the adjusted image obtained in step 203, so that a protected sample image including the adjusted image and the target label may be obtained. A protected image sample set may be formed for each protected image sample obtained for each to-be-protected image sample in the to-be-protected image sample set. In an application scenario in which an open source image sample set is protected, an owner of an image sample set to be protected can deploy the protected image sample set as the open source image sample set on a line for a user to use. In an intra-only application scenario, the protected image sample set may be treated as an intra-only image sample set. In this way, a user of the image sample set may train his own machine learning model from the protected image sample set.

In some optional implementations, before adjusting the pixel values of the selected image, the above method for protecting the privacy information of the image sample set may further include a process of training the image recognition model. For example, the image recognition model may be trained according to whether a model structure used by a target user corresponding to the protected image sample set is known. As shown in fig. 4, fig. 4 is a flowchart illustrating a method for training an image recognition model, which may specifically include the following steps:

step 401, determining whether a model structure used by a target user corresponding to a protected image sample set is known.

In this implementation, the target user corresponding to the protected image sample set may be a user who trains the machine learning model using the protected image sample set. In practice, in some application scenarios, the model structure used by the target user may be known. For example, in an application scenario where the protected image sample set is for internal use only, the model structure used by the target user may be known as the target user may be known. In other application scenarios, the model structure used by the target user may be unknown. For example, in an application scenario where the protected image sample set is an open source image sample set, since the target user is unknown, the model structure used by the target user may also be unknown.

Step 402, in response to determining that the model structure used by the target user corresponding to the protected image sample set is known, performing model training based on the model structure by using the image sample set to be protected to obtain an image recognition model.

Step 403, in response to determining that the model structure used by the target user corresponding to the protected image sample set is unknown, performing model training on the image sample set to be protected based on a plurality of preset model structures respectively to obtain a plurality of image recognition models.

In this implementation, since the model structure used by the target user is not known, a preset predetermined model structure needs to be used. Here, the predetermined model structure may be a model structure determined by a skilled person in various ways, for example, several model structures commonly used at this stage may be determined as the predetermined model structure. In this way, the image sample set to be protected can be used to perform model training based on a plurality of predetermined model structures, respectively, so as to obtain a plurality of image recognition models. For example, assuming that 3 predetermined model structures are included, which are the model structure a, the model structure B, and the model structure C, respectively, model training may be performed based on the model structure a, the model structure B, and the model structure C, respectively, using the image sample set to be protected, so that 3 image recognition models may be obtained.

In some optional implementations, in a scenario where the model structure used by the target user is known, the trained model may include an image recognition model, and the processing result of the image recognition model for the selected image and the processing result for the target image may be output vectors of an intermediate layer of the image recognition model. For example, the image recognition model may include a plurality of layer structures such as an input layer, a plurality of intermediate layers, and an output layer, and the processing result may be an output vector of any intermediate layer. Alternatively, the output layer of the image recognition model may be set as a softmax layer, and the intermediate layer may be a layer before the softmax layer. In this case, in the step 204, the pixel value of the selected image is adjusted to obtain an adjusted image, which may be specifically performed as follows:

first, a distance between a first output vector for the selected image and a second output vector for the target image is determined.

In this implementation, the output vector of a selected image for an intermediate layer of the image recognition model may be determined as a first output vector and the output vector of the target image for the intermediate layer of the image recognition model may be determined as a second output vector. The distance between the first output vector and the second output vector is then calculated. Optionally, the distance may include one of: euclidean distance, manhattan distance, infinite norm of difference vectors, and the like.

Then, the pixel values of the selected image are adjusted with the aim of minimizing the above-mentioned distance.

In this implementation, various optimization algorithms, such as, for example, Projection Gradient (PGD), may be employed to adjust the pixel values of the selected image with the goal of minimizing the distance between the first output vector and the second output vector. Alternatively, the gradient of the distance between the first output vector and the second output vector with respect to the pixel values of the selected image may first be determined. Then, pixel value adjustment is performed in a predetermined step number in the direction in which the gradient descends by a predetermined step size. With the present implementation, an adjustment of the pixel values of the selected image can be achieved.

For example, assume an image sample to be protectedThe book is collected as

Wherein N represents the size of the sample set; the selected image is x_selected，x_selectedIs given by the label y_original(ii) a The target image is x_target，x_targetIs given by the label y_targetWherein, y_target≠y_original(ii) a The adjusted image may be x_modifiedThis can be obtained by the following formula:

wherein W may represent an image dimension, f may represent a mapping from an input to the intermediate layer output in the image recognition model, f (x) corresponds to the first output vector, f (x)_target) Corresponding to the aforementioned second output vector, d (-) can represent a distance measure between the two vectors, which in this example can be represented by x_selectedX is initialized. In addition, in this example, the infinite norm of the difference vector may be used as the distance, x_selectedAfter the pixel value of (2) is adjusted, an adjusted image x is obtained_modifiedI.e., d (f (x)_modified)，f(x_target))＝||f(x_modified)-f(x_target)||_∞. According to the implementation mode, the protected image sample corresponding to the target sample can be generated in the scene that the model structure used by the target user is known.

In other alternative implementations, in a scenario where the model structure used by the target user is unknown, the trained model may include a plurality of image recognition models, and the processing result of the image recognition models for the selected image and the processing result for the target image may refer to output vectors of intermediate layers of the plurality of image recognition models. Alternatively, the output layers of the plurality of image recognition models may be set as softmax layers, and the intermediate layer may be a layer before the softmax layer.

In this case, in the step 204, the pixel value of the selected image is adjusted to obtain an adjusted image, which may be specifically performed as follows:

first, the weighted result of the distances between the output vectors of the plurality of image recognition models for the selected image and the target image, respectively, is determined.

In this implementation, for a plurality of image recognition models, first, the distance between the output vector for the selected image and the output vector for the target image of the intermediate layer of each image recognition model may be determined, resulting in a plurality of distances. And then calculates a weighted result of the plurality of distances. Here, the weighting result may be a weighted sum, a weighted average, or the like.

Then, the pixel values of the selected image are adjusted with the objective of minimizing the weighting result.

For example, assuming there are M image recognition models, image x is adjusted_modifiedThis can be obtained by the following formula:

in this example, x may be used_selectedX is initialized. By the implementation mode, the protected image sample corresponding to the target sample can be generated under the scene that the model structure used by the target user is unknown.

In some optional implementations, the method for protecting privacy information of an image sample set may further include the following:

first, a plurality of selected images are determined for a target image, and a plurality of adjustment images are generated.

In this implementation, for the same target image, multiple selected images may be selected, and step 203 may be repeated multiple times, so as to generate multiple adjustment images.

Then, the labels of the multiple adjusted images are set as target labels, so that multiple protected image samples are obtained.

In this implementation, multiple protected image samples may be generated for the same target image. For example, as shown in fig. 5, fig. 5 is a schematic diagram illustrating the effect of generating multiple protected image samples for the same target image. In the example of an object image shown in fig. 5, the target image 501 shows a ship and the target label is "ship" as seen by human eyes, 3 adjustment images corresponding to the generated 3 protected image samples respectively show a cat, a head α and a horse, and the labels of the 3 adjustment images are the target label "ship". In the example of another face image shown in fig. 5, the target image 502 shows a man's face and the target label is "1133" from the viewpoint of human vision, and the 3 adjustment images corresponding to the generated 3 protected image samples show another man's face, a boy's face and a girl's face, respectively, and the labels of the 3 adjustment images are the target labels "1133".

With the implementation mode, a plurality of protected image samples can contain the information of the target sample. Namely, the relevant knowledge of the corresponding relation between the target image and the target label in the target sample is embodied in the plurality of protected image samples to realize the enhancement of the relevant knowledge, so that the user of the image sample set can better learn the relevant knowledge when training the machine learning model of the user, and the model trained by the user of the image sample set is more accurate.

Referring back to the above procedure, in an embodiment of the present specification, an adjustment image is generated by adjusting a selected image pixel value, and a label of the adjustment image is set as a target label, thereby generating a protected image sample. When the pixel value of the selected image is adjusted, the processing result of the image recognition model for the selected image and the processing result for the target image are required to approach each other, and therefore, the obtained adjusted image approaches the target image with respect to the image recognition model. Therefore, the effect of the model obtained by training the protected image sample set is close to that of the model obtained by training the to-be-protected image sample set. From the human visual point of view, the adjusted image still looks like the selected image, and is not the same as the target image. Based on the method, the protected image sample can be used for replacing the target sample, so that the corresponding relation between the target image and the target label in the target sample is protected, the protection of privacy information is realized, and a user of the protected image sample set can be ensured to train a model with good effect.

According to an embodiment of another aspect, an apparatus for protecting privacy information of a sample set of images is provided. The above-described means for protecting the privacy information of the image sample set may be deployed in any device, platform, or device cluster having computing and processing capabilities.

Fig. 6 shows a schematic block diagram of an apparatus for protecting privacy information of a sample set of images according to one embodiment. As shown in fig. 6, the apparatus 600 for protecting privacy information of a sample set of images includes: a first determining unit 601 configured to determine an image sample to be protected in an image sample set to be protected as a target sample, wherein the target sample includes a target image and a target label; a second determination unit 602 configured to determine an image having a label different from the target label as a selected image; an adjusting unit 603 configured to adjust a pixel value of the selected image to obtain an adjusted image, with a processing result of a pre-trained image recognition model for the selected image approaching to a processing result for the target image as a target, wherein the image recognition model is obtained by training using the to-be-protected image sample set; a generating unit 604 configured to set the target label as a label of the adjustment image, and obtain a protected image sample including the adjustment image and the target label, for forming a protected image sample set.

In some optional implementations of this embodiment, the second determining unit 602 is further configured to: and selecting an image with a label different from the target label from the image sample set to be protected as a selected image.

In some optional implementations of this embodiment, the apparatus 600 may further include: and a first model training unit (not shown in the figure) configured to perform model training based on the model structure by using the image sample set to be protected in response to determining that the model structure used by the target user corresponding to the protected image sample set is known, so as to obtain the image recognition model.

In some optional implementations of this embodiment, the processing result is an output vector of an intermediate layer of the image recognition model; the above-mentioned adjusting unit 603 is further configured to: determining a distance between a first output vector for the selected image and a second output vector for the target image; and adjusting the pixel value of the selected image by taking the minimum distance as a target.

In some optional implementations of this embodiment, the distance includes one of: euclidean distance, manhattan distance, infinite norm of the difference vector.

In some optional implementations of this embodiment, the adjusting the pixel value of the selected image with the goal of minimizing the distance includes: determining a gradient of the distance with respect to the pixel value; the pixel value adjustment is performed in a predetermined step size in the direction in which the gradient decreases by a predetermined number of steps.

In some optional implementation manners of this embodiment, the apparatus 600 further includes a second model training unit (not shown in the figure), configured to perform, in response to determining that a model structure used by a target user corresponding to the protected image sample set is unknown, model training based on a plurality of preset predetermined model structures by using the to-be-protected image sample set, so as to obtain a plurality of image recognition models.

In some optional implementations of this embodiment, the processing result is an output vector of an intermediate layer of the plurality of image recognition models; the above-mentioned adjusting unit 603 is further configured to: determining a weighting result of distances between output vectors of the plurality of image recognition models for the selected image and the target image, respectively; the pixel values of the selected image are adjusted with the goal of minimizing the weighting result.

In some optional implementations of this embodiment, the apparatus 600 further includes: a third determining unit (not shown in the figure) configured to determine a plurality of selected images for the target image and generate a plurality of adjustment images; and a setting unit (not shown in the figure) configured to set the labels of the plurality of adjustment images as target labels, so as to obtain a plurality of protected image samples.

In some optional implementations of the embodiment, the image recognition model includes a softmax layer, and the intermediate layer is a previous layer of the softmax layer.

According to an embodiment of another aspect, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method described in fig. 2.

According to an embodiment of still another aspect, there is also provided a computing device including a memory and a processor, wherein the memory stores executable code, and the processor executes the executable code to implement the method described in fig. 2.

It will be further appreciated by those of ordinary skill in the art that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether these functions are performed in hardware or software depends on the particular application of the solution and design constraints. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied in hardware, a software module executed by a processor, or a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A method for protecting privacy information of a sample set of images, comprising:

determining an image sample to be protected in an image sample set to be protected as a target sample, wherein the target sample comprises a target image and a target label;

determining an image with a label different from the target label as a selected image;

adjusting the pixel value of the selected image by taking the processing result of the pre-trained image recognition model for the selected image approaching to the processing result for the target image as a target to obtain an adjusted image, wherein the image recognition model is obtained by training the to-be-protected image sample set;

and setting the target label as the label of the adjustment image to obtain a protected image sample comprising the adjustment image and the target label, and using the protected image sample to form a protected image sample set.

2. The method of claim 1, wherein the determining an image with a label different from the target label as the selected image comprises:

and selecting an image with a label different from the target label from the image sample set to be protected as a selected image.

3. The method of claim 1, wherein prior to adjusting pixel values of the selected image, the method further comprises:

and in response to determining that the model structure used by the target user corresponding to the protected image sample set is known, performing model training based on the model structure by using the to-be-protected image sample set to obtain the image recognition model.

4. The method of claim 3, wherein the processing results in an output vector of an intermediate layer of the image recognition model; the adjusting the pixel value of the selected image to obtain an adjusted image includes:

determining a distance between a first output vector for the selected image and a second output vector for the target image;

adjusting pixel values of the selected image with a goal of minimizing the distance.

5. The method of claim 4, wherein the distance comprises one of: euclidean distance, manhattan distance, infinite norm of the difference vector.

6. The method of claim 4, wherein said adjusting pixel values of said selected image to minimize said distance comprises:

determining a gradient of the distance relative to the pixel value;

the pixel value adjustment is performed in a predetermined step size in the direction in which the gradient decreases by a predetermined number of steps.

7. The method of claim 1, wherein prior to adjusting pixel values of the selected image, the method further comprises:

and in response to the fact that the model structure used by the target user corresponding to the protected image sample set is unknown, using the to-be-protected image sample set to respectively perform model training based on a plurality of preset model structures to obtain a plurality of image recognition models.

8. The method of claim 7, wherein the processing results in output vectors of intermediate layers of the plurality of image recognition models; the adjusting the pixel value of the selected image to obtain an adjusted image includes:

determining a weighted result of distances between output vectors of the plurality of image recognition models respectively for the selected image and the target image;

adjusting pixel values of the selected image with a goal of minimizing the weighting result.

9. The method of claim 1, wherein the method further comprises:

determining a plurality of selected images aiming at the target image and generating a plurality of adjusting images;

and setting the labels of the multiple adjusted images as target labels to obtain multiple protected image samples.

10. The method of claim 4 or 8, wherein the image recognition model comprises a softmax layer, and the intermediate layer is a previous layer to the softmax layer.

11. An apparatus for protecting privacy information of a sample set of images, comprising:

the protection device comprises a first determining unit, a second determining unit and a third determining unit, wherein the first determining unit is configured to determine an image sample to be protected in an image sample set to be a target sample, and the target sample comprises a target image and a target label;

a second determination unit configured to determine an image having a label different from the target label as a selected image;

an adjusting unit, configured to adjust a pixel value of the selected image to obtain an adjusted image, with a processing result of a pre-trained image recognition model for the selected image approaching to a processing result for the target image as a target, where the image recognition model is obtained by training using the to-be-protected image sample set;

and the generating unit is configured to set the target label as a label of the adjustment image, obtain a protected image sample comprising the adjustment image and the target label, and be used for forming a protected image sample set.

12. The apparatus of claim 11, wherein the second determining unit is further configured to:

13. The apparatus of claim 11, wherein the apparatus further comprises:

and the first model training unit is configured to use the image sample set to be protected to perform model training based on the model structure to obtain the image recognition model in response to determining that the model structure used by the target user corresponding to the protected image sample set is known.

14. The apparatus of claim 13, wherein the processing results in an output vector of an intermediate layer of the image recognition model; the adjustment unit is further configured to:

15. The apparatus of claim 14, wherein the distance comprises one of: euclidean distance, manhattan distance, infinite norm of the difference vector.

16. The apparatus of claim 14, wherein said adjusting pixel values of said selected image to minimize said distance comprises:

determining a gradient of the distance relative to the pixel value;

17. The apparatus of claim 11, wherein the apparatus further comprises:

and the second model training unit is configured to respond to the fact that the model structure used by the target user corresponding to the protected image sample set is unknown, use the to-be-protected image sample set, and respectively perform model training based on a plurality of preset model structures to obtain a plurality of image recognition models.

18. The apparatus of claim 17, wherein the processing results in output vectors of intermediate layers of the plurality of image recognition models; the adjustment unit is further configured to:

19. The apparatus of claim 11, wherein the apparatus further comprises:

a third determining unit configured to determine a plurality of selected images for the target image, and generate a plurality of adjustment images;

and the setting unit is configured to set the labels of the multiple adjustment images as target labels to obtain multiple protected image samples.

20. The apparatus of claim 14 or 18, wherein the image recognition model comprises a softmax layer, and the intermediate layer is a previous layer to the softmax layer.

21. A computer-readable storage medium, on which a computer program is stored which, when executed in a computer, causes the computer to carry out the method of any one of claims 1-10.

22. A computing device comprising a memory and a processor, wherein the memory has stored therein executable code that, when executed by the processor, performs the method of any of claims 1-10.