CN114091104A - Method and apparatus for protecting privacy information of image sample set - Google Patents

Method and apparatus for protecting privacy information of image sample set Download PDF

Info

Publication number
CN114091104A
CN114091104A CN202111415199.0A CN202111415199A CN114091104A CN 114091104 A CN114091104 A CN 114091104A CN 202111415199 A CN202111415199 A CN 202111415199A CN 114091104 A CN114091104 A CN 114091104A
Authority
CN
China
Prior art keywords
image
target
protected
sample set
label
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111415199.0A
Other languages
Chinese (zh)
Inventor
李一鸣
刘沛东
邱伟峰
江勇
夏树涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202111415199.0A priority Critical patent/CN114091104A/en
Publication of CN114091104A publication Critical patent/CN114091104A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the specification provides a method and a device for protecting privacy information of an image sample set. One embodiment of the method comprises: determining an image sample to be protected in an image sample set to be protected as a target sample, wherein the target sample comprises a target image and a target label; determining an image with a label different from the target label as a selected image; adjusting the pixel value of the selected image by taking the processing result of the pre-trained image recognition model for the selected image approaching to the processing result for the target image as a target to obtain an adjusted image, wherein the image recognition model is obtained by training the to-be-protected image sample set; and setting the target label as the label of the adjustment image to obtain a protected image sample comprising the adjustment image and the target label, and forming a protected image sample set.

Description

Method and apparatus for protecting privacy information of image sample set
Technical Field
The embodiment of the specification relates to the field of artificial intelligence, in particular to a method and a device for protecting privacy information of an image sample set.
Background
As computer software and artificial intelligence continue to develop, machine learning models are becoming more widely used, for example, machine learning models are widely used in the field of image processing, and training a well-performing model for processing images requires the use of a large number of image samples. At this stage, many open-source image sample sets on the network are open. There are also many image sample sets of businesses or individuals that are only allowed for internal use due to privacy or confidentiality concerns. The image sample set, especially the face image sample set, may contain a large amount of user privacy information, and when data is leaked or sourced, the privacy information contained in the image sample set may be used maliciously by attackers. For example, an attacker may perform a relevance attack on a correspondence between a leaked face image and a tag, for example, a false video corresponding to the face image is generated by face changing through AI (Artificial Intelligence) to perform fraud on a relative of a user corresponding to the face image, which greatly threatens user privacy. The attack is mainly generated because an attacker successfully acquires the corresponding relation between the face image and the label. Therefore, how to protect the image sample set has important practical significance and value.
Disclosure of Invention
Embodiments of the present specification describe a method and apparatus for protecting privacy information of an image sample set, the method determining an image sample to be protected as a target sample, and determining an image with a tag different from the target tag as a selected image. When the pixel values of the selected image are adjusted, the processing result of the image recognition model for the selected image and the processing result for the target image are required to approach, so that the obtained adjusted image approaches the target image for the image recognition model, and therefore, the effect of the model obtained by training the protected image sample set and the effect of the model obtained by training the to-be-protected image sample set can be ensured to be close. From the human visual point of view, the adjusted image still looks like the selected image, and is not the same as the target image. Based on the method, the protected image sample can be used for replacing the target sample, so that the corresponding relation between the target image and the target label in the target sample is protected, the protection of privacy information is realized, and a user of the protected image sample set can be ensured to train a model with good effect.
According to a first aspect, there is provided a method for protecting privacy information of a sample set of images, comprising: determining an image sample to be protected in an image sample set to be protected as a target sample, wherein the target sample comprises a target image and a target label; determining an image with a label different from the target label as a selected image; adjusting the pixel value of the selected image by taking the processing result of the pre-trained image recognition model for the selected image approaching to the processing result for the target image as a target to obtain an adjusted image, wherein the image recognition model is obtained by training the to-be-protected image sample set; and setting the target label as the label of the adjustment image to obtain a protected image sample comprising the adjustment image and the target label, and forming a protected image sample set.
In one embodiment, the determining that the image with the tag different from the target tag is the selected image includes: and selecting an image with a label different from the target label from the image sample set to be protected as a selected image.
In one embodiment, before adjusting the pixel values of the selected image, the method further comprises: and in response to determining that the model structure used by the target user corresponding to the protected image sample set is known, performing model training based on the model structure by using the to-be-protected image sample set to obtain the image recognition model.
In one embodiment, the processing result is an output vector of an intermediate layer of the image recognition model; the adjusting the pixel value of the selected image to obtain an adjusted image includes: determining a distance between a first output vector for the selected image and a second output vector for the target image; and adjusting the pixel value of the selected image by taking the minimum distance as a target.
In one embodiment, the distance comprises one of: euclidean distance, manhattan distance, infinite norm of the difference vector.
In one embodiment, adjusting the pixel values of the selected image to minimize the distance comprises: determining a gradient of the distance with respect to the pixel value; the pixel value adjustment is performed in a predetermined step size in the direction in which the gradient decreases by a predetermined number of steps.
In one embodiment, before adjusting the pixel values of the selected image, the method further comprises: and in response to the fact that the model structure used by the target user corresponding to the protected image sample set is unknown, performing model training respectively on the basis of a plurality of preset model structures by using the to-be-protected image sample set to obtain a plurality of image recognition models.
In one embodiment, the processing results are output vectors of intermediate layers of the plurality of image recognition models; the adjusting the pixel value of the selected image to obtain an adjusted image includes: determining a weighting result of distances between output vectors of the plurality of image recognition models for the selected image and the target image, respectively; the pixel values of the selected image are adjusted with the goal of minimizing the weighting result.
In one embodiment, the method further comprises: determining a plurality of selected images according to the target image to generate a plurality of adjustment images; and setting the labels of the multiple adjustment images as target labels to obtain multiple protected image samples.
In one embodiment, the image recognition model includes a softmax layer, and the intermediate layer is a previous layer of the softmax layer.
According to a second aspect, there is provided an apparatus for protecting privacy information of a sample set of images, comprising: the protection device comprises a first determining unit, a second determining unit and a third determining unit, wherein the first determining unit is configured to determine an image sample to be protected in an image sample set to be a target sample, and the target sample comprises a target image and a target label; a second determination unit configured to determine an image having a label different from the target label as a selected image; an adjusting unit, configured to adjust a pixel value of the selected image to obtain an adjusted image, with a processing result of a pre-trained image recognition model for the selected image approaching to a processing result for the target image as a target, wherein the image recognition model is obtained by training using the to-be-protected image sample set; and the generating unit is configured to set the target label as a label of the adjustment image, obtain a protected image sample comprising the adjustment image and the target label, and be used for forming a protected image sample set.
According to a third aspect, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method as described in any one of the implementations of the first aspect.
According to a fourth aspect, a computing device is provided, which includes a memory and a processor, and is characterized in that the memory stores executable codes, and the processor executes the executable codes to implement the method as described in any implementation manner of the first aspect.
According to the method and the device for protecting the privacy information of the image sample set, the image sample to be protected is determined as the target sample, and the target sample comprises the target image and the target label. Thereafter, an image having a label different from the target label is determined as the selected image. And then, taking the processing result of the pre-trained image recognition model for the selected image as a target, and adjusting the pixel value of the selected image to obtain an adjusted image. And finally, setting the target label as the label of the adjustment image to obtain a protected image sample containing the adjustment image and the target label. When the pixel value of the selected image is adjusted, the processing result of the image recognition model for the selected image and the processing result for the target image are required to approach each other, and therefore, the obtained adjusted image approaches the target image with respect to the image recognition model. Therefore, the effect of the model obtained by training the protected image sample set is close to that of the model obtained by training the to-be-protected image sample set. From the human visual point of view, the adjusted image still looks like the selected image, and is not the same as the target image. Based on the method, the protected image sample can be used for replacing the target sample, so that the corresponding relation between the target image and the target label in the target sample is protected, the protection of privacy information is realized, and a user of the protected image sample set can be ensured to train a model with good effect.
Drawings
FIG. 1 shows a schematic diagram of one application scenario in which embodiments of the present description may be applied;
FIG. 2 shows a flow diagram of a method for protecting privacy information of a sample set of images, according to one embodiment;
FIG. 3 is a schematic diagram showing the composition of information contained in an adjusted image;
FIG. 4 shows a flow diagram of a method for training an image recognition model;
FIG. 5 is a schematic diagram illustrating the effect of generating multiple protected image samples for the same target image;
fig. 6 shows a schematic block diagram of an apparatus for protecting privacy information of a sample set of images according to one embodiment.
Detailed Description
The technical solutions provided in the present specification are described in further detail below with reference to the accompanying drawings and embodiments. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings. It should be noted that the embodiments and features of the embodiments in the present specification may be combined with each other without conflict.
As mentioned above, an attacker may attack the private information contained in the open source image sample set or the leaked image sample set for internal use only. To prevent this from happening, some owners of the image sample sets block the samples from being used maliciously by encrypting all or part of the image samples (e.g., images or labels) in the image sample sets. However, the existence of encryption is obvious and easily perceived by an attacker. Furthermore, the encryption approach is not particularly suitable for protecting open source image sample sets, since encrypting the image samples may affect the usability of the samples.
To this end, embodiments of the present specification provide a method for protecting privacy information of an image sample set, so that protection of the image sample set can be achieved. Taking an image of an image sample to be protected as a face image and a label as a name as an example, fig. 1 shows a schematic diagram of an application scenario in which the embodiment of the present specification can be applied. As shown in fig. 1, the image sample set to be protected 101 includes a plurality of image samples to be protected, and each image sample to be protected may include a face image and a label. In this example, each image sample to be protected in the image sample set to be protected 101 may be processed. Taking an image of a to-be-protected image sample to be processed currently as a face image with three sheets and a label of "three sheets" as an example, first, the to-be-protected image sample in the to-be-protected image sample set 101 is determined as a target sample, where the target sample includes a target image 102 and a target label of "three sheets", and in this example, the target image 102 is a face image with three sheets. Next, a face image whose label is different from "zhangsan" is determined as the selected image 103, and in this example, the selected image 103 is a face image of lie four whose label is "lie four". Then, with the processing result of the image recognition model 104 with respect to the selected image 103 approaching the processing result with respect to the target image 102 as a target, the pixel values of the selected image 103 are adjusted to obtain an adjusted image 105. Here, the image recognition model 104 may be trained using the image sample set 101 to be protected. Finally, the target label "zhang san" is set as the label of the adjusted image 105, resulting in a protected image sample 106 comprising the adjusted image 105 and the target label "zhang san" for forming a protected image sample set. When adjusting the pixel values of the selected image 103, the result of the processing of the selected image 103 by the image recognition model 104 is required to be closer to the result of the processing of the target image 102, and therefore the obtained adjusted image 105 is closer to the target image 102 with respect to the image recognition model 104. Therefore, the effect of the model trained by using the protected image sample set is close to that of the model trained by using the to-be-protected image sample set 101. From the human visual point of view, the adjusted image 105 still looks like a face image of lie four (selected image), which is not the same as the face image of zhang three (target image). Based on the method, the protected image sample can be used for replacing the target sample, so that the corresponding relation between the human face image with the size of three and the target label with the size of three is protected, the protection of privacy information is realized, and a user of the protected image sample set can train a model with good effect.
With continued reference to fig. 2, fig. 2 illustrates a flow diagram of a method for protecting privacy information of a sample set of images, according to one embodiment. It is to be appreciated that the method can be performed by any apparatus, device, platform, cluster of devices having computing and processing capabilities. As shown in fig. 2, the method for protecting privacy information of an image sample set may include the following steps:
step 201, determining the image sample to be protected in the image sample set to be protected as a target sample.
In this embodiment, the image sample to be protected in the image sample set to be protected may be determined as a target sample, where the target sample may include a target image and a target label. Here, each of the image samples to be protected in the above-mentioned image sample set to be protected may include an image and a label. For example, the image may be a natural image, an object image, an animal image, a medical image, a human face image, a fingerprint image, or the like.
Step 202, determining an image with a label different from the target label as a selected image.
In this embodiment, an image in which the tag is different from the target tag may be determined as the selected image. Here, an image in which any one label is different from the target label may be determined as the selected image. In practice, in order to make the generated protected image sample more realistic and less noticeable to an attacker, an image having a type similar to that of the target image may be selected as the selected image. For example, when the target image is a face image of a certain person, a face image of another person may be selected as the selected image. For another example, when the target image is a fingerprint image of a certain person, a fingerprint image of another person may be selected as the selected image.
In some alternative implementations, an image with a label different from the target label may be selected from the sample set of images to be protected as the selected image. The selected image is selected from the image sample set to be protected, so that the generated protected image sample is more real and is not easy to be perceived by an attacker.
Step 203, taking the processing result of the pre-trained image recognition model for the selected image approaching the processing result for the target image as a target, and adjusting the pixel value of the selected image to obtain an adjusted image.
In this embodiment, an image recognition model may be obtained by training in advance, where the image recognition model may be obtained by training using the above-mentioned image sample set to be protected. For example, the image recognition model may be Deep Neural Networks (DNNs). In this way, the pixel values of the selected image are adjusted to obtain the adjusted image, with the goal that the processing result of the image recognition model for the selected image approaches the processing result for the target image. Here, the processing result may be various processing results, for example, the image recognition model may include an input layer, a plurality of intermediate layers, and an output layer, and the processing result may be an output result of each layer, for example, an output vector of a certain intermediate layer, or a probability vector of each class finally output by the output layer. In this example, the process of generating the adjustment image corresponds to adding the relevant knowledge about the target image to the selected image. As shown in fig. 3, taking a face image as an example, fig. 3 is a schematic diagram illustrating the composition of information included in an adjustment image. In the example shown in fig. 3, the adjusted image 301 contains information from the selected image 302 and a perturbation 303, wherein the perturbation 303 contains information from the target image. This makes it possible to include the related knowledge about the target image in the adjustment image 301.
And 204, setting the target label as a label of the adjustment image to obtain a protected image sample comprising the adjustment image and the target label, and forming a protected image sample set.
In this embodiment, the target label may be set as the label of the adjusted image obtained in step 203, so that a protected sample image including the adjusted image and the target label may be obtained. A protected image sample set may be formed for each protected image sample obtained for each to-be-protected image sample in the to-be-protected image sample set. In an application scenario in which an open source image sample set is protected, an owner of an image sample set to be protected can deploy the protected image sample set as the open source image sample set on a line for a user to use. In an intra-only application scenario, the protected image sample set may be treated as an intra-only image sample set. In this way, a user of the image sample set may train his own machine learning model from the protected image sample set.
In some optional implementations, before adjusting the pixel values of the selected image, the above method for protecting the privacy information of the image sample set may further include a process of training the image recognition model. For example, the image recognition model may be trained according to whether a model structure used by a target user corresponding to the protected image sample set is known. As shown in fig. 4, fig. 4 is a flowchart illustrating a method for training an image recognition model, which may specifically include the following steps:
step 401, determining whether a model structure used by a target user corresponding to a protected image sample set is known.
In this implementation, the target user corresponding to the protected image sample set may be a user who trains the machine learning model using the protected image sample set. In practice, in some application scenarios, the model structure used by the target user may be known. For example, in an application scenario where the protected image sample set is for internal use only, the model structure used by the target user may be known as the target user may be known. In other application scenarios, the model structure used by the target user may be unknown. For example, in an application scenario where the protected image sample set is an open source image sample set, since the target user is unknown, the model structure used by the target user may also be unknown.
Step 402, in response to determining that the model structure used by the target user corresponding to the protected image sample set is known, performing model training based on the model structure by using the image sample set to be protected to obtain an image recognition model.
Step 403, in response to determining that the model structure used by the target user corresponding to the protected image sample set is unknown, performing model training on the image sample set to be protected based on a plurality of preset model structures respectively to obtain a plurality of image recognition models.
In this implementation, since the model structure used by the target user is not known, a preset predetermined model structure needs to be used. Here, the predetermined model structure may be a model structure determined by a skilled person in various ways, for example, several model structures commonly used at this stage may be determined as the predetermined model structure. In this way, the image sample set to be protected can be used to perform model training based on a plurality of predetermined model structures, respectively, so as to obtain a plurality of image recognition models. For example, assuming that 3 predetermined model structures are included, which are the model structure a, the model structure B, and the model structure C, respectively, model training may be performed based on the model structure a, the model structure B, and the model structure C, respectively, using the image sample set to be protected, so that 3 image recognition models may be obtained.
In some optional implementations, in a scenario where the model structure used by the target user is known, the trained model may include an image recognition model, and the processing result of the image recognition model for the selected image and the processing result for the target image may be output vectors of an intermediate layer of the image recognition model. For example, the image recognition model may include a plurality of layer structures such as an input layer, a plurality of intermediate layers, and an output layer, and the processing result may be an output vector of any intermediate layer. Alternatively, the output layer of the image recognition model may be set as a softmax layer, and the intermediate layer may be a layer before the softmax layer. In this case, in the step 204, the pixel value of the selected image is adjusted to obtain an adjusted image, which may be specifically performed as follows:
first, a distance between a first output vector for the selected image and a second output vector for the target image is determined.
In this implementation, the output vector of a selected image for an intermediate layer of the image recognition model may be determined as a first output vector and the output vector of the target image for the intermediate layer of the image recognition model may be determined as a second output vector. The distance between the first output vector and the second output vector is then calculated. Optionally, the distance may include one of: euclidean distance, manhattan distance, infinite norm of difference vectors, and the like.
Then, the pixel values of the selected image are adjusted with the aim of minimizing the above-mentioned distance.
In this implementation, various optimization algorithms, such as, for example, Projection Gradient (PGD), may be employed to adjust the pixel values of the selected image with the goal of minimizing the distance between the first output vector and the second output vector. Alternatively, the gradient of the distance between the first output vector and the second output vector with respect to the pixel values of the selected image may first be determined. Then, pixel value adjustment is performed in a predetermined step number in the direction in which the gradient descends by a predetermined step size. With the present implementation, an adjustment of the pixel values of the selected image can be achieved.
For example, assume an image sample to be protectedThe book is collected as
Figure BDA0003375010700000081
Figure BDA0003375010700000082
Wherein N represents the size of the sample set; the selected image is xselected,xselectedIs given by the label yoriginal(ii) a The target image is xtarget,xtargetIs given by the label ytargetWherein, ytarget≠yoriginal(ii) a The adjusted image may be xmodifiedThis can be obtained by the following formula:
Figure BDA0003375010700000083
wherein W may represent an image dimension, f may represent a mapping from an input to the intermediate layer output in the image recognition model, f (x) corresponds to the first output vector, f (x)target) Corresponding to the aforementioned second output vector, d (-) can represent a distance measure between the two vectors, which in this example can be represented by xselectedX is initialized. In addition, in this example, the infinite norm of the difference vector may be used as the distance, xselectedAfter the pixel value of (2) is adjusted, an adjusted image x is obtainedmodifiedI.e., d (f (x)modified),f(xtarget))=||f(xmodified)-f(xtarget)||. According to the implementation mode, the protected image sample corresponding to the target sample can be generated in the scene that the model structure used by the target user is known.
In other alternative implementations, in a scenario where the model structure used by the target user is unknown, the trained model may include a plurality of image recognition models, and the processing result of the image recognition models for the selected image and the processing result for the target image may refer to output vectors of intermediate layers of the plurality of image recognition models. Alternatively, the output layers of the plurality of image recognition models may be set as softmax layers, and the intermediate layer may be a layer before the softmax layer.
In this case, in the step 204, the pixel value of the selected image is adjusted to obtain an adjusted image, which may be specifically performed as follows:
first, the weighted result of the distances between the output vectors of the plurality of image recognition models for the selected image and the target image, respectively, is determined.
In this implementation, for a plurality of image recognition models, first, the distance between the output vector for the selected image and the output vector for the target image of the intermediate layer of each image recognition model may be determined, resulting in a plurality of distances. And then calculates a weighted result of the plurality of distances. Here, the weighting result may be a weighted sum, a weighted average, or the like.
Then, the pixel values of the selected image are adjusted with the objective of minimizing the weighting result.
For example, assuming there are M image recognition models, image x is adjustedmodifiedThis can be obtained by the following formula:
Figure BDA0003375010700000091
in this example, x may be usedselectedX is initialized. By the implementation mode, the protected image sample corresponding to the target sample can be generated under the scene that the model structure used by the target user is unknown.
In some optional implementations, the method for protecting privacy information of an image sample set may further include the following:
first, a plurality of selected images are determined for a target image, and a plurality of adjustment images are generated.
In this implementation, for the same target image, multiple selected images may be selected, and step 203 may be repeated multiple times, so as to generate multiple adjustment images.
Then, the labels of the multiple adjusted images are set as target labels, so that multiple protected image samples are obtained.
In this implementation, multiple protected image samples may be generated for the same target image. For example, as shown in fig. 5, fig. 5 is a schematic diagram illustrating the effect of generating multiple protected image samples for the same target image. In the example of an object image shown in fig. 5, the target image 501 shows a ship and the target label is "ship" as seen by human eyes, 3 adjustment images corresponding to the generated 3 protected image samples respectively show a cat, a head α and a horse, and the labels of the 3 adjustment images are the target label "ship". In the example of another face image shown in fig. 5, the target image 502 shows a man's face and the target label is "1133" from the viewpoint of human vision, and the 3 adjustment images corresponding to the generated 3 protected image samples show another man's face, a boy's face and a girl's face, respectively, and the labels of the 3 adjustment images are the target labels "1133".
With the implementation mode, a plurality of protected image samples can contain the information of the target sample. Namely, the relevant knowledge of the corresponding relation between the target image and the target label in the target sample is embodied in the plurality of protected image samples to realize the enhancement of the relevant knowledge, so that the user of the image sample set can better learn the relevant knowledge when training the machine learning model of the user, and the model trained by the user of the image sample set is more accurate.
Referring back to the above procedure, in an embodiment of the present specification, an adjustment image is generated by adjusting a selected image pixel value, and a label of the adjustment image is set as a target label, thereby generating a protected image sample. When the pixel value of the selected image is adjusted, the processing result of the image recognition model for the selected image and the processing result for the target image are required to approach each other, and therefore, the obtained adjusted image approaches the target image with respect to the image recognition model. Therefore, the effect of the model obtained by training the protected image sample set is close to that of the model obtained by training the to-be-protected image sample set. From the human visual point of view, the adjusted image still looks like the selected image, and is not the same as the target image. Based on the method, the protected image sample can be used for replacing the target sample, so that the corresponding relation between the target image and the target label in the target sample is protected, the protection of privacy information is realized, and a user of the protected image sample set can be ensured to train a model with good effect.
According to an embodiment of another aspect, an apparatus for protecting privacy information of a sample set of images is provided. The above-described means for protecting the privacy information of the image sample set may be deployed in any device, platform, or device cluster having computing and processing capabilities.
Fig. 6 shows a schematic block diagram of an apparatus for protecting privacy information of a sample set of images according to one embodiment. As shown in fig. 6, the apparatus 600 for protecting privacy information of a sample set of images includes: a first determining unit 601 configured to determine an image sample to be protected in an image sample set to be protected as a target sample, wherein the target sample includes a target image and a target label; a second determination unit 602 configured to determine an image having a label different from the target label as a selected image; an adjusting unit 603 configured to adjust a pixel value of the selected image to obtain an adjusted image, with a processing result of a pre-trained image recognition model for the selected image approaching to a processing result for the target image as a target, wherein the image recognition model is obtained by training using the to-be-protected image sample set; a generating unit 604 configured to set the target label as a label of the adjustment image, and obtain a protected image sample including the adjustment image and the target label, for forming a protected image sample set.
In some optional implementations of this embodiment, the second determining unit 602 is further configured to: and selecting an image with a label different from the target label from the image sample set to be protected as a selected image.
In some optional implementations of this embodiment, the apparatus 600 may further include: and a first model training unit (not shown in the figure) configured to perform model training based on the model structure by using the image sample set to be protected in response to determining that the model structure used by the target user corresponding to the protected image sample set is known, so as to obtain the image recognition model.
In some optional implementations of this embodiment, the processing result is an output vector of an intermediate layer of the image recognition model; the above-mentioned adjusting unit 603 is further configured to: determining a distance between a first output vector for the selected image and a second output vector for the target image; and adjusting the pixel value of the selected image by taking the minimum distance as a target.
In some optional implementations of this embodiment, the distance includes one of: euclidean distance, manhattan distance, infinite norm of the difference vector.
In some optional implementations of this embodiment, the adjusting the pixel value of the selected image with the goal of minimizing the distance includes: determining a gradient of the distance with respect to the pixel value; the pixel value adjustment is performed in a predetermined step size in the direction in which the gradient decreases by a predetermined number of steps.
In some optional implementation manners of this embodiment, the apparatus 600 further includes a second model training unit (not shown in the figure), configured to perform, in response to determining that a model structure used by a target user corresponding to the protected image sample set is unknown, model training based on a plurality of preset predetermined model structures by using the to-be-protected image sample set, so as to obtain a plurality of image recognition models.
In some optional implementations of this embodiment, the processing result is an output vector of an intermediate layer of the plurality of image recognition models; the above-mentioned adjusting unit 603 is further configured to: determining a weighting result of distances between output vectors of the plurality of image recognition models for the selected image and the target image, respectively; the pixel values of the selected image are adjusted with the goal of minimizing the weighting result.
In some optional implementations of this embodiment, the apparatus 600 further includes: a third determining unit (not shown in the figure) configured to determine a plurality of selected images for the target image and generate a plurality of adjustment images; and a setting unit (not shown in the figure) configured to set the labels of the plurality of adjustment images as target labels, so as to obtain a plurality of protected image samples.
In some optional implementations of the embodiment, the image recognition model includes a softmax layer, and the intermediate layer is a previous layer of the softmax layer.
According to an embodiment of another aspect, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method described in fig. 2.
According to an embodiment of still another aspect, there is also provided a computing device including a memory and a processor, wherein the memory stores executable code, and the processor executes the executable code to implement the method described in fig. 2.
It will be further appreciated by those of ordinary skill in the art that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether these functions are performed in hardware or software depends on the particular application of the solution and design constraints. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied in hardware, a software module executed by a processor, or a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (22)

1. A method for protecting privacy information of a sample set of images, comprising:
determining an image sample to be protected in an image sample set to be protected as a target sample, wherein the target sample comprises a target image and a target label;
determining an image with a label different from the target label as a selected image;
adjusting the pixel value of the selected image by taking the processing result of the pre-trained image recognition model for the selected image approaching to the processing result for the target image as a target to obtain an adjusted image, wherein the image recognition model is obtained by training the to-be-protected image sample set;
and setting the target label as the label of the adjustment image to obtain a protected image sample comprising the adjustment image and the target label, and using the protected image sample to form a protected image sample set.
2. The method of claim 1, wherein the determining an image with a label different from the target label as the selected image comprises:
and selecting an image with a label different from the target label from the image sample set to be protected as a selected image.
3. The method of claim 1, wherein prior to adjusting pixel values of the selected image, the method further comprises:
and in response to determining that the model structure used by the target user corresponding to the protected image sample set is known, performing model training based on the model structure by using the to-be-protected image sample set to obtain the image recognition model.
4. The method of claim 3, wherein the processing results in an output vector of an intermediate layer of the image recognition model; the adjusting the pixel value of the selected image to obtain an adjusted image includes:
determining a distance between a first output vector for the selected image and a second output vector for the target image;
adjusting pixel values of the selected image with a goal of minimizing the distance.
5. The method of claim 4, wherein the distance comprises one of: euclidean distance, manhattan distance, infinite norm of the difference vector.
6. The method of claim 4, wherein said adjusting pixel values of said selected image to minimize said distance comprises:
determining a gradient of the distance relative to the pixel value;
the pixel value adjustment is performed in a predetermined step size in the direction in which the gradient decreases by a predetermined number of steps.
7. The method of claim 1, wherein prior to adjusting pixel values of the selected image, the method further comprises:
and in response to the fact that the model structure used by the target user corresponding to the protected image sample set is unknown, using the to-be-protected image sample set to respectively perform model training based on a plurality of preset model structures to obtain a plurality of image recognition models.
8. The method of claim 7, wherein the processing results in output vectors of intermediate layers of the plurality of image recognition models; the adjusting the pixel value of the selected image to obtain an adjusted image includes:
determining a weighted result of distances between output vectors of the plurality of image recognition models respectively for the selected image and the target image;
adjusting pixel values of the selected image with a goal of minimizing the weighting result.
9. The method of claim 1, wherein the method further comprises:
determining a plurality of selected images aiming at the target image and generating a plurality of adjusting images;
and setting the labels of the multiple adjusted images as target labels to obtain multiple protected image samples.
10. The method of claim 4 or 8, wherein the image recognition model comprises a softmax layer, and the intermediate layer is a previous layer to the softmax layer.
11. An apparatus for protecting privacy information of a sample set of images, comprising:
the protection device comprises a first determining unit, a second determining unit and a third determining unit, wherein the first determining unit is configured to determine an image sample to be protected in an image sample set to be a target sample, and the target sample comprises a target image and a target label;
a second determination unit configured to determine an image having a label different from the target label as a selected image;
an adjusting unit, configured to adjust a pixel value of the selected image to obtain an adjusted image, with a processing result of a pre-trained image recognition model for the selected image approaching to a processing result for the target image as a target, where the image recognition model is obtained by training using the to-be-protected image sample set;
and the generating unit is configured to set the target label as a label of the adjustment image, obtain a protected image sample comprising the adjustment image and the target label, and be used for forming a protected image sample set.
12. The apparatus of claim 11, wherein the second determining unit is further configured to:
and selecting an image with a label different from the target label from the image sample set to be protected as a selected image.
13. The apparatus of claim 11, wherein the apparatus further comprises:
and the first model training unit is configured to use the image sample set to be protected to perform model training based on the model structure to obtain the image recognition model in response to determining that the model structure used by the target user corresponding to the protected image sample set is known.
14. The apparatus of claim 13, wherein the processing results in an output vector of an intermediate layer of the image recognition model; the adjustment unit is further configured to:
determining a distance between a first output vector for the selected image and a second output vector for the target image;
adjusting pixel values of the selected image with a goal of minimizing the distance.
15. The apparatus of claim 14, wherein the distance comprises one of: euclidean distance, manhattan distance, infinite norm of the difference vector.
16. The apparatus of claim 14, wherein said adjusting pixel values of said selected image to minimize said distance comprises:
determining a gradient of the distance relative to the pixel value;
the pixel value adjustment is performed in a predetermined step size in the direction in which the gradient decreases by a predetermined number of steps.
17. The apparatus of claim 11, wherein the apparatus further comprises:
and the second model training unit is configured to respond to the fact that the model structure used by the target user corresponding to the protected image sample set is unknown, use the to-be-protected image sample set, and respectively perform model training based on a plurality of preset model structures to obtain a plurality of image recognition models.
18. The apparatus of claim 17, wherein the processing results in output vectors of intermediate layers of the plurality of image recognition models; the adjustment unit is further configured to:
determining a weighted result of distances between output vectors of the plurality of image recognition models respectively for the selected image and the target image;
adjusting pixel values of the selected image with a goal of minimizing the weighting result.
19. The apparatus of claim 11, wherein the apparatus further comprises:
a third determining unit configured to determine a plurality of selected images for the target image, and generate a plurality of adjustment images;
and the setting unit is configured to set the labels of the multiple adjustment images as target labels to obtain multiple protected image samples.
20. The apparatus of claim 14 or 18, wherein the image recognition model comprises a softmax layer, and the intermediate layer is a previous layer to the softmax layer.
21. A computer-readable storage medium, on which a computer program is stored which, when executed in a computer, causes the computer to carry out the method of any one of claims 1-10.
22. A computing device comprising a memory and a processor, wherein the memory has stored therein executable code that, when executed by the processor, performs the method of any of claims 1-10.
CN202111415199.0A 2021-11-25 2021-11-25 Method and apparatus for protecting privacy information of image sample set Pending CN114091104A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111415199.0A CN114091104A (en) 2021-11-25 2021-11-25 Method and apparatus for protecting privacy information of image sample set

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111415199.0A CN114091104A (en) 2021-11-25 2021-11-25 Method and apparatus for protecting privacy information of image sample set

Publications (1)

Publication Number Publication Date
CN114091104A true CN114091104A (en) 2022-02-25

Family

ID=80304614

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111415199.0A Pending CN114091104A (en) 2021-11-25 2021-11-25 Method and apparatus for protecting privacy information of image sample set

Country Status (1)

Country Link
CN (1) CN114091104A (en)

Similar Documents

Publication Publication Date Title
Song et al. Constructing unrestricted adversarial examples with generative models
CN111340008B (en) Method and system for generation of counterpatch, training of detection model and defense of counterpatch
CN111340214B (en) Method and device for training anti-attack model
JP7297226B2 (en) A method for learning and testing a user learning network used to recognize altered data generated by concealing original data to protect personal information, and a learning device and test device using the same
EP3114540B1 (en) Neural network and method of neural network training
CN110941855B (en) Stealing and defending method for neural network model under AIoT scene
CN111476200B (en) Face de-identification generation method based on generation of confrontation network
JP7140317B2 (en) Method for learning data embedding network that generates marked data by synthesizing original data and mark data, method for testing, and learning device using the same
EP4009278A1 (en) Method for producing labeled image from original image while preventing private information leakage of original image and server using the same
CN113254927B (en) Model processing method and device based on network defense and storage medium
CN111091193A (en) Domain-adapted privacy protection method based on differential privacy and oriented to deep neural network
CN110969243B (en) Method and device for training countermeasure generation network for preventing privacy leakage
WO2023093346A1 (en) Exogenous feature-based model ownership verification method and apparatus
Lodeiro-Santiago et al. Secure UAV-based system to detect small boats using neural networks
KR20210009655A (en) Image processing method and system for deep-learning
CN113435264A (en) Face recognition attack resisting method and device based on black box substitution model searching
CN110070017B (en) Method and device for generating human face artificial eye image
CN114091104A (en) Method and apparatus for protecting privacy information of image sample set
US11528259B2 (en) Systems and methods for providing a systemic error in artificial intelligence algorithms
EP4229554A1 (en) Systems and methods for providing a systemic error in artificial intelligence algorithms
CN114612991A (en) Conversion method and device for attacking face picture, electronic equipment and storage medium
CN114783017A (en) Method and device for generating confrontation network optimization based on inverse mapping
CN113762053A (en) Image processing method and device, computer and readable storage medium
JP2021120840A (en) Learning method, device, and program
Luo et al. Defective Convolutional Networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination