WO2023168903A1 - 模型训练和身份匿名化方法、装置、设备、存储介质及程序产品 - Google Patents
模型训练和身份匿名化方法、装置、设备、存储介质及程序产品 Download PDFInfo
- Publication number
- WO2023168903A1 WO2023168903A1 PCT/CN2022/111704 CN2022111704W WO2023168903A1 WO 2023168903 A1 WO2023168903 A1 WO 2023168903A1 CN 2022111704 W CN2022111704 W CN 2022111704W WO 2023168903 A1 WO2023168903 A1 WO 2023168903A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- identity
- vectors
- image
- loss
- network model
- Prior art date
Links
- 238000012549 training Methods 0.000 title claims abstract description 212
- 238000000034 method Methods 0.000 title claims abstract description 126
- 238000003860 storage Methods 0.000 title claims abstract description 14
- 239000013598 vector Substances 0.000 claims abstract description 398
- 230000004927 fusion Effects 0.000 claims abstract description 49
- 238000005070 sampling Methods 0.000 claims abstract description 27
- 238000000605 extraction Methods 0.000 claims abstract description 19
- 238000004590 computer program Methods 0.000 claims description 29
- 238000009826 distribution Methods 0.000 claims description 14
- 238000005516 engineering process Methods 0.000 abstract description 22
- 238000013473 artificial intelligence Methods 0.000 abstract description 11
- 238000010586 diagram Methods 0.000 description 18
- 238000013507 mapping Methods 0.000 description 12
- 230000006870 function Effects 0.000 description 9
- 238000012545 processing Methods 0.000 description 8
- 230000001815 facial effect Effects 0.000 description 7
- 238000010801 machine learning Methods 0.000 description 5
- 230000001360 synchronised effect Effects 0.000 description 5
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 4
- 238000013480 data collection Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 208000032538 Depersonalisation Diseases 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000001010 compromised effect Effects 0.000 description 1
- 238000013503 de-identification Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000005315 distribution function Methods 0.000 description 1
- 230000003631 expected effect Effects 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000013526 transfer learning Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
- G06F21/6254—Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/04—Context-preserving transformations, e.g. by using an importance map
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4038—Image mosaicing, e.g. composing plane images from plane sub-images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
Definitions
- This application relates to the field of image processing technology, and in particular to a model training and identity anonymization method, device, equipment, storage medium and program product.
- Identity anonymization also known as De-Identification, refers to the removal of identifiable identity features (Identity) from images or videos, but at the same time retains other attributes unrelated to identity unchanged, and ensures that anonymized pictures or videos must Still visually authentic.
- conditional generative adversarial networks are used to generate anonymized images by extracting the pose key points of the original image, and using the pose key points of the original image and the background image after removing the facial area as conditions Inputs are fed into the model to generate new virtual identities to fill in the vacant facial areas.
- this method uses the background image after removing the facial area as the model input, resulting in poor quality images generated by the model.
- Embodiments of the present application provide a model training method, an identity anonymization method, a device, a computing device, a computer-readable storage medium, and a computer program product, which can improve the quality of generating identity anonymization images.
- the embodiment of this application provides a model training method, including:
- image generation is performed based on the N first virtual identity vectors and the M attribute vectors to obtain an identity anonymized image of the second training image;
- a loss of the target network model is determined, and the target network model is trained based on the loss.
- the embodiment of this application also provides an identity anonymization method, including:
- N is a positive integer
- attribute vectors are extracted from the image to be processed to obtain M attribute vectors, where M is a positive integer;
- image generation is performed based on the N virtual identity vectors and the M attribute vectors to obtain an identity anonymized image of the image to be processed.
- An embodiment of the present application also provides a model training device, including:
- a projection unit configured to project the first training image to the target space through the projection module in the target network model to obtain N first virtual identity vectors, where N is a positive integer;
- the attribute unit is configured to extract attribute vectors from the second training image through the attribute module in the target network model to obtain M attribute vectors, where M is a positive integer;
- a fusion unit configured to generate an image based on the N first virtual identity vectors and the M attribute vectors through the fusion module of the target network model to obtain an identity anonymized image of the second training image
- a training unit configured to determine a loss of the target network model based on the identity anonymized image, and train the target network model based on the loss.
- the embodiment of the present application also provides an identity anonymization device, including:
- a sampling unit configured to sample on the target space of the projection module in the target network model to obtain N virtual identity vectors, where N is a positive integer;
- the attribute unit is configured to extract attribute vectors of the image to be processed through the attribute module in the target network model, and obtain M attribute vectors, where M is a positive integer;
- the anonymization unit is configured to generate an image based on the N virtual identity vectors and the M attribute vectors through the fusion module of the target network model to obtain an identity anonymized image of the image to be processed.
- An embodiment of the present application also provides a computing device, including a processor and a memory.
- the memory is configured to store a computer program
- the processor is configured to call and run the computer program stored in the memory to execute the above model training method or identity anonymization method provided by embodiments of the present application.
- An embodiment of the present application also provides a chip configured to implement the above model training method or identity anonymization method provided by the embodiment of the present application.
- the chip includes: a processor configured to call and run a computer program from a memory, so that a device installed with the chip executes the above model training method or identity anonymization method provided by embodiments of the present application.
- Embodiments of the present application also provide a computer-readable storage medium configured to store a computer program.
- the computer program When the computer program is executed, the above model training method or identity anonymization method provided by the embodiment of the present application is implemented.
- An embodiment of the present application also provides a computer program product, which includes computer program instructions.
- the computer program instructions When the computer program instructions are executed by a computer, the above-mentioned model training method or identity anonymization method provided by the embodiment of the present application is implemented.
- An embodiment of the present application also provides a computer program that, when run on a computer, implements the above model training method or identity anonymization method provided by the embodiment of the present application.
- N first virtual identity vectors are obtained, so that the target network model can fully learn the identity information in the image
- Attribute vector extraction is performed to obtain M attribute vectors, which enables the target network model to fully learn the attribute information in the image.
- the identity of the second training image is obtained. Anonymize images, so that the trained model can generate images carrying virtual identity information while ensuring that the attribute information of the original image remains unchanged;
- N virtual identity vectors are obtained, and the generation of virtual identity information is realized.
- M attribute vectors are obtained.
- the image is generated based on N virtual identity vectors and M attribute vectors to obtain the identity anonymized image of the image to be processed, achieving the While ensuring that the attribute information of the image to be processed remains unchanged, an identity anonymized image carrying virtual identity information, that is, hiding the true identity, is generated. That is, in the embodiment of the present application, an independent virtual identity is generated through the target network model during identity anonymization without the need for Remove facial areas from images, thereby increasing the fidelity and resolution of identity anonymization.
- Figure 1A is a schematic diagram of a real image provided by an embodiment of the present application.
- Figures 1B-1D are schematic diagrams of the identity anonymization images corresponding to Figure 1A provided by the embodiment of the present application;
- Figure 2 is a schematic diagram of a system architecture provided by an embodiment of the present application.
- Figure 3 is a schematic flow chart of the model training method provided by the embodiment of the present application.
- FIGS. 4 to 6 are schematic structural diagrams of the target network model provided by the embodiment of the present application.
- Figure 7 is a schematic structural diagram of the fusion module provided by the embodiment of the present application.
- Figure 8 is a schematic structural diagram of the target network model provided by the embodiment of the present application.
- Figures 9 and 10 are schematic diagrams of contrast loss determination provided by embodiments of the present application.
- Figure 11 is a schematic flow chart of the identity anonymization method provided by the embodiment of the present application.
- Figure 12 is a schematic diagram of the projection module provided by the embodiment of the present application.
- Figure 13 is a schematic diagram of identity anonymization image determination provided by the embodiment of this application.
- Figure 14 is a schematic block diagram of a model training device provided by an embodiment of the present application.
- Figure 15 is a schematic block diagram of an identity anonymization device provided by an embodiment of the present application.
- Figure 16 is a schematic block diagram of a computing device provided by an embodiment of the present application.
- B corresponding to A means that B is associated with A.
- B can be determined based on A.
- determining B based on A does not mean determining B only based on A.
- B can also be determined based on A and/or other information.
- words such as “first”, “second” and “third” are used to describe the same items or items with basically the same functions and effects. Distinguish similar items. Those skilled in the art can understand that words such as “first”, “second” and “third” do not limit the quantity and order of execution, and words such as “first”, “second” and “third” do not limit the number or order of execution. It’s not necessarily different.
- Artificial Intelligence is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
- artificial intelligence is a comprehensive technology of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can respond in a similar way to human intelligence.
- Artificial intelligence is the study of the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
- Artificial intelligence technology is a comprehensive subject that covers a wide range of fields, including both hardware-level technology and software-level technology.
- Basic artificial intelligence technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, mechatronics and other technologies.
- Artificial intelligence software technology mainly includes computer vision technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
- Machine Learning is a multi-field interdisciplinary subject involving probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and other disciplines. It specializes in studying how computers can simulate or implement human learning behavior to acquire new knowledge or skills, and reorganize existing knowledge structures to continuously improve their performance.
- Machine learning is the core of artificial intelligence and the fundamental way to make computers intelligent. Its applications cover all fields of artificial intelligence.
- Machine learning and deep learning usually include artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, teaching learning and other technologies.
- FIGS. 1A to 1D are the real image
- FIGS. 1B to 1D are the identity anonymized images of FIG. 1A . Comparing Figure 1A and Figure 1B to Figure 1D, it can be seen that Figure 1B to Figure 1D removes the identifiable identity feature (Identity) in Figure 1A, while leaving other attributes unrelated to identity unchanged, and ensuring that it is still visually authentic.
- Identity identifiable identity feature
- Scenario 1 The embodiments of this application can be applied to privacy protection scenarios. For example, for pictures or videos related to human faces, the method of the embodiments of this application can be used to replace the real identity with a virtual identity, so that subsequent detection and other tasks can continue to be performed. No privacy will be compromised. In addition, users can also use the method in the embodiments of this application to hide their identity when posting pictures or videos to avoid leakage of real information.
- Scenario 2 The embodiments of the present application can be applied to the scene of generating a virtual image.
- the technical solutions of the embodiments of the present application can be used to generate virtual identities, such as fixing identity latent variables, replacing background pictures, and generating a specific virtual image in different situations. Pictures or videos of the scene.
- Scenario 1 and Scenario 2 take the target as a human face as an example.
- the method in the embodiment of the present application can also be applied to the scenario of anonymizing the identity of other targets other than human faces, such as animals in the image to be processed. Anonymize the identity of any targets such as vehicles and vehicles.
- the methods of the embodiments of the present application can be applied to intelligent transportation systems.
- Intelligent Traffic System also known as Intelligent Transportation System (Intelligent Transportation System)
- Intelligent Transportation System Intelligent Transportation System
- advanced science and technology information technology, Computer technology, data communication technology, sensor technology, electronic control technology, automatic control theory, operations research, artificial intelligence, etc.
- connections to form a comprehensive transportation system that ensures safety, improves efficiency, improves the environment, and saves energy.
- the solution of combining this application with intelligent transportation can be that the vehicle-mounted device collects the user's face image, and uses the method of the embodiment of this application to anonymize the identity of the collected face image, and then sends it to other users.
- the device performs task analysis, such as illegal driving analysis or intelligent driving analysis.
- Figure 2 is a schematic diagram of a system architecture involved in an embodiment of the present application, including user equipment 101, data collection equipment 102, training equipment 103, execution equipment 104, database 105, content library 106, I/O interface 107 and target network model 108 .
- the data collection device 102 is configured to read training data from the content library 106 and store the read training data in the database 105 .
- the training data involved in the embodiment of the present application includes first training images, second training images, and third training images.
- the first training images, second training images, and third training images are all used to train the target network model.
- the user device 101 is configured to perform annotation operations on data in the database 105 .
- the training device 103 trains the target network model 108 based on the training data maintained in the database 105, so that the trained target network model 108 can generate an identity anonymized image of the image to be processed.
- the target network model 108 obtained by training the device 103 can be applied to different systems or devices.
- the execution device 104 is configured with an I/O interface 107 for data interaction with external devices.
- the image to be processed sent by the user device 101 is received through the I/O interface.
- the computing module 109 in the execution device 104 uses the trained target network model 108 to process the input image to be processed, outputs the identity anonymized image, and outputs the generated identity anonymized image to the user device 101 for display, or inputs other Other tasks are processed in the task model.
- the user device 101 may include a mobile phone, a tablet computer, a notebook computer, a handheld computer, a mobile internet device (mobile internet device, MID) or other terminal devices with the function of installing a browser.
- a mobile phone a tablet computer, a notebook computer, a handheld computer, a mobile internet device (mobile internet device, MID) or other terminal devices with the function of installing a browser.
- a mobile internet device mobile internet device, MID
- the execution device 104 may be a server. There can be one or more servers. When there are multiple servers, at least one of the following situations may exist: at least two servers are configured to provide different services, and at least two servers are configured to provide the same service; for example, the same service is provided in a load balancing manner.
- the above-mentioned server can be an independent physical server, or a server cluster or distributed system composed of multiple physical servers, or it can provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud Cloud servers for basic cloud computing services such as communications, middleware services, domain name services, security services, Content Delivery Network (CDN), and big data and artificial intelligence platforms. Servers can also become nodes of the blockchain.
- the execution device 104 is connected to the user device 101 through the network.
- the network may be an intranet, Internet, Global System of Mobile communication (GSM), Wideband Code Division Multiple Access (WCDMA), 4G network, 5G Wireless or wired networks such as Internet, Bluetooth, Wi-Fi, and call networks.
- GSM Global System of Mobile communication
- WCDMA Wideband Code Division Multiple Access
- 4G network 4G network
- 5G Wireless or wired networks such as Internet, Bluetooth, Wi-Fi, and call networks.
- Figure 2 is only a schematic diagram of a system architecture provided by an embodiment of the present application, and the positional relationship between the devices, devices, modules, etc. shown in the figure does not constitute any limitation.
- the above-mentioned data collection device 102, the user device 101, the training device 103 and the execution device 104 may be the same device.
- the above-mentioned database 105 can be distributed on one server or multiple servers, and the above-mentioned content library 106 can be distributed on one server or multiple servers.
- This application provides a target network model, which is used to perform identity anonymization processing on targets (such as faces) in images to be processed, and to generate identity anonymized images of the images to be processed. Therefore, in some embodiments, the target network model may be referred to as an identity anonymization model, or an identity anonymizer.
- Figure 3 is a schematic flowchart of the model training method provided by the embodiment of the present application.
- the execution subject of the embodiment of the present application is a device with a model training function, such as a model training device.
- the model training device may be a computing device, or a part of the computing device.
- the following description takes the execution subject as a computing device as an example.
- the method in the embodiment of this application includes:
- the computing device projects the first training image to the target space through the projection module in the target network model, and obtains N first virtual identity vectors, where N is a positive integer.
- the first training image in this embodiment of the present application is a training image in the training data. It should be noted that if the above-mentioned first training image is a face image, the above-mentioned first training image is obtained with the user's permission and consent, and the collection, use and processing of relevant image data need to comply with the relevant laws and regulations of relevant countries and regions. and standards.
- the process of training the model using each first training image is basically similar.
- a first training image is used as an example for explanation.
- the embodiment of the present application projects the first training image into the target space through the target network model to obtain one or more virtual identity vectors of the first training image, so that the target network model learns the identity information of the first training image. After the target network model has fully learned the identity information, when actually performing identity anonymization processing, the target space of the target network model can be directly sampled to generate a virtual identity vector.
- the embodiments of this application mainly involve the concepts of attribute vectors and virtual identity vectors.
- the virtual identity vector is a vector corresponding to the virtual identity information
- the virtual identity information is the identity information after hiding the identifiable identity features, such as facial information after hiding the identifiable identity features of the face.
- the attribute vector is the vector corresponding to the attribute information.
- Other feature information in the image other than identifiable identity features is called attribute information, such as background information.
- the target network model of the embodiment of the present application can generate an independent virtual identity vector.
- Figure 4 is a schematic structural diagram of the target network model provided by the embodiment of the present application.
- the target network model of the embodiment of the present application includes a projection module, an attribute module and a fusion module.
- the projection module is configured to project the first training image into the target space to obtain N first virtual identity vectors of the first training image.
- N is a positive integer.
- the embodiment of the present application does not limit the value of N and can be set according to actual needs.
- the attribute module is configured to perform attribute vector extraction on the second training image to extract M attribute vectors of the second training image.
- M is a positive integer.
- the embodiment of the present application does not limit the value of M and can be set according to actual needs. In some embodiments, M equals N.
- the fusion module is configured to perform image generation based on the above-mentioned N first virtual identity vectors and M attribute vectors to obtain an identity anonymized image of the second training image.
- N is a positive integer greater than 1, then the N first virtual identity vectors respectively correspond to different resolutions.
- the projection module is configured to generate a virtual identity vector of the target in the second training image.
- the virtual identity vector hides the true identity characteristics of the target in the second training image.
- the attribute module is configured In order to extract the attribute vector of the second training image, the attribute vector retains other features other than the true identity characteristics of the target in the second training image. In this way, after the fusion module performs image generation based on the above virtual identity vector and attribute vector, an anonymized image that hides the target identity in the second training image can be obtained, that is, an identity anonymized image.
- the projection module includes a first projection unit and a second projection unit, and the target space includes a first space Z and a second space W.
- the above-mentioned computing device passes through The projection module can use the following method to project the first training image to the target space to obtain N first virtual identity vectors:
- Extract the prior identity information of the first training image project the prior identity information to the first space Z through the first projection unit to obtain N identity latent vectors; project the N identity latent vectors into In the second space W, N first virtual identity vectors are obtained.
- the a priori identity information of the first training image is extracted, for example, through a pre-trained recognition model, the a priori identity information of the first training image is extracted. Then, through the first projection unit, the prior identity information of the first training image is projected into the first space Z to obtain N identity latent vectors, and then through the second projection unit, the N identity latent vectors are projected to the second Space W, get N first virtual identity vectors.
- the first space Z and the second space W may be different hidden spaces.
- the embodiment of the present application places no restrictions on the first space Z and the second space W.
- the first space is a latent space Z
- the latent space Z conforms to a standard Gaussian distribution.
- the above-mentioned first projection unit can project the prior identity information into the first space Z in the following manner to obtain N identity latent vectors:
- the prior identity information is projected into the mean and variance of the first space through the first projection unit; sampling is performed based on the mean and variance of the first space to obtain N identity latent vectors.
- the first projection unit is a variational autoencoder (VAE), such as a conditional variational autoencoder (CVAE).
- VAE variational autoencoder
- CVAE conditional variational autoencoder
- the conditional variational autoencoder is a generative network. , learn the distribution of data through the encoder, obtain the latent variables, and then restore the latent variables to the original form of the data through the decoder.
- Conditional variational autoencoders can learn the distribution of data and then sample to generate new data, often used for image generation.
- the VAE projects the prior identity information into the mean and variance of the first space. Then, sampling is performed based on the mean and variance of the first space to obtain N identity latent vectors of the first training image.
- the above-mentioned first space is a latent space Z that conforms to the standard Gaussian distribution. Therefore, in order to enhance the expression ability of the latent space, the embodiment of the present application generates different latent vectors at different resolution levels, for example, generating N Identity latent vector, which is equivalent to constructing a Z+ space containing multiple identity latent vectors.
- the second space W is obtained from the latent space Z, for example, obtained by linear or nonlinear mapping from the latent space Z.
- the embodiment of the present application does not limit the network structure of the second projection unit, for example, it is a mapping network.
- the mapping network is composed of multiple fully connected layers.
- the projection module by projecting the a priori identity information of the first training image into the shadow space (i.e., the target space) of the projection module, the projection module can fully learn the identity information of the first training image so that the subsequent generation is consistent with reality. virtual identity vector.
- the second training image is any image in the training data set.
- the second training image and the first training image may be the same image or different images.
- the attribute module in the embodiment of the present application is configured to learn attribute information of the second training image to generate M attribute vectors.
- the embodiments of this application do not limit the network model of the attribute module.
- the attribute module includes a coding unit and a decoding unit.
- the attribute vector of the second training image can be extracted in the following manner to obtain M Attribute vector:
- the second training image is input into the encoding unit to obtain feature information of the second training image; the feature information is input into the decoding unit to obtain M attribute vectors.
- the coding unit includes multiple feature extraction layers
- the decoding unit also includes multiple feature extraction units, and there is a skip connection between at least one feature extraction layer in the coding unit and at least one feature extraction layer in the decoding unit.
- Example 1 Splice N first virtual identity vectors to obtain the spliced first virtual identity vector.
- Splice M attribute vectors For the spliced attribute vector, combine the spliced first virtual identity vector and the spliced first virtual identity vector. After the attribute vector of the image is generated, it is input into the fusion module to generate the identity anonymized image.
- the spliced first virtual identity vector and the subsequent attribute vector are concatenated and then input into the fusion module to generate an identity anonymized image.
- the spliced first virtual identity vector and the subsequent attribute vector are added together and then input into the fusion module to generate an identity anonymized image.
- Example 2 The fusion module includes multiple different resolution layers. At this time, the fusion module can use the following method to generate images based on N first virtual identity vectors and M attribute vectors to obtain the identity anonymity of the second training image.
- ized image N first virtual identity vectors and M attribute vectors to obtain the identity anonymity of the second training image.
- the N first virtual identity vectors are used as styles, and the M attribute vectors are used as noise, and are input into the corresponding resolution layer to obtain the identity anonymization of the second training image. image.
- N is 3
- M is 4, and the fusion module includes 4 different resolution layers, among which the 3 first virtual identity vectors are recorded as first virtual identity vector 1, first virtual identity vector 2 and first virtual identity.
- Vector 3 the four attribute vectors are recorded as attribute vector 1, attribute vector 2, attribute vector 3 and attribute vector 4.
- the four resolution layers are sequentially recorded as resolution layer 1, resolution layer 2, resolution according to the size of the resolution.
- the first virtual identity vector 1 corresponds to the lower resolution resolution layer 1 and the resolution layer 2
- the first virtual identity vector 2 corresponds to the medium resolution resolution layer 3
- the virtual identity vector 3 corresponds to the highest resolution.
- the four attribute vectors correspond to the four resolution layers in order according to the resolution size.
- the first virtual identity vector 1 is input into the resolution layer 1 to obtain the feature information 1.
- the attribute vector 1 is merged with the feature information 1
- it is input into the resolution layer 2 together with the first virtual identity vector 1 to obtain the feature information 2.
- the attribute vector 2 and the feature information 2 are combined, they are input into the resolution layer 3 at the same time as the first virtual identity vector 3 to obtain the feature information 3.
- the attribute vector 3 is merged with the feature information 3
- the identity anonymized image of the second training image is generated.
- the fusion module is a style-based generator (StyleGAN2).
- the AdaIN layer is included between two adjacent resolution layers of the fusion module. For example, an affine transform (AT) is performed on the first virtual identity vector i+1, and the i-th resolution After the output feature information i of the layer is merged with the attribute vector i, it is input into the AdaIN layer with the first virtual identity vector i+1 after affine transformation, the AdaIN operation is performed, and the AdaIN operation result is input into the i+1th resolution layer .
- affine transform AT
- the AdaIN operation is performed, and the AdaIN operation result is input into the i+1th resolution layer .
- the fusion module in the embodiment of the present application can also be an adversarial model such as StyleGAN3 and ProGAN.
- an adversarial model such as StyleGAN3 and ProGAN.
- the methods for determining the identity of the second training image and anonymizing the image may be different.
- the embodiment of the present application does not Make restrictions.
- the model training process of the embodiment of the present application is introduced.
- the first training image Xs is passed through a pre-trained face recognition model to generate a priori identity information. then. Input the prior identity information into VAE, and project the prior identity information into the first space Z through VAE to obtain N identity latent vectors. For example, 3 N identity latent vectors are obtained. These 3 N identity latent vectors are respectively Corresponds to 3 different resolutions: low, medium and high. Next, N identity latent vectors are input into the mapping network, and the N identity latent vectors are projected from the first space Z to the second space W through the mapping network to obtain N first virtual identity vectors.
- the second training image Xt is input into the autoencoder, and after the second training image Xt is processed by the autoencoder, M attribute vectors are generated. Finally, the M attribute vectors are used as noise and the N first virtual identity vectors are used as styles and input into each layer of StyleGAN2 to obtain the identity anonymized image Ys,t of the second training image output by StyleGAN2.
- the first training image and the second training image are input into the target network model to obtain the identity anonymized image of the second training image output by the target network model. Then, the following S304 is performed to train the target network model.
- the target network model outputs the identity anonymized image of the second training image, and the loss of the target network model is determined based on the identity anonymized image.
- the identity anonymized image is input into a judgment model, which is a pre-trained model that can predict the degree of anonymization of the identity anonymized image.
- the identity anonymized image is input into the judgment model, the judgment model identifies the identity anonymized image, and determines the recognition result as a loss of the target network model. If the recognition accuracy is high, it means that the anonymization effect of the current target network model is not ideal. At this time, the parameters in the target network model are adjusted according to the loss of the target network model.
- select the new first training image and the second training image to perform the above-mentioned steps S301 to S304, and continue training the target network model until the target network model reaches the training end condition.
- the training end conditions at least include that the number of training times reaches the preset number, or the degree of anonymization of the model reaches the expected effect.
- the embodiment of the present application imposes KL divergence constraints L kl on the N identity latent vectors in the first space Z, To ensure that the identity information is projected to the standard Gaussian distribution.
- determining the loss of the target network model based on the identity anonymization image may include: determining the target based on the identity anonymization image and the divergence constraints. Network model loss.
- the divergence constraint L kl of N identity latent vectors can be determined through the following formula (1):
- ⁇ i is the mean value corresponding to the i-th identity latent vector among the N identity latent vectors
- ⁇ i is the variance corresponding to the i-th identity latent vector among the N identity latent vectors.
- the above formula (1) is just an example.
- the method of determining the divergence constraints of N identity latent vectors in the embodiment of the present application includes but is not limited to the above formula (1).
- it can be the above formula (1) Perform deformations and other ways of calculating divergence constraints.
- the above-mentioned second space is obtained from the first space through non-linear mapping, and is a complex non-Gaussian distribution.
- the distribution of the second space W in the intermediate latent space at this time is not uniform, and the real identity vectors are gathered into multiple different centers, and are different from the generated virtual ones.
- the identity vectors do not overlap, so the virtual identity vector cannot produce a reasonable face identity.
- the embodiment of this application proposes to use a contrast loss to constrain the hidden vectors of the second space W space (i.e., the first virtual identity vector), so that the hidden vectors from the same identity are aggregated together, and the hidden vectors from different identities are Vectors repel each other and all hidden vectors are evenly distributed throughout the space.
- the embodiment of this application can also determine identity loss in the following ways:
- Step 1 obtain the third training image
- Step 2 Process the third training image through the projection reference module to obtain N second virtual identity vectors
- Step 3 Determine the identity loss based on the N first virtual identity vectors and N second virtual identity vectors.
- the above-mentioned third training image and the first training image are two different images of the first target.
- the third training image and the first training image are two different face images of the same user.
- the above-mentioned projection reference module has the same network structure as the projection module and is updated according to the projection module.
- the projection reference module is updated according to the projection module momentum, that is, the projection reference module is slowly updated as the projection module is updated.
- the projection reference module can be updated according to the following formula (2):
- P ⁇ '(t) is the projection reference module parameter after the t-th update
- P ⁇ '(t-1) is the projection reference module parameter after the t-1 update
- P ⁇ (t) is the projection reference module parameter after the t-th update.
- the projection module parameters of , ⁇ is a small value, such as 0.01.
- the embodiment of the present application sets a projection reference module that is completely consistent with the network structure of the projection module to perform the first virtual identity vector output by the projection module. constraint.
- the first training image is input into the projection module to obtain N first virtual identity vectors of the first training image
- the third training image is input into the projection reference module to obtain N second virtual identity vectors of the third training image.
- the first training image and the third training image are images of the same target, and the network structures of the projection module and the projection reference module are consistent, if the model training is completed, the N first virtual identity vectors corresponding to the first training image are the same as N
- the difference between the second virtual identity vectors is small.
- the projection module in the target network model can be trained based on the N first virtual identity vectors and N second virtual identity vectors corresponding to the first training image. So that the projection module can generate a virtual identity vector that meets the requirements.
- the methods for determining identity loss based on N first virtual identity vectors and N second virtual identity vectors include but are not limited to the following:
- Method 1 Determine the difference between N first virtual identity vectors and N second virtual identity vectors at different resolutions, and determine the sum of the differences, or the average of the differences, as the identity loss.
- N is 3, determine the difference 1 between the first virtual identity vector 1 and the second virtual identity vector 1, determine the difference 2 between the first virtual identity vector 2 and the second virtual identity vector 2, determine the first virtual identity vector
- the difference between 3 and the second virtual identity vector 3 is 3.
- the sum of Difference 1, Difference 2 and Difference 3 is determined as identity loss, or the average of Difference 1, Difference 2 and Difference 3 is determined as identity loss.
- N dynamic lists K which store the representations of all different target identities (such as face identities) in the second space W+ space in the entire training set.
- the identity loss can be determined in the following way:
- Step 31 For the i-th first virtual identity vector among the N first virtual identity vectors, use the i-th second virtual identity vector to update the virtual identity vector corresponding to the first target in the i-th dynamic list.
- the i-th dynamic list includes virtual identity vectors of different targets at the i-th resolution, and i is a positive integer from 1 to N.
- each N second virtual identity vector corresponds to a dynamic list
- N is 3, corresponding to low resolution, medium resolution and high resolution respectively
- the dynamic list also includes three, namely the first dynamic list corresponding to low resolution, the second dynamic list corresponding to medium resolution, and the third dynamic list corresponding to high resolution.
- Step 32 Determine the identity sub-loss corresponding to the i-th first virtual identity vector according to the i-th first virtual identity vector and the updated i-th dynamic list.
- the first training image and the third training image are two different images of the first target j.
- the first training image Xj is input into the projection module to obtain N first virtual identity vectors Wj.
- the i-th dynamic list Ki includes second virtual identity vectors of different targets at the i-th resolution, and the i-th dynamic list Ki is updated in real time.
- the i-th second virtual identity vector is used to update the virtual identity vector kj corresponding to the first target j in the i-th dynamic list Ki, that is, kj is updated to Wj'.
- the identity sub-loss i corresponding to the i-th first virtual identity vector is determined.
- the method of determining the identity subloss corresponding to the i-th first virtual identity vector is not limited.
- loss methods such as center loss (Center loss) and triplet loss (Triplet loss) to determine the corresponding value of the i-th first virtual identity vector based on the i-th first virtual identity vector and the updated i-th dynamic list. loss of identity.
- the above-mentioned determination of the identity sub-loss corresponding to the i-th first virtual identity vector among the N first virtual identity vectors may include the following steps:
- Step 321 Obtain the first ratio between the i-th second virtual identity vector and the first preset value, multiply the first ratio by the i-th first virtual identity vector, obtain the first result, and perform Exponential operation to obtain the first operation value;
- Step 322 Obtain the second ratio between each second virtual identity vector and the first preset value in the updated i-th dynamic list, and for each second ratio, compare the second ratio with the corresponding th Multiply the i first virtual identity vectors to obtain a second result, and perform an exponential operation on the second result to obtain the second operation value corresponding to each second virtual identity vector;
- Step 323 Determine the sum of the second operation values corresponding to each second virtual identity vector, obtain the third ratio of the first operation value and the sum, and perform a logarithmic operation on the third ratio to obtain the third operation value. ;
- Step 324 Determine the negative number of the third operation value as the identity subloss corresponding to the i-th first virtual identity vector.
- the identity sub-loss L is determined using contrast loss in the form of InfoNCE (Information Noise Contrastive Noise, Information Noise Contrastive Estimation) c , where InfoNCE is a loss function modified by autoregression based on mutual information (Mutual Information).
- InfoNCE Information Noise Contrastive Noise, Information Noise Contrastive Estimation
- the identity subloss L c (i) corresponding to the i-th first virtual identity vector is determined according to the following formula (3):
- w j is the i-th first virtual identity vector of the first target j
- K[j] is the i-th second virtual identity vector of the first target j
- ⁇ is the first preset value
- K[k] is the i-th second virtual identity vector corresponding to the k-th target in the i-th dynamic list
- w k is the first virtual identity vector corresponding to the k-th target
- K is the total number of targets included in the i-th dynamic list .
- Step 33 Determine the sum of the identity sub-losses corresponding to the N first virtual identity vectors as the identity loss of the target network model.
- the identity sub-loss corresponding to the i-th first virtual identity vector is determined, the sum of the identity sub-losses corresponding to the N first virtual identity vectors is determined as the identity loss.
- N is 3.
- the identity sub-loss corresponding to each of the three first virtual identity vectors is determined, and then the sum of the identity sub-losses corresponding to the three first virtual identity vectors is determined. for model identity loss.
- the loss of the target network model is determined based on the identity anonymization image and divergence constraints, including the following steps:
- the reconstruction loss between the identity anonymized image and the second training image is determined, and the loss of the target network model is determined based on the reconstruction loss, divergence constraints and identity loss.
- the difference between the identity anonymized image and the second training image is determined as the reconstruction loss.
- the sum of the differences between each pixel of the identity anonymized image and the corresponding pixel of the second training image is determined as the reconstruction loss.
- the reconstruction loss L rec is determined according to the following formula (4):
- Y s, t is the identity anonymization image
- X t is the second training image
- 1 is the 1-norm operation.
- the loss of the target network model is determined based on the reconstruction loss, divergence constraint and identity loss. For example, the weighted sum of reconstruction loss, divergence constraints, and identity loss is determined as the final loss of the target network model.
- the embodiments of the present application also include determining the identity contrast loss of the identity anonymized image. For example, the following steps are included:
- Step A Determine a first distance between the identity anonymized image and the first training image, a second distance between the identity anonymized image and the second training image, and a third distance between the first training image and the second training image;
- Step B Determine the contrast loss based on the first distance, the second distance and the third distance
- first distance, second distance and third distance can be determined by any distance method such as cosine distance.
- Example 1 After the first distance, the second distance and the third distance are determined according to step A, the sum of the first distance, the second distance and the third distance is determined as the contrast loss.
- Example 2 Determine the sum of the square of the difference between the second distance and the third distance and the first distance; determine the difference between the preset value and the sum as the contrast loss.
- the contrast loss L ICL is determined according to the following formula (5):
- z id represents the 512 -dimensional identity vector representation of image
- the first distance of a training image, cos(z id (Y s, t ), z id (X t )) is the second distance of the identity anonymized image and the second training image, cos(z id (X s ), z id (X t )) is the third distance between the first training image and the second training image.
- the loss of the target network model is determined based on the reconstruction loss, divergence constraint, identity loss and contrast loss. For example, the weighting of the reconstruction loss, divergence constraint, identity loss and contrast loss is and, determined as the loss of the target network model.
- the adversarial loss of the model is also determined. For example, the adversarial loss is determined based on the identity anonymized image and the first training image.
- the adversarial loss L GAN is determined:
- D is the discriminator
- G is the generator
- E(*) represents the expected value of the distribution function
- D(X s ) is the identification result of the first training image X s by the discriminator
- D(Y s, t ) is the identification result.
- the loss of the target network model can be determined based on the reconstruction loss, divergence constraint, identity loss, contrast loss and adversarial loss.
- the reconstruction loss, divergence constraint, identity loss, The weighted sum of contrast loss and adversarial loss is determined as the loss of the target network model.
- the embodiments of this application do not limit the size of the weight values corresponding to the reconstruction loss, divergence constraint, identity loss, contrast loss and adversarial loss, and can be determined according to actual needs.
- the reconstruction loss, divergence constraint, identity loss, contrast loss and adversarial loss are weighted to obtain the loss L total of the target network model:
- the weight corresponding to each loss in the above formula (7) is an example.
- the weight corresponding to each loss in the embodiment of this application includes but is not limited to what is shown in the above formula (7) and can be determined as needed.
- the embodiment of the present application achieves identity anonymization by generating first virtual identity vectors corresponding to different resolutions, which can improve the resolution of anonymization. For example, an anonymization result with a resolution of 1024 2 can be generated while generating less of picture artifacts with higher fidelity.
- the embodiment of the present application does not rely on key regression models and segmentation models during model training, that is, the face area in the image is not removed, and the posture, details, and occlusions in the original image are retained.
- the first training image is projected to the target space through the projection module, and N first virtual identity vectors are obtained, so that the target network model can fully analyze the identity information in the image.
- the identity anonymized image of the second training image is obtained, so that the trained model can generate an image carrying virtual identity information while ensuring that the attribute information of the original image remains unchanged. That is, this application provides a new target network model.
- the target network model can learn the identity information in the first training image.
- the target network model can independently generate a virtual identity, and at the same time, the target network model can learn the identity information in the first training image. 2. Fully learn the attribute information in the training image. There is no need to remove the facial area in the image during the entire learning process, and there is no need to use real identity information for guidance.
- the target network model is trained by using the clear supervision goals in the face-changing task. , improve the fidelity and resolution of the identity anonymization generation of the target network model, so that the trained target network model can generate high-quality identity anonymization images.
- FIG 11 is a schematic flowchart of an identity anonymization method provided by an embodiment of the present application.
- the identity anonymization method shown in Figure 11 uses the above-trained target network model for identity anonymization processing. As shown in Figure 11, the method includes:
- the embodiment of the present application uses the first training image to train the projection module, so that the projection module can fully learn the identity information in the first training image.
- N virtual identity vectors can be obtained by sampling the target space of the projection module.
- the implementation methods of the above S401 include but are not limited to the following:
- Method 1 Sampling is based on the mean and variance of the target space of the trained projection module to obtain N virtual identity vectors. For example, randomly sample the variance of the target space and then add it to the mean of the target space to obtain a virtual identity vector. Repeat the above steps to obtain N virtual identity vectors.
- Method 2 The target space includes the first space and the second space, and the target network model includes the second projection unit. At this time, the following method can be used to sample the target space of the projection module in the target network model to obtain N virtual identities. vector:
- the first projection unit in the projection module is no longer used, and only the second projection unit in the projection module is used for projection.
- sampling is performed in the first space Z that conforms to the standard Gaussian distribution to obtain N identity latent vectors, and then the N identity latent vectors are input into the second projection unit.
- the second projection unit projects N identity latent vectors into W space to obtain N virtual identity vectors.
- N is 3 and the second projection unit is a mapping network as an example.
- the projection module in the embodiment of the present application is not limited to that shown in Figure 12.
- the first training image is used to train the first space so that the variance and mean of the first space conform to the standard Gaussian distribution.
- sampling is performed on the first space to generate N identity latent vectors.
- sampling is performed based on the mean and variance of the first space to obtain N identity latent vectors.
- Random sampling is performed in the variance of the first space, and then Add it to the mean value of the first space to obtain an identity latent vector.
- the N identity latent vectors are projected to the second space through the second projection unit to obtain N virtual identity vectors.
- the attribute module in the embodiment of the present application is configured to extract attribute information in the image to be processed.
- the attribute module includes a coding unit and a decoding unit. At this time, the following method can be used to extract the attribute vector of the image to be processed to obtain M attribute vectors:
- the above-mentioned encoding unit may include multiple feature extraction layers.
- the above-mentioned decoding unit may also include multiple feature extraction layers; wherein the feature extraction layer may include a convolutional layer, etc.
- the M attribute vectors generated above can correspond to different resolutions.
- the target network model is an autoencoder.
- image generation is performed based on N virtual identity vectors and M attribute vectors to obtain an identity anonymized image of the image to be processed.
- N virtual identity vectors and M attribute vectors are generated and input into the fusion module to obtain the identity anonymized image of the image to be processed.
- Example 1 Splice N virtual identity vectors and splice M attribute vectors at the same time. After fusing the spliced virtual identity vectors and attribute vectors, input them into the fusion module.
- the spliced virtual identity vector and attribute vector are concatenated and then input into the fusion module.
- the spliced virtual identity vector and attribute vector are added and then input into the fusion module.
- the fusion module includes multiple different resolution layers.
- N virtual identity vectors can be used as styles
- M attribute vectors can be used as noise
- the corresponding In the resolution layer the identity anonymized image of the image to be processed is obtained.
- the fusion module is StyleGAN2.
- the AdaIN layer is included between the two adjacent resolution layers of the fusion module.
- the virtual identity vector i+1 is subjected to affine transformation, and the output features of the i-th resolution layer are After the information i is merged with the attribute vector i, it is input into the AdaIN layer with the affine transformed virtual identity vector i+1, the AdaIN operation is performed, and the AdaIN operation result is input into the i+1th resolution layer.
- the fusion module in the embodiment of this application can also be an adversarial model such as StyleGAN3 and ProGAN.
- an adversarial model such as StyleGAN3 and ProGAN.
- the identity anonymization process in the embodiment of the present application is introduced.
- sampling is performed in the first space Z of the projection module, and N identity latent vectors are obtained.
- N identity latent vectors are obtained, and these 3 N identity latent vectors respectively correspond to 3 different resolutions: low, medium and high.
- N identity latent vectors are input into the mapping network, and the N identity latent vectors are projected from the first space Z to the second space W through the mapping network to obtain N virtual identity vectors.
- the image Xt to be processed is input into the autoencoder, and after the image Xt to be processed is processed by the autoencoder, M attribute vectors are generated.
- the M attribute vectors are used as noise and the N virtual identity vectors are used as styles, which are input into each layer of StyleGAN2 to obtain the identity anonymized image Ys,t of the image to be processed output by StyleGAN2.
- the identity anonymization method provided by the embodiment of the present application samples on the target space of the projection module in the target network model to obtain N virtual identity vectors.
- the attribute vector of the image to be processed is extracted to obtain M attribute vectors
- the fusion module of the target network model generate images based on N virtual identity vectors and M attribute vectors to obtain the identity anonymized image of the image to be processed. That is to say, the target network model of the embodiment of the present application can independently generate a virtual identity.
- Figure 14 is a schematic block diagram of a model training device provided by an embodiment of the present application.
- the training device 10 may be a computing device or a part of a computing device.
- the model training device 10 includes:
- the projection unit 11 is configured to project the first training image to the target space through the projection module in the target network model to obtain N first virtual identity vectors, where N is a positive integer;
- the attribute unit 12 is configured to extract attribute vectors from the second training image through the attribute module in the target network model to obtain M attribute vectors, where M is a positive integer;
- the fusion unit 13 is configured to perform image generation based on the N first virtual identity vectors and the M attribute vectors through the fusion module of the target network model to obtain an identity anonymized image of the second training image;
- the training unit 14 is configured to determine the loss of the target network model based on the identity anonymized image, and train the target network model based on the loss.
- the projection module includes a first projection unit and a second projection unit
- the target space includes a first space and a second space
- the projection unit 11 is further configured to extract a priori of the first training image. verify the identity information; use the first projection unit to project the a priori identity information to the first space to obtain N identity latent vectors; use the second projection unit to project the N identity latent vectors to In the second space, the N first virtual identity vectors are obtained.
- the projection unit 11 is further configured to project the prior identity information into the mean and variance of the first space through the first projection unit; and perform sampling based on the mean and variance of the first space. , obtain the N identity latent vectors.
- the training unit 14 is further configured to determine the divergence constraints of the N identity latent vectors; and determine the loss of the target network model according to the identity anonymized image and the divergence constraints.
- the N first virtual identity vectors respectively correspond to different resolutions.
- the first projection unit is a variational autoencoder.
- the training unit 14 is also configured to obtain a third training image, where the third training image and the first training image are two different images of the first target; through the target network model
- the projection reference module in projects the third training image to the target space to obtain N second virtual identity vectors.
- the projection reference module has the same network structure as the projection module and is updated according to the projection module. ; According to the N first virtual identity vectors and the N second virtual identity vectors, determine the identity loss; According to the identity anonymization image, the divergence constraint and the identity loss, determine the target network model loss.
- the training unit 14 is further configured to, for the i-th second virtual identity vector among the N second virtual identity vectors, use the i-th second virtual identity vector to update the i-th dynamic In the list, the virtual identity vector corresponding to the first target, wherein the i-th dynamic list includes virtual identity vectors of different targets at the i-th resolution, and the i is a positive integer from 1 to N; According to the i-th first virtual identity vector and the updated i-th dynamic list, determine the identity sub-loss corresponding to the i-th first virtual identity vector; assign the N first virtual identity vectors to The sum of the identity sub-losses is determined as the identity loss.
- the training unit 14 is further configured to obtain a first ratio between the i-th second virtual identity vector and a first preset value, and compare the first ratio with the i-th first virtual identity vector. Multiply the identity vectors to obtain the first result, and perform an exponential operation on the first result to obtain the first operation value; obtain the updated i-th dynamic list, each second virtual identity vector and the first For the second ratio of the preset value, for each second ratio, multiply the second ratio by the corresponding i-th first virtual identity vector to obtain the second result, and perform the Exponential operation is performed to obtain the second operation value corresponding to each second virtual identity vector; the sum of the second operation values corresponding to each second virtual identity vector is determined, and the third operation value of the first operation value and the sum is obtained. Three ratios are calculated, and a logarithmic operation is performed on the third ratio to obtain a third operation value; the negative number of the third operation value is determined as the identity subloss corresponding to the i-th first virtual identity vector.
- the attribute module includes an encoding unit and a decoding unit.
- the attribute unit 12 is also configured to perform feature extraction on the second training image through the encoding unit to obtain feature information of the second training image. ; Decode the feature information through the decoding unit to obtain M attribute vectors.
- the fusion module includes a plurality of different resolution layers, and the fusion unit 13 is further configured to merge the N first virtual identity vectors according to the resolution corresponding to the N first virtual identity vectors.
- the identity vector is used as the style, and the M attribute vectors are used as noise, and are input into the corresponding resolution layer to obtain the identity anonymized image of the second training image.
- the training unit 14 is further configured to determine a reconstruction loss between the identity anonymized image and the second training image; according to the reconstruction loss, the divergence constraint and the identity loss, Determine the loss of the target network model.
- the training unit 14 is further configured to determine a first distance between the identity anonymization image and the first training image, and a second distance between the identity anonymization image and the second training image, and a third distance between the first training image and the second training image; according to the first distance, the second distance and the third distance, a contrast loss is determined; according to the reconstruction loss, The divergence constraint, the identity loss and the contrast loss determine the loss of the target network model.
- the training unit 14 is further configured to determine the sum of the square of the difference between the second distance and the third distance and the first distance; and calculate the difference between the preset value and the sum. value, determined as the contrast loss.
- the training unit 14 is further configured to determine an adversarial loss based on the identity anonymized image and the first training image; combine the reconstruction loss, the scattered The weighted sum of the degree constraint, the identity loss, the comparison loss and the adversarial loss is determined as the loss of the target network model.
- the device embodiments and method embodiments may correspond to each other, and similar descriptions may refer to the method embodiments. To avoid repetition, they will not be repeated here.
- the device shown in Figure 14 can execute the embodiment of the model training method shown in Figure 3, and the foregoing and other operations and/or functions of each module in the device are respectively intended to implement the corresponding method embodiments of the computing device. , for the sake of brevity, will not be repeated here.
- FIG 15 is a schematic block diagram of an identity anonymization device provided by an embodiment of the present application.
- the identity anonymization device 20 may be a computing device or a part of the computing device. As shown in Figure 15, the identity anonymization device 20 includes:
- the sampling unit 21 is configured to sample on the target space of the projection module in the target network model to obtain N virtual identity vectors, where N is a positive integer;
- the attribute unit 22 is configured to extract attribute vectors of the image to be processed through the attribute module in the target network model, and obtain M attribute vectors, where M is a positive integer;
- the anonymization unit 23 is configured to generate an image based on the N virtual identity vectors and the M attribute vectors through the fusion module of the target network model to obtain an identity anonymized image of the image to be processed.
- the target space includes a first space and a second space
- the target network model includes a second projection unit
- the sampling unit 21 is also configured to sample on the first space to obtain N Identity latent vectors: project the N identity latent vectors to the second space through the second projection unit to obtain the N virtual identity vectors.
- the mean and variance of the first space satisfy the standard Gaussian distribution
- the sampling unit 21 is also configured to perform sampling based on the mean and variance of the first space to obtain the N identity latent vectors.
- the N virtual identity vectors respectively correspond to different resolutions.
- the attribute module includes an encoding unit and a decoding unit.
- the attribute unit 22 is also configured to perform feature extraction on the image to be processed through the encoding unit to obtain the feature information of the image to be processed;
- the feature information is decoded by the decoding unit to obtain M attribute vectors.
- the fusion module includes multiple different resolution layers
- the anonymization unit 23 is also configured to use the N virtual identity vectors as Style, the M attribute vectors are used as noise and input into the corresponding resolution layer to obtain the identity anonymized image of the image to be processed.
- the device embodiments and the method embodiments may correspond to each other, and similar descriptions may refer to the method embodiments. To avoid repetition, they will not be repeated here.
- the device shown in Figure 15 can execute the embodiment of the identity anonymization method shown in Figure 11, and the foregoing and other operations and/or functions of each module in the device are respectively implemented to implement the corresponding method of the computing device. Example, for the sake of brevity, will not be repeated here.
- this functional module can be implemented in the form of hardware, can also be implemented through instructions in the form of software, or can also be implemented through a combination of hardware and software modules.
- each step of the method embodiment in the embodiment of the present application can be completed through the integrated logic circuit of the hardware in the processor and/or instructions in the form of software.
- the steps of the method disclosed in the embodiment of the present application can be directly embodied as a hardware translation.
- the execution of the code processor is completed, or the execution is completed using a combination of hardware and software modules in the decoding processor.
- the software module may be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, register, etc.
- the storage medium is located in the memory, and the processor reads the information in the memory and completes the steps in the above method embodiment in combination with its hardware.
- Figure 16 is a schematic block diagram of a computing device provided by an embodiment of the present application.
- the computing device is configured to execute the above method embodiment.
- the computing device 30 may include:
- Memory 31 and processor 32 the memory 31 is configured to store a computer program 33 and transmit the program code 33 to the processor 32 .
- the processor 32 can call and run the computer program 33 from the memory 31 to implement the method in the embodiment of the present application.
- the processor 32 may be configured to perform the above method steps according to instructions in the computer program 33 .
- the processor 32 may include but is not limited to:
- DSP Digital Signal Processor
- ASIC Application Specific Integrated Circuit
- FPGA Field Programmable Gate Array
- the memory 31 includes but is not limited to:
- Non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electrically removable memory. Erase programmable read-only memory (Electrically EPROM, EEPROM) or flash memory. Volatile memory may be Random Access Memory (RAM), which is used as an external cache.
- RAM Random Access Memory
- RAM static random access memory
- DRAM dynamic random access memory
- DRAM synchronous dynamic random access memory
- SDRAM double data rate synchronous dynamic random access memory
- Double Data Rate SDRAM DDR SDRAM
- ESDRAM enhanced synchronous dynamic random access memory
- SLDRAM synchronous link dynamic random access memory
- Direct Rambus RAM Direct Rambus RAM
- the computer program 33 can be divided into one or more modules, and the one or more modules are stored in the memory 31 and executed by the processor 32 to complete the provisions of this application. method of recording a page.
- the one or more modules may be a series of computer program instruction segments capable of completing specific functions. The instruction segments are used to describe the execution process of the computer program 33 in the computing device.
- the computing device 30 may also include:
- Transceiver 34 the transceiver 34 can be connected to the processor 32 or the memory 31 .
- the processor 32 can control the transceiver 34 to communicate with other devices, for example, it can send information or data to other devices, or receive information or data sent by other devices.
- Transceiver 34 may include a transmitter and a receiver.
- the transceiver 34 may also include an antenna, and the number of antennas may be one or more.
- bus system where in addition to the data bus, the bus system also includes a power bus, a control bus and a status signal bus.
- Embodiments of the present application provide a computer storage medium on which a computer program is stored. When the computer program is executed by a computer, the computer can perform the method of the above method embodiment. In other words, embodiments of the present application also provide a computer program product containing instructions, which when executed by a computer causes the computer to perform the method of the above method embodiments.
- Embodiments of the present application provide a computer program product or computer program.
- the computer program product or computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium.
- the processor of the computing device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computing device performs the method of the above method embodiment.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Bioethics (AREA)
- Computer Security & Cryptography (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Computer Hardware Design (AREA)
- Image Analysis (AREA)
- Traffic Control Systems (AREA)
Abstract
Description
Claims (21)
- 一种模型训练方法,所述方法由计算设备执行,包括:通过目标网络模型中的投影模块,将第一训练图像投影至目标空间,得到N个第一虚拟身份向量,所述N为正整数;通过所述目标网络模型中的属性模块,对第二训练图像进行属性向量提取,得到M个属性向量,所述M为正整数;通过所述目标网络模型的融合模块,基于所述N个第一虚拟身份向量和所述M个属性向量进行图像生成,得到所述第二训练图像的身份匿名化图像;根据所述身份匿名化图像,确定所述目标网络模型的损失,并根据所述损失对所述目标网络模型进行训练。
- 根据权利要求1所述的方法,其中,所述投影模块包括第一投影单元和第二投影单元,所述目标空间包括第一空间和第二空间,所述通过目标网络模型中的投影模块,将第一训练图像投影至目标空间,得到N个第一虚拟身份向量,包括:提取所述第一训练图像的先验身份信息;通过所述第一投影单元,将所述先验身份信息投影至第一空间,得到N个身份隐向量;通过所述第二投影单元,将所述N个身份隐向量投影至第二空间,得到所述N个第一虚拟身份向量。
- 根据权利要求2所述的方法,其中,所述通过所述第一投影单元,将所述先验身份信息投影至第一空间,得到N个身份隐向量,包括:通过所述第一投影单元将所述先验身份信息,投影为所述第一空间的均值和方差;基于所述第一空间的均值和方差进行采样,得到所述N个身份隐向量。
- 根据权利要求2或3所述的方法,其中,所述方法还包括:确定所述N个身份隐向量的散度约束;所述根据所述身份匿名化图像,确定所述目标网络模型的损失,包括:根据所述身份匿名化图像和所述散度约束,确定所述目标网络模型的损失。
- 根据权利要求4所述的方法,其中,所述方法还包括:获取第三训练图像,所述第三训练图像和所述第一训练图像均为第一目标的两张不同的图像;通过所述目标网络模型中的投影参考模块,将所述第三训练图像投 影至目标空间,得到N个第二虚拟身份向量,所述投影参考模块与所述投影模块的网络结构相同,且根据所述投影模块进行更新;根据所述N个第一虚拟身份向量和所述N个第二虚拟身份向量,确定身份损失;所述根据所述身份匿名化图像和所述散度约束,确定所述目标网络模型的损失,包括:根据所述身份匿名化图像、所述散度约束和所述身份损失,确定所述目标网络模型的损失。
- 根据权利要求5所述的方法,其中,所述根据所述N个第一虚拟身份向量和所述N个第二虚拟身份向量,确定身份损失,包括:针对所述N个第二虚拟身份向量中的第i个第二虚拟身份向量,使用所述第i个第二虚拟身份向量更新第i个动态列表中,所述第一目标对应的虚拟身份向量,其中,所述第i个动态列表中包括第i个分辨率下不同目标的虚拟身份向量,所述i为从1到N的正整数;根据第i个第一虚拟身份向量和更新后的所述第i个动态列表,确定所述第i个第一虚拟身份向量对应的身份子损失;将所述N个第一虚拟身份向量分别对应的身份子损失之和,确定为所述身份损失。
- 根据权利要求6所述的方法,其中,所述根据第i个第一虚拟身份向量和更新后的所述第i个动态列表,确定所述第i个第一虚拟身份向量对应的身份子损失,包括:获取所述第i个第二虚拟身份向量与第一预设值的第一比值,将所述第一比值与所述第i个第一虚拟身份向量相乘,得到第一结果,并对所述第一结果进行指数运算,得到第一运算值;获取更新后的所述第i个动态列表中,每个第二虚拟身份向量与第一预设值的第二比值,针对各所述第二比值,将所述第二比值与对应的第i个第一虚拟身份向量相乘,得到第二结果,并对所述第二结果进行指数运算,得到所述每个第二虚拟身份向量对应的第二运算值;确定每个第二虚拟身份向量对应的第二运算值的和,获取所述第一运算值与所述和的第三比值,并对所述第三比值进行对数运算,得到第三运算值;将所述第三运算值的负数,确定为所述第i个第一虚拟身份向量对应的身份子损失。
- 根据权利要求1-7任一项所述的方法,其中,所述属性模块包括编码单元和解码单元,所述通过所述目标网络模型中的属性模块,对第二训练图像进行属性向量提取,得到M个属性向量,包括:通过所述编码单元对所述第二训练图像进行特征提取,得到所述第二训练图像的特征信息;通过所述解码单元对所述特征信息进行解码,得到M个属性向量。
- 根据权利要求1-7任一项所述的方法,其中,所述融合模块包括多个不同的分辨率层,所述通过所述目标网络模型的融合模块,基于所述N个第一虚拟身份向量和所述M个属性向量进行图像生成,得到所述第二训练图像的身份匿名化图像,包括:根据所述N个第一虚拟身份向量所对应的分辨率,将所述N个第一虚拟身份向量作为样式,将所述M个属性向量作为噪音,输入对应的分辨率层中,得到所述第二训练图像的身份匿名化图像。
- 根据权利要求5所述的方法,其中,所述根据所述身份匿名化图像、所述散度约束和所述身份损失,确定所述目标网络模型的损失,包括:确定所述身份匿名化图像和所述第二训练图像之间的重建损失;根据所述重建损失、所述散度约束和所述身份损失,确定所述目标网络模型的损失。
- 根据权利要求10所述的方法,其中,所述方法还包括:确定所述身份匿名化图像和所述第一训练图像的第一距离、所述身份匿名化图像和所述第二训练图像的第二距离,以及所述第一训练图像和所述第二训练图像之间的第三距离;根据所述第一距离、所述第二距离和所述第三距离,确定对比损失;所述根据所述重建损失、所述散度约束和所述身份损失,确定所述目标网络模型的损失,包括:根据所述重建损失、所述散度约束、所述身份损失和所述对比损失,确定所述目标网络模型的损失。
- 根据权利要求11所述的方法,其中,所述根据所述第一距离、所述第二距离和所述第三距离,确定对比损失,包括:确定所述第二距离与所述第三距离差的平方,与所述第一距离的和值;将预设值与所述和值的差值,确定为所述对比损失。
- 根据权利要求11所述的方法,其中,若所述融合模块为对抗网络,则所述根据所述重建损失、所述散度约束、所述身份损失和所述对比损失,确定所述目标网络模型的损失,包括:根据所述身份匿名化图像和所述第一训练图像,确定对抗损失;将所述重建损失、所述散度约束、所述身份损失、所述对比损失和所述对抗损失的加权和,确定为所述目标网络模型的损失。
- 一种身份匿名化方法,所述方法由计算设备执行,包括:在目标网络模型中投影模块的目标空间上进行采样,得到N个虚拟身份向量,所述N为正整数;通过目标网络模型中的属性模块,对待处理图像进行属性向量提取, 得到M个属性向量,所述M为正整数;通过所述目标网络模型的融合模块,基于所述N个虚拟身份向量和所述M个属性向量进行图像生成,得到所述待处理图像的身份匿名化图像。
- 根据权利要求14所述的方法,其中,所述目标空间包括第一空间和第二空间,所述目标网络模型包括第二投影单元,所述在目标网络模型中投影模块的目标空间上进行采样,得到N个虚拟身份向量,包括:在所述第一空间上进行采样,得到N个身份隐向量;通过所述第二投影单元,将所述N个身份隐向量投影至第二空间,得到所述N个虚拟身份向量。
- 根据权利要求15所述的方法,其中,所述第一空间的均值和方差满足标准高斯分布,所述在所述第一空间上进行采样,得到N个身份隐向量,包括:基于所述第一空间的均值和方差进行采样,得到所述N个身份隐向量。
- 一种模型训练装置,所述装置包括:投影单元,配置为通过目标网络模型中的投影模块,将第一训练图像投影至目标空间,得到N个第一虚拟身份向量,所述N为正整数;属性单元,配置为通过所述目标网络模型中的属性模块对第二训练图像进行属性向量提取,得到M个属性向量,所述M为正整数;融合单元,配置为通过所述目标网络模型的融合模块,基于所述N个虚拟身份向量和所述M个属性向量进行图像生成,得到所述第二训练图像的身份匿名化图像;训练单元,配置为根据所述身份匿名化图像,确定所述目标网络模型的损失,并根据所述损失对所述目标网络模型进行训练。
- 一种身份匿名化装置,所述装置包括:采样单元,配置为在目标网络模型中投影模块的目标空间上进行采样,得到N个虚拟身份向量,所述N为正整数;属性单元,配置为通过目标网络模型中的属性模块,对待处理图像进行属性向量提取,得到M个属性向量,所述M为正整数;匿名化单元,配置为通过所述目标网络模型的融合模块,基于所述N个虚拟身份向量和所述M个属性向量进行图像生成,得到所述待处理图像的身份匿名化图像。
- 一种计算设备,所述计算设备包括处理器和存储器;所述存储器,配置为存储计算机程序;所述处理器,配置为执行所述计算机程序以实现如上述权利要求1至13任一项所述的方法,或者实现如上述权利要求14至16任一项所述的方法。
- 一种计算机可读存储介质,所述计算机可读存储介质配置为存储计算机程序,所述计算机程序使得计算机执行时,实现如上述权利要求1至13任一项所述的方法,或者实现如上述权利要求或14至16任一项所述的方法。
- 一种计算机程序产品,包括计算机程序或指令,所述计算机程序或指令被处理器执行时,实现权利要求14至16任一项所述的方法,或者实现权利要求1至13任一项所述的方法。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2022566254A JP2024513274A (ja) | 2022-03-10 | 2022-08-11 | モデル訓練方法及び装置、アイデンティティ匿名化方法及び装置、機器、記憶媒体並びにコンピュータプログラム |
EP22773378.9A EP4270232A4 (en) | 2022-03-10 | 2022-08-11 | MODEL TRAINING METHOD AND APPARATUS, IDENTITY ANONYMIZATION METHOD AND APPARATUS, APPARATUS, STORAGE MEDIUM AND PROGRAM PRODUCT |
KR1020227038590A KR20230133755A (ko) | 2022-03-10 | 2022-08-11 | 모델 트레이닝 방법 및 장치, 아이덴티티 익명화 방법 및 장치, 디바이스, 저장 매체, 그리고 프로그램 제품 |
US18/076,073 US20230290128A1 (en) | 2022-03-10 | 2022-12-06 | Model training method and apparatus, deidentification method and apparatus, device, and storage medium |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210234385.2 | 2022-03-10 | ||
CN202210234385.2A CN114936377A (zh) | 2022-03-10 | 2022-03-10 | 模型训练和身份匿名化方法、装置、设备及存储介质 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/076,073 Continuation US20230290128A1 (en) | 2022-03-10 | 2022-12-06 | Model training method and apparatus, deidentification method and apparatus, device, and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023168903A1 true WO2023168903A1 (zh) | 2023-09-14 |
Family
ID=82862564
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/111704 WO2023168903A1 (zh) | 2022-03-10 | 2022-08-11 | 模型训练和身份匿名化方法、装置、设备、存储介质及程序产品 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN114936377A (zh) |
WO (1) | WO2023168903A1 (zh) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117274316A (zh) * | 2023-10-31 | 2023-12-22 | 广东省水利水电科学研究院 | 一种河流表面流速的估计方法、装置、设备及存储介质 |
CN117688538A (zh) * | 2023-12-13 | 2024-03-12 | 上海深感数字科技有限公司 | 一种基于数字身份安全防范的互动教育管理方法及系统 |
CN118536163A (zh) * | 2024-06-04 | 2024-08-23 | 北京高科数聚技术有限公司 | 汽车企业运营数据管理方法及系统 |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118213048A (zh) * | 2023-11-20 | 2024-06-18 | 清华大学 | 影像处理方法、模型训练方法、设备、介质及产品 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113033511A (zh) * | 2021-05-21 | 2021-06-25 | 中国科学院自动化研究所 | 一种基于操控解耦身份表示的人脸匿名方法 |
CN113642409A (zh) * | 2021-07-15 | 2021-11-12 | 上海交通大学 | 一种人脸匿名化系统及方法、终端 |
WO2021258920A1 (zh) * | 2020-06-24 | 2021-12-30 | 百果园技术(新加坡)有限公司 | 生成对抗网络训练方法、图像换脸、视频换脸方法及装置 |
CN114120041A (zh) * | 2021-11-29 | 2022-03-01 | 暨南大学 | 一种基于双对抗变分自编码器的小样本分类方法 |
CN114139198A (zh) * | 2021-11-29 | 2022-03-04 | 杭州电子科技大学 | 基于层次k匿名身份替换的人脸生成隐私保护方法 |
-
2022
- 2022-03-10 CN CN202210234385.2A patent/CN114936377A/zh active Pending
- 2022-08-11 WO PCT/CN2022/111704 patent/WO2023168903A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021258920A1 (zh) * | 2020-06-24 | 2021-12-30 | 百果园技术(新加坡)有限公司 | 生成对抗网络训练方法、图像换脸、视频换脸方法及装置 |
CN113033511A (zh) * | 2021-05-21 | 2021-06-25 | 中国科学院自动化研究所 | 一种基于操控解耦身份表示的人脸匿名方法 |
CN113642409A (zh) * | 2021-07-15 | 2021-11-12 | 上海交通大学 | 一种人脸匿名化系统及方法、终端 |
CN114120041A (zh) * | 2021-11-29 | 2022-03-01 | 暨南大学 | 一种基于双对抗变分自编码器的小样本分类方法 |
CN114139198A (zh) * | 2021-11-29 | 2022-03-04 | 杭州电子科技大学 | 基于层次k匿名身份替换的人脸生成隐私保护方法 |
Non-Patent Citations (1)
Title |
---|
See also references of EP4270232A4 |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117274316A (zh) * | 2023-10-31 | 2023-12-22 | 广东省水利水电科学研究院 | 一种河流表面流速的估计方法、装置、设备及存储介质 |
CN117274316B (zh) * | 2023-10-31 | 2024-05-03 | 广东省水利水电科学研究院 | 一种河流表面流速的估计方法、装置、设备及存储介质 |
CN117688538A (zh) * | 2023-12-13 | 2024-03-12 | 上海深感数字科技有限公司 | 一种基于数字身份安全防范的互动教育管理方法及系统 |
CN117688538B (zh) * | 2023-12-13 | 2024-06-07 | 上海深感数字科技有限公司 | 一种基于数字身份安全防范的互动教育管理方法及系统 |
CN118536163A (zh) * | 2024-06-04 | 2024-08-23 | 北京高科数聚技术有限公司 | 汽车企业运营数据管理方法及系统 |
Also Published As
Publication number | Publication date |
---|---|
CN114936377A (zh) | 2022-08-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2023168903A1 (zh) | 模型训练和身份匿名化方法、装置、设备、存储介质及程序产品 | |
CN113688855B (zh) | 数据处理方法、联邦学习的训练方法及相关装置、设备 | |
US11164046B1 (en) | Method for producing labeled image from original image while preventing private information leakage of original image and server using the same | |
US11475608B2 (en) | Face image generation with pose and expression control | |
CN111767906B (zh) | 人脸检测模型训练方法、人脸检测方法、装置及电子设备 | |
WO2021184754A1 (zh) | 视频对比方法、装置、计算机设备和存储介质 | |
CN109977832B (zh) | 一种图像处理方法、装置及存储介质 | |
CN118202391A (zh) | 从单二维视图进行对象类的神经辐射场生成式建模 | |
JP2023507248A (ja) | 物体検出および認識のためのシステムおよび方法 | |
US20220207861A1 (en) | Methods, devices, and computer readable storage media for image processing | |
CN117095019B (zh) | 一种图像分割方法及相关装置 | |
Choraś et al. | Image Processing & Communications Challenges 6 | |
CN114972016A (zh) | 图像处理方法、装置、计算机设备、存储介质及程序产品 | |
CN114783017A (zh) | 基于逆映射的生成对抗网络优化方法及装置 | |
Wei et al. | Contrastive distortion‐level learning‐based no‐reference image‐quality assessment | |
Sharjeel et al. | Real time drone detection by moving camera using COROLA and CNN algorithm | |
CN114707589A (zh) | 对抗样本的生成方法、装置、存储介质、设备及程序产品 | |
Dai et al. | An optimized method for variational autoencoders based on Gaussian cloud model | |
US20220172416A1 (en) | System and method for performing facial image anonymization | |
JP2024515907A (ja) | 画像処理方法及び装置、コンピューター機器、並びにコンピュータープログラム | |
US20230290128A1 (en) | Model training method and apparatus, deidentification method and apparatus, device, and storage medium | |
CN112463936A (zh) | 一种基于三维信息的视觉问答方法及系统 | |
CN116704588B (zh) | 面部图像的替换方法、装置、设备及存储介质 | |
CN117932314A (zh) | 模型训练方法、装置、电子设备、存储介质及程序产品 | |
Du et al. | IGCE: A Compositional Energy Concept Based Deep Image Generation Neural Network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 2022773378 Country of ref document: EP Effective date: 20220929 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2022566254 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202237065423 Country of ref document: IN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22773378 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 11202254237E Country of ref document: SG |