WO2023284416A1 - 数据处理方法及设备 - Google Patents
数据处理方法及设备 Download PDFInfo
- Publication number
- WO2023284416A1 WO2023284416A1 PCT/CN2022/094556 CN2022094556W WO2023284416A1 WO 2023284416 A1 WO2023284416 A1 WO 2023284416A1 CN 2022094556 W CN2022094556 W CN 2022094556W WO 2023284416 A1 WO2023284416 A1 WO 2023284416A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- generator
- loss
- output
- teacher
- image
- Prior art date
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 47
- 238000004821 distillation Methods 0.000 claims abstract description 283
- 238000000034 method Methods 0.000 claims abstract description 90
- 230000008569 process Effects 0.000 claims abstract description 71
- 238000012545 processing Methods 0.000 claims abstract description 59
- 238000012549 training Methods 0.000 claims abstract description 55
- 238000000605 extraction Methods 0.000 claims description 34
- 238000004590 computer program Methods 0.000 claims description 19
- 238000013507 mapping Methods 0.000 claims description 6
- 238000007906 compression Methods 0.000 abstract description 24
- 230000006835 compression Effects 0.000 abstract description 19
- 230000000694 effects Effects 0.000 abstract description 17
- 230000006870 function Effects 0.000 description 51
- 238000005457 optimization Methods 0.000 description 22
- 238000010586 diagram Methods 0.000 description 17
- 238000013135 deep learning Methods 0.000 description 8
- 238000013461 design Methods 0.000 description 8
- 238000013136 deep learning model Methods 0.000 description 7
- 230000003287 optical effect Effects 0.000 description 6
- 230000004913 activation Effects 0.000 description 5
- 238000001994 activation Methods 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 5
- 238000004422 calculation algorithm Methods 0.000 description 3
- 239000011159 matrix material Substances 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000013307 optical fiber Substances 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 238000013138 pruning Methods 0.000 description 2
- 238000013139 quantization Methods 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 238000013256 Gubra-Amylin NASH model Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 229910052739 hydrogen Inorganic materials 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
Definitions
- Embodiments of the present disclosure relate to the technical field of computer and network communication, and in particular, to a data processing method and device.
- the model compression of deep learning network is to formulate the model compression process of deep learning network as a multi-stage task, which includes multiple operations such as network structure search, distillation, pruning, and quantization.
- GAN Generative Adversarial Networks
- Embodiments of the present disclosure provide a data processing method and device to improve the model compression efficiency of a generative adversarial network, and realize image processing through a generative adversarial network on a lightweight device.
- an embodiment of the present disclosure provides a data processing method, which is applicable to a generative confrontation network obtained through model distillation, and the data processing method includes:
- the generative confrontation network includes the first generator, the second generator and a discriminator
- the model distillation is a process of alternately training the first generator and the second generator
- the first A model size of a generator is smaller than a model size of said second generator.
- an embodiment of the present disclosure provides a data processing device, which is suitable for a generative confrontation network obtained through model distillation, and the data processing device includes:
- Obtaining module used for obtaining the image to be processed
- a processing module configured to process the image through the first generator to obtain a processed image
- the generative confrontation network includes the first generator, the second generator and a discriminator
- the model distillation is a process of alternately training the first generator and the second generator, so The model size of the first generator is smaller than the model size of the second generator.
- an embodiment of the present disclosure provides an electronic device, including: at least one processor and a memory;
- the memory stores computer-executable instructions
- the at least one processor executes the computer-executed instructions stored in the memory, so that the at least one processor executes the data processing method described in the above first aspect and various possible designs of the first aspect.
- an embodiment of the present disclosure provides a computer-readable storage medium, where computer-executable instructions are stored in the computer-readable storage medium, and when the processor executes the computer-executable instructions, the above first aspect and the first Aspects of various possible designs for the data processing methods described.
- the embodiments of the present disclosure provide a computer program product, the computer program product includes computer-executable instructions, and when the processor executes the computer-executable instructions, various possible implementations of the above-mentioned first aspect and the first aspect can be realized. Design the data processing method described.
- the embodiments of the present disclosure provide a computer program, the computer program includes computer-executable instructions, and when the processor executes the computer-executable instructions, the above-mentioned first aspect and various possible designs of the first aspect can be realized.
- the generative confrontation network includes a first generator, a second generator, and a discriminator.
- the model size of the first generator is smaller than the model size of the second generator.
- the first generator and the second generator in the generative confrontation network are alternately trained, and in each training process, the training of the first generator is guided by the optimized second generator.
- the first generator obtained through model distillation processes the image to be processed.
- the multi-stage model compression process is abandoned, and the model compression of only the model distillation stage is realized, which reduces the complexity of model compression and improves the efficiency of model compression; on the other hand, through During the model distillation process, the first generator and the second generator alternately train the online distillation method, which improves the model training effect of the first generator and improves the quality of the image processed by the first generator.
- the finally obtained first generator can adapt to lightweight devices with weak computing power in terms of model scale, and the quality of the processed image is guaranteed to be better through the first generator.
- FIG. 1 is an example diagram of an application scenario provided by an embodiment of the present disclosure
- FIG. 2 is a schematic flowchart of a data processing method provided by an embodiment of the present disclosure
- FIG. 3 is a schematic flow diagram of a training process of a generative confrontation network in a data processing method provided by an embodiment of the present disclosure
- FIG. 4 is a second schematic flow diagram of a training process of the generative confrontation network in the data processing method provided by the embodiment of the present disclosure
- FIG. 5 is an example diagram of a model structure of a generative confrontation network provided by an embodiment of the present disclosure
- FIG. 6 is a structural block diagram of a data processing device provided by an embodiment of the present disclosure.
- FIG. 7 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present disclosure.
- FIG. 1 is an example diagram of an application scenario provided by an embodiment of the present disclosure.
- the application scenario shown in Figure 1 is an image processing scenario.
- the involved devices include a terminal 101 and a server 102, and the terminal 101 communicates with the server 102 through a network, for example.
- the server 102 is used to train the deep learning model, and deploy the trained deep learning model to the terminal 101 .
- the terminal 101 performs image processing through a deep learning model.
- the deep learning model is Generative Adversarial Networks (GAN).
- GAN Generative Adversarial Networks
- the server 102 deploys the generator in the trained generative confrontation network to the terminal.
- the terminal 101 is a lightweight device (such as a camera, a mobile phone, and a smart home appliance) with relatively weak computing power, and is suitable for deploying a small-scale deep learning model. Therefore, how to obtain a smaller-scale generator adapted to be deployed on a lightweight device and improve the image processing effect of a smaller-scale generator is one of the problems that need to be solved urgently.
- Model compression is one of the ways to train deep learning models with small model sizes.
- the current model compression methods for generative adversarial networks still have the following shortcomings:
- the mature model compression technology in the field of deep learning is not customized for generative adversarial networks, and lacks the exploration of the complex characteristics and structure of generative adversarial networks; 2)
- the model compression process includes network structure search, distillation, pruning, quantization, etc. Multiple stages require high time requirements and computing resources; 3)
- the compressed generative adversarial network consumes high computing resources and is difficult to apply to lightweight devices.
- an embodiment of the present disclosure provides a data processing method.
- a model compression method suitable for generative adversarial networks is designed.
- one-step compression of generative adversarial networks is realized through model distillation, which reduces the complexity of model compression and improves the efficiency of model compression.
- the training effect of the smaller generator is improved through the online distillation method in which the smaller generator and the larger generator are alternately trained.
- the resulting generator is smaller in size, suitable for lightweight devices, and the processed image quality is better.
- the data processing method provided by the embodiments of the present disclosure may be applied in a terminal or a server.
- the method When the method is applied to a terminal, real-time processing of images collected by the terminal can be realized.
- the method is applied to the server, it can realize the processing of the image sent by the terminal.
- the terminal device may be a personal digital assistant (PDA for short) device, a handheld device (such as a smart phone, a tablet computer) with a wireless communication function, a computing device (such as a personal computer (personal computer, PC for short)), Vehicle-mounted devices, wearable devices (such as smart watches, smart bracelets), and smart home devices (such as smart display devices), etc.
- PDA personal digital assistant
- a handheld device such as a smart phone, a tablet computer
- a computing device such as a personal computer (personal computer, PC for short)
- Vehicle-mounted devices such as a personal computer (personal computer, PC for short)
- wearable devices such as smart watches, smart bracelets
- smart home devices such
- FIG. 2 is a first schematic flowchart of a data processing method provided by an embodiment of the present disclosure. As shown in Figure 2, the data processing method includes:
- the image to be processed may be an image captured by the terminal in real time, or one or more frames of images acquired from a video captured by the terminal in real time.
- the image to be processed may be an image input by the user or selected by the user.
- the user inputs the image to be processed and detected on the display interface of the terminal, or selects the image to be detected.
- the server receives user input sent by the terminal or an image selected by the user.
- the image to be processed may be an image played on the terminal in real time.
- the terminal acquires the image or video frame being played. Thereby, the processing of the image played on the terminal in real time is realized.
- the image to be processed is an image in a database pre-stored on the terminal and/or the server.
- a database storing a plurality of images to be processed is pre-established at the terminal, and images to be processed are obtained from the database when image processing is performed.
- the model size of the first generator is smaller than the model size of the second generator, so, compared with the first generator, the second generator.
- the image processing capability of the image processor is stronger, and it can extract more detailed features of the image, process and obtain higher-quality images, and the image processing process requires more computing resources.
- the model distillation is used to train the generative confrontation network.
- the first generator and the second generator are alternately trained, that is, online distillation is performed on the first generator and the second generator, so as to use the optimized second generator to guide the first generator Optimization, so that the first generator whose model size is smaller than the second generator can approach the second generator in terms of image processing quality.
- the training process of the generative adversarial network can be carried out on the server. Considering that the computing power of the terminal is weak, the first generator with a smaller model size after model distillation can be deployed on the terminal.
- the image to be processed is directly input into the first generator, or the image to be processed is input into the first generator after preprocessing operations such as cropping, denoising, and enhancement, and the second generator is obtained.
- preprocessing operations such as cropping, denoising, and enhancement
- the model compression of the generative confrontation network is realized, the efficiency and effect of model compression are improved, and the image quality obtained by processing is obtained with a small model size
- the higher first generator is especially suitable for deployment on lightweight devices for image processing, improving the processing efficiency and quality of images on lightweight devices.
- the training process of the generative confrontation network is carried out separately from the process of applying the generative confrontation network to image processing. For example, after the training of the generative confrontation network is performed on the server, the trained students generate , image processing through the Student Generator. After the server updates the generative adversarial network each time, the student generator can be redeployed on the terminal.
- FIG. 3 is a schematic flow diagram of a training process of the generative confrontation network in the data processing method provided by the embodiment of the present disclosure, that is, an alternate training process of the first generator and the second generator in the generative confrontation network schematic diagram of the process.
- an alternate training process of the first generator and the second generator in the generative confrontation network includes:
- the sample data includes a sample image and a reference image corresponding to the sample image.
- the sample data includes the sample image and the real depth map of the sample image; in the face recognition of the image, the sample image includes the sample image and the real face label map of the sample image, such as The location of each face can be manually marked in the marker map.
- the sample image is processed by the second generator to obtain the processed sample image output by the second generator, which is referred to as the output image of the second generator for concise description.
- the discriminator the authenticity of the reference image corresponding to the sample image and the output image of the second generator are discriminated, and the adversarial loss of the second generator is determined.
- the second generator makes its own output image close to the reference image corresponding to the sample image, and the discriminator tries to distinguish the output image of the second generator from the reference image corresponding to the sample image.
- the loss reflects the loss value of authenticity discrimination performed by the discriminator on the output image of the second generator and the reference image corresponding to the sample image.
- the discriminator In the process of discriminating the reference image corresponding to the sample image and the output image of the second generator through the discriminator, the reference image corresponding to the sample image is input to the discriminator, and the output image of the second generator is input to the discriminator , the discriminator respectively judges whether the reference image corresponding to the sample image and the output image of the second generator come from the sample data. Finally, the adversarial loss of the teacher generator is calculated according to the output of the discriminator when the reference image corresponding to the sample image is input to the discriminator, the output of the discriminator when the output image of the second generator is input to the discriminator, and the adversarial loss function.
- the output of the discriminator is 1, which means that the input data of the discriminator comes from sample data
- the output of the discriminator is 0, which means that the input data of the discriminator does not come from sample data
- determine the reference image corresponding to the sample image for input discrimination The expected value of the output of the discriminator at the time of the generator, determine the expected value of the difference obtained by subtracting the output image of the second generator from the output of the discriminator when the output image of the second generator is input to the discriminator, and add and sum the two expected values to obtain the second generation against the loss of the device.
- the adversarial loss function used to calculate the adversarial loss of the second generator is expressed as:
- L GAN (G T , D) is the adversarial loss function
- G T represents the second generator
- D represents the discriminator
- x represents the sample image
- y represents the reference image corresponding to the sample image
- G T (x) represents the sample
- E ⁇ x,y ⁇ [] represents the expected function under the sample data ⁇ x, y ⁇
- E ⁇ x ⁇ [] represents the sample data x expected function of .
- determining the loss value of the second generator is an adversarial loss of the second generator. That is, the adversarial loss obtained by the above calculation is directly used as the loss value for teaching the second generator.
- the second generator's loss value includes the second generator's reconstruction loss in addition to the adversarial loss.
- a possible implementation of S301 includes: processing the sample image through the second generator to obtain the output image of the second generator; The authenticity of the image is determined to determine the adversarial loss of the second generator; the loss value of the second generator is determined according to the difference between the reference image corresponding to the sample image and the output image of the second generator. Therefore, in the loss value of the second generator, both the confrontation loss when the discriminator performs image discrimination and the reconstruction loss reflecting the difference between the reference image corresponding to the sample image and the output image of the second generator are considered, Improve the comprehensiveness and accuracy of the loss value of the second generator, thereby improving the training effect of the second generator.
- the difference between the reference image corresponding to the sample image and the output image of the second generator is determined, and the reconstruction loss of the second generator is calculated according to the difference.
- reconstruction loss function used to calculate the reconstruction loss of the second generator is expressed as:
- L recon (G T , D) is the reconstruction loss function of the second generator
- yG T (x) is the difference between the reference image corresponding to the sample image and the output image of the second generator.
- the second generator can be adjusted according to the optimization objective function to complete a training session of the second generator.
- the optimization objective function is, for example, a function that maximizes the loss value, or a function that minimizes the loss value, and the optimization algorithm used in the adjustment process of the second generator, such as the gradient descent algorithm, does not affect the optimization algorithm here Do limit.
- optimizing the objective function includes maximizing the adversarial loss based on the discriminator and minimizing the adversarial loss based on the teacher generator. loss.
- the optimization direction of the discriminator is to maximize the confrontation loss, so as to improve the discrimination ability of the discriminator; the optimization goal of the second generator is to minimize the confrontation loss, so as to pass the second generation
- the output image of the generator is close to the reference image corresponding to the sample image, so that the discriminator can judge that the output image of the second generator is from the sample data.
- optimizing the objective function includes minimizing the reconstruction loss on the basis of the second generator, that is, by adjusting the second generator to minimize The reconstruction loss is minimized, the output image of the second generator is approached to the reference image corresponding to the sample image, and the image quality of the output image of the second generator is improved.
- the optimization objective function of the second generator is expressed as:
- S303 Determine a distillation loss between the adjusted second generator and the first generator according to the sample image, the adjusted second generator, and the first generator.
- the sample image is processed by the adjusted second generator, and the sample image is processed by the first generator. Since the model scale of the second generator is larger than that of the first generator, the difference between the data obtained by processing the sample image through the adjusted second generator and the data obtained by processing the sample image through the first generator is Differences exist between and from these differences the adjusted distillation losses between the second generator and the first generator are determined.
- distillation losses Examples of distillation losses and procedures for determining distillation losses are provided below.
- the adjusted distillation loss between the second generator and the first generator comprises an adjusted outgoing distillation loss between the second generator and the first generator.
- the network layer includes an input layer, an intermediate layer and an output layer
- the output distillation loss is the distillation loss between the output layer of the second generator and the output layer of the first generator, reflecting the second generator's The difference between the output image and the output image of the first generator.
- a possible implementation of S303 includes: using the first generator and the adjusted second generator to process the sample image respectively to obtain the output image of the first generator and the output image of the second generator; The difference between the output image of the first generator and the output image of the second generator, determines the output distillation loss.
- the difference between the output image of the first generator and the output image of the second generator can be obtained by comparing the output image of the first generator with the output image of the second generator. For example, comparing each pixel in the output image of the first generator with the pixel at the corresponding position in the output image of the second generator; Compare with the output image of the second generator.
- the optimization of the first generator is guided by the output distillation loss reflecting the difference between the output image of the first generator and the output image of the second generator.
- the first generator's The output image gradually approaches the adjusted output image of the second generator, which is beneficial to improve the image quality of the image processed by the first generator.
- the output distillation loss includes a structural similarity loss and/or a perceptual loss between the output image of the first generator and the output image of the second generator.
- the structural similarity loss is similar to the observation of images by the Human Visual System (HVS), focusing on the local structural differences between the output image of the first generator and the output image of the second generator, including the brightness and contrast of the image etc. differences.
- the perceptual loss focuses on the difference in feature representation between the output image of the first generator and the output image of the second generator.
- the output image of the second generator is similar to the output image of the first generator
- the difference between the output image of the second generator and the output image of the first generator is compared respectively, and the structural similarity loss is obtained.
- feature extraction is performed on the output image of the first generator and the output image of the second generator through the feature extraction network, and the perceptual loss between the output image of the first generator and the output image of the second generator is determined , for example, comparing the extracted features of the output image of the first generator with the extracted features of the output image of the second generator to obtain the perceptual loss.
- Determine the output distillation loss based on the structural similarity loss and/or the perceptual loss for example, determine the output distillation loss as the structural similarity loss, or determine the output distillation loss as the perceptual loss, or weight the structural similarity loss and the perceptual loss Summed to get the output distillation loss.
- structured similarity loss and/or perceptual loss from one or more aspects of human vision, feature representation, etc., determine the difference between the output image of the first generator and the output image of the second generator, and improve the output distillation
- the comprehensiveness and accuracy of the loss improve the training effect of the first generator.
- the process of determining the structural similarity loss includes: determining the brightness estimation of the output image of the second generator, the brightness estimation of the output image of the first generator, the contrast estimation of the output image of the second generator, the first generating The contrast estimation of the output image of the generator, the structural similarity estimation between the output image of the second generator and the output image of the first generator; according to these parameters, the output image of the first generator and the output of the second generator are determined Structural similarity loss between images.
- calculate the pixel mean value and pixel standard deviation of the output image of the second generator calculate the pixel mean value and pixel standard deviation of the output image of the first generator, and calculate the difference between the pixels of the output image of the second generator and the first generator The covariance between the pixels of the output image.
- the brightness estimation and contrast estimation of the output image are determined as the pixel mean value and pixel standard deviation of the output image respectively.
- the brightness estimation and contrast estimation of the output image are determined as the pixel mean value and pixel standard deviation of the output image respectively.
- a structural similarity estimate between the output image of the second generator and the output image of the first generator is determined as a covariance between pixels of the output image of the second generator and pixels of the output image of the first generator.
- L SSIM (pt , p s ) represents the structure similarity loss function
- p t and p s represent the output image of the second generator and the output image of the first generator respectively
- ⁇ t and ⁇ s represent the second
- ⁇ ts denote the structural similarity estimation between the output image of the second generator and the output image of the first generator.
- the process of determining the perceptual loss includes: inputting the output image of the first generator and the output image of the second generator into the feature extraction network respectively, and obtaining the output of the first generator output by the preset network layer of the feature extraction network features of the image and features of the output image of the second generator; determining a feature reconstruction loss and/or a style reconstruction loss based on a difference between features of the output image of the first generator and features of the output image of the second generator.
- the perceptual loss includes feature reconstruction loss and/or style reconstruction loss: the feature reconstruction loss is used to reflect the lower level (or more specific) feature representation of the output image of the first generator and the lower level of the output image of the second generator.
- the difference between the feature representations of the first generator is used to encourage the output image of the first generator to have a similar feature representation to the output image of the second generator;
- the style reconstruction loss is used to reflect the more abstract style of the output image of the first generator.
- the difference between a feature (e.g. color, texture, pattern) and the more abstract stylistic features of the output image of the second generator is used to encourage the output image of the first generator to have similar stylistic features to the output image of the second generator.
- the abstraction of features extracted by different network layers based on the same feature extraction network is different: the features of the output image of the first generator and the output image of the second generator extracted by the network layer used to extract the underlying features are obtained According to the difference between the features of the output image of the first generator and the features of the output image of the second generator, the feature reconstruction loss is determined; the first generator extracted by the network layer for extracting abstract features is obtained The features of the output image and the features of the output image of the second generator determine the style reconstruction loss based on the difference between the features of the output image of the first generator and the features of the output image of the second generator.
- image features are extracted through different feature extraction networks, where one feature extraction network is good at extracting underlying feature representations, and the other feature extraction network is good at extracting abstract style features.
- the feature reconstruction loss and the style reconstruction loss are respectively determined.
- the feature extraction network is a super-resolution test sequence (Visual Geometry Group, VGG) network, wherein the VGG network is a deep convolutional neural network, which can be used to extract the characteristics of the output image of the first generator and the second The characteristics of the generator. Therefore, the features of different abstraction levels in the output image of the first generator and the output image of the second generator can be obtained from different network layers of the same VGG network or from different network layers of different VGG networks.
- VGG Visual Geometry Group
- the feature reconstruction loss function used to calculate the feature reconstruction loss is expressed as:
- L fea ( pt , p s ) represents the feature loss function, which is used to calculate the feature reconstruction loss between the output image p t of the second generator and the output image of the first generator p s
- ⁇ j (p t ) represents the feature activation value (i.e. feature) of the output image of the second generator extracted by the j-th layer of the VGG network ⁇
- ⁇ j (p s ) represents the first generator’s extracted by the j-th layer of the VGG network ⁇
- C j ⁇ H j ⁇ W j represents the dimensionality of the feature activation values output by the jth layer of the VGG network ⁇ .
- style reconstruction loss function used to calculate the style reconstruction loss is expressed as:
- L style ( pt , p s ) represents the style loss function, which is used to calculate the style reconstruction loss between the output image p t of the second generator and the output image of the first generator p s
- the distillation loss is backpropagated, and the model parameters of the first generator are adjusted during the backpropagation process, so that the learning generation
- the filter is optimized to minimize distillation losses.
- the distillation loss includes the output distillation loss, and the output distillation loss is backpropagated, and the model parameters of the first generator are adjusted during the backpropagation process, so that the learning generator proceeds in the direction of minimizing the output distillation loss optimization.
- the concept and determination process of the output distillation loss can refer to the description of the preceding steps, and will not be repeated here.
- the online loss of the first generator relative to the second generator also includes a total variation loss of the output image of the first generator.
- the total variation loss of the output image of the first generator is used to reflect the spatial smoothness of the output image of the first generator, and optimizing the first generator through the total variation loss can improve the output of the first generator Spatial smoothness of images to improve image quality.
- a possible implementation of S304 includes: The distillation loss and the total variation loss are weighted and summed to obtain the online loss of the first generator; and the first generator is adjusted according to the online loss of the first generator.
- the weights corresponding to the distillation loss and the total variation loss can be determined by professionals based on experience and experimental processes.
- the data difference of the image processing between the first generator and the second generator and the image output by the first generator are taken into account.
- Noise situation, and through the weighting method of distillation loss and total variation loss, the distillation loss and total variation loss are balanced, which is beneficial to improve the training effect of the first generator.
- the distillation loss includes the output distillation loss
- the output distillation loss includes the structural similarity loss between the output image of the first generator and the output image of the second generator
- the perceptual loss includes the first generator's
- the online distillation loss function used to calculate the online loss of the first generator is expressed as:
- L kd (p t , p s ) ⁇ ssim L ssim + ⁇ fea L fea + ⁇ style L style + ⁇ tv L tv .
- L kd ( pt , p s ) represents the loss function of the first generator
- ⁇ ssim , ⁇ fea , ⁇ style , and ⁇ tv represent the weight corresponding to the structure similarity loss L ssim and the feature reconstruction loss L fea corresponding to The weight corresponding to the style reconstruction loss L style , the weight corresponding to the total variation loss L tv .
- the second generator and the first generator are distilled online, that is, the second generator and the first generator perform training synchronously.
- the first generator is optimized with only the adjusted second generator in the current training epoch.
- the first generator is trained in an environment with a discriminator, and the first generator does not need to be tightly bound to the discriminator, so that the first generator can be trained more flexibly and obtain further compression; the other
- the optimization of the first generator does not require real labels, and the first generator only learns the output of the second generator with a similar structure and a larger model size, which effectively reduces the difficulty of fitting the real labels for the first generator.
- the first generator is a student generator and the second generator is a teacher generator.
- the model structure of the student generator is similar to that of the teacher generator.
- the scale and complexity of the model of the teacher generator are larger than that of the student generator.
- the teacher generator has a stronger learning ability. Ability to better guide the training of the student generator during the distillation process.
- the teacher generator includes a first teacher generator and a second teacher generator, wherein the model capacity of the first teacher generator is greater than that of the student generator, and the model depth of the second teacher generator is greater than that of the student generator The model depth of the generator.
- the student generator with two different teacher generators from two complementary dimensions can provide a complementary comprehensive distillation loss for the student generator model during model distillation, as follows: the first teacher generator starts from the model capacity (that is, the model width, also known as the number of channels of the model) to make up for the student generator to capture more detailed image information that the student generator cannot capture; the second model generator compensates for the student generator from the depth of the model, achieve better image quality.
- the student generator is similar to the first teacher generator and the second teacher generator in terms of model structure, both of which are deep learning models including four network layers.
- the number of channels of the middle layer of the first teacher generator is a multiple of the number of channels of the middle layer of the student generator, wherein the multiple is greater than 1. Therefore, the relationship between the first teacher generator and the student generator is established succinctly through the multiple relationship, which is more conducive to the calculation of the channel distillation loss in the subsequent embodiments.
- the number of network layers of the second teacher generator is greater than the number of network layers of the student generator.
- one or more network layers are added before each upsampling network layer and each downsampling network layer of the student generator to obtain the second teacher generator.
- a deep residual network (Deep residual network, Resnet) is added before each upsampling network layer and each downsampling network layer of the student generator, and the second teacher generates device.
- Resnet Deep residual network
- the loss value of the first teacher generator can be determined according to the sample data and the discriminator, the sample data includes the sample image and the reference image of the sample image; according to the first teacher According to the loss value of the generator, adjust the first teacher generator; according to the sample data and the discriminator, determine the loss value of the second teacher generator; according to the loss value of the second teacher generator, adjust the second teacher generator; according to the sample image , the adjusted first teacher generator and the adjusted second teacher generator, the adjusted student generator.
- the adjustment of the first teacher generator and the adjustment of the second teacher generator can refer to the adjustment of the second generator in the previous embodiment.
- the difference from the previous embodiment is that when adjusting the student generator, it is necessary to determine the first teacher
- the determination process of the distillation loss between the first teacher generator and the student generator, and the determination process of the distillation loss between the second teacher generator and the student generator can refer to the second generator and the student generator in the previous embodiment. The process of determining the distillation loss between the first generators will not be repeated here.
- the discriminator includes a first discriminator and a second discriminator, and there is a shared convolutional layer between the first discriminator and the second discriminator.
- the first teacher The generator uses the first discriminator
- the second teacher generator uses the second discriminator. Therefore, fully considering that the model structures of the first teacher generator and the second teacher generator are similar but not identical, the first discriminator and the second discriminator shared by the convolutional layer are used to train the first teacher generator and the second discriminator respectively.
- the second teacher generator improves the effect and efficiency of model training.
- FIG. 4 is a second schematic flow diagram of a training process of the generative confrontation network in the data processing method provided by the embodiment of the present disclosure, that is, the student generator, the first teacher generator, and the second teacher generator in the generative confrontation network Schematic flow chart of an alternate training process.
- an alternate training process of the student generator, the first teacher generator, and the second teacher generator in the generative confrontation network includes:
- the loss value of the first teacher generator includes an adversarial loss of the first teacher generator.
- the adversarial loss function used to calculate the adversarial loss of the first teacher generator can be expressed as:
- the loss value of the first teacher generator also includes the reconstruction loss of the first teacher generator.
- the reconstruction loss function used to calculate the reconstruction loss of the first teacher generator can be expressed as:
- the optimization objective function of the first teacher generator is expressed as:
- the loss value of the second teacher generator includes an adversarial loss of the second teacher generator.
- the adversarial loss function used to calculate the adversarial loss of the second teacher generator can be expressed as:
- the loss value of the second teacher generator also includes the reconstruction loss of the second teacher generator.
- the reconstruction loss function used to calculate the reconstruction loss of the second teacher generator can be expressed as:
- the optimization objective function of the second teacher generator is expressed as:
- the sample image is processed by the adjusted first teacher generator, the adjusted second teacher generator, and the student generator respectively.
- the model capacity of the first teacher generator is larger than the model capacity of the student generator, and the model depth of the second teacher generator is larger than that of the student generator, so: the sample image obtained by the adjusted first teacher generator.
- the adjusted first teacher can be determined The distillation loss between the generator, the adjusted second teacher generator and the student generator respectively.
- the student generator is adjusted, and in each training process, based on the optimized first teacher generator
- the optimized second teacher generator guides the optimization of the student generator, and integrates the first teacher generator and the second teacher generator to improve the training effect of the student generator.
- the distillation loss between the first teacher generator and the student generator includes an output distillation loss between the first teacher generator and the student generator, that is, the output layer of the first teacher generator and the student generator The distillation loss between the output layers of .
- the output distillation loss may include: a structural similarity loss and/or a perceptual loss between the output image of the first teacher generator and the output image of the student generator.
- the perceptual loss may include: a feature reconstruction loss and/or a style reconstruction loss between the output image of the first teacher generator and the output image of the student generator.
- the distillation loss between the second teacher generator and the student generator includes an output distillation loss between the second teacher generator and the student generator, that is, the output layer of the second teacher generator and the student generator The distillation loss between the output layers of .
- the output distillation loss may include: a structural similarity loss and/or a perceptual loss between the output image of the second division generator and the output image of the student generator.
- the perceptual loss may include: a feature reconstruction loss and/or a style reconstruction loss between the output image of the second teacher generator and the output image of the student generator.
- the model depth of the first teacher generator is the same as the model depth of the student generator.
- the model capacity of the first teacher generator is larger, that is, the number of channels of the convolutional layer is smaller. Many, able to capture details that student generators cannot.
- the distillation loss between the teacher generator and the student generator only includes the output distillation loss, that is, in the model distillation process, only the information of the output layer of the teacher generator is distilled, or in other words, only the teacher generated
- the discrepancy between the output image of the generator and the output image of the student generator does not take into account the information of the intermediate layers of the teacher generator.
- the information of the middle layer of the first teacher generator that is, the information of the channel granularity, can be used as one of the supervisory signals for the optimization process of the student generator to further improve the student generation. device training effect.
- the distillation loss between the first teacher generator and the student generator includes the output distillation loss and the channel distillation loss between the first teacher generator and the student generator, where the channel distillation loss is the first teacher generator.
- the distillation loss between the middle layer of the teacher generator and the middle layer of the student generator reflects the difference between the features of the sample image extracted by the middle layer of the first teacher generator and the features of the sample image extracted by the middle layer of the student generator. Therefore, combining the output distillation loss and the channel distillation loss as the supervisory information for the optimization of the student generator can realize multi-granularity model distillation and improve the effect of model distillation.
- a possible implementation of S405 includes: using the student generator, the adjusted first teacher generator, and the adjusted second teacher generator to process the sample image respectively to obtain the output image of the student generator and the second The output image of a teacher generator, determine the first output distillation loss, the first output distillation loss is the distillation loss between the output layer of the first teacher generator and the output layer of the student generator; according to the intermediate layer output of the student generator The feature map of the first teacher generator and the feature map output by the middle layer of the first teacher generator determine the channel distillation loss; according to the output image of the student generator and the output image of the second teacher generator, determine the second output distillation loss, the second output distillation Loss is the distillation loss between the output layer of the second teacher generator and the output layer of the student generator; the student generator is adjusted according to the first output distillation loss, the channel distillation loss and the second output distillation loss.
- the feature map output by the middle layer of the student generator refers to the feature of the sample image extracted by the middle layer of the student generator, including the feature map value output by each channel in the middle layer of the student generator.
- the feature map output by the middle layer of the first teacher generator refers to the features of the sample image extracted by the middle layer of the first teacher generator, including the features output by each channel in the middle layer of the first teacher generator map value.
- the model depth of the student generator is the same as that of the first teacher generator, for each intermediate layer, determine the feature map output by the intermediate layer of the student generator and the feature map output by the intermediate layer of the first teacher generator According to the difference between, the channel distillation loss is determined.
- the student generator is adjusted according to the first output distillation loss, the channel distillation loss, and the second output distillation loss.
- the first teacher generates A channel convolution layer is connected between the middle layer of the teacher generator and the middle layer of the student generator, and the channel convolution layer is used to establish the mapping between the channel of the middle layer of the first teacher generator and the channel of the middle layer of the student generator relation. Therefore, based on the channel convolution layer, for each channel of the middle layer of the student generator, there is a corresponding channel in the channel of the middle layer of the first teacher generator, without changing the middle layer of the student generator. On the premise of the channel number, through the channel convolution layer, the channel number expansion of the middle layer of the student generator is realized in the process of determining the channel distillation loss.
- a possible implementation of determining the channel distillation loss includes: according to the middle layer of the student generator.
- the feature map output by each channel determines the attention weight of each channel in the middle layer of the student generator; according to the feature map output by each channel in the middle layer of the first teacher generator, determine the weight of each channel in the middle layer of the first teacher generator.
- Attention weights for the channels; the channel distillation loss is determined from the difference between the attention weights of the channels mapped to each other in the middle layer of the student generator and the middle layer of the first teacher generator. Among them, the attention weight of the channel is used to measure the importance of the channel.
- the attention weight of the channel can be calculated based on the pixels on the feature map output by the channel, for example, the sum or mean of all pixels on the feature map output by the channel , determined as the attention weight of the channel.
- the attention weight of the channel can be calculated based on the pixels on the feature map output by the channel.
- determine the channels that are mapped to each other compare the attention weights of the channels that are mapped to each other, and determine the attention of the channels that are mapped to each other The difference between the force weights, which in turn determines the channel distillation loss.
- the channel convolution layer is a 1*1 learnable convolution layer
- the channel of the intermediate layer of the student generator is mapped to the channel of the corresponding intermediate layer of the first teacher generator through the 1*1 learnable convolution layer.
- channels so that the number of channels of the middle layer of the student generator is upscaled to be consistent with the number of channels of the corresponding middle layer of the student generator.
- the attention weight of the channel can be calculated according to each pixel on the feature map output by the channel and the size of the feature map output by the channel.
- the pixels of each row on the feature map output by the channel can be added to obtain the sum of pixels corresponding to each row; the sum of the pixels corresponding to each row Add up to get the sum of pixels corresponding to the feature map; according to the size of the feature map, average the sum of pixels to get the attention weight of the channel.
- calculation formula of the attention weight of the channel can be expressed as:
- w c represents the attention weight of channel c
- H is the height of the feature map output by channel C
- W is the width of the feature map output by channel C
- u c (i,j) is the position on the feature map (i, j) pixels.
- the determination process of the attention weights of each channel in the middle layer of the first teacher generator can refer to the relevant description of the student generator, and will not be repeated here.
- each intermediate layer of the student generator and each intermediate layer of the teacher generator determine the difference between the attention weights of each pair of channels mapped to each other, and according to the attention weights of each pair of channels mapped to each other
- the difference between , the number of samples of the feature map in the middle layer of the student generator and the middle layer of the first teacher generator, and the number of channels of the feature map map determine the channel distillation loss.
- the accuracy of channel distillation loss is improved by considering not only the attention weights of each pair of channels mapped to each other, but also the number of channels in the intermediate layer and the number of feature maps sampled in the intermediate layer.
- the channel distillation loss function used to calculate the channel distillation loss can be expressed as:
- n represents the number of samples of the feature map
- c represents the number of channels mapped by the feature map
- the channel distillation loss is weighted according to the channel loss weight factor to obtain a weighted result.
- the first The output distillation loss and the second output distillation loss tune the student generator. Therefore, by adjusting the weight factor of the channel loss, the degree of influence of the channel distillation loss on the optimization process of the student generator can be adjusted, and the flexibility of the training of the student generator can be improved.
- the process of adjusting the student generator according to the weighted result of the channel distillation loss and the channel loss weighting factor, the first output distillation loss and the second output distillation loss it can be based on the first output distillation loss, the second output distillation loss , respectively determine the online loss of the student generator with respect to the first teacher generator and the online loss of the student generator with respect to the second teacher generator.
- the online loss of the first teacher generator, the online loss of the student generator relative to the second teacher generator, and the weighted results of the channel distillation loss and the channel loss weighting factor are weighted and summed to balance these loss values by weighting , to get the final loss value of the multi-grain online distillation of the student generator.
- the model parameters of the student generator are adjusted to realize the optimization of the student generator.
- the online loss of the student generator relative to the first teacher generator includes: the output distillation loss between the student generator and the first teacher generator.
- the online loss of the student generator with respect to the first teacher generator contains: the output distillation loss between the student generator and the first teacher generator and the total variation loss of the output image of the student generator.
- the online loss of the student generator with respect to the second teacher generator consists of: the output distillation loss between the student generator and the second teacher generator.
- the online loss of the student generator with respect to the second teacher generator contains: the output distillation loss between the student generator and the second teacher generator and the total variation loss of the output image of the student generator.
- the objective loss function used to calculate the final loss value of the multi-grain online distillation of the student generator can be expressed as:
- ⁇ CD represents the channel loss weight factor.
- FIG. 5 is an example diagram of a model structure of a generative confrontation network provided by an embodiment of the present disclosure.
- the generative adversarial network includes two teacher generators, a student generator G S and two discriminators that are partially shared.
- the two teacher generators include a wider teacher generator and more in-depth teacher generators
- the two discriminators share the previous multiple convolutional layers.
- the teacher generator is the first teacher generator in the foregoing embodiment; compared with the student generator, the deeper teacher generator is equivalent to Multiple Resnet modules are inserted before and after the sampling layer of the student generator, and the depth of the model is greater than that of the student generator. It can be seen that the teacher generator is the second teacher generator in the foregoing embodiments.
- the middle layer of the wider teacher generator and the middle layer of the student generator are connected with a channel convolution layer (not marked in Figure 5).
- the channel convolution layer here is used to establish The mapping relationship between the channels in the middle layer of the wider teacher generator and the channel in the middle layer of the student generator facilitates the calculation of the channel distillation loss between the wider teacher generator and the student generator.
- the sample image (the contour map of the high-heeled shoes in Fig. 5) is fed into the wider teacher generator, the student generator, and the deeper teacher generator respectively.
- Determine the GAN loss for the wider teacher generator i.e., the first teacher generator in the previous example, by partially sharing the discriminator, the output image of the wider teacher generator, and the ground truth label (the reference image in the previous example).
- the adversarial loss of which is used to adjust the wider teacher generator.
- the GAN loss of the deeper teacher generator that is, the adversarial loss of the second teacher generator in the previous example, which is used to adjust the deeper teacher generator. teacher builder.
- the channel distillation loss is calculated based on the difference between the wider teacher generator and the student generator in the middle layer. Based on the output image of the wider teacher generator, the output image of the student generator, and the output image of the deeper teacher generator, the output distillation loss between the wider teacher generator and the student generator, the deeper teacher generator The output distillation loss between the generator and the student generator.
- the channel distillation loss and the two classes of output distillation losses are used to tune the student generator.
- FIG. 6 is a structural block diagram of a data processing device provided in an embodiment of the present disclosure. For ease of description, only the parts related to the embodiments of the present disclosure are shown.
- the data processing device includes: an acquisition module 601 and a processing module 602 .
- the processing module 602 is configured to process the image through the first generator to obtain a processed image.
- the data processing device is suitable for the generative confrontation network obtained by model distillation
- the generative confrontation network includes the first generator, the second generator and the discriminator
- the model distillation is to alternately train the first generator and the second generator , the model size of the first generator is smaller than the model size of the second generator.
- an alternate training process of the first generator and the second generator in the generative confrontation network includes:
- the first generator is adjusted.
- determining the loss value of the second generator includes:
- the loss value of the second generator is determined.
- the network layer includes an input layer, an intermediate layer and an output layer, and the adjusted first generator is determined according to the sample image, the adjusted second generator and the first generator. Distillation losses between the second generator and the first generator, including:
- the output distillation loss is determined, and the output distillation loss is the distillation loss between the output layer of the second generator and the output layer of the first generator .
- the output distillation loss is determined according to the difference between the output image of the first generator and the output image of the second generator, including:
- the output distillation loss is determined.
- the perceptual loss includes a feature reconstruction loss and/or a style reconstruction loss, and feature extraction is performed on the output image of the first generator and the output image of the second generator through a feature extraction network to determine the first
- the perceptual loss between the output image of the generator and the output image of the second generator including:
- a feature reconstruction loss and/or a style reconstruction loss is determined based on the difference between the features of the output image of the first generator and the features of the output image of the second generator.
- adjusting the first generator according to the distillation loss includes:
- the first generator is adjusted.
- adjusting the first generator according to the distillation loss and the total variation loss includes:
- the first generator is adjusted according to the online loss of the first generator.
- the first generator is a student generator and the second generator is a teacher generator.
- the teacher generator includes a first teacher generator and a second teacher generator, the model capacity of the first teacher generator is larger than that of the student generator, and the model depth of the second teacher generator is larger than The model depth of the student generator.
- the discriminator includes a first discriminator and a second discriminator, there is a shared convolutional layer between the first discriminator and the second discriminator, and the first generator and the second discriminator in the generative confrontation network
- An alternate training process of the two generators includes:
- the sample data includes the sample image and the reference image of the sample image
- the network layer includes an input layer, an intermediate layer, and an output layer.
- the adjusted first teacher generator and the adjusted second teacher generator Tuning student generators, including:
- the sample image is processed respectively to obtain the output image of the student generator, the output image of the first teacher generator and the second teacher generator output image of
- the first output distillation loss is determined, and the first output distillation loss is the distillation loss between the output layer of the first teacher generator and the output layer of the student generator ;
- the channel distillation loss is determined, and the channel distillation loss is the difference between the middle layer of the first teacher generator and the middle layer of the student generator between distillation losses;
- the second output distillation loss is determined, and the second output distillation loss is the distillation loss between the output layer of the second teacher generator and the output layer of the student generator ;
- a channel convolution layer is connected between the middle layer of the first teacher generator and the middle layer of the student generator, and the channel convolution layer is used to establish the The mapping relationship between the channel and the channel of the intermediate layer of the student generator, according to the feature map output by the intermediate layer of the student generator and the feature map output by the intermediate layer of the first teacher generator, the channel distillation loss is determined, including:
- the channel distillation loss is determined according to the difference between the attention weights of the channels mapped to each other.
- adjusting the student generator according to the first output distillation loss, the channel distillation loss and the second output distillation loss includes:
- the channel distillation loss is weighted to obtain the weighted result
- the device provided in this embodiment can be used to implement the technical solution of the above method embodiment, and its implementation principle and technical effect are similar, so this embodiment will not repeat them here.
- the electronic device 700 may be a terminal device or a server.
- the terminal equipment may include but not limited to mobile phones, notebook computers, digital broadcast receivers, personal digital assistants (Personal Digital Assistant, PDA for short), tablet computers (Portable Android Device, PAD for short), portable multimedia players (Portable Media Player, referred to as PMP), mobile terminals such as vehicle-mounted terminals (such as vehicle-mounted navigation terminals), and fixed terminals such as digital TVs, desktop computers, etc.
- PDA Personal Digital Assistant
- PMP portable multimedia players
- mobile terminals such as vehicle-mounted terminals (such as vehicle-mounted navigation terminals)
- fixed terminals such as digital TVs, desktop computers, etc.
- the electronic device shown in FIG. 7 is only an example, and should not limit the functions and application scope of the embodiments of the present disclosure.
- an electronic device 700 may include a processing device (such as a central processing unit, a graphics processing unit, etc.) 708 loads the program in the random access memory (Random Access Memory, referred to as RAM) 703 to execute various appropriate actions and processes.
- RAM Random Access Memory
- various programs and data necessary for the operation of the electronic device 700 are also stored.
- the processing device 701, ROM 702, and RAM 703 are connected to each other through a bus 704.
- An input/output (I/O) interface 705 is also connected to the bus 704 .
- an input device 706 including, for example, a touch screen, a touchpad, a keyboard, a mouse, a camera, a microphone, an accelerometer, a gyroscope, etc.; ), a speaker, a vibrator, etc.
- a storage device 708 including, for example, a magnetic tape, a hard disk, etc.
- the communication means 709 may allow the electronic device 700 to communicate with other devices wirelessly or by wire to exchange data. While FIG. 7 shows electronic device 700 having various means, it should be understood that implementing or having all of the means shown is not a requirement. More or fewer means may alternatively be implemented or provided.
- embodiments of the present disclosure include a computer program product, which includes a computer program carried on a computer-readable medium, where the computer program includes program codes for executing the methods shown in the flowcharts.
- the computer program may be downloaded and installed from a network via communication means 709, or from storage means 708, or from ROM 702.
- the processing device 701 When the computer program is executed by the processing device 701, the above-mentioned functions defined in the methods of the embodiments of the present disclosure are performed.
- the above-mentioned computer-readable medium in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the above two.
- a computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
- a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
- a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
- a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can transmit, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device .
- Program code embodied on a computer readable medium may be transmitted by any appropriate medium, including but not limited to wires, optical cables, RF (radio frequency), etc., or any suitable combination of the above.
- the above-mentioned computer-readable medium may be included in the above-mentioned electronic device, or may exist independently without being incorporated into the electronic device.
- the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device is made to execute the methods shown in the above-mentioned embodiments.
- Computer program code for carrying out the operations of the present disclosure can be written in one or more programming languages, or combinations thereof, including object-oriented programming languages—such as Java, Smalltalk, C++, and conventional Procedural Programming Language - such as "C" or a similar programming language.
- the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
- the remote computer can be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or it can be connected to an external A computer (connected via the Internet, eg, using an Internet service provider).
- LAN Local Area Network
- WAN Wide Area Network
- each block in a flowchart or block diagram may represent a module, program segment, or portion of code that contains one or more logical functions for implementing specified executable instructions.
- the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
- each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented by a dedicated hardware-based system that performs the specified functions or operations , or may be implemented by a combination of dedicated hardware and computer instructions.
- the units involved in the embodiments described in the present disclosure may be implemented by software or by hardware. Wherein, the name of the unit does not constitute a limitation of the unit itself under certain circumstances, for example, the first obtaining unit may also be described as "a unit for obtaining at least two Internet Protocol addresses".
- FPGAs Field Programmable Gate Arrays
- ASICs Application Specific Integrated Circuits
- ASSPs Application Specific Standard Products
- SOCs System on Chips
- CPLD Complex Programmable Logical device
- a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device.
- a machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
- a machine-readable medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing.
- machine-readable storage media would include one or more wire-based electrical connections, portable computer discs, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
- RAM random access memory
- ROM read only memory
- EPROM or flash memory erasable programmable read only memory
- CD-ROM compact disk read only memory
- magnetic storage or any suitable combination of the foregoing.
- a data processing method is provided, which is suitable for a generative confrontation network obtained through model distillation, and the data processing method includes: acquiring an image to be processed; A generator processes the image to obtain the processed image; wherein, the generative confrontation network includes the first generator, the second generator, and the discriminator, and the model distillation is to alternately train the first A generator and the process of the second generator, the model size of the first generator is smaller than the model size of the second generator.
- an alternate training process of the first generator and the second generator in the generative confrontation network includes: determining the second generator according to the sample data and the discriminator
- the loss value of the sample data includes the sample image and the reference image corresponding to the sample image; according to the loss value of the second generator, adjust the second generator; according to the sample image, the adjusted
- the second generator and the first generator determine an adjusted distillation loss between the second generator and the first generator; and adjust the first generator according to the distillation loss.
- the determining the loss value of the second generator according to the sample data and the discriminator includes: processing the sample image by the second generator, Obtain the output image of the second generator; use the discriminator to determine the authenticity of the reference image corresponding to the sample image and the output image of the second generator, and determine the adversarial loss of the second generator ; Determine the reconstruction loss of the second generator according to the difference between the reference image corresponding to the sample image and the output image of the second generator; determine the second generator according to the confrontation loss and the reconstruction loss The generator's loss value.
- the network layer includes an input layer, an intermediate layer and an output layer, and according to the sample image, the adjusted second generator and the The first generator determines the distillation loss between the adjusted second generator and the first generator, including: processing the sample image through the first generator and the adjusted second generator respectively , get the output image of the first generator and the output image of the second generator; according to the difference between the output image of the first generator and the output image of the second generator, determine the output distillation loss, the output distillation loss is the distillation loss between the output layer of the second generator and the output layer of the first generator.
- the determining the output distillation loss according to the difference between the output image of the first generator and the output image of the second generator includes:
- the perceptual loss includes a feature reconstruction loss and/or a style reconstruction loss, and the output image of the first generator and the output image of the second generator through a feature extraction network
- the output images are respectively subjected to feature extraction, and the perceptual loss between the output image of the first generator and the output image of the second generator is determined, including:
- the feature reconstruction loss is determined and/or the style reconstruction loss.
- the adjusting the first generator according to the distillation loss includes: determining the total variation loss of the output image of the first generator; and the total variation loss, adjust the first generator.
- the adjusting the first generator according to the distillation loss and the total variation loss includes: weighting the distillation loss and the total variation loss The sum is obtained to obtain the online loss of the first generator; and the first generator is adjusted according to the online loss of the first generator.
- the first generator is a student generator
- the second generator is a teacher generator
- the teacher generator includes a first teacher generator and a second teacher generator, the model capacity of the first teacher generator is larger than the model capacity of the student generator, so The model depth of the second teacher generator is greater than the model depth of the student generator.
- the discriminator includes a first discriminator and a second discriminator, there is a shared convolutional layer between the first discriminator and the second discriminator, and the generating
- An alternate training process of the first generator and the second generator in the formula confrontation network includes: determining the loss value of the first teacher generator according to the sample data and the first discriminator, and the sample data includes the sample image and a reference image of the sample image; adjust the first teacher generator based on the loss value of the first teacher generator; determine the second teacher generator based on the sample data and the second discriminator according to the loss value of the second teacher generator, adjust the second teacher generator; according to the sample image, the adjusted first teacher generator and the adjusted second teacher generator, adjust The Student Builder.
- the network layer includes an input layer, an intermediate layer, and an output layer
- adjusting the student generator includes: respectively processing the sample image through the student generator, the adjusted first teacher generator, and the adjusted second teacher generator to obtain The output image of the student generator, the output image of the first teacher generator, and the output image of the second teacher generator; according to the output image of the student generator and the output of the first teacher generator image, determine the first output distillation loss, the first output distillation loss is the distillation loss between the output layer of the first teacher generator and the output layer of the student generator; according to the middle of the student generator
- the feature map output by the layer and the feature map output by the intermediate layer of the first teacher generator determine the channel distillation loss, and the channel distillation loss is the intermediate layer between the intermediate layer of the first teacher generator and the student generator Distillation loss between layers; from the output image of the student generator and the output image of the second teacher generator, determine a second
- a channel convolution layer is connected between the middle layer of the first teacher generator and the middle layer of the student generator, and the channel convolution layer is used to establish the first The mapping relationship between the channel in the middle layer of the teacher generator and the channel of the middle layer of the student generator, the feature map output according to the middle layer of the student generator and the first teacher generator
- the feature map output by the middle layer determining the channel distillation loss, including: according to the feature map output by each channel in the middle layer of the student generator, determining the attention weight of each channel in the middle layer of the student generator;
- the adjusting the student generator according to the first output distillation loss, the channel distillation loss and the second output distillation loss includes: according to the channel loss weighting factor , weighting the channel distillation loss to obtain a weighted result; adjusting the student generator according to the weighted result, the first output distillation loss, and the second output distillation loss.
- a data processing device suitable for a generative confrontation network obtained through model distillation includes: an acquisition module, configured to acquire An image; a processing module, configured to process the image through a first generator to obtain a processed image; wherein the generative confrontation network includes the first generator, a second generator and a discriminator,
- the model distillation is a process of alternately training the first generator and the second generator, the model size of the first generator being smaller than the model size of the second generator.
- an alternate training process of the first generator and the second generator in the generative confrontation network includes: determining the second generator according to the sample data and the discriminator
- the loss value of the sample data includes the sample image and the reference image corresponding to the sample image; according to the loss value of the second generator, adjust the second generator; according to the sample image, the adjusted
- the second generator and the first generator determine an adjusted distillation loss between the second generator and the first generator; and adjust the first generator according to the distillation loss.
- the determining the loss value of the second generator according to the sample data and the discriminator includes: processing the sample image by the second generator, Obtain the output image of the second generator; use the discriminator to determine the authenticity of the reference image corresponding to the sample image and the output image of the second generator, and determine the adversarial loss of the second generator ; Determine the reconstruction loss of the second generator according to the difference between the reference image corresponding to the sample image and the output image of the second generator; determine the second generator according to the confrontation loss and the reconstruction loss The generator's loss value.
- the network layer includes an input layer, an intermediate layer and an output layer, and according to the sample image, the adjusted second generator and the The first generator determines the distillation loss between the adjusted second generator and the first generator, including: processing the sample image through the first generator and the adjusted second generator respectively , get the output image of the first generator and the output image of the second generator; according to the difference between the output image of the first generator and the output image of the second generator, determine the output distillation loss, the output distillation loss is the distillation loss between the output layer of the second generator and the output layer of the first generator.
- the determining the output distillation loss according to the difference between the output image of the first generator and the output image of the second generator includes:
- the perceptual loss includes a feature reconstruction loss and/or a style reconstruction loss, and the output image of the first generator and the output image of the second generator through a feature extraction network
- the output images are respectively subjected to feature extraction, and the perceptual loss between the output image of the first generator and the output image of the second generator is determined, including:
- the feature reconstruction loss is determined and/or the style reconstruction loss.
- the adjusting the first generator according to the distillation loss includes: determining a total variation loss of the output image of the first generator; according to the distillation loss and the total variation loss, adjust the first generator.
- the adjusting the first generator according to the distillation loss and the total variation loss includes: weighting the distillation loss and the total variation loss The sum is obtained to obtain the online loss of the first generator; and the first generator is adjusted according to the online loss of the first generator.
- the first generator is a student generator
- the second generator is a teacher generator
- the teacher generator includes a first teacher generator and a second teacher generator, the model capacity of the first teacher generator is larger than the model capacity of the student generator, so The model depth of the second teacher generator is greater than the model depth of the student generator.
- the discriminator includes a first discriminator and a second discriminator, there is a shared convolutional layer between the first discriminator and the second discriminator, and the generating
- An alternate training process of the first generator and the second generator in the formula confrontation network includes: determining the loss value of the first teacher generator according to the sample data and the first discriminator, and the sample data includes the sample image and a reference image of the sample image; adjust the first teacher generator based on the loss value of the first teacher generator; determine the second teacher generator based on the sample data and the second discriminator according to the loss value of the second teacher generator, adjust the second teacher generator; according to the sample image, the adjusted first teacher generator and the adjusted second teacher generator, adjust The Student Builder.
- the network layer includes an input layer, an intermediate layer, and an output layer
- adjusting the student generator includes: respectively processing the sample image through the student generator, the adjusted first teacher generator, and the adjusted second teacher generator to obtain The output image of the student generator, the output image of the first teacher generator, and the output image of the second teacher generator; according to the output image of the student generator and the output of the first teacher generator image, determine the first output distillation loss, the first output distillation loss is the distillation loss between the output layer of the first teacher generator and the output layer of the student generator; according to the middle of the student generator
- the feature map output by the layer and the feature map output by the intermediate layer of the first teacher generator determine the channel distillation loss, and the channel distillation loss is the intermediate layer between the intermediate layer of the first teacher generator and the student generator Distillation loss between layers; from the output image of the student generator and the output image of the second teacher generator, determine a second
- a channel convolution layer is connected between the middle layer of the first teacher generator and the middle layer of the student generator, and the channel convolution layer is used to establish the first The mapping relationship between the channel in the middle layer of the teacher generator and the channel of the middle layer of the student generator, the feature map output according to the middle layer of the student generator and the first teacher generator
- the feature map output by the middle layer determining the channel distillation loss, including: according to the feature map output by each channel in the middle layer of the student generator, determining the attention weight of each channel in the middle layer of the student generator;
- the adjusting the student generator according to the first output distillation loss, the channel distillation loss and the second output distillation loss includes: according to the channel loss weighting factor , weighting the channel distillation loss to obtain a weighted result; adjusting the student generator according to the weighted result, the first output distillation loss, and the second output distillation loss.
- an electronic device including: at least one processor and a memory;
- the memory stores computer-executable instructions
- the at least one processor executes the computer-executed instructions stored in the memory, so that the at least one processor executes the data processing method described in the above first aspect and various possible designs of the first aspect.
- a computer-readable storage medium stores computer-executable instructions, and when a processor executes the computer-executable instructions, Realize the data processing method described in the above first aspect and various possible designs of the first aspect.
- a computer program product includes computer-executable instructions, and when a processor executes the computer-executable instructions, the first aspect and Various possible designs of the data processing method in the first aspect.
- the embodiments of the present disclosure provide a computer program, the computer program includes computer-executable instructions, and when the processor executes the computer-executable instructions, the above-mentioned first aspect and various possible designs of the first aspect can be realized.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
Description
Claims (19)
- 一种数据处理方法,适用于通过模型蒸馏得到的生成式对抗网络,所述数据处理方法包括:获取待处理的图像;通过第一生成器对所述图像进行处理,得到处理后的图像;其中,所述生成式对抗网络包括所述第一生成器、第二生成器和判别器,所述模型蒸馏为交替训练所述第一生成器和所述第二生成器的过程,所述第一生成器的模型规模小于所述第二生成器的模型规模。
- 根据权利要求1所述的数据处理方法,所述生成式对抗网络中所述第一生成器与所述第二生成器的一次交替训练过程包括:根据样本数据和所述判别器,确定所述第二生成器的损失值,所述样本数据中包括样本图像和所述样本图像对应的参考图像;根据所述第二生成器的损失值,调整所述第二生成器;根据所述样本图像、调整后的第二生成器和所述第一生成器,确定调整后的第二生成器与所述第一生成器之间的蒸馏损失;根据所述蒸馏损失,调整所述第一生成器。
- 根据权利要求2所述的数据处理方法,所述根据样本数据和所述判别器,确定所述第二生成器的损失值,包括:通过所述第二生成器对所述样本图像进行处理,得到所述第二生成器的输出图像;通过所述判别器对所述样本图像对应的参考图像和所述第二生成器的输出图像进行真假判别,确定所述第二生成器的对抗损失;根据所述样本图像对应的参考图像与所述第二生成器的输出图像的差异,确定所述第二生成器的重建损失;根据所述对抗损失和所述重建损失,确定所述第二生成器的损失值。
- 根据权利要求2所述的数据处理方法,在所述生成式对抗网络中,网络层包括输入层、中间层和输出层,所述根据所述样本图像、调整后的第二生成器和所述第一生成器,确定调整后的第二生成器与所述第一生成器之间的蒸馏损失,包括:通过所述第一生成器、调整后的第二生成器,分别处理所述样本图像,得到所述第一生成器的输出图像和所述第二生成器的输出图像;根据所述第一生成器的输出图像与所述第二生成器的输出图像之间的差异,确定输出蒸馏损失,所述输出蒸馏损失为所述第二生成器的输出层与所述第一生成器的输出层之间的蒸馏损失。
- 根据权利要求4所述的数据处理方法,所述根据所述第一生成器的输出图像与所述第二生成器的输出图像之间的差异,确定输出蒸馏损失,包括:根据所述第二生成器的输出图像的亮度和对比度以及所述第一生成器的输出图像的亮度和对比度,确定所述第二生成器的输出图像与所述第一生成器的输出图像之间的结构相似化损失;通过特征提取网络对所述第一生成器的输出图像和所述第二生成器的输出图像分别进行特征提取,确定所述第一生成器的输出图像与所述第二生成器的输出图像之间的感知损失;根据所述结构相似化损失和所述感知损失,确定所述输出蒸馏损失。
- 根据权利要求5所述的数据处理方法,所述感知损失包括特征重建损失和/或样式重建损失,所述通过特征提取网络对所述第一生成器的输出图像和所述第二生成器的输出图像分别进行特征提取,确定所述第一生成器的输出图像与所述第二生成器的输出图像之间的感知损失,包括:将所述第一生成器的输出图像和所述第二生成器的输出图像分别输入所述特征提取网络,得到所述特征提取网络的预设网络层输出的所述第一生成器的输出图像的特征和所述第二生成器的输出图像的特征;根据所述第一生成器的输出图像的特征与所述第二生成器的输出图像的特征之间的差异,确定所述特征重建损失和/或所述样式重建损失。
- 根据权利要求2所述的数据处理方法,所述根据所述蒸馏损失,调整所述第一生成器,包括:确定所述第一生成器的输出图像的总变差损失;根据所述蒸馏损失和所述总变差损失,调整所述第一生成器。
- 根据权利要求7所述的数据处理方法,所述根据所述蒸馏损失和所述总变差损失,调整所述第一生成器,包括:对所述蒸馏损失和所述总变差损失进行加权求和,得到所述第一生成器的在线损失;根据所述第一生成器的在线损失,对所述第一生成器进行调整。
- 根据权利要求1至8任一项所述的数据处理方法,所述第一生成器为学生生成器,所述第二生成器为教师生成器。
- 根据权利要求9所述的数据处理方法,所述教师生成器包括第一教师生成器和第二教师生成器,所述第一教师生成器的模型容量大于所述学生生成器的模型容量,所述第二教师生成器的模型深度大于所述学生生成器的模型深度。
- 根据权利要求10所述的数据处理方法,所述判别器包括第一判别器和第二判别器,所述第一判别器和所述第二判别器之间存在共享卷积层,所述生成式对抗网络中第一生成器与第二生成器的一次交替训练过程包括:根据样本数据和所述第一判别器,确定所述第一教师生成器的损失值,所述样本数据中包括样本图像和所述样本图像的参考图像;根据所述第一教师生成器的损失值,调整所述第一教师生成器;根据样本数据和所述第二判别器,确定所述第二教师生成器的损失值;根据所述第二教师生成器的损失值,调整所述第二教师生成器;根据所述样本图像、调整后的第一教师生成器以及调整后的第二教师生成器,调整所述学生生成器。
- 根据权利要求11所述的数据处理方法,在所述生成式对抗网络中,网络层包括输入层、中间层和输出层,所述根据所述样本图像、调整后的第一教师生成器以及调整后的第二教师生成器,调整所述学生生成器,包括:通过所述学生生成器、调整后的第一教师生成器、调整后的第二教师生成器,分别处理所述样本图像,得到所述学生生成器的输出图像、所述第一教师生成器的输出图像以及所述第二教师生成器的输出图像;根据所述学生生成器的输出图像和所述第一教师生成器的输出图像,确定第一输出蒸馏损失,所述第一输出蒸馏损失为所述第一教师生成器的输出层与所述学生生成器的输出层之间的蒸馏损失;根据所述学生生成器的中间层输出的特征图和所述第一教师生成器的中间层输出的特征图,确定通道蒸馏损失,所述通道蒸馏损失为所述第一教师生成器的中间层与所述学生生成器的中间层之间的蒸馏损失;根据所述学生生成器的输出图像和所述第二教师生成器的输出图像,确定第二输出蒸馏损失,所述第二输出蒸馏损失为所述第二教师生成器的输出层与所述学生生成器的输出层之间的蒸馏损失;根据所述第一输出蒸馏损失、所述通道蒸馏损失以及所述第二输出蒸馏损失,调整所述学生生成器。
- 根据权利要求12所述的数据处理方法,所述第一教师生成器的中间层与所述学生生成器的中间层之间连接有通道卷积层,所述通道卷积层用于建立所述第一教师生成器的中间层中的通道与所述学生生成器的中间层的通道之间的映射关系,所述根据所述学生生成器的中间层输出的特征图和所述第一教师生成器的中间层输出的特征图,确定通道蒸馏损失,包括:根据所述学生生成器的中间层中各通道输出的特征图,确定所述学生生成器的中间层中各通道的注意力权重;根据所述第一教师生成器的中间层中各通道输出的特征图,确定所述第一教师生成器的中间层中各通道的注意力权重;在所述学生生成器的中间层和所述第一教师生成器的中间层中,根据相互映射的通道的注意力权重之间的差异,确定所述通道蒸馏损失。
- 根据权利要求12所述的数据处理方法,所述根据所述第一输出蒸馏损失、所述通道蒸馏损失以及所述第二输出蒸馏损失,调整所述学生生成器,包括:根据通道损失权重因子,对所述通道蒸馏损失进行加权,得到加权结果;根据所述加权结果、所述第一输出蒸馏损失以及所述第二输出蒸馏损失,调整所述学生生成器。
- 一种数据处理设备,适用于通过模型蒸馏得到的生成式对抗网络,所述数据处理设备包括:获取模块,用于获取待处理的图像;处理模块,用于通过第一生成器对所述图像进行处理,得到处理后的图像;其中,所述生成式对抗网络包括所述第一生成器、第二生成器和判别器,所述模型蒸馏为交替训练所述第一生成器和所述第二生成器的过程,所述第一生成器的模型规模小于所述第二生成器的模型规模。
- 一种电子设备,包括:至少一个处理器和存储器;所述存储器存储计算机执行指令;所述至少一个处理器执行所述存储器存储的计算机执行指令,使得所述至少一个处理器执行如权利要求1至14任一项所述的数据处理方法。
- 一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当 处理器执行所述计算机执行指令时,实现如权利要求1至14任一项所述的数据处理方法。
- 一种计算机程序产品,所述计算机程序产品包含计算机执行指令,当处理器执行所述计算机执行指令时,实现如权利要求1至14任一项所述的数据处理方法。
- 一种计算机程序,所述计算机程序包含计算机执行指令,当处理器执行所述计算机执行指令时,实现如权利要求1至14任一项所述的数据处理方法。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP22841056.9A EP4354343A1 (en) | 2021-07-15 | 2022-05-23 | Data processing method and device |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110802048.4 | 2021-07-15 | ||
CN202110802048.4A CN113449851A (zh) | 2021-07-15 | 2021-07-15 | 数据处理方法及设备 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023284416A1 true WO2023284416A1 (zh) | 2023-01-19 |
Family
ID=77816289
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/094556 WO2023284416A1 (zh) | 2021-07-15 | 2022-05-23 | 数据处理方法及设备 |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP4354343A1 (zh) |
CN (1) | CN113449851A (zh) |
WO (1) | WO2023284416A1 (zh) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113449851A (zh) * | 2021-07-15 | 2021-09-28 | 北京字跳网络技术有限公司 | 数据处理方法及设备 |
CN114092678A (zh) * | 2021-11-29 | 2022-02-25 | 北京字节跳动网络技术有限公司 | 图像处理方法、装置、电子设备及存储介质 |
CN116797782A (zh) * | 2022-03-09 | 2023-09-22 | 北京字跳网络技术有限公司 | 一种图像的语义分割方法、装置、电子设备及存储介质 |
CN116797466A (zh) * | 2022-03-14 | 2023-09-22 | 腾讯科技(深圳)有限公司 | 一种图像处理方法、装置、设备及可读存储介质 |
CN115936980B (zh) * | 2022-07-22 | 2023-10-20 | 北京字跳网络技术有限公司 | 一种图像处理方法、装置、电子设备及存储介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106548190A (zh) * | 2015-09-18 | 2017-03-29 | 三星电子株式会社 | 模型训练方法和设备以及数据识别方法 |
CN111967573A (zh) * | 2020-07-15 | 2020-11-20 | 中国科学院深圳先进技术研究院 | 数据处理方法、装置、设备及计算机可读存储介质 |
CN112052948A (zh) * | 2020-08-19 | 2020-12-08 | 腾讯科技(深圳)有限公司 | 一种网络模型压缩方法、装置、存储介质和电子设备 |
CN113095475A (zh) * | 2021-03-02 | 2021-07-09 | 华为技术有限公司 | 一种神经网络的训练方法、图像处理方法以及相关设备 |
CN113449851A (zh) * | 2021-07-15 | 2021-09-28 | 北京字跳网络技术有限公司 | 数据处理方法及设备 |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110458765B (zh) * | 2019-01-25 | 2022-12-02 | 西安电子科技大学 | 基于感知保持卷积网络的图像质量增强方法 |
CN112465111B (zh) * | 2020-11-17 | 2024-06-21 | 大连理工大学 | 一种基于知识蒸馏和对抗训练的三维体素图像分割方法 |
CN113065635A (zh) * | 2021-02-27 | 2021-07-02 | 华为技术有限公司 | 一种模型的训练方法、图像增强方法及设备 |
-
2021
- 2021-07-15 CN CN202110802048.4A patent/CN113449851A/zh active Pending
-
2022
- 2022-05-23 EP EP22841056.9A patent/EP4354343A1/en active Pending
- 2022-05-23 WO PCT/CN2022/094556 patent/WO2023284416A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106548190A (zh) * | 2015-09-18 | 2017-03-29 | 三星电子株式会社 | 模型训练方法和设备以及数据识别方法 |
CN111967573A (zh) * | 2020-07-15 | 2020-11-20 | 中国科学院深圳先进技术研究院 | 数据处理方法、装置、设备及计算机可读存储介质 |
CN112052948A (zh) * | 2020-08-19 | 2020-12-08 | 腾讯科技(深圳)有限公司 | 一种网络模型压缩方法、装置、存储介质和电子设备 |
CN113095475A (zh) * | 2021-03-02 | 2021-07-09 | 华为技术有限公司 | 一种神经网络的训练方法、图像处理方法以及相关设备 |
CN113449851A (zh) * | 2021-07-15 | 2021-09-28 | 北京字跳网络技术有限公司 | 数据处理方法及设备 |
Non-Patent Citations (2)
Title |
---|
HANTING CHEN; YUNHE WANG; HAN SHU; CHANGYUAN WEN; CHUNJING XU; BOXIN SHI; CHAO XU; CHANG XU: "Distilling portable Generative Adversarial Networks for Image Translation", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 7 March 2020 (2020-03-07), 201 Olin Library Cornell University Ithaca, NY 14853 , XP081617036 * |
REN YUXI; WU JIE; XIAO XUEFENG; YANG JIANCHAO: "Online Multi-Granularity Distillation for GAN Compression", 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), IEEE, 10 October 2021 (2021-10-10), pages 6773 - 6783, XP034093124, DOI: 10.1109/ICCV48922.2021.00672 * |
Also Published As
Publication number | Publication date |
---|---|
CN113449851A (zh) | 2021-09-28 |
EP4354343A1 (en) | 2024-04-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2023284416A1 (zh) | 数据处理方法及设备 | |
WO2020155907A1 (zh) | 用于生成漫画风格转换模型的方法和装置 | |
CN109800732B (zh) | 用于生成漫画头像生成模型的方法和装置 | |
CN111476306A (zh) | 基于人工智能的物体检测方法、装置、设备及存储介质 | |
WO2022253061A1 (zh) | 一种语音处理方法及相关设备 | |
CN111275721A (zh) | 一种图像分割方法、装置、电子设备及存储介质 | |
CN112990053B (zh) | 图像处理方法、装置、设备及存储介质 | |
CN111738010B (zh) | 用于生成语义匹配模型的方法和装置 | |
CN112115900B (zh) | 图像处理方法、装置、设备及存储介质 | |
CN113555032B (zh) | 多说话人场景识别及网络训练方法、装置 | |
CN111312223B (zh) | 语音分割模型的训练方法、装置和电子设备 | |
CN114420135A (zh) | 基于注意力机制的声纹识别方法及装置 | |
CN117094362B (zh) | 一种任务处理方法及相关装置 | |
CN112037305B (zh) | 对图像中的树状组织进行重建的方法、设备及存储介质 | |
CN114429658A (zh) | 人脸关键点信息获取方法、生成人脸动画的方法及装置 | |
CN111128131B (zh) | 语音识别方法、装置、电子设备及计算机可读存储介质 | |
WO2023185516A1 (zh) | 图像识别模型的训练方法、识别方法、装置、介质和设备 | |
CN111312224B (zh) | 语音分割模型的训练方法、装置和电子设备 | |
CN111311609B (zh) | 一种图像分割方法、装置、电子设备及存储介质 | |
JP7504192B2 (ja) | 画像を検索するための方法及び装置 | |
CN115116117A (zh) | 一种基于多模态融合网络的学习投入度数据的获取方法 | |
CN115424060A (zh) | 模型训练方法、图像分类方法和装置 | |
CN111291640B (zh) | 用于识别步态的方法和装置 | |
CN113609957A (zh) | 一种人体行为识别方法及终端 | |
CN113762037A (zh) | 图像识别方法、装置、设备以及存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22841056 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 18579178 Country of ref document: US Ref document number: 2022841056 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2022841056 Country of ref document: EP Effective date: 20240112 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |