WO2021114832A1 - 样本图像数据增强方法、装置、电子设备及存储介质 - Google Patents

样本图像数据增强方法、装置、电子设备及存储介质 Download PDF

Info

Publication number
WO2021114832A1
WO2021114832A1 PCT/CN2020/118440 CN2020118440W WO2021114832A1 WO 2021114832 A1 WO2021114832 A1 WO 2021114832A1 CN 2020118440 W CN2020118440 W CN 2020118440W WO 2021114832 A1 WO2021114832 A1 WO 2021114832A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
generator
network model
generation network
training
Prior art date
Application number
PCT/CN2020/118440
Other languages
English (en)
French (fr)
Inventor
赵霄鸿
刘莉红
刘玉宇
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021114832A1 publication Critical patent/WO2021114832A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Definitions

  • This application relates to the field of computer image processing technology, and in particular to a method, device, electronic device, and computer-readable storage medium for enhancing sample image data.
  • the sample image data of the type can be enhanced first. Further, inputting the enhanced multiple sample image data of the type into the image detection classification model for training can make the image detection classification model have a higher accuracy rate when detecting and classifying a certain related image.
  • sample image data enhancement methods can be divided into supervised data enhancement and unsupervised data enhancement methods.
  • supervised data enhancement can be divided into single-sample image data enhancement and multi-sample image data enhancement
  • unsupervised data enhancement can be divided into new data generation and learning enhancement strategies.
  • Supervised data enhancement adopts preset data transformation rules to expand data on the basis of existing data.
  • single-sample image data enhancement includes geometric operations such as flipping, rotation, etc. and color transformations such as noise and blurring.
  • the advantages of this type of method are obvious, that is, it is convenient to operate, but there is a risk of overfitting.
  • Multi-sample image data enhancement is different from single-sample data enhancement. It uses multiple sample images to generate new sample images, such as SMOTE, SamplingPairing, and mixup. The inventors found that these three methods are all attempts to make discrete sample points continuous. Fit the true distribution, but the added sample image is still located in the area enclosed by the known small sample image points in the feature space.
  • this type of method has some potential problems, such as SMOTE, which synthesizes the same number of sample images for each niche sample image.
  • SMOTE synthesizes the same number of sample images for each niche sample image.
  • it increases the possibility of overlap between classes, and on the other hand generates some Provide samples of useful information.
  • Unsupervised data enhancement methods are mainly divided into two types: learning the data enhancement method suitable for the current task through the model, such as AutoAugment, learning the distribution of the data through the model, and randomly generating images consistent with the distribution of the training data set, such as the confrontation generation network ( GAN).
  • the basic idea of AutoAugment is to find the best image transformation strategy from the data itself, and learn different enhancement methods for different tasks. Randomly select 5 of the 16 commonly used data enhancement operations prepared in advance, and select the enhancement operation combination that can achieve data enhancement through training and verification.
  • This method can learn the best data enhancement method for different tasks, and is more flexible and more targeted than using preset data transformation rules in supervised data enhancement.
  • the inventor realizes that the disadvantage of this method is also obvious: it consumes too much computing resources and is difficult to implement when computing resources are limited.
  • This application provides a sample image data enhancement method, device, electronic device, and computer-readable storage medium, the main purpose of which is to enhance sample image data based on a confrontation generation network to generate an extended sample image.
  • the present application provides a method for enhancing sample image data, which includes the following steps:
  • the labeled image and the mask image are input to the target confrontation generation network model to generate an extended sample image.
  • sample image data enhancement device which includes:
  • Sample image acquisition module for acquiring sample images
  • the network model acquisition module is configured to acquire the target confrontation generation network model generated by training the initial confrontation generation network model by using the image blocks of the region of interest in the sample image;
  • Annotated image acquisition module configured to acquire annotated image with the region of interest generated according to the sample image
  • a mask image acquisition module which is used to acquire a mask image generated by masking other regions except the region of interest in the annotated image
  • the extended image generation module is used to input the labeled image and the mask image into a target confrontation generation network model to generate an extended sample image.
  • the present application also provides an electronic device including a memory and a processor, and computer-readable instructions are stored in the memory, and when the computer-readable instructions are executed by the processor ,
  • the processor is caused to perform the following steps: acquiring a sample image; acquiring a target confrontation generation network model generated by training an initial confrontation generation network model using image blocks of a region of interest in the sample image; acquiring a target confrontation generation network model generated based on the sample image An annotated image of the region of interest; obtaining a mask image generated by masking other regions other than the region of interest in the annotated image; and inputting the annotated image and the mask image to the target
  • the confrontation generation network model generates extended sample images.
  • the present application also provides a computer-readable storage medium in which computer-readable instructions are stored.
  • the processing The device executes the following steps: the following steps: obtain sample images; obtain the target confrontation generation network model generated by training the initial confrontation generation network model using the image blocks of the region of interest in the sample image; obtain the target confrontation generation network model generated based on the sample image generation An annotated image of a region of interest; obtaining a mask image generated by masking other regions other than the region of interest in the annotated image; and inputting the annotated image and the mask image into the target confrontation generation
  • the network model generates extended sample images.
  • the initial confrontation generation network model is generated by training the initial confrontation generation network model using the image blocks of the region of interest in the sample image, and Annotated image and mask image input
  • the target confrontation generation network model can generate extended sample images, does not rely on the pre-training model, requires less training resources, and does not increase the computational complexity without reducing the capacity of the network
  • network model training can input a single sample image, without a large number of sample images for training. Further, after the enhanced multiple sample image data of the type is used for the training of the image detection classification model, the accuracy of the detection and classification of a certain related image by the image detection classification model can be improved.
  • FIG. 1 is an implementation environment diagram of a method for enhancing sample image data provided by an embodiment of the application
  • FIG. 2 is a flowchart of a method for enhancing sample image data provided by an embodiment of the application
  • FIG. 3 is a schematic diagram of the training principle of the initial confrontation generation network model in the sample image data enhancement method provided by an embodiment of the application;
  • FIG. 4 is a schematic structural diagram of the generator G n (when n ⁇ N) of the initial confrontation generation network model in the sample image data enhancement method provided by an embodiment of the application;
  • FIG. 5 is a schematic diagram of the input and output principles of sample images, annotated images, and extended sample images in a sample image data enhancement method provided by an embodiment of the application;
  • FIG. 6 is a program module diagram of a preferred embodiment of a sample image data enhancement device provided by an embodiment of the application.
  • the embodiments of the present application provide a method, device, electronic device, and storage medium for enhancing sample image data.
  • the present application can be applied to smart transportation scenarios, thereby promoting the construction of smart cities.
  • the sample image data enhancement method is used to perform data enhancement on the sample image to generate an extended sample image.
  • the extended sample image can be used to train the image detection classification model and improve the accuracy of the image detection classification model, but it is not limited to the above.
  • FIG. 1 is an application environment diagram of a preferred embodiment of a method for enhancing sample image data of this application.
  • the sample image data enhancement method can be applied to an electronic device 1, which includes, but is not limited to, servers, server clusters, mobile phones, tablet computers, notebook computers, desktop computers, personal digital assistants, and wearable devices with computing Functional terminal equipment.
  • the electronic device 1 may include a processor 12, a memory 11, a network interface 13, and a communication bus 14.
  • the memory 11 includes at least one type of readable storage medium, and the readable storage medium may be non-volatile or volatile.
  • the at least one type of readable storage medium may be a non-volatile storage medium such as flash memory, hard disk, multimedia card, card-type memory 11, and the like.
  • the readable storage medium may be an internal storage unit of the electronic device 1, such as a hard disk of the electronic device 1.
  • the readable storage medium may also be the external memory 11 of the electronic device 1, such as a plug-in hard disk or a smart memory card (Smart Media Card, SMC) equipped on the electronic device 1. , Secure Digital (SD) card, Flash Card, etc.
  • SD Secure Digital
  • the readable storage medium of the memory 11 is generally used to store a program (such as a sample image data enhancement program) of the sample image data enhancement device 10 installed in the electronic device 1.
  • the memory 11 can also be used to temporarily store data that has been output or will be output.
  • the processor 12 may be a central processing unit (CPU), microprocessor or other data processing chip in some embodiments, and is used to run program codes or processed data stored in the memory 11, such as executing sample image data. Enhance the program of the device 10 and so on.
  • CPU central processing unit
  • microprocessor or other data processing chip in some embodiments, and is used to run program codes or processed data stored in the memory 11, such as executing sample image data. Enhance the program of the device 10 and so on.
  • the network interface 13 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface), and is generally used to establish a communication connection between the electronic device 1 and other electronic devices.
  • a standard wired interface and a wireless interface such as a WI-FI interface
  • the communication bus 14 is used to realize the connection and communication between these components.
  • FIG. 1 only shows the electronic device 1 with the components 11-14, but it should be understood that it is not required to implement all the illustrated components, and more or fewer components may be implemented instead.
  • the electronic device 1 may also include a user interface.
  • the user interface may include an input unit such as a keyboard (Keyboard), a voice input device such as a microphone (microphone) and other devices with voice recognition functions, and a voice output device such as audio, earphones, etc.
  • the user interface may also include a standard wired interface and a wireless interface.
  • the electronic device 1 may also include a display, and the display may also be called a display screen or a display unit.
  • the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an organic light-emitting diode (OLED) touch device, and the like.
  • the display is used for displaying information processed in the electronic device 1 and for displaying a visualized user interface.
  • the electronic device 1 further includes a touch sensor.
  • the area provided by the touch sensor for the user to perform touch operations is called a touch area.
  • the touch sensor described here may be a resistive touch sensor, a capacitive touch sensor, or the like.
  • the touch sensor includes not only a contact type touch sensor, but also a proximity type touch sensor and the like.
  • the touch sensor may be a single sensor, or may be, for example, a plurality of sensors arranged in an array.
  • the area of the display of the electronic device 1 may be the same as or different from the area of the touch sensor.
  • the display and the touch sensor are stacked to form a touch display screen. The device detects the touch operation triggered by the user based on the touch screen.
  • the electronic device 1 may also include a radio frequency (RF) circuit, a sensor, an audio circuit, etc., which will not be repeated here.
  • RF radio frequency
  • the memory 11 as a computer storage medium may include an operating system and a program of the sample image data enhancement device 10; the processor 12 executes the sample image data enhancement device stored in the memory 11.
  • the program of 10 realizes the following steps S21, S22, S23, S24, S25.
  • Step S21 Obtain a sample image.
  • the sample image may be a car damage image
  • the car damage image may include a scratched area or a cracked area of the vehicle body.
  • the number of the sample images may be one.
  • Step S21 Obtain a target confrontation generation network model generated by training an initial confrontation generation network model using image blocks of the region of interest in the sample image.
  • the region of interest is a car body scratch region or a car body crack region in the car damage image.
  • the image block of the region of interest in the sample image can be obtained by cropping from the sample image.
  • the target confrontation generation network model may be installed in the electronic device 1.
  • the process of training the target confrontation generation network model generated by the initial confrontation generation network model may be performed in the electronic device 1, that is, the electronic device 1 uses the region of interest in the sample image.
  • the image block trains the initial confrontation generation network model to generate the target confrontation generation network model.
  • the process of training the target confrontation generation network model generated by the initial confrontation generation network model can be performed in other electronic devices, that is, other electronic devices use image blocks of the region of interest in the sample image for training.
  • the initial confrontation generation network model generates a target confrontation generation network model, and the trained target confrontation generation network model is further installed in the electronic device.
  • the following describes the process of training the initial confrontation generation network model to generate the target confrontation generation network model.
  • the initial confrontation generation network model may be a fully convolutional pyramid confrontation generation network model.
  • the initial confrontation generation network model includes multiple generators G 0 , G 1 ,...G N and the multiple generators G 0 , G 1 ,...G N corresponding to the plurality of classifiers D 0, D 1, « D N.
  • the output image sizes of the multiple generators G 0 , G 1 ,...G N increase in the order of G 0 , G 1 ,...G N
  • the image The block includes a plurality of image blocks x 0 , x 1 , ... x N that increase in size sequentially
  • the input of the initial confrontation generation network model includes the image blocks x 0 , x 1 , ... ..x N and the noise image z 0 , z 1 ,...z N.
  • N is a natural number greater than or equal to 2.
  • the step of training the target confrontation generation network model generated by the initial confrontation generation network model may include:
  • n is a natural number
  • the noise image z n and the output image of the generator G n+1 Sampled image Input the generator G n to get the output image I.e. the output image And the output image And the image block x n is input to the discriminator D n , and alternate iterative training is performed on the generator G n and the discriminator D n;
  • the output image of the generator G n It can also be called a fake image, and the symbol ⁇ r can represent up-sampling r times, that is, the sampled image Can represent the output image of the generator G n+1 The image obtained by sampling r times.
  • the plurality of generator G 0, G 1, ?? G N corresponding to the plurality of classifiers D 0, D 1, ?? D N G N can follow from G to 0.
  • the sequence from D N to D 0 is trained and fixed in sequence. Specifically, a crude was obtained in a fine manner when the target against the training model generation network, such as the first train is G N, D N, is completed when G N, D N training, G N, D N are fixed , And then perform training of G N-1 and D N-1 ,... until G 0 and D 0 are trained and fixed, so as to obtain the target confrontation generation network model.
  • the generator G n may include a convolutional neural network, and the convolutional neural network receives the noise image z n and outputs the output image
  • the generator G n may include a first adder 41, a convolutional neural network 42, and a second adder 43.
  • the first adder 41 is used to combine the Noise image z n and the sampled image After being superimposed, it is provided to the convolutional neural network 42, and the second superimposer 43 is used to combine the output image of the convolutional neural network with the sampled image As the output image after being superimposed That is, the output image It can be expressed by the following formula:
  • ⁇ n represents the convolutional neural network of the generator G n , which may be a 5-layer fully convolutional network composed of 3 ⁇ 3Conv-BN-LeackyReLU.
  • the generator G n may also include the architecture of the first adder 41, the convolutional neural network 42, and the second adder 43, but the first The adder 41 can directly provide the noise image z n to the convolutional neural network 42, and the output image of the convolutional neural network 42 It can also be directly output via the second adder 43 and used as the output image of the generator G N
  • the generator G n adopts WGAN-GP which can provide gradient penalty loss
  • the discriminator D n is a Markov discriminator.
  • the training loss of the generator G n and the corresponding discriminator D n includes the confrontation loss l adv and the reconstruction loss l rec .
  • the training loss formulas of the generator G n and the discriminator D n are as follows:
  • the reconstruction loss l rec may meet the following conditions:
  • the noise image z N is a random noise image z *
  • G N (z * ) represents the output image of the generator G n
  • the reconstruction loss of the generator G n and the discriminator D n is:
  • Step S23 Obtain annotated image with the region of interest generated according to the sample image.
  • a manual annotation method may be used, such as by operating the electronic device to perform frame selection of the region of interest on the sample image to generate the annotated image.
  • the electronic device may also directly receive an annotated image sent by an external device that has already annotated the region of interest.
  • Step S24 Obtain a mask image generated by performing masking processing on areas other than the region of interest in the annotated image.
  • the electronic device may perform a mask image generated by masking processing on regions other than the region of interest in the annotated image.
  • the electronic device may also directly receive a mask image sent by an external device that is generated by performing a masking process on areas other than the region of interest in the annotated image.
  • the masking process may be an operation of setting the gray scale values of the region outside the region of interest to 0, and setting the gray scale values of the region of interest to 1 (or 255).
  • the input and output principles of the sample image, the annotation image, and the extended sample image involved in the step S23 and the step S24 may be as shown in FIG. 5.
  • Step S25 Input the labeled image and the mask image into the target confrontation generation network model to generate an extended sample image.
  • the trained multiple generators G 0 , G 1 ,...G N are stored as the target confrontation generation network model in the step S22
  • the In step S25 the annotated image and the mask image are input to the target confrontation generation network model to obtain the extended sample image.
  • the target confronts the target
  • the generators G 0 , G 1 ,...G N that generate the network model input the labeled image and the mask image to obtain an output image, and the output image can be further judged by a discriminator, whichever is judged
  • the output image when the result is true is used as the extended sample image.
  • the initial confrontation generation network model is trained by acquiring image blocks of the region of interest in the sample image to generate the target confrontation generation network model, and the labeled image and the mask image are input to the target confrontation Generating the network model can generate extended sample images, does not rely on the pre-training model, and requires less training resources. It does not increase the computational complexity and parameter adjustment engineering without reducing the capacity of the network, but it can pass hidden
  • the method of formula enhances the data of small sample images to obtain extended sample images.
  • network model training can input a single sample image, without a large number of sample images for training. Further, after the enhanced multiple sample image data of the type is used for the training of the image detection classification model, the accuracy of the detection and classification of a certain related image by the image detection classification model can be improved.
  • the confrontation generation network model through confrontation learning, it is possible to generate an extended sample image that can be falsely real, and it can also further improve the accuracy of the image detection classification model trained using the extended sample image.
  • the confrontation generation network model can also generate different data under the premise of obeying the original data distribution, and at the same time consumes much less computing resources than methods such as AutoAugment.
  • the initial confrontation generation network model includes the multiple generators G 0 , G 1 ,...G N and the multiple discriminators D 0 , D 1 ,... D N , and then the target confrontation generation network model can generate multi-size extended sample images, while maintaining global structure and texture features, effectively improving the accuracy of the image detection classification model trained using the extended sample images.
  • the target confrontation generation network model can generate multiple extended sample images after receiving the labeled image and the mask image. It can be seen that the extension of the sample image is relatively simple after the model training is completed.
  • the generator G n when n ⁇ N, includes a first adder, a convolutional neural network, and a second adder.
  • the first adder combines the noise image z n and the sampling image After being superimposed, it is provided to the convolutional neural network, and the second adder combines the output image of the convolutional neural network with the sampled image As the output image after being superimposed That is, the residual learning method is used to define the learning method of each level of the pyramid, so that the generator G n learns the missing details in the image on the basis of the input of each level, and can generate a more realistic extended sample image.
  • the initial confrontation generation network model includes a fully convolutional pyramid confrontation generation network model.
  • the convolutional neural network of the generator adopts a 5-layer fully convolutional network composed of 33Conv-BN-LeackyReLU; multiple Extended sample images of any size and any aspect ratio are also beneficial to improve the accuracy of the image detection classification model trained using the extended sample images.
  • the generator G n adopts WGAN-GP which can provide gradient penalty loss, which has a faster convergence speed, can generate higher quality samples, and can provide a stable training method, almost no adjustment is required. Participate and successfully complete the model training.
  • the discriminator is a Markov discriminator, and the Markov discriminator facilitates the maintenance of high resolution and high detail of the extended sample image, so that the quality of the extended sample image is higher.
  • the plurality of generators G 0 , G 1 ,...G N correspond to multiple The discriminators D 0 , D 1 , ... D N are trained and fixed in sequence from G N to G 0 , and from D N to D 0.
  • the above-mentioned progressive training is also conducive to reducing computing resources Consumption.
  • confrontation loss l adv and the reconstruction loss l rec it is also beneficial to obtain a better target confrontation generation network model, so as to obtain high-quality extended sample images.
  • sample image data enhancement method, device, etc. are applied to car damage image data to perform data enhancement on small sample images such as car body scratched areas or car body cracked areas, so as to solve the problem of unbalanced car damage image samples and improve Car damage image detection classification model performance.
  • the sample image data enhancement method, device, etc. belong to unsupervised learning. Compared with the common supervised learning in deep learning, it does not rely on pre-training models, does not require massive car damage data, does not require a large amount of computing resources, and greatly reduces data Collect costs and training resources.
  • the fully convolutional pyramid confrontation generation network model can be generated coarsely to finely, which obeys the original car damage sample image distribution, but is different from the original car damage sample image, and is more conducive to improving car damage. The performance of the image detection classification model.
  • the program of the sample image data enhancement device 10 may also be divided into one or more modules, and the one or more modules are stored in the memory 11 and executed by the processor 12 to complete the application.
  • the module referred to in this application refers to a series of computer program instruction segments that can complete specific functions.
  • FIG. 6 it is a program module diagram of a preferred embodiment of the sample image data enhancement device 10 in FIG. 1.
  • the sample image data enhancement device 10 can be divided into: a sample image acquisition module 101, a network model acquisition module 102, an annotation image acquisition module 103, a mask image acquisition module 104, and an extended image generation module 105.
  • the functions or operation steps implemented by the modules 101-105 are all similar to the above steps S21, S22, S23, S24 and S25, which will not be described in detail here, for example, for example:
  • the sample image acquisition module 101 is used to acquire a sample image
  • the network model acquisition module 102 is configured to acquire a target confrontation generation network model generated by training an initial confrontation generation network model using image blocks of the region of interest in the sample image;
  • Annotated image acquisition module 103 configured to acquire annotated images with the region of interest generated based on the sample image
  • the mask image acquisition module 104 is configured to acquire a mask image generated by masking other regions except the region of interest in the annotated image.
  • the extended image generation module 105 is configured to input the labeled image and the mask image into a target confrontation generation network model to generate an extended sample image.
  • an embodiment of the present application also proposes a computer-readable storage medium, the computer-readable storage medium includes a sample image data enhancement device, and the sample image data enhancement device implements the following operations when executed by a processor:
  • the labeled image and the mask image are input to the target confrontation generation network model to generate an extended sample image.
  • the initial confrontation generation network model includes a plurality of generators G 0 , G 1 , ... G N and the plurality of generators G 0 , G 1 , ... G N corresponding to the plurality of classifiers D 0, D 1, whil D N, a plurality of generators G 0, G 1, whil
  • the image block includes a plurality of image blocks x 0 , x 1 ,...x N whose sizes increase sequentially, and the initial confrontation
  • the input of the generation network model includes the image block and the noise image z 0 , z 1 ,...z N , where N is a natural number greater than or equal to 2, and the target confrontation generation generated by the training initial confrontation generation network model In the steps of the network model:
  • n is a natural number
  • the noise image z n and the output image of the generator G n+1 Sampled image Input the generator G n to get the output image And the output image And the image block x n is input to the discriminator D n , and alternate iterative training is performed on the generator G n and the discriminator D n;
  • the plurality of discriminators D 0 corresponding to the plurality of generators G 0 , G 1 ,...G N ,D 1 ,...D N are trained and fixed in sequence from G N to G 0 and from D N to D 0;
  • the generator G n includes a convolutional neural network, and the convolutional neural network receives the noise image z n and outputs the output image
  • the generator G n includes a first adder, a convolutional neural network, and a second adder.
  • the first adder is used to combine the noise image z n and the sample image Provided to the convolutional neural network after being superimposed
  • the second adder is used to combine the output image of the convolutional neural network with the sampled image As the output image after being superimposed
  • the initial confrontation generation network model includes a fully convolutional pyramid confrontation generation network model; the convolutional neural network of the generator G n adopts a 5-layer fully convolutional network composed of 33Conv-BN-LeackyReLU.
  • the generator G n adopts WGAN-GP which can provide gradient penalty loss;
  • the discriminator D n is a Markov discriminator;
  • the training loss of the generator G n and the discriminator D n includes To combat the loss l adv and the reconstruction loss l rec , the training loss formulas of the generator G n and the discriminator D n are as follows:
  • the reconstruction loss l rec meets the following conditions:
  • the noise image z N is a random noise image z *
  • the reconstruction loss of the generator G n and the discriminator D n is:
  • the sample image includes a car damage image; the region of interest includes a car body scratch region or a car body crack region in the car damage image.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

一种样本图像数据增强方法、装置、电子设备及计算机可读存储介质。所述样本图像数据增强方法包括:获取样本图像(S21);获取采用所述样本图像中的感兴趣区域的图像块训练初始对抗生成网络模型生成的目标对抗生成网络模型(S22);获取依据所述样本图像生成具有所述感兴趣区域的标注图像(S23);获取对所述标注图像中所述感兴趣区域以外的其他区域进行屏蔽处理生成的掩膜图像(S24);及将所述标注图像及所述掩膜图像输入所述目标对抗生成网络模型生成扩展样本图像(S25)。上述样本图像数据增强方法所需训练资源较少,生成的扩展样本图像的质量也较高,可应用于智慧交通场景中,从而推动智慧城市的建设的目的。

Description

样本图像数据增强方法、装置、电子设备及存储介质
本申请要求于2020年05月28日提交中国专利局、申请号为202010468756.4、申请名称为“样本图像数据增强方法、装置、电子设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机图像处理技术领域,尤其涉及一种样本图像数据增强方法、装置、电子设备及计算机可读存储介质。
背景技术
随着社会经济的不断发展,社会大众生活水平的逐步改善,计算机技术在生产生活中得到广泛推广,特别是计算机图像处理技术,成为当前计算机应用领域中的重要技术类型之一。
计算机图像处理技术中,如何对样本图像数据进行增强,在实际工业项目中解决样本图像分布不均衡的问题时十分重要。举例来说,在进行计算机图像检测分类时,若某一类型的样本图像(如车损图像)的数量较少,导致图像检测分类模型对所述类型的样本图像的训练较少,进而在进行某一相关图像的检测分类时,可能出现错误的检测分类结果。
因此,为改善样本图像分布不均衡的问题,当某一类型样本图像数据较少时,可以先对所述类型的样本图像数据进行增强。进一步地,将增强后的所述类型的多个样本图像数据输入图像检测分类模型进行训练,可以使得图像检测分类模型对某一相关图像进行检测分类时的准确率较高。
一般地,样本图像数据增强方法可以分为有监督地数据增强和无监督地数据增强方法。其中,有监督的数据增强可以分为单样本图像数据增强和多样本图像数据增强,无监督的数据增强可以分为生成新的数据和学习增强策略。
有监督数据增强采用预设的数据变换规则,在已有数据的基础上进行数据扩充。其中,单样本图像数据增强包括几何操作如翻转、旋转等和颜色变换如噪声、模糊等。该类方法的优势很明显,即操作方便,但是存在过拟合的风险。多样本图像数据增强不同于单样本数据增强,它利用多个样本图像来产生新的样本图像,如SMOTE、SamplingPairing和mixup等,发明人发现这三种方法都是试图将离散样本点连续化来拟合真实分布,不过所增加的样本图像在特征空间中仍位于已知小样本图像点所围成的区域内。而且,该类方法存在一些潜在问题,如SMOTE,它为每个小众样本图像合成数量相同的样本图像,然而,其一方面增加了类之间重叠的可能性,另一方面生成了一些没有提供有益信息的样本。
无监督的数据增强方法主要分为两种:通过模型学习出适合当前任务的数据增强方法,如AutoAugment,通过模型学习数据的分布,随机生成与训练数据集分布一致的图片,如对抗生成网络(GAN)。AutoAugment的基本思想思路是从数据本身寻找最佳图像变换策略,对于不同的任务学习不同的增强方法。从预先准备的16个常用数据增强操作中随机选取5个,通过训练和验证来挑选出能够达到数据增强的增强操作组合。该方法能够对于不同的任务学习最佳的数据增强方法,比之有监督数据增强中采用预设的数据变换规则要更灵活、更有针对性。同时,发明人意识到该方法的劣势也很明显:耗费计算资源过大,在计算资源有限的情况下难以实现。
发明内容
本申请提供一种样本图像数据增强方法、装置、电子设备及计算机可读存储介质,其主要目的在于基于对抗生成网络对样本图像数据进行增强,生成扩展样本图像。
为实现上述目的,本申请提供一种样本图像数据增强方法,其包括以下步骤:
获取样本图像;
获取采用所述样本图像中的感兴趣区域的图像块训练初始对抗生成网络模型生成的目标对抗生成网络模型;
获取依据所述样本图像生成具有所述感兴趣区域的标注图像;
获取对所述标注图像中所述感兴趣区域以外的其他区域进行屏蔽处理生成的掩膜图像;及
将所述标注图像及所述掩膜图像输入所述目标对抗生成网络模型生成扩展样本图像。
为实现上述目的,本申请还提供一种样本图像数据增强装置,其包括:
样本图像获取模块,用于获取样本图像;
网络模型获取模块,用于获取采用所述样本图像中的感兴趣区域的图像块训练初始对抗生成网络模型生成的目标对抗生成网络模型;
标注图像获取模块,用于获取依据所述样本图像生成具有所述感兴趣区域的标注图像;
掩膜图像获取模块,用于获取对所述标注图像中所述感兴趣区域以外的其他区域进行屏蔽处理生成的掩膜图像;及
扩展图像生成模块,用于将所述标注图像及所述掩膜图像输入目标对抗生成网络模型以生成扩展样本图像。
此外,为实现上述目的,本申请还提供一种电子设备,所述电子设备包括存储器及处理器,所述存储器中存储有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述处理器执行如下步骤:获取样本图像;获取采用所述样本图像中的感兴趣区域的图像块训练初始对抗生成网络模型生成的目标对抗生成网络模型;获取依据所述样本图像生成具有所述感兴趣区域的标注图像;获取对所述标注图像中所述感兴趣区域以外的其他区域进行屏蔽处理生成的掩膜图像;及将所述标注图像及所述掩膜图像输入所述目标对抗生成网络模型生成扩展样本图像。
另外,为实现上述目的,本申请还提供一种计算机可读存储介质,计算机可读存储介质中存储有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述处理器执行如下步骤:如下步骤:获取样本图像;获取采用所述样本图像中的感兴趣区域的图像块训练初始对抗生成网络模型生成的目标对抗生成网络模型;获取依据所述样本图像生成具有所述感兴趣区域的标注图像;获取对所述标注图像中所述感兴趣区域以外的其他区域进行屏蔽处理生成的掩膜图像;及将所述标注图像及所述掩膜图像输入所述目标对抗生成网络模型生成扩展样本图像。
本申请提出的样本图像数据增强方法、装置、电子设备及计算机可读存储介质中,通过获取采用样本图像中的感兴趣区域的图像块训练初始对抗生成网络模型生成目标对抗生成网络模型,并将标注图像及掩膜图像输入所述目标对抗生成网络模型可生成扩展样本图像,不依赖于预训练模型,所需训练资源也较少,可在不降低网络的容量的前提下,不增加计算复杂度和调参工程量,但能够通过隐式的方法对小样本图像进行数据增强,获得扩展样本图像。此外,网络模型训练可输入一张单张的样本图像,无需大量样本图像进行训练。进一步地,将增强后的所述类型的多个样本图像数据用于图像检测分类模型的训练后,所述图像检测分类模型对某一相关图像进行检测分类时的准确率可以提高。
附图说明
图1为本申请一个实施例提供的样本图像数据增强方法的实施环境图;
图2为本申请一个实施例提供的样本图像数据增强方法的流程图;
图3为本申请一个实施例提供的样本图像数据增强方法中的初始对抗生成网络模型的训练原理示意图;
图4为本申请一个实施例提供的样本图像数据增强方法中的初始对抗生成网络模型的生成器G n(n<N时)的结构示意图;
图5为本申请一个实施例提供的样本图像数据增强方法中样本图像、标注图像及扩展样本图像的输入输出原理示意图;
图6为本申请一个实施例提供的样本图像数据增强装置较佳实施例的程序模块图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
具体地,本申请实施例提供一种样本图像数据增强方法、装置、电子设备及存储介质,本申请可应用于智慧交通场景中,从而推动智慧城市的建设。其中,所述样本图像数据增强方法用于对样本图像进行数据增强,以生成扩展样本图像。其中,所述扩展样本图像可以用于图像检测分类模型的训练及提高图像检测分类模型的准确率,但不限于上述。
参照图1所示,图1为本申请样本图像数据增强方法较佳实施例的应用环境图。所述样本图像数据增强方法可以应用于电子设备1中,所述电子设备1包括但不限于服务器、服务器集群、手机、平板电脑、笔记本电脑、台式电脑、个人数字助理及穿戴式设备等具有运算功能的终端设备。
所述电子设备1可以包括处理器12、存储器11、网络接口13及通信总线14。
存储器11包括至少一种类型的可读存储介质,所述可读存储介质可以是非易失性,也可以是易失性。所述至少一种类型的可读存储介质可为如闪存、硬盘、多媒体卡、卡型存储器11等的非易失性存储介质。在一些实施例中,所述可读存储介质可以是所述电子设备1的内部存储单元,例如该电子设备1的硬盘。在另一些实施例中,所述可读存储介质也可以是所述电子设备1的外部存储器11,例如所述电子设备1上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。
在本实施例中,所述存储器11的可读存储介质通常用于存储安装于所述电子设备1的样本图像数据增强装置10的程序(如样本图像数据增强程序)。所述存储器11还可以用于暂时地存储已经输出或者将要输出的数据。
处理器12在一些实施例中可以是一中央处理器(Central Processing Unit,CPU),微处理器或其他数据处理芯片,用于运行存储器11中存储的程序代码或处理数据,例如执行样本图像数据增强装置10的程序等。
网络接口13可选地可以包括标准的有线接口、无线接口(如WI-FI接口),通常用于在该电子设备1与其他电子设备之间建立通信连接。
通信总线14用于实现这些组件之间的连接通信。
图1仅示出了具有组件11-14的电子设备1,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。
可选地,该电子设备1还可以包括用户接口,用户接口可以包括输入单元比如键盘(Keyboard)、语音输入装置比如麦克风(microphone)等具有语音识别功能的设备、语音输出装置比如音响、耳机等,可选地用户接口还可以包括标准的有线接口、无线接口。
可选地,该电子设备1还可以包括显示器,显示器也可以称为显示屏或显示单元。在一些实施例中可以是LED显示器、液晶显示器、触控式液晶显示器以及有机发光二极管(Organic Light-Emitting Diode,OLED)触摸器等。显示器用于显示在电子设备1中处理的信息以及用于显示可视化的用户界面。
可选地,该电子设备1还包括触摸传感器。所述触摸传感器所提供的供用户进行触摸 操作的区域称为触控区域。此外,这里所述的触摸传感器可以为电阻式触摸传感器、电容式触摸传感器等。而且,所述触摸传感器不仅包括接触式的触摸传感器,也可包括接近式的触摸传感器等。此外,所述触摸传感器可以为单个传感器,也可以为例如阵列布置的多个传感器。
此外,该电子设备1的显示器的面积可以与所述触摸传感器的面积相同,也可以不同。可选地,将显示器与所述触摸传感器层叠设置,以形成触摸显示屏。该装置基于触摸显示屏侦测用户触发的触控操作。
可选地,该电子设备1还可以包括射频(Radio Frequency,RF)电路,传感器、音频电路等等,在此不再赘述。
在图1所示的装置实施例中,作为一种计算机存储介质的存储器11中可以包括操作系统、以及样本图像数据增强装置10的程序;处理器12执行存储器11中存储的样本图像数据增强装置10的程序时实现如下步骤S21、S22、S23、S24、S25。
步骤S21,获取样本图像。
具体地,所述样本图像可以为车损图像,所述车损图像可以包括车身划痕区域或车身开裂区域。另外,所述样本图像的数量可以是一张。
步骤S21,获取采用所述样本图像中的感兴趣区域的图像块训练初始对抗生成网络模型生成的目标对抗生成网络模型。
具体地,所述感兴趣区域为所述车损图像中的车身划痕区域或车身开裂区域。所述样本图像中的感兴趣区域可以为一个、两个或多个。所述样本图像中的感兴趣区域的图像块可以从所述样本图像中裁剪获得。
所述目标对抗生成网络模型可以安装在所述电子设备1中。
在一些实施例中,所述训练初始对抗生成网络模型生成的目标对抗生成网络模型的过程可以在所述电子设备1中进行,即所述电子设备1采用所述样本图像中的感兴趣区域的图像块训练初始对抗生成网络模型生成目标对抗生成网络模型。
在其他一些实施例中,所述训练初始对抗生成网络模型生成的目标对抗生成网络模型的过程可以在其他电子设备中进行,即其他电子设备采用所述样本图像中的感兴趣区域的图像块训练初始对抗生成网络模型生成目标对抗生成网络模型,所述训练好的目标对抗生成网络模型进一步被安装于所述电子设备中。
以下对所述训练初始对抗生成网络模型生成目标对抗生成网络模型的过程进行介绍。
如图3所示,所述初始对抗生成网络模型可以为全卷积的金字塔对抗生成网络模型。具体地,所述初始对抗生成网络模型包括多个生成器G 0,G 1,......G N及与所述多个生成器G 0,G 1,......G N对应的多个判别器D 0,D 1,......D N。其中,所述多个生成器G 0,G 1,......G N的输出图像尺寸按照G 0,G 1,......G N的顺序依次增大,所述图像块包括多个尺寸顺序增大的图像块x 0,x 1,......x N,所述初始对抗生成网络模型的输入包括所述图像块x 0,x 1,......x N及噪声图像z 0,z 1,......z N。其中,N为大于等于2的自然数。
所述训练初始对抗生成网络模型生成的目标对抗生成网络模型的步骤可以包括:
当n=N时,将所述噪声图像z n输入所述生成器G n得到输出图像
Figure PCTCN2020118440-appb-000001
将所述输出图像
Figure PCTCN2020118440-appb-000002
及所述图像块x n输入所述判别器D n,即所述输出图像
Figure PCTCN2020118440-appb-000003
并对所述生成器G n及所述判别器D n进行交替迭代训练;
当n<N时,n为自然数,将所述噪声图像z n及对所述生成器G n+1的输出图像
Figure PCTCN2020118440-appb-000004
的采样图像
Figure PCTCN2020118440-appb-000005
输入所述生成器G n得到输出图像
Figure PCTCN2020118440-appb-000006
即所述输出图像
Figure PCTCN2020118440-appb-000007
并将所述输出图像
Figure PCTCN2020118440-appb-000008
及所述图像块x n输入所述判别器D n,对所述生成器G n及所述判别器D n进行交替迭代训练;及
保存训练后的所述多个生成器或保存训练后的所述多个生成器及所述多个判别器作 为所述目标对抗生成网络模型。
其中,所述生成器G n的输出图像
Figure PCTCN2020118440-appb-000009
也可以称为伪造图像,符号↑ r可以表示上采样r倍,即所述采样图像
Figure PCTCN2020118440-appb-000010
可代表对所述生成器G n+1的输出图像
Figure PCTCN2020118440-appb-000011
的采样r倍获得的图像。
进一步地,所述多个生成器G 0,G 1,......G N对应的多个判别器D 0,D 1,......D N可以依照从G N至G 0、从D N至D 0的顺序被依次训练及固定。具体来说,可以采用由粗当精的方式训练获得所述目标对抗生成网络模型,如,首先训练是G N、D N,当完成G N、D N的训练,G N、D N被固定,再进行G N-1、D N-1的训练,......,直到G 0、D 0被训练完成而固定,从而得到所述目标对抗生成网络模型。
更进一步地,当n=N时,所述生成器G n可以包括卷积神经网络,所述卷积神经网络接收所述噪声图像z n并输出所述输出图像
Figure PCTCN2020118440-appb-000012
如图4所示,当n<N时,所述生成器G n可以包括第一叠加器41、卷积神经网络42及第二叠加器43,所述第一叠加器41用于将所述噪声图像z n及对所述采样图像
Figure PCTCN2020118440-appb-000013
叠加后提供至所述卷积神经网络42,所述第二叠加器43用于将所述卷积神经网络的输出图像与所述采样图像
Figure PCTCN2020118440-appb-000014
叠加后作为所述输出图像
Figure PCTCN2020118440-appb-000015
即,所述输出图像
Figure PCTCN2020118440-appb-000016
可以由以下公式表示:
Figure PCTCN2020118440-appb-000017
其中,ψ n代表所述生成器G n的卷积神经网络,其可以是一个由3×3Conv-BN-LeackyReLU组成的5层全卷积网络。
可以理解,在其他一些实施例中,当n=N时,所述生成器G n也可以包括第一叠加器41、卷积神经网络42及第二叠加器43的架构,但是所述第一叠加器41可以直接将噪声图像z n提供至所述卷积神经网络42,且所述卷积神经网络42的输出图像
Figure PCTCN2020118440-appb-000018
也可以直接经由所述第二叠加器43输出,并作为所述生成器G N的输出图像
Figure PCTCN2020118440-appb-000019
再进一步地,所述生成器G n采用可提供梯度惩罚损失的WGAN-GP,所述判别器D n为马尔科夫判别器。所述训练初始对抗生成网络模型生成的目标对抗生成网络模型的过程中,所述生成器G n及对应的所述判别器D n的训练损失包括对抗损失l adv和重建损失l rec,所述生成器G n及所述判别器D n的训练损失的公式如下:
Figure PCTCN2020118440-appb-000020
其中,其中,
Figure PCTCN2020118440-appb-000021
代表所述生成器G n的对抗损失最小且所述判别器D n的对抗损失最大时的乘积,λ代表超参数,l rec(G n)代表所述生成器G n的重建损失。
所述重建损失l rec可以符合如下条件:
当n=N时,所述噪声图像z N为随机噪声图像z *,G N(z *)代表所述生成器G n的输出图像
Figure PCTCN2020118440-appb-000022
所述生成器G n及所述判别器D n的重建损失为:
l rec=‖(G N(z *)-x N2
当n<N时,所述噪声图像z n为0,
Figure PCTCN2020118440-appb-000023
代表所述生成器G n的输出图像
Figure PCTCN2020118440-appb-000024
所述生成器G n及所述判别器D n的重建损失为:
Figure PCTCN2020118440-appb-000025
步骤S23,获取依据所述样本图像生成的具有所述感兴趣区域的标注图像。
具体地,在一些实施例中,可以使用人工标注的方式,如通过操作所述电子设备,从而在所述样本图像上进行所述感兴趣区域的框选而产生所述标注图像。然而,在其他一些实施例中,所述电子设备也可以直接接收外部设备发送的已经标注好感兴趣区域的标注图像。
步骤S24,获取对所述标注图像中所述感兴趣区域以外的其他区域进行屏蔽处理生成 的掩膜图像。
具体地,在一些实施例中,所述电子设备可以对所述标注图像中所述感兴趣区域以外的其他区域进行屏蔽处理生成的掩膜图像。然而,在其他一些实施例中,所述电子设备也可以直接接收外部设备发送的对所述标注图像中所述感兴趣区域以外的其他区域进行屏蔽处理生成的掩膜图像。具体地,所述屏蔽处理可以将所述感兴趣区域之外的区域灰阶数值都设置为0,所述感兴趣区域的灰阶数值设置为1(或255)的操作。
其中,所述步骤S23、步骤S24涉及的所述样本图像、标注图像及扩展样本图像的输入输出原理可以如图5所示。
步骤S25,将所述标注图像及所述掩膜图像输入所述目标对抗生成网络模型生成扩展样本图像。
具体地,在一些实施例中,所述步骤S22中保存训练后的所述多个生成器G 0,G 1,......G N作为所述目标对抗生成网络模型时,所述步骤S25中,向所述目标对抗生成网络模型输入所述标注图像及所述掩膜图像即可获得所述扩展样本图像。
在其他一些实施例中,所述步骤S22中保存训练后的所述多个生成器及所述多个判别器作为所述目标对抗生成网络模型时,所述步骤S25中,向所述目标对抗生成网络模型的生成器G 0,G 1,......G N输入所述标注图像及所述掩膜图像获得输出图像,所述输出图像可以进一步经由判别器进行判断,取当判断结果为真时的输出图像作为所述扩展样本图像。
本申请提出的样本图像数据增强方法中,通过获取采用样本图像中的感兴趣区域的图像块训练初始对抗生成网络模型生成目标对抗生成网络模型,并将标注图像及掩膜图像输入所述目标对抗生成网络模型可生成扩展样本图像,不依赖于预训练模型,所需训练资源也较少,可在不降低网络的容量的前提下,不增加计算复杂度和调参工程量,但能够通过隐式的方法对小样本图像进行数据增强,获得扩展样本图像。此外,网络模型训练可输入一张单张的样本图像,无需大量样本图像进行训练。进一步地,将增强后的所述类型的多个样本图像数据用于图像检测分类模型的训练后,所述图像检测分类模型对某一相关图像进行检测分类时的准确率可以提高。
进一步地,通过采用对抗生成网络模型,通过对抗学习的方式,可以生成能够以假乱真的扩展样本图像,也可以使得使用所述扩展样本图像进行训练的图像检测分类模型的准确率进一步提高。并且,对抗生成网络模型还能在满足服从原始数据分布的前提下生成不同的数据,同时所消耗的计算资源也比AutoAugment之类的方法少很多。
进一步地,所述初始对抗生成网络模型包括所述多个生成器G 0,G 1,......G N及所述多个判别器D 0,D 1,......D N,进而所述目标对抗生成网络模型可以生成多尺寸的扩展样本图像,同时还能维持全局结构和纹理特征,有效提高使用所述扩展样本图像进行训练的图像检测分类模型的准确率。另外,所述目标对抗生成网络模型接收所述标注图像及所述掩膜图像即可生成多张扩展样本图像,可见模型训练完成后,样本图像的扩展较为简单。
进一步地,当n<N时,所述生成器G n包括第一叠加器、卷积神经网络及第二叠加器,所述第一叠加器将所述噪声图像z n及对所述采样图像
Figure PCTCN2020118440-appb-000026
叠加后提供至所述卷积神经网络,所述第二叠加器将所述卷积神经网络的输出图像与所述采样图像
Figure PCTCN2020118440-appb-000027
叠加后作为所述输出图像
Figure PCTCN2020118440-appb-000028
即采用残差学习的方式定义每一级金字塔的学习方式,使得所述生成器G n在每一级输入的基础上学习图像中缺失的细节,可生成更为逼真的扩展样本图像。
进一步地,所述初始对抗生成网络模型包括全卷积的金字塔对抗生成网络模型,如所述生成器的卷积神经网络采用33Conv-BN-LeackyReLU组成的5层全卷积网络;可以产生多个任意尺寸和任意高宽比的扩展样本图像,也有利于提高使用所述扩展样本图像进行训练的图像检测分类模型的准确率。
进一步地,所述生成器G n采用可提供梯度惩罚损失的WGAN-GP,其具有更快的收敛速度,并能生成更高质量的样本,并可提供稳定的训练方式,几乎不需要怎么调参,成功 完成模型训练。
进一步地,所述判别器为马尔科夫判别器,所述马尔可夫判别器有利于所述扩展样本图像在高分辨率、高细节的保持,使得所述扩展样本图像的质量较高。
进一步地,所述采用所述图像块训练初始对抗生成网络模型生成目标对抗生成网络模型的步骤中,所述多个生成器G 0,G 1,......G N对应的多个判别器D 0,D 1,......D N依照从G N至G 0、从D N至D 0的顺序被依次训练及固定,上述渐进式的训练也有利于减少对计算资源的消耗。
进一步地,通过所述对抗损失l adv和重建损失l rec,也有利于获得较佳的所述目标对抗生成网络模型,从而可获得高质量的扩展样本图像。
进一步地,将所述样本图像数据增强方法、装置等运用到车损图像数据上,对小样本图像如车身划痕区域或车身开裂区域进行数据增强,解决车损图像样本不均衡的问题,提高车损图像检测分类模型的性能。所述样本图像数据增强方法、装置等属于非监督学习,相比于深度学习中常见的监督式学习来说,不依赖于预训练模型,无需海量车损数据,无需大量计算资源,大大降低数据收集成本和训练资源。另外,通过全卷积的金字塔对抗生成网络模型可由粗到精地生成服从原有车损样本图像分布,但又不同于原有车损样本图像的车损扩展样本图像,更有利于提高车损图像检测分类模型的性能。
在其他实施例中,样本图像数据增强装置10的程序还可以被分割为一个或者多个模块,一个或者多个模块被存储于存储器11中,并由处理器12执行,以完成本申请。本申请所称的模块是指能够完成特定功能的一系列计算机程序指令段。如图6所示,为图1中样本图像数据增强装置10较佳实施例的程序模块图。所述样本图像数据增强装置10可以被分割为:样本图像获取模块101、网络模型获取模块102、标注图像获取模块103、掩膜图像获取模块104及扩展图像生成模块105。所述模块101-105所实现的功能或操作步骤均与上文的各步骤S21、S22、S23、S24及S25类似,此处不再详述,示例性地,例如其中:
样本图像获取模块101,用于获取样本图像;
网络模型获取模块102,用于获取采用所述样本图像中的感兴趣区域的图像块训练初始对抗生成网络模型生成的目标对抗生成网络模型;
标注图像获取模块103,用于获取依据所述样本图像生成具有所述感兴趣区域的标注图像;
掩膜图像获取模块104,用于获取对所述标注图像中所述感兴趣区域以外的其他区域进行屏蔽处理生成的掩膜图像;及
扩展图像生成模块105,用于将所述标注图像及所述掩膜图像输入目标对抗生成网络模型以生成扩展样本图像。
此外,本申请实施例还提出一种计算机可读存储介质,所述计算机可读存储介质中包括样本图像数据增强装置,所述样本图像数据增强装置被处理器执行时实现如下操作:
获取样本图像;
获取采用所述样本图像中的感兴趣区域的图像块训练初始对抗生成网络模型生成的目标对抗生成网络模型;
获取依据所述样本图像生成具有所述感兴趣区域的标注图像;
获取对所述标注图像中所述感兴趣区域以外的其他区域进行屏蔽处理生成的掩膜图像;及
将所述标注图像及所述掩膜图像输入所述目标对抗生成网络模型生成扩展样本图像。
优选地,所述初始对抗生成网络模型包括多个生成器G 0,G 1,......G N及与所述多个生成器G 0,G 1,......G N对应的多个判别器D 0,D 1,......D N,所述多个生成器G 0,G 1,......G N的 输出图像尺寸按照G 0,G 1,......G N的顺序依次增大,所述图像块包括多个尺寸顺序增大的图像块x 0,x 1,......x N,所述初始对抗生成网络模型的输入包括所述图像块及噪声图像z 0,z 1,......z N,其中N为大于等于2的自然数,所述训练初始对抗生成网络模型生成的目标对抗生成网络模型的步骤中:
当n=N时,将所述噪声图像z n输入所述生成器G n得到输出图像
Figure PCTCN2020118440-appb-000029
将所述输出图像
Figure PCTCN2020118440-appb-000030
及所述图像块x n输入所述判别器D n,并对所述生成器G n及所述判别器D n进行交替迭代训练;
当n<N时,n为自然数,将所述噪声图像z n及对所述生成器G n+1的输出图像
Figure PCTCN2020118440-appb-000031
的采样图像
Figure PCTCN2020118440-appb-000032
输入所述生成器G n得到输出图像
Figure PCTCN2020118440-appb-000033
并将所述输出图像
Figure PCTCN2020118440-appb-000034
及所述图像块x n输入所述判别器D n,对所述生成器G n及所述判别器D n进行交替迭代训练;及
保存训练后的所述多个生成器或保存训练后的所述多个生成器及所述多个判别器作为所述目标对抗生成网络模型。
优选地,所述训练初始对抗生成网络模型生成的目标对抗生成网络模型的步骤中,所述多个生成器G 0,G 1,......G N对应的多个判别器D 0,D 1,......D N依照从G N至G 0、从D N至D 0的顺序被依次训练及固定;
当n=N时,所述生成器G n包括卷积神经网络,所述卷积神经网络接收所述噪声图像z n并输出所述输出图像
Figure PCTCN2020118440-appb-000035
当n<N时,所述生成器G n包括第一叠加器、卷积神经网络及第二叠加器,所述第一叠加器用于将所述噪声图像z n及对所述采样图像
Figure PCTCN2020118440-appb-000036
叠加后提供至所述卷积神经网络,所述第二叠加器用于将所述卷积神经网络的输出图像与所述采样图像
Figure PCTCN2020118440-appb-000037
叠加后作为所述输出图像
Figure PCTCN2020118440-appb-000038
优选地,所述初始对抗生成网络模型包括全卷积的金字塔对抗生成网络模型;所述生成器G n的卷积神经网络采用33Conv-BN-LeackyReLU组成的5层全卷积网络。
优选地,所述生成器G n采用可提供梯度惩罚损失的WGAN-GP;所述判别器D n为马尔科夫判别器;所述生成器G n及所述判别器D n的训练损失包括对抗损失l adv和重建损失l rec,所述生成器G n及所述判别器D n的训练损失的公式如下:
Figure PCTCN2020118440-appb-000039
其中,
Figure PCTCN2020118440-appb-000040
代表所述生成器G n的对抗损失最小且所述判别器D n的对抗损失最大时的乘积,λ代表超参数,l rec(G n)代表所述生成器G n的重建损失。
优选地,所述重建损失l rec符合如下条件:
当n=N时,所述噪声图像z N为随机噪声图像z *,所述生成器G n及所述判别器D n的重建损失为:
l rec=‖(G N(z *)-x N2
当n<N时,所述噪声图像z n为0,所述生成器G n及所述判别器D n的重建损失为:
Figure PCTCN2020118440-appb-000041
优选地,所述样本图像包括车损图像;所述感兴趣区域包括所述车损图像中的车身划痕区域或车身开裂区域。
本申请之计算机可读存储介质的具体实施方式与上述样本图像数据增强方法、电子设备的具体实施方式大致相同,在此不再赘述。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、装置、物品或者方法不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、装置、物品或者方法所固 有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、装置、物品或者方法中还存在另外的相同要素。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种样本图像数据增强方法,其中,所述方法包括以下步骤:
    获取样本图像;
    获取采用所述样本图像中的感兴趣区域的图像块训练初始对抗生成网络模型生成的目标对抗生成网络模型;
    获取依据所述样本图像生成具有所述感兴趣区域的标注图像;
    获取对所述标注图像中所述感兴趣区域以外的其他区域进行屏蔽处理生成的掩膜图像;及
    将所述标注图像及所述掩膜图像输入所述目标对抗生成网络模型生成扩展样本图像。
  2. 根据权利要求1所述的样本图像数据增强方法,其中,所述初始对抗生成网络模型包括多个生成器G 0,G 1,......G N及与所述多个生成器G 0,G 1,......G N对应的多个判别器D 0,D 1,......D N,所述多个生成器G 0,G 1,......G N的输出图像尺寸按照G 0,G 1,......G N的顺序依次增大,所述图像块包括多个尺寸顺序增大的图像块x 0,x 1,......x N,所述初始对抗生成网络模型的输入包括所述图像块及噪声图像z 0,z 1,......z N,其中N为大于等于2的自然数,所述训练初始对抗生成网络模型生成的目标对抗生成网络模型的过程中:
    当n=N时,将所述噪声图像z n输入所述生成器G n得到输出图像
    Figure PCTCN2020118440-appb-100001
    将所述输出图像
    Figure PCTCN2020118440-appb-100002
    及所述图像块x n输入所述判别器D n,并对所述生成器G n及所述判别器D n进行交替迭代训练;
    当n<N时,n为自然数,将所述噪声图像z n及对所述生成器G n+1的输出图像
    Figure PCTCN2020118440-appb-100003
    的采样图像
    Figure PCTCN2020118440-appb-100004
    输入所述生成器G n得到输出图像
    Figure PCTCN2020118440-appb-100005
    并将所述输出图像
    Figure PCTCN2020118440-appb-100006
    及所述图像块x n输入所述判别器D n,对所述生成器G n及所述判别器D n进行交替迭代训练;及
    保存训练后的所述多个生成器G 0,G 1,......G N或保存训练后的所述多个生成器G 0,G 1,......G N及所述多个判别器D 0,D 1,......D N作为所述目标对抗生成网络模型。
  3. 根据权利要求2所述的样本图像数据增强方法,其中,所述训练初始对抗生成网络模型生成的目标对抗生成网络模型的过程中,所述多个生成器G 0,G 1,......G N对应的多个判别器D 0,D 1,......D N依照从G N至G 0、从D N至D 0的顺序被依次训练及固定;
    当n=N时,所述生成器G n包括卷积神经网络,所述卷积神经网络接收所述噪声图像z n并输出所述输出图像
    Figure PCTCN2020118440-appb-100007
    当n<N时,所述生成器G n包括第一叠加器、卷积神经网络及第二叠加器,所述第一叠加器用于将所述噪声图像z n及对所述采样图像
    Figure PCTCN2020118440-appb-100008
    叠加后提供至所述卷积神经网络,所述第二叠加器用于将所述卷积神经网络的输出图像与所述采样图像
    Figure PCTCN2020118440-appb-100009
    叠加后作为所述输出图像
    Figure PCTCN2020118440-appb-100010
  4. 根据权利要求2所述的样本图像数据增强方法,其中,所述初始对抗生成网络模型包括全卷积的金字塔对抗生成网络模型;所述生成器G n的卷积神经网络采用33Conv-BN-LeackyReLU组成的5层全卷积网络。
  5. 根据权利要求2所述的样本图像数据增强方法,其中,所述生成器G n采用可提供梯度惩罚损失的WGAN-GP;所述判别器D n为马尔科夫判别器;所述生成器G n及所述判别器D n的训练损失包括对抗损失l adv和重建损失l rec,所述生成器G n及所述判别器D n的训练损失的公式如下:
    Figure PCTCN2020118440-appb-100011
    其中,
    Figure PCTCN2020118440-appb-100012
    代表所述生成器G n的对抗损失最小且所述判别器D n的 对抗损失最大时的乘积,λ代表超参数,l rec(G n)代表所述生成器G n的重建损失。
  6. 根据权利要求5所述的样本图像数据增强方法,其中,所述重建损失l rec符合如下条件:
    当n=N时,所述噪声图像z N为随机噪声图像z *,所述生成器G n及所述判别器D n的重建损失为:
    l rec=‖(G N(z *)-x N2
    当n<N时,所述噪声图像z n为0,所述生成器G n及所述判别器D n的重建损失为:
    Figure PCTCN2020118440-appb-100013
  7. 根据权利要求1所述的样本图像数据增强方法,其中,所述样本图像包括车损图像;所述感兴趣区域包括所述车损图像中的车身划痕区域或车身开裂区域。
  8. 一种样本图像数据增强装置,其中,所述装置包括:
    样本图像获取模块,用于获取样本图像;
    网络模型获取模块,用于获取采用所述样本图像中的感兴趣区域的图像块训练初始对抗生成网络模型生成的目标对抗生成网络模型;
    标注图像获取模块,用于获取依据所述样本图像生成具有所述感兴趣区域的标注图像;
    掩膜图像获取模块,用于获取对所述标注图像中所述感兴趣区域以外的其他区域进行屏蔽处理生成的掩膜图像;及
    扩展图像生成模块,用于将所述标注图像及所述掩膜图像输入目标对抗生成网络模型以生成扩展样本图像。
  9. 一种电子设备,其中,所述电子设备包括存储器及处理器,所述存储器中存储有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述处理器执行以下步骤:获取样本图像;获取采用所述样本图像中的感兴趣区域的图像块训练初始对抗生成网络模型生成的目标对抗生成网络模型;获取依据所述样本图像生成具有所述感兴趣区域的标注图像;获取对所述标注图像中所述感兴趣区域以外的其他区域进行屏蔽处理生成的掩膜图像;及将所述标注图像及所述掩膜图像输入所述目标对抗生成网络模型生成扩展样本图像。
  10. 根据权利要求9所述的电子设备,其中,所述初始对抗生成网络模型包括多个生成器G 0,G 1,......G N及与所述多个生成器G 0,G 1,......G N对应的多个判别器D 0,D 1,......D N,所述多个生成器G 0,G 1,......G N的输出图像尺寸按照G 0,G 1,......G N的顺序依次增大,所述图像块包括多个尺寸顺序增大的图像块x 0,x 1,......x N,所述初始对抗生成网络模型的输入包括所述图像块及噪声图像z 0,z 1,......z N,其中N为大于等于2的自然数,所述训练初始对抗生成网络模型生成的目标对抗生成网络模型的过程中:
    当n=N时,将所述噪声图像z n输入所述生成器G n得到输出图像
    Figure PCTCN2020118440-appb-100014
    将所述输出图像
    Figure PCTCN2020118440-appb-100015
    及所述图像块x n输入所述判别器D n,并对所述生成器G n及所述判别器D n进行交替迭代训练;
    当n<N时,n为自然数,将所述噪声图像x n及对所述生成器G n+1的输出图像
    Figure PCTCN2020118440-appb-100016
    的采样图像
    Figure PCTCN2020118440-appb-100017
    输入所述生成器G n得到输出图像
    Figure PCTCN2020118440-appb-100018
    并将所述输出图像
    Figure PCTCN2020118440-appb-100019
    及所述图像块x n输入所述判别器D n,对所述生成器G n及所述判别器D n进行交替迭代训练;及
    保存训练后的所述多个生成器G 0,G 1,......G N或保存训练后的所述多个生成器G 0,G 1,......G N及所述多个判别器D 0,D 1,......D N作为所述目标对抗生成网络模型。
  11. 根据权利要求10所述的电子设备,其中,所述训练初始对抗生成网络模型生成的目标对抗生成网络模型的过程中,所述多个生成器G 0,G 1,......G N对应的多个判别器D 0, D 1,......D N依照从G N至G 0、从D N至D 0的顺序被依次训练及固定;
    当n=N时,所述生成器G n包括卷积神经网络,所述卷积神经网络接收所述噪声图像z n并输出所述输出图像
    Figure PCTCN2020118440-appb-100020
    当n<N时,所述生成器G n包括第一叠加器、卷积神经网络及第二叠加器,所述第一叠加器用于将所述噪声图像z n及对所述采样图像
    Figure PCTCN2020118440-appb-100021
    叠加后提供至所述卷积神经网络,所述第二叠加器用于将所述卷积神经网络的输出图像与所述采样图像
    Figure PCTCN2020118440-appb-100022
    叠加后作为所述输出图像
    Figure PCTCN2020118440-appb-100023
  12. 根据权利要求10所述的电子设备,其中,所述初始对抗生成网络模型包括全卷积的金字塔对抗生成网络模型;所述生成器G n的卷积神经网络采用33Conv-BN-LeackyReLU组成的5层全卷积网络。
  13. 根据权利要求10所述的电子设备,其中,所述生成器G n采用可提供梯度惩罚损失的WGAN-GP;所述判别器D n为马尔科夫判别器;所述生成器G n及所述判别器D n的训练损失包括对抗损失l adv和重建损失l rec,所述生成器G n及所述判别器D n的训练损失的公式如下:
    Figure PCTCN2020118440-appb-100024
    其中,
    Figure PCTCN2020118440-appb-100025
    代表所述生成器G n的对抗损失最小且所述判别器D n的对抗损失最大时的乘积,λ代表超参数,l rec(G n)代表所述生成器G n的重建损失。
  14. 根据权利要求13所述的电子设备,其中,所述重建损失l rec符合如下条件:
    当n=N时,所述噪声图像z N为随机噪声图像z *,所述生成器G n及所述判别器D n的重建损失为:
    l rec=‖(G N(z *)-x N2
    当n<N时,所述噪声图像z n为0,所述生成器G n及所述判别器D n的重建损失为:
    Figure PCTCN2020118440-appb-100026
  15. 据权利要求9所述的电子设备,其中,所述样本图像包括车损图像;所述感兴趣区域包括所述车损图像中的车身划痕区域或车身开裂区域。
  16. 一种计算机可读存储介质,其中,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行如下步骤:获取样本图像;获取采用所述样本图像中的感兴趣区域的图像块训练初始对抗生成网络模型生成的目标对抗生成网络模型;获取依据所述样本图像生成具有所述感兴趣区域的标注图像;获取对所述标注图像中所述感兴趣区域以外的其他区域进行屏蔽处理生成的掩膜图像;及将所述标注图像及所述掩膜图像输入所述目标对抗生成网络模型生成扩展样本图像。
  17. 根据权利要求16所述的计算机可读存储介质,其中,所述初始对抗生成网络模型包括多个生成器G 0,G 1,......G N及与所述多个生成器G 0,G 1,......G N对应的多个判别器D 0,D 1,......D N,所述多个生成器G 0,G 1,......G N的输出图像尺寸按照G 0,G 1,......G N的顺序依次增大,所述图像块包括多个尺寸顺序增大的图像块x 0,x 1,......x N,所述初始对抗生成网络模型的输入包括所述图像块及噪声图像z 0,z 1,......z N,其中N为大于等于2的自然数,所述训练初始对抗生成网络模型生成的目标对抗生成网络模型的过程中:
    当n=N时,将所述噪声图像z n输入所述生成器G n得到输出图像
    Figure PCTCN2020118440-appb-100027
    将所述输出图像
    Figure PCTCN2020118440-appb-100028
    及所述图像块x n输入所述判别器D n,并对所述生成器G n及所述判别器D n进行交替迭代训练;
    当n<N时,n为自然数,将所述噪声图像z n及对所述生成器G n+1的输出图像
    Figure PCTCN2020118440-appb-100029
    的采样图像
    Figure PCTCN2020118440-appb-100030
    输入所述生成器G n得到输出图像
    Figure PCTCN2020118440-appb-100031
    并将所述输出图像
    Figure PCTCN2020118440-appb-100032
    及所述图像块x n输入所述判别器D n,对所述生成器G n及所述判别器D n进行交替迭代训练;及
    保存训练后的所述多个生成器G 0,G 1,......G N或保存训练后的所述多个生成器G 0,G 1,......G N及所述多个判别器D 0,D 1,......D N作为所述目标对抗生成网络模型。
  18. 根据权利要求17所述的计算机可读存储介质,其中,所述训练初始对抗生成网络模型生成的目标对抗生成网络模型的过程中,所述多个生成器G 0,G 1,......G N对应的多个判别器D 0,D 1,......D N依照从G N至G 0、从D N至D 0的顺序被依次训练及固定;
    当n=N时,所述生成器G n包括卷积神经网络,所述卷积神经网络接收所述噪声图像z n并输出所述输出图像
    Figure PCTCN2020118440-appb-100033
    当n<N时,所述生成器G n包括第一叠加器、卷积神经网络及第二叠加器,所述第一叠加器用于将所述噪声图像z n及对所述采样图像
    Figure PCTCN2020118440-appb-100034
    叠加后提供至所述卷积神经网络,所述第二叠加器用于将所述卷积神经网络的输出图像与所述采样图像
    Figure PCTCN2020118440-appb-100035
    叠加后作为所述输出图像
    Figure PCTCN2020118440-appb-100036
  19. 根据权利要求17所述的计算机可读存储介质,其中,所述初始对抗生成网络模型包括全卷积的金字塔对抗生成网络模型;所述生成器G n的卷积神经网络采用33Conv-BN-LeackyReLU组成的5层全卷积网络。
  20. 根据权利要求17所述的计算机可读存储介质,其中,所述生成器G n采用可提供梯度惩罚损失的WGAN-GP;所述判别器D n为马尔科夫判别器;所述生成器G n及所述判别器D n的训练损失包括对抗损失l adv和重建损失l rec,所述生成器G n及所述判别器D n的训练损失的公式如下:
    Figure PCTCN2020118440-appb-100037
    其中,
    Figure PCTCN2020118440-appb-100038
    代表所述生成器G n的对抗损失最小且所述判别器D n的对抗损失最大时的乘积,λ代表超参数,l rec(G n)代表所述生成器G n的重建损失。
PCT/CN2020/118440 2020-05-28 2020-09-28 样本图像数据增强方法、装置、电子设备及存储介质 WO2021114832A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010468756.4 2020-05-28
CN202010468756.4A CN111666994A (zh) 2020-05-28 2020-05-28 样本图像数据增强方法、装置、电子设备及存储介质

Publications (1)

Publication Number Publication Date
WO2021114832A1 true WO2021114832A1 (zh) 2021-06-17

Family

ID=72385186

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/118440 WO2021114832A1 (zh) 2020-05-28 2020-09-28 样本图像数据增强方法、装置、电子设备及存储介质

Country Status (2)

Country Link
CN (1) CN111666994A (zh)
WO (1) WO2021114832A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113469279A (zh) * 2021-07-22 2021-10-01 凌云光技术股份有限公司 一种字符样本集的扩增方法、系统及装置
CN113610161A (zh) * 2021-08-09 2021-11-05 东南数字经济发展研究院 一种基于图像分类技术的目标检测数据标注方法
CN113642621A (zh) * 2021-08-03 2021-11-12 南京邮电大学 基于生成对抗网络的零样本图像分类方法
CN114663275A (zh) * 2022-04-01 2022-06-24 西北大学 一种基于风格对抗生成网络stylegan2的脸谱图像生成方法
CN115481694A (zh) * 2022-09-26 2022-12-16 南京星环智能科技有限公司 一种训练样本集的数据增强方法、装置、设备及存储介质
CN116051683A (zh) * 2022-12-20 2023-05-02 中国科学院空天信息创新研究院 一种基于风格自组的遥感图像生成方法、存储介质及设备

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111666994A (zh) * 2020-05-28 2020-09-15 平安科技(深圳)有限公司 样本图像数据增强方法、装置、电子设备及存储介质
CN111931772B (zh) * 2020-09-18 2021-02-09 平安科技(深圳)有限公司 医学图像处理方法、装置、设备及存储介质
CN112329932A (zh) * 2020-10-30 2021-02-05 深圳市优必选科技股份有限公司 生成对抗网络的训练方法、装置及终端设备
CN112381730B (zh) * 2020-11-12 2024-02-02 上海航天计算机技术研究所 一种遥感影像数据扩增方法
CN112396005A (zh) * 2020-11-23 2021-02-23 平安科技(深圳)有限公司 生物特征图像识别方法、装置、电子设备及可读存储介质
CN112785599B (zh) * 2020-12-25 2024-05-28 深兰工业智能创新研究院(宁波)有限公司 图像扩展方法及装置
CN113435358B (zh) * 2021-06-30 2023-08-11 北京百度网讯科技有限公司 用于训练模型的样本生成方法、装置、设备、程序产品
CN113327221A (zh) * 2021-06-30 2021-08-31 北京工业大学 融合roi区域的图像合成方法、装置、电子设备及介质
CN113962360B (zh) * 2021-10-09 2024-04-05 西安交通大学 一种基于gan网络的样本数据增强方法及系统
JP2023082567A (ja) * 2021-12-02 2023-06-14 株式会社日立製作所 システムおよびプログラム
CN116797814A (zh) * 2022-12-28 2023-09-22 中建新疆建工集团第三建设工程有限公司 智慧工地安全管理系统
CN116030158B (zh) * 2023-03-27 2023-07-07 广州思德医疗科技有限公司 基于风格生成对抗网络模型的病灶图像生成方法及装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108510482A (zh) * 2018-03-22 2018-09-07 姚书忠 基于阴道镜图像的宫颈癌检测方法、装置、设备及介质
CN110189336A (zh) * 2019-05-30 2019-08-30 上海极链网络科技有限公司 图像生成方法、系统、服务器及存储介质
CN110516747A (zh) * 2019-08-29 2019-11-29 电子科技大学 基于对抗生成网络和自编码结合的肺结节良恶性分类方法
US20190371450A1 (en) * 2018-05-30 2019-12-05 Siemens Healthcare Gmbh Decision Support System for Medical Therapy Planning
CN111160135A (zh) * 2019-12-12 2020-05-15 太原理工大学 基于改进的Faster R-cnn的尿红细胞病变识别与统计方法和系统
CN111666994A (zh) * 2020-05-28 2020-09-15 平安科技(深圳)有限公司 样本图像数据增强方法、装置、电子设备及存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106599869B (zh) * 2016-12-22 2019-12-03 安徽大学 一种基于多任务卷积神经网络的车辆属性识别方法
US10262236B2 (en) * 2017-05-02 2019-04-16 General Electric Company Neural network training image generation system
CN110868598B (zh) * 2019-10-17 2021-06-22 上海交通大学 基于对抗生成网络的视频内容替换方法及系统

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108510482A (zh) * 2018-03-22 2018-09-07 姚书忠 基于阴道镜图像的宫颈癌检测方法、装置、设备及介质
US20190371450A1 (en) * 2018-05-30 2019-12-05 Siemens Healthcare Gmbh Decision Support System for Medical Therapy Planning
CN110189336A (zh) * 2019-05-30 2019-08-30 上海极链网络科技有限公司 图像生成方法、系统、服务器及存储介质
CN110516747A (zh) * 2019-08-29 2019-11-29 电子科技大学 基于对抗生成网络和自编码结合的肺结节良恶性分类方法
CN111160135A (zh) * 2019-12-12 2020-05-15 太原理工大学 基于改进的Faster R-cnn的尿红细胞病变识别与统计方法和系统
CN111666994A (zh) * 2020-05-28 2020-09-15 平安科技(深圳)有限公司 样本图像数据增强方法、装置、电子设备及存储介质

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113469279A (zh) * 2021-07-22 2021-10-01 凌云光技术股份有限公司 一种字符样本集的扩增方法、系统及装置
CN113642621A (zh) * 2021-08-03 2021-11-12 南京邮电大学 基于生成对抗网络的零样本图像分类方法
CN113610161A (zh) * 2021-08-09 2021-11-05 东南数字经济发展研究院 一种基于图像分类技术的目标检测数据标注方法
CN114663275A (zh) * 2022-04-01 2022-06-24 西北大学 一种基于风格对抗生成网络stylegan2的脸谱图像生成方法
CN114663275B (zh) * 2022-04-01 2024-03-15 西北大学 一种基于风格对抗生成网络stylegan2的脸谱图像生成方法
CN115481694A (zh) * 2022-09-26 2022-12-16 南京星环智能科技有限公司 一种训练样本集的数据增强方法、装置、设备及存储介质
CN115481694B (zh) * 2022-09-26 2023-09-05 南京星环智能科技有限公司 一种训练样本集的数据增强方法、装置、设备及存储介质
CN116051683A (zh) * 2022-12-20 2023-05-02 中国科学院空天信息创新研究院 一种基于风格自组的遥感图像生成方法、存储介质及设备
CN116051683B (zh) * 2022-12-20 2023-07-04 中国科学院空天信息创新研究院 一种基于风格自组的遥感图像生成方法、存储介质及设备

Also Published As

Publication number Publication date
CN111666994A (zh) 2020-09-15

Similar Documents

Publication Publication Date Title
WO2021114832A1 (zh) 样本图像数据增强方法、装置、电子设备及存储介质
WO2020199468A1 (zh) 图像分类方法、装置及计算机可读存储介质
WO2021098362A1 (zh) 视频分类模型构建、视频分类的方法、装置、设备及介质
CN111160533B (zh) 一种基于跨分辨率知识蒸馏的神经网络加速方法
CN107977707B (zh) 一种对抗蒸馏神经网络模型的方法及计算设备
WO2022001623A1 (zh) 基于人工智能的图像处理方法、装置、设备及存储介质
WO2022105125A1 (zh) 图像分割方法、装置、计算机设备及存储介质
US20200279358A1 (en) Method, device, and system for testing an image
WO2021139302A1 (zh) 图像交通信号灯检测方法、装置、电子设备及存储介质
WO2022105117A1 (zh) 一种图像质量评价的方法、装置、计算机设备及存储介质
WO2019055093A1 (en) EXTRACTION OF SPATIO-TEMPORAL CHARACTERISTICS FROM A VIDEO
WO2022012179A1 (zh) 生成特征提取网络的方法、装置、设备和计算机可读介质
WO2023035531A1 (zh) 文本图像超分辨率重建方法及其相关设备
WO2024041479A1 (zh) 一种数据处理方法及其装置
CN110211195B (zh) 生成图像集合的方法、装置、电子设备和计算机可读存储介质
CN111126347B (zh) 人眼状态识别方法、装置、终端及可读存储介质
WO2020125062A1 (zh) 一种图像融合方法及相关装置
Ding et al. Full‐reference image quality assessment using statistical local correlation
CN116051811B (zh) 区域识别方法、装置、计算机设备及计算机可读存储介质
WO2024040870A1 (zh) 文本图像生成、训练、文本图像处理方法以及电子设备
Wang et al. Multi‐level feature fusion network for crowd counting
WO2022134338A1 (zh) 领域适应方法、装置、电子设备及存储介质
CN113298265B (zh) 一种基于深度学习的异构传感器潜在相关性学习方法
CN114972861A (zh) 对抗样本生成方法、装置、设备及存储介质
CN113362249A (zh) 文字图像合成方法、装置、计算机设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20899920

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20899920

Country of ref document: EP

Kind code of ref document: A1