WO2023035425A1 - 自编码器训练方法及组件,异常图像检测方法及组件 - Google Patents

自编码器训练方法及组件,异常图像检测方法及组件 Download PDF

Info

Publication number
WO2023035425A1
WO2023035425A1 PCT/CN2021/134411 CN2021134411W WO2023035425A1 WO 2023035425 A1 WO2023035425 A1 WO 2023035425A1 CN 2021134411 W CN2021134411 W CN 2021134411W WO 2023035425 A1 WO2023035425 A1 WO 2023035425A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
autoencoder
vector
loss
discriminator
Prior art date
Application number
PCT/CN2021/134411
Other languages
English (en)
French (fr)
Inventor
赵冰
Original Assignee
苏州浪潮智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州浪潮智能科技有限公司 filed Critical 苏州浪潮智能科技有限公司
Publication of WO2023035425A1 publication Critical patent/WO2023035425A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/77Retouching; Inpainting; Scratch removal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/002Image coding using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present application relates to the field of computer technology, in particular to an autoencoder training method and components, and an abnormal image detection method and components.
  • Identity mapping means Since the purpose of the autoencoder is to make the output reconstructed picture as similar as possible to the original picture, in the case of insufficient constraints, the autoencoder will tend to directly copy the input to the output, because doing so Best rated. The identity mapping will cause the autoencoder to fail to learn meaningfully, have good reconstruction ability for both normal samples and abnormal samples, and cannot distinguish between the two.
  • the generalization ability is too strong: when the normal image is similar to the abnormal image, it is difficult for the trained self-encoder to distinguish the two, and it still has a good reconstruction ability for the abnormal image that has not been seen.
  • the purpose of the present application is to provide an autoencoder training method and components, and an abnormal image detection method and component, so as to improve the detection accuracy of the autoencoder for abnormal images.
  • the specific plan is as follows:
  • an autoencoder training method including:
  • the sampling vector is sampled from a preset uniform distribution and has the same dimension as the target vector;
  • the reconstructed area is an area corresponding to the partial area in the reconstructed image
  • the current autoencoder is determined as an autoencoder capable of reconstructing normal images only, so as to detect abnormal images by using the current autoencoder.
  • it also includes:
  • Re-acquire sample images from the normal image set and perform subsequent steps to iteratively train the updated autoencoder, updated vector discriminator, and updated reconstruction discriminator until the comprehensive loss meets the preset Convergence conditions.
  • the random occlusion of a part of the sample image to obtain a training sample includes:
  • a rectangular frame is used to randomly block a part of the sample image to obtain the training sample.
  • the aspect ratio of the rectangular frame is 1:1, and the rectangular frame occupies 10% of the sample image.
  • the present application provides a method for detecting abnormal images, including:
  • the self-encoder Inputting the image to be detected into the trained self-encoder, so that the self-encoder encodes the image to be detected to obtain a vector to be detected, and obtains a target image based on the vector to be detected; the self-encoder follows the above-mentioned training method trained to get;
  • the similarity between the target image and the image to be detected is less than a preset threshold, it is determined that the image to be detected is an abnormal image; otherwise, it is determined that the image to be detected is a normal image.
  • the abnormal image includes traffic violations; if the image to be detected is a medical image, the abnormal image includes a lesion.
  • an autoencoder training device comprising:
  • An acquisition module configured to acquire a sample image from a normal image set
  • a occlusion module used to randomly occlude part of the area in the sample image to obtain a training sample
  • a processing module configured to input the training sample into an autoencoder in an initial state, so that the autoencoder encodes the training sample to obtain a target vector, and obtains a reconstructed image based on the target vector;
  • a first calculation module configured to calculate a first loss between the reconstructed image and the sample image
  • the second calculation module is used to calculate the second loss between the target vector and the sampling vector by using the vector discriminator in the initial state;
  • the sampling vector is obtained by sampling from a preset uniform distribution, and has the same value as the target vector dimension;
  • a third calculation module configured to use a reconstruction discriminator in an initial state to calculate a third loss between the reconstructed area and the partial area; the reconstructed area is an area corresponding to the partial area in the reconstructed image;
  • a fourth calculation module configured to determine a comprehensive loss based on the first loss, the second loss and the third loss
  • An output module configured to determine the current autoencoder as an autoencoder capable of reconstructing only normal images if the comprehensive loss meets a preset convergence condition, so as to detect abnormal images using the current autoencoder.
  • an abnormal image detection device including:
  • the detection module is used to input the image to be detected into the trained self-encoder, so that the self-encoder encodes the image to be detected to obtain a vector to be detected, and obtains a target image based on the vector to be detected; the self-encoder
  • the device is trained according to the above training method;
  • a determining module configured to determine that the image to be detected is an abnormal image if the similarity between the target image and the image to be detected is less than a preset threshold; otherwise, determine that the image to be detected is a normal image.
  • the present application provides an electronic device, including:
  • a processor configured to execute the computer program to implement any method disclosed above.
  • the present application provides a readable storage medium for storing a computer program, wherein when the computer program is executed by a processor, any method disclosed above is implemented.
  • the present application provides a self-encoder training method, including: obtaining a sample image from a normal image set; randomly blocking a part of the sample image to obtain a training sample; inputting the training sample into the initial state An autoencoder, so that the autoencoder encodes the training sample to obtain a target vector, and obtains a reconstructed image based on the target vector; calculates the first loss between the reconstructed image and the sample image; uses the initial The vector discriminator of the state calculates the second loss between the target vector and the sample vector; the sample vector is sampled from a preset uniform distribution and has the same dimension as the target vector; the reconstructed discriminator using the initial state calculating a third loss between the reconstructed area and the partial area; the reconstructed area is an area corresponding to the partial area in the reconstructed image; based on the first loss, the second loss and the first The third loss determines the comprehensive loss; if the comprehensive loss meets the preset convergence condition, the current autoencoder
  • the present application presets an autoencoder in the initial state, a vector discriminator in the initial state, and a reconstruction discriminator in the initial state.
  • the sample image is randomly occluded and then sent to the autoencoder for training, which can change the training process into an image repair task.
  • the image repair task can enable the autoencoder to learn deeper image features (such as: occluded areas and the contextual information between unoccluded regions), thereby reducing the possibility of identity mapping.
  • the vector discriminator can constrain the sample vector to be close to the uniform distribution, that is, the vector obtained by constraining the self-encoder to encode the image is close to the uniform distribution, then the vector obtained by the self-encoder to encode the abnormal image will also be close to the uniform distribution, so the self-encoder output The abnormal image will be close to the normal image.
  • the ability of the autoencoder to reconstruct the abnormal image is constrained, that is, the autoencoder cannot accurately reconstruct the abnormal image, and only has a good reconstruction ability for the normal image. Then the autoencoder The encoder will detect abnormal images more easily, reducing the generalization ability of the autoencoder.
  • this application uses the same sample image to train the autoencoder, vector discriminator, and reconstruction discriminator separately, so that the autoencoder can improve its ability to reconstruct images, and the vector discriminator can improve its ability to distinguish sample vectors. and the ability to sample vectors, thereby constraining the sample vectors to be closer to a uniform distribution, so that the reconstruction discriminator can improve its ability to distinguish the original occlusion area and the reconstructed occlusion area, so that the autoencoder can use the reconstruction discriminator to improve the image reconstruction ability. Training and learning reduce the possibility of identity mapping and the generalization ability of the autoencoder, and finally make the autoencoder only have a good reconstruction ability for normal images, thereby improving the detection accuracy of the autoencoder for abnormal images.
  • the present application provides an autoencoder training method and components, an abnormal image detection method and components, and the components include: devices, equipment and readable storage media, which also have the above-mentioned technical effects.
  • Fig. 1 is a kind of autoencoder training method flow chart disclosed in the present application
  • Fig. 2 is a flow chart of an abnormal image detection method disclosed in the present application.
  • FIG. 3 is a schematic diagram of an autoencoder training device disclosed in the present application.
  • Fig. 4 is a schematic diagram of an abnormal image detection device disclosed in the present application.
  • FIG. 5 is a schematic diagram of an electronic device disclosed in the present application.
  • FIG. 6 is a schematic diagram of an autoencoder training process disclosed in the present application.
  • the present application provides a training scheme that can improve the detection accuracy of an autoencoder for abnormal images.
  • the embodiment of the present application discloses a self-encoder training method, including:
  • randomly occluding a partial area in the sample image to obtain a training sample includes: using a rectangular frame to randomly occlude a partial area in the sample image to obtain a training sample.
  • the aspect ratio of the rectangular frame is 1:1, and the rectangular frame accounts for 10% of the sample image.
  • the aspect ratio of the rectangular frame can be other, and the ratio of the rectangular frame to the sample image can also be flexibly adjusted.
  • the specific means of blocking a part of the sample image is: adding Gaussian noise to this part of the area.
  • the first loss between the reconstructed image and the sample image can be calculated using any loss function, that is, each pixel of the reconstructed image is compared with each pixel of the sample image one by one, based on the difference between the pixels at the same position in the two images The difference between determines the first loss.
  • the sampling vector is sampled from a preset uniform distribution and has the same dimension as the target vector.
  • S104, S105, and S106 may be executed in parallel, or may be executed separately in no particular order.
  • S110, S111, and S112 may be executed in parallel, or may be executed separately in no particular order.
  • the autoencoder in the initial state can be built based on the convolutional neural network, or the generator in the generative confrontation network can be used.
  • the vector discriminator of the initial state and the reconstruction discriminator can adopt the discriminator in the generative adversarial network.
  • the sample image is randomly occluded and then sent to the autoencoder for training, which can change the training process into an image repair task.
  • the image repair task can enable the autoencoder to learn deeper image features (such as: occluded areas and the contextual information between unoccluded regions), thereby reducing the possibility of identity mapping.
  • the vector discriminator can constrain the sample vector to be close to the uniform distribution, that is, the vector obtained by constraining the self-encoder to encode the image is close to the uniform distribution, then the vector obtained by the self-encoder to encode the abnormal image will also be close to the uniform distribution, so the abnormality output by the self-encoder The image will be close to the normal image.
  • the ability of the autoencoder to reconstruct the abnormal image is restricted, that is, the autoencoder cannot accurately reconstruct the abnormal image, and only has a good reconstruction ability for the normal image. Then the autoencoder It will be easier to detect abnormal images and reduce the generalization ability of the autoencoder.
  • an autoencoder in an initial state, a vector discriminator in an initial state, and a reconstruction discriminator in an initial state are preset.
  • the autoencoder can improve its ability to reconstruct images
  • the vector discriminator can improve its ability to distinguish sample vectors and samples.
  • the ability of the vector so that the sample vector is constrained to be closer to the uniform distribution, so that the reconstruction discriminator can improve its ability to distinguish the original occluded area and the reconstructed occluded area, so that the autoencoder can use the reconstruction discriminator to improve the image reconstruction ability. Learning reduces the possibility of identity mapping and the generalization ability of the autoencoder, and finally makes the autoencoder only have a good reconstruction ability for normal images, thereby improving the detection accuracy of the autoencoder for abnormal images.
  • an abnormal image detection method including:
  • the autoencoder trained in this application has good reconstruction ability only for normal images, but cannot reconstruct abnormal images. Therefore, if the image output by the autoencoder is quite different from the input image (ie: the similarity smaller), then it means that the current input image is an abnormal image; otherwise, it means that the current input image is a normal image.
  • the abnormal image includes traffic violations; if the image to be detected is a medical image, the abnormal image includes a lesion.
  • the identification image may be a traffic monitoring image, a shopping mall monitoring image, a medical image, and the like.
  • the normal image set in the above embodiment S101 is: a plurality of traffic travel images that do not include violations.
  • the normal image set in the above embodiment S101 is: a plurality of medical images that do not include lesion parts. It can be seen that the normal image set used for training the autoencoder needs to correspond to the usage scenario of the autoencoder. Of course, you can also use autoencoders to detect images to determine whether there are acts of theft, riots, etc.
  • An autoencoder training device provided in the embodiment of the present application is introduced below.
  • the autoencoder training device described below and the autoencoder training method described above may refer to each other.
  • an autoencoder training device including:
  • An acquisition module 301 configured to acquire a sample image from a normal image set
  • Blocking module 302 is used for randomly blocking the part area in the sample image, obtains training sample
  • the processing module 303 is used to input the training samples into the autoencoder in the initial state, so that the autoencoder encodes the training samples to obtain a target vector, and obtains a reconstructed image based on the target vector;
  • a first calculation module 304 configured to calculate a first loss between the reconstructed image and the sample image
  • the second calculation module 305 is used to calculate the second loss between the target vector and the sampling vector by using the vector discriminator in the initial state; the sampling vector is sampled from a preset uniform distribution and has the same dimension as the target vector;
  • the third calculation module 306 is used to calculate the third loss between the reconstructed area and the partial area by using the reconstruction discriminator in the initial state; the reconstructed area is the area corresponding to the partial area in the reconstructed image;
  • a fourth calculation module 307 configured to determine a comprehensive loss based on the first loss, the second loss and the third loss
  • the output module 308 is configured to determine the current autoencoder as an autoencoder capable of reconstructing only normal images if the comprehensive loss meets the preset convergence condition, so as to use the current autoencoder to detect abnormal images.
  • it also includes:
  • the first update module is used to update the parameters of the self-encoder based on the comprehensive loss to obtain the updated self-encoder if the comprehensive loss does not meet the preset convergence condition;
  • the second update module is used to update the parameters of the vector discriminator based on the second loss to obtain the updated vector discriminator;
  • the third update module is used to update the parameters of the reconstruction discriminator based on the third loss to obtain the updated reconstruction discriminator;
  • An iterative module for reacquiring sample images from the normal image set and performing subsequent steps to iteratively train the updated autoencoder, updated vector discriminator, and updated reconstruction discriminator until the combined loss meets the pre-set Set the convergence condition.
  • the occlusion module is specifically used for:
  • the aspect ratio of the rectangular frame is 1:1, and the rectangular frame accounts for 10% of the sample image.
  • this embodiment provides an autoencoder training device, which can reduce the possibility of identity mapping and the generalization ability of the autoencoder, so that the autoencoder can only have a good reconstruction ability for normal images. This improves the detection accuracy of the autoencoder for abnormal images.
  • An abnormal image detection device provided in an embodiment of the present application is introduced below.
  • the abnormal image detection device described below and the abnormal image detection method described above may refer to each other.
  • an abnormal image detection device including:
  • the detection module 401 is used to input the image to be detected into the trained self-encoder, so that the self-encoder encodes the image to be detected to obtain the vector to be detected, and obtains the target image based on the vector to be detected; the self-encoder follows the above-mentioned training method trained to get;
  • the determining module 402 is configured to determine that the image to be detected is an abnormal image if the similarity between the target image and the image to be detected is less than a preset threshold; otherwise, determine the image to be detected to be a normal image.
  • the abnormal image includes traffic violations; if the image to be detected is a medical image, the abnormal image includes a lesion.
  • this embodiment provides an abnormal image detection device that can accurately detect abnormal images.
  • an electronic device including:
  • Memory 501 for storing computer programs
  • the processor 502 is configured to execute the computer program, so as to implement the method disclosed in any of the foregoing embodiments.
  • a readable storage medium provided by an embodiment of the present application is introduced below, and a readable storage medium described below and any method, device, and device described above may refer to each other.
  • a readable storage medium is used to store a computer program, wherein when the computer program is executed by a processor, any method disclosed in the foregoing embodiments is implemented. Regarding the specific steps of the method, reference may be made to the corresponding content disclosed in the foregoing embodiments, and details are not repeated here.
  • the following embodiments add a dual adversarial learning module on the basis of a denoising autoencoder, namely: an image reconstruction adversarial learning module and a latent space constraint adversarial learning module. Please refer to FIG. 6 for details.
  • the denoising self-encoder generally performs the task of denoising, that is, adding randomly distributed noise points to the training image (normal image) and then inputting it into the self-encoder for reconstruction.
  • the noise reduction task is changed to an image restoration task, that is, the image is input to an autoencoder for reconstruction after adding a rectangular occlusion to the image.
  • a modified denoising autoencoder is used to reconstruct the occluded image.
  • the image inpainting task can enable the autoencoder to learn the context information of the occluded area and the unoccluded area, that is, to promote the autoencoder to learn deeper image features and reduce the possibility of identity mapping.
  • the image reconstruction confrontation learning module can improve the image restoration effect of the occluded area.
  • This module draws on the adversarial learning idea of the adversarial generative network, and the autoencoder performs adversarial learning with this module.
  • the discriminator in this module gradually improves the ability to compare the reconstructed rectangular occluded region with the original image, thereby prompting the autoencoder to generate a more realistic reconstructed image to confuse the discriminator.
  • the training it means that the occluded area repaired by the autoencoder is almost indistinguishable from the corresponding area of the original image.
  • the image repaired by the self-encoder is almost indistinguishable from the original image.
  • the latent space constraint adversarial learning module can inhibit the generalization ability of the autoencoder.
  • the discriminator in this module can constrain the encoding vector of the input sample to conform to or be close to a uniform distribution, that is, to make the sample vector be distributed in any position of the latent space with the same probability.
  • the sample vector and the vector randomly sampled from the uniform distribution that is, U in Figure 6 are sent to the discriminator to realize the discrimination between the two, so as to prompt the autoencoder to input
  • the encoding of the samples conforms to or approximates the vectors obtained by the samples.
  • the discriminator cannot distinguish whether the vector comes from a uniformly distributed random sample or a sample code, it means that the distribution of normal samples in the latent space conforms to (0, 1) uniform distribution. At this time, it can be considered that the latent space has been filled with normal samples conforming to the uniform distribution, so the encoding vector of the abnormal sample is closer to the existing normal sample encoding vector during testing, then the abnormal sample input to the autoencoder will be the same as the output of the autoencoder The reconstructed images are quite different, that is, the autoencoder cannot reconstruct abnormal samples well, which is more conducive to distinguishing abnormal samples.
  • the autoencoder will output the sample vector and reconstruct the image; the subsequent image reconstruction confrontation learning module intercepts the reconstructed covered area from the reconstructed image, and compares it with the covered area in the original image For comparison; the hidden space constraint confrontation learning module compares the sample vector and the random sampling vector; it can be seen that 3 losses can be obtained in one iteration, and the autoencoder can be updated after the weighted synthesis of these 3 losses, and the image reconstruction confrontation learning module uses The loss output by itself updates the discriminator in itself, and the hidden space constraint confrontation learning module uses the loss output by itself to update the discriminator in itself.
  • the autoencoder and the dual adversarial learning module are trained, all parameters of the autoencoder are frozen during the inference phase. Only use the self-encoder for inference, and use the reconstructed similarity of the overall image as the criterion for abnormality. When the similarity is less than the threshold, the image is identified as abnormal.
  • the autoencoder can run on a server or an edge device with certain computing power.
  • autoencoders required in different scenarios can be trained, including but not limited to monitoring video analysis, medical image detection, intelligent transportation system and other scenarios.
  • the structure of the autoencoder and discriminator does not need to be changed in any way, and only needs to be retrained using the data set in the new scene.
  • the self-encoder trained in this embodiment is experimentally verified on the three open source data sets of MNIST, CIFAR10, and UCSD-Ped2.
  • AUC rea Under the Curve, area under the curve
  • the scheme Excellent classification performances of 0.940, 0.642, and 0.941 were achieved on the above three data sets, respectively.
  • AUC is a performance evaluation index for classification tasks, and the value range is 0-1. The larger the value, the better the performance.
  • MNIST is a handwritten data set of 0-9
  • CIFAR10 is a natural image 10 classification data set
  • UCSD-Ped2 is a special anomaly detection data set for monitoring scenes.
  • this embodiment can promote the self-encoder to learn more profound features, and make the abnormal encoding vector closer to the encoding vector of normal samples, which reduces the generalization ability of the autoencoder, and also solves the identity mapping problem, improving Anomaly detection performance of autoencoders.
  • RAM random access memory
  • ROM read-only memory
  • EEPROM electrically programmable ROM
  • EEPROM electrically erasable programmable ROM
  • registers hard disk, removable disk, CD-ROM, or any other Any other known readable storage medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

一种自编码器训练方法及组件,一种异常图像检测方法及组件。该方法在某一次迭代过程中,利用同一个样本图像对自编码器、向量判别器、重建判别器分别训练,可使自编码器提升自身对图像的重建能力,使向量判别器约束样本向量接近均匀分布,使重建判别器提升自身判别原遮挡区域和重建所得遮挡区域的能力,据此对抗训练和学习,降低了恒等映射出现的可能性和自编码器的泛化能力,最终使得自编码器仅对正常图像具有良好的重建能力,从而提升自编码器对异常图像的检测准确率。

Description

自编码器训练方法及组件,异常图像检测方法及组件
本申请要求在2021年9月8日提交中国专利局、申请号为202111046489.2、发明名称为“自编码器训练方法及组件,异常图像检测方法及组件”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,特别涉及一种自编码器训练方法及组件,一种异常图像检测方法及组件。
背景技术
目前,现有的自编码器存在恒等映射问题和泛化能力过强的问题。
恒等映射是指:由于自编码器的目的是使输出的重建图片与原始图片尽可能相似,在约束不充分的情况下,自编码器会倾向于直接将输入复制到输出,因为这样做的评价最优。恒等映射会导致自编码器未进行有意义的学习,对正常样本和异常样本都具有良好的重建能力,无法区分二者。
泛化能力过强是指:当正常图像与异常图像较为相似时,训练完成的自编码器难以区分二者,对未见过的异常图像仍具有良好的重建能力。
因此,如何提高自编码器对异常图像的检测准确率,是本领域技术人员需要解决的问题。
发明内容
有鉴于此,本申请的目的在于提供一种自编码器训练方法及组件,一种异常图像检测方法及组件,以提高自编码器对异常图像的检测准确率。其具体方案如下:
第一方面,本申请提供了一种自编码器训练方法,包括:
从正常图像集中获取样本图像;
随机遮挡所述样本图像中的部分区域,得到训练样本;
将所述训练样本输入初始状态的自编码器,以使所述自编码器编码所述训练样本得到目标向量,并基于所述目标向量得到重建图像;
计算所述重建图像与所述样本图像之间的第一损失;
利用初始状态的向量判别器计算所述目标向量和采样向量之间的第二损失;所述采样向量从预设均匀分布中采样获得,且与所述目标向量具有相同维度;
利用初始状态的重建判别器计算重建区域和所述部分区域之间的第三损失;所述重建区域为所述重建图像中与所述部分区域对应的区域;
基于所述第一损失、所述第二损失和所述第三损失确定综合损失;
若所述综合损失符合预设收敛条件,则将当前自编码器确定为仅对正常图像具有重建能力的自编码器,以利用当前自编码器检测异常图像。
优选地,还包括:
若所述综合损失不符合预设收敛条件,则基于所述综合损失更新所述自编码器的参数,得到更新后的自编码器;
基于所述第二损失更新所述向量判别器的参数,得到更新后的向量判别器;
基于所述第三损失更新所述重建判别器的参数,得到更新后的重建判别器;
从所述正常图像集中重新获取样本图像,并执行后续步骤,以对更新后的自编码器、更新后的向量判别器和更新后的重建判别器进行迭代训练,直至所述综合损失符合预设收敛条件。
优选地,所述随机遮挡所述样本图像中的部分区域,得到训练样本,包括:
利用矩形框随机遮挡所述样本图像中的部分区域,得到所述训练样本。
优选地,所述矩形框的长宽比例为1:1,且所述矩形框占所述样本图像的10%。
第二方面,本申请提供了一种异常图像检测方法,包括:
将待检测图像输入训练完成的自编码器,以使所述自编码器编码所述待检测图像得到待检测向量,并基于所述待检测向量得到目标图像;所述自编码器按照上述训练方法训练得到;
若所述目标图像与所述待检测图像的相似度小于预设阈值,则确定所述待检测图像为异常图像;否则,确定所述待检测图像为正常图像。
优选地,若所述待检测图像为交通监控图像,则所述异常图像中包括交通违法行为;若所述待检测图像为医学图像,则所述异常图像中包括病变部位。
第三方面,本申请提供了一种自编码器训练装置,包括:
获取模块,用于从正常图像集中获取样本图像;
遮挡模块,用于随机遮挡所述样本图像中的部分区域,得到训练样本;
处理模块,用于将所述训练样本输入初始状态的自编码器,以使所述自编码器编码所述训练样本得到目标向量,并基于所述目标向量得到重建图像;
第一计算模块,用于计算所述重建图像与所述样本图像之间的第一损失;
第二计算模块,用于利用初始状态的向量判别器计算所述目标向量和采样向量之间的第二损失;所述采样向量从预设均匀分布中采样获得,且与所述目标向量具有相同维度;
第三计算模块,用于利用初始状态的重建判别器计算重建区域和所述部分区域之间的第三损失;所述重建区域为所述重建图像中与所述部分区域对应的区域;
第四计算模块,用于基于所述第一损失、所述第二损失和所述第三损失确定综合损失;
输出模块,用于若所述综合损失符合预设收敛条件,则将当前自编码器确定为仅对正常图像具有重建能力的自编码器,以利用当前自编码器检测异常图像。
第四方面,本申请提供了一种异常图像检测装置,包括:
检测模块,用于将待检测图像输入训练完成的自编码器,以使所述自编码器编码所述待检测图像得到待检测向量,并基于所述待检测向量得到目标图像;所述自编码器按照上述训练方法训练得到;
确定模块,用于若所述目标图像与所述待检测图像的相似度小于预设阈值,则确定所述待检测图像为异常图像;否则,确定所述待检测图像为正常图像。
第五方面,本申请提供了一种电子设备,包括:
存储器,用于存储计算机程序;
处理器,用于执行所述计算机程序,以实现前述公开的任意方法。
第六方面,本申请提供了一种可读存储介质,用于保存计算机程序,其中,所述计算机程序被处理器执行时实现前述公开的任意方法。
通过以上方案可知,本申请提供了一种自编码器训练方法,包括:从正常图像集中获取样本图像;随机遮挡所述样本图像中的部分区域,得到训练样本;将所述训练样本输入初始状态的自编码器,以使所述自编码器编码所述训练样本得到目标向量,并基于所述目标向量得到重建图像;计算所述重建图像与所述样本图像之间的第一损失;利用初始状态的向量判别器计算所述目标向量和采样向量之间的第二损失;所述采样向量从预设均匀分布中采样获得,且与所述目标向量具有相同维度;利用初始状态的重建判别器计算重建区域和所述部分区域之间的第三损失;所述重建区域为所述重建图像中与所述部分区域对应的区域;基于所述第一损失、所述第二损失和所述第三损失确定综合损失;若所述综合损失符合预设收敛条件,则将当前自编码器确定为仅对正常图像具有重建能力的自编码器,以利用当前自编码器检测异常图像。
可见,本申请预设了初始状态的自编码器、初始状态的向量判别器、初始状态的重建判别器。在某一次迭代过程中,对样本图像随机遮挡后送入自编码器进行训练,可使训练过程变更为图像修复任务,图像修复任务能够使自编码器学习更深层次的图像特征(如:遮挡区域和未遮挡区域之间的上下文信息),从而降低了恒等映射出现的可能性。同时,向量判别器可约束样本向量接近均匀分布,也即:约束自编码器编码图像所得的向量接近均匀分布,那么自编码器编码异常图像所得的向量也会接近均匀分布,故而自编码器输出的异常图像会接近于正常图像,此时自编码器对异常图像的重建能力就得到了约束,即:自编码器对异常图像无法准确实现重建,仅对正常图像具有良好的重建能力,那么自编码器会更容易检测出异常图像,降低了自编码器的泛化能力。在一次迭代过程中,本申请利用同一个样本图像对自编码器、向量判别器、重建判别器分别训练,可使自编码器提升自身对图像的重建能力,使向量判别器提升自身判别样本向量和采样向量的能力,从而约束样本向量更接近均匀分布,使重建判别器提升自身判别原遮挡区域和重建所得遮挡区域的能力,从而使自编码器借助重建判别器提升图像重建能力,据此对抗训练和学习,降低了恒等映 射出现的可能性和自编码器的泛化能力,最终使得自编码器仅对正常图像具有良好的重建能力,从而提升自编码器对异常图像的检测准确率。
相应地,本申请提供的一种自编码器训练方法及组件,一种异常图像检测方法及组件,组件包括:装置、设备及可读存储介质,也同样具有上述技术效果。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。
图1为本申请公开的一种自编码器训练方法流程图;
图2为本申请公开的一种异常图像检测方法流程图;
图3为本申请公开的一种自编码器训练装置示意图;
图4为本申请公开的一种异常图像检测装置示意图;
图5为本申请公开的一种电子设备示意图;
图6为本申请公开的一种自编码器训练流程示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
目前,现有的自编码器存在恒等映射问题和泛化能力过强的问题。为此,本申请提供了一种训练方案,能够提高自编码器对异常图像的检测准确率。
参见图1所示,本申请实施例公开了一种自编码器训练方法,包括:
S101、从正常图像集中获取样本图像。
S102、随机遮挡样本图像中的部分区域,得到训练样本。
在一种具体实施方式中,随机遮挡样本图像中的部分区域,得到训练样本,包括:利用矩形框随机遮挡样本图像中的部分区域,得到训练样本。其中,矩形框的长宽比例为1:1,且矩形框占样本图像的10%。当然,矩形框的长宽比例可以为其他,矩形框占样本图像的比例也可以灵活调整。
其中,遮挡样本图像中的部分区域的具体手段是:往这部分区域里面添加高斯噪声。
S103、将训练样本输入初始状态的自编码器,以使自编码器编码训练样本得到目标向量,并基于目标向量得到重建图像。
S104、计算重建图像与样本图像之间的第一损失。
其中,重建图像与样本图像之间的第一损失可以采用任意损失函数进行计算,即:逐一对比重建图像的各个像素点和样本图像的各个像素点,基于两个图像中同一位置的像素点之间的差异确定第一损失。
S105、利用初始状态的向量判别器计算目标向量和采样向量之间的第二损失;采样向量从预设均匀分布中采样获得,且与目标向量具有相同维度。
S106、利用初始状态的重建判别器计算重建区域和部分区域之间的第三损失;重建区域为重建图像中与部分区域对应的区域。
在本实施例中,S104、S105、S106可以并行执行,也可以不分先后顺序分别执行。S110、S111、S112可以并行执行,也可以不分先后顺序分别执行。
S107、基于第一损失、第二损失和第三损失确定综合损失。
S108、判断综合损失是否符合预设收敛条件;若是,则执行S109;若否,则执行S110。
S109、将当前自编码器确定为仅对正常图像具有重建能力的自编码器,以利用当前自编码器检测异常图像。
S110、基于综合损失更新自编码器的参数,得到更新后的自编码器。
S111、基于第二损失更新向量判别器的参数,得到更新后的向量判别器。
S112、基于第三损失更新重建判别器的参数,得到更新后的重建判别器,并执行S101,以从正常图像集中重新获取样本图像后,执行后续步骤对更新后的自编码器、更新后的向量判别器和更新后的重建判别器进行迭代训练,直至综合损失符合预设收敛条件。
需要说明的是,初始状态的自编码器可以基于卷积神经网络进行搭建,也可以采用生成式对抗网络中的生成器。初始状态的向量判别器和重建判别器可以采用生成式对抗网络中的判别器。
在某一次迭代过程中,对样本图像随机遮挡后送入自编码器进行训练,可使训练过程变更为图像修复任务,图像修复任务能够使自编码器学习更深层次的图像特征(如:遮挡区域和未遮挡区域之间的上下文信息),从而降低了恒等映射出现的可能性。
向量判别器可约束样本向量接近均匀分布,也即:约束自编码器编码图像所得的向量接近均匀分布,那么自编码器编码异常图像所得的向量也会接近均匀分布,故而自编码器输出的异常图像会接近于正常图像,此时自编码器对异常图像的重建能力就得到了约束,即:自编码器对异常图像无法准确实现重建,仅对正常图像具有良好的重建能力,那么自编码器会更容易检测出异常图像,降低了自编码器的泛化能力。
可见,本实施例预设了初始状态的自编码器、初始状态的向量判别器、初始状态的重建判别器。在一次迭代过程中,利用同一个样本图像对自编码器、向量判别器、重建判别器分别训练,可使自编码器提升自身对图像的重建能力,使向量判别器提升自身判别样本向量和采样向量的能力,从而约束样本向量更接近均匀分布,使重建判别器提升自身判别原遮挡区域和重建所得遮挡区域的能力,从而使自编码器借助重建判别器提升图像重建能力,据此对抗训练和学习,降低了恒等映射出现的可能性和自编码器的泛化能力,最终使得自编码器仅对正常图像具有良好的重建能力,从而提升自编码器对异常图像的检测准确率。
参见图2所示,本申请实施例公开了一种异常图像检测方法,包括:
S201、将待检测图像输入自编码器,以使自编码器编码待检测图像得到待检测向量,并基于待检测向量得到目标图像;该自编码器按照上述训练方法训练得到;
S202、判断目标图像与待检测图像的相似度是否小于预设阈值;若是,则执行S203;若否,则执行S204;
S203、确定待检测图像为异常图像;
S204、确定待检测图像为正常图像。
由上文可知,本申请训练得到的自编码器仅对正常图像具有良好的重建能力,而无法重建异常图像,因此若自编码器输出的图像与输入图像差异较大(即:二者相似度较小),那么说明当前输入图像为异常图像;否则,说明当前输入图像为正常图像。
在一种具体实施方式中,若待检测图像为交通监控图像,则异常图像中包括交通违法行为;若待检测图像为医学图像,则异常图像中包括病变部位。可见,该识别图像可以是交通监控图像、商场监控图像、医学图像等。
当待检测图像为交通监控图像时,上述实施例S101中的正常图像集则为:未包括违规行为的多个交通出行图像。当待检测图像为医学图像时,上述实施例S101中的正常图像集则为:未包括病变部位的多个医学图像。可见,训练自编码器所用的正常图像集需要与自编码器的使用场景对应。当然,还可以使用自编码器检测图像,以确定是否有偷窃、暴乱等行为。
可见,使用本实施例提供的自编码器,可以准确检测出异常图像。
下面对本申请实施例提供的一种自编码器训练装置进行介绍,下文描述的一种自编码器训练装置与上文描述的一种自编码器训练方法可以相互参照。
参见图3所示,本申请实施例公开了一种自编码器训练装置,包括:
获取模块301,用于从正常图像集中获取样本图像;
遮挡模块302,用于随机遮挡样本图像中的部分区域,得到训练样本;
处理模块303,用于将训练样本输入初始状态的自编码器,以使自编码器编码训练样本得到目标向量,并基于目标向量得到重建图像;
第一计算模块304,用于计算重建图像与样本图像之间的第一损失;
第二计算模块305,用于利用初始状态的向量判别器计算目标向量和采样向量之间的第二损失;采样向量从预设均匀分布中采样获得,且与目标向量具有相同维度;
第三计算模块306,用于利用初始状态的重建判别器计算重建区域和部分区域之间的第三损失;重建区域为重建图像中与部分区域对应的区域;
第四计算模块307,用于基于第一损失、第二损失和第三损失确定综合损失;
输出模块308,用于若综合损失符合预设收敛条件,则将当前自编码器确定为仅对正常图像具有重建能力的自编码器,以利用当前自编码器检测异常图像。
在一种具体实施方式中,还包括:
第一更新模块,用于若综合损失不符合预设收敛条件,则基于综合损失更新自编码器的参数,得到更新后的自编码器;
第二更新模块,用于基于第二损失更新向量判别器的参数,得到更新后的向量判别器;
第三更新模块,用于基于第三损失更新重建判别器的参数,得到更新后的重建判别器;
迭代模块,用于从正常图像集中重新获取样本图像,并执行后续步骤,以对更新后的自编码器、更新后的向量判别器和更新后的重建判别器进行迭代训练,直至综合损失符合预设收敛条件。
在一种具体实施方式中,遮挡模块具体用于:
利用矩形框随机遮挡样本图像中的部分区域,得到训练样本。
在一种具体实施方式中,矩形框的长宽比例为1:1,且矩形框占样本图像的10%。
其中,关于本实施例中各个模块、单元更加具体的工作过程可以参考前述实施例中公开的相应内容,在此不再进行赘述。
可见,本实施例提供了一种自编码器训练装置,该装置可以降低恒等映射出现的可能性和自编码器的泛化能力,最终使得自编码器仅对正常图像具有良好的重建能力,从而提升自编码器对异常图像的检测准确率。
下面对本申请实施例提供的一种异常图像检测装置进行介绍,下文描述的一种异常图像检测装置与上文描述的一种异常图像检测方法可以相互参照。
参见图4所示,本申请实施例公开了一种异常图像检测装置,包括:
检测模块401,用于将待检测图像输入训练完成的自编码器,以使自编码器编码待检测图像得到待检测向量,并基于待检测向量得到目标图像;所述自编码器按照上述训练方法训练得到;
确定模块402,用于若目标图像与待检测图像的相似度小于预设阈值,则确定待检测图像为异常图像;否则,确定待检测图像为正常图像。
在一种具体实施方式中,若待检测图像为交通监控图像,则异常图像中包括交通违法行为;若待检测图像为医学图像,则异常图像中包括病变部位。
其中,关于本实施例中各个模块、单元更加具体的工作过程可以参考前述实施例中公开的相应内容,在此不再进行赘述。
可见,本实施例提供了一种异常图像检测装置,可以准确检测出异常图像。
下面对本申请实施例提供的一种电子设备进行介绍,下文描述的一种电子设备与上文描述的一种任意方法及装置可以相互参照。
参见图5所示,本申请实施例公开了一种电子设备,包括:
存储器501,用于保存计算机程序;
处理器502,用于执行所述计算机程序,以实现上述任意实施例公开的方法。
下面对本申请实施例提供的一种可读存储介质进行介绍,下文描述的一种可读存储介质与上文描述的一种任意方法、装置及设备可以相互参照。
一种可读存储介质,用于保存计算机程序,其中,所述计算机程序被处理器执行时实现前述实施例公开的任意方法。关于该方法的具体步骤可以参考前述实施例中公开的相应内容,在此不再进行赘述。
为了更清楚地介绍本申请,下述实施例在降噪自编码器的基础上,添加了双重对抗学习模块,即:图像重建对抗学习模块和隐空间约束对抗学习模块,具体请参见图6。
其中,降噪自编码器一般执行的是降噪任务,也即:给训练图像(正常图像)添加随机分布的噪点后输入自编码器进行重建。本实施例为了增加图像重建的难度,将降噪任务改为图像修复任务,即:对图像添加矩形遮挡后输入自编码器进行重建。
改进后的降噪自编码器用于对添加遮挡的图像进行重建。相比于图像降噪任务,图像修复任务能够使自编码器学习遮挡区域与未遮挡区域的上下文 信息,即:促使自编码器学习更深层次的图像特征,降低了恒等映射出现的可能。
其中,图像重建对抗学习模块能够提升遮挡区域的图像修复效果。该模块借鉴对抗生成网络的对抗学习思想,自编码器与该模块进行对抗学习。该模块中的判别器逐步提升对比重建所得的矩形遮挡区域和原图像该区域的能力,从而促使自编码器生成更逼真的重建图像,以混淆判别器。当训练完全时,意味着自编码器修复的遮挡区域与原图对应区域几乎没有区别。当然,此时自编码器修复的图像与原图也几乎没有区别。
其中,隐空间约束对抗学习模块能够抑制自编码器的泛化能力。该模块中的判别器可以约束输入样本的编码向量符合或接近于均匀分布,即:使样本向量以相同的概率分布在隐空间的任何位置。具体的,在每一次迭代中,将样本向量和随机从均匀分布(即图6中的U)中采样得到的向量送入判别器,以实现对二者的判别,以促使自编码器对输入样本的编码符合或接近于采样得到的向量。当判别器无法区分向量是来自于均匀分布的随机采样还是样本编码时,意味着正常样本在隐空间的分布符合(0,1)均匀分布。此时可认为隐空间已被符合均匀分布的正常样本填充,所以测试时异常样本的编码向量与现有的正常样本编码向量较接近,那么输入给自编码器的异常样本将与自编码器输出的重建图像差别较大,也即:自编码器无法针对异常样本进行良好地重建,从而更利于区分异常样本。
基于图6所示的训练过程,在每一次迭代中,自编码器会输出样本向量、重建图像;后续图像重建对抗学习模块从重建图像截取重建的遮盖区域,将其与原图中的遮盖区域进行对比;隐空间约束对抗学习模块对比样本向量和随机采样向量;可见,一次迭代中可获得3个损失,这3个损失加权综合后可对自编码器进行更新,而图像重建对抗学习模块利用自身输出的损失更新自身中的判别器,隐空间约束对抗学习模块利用自身输出的损失更新自身中的判别器。据此迭代训练,直至3个损失加权综合所得的值不再变化,则认为符合收敛条件,此时输出自编码器,丢弃图像重建对抗学习模块和隐空间约束对抗学习模块。
当自编码器及双重对抗学习模块完成训练后,推理阶段时自编码器所有参数均冻结。仅使用自编码器进行推理,使用整体图像的重建相似度作为异常 的判别标准,当相似度小于阈值时候,图像被识别为异常。推理时候自编码器可在服务器或者具有一定算力的边缘设备上运行。
按照本实施例可训练得到不同场景下需要的自编码器,包括但不限于监控视频分析、医学图像检测、智能交通系统等场景。应用到新的场景时,自编码器、判别器的结构不用做任何改变,只需要使用新场景下的数据集重新训练即可。
将本实施例训练所得的自编码器在MNIST、CIFAR10、UCSD-Ped2这3个开源数据集上进行实验验证,在以AUC(Area Under the Curve,曲线下面积)作为性能评价指标时,该方案在上述三个数据集上分别取得了0.940,0.642,0.941的优异分类性能。其中,AUC是分类任务的一种性能评价指标,取值区间0-1,取值越大表示性能越好。
其中MNIST为0-9的手写体数据集,CIFAR10为自然图像10分类数据集,UCSD-Ped2为监控场景的专用异常检测数据集。
可见,本实施例能够促使自编码器学习到更深刻的特征,并且使异常编码向量与正常样本的编码向量较为接近,降低了自编码器的泛化能力,还解决了恒等映射问题,提升了自编码器的异常检测性能。
本申请涉及的“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法或设备固有的其它步骤或单元。
需要说明的是,在本申请中涉及“第一”、“第二”等的描述仅用于描述目的,而不能理解为指示或暗示其相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。另外,各个实施例之间的技术方案可以相互结合,但是必须是以本领域普通技术人员能够实现为基础,当技术方案的结合出现相互矛盾或无法实现时应当认为这种技术方案的结合不存在,也不在本申请要求的保护范围之内。
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同或相似部分互相参见即可。
结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的可读存储介质中。
本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。

Claims (10)

  1. 一种自编码器训练方法,其特征在于,包括:
    从正常图像集中获取样本图像;
    随机遮挡所述样本图像中的部分区域,得到训练样本;
    将所述训练样本输入初始状态的自编码器,以使所述自编码器编码所述训练样本得到目标向量,并基于所述目标向量得到重建图像;
    计算所述重建图像与所述样本图像之间的第一损失;
    利用初始状态的向量判别器计算所述目标向量和采样向量之间的第二损失;所述采样向量从预设均匀分布中采样获得,且与所述目标向量具有相同维度;
    利用初始状态的重建判别器计算重建区域和所述部分区域之间的第三损失;所述重建区域为所述重建图像中与所述部分区域对应的区域;
    基于所述第一损失、所述第二损失和所述第三损失确定综合损失;
    若所述综合损失符合预设收敛条件,则将当前自编码器确定为仅对正常图像具有重建能力的自编码器,以利用当前自编码器检测异常图像。
  2. 根据权利要求1所述的方法,其特征在于,还包括:
    若所述综合损失不符合预设收敛条件,则基于所述综合损失更新所述自编码器的参数,得到更新后的自编码器;
    基于所述第二损失更新所述向量判别器的参数,得到更新后的向量判别器;
    基于所述第三损失更新所述重建判别器的参数,得到更新后的重建判别器;
    从所述正常图像集中重新获取样本图像,并执行后续步骤,以对更新后的自编码器、更新后的向量判别器和更新后的重建判别器进行迭代训练,直至所述综合损失符合预设收敛条件。
  3. 根据权利要求1所述的方法,其特征在于,所述随机遮挡所述样本图像中的部分区域,得到训练样本,包括:
    利用矩形框随机遮挡所述样本图像中的部分区域,得到所述训练样本。
  4. 根据权利要求3所述的方法,其特征在于,所述矩形框的长宽比例为1:1,且所述矩形框占所述样本图像的10%。
  5. 一种异常图像检测方法,其特征在于,包括:
    将待检测图像输入训练完成的自编码器,以使所述自编码器编码所述待检测图像得到待检测向量,并基于所述待检测向量得到目标图像;所述自编码器按照权利要求1至4任一项所述的方法训练得到;
    若所述目标图像与所述待检测图像的相似度小于预设阈值,则确定所述待检测图像为异常图像;否则,确定所述待检测图像为正常图像。
  6. 根据权利要求5所述的方法,其特征在于,若所述待检测图像为交通监控图像,则所述异常图像中包括交通违法行为。
  7. 一种自编码器训练装置,其特征在于,包括:
    获取模块,用于从正常图像集中获取样本图像;
    遮挡模块,用于随机遮挡所述样本图像中的部分区域,得到训练样本;
    处理模块,用于将所述训练样本输入初始状态的自编码器,以使所述自编码器编码所述训练样本得到目标向量,并基于所述目标向量得到重建图像;
    第一计算模块,用于计算所述重建图像与所述样本图像之间的第一损失;
    第二计算模块,用于利用初始状态的向量判别器计算所述目标向量和采样向量之间的第二损失;所述采样向量从预设均匀分布中采样获得,且与所述目标向量具有相同维度;
    第三计算模块,用于利用初始状态的重建判别器计算重建区域和所述部分区域之间的第三损失;所述重建区域为所述重建图像中与所述部分区域对应的区域;
    第四计算模块,用于基于所述第一损失、所述第二损失和所述第三损失确定综合损失;
    输出模块,用于若所述综合损失符合预设收敛条件,则将当前自编码器确定为仅对正常图像具有重建能力的自编码器,以利用当前自编码器检测异常图像。
  8. 一种异常图像检测装置,其特征在于,包括:
    检测模块,用于将待检测图像输入训练完成的自编码器,以使所述自编码器编码所述待检测图像得到待检测向量,并基于所述待检测向量得到目标图像;所述自编码器按照权利要求1至4任一项所述的方法训练得到;
    确定模块,用于若所述目标图像与所述待检测图像的相似度小于预设阈值,则确定所述待检测图像为异常图像;否则,确定所述待检测图像为正常图像。
  9. 一种电子设备,其特征在于,包括:
    存储器,用于存储计算机程序;
    处理器,用于执行所述计算机程序,以实现如权利要求1至6任一项所述的方法。
  10. 一种可读存储介质,其特征在于,用于保存计算机程序,其中,所述计算机程序被处理器执行时实现如权利要求1至6任一项所述的方法。
PCT/CN2021/134411 2021-09-08 2021-11-30 自编码器训练方法及组件,异常图像检测方法及组件 WO2023035425A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111046489.2 2021-09-08
CN202111046489.2A CN113487521A (zh) 2021-09-08 2021-09-08 自编码器训练方法及组件,异常图像检测方法及组件

Publications (1)

Publication Number Publication Date
WO2023035425A1 true WO2023035425A1 (zh) 2023-03-16

Family

ID=77946487

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/134411 WO2023035425A1 (zh) 2021-09-08 2021-11-30 自编码器训练方法及组件,异常图像检测方法及组件

Country Status (2)

Country Link
CN (1) CN113487521A (zh)
WO (1) WO2023035425A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117372791A (zh) * 2023-12-08 2024-01-09 齐鲁空天信息研究院 细粒度定向能毁伤区域检测方法、装置及存储介质

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112734669B (zh) * 2021-01-07 2022-12-02 苏州浪潮智能科技有限公司 一种基于改进降噪自编码器的异常检测模型的训练方法
CN113487521A (zh) * 2021-09-08 2021-10-08 苏州浪潮智能科技有限公司 自编码器训练方法及组件,异常图像检测方法及组件
CN116403269B (zh) * 2023-05-17 2024-03-26 智慧眼科技股份有限公司 一种遮挡人脸解析方法、系统、设备及计算机存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108520503A (zh) * 2018-04-13 2018-09-11 湘潭大学 一种基于自编码器和生成对抗网络修复人脸缺损图像的方法
CN112101426A (zh) * 2020-08-26 2020-12-18 东南大学 基于自编码器的无监督学习图像异常检测方法
EP3798916A1 (en) * 2019-09-24 2021-03-31 Another Brain Transformation of data samples to normal data
CN112734669A (zh) * 2021-01-07 2021-04-30 苏州浪潮智能科技有限公司 一种基于改进降噪自编码器的异常检测模型的训练方法
CN113487521A (zh) * 2021-09-08 2021-10-08 苏州浪潮智能科技有限公司 自编码器训练方法及组件,异常图像检测方法及组件

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112435258A (zh) * 2020-12-17 2021-03-02 深圳市华汉伟业科技有限公司 一种图像检测模型的构建方法、图像检测方法及装置
CN112419318A (zh) * 2020-12-17 2021-02-26 深圳市华汉伟业科技有限公司 一种基于多路级联反馈的异常检测方法及装置、存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108520503A (zh) * 2018-04-13 2018-09-11 湘潭大学 一种基于自编码器和生成对抗网络修复人脸缺损图像的方法
EP3798916A1 (en) * 2019-09-24 2021-03-31 Another Brain Transformation of data samples to normal data
CN112101426A (zh) * 2020-08-26 2020-12-18 东南大学 基于自编码器的无监督学习图像异常检测方法
CN112734669A (zh) * 2021-01-07 2021-04-30 苏州浪潮智能科技有限公司 一种基于改进降噪自编码器的异常检测模型的训练方法
CN113487521A (zh) * 2021-09-08 2021-10-08 苏州浪潮智能科技有限公司 自编码器训练方法及组件,异常图像检测方法及组件

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117372791A (zh) * 2023-12-08 2024-01-09 齐鲁空天信息研究院 细粒度定向能毁伤区域检测方法、装置及存储介质
CN117372791B (zh) * 2023-12-08 2024-03-22 齐鲁空天信息研究院 细粒度定向能毁伤区域检测方法、装置及存储介质

Also Published As

Publication number Publication date
CN113487521A (zh) 2021-10-08

Similar Documents

Publication Publication Date Title
WO2023035425A1 (zh) 自编码器训练方法及组件,异常图像检测方法及组件
Mayer et al. Exposing fake images with forensic similarity graphs
US20230022943A1 (en) Method and system for defending against adversarial sample in image classification, and data processing terminal
Ma et al. Towards a universal model for cross-dataset crowd counting
CN107886048A (zh) 目标跟踪方法及系统、存储介质及电子终端
CN112597864B (zh) 一种监控视频异常检测方法及装置
CN113269722A (zh) 生成对抗网络的训练方法、及高分辨率图像重建方法
CN112598579A (zh) 面向监控场景的图像超分辨率方法、装置及存储介质
CN110399826B (zh) 一种端到端人脸检测和识别方法
CN113344826B (zh) 图像处理方法、装置、电子设备及存储介质
KR102606734B1 (ko) 생체 검출 방법 및 장치
US11756288B2 (en) Image processing method and apparatus, electronic device and storage medium
Wu et al. Reversible contrast enhancement for medical images with background segmentation
CN110135428B (zh) 图像分割处理方法和装置
Liao et al. First step towards parameters estimation of image operator chain
Pocevičiūtė et al. Unsupervised anomaly detection in digital pathology using GANs
Li et al. Image inpainting based on contextual coherent attention GAN
CN113537145A (zh) 目标检测中误、漏检快速解决的方法、装置及存储介质
CN113158773A (zh) 一种活体检测模型的训练方法及训练装置
CN111601181A (zh) 生成视频指纹数据的方法及装置
CN116229535A (zh) 人脸检测模型的训练方法、人脸检测方法及装置
KR102526415B1 (ko) 준지도 학습 방식의 단일 영상 깊이 추정 시스템 및 방법과 이를 위한 컴퓨터 프로그램
Guo et al. Image saliency detection based on geodesic‐like and boundary contrast maps
Jayageetha et al. Medical image quality assessment using CSO based deep neural network
CN112052863B (zh) 一种图像检测方法及装置、计算机存储介质、电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21956596

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE