CN114387451A - Training method, device and medium for abnormal image detection model - Google Patents

Training method, device and medium for abnormal image detection model Download PDF

Info

Publication number
CN114387451A
CN114387451A CN202210036055.2A CN202210036055A CN114387451A CN 114387451 A CN114387451 A CN 114387451A CN 202210036055 A CN202210036055 A CN 202210036055A CN 114387451 A CN114387451 A CN 114387451A
Authority
CN
China
Prior art keywords
image
sample
feature vector
training
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210036055.2A
Other languages
Chinese (zh)
Inventor
袁得嵛
孟玉颜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
PEOPLE'S PUBLIC SECURITY UNIVERSITY OF CHINA
Original Assignee
PEOPLE'S PUBLIC SECURITY UNIVERSITY OF CHINA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by PEOPLE'S PUBLIC SECURITY UNIVERSITY OF CHINA filed Critical PEOPLE'S PUBLIC SECURITY UNIVERSITY OF CHINA
Priority to CN202210036055.2A priority Critical patent/CN114387451A/en
Publication of CN114387451A publication Critical patent/CN114387451A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a training method, a device and a medium for an abnormal image detection model. After a plurality of image training samples and corresponding sample labels are obtained, the image training samples are used as input data for each image training sample, a preset improved abnormal image detection network model is input, and an output detection result of the image training samples is obtained; the initial network weight value of the improved abnormal image detection network model is a pre-trained network weight value based on a sample training data set in an identification project ImageNet database of a computer vision system; and performing iterative training on the improved abnormal image detection model based on the detection result of each image training sample and the sample label of the corresponding sample to obtain the trained abnormal image detection model. The method improves the training efficiency of the model and the detection accuracy of the model.

Description

Training method, device and medium for abnormal image detection model
Technical Field
The application relates to the technical field of image detection, in particular to a training method, a device and a medium for an abnormal image detection model.
Background
For social subjects (telecommunication service operators and internet service providers), the abnormal picture identification technology needs to quickly and conveniently feed back the result of whether a detection target is suspected violation information, so that relevant units and individuals are supported to avoid corresponding legal wind directions; for law enforcement, the examination and identification of violation images essentially belong to a judicial behavior, the identification result should be provided by a unit with relevant qualification, and the detection technology can be used for assisting the examination and identification and improving the working efficiency. In conclusion, it is important to build a system for automatically identifying abnormal image detection.
At present, common methods for identifying abnormal images include a traditional machine learning method based on extraction of feature information such as shape, color, gradient and optical flow and combined with a classifier, and a deep learning technology combined with automatic feature extraction and classification.
However, the accuracy of prediction of the conventional machine learning method is not high, and the deep learning technology requires a large amount of training sample data and a basic deep neural network model architecture, and for the training sample data obtained by horizontally turning, vertically turning and modifying the shading degree of the original image, the problem of overfitting is easy to occur in the training of the basic deep neural network model, thereby affecting the accuracy of image detection.
Disclosure of Invention
An object of the embodiments of the present application is to provide a method, an apparatus, and a medium for training an abnormal image detection model, so as to solve the above problems in the prior art, improve the training efficiency of the model, and improve the detection accuracy of the model
In a first aspect, a method for training an abnormal image detection model is provided, and the method may include:
acquiring a plurality of image training samples and corresponding sample labels;
for each image training sample, using the image training sample as input data, inputting an input network structure of a preset improved abnormal image detection network model, using a residual error network structure configured in the improved abnormal image detection network model, extracting an image feature vector from image information output by the input network structure, and detecting the extracted image feature vector by using an output network structure of the improved abnormal image detection network model to obtain an output detection result of the image training sample; the initial network weight value of the improved abnormal image detection network model is a pre-trained network weight value based on a sample training data set in an identification project ImageNet database of a computer vision system;
and performing iterative training on the pre-training detection model based on the detection result of each image training sample and the sample label of the corresponding sample to obtain a trained abnormal image detection model.
In an optional implementation, the input network structure of the improved anomaly image detection network model includes 1 convolutional layer, 1 max pooling layer, and 1 channel attention mechanism module;
the convolutional layer is used for extracting the characteristics of the input data to obtain a primary image characteristic vector;
the channel attention mechanism module is used for compressing the primary image feature vector to obtain a global compressed feature vector of the primary image feature vector; processing the global compressed feature vector of the primary image feature vector according to a feature channel to obtain a feature channel weight vector of the primary image feature vector, multiplying the feature channel weight vector by the primary image feature vector, and outputting a secondary image feature vector focused on the feature channel;
and the maximum pooling layer is used for acquiring the maximum image feature vector corresponding to the secondary image feature vector output by the channel attention mechanism layer.
In an optional implementation, the residual error network structure configured in the improved abnormal image detection network model includes a plurality of configured residual error units connected in sequence;
each configured residual unit comprises 1 initial residual unit and 1 channel attention mechanism module; the channel attention mechanism module is embedded before the last convolutional layer in the initial residual unit.
In an optional implementation, when the configured residual error network structure includes 4 configured residual error units connected in sequence, a first configured residual error unit of the 4 configured residual error units includes 3 convolutional layers, a second configured residual error unit includes 4 convolutional layers, a third configured residual error unit includes 6 convolutional layers, and a fourth configured residual error unit includes 3 convolutional layers, where convolutional cores in each convolutional layer have the same size.
In an optional implementation, the output network structure of the improved abnormal image detection network model includes 1 channel attention mechanism module, 1 average pooling layer and 1 full-link layer;
the channel attention mechanism module is used for compressing the image feature vectors output by the configured residual error network structure to obtain global compressed feature vectors of the image feature vectors; processing the global compressed feature vector of the image feature vector according to a feature channel to obtain a feature channel weight vector of the image feature vector, multiplying the feature channel weight vector of the image feature vector by the image feature vector, and outputting a secondary image feature vector focused on the output network structure;
the average pooling layer is used for calculating the secondary image feature vector corresponding to the output network structure to obtain a compressed feature vector corresponding to the secondary image feature vector;
and the full connection layer is used for obtaining the detection result of the image training sample based on the obtained compressed feature vector.
In an alternative implementation, obtaining a plurality of image training samples includes:
acquiring a candidate image sample;
processing the candidate image sample by adopting a preset sample quantity expansion mode to obtain a processed candidate image sample;
and determining the candidate image sample and the processed candidate image sample as an image training sample.
In an optional implementation, after obtaining the trained abnormal image detection model, the method further includes:
acquiring at least one image to be detected;
and inputting the at least one image to be detected as input data into the trained abnormal image detection model to obtain a detection result of the at least one image to be detected.
In a second aspect, there is provided an abnormal image detection model training apparatus, which may include:
the acquisition unit is used for acquiring a plurality of image training samples and corresponding sample labels;
the input unit is used for inputting an input network structure of a preset improved abnormal image detection network model by taking the image training samples as input data for each image training sample, extracting image characteristic vectors from image information output by the input network structure by using a residual error network structure configured in the improved abnormal image detection network model, and detecting the extracted image characteristic vectors by using an output network structure of the improved abnormal image detection network model to obtain an output detection result of the image training samples; the initial network weight value of the improved abnormal image detection network model is a pre-trained network weight value based on a sample training data set in an identification project ImageNet database of a computer vision system;
and the training unit is used for carrying out iterative training on the improved abnormal image detection network model based on the detection result of each image training sample and the sample label of the corresponding sample to obtain the trained abnormal image detection model.
In an optional implementation, the input network structure of the improved anomaly image detection network model includes 1 convolutional layer, 1 max pooling layer, and 1 channel attention mechanism module;
the convolutional layer is used for extracting the characteristics of the input data to obtain a primary image characteristic vector;
the channel attention mechanism module is used for compressing the primary image feature vector to obtain a global compressed feature vector of the primary image feature vector; processing the global compressed feature vector of the primary image feature vector according to a feature channel to obtain a feature channel weight vector of the primary image feature vector, multiplying the feature channel weight vector by the primary image feature vector, and outputting a secondary image feature vector focused on the feature channel;
and the maximum pooling layer is used for acquiring the maximum image feature vector corresponding to the secondary image feature vector output by the channel attention mechanism layer.
In an optional implementation, the residual error network structure configured in the improved abnormal image detection network model includes a plurality of configured residual error units connected in sequence;
each configured residual unit comprises 1 initial residual unit and 1 channel attention mechanism module; the channel attention mechanism module is embedded before the last convolutional layer in the initial residual unit.
In an optional implementation, when the configured residual error network structure includes 4 configured residual error units connected in sequence, a first configured residual error unit of the 4 configured residual error units includes 3 convolutional layers, a second configured residual error unit includes 4 convolutional layers, a third configured residual error unit includes 6 convolutional layers, and a fourth configured residual error unit includes 3 convolutional layers, where convolutional cores in each convolutional layer have the same size.
In an optional implementation, the output network structure of the improved abnormal image detection network model includes 1 channel attention mechanism module, 1 average pooling layer and 1 full-link layer;
the channel attention mechanism module is used for compressing the image feature vectors output by the configured residual error network structure to obtain global compressed feature vectors of the image feature vectors; processing the global compressed feature vector of the image feature vector according to a feature channel to obtain a feature channel weight vector of the image feature vector, multiplying the feature channel weight vector of the image feature vector by the image feature vector, and outputting a secondary image feature vector focused on the output network structure;
the average pooling layer is used for calculating the secondary image feature vector corresponding to the output network structure to obtain a compressed feature vector corresponding to the secondary image feature vector;
and the full connection layer is used for obtaining the detection result of the image training sample based on the obtained compressed feature vector.
In an optional implementation, the obtaining unit is specifically configured to:
acquiring a candidate image sample;
processing the candidate image sample by adopting a preset sample quantity expansion mode to obtain a processed candidate image sample;
and determining the candidate image sample and the processed candidate image sample as an image training sample.
In an optional implementation, the obtaining unit is further configured to obtain at least one image to be detected;
and the input unit is used for inputting the at least one image to be detected as input data into the trained abnormal image detection model to obtain the detection result of the at least one image to be detected.
In a third aspect, an electronic device is provided, which includes a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete communication with each other through the communication bus;
a memory for storing a computer program;
a processor adapted to perform the method steps of any of the above first aspects when executing a program stored in the memory.
In a fourth aspect, a computer-readable storage medium is provided, having stored therein a computer program which, when executed by a processor, performs the method steps of any of the above first aspects.
According to the training method of the abnormal image detection model, after a plurality of image training samples and corresponding sample labels are obtained, the image training samples are used as input data for each image training sample, a preset input network structure of an improved abnormal image detection network model is input, a residual error network structure configured in the improved abnormal image detection network model extracts image characteristic vectors from image information output by the input network structure, and the extracted image characteristic vectors are detected by the output network structure of the improved abnormal image detection network model to obtain a detection result of the output image training samples; the initial network weight value of the improved abnormal image detection network model is a pre-trained network weight value based on a sample training data set in an identification project ImageNet database of a computer vision system; and performing iterative training on the improved abnormal image detection network model based on the detection result of each image training sample and the sample label of the corresponding sample to obtain the trained abnormal image detection model. According to the method, a channel attention mechanism module is introduced between original residual error network structures, and characteristic information in a channel is utilized. Parameter information obtained by training on the ImageNet data set is transferred to the field of abnormal image detection after a channel attention mechanism module is introduced, and the accuracy of image detection is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.
Fig. 1 is a schematic flowchart of a training method for an abnormal image detection model according to an embodiment of the present disclosure;
fig. 2 is a schematic model structure diagram of an improved abnormal image detection network model according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a training apparatus for an abnormal image detection model according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present application without any creative effort belong to the protection scope of the present application.
For convenience of understanding, terms referred to in the embodiments of the present application are explained below:
the transfer learning is a machine learning method for solving new field problems with different pairs but still having certain relevance of samples or problems by using the existing knowledge. The method aims at the reuse of the existing knowledge to carry out migration and fine adjustment of the network structure, and is used for solving the problem of neural network training with little or no label sample data in the target field. The core idea is as follows: and transplanting the model obtained by training the task A to the task B.
The module of channel attention network (SE) is a core module of a network of channel attention network (SENet), and is a typical channel attention network. The two operations of pressing and activating constitute the main part of the SE module, in which:
the squeezing operation firstly carries out Global Average Pooling (Global Average Pooling) operation on an input Feature Map with the size of H multiplied by W multiplied by C (H is the length of the input Feature Map, W is the width of the input Feature Map, and C is the number of Feature channels), so that the Feature Map is changed into a 1 multiplied by C Feature vector with a Global receptive field, and then the Global compression Feature quantity of the current Feature Map is obtained through two layers of fully-connected layers (the input and output Feature channels of the fully-connected layers have the same dimension) using a ReLU activation function;
the firing operation derives the weight for each feature channel using the sigmoid activation function. Finally, through Scale operation, the output Feature channel weight vector is multiplied by the original input Feature Map, namely the weighted Feature Map is used as the input of the next layer of network, the original Feature calibration on the channel dimension is completed, and finally the extracted features have stronger directivity than the original features, so that the purpose of improving the classification performance is achieved.
The SE module appears to solve the loss problem caused by the different importance of different channels of the feature map in the convolution pooling process. In the conventional convolutional pooling process, each channel of the default feature map is equally important, whereas in practical problems, the importance of different channels differs.
Fig. 1 is a schematic flowchart of a training method for an abnormal image detection model according to an embodiment of the present application. As shown in fig. 1, the method may include:
and step S110, acquiring a plurality of image training samples and corresponding sample labels.
Considering that the real world image has various changes such as brightness, saturation, angle, position and the like, after the candidate image sample is obtained, a preset sample number expansion mode is adopted to process the candidate image sample, and the processed candidate image sample is obtained, namely the data enhancement preprocessing operation is carried out on the candidate image sample;
then, the candidate image sample and the processed candidate image sample are determined as image training samples, so as to obtain a plurality of image training samples and sample labels corresponding to the corresponding image training samples, where the sample labels are identification information for characterizing the image training samples as normal image samples or abnormal image samples, that is, the image training samples include normal image samples and abnormal image samples.
And step S120, aiming at each image training sample, inputting the image training sample as input data into a preset improved abnormal image detection network model to obtain an output detection result of the image training sample.
In a big data era, a deep learning model needs to be trained to obtain a more ideal result, huge data resources and calculation resources need to be consumed, but in practice, a data set is usually small when the model is trained, and the trained network model is easy to be not converged or over-fitted only by using an effective data set. The method comprises the steps that neural networks such as VGGNet, DenseNet, EfficientNet, Resnet and the like are trained on a sample training data set in a computer vision system recognition project ImageNet database to obtain weight value files of model parameters, namely model parameter values, and then the trained network model is an improved abnormal image detection network model.
Therefore, before the step is executed, a sample training data set in the ImageNet database is obtained, and an original network (such as a Resnet network) is iteratively trained based on the sample training data set, so that an improved abnormal image detection network model and pre-trained model parameter values are obtained.
And for each image training sample, inputting an input network structure of a preset improved abnormal image detection network model and a residual error network structure configured in the improved abnormal image detection network model by using the image training sample as input data, extracting image characteristic vectors from image information output by the input network structure, detecting the extracted image characteristic vectors by using an output network structure of the improved abnormal image detection network model, and outputting a detection result of the image training sample.
Fig. 2 is a schematic diagram of a model structure of an improved abnormal image detection network model provided in an embodiment of the present application, and as shown in fig. 2, the model structure of the improved abnormal image detection network model is as follows:
(1) the input network structure of the improved abnormal image detection network model can comprise 1 convolution layer, 1 maximum pooling layer and 1 channel attention mechanism module, namely an SE module;
the convolution layer is used for carrying out feature extraction on input data to obtain a primary image feature vector;
the channel attention mechanism module is used for compressing the primary image feature vector to obtain a global compressed feature vector of the primary image feature vector; processing the global compressed feature vector of the primary image feature vector according to a feature channel to obtain a feature channel weight vector of the primary image feature vector, multiplying the feature channel weight vector by the primary image feature vector, and outputting a secondary image feature vector focused on the feature channel; and the channel attention mechanism module realizes the correction of the extracted features.
And the maximum pooling layer is used for acquiring the maximum image feature vector corresponding to the secondary image feature vector output by the channel attention mechanism layer.
The module structure of the channel attention mechanism module may include a global average pooling layer, a first Fully connected layer (FC) connected to the global average pooling layer, a ReLU activation function layer connected to the first Fully connected layer, a second Fully connected layer connected to the ReLU activation function layer, and a Sigmoid activation function layer connected to the second Fully connected layer.
The global average pooling layer realizes the reduction of the number of parameters and the calculation amount; the first full-connection layer distinguishes the result obtained in the last step; and then, correcting the linear unit by using the RELU activation function, and reducing the obtained error.
(2) The residual error network structure configured in the improved abnormal image detection network model can comprise a plurality of configured residual error units which are connected in sequence; each configured residual unit comprises 1 initial residual unit and 1 channel attention mechanism module; the channel attention mechanism module is embedded in the initial residual error unit before the last convolutional layer.
As shown in fig. 2, when the configured residual error network structure includes 4 configured residual error units connected in sequence, in a sequence from left to right, a first configured residual error unit of the 4 configured residual error units includes 3 convolutional layers, a second configured residual error unit includes 4 convolutional layers, a third configured residual error unit includes 6 convolutional layers, and a fourth configured residual error unit includes 3 convolutional layers, where convolutional cores in each convolutional layer have the same size. The residual units of the 4 configurations are mainly different in the number of convolution kernels, which are different due to the different number of layers (i.e., different dimensionalities of the convolution kernels).
It is to be understood that in the residual unit of the first configuration, the channel attention mechanism module is embedded between the 2 nd convolutional layer and the third 3 convolutional layer; in the residual error unit of the second configuration, the channel attention mechanism module is embedded between the 3 rd convolution layer and the third 4 th convolution layers, and so on, and for the embedding positions of the channel attention mechanism modules in the residual error units of other configurations, details are not described herein in this embodiment of the application.
(3) The output network structure of the improved abnormal image detection network model can comprise 1 channel attention mechanism module, 1 average pooling layer and 1 full-connection layer;
the channel attention mechanism module is used for compressing the image characteristic vectors output by the configured residual error network structure to obtain global compressed characteristic vectors of the image characteristic vectors; processing the global compressed feature vector of the image feature vector according to the feature channel to obtain a feature channel weight vector of the image feature vector, multiplying the feature channel weight vector of the image feature vector by the image feature vector, and outputting a secondary image feature vector focused on the output network structure;
the average pooling layer is used for calculating the secondary image feature vectors corresponding to the output network structure to obtain compressed feature vectors corresponding to the secondary image feature vectors;
and the full connection layer is used for obtaining the detection result of the image training sample based on the obtained compressed feature vector.
The output network structure further corrects the obtained image features through a channel attention mechanism module, and averages the obtained image feature vectors by using an averaging pooling layer so as to reduce the problem of increase of variance of the estimated values caused by limitation of the field size, and then the final judgment result is output by the full-connection layer to realize classification.
And S130, performing iterative training on the pre-training detection model based on the detection result of each image training sample and the sample label of the corresponding sample to obtain the trained abnormal image detection model.
Furthermore, at least one image to be detected can be obtained in the application stage; and inputting at least one image to be detected as input data into the trained abnormal image detection model to obtain a detection result of at least one image to be detected.
The method comprises the steps of applying a learned abnormal image detection model, building a Windows client window program by utilizing Pyqt, selecting a single image to be detected or a set of images to be detected at an input end, such as all images in a folder, loading trained model parameters after the images are loaded, carrying out classification detection on the input images, using the learned abnormal image detection model, namely an improved residual error model, displaying a discrimination result in a text form if the discrimination image accords with the characteristics of the abnormal image, and carrying out early warning.
After the abnormal image detection model is trained, the image detection system can be set up, the system can be a Server-Client machine, namely a Client-Server (C/S) structure, wherein the abnormal image detection model is located in the Server, and manpower, material resources and financial resources spent on checking the abnormal image can be greatly reduced.
For the detection of a single image, the single image needs to be input into a server, the server performs operations such as adaptive image scaling on the single image, judges whether the input image belongs to an abnormal image according to a predefined abnormal image rule, and outputs an output result and the possibility in a text form through an associated client.
For the detection of batch images, the folder needs to be uploaded to a server, the server performs classification operation on the used images in the folder, and finally the result and the possibility are output through an associated client.
According to the training method of the abnormal image detection model, after a plurality of image training samples and corresponding sample labels are obtained, the image training samples are used as input data for each image training sample, a preset input network structure of an improved abnormal image detection network model is input, a residual error network structure configured in the improved abnormal image detection network model extracts image characteristic vectors from image information output by the input network structure, and the extracted image characteristic vectors are detected by the output network structure of the improved abnormal image detection network model to obtain a detection result of the output image training samples; the initial network weight value of the improved abnormal image detection network model is a pre-trained network weight value based on a sample training data set in an identification project ImageNet database of a computer vision system; and performing iterative training on the pre-training detection model based on the detection result of each image training sample and the sample label of the corresponding sample to obtain the trained abnormal image detection model. According to the method, a channel attention mechanism module is introduced between original residual error network structures, and characteristic information in a channel is utilized. Parameter information obtained by training on the ImageNet data set is transferred to the field of abnormal image detection after a channel attention mechanism module is introduced, so that the training efficiency of the model is improved, and the detection accuracy of the model is improved
Corresponding to the above method, an embodiment of the present application further provides a training apparatus for an abnormal image detection model, as shown in fig. 3, the training apparatus for an abnormal image detection model includes:
an obtaining unit 310, configured to obtain a plurality of image training samples and corresponding sample labels;
an input unit 320, configured to, for each image training sample, use the image training sample as input data, input an input network structure of a preset improved abnormal image detection network model, extract an image feature vector from image information output by the input network structure, and detect the extracted image feature vector by using an output network structure of the improved abnormal image detection network model to obtain an output detection result of the image training sample; the initial network weight value of the improved abnormal image detection network model is a pre-trained network weight value based on a sample training data set in an identification project ImageNet database of a computer vision system;
the training unit 330 is configured to perform iterative training on the improved abnormal image detection network model based on the detection result of each image training sample and the sample label of the corresponding sample, so as to obtain a trained abnormal image detection model.
In an optional implementation, the input network structure of the improved anomaly image detection network model includes 1 convolutional layer, 1 max pooling layer, and 1 channel attention mechanism module;
the convolutional layer is used for extracting the characteristics of the input data to obtain a primary image characteristic vector;
the channel attention mechanism module is used for compressing the primary image feature vector to obtain a global compressed feature vector of the primary image feature vector; processing the global compressed feature vector of the primary image feature vector according to a feature channel to obtain a feature channel weight vector of the primary image feature vector, multiplying the feature channel weight vector by the primary image feature vector, and outputting a secondary image feature vector focused on the feature channel;
and the maximum pooling layer is used for acquiring the maximum image feature vector corresponding to the secondary image feature vector output by the channel attention mechanism layer.
In an optional implementation, the residual error network structure configured in the improved abnormal image detection network model includes a plurality of configured residual error units connected in sequence;
each configured residual unit comprises 1 initial residual unit and 1 channel attention mechanism module; the channel attention mechanism module is embedded before the last convolutional layer in the initial residual unit.
In an optional implementation, when the configured residual error network structure includes 4 configured residual error units connected in sequence, a first configured residual error unit of the 4 configured residual error units includes 3 convolutional layers, a second configured residual error unit includes 4 convolutional layers, a third configured residual error unit includes 6 convolutional layers, and a fourth configured residual error unit includes 3 convolutional layers, where convolutional cores in each convolutional layer have the same size.
In an optional implementation, the output network structure of the improved abnormal image detection network model includes 1 channel attention mechanism module, 1 average pooling layer and 1 full-link layer;
the channel attention mechanism module is used for compressing the image feature vectors output by the configured residual error network structure to obtain global compressed feature vectors of the image feature vectors; processing the global compressed feature vector of the image feature vector according to a feature channel to obtain a feature channel weight vector of the image feature vector, multiplying the feature channel weight vector of the image feature vector by the image feature vector, and outputting a secondary image feature vector focused on the output network structure;
the average pooling layer is used for calculating the secondary image feature vector corresponding to the output network structure to obtain a compressed feature vector corresponding to the secondary image feature vector;
and the full connection layer is used for obtaining the detection result of the image training sample based on the obtained compressed feature vector.
In an optional implementation, the obtaining unit 310 is specifically configured to:
acquiring a candidate image sample;
processing the candidate image sample by adopting a preset sample quantity expansion mode to obtain a processed candidate image sample;
and determining the candidate image sample and the processed candidate image sample as an image training sample.
In an optional implementation, the obtaining unit 310 is further configured to obtain at least one image to be detected;
and an input unit 320, configured to input the trained abnormal image detection model with the at least one to-be-detected image as input data, so as to obtain a detection result of the at least one to-be-detected image.
The functions of the functional units of the training apparatus for the abnormal image detection model provided in the foregoing embodiments of the present application may be implemented through the foregoing method steps, and therefore, detailed working processes and beneficial effects of the units in the training apparatus for the abnormal image detection model provided in the embodiments of the present application are not repeated herein.
An electronic device is further provided in the embodiment of the present application, as shown in fig. 4, and includes a processor 410, a communication interface 420, a memory 430, and a communication bus 440, where the processor 410, the communication interface 420, and the memory 430 complete communication with each other through the communication bus 440.
A memory 430 for storing computer programs;
the processor 410, when executing the program stored in the memory 430, implements the following steps:
acquiring a plurality of image training samples and corresponding sample labels;
for each image training sample, using the image training sample as input data, inputting an input network structure of a preset improved abnormal image detection network model, using a residual error network structure configured in the improved abnormal image detection network model, extracting an image feature vector from image information output by the input network structure, and detecting the extracted image feature vector by using an output network structure of the improved abnormal image detection network model to obtain an output detection result of the image training sample; the initial network weight value of the improved abnormal image detection network model is a pre-trained network weight value based on a sample training data set in an identification project ImageNet database of a computer vision system;
and performing iterative training on the pre-training detection model based on the detection result of each image training sample and the sample label of the corresponding sample to obtain a trained abnormal image detection model.
The aforementioned communication bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the electronic equipment and other equipment.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
As the implementation manner and the beneficial effects of the problem solving of each device of the electronic device in the foregoing embodiment can be implemented by referring to each step in the embodiment shown in fig. 1, detailed working processes and beneficial effects of the electronic device provided in the embodiment of the present application are not repeated herein.
In yet another embodiment provided by the present application, a computer-readable storage medium is further provided, in which instructions are stored, and when the instructions are executed on a computer, the instructions cause the computer to execute the training method of the abnormal image detection model in any one of the above embodiments.
In yet another embodiment provided by the present application, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the method for training an abnormal image detection model according to any one of the above embodiments.
As will be appreciated by one of skill in the art, the embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including the preferred embodiment and all changes and modifications that fall within the true scope of the embodiments of the present application.
It is apparent that those skilled in the art can make various changes and modifications to the embodiments of the present application without departing from the spirit and scope of the embodiments of the present application. Thus, if such modifications and variations of the embodiments of the present application fall within the scope of the claims of the embodiments of the present application and their equivalents, the embodiments of the present application are also intended to include such modifications and variations.

Claims (10)

1. A method for training an abnormal image detection model, the method comprising:
acquiring a plurality of image training samples and corresponding sample labels, wherein the plurality of image training samples comprise normal image training samples and abnormal image training samples;
for each image training sample, using the image training sample as input data, inputting an input network structure of a preset improved abnormal image detection network model, using a residual error network structure configured in the improved abnormal image detection network model, extracting an image feature vector from image information output by the input network structure, and detecting the extracted image feature vector by using an output network structure of the improved abnormal image detection network model to obtain an output detection result of the image training sample; the initial network weight value of the improved abnormal image detection network model is a pre-trained network weight value based on a sample training data set in an identification project ImageNet database of a computer vision system;
and performing iterative training on the improved abnormal image detection network model based on the detection result of each image training sample and the sample label of the corresponding sample to obtain a trained abnormal image detection model.
2. The method of claim 1, wherein the input network structure of the refined anomaly image detection network model comprises 1 convolutional layer, 1 max pooling layer, and 1 channel attention mechanism module;
the convolutional layer is used for extracting the characteristics of the input data to obtain a primary image characteristic vector;
the channel attention mechanism module is used for compressing the primary image feature vector to obtain a global compressed feature vector of the primary image feature vector; processing the global compressed feature vector of the primary image feature vector according to a feature channel to obtain a feature channel weight vector of the primary image feature vector, multiplying the feature channel weight vector by the primary image feature vector, and outputting a secondary image feature vector focused on the feature channel;
and the maximum pooling layer is used for acquiring the maximum image feature vector corresponding to the secondary image feature vector output by the channel attention mechanism layer.
3. The method of claim 1, wherein the residual network structure configured in the improved abnormal image detection network model comprises a plurality of configured residual units connected in sequence;
each configured residual unit comprises 1 initial residual unit and 1 channel attention mechanism module; the channel attention mechanism module is embedded before the last convolutional layer in the initial residual unit.
4. The method of claim 3, wherein when the configured residual error network structure includes 4 configured residual error units connected in sequence, a first configured residual error unit of the 4 configured residual error units includes 3 convolutional layers, a second configured residual error unit includes 4 convolutional layers, a third configured residual error unit includes 6 convolutional layers, and a fourth configured residual error unit includes 3 convolutional layers, wherein convolutional cores in each convolutional layer are the same in size.
5. The method of claim 1, wherein the output network structure of the improved anomaly image detection network model comprises 1 channel attention mechanism module, 1 average pooling layer, and 1 full-link layer;
the channel attention mechanism module is used for compressing the image feature vectors output by the configured residual error network structure to obtain global compressed feature vectors of the image feature vectors; processing the global compressed feature vector of the image feature vector according to a feature channel to obtain a feature channel weight vector of the image feature vector, multiplying the feature channel weight vector of the image feature vector by the image feature vector, and outputting a secondary image feature vector focused on the output network structure;
the average pooling layer is used for calculating the secondary image feature vector corresponding to the output network structure to obtain a compressed feature vector corresponding to the secondary image feature vector;
and the full connection layer is used for obtaining the detection result of the image training sample based on the obtained compressed feature vector.
6. The method of claim 1, wherein obtaining a plurality of image training samples comprises:
acquiring a candidate image sample;
processing the candidate image sample by adopting a preset sample quantity expansion mode to obtain a processed candidate image sample;
and determining the candidate image sample and the processed candidate image sample as an image training sample.
7. The method of claim 1, wherein after obtaining the trained anomaly image detection model, the method further comprises:
acquiring at least one image to be detected;
and inputting the at least one image to be detected as input data into the trained abnormal image detection model to obtain a detection result of the at least one image to be detected.
8. An apparatus for training an abnormal image detection model, the apparatus comprising:
the acquisition unit is used for acquiring a plurality of image training samples and corresponding sample labels;
the input unit is used for inputting an input network structure of a preset improved abnormal image detection network model by taking the image training samples as input data for each image training sample, extracting image characteristic vectors from image information output by the input network structure by using a residual error network structure configured in the improved abnormal image detection network model, and detecting the extracted image characteristic vectors by using an output network structure of the improved abnormal image detection network model to obtain an output detection result of the image training samples; the initial network weight value of the improved abnormal image detection network model is a pre-trained network weight value based on a sample training data set in an identification project ImageNet database of a computer vision system;
and the training unit is used for carrying out iterative training on the improved abnormal image detection network model based on the detection result of each image training sample and the sample label of the corresponding sample to obtain the trained abnormal image detection model.
9. An electronic device, characterized in that the electronic device comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
a memory for storing a computer program;
a processor for implementing the method steps of any of claims 1-7 when executing a program stored on a memory.
10. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1 to 7.
CN202210036055.2A 2022-01-10 2022-01-10 Training method, device and medium for abnormal image detection model Pending CN114387451A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210036055.2A CN114387451A (en) 2022-01-10 2022-01-10 Training method, device and medium for abnormal image detection model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210036055.2A CN114387451A (en) 2022-01-10 2022-01-10 Training method, device and medium for abnormal image detection model

Publications (1)

Publication Number Publication Date
CN114387451A true CN114387451A (en) 2022-04-22

Family

ID=81202282

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210036055.2A Pending CN114387451A (en) 2022-01-10 2022-01-10 Training method, device and medium for abnormal image detection model

Country Status (1)

Country Link
CN (1) CN114387451A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114862857A (en) * 2022-07-07 2022-08-05 合肥高斯智能科技有限公司 Industrial product appearance abnormity detection method and system based on two-stage learning

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114862857A (en) * 2022-07-07 2022-08-05 合肥高斯智能科技有限公司 Industrial product appearance abnormity detection method and system based on two-stage learning

Similar Documents

Publication Publication Date Title
CN109086811B (en) Multi-label image classification method and device and electronic equipment
JP6994588B2 (en) Face feature extraction model training method, face feature extraction method, equipment, equipment and storage medium
CN109740689B (en) Method and system for screening error labeling data of image semantic segmentation
CN112132143B (en) Data processing method, electronic device and computer readable medium
CN110135505B (en) Image classification method and device, computer equipment and computer readable storage medium
CN111680705B (en) MB-SSD method and MB-SSD feature extraction network suitable for target detection
CN112884782B (en) Biological object segmentation method, apparatus, computer device, and storage medium
CN113608916B (en) Fault diagnosis method and device, electronic equipment and storage medium
CN114187311A (en) Image semantic segmentation method, device, equipment and storage medium
CN112668640B (en) Text image quality evaluation method, device, equipment and medium
CN111666932A (en) Document auditing method and device, computer equipment and storage medium
CN115880298A (en) Glass surface defect detection method and system based on unsupervised pre-training
CN112651468A (en) Multi-scale lightweight image classification method and storage medium thereof
CN111310837A (en) Vehicle refitting recognition method, device, system, medium and equipment
CN114387451A (en) Training method, device and medium for abnormal image detection model
CN113128522B (en) Target identification method, device, computer equipment and storage medium
CN117611907A (en) Method and device for detecting flip image and training model, electronic equipment and medium
CN111507420A (en) Tire information acquisition method, tire information acquisition device, computer device, and storage medium
CN111079930A (en) Method and device for determining quality parameters of data set and electronic equipment
CN115713669A (en) Image classification method and device based on inter-class relation, storage medium and terminal
CN111127327B (en) Picture inclination detection method and device
CN113283388A (en) Training method, device and equipment of living human face detection model and storage medium
CN113298102B (en) Training method and device for target classification model
CN112434717B (en) Model training method and device
CN114399432A (en) Target identification method, device, equipment, medium and product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination