CN115035058A

CN115035058A - Self-coding network medical image anomaly detection method

Info

Publication number: CN115035058A
Application number: CN202210626455.9A
Authority: CN
Inventors: 杨绍武; 蓝龙; 徐利洋; 李欣阳; 余宝仑
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2022-06-02
Filing date: 2022-06-02
Publication date: 2022-09-09

Abstract

The invention discloses a self-coding network medical image anomaly detection method, which comprises the following steps: s1, selecting the set image as a data set of the medical image abnormity detection task, and performing superpixel block division on the foreground of the set image to obtain a superpixel block mask of the set image; s2, randomly selecting superpixel block masks of a plurality of set images, and covering the original images of the set images by using the superpixel block masks; s3, inputting the covered setting image into a self-coding network for training, wherein the self-coding network adjusts network parameters by using the weighted sum of the mean square error and the structural similarity error as a loss function; and S4, when the medical image is detected to be abnormal, inputting the image into the trained self-coding network, obtaining a difference image from the input image and the output image, and calculating an abnormal area according to the difference image. The method can improve the reconstruction capability of the self-coding network on the normal sample, and better distinguish the abnormal image through the reconstruction error.

Description

Self-coding network medical image anomaly detection method

Technical Field

The invention relates to the technical field of medical detection, in particular to a self-coding network medical image anomaly detection method.

Background

In the field of computer vision, many mature algorithms are put into practical application, such as face recognition, target tracking, target detection system, etc., wherein machine learning algorithms as cores cannot be separated, but many algorithms are found to be susceptible to out-of-distribution samples through analysis, which may result in erroneous and overly-confident decisions.

Therefore, how to combine a large amount of medical image data to organize mining and application and improve the application capability of computer-aided diagnosis is an important problem in the field of medical image research. The human eye may experience a perceived gap, also known as "blind sight", and due to the complexity of medical images, the human eye sometimes cannot notice all abnormalities, and thus the computer may function as an auxiliary diagnostic.

The existing medical image anomaly detection algorithm based on deep learning mostly depends on a large amount of pixel-level label data, namely, the label data contains an anomaly region, while the labeling of the medical image usually needs a large amount of manpower and material resources, meanwhile, the medical image with the anomaly condition has diversity, and the supervised learning cannot be performed on all the anomaly conditions. For this reason, it is necessary to develop a self-coding network medical image anomaly detection method.

Disclosure of Invention

The invention aims to provide a self-coding network medical image anomaly detection method to overcome the defects in the prior art.

In order to achieve the purpose, the technical scheme adopted by the invention is as follows:

a self-coding network medical image anomaly detection method comprises the following steps:

s1, selecting the set image as a data set of the medical image anomaly detection task, and performing superpixel block division on the foreground of the set image by using a simple linear iterative clustering algorithm to obtain a superpixel block mask of the set image;

s2, randomly selecting superpixel block masks of a plurality of set images, and covering the original images of the set images by using the superpixel block masks;

s3, inputting the covered setting image into a self-coding network for training, wherein the self-coding network adjusts network parameters by using the weighted sum of the mean square error and the structural similarity error between the output image and the original image of the uncovered setting image as a loss function, so that the difference image of the output image and the input image of the self-coding network reaches a first preset threshold value;

and S4, when the medical image is detected to be abnormal, inputting the image into the trained self-coding network, obtaining a difference image from the input image and the output image, and calculating an abnormal area according to the difference image.

Further, the pixel value filled in the covering area in step S2 is an average value of the pixel values of the corresponding original image area.

Further, in the step S3:

the expression of the mean square error is:

in the formula, I and K represent an input image and a reconstructed image respectively, and m and n represent the length and width dimensions of the images respectively;

the structural similarity error expression is as follows:

SSIM(x，y)＝f(l(x，y)，c(x，y)，s(x，y)

where l is brightness, c is contrast, and s is structure, each is defined as follows:

in the formula, mu _x And mu _y Respectively, the mean, sigma, of all pixels in the image block _x ，σ _y Is the variance, σ, of the pixel values of the image block _xy For the corresponding covariance, C is added to the formula to prevent the denominator from being 0 ₁ ，C ₂ ，C ₃ Term, if C ₃ ＝C ₂ And/2, the simplified structural similarity error expression is as follows:

the expression of the loss function is:

Loss(x，y)＝w ^* SIMM(x，y)+(1-w) ^* MSE(x，y)。

further, the step S4 of calculating the abnormal region according to the difference image specifically includes: and if the pixel points in the difference image which are smaller than the second set threshold value are set as normal pixel points, distinguishing the noise area in the rest areas by using the number of the pixel points in the connected area.

Further, the pixel values in the noise region are reduced by an order of magnitude.

Further, the setting image is a brain nuclear magnetic resonance image.

Compared with the prior art, the invention has the advantages that: according to the method for detecting the medical image abnormity of the self-coding network, provided by the invention, the super pixel block foreground covering strategy is added in a semi-supervised abnormity detection frame based on the self-coding network, so that the reconstruction capability of the self-coding network on a normal sample can be improved, and an abnormal area can be better distinguished through reconstruction errors.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the technical solutions in the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a flow chart of the self-coding network medical image anomaly detection method of the invention.

FIG. 2 is a schematic diagram of a superpixel block mask partitioned using a simple linear iterative clustering algorithm in the present invention.

Fig. 3 is a diagram of a reconstruction training strategy based on image processing in the present invention.

Fig. 4 is a schematic diagram illustrating the masking of an original image of a set image by the super pixel block mask according to the present invention.

Figure 5 is a graphical representation of the mean square error loss as a function of the present invention.

Detailed Description

The preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings so that the advantages and features of the present invention can be more readily understood by those skilled in the art, and the scope of the present invention will be more clearly and clearly defined.

Referring to fig. 1, the embodiment discloses a self-coding network medical image anomaly detection method, which includes:

step S1, selecting a set image (in this embodiment, a brain nuclear magnetic resonance image is selected) as a data set of the medical image anomaly detection task, and performing superpixel block division on the foreground of the set image by using a simple linear iterative clustering algorithm to obtain a superpixel block mask of the set image, where a schematic diagram is shown in fig. 2.

In this embodiment, the superpixel division is to group pixels by using the similarity of features between pixels to form superpixels, and replace a large number of pixels in an original image with a small number of superpixels to represent the features of the image, so that the complexity of post-processing of the image is reduced, and therefore, the superpixel generation is usually used as a preprocessing step of an image segmentation algorithm.

In this embodiment, a simple Linear Iterative clustering algorithm slic (simple Linear Iterative cluster) is used, which is a very classical superpixel generation algorithm, and the idea is as follows: the image in the RGB color space is first converted to the CIE-Lab space, which consists of three channels, a luminance channel and two color channels. In the CIE-Lab color space, the meaning of each component L, a, b is the luminance, the green to red component and the blue to yellow component, respectively. The color space is designed based on human perception of color, and is more in accordance with human visual characteristics compared with an RGB color space, and is easier to transform and adjust. Each pixel point can be regarded as a five-dimensional vector (L, a, b, x, y), and a distance measurement standard is constructed for the five-dimensional vector, so that the similarity between two pixel points can be measured by the distance between the corresponding vectors, and the smaller the distance is, the more similar the pixel points are, and the larger the distance is, the smaller the similarity is.

In this embodiment, all pixels in the image foreground are formed into a super-pixel block, which can be regarded as a clustering problem at the pixel level. The SLIC algorithm refers to the idea of a K-means clustering algorithm, firstly generates K seed points, then calculates the distances from other pixel points to the seed points, selects the closest seed point and adds the closest seed point to a set corresponding to the seed point until all the pixel points are searched. And then calculating the average vector values among all pixels in the K sets to obtain K clustering centers again, searching a plurality of surrounding pixel points closest to the vector values of the K centers, adding the pixel points into the category, updating the clustering centers after all the pixel points are classified, iterating again, and repeating the steps until convergence. A mask of a brain nuclear magnetic resonance image corresponding to the super-pixel partition is generated by using a SLIC algorithm.

The advantage of using the SLIC algorithm in this embodiment is that the generated super-pixel blocks are compact and regular, and not only can the color image be segmented, but also the grayscale image can be divided. The method can be applied to brain nuclear magnetic resonance images, and meanwhile, the algorithm needs fewer set parameters and only needs to give the number of pre-divided super pixels under the default condition. The SLIC algorithm is ideal in generating the super-pixel compactness and the running speed.

And step S2, randomly selecting superpixel block masks of a plurality of set images, and covering the original image of the set image by using the superpixel block masks, wherein the pixel values filled in the covering area are the average value of the pixel values of the corresponding original image area.

The main idea in this embodiment is to utilize the reconstruction capability of the self-coding network and only train the network with the normal samples, so that the network model can reconstruct the normal samples well, and when the abnormal samples are input into the self-coding network, a large reconstruction error occurs, so that the abnormality can be determined by the size of the error. There may be different reconstruction strategies in images of different application fields, where the input image may be preprocessed and then input into the self-coding network, and the frame diagram under the strategy is shown in fig. 3.

In combination with the characteristics of the medical image, the super-pixel foreground masking strategy is designed in the embodiment, and the masking mode in the reconstruction strategy is introduced below. According to the mask of the super pixel blocks obtained in step S1, a plurality of super pixel blocks are randomly selected, only the masks of the selected super pixel blocks are retained, the corresponding regions of the original image are covered by the masks, and the pixel values filled in the covered regions are the average values of the pixel values of the corresponding regions of the original image, as shown in fig. 4.

And step S3, inputting the covered setting image into a self-coding network for training, wherein the self-coding network adjusts network parameters by using the weighted sum of the mean square error and the structural similarity error between the output image and the original image of the uncovered setting image as a loss function, so that the difference image of the output image and the input image of the self-coding network reaches a first preset threshold value.

In this embodiment, the super-pixel foreground mask strategy is mainly inspired by context-autocondenser, similar to the classical self-coding network, and is to encode the input image into a shallow feature space, learn a compact feature representation, and reconstruct the original image by using a decoder. However, the general self-coding network only compresses the image information, and does not sufficiently learn the semantic information of the image content. Therefore, the present embodiment provides a new reconstruction method, so that the self-coding network can learn to fill a large area of missing portions of the input image, so that the self-coding network cannot obtain hint information through nearby pixel points, but needs to understand deeper semantic information. Through the research on context-automatecoders and the analysis on brain nuclear magnetic resonance images, the embodiment provides a self-coding network reconstruction strategy based on a superpixel foreground mask.

Since each region of a single whole organ in a human body has similar physiological structure composition, parts of the same structure in a medical image show similar color representation, and a lesion region generally appears in a connected region with the same physiological structure more easily. The SLIC algorithm is to combine color similarity and distance among pixels to cluster and divide pixel points, and each generated pixel block is a continuous area with relative color, so that the corresponding area of the superpixel block can be covered, pathological abnormality can be simulated to a certain extent, and meanwhile, the network is trained to fill the capacity of the covered part through the context of the covered area, so that the self-coding network can better learn the semantic features of normal samples.

In the network model training process, the embodiment utilizes the weighted sum of the mean square error and the structural similarity error between the output image and the original uncovered image as the training loss function to adjust the model parameters.

In this embodiment, the Mean Square Error loss (MSE) is often used as a loss function of the regression model, which calculates a Mean of the sum of squares of the differences between the predicted values and the labels, when the network model outputs multidimensional data, the network model can be regarded as a point in a vector space, and when the distance between two points in the multidimensional space is described by using the euclidean distance, the Mean Square Error loss can be regarded as the L2 distance between two points in the euclidean space, and the expression is as follows:

the functional image is shown in fig. 5.

In the above expression, I and K mainly represent two images, which may represent an input image and a reconstructed image, respectively, in the self-encoding network, and m and n represent the length and width dimensions of the images, respectively. The mean square error curve is smooth and continuous and is derived everywhere, and a gradient descent algorithm is convenient to use in a neural network model, so that the mean square error curve is a common loss function. As the error decreases, the gradient decreases, which facilitates convergence to a minimum value faster. The smaller the value of the mean square error is, the better the effect of the self-coding network model on the image reconstruction as a whole is.

In this embodiment, the mean square error loss concerns the corresponding deviation of each pixel point in the image, and a larger deviation of a single pixel point may have a larger influence on the overall loss, and the loss due to structural similarity can make up for the deficiency of the mean square error loss. Structural Similarity Index (SSIM) focuses more on the regional characteristics of images, in some cases, two images are only slightly different in brightness, the difference of important contents in the images is not large, and the mean square error of the two images can be large, so that the absolute error is different from the mean square error measurement, the loss of structural similarity is more suitable for the visual perception of human eyes, and the structural similarity index mainly considers the images from three angles: luminance (Luminance), Contrast (Contrast), Structure (Structure), can be expressed as the following expression:

SSIM(x，y)＝f(l(x，y)，c(x，y)，s(x，y))

in the expression, l, c, s are respectively defined as follows:

the weighted sum of the mean square error and the structural similarity error is used as a loss function of the self-coding network training, and the expression of the loss function is as follows:

Loss(x，y)＝w ^* SIMM(x，y)+(1-w) ^* MSE(x，y)

and step S4, when the medical image is detected to be abnormal, inputting the image into the trained self-coding network, obtaining a difference image from the input image and the output image, and calculating an abnormal area according to the difference image.

Compared with some abnormal detection methods which can only carry out two-classification on samples, the abnormal region detection result of the pixel level can be obtained by the embodiment. The anomaly detection method based on image segmentation can directly output the classification result of the pixel points, but usually requires label data to train the network, but the embodiment has certain structural advantages, the pixel level label data is not required in the training stage, and the self-coding network can obtain the anomaly detection result of the sample pixel level by utilizing the difference image between the reconstructed image and the input image, namely, whether the pixel points are abnormal or not is judged. In general, a pixel point with a higher pixel value in the difference image may be an abnormal pixel point, and a plurality of connected abnormal pixel points form an abnormal region. Since some noise may exist in the medical image during the reconstruction process, a corresponding small-range connected region may also exist on the difference image, but the connected region is not abnormal, and therefore the abnormal region calculation method adopted in the embodiment is developed below.

In this embodiment, a threshold of a pixel value is first set, and it is considered that a pixel point in the difference image whose pixel value is lower than the threshold does not belong to an anomaly. Under the absolutely ideal condition, the reconstructed image and the input image are completely the same, the pixel values of all the pixel points of the difference image are 0, and because the condition can hardly occur, the embodiment firstly sets a smaller pixel value as the threshold value, and defaults the pixel point corresponding to the pixel value smaller than the threshold value as the normal pixel point. Then, noise interference is eliminated on the rest pixel points, and the abnormal area is usually a larger connected area by analyzing the characteristics of the medical image, so that the abnormal area and the noise area can be distinguished by using the number of the pixel points of the connected area. After the noise area is marked off, the pixel value corresponding to the pixel point of the noise area is reduced by one order of magnitude, and the pixel value is divided by 10, so that the interference of noise in abnormal division of the image pixel point is reduced.

In the embodiment, a superpixel foreground masking strategy is introduced into a self-coding network-based semi-supervised image anomaly detection framework, the capability of the network for filling a covering region by using context information is trained, and the semantic information extraction and reconstruction capability of the network on a normal image are improved.

The mean square error loss MSE and the structural similarity index loss SSIM are considered from overall and local angles respectively, when the self-coding network is trained, the weighted sum of the mean square error loss MSE and the structural similarity index loss SSIM is used as a loss function for restricting network training, the closer the reconstructed medical image and the comparison image are, the better the reconstructed medical image is, the better the MSE is used as the loss function, and the defect that the network is too easy to be influenced by the error of a single pixel point due to the fact that the MSE is used as the loss function alone is overcome.

Although the embodiments of the present invention have been described with reference to the accompanying drawings, various changes or modifications may be made by the patentees within the scope of the appended claims, and within the scope of the invention, as long as they do not exceed the scope of the invention described in the claims.

Claims

1. A self-coding network medical image anomaly detection method is characterized by comprising the following steps:

s1, selecting the set image as a data set of a medical image abnormity detection task, and performing superpixel block division on the foreground of the set image by using a simple linear iterative clustering algorithm to obtain a superpixel block mask of the set image;

s3, inputting the covered setting image into a self-coding network for training, wherein the self-coding network adjusts network parameters by using the weighted sum of the mean square error and the structural similarity error between the original images of the output image and the uncovered setting image as a loss function, so that the difference image of the output image and the input image of the self-coding network reaches a first preset threshold value;

2. The method for detecting abnormality of self-encoded medical image in network according to claim 1, wherein the pixel value filled in the covering area in step S2 is an average value of the pixel values of the corresponding original image area.

3. The self-coding network medical image anomaly detection method according to claim 1, wherein in the step S3:

the expression of the mean square error is:

the structural similarity error expression is as follows:

SSIM(x，y)＝f(l(x，y)，c(x，y)，s(x，y))

in the formula, mu _x And mu _y Are respectively the mean, sigma, of all pixels in the image block _x ，σ _y Is the variance, σ, of the pixel values of the image block _xy For the corresponding covariance, C is added to the formula to prevent the denominator from being 0 ₁ ，C ₂ ，C ₃ Term, if C ₃ ＝C ₂ And/2, the simplified structural similarity error expression is as follows:

the expression of the loss function is:

Loss(x，y)＝w ^* SIMM(x，y)+(1-w) ^* MSE(x，y)。

4. the method for detecting abnormality of self-encoded network medical image as claimed in claim 1, wherein the step S4 of calculating the abnormal region according to the difference image specifically includes: and if the pixel points in the difference image which are smaller than the second set threshold value are set as normal pixel points, distinguishing the noise regions by using the number of the pixel points in the connected region in the rest regions.

5. The self-encoding network medical image anomaly detection method according to claim 4, wherein pixel values in the noise region are reduced by an order of magnitude.

6. The method according to claim 1, wherein the setting image is a brain nuclear magnetic resonance image.