Disclosure of Invention
In order to solve the defects of the prior art, the invention provides a HER2 image classification system based on a multi-convolution neural network, which uses a combination network to extract the characteristics of a HER2 image, and then fuses image information extracted by different networks by using a linear weighting fusion algorithm, thereby obtaining higher classification precision.
In a first aspect, the invention provides a HER2 image classification system based on a multi-convolution neural network;
a HER2 image classification system based on a multi-convolution neural network, comprising:
an image acquisition module configured to: acquiring an HER2 image to be classified;
a pre-processing module configured to: preprocessing a HER2 image to be classified;
an image feature extraction module configured to: inputting the preprocessed HER2 image to be classified into a multi-convolution neural network to obtain the fusion characteristic of the HER2 image;
a classification module configured to: based on the fusion characteristics, a classifier is adopted to obtain the category of the HER2 image;
the multi-convolution neural network comprises a feature extractor and a feature fusion device, the feature extractor adopts at least two different convolution neural networks to simultaneously extract features of the HER2 images to be classified, and the feature fusion device fuses the features extracted by the different convolution neural networks by using a linear weighting fusion method to obtain fusion features.
Further, before fusing the features extracted by different convolutional neural networks, the feature fusion device needs to flatten and correct the features extracted by different convolutional neural networks to obtain the features with the same size.
Further, the network training module is configured to: and acquiring an original training set, filtering the original training set by adopting a median filter, inputting the filtered training set into a data amplifier for data amplification, and training the multi-convolution neural network by adopting the amplified training set.
Further, the data augmentor is configured to: flipping an input HER2 image; cutting the turned HER2 image to obtain sub-images, wherein each sub-image has the same category label as the input HER2 image; sub-images of a significantly different type than the input HER2 image are discarded.
Further, the feature extractor adopts three convolutional neural networks of VGG16, ResNet18 and AlexNet to simultaneously extract features of the HER2 image to be classified.
Further, the VGG16 is composed of 1 input layer, 13 convolutional layers, 5 pooling layers, and 2 full-link layers.
Further, the Resnet18 is composed of 17 convolutional layers and 1 fully-connected layer.
Further, the AlexNet is composed of 5 convolutional layers, 3 pooling layers and 1 full-connection layer.
In a second aspect, the present invention also provides an electronic device, including:
a memory for non-transitory storage of computer readable instructions; and
a processor for executing the computer readable instructions,
wherein the computer readable instructions, when executed by the processor, perform the steps of:
acquiring an HER2 image to be classified;
preprocessing a HER2 image to be classified;
inputting the preprocessed HER2 image to be classified into a multi-convolution neural network to obtain the fusion characteristic of the HER2 image;
based on the fusion characteristics, a classifier is adopted to obtain the category of the HER2 image;
the multi-convolution neural network comprises a feature extractor and a feature fusion device, the feature extractor adopts at least two different convolution neural networks to simultaneously extract features of the HER2 images to be classified, and the feature fusion device fuses the features extracted by the different convolution neural networks by using a linear weighting fusion method to obtain fusion features.
In a third aspect, the present invention also provides a storage medium storing non-transitory computer readable instructions, wherein the non-transitory computer readable instructions, when executed by a computer, perform the steps of:
acquiring an HER2 image to be classified;
preprocessing a HER2 image to be classified;
inputting the preprocessed HER2 image to be classified into a multi-convolution neural network to obtain the fusion characteristic of the HER2 image;
based on the fusion characteristics, a classifier is adopted to obtain the category of the HER2 image;
the multi-convolution neural network comprises a feature extractor and a feature fusion device, the feature extractor adopts at least two different convolution neural networks to simultaneously extract features of the HER2 images to be classified, and the feature fusion device fuses the features extracted by the different convolution neural networks by using a linear weighting fusion method to obtain fusion features.
Compared with the prior art, the invention has the beneficial effects that:
according to the HER2 image classification system based on the multi-convolution neural network, the characteristic extraction module uses various neural networks, and different convolution neural networks can extract characteristics of HER2 images in different scales, so that the network learns richer HER2 image information, and the HER2 image classification accuracy is improved.
Detailed Description
It is to be understood that the following detailed description is exemplary and is intended to provide further explanation of the invention as claimed. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The embodiments and features of the embodiments of the present invention may be combined with each other without conflict.
All data are obtained according to the embodiment and are legally applied on the data on the basis of compliance with laws and regulations and user consent.
Example one
The embodiment provides a HER2 image classification system based on a multi-convolution neural network;
as shown in fig. 1, a HER2 image classification system based on a multi-convolution neural network adopts networks with different structures to perform feature extraction of HER2 images, fuses image information extracted by different networks by using a linear weighted fusion algorithm, optimizes weights in the linear weighted fusion, and performs classification output of HER2 images by using a classifier. The system specifically comprises an image acquisition module, a preprocessing module, a fusion feature extraction module, a classification module and a network training module.
An image acquisition module configured to: HER2 images to be classified are acquired.
As shown in fig. 3, the four HER2 images in the stanford university tissue microarray database were of the type 0, 1+, 2+, and 3+ respectively for the four HER2 images in fig. 3, each image being 1504 × 1440 pixels in size.
A pre-processing module configured to: the HER2 image to be classified is preprocessed, specifically, the original training set is filtered by a median filter.
In the acquisition, transmission and conversion (such as imaging, scanning, transmission and display) of HER2 images, all in a complex environment, all HER2 images are disturbed to varying degrees by visible or invisible noise, resulting in a degradation of image quality. In addition to visual degradation, noise may also mask important image details. The preprocessing can remove different types of noise, meanwhile, the data are preprocessed, so that the data are simplified, the training speed of the network model is increased, and the reliability of feature extraction and recognition is improved.
In one embodiment, in consideration of the characteristics of the HER2 image, a median filter is selected to remove redundant information of the image and reduce the information of the image. Wherein the median filter has a kernel size of 3 x 3.
An image feature extraction module configured to: and inputting the preprocessed HER2 image to be classified into a trained multi-convolution neural network to obtain the fusion characteristic of the HER2 image. As shown in fig. 2, the multi-convolution neural network includes a feature extractor and a feature fusion device.
(1) Feature extractor
The feature extractor adopts at least two different convolutional neural networks to simultaneously extract features of the HER2 image to be classified.
Considering that a single convolutional neural network or residual network cannot extract multiple abstract features of an image, the feature extractor uses multiple pre-trained networks to extract features.
As one implementation mode, the feature extractor adopts three convolutional neural networks of VGG16, ResNet18 and AlexNet to simultaneously extract features of the HER2 image to be classified.
The ResNet18 has a deeper network structure, so that the classification accuracy is improved, and the probability of gradient dispersion is reduced by adding the residual blocks in the ResNet18 network. The traditional Resnet18 is composed of 17 convolutional layers, a Softmax layer and 1 full link layer, because the output of each network is not the same, in order to make the output of three networks consistent, the invention adjusts the last full link layer, and deletes the full link layer and Softmax layer of the traditional network and adds a full link layer at the last, therefore, the ResNet18 of the invention is composed of 17 convolutional layers and 1 full link layer, and adopts a ReLu function as an activation function, wherein the formula is f (x) (max (0, x), when the input x is less than 0, the output is 0, when x is more than 0, the output is x, and the activation function makes the network converge more quickly. The ResNet18 network inputs RGB images that are 224 × 224 × 3.
The VGG16 network structure is very regular and easily modified, and the small convolution kernels in series have fewer parameters than the larger convolution kernels used alone. The VGG-16 network input is an RGB image of 224 multiplied by 3, the traditional Vgg16 is composed of 1 input layer, 13 convolutional layers, 5 pooling layers, 3 full-connection layers and 1 output layer, the original structure of the original Vgg16 is kept unchanged, only the last two full-connection layers are deleted, two random discarding layers and two activation layers are a softmax layer, and a full-connection layer is added for adjusting the size of output, so that the Vgg16 of the invention is composed of 1 input layer, 13 convolutional layers, 5 pooling layers and 2 full-connection layers (the last full-connection layer is the output layer).
The AlexNet network randomly deletes the neurons in the hidden layer by using the falling layer, the neurons deleted in each iteration do not disappear, and the AlexNet network has different network architectures and shares weight, so that the whole network learning is more robust. The full connection layer has the same structure as the artificial neural network, the node number is super large, the connection line is also super large, and therefore a dropout layer is led out, and a part of layers which are not activated sufficiently are removed. The traditional AlexNet network consists of 5 convolutional layers, 3 pooling layers, 2 full-link layers and 1 Softmax layer, the invention deletes the layers behind the average pooling layer (two full-link layers, two random discard layers, two activation layers and one Softmax layer), and finally adds one full-link layer, so the AlexNet network of the invention consists of 5 convolutional layers, 3 pooling layers and 1 full-link layer. The AlexNet network inputs an RGB image of 224 × 224 × 3.
(2) Feature fusion device
The feature fusion device fuses features extracted by different convolutional neural networks by using a linear weighted fusion method to obtain fusion features.
As shown in fig. 5, in order to fuse feature matrices output by different networks, before fusing features extracted by different convolutional neural networks by using a linear weighted fusion algorithm, a feature fusion device needs to flatten and correct the features extracted by the different convolutional neural networks, and the method specifically includes: firstly, flattening the characteristic matrix extracted by each convolutional neural network by using a flattening layer, namely, outputting multidimensional input in a one-dimensional mode, wherein the flattening layer does not influence the size of a batch; and then correcting the flattened features of different sizes by using the full-connection layer to obtain the features of the same size, so that fusion of different models is facilitated.
The optimal weight problem is a multi-objective decision optimization problem under constraint conditions, the feature fusion device adopts a linear weighting method to convert the multi-objective optimization problem into a single-objective optimization problem, and a mathematical model is established:
max(f(x))=ω1f1+ω2f2+ω3f3 (1)
wherein f is1、f2And f3Respectively representing the features of the feature matrixes extracted by the VGG16, the ResNet18 and the AlexNet neural network after flattening and correcting, and then performing linear weighted fusion on the three features; ω 1, ω 2 and ω 3 represent f, respectively1、f2And f3The weight of (c).
The process of solving the optimal weight can be divided into two steps, wherein the first step is to set a random feasible solution; the second step is to optimize the random feasible solution. First, the initial weights of ω 1, ω 2, ω 3 are assigned to 0.10, 0.15, 0.75, (ω 1+ ω 2+ ω 3 is 1 and 0< ω 1<1, 0< ω 2<1, 0< ω 3<1), and the optimization target is accuracy. Through experimental adjustment, the optimal weight is optimized, and finally the highest accuracy is obtained when ω 1 is 0.40, ω 2 is 0.35, and ω 3 is 0.25, as shown in fig. 6.
A classification module configured to: and based on the fusion characteristics, a classifier is adopted to obtain the category of the HER2 image.
As an embodiment, a classification output of the HER2 image is performed using an SVM classifier.
A network training module configured to: and acquiring an original training set, filtering the original training set by adopting a median filter, inputting the filtered training set into a data amplifier for data amplification, and training the multi-convolution neural network by adopting the amplified training set.
As an embodiment, the training and test sets used in the multi-convolution neural network training are HER2 images of the university of stanford tissue microarray database, which includes four types (0, 1+, 2+, 3+), each of 334 HER2 images, each of 1504 × 1440 pixels in size. IHC measured HER2 receptor protein expression in each image, as shown in figure 3, HER2 score was calculated as follows: 3+ (greater than 10% of invasive cancer cells show strong, intact and uniform cell membrane staining); 2+ (weak to moderate complete membrane staining was observed in more than 10% of tumor cells); 1+ (incomplete membrane staining is very weak and barely detectable in more than 10% of tumor cells); 0 (no staining or less than 10% of invasive cancer cells show incomplete and weak cell membrane staining).
As an embodiment, after the training set needs to be filtered by a median filter, data amplification is performed, specifically:
the HER2 image belongs to the field of medical images, with few common datasets. The data collected from hospitals can vary widely and are not necessarily uniform in form. Sometimes, the doctor needs manual marking, and the workload is very large. However, a small amount of data may over-fit the multirollected neural network during training, which means that the network performs well in the training data, but does not perform well in the test data, resulting in a degraded performance of the multirolleted neural network. Therefore, in order to obtain higher precision, the original data set is expanded, and the problem of insufficient data is solved. Since the HER2 pathology images are rotation invariant, the researcher can analyze HER2 pathology images from different directions with no impact on the diagnosis. But for a multi-convolution neural network it is an image that is completely different from before the rotational translation, but its class is the same. Thus, the data augmentor is configured to: inputting a HER2 image in the training set after filtering, and turning over the input HER2 image; cutting the turned HER2 image by using a cutting method to obtain sub-images, wherein each sub-image is considered to have the same class label as the original image, and the size of the cut image is 224 multiplied by 224; sub-images of a significantly different type than the input HER2 image are discarded (a method of manual observation may be used). As shown in fig. 4, the input HER2 image a belongs to the category of 3+, and the sub-image (a) should belong to 1+, and is discarded if it is not consistent with the type of the original image; and the sub-image (b) belonging to 3+ is retained. Eventually, approximately 5000 HER2 images were obtained.
In the experiment, 400 HER2 images (100 images of 0, 1+, 2+ and 3+, respectively) in the stanford university tissue microarray database were used as the test set, and 4000 images were used as the training set. The Adam optimization algorithm was used to update the network weight parameters with a learning rate of 0.001, a batch size of 16, an input size of 224X 224, and an iteration number of 500, and during the test period, the weights and batch normalization layer of the entire training data set were calculated. These normalizations provided better results. And training the multi-convolution neural network by adopting an end-to-end random gradient descent method.
In the experiment, in order to evaluate the classification result of the HER2 pathology image, the evaluation indices used were Accuracy (Accuracy), Precision (Precision), Recall (Recall), and F1 value (F1 Score). The calculation formula of the accuracy rate is as follows:
wherein, tp (true positive) represents true positive, that is, the number of positive correctly predicted cases; FN (false negative) represents the number of false negative cases, i.e., negative cases that are mispredicted; FP (false positive) represents the number of false positive cases, i.e. negative cases which are mispredicted; tn (true negative) represents the true negative, i.e., the number of instances that are correctly predicted as negative.
The calculation formula of the precision rate is as follows:
the recall ratio is calculated by the formula:
the formula for calculating the value of F1 is:
to verify the impact of data preprocessing and data enhancement on the classification results, the raw data set without processing is first input into a multi-convolution neural network for training and testing to get the left graph in fig. 7. And then carrying out smooth filtering processing on the original data, then translating, rotating, shearing and amplifying the data, updating network weight parameters by using an Adam optimization algorithm, wherein the learning rate is 0.001, the batch size is 16, the input size is 224 multiplied by 224, and the iteration number is 500, so that the right graph in the graph 7 is obtained.
As can be seen from the data comparison in fig. 7, if the original data is not subjected to data filtering and data amplification in the multi-convolution neural network training process, the accuracy on the training set is high, but the accuracy on the test set is low, and the original data fluctuates all the time and is not stable, which results in over-fitting. However, after smoothing filtering and rotation of the HER2 data set, a high accuracy was achieved on both the training set and the test set. Therefore, the experiments show that the data preprocessing and the data enhancement of the HER2 pathological image in the multi-convolution neural network training process can avoid overfitting and improve the identification accuracy.
In order to verify that the multi-convolution neural network can better extract information of a HER2 image for efficient classification, single models VGG16, VGG19 and AlexNet which are commonly used at present are compared with the multi-convolution neural network, the evaluation index result of the model ResNet18 with the best effect is shown in table 1, experimental data of combination of every two of VGG16, ResNet18 and AlexNet are also given in table 1, in order to evaluate the multi-convolution neural network better, a recall rate and an F1 value are added for comparison, a loss change diagram is shown in fig. 8, and the data obtained by the experiment prove the effectiveness of the multi-convolution neural network.
TABLE 1 Experimental results for various models
Method
|
Rate of accuracy
|
Recall rate
|
F1 value
|
Multi-convolution neural network
|
0.931
|
0.948
|
0.947
|
VGG16+ResNet18
|
0.923
|
0.924
|
0.925
|
VGG16+AlexNet
|
0.916
|
0.916
|
0.915
|
AlexNet+ResNet18
|
0.917
|
0.916
|
0.921
|
ResNet18
|
0.906
|
0.906
|
0.905 |
To verify the best results of combining SVM classifiers with pre-trained multi-convolutional neural networks, various classifiers were combined with multi-convolutional neural networks, the results are shown in FIG. 9. Experimental results show that the system of the invention is efficient and robust. The HER2 image classification system based on the multi-convolution neural network of the present invention is of great help to the classification of HER2 stained images by primary and advanced pathologists.
The HER2 image classification system based on the multi-convolution neural network considers that a single convolution neural network or a residual error network cannot well extract multi-abstract features of an image, therefore, the feature extraction of the image is carried out by utilizing a plurality of pre-trained networks, specifically, a combined network of VGG-16, AlexNet and ResNet18 is used for carrying out the feature extraction on a HER2 image, image information extracted by different networks is fused by utilizing a linear weighted fusion algorithm, then, the weight in the linear weighted fusion is optimized through a plurality of experiments, and finally, a full connection layer is used for carrying out classification output on a HER2 image, so that an automatic classification system is formed, and higher HER2 image classification accuracy is obtained.
Example two
The present embodiment also provides an electronic device, including: one or more processors, one or more memories, and one or more computer programs; wherein, a processor is connected with the memory, the one or more computer programs are stored in the memory, when the electronic device runs, the processor executes the one or more computer programs stored in the memory, so as to make the electronic device execute the following steps:
acquiring an HER2 image to be classified;
preprocessing a HER2 image to be classified;
inputting the preprocessed HER2 image to be classified into a multi-convolution neural network to obtain the fusion characteristic of the HER2 image;
based on the fusion characteristics, a classifier is adopted to obtain the category of the HER2 image;
the multi-convolution neural network comprises a feature extractor and a feature fusion device, the feature extractor adopts at least two different convolution neural networks to simultaneously extract features of the HER2 images to be classified, and the feature fusion device fuses the features extracted by the different convolution neural networks by using a linear weighting fusion method to obtain fusion features.
It should be understood that in this embodiment, the processor may be a central processing unit CPU, and the processor may also be other general purpose processors, digital signal processors DSP, application specific integrated circuits ASIC, off-the-shelf programmable gate arrays FPGA or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and so on. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory may include both read-only memory and random access memory, and may provide instructions and data to the processor, and a portion of the memory may also include non-volatile random access memory. For example, the memory may also store device type information.
In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software.
The method in the first embodiment may be directly implemented by a hardware processor, or may be implemented by a combination of hardware and software modules in the processor. The software modules may be located in ram, flash, rom, prom, or eprom, registers, among other storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor. To avoid repetition, it is not described in detail here.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
EXAMPLE III
The present embodiments also provide a computer-readable storage medium storing computer instructions that, when executed by a processor, perform the steps of:
acquiring an HER2 image to be classified;
preprocessing a HER2 image to be classified;
inputting the preprocessed HER2 image to be classified into a multi-convolution neural network to obtain the fusion characteristic of the HER2 image;
based on the fusion characteristics, a classifier is adopted to obtain the category of the HER2 image;
the multi-convolution neural network comprises a feature extractor and a feature fusion device, the feature extractor adopts at least two different convolution neural networks to simultaneously extract features of the HER2 images to be classified, and the feature fusion device fuses the features extracted by the different convolution neural networks by using a linear weighting fusion method to obtain fusion features.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.