Disclosure of Invention
In order to solve the problems, the invention provides a method for constructing a detection classification model of pathological squamous epithelial cells, which comprises the steps of constructing an abnormal cell detection model, constructing a visual field diagram judgment model and constructing a sample judgment model;
the construction of the abnormal cell detection model comprises the following steps:
the first step is as follows: detecting suspected diseased cells, wherein the detection training comprises candidate frame extraction, classification positioning and reward and punishment convergence;
the second step is that: optimizing the detection result, including optimizing training and optimizing testing, wherein the optimizing training includes extracting visual field map features, making prediction and comparing convergence;
the construction of the view map judgment model comprises the judgment of a single view map; the judgment on the single visual field image comprises judgment training and judgment testing; the judgment training comprises sub-image feature extraction, comprehensive prediction and comparison convergence;
the construction of the sample judgment model comprises sample diagnosis based on a single-view image; the sample diagnosis based on the single-view map comprises sample diagnosis training and sample diagnosis testing; the sample diagnosis training comprises composition sequence, state synthesis and alignment convergence.
As a preferred technical solution, the step of extracting candidate frames in the detection training includes: a visual field map of the cell image is input, and the detection network extracts candidate frames according to the generation and modification principles.
As a preferred technical solution, the step of classification and positioning in the detection training includes: extracting features corresponding to the pathological changes in the candidate frame based on the learning degree of the current detection network, obtaining a classification result of the candidate frame through feature selection and feature analysis, and adjusting the position of the candidate frame to obtain final positioning.
As a preferred technical solution, the step of extracting the characteristics of the view map in the optimization training includes: and inputting a visual field diagram of the detection frame generated in the first step, and extracting features related to true positive and false positive in the visual field diagram based on the learning degree of the current detection network.
As a preferred technical solution, the step of extracting the sub-graph features in the judgment training is as follows: and inputting the detection result in the abnormal cell detection model into a detection network as a view map, extracting a detection frame in the detection network as a sub-view map, and extracting the feature corresponding to the lesion in each sub-view map based on the learning degree of the current detection network.
As a preferred technical solution, the step of comprehensively predicting in the judgment training comprises: and comprehensively summarizing the features in the sub-view map into feature information, taking the feature information as the feature information of the whole view map, performing convolution, pooling and activation operations, inputting the feature information into a full-connection classification network, mapping the original pixel information of the picture into corresponding feature information, further mapping the feature information into classification information, and obtaining the judgment result of the whole view map.
As a preferred technical solution, the step of judging and testing is: and inputting a detection result output from the abnormal cell detection model as a visual field diagram into the trained visual field diagram judgment network, extracting a detection frame therein as a sub-visual field diagram, acquiring and summarizing features in the sub-visual field diagram, taking the integrated features as the features of the whole visual field diagram, performing convolution, pooling and activation operations, and inputting the features into the full-connection classification network to obtain the final judgment of the visual field diagram.
As a preferred technical solution, the sequence composing step in the sample diagnosis training comprises: and (4) arranging the visual field images according to the positive confidence coefficient according to the judgment result of the visual field image judgment model, and selecting the first 10 visual field images as a group of sequences.
As a preferred technical solution, the state integration in the sample diagnosis training comprises the following steps: and taking the obtained characteristics of the whole view field image in the view field image judgment model as representative image characteristics of the view field image, sequentially inputting one sequence in a group of sequences, combining the output of the previous position and the input of the current position together, using the combined output as the input of the RNN model of the current position, obtaining the output of the current position through convolution, pooling and activation operations of the RNN, continuing to the last position, obtaining the output of the last position, obtaining classification information through a full connection layer, and outputting a sample judgment result.
As a preferred technical solution, the sample judgment and test comprises the following steps: inputting the integrated characteristics in the visual field diagram of the front 10 confidence degrees in the samples into a trained sample judgment network, operating through an RNN model, performing convolution, pooling and activation operations through the RNN to obtain the output of the current position until the output of the last position, and passing through a full connection layer to obtain the judgment result of the current sample.
Has the advantages that: the construction of the detection classification model of the pathological squamous epithelial cells comprises the construction of an abnormal cell detection model, the construction of a visual field image judgment model, the construction of a sample judgment model, the three models are buckled and are progressive layer by layer, and the results are subjected to optimized detection, detailed re-diagnosis, integrated re-diagnosis and the like on the basis of the detection of the former model, so that the multiple control of the diagnosis result is realized on the whole, the accuracy of the diagnosis result is ensured, and a perfect sample diagnosis method is obtained.
Detailed Description
The invention will be further illustrated with reference to the following specific examples. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Further, it should be understood that various changes or modifications of the present invention may be made by those skilled in the art after reading the teaching of the present invention, and such equivalents may fall within the scope of the present invention as defined in the appended claims.
Unless defined otherwise, all terms (including technical and scientific terms) used in disclosing the invention have the meaning commonly understood by one of ordinary skill in the art to which this invention belongs. By way of further guidance, definitions of terms are included to better understand the teachings of the present invention.
In order to solve the problems, the invention provides a method for constructing a detection classification model of pathological squamous epithelial cells, which comprises the steps of constructing an abnormal cell detection model, constructing a visual field diagram judgment model and constructing a sample judgment model;
the construction of the abnormal cell detection model comprises the following steps:
the first step is as follows: detecting suspected diseased cells;
the second step is that: and optimizing the detection result.
Construction of abnormal cell detection model
As shown in fig. 1, in the abnormal cell detection model, a microscope image of a cell is input into a detection network as a visual field image, and the visual field image is determined and an abnormal cell detection result is obtained through abnormal cell detection and abnormal cell refine, wherein the abnormal cell detection is a first step of 'detecting suspected diseased cells' constructed by the model and is intended to locate and classify features related to a lesion in the visual field image, and the abnormal cell refine is a second step of 'optimizing the detection result' constructed by the model, so that the detection result in the first step can be optimized, true positives can be identified, and false positives can be reduced.
The first step is as follows: detecting cells suspected of being diseased
The step of detecting the suspected diseased cells is based on the deep learning framework of the Faster R-CNN, and the abnormal cells are detected by taking the marking frame marked by the professional doctor as detection information.
In some embodiments, the step of detecting suspected diseased cells comprises detection training and detection testing.
In some embodiments, the detection training includes extracting candidate boxes, class positioning, and reward and punishment convergence.
In some embodiments, the step of extracting the candidate box is: a visual field map of the cell image is input, and the detection network extracts candidate frames according to the generation and modification principles.
In some embodiments, the field of view map is 1024 x 1024 in size.
In some embodiments, the generation and modification rules include the scale and size of the candidate boxes.
In some embodiments, the generating rule is to define an anchor as a pixel on the last layer view map of the pre-trained network convolution layer, and k candidate boxes can be generated, wherein each candidate box corresponds to a set of scaling and aspect ratio.
In some embodiments, the generation principle uses 3 scaling scales, i.e., 128, 256, 512, 3 aspect ratios, i.e., 1: 2. 1: 1. 2: 1.
the location of each anchor yields 9 candidate boxes according to the generation principles described above.
In some embodiments, the modification principle is to use a mark frame labeled by a professional doctor in advance to perform fine adjustment and delete the size of the candidate frame so as to meet the required size, and finally use a merging method based on the overlapping degree to merge the candidate frames with the overlapping degree greater than a certain fixed threshold value to complete the modification of the candidate frame.
In some embodiments, the step of categorizing the location is: extracting features corresponding to the pathological changes in the candidate frame based on the learning degree of the current detection network, obtaining a classification result of the candidate frame through feature selection and feature analysis, and adjusting the position of the candidate frame to obtain final positioning.
In some embodiments, the feature selection and feature analysis comprises convolution, pooling, activation; the convolution parameters are 3 × 256, 3 × 512, 3 × 1024; the pooling adopts a maximum pooling method; the activation employs the Relu function.
The most important part of the convolutional neural network is called the filter or kernel. The filter can convert a sub-node matrix on the current layer of neural network into a unit node matrix on the next layer of neural network. The unit node matrix refers to a node matrix with length and width of 1, but without limitation to depth. The length and width of the node matrix processed by the filter are manually specified, the size of the node matrix is also called the size of the filter, and the common sizes of the filter are 3 × 3 and 5 × 5. Because the depth of the filter process is consistent with the depth of the current layer neural network node matrix, although the node matrix is three-dimensional, the size of the filter only needs to specify two dimensions. Another setting in the filter that needs to be manually specified is the depth of the resulting matrix of unit nodes, which is referred to as the depth of the filter. In summary, the size of a filter refers to the size of the input node matrix of a filter, and the depth refers to the depth of the output unit node matrix. In the convolutional neural network, the parameters in the filter used by each convolutional layer are the same, and the shared filter parameters can prevent the content on the image from being influenced by the position.
The pooling layer is added between the convolution layers, so that the size of the matrix can be effectively reduced, and further, the parameters in the final full-connection layer are reduced, and therefore, the pooling layer can not only increase the calculation speed, but also prevent overfitting. The computation in the pooling layer filter is not a weighted sum of nodes, but rather a simpler maximum or average computation. The pooling layer operating with the maximum value is referred to as the maximum pooling layer, and the pooling layer operating with the average value is referred to as the average pooling layer.
Each neuron node in the neural network receives the output value of the neuron at the previous layer as the input value of the neuron, and transmits the input value to the next layer, and the neuron node at the input layer can directly transmit the input attribute value to the next layer (hidden layer or output layer). In a multi-layer neural network, there is a functional relationship between the output of an upper node and the input of a lower node, and this function is called an activation function. At present, the mainstream neural network mainly adopts a sigmoid function or a tanh function, the output is bounded, and the output can be easily used as the input of the next layer. Relu functions and their modifications, such as Leaky-ReLU, P-ReLU, R-ReLU, etc., have been used in recent years.
In some embodiments, the step of reward punishment convergence is: and comparing the classification result obtained by the detection network with the information marked by the doctor, and modifying the network parameters through reward and punishment until the network has the best convergence effect, so that the detection training is completed.
In some embodiments, the optimal convergence effect is that the loss on the training set gradually converges through the oscillation and remains stable.
In some embodiments, the step of detecting the test is: inputting a visual field diagram into a trained abnormal cell detection network, and obtaining detection and classification results and position information of a detection frame through convolution, pooling and activation operations; the convolution parameters are 3 × 256, 3 × 512, 3 × 1024; the pooling adopts a maximum pooling method; the activation employs the Relu function.
The second step is that: optimizing test results
The result obtained by the suspected pathological cell detection in the first step has certain false positive, and the abnormal cells which are detected are subjected to optimized detection based on the deep learning frame of densenet, so that the true positive and the false positive are judged, and the detection result of the false positive is reduced.
In some embodiments, the step of optimizing test results comprises optimizing training and optimizing testing.
In some embodiments, the optimization training includes extracting visual field map features, making predictions, and alignment convergence.
In some embodiments, the step of extracting the view map features is: and inputting a visual field diagram of the detection frame generated in the first step, and extracting features related to true positive and false positive in the visual field diagram based on the learning degree of the current detection network.
In some embodiments, the step of making a prediction is: inputting the extracted features into a detection network, and mapping the original pixel information of the picture into a classification result, namely prediction, through convolution, pooling and activation operations; the convolution parameters are 1 × 256, 1 × 512, 3 × 256, 3 × 512, 3 × 1024; the pooling adopts a maximum pooling method and an average pooling method; the activation employs the Relu function.
In some embodiments, the step of converging the alignment is: and comparing the prediction result obtained by the detection network with the result marked by the doctor, and automatically modifying the mapping relation by the model under the inconsistent condition until the network has the optimal convergence effect and the optimization training is finished.
In some embodiments, the step of optimizing the test is: inputting a visual field image containing suspected pathological cell from the detection network into the trained abnormal cell detection network, extracting corresponding characteristic information in the visual field image, and obtaining a classification result of optimized detection through convolution, pooling and activation operations.
Construction of visual field image judgment model
As shown in fig. 2, in the view map determination model, the detection result in the abnormal cell detection model is input into the detection network as a view map, the detection frame therein is extracted as a sub-view map, the features in the sub-view map are acquired and collected, the integrated features are used as the features of the whole view map, and the view map is determined to be abnormal through the full-connection network.
In some embodiments, the constructing of the view map determination model includes determining for a single view map.
Determination of single view
And on the basis of the detection model, detecting each view map again by using the detected detection frame to finish the final judgment of the view map.
In some embodiments, the judging of the single-view map includes judgment training and judgment testing.
In some embodiments, the decision training includes sub-graph feature extraction, comprehensive prediction, and alignment convergence.
In some embodiments, the step of extracting the sub-graph features is: and inputting the detection result in the abnormal cell detection model into a detection network as a view map, extracting a detection frame in the detection network as a sub-view map, and extracting the feature corresponding to the lesion in each sub-view map based on the learning degree of the current detection network.
In some embodiments, the number of the sub-view maps is not less than 5, and if the number of the sub-view maps is less than 5, the sub-view map with the highest confidence level is copied and supplemented to 5.
In some embodiments, the step of comprehensively predicting comprises: and comprehensively summarizing the features in the sub-view map into feature information, taking the feature information as the feature information of the whole view map, performing convolution, pooling and activation operations, inputting the feature information into a full-connection classification network, mapping the original pixel information of the picture into corresponding feature information, further mapping the feature information into classification information, and obtaining the judgment result of the whole view map.
The full connection in the application refers to that in the full connection neural network, nodes between every two layers are connected through edges and used for integrating the extracted features, and the full connection layer can integrate local information with category distinctiveness in a convolution layer or a pooling layer.
In some embodiments, the full connection is split into two layers, one layer being 256 nodes to 4096 nodes and a second layer being 4096 nodes to 2 nodes.
In some embodiments, the method used to synthesize the features in the sub-views is maximal pooling.
In some embodiments, the step of alignment convergence is the same as the step of alignment convergence in the abnormal cell detection model.
In some embodiments, the step of determining the test is: and inputting a detection result output from the abnormal cell detection model as a visual field diagram into the trained visual field diagram judgment network, extracting a detection frame therein as a sub-visual field diagram, acquiring and summarizing features in the sub-visual field diagram, taking the integrated features as the features of the whole visual field diagram, performing convolution, pooling and activation operations, and inputting the features into the full-connection classification network to obtain the final judgment of the visual field diagram.
Sample judgment model
As shown in fig. 3, 10 visual field images with high positive confidence are selected from the sample judgment model, the feature information obtained by integrating the visual field images obtained from the visual field image judgment model is input into the network, and the deep learning framework based on the RNN is sequentially state-integrated from high confidence to low confidence to complete the diagnosis at the sample level.
In some embodiments, the sample assessment model comprises a sample diagnosis based on a single-field view map.
Sample diagnosis based on single view map
And on the basis of the visual field image judgment model, utilizing the integrated characteristic information in the visual field image with higher confidence coefficient to sequentially synthesize the characteristic information to finish sample diagnosis.
In some embodiments, the single-view based sample diagnosis includes sample diagnosis training and sample diagnosis testing.
In some embodiments, the sample diagnostic training comprises component sequence, state synthesis, and alignment convergence.
In some embodiments, the step of composing the sequence is: and (4) arranging the visual field images according to the positive confidence coefficient according to the judgment result of the visual field image judgment model, and selecting the first 10 visual field images as a group of sequences.
In some embodiments, the step of state integration is: taking the obtained characteristics of the whole view field image in the view field image judgment model as representative image characteristics of the view field image, sequentially inputting one sequence in a group of sequences, combining the output of the previous position and the input of the current position together, using the combined output of the previous position and the input of the current position as the input of the RNN model of the current position, obtaining the output of the current position through convolution, pooling and activation operations of the RNN, continuing to the last position, obtaining the output of the last position, then obtaining classification information through full connection, and outputting a sample judgment result; the convolution parameters are 1 × 256, 1 × 512, 3 × 256, 3 × 512, 3 × 1024; the pooling adopts a maximum pooling method; the activation adopts a sigmoid function; the full connection is divided into two layers, one layer is 256 nodes to 1024 nodes, and the second layer is 1024 nodes to 2 nodes.
In some embodiments, the step of alignment convergence is the same as the step of alignment convergence in the abnormal cell detection model.
In some embodiments, the step of the sample judgment test is: inputting the integrated characteristics in the visual field diagram of the front 10 confidence degrees in the samples into a trained sample judgment network, operating through an RNN model, performing convolution, pooling and activation operations through the RNN to obtain the output of the current position until the output of the last position, and passing through a full connection layer to obtain the judgment result of the current sample.
The detection classification model of the pathological squamous epithelial cells comprises an abnormal cell detection model, a visual field image judgment model and a sample judgment model, the three models are buckled and progress layer by layer, the results are subjected to optimization detection, detail re-diagnosis, integration re-diagnosis and the like on the basis of the detection of the former model, the diagnosis result is subjected to multiple controls on the whole, the accuracy of the diagnosis result is ensured, and a complete sample diagnosis method is obtained.
Finally, it should be understood that the above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.