CN112561863B - Medical image multi-classification recognition system based on improved ResNet - Google Patents

Medical image multi-classification recognition system based on improved ResNet Download PDF

Info

Publication number
CN112561863B
CN112561863B CN202011406222.5A CN202011406222A CN112561863B CN 112561863 B CN112561863 B CN 112561863B CN 202011406222 A CN202011406222 A CN 202011406222A CN 112561863 B CN112561863 B CN 112561863B
Authority
CN
China
Prior art keywords
convolution
layer
size
kernels
classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011406222.5A
Other languages
Chinese (zh)
Other versions
CN112561863A (en
Inventor
李玲
梁楫坤
崔红花
张海蓉
黄玉兰
姚桂锦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jilin University
Original Assignee
Jilin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jilin University filed Critical Jilin University
Priority to CN202011406222.5A priority Critical patent/CN112561863B/en
Publication of CN112561863A publication Critical patent/CN112561863A/en
Application granted granted Critical
Publication of CN112561863B publication Critical patent/CN112561863B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4007Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30101Blood vessel; Artery; Vein; Vascular

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the technical field of medical image processing, and particularly relates to a granular cell image fine-grained classification and identification system based on deep learning; the positioning module is used for extracting features of an input granulocyte picture by utilizing a Hourglass network model, respectively positioning all cells in the granulocyte picture, cutting the positioned cells out, leaving single complete cells, and carrying out size normalization processing on all the cut cells; the classification module classifies the granulocytes positioned by the positioning module by adopting the constructed deep learning classification model; the system can assist clinicians in accurately and efficiently completing granulocyte classification, identification and counting tasks, reduce errors caused by subjectivity, reduce the workload of doctors, and assist the doctors in making disease judgment; the system can effectively solve the cell classification under the unbalanced data and the fine-grained classification among granular cells, and improves the network classification and identification effects.

Description

Medical image multi-classification recognition system based on improved ResNet
Technical Field
The invention belongs to the technical field of medical image processing, and particularly relates to a medical image multi-classification recognition system based on improved ResNet.
Background
There are three types of blood cells in the human body, red blood cells, granulocytes, and platelets. The recognition classification of granulocytes is considered to be an active research area compared to other cell types, as granulocytes are responsible for the immunity of the human body. The counting of granulocytes in the bone marrow provides valuable information for physicians and aids for many important diagnoses, such as leukemia and aids. Granulocyte identification and counting is performed manually under a microscope, which is not only time-consuming but also has a high error rate.
At present, the clinical examination method for the granulocytes is manual microscopic examination, the accuracy of the manual microscopic examination can reach more than 95%, but the manual microscopic examination efficiency is low, the classification speed is slow, and the accuracy is influenced by the experience and the state of an inspector. In the field of medical image processing, with the great progress of imaging technology, the auxiliary medical diagnosis by computer graphics is a great trend, on one hand, the development of the imaging technology brings massive medical data, on the other hand, the auxiliary diagnosis by computer graphics can generate images of blood samples, and accurate computer-aided labor is helpful for accelerating the diagnosis of diseases, reducing the workload of doctors and improving the working efficiency, and bringing more accurate and efficient diagnosis results.
Deep learning is a new field in machine learning research, and the motivation is to establish and simulate human brain for analysis learning. Deep learning is a data-driven model, and can simulate human brain visual mechanisms to automatically learn abstract features of data at all levels, so that essential features of the data can be better reflected. At present, deep learning is involved in the aspects of lesion classification, segmentation, identification, brain function research and the like of medical images.
Due to the fact that artificial omission and the unbalanced number of different cell samples are often generated in the data set collection process, the network model classification and identification effect based on deep learning is not ideal, and the accuracy of cell classification of certain types with small data volumes is low.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a medical image multi-classification recognition system based on improved ResNet, which is a granulocytic fine-grained classification recognition statistical system based on improved ResNet under an unbalanced data set, and comprises a preprocessing module, a positioning module and a classification module; the preprocessing module improves the data set capacity and simultaneously reduces background interference caused by uneven dyeing through a data enhancement method; the positioning module and the recognition module use the pre-trained network parameters as initial values of a learning network, simultaneously perform feature extraction on images in different channels, perform pooling downsampling according to the spatial position of a view, perform feature extraction and feature fusion on the images, analyze the granulocyte images collected under a microscope, assist clinicians in accurately and efficiently completing granulocyte classification, recognition and counting tasks, reduce errors caused by subjectivity, reduce the workload of doctors, and assist the doctors in making disease judgment. The system can effectively solve the cell classification under the unbalanced data and the fine-grained classification among granular cells, and improves the network classification and identification effects.
A medical image multi-classification recognition system based on improved ResNet comprises a positioning module and a classification module, wherein the positioning module utilizes a Hourglass network model to perform feature extraction on an input granulocyte picture, positions all cells in the granulocyte picture respectively, cuts out the positioned cells, leaves single complete cells, and performs size normalization processing on all the cut cells;
the classification module classifies the granulocytes positioned by the positioning module by adopting the constructed deep learning classification model:
the network structure of the constructed deep learning classification model is as follows:
the first layer is a convolutional layer, the number of convolutional cores is 64, the size of each convolutional core is 7 x 7, the second layer is a normalization layer and an activation function layer, the third layer is a pooling layer, maximum pooling is adopted, the pooling size is 3 x 3, the fourth layer is a ResNet-Block classification model, and the fifth layer and the sixth layer are TBC-Block classification models;
the ResNet-Block classification model comprises two branches, wherein the first layer of the first branch is a convolution layer, the number of convolution kernels is 128, the size of each convolution kernel is 1 x 1, the second layer is the convolution layer, the number of convolution kernels is 128, the size of each convolution kernel is 3 x 3, the third layer is the convolution layer, the number of convolution kernels is 128, the size of each convolution kernel is 1 x 1, the second branch is a convolution layer, the number of convolution kernels is 128, the size of each convolution kernel is 1 x 1, and a BN layer and an activation function layer are added after each convolution layer of each branch;
the TBC-Block modules of the fifth layer and the sixth layer have the same structure and respectively comprise three branches and three fully-connected layers, wherein the first branch is a convolution layer, the number of convolution kernels is 256, the size of each convolution kernel is 3 x 3, the second branch is a convolution layer, the number of convolution kernels is 256, the size of each convolution kernel is 3 x 3, the third layer is a convolution layer, the number of convolution kernels is 256, the size of each convolution kernel is 1 x 1, and a BN layer and an activation function layer are added after each convolution layer of each branch;
and finally, the network is a three-layer full-connection layer, an activation function is added behind the three-layer full-connection layer, a Softmax classifier is added behind the activation function, the Softmax classifier classifies the cells and outputs the category of each cell.
The Hourglass network module is of a symmetrical structure and comprises four lower convolution layer groups and four upper convolution layer groups;
the first lower convolution layer group comprises a first convolution layer, a second convolution layer and a third convolution layer, wherein the size of each convolution kernel of the first convolution layer is 1 x 1, the number of the convolution kernels is 256, the size of each convolution kernel of the second convolution layer is 3 x 3, the number of the convolution kernels is 128, the size of each convolution kernel of the third convolution layer is 1 x 1, and the number of the convolution kernels is 256;
the second lower convolution layer group comprises a fourth convolution layer group, a fifth convolution layer group and a sixth convolution layer group, wherein the size of each convolution kernel of the fourth convolution layer is 1 x 1, the number of the convolution kernels is 256, the size of each convolution kernel of the fifth convolution layer is 3 x 3, the number of the convolution kernels is 128, the size of each convolution kernel of the sixth convolution layer is 1 x 1, and the number of the convolution kernels is 256;
the third lower convolution layer group comprises a seventh convolution layer group, an eighth convolution layer group and a ninth convolution layer group, wherein the size of each convolution kernel of the seventh convolution layer is 1 x 1, the number of the convolution kernels is 256, the size of each convolution kernel of the eighth convolution layer is 3 x 3, the number of the convolution kernels is 128, the size of each convolution kernel of the ninth convolution layer is 1 x 1, and the number of the convolution kernels is 256;
the fourth lower convolution layer group includes the tenth, eleventh and twelfth convolution layers, wherein the size of each convolution kernel of the tenth convolution layer is 1 x 1, the number of convolution kernels is 256, the size of each convolution kernel of the eleventh convolution layer is 3 x 3, the number of convolution kernels is 128, the size of each convolution kernel of the twelfth convolution layer is 1 x 1, and the number of convolution kernels is 256;
the first upper convolution layer group comprises a thirteenth convolution layer group, a fourteenth convolution layer group and a fifteenth convolution layer group, wherein the size of each convolution kernel of the thirteenth convolution layer is 1 x 1, the number of the convolution kernels is 256, the size of each convolution kernel of the fourteenth convolution layer is 3 x 3, the number of the convolution kernels is 128, the size of each convolution kernel of the fifteenth convolution layer is 1 x 1, and the number of the convolution kernels is 256;
the second upper convolution layer group includes sixteenth, seventeenth, and eighteenth convolution layers, wherein the size of each convolution kernel of the sixteenth convolution layer is 1 × 1, the number of convolution kernels is 256, the size of each convolution kernel of the seventeenth convolution layer is 3 × 3, the number of convolution kernels is 128, the size of each convolution kernel of the eighteenth convolution layer is 1 × 1, and the number of convolution kernels is 256;
the third upper convolution layer set includes nineteenth, twentieth, and twenty-third convolution layers, where each convolution kernel of the nineteenth convolution layer has a size of 1 x 1, the number of convolution kernels is 256, each convolution kernel of the twentieth convolution layer has a size of 3 x 3, the number of convolution kernels is 128, each convolution kernel of the twenty-first convolution layer has a size of 1 x 1, and the number of convolution kernels is 256;
the fourth upper convolution layer group includes twenty-two, twenty-three, and twenty-four convolution layers, where each convolution kernel of the twenty-second convolution layer is 1 x 1 in size, the number of convolution kernels is 256, each convolution kernel of the twenty-third convolution layer is 3 x 3 in size, the number of convolution kernels is 128, each convolution kernel of the twenty-fourth convolution layer is 1 x 1 in size, and the number of convolution kernels is 256;
adding a pooling layer after each lower convolution layer group, wherein the size of the pooling layer is 2 x 2, and the step length is 2;
adding an upper sampling layer after each upper convolution layer group;
the training process of the Hourglass network model adopted by the positioning module comprises the following steps:
collecting 2000 granulocyte pictures as a training set, and carrying out normalization operation on the sizes of the collected pictures;
manually marking each cell and the category of each cell in each picture in the training set to obtain a marked training set;
improving the image capacity of each image in the annotation training set by a data enhancement method and reducing background interference caused by uneven dyeing;
and step three, inputting the labeling training set processed in the step two into a Hourglass network model of a positioning module for training, enabling the Hourglass network model to learn the characteristics of various cells labeled in the labeling training set, and obtaining a trained model when the positioning accuracy of the Hourglass network model to the cells labeled in the labeling training set is 98%, wherein the positioning accuracy of the Hourglass network model to the cells labeled in the labeling training set is that the Hourglass network model positions the number of all cells in the labeling training set/the number of all cells manually labeled in the labeling training set is 100%.
The training process of the constructed deep learning classification model adopted by the classification module comprises the following contents:
inputting the processed labeling training set into a constructed deep learning classification model adopted by a classification module for training, outputting parameters capable of identifying morphological characteristic information of various cells, classifying through a full connection layer, and outputting the types of the cells, wherein when the classification accuracy of the constructed deep learning classification model to the cells labeled in the labeling training set is 90%, a trained classification model is obtained, and the classification accuracy of the constructed deep learning classification model to the cells labeled in the labeling training set is 100% of the number of the cells of all the cell types in the labeling training set/the number of all the cells of the manually labeled types in the labeling training set.
The counting module counts different types of cells output by the classification module respectively and generates a cell count classification report.
The invention has the beneficial effects that:
the method combines the traditional image processing algorithm and the target recognition network to carry out fine-grained classification, recognition and statistics on granulocytes under a microscope, adopts key point detection and an Anchor-Free-based image positioning network, greatly reduces the calculated amount of the network, improves the network performance, adopts a novel constructed fine-grained classification framework, and effectively improves the recognition accuracy, the judgment precision and the robustness.
Drawings
FIG. 1 is a schematic diagram of a Hourglass network model in the positioning module according to the present invention;
FIG. 2 is a structural diagram of a deep learning classification model constructed in the classification module of the present invention;
FIG. 3 is a diagram of ResNet-Block in the classifier architecture of the present invention;
FIG. 4 is a schematic diagram of TBCBResNet-Block in the classifier structure of the present invention.
Detailed Description
The invention discloses a medical image multi-classification recognition system based on improved ResNet, which comprises two modules: a positioning module and a classification module, wherein:
the positioning module utilizes a Hourglass network model to perform feature extraction on an input granulocyte picture, positions all cells in the granulocyte picture respectively, positions the cells by extracting the central point of a target cell, cuts out the positioned cells, only leaves single complete cells in a visual field, and performs size normalization processing on all the cut cells;
as shown in fig. 1, the Hourglass network module has a symmetrical structure and comprises four lower convolution layer groups and four upper convolution layer groups; and cross-layer connection is carried out in the upper and lower symmetrical convolution layer groups through feature map fusion:
the first lower convolution layer group comprises a first convolution layer group, a second convolution layer group and a third convolution layer group, wherein the size of each convolution kernel of the first convolution layer is 1 x 1, the number of the convolution kernels is 256, the size of each convolution kernel of the second convolution layer is 3 x 3, the number of the convolution kernels is 128, the size of each convolution kernel of the third convolution layer is 1 x 1, the number of the convolution kernels is 256, and the first lower convolution layer group and the fourth upper convolution layer group are connected through the same convolution structure as the first lower convolution layer group;
the second lower convolution layer group comprises a fourth convolution layer group, a fifth convolution layer group and a sixth convolution layer group, wherein the size of each convolution kernel of the fourth convolution layer is 1 x 1, the number of the convolution kernels is 256, the size of each convolution kernel of the fifth convolution layer is 3 x 3, the number of the convolution kernels is 128, the size of each convolution kernel of the sixth convolution layer is 1 x 1, the number of the convolution kernels is 256, and the third lower convolution layer group and the third upper convolution layer group are connected through the same convolution structure as the second lower convolution layer group;
the third lower convolution layer group comprises a seventh convolution layer group, an eighth convolution layer group and a ninth convolution layer group, wherein the size of each convolution kernel of the seventh convolution layer is 1 x 1, the number of the convolution kernels is 256, the size of each convolution kernel of the eighth convolution layer is 3 x 3, the number of the convolution kernels is 128, the size of each convolution kernel of the ninth convolution layer is 1 x 1, the number of the convolution kernels is 256, and the seventh lower convolution layer group and the ninth upper convolution layer group are connected through the same convolution structure as the third lower convolution layer group;
a fourth lower convolution layer group includes the tenth, eleventh, and twelfth convolution layers, wherein the tenth convolution layer has a size of 1 × 1 for each convolution kernel, 256 convolution kernels in number, 3 × 3 for each convolution kernel of the eleventh convolution layer, 128 convolution kernels in number, 1 × 1 for each convolution kernel of the twelfth convolution layer, 256 convolution kernels in number, and is connected to the first upper convolution layer group by the same convolution structure as the fourth lower convolution layer group;
the first upper convolution layer group comprises a thirteenth convolution layer group, a fourteenth convolution layer group and a fifteenth convolution layer group, wherein the size of each convolution kernel of the thirteenth convolution layer is 1 x 1, the number of the convolution kernels is 256, the size of each convolution kernel of the fourteenth convolution layer is 3 x 3, the number of the convolution kernels is 128, the size of each convolution kernel of the fifteenth convolution layer is 1 x 1, the number of the convolution kernels is 256, and the input features of the fourth lower convolution layer group are subjected to feature fusion;
the second upper convolution layer group comprises sixteenth, seventeenth and eighteenth convolution layers, wherein the size of each convolution kernel of the sixteenth convolution layer is 1 x 1, the number of the convolution kernels is 256, the size of each convolution kernel of the seventeenth convolution layer is 3 x 3, the number of the convolution kernels is 128, the size of each convolution kernel of the eighteenth convolution layer is 1 x 1, the number of the convolution kernels is 256, and the input features of the third lower convolution layer group are subjected to feature fusion;
the third upper convolution layer group comprises nineteenth, twentieth and twenty-third convolution layers, wherein the size of each convolution kernel of the nineteenth convolution layer is 1 x 1, the number of the convolution kernels is 256, the size of each convolution kernel of the twentieth convolution layer is 3 x 3, the number of the convolution kernels is 128, the size of each convolution kernel of the twenty-first convolution layer is 1 x 1, the number of the convolution kernels is 256, and feature fusion is carried out on input features of the second lower convolution layer group;
the fourth upper convolution layer group comprises twenty-two, twenty-three and twenty-four convolution layers, wherein the size of each convolution kernel of the twenty-second convolution layer is 1 x 1, the number of the convolution kernels is 256, the size of each convolution kernel of the twenty-third convolution layer is 3 x 3, the number of the convolution kernels is 128, the size of each convolution kernel of the twenty-fourth convolution layer is 1 x 1, the number of the convolution kernels is 256, and feature fusion is carried out on input features of the first lower convolution layer group;
adding a pooling layer after each lower convolution layer group, wherein the size of the pooling layer is 2 x 2, and the step length is 2;
after each lower convolution layer group, performing down-sampling on the features input by each layer by using maximum mean pooling, wherein the size of the pooling layer is 2 x 2, the step length is 2, and the output feature graph is one half of the size of the input feature graph after each time of pooling;
an upsampling layer is added after each group of the upper convolution layers.
And after each upper convolution layer is layered, an upper sampling layer is used for carrying out upper sampling on the input characteristic diagram, so that the output characteristic diagram is twice of the input characteristic diagram, the width and the height are expanded to be twice of the original width, and an interpolation is carried out by a bilinear interpolation method to obtain the output characteristic diagram.
The invention respectively adopts a Hourglass network training model and a supervised learning mode to evaluate the effectiveness of the model in the recognition task by means of an intersection-to-parallel ratio IOC.
The Hourglass network is a feature extraction network and is an Hourglass-shaped structure, each network reduces a picture from high resolution to low resolution through a Bottom-Up process, and increases the picture from low resolution to high resolution through a Top-Down process, the network structure comprises a plurality of Pooling and Upsampling steps, and Upsampling can combine features of multiple resolutions to better extract key features of the image.
And performing feature extraction on the image by using a convolution kernel with the size of 3 x 3 in the Bottom-up process convolution layer in Hourglass. The convolution process is a process of mapping to a new value by performing linear transformation at each position of the image, and the formula is as follows:
Figure GDA0003571693980000061
and performing product operation through the input matrix and the weight, and adding an offset value bias, namely vector inner product + offset. From this point of view, the multi-layer convolution is performed with layer-by-layer mapping, and the whole structure constitutes a complex function.
Using the leak ReLU as the activation function,
Figure GDA0003571693980000062
the (Leaky ReLU) function is a widely used variant of the ReLu activation function, which outputs negative valuesThe input has a small slope. Since the derivative is always not zero, the occurrence of silent neurons can be reduced, learning based on gradient is allowed, and the problem that the neurons cannot learn after the Relu function enters a negative interval is solved.
In the steps, pooling is adopted for feature downsampling, a maximum pooling mode is used, and the pooling layer plays a role in feature fusion and dimension reduction in the convolutional neural network.
As shown in fig. 2, the classification module classifies the granulocytes located by the location module by using the constructed deep learning classification model:
the network structure of the constructed deep learning classification model is as follows:
the first layer is a convolutional layer, the number of convolutional cores is 64, the size of each convolutional core is 7 x 7, the second layer is a normalization layer and an activation function layer, the activation function adopts an LRelu activation function, the third layer is a pooling layer, the maximum pooling is adopted, the pooling size is 3 x 3, the fourth layer is a ResNet-Block classification model, and the fifth layer and the sixth layer are TBC-Block classification models;
the ResNet-Block classification model comprises two branches as shown in figure 3, wherein the first layer of the first branch is a convolution layer, the number of convolution kernels is 128, the size of each convolution kernel is 1 x 1, the second layer is the convolution layer, the number of convolution kernels is 128, the size of each convolution kernel is 3 x 3, the third layer is the convolution layer, the number of convolution kernels is 128, the size of each convolution kernel is 1 x 1, the second branch is the convolution layer, the number of convolution kernels is 128, the size of each convolution kernel is 1 x 1, a BN layer and an activation function layer are added after each convolution layer of each branch, feature maps of the two branches are fused, and a new input feature map is Y;
a branch is added to ResNet-Block, the current output is directly transmitted to the next layer of network, the operation of the current layer is skipped, and meanwhile, the gradient of the next layer of network is directly transmitted to the previous layer of network in the backward propagation process, so that the problem of gradient disappearance of the deep layer of network is solved, but the ResNet-Block increases the complexity of the network and makes the training more complex.
The TBC-Block classification models of the fifth layer and the sixth layer are optimized to be TBC-Block classification models through a Tied Block Convolution novel Convolution module, the Tied Block Convolution novel Convolution is the idea of using packet Convolution when an image is subjected to Convolution operation, but different from the packet Convolution, the Tied Block Convolution uses a Convolution kernel of shared weight values to carry out operation, and a final output characteristic diagram is obtained.
As shown in fig. 4, the TBC-Block modules of the fifth layer and the sixth layer have the same structure, and each TBC-Block module includes three branches and three fully-connected layers, where the first branch is a convolutional layer, the number of convolutional cores is 256, each convolutional core is 3 × 3, the second branch is a convolutional layer, the number of convolutional cores is 256, each convolutional core is 3 × 3, the third layer is a convolutional layer, the number of convolutional cores is 256, each convolutional core is 1 × 1, the feature maps of the first two branches are fused and input to the convolutional layer, and a BN layer and an activation function layer are added after each convolutional layer of each branch; the third branch is to fuse the input feature diagram x with the first two branches to obtain a new input feature diagram Y.
And finally, the network is a three-layer full-connection layer, an activation function is added behind the three-layer full-connection layer, a Softmax classifier is added behind the activation function, the Softmax classifier classifies the cells according to the dimensional characteristics, and the category of each cell is output.
And finally, extracting image characteristics by adopting a deep convolution mode of three fully-connected layers in the network, and classifying by using a SoftMax classifier according to the characteristics after obtaining the dimension characteristics by adopting a ReLu activation function as the final activation function of the network to obtain the image category.
The convolutional neural network has the property of invariance, and can robustly classify an object even if the object is placed in different places. CNNs may have invariance to displacement (Translation), view angle (Viewpoint), Size (Size), Illumination (Illumination) (or a combination of the above).
The data enhancement mode of the invention adopts the methods of random rotation transformation, stretching transformation, Gaussian noise addition, random change of the sequence of three channels of RGB of the image and the like.
Rotation | Reflection transform (Rotation/Reflection): randomly rotating the image for a certain angle; the orientation of the image content is changed, and the image is rotated by 45 degrees and 90 degrees and 180 degrees;
stretching and converting: the input rectangular image is stretched into a square image equal to the image width.
Flip transform (Flip) in which flipping the image along the horizontal direction is used;
and (3) noise disturbance, namely randomly disturbing each pixel RGB of the image, wherein Gaussian noise is added to the image.
The training process of the Hourglass network model adopted by the positioning module comprises the following steps:
collecting 2000 granulocyte pictures as a training set, and carrying out normalization operation on the sizes of the collected pictures to ensure that the pixel size of each picture is 512 x 512;
manually marking each cell and the category of each cell in each picture in the training set, wherein each image comprises 6 granulocytes to obtain a marked training set;
improving the image capacity of each image in the annotation training set by a data enhancement method and reducing background interference caused by uneven dyeing;
firstly, stretching a rectangular image into a square image with the same width as the image by rotating (for example, 45 degrees and 90 degrees and 180 degrees) and stretching and transforming each image in an annotation training set, then adding Gaussian noise into the obtained square image, randomly disturbing each pixel RGB of the image, randomly changing the sequence of three channels of the image RGB, and reducing background interference caused by uneven dyeing;
and then counting the proportion of each type of cells in each image in the labeling training set, giving a cell enhancement weight factor of each type according to the number ratio of each type of cells in the image, and performing number enhancement of different types of cells in each image to different degrees so as to achieve the effect of balancing the types.
And (3) making a PASCAL VOC data set, mainly providing a training sample for a target identification network, wherein the unlabeled data in the cell sample image is not taken as a sample in the data set.
And step three, inputting the labeling training set processed in the step two into a Hourglass network model of a positioning module for training, enabling the Hourglass network model to learn the characteristics of various cells labeled in the labeling training set, adopting a back propagation algorithm and a random gradient descent method for training the Hourglass network, performing back propagation iteration to update the weight of each layer according to the magnitude of the Loss value of forward propagation, stopping the training model until the value of the model tends to converge, and obtaining the trained model when the positioning accuracy of the Hourglass network model to the cells labeled in the labeling training set is 98%, wherein the positioning accuracy of the Hourglass network model to the cells in the training labeling training set is 100% of the images of all the cells in the labeling training set/the number of all the cells manually labeled in the labeling training set.
The training process of the constructed deep learning classification model adopted by the classification module comprises the following contents:
inputting the processed labeling training set into a constructed deep learning classification model adopted by a classification module for training, outputting parameters capable of identifying morphological characteristic information of various cells, classifying through a full connection layer, and outputting the types of the cells, wherein when the accuracy of the constructed deep learning classification model for classifying the cells labeled in the labeling training set is 90%, a trained classification model is obtained, and the accuracy of the constructed deep learning classification model for positioning the cells in the labeling training set is 100% of the number of the cells of all the cell types in the labeling training set/the number of all the cells of the manually labeled types in the labeling training set.
The positioning module obtains a characteristic graph through a Hourglass network in training through the acquired key points, and the characteristic graph is obtained through a Gaussian kernel formula as follows:
Figure GDA0003571693980000091
wherein sigmapIs the target scale-adaptive standard deviation, p is the keypoint on the feature mapThe calculated feature map coordinates are dispersed in the thermodynamic diagram
Figure GDA0003571693980000092
Where R is the scale of the output size, W is the width of the image, H is the height of the image, and C is the class of object detection. If two gaussian functions overlap for the same class C, we choose the element level largest. The objective function is trained as follows, the Focal local of the pixel-level logistic regression:
Figure GDA0003571693980000093
wherein alpha and beta are the hyper-parameters of focal loss, alpha is 2 and beta is 4 in the experiment,
Figure GDA0003571693980000094
is a predicted center point value, Y xyc1 is expressed as a positive sample, the other case is expressed as a negative sample, YxycThe method is characterized in that the method is a real central point value obtained by a Gaussian model, N is the number of key points in an image, and in order to normalize all Focal local values and multiply the normalized Focal local values by one N before a formula, the formula operation is carried out between a predicted value and a real value.
Obtaining the result coordinates of the positions of the cells in the large image, cutting out the positioned cells, only leaving single complete cells in the visual field, and performing size normalization processing to complete a new classification model data set;
the positions of the cells are obtained through positioning by the positioning module, the cells in the picture are cut, a new data set with only single cells in the visual field is obtained, and classification interference caused by the similarity of adjacent cell structures is reduced.
Sending the cells into a Resnet + TBC-Block-based target classification network for training, and obtaining a trained classification network model when the classification accuracy of the classification network on the cells marked in the marking training set is 90%, wherein the classification accuracy of the classification network on the cells marked in the marking training set is that the classification network correctly classifies the number of all cells in the marking training set/the number of all cells in the manual marking training set is 100%, so as to obtain a trained deep learning classification model; the training model is obtained, a new network architecture is designed by using a novel Tied Block Convolution module, the effect of the network on classifying fine-grained cells is improved, and the effect of the classification model is evaluated by Mean Average AP value. And carrying out back propagation iteration to update the weight of each layer according to the magnitude of the Loss value of the forward propagation until the value of the model tends to converge, which represents that the model is trained well, and representing that the network is trained well when the value is reduced to be not reduced any more.
The model uses a deep learning recognition network based on ResNet for backbone networks, with the size of the image input to the network being 128 x 128.
The counting module counts different types of cells output by the classification module respectively and generates a cell count classification report.
In the above steps, pooling is adopted for feature downsampling, and a maximum pooling mode is used.
The training classification module adopts a back propagation algorithm and a random gradient descent method, carries out back propagation iteration to update the weight of each layer according to the magnitude of the forward propagation Loss value, stops training the model until the Loss value of the model tends to converge, uses Class-balanced-Focal-local as a Loss function of the model for the unbalanced data set in the classification module, and improves the capacity of the model for processing the unbalanced data.
The formula of the Loss function of Class-Balanced-Focal-local is as follows:
Figure GDA0003571693980000101
wherein the formula (1-beta)n) V (1- β) represents valid samples of each class of sample data, where n is the number of samples and β ∈ (0,1) is a hyper-parameter, L (p, y) is the Focal-Loss Loss function, where p is the probability of each class and y is the true class of each class;
the research of the invention verifies on a data set obtained by clinical cases, the system can effectively count granulocytes in a rapid and accurate classification way, the reliability of the generalization capability and the popularization capability of the model is strong, the manual evaluation of the granulocytes at present has the defects of strong subjectivity, non-strict standard and time consumption, an experienced doctor observes 1000 granulocytes pictures and needs ten hours, the system detects 1000 granulocytes pictures, the time consumption is only 180 seconds, and the workload of the doctor is greatly reduced; the system has high actual application accuracy and reduces errors caused by subjectivity. The system can assist and partially replace doctors to perform granulocyte classification counting, and has good application prospect.

Claims (5)

1. A medical image multi-classification recognition system based on improved ResNet is characterized by comprising a positioning module and a classification module, wherein the positioning module utilizes a Hourglass network model to perform feature extraction on an input granulocyte picture, positions all cells in the granulocyte picture respectively, cuts out the positioned cells, leaves single complete cells, and performs size normalization processing on all the cut cells;
the classification module classifies the granulocytes positioned by the positioning module by adopting the constructed deep learning classification model:
the network structure of the constructed deep learning classification model is as follows:
the first layer is a convolutional layer, the number of convolutional cores is 64, the size of each convolutional core is 7 x 7, the second layer is a normalization layer and an activation function layer, the third layer is a pooling layer, maximum pooling is adopted, the pooling size is 3 x 3, the fourth layer is a ResNet-Block classification model, and the fifth layer and the sixth layer are TBC-Block classification models;
the ResNet-Block classification model comprises two branches, wherein the first layer of the first branch is a convolution layer, the number of convolution kernels is 128, the size of each convolution kernel is 1 x 1, the second layer is a convolution layer, the number of convolution kernels is 128, the size of each convolution kernel is 3 x 3, the third layer is a convolution layer, the number of convolution kernels is 128, the size of each convolution kernel is 1 x 1, the second branch is a convolution layer, the number of convolution kernels is 128, the size of each convolution kernel is 1 x 1, and a BN layer and an activation function layer are added after each convolution layer of each branch;
the TBC-Block modules of the fifth layer and the sixth layer have the same structure and respectively comprise three branches and three fully-connected layers, wherein the first branch is a convolution layer, the number of convolution kernels is 256, the size of each convolution kernel is 3 x 3, the second branch is a convolution layer, the number of convolution kernels is 256, the size of each convolution kernel is 3 x 3, the third layer is a convolution layer, the number of convolution kernels is 256, the size of each convolution kernel is 1 x 1, and a BN layer and an activation function layer are added after each convolution layer of each branch;
and finally, the network is a three-layer full-connection layer, an activation function is added behind the three-layer full-connection layer, a Softmax classifier is added behind the activation function, the Softmax classifier classifies the cells and outputs the category of each cell.
2. The improved ResNet-based medical image multi-classification recognition system as claimed in claim 1, wherein the Hourglass network module is a symmetric structure comprising four lower convolution layer groups and four upper convolution layer groups;
the first lower convolution layer group comprises a first convolution layer, a second convolution layer and a third convolution layer, wherein the size of each convolution kernel of the first convolution layer is 1 x 1, the number of the convolution kernels is 256, the size of each convolution kernel of the second convolution layer is 3 x 3, the number of the convolution kernels is 128, the size of each convolution kernel of the third convolution layer is 1 x 1, and the number of the convolution kernels is 256;
the second lower convolution layer group comprises a fourth convolution layer group, a fifth convolution layer group and a sixth convolution layer group, wherein the size of each convolution kernel of the fourth convolution layer is 1 x 1, the number of the convolution kernels is 256, the size of each convolution kernel of the fifth convolution layer is 3 x 3, the number of the convolution kernels is 128, the size of each convolution kernel of the sixth convolution layer is 1 x 1, and the number of the convolution kernels is 256;
the third lower convolution layer group comprises a seventh convolution layer group, an eighth convolution layer group and a ninth convolution layer group, wherein the size of each convolution kernel of the seventh convolution layer is 1 x 1, the number of the convolution kernels is 256, the size of each convolution kernel of the eighth convolution layer is 3 x 3, the number of the convolution kernels is 128, the size of each convolution kernel of the ninth convolution layer is 1 x 1, and the number of the convolution kernels is 256;
the fourth lower convolution layer group includes the tenth, eleventh and twelfth convolution layers, wherein the size of each convolution kernel of the tenth convolution layer is 1 x 1, the number of convolution kernels is 256, the size of each convolution kernel of the eleventh convolution layer is 3 x 3, the number of convolution kernels is 128, the size of each convolution kernel of the twelfth convolution layer is 1 x 1, and the number of convolution kernels is 256;
the first upper convolution layer group comprises a thirteenth convolution layer group, a fourteenth convolution layer group and a fifteenth convolution layer group, wherein the size of each convolution kernel of the thirteenth convolution layer is 1 x 1, the number of the convolution kernels is 256, the size of each convolution kernel of the fourteenth convolution layer is 3 x 3, the number of the convolution kernels is 128, the size of each convolution kernel of the fifteenth convolution layer is 1 x 1, and the number of the convolution kernels is 256;
the second upper convolution layer group comprises sixteenth, seventeenth and eighteenth convolution layers, wherein the size of each convolution kernel of the sixteenth convolution layer is 1 x 1, the number of convolution kernels is 256, the size of each convolution kernel of the seventeenth convolution layer is 3 x 3, the number of convolution kernels is 128, the size of each convolution kernel of the eighteenth convolution layer is 1 x 1, and the number of convolution kernels is 256;
the third upper convolution layer group comprises nineteenth, twentieth and twenty-third convolution layers, wherein the size of each convolution kernel of the nineteenth convolution layer is 1 x 1, the number of convolution kernels is 256, the size of each convolution kernel of the twentieth convolution layer is 3 x 3, the number of convolution kernels is 128, the size of each convolution kernel of the twenty-first convolution layer is 1 x 1, and the number of convolution kernels is 256;
the fourth upper convolution layer group includes twenty-two, twenty-three, and twenty-four convolution layers, where each convolution kernel of the twenty-second convolution layer is 1 x 1 in size, the number of convolution kernels is 256, each convolution kernel of the twenty-third convolution layer is 3 x 3 in size, the number of convolution kernels is 128, each convolution kernel of the twenty-fourth convolution layer is 1 x 1 in size, and the number of convolution kernels is 256;
adding a pooling layer after each lower convolution layer group, wherein the size of the pooling layer is 2 x 2, and the step length is 2;
an upsampling layer is added after each group of the upper convolution layers.
3. The improved ResNet-based medical image multi-classification recognition system as claimed in claim 2, wherein the training process of the Hourglass network model adopted by the positioning module comprises the following steps:
collecting 2000 granulocyte pictures as a training set, and carrying out normalization operation on the sizes of the collected pictures;
manually marking each cell and the category of each cell in each picture in the training set to obtain a marked training set;
improving the image capacity of each image in the annotation training set by a data enhancement method and reducing background interference caused by uneven dyeing;
and step three, inputting the labeling training set processed in the step two into a Hourglass network model of a positioning module for training, enabling the Hourglass network model to learn the characteristics of various cells labeled in the labeling training set, and obtaining a trained model when the positioning accuracy of the Hourglass network model to the cells labeled in the labeling training set is 98%, wherein the positioning accuracy of the Hourglass network model to the cells labeled in the labeling training set is that the Hourglass network model positions the number of all cells in the labeling training set/the number of all cells manually labeled in the labeling training set is 100%.
4. The improved ResNet-based medical image multi-class recognition system according to claim 3, wherein the training process of the constructed deep learning classification model adopted by the classification module comprises the following steps:
inputting the processed labeling training set into a constructed deep learning classification model adopted by a classification module for training, outputting parameters capable of identifying morphological characteristic information of various cells, classifying through a full connection layer, and outputting the types of the cells, wherein when the classification accuracy of the constructed deep learning classification model to the cells labeled in the labeling training set is 90%, a trained classification model is obtained, and the classification accuracy of the constructed deep learning classification model to the cells labeled in the labeling training set is 100% of the number of the cells of all the cell types in the labeling training set/the number of all the cells of the manually labeled types in the labeling training set.
5. The improved ResNet-based medical image multi-classification recognition system according to claim 4, further comprising a counting module, wherein the counting module counts different types of cells outputted from the classification module respectively, and generates a cell count classification report.
CN202011406222.5A 2020-12-03 2020-12-03 Medical image multi-classification recognition system based on improved ResNet Active CN112561863B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011406222.5A CN112561863B (en) 2020-12-03 2020-12-03 Medical image multi-classification recognition system based on improved ResNet

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011406222.5A CN112561863B (en) 2020-12-03 2020-12-03 Medical image multi-classification recognition system based on improved ResNet

Publications (2)

Publication Number Publication Date
CN112561863A CN112561863A (en) 2021-03-26
CN112561863B true CN112561863B (en) 2022-06-10

Family

ID=75048229

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011406222.5A Active CN112561863B (en) 2020-12-03 2020-12-03 Medical image multi-classification recognition system based on improved ResNet

Country Status (1)

Country Link
CN (1) CN112561863B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113408463B (en) * 2021-06-30 2022-05-10 吉林大学 Cell image small sample classification system based on distance measurement
CN114387467B (en) * 2021-12-09 2022-07-29 哈工大(张家口)工业技术研究院 Medical image classification method based on multi-module convolution feature fusion
CN114366047B (en) * 2022-01-27 2023-05-09 上海国民集团健康科技有限公司 Multi-task neural network pulse condition data processing method, system and terminal
CN115393351B (en) * 2022-10-27 2023-01-24 北京大学第三医院(北京大学第三临床医学院) Method and device for judging cornea immune state based on Langerhans cells
CN116012367B (en) * 2023-02-14 2023-09-12 山东省人工智能研究院 Deep learning-based stomach mucosa feature and position identification method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996009598A1 (en) * 1994-09-20 1996-03-28 Neopath, Inc. Cytological slide scoring apparatus
CN109166100A (en) * 2018-07-24 2019-01-08 中南大学 Multi-task learning method for cell count based on convolutional neural networks
CN110705639A (en) * 2019-09-30 2020-01-17 吉林大学 Medical sperm image recognition system based on deep learning
CN111931764A (en) * 2020-06-30 2020-11-13 华为技术有限公司 Target detection method, target detection framework and related equipment
CN112017161A (en) * 2020-08-06 2020-12-01 杭州深睿博联科技有限公司 Pulmonary nodule detection method and device based on central point regression

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109034045A (en) * 2018-07-20 2018-12-18 中南大学 A kind of leucocyte automatic identifying method based on convolutional neural networks
CN111986103A (en) * 2020-07-20 2020-11-24 北京市商汤科技开发有限公司 Image processing method, image processing device, electronic equipment and computer storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996009598A1 (en) * 1994-09-20 1996-03-28 Neopath, Inc. Cytological slide scoring apparatus
US5933519A (en) * 1994-09-20 1999-08-03 Neo Path, Inc. Cytological slide scoring apparatus
CN109166100A (en) * 2018-07-24 2019-01-08 中南大学 Multi-task learning method for cell count based on convolutional neural networks
CN110705639A (en) * 2019-09-30 2020-01-17 吉林大学 Medical sperm image recognition system based on deep learning
CN111931764A (en) * 2020-06-30 2020-11-13 华为技术有限公司 Target detection method, target detection framework and related equipment
CN112017161A (en) * 2020-08-06 2020-12-01 杭州深睿博联科技有限公司 Pulmonary nodule detection method and device based on central point regression

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Gated Convolutional Networks with Hybrid Connectivity for Image Classification;Chuanguang Yang 等;《arXiv》;20191128;1-8 *
Lightweight Stacked Hourglass Network for Human Pose Estimation;Seung-Taek Kim 等;《Applied Sciences》;20200917;1-15 *
Tied Block Convolution:Leaner and Better CNNs with Shared Thinner Filters;Xudong Wang 等;《arXIv》;20200925;1-27 *
骨髓红粒细胞自动识别的深度学习模型;吴汾奇 等;《吉林大学学报(信息科学版)》;20201130;第38卷(第6期);729-736 *

Also Published As

Publication number Publication date
CN112561863A (en) 2021-03-26

Similar Documents

Publication Publication Date Title
CN112561863B (en) Medical image multi-classification recognition system based on improved ResNet
Dong et al. Evaluations of deep convolutional neural networks for automatic identification of malaria infected cells
CN111985536B (en) Based on weak supervised learning gastroscopic pathology image Classification method
Oztel et al. Mitochondria segmentation in electron microscopy volumes using deep convolutional neural network
Pan et al. Mitosis detection techniques in H&E stained breast cancer pathological images: A comprehensive review
Hyeon et al. Diagnosing cervical cell images using pre-trained convolutional neural network as feature extractor
CN113378792B (en) Weak supervision cervical cell image analysis method fusing global and local information
CN112381164B (en) Ultrasound image classification method and device based on multi-branch attention mechanism
CN111860406A (en) Blood cell microscopic image classification method based on regional confusion mechanism neural network
CN111429407A (en) Chest X-ray disease detection device and method based on two-channel separation network
de Oliveira et al. Classification of Normal versus Leukemic Cells with Data Augmentation and Convolutional Neural Networks.
Yonekura et al. Improving the generalization of disease stage classification with deep CNN for glioma histopathological images
CN114332473A (en) Object detection method, object detection device, computer equipment, storage medium and program product
CN104933415B (en) A kind of visual remote sensing image cloud sector detection method in real time
Bondre et al. Review on leaf diseases detection using deep learning
Jabbar et al. Diagnosis of malaria infected blood cell digital images using deep convolutional neural networks
CN114972202A (en) Ki67 pathological cell rapid detection and counting method based on lightweight neural network
CN114283326A (en) Underwater target re-identification method combining local perception and high-order feature reconstruction
El Alaoui et al. Deep stacked ensemble for breast cancer diagnosis
Pavithra et al. An Overview of Convolutional Neural Network Architecture and Its Variants in Medical Diagnostics of Cancer and Covid-19
Zhang et al. Blood cell image classification based on image segmentation preprocessing and CapsNet network model
Yan et al. Two and multiple categorization of breast pathological images by transfer learning
Dwivedi et al. EMViT-Net: A novel transformer-based network utilizing CNN and multilayer perceptron for the classification of environmental microorganisms using microscopic images
Liu et al. One-stage attention-based network for image classification and segmentation on optical coherence tomography image
Kong et al. Toward large-scale histopathological image analysis via deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant