CN114494197A - Cerebrospinal fluid cell identification and classification method for small-complexity sample - Google Patents
Cerebrospinal fluid cell identification and classification method for small-complexity sample Download PDFInfo
- Publication number
- CN114494197A CN114494197A CN202210094305.8A CN202210094305A CN114494197A CN 114494197 A CN114494197 A CN 114494197A CN 202210094305 A CN202210094305 A CN 202210094305A CN 114494197 A CN114494197 A CN 114494197A
- Authority
- CN
- China
- Prior art keywords
- image
- training
- model
- cerebrospinal fluid
- sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 210000001175 cerebrospinal fluid Anatomy 0.000 title claims abstract description 58
- 238000000034 method Methods 0.000 title claims abstract description 50
- 238000012549 training Methods 0.000 claims abstract description 77
- 210000004027 cell Anatomy 0.000 claims abstract description 52
- 239000012535 impurity Substances 0.000 claims abstract description 14
- 238000012360 testing method Methods 0.000 claims abstract description 13
- 238000013508 migration Methods 0.000 claims abstract description 7
- 230000005012 migration Effects 0.000 claims abstract description 7
- 210000004698 lymphocyte Anatomy 0.000 claims abstract description 6
- 210000000440 neutrophil Anatomy 0.000 claims abstract description 6
- 238000001914 filtration Methods 0.000 claims abstract description 5
- 210000005087 mononuclear cell Anatomy 0.000 claims abstract description 3
- 238000001514 detection method Methods 0.000 claims description 18
- 238000013526 transfer learning Methods 0.000 claims description 12
- 238000013135 deep learning Methods 0.000 claims description 10
- 238000013507 mapping Methods 0.000 claims description 10
- 230000001464 adherent effect Effects 0.000 claims description 8
- 230000000877 morphologic effect Effects 0.000 claims description 8
- 238000007781 pre-processing Methods 0.000 claims description 8
- 230000006870 function Effects 0.000 claims description 7
- 230000008569 process Effects 0.000 claims description 7
- 230000011218 segmentation Effects 0.000 claims description 7
- 238000013145 classification model Methods 0.000 claims description 5
- 238000009499 grossing Methods 0.000 claims description 4
- 230000035772 mutation Effects 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 4
- 238000000926 separation method Methods 0.000 claims description 4
- 238000012546 transfer Methods 0.000 claims description 4
- 238000013519 translation Methods 0.000 claims description 4
- 239000013598 vector Substances 0.000 claims description 4
- 238000003384 imaging method Methods 0.000 abstract 1
- 238000003745 diagnosis Methods 0.000 description 9
- 238000003748 differential diagnosis Methods 0.000 description 8
- 230000000694 effects Effects 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- 230000021164 cell adhesion Effects 0.000 description 5
- 230000002411 adverse Effects 0.000 description 4
- 210000001616 monocyte Anatomy 0.000 description 4
- 230000000191 radiation effect Effects 0.000 description 4
- 230000001580 bacterial effect Effects 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 230000005484 gravity Effects 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 210000004369 blood Anatomy 0.000 description 2
- 239000008280 blood Substances 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 208000025222 central nervous system infectious disease Diseases 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 210000002540 macrophage Anatomy 0.000 description 2
- 244000052769 pathogen Species 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 206010003445 Ascites Diseases 0.000 description 1
- 102000001554 Hemoglobins Human genes 0.000 description 1
- 108010054147 Hemoglobins Proteins 0.000 description 1
- 206010027202 Meningitis bacterial Diseases 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 206010057249 Phagocytosis Diseases 0.000 description 1
- 208000032851 Subarachnoid Hemorrhage Diseases 0.000 description 1
- 210000001744 T-lymphocyte Anatomy 0.000 description 1
- 208000022971 Tuberculous meningitis Diseases 0.000 description 1
- 208000036142 Viral infection Diseases 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 210000003719 b-lymphocyte Anatomy 0.000 description 1
- 201000009904 bacterial meningitis Diseases 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 210000000601 blood cell Anatomy 0.000 description 1
- 210000001124 body fluid Anatomy 0.000 description 1
- 239000010839 body fluid Substances 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000036755 cellular response Effects 0.000 description 1
- 210000003169 central nervous system Anatomy 0.000 description 1
- 208000037976 chronic inflammation Diseases 0.000 description 1
- 230000006020 chronic inflammation Effects 0.000 description 1
- 230000002380 cytological effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 239000007857 degradation product Substances 0.000 description 1
- 238000002405 diagnostic procedure Methods 0.000 description 1
- 210000003743 erythrocyte Anatomy 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 238000012760 immunocytochemical staining Methods 0.000 description 1
- 230000003834 intracellular effect Effects 0.000 description 1
- 208000001223 meningeal tuberculosis Diseases 0.000 description 1
- 230000001717 pathogenic effect Effects 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 230000008782 phagocytosis Effects 0.000 description 1
- 210000004910 pleural fluid Anatomy 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000001902 propagating effect Effects 0.000 description 1
- 238000003908 quality control method Methods 0.000 description 1
- 239000013049 sediment Substances 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 230000002485 urinary effect Effects 0.000 description 1
- 210000002700 urine Anatomy 0.000 description 1
- 230000009385 viral infection Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/12—Edge-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/194—Segmentation; Edge detection involving foreground-background segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10056—Microscopic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30024—Cell structures in vitro; Tissue sections in vitro
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Molecular Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Medical Informatics (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Quality & Reliability (AREA)
- Investigating Or Analysing Biological Materials (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a cerebrospinal fluid cell identification and classification method for a small-complexity sample, which comprises the following specific steps: s1: utilizing a microscope splicing imaging platform to obtain cell images under a slide sample, wherein the cell images comprise images of mononuclear cells, lymphocytes and neutrophils; s2: the resulting image set is pre-processed. Aiming at the problems that background impurities caused by lens pollution or improper operation exist in the collected cell sample and individual cells are overlapped and adhered, the method mainly comprises the steps of filtering and denoising the image, removing irrelevant factors in the image and separating the adhered cells; s3: carrying out model migration training on the small sample set; s4: according to the obtained model, carrying out reverse fine adjustment on the weight and the threshold value of the model by using a BP algorithm; s5: and (4) identifying the cerebrospinal fluid cell image of the test set by using the trained model, and further optimizing an algorithm. The invention can effectively identify different types of the cerebrospinal fluid cells of the human body.
Description
Technical Field
The invention relates to the technical field of medical cell identification and classification, in particular to a cerebrospinal fluid cell identification and classification method for a small-complexity sample.
Background
Cerebrospinal fluid (CSF) cytology, one of the most important tools for neurologists, including overall cell count and cytological classification, provides important first-hand information for the central nervous system and a range of pathological conditions covered thereby. CSF samples need to be processed immediately, possibly within 1 hour after collection. In normal CSF cells, T lymphocytes are taken as main cells, a small amount of mononuclear macrophages are taken as occasional B lymphocytes; the number of cells is obviously increased, and bacterial meningitis is frequently seen under the microscope with neutrophils as the main part, and intracellular bacterial evidence needs to be further searched; the cellular background mainly consisting of lymphocytes and monocytes is commonly seen in viral infection and chronic inflammation; with promiscuous cell responses as background, tubercular meningitis can occur; phagocytosis of erythrocytes or macrophages containing fragments of hemoglobin degradation products (the latter are called hemoglobinocytes), both of which indicate old subarachnoid hemorrhage; when abnormal cells are discovered under the microscope to suspect tumors, the clinical and immunocytochemical staining integrated judgment is needed.
In recent years, metagenome sequencing technology is receiving wide attention, and has certain value in detecting pathogens of central nervous system infectious diseases, but still has some defects: the specimen is easy to be polluted, the detection result is influenced, and the overall sensitivity of pathogen detection is limited; false negative and false positive results are easy to occur, and even the detection result cannot be analyzed; high detection cost, and limits the wide application of the method. Therefore, metagenomic sequencing has not been able to replace traditional diagnostic methods.
To date, most clinical laboratories have manually counted and sorted cells in cerebrospinal fluid by counting individually mononuclear cells (including lymphoid and monocytic cells) and multiple nuclear cells, totaling 100 cells, directly under a microscope according to their nuclear morphology. The method has the disadvantages of complex operation, time and labor consumption, different operators have great subjectivity due to different proficiency and standard degrees, the result repeatability is low, the error is large, indoor or inter-room quality control cannot be performed, the result return time is long, the clinical requirements cannot be well met, and the method is not suitable for the development of large-scale clinical work of modern hospitals. Cerebrospinal fluid samples are small compared to blood and urine. The sampling amount is small during manual counting, and the counting accuracy cannot be guaranteed. The above problems can be solved to a certain extent if automated cell detection of cerebrospinal fluid specimens can be realized or partially realized. At present, no special automatic analyzer for counting and classifying cerebrospinal fluid cells exists.
With the development of automated cell detection technology, many researchers have tried to count and analyze cerebrospinal fluid cells using various types of cell analyzers (e.g., fully-automatic urinary sediment analyzer and blood analyzer) in recent years. The functions of body fluid cell analysis are added by some new models of blood cell analyzers at present, so that automatic counting and classification of cells in pleural fluid, ascites and the like in a laboratory become possible. However, the cerebrospinal fluid has a small sample amount due to the particularity of the cerebrospinal fluid, and the application of the cerebrospinal fluid in the detection of the cerebrospinal fluid sample is limited by the principles of various instruments, the internal design and other problems of the cerebrospinal fluid sample, and in addition, from the sampling of the cerebrospinal fluid cells, certain bacterial impurities and cell adhesion exist in a sample slide, so that the influence on the identification and classification of the cerebrospinal fluid cells is greatly influenced.
The automatic cerebrospinal fluid cell identification technology based on deep learning can help neurologists to quickly establish a more scientific differential diagnosis model, can reduce adverse effects of the doctors on microscopic examination results due to subjective factors, helps the doctors to count and classify cells, and greatly improves diagnosis rate; and the system can be mutually fused with high-level medical institution resources, so that the overall diagnosis mode tends to be standard and uniform, the radiation effect of high-quality medical resources to the basic-level medical institution is greatly improved, and the differential diagnosis level of the basic-level hospital is improved. Therefore, the deep learning-based automatic cerebrospinal fluid cell identification system is constructed, and has great significance for improving the diagnosis rate of the central nervous system infectious diseases, solving the problems of regional medical differences, low annual capital, misdiagnosis of primary physicians and the like, so that the vast majority of patients are benefited finally.
Disclosure of Invention
1. Technical problem to be solved
The invention aims to solve the problems that in the prior art, because of the particularity of cerebrospinal fluid, the sample amount is small, the application of cerebrospinal fluid in the detection of cerebrospinal fluid samples is limited by the principles, the internal design and the like of various instruments, and in addition, from the sampling of cerebrospinal fluid cells, a certain bacterial impurity and cell adhesion exist in a sample slide, so that the influence on the identification and classification of cerebrospinal fluid cells is greatly influenced.
2. Technical scheme
In order to achieve the purpose, the invention adopts the following technical scheme:
a method for identifying and classifying cerebrospinal fluid cells of a small-complexity sample comprises the following steps:
s1: acquiring an image of a sample slide by using an automatic scanning platform of a microscope to obtain a complete image set of a cerebrospinal fluid cell slide with a plurality of cells;
s2: preprocessing the obtained image set, filtering and denoising the image, removing irrelevant factors in the image, separating and processing mutually adhered cells, and forming a training set and a test set in batches by using the obtained sample image set;
s3: carrying out model transfer training on the small sample set, and carrying out transfer learning on the small sample data set by using a deep learning network trained in the similar field;
s4: carrying out reverse fine adjustment on the weight and the threshold value of the trained model by using a BP algorithm, and further optimizing the model;
s5: and inputting the test set into the model, and outputting a result, namely a cerebrospinal fluid cell identification result.
Preferably, in S1, the cerebrospinal fluid cytology slide is placed on an electric translation stage of a microscope, a software system is used to perform diagonal coordinate point positioning on a scanning range of the slide, determine a scanning range of an image, record a size of the scanning image range, perform image acquisition stitching by using a software system platform to obtain a complete cytology specimen picture, and repeat this step for subsequent slide image acquisition.
Preferably, the specific steps of preprocessing the image set in S2 are as follows:
step 1: aiming at irrelevant impurities in a sample background, firstly, carrying out background separation on an image, obtaining a binary image by a maximum inter-class variance method, smoothing the outline of a target in the binary image by using morphological opening operation, removing impurities which are not the target in the background in the morphological opening operation, and finally obtaining the outline edge information of the target by using a Canny boundary detection algorithm;
step 2: dividing the adherent cells by adopting pit detection, wherein the pit of the adherent cells refers to a point with maximum local curvature in a pit area formed after two or more similar circular objects are adhered due to mutual overlapping, and for an image with circularity, the situation of curvature mutation does not exist unless the number of the cells is two or more;
and step 3: and (4) elliptical fitting, wherein in order to obtain the contour boundary lost due to adhesion of an adhesion target, the algorithm utilizes the prior knowledge that the target is generally in a similar circle shape, and an elliptical fitting method based on a least square method is used for fitting to complete adhesion segmentation.
Preferably, in S3, a multi-layer ResNet model trained in other fields is used, the part in front of the full connection layer of the model is intercepted, the output part sets three output nodes according to the required classification type, then a pre-training migration mode is used, the current parameters of the multi-layer ResNet are used as the initial parameters of the present invention, and then the picture data processed in S2 is used for training the network, the specific process is as follows:
1) determining the number of nodes of the first-layer network according to the dimension of the input data, namely the number of nodes of the input layer;
2) inputting data to a residual error network unit, and training the network layer by utilizing training data according to the characteristic of a ResNet network, namely an identity mapping function of the residual error network, wherein the output of each module is the current input plus the residual error;
3) the trained ResNet network is used for carrying out network transfer learning, and the well-trained parameters are used as initial training parameters of the model, so that part of training time and training samples are saved, and the model is very suitable for training and learning of small samples.
Preferably, the specific steps of optimizing the model in S4 are:
1) after training is finished, label data is added to the topmost layer of ResNet, and supervised training is carried out on the model, namely a back propagation algorithm (BP) is used for carrying out fine adjustment on related parameters of the network;
2) and respectively inputting the labeled data of the classified classes into the topmost layer of the ResNet, finely adjusting the weight and the threshold of the ResNet through a BP algorithm, and further reducing training errors and improving the accuracy of the transfer learning identification model through supervised training.
Preferably, in S5, the test set data is input into the trained classification model, after multi-layer ResNet mapping, the number of nodes in the output layer is the number of identification states, and the input vector successfully activates the corresponding class node in the output layer.
Preferably, in the classification node of S5, the monocyte is node 0, the lymphocyte is node 1, and the neutrophil is node 2.
3. Advantageous effects
Compared with the prior art, the invention has the advantages that:
(1) according to the invention, the problem of difficult feature extraction caused by the existence of background impurities in the acquired image can be effectively solved, and the method can be well suitable for cerebrospinal fluid cell identification under a complex background; because the trained model parameters are used as the initial training parameters of the invention, part of training time is reduced to a certain extent, and the trained model parameters generally have more reliability than randomly selected parameters, so the method is suitable for the learning training of small samples; aiming at the sample with cell adhesion, the invention also uses the near circularity of the cell to predict the central point of the cell for segmentation, so that the application condition of the invention is diversified.
(2) According to the invention, the cerebrospinal fluid cell automatic identification technology based on deep learning can help neurologists to quickly establish a more scientific differential diagnosis model, reduce adverse effects of the doctors on microscopic examination results due to subjective factors, help the doctors to count and classify cells, and greatly improve the diagnosis rate; and the system can be mutually fused with high-level medical institution resources, so that the overall diagnosis mode tends to be standard and uniform, the radiation effect of high-quality medical resources to basic-level medical institutions is greatly improved, and the differential diagnosis level of basic-level hospitals is improved.
Drawings
FIG. 1 is a block diagram of a technical process of a cerebrospinal fluid cell identification and classification method for a small sample with small complexity according to the present invention;
FIG. 2 is a schematic diagram of a pit of a complex small sample cerebrospinal fluid cell identification and classification method according to the present invention;
FIG. 3 is a ResNet model for transfer learning in the present invention;
fig. 4 is an example of transfer learning in the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments.
Example 1:
referring to fig. 1, a method for identifying and classifying cerebrospinal fluid cells of a small complexity sample comprises the following steps:
s1: acquiring an image of a sample slide by using an automatic scanning platform of a microscope to obtain a complete image set of a cerebrospinal fluid cell slide with a plurality of cells;
placing a cerebrospinal fluid cell slide on an electric translation table of a microscope, positioning a diagonal coordinate point in a scanning range of the slide by using a software system, determining the scanning range of an image, recording the size of the scanning image range, splicing collected images by using a software system platform to obtain a complete cell sample picture, and repeating the step for the subsequent slide image collection;
s2: preprocessing the obtained image set, filtering and denoising the image, removing irrelevant factors in the image, separating and processing mutually adhered cells, and forming a training set and a test set in batches by using the obtained sample image set;
the specific steps of preprocessing the image set are as follows:
step 1: aiming at irrelevant impurities in a sample background, firstly, carrying out background separation on an image, obtaining a binary image by a maximum inter-class variance method, smoothing the outline of a target in the binary image by using morphological opening operation, removing impurities which are not the target in the background in the morphological opening operation, and finally obtaining the outline edge information of the target by using a Canny boundary detection algorithm;
step 2: dividing the adherent cells by adopting pit detection, wherein the pit of the adherent cells refers to a point with maximum local curvature in a pit area formed after two or more similar circular objects are adhered due to mutual overlapping, and for an image with circularity, the situation of curvature mutation does not exist unless the number of the cells is two or more;
and step 3: ellipse fitting, wherein in order to obtain the contour boundary lost due to adhesion of an adhesion target, the algorithm utilizes the prior knowledge that the target is generally in a similar circle shape, and an ellipse fitting method based on a least square method is used for fitting to complete adhesion segmentation;
s3: carrying out model transfer training on the small sample set, and carrying out transfer learning on the small sample data set by using a deep learning network trained in the similar field;
the method comprises the following steps of intercepting a part in front of a full connection layer of a model by adopting a multi-layer ResNet model trained in other fields, setting three output nodes by an output part according to a required classification type, taking parameters of the current multi-layer ResNet as initial parameters of the method by utilizing a pre-training migration mode, and then training a network by using picture data processed in S2, wherein the specific process comprises the following steps:
1) determining the number of nodes of the first-layer network, namely the number of nodes of an input layer according to the dimension of the input data;
2) inputting data to a residual error network unit, and training the network layer by utilizing training data according to the characteristic of a ResNet network, namely an identity mapping function of the residual error network, wherein the output of each module is the current input plus the residual error;
3) the trained ResNet network is used for carrying out network transfer learning, and the well-trained parameters are used as initial training parameters of the model, so that part of training time and training samples are saved, and the model is very suitable for training and learning of small samples;
s4: carrying out reverse fine adjustment on the weight and the threshold value of the trained model by using a BP algorithm, and further optimizing the model;
the specific steps of optimizing the model are as follows:
1) after training is finished, label data is added to the topmost layer of ResNet, and supervised training is carried out on the model, namely a back propagation algorithm (BP) is used for carrying out fine adjustment on related parameters of the network;
2) and respectively inputting the labeled data of the classified classes into the topmost layer of the ResNet, finely adjusting the weight and the threshold of the ResNet through a BP algorithm, and further reducing training errors and improving the accuracy of the transfer learning identification model through supervised training.
S5: and inputting the test set into the model, and outputting a result, namely a cerebrospinal fluid cell identification result.
Inputting test set data into a trained classification model, after multilayer ResNet mapping, the number of nodes of an output layer is the number of recognition states, input vectors successfully activate corresponding class nodes in the output layer, and in the class nodes, monocytes are node 0, lymphocytes are node 1, and neutrophils are node 2.
According to the invention, the problem of difficult feature extraction caused by the existence of background impurities in the acquired image can be effectively solved, and the method can be well suitable for cerebrospinal fluid cell identification under a complex background; because the trained model parameters are used as the initial training parameters of the invention, part of training time is reduced to a certain extent, and the trained model parameters generally have more reliability than randomly selected parameters, so the method is suitable for the learning training of small samples; aiming at a sample with cell adhesion, the invention also utilizes the near circularity of the cell to predict the central point of the cell for segmentation, so that the application condition of the invention is diversified.
According to the invention, the cerebrospinal fluid cell automatic identification technology based on deep learning can help neurologists to quickly establish a more scientific differential diagnosis model, reduce adverse effects of the doctors on microscopic examination results due to subjective factors, help the doctors to count and classify cells, and greatly improve the diagnosis rate; and the system can be mutually fused with high-level medical institution resources, so that the overall diagnosis mode tends to be standard and uniform, the radiation effect of high-quality medical resources to basic-level medical institutions is greatly improved, and the differential diagnosis level of basic-level hospitals is improved.
Example 2:
referring to fig. 1-4, a method for identifying and classifying cerebrospinal fluid cells of a small complexity sample comprises the following steps:
s1: acquiring an image of a sample slide by using an automatic scanning platform of a microscope to obtain a complete image set of a cerebrospinal fluid cell slide with a plurality of cells;
placing a cerebrospinal fluid cell slide on an electric translation table of a microscope, positioning a diagonal coordinate point in a scanning range of the slide by using a software system, determining the scanning range of an image, recording the size of the scanning image range, splicing collected images by using a software system platform to obtain a complete cell sample picture, and repeating the step for the subsequent slide image collection;
s2: preprocessing the obtained image set, filtering and denoising the image, removing irrelevant factors in the image, separating and processing mutually adhered cells, and forming a training set and a test set in batches by using the obtained sample image set;
the specific steps of preprocessing the image set are as follows:
step 1: aiming at irrelevant impurities in a sample background, firstly, carrying out background separation on an image, obtaining a binary image by a maximum inter-class variance method, smoothing the outline of a target in the binary image by using morphological opening operation, removing impurities which are not the target in the background in the morphological opening operation, and finally obtaining the outline edge information of the target by using a Canny boundary detection algorithm;
step 2: dividing the adherent cells by adopting pit detection, wherein the pit of the adherent cells refers to a point with maximum local curvature in a pit area formed after two or more similar circular objects are adhered due to mutual overlapping, and for an image with circularity, the situation of curvature mutation does not exist unless the number of the cells is two or more;
1) and (3) pit detection:
first, corner detection is performed on the target contour by a modified Curvature Scale Space algorithm (CSS). This improved CSS algorithm preserves all real corner points at a relatively low scale and then compares the curvatures of all candidate corner points with adaptive local thresholds to remove redundant corner points. Typically, the adaptive local threshold for a candidate corner is determined based on the curvatures of its neighborhood regions, and candidate corners with absolute curvatures below their local thresholds will be eliminated. Among the candidates for the corner points, although some points are detected as local maxima in curvature values, they have a very small difference between neighboring points in a Support area (ROS), and a suitable area is selected when selecting the Support area.
The setting method of the self-adaptive local threshold comprises the following steps:
wherein,is the mean curvature of the neighborhood region, p represents the position of the candidate corner, R1And R2C is the coefficient for the size of the support area;
2) grouping contour segments:
dividing the contour of the blocking area into a plurality of contour segments using the pits obtained in 1). Since not every contour segment corresponds to a single object, there may be cases where multiple contour segments belong to the same object. Therefore, it is necessary to group contour segments belonging to the same object. For a contour segment, for another contour segment s within a certain neighborhood thereofjIf s isiAnd sjIf the grouping conditions are met, the groups are grouped into the same group, and the grouping method comprises the following three condition constraints:
condition 1: if the Average Distance Deviation (ADD) generated by the ellipses which are fitted after being divided into a group is smaller than the Average Distance Deviation generated by the ellipses which are fitted by any one contour segment before being combined, the contour segments are divided into the same group.
Condition 2: if the distances between the centers of gravity of the ellipses which are divided into the same group and then fitted and the centers of gravity of the ellipses which are respectively and independently fitted in each contour section are relatively close, the ellipses can be divided into one group.
Condition 3: if any two contour segments siAnd sjThe centers of gravity of the ellipses respectively fitted are close to each other, and then the ellipses can be divided into a group;
and step 3: ellipse fitting, wherein in order to obtain the contour boundary lost due to adhesion of an adhesion target, the algorithm utilizes the prior knowledge that the target is generally in a similar circle shape, and an ellipse fitting method based on a least square method is used for fitting to complete adhesion segmentation;
s3: carrying out model transfer training on the small sample set, and carrying out transfer learning on the small sample data set by using a deep learning network trained in the similar field;
the multi-layer ResNet model trained in other fields is adopted, the part in front of the full connection layer of the model is intercepted, the output part is provided with three output nodes according to the required classification type, and then a pre-training migration mode is utilized, namely, the parameters of the current multi-layer ResNet model are used as the initial parameters of the invention, the picture data processed in the step two of the claim 3 is used for network training, and an example of migration learning is shown in figure 4. The specific implementation mode is as follows:
ResNet is composed of a plurality of convolution modules connected in series, each convolution module comprises a layer of convolution and a layer of pooling, and figure 3 is a ResNet module. During training, the unit target map (i.e. the optimal solution to be approached) is assumed to be f (x) + x, and the output is: y + x, then the goal of the training becomes to have y approach f (x). I.e. the same body part x before and after the mapping is removed, so that slight variations (residuals) are highlighted.
Expressed by a mathematical expression:
y=F(x,{Wi})+Wsx (2)
x is the input of the residual unit, y is the output of the residual unit, F (x) is the target mapping, { WiIs the convolution layer in the residual unit. WsIs a convolution of 1 x 1 convolution kernel size and has the effect of dropping or raising the dimension to x, and thus being consistent with the output y size (since a summation is required).
The specific process is as follows:
1) determining the number of nodes of the first-layer network according to the dimension of the input data, namely the number of nodes of the input layer;
2) inputting data into a residual error network unit, and training the network layer by utilizing training data according to the characteristic of a ResNet network, namely an identity mapping function of the residual error network, wherein the output of each module is the current input plus the residual error.
3) In the invention, the acquisition of training data does not reach hundreds of thousands of training set samples required by deep learning, but because the trained ResNet network is used for network migration, the well-trained parameters are used as the initial training parameters of the model, so that part of training time and training samples are saved, and the model is very suitable for training and learning of small samples;
s4: carrying out reverse fine adjustment on the weight and the threshold value of the trained model by using a BP algorithm, and further optimizing the model;
the specific implementation measures are as follows:
(1) the model pre-training takes the migrated weight as the initial weight of the new network, and the value is changed by the gradient descent algorithm in the training process.
Gradient descent algorithm:
1) starting from 0 to the end of the training set data quantity:
calculating the gradient of the weight w and the deviation b of the ith training data relative to the loss function. We will then eventually find the gradient values of the weight and bias for each training data.
The sum of the gradients of all the training data weights w is calculated.
And calculating the sum of the gradients of all the training data deviations b.
2) After having done the above calculations, we start to perform the following calculations:
and (4) calculating the average value of the weight of all samples and the gradient of the deviation by using the results obtained in the steps (c) and (c).
② the weight value and the bias value of each sample are updated using the following equation.
The above process is repeated until the loss function converges unchanged.
(2) Reverse fine tuning, namely, carrying out supervised training on a ResNet network to reduce training errors and improve the accuracy of a classification model, wherein the BP algorithm comprises the following steps:
1) inputting a training set;
2) setting an activation value a corresponding to an input layer for each sample x in a training set1:
Forward propagation:
3) because the output result has an error with the actual result, the error generated by the output layer is calculated:
δL=ΔaCeσ'(zL) (6)
4) and reversely propagating the error obtained in the previous step from the output layer to the hidden layer:
δl=((wl+1)Tδl+1)eδ'(zl) (7)
5) using gradient descent, training parameters, and iterating continuously until convergence:
s5: and inputting the test set into the model, and outputting a result, namely a cerebrospinal fluid cell identification result.
Inputting test set data into a trained classification model, after multilayer ResNet mapping, the number of nodes of an output layer is the number of recognition states, input vectors successfully activate corresponding class nodes in the output layer, and in the class nodes, monocytes are node 0, lymphocytes are node 1, and neutrophils are node 2.
According to the invention, the problem of difficult feature extraction caused by the existence of background impurities in the acquired image can be effectively solved, and the method can be well suitable for cerebrospinal fluid cell identification under a complex background; because the trained model parameters are used as the initial training parameters of the invention, part of training time is reduced to a certain extent, and the trained model parameters generally have more reliability than randomly selected parameters, so the method is suitable for the learning training of small samples; aiming at a sample with cell adhesion, the invention also utilizes the near circularity of the cell to predict the central point of the cell for segmentation, so that the application condition of the invention is diversified.
According to the invention, the cerebrospinal fluid cell automatic identification technology based on deep learning can help neurologists to quickly establish a more scientific differential diagnosis model, reduce adverse effects of the doctors on microscopic examination results due to subjective factors, help the doctors to count and classify cells, and greatly improve the diagnosis rate; and the system can be mutually fused with high-level medical institution resources, so that the overall diagnosis mode tends to be standard and uniform, the radiation effect of high-quality medical resources to basic-level medical institutions is greatly improved, and the differential diagnosis level of basic-level hospitals is improved.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.
Claims (7)
1. A method for identifying and classifying cerebrospinal fluid cells with small complexity is characterized by comprising the following steps:
s1: acquiring an image of a sample slide by using an automatic scanning platform of a microscope to obtain a complete image set of a cerebrospinal fluid cell slide with a plurality of cells;
s2: preprocessing the obtained image set, filtering and denoising the image, removing irrelevant factors in the image, separating and processing mutually adhered cells, and forming a training set and a test set in batches by using the obtained sample image set;
s3: carrying out model transfer training on the small sample set, and carrying out transfer learning on the small sample data set by using a deep learning network trained in the similar field;
s4: carrying out reverse fine adjustment on the weight and the threshold value of the trained model by using a BP algorithm, and further optimizing the model;
s5: and inputting the test set into the model, and outputting a result, namely a cerebrospinal fluid cell identification result.
2. The method for cerebrospinal fluid cell identification and classification of small complexity samples as claimed in claim 1, wherein the cerebrospinal fluid cell slide is placed on a motorized translation stage of a microscope in S1, a software system is used to perform diagonal coordinate point positioning on the scanning range of the slide, the scanning range of the image is determined, the size of the scanning image range is recorded, the software system platform is used to perform image collection and stitching to obtain a complete picture of the cell sample, and the steps are repeated for the subsequent slide image collection.
3. The method for cerebrospinal fluid cell identification and classification of small samples according to claim 1, wherein the preprocessing of the image set in S2 comprises the following steps:
step 1: aiming at irrelevant impurities in a sample background, firstly, carrying out background separation on an image, obtaining a binary image by a maximum inter-class variance method, smoothing the outline of a target in the binary image by using morphological opening operation, removing impurities which are not the target in the background in the morphological opening operation, and finally obtaining the outline edge information of the target by using a Canny boundary detection algorithm;
step 2: dividing the adherent cells by adopting pit detection, wherein the pit of the adherent cells refers to a point with maximum local curvature in a pit area formed after two or more similar circular objects are adhered due to mutual overlapping, and for an image with circularity, the situation of curvature mutation does not exist unless the number of the cells is two or more;
and step 3: and (4) elliptical fitting, wherein in order to obtain the contour boundary lost due to adhesion of an adhesion target, the algorithm utilizes the prior knowledge that the target is generally in a similar circle shape, and an elliptical fitting method based on a least square method is used for fitting to complete adhesion segmentation.
4. The method for identifying and classifying cerebrospinal fluid cells of a small-complexity sample according to claim 1, wherein in S3, a multi-layer ResNet model trained in other fields is used, the part in front of the full connection layer of the model is intercepted, the output part sets three output nodes according to the required classification type, and then the pre-trained migration method is used to take the current parameters of the multi-layer ResNet as the initial parameters of the present invention, and then the network training is performed by using the picture data processed in S2, which comprises the following specific processes:
1) determining the number of nodes of the first-layer network according to the dimension of the input data, namely the number of nodes of the input layer;
2) inputting data to a residual error network unit, and training the network layer by utilizing training data according to the characteristic of a ResNet network, namely an identity mapping function of the residual error network, wherein the output of each module is the current input plus the residual error;
3) the trained ResNet network is used for carrying out network transfer learning, and the well-trained parameters are used as initial training parameters of the model, so that part of training time and training samples are saved, and the model is very suitable for training and learning of small samples.
5. The method for identifying and classifying cerebrospinal fluid cells of a small complexity sample as claimed in claim 1, wherein the step of optimizing the model in S4 comprises:
1) after training is finished, label data is added to the topmost layer of ResNet, and supervised training is carried out on the model, namely a back propagation algorithm (BP) is used for carrying out fine adjustment on related parameters of the network;
2) and respectively inputting the labeled data of the classified classes into the topmost layer of the ResNet, finely adjusting the weight and the threshold of the ResNet through a BP algorithm, and further reducing training errors and improving the accuracy of the transfer learning identification model through supervised training.
6. The method for identifying and classifying cerebrospinal fluid cells with small complexity as claimed in claim 1, wherein in S5, the test set data is input into the trained classification model, after multi-layer ResNet mapping, the number of nodes in the output layer is the number of identification states, and the input vector successfully activates the corresponding class node in the output layer.
7. The method as claimed in claim 6, wherein the mononuclear cell, the lymphocyte and the neutrophil in the S5 are node 0, node 1 and node 2 respectively.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210094305.8A CN114494197A (en) | 2022-01-26 | 2022-01-26 | Cerebrospinal fluid cell identification and classification method for small-complexity sample |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210094305.8A CN114494197A (en) | 2022-01-26 | 2022-01-26 | Cerebrospinal fluid cell identification and classification method for small-complexity sample |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114494197A true CN114494197A (en) | 2022-05-13 |
Family
ID=81477483
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210094305.8A Pending CN114494197A (en) | 2022-01-26 | 2022-01-26 | Cerebrospinal fluid cell identification and classification method for small-complexity sample |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114494197A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115100646A (en) * | 2022-06-27 | 2022-09-23 | 武汉兰丁智能医学股份有限公司 | Cell image high-definition rapid splicing identification marking method |
CN116823823A (en) * | 2023-08-29 | 2023-09-29 | 天津市肿瘤医院(天津医科大学肿瘤医院) | Artificial intelligence cerebrospinal fluid cell automatic analysis method |
WO2024000288A1 (en) * | 2022-06-29 | 2024-01-04 | 深圳华大生命科学研究院 | Image stitching method, and gene sequencing system and corresponding gene sequencer |
CN117576098A (en) * | 2024-01-16 | 2024-02-20 | 武汉互创联合科技有限公司 | Cell division balance evaluation method and device based on segmentation |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111476266A (en) * | 2020-02-27 | 2020-07-31 | 武汉大学 | Non-equilibrium type leukocyte classification method based on transfer learning |
CN113723199A (en) * | 2021-08-03 | 2021-11-30 | 南京邮电大学 | Airport low visibility detection method, device and system |
WO2021247868A1 (en) * | 2020-06-03 | 2021-12-09 | Case Western Reserve University | Classification of blood cells |
-
2022
- 2022-01-26 CN CN202210094305.8A patent/CN114494197A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111476266A (en) * | 2020-02-27 | 2020-07-31 | 武汉大学 | Non-equilibrium type leukocyte classification method based on transfer learning |
WO2021247868A1 (en) * | 2020-06-03 | 2021-12-09 | Case Western Reserve University | Classification of blood cells |
CN113723199A (en) * | 2021-08-03 | 2021-11-30 | 南京邮电大学 | Airport low visibility detection method, device and system |
Non-Patent Citations (4)
Title |
---|
HUANHUAN YIN 等: "Research on Recognition and Classification System of Cerebrospinal Fluid Cells Based on Small Samples", 《2021 IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE, ELECTRONIC INFORMATION ENGINEERING AND INTELLIGENT CONTROL TECHNOLOGY (CEI)》, 29 October 2021 (2021-10-29), pages 149 - 152 * |
SAHAR ZAFARI 等: "Segmentation of Partially Overlapping Nanoparticles Using Concave Points", 《ADVANCES IN VISUAL COMPUTING》, 18 December 2015 (2015-12-18), pages 187 - 197, XP047332049, DOI: 10.1007/978-3-319-27857-5_17 * |
刘宰豪: "基于凹点和重心检测的粘连类圆形目标图像分割", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 02, 15 February 2020 (2020-02-15), pages 138 - 1382 * |
尹欢欢: "脑脊液细胞显微图像识别与分类系统设计与实现", 《万方数据知识服务平台》, 1 November 2023 (2023-11-01), pages 1 - 86 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115100646A (en) * | 2022-06-27 | 2022-09-23 | 武汉兰丁智能医学股份有限公司 | Cell image high-definition rapid splicing identification marking method |
WO2024000288A1 (en) * | 2022-06-29 | 2024-01-04 | 深圳华大生命科学研究院 | Image stitching method, and gene sequencing system and corresponding gene sequencer |
CN116823823A (en) * | 2023-08-29 | 2023-09-29 | 天津市肿瘤医院(天津医科大学肿瘤医院) | Artificial intelligence cerebrospinal fluid cell automatic analysis method |
CN116823823B (en) * | 2023-08-29 | 2023-11-14 | 天津市肿瘤医院(天津医科大学肿瘤医院) | Artificial intelligence cerebrospinal fluid cell automatic analysis method |
CN117576098A (en) * | 2024-01-16 | 2024-02-20 | 武汉互创联合科技有限公司 | Cell division balance evaluation method and device based on segmentation |
CN117576098B (en) * | 2024-01-16 | 2024-04-19 | 武汉互创联合科技有限公司 | Cell division balance evaluation method and device based on segmentation |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114494197A (en) | Cerebrospinal fluid cell identification and classification method for small-complexity sample | |
US5933519A (en) | Cytological slide scoring apparatus | |
US4965725A (en) | Neural network based automated cytological specimen classification system and method | |
US5544650A (en) | Automated specimen classification system and method | |
JP2831282B2 (en) | Interaction method of object evaluation | |
Shafique et al. | Computer-assisted acute lymphoblastic leukemia detection and diagnosis | |
US20130094750A1 (en) | Methods and systems for segmentation of cells for an automated differential counting system | |
Wu et al. | A hematologist-level deep learning algorithm (BMSNet) for assessing the morphologies of single nuclear balls in bone marrow smears: algorithm development | |
Parab et al. | Red blood cell classification using image processing and CNN | |
EP0745243B9 (en) | Automated cytological specimen classification methods | |
Beksaç et al. | An artificial intelligent diagnostic system on differential recognition of hematopoietic cells from microscopic images | |
KR102624956B1 (en) | Method for detecting cells with at least one malformation in a cell sample | |
JP3916395B2 (en) | Test system with sample pretreatment function | |
CN114332855A (en) | Unmarked leukocyte three-classification method based on bright field microscopic imaging | |
Simon et al. | Shallow cnn with lstm layer for tuberculosis detection in microscopic images | |
KR20010017092A (en) | Method for counting and analyzing morphology of blood cell automatically | |
CN112613505A (en) | Cell micronucleus identification, positioning and counting method based on deep learning | |
CN112036334A (en) | Method, system and terminal for classifying visible components in sample to be detected | |
Evangeline et al. | Computer aided system for human blood cell identification, classification and counting | |
CN113139928A (en) | Training method of pulmonary nodule detection model and pulmonary nodule detection method | |
Mustafa et al. | Malaria parasite diagnosis using computational techniques: a comprehensive review | |
Muhamad et al. | A deep learning method for detecting leukemia in real images | |
Priyankara et al. | An extensible computer vision application for blood cell recognition and analysis | |
CN112819057A (en) | Automatic identification method of urinary sediment image | |
CN112924452A (en) | Blood examination auxiliary system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |