CN114494197A - Cerebrospinal fluid cell identification and classification method for small-complexity sample - Google Patents
Cerebrospinal fluid cell identification and classification method for small-complexity sample Download PDFInfo
- Publication number
- CN114494197A CN114494197A CN202210094305.8A CN202210094305A CN114494197A CN 114494197 A CN114494197 A CN 114494197A CN 202210094305 A CN202210094305 A CN 202210094305A CN 114494197 A CN114494197 A CN 114494197A
- Authority
- CN
- China
- Prior art keywords
- cerebrospinal fluid
- training
- model
- image
- layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 210000001175 cerebrospinal fluid Anatomy 0.000 title claims abstract description 57
- 238000000034 method Methods 0.000 title claims abstract description 44
- 238000012549 training Methods 0.000 claims abstract description 67
- 210000004027 cell Anatomy 0.000 claims abstract description 49
- 238000012360 testing method Methods 0.000 claims abstract description 15
- 239000012535 impurity Substances 0.000 claims abstract description 14
- 210000001616 monocyte Anatomy 0.000 claims abstract description 7
- 210000004698 lymphocyte Anatomy 0.000 claims abstract description 6
- 210000000440 neutrophil Anatomy 0.000 claims abstract description 6
- 238000012546 transfer Methods 0.000 claims abstract description 6
- 239000000853 adhesive Substances 0.000 claims abstract description 3
- 230000001070 adhesive effect Effects 0.000 claims abstract description 3
- 239000011521 glass Substances 0.000 claims abstract 2
- 238000001514 detection method Methods 0.000 claims description 16
- 238000013135 deep learning Methods 0.000 claims description 13
- 238000013526 transfer learning Methods 0.000 claims description 13
- 238000013507 mapping Methods 0.000 claims description 10
- 230000006870 function Effects 0.000 claims description 7
- 230000008569 process Effects 0.000 claims description 6
- 230000011218 segmentation Effects 0.000 claims description 6
- 230000008859 change Effects 0.000 claims description 5
- 238000013145 classification model Methods 0.000 claims description 5
- 238000013508 migration Methods 0.000 claims description 5
- 230000005012 migration Effects 0.000 claims description 5
- 230000000877 morphologic effect Effects 0.000 claims description 4
- 238000007781 pre-processing Methods 0.000 claims description 4
- 238000013519 translation Methods 0.000 claims description 4
- 238000011109 contamination Methods 0.000 abstract 1
- 238000003384 imaging method Methods 0.000 abstract 1
- 238000005457 optimization Methods 0.000 abstract 1
- 238000003745 diagnosis Methods 0.000 description 12
- 238000003748 differential diagnosis Methods 0.000 description 8
- 210000003169 central nervous system Anatomy 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- 208000035473 Communicable disease Diseases 0.000 description 5
- 230000002411 adverse Effects 0.000 description 4
- 230000021164 cell adhesion Effects 0.000 description 4
- 238000010276 construction Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 230000000191 radiation effect Effects 0.000 description 4
- 230000003044 adaptive effect Effects 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 230000005484 gravity Effects 0.000 description 3
- 230000001580 bacterial effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 210000004369 blood Anatomy 0.000 description 2
- 239000008280 blood Substances 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 210000002540 macrophage Anatomy 0.000 description 2
- 244000052769 pathogen Species 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 210000002700 urine Anatomy 0.000 description 2
- 206010003445 Ascites Diseases 0.000 description 1
- 102000001554 Hemoglobins Human genes 0.000 description 1
- 108010054147 Hemoglobins Proteins 0.000 description 1
- 108010017480 Hemosiderin Proteins 0.000 description 1
- 206010027202 Meningitis bacterial Diseases 0.000 description 1
- 206010027259 Meningitis tuberculous Diseases 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 206010057249 Phagocytosis Diseases 0.000 description 1
- 208000032851 Subarachnoid Hemorrhage Diseases 0.000 description 1
- 210000001744 T-lymphocyte Anatomy 0.000 description 1
- 208000022971 Tuberculous meningitis Diseases 0.000 description 1
- 208000036142 Viral infection Diseases 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 210000003719 b-lymphocyte Anatomy 0.000 description 1
- 201000009904 bacterial meningitis Diseases 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 210000000601 blood cell Anatomy 0.000 description 1
- 210000001124 body fluid Anatomy 0.000 description 1
- 239000010839 body fluid Substances 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000036755 cellular response Effects 0.000 description 1
- 208000037976 chronic inflammation Diseases 0.000 description 1
- 230000006020 chronic inflammation Effects 0.000 description 1
- 230000002380 cytological effect Effects 0.000 description 1
- 239000007857 degradation product Substances 0.000 description 1
- 238000002405 diagnostic procedure Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 210000003743 erythrocyte Anatomy 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 238000012760 immunocytochemical staining Methods 0.000 description 1
- 208000001223 meningeal tuberculosis Diseases 0.000 description 1
- 230000001717 pathogenic effect Effects 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 230000008782 phagocytosis Effects 0.000 description 1
- 210000004910 pleural fluid Anatomy 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 239000013049 sediment Substances 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 230000009385 viral infection Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/12—Edge-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/194—Segmentation; Edge detection involving foreground-background segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10056—Microscopic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30024—Cell structures in vitro; Tissue sections in vitro
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Molecular Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Medical Informatics (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Quality & Reliability (AREA)
- Image Analysis (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
本发明公开了一种复杂性小样本脑脊液细胞识别与分类方法,具体方法步骤如下:S1:利用显微镜拼接成像平台获得玻片样本下的细胞图像,包括单核细胞、淋巴细胞和中性粒细胞的图像;S2:对所得图像集进行预处理。针对采集到的细胞样本中存在着由于镜头污染或操作不当等带来的背景杂质,以及出现个别细胞重叠粘连的问题,主要是对图像进行滤波去噪,剔除图片中不相关的因素,并且对粘连的细胞进行分离处理;S3:针对小样本集进行模型的迁移训练;S4:根据所得模型,使用BP算法对模型的权重和阈值进行反向微调;S5:利用训练好的模型对测试集进行脑脊液细胞图像的识别,并进一步优化算法。本发明能够有效地识别人体脑脊液细胞的不同种类。
The invention discloses a complex small sample cerebrospinal fluid cell identification and classification method. The specific method steps are as follows: S1: use a microscope splicing imaging platform to obtain cell images under a glass slide sample, including monocytes, lymphocytes and neutrophils image; S2: Preprocess the resulting image set. In view of the background impurities caused by lens contamination or improper operation in the collected cell samples, as well as the problem of overlapping and adhesion of individual cells, the main purpose is to filter and denoise the image, remove irrelevant factors in the image, and correct Adhesive cells are separated; S3: Transfer training of the model for a small sample set; S4: According to the obtained model, use the BP algorithm to reverse fine-tune the weights and thresholds of the model; S5: Use the trained model to perform training on the test set Recognition of cerebrospinal fluid cell images and further optimization of the algorithm. The invention can effectively identify different types of human cerebrospinal fluid cells.
Description
技术领域technical field
本发明涉及医学细胞识别与分类技术领域,尤其涉及一种复杂性小样本脑脊液细胞识别与分类方法。The invention relates to the technical field of medical cell identification and classification, in particular to a complex small sample cerebrospinal fluid cell identification and classification method.
背景技术Background technique
脑脊液(Cerebrospinal fluid,CSF)细胞学,是神经科医生的最重要工具之一,包含总体细胞计数和细胞学分类,为中枢神经系统及其涵盖的一系列病理状况提供重要的第一手信息。CSF样本需要立即处理,尽可能在收集后1小时内处理。在正常CSF细胞以T淋巴细胞为主,少量单核巨噬细胞,偶见B淋巴细胞;细胞数明显增加,镜下以中性粒细胞为主时多见于细菌性脑膜炎,需进一步寻找胞内细菌证据;以淋巴细胞和单核细胞为主的细胞背景多见于病毒感染和慢性炎症;以混杂细胞反应为背景时,可发生于结核性脑膜炎;吞噬红细胞或含血红蛋白降解产物片段的巨噬细胞(后者称为含铁血黄素细胞),均提示陈旧性的蛛网膜下腔出血;镜下发现异型细胞怀疑肿瘤时,需结合临床及免疫细胞化学染色综合判断。Cerebrospinal fluid (CSF) cytology, one of the most important tools for neurologists, includes gross cell counts and cytological classifications, providing important first-hand information on the central nervous system and the range of pathological conditions it covers. CSF samples need to be processed immediately, if possible within 1 hour of collection. In normal CSF cells are mainly T lymphocytes, a small amount of mononuclear macrophages, and occasionally B lymphocytes; the number of cells increases significantly, and neutrophils are more common under microscope in bacterial meningitis, and further search for cells is required. Evidence of endobacteria; cellular background dominated by lymphocytes and monocytes is more common in viral infections and chronic inflammation; can occur in tuberculous meningitis in the background of promiscuous cellular responses; phagocytosis of erythrocytes or macrophages containing fragments of hemoglobin degradation products Phage cells (the latter are called hemosiderin cells) are all indicative of old subarachnoid hemorrhage; when atypical cells are found under the microscope to suspect a tumor, a comprehensive judgment should be combined with clinical and immunocytochemical staining.
近年来宏基因组测序技术受到了广泛的关注,在中枢神经系统感染性疾病病原体检测中具有一定的价值但仍存在一些不足:标本易受污染,影响检测结果,限制病原体检测的总体敏感性;容易出现假阴性和假阳性结果,甚至导致检测结果无法分析;检测费用高,限制其广泛运用等。所以,宏基因组测序尚不能取代传统的诊断方法。In recent years, metagenomic sequencing technology has received extensive attention, and it has certain value in the detection of pathogens of infectious diseases in the central nervous system, but there are still some shortcomings: the specimen is easily contaminated, which affects the detection results and limits the overall sensitivity of pathogen detection; There are false negative and false positive results, and even the test results cannot be analyzed; the test cost is high, which limits its wide application. Therefore, metagenomic sequencing cannot yet replace traditional diagnostic methods.
迄今为止,大多数临床实验室仍采用手工法对脑脊液中的细胞进行计数及分类,该分析采用的是直接在显微镜下根据细胞核形态分别计数单个核(包括淋巴和单核细胞)和多个核细胞,共计数100个。此方法存在操作繁琐,耗时费力,不同操作者之间由于熟练程度、规范程度不同,具有很大的主观性,结果重复性低、误差较大,无法进行室内或室间质控,并且结果回报时间又长,无法较好地满足临床需要,不适于现代化医院大规模临床工作的开展。与血液和尿液相比,脑脊液样本量少。手工计数时取样量少,不能保证计数的精确度。如果能实现或部分实现脑脊液标本自动化细胞检测,可在一定程度上解决上述问题。目前尚无专用的计数及分类脑脊液细胞的自动分析仪器。To date, most clinical laboratories still count and classify cells in CSF by hand, using the method of counting single nuclei (including lymphoid and monocytes) and multiple nuclei directly under the microscope based on their nuclear morphology. A total of 100 cells were counted. This method is cumbersome, time-consuming and labor-intensive. Due to the difference in proficiency and standardization among different operators, it has great subjectivity. The results have low repeatability and large errors. The return time is long, it cannot meet the clinical needs well, and it is not suitable for the development of large-scale clinical work in modern hospitals. Cerebrospinal fluid samples are small in volume compared to blood and urine. In manual counting, the sampling amount is small, and the accuracy of counting cannot be guaranteed. If the automated cell detection of cerebrospinal fluid samples can be realized or partially realized, the above problems can be solved to a certain extent. At present, there is no special automatic analyzer for counting and classifying cerebrospinal fluid cells.
随着自动化细胞检测技术的发展,近年来许多研究者尝试使用各类细胞分析仪(如全自动尿沉渣分析仪和血液分析仪)对脑脊液细胞进行计数和分析。现在一些新型号的血细胞分析仪增加了体液细胞分析的功能,使实验室对胸水、腹水等中的细胞进行自动计数和分类成为可能。但因脑脊液因其特殊性,样本量较少,且各类仪器本身的原理和内部设计等问题限制了其在脑脊液标本检测中的应用,加之从脑脊液细胞的取样来看,样本玻片存在一定的细菌杂质以及细胞粘连的情况,这对脑脊液细胞识别分类的影响存在着巨大影响。With the development of automated cell detection technology, in recent years, many researchers have tried to use various cell analyzers (such as automatic urine sediment analyzer and blood analyzer) to count and analyze cerebrospinal fluid cells. Now some new models of blood cell analyzers have added the function of body fluid cell analysis, making it possible for laboratories to automatically count and classify cells in pleural fluid, ascites, etc. However, due to the particularity of cerebrospinal fluid, the sample size is small, and the principle and internal design of various instruments limit their application in the detection of cerebrospinal fluid samples. The presence of bacterial impurities and cell adhesion has a huge impact on the identification and classification of cerebrospinal fluid cells.
基于深度学习的脑脊液细胞自动识别技术,能够帮助神经科医生快速建立更科学的鉴别诊断模型,可以减少医生因主观因素对镜检结果造成的不利影响,帮助医生进行细胞计数及分类,大大提高诊断率;且可以与高水平医疗机构资源相互融合,让整体诊断模式趋为规范、统一,极大地提高优质医疗资源向基层医疗机构的辐射作用,提高基层医院的鉴别诊断水平。因此,构建基于深度学习的脑脊液细胞自动识别系统,对于提升中枢神经系统感染性疾病的诊断率、解决地区医疗差异、低年资及基层医师误诊等问题具有重大意义,从而最终使广大患者受益。The automatic identification technology of cerebrospinal fluid cells based on deep learning can help neurologists to quickly establish a more scientific differential diagnosis model, which can reduce the adverse effects of subjective factors on the results of microscopic examinations, help doctors to count and classify cells, and greatly improve the diagnosis. And it can be integrated with the resources of high-level medical institutions, so that the overall diagnosis model tends to be standardized and unified, greatly improving the radiation effect of high-quality medical resources to primary medical institutions, and improving the differential diagnosis level of primary hospitals. Therefore, the construction of an automatic identification system of cerebrospinal fluid cells based on deep learning is of great significance for improving the diagnosis rate of infectious diseases in the central nervous system, solving the problems of regional medical differences, low seniority and misdiagnosis by primary physicians, thus ultimately benefiting the majority of patients.
发明内容SUMMARY OF THE INVENTION
1.要解决的技术问题1. Technical problems to be solved
本发明的目的是为了解决现有技术中因脑脊液因其特殊性,样本量较少,且各类仪器本身的原理和内部设计等问题限制了其在脑脊液标本检测中的应用,加之从脑脊液细胞的取样来看,样本玻片存在一定的细菌杂质以及细胞粘连的情况,这对脑脊液细胞识别分类的影响存在着巨大影响的问题,而提出的一种复杂性小样本脑脊液细胞识别与分类方法。The purpose of the present invention is to solve the problems in the prior art due to the particularity of cerebrospinal fluid, the small sample size, and the principles and internal design of various instruments themselves limit their application in the detection of cerebrospinal fluid samples. From the sampling point of view, the sample slides have certain bacterial impurities and cell adhesion, which has a huge impact on the identification and classification of cerebrospinal fluid cells. A complex small sample cerebrospinal fluid cell identification and classification method is proposed.
2.技术方案2. Technical solutions
为了实现上述目的,本发明采用了如下技术方案:In order to achieve the above object, the present invention adopts the following technical solutions:
一种复杂性小样本脑脊液细胞识别与分类方法,包括以下步骤:A complex small sample cerebrospinal fluid cell identification and classification method, comprising the following steps:
S1:使用显微镜自动扫描平台进行样本玻片的图像获取,得到具有多个细胞的脑脊液细胞玻片的完整图像集;S1: use the microscope automatic scanning platform to acquire the image of the sample slide to obtain a complete image set of the cerebrospinal fluid cell slide with multiple cells;
S2:对所得图像集进行预处理,对图像进行滤波去噪,剔除图片中不相关的因素,并且对相互粘连的细胞进行分离处理,将所得样本图像集分批次形成训练集和测试集;S2: Preprocess the obtained image set, filter and denoise the image, remove irrelevant factors in the picture, and separate the cells that adhere to each other, and form the obtained sample image set into training set and test set in batches;
S3:针对小样本集进行模型的迁移训练,利用相近领域训练好的深度学习网络对小样本数据集进行迁移学习;S3: Carry out model transfer training for small sample sets, and use deep learning networks trained in similar fields to perform transfer learning on small sample data sets;
S4:将训练好的模型,利用BP算法进行权重和阈值的反向微调,进一步优化模型;S4: Use the BP algorithm to fine-tune the weights and thresholds of the trained model to further optimize the model;
S5:将测试集输入到模型中,输出结果即为脑脊液细胞识别结果。S5: Input the test set into the model, and the output result is the cerebrospinal fluid cell identification result.
优选地,所述S1中将脑脊液细胞玻片放置显微镜的电动平移台上,利用软件系统对玻片的扫描范围进行对角线的坐标点定位,确定图像的扫描范围,并记下扫描的图像范围大小,利用软件系统平台进行采集图像的拼接,得到一张完整的细胞样本图片,对后面的玻片图像采集重复此步骤。Preferably, in the step S1, the cerebrospinal fluid cell slide is placed on the electric translation stage of the microscope, and a software system is used to locate the diagonal coordinates of the scanning range of the slide, determine the scanning range of the image, and record the scanned image The size of the range, the software system platform is used to stitch the collected images to obtain a complete picture of the cell sample, and this step is repeated for the subsequent slide image collection.
优选地,所述S2中对图像集进行预处理的具体步骤为:Preferably, the specific steps of preprocessing the image set in S2 are:
步骤1:针对样本背景中的无关杂质,首先对图像进行背景分离,通过最大类间方差法获取二值图像,利用形态学开操作对二值图像中目标的轮廓进行平滑处理,这一部分在形态学的开操作中,就能将背景中的不是目标的杂质去除,最后用Canny边界检测算法获取目标的轮廓边缘信息;Step 1: For the irrelevant impurities in the background of the sample, firstly separate the background of the image, obtain the binary image by the maximum inter-class variance method, and use the morphological opening operation to smooth the contour of the target in the binary image. In the open operation of learning, the impurities in the background that are not the target can be removed, and finally the Canny boundary detection algorithm is used to obtain the contour edge information of the target;
步骤2:采用凹点检测来对粘连细胞进行分割,粘连细胞凹点指的是由于两个至多个类圆形目标,因相互重叠而产生粘连后所形成的凹区域中的局部曲率最大点,对于近圆性的图像来说,不存在曲率突变的情况,除非是两个或者多个细胞;Step 2: Use the concave point detection to segment the adhesion cells. The concave point of the adhesion cell refers to the local maximum curvature point in the concave area formed by the adhesion of two or more quasi-circular objects due to overlapping with each other. For near-circular images, there is no sudden change in curvature unless there are two or more cells;
步骤3:椭圆拟合,为了获取粘连目标因粘连而丢失掉的轮廓边界,该算法利用目标一般呈现为类圆形的先验知识,使用基于最小二乘法的椭圆拟合方法进行拟合以完成粘连分割。Step 3: Ellipse fitting. In order to obtain the outline boundary of the sticking target lost due to sticking, the algorithm uses the prior knowledge that the target generally appears to be a circle, and uses the ellipse fitting method based on the least squares method for fitting to complete Adhesive segmentation.
优选地,所述S3中采用已在其他领域训练好的多层ResNet模型,截取模型的全连接层前面的部分,输出部分根据所需要的分类种类设置三个输出节点,再利用预训练的迁移方式,将当前多层ResNet的参数作为本发明的初始参数,再用S2中所处理过的图片数据进行网络的训练,具体过程为:Preferably, the multi-layer ResNet model that has been trained in other fields is used in the S3, the part in front of the fully connected layer of the model is intercepted, the output part sets three output nodes according to the required classification type, and then uses the pre-trained migration method, take the parameters of the current multi-layer ResNet as the initial parameters of the present invention, and then use the image data processed in S2 to train the network. The specific process is:
1)根据输入数据的维数确定第一层网络的节点数,也就是输入层节点数;1) Determine the number of nodes in the first layer of the network according to the dimension of the input data, that is, the number of nodes in the input layer;
2)输入数据到残差网络单元,根据ResNet网络的特性,即残差网络的恒等映射函数,每个模块的输出都是当前输入加上残差,利用训练数据对网络进行层层训练;2) Input data to the residual network unit. According to the characteristics of the ResNet network, that is, the identity mapping function of the residual network, the output of each module is the current input plus the residual, and the training data is used to train the network layer by layer;
3)使用已训练好的ResNet网络进行网络的迁移学习,利用其训练得很好的参数作为本模型的训练初始参数,省去了一部分的训练时间以及训练样本,十分适合小样本的训练学习。3) Use the trained ResNet network for network transfer learning, and use its well-trained parameters as the initial training parameters of this model, which saves part of the training time and training samples, which is very suitable for training and learning with small samples.
优选地,所述S4中优化模型的具体步骤为:Preferably, the specific steps of optimizing the model in the S4 are:
1)当训练完成后,通过在ResNet的最顶层添加标签数据,对模型进行有监督训练,即使用反向传播算法(BP)对网络的相关参数进行微调;1) When the training is completed, the model is supervised by adding label data to the top layer of ResNet, that is, using the back propagation algorithm (BP) to fine-tune the relevant parameters of the network;
2)分别将所分类别的带标签数据,输入到ResNet的最顶层中,通过BP算法微调ResNet的权重和阈值,通过有监督的训练将进一步减少训练误差和提高迁移学习识别模型的准确率。2) Input the classified labeled data into the top layer of ResNet, fine-tune the weight and threshold of ResNet through BP algorithm, and further reduce the training error and improve the accuracy of the transfer learning recognition model through supervised training.
优选地,所述S5中将测试集数据输入到训练好的分类模型中,经过多层ResNet映射后,输出层节点数为识别状态的数量,输入向量在输出层成功激活相应类别节点。Preferably, in the step S5, the test set data is input into the trained classification model, and after multi-layer ResNet mapping, the number of output layer nodes is the number of recognition states, and the input vector successfully activates the corresponding category nodes in the output layer.
优选地,所述S5中类别节点中单核细胞为节点0、淋巴细胞为节点1、中性粒细胞为节点2。Preferably, in the category nodes in S5, monocytes are node 0, lymphocytes are node 1, and neutrophils are node 2.
3.有益效果3. Beneficial effects
相比于现有技术,本发明的优点在于:Compared with the prior art, the advantages of the present invention are:
(1)本发明中,能够有效解决采集到的图像中由于背景杂质的存在而导致特征提取困难的问题,能够很好地适应复杂背景下的脑脊液细胞识别;由于采用已训练好的模型参数为本发明的初始训练参数,一定程度上是减少了部分训练时间,训练好的模型参数一般来说是比随机选取的参数更加有可靠性,所以适用于小样本的学习训练;针对有细胞粘连的样本,本发明也利用其细胞的近圆性,对其进行了中心点的预测来进行分割,使得本发明的适用情况多样化。(1) In the present invention, the problem of difficulty in feature extraction due to the existence of background impurities in the collected images can be effectively solved, and the recognition of cerebrospinal fluid cells under complex backgrounds can be well adapted; since the trained model parameters are The initial training parameters of the present invention reduce part of the training time to a certain extent, and the trained model parameters are generally more reliable than randomly selected parameters, so they are suitable for small sample learning and training; For the sample, the present invention also utilizes the near-circularity of its cells, and predicts its center point to segment it, so that the application of the present invention is diversified.
(2)本发明中,基于深度学习的脑脊液细胞自动识别技术,能够帮助神经科医生快速建立更科学的鉴别诊断模型,可以减少医生因主观因素对镜检结果造成的不利影响,帮助医生进行细胞计数及分类,大大提高诊断率;且可以与高水平医疗机构资源相互融合,让整体诊断模式趋为规范、统一,极大地提高优质医疗资源向基层医疗机构的辐射作用,提高基层医院的鉴别诊断水平,因此,构建基于深度学习的脑脊液细胞自动识别系统,对于提升中枢神经系统感染性疾病的诊断率、解决地区医疗差异、低年资及基层医师误诊等问题具有重大意义,从而最终使广大患者受益。(2) In the present invention, the automatic identification technology of cerebrospinal fluid cells based on deep learning can help neurologists quickly establish a more scientific differential diagnosis model, can reduce the adverse effects of doctors on the results of microscopic examination due to subjective factors, and help doctors in cell Counting and classification can greatly improve the diagnosis rate; and can be integrated with the resources of high-level medical institutions, so that the overall diagnosis model tends to be standardized and unified, greatly improving the radiation effect of high-quality medical resources to primary medical institutions, and improving the differential diagnosis of primary hospitals. Therefore, the construction of an automatic identification system of cerebrospinal fluid cells based on deep learning is of great significance for improving the diagnosis rate of infectious diseases in the central nervous system, solving the problems of regional medical differences, low seniority and misdiagnosis by primary physicians, so as to eventually make the majority of patients benefit.
附图说明Description of drawings
图1为本发明提出的一种复杂性小样本脑脊液细胞识别与分类方法的技术流程框图;Fig. 1 is a technical flowchart of a method for identifying and classifying complex small sample cerebrospinal fluid cells proposed by the present invention;
图2为本发明提出的一种复杂性小样本脑脊液细胞识别与分类方法的凹点示意图;FIG. 2 is a schematic diagram of a concave point of a complex small sample cerebrospinal fluid cell identification and classification method proposed by the present invention;
图3为本发明中迁移学习的ResNet模型;Fig. 3 is the ResNet model of transfer learning in the present invention;
图4为本发明中迁移学习示例。FIG. 4 is an example of transfer learning in the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments.
实施例1:Example 1:
参照图1,一种复杂性小样本脑脊液细胞识别与分类方法,包括以下步骤:Referring to Figure 1, a complex small sample cerebrospinal fluid cell identification and classification method includes the following steps:
S1:使用显微镜自动扫描平台进行样本玻片的图像获取,得到具有多个细胞的脑脊液细胞玻片的完整图像集;S1: use the microscope automatic scanning platform to acquire the image of the sample slide to obtain a complete image set of the cerebrospinal fluid cell slide with multiple cells;
将脑脊液细胞玻片放置显微镜的电动平移台上,利用软件系统对玻片的扫描范围进行对角线的坐标点定位,确定图像的扫描范围,并记下扫描的图像范围大小,利用软件系统平台进行采集图像的拼接,得到一张完整的细胞样本图片,对后面的玻片图像采集重复此步骤;Place the cerebrospinal fluid cell slide on the electric translation stage of the microscope, use the software system to locate the diagonal coordinate points of the scanning range of the slide, determine the scanning range of the image, and note the size of the scanned image range, use the software system platform Perform the stitching of the collected images to obtain a complete picture of the cell sample, and repeat this step for the subsequent slide image collection;
S2:对所得图像集进行预处理,对图像进行滤波去噪,剔除图片中不相关的因素,并且对相互粘连的细胞进行分离处理,将所得样本图像集分批次形成训练集和测试集;S2: Preprocess the obtained image set, filter and denoise the image, remove irrelevant factors in the picture, and separate the cells that adhere to each other, and form the obtained sample image set into training set and test set in batches;
对图像集进行预处理的具体步骤为:The specific steps of preprocessing the image set are as follows:
步骤1:针对样本背景中的无关杂质,首先对图像进行背景分离,通过最大类间方差法获取二值图像,利用形态学开操作对二值图像中目标的轮廓进行平滑处理,这一部分在形态学的开操作中,就能将背景中的不是目标的杂质去除,最后用Canny边界检测算法获取目标的轮廓边缘信息;Step 1: For the irrelevant impurities in the background of the sample, firstly separate the background of the image, obtain the binary image by the maximum inter-class variance method, and use the morphological opening operation to smooth the contour of the target in the binary image. In the open operation of learning, the impurities in the background that are not the target can be removed, and finally the Canny boundary detection algorithm is used to obtain the contour edge information of the target;
步骤2:采用凹点检测来对粘连细胞进行分割,粘连细胞凹点指的是由于两个至多个类圆形目标,因相互重叠而产生粘连后所形成的凹区域中的局部曲率最大点,对于近圆性的图像来说,不存在曲率突变的情况,除非是两个或者多个细胞;Step 2: Use the concave point detection to segment the adhesion cells. The concave point of the adhesion cell refers to the local maximum curvature point in the concave area formed by the adhesion of two or more quasi-circular objects due to overlapping with each other. For near-circular images, there is no sudden change in curvature unless there are two or more cells;
步骤3:椭圆拟合,为了获取粘连目标因粘连而丢失掉的轮廓边界,该算法利用目标一般呈现为类圆形的先验知识,使用基于最小二乘法的椭圆拟合方法进行拟合以完成粘连分割;Step 3: Ellipse fitting. In order to obtain the outline boundary of the sticking target lost due to sticking, the algorithm uses the prior knowledge that the target generally appears to be a circle, and uses the ellipse fitting method based on the least squares method for fitting to complete Adhesion segmentation;
S3:针对小样本集进行模型的迁移训练,利用相近领域训练好的深度学习网络对小样本数据集进行迁移学习;S3: Carry out model transfer training for small sample sets, and use deep learning networks trained in similar fields to perform transfer learning on small sample data sets;
采用已在其他领域训练好的多层ResNet模型,截取模型的全连接层前面的部分,输出部分根据所需要的分类种类设置三个输出节点,再利用预训练的迁移方式,将当前多层ResNet的参数作为本发明的初始参数,再用S2中所处理过的图片数据进行网络的训练,具体过程为:Use the multi-layer ResNet model that has been trained in other fields, intercept the part in front of the fully connected layer of the model, set three output nodes in the output part according to the required classification type, and then use the pre-training migration method to transfer the current multi-layer ResNet The parameters of the present invention are used as the initial parameters of the present invention, and the image data processed in S2 is used to train the network. The specific process is:
1)根据输入数据的维数确定第一层网络的节点数,也就是输入层节点数;1) Determine the number of nodes in the first layer of the network according to the dimension of the input data, that is, the number of nodes in the input layer;
2)输入数据到残差网络单元,根据ResNet网络的特性,即残差网络的恒等映射函数,每个模块的输出都是当前输入加上残差,利用训练数据对网络进行层层训练;2) Input data to the residual network unit. According to the characteristics of the ResNet network, that is, the identity mapping function of the residual network, the output of each module is the current input plus the residual, and the training data is used to train the network layer by layer;
3)使用已训练好的ResNet网络进行网络的迁移学习,利用其训练得很好的参数作为本模型的训练初始参数,省去了一部分的训练时间以及训练样本,十分适合小样本的训练学习;3) Use the trained ResNet network for network transfer learning, and use its well-trained parameters as the initial training parameters of the model, which saves part of the training time and training samples, which is very suitable for training and learning with small samples;
S4:将训练好的模型,利用BP算法进行权重和阈值的反向微调,进一步优化模型;S4: Use the BP algorithm to fine-tune the weights and thresholds of the trained model to further optimize the model;
优化模型的具体步骤为:The specific steps to optimize the model are:
1)当训练完成后,通过在ResNet的最顶层添加标签数据,对模型进行有监督训练,即使用反向传播算法(BP)对网络的相关参数进行微调;1) When the training is completed, the model is supervised by adding label data to the top layer of ResNet, that is, using the back propagation algorithm (BP) to fine-tune the relevant parameters of the network;
2)分别将所分类别的带标签数据,输入到ResNet的最顶层中,通过BP算法微调ResNet的权重和阈值,通过有监督的训练将进一步减少训练误差和提高迁移学习识别模型的准确率。2) Input the classified labeled data into the top layer of ResNet, fine-tune the weight and threshold of ResNet through BP algorithm, and further reduce the training error and improve the accuracy of the transfer learning recognition model through supervised training.
S5:将测试集输入到模型中,输出结果即为脑脊液细胞识别结果。S5: Input the test set into the model, and the output result is the cerebrospinal fluid cell identification result.
将测试集数据输入到训练好的分类模型中,经过多层ResNet映射后,输出层节点数为识别状态的数量,输入向量在输出层成功激活相应类别节点,类别节点中单核细胞为节点0、淋巴细胞为节点1、中性粒细胞为节点2。Input the test set data into the trained classification model, after multi-layer ResNet mapping, the number of output layer nodes is the number of recognition states, the input vector successfully activates the corresponding category node in the output layer, and the monocyte in the category node is node 0 , lymphocytes are node 1, and neutrophils are node 2.
本发明中,能够有效解决采集到的图像中由于背景杂质的存在而导致特征提取困难的问题,能够很好地适应复杂背景下的脑脊液细胞识别;由于采用已训练好的模型参数为本发明的初始训练参数,一定程度上是减少了部分训练时间,训练好的模型参数一般来说是比随机选取的参数更加有可靠性,所以适用于小样本的学习训练;针对有细胞粘连的样本,本发明也利用其细胞的近圆性,对其进行了中心点的预测来进行分割,使得本发明的适用情况多样化。In the present invention, the problem of difficulty in feature extraction caused by the existence of background impurities in the collected images can be effectively solved, and the recognition of cerebrospinal fluid cells under complex backgrounds can be well adapted; since the trained model parameters are used in the present invention The initial training parameters reduce part of the training time to a certain extent, and the trained model parameters are generally more reliable than randomly selected parameters, so they are suitable for small sample learning and training; for samples with cell adhesion, this The invention also makes use of the near-circularity of its cells to predict the center point for segmentation, so that the application of the invention is diversified.
本发明中,基于深度学习的脑脊液细胞自动识别技术,能够帮助神经科医生快速建立更科学的鉴别诊断模型,可以减少医生因主观因素对镜检结果造成的不利影响,帮助医生进行细胞计数及分类,大大提高诊断率;且可以与高水平医疗机构资源相互融合,让整体诊断模式趋为规范、统一,极大地提高优质医疗资源向基层医疗机构的辐射作用,提高基层医院的鉴别诊断水平,因此,构建基于深度学习的脑脊液细胞自动识别系统,对于提升中枢神经系统感染性疾病的诊断率、解决地区医疗差异、低年资及基层医师误诊等问题具有重大意义,从而最终使广大患者受益。In the present invention, the automatic identification technology of cerebrospinal fluid cells based on deep learning can help neurologists to quickly establish a more scientific differential diagnosis model, can reduce the adverse effects of doctors on microscopic examination results caused by subjective factors, and help doctors to count and classify cells , greatly improving the diagnosis rate; and can be integrated with the resources of high-level medical institutions, so that the overall diagnosis model tends to be standardized and unified, greatly improving the radiation effect of high-quality medical resources to primary medical institutions, and improving the level of differential diagnosis in primary hospitals. Therefore, The construction of an automatic identification system for cerebrospinal fluid cells based on deep learning is of great significance for improving the diagnosis rate of infectious diseases in the central nervous system, solving the problems of regional medical differences, low seniority and misdiagnosis by primary physicians, thus ultimately benefiting the majority of patients.
实施例2:Example 2:
参照图1-4,一种复杂性小样本脑脊液细胞识别与分类方法,包括以下步骤:Referring to Figures 1-4, a complex small sample cerebrospinal fluid cell identification and classification method includes the following steps:
S1:使用显微镜自动扫描平台进行样本玻片的图像获取,得到具有多个细胞的脑脊液细胞玻片的完整图像集;S1: use the microscope automatic scanning platform to acquire the image of the sample slide to obtain a complete image set of the cerebrospinal fluid cell slide with multiple cells;
将脑脊液细胞玻片放置显微镜的电动平移台上,利用软件系统对玻片的扫描范围进行对角线的坐标点定位,确定图像的扫描范围,并记下扫描的图像范围大小,利用软件系统平台进行采集图像的拼接,得到一张完整的细胞样本图片,对后面的玻片图像采集重复此步骤;Place the cerebrospinal fluid cell slide on the electric translation stage of the microscope, use the software system to locate the diagonal coordinate points of the scanning range of the slide, determine the scanning range of the image, and note the size of the scanned image range, use the software system platform Perform the stitching of the collected images to obtain a complete picture of the cell sample, and repeat this step for the subsequent slide image collection;
S2:对所得图像集进行预处理,对图像进行滤波去噪,剔除图片中不相关的因素,并且对相互粘连的细胞进行分离处理,将所得样本图像集分批次形成训练集和测试集;S2: Preprocess the obtained image set, filter and denoise the image, remove irrelevant factors in the picture, and separate the cells that adhere to each other, and form the obtained sample image set into training set and test set in batches;
对图像集进行预处理的具体步骤为:The specific steps of preprocessing the image set are as follows:
步骤1:针对样本背景中的无关杂质,首先对图像进行背景分离,通过最大类间方差法获取二值图像,利用形态学开操作对二值图像中目标的轮廓进行平滑处理,这一部分在形态学的开操作中,就能将背景中的不是目标的杂质去除,最后用Canny边界检测算法获取目标的轮廓边缘信息;Step 1: For the irrelevant impurities in the background of the sample, firstly separate the background of the image, obtain the binary image by the maximum inter-class variance method, and use the morphological opening operation to smooth the contour of the target in the binary image. In the open operation of learning, the impurities in the background that are not the target can be removed, and finally the Canny boundary detection algorithm is used to obtain the contour edge information of the target;
步骤2:采用凹点检测来对粘连细胞进行分割,粘连细胞凹点指的是由于两个至多个类圆形目标,因相互重叠而产生粘连后所形成的凹区域中的局部曲率最大点,对于近圆性的图像来说,不存在曲率突变的情况,除非是两个或者多个细胞;Step 2: Use the concave point detection to segment the adhesion cells. The concave point of the adhesion cell refers to the local maximum curvature point in the concave area formed by the adhesion of two or more quasi-circular objects due to overlapping with each other. For near-circular images, there is no sudden change in curvature unless there are two or more cells;
1)凹点检测:1) pit detection:
首先,通过一种改进的曲率尺度空间算法(Curvature Scale Space,CSS)对目标轮廓进行角点检测。这种改进的CSS算法以相对较低的尺度保留所有真实角点,然后将所有候选角点的曲率与自适应局部阈值进行比较以移除冗余角点。通常,候选角点的自适应局部阈值是根据其邻域区域的曲率确定的,绝对曲率低于其局部阈值的候选角点将被消除。在角点的候选者中,尽管一些点在曲率数值上被检测为局部最大值,但是它们在支持区域(Region of Support,ROS)中,相邻的点之间的差异却非常小,在选择支持区域的时候,也要选择合适的区域。First, an improved curvature scale space algorithm (Curvature Scale Space, CSS) is used to detect the corner points of the target contour. This improved CSS algorithm preserves all ground-truth corners at a relatively low scale, and then compares the curvature of all candidate corners with an adaptive local threshold to remove redundant corners. Usually, the adaptive local threshold of a candidate corner is determined according to the curvature of its neighborhood area, and candidate corners whose absolute curvature is lower than their local threshold will be eliminated. Among the candidates for corner points, although some points are detected as local maxima in the curvature value, they are in the region of support (ROS), and the difference between adjacent points is very small. When supporting regions, also select the appropriate region.
自适应局部阈值的设定方法为:The setting method of the adaptive local threshold is:
其中,是邻域区域的曲率均值,p代表候选角点的位置,R1与R2为支持区域的尺寸大小,C为系数;in, is the mean curvature of the neighborhood area, p represents the position of the candidate corner point, R 1 and R 2 are the size of the support area, and C is the coefficient;
2)轮廓段分组:2) Contour segment grouping:
使用1)中获得的凹点将粘连区域的轮廓分割为多个轮廓段。由于每个轮廓段并非都对应于一个单独的目标,可能存在多个轮廓段都属于同一个目标的情况。因此,需要将属于同一个目标的轮廓段分为一组。对于某个轮廓段,对于在其一定邻域范围内的另一个轮廓段sj,若si和sj满足分为一组的条件,则将其分为同一组,这种分组方法包括以下三个条件约束:Use the concave points obtained in 1) to segment the contour of the adhesion region into multiple contour segments. Since each contour segment does not correspond to a separate target, there may be situations where multiple contour segments belong to the same target. Therefore, it is necessary to group the contour segments belonging to the same target into a group. For a contour segment, for another contour segment s j within a certain neighborhood range, if s i and s j satisfy the condition of being divided into one group, they are divided into the same group. This grouping method includes the following Three conditional constraints:
条件1:若分为一组后拟合出的椭圆产生的平均距离偏差(Average DistanceDeviation,ADD)小于组合前任意一个轮廓段单独拟合出的椭圆产生的平均距离偏差,则将这些轮廓段分为同一组。Condition 1: If the average distance deviation (Average DistanceDeviation, ADD) of the fitted ellipses after being divided into a group is smaller than the average distance deviation produced by the ellipses fitted individually by any contour segment before the combination, then these contour segments are divided into two groups. for the same group.
条件2:若分为同组后拟合出的椭圆的重心与每个轮廓段分别单独拟合出的椭圆重心的距离都较为接近,则可分为一组。Condition 2: If the distance between the center of gravity of the fitted ellipse after being divided into the same group and the center of gravity of the ellipse fitted separately by each contour segment is relatively close, it can be divided into one group.
条件3:如果任意两个轮廓段si和sj分别拟合出的椭圆间的重心相距很近,则可分为一组;Condition 3: If the centers of gravity between the ellipses fitted by any two contour segments s i and s j respectively are very close, they can be grouped into one group;
步骤3:椭圆拟合,为了获取粘连目标因粘连而丢失掉的轮廓边界,该算法利用目标一般呈现为类圆形的先验知识,使用基于最小二乘法的椭圆拟合方法进行拟合以完成粘连分割;Step 3: Ellipse fitting. In order to obtain the outline boundary of the sticking target lost due to sticking, the algorithm uses the prior knowledge that the target generally appears to be a circle, and uses the ellipse fitting method based on the least squares method for fitting to complete Adhesion segmentation;
S3:针对小样本集进行模型的迁移训练,利用相近领域训练好的深度学习网络对小样本数据集进行迁移学习;S3: Carry out model transfer training for small sample sets, and use deep learning networks trained in similar fields to perform transfer learning on small sample data sets;
采用已在其他领域训练好的多层ResNet模型,截取模型的全连接层前面的部分,输出部分根据所需要的分类种类设置三个输出节点,再利用预训练的迁移方式,即,将当前多层ResNet模型的参数作为本发明的初始参数,再用权利要求3步骤二所处理过的图片数据进行网络的训练,迁移学习的示例如图4。具体实施方式如下:The multi-layer ResNet model that has been trained in other fields is used, and the front part of the fully connected layer of the model is intercepted. The output part sets three output nodes according to the required classification type, and then uses the pre-training migration method, that is, the current multi-layer The parameters of the layer ResNet model are used as the initial parameters of the present invention, and the image data processed in step 2 of claim 3 is used to train the network. An example of transfer learning is shown in Figure 4. The specific implementation is as follows:
ResNet是由多个串联在一起的卷积模块构成,每一个卷积模块都包括一层卷积一层池化,图3为一个ResNet模块。在训练时,将该单元目标映射(即要趋近的最优解)假设为F(x)+x,而输出为:y+x,那么训练的目标就变成了使y趋近于F(x)。即去掉映射前后相同的主体部分x,从而突出微小的变化(残差)。ResNet is composed of multiple convolution modules connected in series. Each convolution module includes one layer of convolution and one layer of pooling. Figure 3 shows a ResNet module. During training, the unit target mapping (that is, the optimal solution to be approached) is assumed to be F(x)+x, and the output is: y+x, then the training goal becomes to make y approach F (x). That is, the same body part x before and after the mapping is removed, so as to highlight the small changes (residuals).
用数学表达式表示为:Mathematically expressed as:
y=F(x,{Wi})+Wsx (2)y=F(x,{W i })+W s x (2)
x是残差单元的输入,y是残差单元的输出,F(x)是目标映射,{Wi}是残差单元中的卷积层。Ws是一个1×1卷积核大小的卷积,作用是给x降维或升维,从而与输出y大小一致(因为需要求和)。x is the input of the residual unit, y is the output of the residual unit, F(x) is the target map, and {W i } is the convolutional layer in the residual unit. W s is a convolution with a 1×1 convolution kernel size, which is used to reduce or increase the dimension of x so that it is the same size as the output y (because it needs to be summed).
具体过程为:The specific process is:
1)根据输入数据的维数确定第一层网络的节点数,也就是输入层节点数;1) Determine the number of nodes in the first layer of the network according to the dimension of the input data, that is, the number of nodes in the input layer;
2)输入数据到残差网络单元,根据ResNet网络的特性,即,残差网络的恒等映射函数,每个模块的输出都是当前输入加上残差,利用训练数据对网络进行层层训练。2) Input data to the residual network unit. According to the characteristics of the ResNet network, that is, the identity mapping function of the residual network, the output of each module is the current input plus the residual, and the training data is used to train the network layer by layer. .
3)本发明中的训练数据的采集并未达到深度学习要求的几十万的训练集样本,但由于使用已训练好的ResNet网络进行网络的迁移,利用其训练得很好的参数作为本模型的训练初始参数,省去了一部分的训练时间以及训练样本,十分适合小样本的训练学习;3) The collection of training data in the present invention does not reach the hundreds of thousands of training set samples required by deep learning, but due to the use of the trained ResNet network for network migration, the well-trained parameters are used as this model. The initial parameters of training, which saves part of the training time and training samples, are very suitable for training and learning with small samples;
S4:将训练好的模型,利用BP算法进行权重和阈值的反向微调,进一步优化模型;S4: Use the BP algorithm to fine-tune the weights and thresholds of the trained model to further optimize the model;
具体实施措施如下:The specific implementation measures are as follows:
(1)模型预训练将迁移过来的权重视作新网络的初始权重,在训练过程中会被梯度下降算法改变数值。(1) Model pre-training uses the transferred weights as the initial weights of the new network, which will be changed by the gradient descent algorithm during the training process.
梯度下降算法:Gradient descent algorithm:
1)从0开始到训练集数据数量结束:1) Starting from 0 and ending with the number of training set data:
①计算第i个训练数据的权重w和偏差b相对于损失函数的梯度。于是我们最终会得到每一个训练数据的权重和偏差的梯度值。① Calculate the gradient of the weight w and bias b of the i-th training data relative to the loss function. So we end up with the gradient values of the weights and biases for each training data.
②计算所有训练数据权重w的梯度的总和。② Calculate the sum of the gradients of all training data weights w.
③计算所有训练数据偏差b的梯度的总和。③ Calculate the sum of the gradients of all training data deviations b.
2)做完上面的计算之后,我们开始执行下面的计算:2) After completing the above calculations, we start to perform the following calculations:
①使用上面第②、③步所得到的结果,计算所有样本的权重和偏差的梯度的平均值。① Using the results obtained in steps ② and ③ above, calculate the average value of the gradients of the weights and biases of all samples.
②使用下面的式子,更新每个样本的权重值和偏差值。②Using the following formula, update the weight value and bias value of each sample.
重复上面的过程,直至损失函数收敛不变。Repeat the above process until the loss function converges.
(2)反向微调也就是对ResNet网络进行有监督训练来减少训练误差和提高分类模型的准确率,BP算法步骤:(2) Reverse fine-tuning is to perform supervised training on the ResNet network to reduce training errors and improve the accuracy of the classification model. BP algorithm steps:
1)输入训练集;1) Input the training set;
2)对于训练集中的每个样本x,设置输入层对应的激活值a1:2) For each sample x in the training set, set the activation value a 1 corresponding to the input layer:
前向传播:Forward propagation:
3)由于输出结果与实际结果有误差,则计算输出层产生的错误:3) Since there is an error between the output result and the actual result, calculate the error generated by the output layer:
δL=ΔaCeσ'(zL) (6)δ L =Δ a Ceσ'(z L ) (6)
4)将上步所求的误差从输出层向隐藏层反向传播:4) Backpropagate the error obtained in the previous step from the output layer to the hidden layer:
δl=((wl+1)Tδl+1)eδ'(zl) (7)δ l =((w l+1 ) T δ l+1 )eδ'(z l ) (7)
5)使用梯度下降,训练参数,不断迭代直至收敛:5) Use gradient descent, train parameters, and iterate until convergence:
S5:将测试集输入到模型中,输出结果即为脑脊液细胞识别结果。S5: Input the test set into the model, and the output result is the cerebrospinal fluid cell identification result.
将测试集数据输入到训练好的分类模型中,经过多层ResNet映射后,输出层节点数为识别状态的数量,输入向量在输出层成功激活相应类别节点,类别节点中单核细胞为节点0、淋巴细胞为节点1、中性粒细胞为节点2。Input the test set data into the trained classification model, after multi-layer ResNet mapping, the number of output layer nodes is the number of recognition states, the input vector successfully activates the corresponding category node in the output layer, and the monocyte in the category node is node 0 , lymphocytes are node 1, and neutrophils are node 2.
本发明中,能够有效解决采集到的图像中由于背景杂质的存在而导致特征提取困难的问题,能够很好地适应复杂背景下的脑脊液细胞识别;由于采用已训练好的模型参数为本发明的初始训练参数,一定程度上是减少了部分训练时间,训练好的模型参数一般来说是比随机选取的参数更加有可靠性,所以适用于小样本的学习训练;针对有细胞粘连的样本,本发明也利用其细胞的近圆性,对其进行了中心点的预测来进行分割,使得本发明的适用情况多样化。In the present invention, the problem of difficulty in feature extraction caused by the existence of background impurities in the collected images can be effectively solved, and the recognition of cerebrospinal fluid cells under complex backgrounds can be well adapted; since the trained model parameters are used in the present invention The initial training parameters reduce part of the training time to a certain extent, and the trained model parameters are generally more reliable than randomly selected parameters, so they are suitable for small sample learning and training; for samples with cell adhesion, this The invention also makes use of the near-circularity of its cells to predict the center point for segmentation, so that the application of the invention is diversified.
本发明中,基于深度学习的脑脊液细胞自动识别技术,能够帮助神经科医生快速建立更科学的鉴别诊断模型,可以减少医生因主观因素对镜检结果造成的不利影响,帮助医生进行细胞计数及分类,大大提高诊断率;且可以与高水平医疗机构资源相互融合,让整体诊断模式趋为规范、统一,极大地提高优质医疗资源向基层医疗机构的辐射作用,提高基层医院的鉴别诊断水平,因此,构建基于深度学习的脑脊液细胞自动识别系统,对于提升中枢神经系统感染性疾病的诊断率、解决地区医疗差异、低年资及基层医师误诊等问题具有重大意义,从而最终使广大患者受益。In the present invention, the automatic identification technology of cerebrospinal fluid cells based on deep learning can help neurologists to quickly establish a more scientific differential diagnosis model, can reduce the adverse effects of doctors on microscopic examination results caused by subjective factors, and help doctors to count and classify cells , greatly improve the diagnosis rate; and can be integrated with the resources of high-level medical institutions, so that the overall diagnosis model tends to be standardized and unified, greatly improving the radiation effect of high-quality medical resources to primary medical institutions, and improving the level of differential diagnosis in primary hospitals. Therefore, The construction of an automatic identification system of cerebrospinal fluid cells based on deep learning is of great significance for improving the diagnosis rate of infectious diseases in the central nervous system, solving the problems of regional medical differences, low seniority and misdiagnosis by primary doctors, thus ultimately benefiting the majority of patients.
以上所述,仅为本发明较佳的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,根据本发明的技术方案及其发明构思加以等同替换或改变,都应涵盖在本发明的保护范围之内。The above description is only a preferred embodiment of the present invention, but the protection scope of the present invention is not limited to this. The equivalent replacement or change of the inventive concept thereof shall be included within the protection scope of the present invention.
Claims (7)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210094305.8A CN114494197A (en) | 2022-01-26 | 2022-01-26 | Cerebrospinal fluid cell identification and classification method for small-complexity sample |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210094305.8A CN114494197A (en) | 2022-01-26 | 2022-01-26 | Cerebrospinal fluid cell identification and classification method for small-complexity sample |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114494197A true CN114494197A (en) | 2022-05-13 |
Family
ID=81477483
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210094305.8A Pending CN114494197A (en) | 2022-01-26 | 2022-01-26 | Cerebrospinal fluid cell identification and classification method for small-complexity sample |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114494197A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115100646A (en) * | 2022-06-27 | 2022-09-23 | 武汉兰丁智能医学股份有限公司 | Cell image high-definition rapid splicing identification marking method |
CN116823823A (en) * | 2023-08-29 | 2023-09-29 | 天津市肿瘤医院(天津医科大学肿瘤医院) | An artificial intelligence method for automatic analysis of cerebrospinal fluid cells |
WO2024000288A1 (en) * | 2022-06-29 | 2024-01-04 | 深圳华大生命科学研究院 | Image stitching method, and gene sequencing system and corresponding gene sequencer |
CN117576098A (en) * | 2024-01-16 | 2024-02-20 | 武汉互创联合科技有限公司 | Cell division balance evaluation method and device based on segmentation |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111476266A (en) * | 2020-02-27 | 2020-07-31 | 武汉大学 | Non-equilibrium type leukocyte classification method based on transfer learning |
CN113723199A (en) * | 2021-08-03 | 2021-11-30 | 南京邮电大学 | Airport low visibility detection method, device and system |
WO2021247868A1 (en) * | 2020-06-03 | 2021-12-09 | Case Western Reserve University | Classification of blood cells |
-
2022
- 2022-01-26 CN CN202210094305.8A patent/CN114494197A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111476266A (en) * | 2020-02-27 | 2020-07-31 | 武汉大学 | Non-equilibrium type leukocyte classification method based on transfer learning |
WO2021247868A1 (en) * | 2020-06-03 | 2021-12-09 | Case Western Reserve University | Classification of blood cells |
CN113723199A (en) * | 2021-08-03 | 2021-11-30 | 南京邮电大学 | Airport low visibility detection method, device and system |
Non-Patent Citations (4)
Title |
---|
HUANHUAN YIN 等: "Research on Recognition and Classification System of Cerebrospinal Fluid Cells Based on Small Samples", 《2021 IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE, ELECTRONIC INFORMATION ENGINEERING AND INTELLIGENT CONTROL TECHNOLOGY (CEI)》, 29 October 2021 (2021-10-29), pages 149 - 152 * |
SAHAR ZAFARI 等: "Segmentation of Partially Overlapping Nanoparticles Using Concave Points", 《ADVANCES IN VISUAL COMPUTING》, 18 December 2015 (2015-12-18), pages 187 - 197, XP047332049, DOI: 10.1007/978-3-319-27857-5_17 * |
刘宰豪: "基于凹点和重心检测的粘连类圆形目标图像分割", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 02, 15 February 2020 (2020-02-15), pages 138 - 1382 * |
尹欢欢: "脑脊液细胞显微图像识别与分类系统设计与实现", 《万方数据知识服务平台》, 1 November 2023 (2023-11-01), pages 1 - 86 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115100646A (en) * | 2022-06-27 | 2022-09-23 | 武汉兰丁智能医学股份有限公司 | Cell image high-definition rapid splicing identification marking method |
WO2024000288A1 (en) * | 2022-06-29 | 2024-01-04 | 深圳华大生命科学研究院 | Image stitching method, and gene sequencing system and corresponding gene sequencer |
CN116823823A (en) * | 2023-08-29 | 2023-09-29 | 天津市肿瘤医院(天津医科大学肿瘤医院) | An artificial intelligence method for automatic analysis of cerebrospinal fluid cells |
CN116823823B (en) * | 2023-08-29 | 2023-11-14 | 天津市肿瘤医院(天津医科大学肿瘤医院) | Artificial intelligence cerebrospinal fluid cell automatic analysis method |
CN117576098A (en) * | 2024-01-16 | 2024-02-20 | 武汉互创联合科技有限公司 | Cell division balance evaluation method and device based on segmentation |
CN117576098B (en) * | 2024-01-16 | 2024-04-19 | 武汉互创联合科技有限公司 | Cell division balance evaluation method and device based on segmentation |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114494197A (en) | Cerebrospinal fluid cell identification and classification method for small-complexity sample | |
CN106248559B (en) | A kind of five sorting technique of leucocyte based on deep learning | |
CN101809589B (en) | Methods and systems for processing biological specimens utilizing multiple wavelengths | |
CN110473167B (en) | Urine sediment image recognition system and method based on deep learning | |
CN112784767A (en) | Cell example segmentation algorithm based on leukocyte microscopic image | |
CN108257124A (en) | A kind of white blood cell count(WBC) method and system based on image | |
CN101799926B (en) | Ki-67 immunohistochemical pathological image automatic quantitative analysis system | |
CN113628199B (en) | Pathological picture stained tissue area detection method, pathological picture stained tissue area detection system and prognosis state analysis system | |
CN113902669A (en) | Method and system for reading urine exfoliative cell fluid-based smear | |
CN112036334A (en) | Method, system and terminal for classifying visible components in sample to be detected | |
CN114332855A (en) | Unmarked leukocyte three-classification method based on bright field microscopic imaging | |
Rani et al. | Automatic Evaluations of Human Blood Using Deep Learning Concepts | |
KR20010017092A (en) | Method for counting and analyzing morphology of blood cell automatically | |
KR20200136004A (en) | Method for detecting cells with at least one malformation in a cell sample | |
CN112432902A (en) | Automatic detection system and method for judging cell number through peripheral blood cell morphology | |
CN110414317B (en) | Capsule network-based automatic white blood cell classification and counting method | |
CN112001315A (en) | Bone marrow cell classification and identification method based on transfer learning and image texture features | |
Sinha et al. | Detection of leukemia disease using convolutional neural network | |
CN113222944B (en) | Cell nucleus segmentation method and system and device for auxiliary analysis of cancer based on pathological images | |
CN114387596A (en) | Cytopathological smear automatic interpretation system | |
Priyankara et al. | An extensible computer vision application for blood cell recognition and analysis | |
CN112819057A (en) | Automatic identification method of urinary sediment image | |
CN114742803B (en) | Platelet aggregation detection method combining deep learning and digital image processing algorithm | |
Cheng et al. | Application of image recognition technology in pathological diagnosis of blood smears | |
CN113222928B (en) | Urine cytology artificial intelligence urothelial cancer identification system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |