CN114494197A - Cerebrospinal fluid cell identification and classification method for small-complexity sample - Google Patents

Cerebrospinal fluid cell identification and classification method for small-complexity sample Download PDF

Info

Publication number
CN114494197A
CN114494197A CN202210094305.8A CN202210094305A CN114494197A CN 114494197 A CN114494197 A CN 114494197A CN 202210094305 A CN202210094305 A CN 202210094305A CN 114494197 A CN114494197 A CN 114494197A
Authority
CN
China
Prior art keywords
cerebrospinal fluid
training
model
image
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210094305.8A
Other languages
Chinese (zh)
Inventor
屈剑锋
万亚辉
刘金卓
满石林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN202210094305.8A priority Critical patent/CN114494197A/en
Publication of CN114494197A publication Critical patent/CN114494197A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/12Edge-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10056Microscopic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30024Cell structures in vitro; Tissue sections in vitro

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medical Informatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

本发明公开了一种复杂性小样本脑脊液细胞识别与分类方法,具体方法步骤如下:S1:利用显微镜拼接成像平台获得玻片样本下的细胞图像,包括单核细胞、淋巴细胞和中性粒细胞的图像;S2:对所得图像集进行预处理。针对采集到的细胞样本中存在着由于镜头污染或操作不当等带来的背景杂质,以及出现个别细胞重叠粘连的问题,主要是对图像进行滤波去噪,剔除图片中不相关的因素,并且对粘连的细胞进行分离处理;S3:针对小样本集进行模型的迁移训练;S4:根据所得模型,使用BP算法对模型的权重和阈值进行反向微调;S5:利用训练好的模型对测试集进行脑脊液细胞图像的识别,并进一步优化算法。本发明能够有效地识别人体脑脊液细胞的不同种类。

Figure 202210094305

The invention discloses a complex small sample cerebrospinal fluid cell identification and classification method. The specific method steps are as follows: S1: use a microscope splicing imaging platform to obtain cell images under a glass slide sample, including monocytes, lymphocytes and neutrophils image; S2: Preprocess the resulting image set. In view of the background impurities caused by lens contamination or improper operation in the collected cell samples, as well as the problem of overlapping and adhesion of individual cells, the main purpose is to filter and denoise the image, remove irrelevant factors in the image, and correct Adhesive cells are separated; S3: Transfer training of the model for a small sample set; S4: According to the obtained model, use the BP algorithm to reverse fine-tune the weights and thresholds of the model; S5: Use the trained model to perform training on the test set Recognition of cerebrospinal fluid cell images and further optimization of the algorithm. The invention can effectively identify different types of human cerebrospinal fluid cells.

Figure 202210094305

Description

一种复杂性小样本脑脊液细胞识别与分类方法A complex small sample cerebrospinal fluid cell identification and classification method

技术领域technical field

本发明涉及医学细胞识别与分类技术领域,尤其涉及一种复杂性小样本脑脊液细胞识别与分类方法。The invention relates to the technical field of medical cell identification and classification, in particular to a complex small sample cerebrospinal fluid cell identification and classification method.

背景技术Background technique

脑脊液(Cerebrospinal fluid,CSF)细胞学,是神经科医生的最重要工具之一,包含总体细胞计数和细胞学分类,为中枢神经系统及其涵盖的一系列病理状况提供重要的第一手信息。CSF样本需要立即处理,尽可能在收集后1小时内处理。在正常CSF细胞以T淋巴细胞为主,少量单核巨噬细胞,偶见B淋巴细胞;细胞数明显增加,镜下以中性粒细胞为主时多见于细菌性脑膜炎,需进一步寻找胞内细菌证据;以淋巴细胞和单核细胞为主的细胞背景多见于病毒感染和慢性炎症;以混杂细胞反应为背景时,可发生于结核性脑膜炎;吞噬红细胞或含血红蛋白降解产物片段的巨噬细胞(后者称为含铁血黄素细胞),均提示陈旧性的蛛网膜下腔出血;镜下发现异型细胞怀疑肿瘤时,需结合临床及免疫细胞化学染色综合判断。Cerebrospinal fluid (CSF) cytology, one of the most important tools for neurologists, includes gross cell counts and cytological classifications, providing important first-hand information on the central nervous system and the range of pathological conditions it covers. CSF samples need to be processed immediately, if possible within 1 hour of collection. In normal CSF cells are mainly T lymphocytes, a small amount of mononuclear macrophages, and occasionally B lymphocytes; the number of cells increases significantly, and neutrophils are more common under microscope in bacterial meningitis, and further search for cells is required. Evidence of endobacteria; cellular background dominated by lymphocytes and monocytes is more common in viral infections and chronic inflammation; can occur in tuberculous meningitis in the background of promiscuous cellular responses; phagocytosis of erythrocytes or macrophages containing fragments of hemoglobin degradation products Phage cells (the latter are called hemosiderin cells) are all indicative of old subarachnoid hemorrhage; when atypical cells are found under the microscope to suspect a tumor, a comprehensive judgment should be combined with clinical and immunocytochemical staining.

近年来宏基因组测序技术受到了广泛的关注,在中枢神经系统感染性疾病病原体检测中具有一定的价值但仍存在一些不足:标本易受污染,影响检测结果,限制病原体检测的总体敏感性;容易出现假阴性和假阳性结果,甚至导致检测结果无法分析;检测费用高,限制其广泛运用等。所以,宏基因组测序尚不能取代传统的诊断方法。In recent years, metagenomic sequencing technology has received extensive attention, and it has certain value in the detection of pathogens of infectious diseases in the central nervous system, but there are still some shortcomings: the specimen is easily contaminated, which affects the detection results and limits the overall sensitivity of pathogen detection; There are false negative and false positive results, and even the test results cannot be analyzed; the test cost is high, which limits its wide application. Therefore, metagenomic sequencing cannot yet replace traditional diagnostic methods.

迄今为止,大多数临床实验室仍采用手工法对脑脊液中的细胞进行计数及分类,该分析采用的是直接在显微镜下根据细胞核形态分别计数单个核(包括淋巴和单核细胞)和多个核细胞,共计数100个。此方法存在操作繁琐,耗时费力,不同操作者之间由于熟练程度、规范程度不同,具有很大的主观性,结果重复性低、误差较大,无法进行室内或室间质控,并且结果回报时间又长,无法较好地满足临床需要,不适于现代化医院大规模临床工作的开展。与血液和尿液相比,脑脊液样本量少。手工计数时取样量少,不能保证计数的精确度。如果能实现或部分实现脑脊液标本自动化细胞检测,可在一定程度上解决上述问题。目前尚无专用的计数及分类脑脊液细胞的自动分析仪器。To date, most clinical laboratories still count and classify cells in CSF by hand, using the method of counting single nuclei (including lymphoid and monocytes) and multiple nuclei directly under the microscope based on their nuclear morphology. A total of 100 cells were counted. This method is cumbersome, time-consuming and labor-intensive. Due to the difference in proficiency and standardization among different operators, it has great subjectivity. The results have low repeatability and large errors. The return time is long, it cannot meet the clinical needs well, and it is not suitable for the development of large-scale clinical work in modern hospitals. Cerebrospinal fluid samples are small in volume compared to blood and urine. In manual counting, the sampling amount is small, and the accuracy of counting cannot be guaranteed. If the automated cell detection of cerebrospinal fluid samples can be realized or partially realized, the above problems can be solved to a certain extent. At present, there is no special automatic analyzer for counting and classifying cerebrospinal fluid cells.

随着自动化细胞检测技术的发展,近年来许多研究者尝试使用各类细胞分析仪(如全自动尿沉渣分析仪和血液分析仪)对脑脊液细胞进行计数和分析。现在一些新型号的血细胞分析仪增加了体液细胞分析的功能,使实验室对胸水、腹水等中的细胞进行自动计数和分类成为可能。但因脑脊液因其特殊性,样本量较少,且各类仪器本身的原理和内部设计等问题限制了其在脑脊液标本检测中的应用,加之从脑脊液细胞的取样来看,样本玻片存在一定的细菌杂质以及细胞粘连的情况,这对脑脊液细胞识别分类的影响存在着巨大影响。With the development of automated cell detection technology, in recent years, many researchers have tried to use various cell analyzers (such as automatic urine sediment analyzer and blood analyzer) to count and analyze cerebrospinal fluid cells. Now some new models of blood cell analyzers have added the function of body fluid cell analysis, making it possible for laboratories to automatically count and classify cells in pleural fluid, ascites, etc. However, due to the particularity of cerebrospinal fluid, the sample size is small, and the principle and internal design of various instruments limit their application in the detection of cerebrospinal fluid samples. The presence of bacterial impurities and cell adhesion has a huge impact on the identification and classification of cerebrospinal fluid cells.

基于深度学习的脑脊液细胞自动识别技术,能够帮助神经科医生快速建立更科学的鉴别诊断模型,可以减少医生因主观因素对镜检结果造成的不利影响,帮助医生进行细胞计数及分类,大大提高诊断率;且可以与高水平医疗机构资源相互融合,让整体诊断模式趋为规范、统一,极大地提高优质医疗资源向基层医疗机构的辐射作用,提高基层医院的鉴别诊断水平。因此,构建基于深度学习的脑脊液细胞自动识别系统,对于提升中枢神经系统感染性疾病的诊断率、解决地区医疗差异、低年资及基层医师误诊等问题具有重大意义,从而最终使广大患者受益。The automatic identification technology of cerebrospinal fluid cells based on deep learning can help neurologists to quickly establish a more scientific differential diagnosis model, which can reduce the adverse effects of subjective factors on the results of microscopic examinations, help doctors to count and classify cells, and greatly improve the diagnosis. And it can be integrated with the resources of high-level medical institutions, so that the overall diagnosis model tends to be standardized and unified, greatly improving the radiation effect of high-quality medical resources to primary medical institutions, and improving the differential diagnosis level of primary hospitals. Therefore, the construction of an automatic identification system of cerebrospinal fluid cells based on deep learning is of great significance for improving the diagnosis rate of infectious diseases in the central nervous system, solving the problems of regional medical differences, low seniority and misdiagnosis by primary physicians, thus ultimately benefiting the majority of patients.

发明内容SUMMARY OF THE INVENTION

1.要解决的技术问题1. Technical problems to be solved

本发明的目的是为了解决现有技术中因脑脊液因其特殊性,样本量较少,且各类仪器本身的原理和内部设计等问题限制了其在脑脊液标本检测中的应用,加之从脑脊液细胞的取样来看,样本玻片存在一定的细菌杂质以及细胞粘连的情况,这对脑脊液细胞识别分类的影响存在着巨大影响的问题,而提出的一种复杂性小样本脑脊液细胞识别与分类方法。The purpose of the present invention is to solve the problems in the prior art due to the particularity of cerebrospinal fluid, the small sample size, and the principles and internal design of various instruments themselves limit their application in the detection of cerebrospinal fluid samples. From the sampling point of view, the sample slides have certain bacterial impurities and cell adhesion, which has a huge impact on the identification and classification of cerebrospinal fluid cells. A complex small sample cerebrospinal fluid cell identification and classification method is proposed.

2.技术方案2. Technical solutions

为了实现上述目的,本发明采用了如下技术方案:In order to achieve the above object, the present invention adopts the following technical solutions:

一种复杂性小样本脑脊液细胞识别与分类方法,包括以下步骤:A complex small sample cerebrospinal fluid cell identification and classification method, comprising the following steps:

S1:使用显微镜自动扫描平台进行样本玻片的图像获取,得到具有多个细胞的脑脊液细胞玻片的完整图像集;S1: use the microscope automatic scanning platform to acquire the image of the sample slide to obtain a complete image set of the cerebrospinal fluid cell slide with multiple cells;

S2:对所得图像集进行预处理,对图像进行滤波去噪,剔除图片中不相关的因素,并且对相互粘连的细胞进行分离处理,将所得样本图像集分批次形成训练集和测试集;S2: Preprocess the obtained image set, filter and denoise the image, remove irrelevant factors in the picture, and separate the cells that adhere to each other, and form the obtained sample image set into training set and test set in batches;

S3:针对小样本集进行模型的迁移训练,利用相近领域训练好的深度学习网络对小样本数据集进行迁移学习;S3: Carry out model transfer training for small sample sets, and use deep learning networks trained in similar fields to perform transfer learning on small sample data sets;

S4:将训练好的模型,利用BP算法进行权重和阈值的反向微调,进一步优化模型;S4: Use the BP algorithm to fine-tune the weights and thresholds of the trained model to further optimize the model;

S5:将测试集输入到模型中,输出结果即为脑脊液细胞识别结果。S5: Input the test set into the model, and the output result is the cerebrospinal fluid cell identification result.

优选地,所述S1中将脑脊液细胞玻片放置显微镜的电动平移台上,利用软件系统对玻片的扫描范围进行对角线的坐标点定位,确定图像的扫描范围,并记下扫描的图像范围大小,利用软件系统平台进行采集图像的拼接,得到一张完整的细胞样本图片,对后面的玻片图像采集重复此步骤。Preferably, in the step S1, the cerebrospinal fluid cell slide is placed on the electric translation stage of the microscope, and a software system is used to locate the diagonal coordinates of the scanning range of the slide, determine the scanning range of the image, and record the scanned image The size of the range, the software system platform is used to stitch the collected images to obtain a complete picture of the cell sample, and this step is repeated for the subsequent slide image collection.

优选地,所述S2中对图像集进行预处理的具体步骤为:Preferably, the specific steps of preprocessing the image set in S2 are:

步骤1:针对样本背景中的无关杂质,首先对图像进行背景分离,通过最大类间方差法获取二值图像,利用形态学开操作对二值图像中目标的轮廓进行平滑处理,这一部分在形态学的开操作中,就能将背景中的不是目标的杂质去除,最后用Canny边界检测算法获取目标的轮廓边缘信息;Step 1: For the irrelevant impurities in the background of the sample, firstly separate the background of the image, obtain the binary image by the maximum inter-class variance method, and use the morphological opening operation to smooth the contour of the target in the binary image. In the open operation of learning, the impurities in the background that are not the target can be removed, and finally the Canny boundary detection algorithm is used to obtain the contour edge information of the target;

步骤2:采用凹点检测来对粘连细胞进行分割,粘连细胞凹点指的是由于两个至多个类圆形目标,因相互重叠而产生粘连后所形成的凹区域中的局部曲率最大点,对于近圆性的图像来说,不存在曲率突变的情况,除非是两个或者多个细胞;Step 2: Use the concave point detection to segment the adhesion cells. The concave point of the adhesion cell refers to the local maximum curvature point in the concave area formed by the adhesion of two or more quasi-circular objects due to overlapping with each other. For near-circular images, there is no sudden change in curvature unless there are two or more cells;

步骤3:椭圆拟合,为了获取粘连目标因粘连而丢失掉的轮廓边界,该算法利用目标一般呈现为类圆形的先验知识,使用基于最小二乘法的椭圆拟合方法进行拟合以完成粘连分割。Step 3: Ellipse fitting. In order to obtain the outline boundary of the sticking target lost due to sticking, the algorithm uses the prior knowledge that the target generally appears to be a circle, and uses the ellipse fitting method based on the least squares method for fitting to complete Adhesive segmentation.

优选地,所述S3中采用已在其他领域训练好的多层ResNet模型,截取模型的全连接层前面的部分,输出部分根据所需要的分类种类设置三个输出节点,再利用预训练的迁移方式,将当前多层ResNet的参数作为本发明的初始参数,再用S2中所处理过的图片数据进行网络的训练,具体过程为:Preferably, the multi-layer ResNet model that has been trained in other fields is used in the S3, the part in front of the fully connected layer of the model is intercepted, the output part sets three output nodes according to the required classification type, and then uses the pre-trained migration method, take the parameters of the current multi-layer ResNet as the initial parameters of the present invention, and then use the image data processed in S2 to train the network. The specific process is:

1)根据输入数据的维数确定第一层网络的节点数,也就是输入层节点数;1) Determine the number of nodes in the first layer of the network according to the dimension of the input data, that is, the number of nodes in the input layer;

2)输入数据到残差网络单元,根据ResNet网络的特性,即残差网络的恒等映射函数,每个模块的输出都是当前输入加上残差,利用训练数据对网络进行层层训练;2) Input data to the residual network unit. According to the characteristics of the ResNet network, that is, the identity mapping function of the residual network, the output of each module is the current input plus the residual, and the training data is used to train the network layer by layer;

3)使用已训练好的ResNet网络进行网络的迁移学习,利用其训练得很好的参数作为本模型的训练初始参数,省去了一部分的训练时间以及训练样本,十分适合小样本的训练学习。3) Use the trained ResNet network for network transfer learning, and use its well-trained parameters as the initial training parameters of this model, which saves part of the training time and training samples, which is very suitable for training and learning with small samples.

优选地,所述S4中优化模型的具体步骤为:Preferably, the specific steps of optimizing the model in the S4 are:

1)当训练完成后,通过在ResNet的最顶层添加标签数据,对模型进行有监督训练,即使用反向传播算法(BP)对网络的相关参数进行微调;1) When the training is completed, the model is supervised by adding label data to the top layer of ResNet, that is, using the back propagation algorithm (BP) to fine-tune the relevant parameters of the network;

2)分别将所分类别的带标签数据,输入到ResNet的最顶层中,通过BP算法微调ResNet的权重和阈值,通过有监督的训练将进一步减少训练误差和提高迁移学习识别模型的准确率。2) Input the classified labeled data into the top layer of ResNet, fine-tune the weight and threshold of ResNet through BP algorithm, and further reduce the training error and improve the accuracy of the transfer learning recognition model through supervised training.

优选地,所述S5中将测试集数据输入到训练好的分类模型中,经过多层ResNet映射后,输出层节点数为识别状态的数量,输入向量在输出层成功激活相应类别节点。Preferably, in the step S5, the test set data is input into the trained classification model, and after multi-layer ResNet mapping, the number of output layer nodes is the number of recognition states, and the input vector successfully activates the corresponding category nodes in the output layer.

优选地,所述S5中类别节点中单核细胞为节点0、淋巴细胞为节点1、中性粒细胞为节点2。Preferably, in the category nodes in S5, monocytes are node 0, lymphocytes are node 1, and neutrophils are node 2.

3.有益效果3. Beneficial effects

相比于现有技术,本发明的优点在于:Compared with the prior art, the advantages of the present invention are:

(1)本发明中,能够有效解决采集到的图像中由于背景杂质的存在而导致特征提取困难的问题,能够很好地适应复杂背景下的脑脊液细胞识别;由于采用已训练好的模型参数为本发明的初始训练参数,一定程度上是减少了部分训练时间,训练好的模型参数一般来说是比随机选取的参数更加有可靠性,所以适用于小样本的学习训练;针对有细胞粘连的样本,本发明也利用其细胞的近圆性,对其进行了中心点的预测来进行分割,使得本发明的适用情况多样化。(1) In the present invention, the problem of difficulty in feature extraction due to the existence of background impurities in the collected images can be effectively solved, and the recognition of cerebrospinal fluid cells under complex backgrounds can be well adapted; since the trained model parameters are The initial training parameters of the present invention reduce part of the training time to a certain extent, and the trained model parameters are generally more reliable than randomly selected parameters, so they are suitable for small sample learning and training; For the sample, the present invention also utilizes the near-circularity of its cells, and predicts its center point to segment it, so that the application of the present invention is diversified.

(2)本发明中,基于深度学习的脑脊液细胞自动识别技术,能够帮助神经科医生快速建立更科学的鉴别诊断模型,可以减少医生因主观因素对镜检结果造成的不利影响,帮助医生进行细胞计数及分类,大大提高诊断率;且可以与高水平医疗机构资源相互融合,让整体诊断模式趋为规范、统一,极大地提高优质医疗资源向基层医疗机构的辐射作用,提高基层医院的鉴别诊断水平,因此,构建基于深度学习的脑脊液细胞自动识别系统,对于提升中枢神经系统感染性疾病的诊断率、解决地区医疗差异、低年资及基层医师误诊等问题具有重大意义,从而最终使广大患者受益。(2) In the present invention, the automatic identification technology of cerebrospinal fluid cells based on deep learning can help neurologists quickly establish a more scientific differential diagnosis model, can reduce the adverse effects of doctors on the results of microscopic examination due to subjective factors, and help doctors in cell Counting and classification can greatly improve the diagnosis rate; and can be integrated with the resources of high-level medical institutions, so that the overall diagnosis model tends to be standardized and unified, greatly improving the radiation effect of high-quality medical resources to primary medical institutions, and improving the differential diagnosis of primary hospitals. Therefore, the construction of an automatic identification system of cerebrospinal fluid cells based on deep learning is of great significance for improving the diagnosis rate of infectious diseases in the central nervous system, solving the problems of regional medical differences, low seniority and misdiagnosis by primary physicians, so as to eventually make the majority of patients benefit.

附图说明Description of drawings

图1为本发明提出的一种复杂性小样本脑脊液细胞识别与分类方法的技术流程框图;Fig. 1 is a technical flowchart of a method for identifying and classifying complex small sample cerebrospinal fluid cells proposed by the present invention;

图2为本发明提出的一种复杂性小样本脑脊液细胞识别与分类方法的凹点示意图;FIG. 2 is a schematic diagram of a concave point of a complex small sample cerebrospinal fluid cell identification and classification method proposed by the present invention;

图3为本发明中迁移学习的ResNet模型;Fig. 3 is the ResNet model of transfer learning in the present invention;

图4为本发明中迁移学习示例。FIG. 4 is an example of transfer learning in the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments.

实施例1:Example 1:

参照图1,一种复杂性小样本脑脊液细胞识别与分类方法,包括以下步骤:Referring to Figure 1, a complex small sample cerebrospinal fluid cell identification and classification method includes the following steps:

S1:使用显微镜自动扫描平台进行样本玻片的图像获取,得到具有多个细胞的脑脊液细胞玻片的完整图像集;S1: use the microscope automatic scanning platform to acquire the image of the sample slide to obtain a complete image set of the cerebrospinal fluid cell slide with multiple cells;

将脑脊液细胞玻片放置显微镜的电动平移台上,利用软件系统对玻片的扫描范围进行对角线的坐标点定位,确定图像的扫描范围,并记下扫描的图像范围大小,利用软件系统平台进行采集图像的拼接,得到一张完整的细胞样本图片,对后面的玻片图像采集重复此步骤;Place the cerebrospinal fluid cell slide on the electric translation stage of the microscope, use the software system to locate the diagonal coordinate points of the scanning range of the slide, determine the scanning range of the image, and note the size of the scanned image range, use the software system platform Perform the stitching of the collected images to obtain a complete picture of the cell sample, and repeat this step for the subsequent slide image collection;

S2:对所得图像集进行预处理,对图像进行滤波去噪,剔除图片中不相关的因素,并且对相互粘连的细胞进行分离处理,将所得样本图像集分批次形成训练集和测试集;S2: Preprocess the obtained image set, filter and denoise the image, remove irrelevant factors in the picture, and separate the cells that adhere to each other, and form the obtained sample image set into training set and test set in batches;

对图像集进行预处理的具体步骤为:The specific steps of preprocessing the image set are as follows:

步骤1:针对样本背景中的无关杂质,首先对图像进行背景分离,通过最大类间方差法获取二值图像,利用形态学开操作对二值图像中目标的轮廓进行平滑处理,这一部分在形态学的开操作中,就能将背景中的不是目标的杂质去除,最后用Canny边界检测算法获取目标的轮廓边缘信息;Step 1: For the irrelevant impurities in the background of the sample, firstly separate the background of the image, obtain the binary image by the maximum inter-class variance method, and use the morphological opening operation to smooth the contour of the target in the binary image. In the open operation of learning, the impurities in the background that are not the target can be removed, and finally the Canny boundary detection algorithm is used to obtain the contour edge information of the target;

步骤2:采用凹点检测来对粘连细胞进行分割,粘连细胞凹点指的是由于两个至多个类圆形目标,因相互重叠而产生粘连后所形成的凹区域中的局部曲率最大点,对于近圆性的图像来说,不存在曲率突变的情况,除非是两个或者多个细胞;Step 2: Use the concave point detection to segment the adhesion cells. The concave point of the adhesion cell refers to the local maximum curvature point in the concave area formed by the adhesion of two or more quasi-circular objects due to overlapping with each other. For near-circular images, there is no sudden change in curvature unless there are two or more cells;

步骤3:椭圆拟合,为了获取粘连目标因粘连而丢失掉的轮廓边界,该算法利用目标一般呈现为类圆形的先验知识,使用基于最小二乘法的椭圆拟合方法进行拟合以完成粘连分割;Step 3: Ellipse fitting. In order to obtain the outline boundary of the sticking target lost due to sticking, the algorithm uses the prior knowledge that the target generally appears to be a circle, and uses the ellipse fitting method based on the least squares method for fitting to complete Adhesion segmentation;

S3:针对小样本集进行模型的迁移训练,利用相近领域训练好的深度学习网络对小样本数据集进行迁移学习;S3: Carry out model transfer training for small sample sets, and use deep learning networks trained in similar fields to perform transfer learning on small sample data sets;

采用已在其他领域训练好的多层ResNet模型,截取模型的全连接层前面的部分,输出部分根据所需要的分类种类设置三个输出节点,再利用预训练的迁移方式,将当前多层ResNet的参数作为本发明的初始参数,再用S2中所处理过的图片数据进行网络的训练,具体过程为:Use the multi-layer ResNet model that has been trained in other fields, intercept the part in front of the fully connected layer of the model, set three output nodes in the output part according to the required classification type, and then use the pre-training migration method to transfer the current multi-layer ResNet The parameters of the present invention are used as the initial parameters of the present invention, and the image data processed in S2 is used to train the network. The specific process is:

1)根据输入数据的维数确定第一层网络的节点数,也就是输入层节点数;1) Determine the number of nodes in the first layer of the network according to the dimension of the input data, that is, the number of nodes in the input layer;

2)输入数据到残差网络单元,根据ResNet网络的特性,即残差网络的恒等映射函数,每个模块的输出都是当前输入加上残差,利用训练数据对网络进行层层训练;2) Input data to the residual network unit. According to the characteristics of the ResNet network, that is, the identity mapping function of the residual network, the output of each module is the current input plus the residual, and the training data is used to train the network layer by layer;

3)使用已训练好的ResNet网络进行网络的迁移学习,利用其训练得很好的参数作为本模型的训练初始参数,省去了一部分的训练时间以及训练样本,十分适合小样本的训练学习;3) Use the trained ResNet network for network transfer learning, and use its well-trained parameters as the initial training parameters of the model, which saves part of the training time and training samples, which is very suitable for training and learning with small samples;

S4:将训练好的模型,利用BP算法进行权重和阈值的反向微调,进一步优化模型;S4: Use the BP algorithm to fine-tune the weights and thresholds of the trained model to further optimize the model;

优化模型的具体步骤为:The specific steps to optimize the model are:

1)当训练完成后,通过在ResNet的最顶层添加标签数据,对模型进行有监督训练,即使用反向传播算法(BP)对网络的相关参数进行微调;1) When the training is completed, the model is supervised by adding label data to the top layer of ResNet, that is, using the back propagation algorithm (BP) to fine-tune the relevant parameters of the network;

2)分别将所分类别的带标签数据,输入到ResNet的最顶层中,通过BP算法微调ResNet的权重和阈值,通过有监督的训练将进一步减少训练误差和提高迁移学习识别模型的准确率。2) Input the classified labeled data into the top layer of ResNet, fine-tune the weight and threshold of ResNet through BP algorithm, and further reduce the training error and improve the accuracy of the transfer learning recognition model through supervised training.

S5:将测试集输入到模型中,输出结果即为脑脊液细胞识别结果。S5: Input the test set into the model, and the output result is the cerebrospinal fluid cell identification result.

将测试集数据输入到训练好的分类模型中,经过多层ResNet映射后,输出层节点数为识别状态的数量,输入向量在输出层成功激活相应类别节点,类别节点中单核细胞为节点0、淋巴细胞为节点1、中性粒细胞为节点2。Input the test set data into the trained classification model, after multi-layer ResNet mapping, the number of output layer nodes is the number of recognition states, the input vector successfully activates the corresponding category node in the output layer, and the monocyte in the category node is node 0 , lymphocytes are node 1, and neutrophils are node 2.

本发明中,能够有效解决采集到的图像中由于背景杂质的存在而导致特征提取困难的问题,能够很好地适应复杂背景下的脑脊液细胞识别;由于采用已训练好的模型参数为本发明的初始训练参数,一定程度上是减少了部分训练时间,训练好的模型参数一般来说是比随机选取的参数更加有可靠性,所以适用于小样本的学习训练;针对有细胞粘连的样本,本发明也利用其细胞的近圆性,对其进行了中心点的预测来进行分割,使得本发明的适用情况多样化。In the present invention, the problem of difficulty in feature extraction caused by the existence of background impurities in the collected images can be effectively solved, and the recognition of cerebrospinal fluid cells under complex backgrounds can be well adapted; since the trained model parameters are used in the present invention The initial training parameters reduce part of the training time to a certain extent, and the trained model parameters are generally more reliable than randomly selected parameters, so they are suitable for small sample learning and training; for samples with cell adhesion, this The invention also makes use of the near-circularity of its cells to predict the center point for segmentation, so that the application of the invention is diversified.

本发明中,基于深度学习的脑脊液细胞自动识别技术,能够帮助神经科医生快速建立更科学的鉴别诊断模型,可以减少医生因主观因素对镜检结果造成的不利影响,帮助医生进行细胞计数及分类,大大提高诊断率;且可以与高水平医疗机构资源相互融合,让整体诊断模式趋为规范、统一,极大地提高优质医疗资源向基层医疗机构的辐射作用,提高基层医院的鉴别诊断水平,因此,构建基于深度学习的脑脊液细胞自动识别系统,对于提升中枢神经系统感染性疾病的诊断率、解决地区医疗差异、低年资及基层医师误诊等问题具有重大意义,从而最终使广大患者受益。In the present invention, the automatic identification technology of cerebrospinal fluid cells based on deep learning can help neurologists to quickly establish a more scientific differential diagnosis model, can reduce the adverse effects of doctors on microscopic examination results caused by subjective factors, and help doctors to count and classify cells , greatly improving the diagnosis rate; and can be integrated with the resources of high-level medical institutions, so that the overall diagnosis model tends to be standardized and unified, greatly improving the radiation effect of high-quality medical resources to primary medical institutions, and improving the level of differential diagnosis in primary hospitals. Therefore, The construction of an automatic identification system for cerebrospinal fluid cells based on deep learning is of great significance for improving the diagnosis rate of infectious diseases in the central nervous system, solving the problems of regional medical differences, low seniority and misdiagnosis by primary physicians, thus ultimately benefiting the majority of patients.

实施例2:Example 2:

参照图1-4,一种复杂性小样本脑脊液细胞识别与分类方法,包括以下步骤:Referring to Figures 1-4, a complex small sample cerebrospinal fluid cell identification and classification method includes the following steps:

S1:使用显微镜自动扫描平台进行样本玻片的图像获取,得到具有多个细胞的脑脊液细胞玻片的完整图像集;S1: use the microscope automatic scanning platform to acquire the image of the sample slide to obtain a complete image set of the cerebrospinal fluid cell slide with multiple cells;

将脑脊液细胞玻片放置显微镜的电动平移台上,利用软件系统对玻片的扫描范围进行对角线的坐标点定位,确定图像的扫描范围,并记下扫描的图像范围大小,利用软件系统平台进行采集图像的拼接,得到一张完整的细胞样本图片,对后面的玻片图像采集重复此步骤;Place the cerebrospinal fluid cell slide on the electric translation stage of the microscope, use the software system to locate the diagonal coordinate points of the scanning range of the slide, determine the scanning range of the image, and note the size of the scanned image range, use the software system platform Perform the stitching of the collected images to obtain a complete picture of the cell sample, and repeat this step for the subsequent slide image collection;

S2:对所得图像集进行预处理,对图像进行滤波去噪,剔除图片中不相关的因素,并且对相互粘连的细胞进行分离处理,将所得样本图像集分批次形成训练集和测试集;S2: Preprocess the obtained image set, filter and denoise the image, remove irrelevant factors in the picture, and separate the cells that adhere to each other, and form the obtained sample image set into training set and test set in batches;

对图像集进行预处理的具体步骤为:The specific steps of preprocessing the image set are as follows:

步骤1:针对样本背景中的无关杂质,首先对图像进行背景分离,通过最大类间方差法获取二值图像,利用形态学开操作对二值图像中目标的轮廓进行平滑处理,这一部分在形态学的开操作中,就能将背景中的不是目标的杂质去除,最后用Canny边界检测算法获取目标的轮廓边缘信息;Step 1: For the irrelevant impurities in the background of the sample, firstly separate the background of the image, obtain the binary image by the maximum inter-class variance method, and use the morphological opening operation to smooth the contour of the target in the binary image. In the open operation of learning, the impurities in the background that are not the target can be removed, and finally the Canny boundary detection algorithm is used to obtain the contour edge information of the target;

步骤2:采用凹点检测来对粘连细胞进行分割,粘连细胞凹点指的是由于两个至多个类圆形目标,因相互重叠而产生粘连后所形成的凹区域中的局部曲率最大点,对于近圆性的图像来说,不存在曲率突变的情况,除非是两个或者多个细胞;Step 2: Use the concave point detection to segment the adhesion cells. The concave point of the adhesion cell refers to the local maximum curvature point in the concave area formed by the adhesion of two or more quasi-circular objects due to overlapping with each other. For near-circular images, there is no sudden change in curvature unless there are two or more cells;

1)凹点检测:1) pit detection:

首先,通过一种改进的曲率尺度空间算法(Curvature Scale Space,CSS)对目标轮廓进行角点检测。这种改进的CSS算法以相对较低的尺度保留所有真实角点,然后将所有候选角点的曲率与自适应局部阈值进行比较以移除冗余角点。通常,候选角点的自适应局部阈值是根据其邻域区域的曲率确定的,绝对曲率低于其局部阈值的候选角点将被消除。在角点的候选者中,尽管一些点在曲率数值上被检测为局部最大值,但是它们在支持区域(Region of Support,ROS)中,相邻的点之间的差异却非常小,在选择支持区域的时候,也要选择合适的区域。First, an improved curvature scale space algorithm (Curvature Scale Space, CSS) is used to detect the corner points of the target contour. This improved CSS algorithm preserves all ground-truth corners at a relatively low scale, and then compares the curvature of all candidate corners with an adaptive local threshold to remove redundant corners. Usually, the adaptive local threshold of a candidate corner is determined according to the curvature of its neighborhood area, and candidate corners whose absolute curvature is lower than their local threshold will be eliminated. Among the candidates for corner points, although some points are detected as local maxima in the curvature value, they are in the region of support (ROS), and the difference between adjacent points is very small. When supporting regions, also select the appropriate region.

自适应局部阈值的设定方法为:The setting method of the adaptive local threshold is:

Figure BDA0003490465750000121
Figure BDA0003490465750000121

其中,

Figure BDA0003490465750000122
是邻域区域的曲率均值,p代表候选角点的位置,R1与R2为支持区域的尺寸大小,C为系数;in,
Figure BDA0003490465750000122
is the mean curvature of the neighborhood area, p represents the position of the candidate corner point, R 1 and R 2 are the size of the support area, and C is the coefficient;

2)轮廓段分组:2) Contour segment grouping:

使用1)中获得的凹点将粘连区域的轮廓分割为多个轮廓段。由于每个轮廓段并非都对应于一个单独的目标,可能存在多个轮廓段都属于同一个目标的情况。因此,需要将属于同一个目标的轮廓段分为一组。对于某个轮廓段,对于在其一定邻域范围内的另一个轮廓段sj,若si和sj满足分为一组的条件,则将其分为同一组,这种分组方法包括以下三个条件约束:Use the concave points obtained in 1) to segment the contour of the adhesion region into multiple contour segments. Since each contour segment does not correspond to a separate target, there may be situations where multiple contour segments belong to the same target. Therefore, it is necessary to group the contour segments belonging to the same target into a group. For a contour segment, for another contour segment s j within a certain neighborhood range, if s i and s j satisfy the condition of being divided into one group, they are divided into the same group. This grouping method includes the following Three conditional constraints:

条件1:若分为一组后拟合出的椭圆产生的平均距离偏差(Average DistanceDeviation,ADD)小于组合前任意一个轮廓段单独拟合出的椭圆产生的平均距离偏差,则将这些轮廓段分为同一组。Condition 1: If the average distance deviation (Average DistanceDeviation, ADD) of the fitted ellipses after being divided into a group is smaller than the average distance deviation produced by the ellipses fitted individually by any contour segment before the combination, then these contour segments are divided into two groups. for the same group.

条件2:若分为同组后拟合出的椭圆的重心与每个轮廓段分别单独拟合出的椭圆重心的距离都较为接近,则可分为一组。Condition 2: If the distance between the center of gravity of the fitted ellipse after being divided into the same group and the center of gravity of the ellipse fitted separately by each contour segment is relatively close, it can be divided into one group.

条件3:如果任意两个轮廓段si和sj分别拟合出的椭圆间的重心相距很近,则可分为一组;Condition 3: If the centers of gravity between the ellipses fitted by any two contour segments s i and s j respectively are very close, they can be grouped into one group;

步骤3:椭圆拟合,为了获取粘连目标因粘连而丢失掉的轮廓边界,该算法利用目标一般呈现为类圆形的先验知识,使用基于最小二乘法的椭圆拟合方法进行拟合以完成粘连分割;Step 3: Ellipse fitting. In order to obtain the outline boundary of the sticking target lost due to sticking, the algorithm uses the prior knowledge that the target generally appears to be a circle, and uses the ellipse fitting method based on the least squares method for fitting to complete Adhesion segmentation;

S3:针对小样本集进行模型的迁移训练,利用相近领域训练好的深度学习网络对小样本数据集进行迁移学习;S3: Carry out model transfer training for small sample sets, and use deep learning networks trained in similar fields to perform transfer learning on small sample data sets;

采用已在其他领域训练好的多层ResNet模型,截取模型的全连接层前面的部分,输出部分根据所需要的分类种类设置三个输出节点,再利用预训练的迁移方式,即,将当前多层ResNet模型的参数作为本发明的初始参数,再用权利要求3步骤二所处理过的图片数据进行网络的训练,迁移学习的示例如图4。具体实施方式如下:The multi-layer ResNet model that has been trained in other fields is used, and the front part of the fully connected layer of the model is intercepted. The output part sets three output nodes according to the required classification type, and then uses the pre-training migration method, that is, the current multi-layer The parameters of the layer ResNet model are used as the initial parameters of the present invention, and the image data processed in step 2 of claim 3 is used to train the network. An example of transfer learning is shown in Figure 4. The specific implementation is as follows:

ResNet是由多个串联在一起的卷积模块构成,每一个卷积模块都包括一层卷积一层池化,图3为一个ResNet模块。在训练时,将该单元目标映射(即要趋近的最优解)假设为F(x)+x,而输出为:y+x,那么训练的目标就变成了使y趋近于F(x)。即去掉映射前后相同的主体部分x,从而突出微小的变化(残差)。ResNet is composed of multiple convolution modules connected in series. Each convolution module includes one layer of convolution and one layer of pooling. Figure 3 shows a ResNet module. During training, the unit target mapping (that is, the optimal solution to be approached) is assumed to be F(x)+x, and the output is: y+x, then the training goal becomes to make y approach F (x). That is, the same body part x before and after the mapping is removed, so as to highlight the small changes (residuals).

用数学表达式表示为:Mathematically expressed as:

y=F(x,{Wi})+Wsx (2)y=F(x,{W i })+W s x (2)

x是残差单元的输入,y是残差单元的输出,F(x)是目标映射,{Wi}是残差单元中的卷积层。Ws是一个1×1卷积核大小的卷积,作用是给x降维或升维,从而与输出y大小一致(因为需要求和)。x is the input of the residual unit, y is the output of the residual unit, F(x) is the target map, and {W i } is the convolutional layer in the residual unit. W s is a convolution with a 1×1 convolution kernel size, which is used to reduce or increase the dimension of x so that it is the same size as the output y (because it needs to be summed).

具体过程为:The specific process is:

1)根据输入数据的维数确定第一层网络的节点数,也就是输入层节点数;1) Determine the number of nodes in the first layer of the network according to the dimension of the input data, that is, the number of nodes in the input layer;

2)输入数据到残差网络单元,根据ResNet网络的特性,即,残差网络的恒等映射函数,每个模块的输出都是当前输入加上残差,利用训练数据对网络进行层层训练。2) Input data to the residual network unit. According to the characteristics of the ResNet network, that is, the identity mapping function of the residual network, the output of each module is the current input plus the residual, and the training data is used to train the network layer by layer. .

3)本发明中的训练数据的采集并未达到深度学习要求的几十万的训练集样本,但由于使用已训练好的ResNet网络进行网络的迁移,利用其训练得很好的参数作为本模型的训练初始参数,省去了一部分的训练时间以及训练样本,十分适合小样本的训练学习;3) The collection of training data in the present invention does not reach the hundreds of thousands of training set samples required by deep learning, but due to the use of the trained ResNet network for network migration, the well-trained parameters are used as this model. The initial parameters of training, which saves part of the training time and training samples, are very suitable for training and learning with small samples;

S4:将训练好的模型,利用BP算法进行权重和阈值的反向微调,进一步优化模型;S4: Use the BP algorithm to fine-tune the weights and thresholds of the trained model to further optimize the model;

具体实施措施如下:The specific implementation measures are as follows:

(1)模型预训练将迁移过来的权重视作新网络的初始权重,在训练过程中会被梯度下降算法改变数值。(1) Model pre-training uses the transferred weights as the initial weights of the new network, which will be changed by the gradient descent algorithm during the training process.

梯度下降算法:Gradient descent algorithm:

1)从0开始到训练集数据数量结束:1) Starting from 0 and ending with the number of training set data:

①计算第i个训练数据的权重w和偏差b相对于损失函数的梯度。于是我们最终会得到每一个训练数据的权重和偏差的梯度值。① Calculate the gradient of the weight w and bias b of the i-th training data relative to the loss function. So we end up with the gradient values of the weights and biases for each training data.

②计算所有训练数据权重w的梯度的总和。② Calculate the sum of the gradients of all training data weights w.

③计算所有训练数据偏差b的梯度的总和。③ Calculate the sum of the gradients of all training data deviations b.

2)做完上面的计算之后,我们开始执行下面的计算:2) After completing the above calculations, we start to perform the following calculations:

①使用上面第②、③步所得到的结果,计算所有样本的权重和偏差的梯度的平均值。① Using the results obtained in steps ② and ③ above, calculate the average value of the gradients of the weights and biases of all samples.

②使用下面的式子,更新每个样本的权重值和偏差值。②Using the following formula, update the weight value and bias value of each sample.

Figure BDA0003490465750000141
Figure BDA0003490465750000141

Figure BDA0003490465750000151
Figure BDA0003490465750000151

重复上面的过程,直至损失函数收敛不变。Repeat the above process until the loss function converges.

(2)反向微调也就是对ResNet网络进行有监督训练来减少训练误差和提高分类模型的准确率,BP算法步骤:(2) Reverse fine-tuning is to perform supervised training on the ResNet network to reduce training errors and improve the accuracy of the classification model. BP algorithm steps:

1)输入训练集;1) Input the training set;

2)对于训练集中的每个样本x,设置输入层对应的激活值a12) For each sample x in the training set, set the activation value a 1 corresponding to the input layer:

前向传播:Forward propagation:

Figure BDA0003490465750000152
Figure BDA0003490465750000152

3)由于输出结果与实际结果有误差,则计算输出层产生的错误:3) Since there is an error between the output result and the actual result, calculate the error generated by the output layer:

δL=ΔaCeσ'(zL) (6)δ La Ceσ'(z L ) (6)

4)将上步所求的误差从输出层向隐藏层反向传播:4) Backpropagate the error obtained in the previous step from the output layer to the hidden layer:

δl=((wl+1)Tδl+1)eδ'(zl) (7)δ l =((w l+1 ) T δ l+1 )eδ'(z l ) (7)

5)使用梯度下降,训练参数,不断迭代直至收敛:5) Use gradient descent, train parameters, and iterate until convergence:

Figure BDA0003490465750000153
Figure BDA0003490465750000153

S5:将测试集输入到模型中,输出结果即为脑脊液细胞识别结果。S5: Input the test set into the model, and the output result is the cerebrospinal fluid cell identification result.

将测试集数据输入到训练好的分类模型中,经过多层ResNet映射后,输出层节点数为识别状态的数量,输入向量在输出层成功激活相应类别节点,类别节点中单核细胞为节点0、淋巴细胞为节点1、中性粒细胞为节点2。Input the test set data into the trained classification model, after multi-layer ResNet mapping, the number of output layer nodes is the number of recognition states, the input vector successfully activates the corresponding category node in the output layer, and the monocyte in the category node is node 0 , lymphocytes are node 1, and neutrophils are node 2.

本发明中,能够有效解决采集到的图像中由于背景杂质的存在而导致特征提取困难的问题,能够很好地适应复杂背景下的脑脊液细胞识别;由于采用已训练好的模型参数为本发明的初始训练参数,一定程度上是减少了部分训练时间,训练好的模型参数一般来说是比随机选取的参数更加有可靠性,所以适用于小样本的学习训练;针对有细胞粘连的样本,本发明也利用其细胞的近圆性,对其进行了中心点的预测来进行分割,使得本发明的适用情况多样化。In the present invention, the problem of difficulty in feature extraction caused by the existence of background impurities in the collected images can be effectively solved, and the recognition of cerebrospinal fluid cells under complex backgrounds can be well adapted; since the trained model parameters are used in the present invention The initial training parameters reduce part of the training time to a certain extent, and the trained model parameters are generally more reliable than randomly selected parameters, so they are suitable for small sample learning and training; for samples with cell adhesion, this The invention also makes use of the near-circularity of its cells to predict the center point for segmentation, so that the application of the invention is diversified.

本发明中,基于深度学习的脑脊液细胞自动识别技术,能够帮助神经科医生快速建立更科学的鉴别诊断模型,可以减少医生因主观因素对镜检结果造成的不利影响,帮助医生进行细胞计数及分类,大大提高诊断率;且可以与高水平医疗机构资源相互融合,让整体诊断模式趋为规范、统一,极大地提高优质医疗资源向基层医疗机构的辐射作用,提高基层医院的鉴别诊断水平,因此,构建基于深度学习的脑脊液细胞自动识别系统,对于提升中枢神经系统感染性疾病的诊断率、解决地区医疗差异、低年资及基层医师误诊等问题具有重大意义,从而最终使广大患者受益。In the present invention, the automatic identification technology of cerebrospinal fluid cells based on deep learning can help neurologists to quickly establish a more scientific differential diagnosis model, can reduce the adverse effects of doctors on microscopic examination results caused by subjective factors, and help doctors to count and classify cells , greatly improve the diagnosis rate; and can be integrated with the resources of high-level medical institutions, so that the overall diagnosis model tends to be standardized and unified, greatly improving the radiation effect of high-quality medical resources to primary medical institutions, and improving the level of differential diagnosis in primary hospitals. Therefore, The construction of an automatic identification system of cerebrospinal fluid cells based on deep learning is of great significance for improving the diagnosis rate of infectious diseases in the central nervous system, solving the problems of regional medical differences, low seniority and misdiagnosis by primary doctors, thus ultimately benefiting the majority of patients.

以上所述,仅为本发明较佳的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,根据本发明的技术方案及其发明构思加以等同替换或改变,都应涵盖在本发明的保护范围之内。The above description is only a preferred embodiment of the present invention, but the protection scope of the present invention is not limited to this. The equivalent replacement or change of the inventive concept thereof shall be included within the protection scope of the present invention.

Claims (7)

1.一种复杂性小样本脑脊液细胞识别与分类方法,其特征在于,包括以下步骤:1. a complex small sample cerebrospinal fluid cell identification and classification method, is characterized in that, comprises the following steps: S1:使用显微镜自动扫描平台进行样本玻片的图像获取,得到具有多个细胞的脑脊液细胞玻片的完整图像集;S1: use the microscope automatic scanning platform to acquire the image of the sample slide to obtain a complete image set of the cerebrospinal fluid cell slide with multiple cells; S2:对所得图像集进行预处理,对图像进行滤波去噪,剔除图片中不相关的因素,并且对相互粘连的细胞进行分离处理,将所得样本图像集分批次形成训练集和测试集;S2: Preprocess the obtained image set, filter and denoise the image, remove irrelevant factors in the picture, and separate the cells that adhere to each other, and form the obtained sample image set into training set and test set in batches; S3:针对小样本集进行模型的迁移训练,利用相近领域训练好的深度学习网络对小样本数据集进行迁移学习;S3: Carry out model transfer training for small sample sets, and use deep learning networks trained in similar fields to perform transfer learning on small sample data sets; S4:将训练好的模型,利用BP算法进行权重和阈值的反向微调,进一步优化模型;S4: Use the BP algorithm to fine-tune the weights and thresholds of the trained model to further optimize the model; S5:将测试集输入到模型中,输出结果即为脑脊液细胞识别结果。S5: Input the test set into the model, and the output result is the cerebrospinal fluid cell identification result. 2.根据权利要求1所述的一种复杂性小样本脑脊液细胞识别与分类方法,其特征在于,所述S1中将脑脊液细胞玻片放置显微镜的电动平移台上,利用软件系统对玻片的扫描范围进行对角线的坐标点定位,确定图像的扫描范围,并记下扫描的图像范围大小,利用软件系统平台进行采集图像的拼接,得到一张完整的细胞样本图片,对后面的玻片图像采集重复此步骤。2. The method for identifying and classifying cerebrospinal fluid cells in a complex small sample according to claim 1, wherein in the S1, the cerebrospinal fluid cell glass slide is placed on an electric translation stage of a microscope, and a software system is used to quantify the cerebrospinal fluid cell slide. The scanning range is determined by the diagonal coordinate point positioning, the scanning range of the image is determined, and the size of the scanned image range is recorded. The software system platform is used to stitch the collected images to obtain a complete picture of the cell sample. Repeat this step for image acquisition. 3.根据权利要求1所述的一种复杂性小样本脑脊液细胞识别与分类方法,其特征在于,所述S2中对图像集进行预处理的具体步骤为:3. a kind of complex small sample cerebrospinal fluid cell identification and classification method according to claim 1, is characterized in that, in described S2, the concrete step of preprocessing the image set is: 步骤1:针对样本背景中的无关杂质,首先对图像进行背景分离,通过最大类间方差法获取二值图像,利用形态学开操作对二值图像中目标的轮廓进行平滑处理,这一部分在形态学的开操作中,就能将背景中的不是目标的杂质去除,最后用Canny边界检测算法获取目标的轮廓边缘信息;Step 1: For the irrelevant impurities in the background of the sample, firstly separate the background of the image, obtain the binary image by the maximum inter-class variance method, and use the morphological opening operation to smooth the contour of the target in the binary image. In the open operation of learning, the impurities in the background that are not the target can be removed, and finally the Canny boundary detection algorithm is used to obtain the contour edge information of the target; 步骤2:采用凹点检测来对粘连细胞进行分割,粘连细胞凹点指的是由于两个至多个类圆形目标,因相互重叠而产生粘连后所形成的凹区域中的局部曲率最大点,对于近圆性的图像来说,不存在曲率突变的情况,除非是两个或者多个细胞;Step 2: Use the concave point detection to segment the adhesion cells. The concave point of the adhesion cell refers to the local maximum curvature point in the concave area formed by the adhesion of two or more quasi-circular objects due to overlapping with each other. For near-circular images, there is no sudden change in curvature unless there are two or more cells; 步骤3:椭圆拟合,为了获取粘连目标因粘连而丢失掉的轮廓边界,该算法利用目标一般呈现为类圆形的先验知识,使用基于最小二乘法的椭圆拟合方法进行拟合以完成粘连分割。Step 3: Ellipse fitting. In order to obtain the outline boundary of the sticking target lost due to sticking, the algorithm uses the prior knowledge that the target generally appears to be a circle, and uses the ellipse fitting method based on the least squares method for fitting to complete Adhesive segmentation. 4.根据权利要求1所述的一种复杂性小样本脑脊液细胞识别与分类方法,其特征在于,所述S3中采用已在其他领域训练好的多层ResNet模型,截取模型的全连接层前面的部分,输出部分根据所需要的分类种类设置三个输出节点,再利用预训练的迁移方式,将当前多层ResNet的参数作为本发明的初始参数,再用S2中所处理过的图片数据进行网络的训练,具体过程为:4. a kind of complexity small sample cerebrospinal fluid cell identification and classification method according to claim 1, is characterized in that, adopts the multi-layer ResNet model trained in other fields in the described S3, intercepts the front of the fully connected layer of the model The output part sets three output nodes according to the required classification types, and then uses the pre-training migration method to take the parameters of the current multi-layer ResNet as the initial parameters of the present invention, and then use the image data processed in S2 to carry out The training process of the network is as follows: 1)根据输入数据的维数确定第一层网络的节点数,也就是输入层节点数;1) Determine the number of nodes in the first layer of the network according to the dimension of the input data, that is, the number of nodes in the input layer; 2)输入数据到残差网络单元,根据ResNet网络的特性,即残差网络的恒等映射函数,每个模块的输出都是当前输入加上残差,利用训练数据对网络进行层层训练;2) Input data to the residual network unit. According to the characteristics of the ResNet network, that is, the identity mapping function of the residual network, the output of each module is the current input plus the residual, and the training data is used to train the network layer by layer; 3)使用已训练好的ResNet网络进行网络的迁移学习,利用其训练得很好的参数作为本模型的训练初始参数,省去了一部分的训练时间以及训练样本,十分适合小样本的训练学习。3) Use the trained ResNet network for network transfer learning, and use its well-trained parameters as the initial training parameters of this model, which saves part of the training time and training samples, which is very suitable for training and learning with small samples. 5.根据权利要求1所述的一种复杂性小样本脑脊液细胞识别与分类方法,其特征在于,所述S4中优化模型的具体步骤为:5. a kind of complex small sample cerebrospinal fluid cell identification and classification method according to claim 1, is characterized in that, the concrete steps of optimizing model in described S4 are: 1)当训练完成后,通过在ResNet的最顶层添加标签数据,对模型进行有监督训练,即使用反向传播算法(BP)对网络的相关参数进行微调;1) When the training is completed, the model is supervised by adding label data to the top layer of ResNet, that is, using the back propagation algorithm (BP) to fine-tune the relevant parameters of the network; 2)分别将所分类别的带标签数据,输入到ResNet的最顶层中,通过BP算法微调ResNet的权重和阈值,通过有监督的训练将进一步减少训练误差和提高迁移学习识别模型的准确率。2) Input the classified labeled data into the top layer of ResNet, fine-tune the weight and threshold of ResNet through BP algorithm, and further reduce the training error and improve the accuracy of the transfer learning recognition model through supervised training. 6.根据权利要求1所述的一种复杂性小样本脑脊液细胞识别与分类方法,其特征在于,所述S5中将测试集数据输入到训练好的分类模型中,经过多层ResNet映射后,输出层节点数为识别状态的数量,输入向量在输出层成功激活相应类别节点。6. a kind of complexity small sample cerebrospinal fluid cell identification and classification method according to claim 1, is characterized in that, in described S5, test set data is input in the classification model that is trained, after multi-layer ResNet mapping, The number of output layer nodes is the number of recognition states, and the input vector successfully activates the corresponding category nodes in the output layer. 7.根据权利要求6所述的一种复杂性小样本脑脊液细胞识别与分类方法,其特征在于,所述S5中类别节点中单核细胞为节点0、淋巴细胞为节点1、中性粒细胞为节点2。7 . The method for identifying and classifying cerebrospinal fluid cells in a complex small sample according to claim 6 , wherein the monocytes in the class nodes in S5 are node 0, lymphocytes are node 1, and neutrophils are node 1. 8 . for node 2.
CN202210094305.8A 2022-01-26 2022-01-26 Cerebrospinal fluid cell identification and classification method for small-complexity sample Pending CN114494197A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210094305.8A CN114494197A (en) 2022-01-26 2022-01-26 Cerebrospinal fluid cell identification and classification method for small-complexity sample

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210094305.8A CN114494197A (en) 2022-01-26 2022-01-26 Cerebrospinal fluid cell identification and classification method for small-complexity sample

Publications (1)

Publication Number Publication Date
CN114494197A true CN114494197A (en) 2022-05-13

Family

ID=81477483

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210094305.8A Pending CN114494197A (en) 2022-01-26 2022-01-26 Cerebrospinal fluid cell identification and classification method for small-complexity sample

Country Status (1)

Country Link
CN (1) CN114494197A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115100646A (en) * 2022-06-27 2022-09-23 武汉兰丁智能医学股份有限公司 Cell image high-definition rapid splicing identification marking method
CN116823823A (en) * 2023-08-29 2023-09-29 天津市肿瘤医院(天津医科大学肿瘤医院) An artificial intelligence method for automatic analysis of cerebrospinal fluid cells
WO2024000288A1 (en) * 2022-06-29 2024-01-04 深圳华大生命科学研究院 Image stitching method, and gene sequencing system and corresponding gene sequencer
CN117576098A (en) * 2024-01-16 2024-02-20 武汉互创联合科技有限公司 Cell division balance evaluation method and device based on segmentation

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476266A (en) * 2020-02-27 2020-07-31 武汉大学 Non-equilibrium type leukocyte classification method based on transfer learning
CN113723199A (en) * 2021-08-03 2021-11-30 南京邮电大学 Airport low visibility detection method, device and system
WO2021247868A1 (en) * 2020-06-03 2021-12-09 Case Western Reserve University Classification of blood cells

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476266A (en) * 2020-02-27 2020-07-31 武汉大学 Non-equilibrium type leukocyte classification method based on transfer learning
WO2021247868A1 (en) * 2020-06-03 2021-12-09 Case Western Reserve University Classification of blood cells
CN113723199A (en) * 2021-08-03 2021-11-30 南京邮电大学 Airport low visibility detection method, device and system

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
HUANHUAN YIN 等: "Research on Recognition and Classification System of Cerebrospinal Fluid Cells Based on Small Samples", 《2021 IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE, ELECTRONIC INFORMATION ENGINEERING AND INTELLIGENT CONTROL TECHNOLOGY (CEI)》, 29 October 2021 (2021-10-29), pages 149 - 152 *
SAHAR ZAFARI 等: "Segmentation of Partially Overlapping Nanoparticles Using Concave Points", 《ADVANCES IN VISUAL COMPUTING》, 18 December 2015 (2015-12-18), pages 187 - 197, XP047332049, DOI: 10.1007/978-3-319-27857-5_17 *
刘宰豪: "基于凹点和重心检测的粘连类圆形目标图像分割", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 02, 15 February 2020 (2020-02-15), pages 138 - 1382 *
尹欢欢: "脑脊液细胞显微图像识别与分类系统设计与实现", 《万方数据知识服务平台》, 1 November 2023 (2023-11-01), pages 1 - 86 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115100646A (en) * 2022-06-27 2022-09-23 武汉兰丁智能医学股份有限公司 Cell image high-definition rapid splicing identification marking method
WO2024000288A1 (en) * 2022-06-29 2024-01-04 深圳华大生命科学研究院 Image stitching method, and gene sequencing system and corresponding gene sequencer
CN116823823A (en) * 2023-08-29 2023-09-29 天津市肿瘤医院(天津医科大学肿瘤医院) An artificial intelligence method for automatic analysis of cerebrospinal fluid cells
CN116823823B (en) * 2023-08-29 2023-11-14 天津市肿瘤医院(天津医科大学肿瘤医院) Artificial intelligence cerebrospinal fluid cell automatic analysis method
CN117576098A (en) * 2024-01-16 2024-02-20 武汉互创联合科技有限公司 Cell division balance evaluation method and device based on segmentation
CN117576098B (en) * 2024-01-16 2024-04-19 武汉互创联合科技有限公司 Cell division balance evaluation method and device based on segmentation

Similar Documents

Publication Publication Date Title
CN114494197A (en) Cerebrospinal fluid cell identification and classification method for small-complexity sample
CN106248559B (en) A kind of five sorting technique of leucocyte based on deep learning
CN101809589B (en) Methods and systems for processing biological specimens utilizing multiple wavelengths
CN110473167B (en) Urine sediment image recognition system and method based on deep learning
CN112784767A (en) Cell example segmentation algorithm based on leukocyte microscopic image
CN108257124A (en) A kind of white blood cell count(WBC) method and system based on image
CN101799926B (en) Ki-67 immunohistochemical pathological image automatic quantitative analysis system
CN113628199B (en) Pathological picture stained tissue area detection method, pathological picture stained tissue area detection system and prognosis state analysis system
CN113902669A (en) Method and system for reading urine exfoliative cell fluid-based smear
CN112036334A (en) Method, system and terminal for classifying visible components in sample to be detected
CN114332855A (en) Unmarked leukocyte three-classification method based on bright field microscopic imaging
Rani et al. Automatic Evaluations of Human Blood Using Deep Learning Concepts
KR20010017092A (en) Method for counting and analyzing morphology of blood cell automatically
KR20200136004A (en) Method for detecting cells with at least one malformation in a cell sample
CN112432902A (en) Automatic detection system and method for judging cell number through peripheral blood cell morphology
CN110414317B (en) Capsule network-based automatic white blood cell classification and counting method
CN112001315A (en) Bone marrow cell classification and identification method based on transfer learning and image texture features
Sinha et al. Detection of leukemia disease using convolutional neural network
CN113222944B (en) Cell nucleus segmentation method and system and device for auxiliary analysis of cancer based on pathological images
CN114387596A (en) Cytopathological smear automatic interpretation system
Priyankara et al. An extensible computer vision application for blood cell recognition and analysis
CN112819057A (en) Automatic identification method of urinary sediment image
CN114742803B (en) Platelet aggregation detection method combining deep learning and digital image processing algorithm
Cheng et al. Application of image recognition technology in pathological diagnosis of blood smears
CN113222928B (en) Urine cytology artificial intelligence urothelial cancer identification system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination