CN108564114B - A method for automatic identification of human fecal leukocytes based on machine learning - Google Patents
A method for automatic identification of human fecal leukocytes based on machine learning Download PDFInfo
- Publication number
- CN108564114B CN108564114B CN201810262889.9A CN201810262889A CN108564114B CN 108564114 B CN108564114 B CN 108564114B CN 201810262889 A CN201810262889 A CN 201810262889A CN 108564114 B CN108564114 B CN 108564114B
- Authority
- CN
- China
- Prior art keywords
- area
- image
- value range
- pixels
- binary image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Investigating Or Analysing Biological Materials (AREA)
- Image Analysis (AREA)
Abstract
Description
技术领域technical field
本发明是涉及基于数字图像处理技术,特别是机器学习技术对人体粪便中白细胞自动识别检测方法。The invention relates to a method for automatic identification and detection of leukocytes in human feces based on digital image processing technology, in particular, machine learning technology.
背景技术Background technique
人体粪便中的白细胞常在粘液及脓血便中检出,且多为中性分页核粒细胞。由于细胞退变,细胞多胀大,结构不清。如数量较多,成堆存在,且细胞膜不完整或已破碎,此时亦称为脓细胞,表示感染严重。不同于人体的血液与尿液,人体的粪便中含有大量食物残渣等杂质,导致图像背景复杂;部分白细胞细胞膜不再完整,且与部分杂质粘连。以上原因均导致白细胞不易分割。由于细胞的退变,白细胞的形态多变,不易使用基于形态学的方法识别。White blood cells in human feces are often detected in mucus and pus and blood, and most of them are neutral paged granulocytes. Due to cell degeneration, the cells are swollen and the structure is unclear. If the number is large, there are piles, and the cell membrane is incomplete or broken, it is also called pus cells at this time, indicating a serious infection. Different from human blood and urine, human feces contain a large amount of impurities such as food residues, resulting in a complex image background; some white blood cell membranes are no longer intact, and are adhered to some impurities. The above reasons all lead to the difficulty of dividing white blood cells. Due to cellular degeneration, leukocytes are morphologically variable and are not easily identified using morphology-based methods.
本申请人14年申请过专利一种针对支气管肺泡灌洗涂片中白细胞的自动检测方法(专利号为2014103696359)该发明公开了通过对显微镜采集支气管肺泡灌洗液涂片的显微图像进行灰度化、二值化处理,同时利用白细胞的外形特征与内部特征进行筛选,最终识别出白细胞,采用纯图像处理的方法检测样本中的白细胞,实际应用时计算量大。The applicant has applied for a patent for an automatic detection method for leukocytes in bronchoalveolar lavage smears in 14 years (patent number is 2014103696359) At the same time, the appearance and internal characteristics of leukocytes are used for screening, and finally leukocytes are identified. The pure image processing method is used to detect leukocytes in the sample, which requires a large amount of calculation in practical applications.
发明内容SUMMARY OF THE INVENTION
本发明的目的是针对粪便显微图像背景复杂及白细胞形态变化多样性,设计了一种针对人体粪便中白细胞的自动识别方法,该方法基于数字图像处理技术和机器学习技术,从人体粪便显微图像中自动识别白细胞,从而达到检测准确率高、漏检率低的目的,满足临床检测的需要。The purpose of the present invention is to design an automatic identification method for leukocytes in human feces in view of the complex background of stool microscopic images and the diversity of leukocyte morphological changes. The method is based on digital image processing technology and machine learning technology. The white blood cells are automatically identified in the image, so as to achieve the purpose of high detection accuracy and low missed detection rate, and meet the needs of clinical detection.
本发明提供的技术方案为一种基于机器学习的人体粪便白细胞自动识别方法,该方法包括以下步骤:The technical solution provided by the present invention is a method for automatic identification of human fecal leukocytes based on machine learning, and the method comprises the following steps:
步骤1:对人体粪便进行稀释、搅拌、沉淀的预处理,使用生物显微镜采集人体粪便样本的显微图像;Step 1: Pretreatment of human feces by dilution, stirring and precipitation, and using a biological microscope to collect microscopic images of human feces samples;
步骤2:对步骤1中的图像进行灰度化处理,转为灰度图像;Step 2: Grayscale the image in Step 1 and convert it to a grayscale image;
步骤3:对步骤2中的灰度图像进行局部反向二值化处理,获得二值化图像;Step 3: perform local reverse binarization processing on the grayscale image in step 2 to obtain a binarized image;
步骤4:对步骤3中的二值图像进行孔洞填充操作,填充面积小于S1的区域,其中S1的取值范围为180~230个像素;Step 4: Perform a hole filling operation on the binary image in Step 3, and fill the area with an area smaller than S1, where the value range of S1 is 180-230 pixels;
步骤5:对步骤4中的二值图像进行形态学膨胀运算,采用半径为R1的圆形模板;对膨胀后的二值图像进行孔洞填充;对孔洞填充后的二值图像进行形态学腐蚀运算,采用半径为R2的圆形模板;R1的取值范围为2~5个像素,R2的取值范围为1~3个像素,且R1大于R2;Step 5: Perform a morphological expansion operation on the binary image in Step 4, using a circular template with a radius of R1; perform hole filling on the expanded binary image; perform a morphological erosion operation on the hole-filled binary image , using a circular template with a radius of R2; the value range of R1 is 2 to 5 pixels, the value range of R2 is 1 to 3 pixels, and R1 is greater than R2;
步骤6:将步骤5的二值图像连通区域标记,并计算连通区域的面积特征和外接矩形;保留面积大于S2,且占外接矩形面积比例大于D1;外接矩形宽高大于H1,小于H2的区域;S2的取值范围为1400~1550个像素,D1的取值范围为60%~80%,H1的取值范围为30~50个像素,H2的取值范围为100~120个像素;Step 6: Mark the connected area of the binary image in step 5, and calculate the area feature and circumscribed rectangle of the connected area; the reserved area is greater than S2, and the proportion of the area of the circumscribed rectangle is greater than D1; the width and height of the circumscribed rectangle is greater than H1, and smaller than the area of H2 ; The value range of S2 is 1400~1550 pixels, the value range of D1 is 60%~80%, the value range of H1 is 30~50 pixels, and the value range of H2 is 100~120 pixels;
步骤7:对步骤6保留的区域,根据外接矩形裁剪二值图像;对裁剪的二值图像,首先计算二值图像的最大内接圆的圆心和半径,仅保留内接圆半径大于R3的区域;对二值图像进一步操作,只保留距离圆心小于L1的点;R3的取值范围为15~20个像素,L1的取值范围为20~30个像素;Step 7: For the area reserved in Step 6, crop the binary image according to the circumscribed rectangle; for the cropped binary image, first calculate the center and radius of the largest inscribed circle of the binary image, and only retain the area where the radius of the inscribed circle is greater than R3 ; For further operations on the binary image, only the points whose distance from the center of the circle is less than L1 are retained; the value range of R3 is 15 to 20 pixels, and the value range of L1 is 20 to 30 pixels;
步骤8:对步骤7的二值图像,计算二值图像的最小外接圆,保留二值区域面积占外接圆面积D2以上的二值图像;D2的取值范围为72%~90%Step 8: For the binary image in Step 7, calculate the minimum circumscribed circle of the binary image, and reserve the binary image whose area of the binary area accounts for more than the circumscribed circle area D2; the value range of D2 is 72% to 90%
步骤9:对步骤8的二值图像重新计算并修正外接矩形坐标;保留外接矩形宽、高大于H3,宽高绝对值之差小于H4的区域;H3的取值范围为50~57个像素,H4的取值范围为12~18个像素;Step 9: Recalculate and correct the coordinates of the circumscribed rectangle for the binary image in Step 8; keep the area where the width and height of the circumscribed rectangle are greater than H3, and the difference between the absolute values of width and height is less than H4; the value range of H3 is 50 to 57 pixels, The value range of H4 is 12 to 18 pixels;
步骤10:对步骤9保留的区域,根据新计算的外接矩形坐标,裁剪对应的灰度图和彩图;计算灰度图的方差、灰度均值和清晰度;保留方差大于F1,灰度均值大于G1,清晰度大于Q1的区域;F1的取值范围为295~308,G1的取值范围为55~70,Q1的取值范围为11~20;Step 10: For the area reserved in Step 9, according to the newly calculated bounding rectangle coordinates, crop the corresponding grayscale image and color image; calculate the variance, grayscale mean and sharpness of the grayscale image; retain the variance greater than F1, and the grayscale mean Greater than G1, the definition is greater than the area of Q1; the value range of F1 is 295~308, the value range of G1 is 55~70, and the value range of Q1 is 11~20;
以下分为训练分类器和实际检测两个过程;The following is divided into two processes: training classifier and actual detection;
步骤11~步骤16为分类器训练过程;步骤17~步骤20为实际检测过程;Steps 11 to 16 are the classifier training process; Steps 17 to 20 are the actual detection process;
步骤11:将步骤10中保留的区域坐标与对应的人工框选白细胞区域坐标对比,去除真实的白细胞,并截取剩余区域的彩图作为负样本集的杂质;训练使用的正样本集采用人工框选的白细胞;步骤4~步骤10所有采用的参数,均可保证所有样本中的白细胞进入步骤10后的被保留区域;Step 11: Compare the coordinates of the region retained in step 10 with the coordinates of the corresponding artificial frame-selected white blood cell region, remove the real white blood cells, and intercept the color image of the remaining area as the impurity of the negative sample set; the positive sample set used for training adopts the artificial frame Selected leukocytes; all parameters used in steps 4 to 10 can ensure that leukocytes in all samples enter the retained area after step 10;
步骤12:对所有的样本集,件样本图像进行归一化处理,然后分别提取三个颜色通道的Gabor特征和LBP特征;Step 12: Normalize all sample sets and sample images, and then extract the Gabor features and LBP features of the three color channels respectively;
步骤13:将步骤12中样本集提取的特征进行预处理;Step 13: Preprocess the features extracted from the sample set in Step 12;
步骤13-1:首先对特征向量进行PCA(Principal Component Analysis)主成分分析,仅保留99%的主要成分;Step 13-1: First, perform PCA (Principal Component Analysis) principal component analysis on the eigenvectors, and only retain 99% of the principal components;
步骤13-2:对步骤13-1得到的特征进行‘L2’规范化;Step 13-2: Perform 'L2' normalization on the features obtained in Step 13-1;
步骤13-3:对步骤13-2得到的特征进行标准化;Step 13-3: Standardize the features obtained in Step 13-2;
步骤14:使用k-折交叉验证的方法,将样本集等分为N份,N>3,每次使用其中一份或两份作为训练集,其余的作为测试集;Step 14: Use the k-fold cross-validation method to divide the sample set into N equal parts, N>3, use one or two of them as the training set each time, and the rest as the test set;
步骤15:对步骤14中的训练集,使用SVM(Support Vector Machine)支持向量机模型进行训练,获得分类模型1;Step 15: For the training set in Step 14, use the SVM (Support Vector Machine) support vector machine model for training to obtain classification model 1;
步骤16:根据步骤15中获得的分类模型1,训练分类模型2;Step 16: According to the classification model 1 obtained in step 15, train the classification model 2;
步骤17:裁剪步骤10中被保留区域对应的彩色图像,提取Gabor和LBP特征并组合,按步骤13的方法进行数据预处理,获得待测的特征向量;Step 17: crop the color image corresponding to the reserved area in step 10, extract Gabor and LBP features and combine them, perform data preprocessing according to the method of step 13, and obtain the feature vector to be measured;
步骤18:将步骤17中的待测特征向量输入至分类器1,获得分类结果;Step 18: Input the feature vector to be tested in Step 17 into Classifier 1 to obtain a classification result;
步骤19:提取步骤18中结果为白细胞的彩色图像HOG(Histogram of OrientedGradient)特征,按步骤13的方法进行数据预处理,并将处理后的特征向量送入分类器2中,获得最终的分类结果;Step 19: Extract the HOG (Histogram of Oriented Gradient) feature of the color image of the white blood cells as the result in Step 18, perform data preprocessing according to the method in Step 13, and send the processed feature vector to Classifier 2 to obtain the final classification result ;
步骤20:将步骤19中所有检测到的白细胞在步骤1的彩色显微图像上标记出来。Step 20: Label all the leukocytes detected in Step 19 on the color microscopic image of Step 1.
进一步的,所述步骤3中局部区域二值化的方法为:处理局部的区域为方形,边长为11个像素,阈值选取方法为灰度均值,阈值放缩系数为0.98。Further, the method for local area binarization in the step 3 is as follows: the processing local area is a square, the side length is 11 pixels, the threshold selection method is the gray mean value, and the threshold scaling factor is 0.98.
进一步的,所述步骤10中的清晰度计算采用二次模糊的方法,具体步骤为:Further, the sharpness calculation in the step 10 adopts the method of secondary blurring, and the specific steps are:
步骤10-1:计算灰度图像每个像素点的梯度值,获得原梯度图。Step 10-1: Calculate the gradient value of each pixel of the grayscale image to obtain the original gradient map.
步骤10-2:对灰度图进行高斯模糊,模板大小为9*9,方差取1.5。Step 10-2: Perform Gaussian blur on the grayscale image, the template size is 9*9, and the variance is 1.5.
步骤10-3:对经过10-2步骤模糊后的灰度图像计算每个像素点的梯度值,获得模糊梯度图。Step 10-3: Calculate the gradient value of each pixel for the grayscale image blurred in step 10-2 to obtain a blurred gradient map.
步骤10-4:计算步骤10-1得到的原梯度图和步骤10-3得到的模糊梯度图的平均绝对差值,即获得清晰度值。Step 10-4: Calculate the average absolute difference between the original gradient image obtained in step 10-1 and the blurred gradient image obtained in step 10-3, that is, to obtain a sharpness value.
进一步的,所述步骤12的具体方法为:Further, the specific method of the step 12 is:
将图像缩放至80*80大小,然后分别提取三个颜色通道的Gabor特征和LBP特征;其中Gabor特征尺度选择15;方向选择{0°,45°,90°,135°,180°,225°,270°,315°}8个方向;波长取π/4;空间纵横比取1.0;标准差取1.5;相位偏移为0;同时采用8倍降采样降低特征维度;其中LBP(Local Binary Pattern)特征采用LBP等价模式(Uniform Pattern),组合Gabor特征和LBP特征。Scale the image to 80*80 size, and then extract the Gabor features and LBP features of the three color channels respectively; the Gabor feature scale is 15; the direction is {0°, 45°, 90°, 135°, 180°, 225° ,270°,315°}8 directions; wavelength is π/4; spatial aspect ratio is 1.0; standard deviation is 1.5; phase offset is 0; ) feature adopts LBP equivalent pattern (Uniform Pattern), combining Gabor feature and LBP feature.
进一步的,所述步骤15中:Further, in the step 15:
使用SVM支持向量机模型进行训练时,选择RBF核函数,gamma为自动选取,惩罚系数C取值为2,正负样本权重比值为2:1;When using the SVM support vector machine model for training, select the RBF kernel function, gamma is automatically selected, the penalty coefficient C is 2, and the weight ratio of positive and negative samples is 2:1;
所述步骤16中的具体步骤为:The specific steps in the step 16 are:
步骤16-1:将步骤15中模型1所有误检的负样本作为模型2的负样本集。正样本集为放缩后的人工框选白细胞。Step 16-1: Take all the false negative samples of model 1 in step 15 as the negative sample set of model 2. The positive sample set is the artificially framed white blood cells after scaling.
步骤16-2:对样本集的灰度图提取HOG(Histogram of Oriented Gradient)方向梯度直方图特征。Cell大小取8*8,梯度取9个方向;Block大小取2*2;步长取8。获得3600维特征向量。Step 16-2: Extract the HOG (Histogram of Oriented Gradient) direction gradient histogram feature from the grayscale image of the sample set. The cell size is 8*8, the gradient is 9 directions; the block size is 2*2; the step size is 8. Obtain a 3600-dimensional feature vector.
步骤16-3:对样本集特征采用步骤13中数据预处理方式,获得2320维特征向量。Step 16-3: Use the data preprocessing method in Step 13 for the features of the sample set to obtain a 2320-dimensional feature vector.
步骤16-4:采用与模型1相同训练方法,使用SVM模型,获得分类模型2,达到在保留白细胞的基础上进一步去除误检杂质的目的。最后选择RBF核函数,gamma为自动选取;惩罚系数C取1;正负样本权重比值为6:1,其他均为默认参数。Step 16-4: Adopt the same training method as Model 1, use the SVM model to obtain classification Model 2, and achieve the purpose of further removing falsely detected impurities on the basis of retaining white blood cells. Finally, the RBF kernel function is selected, gamma is automatically selected; the penalty coefficient C is 1; the weight ratio of positive and negative samples is 6:1, and the others are default parameters.
本发明采用对样本图像进行特征提取,采用提取的特征对SVM支持向量机进行训练,然后再将训练好的向量机应用于白细胞的识别,本发明方法具有识别准确,速度快,效率高,计算量小的优点。The invention adopts the feature extraction of sample images, uses the extracted features to train the SVM support vector machine, and then applies the trained vector machine to the identification of white blood cells. The method of the invention has the advantages of accurate identification, high speed, high efficiency, and high computational Small size advantage.
附图说明Description of drawings
图1是本发明一种针对人体粪便中白细胞的自动识别方法检测流程图。FIG. 1 is a flow chart of an automatic identification method for detecting white blood cells in human feces according to the present invention.
图2是对应的训练分类模型的流程图。FIG. 2 is a flowchart of the corresponding training classification model.
图3是最终的自动识别结果图。Figure 3 is the final automatic identification result graph.
具体实施方式Detailed ways
一种针对人体粪便白细胞的检测方法,该方法包括以下步骤:A detection method for human fecal leukocytes, the method comprises the following steps:
步骤1:对人体粪便进行稀释、搅拌、沉淀等预处理操作,使用生物显微镜采集人体粪便样本的显微图像。Step 1: Pre-processing operations such as dilution, stirring, and precipitation are performed on human feces, and microscopic images of human feces samples are collected using a biological microscope.
步骤2:对步骤1中的图像进行灰度化处理,转为灰度图像。Step 2: Grayscale the image in Step 1 and convert it to a grayscale image.
步骤3:对步骤2中的灰度图像进行局部反向二值化处理,获得二值化图像。处理局部的区域为方形,半径为11;阈值选取方法为灰度均值,阈值放缩系数为0.98。Step 3: Perform local reverse binarization processing on the grayscale image in step 2 to obtain a binarized image. The local processing area is a square with a radius of 11; the threshold selection method is gray mean value, and the threshold scaling factor is 0.98.
步骤4:对步骤3中的二值图像进行孔洞填充操作,然后去除面积小于200的区域。Step 4: Perform the hole filling operation on the binary image in Step 3, and then remove the area with an area less than 200.
步骤5:对步骤4中的二值图像进行形态学膨胀运算,采用半径为4的圆形模板。对膨胀后的二值图像进行孔洞填充。对孔洞填充后的二值图像进行形态学腐蚀运算,采用半径为2的圆形模板。Step 5: Perform a morphological expansion operation on the binary image in Step 4, using a circular template with a radius of 4. Fill holes in the dilated binary image. The morphological erosion operation is performed on the binary image after hole filling, and a circular template with a radius of 2 is used.
步骤6:将步骤5的二值图像连通区域标记,并计算连通区域的面积特征和外接矩形。保留面积大于1500,且占外接矩形面积比例大于70%;外接矩形宽高大于35,小于110的区域。Step 6: Mark the connected region of the binary image in Step 5, and calculate the area feature and circumscribed rectangle of the connected region. The reserved area is greater than 1500, and accounts for more than 70% of the area of the circumscribed rectangle; the width and height of the circumscribed rectangle are greater than 35 and less than 110.
步骤7:对步骤6保留的区域,根据外接矩形裁剪二值图像。对裁剪的二值图像,首先计算二值图像的最大内接圆的圆心和半径,仅保留内接圆半径大于18的区域。对二值图像进一步操作,只保留距离圆心小于24的点。Step 7: For the area reserved in Step 6, crop the binary image according to the circumscribing rectangle. For the cropped binary image, first calculate the center and radius of the largest inscribed circle of the binary image, and only keep the area where the radius of the inscribed circle is greater than 18. For further operations on the binary image, only the points less than 24 from the center of the circle are retained.
步骤8:对步骤7的二值图像,计算二值图像的最小外接圆,保留二值区域面积占外接圆面积84%以上的二值图像。Step 8: For the binary image in Step 7, calculate the minimum circumscribed circle of the binary image, and reserve the binary image whose area of the binary area accounts for more than 84% of the area of the circumscribed circle.
步骤9:对步骤8的二值图像重新计算并修正外接矩形坐标。保留外接矩形宽高大于53,宽高绝对值之差小于15的区域。Step 9: Recalculate and correct the coordinates of the circumscribed rectangle for the binary image in Step 8. Retain the area where the width and height of the enclosing rectangle are greater than 53, and the difference between the absolute value of the width and height is less than 15.
步骤10:对步骤9保留的区域,根据新计算的外接矩形坐标,裁剪对应的灰度图和彩图。计算灰度图的方差、灰度均值和清晰度。保留方差大于300,灰度均值大于60,清晰度大于15的区域。Step 10: For the area reserved in Step 9, according to the newly calculated coordinates of the circumscribed rectangle, crop the corresponding grayscale image and color image. Calculates the variance, gray mean, and sharpness of a grayscale image. Retain areas with variance greater than 300, gray mean value greater than 60, and sharpness greater than 15.
以下分为训练分类器和实际检测两个过程。步骤11~步骤16为分类器训练过程。步骤17~步骤20为实际检测过程。The following is divided into two processes: training the classifier and actual detection. Steps 11 to 16 are the classifier training process. Steps 17 to 20 are the actual detection process.
步骤11:将步骤10中保留的区域坐标与对应的人工框选白细胞区域坐标对比,去除真实的白细胞,并截取剩余区域的彩图作为负样本集的杂质。训练使用的正样本集采用人工框选的白细胞。步骤4~步骤10所有采用的参数,均可保证所有样本中的白细胞进入步骤10后的被保留区域。Step 11: Compare the coordinates of the region retained in Step 10 with the coordinates of the corresponding artificially framed white blood cell region, remove the real white blood cells, and intercept the color image of the remaining region as impurities in the negative sample set. The positive sample set used for training uses artificially framed white blood cells. All parameters used in steps 4 to 10 can ensure that the leukocytes in all samples enter the retained area after step 10.
步骤12:对所有的样本集,将图像缩放至80*80大小,然后分别提取三个颜色通道的Gabor特征和LBP特征。其中Gabor特征尺度选择15;方向选择{0°,45°,90°,135°,180°,225°,270°,315°}8个方向;波长取π/4;空间纵横比取1.0;标准差取1.5;相位偏移为0。同时采用8倍降采样降低特征维度。共获得2400维特征。其中LBP(Local Binary Pattern)特征采用LBP等价模式(Uniform Pattern),共提取78维特征。组合Gabor特征和LBP特征,获得2478维特征向量。Step 12: For all sample sets, scale the image to 80*80 size, and then extract the Gabor features and LBP features of the three color channels respectively. Among them, the Gabor feature scale is 15; the direction is {0°, 45°, 90°, 135°, 180°, 225°, 270°, 315°} 8 directions; the wavelength is π/4; the spatial aspect ratio is 1.0; The standard deviation is 1.5; the phase offset is 0. At the same time, 8 times downsampling is used to reduce the feature dimension. A total of 2400 dimensional features were obtained. Among them, the LBP (Local Binary Pattern) feature adopts the LBP equivalent pattern (Uniform Pattern), and a total of 78-dimensional features are extracted. Combine Gabor features and LBP features to obtain a 2478-dimensional feature vector.
步骤13:将步骤12中样本集提取的特征进行预处理,获得1342维特征向量。Step 13: Preprocess the features extracted from the sample set in Step 12 to obtain a 1342-dimensional feature vector.
步骤14:使用k-折交叉验证的方法,将样本集等分为5份,每次使用其中1份作为训练集,其余4份作为测试集。Step 14: Using the k-fold cross-validation method, the sample set is divided into 5 equal parts, 1 part is used as the training set each time, and the remaining 4 parts are used as the test set.
步骤15:对步骤14中的训练集,使用SVM(Support Vector Machine)支持向量机模型进行训练,获得分类模型1。根据交叉验证的结果,最终选择RBF核函数,gamma为自动选取;惩罚系数C取值为2;正负样本权重比值为2:1,其他均为默认参数。Step 15: Use the SVM (Support Vector Machine) support vector machine model for training on the training set in step 14 to obtain a classification model 1. According to the results of cross-validation, the RBF kernel function is finally selected, gamma is automatically selected; the penalty coefficient C is 2; the weight ratio of positive and negative samples is 2:1, and the others are default parameters.
步骤16:根据步骤15中获得的分类模型1,训练分类模型2。Step 16: According to the classification model 1 obtained in step 15, the classification model 2 is trained.
步骤17:裁剪步骤10中被保留区域对应的彩色图像,提取Gabor和LBP特征并组合,按步骤13的方法进行数据预处理,获得待测的特征向量。Step 17: Crop the color image corresponding to the reserved area in step 10, extract Gabor and LBP features and combine them, perform data preprocessing according to the method of step 13, and obtain the feature vector to be measured.
步骤18:将步骤17中的待测特征向量输入至分类器1,获得分类结果。Step 18: Input the feature vector to be tested in Step 17 to Classifier 1 to obtain a classification result.
步骤19:提取步骤18中结果为白细胞的彩色图像HOG(Histogram of OrientedGradient)特征,按步骤13的方法进行数据预处理,并将处理后的特征向量送入分类器2中,获得最终的分类结果。Step 19: Extract the HOG (Histogram of Oriented Gradient) feature of the color image of the white blood cells as the result in Step 18, perform data preprocessing according to the method in Step 13, and send the processed feature vector to Classifier 2 to obtain the final classification result .
步骤20:将步骤19中所有检测到的白细胞在步骤1的彩色显微图像上标记出来。Step 20: Label all the leukocytes detected in Step 19 on the color microscopic image of Step 1.
所述步骤10中的清晰度计算采用二次模糊的方法,具体步骤为:The sharpness calculation in the step 10 adopts the method of secondary blurring, and the specific steps are:
步骤10-1:计算灰度图像每个像素点的梯度值,获得原梯度图。Step 10-1: Calculate the gradient value of each pixel of the grayscale image to obtain the original gradient map.
步骤10-2:对灰度图进行高斯模糊,模板大小为9*9,方差取1.5。Step 10-2: Perform Gaussian blur on the grayscale image, the template size is 9*9, and the variance is 1.5.
步骤10-3:对经过10-2步骤模糊后的灰度图像计算每个像素点的梯度值,获得模糊梯度图。Step 10-3: Calculate the gradient value of each pixel for the grayscale image blurred in step 10-2 to obtain a blurred gradient map.
步骤10-4:计算步骤10-1得到的原梯度图和步骤10-3得到的模糊梯度图的平均绝对差值,即获得清晰度值。Step 10-4: Calculate the average absolute difference between the original gradient image obtained in step 10-1 and the blurred gradient image obtained in step 10-3, that is, to obtain a sharpness value.
所述步骤13中的具体步骤为:The specific steps in the step 13 are:
步骤13-1:首先对2478维特征向量进行PCA(Principal Component Analysis)主成分分析,仅保留99%的主要成分,获得1342维特征向量。Step 13-1: First, perform PCA (Principal Component Analysis) principal component analysis on the 2478-dimensional feature vector, retain only 99% of the principal components, and obtain a 1342-dimensional feature vector.
步骤13-2:对步骤13-1得到的特征进行‘L2’规范化。Step 13-2: Perform 'L2' normalization on the features obtained in Step 13-1.
步骤13-3:对步骤13-2得到的特征进行标准化。Step 13-3: Standardize the features obtained in Step 13-2.
所述步骤16中的具体步骤为:The specific steps in the step 16 are:
步骤16-1:将步骤15中模型1所有误检的负样本作为模型2的负样本集。正样本集为放缩后的人工框选白细胞。Step 16-1: Take all the false negative samples of model 1 in step 15 as the negative sample set of model 2. The positive sample set is the artificially framed white blood cells after scaling.
步骤16-2:对样本集的灰度图提取HOG(Histogram of Oriented Gradient)方向梯度直方图特征。Cell大小取8*8,梯度取9个方向;Block大小取2*2;步长取8。获得3600维特征向量。Step 16-2: Extract the HOG (Histogram of Oriented Gradient) direction gradient histogram feature from the grayscale image of the sample set. The cell size is 8*8, the gradient is 9 directions; the block size is 2*2; the step size is 8. Obtain a 3600-dimensional feature vector.
步骤16-3:对样本集特征采用步骤13中数据预处理方式,获得2320维特征向量。Step 16-3: Use the data preprocessing method in Step 13 for the features of the sample set to obtain a 2320-dimensional feature vector.
步骤16-4:采用与模型1相同训练方法,使用SVM模型,获得分类模型2,达到在保留白细胞的基础上进一步去除误检杂质的目的。最后选择RBF核函数,gamma为自动选取;惩罚系数C取1;正负样本权重比值为6:1,其他均为默认参数。Step 16-4: Adopt the same training method as Model 1 and use the SVM model to obtain Classification Model 2, so as to achieve the purpose of further removing falsely detected impurities on the basis of retaining white blood cells. Finally, the RBF kernel function is selected, gamma is automatically selected; the penalty coefficient C is 1; the weight ratio of positive and negative samples is 6:1, and the others are default parameters.
Claims (5)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201810262889.9A CN108564114B (en) | 2018-03-28 | 2018-03-28 | A method for automatic identification of human fecal leukocytes based on machine learning |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201810262889.9A CN108564114B (en) | 2018-03-28 | 2018-03-28 | A method for automatic identification of human fecal leukocytes based on machine learning |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN108564114A CN108564114A (en) | 2018-09-21 |
| CN108564114B true CN108564114B (en) | 2022-05-27 |
Family
ID=63533547
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201810262889.9A Expired - Fee Related CN108564114B (en) | 2018-03-28 | 2018-03-28 | A method for automatic identification of human fecal leukocytes based on machine learning |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN108564114B (en) |
Families Citing this family (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109447119A (en) * | 2018-09-26 | 2019-03-08 | 电子科技大学 | Cast recognition methods in the arena with SVM is cut in a kind of combining form credit |
| CN109919054B (en) * | 2019-02-25 | 2023-04-07 | 电子科技大学 | Machine vision-based reagent card automatic classification detection method |
| CN109975196B (en) * | 2019-03-01 | 2021-10-08 | 深圳大学 | A kind of reticulocyte detection method and system thereof |
| CN110070024B (en) * | 2019-04-16 | 2020-05-05 | 温州医科大学 | A method, system and mobile phone for skin pressure injury thermal imaging image recognition |
| CN110070138B (en) * | 2019-04-26 | 2021-09-21 | 河南萱闱堂医疗信息科技有限公司 | Method for automatically scoring excrement picture before endoscope detection of colon |
| CN110415212A (en) * | 2019-06-18 | 2019-11-05 | 平安科技(深圳)有限公司 | Abnormal cell detection method, device and computer readable storage medium |
| CN112540039A (en) * | 2020-12-31 | 2021-03-23 | 北京博奥体质宝健康科技有限公司 | Method for directly calculating number of adherent living cells |
| CN113989628B (en) * | 2021-10-27 | 2022-08-26 | 哈尔滨工程大学 | Underwater signal lamp positioning method based on weak direction gradient |
Citations (13)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103154732A (en) * | 2010-08-05 | 2013-06-12 | 艾博特健康公司 | Method and device for automatic analysis of whole blood samples by microscopic images |
| CN103679184A (en) * | 2013-12-06 | 2014-03-26 | 河海大学 | Method for leukocyte automatic identification based on relevant vector machine |
| CN104156951A (en) * | 2014-07-30 | 2014-11-19 | 电子科技大学 | Leukocyte detecting method aiming at bronchoalveolar lavage smear |
| CN104198355A (en) * | 2014-07-16 | 2014-12-10 | 电子科技大学 | Automatic detection method for red cells in feces |
| CN105404887A (en) * | 2015-07-05 | 2016-03-16 | 中国计量学院 | White blood count five-classification method based on random forest |
| WO2016086023A1 (en) * | 2014-11-24 | 2016-06-02 | Massachusetts Institute Of Technology | Systems, apparatus, and methods for analyzing blood cell dynamics |
| CN105654107A (en) * | 2015-09-21 | 2016-06-08 | 长春迪瑞医疗科技股份有限公司 | Visible component classification method based on SVM |
| CN106295588A (en) * | 2016-08-17 | 2017-01-04 | 电子科技大学 | The automatic identifying method of leukocyte in a kind of leucorrhea micro-image |
| CN106682633A (en) * | 2016-12-30 | 2017-05-17 | 四川沃文特生物技术有限公司 | Method for classifying and identifying visible components of microscopic excrement examination images based on machine vision |
| CN106897682A (en) * | 2017-02-15 | 2017-06-27 | 电子科技大学 | Leucocyte automatic identifying method in a kind of leukorrhea based on convolutional neural networks |
| CN106980839A (en) * | 2017-03-31 | 2017-07-25 | 宁波摩视光电科技有限公司 | A kind of method of automatic detection bacillus in leukorrhea based on HOG features |
| WO2017145172A1 (en) * | 2016-02-23 | 2017-08-31 | Sigtuple Technologies Private Limited | System and method for extraction and analysis of samples under a microscope |
| CN107408197A (en) * | 2015-03-11 | 2017-11-28 | 西门子公司 | Systems and methods for deconvolutional network-based classification of cellular images and videos |
-
2018
- 2018-03-28 CN CN201810262889.9A patent/CN108564114B/en not_active Expired - Fee Related
Patent Citations (13)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103154732A (en) * | 2010-08-05 | 2013-06-12 | 艾博特健康公司 | Method and device for automatic analysis of whole blood samples by microscopic images |
| CN103679184A (en) * | 2013-12-06 | 2014-03-26 | 河海大学 | Method for leukocyte automatic identification based on relevant vector machine |
| CN104198355A (en) * | 2014-07-16 | 2014-12-10 | 电子科技大学 | Automatic detection method for red cells in feces |
| CN104156951A (en) * | 2014-07-30 | 2014-11-19 | 电子科技大学 | Leukocyte detecting method aiming at bronchoalveolar lavage smear |
| WO2016086023A1 (en) * | 2014-11-24 | 2016-06-02 | Massachusetts Institute Of Technology | Systems, apparatus, and methods for analyzing blood cell dynamics |
| CN107408197A (en) * | 2015-03-11 | 2017-11-28 | 西门子公司 | Systems and methods for deconvolutional network-based classification of cellular images and videos |
| CN105404887A (en) * | 2015-07-05 | 2016-03-16 | 中国计量学院 | White blood count five-classification method based on random forest |
| CN105654107A (en) * | 2015-09-21 | 2016-06-08 | 长春迪瑞医疗科技股份有限公司 | Visible component classification method based on SVM |
| WO2017145172A1 (en) * | 2016-02-23 | 2017-08-31 | Sigtuple Technologies Private Limited | System and method for extraction and analysis of samples under a microscope |
| CN106295588A (en) * | 2016-08-17 | 2017-01-04 | 电子科技大学 | The automatic identifying method of leukocyte in a kind of leucorrhea micro-image |
| CN106682633A (en) * | 2016-12-30 | 2017-05-17 | 四川沃文特生物技术有限公司 | Method for classifying and identifying visible components of microscopic excrement examination images based on machine vision |
| CN106897682A (en) * | 2017-02-15 | 2017-06-27 | 电子科技大学 | Leucocyte automatic identifying method in a kind of leukorrhea based on convolutional neural networks |
| CN106980839A (en) * | 2017-03-31 | 2017-07-25 | 宁波摩视光电科技有限公司 | A kind of method of automatic detection bacillus in leukorrhea based on HOG features |
Non-Patent Citations (4)
| Title |
|---|
| Acute leukemia classification by using SVM and K-Means clustering;Jakkrich Laosai等;《2014 International Electrical Engineering Congress (iEECON)》;20141016;第1-4页 * |
| Analyzing Microscopic Images of Peripheral Blood Smear Using Deep Learning;Dheeraj Mundhra等;《DLMIA 2017, ML-CDS 2017: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support》;20170909;第178-185页 * |
| 基于多空间混合属性融合的白细胞图像识别方法研究;郝连旺;《中国优秀博硕士学位论文全文数据库(博士)医药卫生科技辑》;20150115(第(2015)01期);第E059-26页 * |
| 白细胞图像的特征提取与分类算法研究;张敏淑;《中国优秀博硕士学位论文全文数据库(硕士)医药卫生科技辑》;20170415(第(2017)04期);第E060-37页 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN108564114A (en) | 2018-09-21 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN108564114B (en) | A method for automatic identification of human fecal leukocytes based on machine learning | |
| CN113435460B (en) | A recognition method for bright crystal granular limestone images | |
| CN111462076B (en) | Full-slice digital pathological image fuzzy region detection method and system | |
| CN107256558B (en) | Unsupervised automatic cervical cell image segmentation method and system | |
| CN106295588B (en) | The automatic identifying method of leucocyte in a kind of leukorrhea micro-image | |
| CN105261017B (en) | The method that image segmentation based on road surface constraint extracts pedestrian's area-of-interest | |
| CN111598856A (en) | Method and system for automatic detection of chip surface defects based on defect-oriented multi-point localization neural network | |
| CN112215800B (en) | Overlapping Chromosome Identification and Segmentation Method Based on Machine Learning | |
| CN108197606A (en) | The recognition methods of abnormal cell in a kind of pathological section based on multiple dimensioned expansion convolution | |
| CN108446729A (en) | Egg embryo classification method based on convolutional neural networks | |
| CN106022231A (en) | Multi-feature-fusion-based technical method for rapid detection of pedestrian | |
| CN110110667B (en) | Processing method and system of diatom image and related components | |
| CN110135271A (en) | A kind of cell sorting method and device | |
| CN108090906A (en) | A kind of uterine neck image processing method and device based on region nomination | |
| CN109544564A (en) | A kind of medical image segmentation method | |
| CN116188786B (en) | Image segmentation system for hepatic duct and biliary tract calculus | |
| CN102073872B (en) | Image-based method for identifying shape of parasite egg | |
| CN109190590A (en) | A kind of arena crystallization recognition methods, device, computer equipment and storage medium | |
| CN112017165A (en) | Lacrimal river height detection method based on deep learning | |
| CN102636656B (en) | Calibration method of full-automatic urine visible component analyser | |
| CN107657220A (en) | A kind of leukorrhea mould automatic testing method based on HOG features and SVM | |
| Plissiti et al. | Automated segmentation of cell nuclei in PAP smear images | |
| CN114494318A (en) | A method of corneal contour extraction based on Otsu algorithm based on corneal dynamic deformation video | |
| CN116433978A (en) | Automatic generation and automatic labeling method and device for high-quality flaw image | |
| CN104966064A (en) | Pedestrian ahead distance measurement method based on visual sense |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant | ||
| CF01 | Termination of patent right due to non-payment of annual fee | ||
| CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20220527 |