WO2020042704A1 - 一种基于深度学习的染色体识别方法 - Google Patents

一种基于深度学习的染色体识别方法 Download PDF

Info

Publication number
WO2020042704A1
WO2020042704A1 PCT/CN2019/090230 CN2019090230W WO2020042704A1 WO 2020042704 A1 WO2020042704 A1 WO 2020042704A1 CN 2019090230 W CN2019090230 W CN 2019090230W WO 2020042704 A1 WO2020042704 A1 WO 2020042704A1
Authority
WO
WIPO (PCT)
Prior art keywords
chromosome
deep learning
image
network
model
Prior art date
Application number
PCT/CN2019/090230
Other languages
English (en)
French (fr)
Inventor
宋宁
吴朝玉
秦玉磊
周磊
杨杰
Original Assignee
杭州德适生物科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州德适生物科技有限公司 filed Critical 杭州德适生物科技有限公司
Priority to US17/272,254 priority Critical patent/US11436493B2/en
Publication of WO2020042704A1 publication Critical patent/WO2020042704A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/62Analysis of geometric attributes of area, perimeter, diameter or volume
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30024Cell structures in vitro; Tissue sections in vitro

Definitions

  • the invention relates to a chromosome recognition method based on deep learning, and belongs to the technical field of chromosome recognition.
  • Human chromosome disease is a syndrome of a series of clinical symptoms caused by a congenital number or structural abnormality. Mainly children with mental retardation, stunted growth, and congenital malformations. It can also cause miscarriages and stillbirths. These are unbearable for every family. However, the prevalence of this symptom in China's pregnant population is about 5% -10%, accounting for more than half of aborted embryos. And these data have a growing trend year by year, and our government and related institutions have begun to pay attention to chromosomal diseases.
  • the method for clinically examining human chromosomal disease is to obtain a stained karyotype sample by culturing somatic cells and then performing a series of operations, and then taking a digital photo to obtain a picture, and then analyzing and identifying the chromosome picture.
  • the method of analyzing chromosome pictures is basically manual operation, manual identification, and testing doctors first need a lot of training time to master the knowledge of identifying each chromosome type, and the workload is heavy. Even if an experienced doctor analyzes and identifies a patient's chromosome, the entire process usually takes more than two weeks, and the time period is relatively long. And artificial recognition, subjectivity is very strong, it is easily affected by the external environment, the accuracy is not high.
  • the object of the present invention is to provide an automatic, accurate, and efficient identification of chromosome types using a deep learning method, effectively improve the analysis efficiency of karyotypes, shorten the identification and sorting time, and complete chromosomes with high accuracy.
  • the automatic classification and sorting can effectively reduce the workload of doctors, without interference from the outside world, and the procedures are simple and reasonable. It can be widely applied on a large scale, and a simple deep learning-based chromosome recognition method is deployed.
  • a deep learning-based chromosome recognition method based on deep learning includes the following steps:
  • the first step is to obtain independent chromosome images
  • the second step is to calculate the manual features of the chromosomes
  • the third step is to perform basic image processing on the chromosomes
  • the fourth step is to establish a deep learning model
  • the fifth step is to predict the type of the chromosome based on the deep learning model.
  • the invention adopts a deep learning method to automatically, accurately and efficiently identify chromosome types. Compared with the existing recognition technology, it can effectively improve the analysis efficiency of chromosome karyotypes, shorten the recognition and sorting time, and complete the automatic classification of chromosomes with high accuracy. Sequencing can also effectively reduce the workload of doctors without interference from the outside world, and the procedures are simple and reasonable, which can be widely promoted and applied to the outside world with simple deployment.
  • the second step includes the following steps:
  • the third step includes the following steps:
  • chromosome image is enlarged to bs pixels along the longest axis; the other axis is enlarged proportionally; the image sizes of different chromosomes are not consistent.
  • the present invention processes all images in a uniform size, and the processing rule is to zoom in on the longer axis of the image.
  • the fourth step includes the following steps:
  • the backbone network model is based on the ResNet residual network structure
  • the model's classifier uses the MLP multilayer perceptron network; the main point of adopting this network is to be able to build an end-to-end end-to-end network without the need to separately train an SVM classifier based on features; this model uses Two MLP classifiers were identified for the type recognition and polarity recognition of chromosomes; the neuron parameters of the type recognition classifier are: (ms + ns) * 24; the neuron parameters of the polarity recognition classifier are ( ms + ns) * ms, ms * 2; the purpose of the classifier of the chromosome is to output the predicted probabilities of the 24 types of chromosomes, and the purpose of the polarity classifier is to output the prediction of the two kinds of polarities, that is, long arm down or long arm up Probability; where ms is derived from global pooling of the last ms features extracted from the residual network, and ns is derived from the addition of additional manually extracted features;
  • this model comprehensively considers the characteristics of deep learning and manual design, and comprehensively considers the CNN results during classification, as well as the relative skeleton length of the chromosome, the area ratio of the circumscribed rectangle, the ratio of its convex hull, and eccentricity.
  • This construction method not only takes into account the data dividends brought by the use of deep learning on large-scale data sets, but also makes the features considered by the algorithm have a certain interpretability, which has not been considered in previous literature and methods.
  • exp (x) is an abbreviation of exponential, that is, the exponential function e x ;
  • t is the true gold standard label.
  • category classification its value is between 0-23, which represents chromosome 1 to Y chromosome;
  • polarity classification its values are 0 and 1, which represent long arm up, long arm down. ;
  • the fifth step includes the following steps:
  • the search is directly based on the relative length to predict the category; according to the proportion of the chromosome relative to the longest chromosome length, the closest chromosome category can be obtained by look-up table method; look-up table The relative length table in the method is calculated based on the calculation of the standard chromosome map.
  • it also includes a sixth step of establishing an evaluation system for chromosome recognition results
  • the evaluation indicators are selected as: accuracy, sensitivity and specificity, precision and recall, and F1 index; assuming that there are only two types of classification targets, they are counted as positive and negative, respectively:
  • TP the number of cases that are correctly divided into positive cases, that is, the number of instances that are actually positive cases and are divided into positive cases by the deep learning model
  • FN the number of negative cases that were incorrectly divided into negative cases, that is, the number of instances that were actually positive but were divided into negative cases by the deep learning model
  • TN the number of correctly classified negative cases, that is, the number of instances that are actually negative cases and divided into negative cases by the deep learning model
  • the range of these five evaluation indicators is between 0-1; the higher the score, the better the classification effect.
  • sensitivity and recall The definition of sensitivity and recall is the same, but sensitivity is measured as a pair with specificity, accuracy and recall are measured as a pair, but in actual formula calculation, sensitivity and recall No difference. Establishing a reasonable index evaluation system can understand the recognition effect of the present invention in time, and then can improve the invention in time.
  • the bs is a number containing factors of 32 and 64, and its value is 256. Since the longest chromosome image may be 310 pixels, and 256 is the closest to 310, the number containing factors of 32 and 64 is selected 256pixel can meet the image size requirements on the one hand, and it is conducive to the final image size after neural network pooling conforms to the rules of experience of deep learning, which is convenient for the data processing and accuracy control of the present invention.
  • the rotation angle is controlled between plus and minus 30 degrees, and flipping includes horizontal flipping and vertical flipping; horizontal flipping expands the diversity of samples, and vertical flipping is a label that changes polarity.
  • the degree of inversion should not be too large, because the polarity needs to be determined. If the degree of rotation is too large, the direction of the long arm will be changed, and the polarity will be changed. Therefore, it cannot be rotated too much.
  • the rotation angle is controlled between plus and minus 30 degrees, which can meet the requirements of sample diversity. It does not cause a change in polarity.
  • the normalization step is to first calculate the mean and standard deviation of each chromosome image for each chromosome image, and then obtain the normalized map according to the following formula:
  • is the mean value of the image
  • is the standard deviation of the image
  • Image old is the original image
  • Image new is the normalized image
  • all images have theoretically 0 variance and 1 standard deviation.
  • the purpose of this step is to make the input of the network as standard and consistent as possible, making it easier for network training to converge.
  • the residual network structure is constructed based on the residual structure of the BasicBlock basic block, using 4 sets of BasicBlock, the number of BasicBlocks in each group is 3, 6, 27, 3 respectively; the residual basic block The purpose is to train the CNN convolutional neural network by fitting the residuals of the predicted output features, so as to continuously extract high-dimensional features for final classification.
  • the hs 80; it can be known through experiments that 80 layers are ideal. More layers can not significantly improve the accuracy rate. On the contrary, because there are not enough samples, the network of more layers cannot be fully trained. The GPU memory occupied by the network is more, which is not suitable for promotion. A low-level network will affect the accuracy rate. Too few network layers, the network's ability to fit sample prediction categories is poor, and the ability to adapt to sample diversity is poor.
  • the ms is preferably 256.
  • the residual network extracts the last 256 features, that is, 256 neurons, which can meet the accuracy requirements of the present invention. At the same time, the processing speed is faster and less resources are occupied.
  • the present invention has the following beneficial effects:
  • the invention adopts a deep learning method to automatically, accurately and efficiently identify chromosome types. Compared with the existing recognition technology, it can effectively improve the analysis efficiency of chromosome karyotypes, shorten the recognition and sorting time, and complete the automatic classification of chromosomes with high accuracy. Sequencing can effectively reduce the workload of doctors without interference from the outside world, and the procedures are concise and reasonable, which can be promoted and applied on a large scale and simple to deploy.
  • FIG. 1 is a diagram of filling white pixels
  • FIG. 3 is a diagram after the chromosome map shown in FIG. 2 is normalized
  • Figure 4 is a diagram of the chromosome map shown in Figure 3 after random rotation
  • FIG. 5 is a diagram after the chromosome map shown in FIG. 3 is randomly flipped.
  • a deep learning-based chromosome recognition method based on deep learning includes the following steps:
  • the first step is to obtain independent chromosome images
  • the second step is to calculate the manual characteristics of the chromosome, which includes the following steps:
  • the third step is to perform basic image processing on the chromosome, which includes the following steps:
  • the chromosome image is enlarged to bs pixels along the longest axis; the other axis is enlarged proportionally; the image sizes of different chromosomes are not consistent.
  • the algorithm and framework have consistent requirements for the size of the input image, the present invention processes all images in a uniform size, and the processing rule is to zoom in on the longer axis of the image.
  • the bs is a number containing factors 32 and 64, and its value is 256. Since the longest chromosome image may be 310 pixels, and 256 is the closest to 310, the number containing factors 32 and 64. Selecting 256pixel can satisfy Image size requirements, on the other hand, are conducive to the final image size after neural network pooling conforms to the rules of experience of deep learning, which is convenient for the data processing and accuracy control of the present invention.
  • the rotation angle is controlled between plus and minus 30 degrees, and flip includes horizontal flip and vertical flip; horizontal flip is to expand the diversity of samples, see Figure 4 ,
  • the flip in the vertical direction is a label that changes the polarity of polarity, see Figure 5.
  • the degree of inversion should not be too large, because the polarity needs to be determined. If the degree of rotation is too large, the direction of the long arm will be changed, and the polarity will be changed. Therefore, you cannot rotate too much.
  • the angle of rotation is controlled between plus and minus 30 degrees, which can meet the requirements of sample diversity. It does not cause a change in polarity.
  • the normalization step is: for each chromosome image, first calculate the mean and standard deviation of each chromosome image, and then obtain the normalized map according to the following formula:
  • is the mean value of the image
  • is the standard deviation of the image
  • Image old is the original image
  • Image new is the normalized image
  • All images have theoretically 0 variance and 1 standard deviation, see Figure 2-3.
  • the purpose of this step is to make the input of the network as standard and consistent as possible, making it easier for network training to converge.
  • the fourth step is to establish a deep learning model, which includes the following steps:
  • the backbone network model is based on the ResNet residual network structure
  • S1 the residual network structure is built based on the residual structure of the BasicBlock basic block, using 4 sets of BasicBlock, the number of BasicBlocks in each group is 3, 6, 27, 3
  • the purpose of this residual basic block is mainly to train the CNN convolutional neural network by fitting the residuals of the predicted output features, so as to continuously extract high-dimensional features for final classification.
  • the model's classifier uses the MLP multilayer perceptron network; the main point of adopting this network is to be able to build an end-to-end end-to-end network without the need to separately train an SVM classifier based on features; this model uses Two MLP classifiers were identified for the type recognition and polarity recognition of chromosomes; the neuron parameters of the type recognition classifier are: (ms + ns) * 24; the neuron parameters of the polarity recognition classifier are ( ms + ns) * ms, ms * 2; the purpose of the classifier of the chromosome is to output the predicted probabilities of the 24 types of chromosomes, and the purpose of the polarity classifier is to output the prediction of the two kinds of polarities, that is, long arm down or long arm up Probability; where ms is derived from global pooling of the last ms features extracted from the residual network, and ns is derived from the addition of additional manually extracted features.
  • exp (x) is an abbreviation of exponential, that is, the exponential function e x ;
  • t is the true gold standard label.
  • category classification its value is between 0-23, which represents chromosome 1 to Y chromosome;
  • polarity classification its values are 0 and 1, which represent long arm up, long arm down. ;
  • the fifth step is to predict the type of the chromosome based on the deep learning model, which includes the following steps:
  • the search is directly based on the relative length to predict the category; according to the proportion of the chromosome relative to the longest chromosome length, the closest chromosome category can be obtained by look-up table method; look-up table
  • the relative length table in the method is calculated based on the calculation of the standard chromosome map. Length-based prediction can be understood as a modified prediction method. Its relative share is shown in the following table:
  • the sixth step is to establish an evaluation system for the results of chromosome recognition.
  • the evaluation indicators are selected as: accuracy, sensitivity and specificity, precision and recall, and F1 index; assuming that there are only two types of classification targets, they are counted as positive and negative, respectively:
  • TP the number of cases that are correctly divided into positive cases, that is, the number of instances that are actually positive cases and are divided into positive cases by the deep learning model
  • FN the number of negative cases that were incorrectly divided into negative cases, that is, the number of instances that were actually positive but were divided into negative cases by the deep learning model
  • TN the number of negative cases correctly divided, that is, the number of instances that are actually negative cases and divided into negative cases by the deep learning model.
  • the range of these five evaluation indicators is between 0-1; the higher the score, the better the classification effect.
  • sensitivity and recall The definition of sensitivity and recall is the same, but sensitivity is measured as a pair with specificity, accuracy and recall are measured as a pair, but in actual formula calculation, sensitivity and recall No difference. Establishing a reasonable index evaluation system can understand the recognition effect of the present invention in time, and then can improve the invention in time.
  • the present invention arranges, collects, and labels 80254 meta-phase chromosome images by itself, including 77878 normal samples and 2376 abnormal samples.
  • the present invention is developed based on this data set, and can recognize categories and polarities for both normal samples and abnormal samples, and has good universality.
  • the accuracy test results are based on the test sample set, and the verification method is 10% cross-validation. According to the results of cross-validation, the performance that the present invention can achieve on the test sample set is:
  • the present invention adopts a deep learning method to automatically, accurately and efficiently identify chromosome types. Compared with existing recognition technologies, it can effectively improve the analysis efficiency of chromosome karyotypes, shorten the recognition sorting time, and complete with high accuracy.
  • the automatic classification and sequencing of chromosomes can effectively reduce the workload of doctors without interference from the outside world, and the procedures are concise and reasonable, which can be widely used and applied to the outside world with simple deployment.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Genetics & Genomics (AREA)
  • Medical Informatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Geometry (AREA)
  • Image Analysis (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

本发明公开了一种基于深度学习的染色体识别方法,属于染色体识别技术领域。目前分析染色体的办法基本上是手工操作,检验医生首先需要大量的培训时间才能掌握识别每个染色体类型的知识,工作负担较重。即使是经验丰富的医生对病人的染色体进行分析识别,整个流程一般也需要两个星期以上,时间周期较长。并且人工识别,主观性很强,很容易受外界环境影响,准确率不高。本发明采用深度学习方法能够对染色体类型进行准确、高效识别,相比现有识别技术,能够有效提升染色体核型的分析效率,缩短识别排序时间,高准确率地完成染色体的自动分类和排序,同时能够有效减轻医生的工作负担,不受外界干扰,并且工序简洁、合理,可向外大规模推广应用。

Description

一种基于深度学习的染色体识别方法 技术领域
本发明涉及一种基于深度学习的染色体识别方法,属于染色体识别技术领域。
背景技术
人类染色体病是由先天性的染色体数目或结构异常而引起的一系列临床症状的综合征。主要有患儿智力低下,发育迟缓,先天性畸形。同时还会造成流产,死胎等。这些对于每一个家庭都是难以承受的。然而这一症状在我国的妊娠人群中的患病率约有5%-10%,在流产胚胎中占据一半以上。并且这些数据有逐年增长的趋势,我国政府及相关机构也开始重视染色体疾病。
临床检验人染色体病的方法是通过培养体细胞再经一系列的操作得到经染色显带核型样本,再经数码拍照得到照片,然后对染色体图片进行分析识别。目前分析染色体图片的方法,基本上是手工操作,人工识别,检验医生首先需要大量的培训时间才能掌握识别每个染色体类型的知识,工作负担较重。即使是经验丰富的医生对病人的染色体进行分析识别,整个流程一般也需要两个星期以上,时间周期较长。并且人工识别,主观性很强,很容易受外界环境影响,准确率不高。
发明内容
针对现有技术的缺陷,本发明的目的在于提供一种采用深度学习方法对染色体类型进行自动、准确、高效识别,有效提升染色体核型的分析效率,缩短识别排序时间,高准确率地完成染色体的自动分类和排序,同时能够有效减轻医生的工作负担,不受外界干扰,并且工序简洁、合理,可向外大规模推广应用,部署简单的基于深度学习的染色体识别方法。
为实现上述目的,本发明的技术方案为:
一种基于深度学习的基于深度学习的染色体识别方法,包括以下步骤:
第一步,得到独立的染色体图像;
第二步,对染色体的手工特征进行计算;
第三步,对染色体进行基本的图像处理;
第四步,建立深度学习模型;
第五步,基于深度学习模型对染色体的类型进行预测。
本发明采用深度学习方法能够对染色体类型进行自动、准确、高效识别,相比现有识别技术,能够有效提升染色体核型的分析效率,缩短识别排序时间,高准确率地完成染色体的自动分类和排序,同时能够有效减轻医生的工作负担,不受外界干扰,并且工序简洁、合理, 可向外大规模推广应用,部署简单。
作为优选技术措施,
所述第二步,包括以下步骤:
a)基于形态学操作,以及骨架提取算法来提取染色体的骨架,并计算其长度;
b)将该染色体长度,除以同一细胞内最长的染色体长度,得到相对占比长度;
c)基于单个染色体图像计算:相对外接矩形的面积占比、相对其凸包的占比、离心率。
这三个指标是衡量染色体在形态上的特征,是否面积大,是否比较凸,是否很圆。以上特征将参与最终的模型构建,创新性地在深度网络中融入手动提取的特征,使得本发明的工序流程更加合理、有序。
作为优选技术措施,第三步,包括以下步骤:
a)将染色体图像沿着最长的轴放大至bs个pixel;另一个轴等比例的放大;不同染色体的图像尺寸不一致。但是由于算法和框架对于输入图像尺寸有一致性要求,本发明将所有图像都做了统一尺寸的处理,处理规则是按照图像较长的那一轴进行放大。
b)对放大的图像填充白色像素255,至正方形bs*bs pixel的大小。由于染色体图像原本背景颜色就是白色,所以填充白色。填充白色符合染色体图像的特征,降低图像处理的难度,提升染色体识别效率。
c)训练深度网络前,对图像进行旋转、翻转数据增强操作;
d)对所有输入图像进行归一化处理,使得图像输入尽可能标准一致,网络训练更容易收敛。
作为优选技术措施,第四步,包括以下步骤:
S1,建立模型结构:主干网络模型基于ResNet残差网络结构;
S2,通过使用残差学习Residual Learning的方式,能够极大提高模型抽取特征的有效性,而且能够在避免过拟合训练样本集的情况下,构建深层次的网络,提高模型的准确率;本模型的深度为:hs层;
S3,模型的分类器采用的是MLP多层感知器网络;采取该网络的要点在于能够构建一个端到端end-to-end网络,而无需单独基于特征再训练一个SVM分类器;本模型使用了两个MLP分类器,分别针对染色体的类型识别,以及极性识别;类型识别分类器的神经元参数构成为:(ms+ns)*24;极性识别分类器的神经元参数构成为(ms+ns)*ms,ms*2;染色体的类别分类器目的是输出24种类别的染色体的预测概率,极性分类器目的是输出2种极性即长臂向下或者长臂向上的预测概率;其中ms是来源于对残差网络提取的最后ms个特征的全局pooling池化,ns是来源于对额外手工提取特征的加入;
S4,对于MLP的分类器神经元参数设置(ms+ns);由于在以往的染色体分类文献中,长度信息是一个很重要的判断依据。因此,本模型综合考虑了深度学习特征以及手工设计特征,在分类时综合考虑CNN结果,以及染色体的相对骨架长度、相对外接矩形的面积占比、相对其凸包的占比、离心率。这样的构建方式,既兼顾了在大规模数据集上使用深度学习带来的数据红利,又使得算法考虑的特征具有一定可解释性,这是以往文献及方法所不曾考虑的。
S5,模型的损失函数Loss Function设置为交叉熵函数Cross-Entropy Loss,其定义的数学表达式如下:
Figure PCTCN2019090230-appb-000001
其中,exp(x)为exponential的缩写,即为指数函数e x
x为MLP分类器输出的结果向量,N cls为需要预测的分类总类别数;对于染色体的类型分类,x维度为24维,N cls=24;对于极性分类,x其维度为2维,N cls=2;j为计数下标,用于累加x向量中每个元素x[j];
t为真实的金标准标签,对于类别分类,其值在0-23之间,代表1号染色体至Y染色体;对于极性分类,其值为0和1,代表长臂向上,长臂向下;
整个函数是对概率值取了负对数,便于求解其最小值;对数中的分式解释意义,以类别预测为例:预测的所有类别结果x[j],j=1,2,...,24中,金标准标签t对应的类别的概率;
S6,深度学习模型的训练时,使用ADAM优化器。
作为优选技术措施,第五步,包括以下步骤:
a)使用深度学习模型,其MLP分类器的分别输出类别预测的24种概率值,以及极性预测的2种概率值;大部分染色体能够以极高的置信度被准确预测;所有类别预测的概率之和为1。比如预测该染色体图片是第一类的染色体概率为0.9,第二类的概率为0.05,第三类0.05,……,则根据概率最大的原则,认为该图片是第一类染色体。
b)对于深度学习预测结果中,假设染色体被预测为类别a的概率p是所有24个类别概率中最大的,则认为该染色体属于类别a,置信度就是概率p。若p小于0.7,则认为置信度不高。对于类别置信度不高的染色体,直接基于相对长度进行查找来预测其类别;根据染色体相对1号最长染色体长度的占比,可由查表法求出该相对值最接近的染色体类别;查表法中的相对长度表,是根据标准染色体图谱的计算得到的。
作为优选技术措施,还包括第六步,对染色体识别结果建立评价系统,
评价指标选取为:准确率accuracy,敏感度sensitivity与特异度specificity,精确度precision 与召回率recall,以及F1指数;假设分类目标只有两类,计为正例positive和负例negtive分别是:
1)TP:被正确地划分为正例的个数,即实际为正例且被深度学习模型划分为正例的实例数;
2)FP:被错误地划分为正例的个数,即实际为负例但被深度学习模型划分为正例的实例数;
3)FN:被错误地划分为负例的个数,即实际为正例但被深度学习模型划分为负例的实例数
4)TN:被正确地划分为负例的个数,即实际为负例且被深度学习模型划分为负例的实例数;
Figure PCTCN2019090230-appb-000002
Figure PCTCN2019090230-appb-000003
Figure PCTCN2019090230-appb-000004
Figure PCTCN2019090230-appb-000005
Figure PCTCN2019090230-appb-000006
Figure PCTCN2019090230-appb-000007
这5个评价指标的范围是0-1之间;分数越高,代表分类效果越好。
其中敏感度与召回率的定义是一样的,但是敏感度是与特异度作为一对来衡量的,精确度与召回率是作为一对来衡量的,但实际公式计算上,敏感度与召回率无差别。建立合理的指标评价系统,能够及时了解本发明的识别效果,进而能够及时对发明进行改进。
作为优选技术措施,所述bs为含有因数32、64的数字,其值取256;由于染色体图像最长可能为310个像素,且256是距离310最近的,含有因数32,64的数字,选取256pixel一方面能够满足图像尺寸要求,另一方面有利于神经网络池化(pooling)后的最终图像尺寸符合深度学习的经验规则,便于本发明的数据处理以及精准度控制。
旋转的角度控制在正负30度之间,翻转包括水平翻转和竖直翻转;水平方向翻转是扩充样本多样性,竖直方向的翻转则是改变polarity极性的标签。翻转度数不宜过大,因为需要确定极性。若旋转度数过大,则会改变长臂的方向,也就改变了极性,因此不能够旋转太 大角度,旋转的角度控制在正负30度之间,正好能够满足样本多样性的要求,同时不会导致极性的改变。
作为优选技术措施,归一化的步骤是,对每张染色体图像,首先计算每张染色体图像的均值和标准差,然后根据如下公式得到归一化处理后的图:
Figure PCTCN2019090230-appb-000008
其中,μ为图像均值,σ为图像标准差;Image old为原图,Image new为归一化处理后的图;经过该步骤处理,所有图像理论上已经具有0方差,1标准差了。此步骤的目的是让网络的输入尽可能标准一致,使得网络训练更容易收敛。
作为优选技术措施,S1,残差网络结构基于BasicBlock基础块的残差结构进行构建,使用了4组BasicBlock,每一组中BasicBlock的数量分别为3,6,27,3;该残差基础块的目的主要是通过拟合预测输出的特征的残差来训练CNN卷积神经网络,从而不断抽取高维特征,以供最终的分类。
作为优选技术措施,S6,ADAM优化器的参数分别设置为:β1=0.9,β2=0.99;训练的学习率初始设置为0.01,随着迭代次数增加而递减;训练总迭代次数为120,批量训练的样本大小Batchsize设置为256。
所述hs=80;经过实验可知,80层是比较理想的,更多层数并不能显著提升准确率,反而会因为样本不够多而无法对更多层的网络训练充分,且更高层数的网络占用的GPU显存更多,不适宜推广。低层次的网络则会影响准确率,网络层数太少,网络对样本预测类别的拟合能力就差,对样本多样性的适应能力就差。
ms取值范围为256-4096;ns=4。ms优选256,由于神经元数目越多,训练所需要的样本就越多,占用的计算资源也越多,残差网络提取最后256个特征即256个神经元,能够满足本发明的精准度要求,同时处理速率较快,占用资源少。
与现有技术相比,本发明具有以下有益效果:
本发明采用深度学习方法能够对染色体类型进行自动、准确、高效识别,相比现有识别技术,能够有效提升染色体核型的分析效率,缩短识别排序时间,高准确率地完成染色体的自动分类和排序,同时能够有效减轻医生的工作负担,不受外界干扰,并且工序简洁、合理,可向外大规模推广应用,部署简单。
附图说明
图1为填充白色像素的图;
图2染色体原始图;
图3为图2所示染色体图经归一化处理后的图;
图4为图3所示染色体图经随机旋转后的图;
图5为图3所示染色体图经随机翻转后的图。
具体实施方式
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。
相反,本发明涵盖任何由权利要求定义的在本发明的精髓和范围上做的替代、修改、等效方法以及方案。进一步,为了使公众对本发明有更好的了解,在下文对本发明的细节描述中,详尽描述了一些特定的细节部分。对本领域技术人员来说没有这些细节部分的描述也可以完全理解本发明。
一种基于深度学习的基于深度学习的染色体识别方法,包括以下步骤:
第一步,得到独立的染色体图像;
第二步,对染色体的手工特征进行计算,其包括以下步骤:
a)基于形态学操作,以及骨架提取算法来提取染色体的骨架,并计算其长度;
b)将该染色体长度,除以同一细胞内最长的染色体长度,得到相对占比长度;
c)基于单个染色体图像计算:相对外接矩形的面积占比、相对其凸包的占比、离心率。
这三个指标是衡量染色体在形态上的特征,是否面积大,是否比较凸,是否很圆。以上特征将参与最终的模型构建,创新性地在深度网络中融入手动提取的特征,使得本发明的工序流程更加合理、有序。
第三步,对染色体进行基本的图像处理,其包括以下步骤:
a)将染色体图像沿着最长的轴放大至bs个pixel;另一个轴等比例的放大;不同染色体的图像尺寸不一致。但是由于算法和框架对于输入图像尺寸有一致性要求,本发明将所有图像都做了统一尺寸的处理,处理规则是按照图像较长的那一轴进行放大。所述bs为含有因数32、64的数字,其值取256;由于染色体图像最长可能为310个像素,且256是距离310最近的,含有因数32,64的数字,选取256pixel一方面可以满足图像尺寸要求,另一方面有利于神经网络池化(pooling)后的最终图像尺寸符合深度学习的经验规则,便于本发明的数据处理以及精准度控制。
b)对放大的图像填充白色像素255,至正方形256x256pixel的大小。由于染色体图像原本背景颜色就是白色,所以填充白色,参见图1。填充白色符合染色体图像的特征,降低图像处理的难度,提升染色体识别效率。
c)训练深度网络前,对图像进行旋转、翻转数据增强操作;旋转的角度控制在正负30度之间,翻转包括水平翻转和竖直翻转;水平方向翻转是扩充样本多样性,参见图4,竖直方向的翻转则是改变polarity极性的标签,参见图5。翻转度数不宜过大,因为需要确定极性。若旋转度数过大,则会改变长臂的方向,也就改变了极性,因此不可以旋转太大角度,旋转的角度控制在正负30度之间,正好能够满足样本多样性的要求,同时不会导致极性的改变。
d)对所有输入图像进行归一化处理,使得图像输入尽可能标准一致,网络训练更容易收敛。归一化的步骤是,对每张染色体图像,首先计算每张染色体图像的均值和标准差,然后根据如下公式得到归一化处理后的图:
Figure PCTCN2019090230-appb-000009
其中,μ为图像均值,σ为图像标准差;Image old为原图,Image new为归一化处理后的图;经过该步骤处理,所有图像理论上已经具有0方差,1标准差了,参见图2-3。此步骤的目的是让网络的输入尽可能标准一致,使得网络训练更容易收敛。
第四步,建立深度学习模型,其包括以下步骤:
S1,建立模型结构:主干网络模型基于ResNet残差网络结构;S1,残差网络结构基于BasicBlock基础块的残差结构进行构建,使用了4组BasicBlock,每一组中BasicBlock的数量分别为3,6,27,3;该残差基础块的目的主要是通过拟合预测输出的特征的残差来训练CNN卷积神经网络,从而不断抽取高维特征,以供最终的分类。
S2,通过使用残差学习Residual Learning的方式,能够极大提高模型抽取特征的有效性,而且能够在避免过拟合训练样本集的情况下,构建深层次的网络,提高模型的准确率;本模型的深度为:80层;经过实验可知,80层是比较理想的,更多层数并不能显著提升准确率,反而会因为样本不够多而无法对更多层的网络训练充分,且更高层数的网络占用的GPU显存更多,不适宜推广。低层次的网络则会影响准确率,网络层数太少,网络对样本预测类别的拟合能力就差,对样本多样性的适应能力就差。
S3,模型的分类器采用的是MLP多层感知器网络;采取该网络的要点在于能够构建一个端到端end-to-end网络,而无需单独基于特征再训练一个SVM分类器;本模型使用了两个MLP分类器,分别针对染色体的类型识别,以及极性识别;类型识别分类器的神经元参数构成为:(ms+ns)*24;极性识别分类器的神经元参数构成为(ms+ns)*ms,ms*2;染色体的类别分类器目的是输出24种类别的染色体的预测概率,极性分类器目的是输出2种极性即长臂向下或者长臂向上的预测概率;其中ms是来源于对残差网络提取的最后ms个 特征的全局pooling池化,ns是来源于对额外手工提取特征的加入。
ms=256;ns=4,由于神经元数目越多,训练所需要的样本就越多,占用的计算资源也越多,残差网络提取最后256个特征即256个神经元,能够满足本发明的精准度要求,同时处理速率较快,占用资源少。
S4,对于MLP的分类器神经元参数设置(ms+ns),由于在以往的染色体分类文献中,长度信息是一个很重要的判断依据。因此,本模型综合考虑了深度学习特征以及手工设计特征,在分类时综合考虑CNN结果,以及染色体的相对骨架长度、相对外接矩形的面积占比、相对其凸包的占比、离心率。这样的构建方式,既兼顾了在大规模数据集上使用深度学习带来的数据红利,又使得算法考虑的特征具有一定可解释性,这是以往文献及方法所不曾考虑的。
S5,模型的损失函数Loss Function设置为交叉熵函数Cross-Entropy Loss,其定义的数学表达式如下:
Figure PCTCN2019090230-appb-000010
其中,exp(x)为exponential的缩写,即为指数函数e x
x为MLP分类器输出的结果向量,N cls为需要预测的分类总类别数;对于染色体的类型分类,x维度为24维,N cls=24;对于极性分类,x其维度为2维,N cls=2;j为计数下标,用于累加x向量中每个元素x[j];
t为真实的金标准标签,对于类别分类,其值在0-23之间,代表1号染色体至Y染色体;对于极性分类,其值为0和1,代表长臂向上,长臂向下;
整个函数是对概率值取了负对数,便于求解其最小值;对数中的分式解释意义,以类别预测为例:预测的所有类别结果x[j],j=1,2,...,24中,金标准标签t对应的类别的概率;
S6,深度学习模型的训练时,使用ADAM优化器。ADAM优化器的参数分别设置为:β1=0.9,β2=0.99;训练的学习率初始设置为0.01,随着迭代次数增加而递减;训练总迭代次数为120,批量训练的样本大小Batchsize设置为256。
第五步,基于深度学习模型对染色体的类型进行预测,其包括以下步骤:
a)使用深度学习模型,其MLP分类器的分别输出类别预测的24种概率值,以及极性预测的2种概率值;大部分染色体能够极高的置信度被准确预测;所有类别预测的概率之和为1。比如预测该染色体图片是第一类的染色体概率为0.9,第二类的概率为0.05,第三类0.05,……,则根据概率最大的原则,认为该图片是第一类染色体。
b)对于深度学习预测结果中,假设染色体被预测为类别a的概率p是所有24个类别概 率中最大的,则认为该染色体属于类别a,置信度就是概率p。若p小于0.7,则认为置信度不高。对于类别置信度不高的染色体,直接基于相对长度进行查找来预测其类别;根据染色体相对1号最长染色体长度的占比,可由查表法求出该相对值最接近的染色体类别;查表法中的相对长度表,是根据标准染色体图谱的计算得到的。基于长度的预测可理解为一种修正预测方法。其相对占比如下表所示:
Figure PCTCN2019090230-appb-000011
第六步,对染色体识别结果建立评价系统,
评价指标选取为:准确率accuracy,敏感度sensitivity与特异度specificity,精确度precision与召回率recall,以及F1指数;假设分类目标只有两类,计为正例positive和负例negtive分别是:
1)TP:被正确地划分为正例的个数,即实际为正例且被深度学习模型划分为正例的实例数;
2)FP:被错误地划分为正例的个数,即实际为负例但被深度学习模型划分为正例的实例数;
3)FN:被错误地划分为负例的个数,即实际为正例但被深度学习模型划分为负例的实例数
4)TN:被正确地划分为负例的个数,即实际为负例且被深度学习模型划分为负例的实例数。
Figure PCTCN2019090230-appb-000012
Figure PCTCN2019090230-appb-000013
Figure PCTCN2019090230-appb-000014
Figure PCTCN2019090230-appb-000015
Figure PCTCN2019090230-appb-000016
Figure PCTCN2019090230-appb-000017
这5个评价指标的范围是0-1之间;分数越高,代表分类效果越好。
其中敏感度与召回率的定义是一样的,但是敏感度是与特异度作为一对来衡量的,精确度与召回率是作为一对来衡量的,但实际公式计算上,敏感度与召回率无差别。建立合理的指标评价系统,能够及时了解本发明的识别效果,进而能够及时对发明进行改进。
为验证本发明的识别效果,本发明自行整理并收集、标记了80254张meta-phase染色体图像,其中包括77878张正常样本,2376张异常样本。本发明基于此数据集进行开发,对于正常样本和异常样本均可识别类别和极性,具有较好的普适性generality。准确率测试结果基于测试样本集,验证方法为10折交叉验证。根据交叉验证的结果,本发明在测试样本集上可达到的性能为:
i.类别预测:
accuracy0.9803,sensitivity0.9766,specificity0.9991,precision0.9796,recall0.9766,F1score0.9779
ii.极性预测:
accuracy0.9897,sensitivity0.9895,specificity0.9895,precision0.9895,recall0.9895,F 1score0.9895
从上述实验可知,本发明采用深度学习方法能够对染色体类型进行自动、准确、高效识别,相比现有识别技术,能够有效提升染色体核型的分析效率,缩短识别排序时间,高准确率地完成染色体的自动分类和排序,同时能够有效减轻医生的工作负担,不受外界干扰,并且工序简洁、合理,可向外大规模推广应用,部署简单。
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。

Claims (10)

  1. 一种基于深度学习的染色体识别方法,其特征在于,包括以下步骤:
    第一步,得到独立的染色体图像;
    第二步,对染色体的手工特征进行计算;
    第三步,对染色体进行基本的图像处理;
    第四步,建立深度学习模型;
    第五步,基于深度学习模型对染色体的类型进行预测。
  2. 如权利要求1所述的一种基于深度学习的染色体识别方法,其特征在于,
    所述第二步,包括以下步骤:
    a)基于形态学操作,以及骨架提取算法来提取染色体的骨架,并计算其长度;
    b)将该染色体长度,除以同一细胞内最长的染色体长度,得到相对占比长度;
    c)基于单个染色体图像计算:相对外接矩形的面积占比、相对其凸包的占比、离心率。
  3. 如权利要求1所述的一种基于深度学习的染色体识别方法,其特征在于,
    第三步,包括以下步骤:
    a)将染色体图像沿着最长的轴放大至bs个pixel;另一个轴等比例的放大;
    b)对放大的图像填充白色像素;
    c)训练深度网络前,对图像进行旋转、翻转数据增强操作;
    d)对所有输入图像进行归一化处理,使得图像输入尽可能标准一致,网络训练更容易收敛。
  4. 如权利要求1所述的一种基于深度学习的染色体识别方法,其特征在于,
    第四步,包括以下步骤:
    S1,建立模型结构:主干网络模型基于ResNet残差网络结构;
    S2,通过使用残差学习Residual Learning的方式,能够极大提高模型抽取特征的有效性,而且能够在避免过拟合训练样本集的情况下,构建深层次的网络,提高模型的准确率;本模型的深度为:hs层;
    S3,模型的分类器采用的是MLP多层感知器网络;采取该网络的要点在于能够构建一个端到端end-to-end网络,而无需单独基于特征再训练一个SVM分类器;本模型使用了两个MLP分类器,分别针对染色体的类型识别,以及极性识别;类型识别分类器的神经元参数构成为:(ms+ns)*24;极性识别分类器的神经元参数构成为(ms+ns)*ms,ms*2;染色体的类别分类器目的是输出24种类别的染色体的预测概率,极性分类器目的是输出2种极性即长臂向下或者长臂向上的预测概率;其中ms是来源于对残差网络提取的最后ms个 特征的全局pooling池化,ns是来源于对额外手工提取特征的加入;
    S4,对于MLP的分类器神经元参数设置(ms+ns);
    S5,模型的损失函数Loss Function设置为交叉熵函数Cross-Entropy Loss,其定义的数学表达式如下:
    Figure PCTCN2019090230-appb-100001
    其中,exp(x)为exponential的缩写,即为指数函数e x
    x为MLP分类器输出的结果向量,N cls为需要预测的分类总类别数;对于染色体的类型分类,x维度为24维,N cls=24;对于极性分类,x其维度为2维,N cls=2;j为计数下标,用于累加x向量中每个元素x[j];
    t为真实的金标准标签,对于类别分类,其值在0-23之间,代表1号染色体至Y染色体;对于极性分类,其值为0和1,代表长臂向上,长臂向下;
    整个函数是对概率值取了负对数,便于求解其最小值;对数中的分式解释意义,以类别预测为例:预测的所有类别结果x[j],j=1,2,...,24中,金标准标签t对应的类别的概率;
    S6,深度学习模型的训练时,使用ADAM优化器。
  5. 如权利要求1所述的一种基于深度学习的染色体识别方法,其特征在于,
    第五步,包括以下步骤:
    a)使用深度学习模型,其MLP分类器的分别输出类别预测的24种概率值,以及极性预测的2种概率值;大部分染色体能够以极高的置信度被准确预测;
    b)对于深度学习预测结果中,类别置信度不高的染色体,直接基于相对长度进行查找来预测其类别;根据染色体相对1号最长染色体长度的占比,可由查表法求出该相对值最接近的染色体类别;查表法中的相对长度表,是根据标准染色体图谱的计算得到的。
  6. 如权利要求1-5任一项所述的一种基于深度学习的染色体识别方法,其特征在于,还包括第六步,对染色体识别结果建立评价系统,
    评价指标选取为:准确率accuracy,敏感度sensitivity与特异度specificity,精确度precision与召回率recall,以及F1指数;假设分类目标只有两类,计为正例positive和负例negtive分别是:
    1)TP:被正确地划分为正例的个数,即实际为正例且被深度学习模型划分为正例的实例数;
    2)FP:被错误地划分为正例的个数,即实际为负例但被深度学习模型划分为正例的实例数;
    3)FN:被错误地划分为负例的个数,即实际为正例但被深度学习模型划分为负例的实例数
    4)TN:被正确地划分为负例的个数,即实际为负例且被深度学习模型划分为负例的实例数;
    Figure PCTCN2019090230-appb-100002
    Figure PCTCN2019090230-appb-100003
    Figure PCTCN2019090230-appb-100004
    Figure PCTCN2019090230-appb-100005
    Figure PCTCN2019090230-appb-100006
    Figure PCTCN2019090230-appb-100007
    这5个评价指标的范围是0-1之间;分数越高,代表分类效果越好。
  7. 如权利要求3所述的一种基于深度学习的染色体识别方法,其特征在于,
    所述bs为含有因数32、64的数字,其值取256;
    旋转的角度控制在正负30度之间,翻转包括水平翻转和竖直翻转;水平方向翻转是扩充样本多样性,竖直方向的翻转则是改变polarity极性的标签。
  8. 如权利要求7所述的一种基于深度学习的染色体识别方法,其特征在于,
    归一化的步骤是,对每张染色体图像,首先计算每张染色体图像的均值和标准差,然后根据如下公式得到归一化处理后的图:
    Figure PCTCN2019090230-appb-100008
    其中,μ为图像均值,σ为图像标准差;Image old为原图,Image new为归一化处理后的图;经过该步骤处理,所有图像理论上已经具有0方差,1标准差了。
  9. 如权利要求4所述的一种基于深度学习的染色体识别方法,其特征在于,
    S1,残差网络结构基于BasicBlock基础块的残差结构进行构建,使用了4组BasicBlock,每一组中BasicBlock的数量分别为3,6,27,3;该残差基础块的目的主要是通过拟合预测输出的特征的残差来训练CNN卷积神经网络,从而不断抽取高维特征,以供最终的分类。
  10. 如权利要求9所述的一种基于深度学习的染色体识别方法,其特征在于,
    S6,ADAM优化器的参数分别设置为:β1=0.9,β2=0.99;训练的学习率初始设置为0.01, 随着迭代次数增加而递减;训练总迭代次数为120,批量训练的样本大小Batchsize设置为256;所述hs=80;ms取值范围为256-4096;ns=4。
PCT/CN2019/090230 2018-08-27 2019-06-06 一种基于深度学习的染色体识别方法 WO2020042704A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/272,254 US11436493B2 (en) 2018-08-27 2019-06-06 Chromosome recognition method based on deep learning

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810979111.XA CN109300111B (zh) 2018-08-27 2018-08-27 一种基于深度学习的染色体识别方法
CN201810979111.X 2018-08-27

Publications (1)

Publication Number Publication Date
WO2020042704A1 true WO2020042704A1 (zh) 2020-03-05

Family

ID=65165558

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/090230 WO2020042704A1 (zh) 2018-08-27 2019-06-06 一种基于深度学习的染色体识别方法

Country Status (3)

Country Link
US (1) US11436493B2 (zh)
CN (1) CN109300111B (zh)
WO (1) WO2020042704A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112037173A (zh) * 2020-08-04 2020-12-04 湖南自兴智慧医疗科技有限公司 染色体检测方法、装置及电子设备
CN113408505A (zh) * 2021-08-19 2021-09-17 北京大学第三医院(北京大学第三临床医学院) 一种基于深度学习的染色体极性识别方法和系统

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109300111B (zh) * 2018-08-27 2020-05-12 杭州德适生物科技有限公司 一种基于深度学习的染色体识别方法
JP2021531812A (ja) * 2019-02-21 2021-11-25 中國醫藥大學附設醫院China Medical University Hospital 染色体異常のテストモデル、そのテストシステム及び染色体異常のテスト方法
CN110390312A (zh) * 2019-07-29 2019-10-29 北京航空航天大学 基于卷积神经网络的染色体自动分类方法和分类器
CN110533672B (zh) * 2019-08-22 2022-10-28 杭州德适生物科技有限公司 一种基于条带识别的染色体排序方法
CN110533684B (zh) * 2019-08-22 2022-11-25 杭州德适生物科技有限公司 一种染色体核型图像切割方法
WO2021074694A1 (en) * 2019-10-17 2021-04-22 Metasystems Hard & Software Gmbh Methods for automated chromosome analysis
CN110879996A (zh) * 2019-12-03 2020-03-13 上海北昂医药科技股份有限公司 一种染色体分裂相定位排序方法
CN111325711A (zh) * 2020-01-16 2020-06-23 杭州德适生物科技有限公司 一种基于深度学习的染色体分裂相图像质量评价方法
CN111612744A (zh) * 2020-04-30 2020-09-01 西交利物浦大学 弯曲染色体图像拉直模型生成方法、模型的应用、系统、可读存储介质及计算机设备
CN112330652A (zh) * 2020-11-13 2021-02-05 深圳大学 基于深度学习的染色体识别方法、装置和计算机设备
CN112487941B (zh) * 2020-11-26 2023-03-14 华南师范大学 染色体簇与染色体实例的识别方法、系统和存储介质
CN114331031B (zh) * 2021-12-08 2022-12-09 北京华清安地建筑设计有限公司 一种建筑传统特征识别评价方法和系统
CN115220623B (zh) * 2021-12-17 2023-12-05 深圳市瑞图生物技术有限公司 染色体图像分析方法、设备及存储介质
CN115147661B (zh) * 2022-07-25 2023-07-25 浙大城市学院 一种染色体分类方法、装置、设备及可读存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104331712A (zh) * 2014-11-24 2015-02-04 齐齐哈尔格林环保科技开发有限公司 一种藻类细胞图像自动分类方法
CN105957092A (zh) * 2016-05-31 2016-09-21 福州大学 用于计算机辅助诊断的乳腺钼靶图像特征自学习提取方法
CN107784324A (zh) * 2017-10-17 2018-03-09 杭州电子科技大学 基于深度残差网络的白血细胞多分类识别方法
CN109300111A (zh) * 2018-08-27 2019-02-01 杭州德适生物科技有限公司 一种基于深度学习的染色体识别方法

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4122518A (en) * 1976-05-17 1978-10-24 The United States Of America As Represented By The Administrator Of The National Aeronautics & Space Administration Automated clinical system for chromosome analysis
US4656594A (en) * 1985-05-06 1987-04-07 National Biomedical Research Foundation Operator-interactive automated chromosome analysis system producing a karyotype
CN1120445C (zh) * 2000-01-13 2003-09-03 北京工业大学 一种自动识别人体染色体模式的动态神经元模糊计算方法
CN1957353A (zh) * 2004-02-10 2007-05-02 皇家飞利浦电子股份有限公司 用于优化基于基因组学的医学诊断测试的遗传算法
CN101520890B (zh) * 2008-12-31 2011-04-20 广东威创视讯科技股份有限公司 一种基于灰度特征图像的粘连染色体自动分割方法
CN101710417B (zh) * 2009-11-06 2012-06-06 广东威创视讯科技股份有限公司 一种染色体图像处理方法及其系统
US9607202B2 (en) * 2009-12-17 2017-03-28 University of Pittsburgh—of the Commonwealth System of Higher Education Methods of generating trophectoderm and neurectoderm from human embryonic stem cells
WO2012061669A2 (en) * 2010-11-05 2012-05-10 Cytognomix,Inc. Centromere detector and method for determining radiation exposure from chromosome abnormalities
WO2013113707A1 (en) * 2012-02-01 2013-08-08 Ventana Medical Systems, Inc. System for detecting genes in tissue samples
CN106340016B (zh) * 2016-08-31 2019-03-22 湖南品信生物工程有限公司 一种基于细胞显微镜图像的dna定量分析方法
CN107463802A (zh) * 2017-08-02 2017-12-12 南昌大学 一种原核蛋白质乙酰化位点的预测方法
US10496924B1 (en) * 2018-08-07 2019-12-03 Capital One Services, Llc Dictionary DGA detector model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104331712A (zh) * 2014-11-24 2015-02-04 齐齐哈尔格林环保科技开发有限公司 一种藻类细胞图像自动分类方法
CN105957092A (zh) * 2016-05-31 2016-09-21 福州大学 用于计算机辅助诊断的乳腺钼靶图像特征自学习提取方法
CN107784324A (zh) * 2017-10-17 2018-03-09 杭州电子科技大学 基于深度残差网络的白血细胞多分类识别方法
CN109300111A (zh) * 2018-08-27 2019-02-01 杭州德适生物科技有限公司 一种基于深度学习的染色体识别方法

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112037173A (zh) * 2020-08-04 2020-12-04 湖南自兴智慧医疗科技有限公司 染色体检测方法、装置及电子设备
CN112037173B (zh) * 2020-08-04 2024-04-05 湖南自兴智慧医疗科技有限公司 染色体检测方法、装置及电子设备
CN113408505A (zh) * 2021-08-19 2021-09-17 北京大学第三医院(北京大学第三临床医学院) 一种基于深度学习的染色体极性识别方法和系统

Also Published As

Publication number Publication date
CN109300111B (zh) 2020-05-12
US11436493B2 (en) 2022-09-06
US20210312285A1 (en) 2021-10-07
CN109300111A (zh) 2019-02-01

Similar Documents

Publication Publication Date Title
WO2020042704A1 (zh) 一种基于深度学习的染色体识别方法
Khosravi et al. Deep learning enables robust assessment and selection of human blastocysts after in vitro fertilization
US11842487B2 (en) Detection model training method and apparatus, computer device and storage medium
Khalifa et al. Deep transfer learning models for medical diabetic retinopathy detection
Kwasigroch et al. Deep CNN based decision support system for detection and assessing the stage of diabetic retinopathy
Wang et al. Diabetic retinopathy stage classification using convolutional neural networks
Sharma et al. Crowdsourcing for chromosome segmentation and deep classification
Başaran et al. Convolutional neural network approach for automatic tympanic membrane detection and classification
Khan et al. Cataract detection using convolutional neural network with VGG-19 model
WO2021062904A1 (zh) 基于病理图像的tmb分类方法、系统及tmb分析装置
Theodorakopoulos et al. Hep-2 cells classification via fusion of morphological and textural features
Hu et al. Classification of metaphase chromosomes using deep convolutional neural network
CN112381178B (zh) 一种基于多损失特征学习的医学影像分类方法
Yüzkat et al. Multi-model CNN fusion for sperm morphology analysis
Zhu et al. Screening of common retinal diseases using six-category models based on EfficientNet
CN114913923A (zh) 针对单细胞染色质开放性测序数据的细胞类型识别方法
CN114494263B (zh) 融合临床信息的医学影像病变检测方法、系统及设备
CN112712122A (zh) 基于神经网络模型的角膜溃疡的分类检测方法及系统
Li et al. HEp-2 specimen classification via deep CNNs and pattern histogram
CN114792385A (zh) 一种金字塔分离双注意力的少样本细粒度图像分类方法
TW201913565A (zh) 胚胎影像評價方法及系統
CN105528791B (zh) 一种面向触摸屏手绘图像的质量评价装置及其评价方法
WO2023160666A1 (zh) 一种目标检测方法、目标检测模型训练方法及装置
Fan et al. Detecting Glaucoma in the Ocular Hypertension Treatment Study using deep learning: implications for clinical trial endpoints
Peng et al. Identification of incorrect karyotypes using deep learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19854530

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19854530

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 10.08.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19854530

Country of ref document: EP

Kind code of ref document: A1