WO2023240820A1 - 一种染色体核型分析模块 - Google Patents
一种染色体核型分析模块 Download PDFInfo
- Publication number
- WO2023240820A1 WO2023240820A1 PCT/CN2022/120115 CN2022120115W WO2023240820A1 WO 2023240820 A1 WO2023240820 A1 WO 2023240820A1 CN 2022120115 W CN2022120115 W CN 2022120115W WO 2023240820 A1 WO2023240820 A1 WO 2023240820A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- chromosome
- model
- karyotype analysis
- loss
- analysis module
- Prior art date
Links
- 210000000349 chromosome Anatomy 0.000 title claims abstract description 88
- 238000004458 analytical method Methods 0.000 title claims abstract description 56
- 230000005945 translocation Effects 0.000 claims abstract description 16
- 230000005772 establishment of chromosome localization Effects 0.000 claims abstract description 5
- 238000007405 data analysis Methods 0.000 claims abstract description 4
- 230000006870 function Effects 0.000 claims description 25
- 238000001514 detection method Methods 0.000 claims description 15
- 230000011218 segmentation Effects 0.000 claims description 12
- 238000010186 staining Methods 0.000 claims description 6
- 230000031864 metaphase Effects 0.000 claims description 4
- 238000013499 data model Methods 0.000 claims description 2
- 238000013473 artificial intelligence Methods 0.000 abstract 1
- 239000003550 marker Substances 0.000 abstract 1
- 238000000034 method Methods 0.000 description 11
- 238000012549 training Methods 0.000 description 9
- 230000000694 effects Effects 0.000 description 7
- 230000002159 abnormal effect Effects 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 4
- 206010010356 Congenital anomaly Diseases 0.000 description 3
- 208000034951 Genetic Translocation Diseases 0.000 description 3
- 230000005856 abnormality Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 230000001605 fetal effect Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 208000011580 syndromic disease Diseases 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 231100000071 abnormal chromosome number Toxicity 0.000 description 2
- 239000000969 carrier Substances 0.000 description 2
- 210000004027 cell Anatomy 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 2
- 238000004043 dyeing Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 206010000234 Abortion spontaneous Diseases 0.000 description 1
- 206010008805 Chromosomal abnormalities Diseases 0.000 description 1
- 208000031404 Chromosome Aberrations Diseases 0.000 description 1
- 208000032170 Congenital Abnormalities Diseases 0.000 description 1
- 208000027205 Congenital disease Diseases 0.000 description 1
- 208000027877 Disorders of Sex Development Diseases 0.000 description 1
- 208000026350 Inborn Genetic disease Diseases 0.000 description 1
- 208000017924 Klinefelter Syndrome Diseases 0.000 description 1
- 208000036626 Mental retardation Diseases 0.000 description 1
- 208000001300 Perinatal Death Diseases 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000001464 adherent effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007698 birth defect Effects 0.000 description 1
- 210000002230 centromere Anatomy 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 210000004252 chorionic villi Anatomy 0.000 description 1
- 210000001726 chromosome structure Anatomy 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 238000003759 clinical diagnosis Methods 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000011109 contamination Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000004064 dysfunction Effects 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 208000016361 genetic disease Diseases 0.000 description 1
- 210000004392 genitalia Anatomy 0.000 description 1
- 201000005611 hermaphroditism Diseases 0.000 description 1
- 230000010196 hermaphroditism Effects 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 230000036244 malformation Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 208000015994 miscarriage Diseases 0.000 description 1
- 230000002611 ovarian Effects 0.000 description 1
- 238000003793 prenatal diagnosis Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 108090000623 proteins and genes Proteins 0.000 description 1
- 230000008707 rearrangement Effects 0.000 description 1
- 230000001850 reproductive effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000007017 scission Effects 0.000 description 1
- 230000001568 sexual effect Effects 0.000 description 1
- 208000000995 spontaneous abortion Diseases 0.000 description 1
- 238000013526 transfer learning Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 208000026485 trisomy X Diseases 0.000 description 1
- 208000013327 true hermaphroditism Diseases 0.000 description 1
- 210000003606 umbilical vein Anatomy 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
Definitions
- the invention belongs to the field of biological information, and specifically relates to a chromosome karyotype analysis module.
- fetal cells can be obtained through amniotic cavity, umbilical veins and chorionic villus puncture. The obtained cells were harvested, sliced, and banded after in vitro culture, and karyotype analysis was performed. It is used in clinical medicine, pre-marital examination and eugenics, to diagnose fetal chromosomal abnormalities and intervene in birth defects. Such as reproductive dysfunction, abnormal secondary sexual characteristics, external genital hermaphroditism, congenital multiple malformations, mental retardation, abnormal temperament, etc.
- the conventional process is that the chromosome analysis system takes real-time images of chromosomes observed under the microscope through a camera and transmits them to a computer, and then uses chromosome image analysis software to perform image adjustment processing, segmentation of adherent and overlapping chromosomes, karyotype identification and arrangement, After operations such as report design and final confirmation by the examining doctor, a clear and intuitive chromosome examination report with pictures and text can be printed.
- Chinese Patent 20211060036 Generate several anchor boxes, and extract several regions of interest of the same size from the anchor frames through a preset detection algorithm; construct a chromosome karyotype analysis model based on the regions of interest and feature information; display the chromosomes to be identified.
- the micro-image is input into the karyotype analysis model, and the corresponding karyotype analysis results are output.
- the invention automatically classifies, segments and analyzes chromosome karyotypes to improve the accuracy of karyotype detection.
- Chinese patent 202011161688.3 discloses an automated chromosome karyotype analysis and abnormality detection method. Combining the attention mechanism with the convolutional neural network, this method locates the target in two stages. The first stage roughly locates the target area; the second stage adds the attention mechanism to the target area in the first stage to extract more detailed information. Deep semantic features predict the mask of the target, and at the same time, the location of the target is roughly located and segmented with the help of category prediction and detection frame regression tasks. Finally, the trained model is used to segment and detect chromosome images, and chromosome segmentation can be accurately achieved. and abnormality detection, thereby automating karyotype analysis.
- the present invention provides a chromosome karyotype analysis module.
- chromosome karyotype analysis takes metaphase chromosomes as the research object, and analyzes the chromosomes based on characteristics such as the length of the chromosome, the position of the centromere, the ratio of long and short arms, the presence or absence of satellites, and the use of banding technology. , comparison, sorting and numbering, and diagnosis based on the variation in chromosome structure and number.
- module generally refers to a biological information module, which is a biological information analysis tool and may include a hardware part and a software part.
- chromosome banding technology means that after the chromosomes undergo some special treatment or specific staining, the chromosomes can display a series of continuous light and dark stripes, which are called banding chromosomes.
- the purpose of the chromosome karyotype analysis module provided by the present invention is to achieve uniform dyeing, so that the appearance of images input under different dyeing conditions is similar and coordinated.
- the present invention provides a chromosome karyotype analysis module.
- the chromosome karyotype analysis module is used to analyze the chromosome karyotype processed by banding technology.
- the chromosome karyotype analysis module of the present invention outputs a total of 17 types of chromosome karyotype analysis reports, which can solve the problem of abnormal chromosome number and screen for congenital diseases caused by abnormal chromosome number, including congenital ovarian hypoplasia syndrome and trisomy X. syndrome, hyperandrogen syndrome, Klinefelter syndrome, etc.
- the karyotype analysis module includes a hardware part and a data analysis program part.
- the hardware part is a computer.
- the data analysis program part is a combination of data analysis models.
- the data model described is used for staining original pattern marking, chromosome positioning, chromosome sorting, and chromosome balanced translocation identification.
- the parameters of the chromosome positioning model are batch_size is 16, epoch is 200; the basic model yolov5x.pt; the loss function uses GIoU as the loss function of the bounding box regression, specifically the binary cross entropy loss (BCE loss) complete loss function , consisting of three parts: bounding box regression loss, confidence prediction loss and category prediction loss.
- BCE loss binary cross entropy loss
- the chromosome sorting model is based on Mask R-CNN, the image size is set to 896 ⁇ 896, the learning rate is 0.001, the momentum super parameter is 0.9, the epoch is set to 1000, the batch_size is set to 16, and the backbone network uses a pre-trained model resnet101.
- the loss function used by the chromosome sorting model network is classification error + detection error + segmentation error.
- segmentation error each pixel uses binary Sigmoid cross-entropy loss.
- the identification model for balanced chromosome translocation is the DM-K core model.
- the specific parameters of the DM-K core model are: the basic model selects Vgg-16, Vgg-19, resnet-101 or senet, the learning rate is set to 0.001, epoch 100 rounds, batch_size can be set to 32 or 64; the loss function cross entropy cross entropy.
- the sample type is metaphase with band levels at 400-550G.
- the present invention provides a karyotype analysis system.
- the karyotype analysis system includes the aforementioned karyotype analysis module.
- the chromosome karyotype analysis system also includes a data acquisition module.
- the data collection module is used to collect chromosome banding data that meets preset requirements.
- the application of the data acquisition module includes a chromosome staining step.
- the karyotype analysis system also includes a report generation module.
- the report generation module is used to generate an analysis report based on the analysis results of the chromosome karyotype analysis module.
- Figure 1 is an example of the original chromosome image
- Figure 2 shows an example of manual annotation of chromosomes
- Figure 3 is an example of chromosome algorithm prediction
- Figure 4 is a schematic diagram of feature extraction
- Figure 5 is the original picture of chromosomes
- Figure 6 is an example of manual arrangement of chromosomes
- Figure 7 is an example of the prediction graph of the chromosome algorithm
- Figure 8 is a comparison diagram of balanced chromosome translocation
- Figure 10 shows the analysis of the training effect of chromosome image detection: loss function, precision, and recall.
- Embodiment 1 A chromosome karyotype analysis module and its detection effect
- the karyotyping module includes a computer and data analysis software.
- the data analysis software uses the model for analysis.
- the training and verification information of the model is as follows:
- Sample source 1,000 copies of the original chromosome map, from the First affiliated Hospital of Guangzhou Medical University.
- the original staining map is marked and the original map of the chromosome is shown in Figure 1.
- Yolov5 includes yolov5s, yolov5m, yolov5l, yolov5x, etc. Through comparison, select the yolov5x version model.
- the image size is set to 896 ⁇ 896;
- Methods (1) Hyperparameter settings, including network parameters and training hyperparameters, where batch_size is 16 and training round epoch is 200; (2) Different backbone networks are selected as feature extraction layers to improve the network structure, and the pre-training model is yolov5x.pt; (3) Selection of loss function.
- the classification loss in the training stage uses the complete loss function of binary cross entropy loss (BCE loss), which consists of three parts: bounding box regression loss, confidence prediction loss and category prediction loss.
- BCE loss binary cross entropy loss
- the final loss function uses GIoU as the loss function for bounding box regression, as follows:
- S represents grid size
- S ⁇ S represents 13 ⁇ 13, 26 ⁇ 26, 52 ⁇ 52;
- Figure 2-3 The comparison with manually labeled chromosomes is shown in Figure 2-3, where Figure 2 is an example of manual chromosome labeling; Figure 3 is an example of chromosome algorithm prediction.
- Instance segmentation is used, based on Mask R-CNN.
- the image size is set to 896 ⁇ 896;
- Hyperparameter settings learning rate 0.001, momentum hyperparameter 0.9, epoch set to 1000, batch_size set to 16;
- Model selection The backbone network uses the pre-trained model resnet101.
- Loss function The loss function used by the network is classification error + detection error + segmentation error.
- segmentation error each pixel uses binary Sigmoid cross-entropy loss.
- the convolutional layer calculates the output feature map formula:
- Figure 5 is an example of the original chromosome map
- Figure 6 is an example of the manual arrangement of chromosomes
- Figure 7 is an example of algorithmic prediction of the staining map.
- Data input form Each group of chromosomes, normal and abnormal, are divided into two categories for input.
- the pad around the picture is 0, and the size of the structure is 256 ⁇ 256.
- Model selection Basic model selection Vgg-16, Vgg-19, resnet-101, senet, etc.
- Hyperparameter settings learning rate is set to 0.001, epoch 100 rounds, batch_size can be set to 32 or 64.
- Training process Input the original image for inference, obtain a score, and then use binary cross entropy as the loss function and back propagation to update the parameters.
- the chromosome banding image data is input into the above model for analysis, and an analysis report is obtained.
- the karyotype analysis module of this embodiment has an accuracy of 80% in detecting test set samples, and the area under the ROC curve is 0.98, indicating that the karyotype analysis module of the present application has good diagnostic performance and is suitable for clinical diagnosis.
- the sorting performance is detected.
- instance segmentation is used, based on Mask R-CNN.
- the final training picture size is set to 896 ⁇ 896, the learning rate is 0.001, the momentum super parameter is 0.9, the epoch is set to 1000, the batch_size is set to 16, and the backbone network
- the pre-trained model resnet101 is used.
- the loss function used by the network is classification error + detection error + segmentation error.
- each pixel uses binary Sigmoid cross-entropy loss.
- the basic model chooses Vgg-16, Vgg-19, resnet-101, senet, etc., the learning rate is set to 0.001, epoch 100 rounds, batch_size can be set to 32 or 64, and pad around the picture is 0, construct a size of 256 ⁇ 256, and obtain the DM-K core model.
- the specific process input the original image for inference, obtain a score, then use binary cross entropy as the loss function, and perform backpropagation to update the parameters.
- balanced chromosome translocation carriers Individuals with structural abnormalities of balanced chromosome translocations are called balanced chromosome translocation carriers.
- the individual phenotype is normal, but the structurally abnormal chromosomes can be passed on to offspring, causing fetal miscarriage, death, neonatal death, or birth of deformed children. .
- the incidence of carriers of balanced chromosomal translocations is approximately 1/200-1/400.
- the detection rate of balanced chromosome translocation in this application is 85%.
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Public Health (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- Evolutionary Biology (AREA)
- Chemical & Material Sciences (AREA)
- Genetics & Genomics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Pathology (AREA)
- Analytical Chemistry (AREA)
- Epidemiology (AREA)
- Primary Health Care (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
本发明提供一种染色体核型分析模块,包括硬件部分和数据分析程序部分,所述的数据分析程序部分包括数据分析模型,所述的数据分析模型分别用于染色原图标记,染色体定位,染色体排序,染色体平衡易位识别。本发明提供的染色体核型分析模块通过人工智能进行染色体核型分析,避免肉眼比对,节省时间,提高效率。
Description
本发明属于生物信息领域,具体涉及一种染色体核型分析模块。
在遗传疾病诊断(产前诊断)领域,胎儿细胞可通过羊膜腔、脐血管和绒毛膜穿刺获取。获得的细胞经体外培养后收获、制片、显带,做染色体核型分析。用于临床医学、婚前检查和优生优育,通过诊断胎儿的染色体异常,进行出生缺陷干预。如生殖功能障碍、第二性征异常、外生殖器两性畸形、先天性多发性畸形和智力低下、性情异常等。
常规流程为,染色体分析系统通过摄像机将显微镜下观察到的染色体实时图像拍摄下来并传输到电脑上,再利用染色体图像分析软件进行图像调节处理、分割粘连和重叠的染色体、核型识别与排列、报告设计等操作,最后经检验医生确认后即可打印出图文并茂,清晰直观的染色体检查报告。
但传统染色体核型分析对从业者的专业技能要求较高,且费时费力。且目前公开的资料主要是染色体的重排、配对的统计,缺少具体核型的程序性分析。
中国专利202110600361.X中公开了一种染色体核型分析方法、系统、终端设备和存储介质,其方法包括:对染色体显微图像进行特征提取和特征融合输出目标特征图;根据所述目标特征图生成若干个锚框,通过预设检测算法从所述锚框中提取出若干个相同尺寸的感兴趣区域;根据所述感兴趣区域和特征信息,构建染色体核型分析模型;将待识别染色体显微图像输入至所述染色体核型分析模型,输出对应的染色体核型分析结果。该发明自动对染色体核型进行分类、分割和分析,提升染色体核型检测的准确性。
中国专利202011161688.3公开了一种自动化的染色体核型分析以及异常检测方法。将注意力机制和卷积神经网络相结合,该方法分两阶段对目标定位,第一阶段粗定位出目标的区域;第二阶段在第一阶段的目标区域内加上注意力机制,提取更深层语义特征预测出目标的掩码,同时借助类别预测和检测框回归任务粗定位出目标的位置,并进行分割;最后,利用训练好的模型对染色体图像分割与检测,可以准确的实现染色体分割与异常检测,从而实现染色体核型分析自动化。
但以上方法对于染色体易位的分辨力较弱,需要更进一步优化,以获得分析精度更高的染色体核型分析模块。
发明内容
为了解决上述问题,本发明提供了一种染色体核型分析模块。
本发明中,“染色体核型分析”是以分裂中期染色体为研究对象,根据染色体的长度、着丝点位置、长短臂比例、随体的有无等特征,并借助显带技术对染色体进行分析、比较、排序和编号,根据染色体结构和数目的变异情况来进行诊断。
本发明中,“模块”一般指生物信息模块,为生物信息分析工具,可以包括硬件部分和软件部分。
本发明中,“染色体显带技术”是指染色体经过某种特殊的处理或特异的染色后,染色体上可显示出一系列连续的明暗条纹,称为显带染色体。
本发明提供的染色体核型分析模块目的在于实现染色均一化,让图像在不同染色条件下输入的外观接近,具有协调性。
一方面,本发明提供了一种染色体核型分析模块。
所述的染色体核型分析模块用于分析显带技术处理后的染色体核型。
本发明的染色体核型分析模块输出的染色体核型分析报告共17类,可以解决染色体条数异常,筛查染色体条数异常引发的先天性疾病,包括先天性卵巢发育不全综合征、X三体综合征、超雄综合征、Klinefelter综合征等。
具体地,所述的染色体核型分析模块包括硬件部分和数据分析程序部分。
所述的硬件部分为一台计算机。
具体地,所述的数据分析程序部分为数据分析模型的组合。
所述的数据模型分别用于染色原图标记,染色体定位,染色体排序,染色体平衡易位识别。
所述的染色体定位模型的参数为batch_size为16,epoch为200;基础模型yolov5x.pt;损失函数采用GIoU作为边界框回归的损失函数,具体为二元交叉熵损失(BCE loss)完整的损失函数,由边界框回归损失、置信度预测损失和类别预测损失三部分构成。
所述的染色体排序的模型为以Mask R-CNN为基础,图片大小设置为896×896,学习率0.001,momentum超参0.9,epoch设置为1000,batch_size设置为16,骨干网络采用预训练的模型resnet101。
所述的染色体排序模型网络使用的损失函数为分类误差+检测误差+分割误差,计算分割误差时每一个像素使用二值的Sigmoid交叉熵损失。
所述的用于染色体平衡易位的识别模型为DM-K核心模型。
所述的DM-K核心模型的具体参数为:基础模型选择Vgg-16、Vgg-19、resnet-101或senet,学习率设为0.001,epoch 100轮,batch_size可设置为32或者64;损失函数cross entropy交叉熵。
样本类型为条带水平在400-550G的中期分裂相。
另一方面,本发明提供了一种染色体核型分析系统。
所述的染色体核型分析系统包括前述的染色体核型分析模块。
所述的染色体核型分析系统还包括数据采集模块。
所述的数据采集模块用于采集符合预设要求的染色体显带数据。
所述的数据采集模块应用时包括染色体染色步骤。
所述的核型分析系统中还包括报告生成模块。
所述的生成报告模块用于根据染色体核型分析模块的分析结果生成分析报告。
本发明的有益效果:
(1)避免肉眼比对,节省时间、提高效率;
(2)诊断效能高(ROC曲线下面积0.98,准确率高达80%)。
图1为染色体原图实例;
图2为染色体手工标注示例;
图3为染色体算法预测示例;
图4为特征提取示意图;
图5为染色体原图;
图6为染色体手工排列图示例;
图7为染色体算法预测图示例;
图8为染色体平衡易位对比图;
图9染色体图片检测标签分布情况;
图10为染色体图片检测训练效果分析:损失函数、精度、召回情况。
下面结合具体实施例,对本发明作进一步详细的阐述,下述实施例不用于限制本发明,仅用于说明本发明。以下实施例中所使用的实验方法如无特殊说明,实施例中未注明具体条件的实验方法,通常按照常规条件,下述实施例中所使用的材料、试剂等,如无特殊说明,均可从商业途径得到。
实施例1一种染色体核型分析模块及其检测效果
染色体核型分析模块包括一台计算机和数据分析软件。
数据分析软件运用模型进行分析,模型的训练验证信息如下:
样本来源:染色体原图1000份,来自广州医科大学附属第一医院。
样本采集要求(标准):条带水平在400-550G的中期分裂相,形态适中,G显带普遍较清晰,无明显污染。
对样本系统标记,1000份数据中,800份用于模型训练,100份用于模型验证,100份用于模型测试。
包括以下步骤:
(1)染色体数目异常—目标检测算法
染色原图标记,染色体原图如图1所示。
所用方法为目标检测算法,综合考虑选择yolov5算法,优点是速度快,精度高,配置灵活。yolov5包括yolov5s,yolov5m,yolov5l,yolov5x等等,通过比较,选择yolov5x版本模型。
数据输入形式:图片大小设置为896×896;
方法:(1)超参数设定,包括网络参数及训练超参数,其中batch_size为16,训练轮次epoch为200;(2)选择不同的骨干网络作为特征提取层改进网络结构,预训练模型为yolov5x.pt;(3)损失函数的选择。训练阶段的分类损失采用的是二元交叉熵损失(BCE loss)完整的损失函数,由边界框回归损失、置信度预测损失和类别预测损失三部分构成。
最终损失函数方面采用GIoU作为边界框回归的损失函数,具体如下:
上式中:
S代表grid大小;
S×S代表13×13,26×26,52×52;
B代表框;
与手工标注的染色体对比见图2-3,其中图2为染色体手工标注示例;图3为染色体算法预测示例。
(2)染色体分类排序
采用实例分割,以Mask R-CNN为基础。
数据输入形式:图片大小设置为896×896;
超参数设置:学习率0.001,momentum超参0.9,epoch设置为1000,batch_size设置为16;
模型选择:骨干网络采用预训练的模型resnet101。
损失函数:网络使用的损失函数为分类误差+检测误差+分割误差,计算分割误差时每一个像素使用二值的Sigmoid交叉熵损失。
最终通过将每个染色体实例分割后,结合染色体的类别,进行分组排序。
其中每个卷积层的激活功能采用线性整流函数ReLU:
用于多分类的Softmax函数:
交叉熵损失函数:
卷积层计算输出特征图公式:
其中X表示输入图像,Z
k是第K个输出特征映射,W
k是第K个特征映射的权重,*是二维卷积算子,f(.)表示非线性激活函数。
特征提取示意图见图4。
分类排序效果见图5-7,其中图5为染色体原图示例,图6为染色体手工排列图示例,图7为染色图算法图预测示例。
(3)染色体平衡易位的识别
数据输入形式:每一组染色体正常以及发生异常的分成两类,用于输入,图片周围pad为0,构造256×256大小。
模型选择:基础模型选择Vgg-16,Vgg-19,resnet-101,senet等等。
超参数设置:学习率设为0.001,epoch 100轮,batch_size可设置为32或者64。
获得DM-K核心模型。
训练过程:输入原始图像进行推理,得到一个score,然后用二元交叉熵作为损失函数,反向传播进行参数的更新。
交叉熵损失函数:
在完成上述模型的构建后分别进行验证及测试,形成染色体核型分析系统。
集成以上技术点,实现智能核型分析系统的应用转化。
在进行分析时,将染色体显带图像数据输入上述模型进行分析后,得到分析报告。
本实施例的染色体核型分析模块对于测试集样本检测的准确率可达80%,ROC曲线下面积为0.98,表明本申请的染色体核型分析模块具有较好的诊断效能,适用于临床诊断。
实施例2染色体的排序效果
根据实施例1中所述的排序方法,进行排序性能的检测。
在定位的基础上,采用实例分割,以Mask R-CNN为基础,最终训练的图片大小设置为896×896,学习率0.001,momentum超参0.9,epoch设置为1000,batch_size设置为16,骨干网络采用预训练的模型resnet101。
网络使用的损失函数为分类误差+检测误差+分割误差,计算分割误差时每一个像素使用二值的Sigmoid交叉熵损失。
效果:用于模型测试的100例数据来看,以医生判断为标准,本发明的排序与徕卡CytoVision的排序的准确率比较:提高20%,具有显著性差异,表明本发明的排序效果更优。
实施例3染色体平衡易位的识别
根据实施例1中所述的方法,对于本申请染色体平衡易位的识别效果进行检测。
采用图像分类方法,主要使用迁移学习,基础模型选择Vgg-16,Vgg-19,resnet-101,senet等等,学习率设为0.001,epoch 100轮,batch_size可设置为32或者64,图片周围pad为0,构造256×256大小,获得DM-K核心模型。
具体过程:输入原始图像进行推理,得到一个score,然后用二元交叉熵作为损失函数,反向传播进行参数的更新。
关于平衡易位:
某个体两条或两条以上染色体发生断裂,断片发生相互交换(易位),但这种易位一般没有遗传物质丢失或者丢失不多而未引起足够变化以使个体表型(身体外表)改变,临床上称这种染色体易位为平衡易位(图8)。
具有染色体平衡易位结构异常的个体称为染色体平衡易位携带者,个体表型正常,但可将结构异常的染色体往子代传递,而引起胎儿流产、死亡,新生儿死亡或生出畸形儿等。人群中,染色体平衡易位携带者的发生率约为1/200-1/400。
本申请对于染色体平衡易位的检出率为85%。
Claims (9)
- 一种染色体核型分析模块,其特征在于,包括硬件部分和数据分析程序部分,所述的数据分析程序部分包括数据分析模型,所述的数据模型分别用于染色原图标记,染色体定位,染色体排序,染色体平衡易位识别;所述的染色体排序的模型为以Mask R-CNN为基础,图片大小设置为896×896,学习率0.001,momentum超参0.9,epoch设置为1000,batch_size设置为16,骨干网络采用预训练的模型resnet101。
- 根据权利要求1所述的染色体核型分析模块,其特征在于,所述的染色体定位模型的参数为batch_size为16,epoch为200;基础模型yolov5x.pt;损失函数采用GIoU作为边界框回归的损失函数,具体为二元交叉熵损失完整的损失函数,由边界框回归损失、置信度预测损失和类别预测损失三部分构成。
- 根据权利要求1所述的染色体核型分析模块,其特征在于,所述的用于染色体平衡易位的识别模型为DM-K核心模型。
- 根据权利要求3所述的染色体核型分析模块,其特征在于,所述的DM-K核心模型的参数为学习率设为0.001,epoch 100轮,batch_size可设置为32或者64;损失函数cross entropy交叉熵。
- 根据权利要求1所述的染色体核型分析模块,其特征在于,所述的染色体排序模型网络使用的损失函数为分类误差+检测误差+分割误差,计算分割误差时每一个像素使用二值的Sigmoid交叉熵损失。
- 根据权利要求1所述的染色体核型分析模块,其特征在于,样本类型为条带水平在400-550G的中期分裂相。
- 一种染色体核型分析系统,其特征在于,包括权利要求1-6任一项所述的染色体核型分析模块。
- 根据权利要求7所述的染色体核型分析系统,其特征在于,还包括数据采集模块,所述的数据采集模块用于采集染色体显带数据。
- 根据权利要求7所述的染色体核型分析系统,其特征在于,还包括报告生成模块,所述的报告生成模块用于根据染色体核型分析模块的分析结果生成分析报告。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210691618.1 | 2022-06-17 | ||
CN202210691618.1A CN115188413A (zh) | 2022-06-17 | 2022-06-17 | 一种染色体核型分析模块 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023240820A1 true WO2023240820A1 (zh) | 2023-12-21 |
Family
ID=83514205
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/120115 WO2023240820A1 (zh) | 2022-06-17 | 2022-09-21 | 一种染色体核型分析模块 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN115188413A (zh) |
WO (1) | WO2023240820A1 (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117274294B (zh) * | 2023-09-18 | 2024-06-04 | 笑纳科技(苏州)有限公司 | 一种同源染色体分割方法 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110533672A (zh) * | 2019-08-22 | 2019-12-03 | 杭州德适生物科技有限公司 | 一种基于条带识别的染色体排序方法 |
WO2020168511A1 (zh) * | 2019-02-21 | 2020-08-27 | 中国医药大学附设医院 | 染色体异常检测模型、其检测系统及染色体异常检测方法 |
CN112052813A (zh) * | 2020-09-15 | 2020-12-08 | 中国人民解放军军事科学院军事医学研究院 | 染色体间易位识别方法、装置、电子设备及可读存储介质 |
CN112288706A (zh) * | 2020-10-27 | 2021-01-29 | 武汉大学 | 一种自动化的染色体核型分析以及异常检测方法 |
CN113223614A (zh) * | 2021-05-31 | 2021-08-06 | 上海澜澈生物科技有限公司 | 一种染色体核型分析方法、系统、终端设备和存储介质 |
CN114026644A (zh) * | 2019-03-28 | 2022-02-08 | 相位基因组学公司 | 通过测序进行核型分析的系统和方法 |
CN114026647A (zh) * | 2019-04-12 | 2022-02-08 | 欧洲分子生物学实验室 | 单细胞遗传结构变异的综合检测 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106934392B (zh) * | 2017-02-28 | 2020-05-26 | 西交利物浦大学 | 基于多任务学习卷积神经网络的车标识别及属性预测方法 |
CN107358626B (zh) * | 2017-07-17 | 2020-05-15 | 清华大学深圳研究生院 | 一种利用条件生成对抗网络计算视差的方法 |
CN108230338B (zh) * | 2018-01-11 | 2021-09-28 | 温州大学 | 一种基于卷积神经网络的立体图像分割方法 |
CN108918532B (zh) * | 2018-06-15 | 2021-06-11 | 长安大学 | 一种快速道路交通标志破损检测系统及其检测方法 |
CN113658174B (zh) * | 2021-09-02 | 2023-09-19 | 北京航空航天大学 | 基于深度学习和图像处理算法的微核组学图像检测方法 |
-
2022
- 2022-06-17 CN CN202210691618.1A patent/CN115188413A/zh active Pending
- 2022-09-21 WO PCT/CN2022/120115 patent/WO2023240820A1/zh unknown
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020168511A1 (zh) * | 2019-02-21 | 2020-08-27 | 中国医药大学附设医院 | 染色体异常检测模型、其检测系统及染色体异常检测方法 |
CN114026644A (zh) * | 2019-03-28 | 2022-02-08 | 相位基因组学公司 | 通过测序进行核型分析的系统和方法 |
CN114026647A (zh) * | 2019-04-12 | 2022-02-08 | 欧洲分子生物学实验室 | 单细胞遗传结构变异的综合检测 |
CN110533672A (zh) * | 2019-08-22 | 2019-12-03 | 杭州德适生物科技有限公司 | 一种基于条带识别的染色体排序方法 |
CN112052813A (zh) * | 2020-09-15 | 2020-12-08 | 中国人民解放军军事科学院军事医学研究院 | 染色体间易位识别方法、装置、电子设备及可读存储介质 |
CN112288706A (zh) * | 2020-10-27 | 2021-01-29 | 武汉大学 | 一种自动化的染色体核型分析以及异常检测方法 |
CN113223614A (zh) * | 2021-05-31 | 2021-08-06 | 上海澜澈生物科技有限公司 | 一种染色体核型分析方法、系统、终端设备和存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN115188413A (zh) | 2022-10-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2018384082B2 (en) | Systems and methods for estimating embryo viability | |
US11436493B2 (en) | Chromosome recognition method based on deep learning | |
CN112288706B (zh) | 一种自动化的染色体核型分析以及异常检测方法 | |
US20200111212A1 (en) | Chromosome Abnormality Detecting Model, Detecting System Thereof, And Method For Detecting Chromosome Abnormality | |
CN111681209B (zh) | 卵裂球分裂状态智能检测系统 | |
Leahy et al. | Automated measurements of key morphological features of human embryos for IVF | |
CN111476307B (zh) | 一种基于深度领域适应的锂电池表面缺陷检测方法 | |
CN109670489B (zh) | 基于多实例学习的弱监督式早期老年性黄斑病变分类方法 | |
TW202015067A (zh) | 染色體異常檢測模型之建立方法、染色體異常檢測系統及染色體異常檢測方法 | |
CN110111895A (zh) | 一种鼻咽癌远端转移预测模型的建立方法 | |
WO2023240820A1 (zh) | 一种染色体核型分析模块 | |
CN115641335B (zh) | 基于时差培养箱的胚胎异常多级联智能综合分析系统 | |
CN113378831A (zh) | 一种小鼠胚胎器官识别与评分方法与系统 | |
WO2023121575A1 (en) | Determining the age and arrest status of embryos using a single deep learning model | |
CN113724842A (zh) | 一种基于注意力机制的宫颈组织病理辅助诊断方法 | |
Yang et al. | Chromosome classification via deep learning and its application to patients with structural abnormalities of chromosomes | |
CN117197587A (zh) | 基于深度学习的染色体核型分析方法及系统 | |
CN111968147A (zh) | 一种基于关键点检测的乳腺癌病理图像综合分析系统 | |
CN114187480A (zh) | 一种基于深度学习的骨髓象细胞影像图检测分类方法及其系统 | |
CN113077457A (zh) | 基于延时摄像系统与深度学习算法预测胚胎能否成囊的系统 | |
Iqbal et al. | Towards Efficient Segmentation and Classification of White Blood Cell Cancer Using Deep Learning | |
CN111401119A (zh) | 细胞核的分类 | |
US20240312560A1 (en) | Systems and methods for non-invasive preimplantation embryo genetic screening | |
Eswaran et al. | Deep Learning Algorithms for Timelapse Image Sequence-Based Automated Blastocyst Quality Detection | |
RU2800079C2 (ru) | Системы и способы оценки жизнеспособности эмбрионов |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22946493 Country of ref document: EP Kind code of ref document: A1 |