CN112102952A

CN112102952A - Method for identifying pathological category based on distance calculation method and related device

Info

Publication number: CN112102952A
Application number: CN202010857223.5A
Authority: CN
Inventors: 车拴龙; 余霆嵩; 罗丕福; 卢芳; 李学锋; 刘斯; 刘莹; 林万里
Original assignee: Guangzhou Kingmed Diagnostics Group Co ltd; Guangzhou Kingmed Diagnostics Central Co Ltd
Current assignee: Guangzhou Kingmed Diagnostics Group Co ltd; Guangzhou Kingmed Diagnostics Central Co Ltd
Priority date: 2020-08-24
Filing date: 2020-08-24
Publication date: 2020-12-18
Anticipated expiration: 2040-08-24
Also published as: CN112102952B

Abstract

The embodiment of the present invention discloses a method for identifying a pathological category based on a distance calculation method. By acquiring characteristic parameters to be diagnosed, the characteristic parameters to be diagnosed include N characteristic parameters corresponding to N types of characteristics; The standard parameters include M×N standard parameters of N kinds of features corresponding to M kinds of pathological categories respectively; using a preset distance calculation method to calculate the predicted distance between the feature parameters to be diagnosed and the standard parameters corresponding to each kind of pathological category, to obtain M prediction distances; determine the pathological category corresponding to the characteristic parameters to be diagnosed according to the size of the M predicted distances, and realize automatic pathological diagnosis by calculating and comparing the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathological category one by one. And improve the objectivity and accuracy of pathological diagnosis. In addition, a system, computer equipment and storage medium for identifying pathological categories based on the distance calculation method are also proposed.

Description

Method and related equipment for identifying pathological categories based on distance calculation method

技术领域technical field

本发明涉及计算机技术领域，尤其涉及一种基于距离计算方法鉴定病理类别的方法及相关设备。The invention relates to the field of computer technology, and in particular, to a method and related equipment for identifying a pathological category based on a distance calculation method.

背景技术Background technique

病理诊断是研究疾病发生的原因，发病机制，以及疾病过程中患病机体的形态结构，功能代谢改变与疾病的转归，从而为疾病的诊断、治疗及预防提供必要的理论基础和实践依据。病理诊断是肿瘤各种检查方法中最可靠的，病理诊断被喻为“金标准”，也是疾病的最终诊断。Pathological diagnosis is to study the cause of disease, pathogenesis, as well as the morphological structure, functional and metabolic changes of the diseased body during the disease process, and the outcome of the disease, so as to provide the necessary theoretical basis and practical basis for the diagnosis, treatment and prevention of the disease. Pathological diagnosis is the most reliable among various tumor examination methods. Pathological diagnosis is regarded as the "gold standard" and the final diagnosis of the disease.

从临床症状到传统的HE病理学形态，有许多相似性病变，极易与多种疾病混淆，包括良性与恶性疾病。一旦错误诊断，将造成严重的医疗事故。在疾病诊断的过程中，需要与许多相似的病变进行鉴别诊断。例如肿瘤性疾病，会有多种蛋白不同程度的表达情况，以及多种不同基因的变异发生。这些蛋白和基因的发生情况，称之为特征性参数。而肿瘤性疾病，称之为结果性现象。医生最直观的可检测的结果是特征性参数，需要综合性分析特征性参数的表达情况，整合前期自身积累的经验和书籍文献中的知识。最终对结果性现象进行预判。医生看到了检验的结果报告、影像学报告，再结合病史和临床资料，综合选出N种比较有可能性的病变，再通过综合分析与会诊讨论，得出最有倾向性的诊断结果。最终对结果性现象进行预判。然而，目前基于个人水平高低不等的经验进行预测结果性现象，存在严重的个人主观性影响。并且不同医疗机构、不同医生之间，对于结果性现象判读的差异性较大，降低了病理类别鉴定得客观性，从而在一定程度上影响了诊疗质量，使得病理类别鉴定得准确度不高，病理诊断遗漏率高。From clinical symptoms to traditional HE pathology, there are many similar lesions, which are easily confused with a variety of diseases, including benign and malignant diseases. Once misdiagnosed, it will cause serious medical malpractice. In the process of disease diagnosis, differential diagnosis with many similar lesions is required. For example, in tumor diseases, there will be different levels of expression of various proteins and mutations of many different genes. The occurrence of these proteins and genes is called characteristic parameters. On the other hand, neoplastic diseases are called consequential phenomena. The most intuitive and detectable results of doctors are characteristic parameters. It is necessary to comprehensively analyze the expression of characteristic parameters, and integrate their own accumulated experience and knowledge in books and literature. Finally, predict the resulting phenomenon. The doctor sees the report of the test results and the imaging report, and then combines the medical history and clinical data to comprehensively select N more likely lesions, and then through comprehensive analysis and consultation and discussion, the most inclined diagnosis results are obtained. Finally, predict the resulting phenomenon. However, the current phenomenon of predicting results based on the experience of different levels of individuals has a serious impact of personal subjectivity. In addition, among different medical institutions and different doctors, the interpretation of consequential phenomena is quite different, which reduces the objectivity of pathological category identification, thereby affecting the quality of diagnosis and treatment to a certain extent, making the accuracy of pathological category identification not high. The missed rate of pathological diagnosis is high.

发明内容SUMMARY OF THE INVENTION

基于此，有必要针对上述问题，提出一种基于距离计算方法鉴定病理类别的方法、系统、计算机设备及存储介质，以提高对病理诊断的客观性和准确性。Based on this, it is necessary to propose a method, system, computer equipment and storage medium for identifying pathological categories based on the distance calculation method to improve the objectivity and accuracy of pathological diagnosis.

一种基于距离计算方法鉴定病理类别的方法，所述方法包括：A method for identifying a pathological category based on a distance calculation method, the method comprising:

获取待诊断特征参数，所述待诊断特征参数包括N种特征对应的N个特征参数，N为自然数；Obtaining feature parameters to be diagnosed, the feature parameters to be diagnosed include N feature parameters corresponding to N kinds of features, and N is a natural number;

获取预设的标准参数，所述预设的标准参数包括M种病理类别分别对应的N种特征的M×N个标准参数，M为自然数；Acquiring preset standard parameters, where the preset standard parameters include M×N standard parameters of N features corresponding to M types of pathological categories, where M is a natural number;

采用预设的距离计算方法分别计算所述待诊断特征参数与每种病理类别对应的标准参数之间的预测距离，得到M个预测距离；Calculate the predicted distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathological category by using a preset distance calculation method to obtain M predicted distances;

根据M个所述预测距离的大小确定所述待诊断特征参数对应的病理类别。The pathological category corresponding to the to-be-diagnosed feature parameter is determined according to the size of the M predicted distances.

一种基于距离计算方法鉴定病理类别的系统，所述系统包括：A system for identifying pathological categories based on a distance calculation method, the system comprising:

第一参数获取模块，用于获取待诊断特征参数，所述待诊断特征参数包括N种特征对应的N个特征参数，N为自然数；a first parameter obtaining module, configured to obtain characteristic parameters to be diagnosed, wherein the characteristic parameters to be diagnosed include N characteristic parameters corresponding to N kinds of features, and N is a natural number;

第二参数获取模块，用于获取预设的标准参数，所述预设的标准参数包括M种病理类别分别对应的N种特征的M×N个标准参数，M为自然数；The second parameter acquisition module is configured to acquire preset standard parameters, where the preset standard parameters include M×N standard parameters of N features corresponding to M types of pathological categories, where M is a natural number;

计算模块，用于采用预设的距离计算方法分别计算所述待诊断特征参数与每种病理类别对应的标准参数之间的预测距离，得到M个预测距离；a calculation module, configured to calculate the predicted distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathological category by using a preset distance calculation method, to obtain M predicted distances;

诊断模块，用于根据M个所述预测距离的大小确定所述待诊断特征参数对应的病理类别。A diagnosis module, configured to determine the pathological category corresponding to the characteristic parameter to be diagnosed according to the size of the M predicted distances.

一种计算机设备，包括存储器和处理器，所述存储器存储有计算机程序，所述计算机程序被所述处理器执行时，使得所述处理器执行以下步骤：A computer device includes a memory and a processor, the memory stores a computer program, and when the computer program is executed by the processor, the processor performs the following steps:

一种计算机可读存储介质，存储有计算机程序，所述计算机程序被处理器执行时，使得所述处理器执行以下步骤：A computer-readable storage medium storing a computer program, when executed by a processor, the computer program causes the processor to perform the following steps:

上述基于距离计算方法鉴定病理类别的方法、系统、计算机设备及存储介质，通过获取待诊断特征参数；获取预设的标准参数；采用预设的距离计算方法分别计算所述待诊断特征参数与每种病理类别对应的标准参数之间的预测距离，得到M个预测距离；根据M个所述预测距离的大小确定所述待诊断特征参数对应的病理类别，通过采用多种基于距离计算的方法进行病理诊断，提高了病理诊断得准确性和客观性。The above-mentioned method, system, computer equipment and storage medium for identifying pathological categories based on the distance calculation method, by obtaining the characteristic parameters to be diagnosed; obtaining preset standard parameters; using the preset distance calculation method to respectively calculate the characteristic parameters to be diagnosed and each The predicted distances between the standard parameters corresponding to the various pathological categories are obtained, and M predicted distances are obtained; the pathological categories corresponding to the characteristic parameters to be diagnosed are determined according to the size of the M predicted distances, and the pathological categories corresponding to the characteristic parameters to be diagnosed are determined by adopting a variety of methods based on distance calculation. Pathological diagnosis improves the accuracy and objectivity of pathological diagnosis.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the following briefly introduces the accompanying drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present invention. For those of ordinary skill in the art, other drawings can also be obtained according to these drawings without creative efforts.

其中：in:

图1为一个实施例中基于距离计算方法鉴定病理类别的方法的流程图；1 is a flowchart of a method for identifying a pathological category based on a distance calculation method in one embodiment;

图2为一个实施例中预测距离计算方法的流程图；2 is a flowchart of a method for calculating a predicted distance in one embodiment;

图3为另一个实施例中预测距离计算方法的流程图；3 is a flowchart of a method for calculating a predicted distance in another embodiment;

图4为一个实施例中病理类别确定方法的流程图；4 is a flowchart of a method for determining a pathological category in one embodiment;

图5为一个实施例中基于距离计算方法鉴定病理类别的系统的结构框图；5 is a structural block diagram of a system for identifying a pathological category based on a distance calculation method in one embodiment;

图6为一个实施例中计算机设备的结构框图。FIG. 6 is a structural block diagram of a computer device in one embodiment.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

如图1所示，在一个实施例中，提供了一种基于距离计算方法鉴定病理类别的方法，该基于距离计算方法鉴定病理类别的方法既可以应用于终端，也可以应用于服务器，本实施例以应用于服务器举例说明。该基于距离计算方法鉴定病理类别的方法具体包括以下步骤：As shown in FIG. 1 , in one embodiment, a method for identifying a pathological category based on a distance calculation method is provided. The method for identifying a pathological category based on a distance calculation method can be applied to both a terminal and a server. This implementation The example is applied to the server as an example. The method for identifying pathological categories based on the distance calculation method specifically includes the following steps:

步骤102，获取待诊断特征参数，待诊断特征参数包括N种特征对应的N个特征参数，N为自然数。Step 102: Obtain characteristic parameters to be diagnosed, where the characteristic parameters to be diagnosed include N characteristic parameters corresponding to N types of characteristics, where N is a natural number.

其中，待诊断特征参数是指用于反映待诊断病理切片的病理特征的参数，且待诊断特征参数包括多种特征对应的多个特征参数。在一具体实施方式中，待诊断病理切片为卵巢上皮性恶性肿瘤，其对应的7个特征参数可以是Pax-8、WT-1、CA125、P53、CEA、ER和PVHL对应的数值。具体地，可以通过病理分析仪器对病理切片进行分析后获取到该待诊断特征参数。The to-be-diagnosed feature parameter refers to a parameter used to reflect the pathological feature of the to-be-diagnosed pathological slice, and the to-be-diagnosed feature parameter includes multiple feature parameters corresponding to multiple features. In a specific embodiment, the pathological section to be diagnosed is an epithelial ovarian malignant tumor, and the corresponding seven characteristic parameters may be the values corresponding to Pax-8, WT-1, CA125, P53, CEA, ER and PVHL. Specifically, the to-be-diagnosed characteristic parameter can be obtained after analyzing the pathological section by a pathological analysis instrument.

步骤104，获取预设的标准参数，预设的标准参数包括M种病理类别分别对应的N种特征的M×N个标准参数，M为自然数。Step 104: Acquire preset standard parameters, where the preset standard parameters include M×N standard parameters of N features corresponding to M types of pathological categories, where M is a natural number.

其中，预设的标准参数为根据各个病理类别下N个特征参数的大小或者范围设置的参数，标准参数与待诊断特征参数一一对应，即每种病理类别均含有N个标准参数，因此，M种病理类别包含有M×N个标准参数。继续以步骤S102中的卵巢上皮性恶性肿瘤为例，其对应有7个特征参数，其存在的病理类别包括：浆液性腺癌、粘液性腺癌、子宫内膜样腺癌、透明细胞腺癌和转移性腺癌。每种病理类别对应的标准参数有N个，例如，对于病理类别为浆液性腺癌的7个标准参数即Pax-8、WT-1、CA125、P53、CEA、ER和PVHL对应的数值分别是95％、95％、95％、95％、95％、75％和5％。Among them, the preset standard parameters are parameters set according to the size or range of N characteristic parameters under each pathological category, and the standard parameters correspond one-to-one with the characteristic parameters to be diagnosed, that is, each pathological category contains N standard parameters, so , M pathological categories contain M×N standard parameters. Continue to take the ovarian epithelial malignant tumor in step S102 as an example, which corresponds to 7 characteristic parameters, and its pathological categories include: serous adenocarcinoma, mucinous adenocarcinoma, endometrioid adenocarcinoma, clear cell adenocarcinoma and metastasis Gonadal carcinoma. There are N standard parameters corresponding to each pathological category. For example, for the 7 standard parameters whose pathological category is serous adenocarcinoma, the corresponding values of Pax-8, WT-1, CA125, P53, CEA, ER and PVHL are 95 respectively. %, 95%, 95%, 95%, 95%, 75% and 5%.

步骤106，采用预设的距离计算方法分别计算待诊断特征参数与每种病理类别对应的标准参数之间的预测距离，得到M个预测距离。Step 106 , using a preset distance calculation method to calculate the predicted distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathological category, respectively, to obtain M predicted distances.

其中，预设的距离计算方法是指预先设定的用于比较待诊断特征参数与标准参数相似程度的一种量化方法。其中的距离计算方法可以是欧几里得距离、明可夫斯基距离、曼哈顿距离及切比雪夫距离、余弦相似度和/或皮尔森相关系数的距离度量中的一种或者多种，具体可以根据其中的各个距离自身的特性与标准参数和/或待诊断特征参数的特性进行选择。预测距离是指待诊断特征参数与各个病理类别对应的标准参数相似程度的量化值。具体地，将待诊断特征参数中的N个特征参数分别与M个病理类别对应的N个标准参数按照预设的距离计算方法进行计算，从而得到M个预测距离。可以理解地，通过预设的距离计算方法计算待诊断特征参数与每种病理类别对应的标准参数之间的预测距离，实现了待诊断特征参数与标准参数相似程度的具体量化，提升了预测距离计算得客观性。并且由于预设的距离计算方法包含了多种距离度量，提高了预测距离计算得准确性。Wherein, the preset distance calculation method refers to a preset quantification method for comparing the similarity degree between the characteristic parameter to be diagnosed and the standard parameter. The distance calculation method may be one or more of the distance measures of Euclidean distance, Minkowski distance, Manhattan distance, Chebyshev distance, cosine similarity and/or Pearson correlation coefficient. The selection can be made according to the characteristics of each distance itself and the characteristics of the standard parameters and/or the characteristic parameters to be diagnosed. The prediction distance refers to the quantified value of the similarity degree between the characteristic parameter to be diagnosed and the standard parameter corresponding to each pathological category. Specifically, the N characteristic parameters in the characteristic parameters to be diagnosed and the N standard parameters corresponding to the M pathological categories are respectively calculated according to a preset distance calculation method, thereby obtaining M predicted distances. Understandably, the predicted distance between the characteristic parameter to be diagnosed and the standard parameter corresponding to each pathological category is calculated by the preset distance calculation method, which realizes the specific quantification of the similarity between the characteristic parameter to be diagnosed and the standard parameter, and improves the predicted distance. Calculated objectivity. And because the preset distance calculation method includes a variety of distance metrics, the accuracy of the predicted distance calculation is improved.

步骤108，根据M个预测距离的大小确定待诊断特征参数对应的病理类别。Step 108: Determine the pathological category corresponding to the feature parameter to be diagnosed according to the size of the M predicted distances.

具体地，根据M个预测距离对应的具体数值以及预测距离大小与相似程度的正相关或者反相关的关系，确定待诊断特征参数对应的病理类别。例如，5个病理类别：浆液性腺癌、粘液性腺癌、子宫内膜样腺癌、透明细胞腺癌和转移性腺癌对应的预测距离为：0.7、0.5、0.4、0.4、0.5，且预测距离与相似程度正相关，即预测距离距离越大，相似程度越高，则待诊断特征参数对应的病理类别为预测距离0.7对应的病理类别即浆液性腺癌。可以理解地，通过将待诊断特征参数与各个病理类别对应的标准参数一一进行计算比较，实现对病理诊断得自动化，且保证了病理诊断的客观性和准确性。Specifically, the pathological category corresponding to the characteristic parameter to be diagnosed is determined according to the specific numerical values corresponding to the M predicted distances and the positive correlation or inverse correlation between the magnitude of the predicted distance and the degree of similarity. For example, 5 pathological categories: serous adenocarcinoma, mucinous adenocarcinoma, endometrioid adenocarcinoma, clear cell adenocarcinoma, and metastatic adenocarcinoma correspond to the predicted distances: 0.7, 0.5, 0.4, 0.4, 0.5, and the predicted distances are the same as The degree of similarity is positively correlated, that is, the larger the predicted distance, the higher the degree of similarity, and the pathological category corresponding to the feature parameter to be diagnosed is the pathological category corresponding to the predicted distance of 0.7, namely serous adenocarcinoma. Understandably, by calculating and comparing the characteristic parameters to be diagnosed with the standard parameters corresponding to each pathological category one by one, the pathological diagnosis is automated, and the objectivity and accuracy of the pathological diagnosis are ensured.

上述基于距离计算方法鉴定病理类别的方法，通过获取待诊断特征参数，待诊断特征参数包括N种特征对应的N个特征参数；获取预设的标准参数，预设的标准参数包括M种病理类别分别对应的N种特征的M×N个标准参数；采用预设的距离计算方法分别计算待诊断特征参数与每种病理类别对应的标准参数之间的预测距离，得到M个预测距离；根据M个预测距离的大小确定待诊断特征参数对应的病理类别，通过采用多种基于距离计算的方法将待诊断特征参数与各个病理类别对应的标准参数一一进行计算比较，实现对病理诊断得自动化，且提高了病理诊断得客观性和准确性。In the above method for identifying pathological categories based on the distance calculation method, by acquiring characteristic parameters to be diagnosed, the characteristic parameters to be diagnosed include N characteristic parameters corresponding to N kinds of features; and preset standard parameters are obtained, and the preset standard parameters include M kinds of pathological categories M×N standard parameters corresponding to N kinds of features respectively; use a preset distance calculation method to calculate the predicted distance between the feature parameter to be diagnosed and the standard parameter corresponding to each pathological category, and obtain M predicted distances; according to M The size of the predicted distance determines the pathological category corresponding to the characteristic parameter to be diagnosed. By using a variety of distance calculation methods to calculate and compare the characteristic parameter to be diagnosed with the standard parameters corresponding to each pathological category one by one, the pathological diagnosis can be automated. , and improve the objectivity and accuracy of pathological diagnosis.

如图2所示，在一个实施例中，采用预设的距离计算方法分别计算待诊断特征参数与每种病理类别对应的标准参数之间的预测距离，得到M个预测距离，包括：As shown in FIG. 2, in one embodiment, a preset distance calculation method is used to calculate the predicted distances between the feature parameters to be diagnosed and the standard parameters corresponding to each pathological category, respectively, to obtain M predicted distances, including:

步骤106A，分别计算每个特征参数与M种病理类别对应的标准参数的第一距离和/或第二距离，得到M×N个特征距离；Step 106A, respectively calculating the first distance and/or the second distance between each characteristic parameter and the standard parameters corresponding to M types of pathological categories, to obtain M×N characteristic distances;

步骤106B，分别将每一种病理类别对应的N个特征距离进行融合计算，得到M个预测距离。In step 106B, the N feature distances corresponding to each pathological category are respectively fused and calculated to obtain M predicted distances.

其中，第一距离和第二距离分别是两类距离，根据与相似程度的相关性关系进行分类。例如，第一距离与相似程度呈正相关，则第二距离与相似程度呈反相关。特征距离是指单个特征参数与对应的标准参数之间的距离，该特征距离也可以是欧几里得距离、明可夫斯基距离、曼哈顿距离及切比雪夫距离、余弦相似度和/或皮尔森相关系数的距离度量中的一种或者多种。具体地，分别将N个特征参数与M种病理类别对应的标准参数进行距离计算，该距离为第一距离和/或第二距离，得到M×N个特征距离；然后将每种病理类别对应的N个特征距离进行融合计算，得到M个预测距离。其中的融合计算是指综合多个指标进行计算的处理方法，例如可以根据各个指标的重要性，设置每个指标的权重后，进行加权求和的计算，也可以是按照预设的规则进行自适应的融合的方法。可以理解地，本实施例中，通过对各个特征距离进行融合计算，从而融合考虑了各个特征距离对预测距离的影响，进一步保证了预测距离确定的准确性。Among them, the first distance and the second distance are two types of distances, which are classified according to the correlation with the degree of similarity. For example, the first distance is positively correlated with the similarity degree, and the second distance is inversely correlated with the similarity degree. Feature distance refers to the distance between a single feature parameter and the corresponding standard parameter. The feature distance can also be Euclidean distance, Minkowski distance, Manhattan distance, Chebyshev distance, cosine similarity and/or One or more of the distance measures for the Pearson correlation coefficient. Specifically, the distance between the N characteristic parameters and the standard parameters corresponding to M types of pathological categories is calculated respectively, and the distance is the first distance and/or the second distance, so as to obtain M×N characteristic distances; then each pathological category corresponds to The N feature distances are fused and calculated to obtain M predicted distances. The fusion calculation refers to a processing method that integrates multiple indicators for calculation. For example, the weight of each indicator can be set according to the importance of each indicator, and then the weighted sum calculation can be performed, or the automatic calculation can be performed according to preset rules. Adaptive fusion approach. It can be understood that, in this embodiment, by performing fusion calculation on each feature distance, the influence of each feature distance on the predicted distance is considered in fusion, which further ensures the accuracy of determining the predicted distance.

如图3所示，在一个实施例中，分别将每一种病理类别对应的N个特征距离进行融合计算，得到M个预测距离，包括：As shown in Figure 3, in one embodiment, the N feature distances corresponding to each pathological category are respectively fused and calculated to obtain M predicted distances, including:

步骤106B1，获取每个特征距离对应的预设权重；Step 106B1, obtain the preset weight corresponding to each feature distance;

步骤106B2，根据每一种病理类别对应的N个特征距离和对应的预设权重进行加权计算，得到M个预测距离。Step 106B2: Perform weighted calculation according to the N characteristic distances corresponding to each pathological category and the corresponding preset weights to obtain M predicted distances.

具体地，首先确定每个特征距离的预设权重，该预设权重可以根据各个特征距离对于病理诊断正确性的影响大小进行设置。然后，根据每一种病理类别对应的N个特征距离和对应的预设权重进行加权计算，得到M个预测距离。可以理解地，本实施例中，通过加权计算得到预测距离，且该融合计算的方法简单快捷，从而提高了预测距离计算的速度。Specifically, a preset weight of each feature distance is first determined, and the preset weight can be set according to the influence of each feature distance on the correctness of the pathological diagnosis. Then, weighted calculation is performed according to the N characteristic distances corresponding to each pathological category and the corresponding preset weights to obtain M predicted distances. It can be understood that, in this embodiment, the predicted distance is obtained through weighted calculation, and the method of fusion calculation is simple and fast, thereby improving the speed of calculation of the predicted distance.

在一个实施例中，第一距离为欧几里得距离、明可夫斯基距离、曼哈顿距离及切比雪夫距离一种的至少一种，第二距离为余弦相似度和/或皮尔森相关系数。In one embodiment, the first distance is at least one of Euclidean distance, Minkowski distance, Manhattan distance and Chebyshev distance, and the second distance is cosine similarity and/or Pearson correlation coefficient.

其中，第一距离为欧几里得距离、明可夫斯基距离、曼哈顿距离及切比雪夫距离一种的至少一种，欧几里得距离(Euclidean Distance)是衡量的是多维空间中各个点之间的绝对距离，其公式为：

其中，dist(X₁,Y₁)表示为欧几里得距离，x_i表示为第i个特征参数，y_i表示为与第i个特征参数对应的第i个标准参数。明可夫斯基距离(Minkowski Distance)是欧几里得距离的推广，是对多个距离度量公式的概括性的表述，其公式为：

其中，dist(X₂,Y₂)表示为明可夫斯基距离，x_i表示为第i个特征参数，y_i表示为与第i个特征参数对应的第i个标准参数，p为常数。曼哈顿距离(Manhattan Distance)来源于城市区块距离，是将多个维度上的距离进行求和后的结果，其公式为：

其中，dist(X₃,Y₃)表示为曼哈顿距离，x_i表示为第i个特征参数，y_i表示为与第i个特征参数对应的第i个标准参数。切比雪夫距离(ChebyshevDistance)是向量空间中的一种度量，二个点之间的距离定义是其各坐标数值差绝对值的最大值，其公式为：

其中，dist(X₄,Y₄)表示为切比雪夫距离，x_i表示为第i个特征参数，y_i表示为与第i个特征参数对应的第i个标准参数。且欧几里得距离、明可夫斯基距离、曼哈顿距离及切比雪夫距离与相似程度呈负相关关系，即第一距离越大，则相似度越低。第二距离为余弦相似度和/或皮尔森相关系数，余弦相似度(CosineSimilarity)是用向量空间中两个向量夹角的余弦值作为衡量两个个体间差异的大小，其公式为：

其中，sim(X,Y)表示为余弦相似度，x表示为特征参数，y表示为与特征参数x对应的标准参数。皮尔森相关系数(Pearson Correlation Coefficient)是用来衡量两个数据集合是否在一条线上面，它用来衡量定距变量间的线性关系，其公式为：

其中，r(X,Y)表示为皮尔森相关系数，x表示为特征参数，y表示为与特征参数x对应的标准参数。余弦相似度和皮尔森相关系数与相似程度呈负相关关系，即第二距离越大，则相似程度越高。可以理解地，第一距离及第二距离有各自的距离度量场景，因此，根据待诊断特征参数的应用场景选取合适的第一距离或者第二距离，以进一步提高病理诊断得准确性。Among them, the first distance is at least one of Euclidean distance, Minkowski distance, Manhattan distance and Chebyshev distance. The absolute distance between points, whose formula is:

Among them, dist(X ₁ , Y ₁ ) represents the Euclidean distance, _xi represents the ith characteristic parameter, and _yi represents the ith standard parameter corresponding to the ith characteristic parameter. Minkowski Distance is a generalization of Euclidean distance and a generalized expression of multiple distance measurement formulas. The formula is:

Among them, dist(X ₂ , Y ₂ ) is the Minkowski distance, x _i is the i-th characteristic parameter, y _i is the i-th standard parameter corresponding to the i-th characteristic parameter, and p is a constant . Manhattan Distance is derived from the city block distance, which is the result of summing the distances in multiple dimensions. The formula is:

Among them, dist(X ₃ , Y ₃ ) represents the Manhattan distance, _xi represents the ith feature parameter, and y _i represents the ith standard parameter corresponding to the ith feature parameter. Chebyshev distance (ChebyshevDistance) is a measure in the vector space, the distance between two points is defined as the maximum value of the absolute value of the difference between their coordinates, and its formula is:

Among them, dist(X ₄ , Y ₄ ) represents the Chebyshev distance, _xi represents the ith feature parameter, and y _i represents the ith standard parameter corresponding to the ith feature parameter. And Euclidean distance, Minkowski distance, Manhattan distance and Chebyshev distance are negatively correlated with similarity, that is, the larger the first distance, the lower the similarity. The second distance is the cosine similarity and/or the Pearson correlation coefficient. The cosine similarity (CosineSimilarity) uses the cosine value of the angle between the two vectors in the vector space as a measure of the difference between two individuals. The formula is:

Among them, sim(X, Y) represents the cosine similarity, x represents the feature parameter, and y represents the standard parameter corresponding to the feature parameter x. The Pearson Correlation Coefficient is used to measure whether two data sets are on a line. It is used to measure the linear relationship between spaced variables. The formula is:

Among them, r(X, Y) is the Pearson correlation coefficient, x is the characteristic parameter, and y is the standard parameter corresponding to the characteristic parameter x. The cosine similarity and the Pearson correlation coefficient are negatively correlated with the similarity, that is, the larger the second distance, the higher the similarity. Understandably, the first distance and the second distance have their own distance measurement scenarios. Therefore, an appropriate first distance or second distance is selected according to the application scenario of the feature parameter to be diagnosed to further improve the accuracy of pathological diagnosis.

在一个实施例中，获取每个特征距离对应的预设权重，包括：当特征距离为第一距离时，预设权重为负数；当特征距离为第二距离时，预设权重为正数。In one embodiment, acquiring the preset weight corresponding to each feature distance includes: when the feature distance is the first distance, the preset weight is a negative number; when the feature distance is the second distance, the preset weight is a positive number.

具体地，当特征距离为第一距离时，预设权重为负数，当特征距离为第二距离时，预设权重为正数，由于第一距离与相似程度呈负相关，将其对应的预设权重确定为负数，且第二距离与相似程度呈正相关，将其对应的预设权重确定为正数，从而能够使得基于该预设权重计算得到的预测距离与相似程度呈正相关，进而能够方便得根据预测距离进行病理诊断，进一步提高病理诊断得效率。Specifically, when the feature distance is the first distance, the preset weight is a negative number, and when the feature distance is the second distance, the preset weight is a positive number. Since the first distance is negatively correlated with the similarity, the corresponding preset weight is The weight is determined as a negative number, and the second distance is positively correlated with the degree of similarity, and the corresponding preset weight is determined as a positive number, so that the predicted distance calculated based on the preset weight is positively correlated with the degree of similarity, which can facilitate Pathological diagnosis must be carried out according to the predicted distance to further improve the efficiency of pathological diagnosis.

在一个实施例中，分别将每一种病理类别对应的N个特征距离进行融合计算，得到M个预测距离，包括：当特征距离为第一距离时，预测距离与特征距离成负相关；当特征距离为第二距离时，预测距离与特征距离成正相关。In one embodiment, the N feature distances corresponding to each pathological category are respectively fused and calculated to obtain M predicted distances, including: when the feature distance is the first distance, the predicted distance and the feature distance are negatively correlated; when When the feature distance is the second distance, the predicted distance is positively correlated with the feature distance.

如图4所示，在一个实施例中，根据M个预测距离的大小确定待诊断特征参数对应的病理类别，包括：As shown in Figure 4, in one embodiment, the pathological category corresponding to the feature parameter to be diagnosed is determined according to the size of the M predicted distances, including:

步骤108A，将每个预测距离在M个预测距离的占比确定为对应病理类别的概率；Step 108A, determining the proportion of each predicted distance in the M predicted distances as the probability of the corresponding pathological category;

步骤108B，根据每个病理类别的概率确定待诊断特征参数对应的病理类别。Step 108B: Determine the pathological category corresponding to the characteristic parameter to be diagnosed according to the probability of each pathological category.

在这个实施例中，计算每个预测距离与M个预测距离总和的比例值，将该比例值确定为待诊断特征参数对应的病理类别的概率，根据每个病理类别的概率确定待诊断特征参数对应的病理类别。例如，5个病理类别：浆液性腺癌、粘液性腺癌、子宫内膜样腺癌、透明细胞腺癌和转移性腺癌对应的预测距离为：0.7、0.5、0.4、0.4、0.5，对应的各个病理类别的概率依次为28％、20％、16％、16％、20％，因此，病理类别的概率为28％的浆液性腺癌为待诊断特征参数对应的病理类别。可以理解地，通过将每个预测距离在M个预测距离的占比确定为对应病理类别的概率作为病理诊断的依据，不仅计算简单快捷，而且提高了病理诊断得准确性和病理诊断效率。In this embodiment, the ratio of each predicted distance to the sum of M predicted distances is calculated, the ratio is determined as the probability of the pathological category corresponding to the feature parameter to be diagnosed, and the feature parameter to be diagnosed is determined according to the probability of each pathological category the corresponding pathological category. For example, 5 pathological categories: serous adenocarcinoma, mucinous adenocarcinoma, endometrioid adenocarcinoma, clear cell adenocarcinoma and metastatic adenocarcinoma correspond to the predicted distances: 0.7, 0.5, 0.4, 0.4, 0.5, corresponding to each pathology The probabilities of the categories are 28%, 20%, 16%, 16%, and 20% in sequence. Therefore, the serous adenocarcinoma with a probability of the pathological category of 28% is the pathological category corresponding to the characteristic parameters to be diagnosed. Understandably, by determining the proportion of each predicted distance in the M predicted distances as the probability of the corresponding pathological category as the basis for pathological diagnosis, not only the calculation is simple and fast, but also the accuracy of pathological diagnosis and the efficiency of pathological diagnosis are improved.

如图5所示，在一个实施例中，提出了一种基于距离计算方法鉴定病理类别的系统，所述系统包括：As shown in Figure 5, in one embodiment, a system for identifying pathological categories based on a distance calculation method is proposed, and the system includes:

第一参数获取模块502，用于获取待诊断特征参数，所述待诊断特征参数包括N种特征对应的N个特征参数，N为自然数；The first parameter obtaining module 502 is configured to obtain characteristic parameters to be diagnosed, wherein the characteristic parameters to be diagnosed include N characteristic parameters corresponding to N kinds of features, and N is a natural number;

第二参数获取模块504，用于获取预设的标准参数，所述预设的标准参数包括M种病理类别分别对应的N种特征的M×N个标准参数，M为自然数；The second parameter obtaining module 504 is configured to obtain preset standard parameters, where the preset standard parameters include M×N standard parameters of N features corresponding to M types of pathological categories, where M is a natural number;

计算模块506，用于采用预设的距离计算方法分别计算所述待诊断特征参数与每种病理类别对应的标准参数之间的预测距离，得到M个预测距离；The calculation module 506 is used to calculate the predicted distance between the characteristic parameter to be diagnosed and the standard parameter corresponding to each pathological category by using a preset distance calculation method, and obtain M predicted distances;

诊断模块508，用于根据M个所述预测距离的大小确定所述待诊断特征参数对应的病理类别。The diagnosis module 508 is configured to determine the pathological category corresponding to the characteristic parameter to be diagnosed according to the size of the M predicted distances.

在一个实施例中，计算模块包括：In one embodiment, the computing module includes:

距离计算单元，用于分别计算每个所述特征参数与所述M种病理类别对应的标准参数的第一距离和/或第二距离，得到M×N个特征距离；A distance calculation unit, configured to respectively calculate the first distance and/or the second distance between each of the characteristic parameters and the standard parameters corresponding to the M types of pathological categories, to obtain M×N characteristic distances;

距离融合单元，用于分别将每一种病理类别对应的N个特征距离进行融合计算，得到M个所述预测距离。The distance fusion unit is used to fuse and calculate the N characteristic distances corresponding to each pathological category to obtain the M predicted distances.

在一个实施例中，距离融合单元包括：In one embodiment, the distance fusion unit includes:

权重获取子单元，用于获取每个特征距离对应的预设权重；The weight acquisition subunit is used to acquire the preset weight corresponding to each feature distance;

融合计算子单元，用于根据每一种病理类别对应的N个特征距离和对应的预设权重进行加权计算，得到M个所述预测距离。The fusion calculation subunit is configured to perform weighted calculation according to the N characteristic distances corresponding to each pathological category and the corresponding preset weights, to obtain the M predicted distances.

在一个实施例中，诊断模块包括：In one embodiment, the diagnostic module includes:

概率计算单元，用于将每个所述预测距离在M个预测距离的占比确定为对应病理类别的概率；a probability calculation unit, configured to determine the proportion of each of the predicted distances in the M predicted distances as the probability of the corresponding pathological category;

病理诊断单元，用于根据每个所述病理类别的概率确定所述待诊断特征参数对应的病理类别。A pathological diagnosis unit, configured to determine the pathological category corresponding to the characteristic parameter to be diagnosed according to the probability of each pathological category.

图6示出了一个实施例中计算机设备的内部结构图。该计算机设备具体可以是服务器，所述服务器包括但不限于高性能计算机和高性能计算机集群。如图6所示，该计算机设备包括通过系统总线连接的处理器、存储器和网络接口。其中，存储器包括非易失性存储介质和内存储器。该计算机设备的非易失性存储介质存储有操作系统，还可存储有计算机程序，该计算机程序被处理器执行时，可使得处理器实现基于距离计算方法鉴定病理类别的方法。该内存储器中也可储存有计算机程序，该计算机程序被处理器执行时，可使得处理器执行基于距离计算方法鉴定病理类别的方法。本领域技术人员可以理解，图6中示出的结构，仅仅是与本申请方案相关的部分结构的框图，并不构成对本申请方案所应用于其上的计算机设备的限定，具体的计算机设备可以包括比图中所示更多或更少的部件，或者组合某些部件，或者具有不同的部件布置。Figure 6 shows an internal structure diagram of a computer device in one embodiment. The computer device may specifically be a server, and the server includes but is not limited to high-performance computers and high-performance computer clusters. As shown in Figure 6, the computer device includes a processor, memory and a network interface connected by a system bus. Wherein, the memory includes a non-volatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system and also stores a computer program, which, when executed by the processor, enables the processor to implement a method for identifying a pathological category based on a distance calculation method. A computer program may also be stored in the internal memory, and when executed by the processor, the computer program may cause the processor to execute a method for identifying a pathological category based on a distance calculation method. Those skilled in the art can understand that the structure shown in FIG. 6 is only a block diagram of a partial structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. Include more or fewer components than shown in the figures, or combine certain components, or have a different arrangement of components.

在一个实施例中，本申请提供的基于距离计算方法鉴定病理类别的方法可以实现为一种计算机程序的形式，计算机程序可在如图6所示的计算机设备上运行。计算机设备的存储器中可存储组成基于距离计算方法鉴定病理类别的系统的各个程序模板。比如，第一参数获取模块502，第二参数获取模块504，计算模块506，诊断模块508。In one embodiment, the method for identifying a pathological category based on a distance calculation method provided by the present application can be implemented in the form of a computer program, and the computer program can be executed on a computer device as shown in FIG. 6 . The memory of the computer device can store various program templates that make up the system for identifying pathological categories based on the distance calculation method. For example, the first parameter acquisition module 502 , the second parameter acquisition module 504 , the calculation module 506 , and the diagnosis module 508 .

一种计算机设备，包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序，所述处理器执行所述计算机程序时实现如下步骤：获取待诊断特征参数，所述待诊断特征参数包括N种特征对应的N个特征参数，N为自然数；获取预设的标准参数，所述预设的标准参数包括M种病理类别分别对应的N种特征的M×N个标准参数，M为自然数；采用预设的距离计算方法分别计算所述待诊断特征参数与每种病理类别对应的标准参数之间的预测距离，得到M个预测距离；根据M个所述预测距离的大小确定所述待诊断特征参数对应的病理类别。A computer device, comprising a memory, a processor and a computer program stored in the memory and running on the processor, the processor implements the following steps when executing the computer program: acquiring characteristic parameters to be diagnosed, The to-be-diagnosed feature parameters include N feature parameters corresponding to N kinds of features, where N is a natural number; obtain preset standard parameters, where the preset standard parameters include M×N of N kinds of features corresponding to M kinds of pathological categories respectively standard parameters, M is a natural number; the predicted distance between the characteristic parameter to be diagnosed and the standard parameter corresponding to each pathological category is calculated by using a preset distance calculation method, and M predicted distances are obtained; according to the M predicted distances The size of the distance determines the pathological category corresponding to the feature parameter to be diagnosed.

在一个实施例中，所述采用预设的距离计算方法分别计算所述待诊断特征参数与每种病理类别对应的标准参数之间的预测距离，得到M个预测距离，包括：分别计算每个所述特征参数与所述M种病理类别对应的标准参数的第一距离和/或第二距离，得到M×N个特征距离；分别将每一种病理类别对应的N个特征距离进行融合计算，得到M个所述预测距离。In one embodiment, calculating the predicted distance between the characteristic parameter to be diagnosed and the standard parameter corresponding to each pathological category by using a preset distance calculation method, to obtain M predicted distances, includes: calculating each The first distance and/or the second distance between the feature parameter and the standard parameters corresponding to the M pathological categories, to obtain M×N feature distances; respectively, perform fusion calculation on the N feature distances corresponding to each pathological category , to obtain M the predicted distances.

在一个实施例中，所述分别将每一种病理类别对应的N个特征距离进行融合计算，得到M个所述预测距离，包括：获取每个特征距离对应的预设权重；根据每一种病理类别对应的N个特征距离和对应的预设权重进行加权计算，得到M个所述预测距离。In one embodiment, performing fusion calculation on N feature distances corresponding to each pathological category to obtain M predicted distances, including: obtaining a preset weight corresponding to each feature distance; The N characteristic distances corresponding to the pathological categories and the corresponding preset weights are weighted and calculated to obtain the M predicted distances.

在一个实施例中，所述第一距离为欧几里得距离、明可夫斯基距离、曼哈顿距离及切比雪夫距离一种的至少一种，所述第二距离为余弦相似度和/或皮尔森相关系数。In one embodiment, the first distance is at least one of Euclidean distance, Minkowski distance, Manhattan distance and Chebyshev distance, and the second distance is cosine similarity and/or or the Pearson correlation coefficient.

在一个实施例中，所述获取每个特征距离对应的预设权重，包括：当所述特征距离为第一距离时，所述预设权重为负数；当所述特征距离为第二距离时，所述预设权重为正数。In one embodiment, the acquiring a preset weight corresponding to each feature distance includes: when the feature distance is a first distance, the preset weight is a negative number; when the feature distance is a second distance , the preset weight is a positive number.

在一个实施例中，所述分别将每一种病理类别对应的N个特征距离进行融合计算，得到M个所述预测距离，包括：当所述特征距离为第一距离时，所述预测距离与特征距离成负相关；当所述特征距离为第二距离时，所述预测距离与特征距离成正相关。In one embodiment, performing fusion calculation on N feature distances corresponding to each pathological category to obtain M predicted distances, including: when the feature distance is a first distance, the predicted distance is negatively correlated with the feature distance; when the feature distance is the second distance, the predicted distance is positively correlated with the feature distance.

在一个实施例中，所述根据M个所述预测距离的大小确定所述待诊断特征参数对应的病理类别，包括：将每个所述预测距离在M个预测距离的占比确定为对应病理类别的概率；根据每个所述病理类别的概率确定所述待诊断特征参数对应的病理类别。In one embodiment, the determining the pathological category corresponding to the feature parameter to be diagnosed according to the size of the M predicted distances includes: determining the proportion of each predicted distance in the M predicted distances as the corresponding pathology The probability of the category; the pathological category corresponding to the characteristic parameter to be diagnosed is determined according to the probability of each pathological category.

一种计算机可读存储介质，所述计算机可读存储介质存储有计算机程序，其特征在于，所述计算机程序被处理器执行时实现如下步骤：获取待诊断特征参数，所述待诊断特征参数包括N种特征对应的N个特征参数，N为自然数；获取预设的标准参数，所述预设的标准参数包括M种病理类别分别对应的N种特征的M×N个标准参数，M为自然数；采用预设的距离计算方法分别计算所述待诊断特征参数与每种病理类别对应的标准参数之间的预测距离，得到M个预测距离；根据M个所述预测距离的大小确定所述待诊断特征参数对应的病理类别。A computer-readable storage medium storing a computer program, characterized in that, when the computer program is executed by a processor, the following steps are implemented: acquiring characteristic parameters to be diagnosed, and the characteristic parameters to be diagnosed include: N feature parameters corresponding to N kinds of features, where N is a natural number; obtain preset standard parameters, where the preset standard parameters include M×N standard parameters of N features corresponding to M kinds of pathological categories, where M is a natural number ; Calculate the predicted distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathological category by using a preset distance calculation method, and obtain M predicted distances; Determine the predicted distances according to the size of the M predicted distances The pathological category corresponding to the diagnostic feature parameter.

本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程，是可以通过计算机程序来指令相关的硬件来完成，所述的程序可存储于一非易失性计算机可读取存储介质中，该程序在执行时，可包括如上述各方法的实施例的流程。其中，本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用，均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限，RAM以多种形式可得，诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing relevant hardware through a computer program, and the program can be stored in a non-volatile computer-readable storage medium , when the program is executed, it may include the flow of the above-mentioned method embodiments. Wherein, any reference to memory, storage, database or other medium used in the various embodiments provided in this application may include non-volatile and/or volatile memory. Nonvolatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in various forms such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Road (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

以上实施例的各技术特征可以进行任意的组合，为使描述简洁，未对上述实施例中的各个技术特征所有可能的组合都进行描述，然而，只要这些技术特征的组合不存在矛盾，都应当认为是本说明书记载的范围。The technical features of the above embodiments can be combined arbitrarily. In order to make the description simple, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction in the combination of these technical features It is considered to be the range described in this specification.

以上所述实施例仅表达了本申请的几种实施方式，其描述较为具体和详细，但并不能因此而理解为对本申请专利范围的限制。应当指出的是，对于本领域的普通技术人员来说，在不脱离本申请构思的前提下，还可以做出若干变形和改进，这些都属于本申请的保护范围。因此，本申请专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only represent several embodiments of the present application, and the descriptions thereof are relatively specific and detailed, but should not be construed as a limitation on the scope of the patent of the present application. It should be pointed out that for those skilled in the art, without departing from the concept of the present application, several modifications and improvements can be made, which all belong to the protection scope of the present application. Therefore, the scope of protection of the patent of the present application shall be subject to the appended claims.

Claims

1. A method for identifying a category of pathology based on a distance calculation method, comprising:

acquiring characteristic parameters to be diagnosed, wherein the characteristic parameters to be diagnosed comprise N characteristic parameters corresponding to N characteristics, and N is a natural number;

acquiring preset standard parameters, wherein the preset standard parameters comprise M multiplied by N standard parameters of N characteristics respectively corresponding to M pathological categories, and M is a natural number;

respectively calculating the prediction distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathology category by adopting a preset distance calculation method to obtain M prediction distances;

and determining the pathological type corresponding to the characteristic parameter to be diagnosed according to the size of the M predicted distances.

2. The method for identifying pathological types according to claim 1, wherein the step of calculating the predicted distance between the characteristic parameter to be diagnosed and the standard parameter corresponding to each pathological type by using a preset distance calculation method to obtain M predicted distances comprises:

respectively calculating a first distance and/or a second distance between each characteristic parameter and the standard parameters corresponding to the M pathological categories to obtain M multiplied by N characteristic distances;

and respectively carrying out fusion calculation on the N characteristic distances corresponding to each pathology category to obtain M predicted distances.

3. The method according to claim 2, wherein the fusion calculation of the N characteristic distances corresponding to each pathology category to obtain M predicted distances comprises:

acquiring a preset weight corresponding to each characteristic distance;

and performing weighted calculation according to the N characteristic distances corresponding to each pathology category and the corresponding preset weight to obtain M predicted distances.

4. The method according to claim 2, wherein the first distance is at least one of a euclidean distance, a minkowski distance, a manhattan distance, and a chebyshev distance, and the second distance is a cosine similarity and/or a pearson correlation coefficient.

5. The method for identifying pathological types according to claim 4, wherein the obtaining of the preset weight corresponding to each characteristic distance comprises:

when the characteristic distance is a first distance, the preset weight is a negative number;

and when the characteristic distance is a second distance, the preset weight is a positive number.

6. The method according to claim 4, wherein the fusion calculation of the N characteristic distances corresponding to each pathology category to obtain M predicted distances comprises:

when the feature distance is a first distance, the predicted distance is inversely related to the feature distance;

when the characteristic distance is a second distance, the predicted distance is positively correlated with the characteristic distance.

7. The method for identifying pathological categories according to claim 1, wherein the determining pathological categories corresponding to the feature parameters to be diagnosed according to the magnitude of the M predicted distances includes:

determining the proportion of each predicted distance in the M predicted distances as the probability of the corresponding pathological category;

and determining the pathological type corresponding to the characteristic parameter to be diagnosed according to the probability of each pathological type.

8. A system for identifying a category of pathology based on a distance calculation method, the system comprising:

the system comprises a first parameter acquisition module, a second parameter acquisition module and a parameter analysis module, wherein the first parameter acquisition module is used for acquiring characteristic parameters to be diagnosed, the characteristic parameters to be diagnosed comprise N characteristic parameters corresponding to N characteristics, and N is a natural number;

the second parameter acquisition module is used for acquiring preset standard parameters, wherein the preset standard parameters comprise M multiplied by N standard parameters of N characteristics respectively corresponding to M pathological categories, and M is a natural number;

the calculation module is used for respectively calculating the prediction distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathology category by adopting a preset distance calculation method to obtain M prediction distances;

and the diagnosis module is used for determining the pathological type corresponding to the characteristic parameter to be diagnosed according to the size of the M prediction distances.

9. The system for identifying a pathology category according to claim 8, wherein said calculation module comprises:

the distance calculation unit is used for calculating a first distance and/or a second distance between each characteristic parameter and the standard parameters corresponding to the M pathological categories respectively to obtain M multiplied by N characteristic distances;

and the distance fusion unit is used for respectively carrying out fusion calculation on the N characteristic distances corresponding to each pathology category to obtain M predicted distances.

10. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor when executing the computer program performs the steps of the method for identifying a category of pathology based on a distance calculation method according to any one of claims 1 to 7.

11. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method for identifying a category of pathology based on a distance calculation method according to any one of claims 1 to 7.