CN112102952A - Method for identifying pathological category based on distance calculation method and related device - Google Patents

Method for identifying pathological category based on distance calculation method and related device Download PDF

Info

Publication number
CN112102952A
CN112102952A CN202010857223.5A CN202010857223A CN112102952A CN 112102952 A CN112102952 A CN 112102952A CN 202010857223 A CN202010857223 A CN 202010857223A CN 112102952 A CN112102952 A CN 112102952A
Authority
CN
China
Prior art keywords
distance
characteristic
distances
pathological
diagnosed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010857223.5A
Other languages
Chinese (zh)
Other versions
CN112102952B (en
Inventor
车拴龙
余霆嵩
罗丕福
卢芳
李学锋
刘斯
刘莹
林万里
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Kingmed Diagnostics Group Co ltd
Guangzhou Kingmed Diagnostics Central Co Ltd
Original Assignee
Guangzhou Kingmed Diagnostics Group Co ltd
Guangzhou Kingmed Diagnostics Central Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Kingmed Diagnostics Group Co ltd, Guangzhou Kingmed Diagnostics Central Co Ltd filed Critical Guangzhou Kingmed Diagnostics Group Co ltd
Priority to CN202010857223.5A priority Critical patent/CN112102952B/en
Publication of CN112102952A publication Critical patent/CN112102952A/en
Application granted granted Critical
Publication of CN112102952B publication Critical patent/CN112102952B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/50ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Primary Health Care (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the invention discloses a method for identifying pathological categories based on a distance calculation method, which comprises the steps of obtaining characteristic parameters to be diagnosed, wherein the characteristic parameters to be diagnosed comprise N characteristic parameters corresponding to N characteristics; acquiring preset standard parameters, wherein the preset standard parameters comprise M multiplied by N standard parameters of N characteristics respectively corresponding to M pathological categories; respectively calculating the prediction distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathological category by adopting a preset distance calculation method to obtain M prediction distances; and determining the pathological type corresponding to the characteristic parameter to be diagnosed according to the M prediction distances, and calculating and comparing the characteristic parameter to be diagnosed and the standard parameters corresponding to the pathological types one by one to realize automation of pathological diagnosis and improve the objectivity and accuracy of the pathological diagnosis. In addition, a system, a computer device and a storage medium for identifying a pathology category based on a distance calculation method are also provided.

Description

Method for identifying pathological category based on distance calculation method and related device
Technical Field
The invention relates to the technical field of computers, in particular to a method for identifying pathological types based on a distance calculation method and related equipment.
Background
The pathological diagnosis is the research on the cause and pathogenesis of the disease, and the morphological structure, functional metabolic change and the outcome of the disease in the disease process, thereby providing necessary theoretical basis and practical basis for the diagnosis, treatment and prevention of the disease. Pathological diagnosis is the most reliable of various examination methods of tumors, and is self-diagnosed as "gold standard" and the final diagnosis of diseases.
From clinical symptoms to traditional HE pathological forms, there are many similar lesions that are extremely confused with a variety of diseases, including benign and malignant diseases. Once misdiagnosed, serious medical accidents will result. In the course of disease diagnosis, differential diagnosis is required for many similar pathologies. For example, neoplastic diseases, have varying degrees of expression of various proteins and variations in various genes. The occurrence of these proteins and genes is referred to as a characteristic parameter. And neoplastic diseases are referred to as the resultant phenomena. The most intuitive and detectable result of doctors is characteristic parameters, and comprehensive analysis of the expression condition of the characteristic parameters is needed, and the accumulated experience of doctors at the previous stage and knowledge in book literature are integrated. And finally, prejudging the resultant phenomenon. The doctor sees the result report and the imaging report of the examination, combines the medical history and the clinical data to comprehensively select N types of lesions with possibility, and obtains the most preferential diagnosis result through comprehensive analysis and consultation discussion. And finally, prejudging the resultant phenomenon. However, at present, the phenomenon of result prediction is carried out based on experience of different personal levels, and serious personal subjectivity influence exists. Moreover, the difference of interpretation of the result phenomena is large among different medical institutions and different doctors, and the objectivity of pathological classification identification is reduced, so that the diagnosis and treatment quality is influenced to a certain extent, the accuracy of pathological classification identification is low, and the pathological diagnosis leak rate is high.
Disclosure of Invention
In view of the above, there is a need to provide a method, system, computer device and storage medium for identifying a pathology category based on a distance calculation method, so as to improve objectivity and accuracy of pathological diagnosis.
A method of identifying a pathology category based on a distance calculation method, the method comprising:
acquiring characteristic parameters to be diagnosed, wherein the characteristic parameters to be diagnosed comprise N characteristic parameters corresponding to N characteristics, and N is a natural number;
acquiring preset standard parameters, wherein the preset standard parameters comprise M multiplied by N standard parameters of N characteristics respectively corresponding to M pathological categories, and M is a natural number;
respectively calculating the prediction distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathology category by adopting a preset distance calculation method to obtain M prediction distances;
and determining the pathological type corresponding to the characteristic parameter to be diagnosed according to the size of the M predicted distances.
A system for identifying a category of pathology based on a distance calculation method, the system comprising:
the system comprises a first parameter acquisition module, a second parameter acquisition module and a parameter analysis module, wherein the first parameter acquisition module is used for acquiring characteristic parameters to be diagnosed, the characteristic parameters to be diagnosed comprise N characteristic parameters corresponding to N characteristics, and N is a natural number;
the second parameter acquisition module is used for acquiring preset standard parameters, wherein the preset standard parameters comprise M multiplied by N standard parameters of N characteristics respectively corresponding to M pathological categories, and M is a natural number;
the calculation module is used for respectively calculating the prediction distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathology category by adopting a preset distance calculation method to obtain M prediction distances;
and the diagnosis module is used for determining the pathological type corresponding to the characteristic parameter to be diagnosed according to the size of the M prediction distances.
A computer device comprising a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the steps of:
acquiring characteristic parameters to be diagnosed, wherein the characteristic parameters to be diagnosed comprise N characteristic parameters corresponding to N characteristics, and N is a natural number;
acquiring preset standard parameters, wherein the preset standard parameters comprise M multiplied by N standard parameters of N characteristics respectively corresponding to M pathological categories, and M is a natural number;
respectively calculating the prediction distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathology category by adopting a preset distance calculation method to obtain M prediction distances;
and determining the pathological type corresponding to the characteristic parameter to be diagnosed according to the size of the M predicted distances.
A computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of:
acquiring characteristic parameters to be diagnosed, wherein the characteristic parameters to be diagnosed comprise N characteristic parameters corresponding to N characteristics, and N is a natural number;
acquiring preset standard parameters, wherein the preset standard parameters comprise M multiplied by N standard parameters of N characteristics respectively corresponding to M pathological categories, and M is a natural number;
respectively calculating the prediction distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathology category by adopting a preset distance calculation method to obtain M prediction distances;
and determining the pathological type corresponding to the characteristic parameter to be diagnosed according to the size of the M predicted distances.
The method, the system, the computer equipment and the storage medium for identifying the pathological type based on the distance calculation method acquire the characteristic parameters to be diagnosed; acquiring a preset standard parameter; respectively calculating the prediction distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathology category by adopting a preset distance calculation method to obtain M prediction distances; and determining the pathological category corresponding to the characteristic parameter to be diagnosed according to the M predicted distances, and performing pathological diagnosis by adopting various distance calculation-based methods, so that the accuracy and objectivity of pathological diagnosis are improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Wherein:
FIG. 1 is a flow diagram of a method for identifying a pathology category based on a distance calculation method in one embodiment;
FIG. 2 is a flow diagram of a method for predicted distance calculation in one embodiment;
FIG. 3 is a flow chart of a method of calculating a predicted distance in another embodiment;
FIG. 4 is a flow diagram of a pathology category determination method in one embodiment;
FIG. 5 is a block diagram of a system for identifying a pathology category based on a distance computation method, according to one embodiment;
FIG. 6 is a block diagram of a computer device in one embodiment.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1, in an embodiment, a method for identifying a pathology category based on a distance calculation method is provided, and the method for identifying a pathology category based on a distance calculation method may be applied to a terminal or a server, and the embodiment is exemplified by being applied to a server. The method for identifying the pathological category based on the distance calculation method specifically comprises the following steps of:
102, obtaining characteristic parameters to be diagnosed, wherein the characteristic parameters to be diagnosed comprise N characteristic parameters corresponding to N characteristics, and N is a natural number.
The characteristic parameters to be diagnosed are parameters for reflecting pathological characteristics of pathological sections to be diagnosed, and the characteristic parameters to be diagnosed comprise a plurality of characteristic parameters corresponding to a plurality of characteristics. In one embodiment, the pathological section to be diagnosed is an ovarian epithelial malignancy, and the corresponding 7 characteristic parameters may be values corresponding to Pax-8, WT-1, CA125, P53, CEA, ER, and PVHL. Specifically, the characteristic parameter to be diagnosed can be obtained after the pathological section is analyzed by a pathological analysis instrument.
And 104, acquiring preset standard parameters, wherein the preset standard parameters comprise M multiplied by N standard parameters of N characteristics respectively corresponding to M pathological categories, and M is a natural number.
The preset standard parameters are parameters set according to the size or range of the N characteristic parameters under each pathology category, and the standard parameters correspond to the characteristic parameters to be diagnosed one by one, namely each pathology category comprises N standard parameters, so that the M pathology categories comprise M multiplied by N standard parameters. Continuing with the example of ovarian epithelial malignancy in step S102, there are 7 characteristic parameters corresponding to the presence of pathological categories including: serous adenocarcinomas, mucinous adenocarcinomas, endometrioid adenocarcinomas, clear cell adenocarcinomas, and metastatic adenocarcinomas. There are N standard parameters for each pathological category, for example, the values for 7 standard parameters for pathological categories of serous adenocarcinoma, i.e., Pax-8, WT-1, CA125, P53, CEA, ER, and PVHL, are 95%, 75%, and 5%, respectively.
And 106, respectively calculating the prediction distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathology category by adopting a preset distance calculation method to obtain M prediction distances.
The preset distance calculation method is a preset quantification method for comparing the similarity degree of the characteristic parameter to be diagnosed and the standard parameter. The distance calculation method may be one or more of euclidean distance, minkowski distance, manhattan distance, chebyshev distance, cosine similarity, and/or distance measurement of pearson correlation coefficient, and may be specifically selected according to the characteristics of each distance itself and the characteristics of the standard parameter and/or the characteristic parameter to be diagnosed. The predicted distance is a quantized value of the similarity degree between the characteristic parameter to be diagnosed and the standard parameter corresponding to each pathological category. Specifically, N characteristic parameters in the characteristic parameters to be diagnosed and N standard parameters corresponding to M pathology categories are respectively calculated according to a preset distance calculation method, so as to obtain M predicted distances. The method can be understood that the prediction distance between the characteristic parameter to be diagnosed and the standard parameter corresponding to each pathological category is calculated through a preset distance calculation method, so that the specific quantification of the similarity degree between the characteristic parameter to be diagnosed and the standard parameter is realized, and the objectivity of calculation of the prediction distance is improved. And because the preset distance calculation method comprises various distance measures, the accuracy of the calculation of the predicted distance is improved.
And step 108, determining the pathological type corresponding to the characteristic parameter to be diagnosed according to the M prediction distances.
Specifically, the pathological category corresponding to the characteristic parameter to be diagnosed is determined according to the specific numerical values corresponding to the M prediction distances and the positive correlation or the inverse correlation between the prediction distance and the similarity. For example, 5 pathological categories: the predicted distances corresponding to serous adenocarcinoma, mucinous adenocarcinoma, endometrioid adenocarcinoma, clear cell adenocarcinoma, and metastatic gonadal carcinoma are: 0.7, 0.5, 0.4 and 0.5, wherein the prediction distance is positively correlated with the similarity degree, namely the larger the prediction distance is, the higher the similarity degree is, and the pathological category corresponding to the characteristic parameter to be diagnosed is the pathological category corresponding to the prediction distance of 0.7, namely serous adenocarcinoma. Understandably, the pathological diagnosis is automated and the objectivity and the accuracy of the pathological diagnosis are ensured by calculating and comparing the characteristic parameters to be diagnosed with the standard parameters corresponding to each pathological category one by one.
According to the method for identifying the pathological category based on the distance calculation method, the characteristic parameters to be diagnosed are obtained, and the characteristic parameters to be diagnosed comprise N characteristic parameters corresponding to N characteristics; acquiring preset standard parameters, wherein the preset standard parameters comprise M multiplied by N standard parameters of N characteristics respectively corresponding to M pathological categories; respectively calculating the prediction distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathological category by adopting a preset distance calculation method to obtain M prediction distances; and determining the pathological type corresponding to the characteristic parameter to be diagnosed according to the M predicted distances, and calculating and comparing the characteristic parameter to be diagnosed and the standard parameters corresponding to the pathological types one by adopting various distance calculation-based methods, so that the pathological diagnosis is automated, and the objectivity and the accuracy of the pathological diagnosis are improved.
As shown in fig. 2, in an embodiment, the calculating the predicted distance between the characteristic parameter to be diagnosed and the standard parameter corresponding to each pathology category by using a preset distance calculation method to obtain M predicted distances includes:
step 106A, respectively calculating a first distance and/or a second distance between each characteristic parameter and a standard parameter corresponding to M pathological categories to obtain M multiplied by N characteristic distances;
and step 106B, respectively carrying out fusion calculation on the N characteristic distances corresponding to each pathology category to obtain M predicted distances.
The first distance and the second distance are two kinds of distances respectively, and are classified according to the correlation relationship with the similarity degree. For example, a first distance is positively correlated with the degree of similarity, and a second distance is inversely correlated with the degree of similarity. The characteristic distance refers to the distance between a single characteristic parameter and a corresponding standard parameter, and the characteristic distance can also be one or more of Euclidean distance, Megowski distance, Manhattan distance, Chebyshev distance, cosine similarity and/or distance measurement of Pearson correlation coefficient. Specifically, respectively carrying out distance calculation on the N characteristic parameters and standard parameters corresponding to the M pathological categories, wherein the distance is a first distance and/or a second distance, and obtaining M multiplied by N characteristic distances; and then carrying out fusion calculation on the N characteristic distances corresponding to each pathology category to obtain M predicted distances. The fusion calculation is a processing method of performing calculation by integrating a plurality of indexes, and for example, the calculation of weighted summation may be performed after setting the weight of each index according to the importance of each index, or may be a method of performing adaptive fusion according to a preset rule. In this embodiment, the influence of each feature distance on the predicted distance is taken into consideration by fusion calculation, so that the accuracy of determining the predicted distance is further ensured.
As shown in fig. 3, in an embodiment, the fusion calculation of the N feature distances corresponding to each pathology category to obtain M predicted distances includes:
step 106B1, acquiring a preset weight corresponding to each characteristic distance;
and step 106B2, performing weighted calculation according to the N characteristic distances corresponding to each pathology type and the corresponding preset weights to obtain M predicted distances.
Specifically, a preset weight of each feature distance is determined first, and the preset weight can be set according to the influence of each feature distance on the correctness of pathological diagnosis. And then, performing weighted calculation according to the N characteristic distances corresponding to each pathology category and the corresponding preset weights to obtain M predicted distances. It can be understood that, in this embodiment, the predicted distance is obtained through weighting calculation, and the fusion calculation method is simple and fast, so that the speed of calculating the predicted distance is increased.
In one embodiment, the first distance is at least one of a euclidean distance, a minkowski distance, a manhattan distance, and a chebyshev distance, and the second distance is a cosine similarity and/or a pearson correlation coefficient.
Wherein, the first Distance is at least one of Euclidean Distance, Minkowski Distance, Manhattan Distance and Chebyshev Distance, Euclidean Distance (Euclidean Distance) is the absolute Distance between each point in the multidimensional space, and the formula is as follows:
Figure BDA0002646848060000071
wherein, dist (X)1,Y1) Expressed as the Euclidean distance, xiExpressed as the i-th characteristic parameter, yiExpressed as the ith standard parameter corresponding to the ith characteristic parameter. Minkowski Distance (Minkowski Distance) is a generalization of euclidean Distance and is a generalized representation of a number of Distance metric equations, which are:
Figure BDA0002646848060000072
wherein, dist (X)2,Y2) Expressed as the Minkowski distance, xiExpressed as the i-th characteristic parameter, yiExpressed as the ith standard parameter corresponding to the ith characteristic parameter, and p is a constant. Manhattan Distance (Manhattan Distance) is derived from city block Distance, and is a result of summing distances in multiple dimensions, and the formula is as follows:
Figure BDA0002646848060000081
wherein, dist (X)3,Y3) Expressed as the Manhattan distance, xiExpressed as the i-th characteristic parameter, yiExpressed as the ith standard parameter corresponding to the ith characteristic parameter. Chebyshev Distance (Chebyshev Distance) is a measure in vector space, and the Distance between two points is defined as the maximum of the absolute value of the difference between the values of its coordinates, and is expressed by the following formula:
Figure BDA0002646848060000082
wherein, dist (X)4,Y4) Expressed as the Chebyshev distance, xiExpressed as the i-th characteristic parameter, yiExpressed as the ith standard parameter corresponding to the ith characteristic parameter. And the Euclidean distance, the Minkowski distance, the Manhattan distance and the Chebyshev distance are in a negative correlation with the similarity degree, namely the greater the first distance is, the lower the similarity degree is. The second distance is Cosine Similarity and/or Pearson correlation coefficient, the Cosine Similarity (Cosine Similarity) is the difference between two individuals measured by Cosine value of two vector included angle in vector space, and the formula is:
Figure BDA0002646848060000083
where sim (X, Y) is expressed as cosine similarity, X is expressed as a characteristic parameter, and Y is expressed as a standard parameter corresponding to the characteristic parameter X. Pearson Correlation Coefficient (Pearson Correlation Coefficient) is used to measure whether two data sets are on a line, and is used to measure the linear relation between distance variables, and the formula is:
Figure BDA0002646848060000084
where r (X, Y) is expressed as a pearson correlation coefficient, X is expressed as a characteristic parameter, and Y is expressed as a standard parameter corresponding to the characteristic parameter X. The cosine similarity and the pearson correlation coefficient are in a negative correlation with the similarity, i.e., the greater the second distance, the higher the similarity. It can be understood that the first distance and the second distance have respective distance measurement scenes, and therefore, the appropriate first distance or second distance is selected according to the application scene of the characteristic parameter to be diagnosed, so as to further improve the accuracy of pathological diagnosis.
In one embodiment, obtaining the preset weight corresponding to each feature distance includes: when the characteristic distance is the first distance, the preset weight is a negative number; when the characteristic distance is the second distance, the preset weight is a positive number.
Specifically, when the characteristic distance is a first distance, the preset weight is a negative number, when the characteristic distance is a second distance, the preset weight is a positive number, the first distance is in negative correlation with the similarity, the corresponding preset weight is determined to be a negative number, the second distance is in positive correlation with the similarity, and the corresponding preset weight is determined to be a positive number, so that the predicted distance calculated based on the preset weight is in positive correlation with the similarity, pathological diagnosis can be conveniently performed according to the predicted distance, and the pathological diagnosis efficiency is further improved.
In one embodiment, the fusion calculation of the N feature distances corresponding to each pathology category to obtain M predicted distances includes: when the characteristic distance is the first distance, the predicted distance is in negative correlation with the characteristic distance; when the characteristic distance is the second distance, the predicted distance is positively correlated with the characteristic distance.
As shown in fig. 4, in an embodiment, determining the pathology category corresponding to the feature parameter to be diagnosed according to the magnitudes of the M predicted distances includes:
step 108A, determining the proportion of each predicted distance in the M predicted distances as the probability of the corresponding pathological category;
and step 108B, determining the pathological type corresponding to the characteristic parameter to be diagnosed according to the probability of each pathological type.
In this embodiment, a ratio of each of the predicted distances to the sum of the M predicted distances is calculated, the ratio is determined as a probability of a pathology category corresponding to the feature parameter to be diagnosed, and the pathology category corresponding to the feature parameter to be diagnosed is determined according to the probability of each pathology category. For example, 5 pathological categories: the predicted distances corresponding to serous adenocarcinoma, mucinous adenocarcinoma, endometrioid adenocarcinoma, clear cell adenocarcinoma, and metastatic gonadal carcinoma are: 0.7, 0.5, 0.4 and 0.5, wherein the probability of each corresponding pathological category is 28%, 20%, 16% and 20% in sequence, so that serous adenocarcinoma with the probability of 28% of the pathological category is the pathological category corresponding to the characteristic parameter to be diagnosed. It can be understood that, by determining the probability of each predicted distance in the M predicted distances as the corresponding pathological category as the basis of pathological diagnosis, not only the calculation is simple and fast, but also the accuracy of pathological diagnosis and the efficiency of pathological diagnosis are improved.
As shown in fig. 5, in one embodiment, a system for identifying a category of pathology based on a distance calculation method is proposed, the system comprising:
a first parameter obtaining module 502, configured to obtain a feature parameter to be diagnosed, where the feature parameter to be diagnosed includes N feature parameters corresponding to N features, where N is a natural number;
a second parameter obtaining module 504, configured to obtain preset standard parameters, where the preset standard parameters include M × N standard parameters of N features corresponding to M types of pathology categories, where M is a natural number;
a calculating module 506, configured to calculate predicted distances between the feature parameter to be diagnosed and the standard parameter corresponding to each pathology category by using a preset distance calculating method, respectively, so as to obtain M predicted distances;
and the diagnosis module 508 is configured to determine a pathology category corresponding to the feature parameter to be diagnosed according to the magnitudes of the M predicted distances.
In one embodiment, the calculation module comprises:
the distance calculation unit is used for calculating a first distance and/or a second distance between each characteristic parameter and the standard parameters corresponding to the M pathological categories respectively to obtain M multiplied by N characteristic distances;
and the distance fusion unit is used for respectively carrying out fusion calculation on the N characteristic distances corresponding to each pathology category to obtain M predicted distances.
In one embodiment, the distance fusion unit includes:
the weight obtaining subunit is used for obtaining a preset weight corresponding to each characteristic distance;
and the fusion calculation subunit is used for performing weighted calculation according to the N characteristic distances corresponding to each pathology category and the corresponding preset weights to obtain M predicted distances.
In one embodiment, the diagnostic module comprises:
a probability calculation unit for determining the ratio of each of the predicted distances in the M predicted distances as the probability of the corresponding pathology category;
and the pathological diagnosis unit is used for determining the pathological category corresponding to the characteristic parameter to be diagnosed according to the probability of each pathological category.
FIG. 6 is a diagram illustrating an internal structure of a computer device in one embodiment. The computer device may specifically be a server including, but not limited to, a high performance computer and a cluster of high performance computers. As shown in fig. 6, the computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the memory includes a non-volatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system and may also store a computer program that, when executed by the processor, causes the processor to implement a method of identifying a category of pathology based on a distance calculation method. The internal memory may also have stored therein a computer program that, when executed by the processor, causes the processor to perform a method for identifying a category of pathology based on a distance calculation method. Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, the method for identifying a pathology category based on a distance calculation method provided herein may be implemented in the form of a computer program that is executable on a computer device as shown in fig. 6. The memory of the computer device may store therein respective program templates constituting a system for identifying a category of pathology based on a distance calculation method. For example, the first parameter obtaining module 502, the second parameter obtaining module 504, the calculating module 506, and the diagnosing module 508.
A computer device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, the processor implementing the following steps when executing the computer program: acquiring characteristic parameters to be diagnosed, wherein the characteristic parameters to be diagnosed comprise N characteristic parameters corresponding to N characteristics, and N is a natural number; acquiring preset standard parameters, wherein the preset standard parameters comprise M multiplied by N standard parameters of N characteristics respectively corresponding to M pathological categories, and M is a natural number; respectively calculating the prediction distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathology category by adopting a preset distance calculation method to obtain M prediction distances; and determining the pathological type corresponding to the characteristic parameter to be diagnosed according to the size of the M predicted distances.
In one embodiment, the calculating, by using a preset distance calculation method, the predicted distances between the characteristic parameter to be diagnosed and the standard parameters corresponding to each pathology category to obtain M predicted distances includes: respectively calculating a first distance and/or a second distance between each characteristic parameter and the standard parameters corresponding to the M pathological categories to obtain M multiplied by N characteristic distances; and respectively carrying out fusion calculation on the N characteristic distances corresponding to each pathology category to obtain M predicted distances.
In one embodiment, the fusion calculation of the N feature distances corresponding to each pathology category to obtain M predicted distances includes: acquiring a preset weight corresponding to each characteristic distance; and performing weighted calculation according to the N characteristic distances corresponding to each pathology category and the corresponding preset weight to obtain M predicted distances.
In one embodiment, the first distance is at least one of a euclidean distance, a minkowski distance, a manhattan distance, and a chebyshev distance, and the second distance is a cosine similarity and/or a pearson correlation coefficient.
In an embodiment, the obtaining the preset weight corresponding to each feature distance includes: when the characteristic distance is a first distance, the preset weight is a negative number; and when the characteristic distance is a second distance, the preset weight is a positive number.
In one embodiment, the fusion calculation of the N feature distances corresponding to each pathology category to obtain M predicted distances includes: when the feature distance is a first distance, the predicted distance is inversely related to the feature distance; when the characteristic distance is a second distance, the predicted distance is positively correlated with the characteristic distance.
In one embodiment, the determining, according to the magnitudes of the M predicted distances, a pathology category corresponding to the feature parameter to be diagnosed includes: determining the proportion of each predicted distance in the M predicted distances as the probability of the corresponding pathological category; and determining the pathological type corresponding to the characteristic parameter to be diagnosed according to the probability of each pathological type.
A computer-readable storage medium storing a computer program, the computer program when executed by a processor implementing the steps of: acquiring characteristic parameters to be diagnosed, wherein the characteristic parameters to be diagnosed comprise N characteristic parameters corresponding to N characteristics, and N is a natural number; acquiring preset standard parameters, wherein the preset standard parameters comprise M multiplied by N standard parameters of N characteristics respectively corresponding to M pathological categories, and M is a natural number; respectively calculating the prediction distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathology category by adopting a preset distance calculation method to obtain M prediction distances; and determining the pathological type corresponding to the characteristic parameter to be diagnosed according to the size of the M predicted distances.
In one embodiment, the calculating, by using a preset distance calculation method, the predicted distances between the characteristic parameter to be diagnosed and the standard parameters corresponding to each pathology category to obtain M predicted distances includes: respectively calculating a first distance and/or a second distance between each characteristic parameter and the standard parameters corresponding to the M pathological categories to obtain M multiplied by N characteristic distances; and respectively carrying out fusion calculation on the N characteristic distances corresponding to each pathology category to obtain M predicted distances.
In one embodiment, the fusion calculation of the N feature distances corresponding to each pathology category to obtain M predicted distances includes: acquiring a preset weight corresponding to each characteristic distance; and performing weighted calculation according to the N characteristic distances corresponding to each pathology category and the corresponding preset weight to obtain M predicted distances.
In one embodiment, the first distance is at least one of a euclidean distance, a minkowski distance, a manhattan distance, and a chebyshev distance, and the second distance is a cosine similarity and/or a pearson correlation coefficient.
In an embodiment, the obtaining the preset weight corresponding to each feature distance includes: when the characteristic distance is a first distance, the preset weight is a negative number; and when the characteristic distance is a second distance, the preset weight is a positive number.
In one embodiment, the fusion calculation of the N feature distances corresponding to each pathology category to obtain M predicted distances includes: when the feature distance is a first distance, the predicted distance is inversely related to the feature distance; when the characteristic distance is a second distance, the predicted distance is positively correlated with the characteristic distance.
In one embodiment, the determining, according to the magnitudes of the M predicted distances, a pathology category corresponding to the feature parameter to be diagnosed includes: determining the proportion of each predicted distance in the M predicted distances as the probability of the corresponding pathological category; and determining the pathological type corresponding to the characteristic parameter to be diagnosed according to the probability of each pathological type.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the program is executed. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (11)

1. A method for identifying a category of pathology based on a distance calculation method, comprising:
acquiring characteristic parameters to be diagnosed, wherein the characteristic parameters to be diagnosed comprise N characteristic parameters corresponding to N characteristics, and N is a natural number;
acquiring preset standard parameters, wherein the preset standard parameters comprise M multiplied by N standard parameters of N characteristics respectively corresponding to M pathological categories, and M is a natural number;
respectively calculating the prediction distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathology category by adopting a preset distance calculation method to obtain M prediction distances;
and determining the pathological type corresponding to the characteristic parameter to be diagnosed according to the size of the M predicted distances.
2. The method for identifying pathological types according to claim 1, wherein the step of calculating the predicted distance between the characteristic parameter to be diagnosed and the standard parameter corresponding to each pathological type by using a preset distance calculation method to obtain M predicted distances comprises:
respectively calculating a first distance and/or a second distance between each characteristic parameter and the standard parameters corresponding to the M pathological categories to obtain M multiplied by N characteristic distances;
and respectively carrying out fusion calculation on the N characteristic distances corresponding to each pathology category to obtain M predicted distances.
3. The method according to claim 2, wherein the fusion calculation of the N characteristic distances corresponding to each pathology category to obtain M predicted distances comprises:
acquiring a preset weight corresponding to each characteristic distance;
and performing weighted calculation according to the N characteristic distances corresponding to each pathology category and the corresponding preset weight to obtain M predicted distances.
4. The method according to claim 2, wherein the first distance is at least one of a euclidean distance, a minkowski distance, a manhattan distance, and a chebyshev distance, and the second distance is a cosine similarity and/or a pearson correlation coefficient.
5. The method for identifying pathological types according to claim 4, wherein the obtaining of the preset weight corresponding to each characteristic distance comprises:
when the characteristic distance is a first distance, the preset weight is a negative number;
and when the characteristic distance is a second distance, the preset weight is a positive number.
6. The method according to claim 4, wherein the fusion calculation of the N characteristic distances corresponding to each pathology category to obtain M predicted distances comprises:
when the feature distance is a first distance, the predicted distance is inversely related to the feature distance;
when the characteristic distance is a second distance, the predicted distance is positively correlated with the characteristic distance.
7. The method for identifying pathological categories according to claim 1, wherein the determining pathological categories corresponding to the feature parameters to be diagnosed according to the magnitude of the M predicted distances includes:
determining the proportion of each predicted distance in the M predicted distances as the probability of the corresponding pathological category;
and determining the pathological type corresponding to the characteristic parameter to be diagnosed according to the probability of each pathological type.
8. A system for identifying a category of pathology based on a distance calculation method, the system comprising:
the system comprises a first parameter acquisition module, a second parameter acquisition module and a parameter analysis module, wherein the first parameter acquisition module is used for acquiring characteristic parameters to be diagnosed, the characteristic parameters to be diagnosed comprise N characteristic parameters corresponding to N characteristics, and N is a natural number;
the second parameter acquisition module is used for acquiring preset standard parameters, wherein the preset standard parameters comprise M multiplied by N standard parameters of N characteristics respectively corresponding to M pathological categories, and M is a natural number;
the calculation module is used for respectively calculating the prediction distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathology category by adopting a preset distance calculation method to obtain M prediction distances;
and the diagnosis module is used for determining the pathological type corresponding to the characteristic parameter to be diagnosed according to the size of the M prediction distances.
9. The system for identifying a pathology category according to claim 8, wherein said calculation module comprises:
the distance calculation unit is used for calculating a first distance and/or a second distance between each characteristic parameter and the standard parameters corresponding to the M pathological categories respectively to obtain M multiplied by N characteristic distances;
and the distance fusion unit is used for respectively carrying out fusion calculation on the N characteristic distances corresponding to each pathology category to obtain M predicted distances.
10. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor when executing the computer program performs the steps of the method for identifying a category of pathology based on a distance calculation method according to any one of claims 1 to 7.
11. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method for identifying a category of pathology based on a distance calculation method according to any one of claims 1 to 7.
CN202010857223.5A 2020-08-24 2020-08-24 Method for identifying pathology category based on distance calculation method and related equipment Active CN112102952B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010857223.5A CN112102952B (en) 2020-08-24 2020-08-24 Method for identifying pathology category based on distance calculation method and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010857223.5A CN112102952B (en) 2020-08-24 2020-08-24 Method for identifying pathology category based on distance calculation method and related equipment

Publications (2)

Publication Number Publication Date
CN112102952A true CN112102952A (en) 2020-12-18
CN112102952B CN112102952B (en) 2024-05-14

Family

ID=73754457

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010857223.5A Active CN112102952B (en) 2020-08-24 2020-08-24 Method for identifying pathology category based on distance calculation method and related equipment

Country Status (1)

Country Link
CN (1) CN112102952B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113418920A (en) * 2021-05-14 2021-09-21 广州金域医学检验中心有限公司 Section staining quality interpretation method and device, computer equipment and storage medium

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100331688A1 (en) * 2009-06-30 2010-12-30 Tatsuro Baba Automatic diagnosis support apparatus, ultrasonic diagnosis apparatus, and automatic diagnosis support method
JP2011005026A (en) * 2009-06-26 2011-01-13 Toshiba Corp Ultrasonic diagnosis apparatus and automatic diagnosis support apparatus
CA2798337A1 (en) * 2012-12-04 2014-06-04 University Of Winnipeg Cardiovascular pulse wave analysis method and system
CN107133487A (en) * 2017-06-21 2017-09-05 深圳市海威达信息技术有限公司 A kind of disease prevention and cure method, device and disease prevention and cure system
CN107680677A (en) * 2017-10-11 2018-02-09 四川大学 Neuropsychiatric disease sorting technique based on brain network analysis
CN108764280A (en) * 2018-04-17 2018-11-06 中国科学院计算技术研究所 A kind of medical data processing method and system based on symptom vector
CN109582797A (en) * 2018-12-13 2019-04-05 泰康保险集团股份有限公司 Obtain method, apparatus, medium and electronic equipment that classification of diseases is recommended
CN110299205A (en) * 2019-07-23 2019-10-01 上海图灵医疗科技有限公司 Biomedicine signals characteristic processing and evaluating method, device and application based on artificial intelligence
US20190313920A1 (en) * 2018-03-16 2019-10-17 Tata Consultancy Services Limited System and method for classification of coronary artery disease based on metadata and cardiovascular signals
CN110827929A (en) * 2019-11-05 2020-02-21 中山大学 Disease classification code recognition method and device, computer equipment and storage medium
CN111079021A (en) * 2019-12-20 2020-04-28 腾讯科技(深圳)有限公司 Method, device, server and storage medium for recommending medical information content
CN111248913A (en) * 2020-01-21 2020-06-09 山东师范大学 Chronic obstructive pulmonary disease prediction system, equipment and medium based on transfer learning
EP3745947A1 (en) * 2018-01-30 2020-12-09 IRCCS Centro Neurolesi "Bonino-Pulejo" Method for detecting a conversion from mild cognitive impairment to alzheimer disease
WO2021004168A1 (en) * 2019-07-10 2021-01-14 江苏博子岛智能科技有限公司 Artificial intelligence-type medical data integration system and method

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011005026A (en) * 2009-06-26 2011-01-13 Toshiba Corp Ultrasonic diagnosis apparatus and automatic diagnosis support apparatus
US20100331688A1 (en) * 2009-06-30 2010-12-30 Tatsuro Baba Automatic diagnosis support apparatus, ultrasonic diagnosis apparatus, and automatic diagnosis support method
CA2798337A1 (en) * 2012-12-04 2014-06-04 University Of Winnipeg Cardiovascular pulse wave analysis method and system
CN107133487A (en) * 2017-06-21 2017-09-05 深圳市海威达信息技术有限公司 A kind of disease prevention and cure method, device and disease prevention and cure system
CN107680677A (en) * 2017-10-11 2018-02-09 四川大学 Neuropsychiatric disease sorting technique based on brain network analysis
EP3745947A1 (en) * 2018-01-30 2020-12-09 IRCCS Centro Neurolesi "Bonino-Pulejo" Method for detecting a conversion from mild cognitive impairment to alzheimer disease
US20190313920A1 (en) * 2018-03-16 2019-10-17 Tata Consultancy Services Limited System and method for classification of coronary artery disease based on metadata and cardiovascular signals
CN108764280A (en) * 2018-04-17 2018-11-06 中国科学院计算技术研究所 A kind of medical data processing method and system based on symptom vector
CN109582797A (en) * 2018-12-13 2019-04-05 泰康保险集团股份有限公司 Obtain method, apparatus, medium and electronic equipment that classification of diseases is recommended
WO2021004168A1 (en) * 2019-07-10 2021-01-14 江苏博子岛智能科技有限公司 Artificial intelligence-type medical data integration system and method
CN110299205A (en) * 2019-07-23 2019-10-01 上海图灵医疗科技有限公司 Biomedicine signals characteristic processing and evaluating method, device and application based on artificial intelligence
CN110827929A (en) * 2019-11-05 2020-02-21 中山大学 Disease classification code recognition method and device, computer equipment and storage medium
CN111079021A (en) * 2019-12-20 2020-04-28 腾讯科技(深圳)有限公司 Method, device, server and storage medium for recommending medical information content
CN111248913A (en) * 2020-01-21 2020-06-09 山东师范大学 Chronic obstructive pulmonary disease prediction system, equipment and medium based on transfer learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ZIA, S. S.,等: "Schematic Cycle of Case-Based Reasoning Technique Implements in Clinical Decision Support Systems Used for Diagnosis of Liver Disease", 《SINDH UNIVERSITY RESEARCH JOURNAL -SCIENCE SERIES》, vol. 47, no. 2, pages 215 - 220 *
刘畅: "心血管疾病的计算机辅助诊断关键技术研究", 《中国优秀硕士学位论文全文数据库 医药卫生科技辑》, no. 2, pages 062 - 24 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113418920A (en) * 2021-05-14 2021-09-21 广州金域医学检验中心有限公司 Section staining quality interpretation method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN112102952B (en) 2024-05-14

Similar Documents

Publication Publication Date Title
CN110120040B (en) Slice image processing method, slice image processing device, computer equipment and storage medium
CN110222170B (en) Method, device, storage medium and computer equipment for identifying sensitive data
US20210295162A1 (en) Neural network model training method and apparatus, computer device, and storage medium
CN110363387A (en) Portrait analysis method, device, computer equipment and storage medium based on big data
CN109887562B (en) Similarity determination method, device, equipment and storage medium for electronic medical records
CN110705718A (en) Model interpretation method and device based on cooperative game and electronic equipment
CN111860573A (en) Model training method, image class detection method and device and electronic equipment
CN110427970A (en) Image classification method, device, computer equipment and storage medium
CN112635063A (en) Lung cancer prognosis comprehensive prediction model, construction method and device
CN112419321A (en) X-ray image identification method and device, computer equipment and storage medium
CN110503566B (en) Wind control model building method and device, computer equipment and storage medium
Wu et al. Aro: a machine learning approach to identifying single molecules and estimating classification error in fluorescence microscopy images
CN112819797A (en) Diabetic retinopathy analysis method, device, system and storage medium
CN112179653A (en) Rolling bearing vibration signal blind source separation method and device and computer equipment
CN114445356A (en) Multi-resolution-based full-field pathological section image tumor rapid positioning method
CN114511523B (en) Gastric cancer molecular subtype classification method and device based on self-supervision learning
CN112102952B (en) Method for identifying pathology category based on distance calculation method and related equipment
CN114818828A (en) Training method of radar interference perception model and radar interference signal identification method
WO2021217854A1 (en) False positive filtering method, device, equipment, and storage medium
CN111368837B (en) Image quality evaluation method and device, electronic equipment and storage medium
CN110276802B (en) Method, device and equipment for positioning pathological tissue in medical image
CN112348226A (en) Prediction data generation method, system, computer device and storage medium
CN112233742A (en) Medical record document classification system, equipment and storage medium based on clustering
CN109493975B (en) Chronic disease recurrence prediction method, device and computer equipment based on xgboost model
Ali Shah et al. An ensemble-based deep learning model for detection of mutation causing cutaneous melanoma

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Country or region after: China

Address after: 510700 No. 10, helix 3 Road, International Biological Island, Huangpu District, Guangzhou City, Guangdong Province

Applicant after: GUANGZHOU KINGMED CENTER FOR CLINICAL LABORATORY

Applicant after: GUANGZHOU KINGMED DIAGNOSTICS GROUP Co.,Ltd.

Address before: 510330 Guangdong Guangzhou Haizhuqu District Xingang East Road 2429, 3rd floor.

Applicant before: GUANGZHOU KINGMED CENTER FOR CLINICAL LABORATORY

Country or region before: China

Applicant before: GUANGZHOU KINGMED DIAGNOSTICS GROUP Co.,Ltd.

GR01 Patent grant
GR01 Patent grant