CN112102952B - Method for identifying pathology category based on distance calculation method and related equipment - Google Patents

Method for identifying pathology category based on distance calculation method and related equipment Download PDF

Info

Publication number
CN112102952B
CN112102952B CN202010857223.5A CN202010857223A CN112102952B CN 112102952 B CN112102952 B CN 112102952B CN 202010857223 A CN202010857223 A CN 202010857223A CN 112102952 B CN112102952 B CN 112102952B
Authority
CN
China
Prior art keywords
distance
characteristic
distances
pathology
predicted
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010857223.5A
Other languages
Chinese (zh)
Other versions
CN112102952A (en
Inventor
车拴龙
余霆嵩
罗丕福
卢芳
李学锋
刘斯
刘莹
林万里
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Kingmed Diagnostics Group Co ltd
Guangzhou Kingmed Diagnostics Central Co Ltd
Original Assignee
Guangzhou Kingmed Diagnostics Group Co ltd
Guangzhou Kingmed Diagnostics Central Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Kingmed Diagnostics Group Co ltd, Guangzhou Kingmed Diagnostics Central Co Ltd filed Critical Guangzhou Kingmed Diagnostics Group Co ltd
Priority to CN202010857223.5A priority Critical patent/CN112102952B/en
Publication of CN112102952A publication Critical patent/CN112102952A/en
Application granted granted Critical
Publication of CN112102952B publication Critical patent/CN112102952B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/50ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Theoretical Computer Science (AREA)
  • Primary Health Care (AREA)
  • General Health & Medical Sciences (AREA)
  • Epidemiology (AREA)
  • Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the invention discloses a method for identifying pathological categories based on a distance calculation method, which comprises the steps of obtaining characteristic parameters to be diagnosed, wherein the characteristic parameters to be diagnosed comprise N characteristic parameters corresponding to N characteristics; acquiring preset standard parameters, wherein the preset standard parameters comprise M multiplied by N standard parameters of N characteristics corresponding to M pathology categories respectively; respectively calculating the predicted distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathological category by adopting a preset distance calculation method to obtain M predicted distances; according to the M prediction distances, the pathological category corresponding to the characteristic parameter to be diagnosed is determined, and the characteristic parameter to be diagnosed is calculated and compared with the standard parameter corresponding to each pathological category one by one, so that the pathological diagnosis is automated, and the objectivity and the accuracy of the pathological diagnosis are improved. In addition, a system, a computer device and a storage medium for identifying pathology categories based on a distance calculation method are also provided.

Description

Method for identifying pathology category based on distance calculation method and related equipment
Technical Field
The invention relates to the technical field of computers, in particular to a method for identifying pathological categories based on a distance calculation method and related equipment.
Background
Pathological diagnosis is to study the cause of disease, pathogenesis, morphological structure of diseased organism, functional metabolic change and disease outcome in the course of disease, thus providing necessary theoretical basis and practical basis for disease diagnosis, treatment and prevention. Pathological diagnosis is the most reliable of various inspection methods of tumors, and is also known as "gold standard" and is the final diagnosis of diseases.
There are many similar lesions, from clinical symptoms to traditional HE pathological forms, which are highly confused with a variety of diseases, including benign and malignant diseases. Once misdiagnosed, serious medical accidents are caused. In the course of disease diagnosis, differential diagnosis is required with many similar lesions. For example, neoplastic diseases, have varying degrees of expression of a variety of proteins, and variations in a variety of different genes. The occurrence of these proteins and genes is referred to as a characteristic parameter. And neoplastic diseases are referred to as a resulting phenomenon. The most intuitive and detectable result of doctors is characteristic parameters, the expression condition of the characteristic parameters needs to be comprehensively analyzed, and the accumulated experience in the early stage and knowledge in book literature are integrated. And finally, prejudging the fruiting phenomenon. The doctor sees the result report and the imaging report of the examination, combines the medical history and the clinical data, comprehensively selects N kinds of lesions with comparability, and obtains the most prone diagnosis result through comprehensive analysis and diagnosis discussion. And finally, prejudging the fruiting phenomenon. However, the predictive outcome phenomenon is currently based on experience of varying individual levels, and there is a serious subjective impact on individuals. And the differentiation of the result phenomena between different medical institutions and different doctors is large, the objectivity of the pathological category identification is reduced, so that the diagnosis and treatment quality is affected to a certain extent, the accuracy of the pathological category identification is low, and the leakage rate of pathological diagnosis is high.
Disclosure of Invention
Based on this, it is necessary to provide a method, a system, a computer device and a storage medium for identifying pathology categories based on a distance calculation method, so as to improve objectivity and accuracy of pathology diagnosis.
A method of identifying pathology categories based on a distance calculation method, the method comprising:
Acquiring characteristic parameters to be diagnosed, wherein the characteristic parameters to be diagnosed comprise N characteristic parameters corresponding to N characteristics, and N is a natural number;
Acquiring preset standard parameters, wherein the preset standard parameters comprise M multiplied by N standard parameters of N characteristics corresponding to M pathology categories respectively, and M is a natural number;
Respectively calculating the predicted distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathological category by adopting a preset distance calculation method to obtain M predicted distances;
and determining the pathological category corresponding to the characteristic parameter to be diagnosed according to the M predicted distances.
A system for identifying pathology categories based on a distance calculation method, the system comprising:
the first parameter acquisition module is used for acquiring characteristic parameters to be diagnosed, wherein the characteristic parameters to be diagnosed comprise N characteristic parameters corresponding to N characteristics, and N is a natural number;
The second parameter acquisition module is used for acquiring preset standard parameters, wherein the preset standard parameters comprise M multiplied by N standard parameters of N characteristics corresponding to M pathology categories respectively, and M is a natural number;
The calculation module is used for respectively calculating the prediction distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathology type by adopting a preset distance calculation method to obtain M prediction distances;
And the diagnosis module is used for determining the pathological category corresponding to the characteristic parameter to be diagnosed according to the M predicted distances.
A computer device comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of:
Acquiring characteristic parameters to be diagnosed, wherein the characteristic parameters to be diagnosed comprise N characteristic parameters corresponding to N characteristics, and N is a natural number;
Acquiring preset standard parameters, wherein the preset standard parameters comprise M multiplied by N standard parameters of N characteristics corresponding to M pathology categories respectively, and M is a natural number;
Respectively calculating the predicted distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathological category by adopting a preset distance calculation method to obtain M predicted distances;
and determining the pathological category corresponding to the characteristic parameter to be diagnosed according to the M predicted distances.
A computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of:
Acquiring characteristic parameters to be diagnosed, wherein the characteristic parameters to be diagnosed comprise N characteristic parameters corresponding to N characteristics, and N is a natural number;
Acquiring preset standard parameters, wherein the preset standard parameters comprise M multiplied by N standard parameters of N characteristics corresponding to M pathology categories respectively, and M is a natural number;
Respectively calculating the predicted distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathological category by adopting a preset distance calculation method to obtain M predicted distances;
and determining the pathological category corresponding to the characteristic parameter to be diagnosed according to the M predicted distances.
The method, the system, the computer equipment and the storage medium for identifying the pathological category based on the distance calculation method are characterized by acquiring the characteristic parameters to be diagnosed; acquiring preset standard parameters; respectively calculating the predicted distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathological category by adopting a preset distance calculation method to obtain M predicted distances; and determining the pathological category corresponding to the characteristic parameters to be diagnosed according to the M predicted distances, and performing pathological diagnosis by adopting a plurality of methods based on distance calculation, thereby improving the accuracy and objectivity of pathological diagnosis.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Wherein:
FIG. 1 is a flow diagram of a method of identifying pathology categories based on a distance calculation method in one embodiment;
FIG. 2 is a flow chart of a method of calculating a predicted distance in one embodiment;
FIG. 3 is a flowchart of a method of calculating a predicted distance in another embodiment;
FIG. 4 is a flow diagram of a method of pathology class determination in one embodiment;
FIG. 5 is a block diagram of a system for identifying pathology categories based on a distance calculation method in one embodiment;
FIG. 6 is a block diagram of a computer device in one embodiment.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As shown in fig. 1, in one embodiment, a method for identifying a pathology class based on a distance calculation method is provided, and the method for identifying a pathology class based on a distance calculation method is applied to a terminal or a server, and this embodiment is exemplified by being applied to a server. The method for identifying the pathology class based on the distance calculation method specifically comprises the following steps:
step 102, obtaining characteristic parameters to be diagnosed, wherein the characteristic parameters to be diagnosed comprise N characteristic parameters corresponding to N characteristics, and N is a natural number.
The characteristic parameters to be diagnosed are parameters for reflecting pathological characteristics of pathological sections to be diagnosed, and the characteristic parameters to be diagnosed comprise a plurality of characteristic parameters corresponding to a plurality of characteristics. In one embodiment, the pathological section to be diagnosed is an ovarian epithelial malignancy, and the 7 corresponding characteristic parameters may be values corresponding to Pax-8, WT-1, CA125, P53, CEA, ER, and PVHL. Specifically, the characteristic parameters to be diagnosed can be obtained after the pathological section is analyzed by a pathological analysis instrument.
Step 104, obtaining preset standard parameters, wherein the preset standard parameters comprise M multiplied by N standard parameters of N characteristics corresponding to M pathology categories respectively, and M is a natural number.
The preset standard parameters are parameters set according to the sizes or ranges of the N characteristic parameters under each pathological category, and the standard parameters are in one-to-one correspondence with the characteristic parameters to be diagnosed, namely, each pathological category contains the N standard parameters, so that the M pathological categories contain M multiplied by N standard parameters. Continuing with the example of ovarian epithelial malignancy in step S102, there are 7 feature parameters corresponding to the pathological categories that exist including: serous adenocarcinomas, mucinous adenocarcinomas, endometrioid adenocarcinomas, clear cell adenocarcinomas and metastatic adenocarcinomas. There are N standard parameters for each pathology category, e.g., values for 95%, 75% and 5% for 7 standard parameters for serous adenocarcinoma, i.e., pax-8, WT-1, CA125, P53, CEA, ER and PVHL, respectively.
And 106, respectively calculating the predicted distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathological category by adopting a preset distance calculation method to obtain M predicted distances.
The preset distance calculation method is a quantization method which is preset and used for comparing the similarity degree of the characteristic parameters to be diagnosed and the standard parameters. The distance calculating method can be one or more of Euclidean distance, minkowski distance, manhattan distance, chebyshev distance, cosine similarity and/or distance measurement of pearson correlation coefficient, and can be specifically selected according to the characteristics of each distance and the standard parameters and/or the characteristics of the characteristic parameters to be diagnosed. The prediction distance refers to a quantized value of the similarity degree of the characteristic parameter to be diagnosed and the standard parameter corresponding to each pathology type. Specifically, N characteristic parameters in the characteristic parameters to be diagnosed are respectively calculated with N standard parameters corresponding to M pathological categories according to a preset distance calculation method, so as to obtain M predicted distances. It can be understood that the prediction distance between the feature parameter to be diagnosed and the standard parameter corresponding to each pathological category is calculated by a preset distance calculation method, so that the specific quantification of the similarity degree of the feature parameter to be diagnosed and the standard parameter is realized, and the objectivity of calculating the prediction distance is improved. And the preset distance calculation method comprises various distance measures, so that the accuracy of calculating the predicted distance is improved.
And step 108, determining the pathological category corresponding to the characteristic parameter to be diagnosed according to the M predicted distances.
Specifically, according to specific values corresponding to the M predicted distances and the relation between the magnitude of the predicted distances and the positive correlation or the negative correlation of the similarity, the pathology category corresponding to the characteristic parameter to be diagnosed is determined. For example, 5 pathology categories: the predicted distances corresponding to serous adenocarcinomas, mucinous adenocarcinomas, endometrioid adenocarcinomas, clear cell adenocarcinomas and metastatic adenocarcinomas are: 0.7, 0.5, 0.4 and 0.5, and the predicted distance is positively correlated with the similarity degree, namely, the greater the predicted distance is, the higher the similarity degree is, and the pathological category corresponding to the characteristic parameter to be diagnosed is the pathological category corresponding to the predicted distance of 0.7, namely, serous adenocarcinoma. It can be understood that the feature parameters to be diagnosed are calculated and compared with the standard parameters corresponding to each pathological category one by one, so that the pathological diagnosis is automated, and the objectivity and the accuracy of the pathological diagnosis are ensured.
According to the method for identifying the pathological category based on the distance calculation method, the characteristic parameters to be diagnosed are obtained, wherein the characteristic parameters to be diagnosed comprise N characteristic parameters corresponding to N characteristics; acquiring preset standard parameters, wherein the preset standard parameters comprise M multiplied by N standard parameters of N characteristics corresponding to M pathology categories respectively; respectively calculating the predicted distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathological category by adopting a preset distance calculation method to obtain M predicted distances; according to the M predicted distances, the pathological category corresponding to the characteristic parameter to be diagnosed is determined, and the characteristic parameter to be diagnosed is calculated and compared with the standard parameter corresponding to each pathological category one by adopting a plurality of methods based on distance calculation, so that the automation of pathological diagnosis is realized, and the objectivity and the accuracy of the pathological diagnosis are improved.
As shown in fig. 2, in one embodiment, a preset distance calculation method is adopted to calculate the predicted distances between the feature parameters to be diagnosed and the standard parameters corresponding to each pathology category, so as to obtain M predicted distances, including:
106A, respectively calculating a first distance and/or a second distance of each characteristic parameter and standard parameters corresponding to M pathological categories to obtain M multiplied by N characteristic distances;
And 106B, respectively carrying out fusion calculation on N characteristic distances corresponding to each pathology type to obtain M predicted distances.
The first distance and the second distance are two types of distances respectively, and are classified according to the correlation relation with the similarity degree. For example, a first distance is positively correlated with a degree of similarity, and a second distance is inversely correlated with the degree of similarity. The feature distance refers to a distance between a single feature parameter and a corresponding standard parameter, and may also be one or more of euclidean distance, markov distance, manhattan distance, and chebyshev distance, cosine similarity, and/or distance measures of pearson correlation coefficients. Specifically, respectively carrying out distance calculation on N characteristic parameters and standard parameters corresponding to M pathological categories, wherein the distance is a first distance and/or a second distance, and M multiplied by N characteristic distances are obtained; and then carrying out fusion calculation on N characteristic distances corresponding to each pathology type to obtain M prediction distances. The fusion calculation refers to a processing method for calculating a plurality of indexes, for example, the calculation can be performed by weighting and summing after the weight of each index is set according to the importance of each index, or the self-adaptive fusion can be performed according to a preset rule. It can be appreciated that in this embodiment, by performing fusion calculation on each feature distance, the influence of each feature distance on the predicted distance is considered in fusion, so that the accuracy of determining the predicted distance is further ensured.
As shown in fig. 3, in one embodiment, fusion calculation is performed on N feature distances corresponding to each pathology category, to obtain M predicted distances, which includes:
step 106B1, obtaining a preset weight corresponding to each characteristic distance;
And 106B2, carrying out weighted calculation according to N characteristic distances corresponding to each pathology type and corresponding preset weights to obtain M prediction distances.
Specifically, a preset weight of each characteristic distance is first determined, and the preset weight can be set according to the influence of each characteristic distance on the correctness of pathological diagnosis. And then, carrying out weighted calculation according to N characteristic distances corresponding to each pathology type and corresponding preset weights to obtain M prediction distances. It can be appreciated that in this embodiment, the predicted distance is obtained through weighted calculation, and the fusion calculation method is simple and fast, so that the speed of calculating the predicted distance is improved.
In one embodiment, the first distance is at least one of Euclidean distance, minkowski distance, manhattan distance, and Chebyshev distance, and the second distance is cosine similarity and/or a Pierson correlation coefficient.
Wherein the first distance is at least one of euclidean distance, minkowski distance, manhattan distance, and chebyshev distance, and the euclidean distance (Euclidean Distance) is an absolute distance between points in the multidimensional space, and the formula is: where dist (X 1,Y1) is denoted as Euclidean distance, X i is denoted as the i-th feature parameter, and y i is denoted as the i-th standard parameter corresponding to the i-th feature parameter. The Minkowski distance (Minkowski Distance) is a generalization of the Euclidean distance, and is a generalized representation of a number of distance metric formulas, which are: /(I) Wherein dist (X 2,Y2) is expressed as a minkowski distance, X i is expressed as an i-th feature parameter, y i is expressed as an i-th standard parameter corresponding to the i-th feature parameter, and p is a constant. Manhattan distance (MANHATTAN DISTANCE) is derived from city block distance, and is the result of summing distances in multiple dimensions, and the formula is: /(I)Where dist (X 3,Y3) is expressed as Manhattan distance, X i is expressed as the ith feature parameter, and y i is expressed as the ith standard parameter corresponding to the ith feature parameter. Chebyshev distance (Chebyshev Distance) is a measure in vector space, the distance between two points is defined as the maximum value of the absolute value of the difference between the numerical values of the coordinates, and the formula is: /(I)Where dist (X 4,Y4) is denoted as Chebyshev distance, X i is denoted as the i-th feature parameter, and y i is denoted as the i-th standard parameter corresponding to the i-th feature parameter. And the Euclidean distance, the Minkowski distance, the Manhattan distance and the Chebyshev distance are in negative correlation with the similarity, i.e. the larger the first distance is, the lower the similarity is. The second distance is cosine similarity and/or pearson correlation coefficient, the cosine similarity (Cosine Similarity) is the cosine value of the included angle of two vectors in the vector space, which is used for measuring the difference between two individuals, and the formula is as follows: /(I)Where sim (X, Y) is denoted as cosine similarity, X is denoted as a characteristic parameter, and Y is denoted as a standard parameter corresponding to the characteristic parameter X. The pearson correlation coefficient (Pearson Correlation Coefficient) is used to measure whether two data sets are above a line, and is used to measure the linear relationship between distance variables, and its formula is: Where r (X, Y) is denoted as pearson correlation coefficient, X is denoted as a characteristic parameter, and Y is denoted as a standard parameter corresponding to the characteristic parameter X. The cosine similarity and the pearson correlation coefficient have a negative correlation with the degree of similarity, i.e. the greater the second distance, the higher the degree of similarity. It can be understood that the first distance and the second distance have respective distance measurement scenes, so that the appropriate first distance or second distance is selected according to the application scene of the feature parameter to be diagnosed, so as to further improve the accuracy of pathological diagnosis.
In one embodiment, obtaining a preset weight corresponding to each feature distance includes: when the characteristic distance is the first distance, the preset weight is a negative number; when the feature distance is the second distance, the preset weight is a positive number.
Specifically, when the feature distance is the first distance, the preset weight is a negative number, when the feature distance is the second distance, the preset weight is a positive number, and because the first distance is in negative correlation with the similarity, the corresponding preset weight is determined to be a negative number, and the second distance is in positive correlation with the similarity, the corresponding preset weight is determined to be a positive number, so that the predicted distance calculated based on the preset weight is in positive correlation with the similarity, further pathological diagnosis can be conveniently performed according to the predicted distance, and the efficiency of pathological diagnosis is further improved.
In one embodiment, the fusion calculation is performed on N feature distances corresponding to each pathology category to obtain M predicted distances, which includes: when the characteristic distance is the first distance, the predicted distance and the characteristic distance are in negative correlation; when the feature distance is the second distance, the predicted distance is positively correlated with the feature distance.
As shown in fig. 4, in one embodiment, determining a pathology category corresponding to the feature parameter to be diagnosed according to the magnitudes of the M prediction distances includes:
Step 108A, determining the duty ratio of each predicted distance to M predicted distances as the probability of the corresponding pathology class;
and step 108B, determining the pathological category corresponding to the characteristic parameter to be diagnosed according to the probability of each pathological category.
In this embodiment, a ratio value of each predicted distance to the sum of the M predicted distances is calculated, the ratio value is determined as a probability of a pathology class corresponding to the feature parameter to be diagnosed, and the pathology class corresponding to the feature parameter to be diagnosed is determined according to the probability of each pathology class. For example, 5 pathology categories: the predicted distances corresponding to serous adenocarcinomas, mucinous adenocarcinomas, endometrioid adenocarcinomas, clear cell adenocarcinomas and metastatic adenocarcinomas are: the probabilities of the corresponding pathological categories are 28%, 20%, 16% and 20% in sequence, and thus, serous adenocarcinoma with the probability of the pathological category of 28% is the pathological category corresponding to the characteristic parameter to be diagnosed. It can be understood that the probability of each predicted distance in M predicted distances is determined as the basis of the pathological diagnosis by using the ratio of each predicted distance in M predicted distances as the basis of the pathological diagnosis, so that the calculation is simple and quick, and the accuracy and the efficiency of the pathological diagnosis are improved.
As shown in fig. 5, in one embodiment, a system for identifying pathology categories based on a distance calculation method is provided, the system comprising:
The first parameter obtaining module 502 is configured to obtain feature parameters to be diagnosed, where the feature parameters to be diagnosed include N feature parameters corresponding to N features, where N is a natural number;
A second parameter obtaining module 504, configured to obtain preset standard parameters, where the preset standard parameters include m×n standard parameters of N features corresponding to M pathology categories, where M is a natural number;
the calculating module 506 is configured to calculate, by using a preset distance calculating method, prediction distances between the feature parameter to be diagnosed and standard parameters corresponding to each pathology category, so as to obtain M prediction distances;
And the diagnosis module 508 is configured to determine a pathology category corresponding to the feature parameter to be diagnosed according to the magnitudes of the M predicted distances.
In one embodiment, the computing module includes:
The distance calculation unit is used for calculating the first distance and/or the second distance of each characteristic parameter and the standard parameter corresponding to the M pathological categories respectively to obtain M multiplied by N characteristic distances;
And the distance fusion unit is used for respectively carrying out fusion calculation on N characteristic distances corresponding to each pathology category to obtain M predicted distances.
In one embodiment, the distance fusion unit includes:
the weight acquisition subunit is used for acquiring a preset weight corresponding to each characteristic distance;
And the fusion calculation subunit is used for carrying out weighted calculation according to N characteristic distances corresponding to each pathology category and the corresponding preset weights to obtain M prediction distances.
In one embodiment, the diagnostic module includes:
the probability calculation unit is used for determining the duty ratio of each predicted distance to M predicted distances as the probability of the corresponding pathology class;
and the pathology diagnosis unit is used for determining the pathology category corresponding to the characteristic parameter to be diagnosed according to the probability of each pathology category.
FIG. 6 illustrates an internal block diagram of a computer device in one embodiment. The computer device may be, in particular, a server including, but not limited to, a high performance computer and a high performance computer cluster. As shown in fig. 6, the computer device includes a processor, a memory, and a network interface connected by a system bus. The memory includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system and may also store a computer program which, when executed by a processor, causes the processor to implement a method of identifying a pathology class based on a distance calculation method. The internal memory may also have stored therein a computer program which, when executed by the processor, causes the processor to perform a method for identifying a pathology class based on a distance calculation method. It will be appreciated by those skilled in the art that the structure shown in FIG. 6 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
In one embodiment, the method for identifying pathology categories based on distance calculation method provided by the present application may be implemented in the form of a computer program that can be run on a computer device as shown in fig. 6. The memory of the computer device may store therein various program templates that constitute a system for identifying pathology categories based on distance calculation methods. Such as a first parameter acquisition module 502, a second parameter acquisition module 504, a calculation module 506, and a diagnostic module 508.
A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of when executing the computer program: acquiring characteristic parameters to be diagnosed, wherein the characteristic parameters to be diagnosed comprise N characteristic parameters corresponding to N characteristics, and N is a natural number; acquiring preset standard parameters, wherein the preset standard parameters comprise M multiplied by N standard parameters of N characteristics corresponding to M pathology categories respectively, and M is a natural number; respectively calculating the predicted distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathological category by adopting a preset distance calculation method to obtain M predicted distances; and determining the pathological category corresponding to the characteristic parameter to be diagnosed according to the M predicted distances.
In one embodiment, the calculating, by using a preset distance calculating method, the predicted distances between the feature parameter to be diagnosed and the standard parameter corresponding to each pathology type, to obtain M predicted distances includes: respectively calculating a first distance and/or a second distance of each characteristic parameter and standard parameters corresponding to the M pathological categories to obtain M multiplied by N characteristic distances; and respectively carrying out fusion calculation on N characteristic distances corresponding to each pathology category to obtain M predicted distances.
In one embodiment, the fusing calculation is performed on the N feature distances corresponding to each pathology category to obtain M predicted distances, which includes: acquiring a preset weight corresponding to each characteristic distance; and carrying out weighted calculation according to N characteristic distances corresponding to each pathology type and corresponding preset weights to obtain M predicted distances.
In one embodiment, the first distance is at least one of euclidean distance, minkowski distance, manhattan distance, and chebyshev distance, and the second distance is cosine similarity and/or pearson correlation coefficient.
In one embodiment, the obtaining the preset weight corresponding to each feature distance includes: when the characteristic distance is a first distance, the preset weight is a negative number; and when the characteristic distance is a second distance, the preset weight is a positive number.
In one embodiment, the fusing calculation is performed on the N feature distances corresponding to each pathology category to obtain M predicted distances, which includes: when the characteristic distance is a first distance, the predicted distance and the characteristic distance are in negative correlation; when the feature distance is the second distance, the predicted distance is positively correlated with the feature distance.
In one embodiment, the determining the pathology category corresponding to the feature parameter to be diagnosed according to the magnitude of the M predicted distances includes: determining the duty ratio of each predicted distance to M predicted distances as the probability of the corresponding pathology class; and determining the pathological category corresponding to the characteristic parameter to be diagnosed according to the probability of each pathological category.
A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor performs the steps of: acquiring characteristic parameters to be diagnosed, wherein the characteristic parameters to be diagnosed comprise N characteristic parameters corresponding to N characteristics, and N is a natural number; acquiring preset standard parameters, wherein the preset standard parameters comprise M multiplied by N standard parameters of N characteristics corresponding to M pathology categories respectively, and M is a natural number; respectively calculating the predicted distances between the characteristic parameters to be diagnosed and the standard parameters corresponding to each pathological category by adopting a preset distance calculation method to obtain M predicted distances; and determining the pathological category corresponding to the characteristic parameter to be diagnosed according to the M predicted distances.
In one embodiment, the calculating, by using a preset distance calculating method, the predicted distances between the feature parameter to be diagnosed and the standard parameter corresponding to each pathology type, to obtain M predicted distances includes: respectively calculating a first distance and/or a second distance of each characteristic parameter and standard parameters corresponding to the M pathological categories to obtain M multiplied by N characteristic distances; and respectively carrying out fusion calculation on N characteristic distances corresponding to each pathology category to obtain M predicted distances.
In one embodiment, the fusing calculation is performed on the N feature distances corresponding to each pathology category to obtain M predicted distances, which includes: acquiring a preset weight corresponding to each characteristic distance; and carrying out weighted calculation according to N characteristic distances corresponding to each pathology type and corresponding preset weights to obtain M predicted distances.
In one embodiment, the first distance is at least one of euclidean distance, minkowski distance, manhattan distance, and chebyshev distance, and the second distance is cosine similarity and/or pearson correlation coefficient.
In one embodiment, the obtaining the preset weight corresponding to each feature distance includes: when the characteristic distance is a first distance, the preset weight is a negative number; and when the characteristic distance is a second distance, the preset weight is a positive number.
In one embodiment, the fusing calculation is performed on the N feature distances corresponding to each pathology category to obtain M predicted distances, which includes: when the characteristic distance is a first distance, the predicted distance and the characteristic distance are in negative correlation; when the feature distance is the second distance, the predicted distance is positively correlated with the feature distance.
In one embodiment, the determining the pathology category corresponding to the feature parameter to be diagnosed according to the magnitude of the M predicted distances includes: determining the duty ratio of each predicted distance to M predicted distances as the probability of the corresponding pathology class; and determining the pathological category corresponding to the characteristic parameter to be diagnosed according to the probability of each pathological category.
Those skilled in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by a computer program for instructing relevant hardware, where the program may be stored in a non-volatile computer readable storage medium, and where the program, when executed, may include processes in the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link (SYNCHLINK) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.

Claims (8)

1. A method for identifying pathology categories based on a distance calculation method, comprising:
Acquiring characteristic parameters to be diagnosed, wherein the characteristic parameters to be diagnosed comprise N characteristic parameters corresponding to N characteristics, and N is a natural number;
Acquiring preset standard parameters, wherein the preset standard parameters comprise M multiplied by N standard parameters of N characteristics corresponding to M pathology categories respectively, and M is a natural number;
calculating the predicted distances between the feature parameters to be diagnosed and the standard parameters corresponding to each pathology type by adopting a preset distance calculation method to obtain M predicted distances, wherein the method comprises the following steps: respectively calculating a first distance or a second distance of each characteristic parameter and standard parameters corresponding to the M pathological categories to obtain M multiplied by N characteristic distances; respectively carrying out fusion calculation on N characteristic distances corresponding to each pathology category to obtain M predicted distances, wherein the fusion calculation comprises the following steps: acquiring a preset weight corresponding to each characteristic distance; weighting calculation is carried out according to N characteristic distances corresponding to each pathology type and corresponding preset weights, and M predicted distances are obtained;
Determining the pathology category corresponding to the characteristic parameter to be diagnosed according to the M predicted distances, wherein the pathology category comprises: determining the duty ratio of each predicted distance to M predicted distances as the probability of the corresponding pathology class; and determining the pathological category corresponding to the characteristic parameter to be diagnosed according to the probability of each pathological category.
2. The method of claim 1, wherein the first distance is at least one of euclidean distance, minkowski distance, manhattan distance, and chebyshev distance, and the second distance is cosine similarity or pearson correlation coefficient.
3. The method for identifying pathological category based on distance calculation method according to claim 2, wherein the obtaining the preset weight corresponding to each characteristic distance comprises:
when the characteristic distance is a first distance, the preset weight is a negative number;
And when the characteristic distance is a second distance, the preset weight is a positive number.
4. The method for identifying pathological categories based on a distance calculation method according to claim 2, wherein the respectively performing fusion calculation on the N feature distances corresponding to each pathological category to obtain M predicted distances includes:
when the characteristic distance is a first distance, the predicted distance and the characteristic distance are in negative correlation;
When the feature distance is the second distance, the predicted distance is positively correlated with the feature distance.
5. A system for identifying pathology categories based on a distance calculation method, the system for identifying pathology categories based on a distance calculation method comprising:
the first parameter acquisition module is used for acquiring characteristic parameters to be diagnosed, wherein the characteristic parameters to be diagnosed comprise N characteristic parameters corresponding to N characteristics, and N is a natural number;
The second parameter acquisition module is used for acquiring preset standard parameters, wherein the preset standard parameters comprise M multiplied by N standard parameters of N characteristics corresponding to M pathology categories respectively, and M is a natural number;
The calculation module is configured to calculate, by using a preset distance calculation method, prediction distances between the feature parameter to be diagnosed and standard parameters corresponding to each pathology category, to obtain M prediction distances, where the calculation module includes: respectively calculating a first distance or a second distance of each characteristic parameter and standard parameters corresponding to the M pathological categories to obtain M multiplied by N characteristic distances; respectively carrying out fusion calculation on N characteristic distances corresponding to each pathology category to obtain M predicted distances, wherein the fusion calculation comprises the following steps: acquiring a preset weight corresponding to each characteristic distance; weighting calculation is carried out according to N characteristic distances corresponding to each pathology type and corresponding preset weights, and M predicted distances are obtained;
The diagnosis module is used for determining the pathological category corresponding to the characteristic parameter to be diagnosed according to the M predicted distances, and comprises the following steps: determining the duty ratio of each predicted distance to M predicted distances as the probability of the corresponding pathology class; and determining the pathological category corresponding to the characteristic parameter to be diagnosed according to the probability of each pathological category.
6. The system for identifying pathology categories based on a distance calculation method according to claim 5, wherein the calculation module comprises:
The distance calculation unit is used for calculating a first distance or a second distance of each characteristic parameter and the standard parameters corresponding to the M pathological categories respectively to obtain M multiplied by N characteristic distances;
And the distance fusion unit is used for respectively carrying out fusion calculation on N characteristic distances corresponding to each pathology category to obtain M predicted distances.
7. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method of identifying a pathology class based on a distance calculation method according to any one of claims 1 to 4 when the computer program is executed by the processor.
8. A computer-readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method of identifying a pathology class based on a distance calculation method according to any one of claims 1 to 4.
CN202010857223.5A 2020-08-24 2020-08-24 Method for identifying pathology category based on distance calculation method and related equipment Active CN112102952B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010857223.5A CN112102952B (en) 2020-08-24 2020-08-24 Method for identifying pathology category based on distance calculation method and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010857223.5A CN112102952B (en) 2020-08-24 2020-08-24 Method for identifying pathology category based on distance calculation method and related equipment

Publications (2)

Publication Number Publication Date
CN112102952A CN112102952A (en) 2020-12-18
CN112102952B true CN112102952B (en) 2024-05-14

Family

ID=73754457

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010857223.5A Active CN112102952B (en) 2020-08-24 2020-08-24 Method for identifying pathology category based on distance calculation method and related equipment

Country Status (1)

Country Link
CN (1) CN112102952B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113418920A (en) * 2021-05-14 2021-09-21 广州金域医学检验中心有限公司 Section staining quality interpretation method and device, computer equipment and storage medium

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011005026A (en) * 2009-06-26 2011-01-13 Toshiba Corp Ultrasonic diagnosis apparatus and automatic diagnosis support apparatus
CA2798337A1 (en) * 2012-12-04 2014-06-04 University Of Winnipeg Cardiovascular pulse wave analysis method and system
CN107133487A (en) * 2017-06-21 2017-09-05 深圳市海威达信息技术有限公司 A kind of disease prevention and cure method, device and disease prevention and cure system
CN107680677A (en) * 2017-10-11 2018-02-09 四川大学 Neuropsychiatric disease sorting technique based on brain network analysis
CN108764280A (en) * 2018-04-17 2018-11-06 中国科学院计算技术研究所 A kind of medical data processing method and system based on symptom vector
CN109582797A (en) * 2018-12-13 2019-04-05 泰康保险集团股份有限公司 Obtain method, apparatus, medium and electronic equipment that classification of diseases is recommended
CN110299205A (en) * 2019-07-23 2019-10-01 上海图灵医疗科技有限公司 Biomedicine signals characteristic processing and evaluating method, device and application based on artificial intelligence
CN110827929A (en) * 2019-11-05 2020-02-21 中山大学 Disease classification code recognition method and device, computer equipment and storage medium
CN111079021A (en) * 2019-12-20 2020-04-28 腾讯科技(深圳)有限公司 Method, device, server and storage medium for recommending medical information content
CN111248913A (en) * 2020-01-21 2020-06-09 山东师范大学 Chronic obstructive pulmonary disease prediction system, equipment and medium based on transfer learning
EP3745947A1 (en) * 2018-01-30 2020-12-09 IRCCS Centro Neurolesi "Bonino-Pulejo" Method for detecting a conversion from mild cognitive impairment to alzheimer disease
WO2021004168A1 (en) * 2019-07-10 2021-01-14 江苏博子岛智能科技有限公司 Artificial intelligence-type medical data integration system and method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5611546B2 (en) * 2009-06-30 2014-10-22 株式会社東芝 Automatic diagnosis support apparatus, ultrasonic diagnosis apparatus, and automatic diagnosis support program
EP3539464A1 (en) * 2018-03-16 2019-09-18 Tata Consultancy Services Limited System and method for classification of coronary artery disease based on metadata and cardiovascular signals

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011005026A (en) * 2009-06-26 2011-01-13 Toshiba Corp Ultrasonic diagnosis apparatus and automatic diagnosis support apparatus
CA2798337A1 (en) * 2012-12-04 2014-06-04 University Of Winnipeg Cardiovascular pulse wave analysis method and system
CN107133487A (en) * 2017-06-21 2017-09-05 深圳市海威达信息技术有限公司 A kind of disease prevention and cure method, device and disease prevention and cure system
CN107680677A (en) * 2017-10-11 2018-02-09 四川大学 Neuropsychiatric disease sorting technique based on brain network analysis
EP3745947A1 (en) * 2018-01-30 2020-12-09 IRCCS Centro Neurolesi "Bonino-Pulejo" Method for detecting a conversion from mild cognitive impairment to alzheimer disease
CN108764280A (en) * 2018-04-17 2018-11-06 中国科学院计算技术研究所 A kind of medical data processing method and system based on symptom vector
CN109582797A (en) * 2018-12-13 2019-04-05 泰康保险集团股份有限公司 Obtain method, apparatus, medium and electronic equipment that classification of diseases is recommended
WO2021004168A1 (en) * 2019-07-10 2021-01-14 江苏博子岛智能科技有限公司 Artificial intelligence-type medical data integration system and method
CN110299205A (en) * 2019-07-23 2019-10-01 上海图灵医疗科技有限公司 Biomedicine signals characteristic processing and evaluating method, device and application based on artificial intelligence
CN110827929A (en) * 2019-11-05 2020-02-21 中山大学 Disease classification code recognition method and device, computer equipment and storage medium
CN111079021A (en) * 2019-12-20 2020-04-28 腾讯科技(深圳)有限公司 Method, device, server and storage medium for recommending medical information content
CN111248913A (en) * 2020-01-21 2020-06-09 山东师范大学 Chronic obstructive pulmonary disease prediction system, equipment and medium based on transfer learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Schematic Cycle of Case-Based Reasoning Technique Implements in Clinical Decision Support Systems Used for Diagnosis of Liver Disease;Zia, S. S.,等;《Sindh University Research Journal -Science Series》;第47卷(第2期);第215-220页 *
心血管疾病的计算机辅助诊断关键技术研究;刘畅;《中国优秀硕士学位论文全文数据库 医药卫生科技辑》(第2期);E062-24 *

Also Published As

Publication number Publication date
CN112102952A (en) 2020-12-18

Similar Documents

Publication Publication Date Title
CN110120040B (en) Slice image processing method, slice image processing device, computer equipment and storage medium
CN110363387A (en) Portrait analysis method, device, computer equipment and storage medium based on big data
CN111524137B (en) Cell identification counting method and device based on image identification and computer equipment
CN109887562B (en) Similarity determination method, device, equipment and storage medium for electronic medical records
CN109817339B (en) Patient grouping method and device based on big data
CN112419321A (en) X-ray image identification method and device, computer equipment and storage medium
CN109146891B (en) Hippocampus segmentation method and device applied to MRI and electronic equipment
US20210241178A1 (en) Computationally derived cytological image markers for predicting risk of relapse in acute myeloid leukemia patients following bone marrow transplantation images
CN113077434A (en) Method, device and storage medium for lung cancer identification based on multi-modal information
CN111554402A (en) Machine learning-based method and system for predicting postoperative recurrence risk of primary liver cancer
CN112102952B (en) Method for identifying pathology category based on distance calculation method and related equipment
CN111666890A (en) Spine deformation crowd identification method and device, computer equipment and storage medium
CN112785420A (en) Credit scoring model training method and device, electronic equipment and storage medium
CN117349630A (en) Method and system for biochemical data analysis
CN113887866A (en) Method and device for generating human living environment evaluation index
CN114445356A (en) Multi-resolution-based full-field pathological section image tumor rapid positioning method
CN113554668A (en) Skin mirror image melanoma segmentation method, device and related components
CN109493975B (en) Chronic disease recurrence prediction method, device and computer equipment based on xgboost model
CN110276802B (en) Method, device and equipment for positioning pathological tissue in medical image
CN112233742A (en) Medical record document classification system, equipment and storage medium based on clustering
CN113436725B (en) Data processing method, system, computer device and computer readable storage medium
Hoffmann et al. The risk function of the goodness-of-fit tests for tail models
CN113345588A (en) Rapid attribute reduction method for incomplete data set
Tasya et al. Breast Cancer Detection Using Convolutional Neural Network with EfficientNet Architecture
Balakrishnan et al. Breast Cancer Recognition Using Integrated Lasso Based Artificial Intelligence Approach

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Country or region after: China

Address after: 510700 No. 10, helix 3 Road, International Biological Island, Huangpu District, Guangzhou City, Guangdong Province

Applicant after: GUANGZHOU KINGMED CENTER FOR CLINICAL LABORATORY

Applicant after: GUANGZHOU KINGMED DIAGNOSTICS GROUP Co.,Ltd.

Address before: 510330 Guangdong Guangzhou Haizhuqu District Xingang East Road 2429, 3rd floor.

Applicant before: GUANGZHOU KINGMED CENTER FOR CLINICAL LABORATORY

Country or region before: China

Applicant before: GUANGZHOU KINGMED DIAGNOSTICS GROUP Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant