CN111985837A - Risk analysis method, device and equipment based on hierarchical clustering and storage medium - Google Patents
Risk analysis method, device and equipment based on hierarchical clustering and storage medium Download PDFInfo
- Publication number
- CN111985837A CN111985837A CN202010895439.0A CN202010895439A CN111985837A CN 111985837 A CN111985837 A CN 111985837A CN 202010895439 A CN202010895439 A CN 202010895439A CN 111985837 A CN111985837 A CN 111985837A
- Authority
- CN
- China
- Prior art keywords
- data
- distance
- risk analysis
- clustering
- hospital
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012502 risk assessment Methods 0.000 title claims abstract description 85
- 238000000034 method Methods 0.000 title claims abstract description 48
- 239000011159 matrix material Substances 0.000 claims abstract description 110
- 239000003814 drug Substances 0.000 claims abstract description 55
- 238000013138 pruning Methods 0.000 claims abstract description 38
- 229940079593 drug Drugs 0.000 claims abstract description 31
- 238000004422 calculation algorithm Methods 0.000 claims description 11
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000004458 analytical method Methods 0.000 claims description 7
- 238000013473 artificial intelligence Methods 0.000 abstract 1
- 230000002159 abnormal effect Effects 0.000 description 10
- 230000005856 abnormality Effects 0.000 description 8
- 238000010586 diagram Methods 0.000 description 6
- 238000010276 construction Methods 0.000 description 4
- 244000153665 Ficus glomerata Species 0.000 description 3
- 235000012571 Ficus glomerata Nutrition 0.000 description 3
- 244000141353 Prunus domestica Species 0.000 description 3
- 208000024172 Cardiovascular disease Diseases 0.000 description 2
- 206010028980 Neoplasm Diseases 0.000 description 2
- 201000011510 cancer Diseases 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 230000002265 prevention Effects 0.000 description 2
- 238000012800 visualization Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010219 correlation analysis Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 238000012954 risk control Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0635—Risk analysis of enterprise or organisation activities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/231—Hierarchical techniques, i.e. dividing or merging pattern sets so as to obtain a dendrogram
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H40/00—ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
- G16H40/20—ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the management or administration of healthcare resources or facilities, e.g. managing hospital staff or surgery rooms
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Business, Economics & Management (AREA)
- Health & Medical Sciences (AREA)
- Strategic Management (AREA)
- Entrepreneurship & Innovation (AREA)
- Economics (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Quality & Reliability (AREA)
- Evolutionary Biology (AREA)
- Development Economics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Educational Administration (AREA)
- Bioinformatics & Computational Biology (AREA)
- Game Theory and Decision Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- General Engineering & Computer Science (AREA)
- Tourism & Hospitality (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Primary Health Care (AREA)
- Public Health (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the field of artificial intelligence, and discloses a risk analysis method, a device, equipment and a storage medium based on hierarchical clustering, which are applied to the field of intelligent medical treatment. The method comprises the following steps: acquiring initial data, wherein the initial data is used for indicating drug sales data of a plurality of hospitals, and the initial data is time sequence data; calculating correlation coefficients between any two different hospitals according to a preset similarity formula and the initial data to obtain a plurality of target correlation coefficients; generating a distance matrix among a plurality of hospitals according to the plurality of target correlation coefficients; pruning and hierarchical clustering operations are carried out on the distance matrix to generate a clustering tree, wherein the clustering tree comprises a plurality of clusters; and performing risk analysis according to the clustering tree to obtain a risk analysis result.
Description
Technical Field
The invention relates to the field of medical data, in particular to a risk analysis method, device, equipment and storage medium based on hierarchical clustering.
Background
Risk control refers to the risk manager taking various measures and methods to eliminate or reduce the various possibilities of occurrence of a risk event, or the risk controller reducing the losses incurred when a risk event occurs. In the fields of e-commerce, credit card fraud prevention, medical insurance fund fraud prevention and the like, wind control is a very important direction.
In the existing scheme, a candidate abnormal result is generally found through an abnormal recognition model such as correlation analysis, statistical analysis and the like, but the data is often noisy, and the obtained result is often not ideal. Moreover, for high-dimensional data, the method is easy to be trapped in a dimension disaster (security of dimension) and the analysis result is distorted.
Disclosure of Invention
The invention provides a risk analysis method, a risk analysis device, risk analysis equipment and a risk analysis storage medium based on hierarchical clustering, which are used for avoiding dimension disasters when time series data are processed.
A first aspect of an embodiment of the present invention provides a risk analysis method based on hierarchical clustering, including: acquiring initial data, wherein the initial data is used for indicating drug sales data of a plurality of hospitals, and the initial data is time sequence data; calculating correlation coefficients between any two different hospitals according to a preset similarity formula and the initial data to obtain a plurality of target correlation coefficients; generating a distance matrix among a plurality of hospitals according to the plurality of target correlation coefficients; pruning and hierarchical clustering operations are carried out on the distance matrix to generate a clustering tree, wherein the clustering tree comprises a plurality of clusters; and performing risk analysis according to the clustering tree to obtain a risk analysis result.
Optionally, in a first implementation manner of the first aspect of the embodiment of the present invention, the calculating a correlation coefficient between any two different hospitals according to a preset similarity formula and the initial data to obtain a plurality of target correlation coefficients includes: respectively determining the sales amount Y of the medicine in the hospital iiAnd the sales amount Y of the medicine of hospital jj(ii) a Sales amount Y of the medicineiAnd said drug sales amount YjInputting the data into a preset similarity formula to generate the correlation coefficient of the hospital i and the hospital j, wherein the preset similarity formula isWherein,YiIndicates the sales amount of the medicine in Hospital i, YjRepresents the sales amount of the medicine of the hospital j, i and j are positive integers,<>denotes the mean value, pijThe correlation coefficients of hospital i and hospital j; calculating correlation coefficients between any other two hospitals to obtain a plurality of other correlation coefficients, wherein the any other two hospitals do not include hospital i and hospital j at the same time; generating a plurality of target correlation coefficients including the correlation coefficients for Hospital i and Hospital j and the plurality of other correlation coefficients.
Optionally, in a second implementation manner of the first aspect of the embodiment of the present invention, the generating a distance matrix between multiple hospitals according to the multiple target correlation coefficients includes: calculating initial distances between any two different hospitals according to the target correlation coefficients to obtain a plurality of initial distances; generating a distance matrix based on the plurality of initial distances, the distance matrix indicating a distance between any two hospitals.
Optionally, in a third implementation manner of the first aspect of the embodiment of the present invention, the calculating an initial distance between any two different hospitals according to the multiple target correlation coefficients to obtain multiple initial distances includes: calling a preset distance formula to calculate the distance corresponding to each target correlation coefficient to obtain a plurality of initial distances, wherein d (i, j) represents the distance between hospital i and hospital j, and the preset distance formula is as follows:
optionally, in a fourth implementation manner of the first aspect of the embodiment of the present invention, the pruning and hierarchical clustering operations are performed on the distance matrix to generate a cluster tree, where the cluster tree includes a plurality of clusters, and the cluster tree includes: pruning the distance matrix to obtain a pruned distance matrix; and performing hierarchical clustering on the pruned distance matrix to generate a clustering tree.
Optionally, in a fifth implementation manner of the first aspect of the embodiment of the present invention, the pruning the distance matrix to obtain a pruned distance matrix includes: converting the distance matrix into an undirected graph; generating a minimum spanning tree by using a preset algorithm and the undirected graph; and pruning the distance matrix based on the minimum spanning tree to obtain the pruned distance matrix.
Optionally, in a sixth implementation manner of the first aspect of the embodiment of the present invention, the performing hierarchical clustering on the pruned distance matrix to generate a clustering tree includes: calling a preset matrix distance formula to calculate the distance of each data point in the pruned distance matrix to obtain a plurality of distances, wherein the preset matrix distance formula isD represents the distance between any two data points; and performing hierarchical clustering on two nearest data points in the plurality of distances to obtain a plurality of data categories, wherein the data categories comprise data points and data combinations, and performing the hierarchical clustering process in an iterative manner until the distance matrix is converted into a plurality of clusters to generate a clustering tree.
A second aspect of the embodiments of the present invention provides a risk analysis device based on hierarchical clustering, including: the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring initial data, the initial data is used for indicating the drug sales data of a plurality of hospitals, and the initial data is time sequence data; the calculation module is used for calculating correlation coefficients between any two different hospitals according to a preset similarity formula and the initial data to obtain a plurality of target correlation coefficients; the generating module is used for generating a distance matrix among a plurality of hospitals according to the plurality of target correlation coefficients; the clustering module is used for pruning and hierarchical clustering operations on the distance matrix to generate a clustering tree, and the clustering tree comprises a plurality of clusters; and the analysis module is used for carrying out risk analysis according to the clustering tree to obtain a risk analysis result.
Optionally, in a first implementation manner of the second aspect of the embodiment of the present invention, the calculation module includes: a determination unit for determining the sales amount Y of the medicine in hospital iiAnd the sales amount Y of the medicine of hospital jj(ii) a An input unit for assigning the sales amount Y of the medicineiAnd sale of said pharmaceutical productForehead YjInputting the data into a preset similarity formula to generate the correlation coefficient of the hospital i and the hospital j, wherein the preset similarity formula isWherein, YiIndicates the sales amount of the medicine in Hospital i, YjRepresents the sales amount of the medicine of the hospital j, i and j are positive integers,<>denotes the mean value, pijThe correlation coefficients of hospital i and hospital j; the first calculating unit is used for calculating correlation coefficients between any other two hospitals to obtain a plurality of other correlation coefficients, and the any other two hospitals do not include hospital i and hospital j at the same time; a first generating unit configured to generate a plurality of target correlation coefficients including the correlation coefficients of hospital i and hospital j and the plurality of other correlation coefficients.
Optionally, in a second implementation manner of the second aspect of the embodiment of the present invention, the generating module includes: the second calculation unit is used for calculating the initial distance between any two different hospitals according to the target correlation coefficients to obtain a plurality of initial distances; a second generating unit configured to generate a distance matrix based on the plurality of initial distances, the distance matrix indicating a distance between any two hospitals.
Optionally, in a third implementation manner of the second aspect of the embodiment of the present invention, the second calculating unit is specifically configured to: calling a preset distance formula to calculate the distance corresponding to each target correlation coefficient to obtain a plurality of initial distances, wherein d (i, j) represents the distance between hospital i and hospital j, and the preset distance formula is as follows:
optionally, in a fourth implementation manner of the second aspect of the embodiment of the present invention, the clustering module includes: the pruning unit is used for carrying out pruning operation on the distance matrix to obtain a pruned distance matrix; and the clustering unit is used for carrying out hierarchical clustering on the pruned distance matrix to generate a clustering tree.
Optionally, in a fifth implementation manner of the second aspect of the embodiment of the present invention, the pruning unit is specifically configured to: converting the distance matrix into an undirected graph; generating a minimum spanning tree by using a preset algorithm and the undirected graph; and pruning the distance matrix based on the minimum spanning tree to obtain the pruned distance matrix.
Optionally, in a sixth implementation manner of the second aspect of the embodiment of the present invention, the clustering unit is specifically configured to: calling a preset matrix distance formula to calculate the distance of each data point in the pruned distance matrix to obtain a plurality of distances, wherein the preset matrix distance formula isD represents the distance between any two data points; and performing hierarchical clustering on two nearest data points in the plurality of distances to obtain a plurality of data categories, wherein the data categories comprise data points and data combinations, and performing the hierarchical clustering process in an iterative manner until the distance matrix is converted into a plurality of clusters to generate a clustering tree.
A third aspect of an embodiment of the present invention provides a risk analysis device based on hierarchical clustering, including a memory and at least one processor, where the memory stores instructions, and the memory and the at least one processor are interconnected by a line; the at least one processor invokes the instructions in the memory to cause the hierarchical cluster-based risk analysis device to perform the hierarchical cluster-based risk analysis method described above.
A fourth aspect of the embodiments of the present invention provides a computer-readable storage medium, which stores instructions that, when executed by a processor, implement the steps of the risk analysis method based on hierarchical clustering according to any of the above embodiments.
According to the technical scheme provided by the embodiment of the invention, initial data are obtained, wherein the initial data are used for indicating the drug sales data of a plurality of hospitals, and the initial data are time series data; calculating correlation coefficients between any two different hospitals according to a preset similarity formula and the initial data to obtain a plurality of target correlation coefficients; generating a distance matrix among a plurality of hospitals according to the plurality of target correlation coefficients; pruning and hierarchical clustering operations are carried out on the distance matrix to generate a clustering tree, wherein the clustering tree comprises a plurality of clusters; and performing risk analysis according to the clustering tree to obtain a risk analysis result. According to the embodiment of the invention, the time series data is subjected to noise reduction and pruning treatment, so that the situation of dimension disaster is avoided, and the reliability of a risk analysis result is enhanced.
Drawings
FIG. 1 is a schematic diagram of an embodiment of a risk analysis method based on hierarchical clustering according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of another embodiment of a risk analysis method based on hierarchical clustering according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of an embodiment of a risk analysis apparatus based on hierarchical clustering according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of another embodiment of a risk analysis device based on hierarchical clustering according to an embodiment of the present invention;
fig. 5 is a schematic diagram of an embodiment of a risk analysis device based on hierarchical clustering in the embodiment of the present invention.
Detailed Description
The invention provides a risk analysis method, a risk analysis device, risk analysis equipment and a storage medium based on hierarchical clustering, which are used for carrying out noise reduction and pruning processing on time series data, avoiding the situation of being trapped in dimension disasters and enhancing the reliability of risk analysis results.
In order to make the technical field of the invention better understand the scheme of the invention, the embodiment of the invention will be described in conjunction with the attached drawings in the embodiment of the invention.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," or "having," and any variations thereof, are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Referring to fig. 1, a flowchart of a risk analysis method based on hierarchical clustering according to an embodiment of the present invention specifically includes:
101. initial data indicating drug sales data of a plurality of hospitals is acquired, and the initial data is time-series data.
The server acquires initial data indicating drug sales data of a plurality of hospitals, the initial data being time-series data. The initial data in this embodiment is time-series data, taking medical insurance wind control as an example, the basic data is medical insurance settlement data, and the server can obtain daily sales data of various medicines in each hospital through grouping and integration, which is time-series data.
It should be noted that the server analyzes the time-series data of each hospital, finds out a possibly abnormal hospital from the time-series data by using the risk analysis method based on hierarchical clustering proposed in this embodiment, and then performs visualization processing on the data of the abnormal hospital to discover and verify the reason of the abnormality and prompt the reason.
It is to be understood that the execution subject of the present invention may be a risk analysis device based on hierarchical clustering, and may also be a server, which is not limited herein. The embodiment of the present invention is described by taking a server as an execution subject.
102. And calculating the correlation coefficient between any two different hospitals according to a preset similarity formula and initial data to obtain a plurality of target correlation coefficients.
And the server calculates the correlation coefficient between any two different hospitals according to a preset similarity formula and initial data to obtain a plurality of target correlation coefficients. The target correlation coefficient between any two hospitals is a correlation coefficient between drug sales, where the drug sales is a specific drug, and can be focused on drugs with higher unit price and higher medical insurance fund expenditure, such as drugs for treating cancer and drugs for treating cardiovascular diseases, which is not limited herein.
103. And generating a distance matrix among a plurality of hospitals according to the plurality of target correlation coefficients.
The server generates a distance matrix between a plurality of hospitals according to the plurality of target correlation coefficients.
Specifically, the server calculates an initial distance between any two different hospitals according to the multiple target correlation coefficients to obtain multiple initial distances; the server generates a distance matrix based on the plurality of initial distances, the distance matrix indicating a distance between any two hospitals.
Wherein, the server calculates the initial distance between any two different hospitals according to the multiple target correlation coefficients, and the obtaining of the multiple initial distances comprises: the server calls a second preset formula to calculate the distance corresponding to each target correlation coefficient to obtain a plurality of initial distances, d (i, j) represents the distance between hospital i and hospital j, and the second preset formula is as follows:
104. and pruning and hierarchical clustering operations are carried out on the distance matrix to generate a clustering tree, wherein the clustering tree comprises a plurality of clusters.
And the server performs pruning and hierarchical clustering operations on the distance matrix to generate a clustering tree, wherein the clustering tree comprises a plurality of clusters. Specifically, the server prunes the distance matrix to obtain a pruned distance matrix; and the server carries out hierarchical clustering on the pruned distance matrix to generate a clustering tree.
In this embodiment, a bottom-up merging method is used for clustering, and a merging algorithm of hierarchical clustering combines two most similar data points of all data points by calculating the similarity between the two types of data points, and iterates this process repeatedly. The merging algorithm of hierarchical clustering determines the similarity between data points of each category and all data points by calculating the distance between the data points, wherein the smaller the distance is, the higher the similarity is, and two data points or categories with the closest distance are combined to generate a clustering tree. Because the server already obtains the distance matrix after pruning, the server can combine different hospitals according to the distance between the hospitals to generate the clustering tree.
105. And performing risk analysis according to the clustering tree to obtain a risk analysis result.
And the server performs risk analysis according to the clustering tree to obtain a risk analysis result.
The clustering tree is obtained through hierarchical clustering, each cluster (hierarchical structure) can represent the hierarchical structure of one class of hospitals, and is compared with a preset rule, if a third-class hospital, a second-class hospital and a first-class hospital are respectively clustered together, no problem exists, and if the hospital A falls into other hierarchies, the hospital A is possibly abnormal. On the other hand, after hierarchical clustering, the distance between the closest two hospitals can be determined, namely a, the distance between the closest two hospitals and the last clustered two clusters is determined, namely b, then a and b are compared, if b >3a, the hospital in the last clustered cluster is determined to have an abnormality, and if the number of hospitals in the abnormal cluster is small, the abnormality suspicion degree is higher.
The criterion for determining the abnormality may be set according to actual conditions, for example, the determination condition "b >3 a" may be replaced with "b >4 a" or "b >2 a", and the determination condition is not limited herein.
According to the embodiment of the invention, the time series data is subjected to noise reduction and pruning treatment, so that the situation of dimension disaster is avoided, and the reliability of a risk analysis result is enhanced. And this scheme can be applied to in the wisdom medical treatment field to promote the construction in wisdom city.
Referring to fig. 2, another flowchart of the risk analysis method based on hierarchical clustering according to the embodiment of the present invention specifically includes:
201. initial data indicating drug sales data of a plurality of hospitals is acquired, and the initial data is time-series data.
The server acquires initial data indicating drug sales data of a plurality of hospitals, the initial data being time-series data. The initial data in this embodiment is time-series data, taking medical insurance wind control as an example, the basic data is medical insurance settlement data, and the server can obtain daily sales data of various medicines in each hospital through grouping and integration, which is time-series data.
It should be noted that the server analyzes the time-series data of each hospital, finds out a possibly abnormal hospital from the time-series data by using the risk analysis method based on hierarchical clustering proposed in this embodiment, and then performs visualization processing on the data of the abnormal hospital to discover and verify the reason of the abnormality and prompt the reason.
It is to be understood that the execution subject of the present invention may be a risk analysis device based on hierarchical clustering, and may also be a server, which is not limited herein. The embodiment of the present invention is described by taking a server as an execution subject.
202. And calculating the correlation coefficient between any two different hospitals according to a preset similarity formula and initial data to obtain a plurality of target correlation coefficients.
And the server calculates the correlation coefficient between any two different hospitals according to a preset similarity formula and initial data to obtain a plurality of target correlation coefficients. The target correlation coefficient between any two hospitals is a correlation coefficient between drug sales, where the drug sales is a specific drug, and can be focused on drugs with higher unit price and higher medical insurance fund expenditure, such as drugs for treating cancer and drugs for treating cardiovascular diseases, which is not limited herein.
Specifically, the server determines the sales amount Y of the medicine in hospital iiAnd the sales amount Y of the medicine of hospital jj(ii) a The server sends the sales amount Y of the medicineiAnd the sales amount of the drug YjInputting the data into a preset similarity formula to generate the correlation coefficient of the hospital i and the hospital j, wherein the preset similarity formula isWherein, YiIndicates the sales amount of the medicine in Hospital i, YjRepresents the sales amount of the medicine of the hospital j, i and j are positive integers,<>denotes the mean value, pijThe correlation coefficients of hospital i and hospital j; the server calculates the correlation coefficient between any two other hospitals to obtain a plurality of other correlation coefficients, and any two other hospitals do not simultaneously comprise the hospital i and the hospital j; the server generates a plurality of target correlation coefficients including a correlation coefficient for hospital i and hospital j and a plurality of other correlation coefficients.
203. And generating a distance matrix among a plurality of hospitals according to the plurality of target correlation coefficients.
The server generates a distance matrix between a plurality of hospitals according to the plurality of target correlation coefficients.
Specifically, the server calculates an initial distance between any two different hospitals according to the multiple target correlation coefficients to obtain multiple initial distances; the server generates a distance matrix based on the plurality of initial distances, the distance matrix indicating a distance between any two hospitals.
Wherein, the server calculates the initial distance between any two different hospitals according to the multiple target correlation coefficients, and the obtaining of the multiple initial distances comprises: the server calls a second preset formula to calculate the distance corresponding to each target correlation coefficient to obtain a plurality of initial distances, d (i, j) represents the distance between hospital i and hospital j, and the second preset formula is as follows:
204. and carrying out pruning operation on the distance matrix to obtain the pruned distance matrix.
And the server prunes the distance matrix to obtain the pruned distance matrix. Specifically, the server converts the distance matrix into an undirected graph; the server generates a minimum spanning tree by using a preset algorithm and an undirected graph; and the server prunes the distance matrix based on the minimum spanning tree to obtain the pruned distance matrix.
According to the distance matrix between hospitals, the undirected graph can be obtained by taking the hospitals as nodes and the distances between the hospitals as the weights of edges. The purpose of pruning and denoising is achieved by converting the undirected graph into the minimum spanning tree. In the pruned distance matrix, if two hospitals are connected by branches on the minimum spanning tree, the distance between the two hospitals is preserved, otherwise, the distance between the two hospitals is set to a large value, such as other values greater than 500, for example, 1000, 2000, etc., and the invention is not limited herein.
It should be noted that in a given undirected graph G ═ V, E, (u, V) represents the edge connecting vertex u (i.e., hospital) and vertex V (i.e., distance between hospitals), and w (u, V) represents the weight of this edge, and if T is a subset of E and is a acyclic graph such that w (T) is minimized, then T is the minimum spanning tree of G. A spanning tree for a connected graph with n nodes is a minimal connected subgraph of the original graph, and contains all n nodes in the original graph and has the least edges to keep the graph connected. The minimum spanning tree can be determined using the kruskal (kruskal) algorithm or the prim (prim) algorithm.
It should be noted that, because actual data is usually noisy, the calculated original distance matrix is also noisy, which may interfere with the final analysis result. The distance matrix is essentially a graph, pruning is carried out by using the idea of the minimum spanning tree, only the distance between the nodes of the obtained minimum spanning tree is reserved, other distances are set to be a large distance, and the noise of the pruned distance matrix is greatly reduced, thereby being beneficial to subsequent analysis.
205. And performing hierarchical clustering on the distance matrix after pruning to generate a clustering tree.
And the server performs hierarchical clustering on the pruned distance matrix to generate a clustering tree. Specifically, the server calls a preset matrix distance formula to calculate the distance of each data point in the pruned distance matrix to obtain a plurality of distances, wherein the preset matrix distance formula isD represents the distance between any two data points; the server hierarchically clusters the two closest data points of the plurality of distances,and obtaining a plurality of data categories, wherein the data categories comprise data points and data combinations, and iteratively executing a hierarchical clustering process until the distance matrix is converted into a plurality of clusters to generate a clustering tree.
In this embodiment, a bottom-up merging method is used for clustering, and a merging algorithm of hierarchical clustering combines two most similar data points of all data points by calculating the similarity between the two types of data points, and iterates this process repeatedly. The merging algorithm of hierarchical clustering determines the similarity between data points of each category and all data points by calculating the distance between the data points, wherein the smaller the distance is, the higher the similarity is, and two data points or categories with the closest distance are combined to generate a clustering tree. Because the server already obtains the distance matrix after pruning, the server can combine different hospitals according to the distance between the hospitals to generate the clustering tree. Wherein, the distance between each data point and all data points is calculated to determine the Euclidean distance D between the data points, and the smaller the Euclidean distance is, the higher the similarity is.
In the present application, data points are hospitals, one hospital represents one data point, and one data point combination represents two combined data points, and it is assumed that the pruned distance matrix in the embodiment of the present invention includes six hospitals A, B, C, D, E, F, that is, includes a data point a, a data point B, a data point C, a data point D, a data point E, and a data point F, and then, after combining the data point B (hospital B) and the data point C (hospital C), a category (B, C) is obtained, and finally, a data category a, a data category (B, C), a data category D, a data category E, and a data category F are obtained, and the distance matrix between the data categories is recalculated.
It is understood that, for the calculation method of calculating the distance between data points, a preset matrix distance formula is used for calculation. For the calculation method between the calculation data combination and other data points, for example: when calculating the distance of a data combination (B, C) to a data point A, it is necessary to calculate the mean of the distances B to A and C to A, respectively, i.e.For the calculation method to calculate the distance between two data combinations: the distance of each of the two combined data points from all other data points is calculated. The mean of all distances is taken as the distance between two combined data points. This method is more computationally intensive, but results are more reasonable than the first two methods. For example, for a data combination (A, E) to a data combination (B, C) the distance is
206. And performing risk analysis according to the clustering tree to obtain a risk analysis result.
And the server performs risk analysis according to the clustering tree to obtain a risk analysis result.
The clustering tree is obtained through hierarchical clustering, each cluster (hierarchical structure) can represent the hierarchical structure of one class of hospitals, and is compared with a preset rule, if a third-class hospital, a second-class hospital and a first-class hospital are respectively clustered together, no problem exists, and if the hospital A falls into other hierarchies, the hospital A is possibly abnormal. On the other hand, after hierarchical clustering, the distance between the closest two hospitals can be determined, namely a, the distance between the closest two hospitals and the last clustered two clusters is determined, namely b, then a and b are compared, if b >3a, the hospital in the last clustered cluster is determined to have an abnormality, and if the number of hospitals in the abnormal cluster is small, the abnormality suspicion degree is higher.
The criterion for determining the abnormality may be set according to actual conditions, for example, the determination condition "b >3 a" may be replaced with "b >4 a" or "b >2 a", and the determination condition is not limited herein.
According to the embodiment of the invention, the time series data is subjected to noise reduction and pruning treatment, so that the situation of dimension disaster is avoided, and the reliability of a risk analysis result is enhanced. And this scheme can be applied to in the wisdom medical treatment field to promote the construction in wisdom city.
In the above description of the risk analysis method based on hierarchical clustering in the embodiment of the present invention, referring to fig. 3, a risk analysis device based on hierarchical clustering in the embodiment of the present invention is described below, and an embodiment of the risk analysis device based on hierarchical clustering in the embodiment of the present invention includes:
an obtaining module 301, configured to obtain initial data, where the initial data is used to indicate drug sales data of multiple hospitals, and the initial data is time-series data;
a calculating module 302, configured to calculate correlation coefficients between any two different hospitals according to a preset similarity formula and the initial data to obtain multiple target correlation coefficients;
a generating module 303, configured to generate a distance matrix between multiple hospitals according to the multiple target correlation coefficients;
a clustering module 304, configured to perform pruning and hierarchical clustering operations on the distance matrix to generate a clustering tree, where the clustering tree includes multiple clusters;
and the analysis module 305 is configured to perform risk analysis according to the clustering tree to obtain a risk analysis result.
According to the embodiment of the invention, the time series data is subjected to noise reduction and pruning treatment, so that the situation of dimension disaster is avoided, and the reliability of a risk analysis result is enhanced. And this scheme can be applied to in the wisdom medical treatment field to promote the construction in wisdom city.
Referring to fig. 4, another embodiment of the risk analysis device based on hierarchical clustering according to the embodiment of the present invention includes:
an obtaining module 301, configured to obtain initial data, where the initial data is used to indicate drug sales data of multiple hospitals, and the initial data is time-series data;
a calculating module 302, configured to calculate correlation coefficients between any two different hospitals according to a preset similarity formula and the initial data to obtain multiple target correlation coefficients;
a generating module 303, configured to generate a distance matrix between multiple hospitals according to the multiple target correlation coefficients;
a clustering module 304, configured to perform pruning and hierarchical clustering operations on the distance matrix to generate a clustering tree, where the clustering tree includes multiple clusters;
and the analysis module 305 is configured to perform risk analysis according to the clustering tree to obtain a risk analysis result.
Optionally, the calculating module 302 includes:
a determination unit 3021 for determining the sales amount Y of the drugs for hospital i, respectivelyiAnd the sales amount Y of the medicine of hospital jj;
An input unit 3022 for assigning the sales amount Y of the medicineiAnd said drug sales amount YjInputting the data into a preset similarity formula to generate the correlation coefficient of the hospital i and the hospital j, wherein the preset similarity formula isWherein, YiIndicates the sales amount of the medicine in Hospital i, YjRepresents the sales amount of the medicine of the hospital j, i and j are positive integers,<>denotes the mean value, pijThe correlation coefficients of hospital i and hospital j;
the first calculating unit 3023 is configured to calculate correlation coefficients between any other two hospitals to obtain a plurality of other correlation coefficients, where the any other two hospitals do not include hospital i and hospital j at the same time;
a first generating unit 3024 configured to generate a plurality of target correlation coefficients including the correlation coefficients of hospital i and hospital j and the plurality of other correlation coefficients.
Optionally, the generating module 303 includes:
a second calculating unit 3031, configured to calculate initial distances between any two different hospitals according to the multiple target correlation coefficients, so as to obtain multiple initial distances;
a second generating unit 3032, configured to generate a distance matrix based on the plurality of initial distances, where the distance matrix is used to indicate a distance between any two hospitals.
Optionally, the second calculating unit 3031 is specifically configured to:
calling a preset distance formula to calculate the distance corresponding to each target correlation coefficient to obtain a plurality of initial distances, wherein d (i, j) represents the distance between hospital i and hospital jThe preset distance formula is as follows:
optionally, the clustering module 304 includes:
a pruning unit 3041, configured to perform a pruning operation on the distance matrix to obtain a pruned distance matrix;
a clustering unit 3042, configured to perform hierarchical clustering on the pruned distance matrix, and generate a clustering tree.
Optionally, the pruning unit 3041 is specifically configured to:
converting the distance matrix into an undirected graph; generating a minimum spanning tree by using a preset algorithm and the undirected graph; and pruning the distance matrix based on the minimum spanning tree to obtain the pruned distance matrix.
Optionally, the clustering unit 3042 is specifically configured to:
calling a preset matrix distance formula to calculate the distance of each data point in the pruned distance matrix to obtain a plurality of distances, wherein the preset matrix distance formula isD represents the distance between any two data points; and performing hierarchical clustering on two nearest data points in the plurality of distances to obtain a plurality of data categories, wherein the data categories comprise data points and data combinations, and performing the hierarchical clustering process in an iterative manner until the distance matrix is converted into a plurality of clusters to generate a clustering tree.
According to the embodiment of the invention, the time series data is subjected to noise reduction and pruning treatment, so that the situation of dimension disaster is avoided, and the reliability of a risk analysis result is enhanced. And this scheme can be applied to in the wisdom medical treatment field to promote the construction in wisdom city.
Fig. 3 to 4 describe the risk analysis device based on hierarchical clustering in the embodiment of the present invention in detail from the perspective of the modular functional entity, and the risk analysis device based on hierarchical clustering in the embodiment of the present invention in detail from the perspective of hardware processing.
Fig. 5 is a schematic structural diagram of a risk analysis device based on hierarchical clustering according to an embodiment of the present invention, where the risk analysis device 500 based on hierarchical clustering may generate relatively large differences due to different configurations or performances, and may include one or more processors (CPUs) 510 (e.g., one or more processors) and a memory 520, one or more storage media 530 (e.g., one or more mass storage devices) storing applications 533 or data 532. Memory 520 and storage media 530 may be, among other things, transient or persistent storage. The program stored on the storage medium 530 may include one or more modules (not shown), each of which may include a series of instructions operating on the risk analysis device 500 based on hierarchical clustering. Still further, the processor 510 may be configured to communicate with the storage medium 530 to execute a series of instruction operations in the storage medium 530 on the hierarchical clustering based risk analysis device 500.
The hierarchical clustering-based risk analysis device 500 may also include one or more power supplies 540, one or more wired or wireless network interfaces 550, one or more input-output interfaces 560, and/or one or more operating systems 531, such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, and the like. Those skilled in the art will appreciate that the hierarchical clustering based risk analysis device architecture shown in fig. 5 does not constitute a limitation of hierarchical clustering based risk analysis devices, and may include more or fewer components than shown, or combine certain components, or a different arrangement of components. The processor 510 may perform the functions of the obtaining module 301, the calculating module 302, the generating module 303, the clustering module 304 and the analyzing module 305 in the above embodiments.
The present invention also provides a computer-readable storage medium, which may be a non-volatile computer-readable storage medium, and which may also be a volatile computer-readable storage medium, having stored therein instructions, which, when run on a computer, cause the computer to perform the steps of the hierarchical clustering based risk analysis method.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
Claims (10)
1. A risk analysis method based on hierarchical clustering is characterized by comprising the following steps:
acquiring initial data, wherein the initial data is used for indicating drug sales data of a plurality of hospitals, and the initial data is time sequence data;
calculating correlation coefficients between any two different hospitals according to a preset similarity formula and the initial data to obtain a plurality of target correlation coefficients;
generating a distance matrix among a plurality of hospitals according to the plurality of target correlation coefficients;
pruning and hierarchical clustering operations are carried out on the distance matrix to generate a clustering tree, wherein the clustering tree comprises a plurality of clusters;
and performing risk analysis according to the clustering tree to obtain a risk analysis result.
2. The risk analysis method based on hierarchical clustering according to claim 1, wherein the calculating a correlation coefficient between any two different hospitals according to a preset similarity formula and the initial data to obtain a plurality of target correlation coefficients comprises:
respectively determining the sales amount Y of the medicine in the hospital iiAnd the sales amount Y of the medicine of hospital jj;
Sales amount Y of the medicineiAnd said drug sales amount YjInputting the data into a preset similarity formula to generate the correlation coefficient of the hospital i and the hospital j, wherein the preset similarity formula isWherein, YiIndicates the sales amount of the medicine in Hospital i, YjRepresents the sales amount of the medicine of the hospital j, i and j are positive integers,<>denotes the mean value, pijThe correlation coefficients of hospital i and hospital j;
calculating correlation coefficients between any other two hospitals to obtain a plurality of other correlation coefficients, wherein the any other two hospitals do not include hospital i and hospital j at the same time;
generating a plurality of target correlation coefficients including the correlation coefficients for Hospital i and Hospital j and the plurality of other correlation coefficients.
3. The hierarchical clustering-based risk analysis method according to claim 1, wherein the generating a distance matrix between hospitals according to the plurality of target correlation coefficients comprises:
calculating initial distances between any two different hospitals according to the target correlation coefficients to obtain a plurality of initial distances;
generating a distance matrix based on the plurality of initial distances, the distance matrix indicating a distance between any two hospitals.
4. The risk analysis method based on hierarchical clustering according to claim 3, wherein the calculating an initial distance between any two different hospitals according to the plurality of target correlation coefficients to obtain a plurality of initial distances comprises:
5. the risk analysis method based on hierarchical clustering according to any one of claims 1-4, wherein the pruning and hierarchical clustering operations on the distance matrix generate a clustering tree, the clustering tree comprising a plurality of clusters, including:
pruning the distance matrix to obtain a pruned distance matrix;
and performing hierarchical clustering on the pruned distance matrix to generate a clustering tree.
6. The risk analysis method based on hierarchical clustering according to claim 5, wherein the pruning operation on the distance matrix to obtain the pruned distance matrix comprises:
converting the distance matrix into an undirected graph;
generating a minimum spanning tree by using a preset algorithm and the undirected graph;
and pruning the distance matrix based on the minimum spanning tree to obtain the pruned distance matrix.
7. The risk analysis method based on hierarchical clustering according to claim 5, wherein the hierarchical clustering of the pruned distance matrix to generate a clustering tree comprises:
calling a preset matrix distance formula to calculate the distance of each data point in the pruned distance matrix to obtain a plurality of distances, wherein the preset matrix distance formula isD represents the distance between any two data points;
and performing hierarchical clustering on two nearest data points in the plurality of distances to obtain a plurality of data categories, wherein the data categories comprise data points and data combinations, and performing the hierarchical clustering process in an iterative manner until the distance matrix is converted into a plurality of clusters to generate a clustering tree.
8. A risk analysis device based on hierarchical clustering is characterized by comprising:
the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring initial data, the initial data is used for indicating the drug sales data of a plurality of hospitals, and the initial data is time sequence data;
the calculation module is used for calculating correlation coefficients between any two different hospitals according to a preset similarity formula and the initial data to obtain a plurality of target correlation coefficients;
the generating module is used for generating a distance matrix among a plurality of hospitals according to the plurality of target correlation coefficients;
the clustering module is used for pruning and hierarchical clustering operations on the distance matrix to generate a clustering tree, and the clustering tree comprises a plurality of clusters;
and the analysis module is used for carrying out risk analysis according to the clustering tree to obtain a risk analysis result.
9. A hierarchical clustering-based risk analysis device, characterized in that the hierarchical clustering-based risk analysis device comprises: a memory having instructions stored therein and at least one processor, the memory and the at least one processor interconnected by a line;
the at least one processor invokes the instructions in the memory to cause the hierarchical cluster-based risk analysis device to perform the hierarchical cluster-based risk analysis method of any one of claims 1-7.
10. A computer-readable storage medium storing instructions that, when executed by a processor, implement a hierarchical clustering based risk analysis method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010895439.0A CN111985837A (en) | 2020-08-31 | 2020-08-31 | Risk analysis method, device and equipment based on hierarchical clustering and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010895439.0A CN111985837A (en) | 2020-08-31 | 2020-08-31 | Risk analysis method, device and equipment based on hierarchical clustering and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111985837A true CN111985837A (en) | 2020-11-24 |
Family
ID=73440480
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010895439.0A Pending CN111985837A (en) | 2020-08-31 | 2020-08-31 | Risk analysis method, device and equipment based on hierarchical clustering and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111985837A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112668622A (en) * | 2020-12-22 | 2021-04-16 | 中国矿业大学(北京) | Analysis method and analysis and calculation device for coal geological composition data |
CN113268485A (en) * | 2021-06-02 | 2021-08-17 | 电信科学技术第十研究所有限公司 | Data table association analysis method, device, equipment and storage medium |
CN113420804A (en) * | 2021-06-18 | 2021-09-21 | 工业互联网创新中心(上海)有限公司 | Data processing method, device, network equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104778480A (en) * | 2015-05-08 | 2015-07-15 | 江南大学 | Hierarchical spectral clustering method based on local density and geodesic distance |
CN109523394A (en) * | 2018-10-23 | 2019-03-26 | 平安科技(深圳)有限公司 | A kind of risk checking method based on data processing, device and storage medium |
CN109767067A (en) * | 2018-12-13 | 2019-05-17 | 平安医疗健康管理股份有限公司 | Method and Related product based on more evaluative dimensions evaluation hospital |
CN110555110A (en) * | 2019-09-10 | 2019-12-10 | 哈尔滨工业大学 | text clustering method combining K-means and evidence accumulation |
CN111062418A (en) * | 2019-11-25 | 2020-04-24 | 南京师范大学 | Non-parametric clustering algorithm and system based on minimum spanning tree |
-
2020
- 2020-08-31 CN CN202010895439.0A patent/CN111985837A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104778480A (en) * | 2015-05-08 | 2015-07-15 | 江南大学 | Hierarchical spectral clustering method based on local density and geodesic distance |
CN109523394A (en) * | 2018-10-23 | 2019-03-26 | 平安科技(深圳)有限公司 | A kind of risk checking method based on data processing, device and storage medium |
CN109767067A (en) * | 2018-12-13 | 2019-05-17 | 平安医疗健康管理股份有限公司 | Method and Related product based on more evaluative dimensions evaluation hospital |
CN110555110A (en) * | 2019-09-10 | 2019-12-10 | 哈尔滨工业大学 | text clustering method combining K-means and evidence accumulation |
CN111062418A (en) * | 2019-11-25 | 2020-04-24 | 南京师范大学 | Non-parametric clustering algorithm and system based on minimum spanning tree |
Non-Patent Citations (4)
Title |
---|
刘英俊;吴惠珍;安静;张旭东;任炳楠;宋浩静;支旭然;董占军;: "在办公自动化系统中建立医院临时采购药品审批功能的实践", 中国药房, no. 10 * |
孙吉贵: "聚类算法研究", 软件学报, vol. 19, no. 1, 31 January 2008 (2008-01-31), pages 48 - 61 * |
张勇;张建伟;韩云祥;: "一种改进的航迹聚类方法", 现代计算机, no. 18, pages 11 - 18 * |
蔡娇楠 等: "基于最小生成树算法的建筑物聚类", 测绘, vol. 40, no. 6, 31 December 2017 (2017-12-31), pages 247 - 250 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112668622A (en) * | 2020-12-22 | 2021-04-16 | 中国矿业大学(北京) | Analysis method and analysis and calculation device for coal geological composition data |
CN113268485A (en) * | 2021-06-02 | 2021-08-17 | 电信科学技术第十研究所有限公司 | Data table association analysis method, device, equipment and storage medium |
CN113268485B (en) * | 2021-06-02 | 2024-02-09 | 电信科学技术第十研究所有限公司 | Data table association analysis method, device, equipment and storage medium |
CN113420804A (en) * | 2021-06-18 | 2021-09-21 | 工业互联网创新中心(上海)有限公司 | Data processing method, device, network equipment and storage medium |
WO2022262869A1 (en) * | 2021-06-18 | 2022-12-22 | 工业互联网创新中心(上海)有限公司 | Data processing method and apparatus, network device, and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111985837A (en) | Risk analysis method, device and equipment based on hierarchical clustering and storage medium | |
US11037684B2 (en) | Generating drug repositioning hypotheses based on integrating multiple aspects of drug similarity and disease similarity | |
Jayakumar et al. | A new procedure of clustering based on multivariate outlier detection | |
CN113728335A (en) | Method and system for classification and visualization of 3D images | |
Karthikeyan et al. | Analysis of classification algorithms applied to hepatitis patients | |
US20170147753A1 (en) | Method for searching for similar case of multi-dimensional health data and apparatus for the same | |
Jacobs et al. | A Bayesian approach to model selection in hierarchical mixtures-of-experts architectures | |
Mirmozaffari et al. | Window analysis using two-stage DEA in heart hospitals | |
CN110931090A (en) | Disease data processing method and device, computer equipment and storage medium | |
Kowalski et al. | Determining significance of input neurons for probabilistic neural network by sensitivity analysis procedure | |
CN113707286A (en) | Inquiry allocation method, device, equipment and storage medium based on decision tree | |
CN110109975A (en) | Data clustering method and device | |
WO2021120587A1 (en) | Method and apparatus for retina classification based on oct, computer device, and storage medium | |
Fradi et al. | Real-time arrhythmia heart disease detection system using CNN architecture based various optimizers-networks | |
CN107480426B (en) | Self-iteration medical record file clustering analysis system | |
Wang et al. | Extended ResNet and label feature vector based chromosome classification | |
Malott et al. | Topology preserving data reduction for computing persistent homology | |
CN114820603A (en) | Intelligent health management method based on AI tongue diagnosis image processing and related device | |
CN113656601A (en) | Doctor-patient matching method, device, equipment and storage medium | |
CN108320788A (en) | Hospital business analysis method and device | |
AlZu’Bi et al. | transfer learning enabled CAD system for monkey pox classification | |
CN108091398B (en) | Patient grouping method and device | |
Moschou et al. | Assessment of self-organizing map variants for clustering with application to redistribution of emotional speech patterns | |
Dong et al. | Protein remote homology detection based on binary profiles | |
Syarofina et al. | Cluster analysis in prediction of biological activity and molecular structure relationship of dipeptidyl peptidase-4 inhibitors for the type two diabetes mellitus treatment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |