CN111985837A - Risk analysis method, device and equipment based on hierarchical clustering and storage medium - Google Patents

Risk analysis method, device and equipment based on hierarchical clustering and storage medium Download PDF

Info

Publication number
CN111985837A
CN111985837A CN202010895439.0A CN202010895439A CN111985837A CN 111985837 A CN111985837 A CN 111985837A CN 202010895439 A CN202010895439 A CN 202010895439A CN 111985837 A CN111985837 A CN 111985837A
Authority
CN
China
Prior art keywords
data
distance
risk analysis
clustering
hospital
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010895439.0A
Other languages
Chinese (zh)
Inventor
郭建福
张旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Medical and Healthcare Management Co Ltd
Original Assignee
Ping An Medical and Healthcare Management Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Medical and Healthcare Management Co Ltd filed Critical Ping An Medical and Healthcare Management Co Ltd
Priority to CN202010895439.0A priority Critical patent/CN111985837A/en
Publication of CN111985837A publication Critical patent/CN111985837A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/231Hierarchical techniques, i.e. dividing or merging pattern sets so as to obtain a dendrogram
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H40/00ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
    • G16H40/20ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the management or administration of healthcare resources or facilities, e.g. managing hospital staff or surgery rooms

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Quality & Reliability (AREA)
  • Evolutionary Biology (AREA)
  • Development Economics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Educational Administration (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Game Theory and Decision Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • General Engineering & Computer Science (AREA)
  • Tourism & Hospitality (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the field of artificial intelligence, and discloses a risk analysis method, a device, equipment and a storage medium based on hierarchical clustering, which are applied to the field of intelligent medical treatment. The method comprises the following steps: acquiring initial data, wherein the initial data is used for indicating drug sales data of a plurality of hospitals, and the initial data is time sequence data; calculating correlation coefficients between any two different hospitals according to a preset similarity formula and the initial data to obtain a plurality of target correlation coefficients; generating a distance matrix among a plurality of hospitals according to the plurality of target correlation coefficients; pruning and hierarchical clustering operations are carried out on the distance matrix to generate a clustering tree, wherein the clustering tree comprises a plurality of clusters; and performing risk analysis according to the clustering tree to obtain a risk analysis result.

Description

Risk analysis method, device and equipment based on hierarchical clustering and storage medium
Technical Field
The invention relates to the field of medical data, in particular to a risk analysis method, device, equipment and storage medium based on hierarchical clustering.
Background
Risk control refers to the risk manager taking various measures and methods to eliminate or reduce the various possibilities of occurrence of a risk event, or the risk controller reducing the losses incurred when a risk event occurs. In the fields of e-commerce, credit card fraud prevention, medical insurance fund fraud prevention and the like, wind control is a very important direction.
In the existing scheme, a candidate abnormal result is generally found through an abnormal recognition model such as correlation analysis, statistical analysis and the like, but the data is often noisy, and the obtained result is often not ideal. Moreover, for high-dimensional data, the method is easy to be trapped in a dimension disaster (security of dimension) and the analysis result is distorted.
Disclosure of Invention
The invention provides a risk analysis method, a risk analysis device, risk analysis equipment and a risk analysis storage medium based on hierarchical clustering, which are used for avoiding dimension disasters when time series data are processed.
A first aspect of an embodiment of the present invention provides a risk analysis method based on hierarchical clustering, including: acquiring initial data, wherein the initial data is used for indicating drug sales data of a plurality of hospitals, and the initial data is time sequence data; calculating correlation coefficients between any two different hospitals according to a preset similarity formula and the initial data to obtain a plurality of target correlation coefficients; generating a distance matrix among a plurality of hospitals according to the plurality of target correlation coefficients; pruning and hierarchical clustering operations are carried out on the distance matrix to generate a clustering tree, wherein the clustering tree comprises a plurality of clusters; and performing risk analysis according to the clustering tree to obtain a risk analysis result.
Optionally, in a first implementation manner of the first aspect of the embodiment of the present invention, the calculating a correlation coefficient between any two different hospitals according to a preset similarity formula and the initial data to obtain a plurality of target correlation coefficients includes: respectively determining the sales amount Y of the medicine in the hospital iiAnd the sales amount Y of the medicine of hospital jj(ii) a Sales amount Y of the medicineiAnd said drug sales amount YjInputting the data into a preset similarity formula to generate the correlation coefficient of the hospital i and the hospital j, wherein the preset similarity formula is
Figure BDA0002658318220000011
Wherein,YiIndicates the sales amount of the medicine in Hospital i, YjRepresents the sales amount of the medicine of the hospital j, i and j are positive integers,<>denotes the mean value, pijThe correlation coefficients of hospital i and hospital j; calculating correlation coefficients between any other two hospitals to obtain a plurality of other correlation coefficients, wherein the any other two hospitals do not include hospital i and hospital j at the same time; generating a plurality of target correlation coefficients including the correlation coefficients for Hospital i and Hospital j and the plurality of other correlation coefficients.
Optionally, in a second implementation manner of the first aspect of the embodiment of the present invention, the generating a distance matrix between multiple hospitals according to the multiple target correlation coefficients includes: calculating initial distances between any two different hospitals according to the target correlation coefficients to obtain a plurality of initial distances; generating a distance matrix based on the plurality of initial distances, the distance matrix indicating a distance between any two hospitals.
Optionally, in a third implementation manner of the first aspect of the embodiment of the present invention, the calculating an initial distance between any two different hospitals according to the multiple target correlation coefficients to obtain multiple initial distances includes: calling a preset distance formula to calculate the distance corresponding to each target correlation coefficient to obtain a plurality of initial distances, wherein d (i, j) represents the distance between hospital i and hospital j, and the preset distance formula is as follows:
Figure BDA0002658318220000021
optionally, in a fourth implementation manner of the first aspect of the embodiment of the present invention, the pruning and hierarchical clustering operations are performed on the distance matrix to generate a cluster tree, where the cluster tree includes a plurality of clusters, and the cluster tree includes: pruning the distance matrix to obtain a pruned distance matrix; and performing hierarchical clustering on the pruned distance matrix to generate a clustering tree.
Optionally, in a fifth implementation manner of the first aspect of the embodiment of the present invention, the pruning the distance matrix to obtain a pruned distance matrix includes: converting the distance matrix into an undirected graph; generating a minimum spanning tree by using a preset algorithm and the undirected graph; and pruning the distance matrix based on the minimum spanning tree to obtain the pruned distance matrix.
Optionally, in a sixth implementation manner of the first aspect of the embodiment of the present invention, the performing hierarchical clustering on the pruned distance matrix to generate a clustering tree includes: calling a preset matrix distance formula to calculate the distance of each data point in the pruned distance matrix to obtain a plurality of distances, wherein the preset matrix distance formula is
Figure BDA0002658318220000022
D represents the distance between any two data points; and performing hierarchical clustering on two nearest data points in the plurality of distances to obtain a plurality of data categories, wherein the data categories comprise data points and data combinations, and performing the hierarchical clustering process in an iterative manner until the distance matrix is converted into a plurality of clusters to generate a clustering tree.
A second aspect of the embodiments of the present invention provides a risk analysis device based on hierarchical clustering, including: the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring initial data, the initial data is used for indicating the drug sales data of a plurality of hospitals, and the initial data is time sequence data; the calculation module is used for calculating correlation coefficients between any two different hospitals according to a preset similarity formula and the initial data to obtain a plurality of target correlation coefficients; the generating module is used for generating a distance matrix among a plurality of hospitals according to the plurality of target correlation coefficients; the clustering module is used for pruning and hierarchical clustering operations on the distance matrix to generate a clustering tree, and the clustering tree comprises a plurality of clusters; and the analysis module is used for carrying out risk analysis according to the clustering tree to obtain a risk analysis result.
Optionally, in a first implementation manner of the second aspect of the embodiment of the present invention, the calculation module includes: a determination unit for determining the sales amount Y of the medicine in hospital iiAnd the sales amount Y of the medicine of hospital jj(ii) a An input unit for assigning the sales amount Y of the medicineiAnd sale of said pharmaceutical productForehead YjInputting the data into a preset similarity formula to generate the correlation coefficient of the hospital i and the hospital j, wherein the preset similarity formula is
Figure BDA0002658318220000031
Wherein, YiIndicates the sales amount of the medicine in Hospital i, YjRepresents the sales amount of the medicine of the hospital j, i and j are positive integers,<>denotes the mean value, pijThe correlation coefficients of hospital i and hospital j; the first calculating unit is used for calculating correlation coefficients between any other two hospitals to obtain a plurality of other correlation coefficients, and the any other two hospitals do not include hospital i and hospital j at the same time; a first generating unit configured to generate a plurality of target correlation coefficients including the correlation coefficients of hospital i and hospital j and the plurality of other correlation coefficients.
Optionally, in a second implementation manner of the second aspect of the embodiment of the present invention, the generating module includes: the second calculation unit is used for calculating the initial distance between any two different hospitals according to the target correlation coefficients to obtain a plurality of initial distances; a second generating unit configured to generate a distance matrix based on the plurality of initial distances, the distance matrix indicating a distance between any two hospitals.
Optionally, in a third implementation manner of the second aspect of the embodiment of the present invention, the second calculating unit is specifically configured to: calling a preset distance formula to calculate the distance corresponding to each target correlation coefficient to obtain a plurality of initial distances, wherein d (i, j) represents the distance between hospital i and hospital j, and the preset distance formula is as follows:
Figure BDA0002658318220000032
optionally, in a fourth implementation manner of the second aspect of the embodiment of the present invention, the clustering module includes: the pruning unit is used for carrying out pruning operation on the distance matrix to obtain a pruned distance matrix; and the clustering unit is used for carrying out hierarchical clustering on the pruned distance matrix to generate a clustering tree.
Optionally, in a fifth implementation manner of the second aspect of the embodiment of the present invention, the pruning unit is specifically configured to: converting the distance matrix into an undirected graph; generating a minimum spanning tree by using a preset algorithm and the undirected graph; and pruning the distance matrix based on the minimum spanning tree to obtain the pruned distance matrix.
Optionally, in a sixth implementation manner of the second aspect of the embodiment of the present invention, the clustering unit is specifically configured to: calling a preset matrix distance formula to calculate the distance of each data point in the pruned distance matrix to obtain a plurality of distances, wherein the preset matrix distance formula is
Figure BDA0002658318220000033
D represents the distance between any two data points; and performing hierarchical clustering on two nearest data points in the plurality of distances to obtain a plurality of data categories, wherein the data categories comprise data points and data combinations, and performing the hierarchical clustering process in an iterative manner until the distance matrix is converted into a plurality of clusters to generate a clustering tree.
A third aspect of an embodiment of the present invention provides a risk analysis device based on hierarchical clustering, including a memory and at least one processor, where the memory stores instructions, and the memory and the at least one processor are interconnected by a line; the at least one processor invokes the instructions in the memory to cause the hierarchical cluster-based risk analysis device to perform the hierarchical cluster-based risk analysis method described above.
A fourth aspect of the embodiments of the present invention provides a computer-readable storage medium, which stores instructions that, when executed by a processor, implement the steps of the risk analysis method based on hierarchical clustering according to any of the above embodiments.
According to the technical scheme provided by the embodiment of the invention, initial data are obtained, wherein the initial data are used for indicating the drug sales data of a plurality of hospitals, and the initial data are time series data; calculating correlation coefficients between any two different hospitals according to a preset similarity formula and the initial data to obtain a plurality of target correlation coefficients; generating a distance matrix among a plurality of hospitals according to the plurality of target correlation coefficients; pruning and hierarchical clustering operations are carried out on the distance matrix to generate a clustering tree, wherein the clustering tree comprises a plurality of clusters; and performing risk analysis according to the clustering tree to obtain a risk analysis result. According to the embodiment of the invention, the time series data is subjected to noise reduction and pruning treatment, so that the situation of dimension disaster is avoided, and the reliability of a risk analysis result is enhanced.
Drawings
FIG. 1 is a schematic diagram of an embodiment of a risk analysis method based on hierarchical clustering according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of another embodiment of a risk analysis method based on hierarchical clustering according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of an embodiment of a risk analysis apparatus based on hierarchical clustering according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of another embodiment of a risk analysis device based on hierarchical clustering according to an embodiment of the present invention;
fig. 5 is a schematic diagram of an embodiment of a risk analysis device based on hierarchical clustering in the embodiment of the present invention.
Detailed Description
The invention provides a risk analysis method, a risk analysis device, risk analysis equipment and a storage medium based on hierarchical clustering, which are used for carrying out noise reduction and pruning processing on time series data, avoiding the situation of being trapped in dimension disasters and enhancing the reliability of risk analysis results.
In order to make the technical field of the invention better understand the scheme of the invention, the embodiment of the invention will be described in conjunction with the attached drawings in the embodiment of the invention.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," or "having," and any variations thereof, are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Referring to fig. 1, a flowchart of a risk analysis method based on hierarchical clustering according to an embodiment of the present invention specifically includes:
101. initial data indicating drug sales data of a plurality of hospitals is acquired, and the initial data is time-series data.
The server acquires initial data indicating drug sales data of a plurality of hospitals, the initial data being time-series data. The initial data in this embodiment is time-series data, taking medical insurance wind control as an example, the basic data is medical insurance settlement data, and the server can obtain daily sales data of various medicines in each hospital through grouping and integration, which is time-series data.
It should be noted that the server analyzes the time-series data of each hospital, finds out a possibly abnormal hospital from the time-series data by using the risk analysis method based on hierarchical clustering proposed in this embodiment, and then performs visualization processing on the data of the abnormal hospital to discover and verify the reason of the abnormality and prompt the reason.
It is to be understood that the execution subject of the present invention may be a risk analysis device based on hierarchical clustering, and may also be a server, which is not limited herein. The embodiment of the present invention is described by taking a server as an execution subject.
102. And calculating the correlation coefficient between any two different hospitals according to a preset similarity formula and initial data to obtain a plurality of target correlation coefficients.
And the server calculates the correlation coefficient between any two different hospitals according to a preset similarity formula and initial data to obtain a plurality of target correlation coefficients. The target correlation coefficient between any two hospitals is a correlation coefficient between drug sales, where the drug sales is a specific drug, and can be focused on drugs with higher unit price and higher medical insurance fund expenditure, such as drugs for treating cancer and drugs for treating cardiovascular diseases, which is not limited herein.
103. And generating a distance matrix among a plurality of hospitals according to the plurality of target correlation coefficients.
The server generates a distance matrix between a plurality of hospitals according to the plurality of target correlation coefficients.
Specifically, the server calculates an initial distance between any two different hospitals according to the multiple target correlation coefficients to obtain multiple initial distances; the server generates a distance matrix based on the plurality of initial distances, the distance matrix indicating a distance between any two hospitals.
Wherein, the server calculates the initial distance between any two different hospitals according to the multiple target correlation coefficients, and the obtaining of the multiple initial distances comprises: the server calls a second preset formula to calculate the distance corresponding to each target correlation coefficient to obtain a plurality of initial distances, d (i, j) represents the distance between hospital i and hospital j, and the second preset formula is as follows:
Figure BDA0002658318220000051
104. and pruning and hierarchical clustering operations are carried out on the distance matrix to generate a clustering tree, wherein the clustering tree comprises a plurality of clusters.
And the server performs pruning and hierarchical clustering operations on the distance matrix to generate a clustering tree, wherein the clustering tree comprises a plurality of clusters. Specifically, the server prunes the distance matrix to obtain a pruned distance matrix; and the server carries out hierarchical clustering on the pruned distance matrix to generate a clustering tree.
In this embodiment, a bottom-up merging method is used for clustering, and a merging algorithm of hierarchical clustering combines two most similar data points of all data points by calculating the similarity between the two types of data points, and iterates this process repeatedly. The merging algorithm of hierarchical clustering determines the similarity between data points of each category and all data points by calculating the distance between the data points, wherein the smaller the distance is, the higher the similarity is, and two data points or categories with the closest distance are combined to generate a clustering tree. Because the server already obtains the distance matrix after pruning, the server can combine different hospitals according to the distance between the hospitals to generate the clustering tree.
105. And performing risk analysis according to the clustering tree to obtain a risk analysis result.
And the server performs risk analysis according to the clustering tree to obtain a risk analysis result.
The clustering tree is obtained through hierarchical clustering, each cluster (hierarchical structure) can represent the hierarchical structure of one class of hospitals, and is compared with a preset rule, if a third-class hospital, a second-class hospital and a first-class hospital are respectively clustered together, no problem exists, and if the hospital A falls into other hierarchies, the hospital A is possibly abnormal. On the other hand, after hierarchical clustering, the distance between the closest two hospitals can be determined, namely a, the distance between the closest two hospitals and the last clustered two clusters is determined, namely b, then a and b are compared, if b >3a, the hospital in the last clustered cluster is determined to have an abnormality, and if the number of hospitals in the abnormal cluster is small, the abnormality suspicion degree is higher.
The criterion for determining the abnormality may be set according to actual conditions, for example, the determination condition "b >3 a" may be replaced with "b >4 a" or "b >2 a", and the determination condition is not limited herein.
According to the embodiment of the invention, the time series data is subjected to noise reduction and pruning treatment, so that the situation of dimension disaster is avoided, and the reliability of a risk analysis result is enhanced. And this scheme can be applied to in the wisdom medical treatment field to promote the construction in wisdom city.
Referring to fig. 2, another flowchart of the risk analysis method based on hierarchical clustering according to the embodiment of the present invention specifically includes:
201. initial data indicating drug sales data of a plurality of hospitals is acquired, and the initial data is time-series data.
The server acquires initial data indicating drug sales data of a plurality of hospitals, the initial data being time-series data. The initial data in this embodiment is time-series data, taking medical insurance wind control as an example, the basic data is medical insurance settlement data, and the server can obtain daily sales data of various medicines in each hospital through grouping and integration, which is time-series data.
It should be noted that the server analyzes the time-series data of each hospital, finds out a possibly abnormal hospital from the time-series data by using the risk analysis method based on hierarchical clustering proposed in this embodiment, and then performs visualization processing on the data of the abnormal hospital to discover and verify the reason of the abnormality and prompt the reason.
It is to be understood that the execution subject of the present invention may be a risk analysis device based on hierarchical clustering, and may also be a server, which is not limited herein. The embodiment of the present invention is described by taking a server as an execution subject.
202. And calculating the correlation coefficient between any two different hospitals according to a preset similarity formula and initial data to obtain a plurality of target correlation coefficients.
And the server calculates the correlation coefficient between any two different hospitals according to a preset similarity formula and initial data to obtain a plurality of target correlation coefficients. The target correlation coefficient between any two hospitals is a correlation coefficient between drug sales, where the drug sales is a specific drug, and can be focused on drugs with higher unit price and higher medical insurance fund expenditure, such as drugs for treating cancer and drugs for treating cardiovascular diseases, which is not limited herein.
Specifically, the server determines the sales amount Y of the medicine in hospital iiAnd the sales amount Y of the medicine of hospital jj(ii) a The server sends the sales amount Y of the medicineiAnd the sales amount of the drug YjInputting the data into a preset similarity formula to generate the correlation coefficient of the hospital i and the hospital j, wherein the preset similarity formula is
Figure BDA0002658318220000071
Wherein, YiIndicates the sales amount of the medicine in Hospital i, YjRepresents the sales amount of the medicine of the hospital j, i and j are positive integers,<>denotes the mean value, pijThe correlation coefficients of hospital i and hospital j; the server calculates the correlation coefficient between any two other hospitals to obtain a plurality of other correlation coefficients, and any two other hospitals do not simultaneously comprise the hospital i and the hospital j; the server generates a plurality of target correlation coefficients including a correlation coefficient for hospital i and hospital j and a plurality of other correlation coefficients.
203. And generating a distance matrix among a plurality of hospitals according to the plurality of target correlation coefficients.
The server generates a distance matrix between a plurality of hospitals according to the plurality of target correlation coefficients.
Specifically, the server calculates an initial distance between any two different hospitals according to the multiple target correlation coefficients to obtain multiple initial distances; the server generates a distance matrix based on the plurality of initial distances, the distance matrix indicating a distance between any two hospitals.
Wherein, the server calculates the initial distance between any two different hospitals according to the multiple target correlation coefficients, and the obtaining of the multiple initial distances comprises: the server calls a second preset formula to calculate the distance corresponding to each target correlation coefficient to obtain a plurality of initial distances, d (i, j) represents the distance between hospital i and hospital j, and the second preset formula is as follows:
Figure BDA0002658318220000072
204. and carrying out pruning operation on the distance matrix to obtain the pruned distance matrix.
And the server prunes the distance matrix to obtain the pruned distance matrix. Specifically, the server converts the distance matrix into an undirected graph; the server generates a minimum spanning tree by using a preset algorithm and an undirected graph; and the server prunes the distance matrix based on the minimum spanning tree to obtain the pruned distance matrix.
According to the distance matrix between hospitals, the undirected graph can be obtained by taking the hospitals as nodes and the distances between the hospitals as the weights of edges. The purpose of pruning and denoising is achieved by converting the undirected graph into the minimum spanning tree. In the pruned distance matrix, if two hospitals are connected by branches on the minimum spanning tree, the distance between the two hospitals is preserved, otherwise, the distance between the two hospitals is set to a large value, such as other values greater than 500, for example, 1000, 2000, etc., and the invention is not limited herein.
It should be noted that in a given undirected graph G ═ V, E, (u, V) represents the edge connecting vertex u (i.e., hospital) and vertex V (i.e., distance between hospitals), and w (u, V) represents the weight of this edge, and if T is a subset of E and is a acyclic graph such that w (T) is minimized, then T is the minimum spanning tree of G. A spanning tree for a connected graph with n nodes is a minimal connected subgraph of the original graph, and contains all n nodes in the original graph and has the least edges to keep the graph connected. The minimum spanning tree can be determined using the kruskal (kruskal) algorithm or the prim (prim) algorithm.
It should be noted that, because actual data is usually noisy, the calculated original distance matrix is also noisy, which may interfere with the final analysis result. The distance matrix is essentially a graph, pruning is carried out by using the idea of the minimum spanning tree, only the distance between the nodes of the obtained minimum spanning tree is reserved, other distances are set to be a large distance, and the noise of the pruned distance matrix is greatly reduced, thereby being beneficial to subsequent analysis.
205. And performing hierarchical clustering on the distance matrix after pruning to generate a clustering tree.
And the server performs hierarchical clustering on the pruned distance matrix to generate a clustering tree. Specifically, the server calls a preset matrix distance formula to calculate the distance of each data point in the pruned distance matrix to obtain a plurality of distances, wherein the preset matrix distance formula is
Figure BDA0002658318220000081
D represents the distance between any two data points; the server hierarchically clusters the two closest data points of the plurality of distances,and obtaining a plurality of data categories, wherein the data categories comprise data points and data combinations, and iteratively executing a hierarchical clustering process until the distance matrix is converted into a plurality of clusters to generate a clustering tree.
In this embodiment, a bottom-up merging method is used for clustering, and a merging algorithm of hierarchical clustering combines two most similar data points of all data points by calculating the similarity between the two types of data points, and iterates this process repeatedly. The merging algorithm of hierarchical clustering determines the similarity between data points of each category and all data points by calculating the distance between the data points, wherein the smaller the distance is, the higher the similarity is, and two data points or categories with the closest distance are combined to generate a clustering tree. Because the server already obtains the distance matrix after pruning, the server can combine different hospitals according to the distance between the hospitals to generate the clustering tree. Wherein, the distance between each data point and all data points is calculated to determine the Euclidean distance D between the data points, and the smaller the Euclidean distance is, the higher the similarity is.
In the present application, data points are hospitals, one hospital represents one data point, and one data point combination represents two combined data points, and it is assumed that the pruned distance matrix in the embodiment of the present invention includes six hospitals A, B, C, D, E, F, that is, includes a data point a, a data point B, a data point C, a data point D, a data point E, and a data point F, and then, after combining the data point B (hospital B) and the data point C (hospital C), a category (B, C) is obtained, and finally, a data category a, a data category (B, C), a data category D, a data category E, and a data category F are obtained, and the distance matrix between the data categories is recalculated.
It is understood that, for the calculation method of calculating the distance between data points, a preset matrix distance formula is used for calculation. For the calculation method between the calculation data combination and other data points, for example: when calculating the distance of a data combination (B, C) to a data point A, it is necessary to calculate the mean of the distances B to A and C to A, respectively, i.e.
Figure BDA0002658318220000082
For the calculation method to calculate the distance between two data combinations: the distance of each of the two combined data points from all other data points is calculated. The mean of all distances is taken as the distance between two combined data points. This method is more computationally intensive, but results are more reasonable than the first two methods. For example, for a data combination (A, E) to a data combination (B, C) the distance is
Figure BDA0002658318220000083
206. And performing risk analysis according to the clustering tree to obtain a risk analysis result.
And the server performs risk analysis according to the clustering tree to obtain a risk analysis result.
The clustering tree is obtained through hierarchical clustering, each cluster (hierarchical structure) can represent the hierarchical structure of one class of hospitals, and is compared with a preset rule, if a third-class hospital, a second-class hospital and a first-class hospital are respectively clustered together, no problem exists, and if the hospital A falls into other hierarchies, the hospital A is possibly abnormal. On the other hand, after hierarchical clustering, the distance between the closest two hospitals can be determined, namely a, the distance between the closest two hospitals and the last clustered two clusters is determined, namely b, then a and b are compared, if b >3a, the hospital in the last clustered cluster is determined to have an abnormality, and if the number of hospitals in the abnormal cluster is small, the abnormality suspicion degree is higher.
The criterion for determining the abnormality may be set according to actual conditions, for example, the determination condition "b >3 a" may be replaced with "b >4 a" or "b >2 a", and the determination condition is not limited herein.
According to the embodiment of the invention, the time series data is subjected to noise reduction and pruning treatment, so that the situation of dimension disaster is avoided, and the reliability of a risk analysis result is enhanced. And this scheme can be applied to in the wisdom medical treatment field to promote the construction in wisdom city.
In the above description of the risk analysis method based on hierarchical clustering in the embodiment of the present invention, referring to fig. 3, a risk analysis device based on hierarchical clustering in the embodiment of the present invention is described below, and an embodiment of the risk analysis device based on hierarchical clustering in the embodiment of the present invention includes:
an obtaining module 301, configured to obtain initial data, where the initial data is used to indicate drug sales data of multiple hospitals, and the initial data is time-series data;
a calculating module 302, configured to calculate correlation coefficients between any two different hospitals according to a preset similarity formula and the initial data to obtain multiple target correlation coefficients;
a generating module 303, configured to generate a distance matrix between multiple hospitals according to the multiple target correlation coefficients;
a clustering module 304, configured to perform pruning and hierarchical clustering operations on the distance matrix to generate a clustering tree, where the clustering tree includes multiple clusters;
and the analysis module 305 is configured to perform risk analysis according to the clustering tree to obtain a risk analysis result.
According to the embodiment of the invention, the time series data is subjected to noise reduction and pruning treatment, so that the situation of dimension disaster is avoided, and the reliability of a risk analysis result is enhanced. And this scheme can be applied to in the wisdom medical treatment field to promote the construction in wisdom city.
Referring to fig. 4, another embodiment of the risk analysis device based on hierarchical clustering according to the embodiment of the present invention includes:
an obtaining module 301, configured to obtain initial data, where the initial data is used to indicate drug sales data of multiple hospitals, and the initial data is time-series data;
a calculating module 302, configured to calculate correlation coefficients between any two different hospitals according to a preset similarity formula and the initial data to obtain multiple target correlation coefficients;
a generating module 303, configured to generate a distance matrix between multiple hospitals according to the multiple target correlation coefficients;
a clustering module 304, configured to perform pruning and hierarchical clustering operations on the distance matrix to generate a clustering tree, where the clustering tree includes multiple clusters;
and the analysis module 305 is configured to perform risk analysis according to the clustering tree to obtain a risk analysis result.
Optionally, the calculating module 302 includes:
a determination unit 3021 for determining the sales amount Y of the drugs for hospital i, respectivelyiAnd the sales amount Y of the medicine of hospital jj
An input unit 3022 for assigning the sales amount Y of the medicineiAnd said drug sales amount YjInputting the data into a preset similarity formula to generate the correlation coefficient of the hospital i and the hospital j, wherein the preset similarity formula is
Figure BDA0002658318220000101
Wherein, YiIndicates the sales amount of the medicine in Hospital i, YjRepresents the sales amount of the medicine of the hospital j, i and j are positive integers,<>denotes the mean value, pijThe correlation coefficients of hospital i and hospital j;
the first calculating unit 3023 is configured to calculate correlation coefficients between any other two hospitals to obtain a plurality of other correlation coefficients, where the any other two hospitals do not include hospital i and hospital j at the same time;
a first generating unit 3024 configured to generate a plurality of target correlation coefficients including the correlation coefficients of hospital i and hospital j and the plurality of other correlation coefficients.
Optionally, the generating module 303 includes:
a second calculating unit 3031, configured to calculate initial distances between any two different hospitals according to the multiple target correlation coefficients, so as to obtain multiple initial distances;
a second generating unit 3032, configured to generate a distance matrix based on the plurality of initial distances, where the distance matrix is used to indicate a distance between any two hospitals.
Optionally, the second calculating unit 3031 is specifically configured to:
calling a preset distance formula to calculate the distance corresponding to each target correlation coefficient to obtain a plurality of initial distances, wherein d (i, j) represents the distance between hospital i and hospital jThe preset distance formula is as follows:
Figure BDA0002658318220000102
optionally, the clustering module 304 includes:
a pruning unit 3041, configured to perform a pruning operation on the distance matrix to obtain a pruned distance matrix;
a clustering unit 3042, configured to perform hierarchical clustering on the pruned distance matrix, and generate a clustering tree.
Optionally, the pruning unit 3041 is specifically configured to:
converting the distance matrix into an undirected graph; generating a minimum spanning tree by using a preset algorithm and the undirected graph; and pruning the distance matrix based on the minimum spanning tree to obtain the pruned distance matrix.
Optionally, the clustering unit 3042 is specifically configured to:
calling a preset matrix distance formula to calculate the distance of each data point in the pruned distance matrix to obtain a plurality of distances, wherein the preset matrix distance formula is
Figure BDA0002658318220000103
D represents the distance between any two data points; and performing hierarchical clustering on two nearest data points in the plurality of distances to obtain a plurality of data categories, wherein the data categories comprise data points and data combinations, and performing the hierarchical clustering process in an iterative manner until the distance matrix is converted into a plurality of clusters to generate a clustering tree.
According to the embodiment of the invention, the time series data is subjected to noise reduction and pruning treatment, so that the situation of dimension disaster is avoided, and the reliability of a risk analysis result is enhanced. And this scheme can be applied to in the wisdom medical treatment field to promote the construction in wisdom city.
Fig. 3 to 4 describe the risk analysis device based on hierarchical clustering in the embodiment of the present invention in detail from the perspective of the modular functional entity, and the risk analysis device based on hierarchical clustering in the embodiment of the present invention in detail from the perspective of hardware processing.
Fig. 5 is a schematic structural diagram of a risk analysis device based on hierarchical clustering according to an embodiment of the present invention, where the risk analysis device 500 based on hierarchical clustering may generate relatively large differences due to different configurations or performances, and may include one or more processors (CPUs) 510 (e.g., one or more processors) and a memory 520, one or more storage media 530 (e.g., one or more mass storage devices) storing applications 533 or data 532. Memory 520 and storage media 530 may be, among other things, transient or persistent storage. The program stored on the storage medium 530 may include one or more modules (not shown), each of which may include a series of instructions operating on the risk analysis device 500 based on hierarchical clustering. Still further, the processor 510 may be configured to communicate with the storage medium 530 to execute a series of instruction operations in the storage medium 530 on the hierarchical clustering based risk analysis device 500.
The hierarchical clustering-based risk analysis device 500 may also include one or more power supplies 540, one or more wired or wireless network interfaces 550, one or more input-output interfaces 560, and/or one or more operating systems 531, such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, and the like. Those skilled in the art will appreciate that the hierarchical clustering based risk analysis device architecture shown in fig. 5 does not constitute a limitation of hierarchical clustering based risk analysis devices, and may include more or fewer components than shown, or combine certain components, or a different arrangement of components. The processor 510 may perform the functions of the obtaining module 301, the calculating module 302, the generating module 303, the clustering module 304 and the analyzing module 305 in the above embodiments.
The present invention also provides a computer-readable storage medium, which may be a non-volatile computer-readable storage medium, and which may also be a volatile computer-readable storage medium, having stored therein instructions, which, when run on a computer, cause the computer to perform the steps of the hierarchical clustering based risk analysis method.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A risk analysis method based on hierarchical clustering is characterized by comprising the following steps:
acquiring initial data, wherein the initial data is used for indicating drug sales data of a plurality of hospitals, and the initial data is time sequence data;
calculating correlation coefficients between any two different hospitals according to a preset similarity formula and the initial data to obtain a plurality of target correlation coefficients;
generating a distance matrix among a plurality of hospitals according to the plurality of target correlation coefficients;
pruning and hierarchical clustering operations are carried out on the distance matrix to generate a clustering tree, wherein the clustering tree comprises a plurality of clusters;
and performing risk analysis according to the clustering tree to obtain a risk analysis result.
2. The risk analysis method based on hierarchical clustering according to claim 1, wherein the calculating a correlation coefficient between any two different hospitals according to a preset similarity formula and the initial data to obtain a plurality of target correlation coefficients comprises:
respectively determining the sales amount Y of the medicine in the hospital iiAnd the sales amount Y of the medicine of hospital jj
Sales amount Y of the medicineiAnd said drug sales amount YjInputting the data into a preset similarity formula to generate the correlation coefficient of the hospital i and the hospital j, wherein the preset similarity formula is
Figure FDA0002658318210000011
Wherein, YiIndicates the sales amount of the medicine in Hospital i, YjRepresents the sales amount of the medicine of the hospital j, i and j are positive integers,<>denotes the mean value, pijThe correlation coefficients of hospital i and hospital j;
calculating correlation coefficients between any other two hospitals to obtain a plurality of other correlation coefficients, wherein the any other two hospitals do not include hospital i and hospital j at the same time;
generating a plurality of target correlation coefficients including the correlation coefficients for Hospital i and Hospital j and the plurality of other correlation coefficients.
3. The hierarchical clustering-based risk analysis method according to claim 1, wherein the generating a distance matrix between hospitals according to the plurality of target correlation coefficients comprises:
calculating initial distances between any two different hospitals according to the target correlation coefficients to obtain a plurality of initial distances;
generating a distance matrix based on the plurality of initial distances, the distance matrix indicating a distance between any two hospitals.
4. The risk analysis method based on hierarchical clustering according to claim 3, wherein the calculating an initial distance between any two different hospitals according to the plurality of target correlation coefficients to obtain a plurality of initial distances comprises:
calling a preset distance formula to calculate the distance corresponding to each target correlation coefficient to obtain a plurality of initial distances, wherein d (i, j) represents the distance between hospital i and hospital j, and the preset distance formula is as follows:
Figure FDA0002658318210000021
5. the risk analysis method based on hierarchical clustering according to any one of claims 1-4, wherein the pruning and hierarchical clustering operations on the distance matrix generate a clustering tree, the clustering tree comprising a plurality of clusters, including:
pruning the distance matrix to obtain a pruned distance matrix;
and performing hierarchical clustering on the pruned distance matrix to generate a clustering tree.
6. The risk analysis method based on hierarchical clustering according to claim 5, wherein the pruning operation on the distance matrix to obtain the pruned distance matrix comprises:
converting the distance matrix into an undirected graph;
generating a minimum spanning tree by using a preset algorithm and the undirected graph;
and pruning the distance matrix based on the minimum spanning tree to obtain the pruned distance matrix.
7. The risk analysis method based on hierarchical clustering according to claim 5, wherein the hierarchical clustering of the pruned distance matrix to generate a clustering tree comprises:
calling a preset matrix distance formula to calculate the distance of each data point in the pruned distance matrix to obtain a plurality of distances, wherein the preset matrix distance formula is
Figure FDA0002658318210000022
D represents the distance between any two data points;
and performing hierarchical clustering on two nearest data points in the plurality of distances to obtain a plurality of data categories, wherein the data categories comprise data points and data combinations, and performing the hierarchical clustering process in an iterative manner until the distance matrix is converted into a plurality of clusters to generate a clustering tree.
8. A risk analysis device based on hierarchical clustering is characterized by comprising:
the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring initial data, the initial data is used for indicating the drug sales data of a plurality of hospitals, and the initial data is time sequence data;
the calculation module is used for calculating correlation coefficients between any two different hospitals according to a preset similarity formula and the initial data to obtain a plurality of target correlation coefficients;
the generating module is used for generating a distance matrix among a plurality of hospitals according to the plurality of target correlation coefficients;
the clustering module is used for pruning and hierarchical clustering operations on the distance matrix to generate a clustering tree, and the clustering tree comprises a plurality of clusters;
and the analysis module is used for carrying out risk analysis according to the clustering tree to obtain a risk analysis result.
9. A hierarchical clustering-based risk analysis device, characterized in that the hierarchical clustering-based risk analysis device comprises: a memory having instructions stored therein and at least one processor, the memory and the at least one processor interconnected by a line;
the at least one processor invokes the instructions in the memory to cause the hierarchical cluster-based risk analysis device to perform the hierarchical cluster-based risk analysis method of any one of claims 1-7.
10. A computer-readable storage medium storing instructions that, when executed by a processor, implement a hierarchical clustering based risk analysis method according to any one of claims 1 to 7.
CN202010895439.0A 2020-08-31 2020-08-31 Risk analysis method, device and equipment based on hierarchical clustering and storage medium Pending CN111985837A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010895439.0A CN111985837A (en) 2020-08-31 2020-08-31 Risk analysis method, device and equipment based on hierarchical clustering and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010895439.0A CN111985837A (en) 2020-08-31 2020-08-31 Risk analysis method, device and equipment based on hierarchical clustering and storage medium

Publications (1)

Publication Number Publication Date
CN111985837A true CN111985837A (en) 2020-11-24

Family

ID=73440480

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010895439.0A Pending CN111985837A (en) 2020-08-31 2020-08-31 Risk analysis method, device and equipment based on hierarchical clustering and storage medium

Country Status (1)

Country Link
CN (1) CN111985837A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112668622A (en) * 2020-12-22 2021-04-16 中国矿业大学(北京) Analysis method and analysis and calculation device for coal geological composition data
CN113268485A (en) * 2021-06-02 2021-08-17 电信科学技术第十研究所有限公司 Data table association analysis method, device, equipment and storage medium
CN113420804A (en) * 2021-06-18 2021-09-21 工业互联网创新中心(上海)有限公司 Data processing method, device, network equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104778480A (en) * 2015-05-08 2015-07-15 江南大学 Hierarchical spectral clustering method based on local density and geodesic distance
CN109523394A (en) * 2018-10-23 2019-03-26 平安科技(深圳)有限公司 A kind of risk checking method based on data processing, device and storage medium
CN109767067A (en) * 2018-12-13 2019-05-17 平安医疗健康管理股份有限公司 Method and Related product based on more evaluative dimensions evaluation hospital
CN110555110A (en) * 2019-09-10 2019-12-10 哈尔滨工业大学 text clustering method combining K-means and evidence accumulation
CN111062418A (en) * 2019-11-25 2020-04-24 南京师范大学 Non-parametric clustering algorithm and system based on minimum spanning tree

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104778480A (en) * 2015-05-08 2015-07-15 江南大学 Hierarchical spectral clustering method based on local density and geodesic distance
CN109523394A (en) * 2018-10-23 2019-03-26 平安科技(深圳)有限公司 A kind of risk checking method based on data processing, device and storage medium
CN109767067A (en) * 2018-12-13 2019-05-17 平安医疗健康管理股份有限公司 Method and Related product based on more evaluative dimensions evaluation hospital
CN110555110A (en) * 2019-09-10 2019-12-10 哈尔滨工业大学 text clustering method combining K-means and evidence accumulation
CN111062418A (en) * 2019-11-25 2020-04-24 南京师范大学 Non-parametric clustering algorithm and system based on minimum spanning tree

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
刘英俊;吴惠珍;安静;张旭东;任炳楠;宋浩静;支旭然;董占军;: "在办公自动化系统中建立医院临时采购药品审批功能的实践", 中国药房, no. 10 *
孙吉贵: "聚类算法研究", 软件学报, vol. 19, no. 1, 31 January 2008 (2008-01-31), pages 48 - 61 *
张勇;张建伟;韩云祥;: "一种改进的航迹聚类方法", 现代计算机, no. 18, pages 11 - 18 *
蔡娇楠 等: "基于最小生成树算法的建筑物聚类", 测绘, vol. 40, no. 6, 31 December 2017 (2017-12-31), pages 247 - 250 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112668622A (en) * 2020-12-22 2021-04-16 中国矿业大学(北京) Analysis method and analysis and calculation device for coal geological composition data
CN113268485A (en) * 2021-06-02 2021-08-17 电信科学技术第十研究所有限公司 Data table association analysis method, device, equipment and storage medium
CN113268485B (en) * 2021-06-02 2024-02-09 电信科学技术第十研究所有限公司 Data table association analysis method, device, equipment and storage medium
CN113420804A (en) * 2021-06-18 2021-09-21 工业互联网创新中心(上海)有限公司 Data processing method, device, network equipment and storage medium
WO2022262869A1 (en) * 2021-06-18 2022-12-22 工业互联网创新中心(上海)有限公司 Data processing method and apparatus, network device, and storage medium

Similar Documents

Publication Publication Date Title
CN111985837A (en) Risk analysis method, device and equipment based on hierarchical clustering and storage medium
US11037684B2 (en) Generating drug repositioning hypotheses based on integrating multiple aspects of drug similarity and disease similarity
Jayakumar et al. A new procedure of clustering based on multivariate outlier detection
CN113728335A (en) Method and system for classification and visualization of 3D images
Karthikeyan et al. Analysis of classification algorithms applied to hepatitis patients
US20170147753A1 (en) Method for searching for similar case of multi-dimensional health data and apparatus for the same
Jacobs et al. A Bayesian approach to model selection in hierarchical mixtures-of-experts architectures
Mirmozaffari et al. Window analysis using two-stage DEA in heart hospitals
CN110931090A (en) Disease data processing method and device, computer equipment and storage medium
Kowalski et al. Determining significance of input neurons for probabilistic neural network by sensitivity analysis procedure
CN113707286A (en) Inquiry allocation method, device, equipment and storage medium based on decision tree
CN110109975A (en) Data clustering method and device
WO2021120587A1 (en) Method and apparatus for retina classification based on oct, computer device, and storage medium
Fradi et al. Real-time arrhythmia heart disease detection system using CNN architecture based various optimizers-networks
CN107480426B (en) Self-iteration medical record file clustering analysis system
Wang et al. Extended ResNet and label feature vector based chromosome classification
Malott et al. Topology preserving data reduction for computing persistent homology
CN114820603A (en) Intelligent health management method based on AI tongue diagnosis image processing and related device
CN113656601A (en) Doctor-patient matching method, device, equipment and storage medium
CN108320788A (en) Hospital business analysis method and device
AlZu’Bi et al. transfer learning enabled CAD system for monkey pox classification
CN108091398B (en) Patient grouping method and device
Moschou et al. Assessment of self-organizing map variants for clustering with application to redistribution of emotional speech patterns
Dong et al. Protein remote homology detection based on binary profiles
Syarofina et al. Cluster analysis in prediction of biological activity and molecular structure relationship of dipeptidyl peptidase-4 inhibitors for the type two diabetes mellitus treatment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination