CN116628538A

CN116628538A - Patient clustering method and device based on graph alignment neural network and computer equipment

Info

Publication number: CN116628538A
Application number: CN202310924938.1A
Authority: CN
Inventors: 刘哲; 宋琳璇; 马川; 方黎明; 葛春鹏; 涂文轩
Original assignee: Zhejiang Lab
Current assignee: Zhejiang Lab
Priority date: 2023-07-26
Filing date: 2023-07-26
Publication date: 2023-08-22

Abstract

The application relates to a patient clustering method, a device and computer equipment based on a graph alignment neural network. Relates to the technical field of neural networks. The method comprises the following steps: acquiring medical chart data corresponding to a patient; generating a feature matrix and a graph adjacent matrix according to the medical graph data; inputting the feature matrix and the graph adjacent matrix into a graph alignment neural network to generate a clustering distribution matrix corresponding to the patient; the graph alignment neural network includes: at least one of a feature alignment rule, a class center alignment rule, and a minimum entropy alignment rule. The clustering analysis is carried out on the patient through the graph alignment nerve, so that the workload of doctors can be reduced, the clustering efficiency can be improved, and the clustering accuracy can be further improved. And simultaneously, the neural network parameters are adjusted while the graph is aligned with the neural network training through the characteristic alignment rule, the class center alignment rule and the minimum entropy alignment rule, so that the accuracy of the classification of patients is further improved.

Description

Patient clustering method and device based on graph alignment neural network and computer equipment

Technical Field

The present application relates to the field of neural networks, and in particular, to a method, an apparatus, and a computer device for clustering patients based on graph alignment neural networks.

Background

With the development of information digitization, various enterprises and public institutions, hospitals and schools have basically realized the digitization of information. Taking a hospital as an example, patients go through links such as on-line registration, on-line filing, inquiry and stay, and the like from entering the hospital, and the data of all patients are digitally processed and stored. As hospital patients grow more and more, all patients' digitized information also grows more and more. In order to better archive and check the information of the patient, the information of the patient needs to be classified and then stored.

In the conventional technology, a doctor is usually required to divide categories according to information of a patient, and then store the information of the corresponding patient. This increases the workload of the physician and is time consuming and of low accuracy when there is a lot of patient information at the same time to be classified.

Disclosure of Invention

Based on the foregoing, it is necessary to provide a patient clustering method, device and computer equipment based on a graph alignment neural network.

In a first aspect, the present application provides a method for clustering patients based on graph-aligned neural networks, the method comprising: acquiring medical chart data corresponding to a patient; generating a feature matrix and a graph adjacent matrix according to the medical graph data; inputting the feature matrix and the graph adjacent matrix into a graph alignment neural network to generate a clustering distribution matrix corresponding to the patient; the graph alignment neural network includes: at least one of a feature alignment rule, a class center alignment rule, and a minimum entropy alignment rule.

In one embodiment, the generating the feature matrix and the graph adjacency matrix according to the medical graph data includes: the medical map data comprises map nodes, node characteristics and relationship among the map nodes; generating a feature matrix according to the node features in the medical chart data; and generating a graph adjacency matrix according to the graph nodes and the relationships among the graph nodes in the medical graph data.

In one embodiment, the inputting the feature matrix and the graph adjacent matrix into the graph alignment neural network, and generating the cluster allocation matrix corresponding to the patient includes: regularizing the feature matrix to obtain a regularized feature matrix; multiplying the regularized feature matrix with the graph adjacency matrix, and then inputting the graph to align the neural network; the image alignment neural network is a multi-layer network, and each layer of network comprises a multi-layer perceptron; sequentially performing iterative training through the graph alignment neural network, and outputting clustering probability corresponding to each layer of network; and calculating a clustering distribution matrix corresponding to the patient according to the clustering probability of each layer of network.

In one embodiment, the performing sequential iterative training through the graph alignment neural network, and outputting the clustering probability corresponding to each layer of network includes: for each layer of network; a first layer of the multi-layer perceptron performs feature extraction on input data to obtain a correlation matrix; the second layer of the multi-layer perceptron carries out probability prediction according to the correlation matrix and outputs clustering probability; and calculating the loss of the output parameters of the multi-layer perceptron through at least one of a characteristic alignment rule, a class center alignment rule and a minimum entropy alignment rule, and updating parameters of the graph alignment neural network.

In one embodiment, the extracting features of the input data by the first layer of the multi-layer perceptron includes: calculating a characterization matrix according to the regularized feature matrix; regularizing the characterization matrix to obtain a regularized characterization matrix; and calculating a characteristic correlation matrix and a characterization correlation matrix according to the regularized characteristic matrix and the regularized characterization matrix.

In one embodiment, the probability prediction is performed according to the correlation matrix by the second layer of the multi-layer perceptron, and outputting the cluster probability includes: performing deformation treatment on the characterization matrix to obtain a deformed characterization matrix; and outputting the clustering probability through the characterization matrix and the deformed characterization matrix.

In one embodiment, the calculating the loss of the output parameters of the multi-layer perceptron through at least one of the feature alignment rule, the class center alignment rule and the minimum entropy alignment rule, and updating the parameters of the graph alignment neural network includes: based on a feature alignment rule, performing feature alignment on the feature correlation matrix and the characterization correlation matrix to generate a divergence value as a first loss parameter; embedding an average representation cluster center of the medical chart data into a representation matrix based on a class center alignment rule to obtain a cluster center representation matrix; calculating a correlation matrix according to the cluster center characterization matrix; performing feature alignment on the correlation matrix and the identity matrix to generate a divergence value as a second loss parameter; sharpening the clustering probability based on a minimum entropy alignment rule to generate a sharpening probability; generating a third loss parameter according to the clustering probability and the sharpening probability; and updating parameters of the graph alignment neural network according to at least one of the first loss parameter, the second loss parameter and the third loss parameter.

In a second aspect, the present application also provides a patient clustering device based on a graph alignment neural network, the device comprising: the acquisition module is used for acquiring medical chart data corresponding to the patient; the preprocessing module is used for generating a feature matrix and a graph adjacency matrix according to the medical graph data; the neural network module is used for inputting the characteristic matrix and the graph adjacent matrix into a graph alignment neural network to generate a clustering distribution matrix corresponding to the patient; the graph alignment neural network includes: at least one of a feature alignment rule, a class center alignment rule, and a minimum entropy alignment rule.

In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor which when executing the computer program implements the graph-alignment neural network based patient clustering method as described in the first aspect above.

In a fourth aspect, the present application also provides a computer-readable storage medium. The computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a patient clustering method based on a graph alignment neural network as described in the first aspect above.

In a fifth aspect, the present application also provides a computer program product. The computer program product comprises a computer program which, when executed by a processor, implements a patient clustering method implementing a graph-based alignment neural network as described in the first aspect above.

According to the patient clustering method, the patient clustering device and the computer equipment based on the graph alignment neural network, the medical graph data corresponding to the patient are obtained, the corresponding feature matrix and the graph adjacency matrix are generated according to the medical graph data, the feature matrix and the graph adjacency matrix are input into the graph alignment neural network, the patient is classified based on the graph neural network, and the clustering distribution matrix corresponding to the patient is generated. Wherein, the graph neural network includes: feature alignment rules, class center alignment rules, and minimum entropy alignment rules. The clustering analysis is carried out on the patient through the graph alignment nerve, so that the workload of doctors can be reduced, the clustering efficiency can be improved, and the clustering accuracy can be further improved. Meanwhile, through the feature alignment rule, the class center alignment rule and the minimum entropy alignment rule, parameters of the graph alignment neural network can be adjusted and optimized while the graph alignment neural network is trained, and the accuracy of patient classification can be further improved.

Drawings

FIG. 1 is a diagram of an application environment for a patient clustering method based on a graph alignment neural network in one embodiment;

FIG. 2 is a flow diagram of a method of patient clustering based on graph alignment neural networks in one embodiment;

FIG. 3 is a flow chart of a training method of the alignment neural network in one embodiment;

FIG. 4 is a flow chart of a training method of the alignment neural network in another embodiment;

FIG. 5 is a block diagram of a patient clustering device based on a graph alignment neural network in one embodiment;

fig. 6 is an internal structural diagram of a computer device in one embodiment.

Detailed Description

The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.

The patient clustering method based on the graph alignment neural network provided by the embodiment of the application can be applied to an application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104 or may be located on a cloud or other network server. For example, a patient or a doctor uploads patient information through a terminal, a server receives the patient information and stores the patient information into a data storage system, and when a patient clustering method based on a graph alignment neural network needs to be executed, the server acquires corresponding data to execute a corresponding method. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices, and portable wearable devices, where the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart vehicle devices, and the like. The portable wearable device may be a smart watch, smart bracelet, headset, or the like. The server 104 may be implemented as a stand-alone server or as a server cluster of multiple servers.

The privacy computing platform can be constructed on the server, and can represent data into a graph structure and then process the data through graph computation. The data processing mode based on the graph can protect privacy of patient data and improve efficiency and accuracy of data processing. In order to effectively analyze the graph data formed by a large amount of patient information, a graph alignment neural network is also constructed on the server. Among them, the graphic neural network is attracting attention because of having a strong graphic computing power, however, when the tag data is limited, the conventional graphic neural network generally cannot make full use of the graphic information, resulting in the problem of excessive smoothness. Therefore, the application proposes a graph alignment neural network to solve the problem of excessive smoothing.

The privacy computing platform is an important data processing tool, and can effectively calculate and analyze data while protecting the privacy of the data. Due to the rapid increase in data volume and the expensive acquisition of data labels, semi-supervised learning tasks have become a very important research topic. The semi-supervised learning task aims at effectively acquiring effective information in data when the labels are rare, so that the semi-supervised learning task is accurately applied to downstream tasks. The graph semi-supervised task aims to construct an efficient structure to fully mine hidden information in graph data. However, due to the lack of high quality valid labels for guidance, models tend to appear too smooth when mining graph data. Such as a widely used graph roll-up neural network, that obtains valid information present in a feature by subjecting the feature to a laplace smoothing process. And applying the obtained graph node characterization to tasks such as downstream node classification, clustering and the like. Then when one uses more than two layers of graph convolution neural network, one may fuse node information from different categories and characterize it as part of the current node, making it difficult to distinguish it in downstream tasks, i.e. an overcomplete phenomenon occurs.

The graph neural network is a deep learning model based on a graph structure, has strong graph computing capability, and can better mine the characteristics of graph data through the computation of the graph neural network, so that the accuracy and generalization capability of the model are improved. However, in practice only a small portion of the data is tagged and a large portion of the data is untagged. Traditional supervised learning methods, such as support vector machines (Support Vector Machine, SVM) and logistic regression (LogisticRegression, LR), can only be trained with tagged data and therefore perform poorly with limited tag data. In contrast, the graph neural network may be trained with unlabeled data, thereby improving classification performance. The basic idea of the graph neural network is to combine the information of the nodes and their neighboring nodes to generate the characterization vector of the node. In particular, the graph neural network learns the classification task of a node by iteratively updating the token vector of the node, wherein in each iteration, the token vector of each node is a function of the information of itself and its neighboring nodes.

However, in the case of limited tag data, most of the graph neural network methods cannot fully utilize enough graph information, resulting in a problem of excessive smoothing. This is because when the tag data is rare, the node characterization vectors may be too similar, resulting in a classifier that cannot accurately judge the difference between them, and thus, a classification performance is degraded. In the graph neural network, the smoothing problem (Smoothness Problem) is a common challenge, mainly because the number and structure of neighbors of a node in the graph change, and the characteristic representation of the node also changes. Solutions to the smoothing problem typically involve introducing regularization terms or constraints in the node embedding or graph roll stacking to ensure smoothness between embedded or feature representations of neighboring nodes.

In addition to graph convolutional neural networks (Graph Convolutional Network, GCN), there are also some other graph neural network models that have smoothing problems, such as graph sage and GAT.

Therefore, a graph depth neural network framework capable of solving the smoothing problem and fully mining graph data structure and characteristic information needs to be established, and the graph alignment neural network of the framework can fully mine the essential information of the graph, improve the representation of graph node characterization in a downstream task and solve the problem of excessive smoothing.

In one embodiment, as shown in fig. 2, a patient clustering method based on a graph alignment neural network is provided, and the method is applied to the server in fig. 1 for illustration, and includes the following steps:

step S202, medical chart data corresponding to a patient are acquired.

Specifically, the healthcare map data is generated by the privacy computing platform based on the patient's information. The medical map data is stored in a data storage system from which the corresponding medical map data is retrieved when the method of the application is performed. The medical map data comprises map nodes, node characteristics and relationship among the map nodes. For example, graph nodes represent patients who see a doctor, node features represent patient information, and relationships between nodes represent relationships between patients who have confirmed the same doctor in the same hospital.

Step S204, generating a feature matrix and a graph adjacent matrix according to the medical graph data.

Specifically, after acquiring the medical map data, the feature matrix X and the map adjacent matrix a are generated from the medical map data. Wherein the feature matrix is used for representing each patient information; the graph adjacency matrix is used to represent the relationship between every two patients. For example, in the medical map data, if the number of patients is n, the corresponding map adjacent matrix is an n×n matrix, each element in the matrix represents a relationship between every two patients, and if there is a relationship, it is represented as 1, and if there is no relationship, it is represented as 0.

And S206, inputting the feature matrix and the graph adjacent matrix into a graph alignment neural network to generate a clustering distribution matrix corresponding to the patient.

Specifically, the graph alignment neural network includes: at least one of a feature alignment rule, a class center alignment rule, and a minimum entropy alignment rule. For example, in the graph alignment neural network, only one alignment rule may be included, and for example, only feature alignment rules, only class center alignment rules, or only minimum entropy alignment rules may be included; two alignment rules can be contained at the same time, for example, a combination of any two alignment rules can be adopted; three feature alignment rules may also be included simultaneously. After the feature matrix X is obtained, firstly regularizing the feature matrix to obtain a regularized feature matrix . The problem of excessive smoothing of the output of the graph alignment neural network can be solved through regularization processing. Afterwards, regularized feature matrix ++>Multiplying the graph adjacent matrix A, and taking the result after multiplication as the input of the graph alignment neural network. And finally, outputting a cluster allocation matrix through the graph alignment neural network. Wherein the cluster allocation matrix is used to represent patient categories, i.e. patients suffering from the same disease will be clustered into the same cluster. Thereby achieving classification of the patient. In the process of training the graph alignment neural network, the loss parameters are calculated through at least one of the characteristic alignment rule, the class center alignment rule and the minimum entropy alignment rule, and then the parameters of the graph alignment neural network are optimized through the back propagation of the loss parameters, so that the classification of patients can be further accurate. And the classification of patients by the graph alignment neural network can improve the classification efficiency and reduce the workload of doctors.

According to the embodiment of the application, the corresponding feature matrix and the graph adjacent matrix are generated by acquiring the medical graph data corresponding to the patient, and finally the feature matrix and the graph adjacent matrix are input into the graph alignment neural network, so that the patient is classified based on the graph neural network, and the clustering distribution matrix corresponding to the patient is generated. Wherein, the graph neural network includes: feature alignment rules, class center alignment rules, and minimum entropy alignment rules. The clustering analysis is carried out on the patient through the graph alignment nerve, so that the workload of doctors can be reduced, the clustering efficiency can be improved, and the clustering accuracy can be further improved. Meanwhile, through the feature alignment rule, the class center alignment rule and the minimum entropy alignment rule, parameters of the graph alignment neural network can be adjusted and optimized while the graph alignment neural network is trained, and the accuracy of patient classification can be further improved.

In the embodiment of the application, the map alignment neural network adopts three unique alignment rules to thoroughly explore hidden information of insufficient labels in medical map data. First, to better investigate the specifics of the attributes, we propose feature alignment rules, i.e., alignment of the attributes and embedding the inner product of the matrix. This helps ensure that the attributes match the structure of the graph and a more accurate embedded representation is obtained from it. Secondly, in order to properly utilize the higher-order neighbor information, we propose a cluster center alignment rule, i.e., aligning the inner product of the cluster center matrix with the identity matrix. This helps to maintain the relationship between the higher order neighbors and obtain richer graph information from it. Finally, to obtain reliable prediction results with few labels, we establish a minimum entropy alignment rule, which lines the prediction probability matrix with its sharpening results. This helps to maximize classification accuracy, especially where there is less label data.

In one embodiment, generating the feature matrix and the graph adjacency matrix from the medical graph data comprises: generating a feature matrix according to the node features in the medical chart data; and generating a graph adjacency matrix according to the graph nodes and the relationships among the graph nodes in the medical graph data.

Specifically, the medical map data includes map nodes, node characteristics, and relationships between the map nodes. Generating a feature matrix for all patients according to node features, namely patient information; a graph adjacency matrix for all patients is generated from graph nodes and relationships between graph nodes, i.e., patient and patient-to-patient relationships.

In one embodiment, inputting the feature matrix and the graph adjacency matrix into a graph alignment neural network, and generating a cluster allocation matrix corresponding to the patient includes: regularizing the feature matrix to obtain a regularized feature matrix; multiplying the regularized feature matrix with the graph adjacency matrix, and then inputting the graph to align the neural network; the image alignment neural network is a multi-layer network, and each layer of network comprises a multi-layer perceptron; sequentially performing iterative training through the graph alignment neural network, and outputting clustering probability corresponding to each layer of network; and calculating a clustering distribution matrix corresponding to the patient according to the clustering probability of each layer of network.

Specifically, firstly, normalizing the feature matrix X to obtain a regularized feature matrixThen the regularized characteristic matrix is +.>Multiplied by the graph adjacency matrix a as input to the graph alignment neural network. The graph adjacency matrix A is n×n, n is the number of patients, if there is a certain relationship between patients i and j, A [ i ] ][j]=1, otherwise a [ i ]][j]=0. The graph alignment neural network comprises a plurality of layers of network structures, wherein each layer of network structure comprises a multi-layer perceptron (MLP), in each layer of network structure, one MLP layer is utilized for feature extraction, the other MLP layer is utilized for probability calculation, and each layer of network structure outputs the clustering probability corresponding to all patients. And calculating the average value according to the clustering probability of each layer of network to obtain the output result of the final graph alignment neural network, namely a clustering distribution matrix.

For example, the graph alignment neural network is trained using sequential iterations, feature extraction with one MLP layer, and probability calculation with one MLP layer with normalized exponential function (softmax). Parameters in each layer of the network structure of the graph alignment neural network model are shared. And, in the model, through feature alignment rules and class center pairsAt least one of the alignment rule and the minimum entropy alignment rule performs parameter optimization on the model. The neural network is aligned by training the graph to provide a reliable representation of nodes with a small amount of marker data. The graph alignment neural network model receives the regularized feature matrixMultiplying the graph adjacency matrix A as input, performing iterative training of a plurality of layers, and then averaging the clustering probability output by each layer to obtain a final output result of the model.

Wherein, the mathematical formula of the graph alignment neural network model is defined as:

wherein, the liquid crystal display device comprises a liquid crystal display device,f ₁ equation of representationIs a nonlinear function part of the (c).f ₂ Expression formula->Is a function part of the above.LIs the number of the current network structure layer, ranging from 1-L. The parameters of each network structure layer in the model are shared and updated in sequence. The present embodiment adopts the firstlInitializing a best learning weight matrix for a layerl+1The optimal learning weight matrix of a layer, and the training optimal weight coefficient of the previous layer initializes the coefficient of the next layer. The final loss function is formulated as:

wherein the calculation is based on a weighted sum strategy in an adaptive graph roll-up neural network model (AdaGCN)，Is a loss function of the cluster center alignment rule,is a cross-entropy loss function,is the minimum entropy aligned loss function. In additionβAndγis used as a learnable parameter to coordinate all targets.

In one embodiment, the training process of the graph alignment neural network model is as follows:

input: regularized graph adjacency matrixRegularized feature matrixModel layer number L, modelSum step sizep。

And (3) outputting: the distribution matrix Z is clustered.

Step 1: initializing:calculating a feature correlation matrix Initial iteration s=0.

Step 2: for the followinglFrom 0 to L.

Step 3: when s < p.

Step 4: acquisition node embedding:。

step 5: by the formulaObtaining an embedded correlation matrix->。

Step 6: by the formulaClustering centers for training nodes in a calculation equationAnd pass through the formulaIts corresponding inner product is calculated, where i represents the current cluster and m represents the number of samples in cluster i.

Step 7: calculating a probability prediction matrix:。

step 8: by minimizing the equationLoss in (a)To update the model parameters.

Step 9: if it isThens+ = 1。

Step 10: return to。

In one embodiment, as shown in fig. 3, a training method of a graph alignment neural network is provided, including the following steps:

step S302: and the first layer of the multi-layer perceptron performs feature extraction on the input data to obtain a correlation matrix.

Specifically, each layer of network structure of the graph alignment neural network comprises a multi-layer perceptron MLP, wherein the MLP is a two-layer structure. The input data is feature extracted using a first layer of MLP. Wherein the input data is data of the input graph alignment neural network, andi.e. regularized feature matrixThe result a of the multiplication with the graph adjacency matrix. When the feature extraction is carried out, a characterization matrix is calculated according to the regularized feature matrix. I.e. regularized feature matrix is processed by using first layer MLP Obtaining a characterization matrix. Re-characterization matrixAnd (3) carrying out regularization treatment to obtain a regularized characterization matrix E. Finally, calculating a characteristic correlation matrix according to the regularized characteristic matrix and the regularized characterization matrixCharacterizing a correlation matrix。

Step S304: and the second layer of the multi-layer perceptron carries out probability prediction according to the correlation matrix and outputs clustering probability.

Specifically, the second layer MLP includes a normalized exponential function (softmax) layer. In probability calculation, the characterization matrix is first of allPerforming deformation treatment to obtain a deformed characterization matrixThe method comprises the steps of carrying out a first treatment on the surface of the And outputting the clustering probability through the characterization matrix and the deformed characterization matrix. Exemplary, characterization matrices are calculated using a second layer of MLP with softmaxAndobtaining a predictive probability matrixAndi.e. the cluster probability.

Step S306: and calculating the loss of the output parameters of the multi-layer perceptron through at least one of a characteristic alignment rule, a class center alignment rule and a minimum entropy alignment rule, and updating parameters of the graph alignment neural network.

Specifically, in the process of model training, characteristic alignment rules are utilized, and KL divergence is adopted to measure characteristic correlation matrix F and characterization correlation matrix And performing feature alignment on the differences, and generating a first loss parameter. And measuring the difference between the correlation matrix and the identity matrix by using a class center alignment rule and KL divergence, and performing feature alignment to generate a first loss parameter. Predictive probability matrix using minimum entropy alignment rulesAndalignment is performed to generate a third loss parameter. Updating parameters of the graph alignment neural network by the loss parameters generated by at least one alignment rule. For example, only one loss parameter may be used to update the parameters of the graph alignment neural network, e.g., only the first loss parameter; it is also possible to use only the second loss parameter; it is also possible to use only the third loss parameter. Two loss parameters may also be used to update the graph alignment neural network, for example, a combination of any two loss parameters. Three loss parameters can also be used simultaneously to update the graph alignment neural network.

In the embodiment, the loss is calculated through three alignment rules in the graph alignment neural network model, added with the loss of the model prediction result and the real result, and the model parameters are updated through back propagation of the loss function. The feature alignment rule and the minimum entropy alignment rule not only can improve the extraction of robust representation of the graph alignment neural network model, but also can help the extraction of more accurate cluster centers for subsequent calculation, and the classification of the graph alignment neural network to the patient can be more accurate by weighting the losses of the three alignment rules and then optimizing the model through back propagation.

According to the embodiment of the application, the problems in the traditional technology are solved by providing the graph alignment neural network model. Firstly, in order to fully utilize node characteristic data in medical chart data, characteristic alignment rules are provided based on the concept of a chart automatic encoder, and the content of node characterization is enriched by aligning a characteristic correlation matrix with a chart node characterization correlation matrix. Regularizing an original feature matrix, calculating a corresponding similarity matrix, selecting the first k samples in a top-k mode, marking the corresponding similarity as 1, and marking the rest as 0, thereby obtaining a feature correlation matrix. In addition, the graph node representation can be obtained through the multi-layer perceptron, the inner product of the graph node representation is calculated to obtain a graph node representation correlation matrix, and the graph node representation correlation matrix and the characteristic correlation matrix are subjected to cross entropy loss.

Secondly, in order to fully utilize the higher-order neighbor information in the graph and ensure the correctness of the connection, we propose a category center alignment rule. The inner product of the class center characterization matrix is computed and aligned with the identity matrix. In the semi-supervision task, a small number of labeled samples exist, the corresponding node characterization is used for calculating an average value to serve as the center characterization of the current category, and the category center characterization matrix is close to the identity matrix. By minimizing intra-class distances and eliminating inter-class noise, we can avoid introducing excessive incorrect neighbor noise to node characterization when the model obtains higher-order neighbor information.

Finally, we propose a minimum entropy alignment rule in order to be able to get more accurate class prediction results in semi-supervised tasks. The entropy of the prediction result of the model is made lower by aligning the prediction probability matrix with the result of its sharpening.

Specifically, based on a feature alignment rule, feature alignment is performed on the feature correlation matrix and the characterization correlation matrix, and a divergence value is generated as a first loss parameter. Namely, the difference between the two correlation matrixes is measured by using the KL divergence, characteristic alignment is carried out, and the KL divergence value is used as a first loss parameter.

The graph-aligned neural network model is illustratively a multi-layer network structure, starting from the first layer, the initial input is a regularized feature matrixMultiplying the graph adjacent matrix A, and obtaining the regularized characteristic matrix of the second layer as the input of the second layer after smoothing>Multiplied by the graph adjacency matrix a, and the like for each layer later. A Dropout layer is added after the input data to increase the robustness of the model training. The hidden node representation is then generated using a linear layer. Feature extraction is performed using a layer of MLP. And since only the linear layer is applicable when the dimension of the linear layer is large enough, a good table is also possibleShowing the effect. For the activation function, a nonlinear activation function is used, specifically expressed as follows:

wherein, the liquid crystal display device comprises a liquid crystal display device,for first layer input, smoothed adjacency matrixAndis the second layer input.lIndicating the number of layers in which the current model is located, l=1,...,L。Is a trainable parameter.Indicating the current firstThe hidden layer representation to which the layer model is trained,Includedthe number of cells in a cell is one,normalized to generate final node embedding。Andrepresenting the relu activation function and the sigmoid activation function respectively,is the minimum value. The model is weight shared among all layers. Their iterative update sequence is sequential, and the weight coefficient of the training optimum of the upper layer is used as the initialization of the weight coefficient of the lower layer. And normalizing the obtained hidden layer vector to obtain the final node representation vector.

And calculating the KL divergence value by calculating the inner product corresponding to the feature matrix, taking the inner product as the real feature label of the original input and the corresponding matrix which uses the inner product to represent hidden layer vector distribution. To measure the difference between the current node hidden representation distribution and the true feature distribution. Since the value of the real feature distribution label is fixed, the corresponding entropy value is a constant value. The corresponding loss can be expressed in terms of a cross entropy loss function. In this embodiment, the node feature matrix is used to calculate the corresponding similarity matrix, and cosine similarity is used as the calculation standard. The method comprises the following steps:

wherein represents the dot product of the dot product, I ₂ Is the L2 norm. S is an element in the similarity matrix S, which represents the similarity of the i node and the j node on the node characteristics. Because the underlying graphic structure is sparse. To construct the corresponding feature similarity 0-1 matrix, we will be less than a certain threshold in SThe part of (2) is set to 0. In addition, top-k () means that k nodes with the highest similarity value with the ith node are selected, and the rest are assigned 0.

Specifically, based on a class center alignment rule, embedding an average representation cluster center of medical chart data into a representation matrix to obtain a cluster center representation matrix; calculating a correlation matrix according to the cluster center characterization matrix; and performing feature alignment on the correlation matrix and the identity matrix to generate a divergence value as a second loss parameter. That is, according to the regularized representation matrix E, the average representation of the medical image data is represented in the clusterObtaining cluster center characterization matrix by heart embeddingThen calculate the correlation matrixAnd performing feature alignment on the relevant row matrix and the identity matrix by using a class center alignment rule, measuring the difference between the relevant row matrix and the identity matrix by using KL divergence, and taking the KL divergence value as a second loss parameter.

Illustratively, a small portion of the labeled sample is extracted for computation of the class center representation based on the hidden layer representation obtained above. Specifically, training set samples with known labels are used in the training process, and the samples are classified according to categories, so that calculation of the class center hidden layer representation is performed. For example, the data set has seven classes, each class has 20 samples with labels as training sets, and then the hidden layer representation of the class center of a certain class of the current data set is the average value of the hidden layer representations of the 20 samples. And after obtaining the class center hidden layer representation matrix, calculating an inner product matrix as the distribution condition of the current class center representation. And calculating the KL divergence value of the unit diagonal matrix and the current class center representation distribution to measure the quality of the current class center representation. The method comprises the following steps:

Where i represents the current cluster and m represents the number of samples in cluster i.Andis thatIs a product of the inner product of (a). The alignment identity matrix is used in the loss function, considering here that the representation of the current class center is closer to itself and further from the rest of the class centers. Sample in class with known tagsThe aim of the heart finding is to minimize the unreliability of training tasks in such semi-supervised tasks by using known information. The node representation obtained is more accurate when given the model deterministic criteria.

Specifically, sharpening the clustering probability based on a minimum entropy alignment rule to generate a sharpening probability; and generating a third loss parameter according to the clustering probability and the sharpening probability. And namely, characteristic alignment is carried out on the clustering probability and the sharpening probability, the difference between the clustering probability and the sharpening probability is measured by adopting the KL divergence value as a third loss parameter.

Illustratively, the first two alignment rules optimize node representation by mining graph attributes and structure data. However, creating an accurate predictive probability matrix with a small number of labels for semi-supervised tasks remains a significant challenge. In order to overcome this problem, the embodiment of the application proposes a minimum entropy alignment rule by aligning the prediction probability matrix and its sharpening result. Specifically, we first calculate using one MLP layer To ensure that the dimensions of the learned embeddings and clusters are constant. The softmax is then used to generate a predictive probability matrix, i.e., a cluster allocation matrix Z. Furthermore, we also calculate the cluster relation probability of all samples through the sharpening operation.

Wherein node i belongs to cluster C, ranging from 1 to C. the tem is a super parameter that controls the sharpness of the classification distribution,representing the relu activation function. The distance between the predictive probability matrix Z and the sharpening result is then reduced to optimize the network. The rule loss function may be expressed as:

where N represents the size of the cluster,andrepresenting the true value and the predicted value, respectively, || is the L2 norm. Unlabeled examples are most beneficial when clusters have little overlap. The result is optimized by reducing the entropy of the prediction, thereby increasing the confidence of the classification result.

Specifically, the parameters of the graph alignment neural network are updated according to at least one of the first loss parameter, the second loss parameter and the third loss parameter. For example, the loss calculated by the three alignment rules is back-propagated through the loss function, updating the model parameters. The model parameter update may be performed using only the first loss parameter, the model parameter update may be performed using only the second loss parameter, or the model parameter update may be performed using only the third loss parameter; it can be appreciated that any two loss parameters can also be used to update the model parameters; the model parameters can also be updated using three loss parameters simultaneously.

The graph alignment neural network in the embodiment of the application can fully mine the characteristic and structure information in a semi-supervision task, and effectively utilizes the essential data of the graph through the characteristic alignment rule, the class center alignment rule and the minimum entropy alignment rule, and enhances the accuracy of graph node representation from three angles of characteristic, structure and prediction structure respectively, thereby avoiding the problem of excessive smoothness. Specifically, through the characteristic alignment rule, the graph alignment neural network can fully utilize the input node characteristic information when training the node representation, and fuse the relevant node characteristic information helpful for the downstream task into the node representation, so as to enrich the meaning of the node representation. Through the class center alignment rule, the problem of collapse of the node representation during high-level model training is solved, and the neighbor range of the node is enlarged as the high-order adjacency matrix tends to be in a full-connected state, so that the number of nodes included in the formed local structure is increased. The degree of distinction between the local states is not obvious, resulting in that the representations of the nodes all tend to be approximate, so that the resolution between the nodes is not high. So that the nodes trained in the higher layers can collapse. To remedy the defect represented by this part of nodes, we choose a small number of nodes with known labels. And calculating class center representations corresponding to the class center representations, and performing distributed alignment on the class center representations and the identity matrix. Thereby solving the problem of collapse.

In one embodiment, as shown in FIG. 4, a training method for a graph alignment neural network is provided.

In step S401, a graph data adjacent matrix and a regularized feature matrix are prepared and multiplied as input.

Step S402, instantiating a model, initializing corresponding parameters, setting super parameters, and ending the preparation stage.

Step S403, the training stage carries out model iterative training according to the layer number L in the super parameter, and the trained parameter is transferred between the layer numbers.

Step S404, forward propagation is carried out, the intermediate result H output by the model is used for calculating loss by using the three proposed alignment rules, and the loss of the model prediction result and the real result is calculated.

In step S405, the model parameters are updated by backward propagation through the loss function.

Step S406, after training, the final prediction probability matrix of all layers is averaged to be the final result.

According to the embodiment of the application, the effective information in the graph data can be fully mined through the graph alignment neural network, and a reliable node representation is generated by using a small number of labels in the semi-supervision task. In the aspect of feature utilization, feature information and node characterization information can be effectively associated through a feature alignment rule; in the aspect of high-order neighbor information mining, neighbor noise caused by a full-communication adjacent army array can be avoided by using a small number of labels through a class center alignment rule; in the aspect of node prediction accuracy of semi-supervision tasks, the entropy value of the final prediction result can be greatly reduced through a minimum entropy alignment rule, and a more accurate prediction result is obtained.

It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.

Based on the same inventive concept, the embodiment of the application also provides a patient clustering device based on the graph alignment neural network, which is used for realizing the patient clustering method based on the graph alignment neural network. The implementation of the solution provided by the device is similar to the implementation described in the above method, so the specific limitations in the embodiments of the patient clustering device based on the graph alignment neural network provided below can be referred to above for the limitations of the patient clustering method based on the graph alignment neural network, which are not repeated here.

In one embodiment, as shown in fig. 5, there is provided a patient clustering device based on a graph alignment neural network, comprising: an acquisition module 100, a preprocessing module 200, and a neural network module 300, wherein:

the acquiring module 100 is configured to acquire medical chart data corresponding to a patient.

And the preprocessing module 200 is used for generating a feature matrix and a graph adjacency matrix according to the medical graph data.

The neural network module 300 is configured to align the feature matrix and the graph adjacent matrix with a neural network to generate a cluster allocation matrix corresponding to the patient; the graph alignment neural network includes: at least one of a feature alignment rule, a class center alignment rule, and a minimum entropy alignment rule.

The preprocessing module 200 is further configured to generate a feature matrix according to the node features in the medical map data; and generating a graph adjacency matrix according to the graph nodes and the relationships among the graph nodes in the medical graph data.

The neural network module 300 is further configured to perform regularization processing on the feature matrix to obtain a regularized feature matrix; multiplying the regularized feature matrix with the graph adjacency matrix, and then inputting the graph to align the neural network; the image alignment neural network is a multi-layer network, and each layer of network comprises a multi-layer perceptron; sequentially performing iterative training through the graph alignment neural network, and outputting clustering probability corresponding to each layer of network; and calculating a clustering distribution matrix corresponding to the patient according to the clustering probability of each layer of network.

The neural network module 300 is further configured to, for each layer of network; a first layer of the multi-layer perceptron performs feature extraction on input data to obtain a correlation matrix; the second layer of the multi-layer perceptron carries out probability prediction according to the correlation matrix and outputs clustering probability; and calculating the loss of the output parameters of the multi-layer perceptron through at least one of a characteristic alignment rule, a class center alignment rule and a minimum entropy alignment rule, and updating parameters of the graph alignment neural network.

The neural network module 300 is further configured to calculate a characterization matrix according to the regularized feature matrix; regularizing the characterization matrix to obtain a regularized characterization matrix; and calculating a characteristic correlation matrix and a characterization correlation matrix according to the regularized characteristic matrix and the regularized characterization matrix.

The neural network module 300 is further configured to perform deformation processing on the characterization matrix to obtain a deformed characterization matrix; and outputting the clustering probability through the characterization matrix and the deformed characterization matrix.

The neural network module 300 is further configured to perform feature alignment on the feature correlation matrix and the characterization correlation matrix based on a feature alignment rule, and generate a divergence value as a first loss parameter; embedding an average representation cluster center of the medical chart data into a representation matrix based on a class center alignment rule to obtain a cluster center representation matrix; calculating a correlation matrix according to the cluster center characterization matrix; performing feature alignment on the correlation matrix and the identity matrix to generate a divergence value as a second loss parameter; sharpening the clustering probability based on a minimum entropy alignment rule to generate a sharpening probability; generating a third loss parameter according to the clustering probability and the sharpening probability; and updating parameters of the graph alignment neural network according to at least one of the first loss parameter, the second loss parameter and the third loss parameter.

The various modules in the above-described graph-alignment neural network-based patient clustering device may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 6. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is for storing medical map data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements a method of patient clustering based on graph-aligned neural networks.

It will be appreciated by those skilled in the art that the structure shown in FIG. 6 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.

In one embodiment, a computer device is provided comprising a memory and a processor, the memory having stored therein a computer program, the processor when executing the computer program performing the steps of:

acquiring medical chart data corresponding to a patient; generating a feature matrix and a graph adjacent matrix according to the medical graph data; inputting the feature matrix and the graph adjacent matrix into a graph alignment neural network to generate a clustering distribution matrix corresponding to the patient; the graph alignment neural network includes: at least one of a feature alignment rule, a class center alignment rule, and a minimum entropy alignment rule.

In one embodiment, the processor when executing the computer program further performs the steps of:

generating a feature matrix according to the node features in the medical chart data; and generating a graph adjacency matrix according to the graph nodes and the relationships among the graph nodes in the medical graph data.

regularizing the feature matrix to obtain a regularized feature matrix; multiplying the regularized feature matrix with the graph adjacency matrix, and then inputting the graph to align the neural network; the image alignment neural network is a multi-layer network, and each layer of network comprises a multi-layer perceptron; sequentially performing iterative training through the graph alignment neural network, and outputting clustering probability corresponding to each layer of network; and calculating a clustering distribution matrix corresponding to the patient according to the clustering probability of each layer of network.

for each layer of network; a first layer of the multi-layer perceptron performs feature extraction on input data to obtain a correlation matrix; the second layer of the multi-layer perceptron carries out probability prediction according to the correlation matrix and outputs clustering probability; and calculating the loss of the output parameters of the multi-layer perceptron through at least one of a characteristic alignment rule, a class center alignment rule and a minimum entropy alignment rule, and updating parameters of the graph alignment neural network.

Calculating a characterization matrix according to the regularized feature matrix; regularizing the characterization matrix to obtain a regularized characterization matrix; and calculating a characteristic correlation matrix and a characterization correlation matrix according to the regularized characteristic matrix and the regularized characterization matrix.

performing deformation treatment on the characterization matrix to obtain a deformed characterization matrix; and outputting the clustering probability through the characterization matrix and the deformed characterization matrix.

based on a feature alignment rule, performing feature alignment on the feature correlation matrix and the characterization correlation matrix to generate a divergence value as a first loss parameter; embedding an average representation cluster center of the medical chart data into a representation matrix based on a class center alignment rule to obtain a cluster center representation matrix; calculating a correlation matrix according to the cluster center characterization matrix; performing feature alignment on the correlation matrix and the identity matrix to generate a divergence value as a second loss parameter; sharpening the clustering probability based on a minimum entropy alignment rule to generate a sharpening probability; generating a third loss parameter according to the clustering probability and the sharpening probability; and updating parameters of the graph alignment neural network according to at least one of the first loss parameter, the second loss parameter and the third loss parameter.

In one embodiment, a computer readable storage medium is provided having a computer program stored thereon, which when executed by a processor, performs the steps of:

In one embodiment, the computer program when executed by the processor further performs the steps of:

In one embodiment, a computer program product is provided, comprising a computer program that when executed by a processor implements the graph-alignment neural network-based patient clustering method of the method embodiments described above.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (MagnetoresistiveRandom Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase ChangeMemory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static RandomAccess Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the embodiments provided herein may include at least one of a relational database and a non-relational database. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processor referred to in the embodiments provided in the present application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computing, or the like, but is not limited thereto.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.

Claims

1. A method for clustering patients based on graph alignment neural networks, the method comprising:

acquiring medical chart data corresponding to a patient;

generating a feature matrix and a graph adjacent matrix according to the medical graph data;

inputting the feature matrix and the graph adjacent matrix into a graph alignment neural network to generate a clustering distribution matrix corresponding to the patient; the graph alignment neural network includes: at least one of a feature alignment rule, a class center alignment rule, and a minimum entropy alignment rule.

2. The method of claim 1, wherein generating a feature matrix and a graph adjacency matrix from the medical graph data comprises: the medical map data comprises map nodes, node characteristics and relationship among the map nodes;

generating a feature matrix according to the node features in the medical chart data;

and generating a graph adjacency matrix according to the graph nodes and the relationships among the graph nodes in the medical graph data.

3. The method of claim 1, wherein said inputting the feature matrix and the graph adjacency matrix into a graph alignment neural network, generating a cluster allocation matrix corresponding to the patient comprises:

regularizing the feature matrix to obtain a regularized feature matrix;

multiplying the regularized feature matrix with the graph adjacency matrix, and then inputting the graph to align the neural network; the image alignment neural network is a multi-layer network, and each layer of network comprises a multi-layer perceptron;

sequentially performing iterative training through the graph alignment neural network, and outputting clustering probability corresponding to each layer of network;

and calculating a clustering distribution matrix corresponding to the patient according to the clustering probability of each layer of network.

4. The method of claim 3, wherein the sequentially iteratively training through graph-aligned neural networks, outputting cluster probabilities for each layer of networks comprises:

For each layer of network;

a first layer of the multi-layer perceptron performs feature extraction on input data to obtain a correlation matrix;

the second layer of the multi-layer perceptron carries out probability prediction according to the correlation matrix and outputs clustering probability;

and calculating the loss of the output parameters of the multi-layer perceptron through at least one of a characteristic alignment rule, a class center alignment rule and a minimum entropy alignment rule, and updating parameters of the graph alignment neural network.

5. The method of claim 4, wherein the feature extraction of the input data by the first layer of the multi-layer perceptron comprises:

calculating a characterization matrix according to the regularized feature matrix;

regularizing the characterization matrix to obtain a regularized characterization matrix;

and calculating a characteristic correlation matrix and a characterization correlation matrix according to the regularized characteristic matrix and the regularized characterization matrix.

6. The method of claim 5, wherein the probability prediction of the second layer of the multi-layer perceptron based on the correlation matrix, the outputting the cluster probability comprises:

performing deformation treatment on the characterization matrix to obtain a deformed characterization matrix;

And outputting the clustering probability through the characterization matrix and the deformed characterization matrix.

7. The method of claim 6, wherein the calculating the loss of the output parameters of the multi-layer perceptron by at least one of a feature alignment rule, a class center alignment rule, and a minimum entropy alignment rule, and updating the parameters of the graph alignment neural network comprises:

based on a feature alignment rule, performing feature alignment on the feature correlation matrix and the characterization correlation matrix to generate a divergence value as a first loss parameter;

embedding an average representation cluster center of the medical chart data into a representation matrix based on a class center alignment rule to obtain a cluster center representation matrix; calculating a correlation matrix according to the cluster center characterization matrix; performing feature alignment on the correlation matrix and the identity matrix to generate a divergence value as a second loss parameter;

sharpening the clustering probability based on a minimum entropy alignment rule to generate a sharpening probability; generating a third loss parameter according to the clustering probability and the sharpening probability;

and updating parameters of the graph alignment neural network according to at least one of the first loss parameter, the second loss parameter and the third loss parameter.

8. A graph-aligned neural network-based patient clustering device, the device comprising:

the acquisition module is used for acquiring medical chart data corresponding to the patient;

the preprocessing module is used for generating a feature matrix and a graph adjacency matrix according to the medical graph data;

the neural network module is used for inputting the characteristic matrix and the graph adjacent matrix into a graph alignment neural network to generate a clustering distribution matrix corresponding to the patient; the graph alignment neural network includes: at least one of a feature alignment rule, a class center alignment rule, and a minimum entropy alignment rule.

9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 7 when the computer program is executed.

10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any one of claims 1 to 7.