CN117540247A

CN117540247A - Comprehensive decision method, system and medium for preference learning based on graph neural network

Info

Publication number: CN117540247A
Application number: CN202311445543.XA
Authority: CN
Inventors: 林荣恒; 孟振华; 陈硕; 吴步丹; 邹华
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2023-11-02
Filing date: 2023-11-02
Publication date: 2024-02-09

Abstract

The invention provides a comprehensive decision method, a system and a medium for preference learning based on a graph neural network, wherein the method comprises the following steps: the preference relation among the stage 1 schemes is mapped to a graph structure, and a preference relation graph is constructed according to the preference relation; the phase 2 preference relation prediction problem is converted into an edge classification problem on a graph, edge characteristics on the graph are obtained by adopting a multi-layer perceptron model, and preference information on the graph is mined by means of the expression capacity of a graph neural network; stage 3 constructs a comparator neural network, takes the pair-wise scheme preferences of the previous stage as input, takes each scheme preference score as output, and ranks the scheme preferences according to the score. The method solves the problem of generating more invalid preference information during comprehensive decision making, improves sequencing prediction performance, and ensures the interpretability of the comprehensive decision making process at the same time, and the obtained decision making result is more reliable and reliable.

Description

Comprehensive decision method, system and medium for preference learning based on graph neural network

Technical Field

The invention relates to the technical field of comprehensive decision analysis, in particular to a preference learning method, a preference learning system and a preference learning medium based on a graph neural network.

Background

Decision making is the process of comparing, judging, evaluating and selecting an optimal solution from a plurality of possible solutions in order to accomplish a task or pursue a certain goal, and is one of the most basic activities of a human being. With the development of internet technology, with the help of a large amount of knowledge information and data information, the existing decision is gradually automated and intelligent, wherein a machine learning method represented by preference learning has gradually become a research focus in the intelligent decision field. Preference learning is a classification method based on observed preference information, and from the viewpoint of supervised learning, preference learning trains a set of items having preference for tags or other items, and predicts the preference of all items. In general, the predicted preference relationships need to form a total order, so the nature of preference learning is the same as the comprehensive decision, and is the best to worst ranking limited alternative, and the preferred alternative is selected.

The existing preference learning method can be generally divided into a parameterization method and a non-parameterization method, wherein the parameterization method utilizes a parameterization score function to learn parameters of the function from training data, and the training speed is higher but the expression capacity is weaker; the non-parameterized method defines the distribution of the scoring function and learns the scoring function quality directly from the training data, which is more expressive but computationally expensive. For most of more complex decision scenes, the included decision examples and related decision data are large in scale, the advantages of the parameterized preference learning method are more obvious, and the research on the method is also more abundant. Such as a nonlinear rank SVM method, which applies a Support Vector Machine (SVM) to improve the search accuracy of click data, a scoring function is mainly learned by minimizing the distance between scheme pairs, which formalizes a preference learning problem as a constraint optimization problem; the linear RankNet method uses a probability cost function to learn a scoring function, which can be any model that is microscopic with respect to parameters, meaning that the probability loss function is not specific to a fixed machine learning model, and predicts the probability of the scheme preference based on the scoring function. While preference learning methods such as RankNet rely strongly on assumptions known to the ranking function, such assumptions tend to be unrealistic in practical situations and may lead to incorrect ranking.

Preference learning on a graph can be seen as ordering nodes in the graph, and related methods can be divided into two major categories, graph-based topology and deep learning. Topology-based methods can vary significantly in performance between different data sets because they do not involve learning from examples to rank, whereas deep learning methods mostly rank nodes based on inherent graph structures, the relationships between nodes sometimes represent competition or collaboration rather than preference, thus affecting preference prediction. If schemes in preference learning are modeled as nodes of a graph, preference relationships among schemes are modeled as edges of the graph, and the GNN is used to extract scheme features, the performance of the parameterized preference learning method can be improved under the action of the GNN, and the problem of generating more invalid preference information in comprehensive decision can be solved.

Therefore, how to perform preference learning on a graph structure and how to improve the expression capability of the parameterized preference learning method by means of the expression capability of GNN is a problem to be solved urgently.

Disclosure of Invention

In order to solve the defects in the prior art, the invention provides a comprehensive decision method, a system and a medium for preference learning based on a graph neural network, which comprise three stages of preference relation graph construction, preference relation prediction and scheme preference ordering.

The first aspect of the invention discloses a comprehensive decision method for preference learning based on a graph neural network, which comprises the following steps:

s1: preprocessing data in a decision problem, wherein the decision problem consists of an alternative scheme and an evaluation criterion, and decision scheme information with characteristic representation is obtained;

s2: mapping the decision scheme information obtained in the step S1 into a graph structure, and constructing a preference relation directed graph according to preference relation among the decision schemes, wherein nodes on the directed graph represent schemes, and edges represent preference relation among the schemes;

s3: obtaining scheme preference characteristics on the preference relation directed graph by using a multi-layer perceptron model, and mining scheme preference data on the graph by using the expression capacity of the graph neural network, wherein the scheme preference data is a paired scheme;

s4: inputting the paired schemes obtained in the step S3 into a comparator neural network, outputting preference scores of each scheme, and correcting relevant positive inconsistent preferences;

s5: and sorting according to the preference scores of the schemes to obtain the final result of the decision problem, wherein the scheme with the forefront sorting is the optimal scheme of the problem.

According to the method of the first aspect of the present invention, preprocessing the data in the decision problem in the step S1 includes: and removing repeated data, processing missing values and abnormal values, and performing data conversion and standardization.

According to the method of the first aspect of the present invention, the preference relationship between the schemes in the step S2 is determined by the preference supervision information size of the scheme pair, and the characteristic value of the node is the evaluation criterion evaluation value of the alternative scheme.

According to the method of the first aspect of the present invention, the step S3 of obtaining the scheme preference feature on the preference relation directed graph by using the multi-layer perceptron model includes the following steps:

s31: randomly sampling neighbors of nodes on the preference relation directed graph, wherein the number of the neighbors sampled by each hop is not more than the aggregation layer number;

s32: generating an embedded representation of a node on the graph, specifically, firstly aggregating the characteristics of two-hop neighbors to generate an embedded representation of a one-hop neighbor, and then aggregating the embedded representation of the one-hop to generate an embedded representation of the node;

s33: inputting the embedded representation of the node in the step S32 into a fully-connected network to obtain a characteristic value of the node, and inputting the characteristic value into a multi-layer perceptron;

s34: and the characteristic values of adjacent nodes output by the multi-layer perceptron are subjected to nonlinear conversion to obtain the characteristic representation of the upper edge of the graph.

According to the method of the first aspect of the present invention, the step S4 includes the steps of:

s41: comparing the node data on the preference relation directed graph in pairs according to the category to which the node data belongs, and constructing scheme pairs;

s42: inputting the scheme pair into a three-layer unbiased comparator neural network, and outputting preference probabilities of two nodes in a node pair corresponding to the scheme pair;

s43: the two-node preference probabilities are converted to preference scores for each node using nonlinear transformation.

According to the method of the first aspect of the present invention, in the step S3, edge features on the preference relationship graph are obtained by using the graph neural network, and node neighbors on the graph are sampled and aggregated mainly according to a neighbor aggregation mechanism to obtain node features, where an expression of neighbor aggregation is:

where σ is a sigmoid function, | represents connecting features,characteristic representation of node v at time t, < ->Feature representation of neighbor node u representing node v at time t, N _v Is the neighbor set of node v, W represents the weight, AGG _t+1 Is an aggregation function that may contain a variety of operators, including an average operator.

According to the method of the first aspect of the present invention, in the step S4, the comparator neural network may satisfy transitivity, symmetry and autoreactivity, wherein for input For input domain, the comparator neural network p can be expressed as

Wherein w is _i Is the output weight of the ith hidden layer neuron, f is an antisymmetric function,and->The outputs of the ith hidden layer neuron are for inputs (x, y) and (y, x), respectively, b is the bias.

In view of the antisymmetric nature of f

I.e. p (x, y) = -p (y, x), and similarly, autoreactivity, i.e.It can also be satisfied that +.>Since f needs to keep the sign of the input consistent, i.e +.>Thus p (x, z) =f (x-y+y-z)>0, i.e. the comparator neural network p satisfies the transitivity.

In the method according to the first aspect of the present invention, in the step S4, the comparator neural network outputs a preference score of the scheme pair (x, y) and predicts the square based on the probability ranking frameworkThe probability of case x being better than case y is calculated by calculating the probability loss C _xy The cross entropy loss function is also used when:

wherein L is _rank Is a loss of ordering and,is the expected probability of scheme x being better than scheme y, let o _xy =f (x) -f (y), whenWhen (I)>I.e. when the desired result is that the x-ordering is better than y, if o _xy The larger the loss, the smaller the loss; when->At time C _xy ＝-log(1-P _xy )＝log(1+exp(o _xy ) If the desired result is that the y-ordering is better than x, if o _xy The smaller the loss, the smaller the loss; when->When (I)>I.e. when the desired result is that the order of x and y is known, if o _xy =0, the loss function will be minimal. After obtaining the loss function, the parameters of the comparator neural network can be obtained through gradient descent.

The second aspect of the invention discloses a comprehensive decision system based on the preference learning of the graph neural network, which comprises a computer device, wherein the computer device is used for executing the steps of the comprehensive decision method based on the preference learning of the graph neural network in the first aspect.

A third aspect of the present invention discloses a computer-readable storage medium, on which a computer program is stored to implement the comprehensive decision method for preference learning based on a graph neural network according to the first aspect.

In summary, the scheme provided by the invention has the following technical effects: the method utilizes the expression capability of the graph neural network to improve the performance of the parameterized preference learning method by mapping the decision problem into the graph structure, and comprises three stages of preference relation graph construction, preference relation prediction and scheme preference ordering. In the construction stage of the preference relation diagram, a decision scheme and decision criteria of a decision problem are respectively converted into nodes and node characteristics of the preference relation diagram, and then edges of the preference relation diagram are constructed according to different categories of the scheme, wherein the edges of the preference relation diagram represent known preference relations among schemes; in the preference relation prediction stage, the preference relation prediction of the alternative scheme is converted into the edge classification of the preference relation graph, in addition, edge characteristics on the preference graph are obtained by constructing a multi-layer perceptron model, and the preference information on the graph is mined by means of the expression capacity of the graph neural network; in the scheme preference ordering stage, the comparator neural network receives preferences of alternatives, the neural network is structured so that preference data of the alternatives are output in a preference score form, all the alternatives are ordered according to the preference score, and the earlier ordered alternatives are more suitable for decision selection. The method solves the problem that more invalid preference information is generated when comprehensive decision is made in the prior art, utilizes the expression capacity of the graph neural network to mine the preference information on the graph, helps a decision maker to more scientifically establish a preference relation, reduces the influence of subjectivity on decision results, improves sequencing prediction performance, ensures the interpretability of the comprehensive decision process, and ensures that the obtained decision results are more reliable and reliable.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a diagram showing the relationship between the various aspects of the present invention;

fig. 2 is a flowchart of a preference learning method according to an embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The invention provides a comprehensive decision method for preference learning based on a graph neural network, which comprises three stages of preference relation graph construction, preference relation prediction and scheme preference ordering. In the first stage, the preference relation among schemes is mapped to a graph structure, and a preference relation graph is constructed according to the preference relation; in the second stage, mining preference information on the preference relation graph by means of the expression capacity of the graph neural network so as to obtain characteristic representation of the preference relation; in the third stage, a comparator neural network is used to determine a preference score for each solution and the preferences of the solutions are ranked according to the score.

As shown in FIG. 1, the invention can be decomposed into three parts, including (1) a preference relation diagram structure, (2) preference relation prediction and (3) scheme preference ordering, and the relation among the components is shown in FIG. 1. The method comprises the steps of (1) constructing a preference relation graph, wherein the preference relation graph is mainly used for mapping comprehensive decision questions to a graph structure, so that scheme preference ordering questions are converted into side classification questions of the graph; (2) The preference relation prediction is mainly used for predicting the scheme category on the preference relation diagram constructed in the step (1) and determining preference data information of the combination of each scheme; (3) The scheme preference ranking receives the paired alternative scheme preference information of the step (2), and the preference scores of all alternatives are evaluated through a comparator neural network, so that the ranking and decision-making results of the alternatives are obtained.

The preference learning method based on the graph neural network comprises 5 specific processes, and a schematic diagram of the disclosed embodiment of the invention is shown in fig. 2. The embodiment provides a preference learning method and is used for comprehensive decision making, and specifically comprises the following steps:

s1: the data in the decision questions are preprocessed, wherein the general decision questions are mostly composed of alternative schemes (objects) and evaluation criteria (features), and decision scheme information with feature representation is obtained.

Firstly, preprocessing decision data in the comprehensive decision problem (S1), wherein the expression form of the decision problem is mainly a combination of alternative schemes (schemes) and evaluation criteria (characteristics), the alternative schemes represent possible solutions, the evaluation criteria are used for measuring the characteristics of the schemes, and decision scheme preference information with characteristic representation can be obtained by applying the evaluation criteria to the alternative schemes.

Further, in S1, the decision problem is expressed as x= { X ₁ ,x ₂ ,...,x _m The set of m alternatives, c= { C ₁ ,c ₂ ,...,c _n The number of n is a set of evaluation criteria, x _ij For the ith alternative x _i With respect to the j-th evaluation criterion c _j Is a target of the evaluation value of (a). According to the relation between the alternative scheme set and the evaluation criterion, the original judgment matrix R can be composed as follows:

since the dimensions of the evaluation criteria are different, the evaluation needs to unify the dimensions, that is, the decision data needs to be standardized:

wherein x is _ij The evaluation value is determined as a criterion, and the evaluation value is calculated,for the criterion evaluation value after pretreatment, I ₁ Is a benefit criterion, i.e. the higher the evaluation value of the criterion is, the better the criterion is ₂ For the cost-type criterion, i.e., the criterion that the smaller the evaluation value of the criterion is, the better, m= {1, 2. Furthermore, in preference learning, a preference model needs to be trained according to existing preference decision data, so preference supervision information l= {0,1,2}, which corresponds to different preference categories, i.e. for each alternative x, respectively, is needed _i Its corresponding preference label is y _i ∈L。

It should be noted that, in this embodiment, the decision maker needs to sort and clean the decision data to ensure the accuracy and consistency of the data, including removing duplicate data, processing missing values and outliers, and performing data conversion and standardization. Through these steps, the quality of the decision data can be improved, and a reliable basis is provided for subsequent decision analysis. In addition, the decision maker needs to distinguish and unify dimension processing on evaluation criteria, which can be various indexes and factors, such as cost, benefit, risk, feasibility and the like, and the decision maker can allocate corresponding weights for each criterion according to own requirements and decision targets and apply the evaluation criteria to alternative schemes. Through preprocessing and evaluation, the decision scheme obtained by the decision maker contains characteristic representation information, and the characteristic representation information can help the decision maker to better understand the advantages and disadvantages of the alternative scheme, and make trade-offs and comparisons under different evaluation criteria. The form of the decision scheme information obtained after pretreatment is as follows:

TABLE 1 Pre-processed decision scheme preference information form

Where each row represents each alternative, the first column represents the category (decision relevance) of the scheme, the greater the number, the better the category, the more relevant the scheme, and the second column begins with the feature vector for each scheme.

S2: and mapping the decision scheme information into a graph structure, and constructing a preference relation directed graph according to preference relation among schemes, wherein nodes on the graph represent the schemes, and edges represent the preference relation among the schemes.

And secondly, based on decision data of the step S1, mapping decision scheme information into a graph structure, constructing a preference relation directed graph according to preference relation among schemes (S2), wherein each node on the graph represents one scheme, and the sides represent the preference relation among the schemes, so that the preference relation between different schemes can be more clearly understood by constructing the preference relation directed graph.

And in the step S2, a preference relation graph G= (V, E) is constructed, wherein a node V is an option in the alternative scheme set X, an edge E is a preference relation among the alternative schemes, the judgment of the preference relation is determined by the size of preference supervision information of the scheme pairs, and the characteristic value of the node is an evaluation criterion evaluation value of the alternative schemes. The key to introducing the graph here is to map the preference decision task to the graph domain, which can also be regarded as a classification task. For example, to classify the preference relationship between two alternatives, if scheme a is better than scheme b, the classification result is 1, otherwise 0, so if (v) _i ,v _j ) E E, then scheme v _i Ratio scheme v _j Is more popular. If a set of schemes x= { X ₁ ,x ₂ ,x ₃ The preference relation between schemes is x ₁ ＞x ₃ ＞x ₂ Constructing a preference relation graph g= (V, E), wherein e= { E ₁ ＝(x ₁ ,x ₃ ),e ₂ ＝(x ₂ ,x ₃ ) The tag set corresponding to E is l= {1,0}. The main goal of preference learning is to predict the preference relationship between schemes. Mapping this task to the graph domain thus involves classifying edges in the preference graph, but since these relationships are typically unknown, the edge class in the graph is also unknown, it is to be noted that the edges of the preference graph in this embodiment are directional, indicating the merits of one solution over another, e.g., if solution a is considered to be better than solution B, the directional edges are the edges pointing from node a to node B. Through this, theIn a similar manner, the whole preference relation directed graph can form a complex network structure to show the relative preference relation among schemes. The establishment of the preference relation directed graph is helpful for a decision maker to analyze and understand the good and bad relation among different schemes more intuitively, and the representation mode of the graph can provide more intuitive and easier-to-understand decision information, so that the decision maker can grasp the characteristics and potential influence of each scheme more accurately. Meanwhile, a decision maker can quantitatively or qualitatively quantify the preference relation among schemes according to own judgment and experience. In addition, various decision analysis methods and techniques, such as analytic hierarchy process, fuzzy comprehensive evaluation and the like, can be used for assisting in constructing the preference relation directed graph. The methods and techniques may help decision makers build preference relationships more scientifically and reduce the impact of subjectivity on decision results.

S3: and designing a multi-layer perceptron model to obtain scheme preference characteristics on the preference relation directed graph, and mining preference information on the graph by using the expression capacity of the graph neural network.

In order to acquire scheme preference characteristics on the preference relation directed graph, a multi-layer perceptron model (S3) is designed, the preference characteristics of each scheme can be acquired by learning the relation between nodes on the graph constructed by the S2, and meanwhile, the preference information on the graph is mined by utilizing the expression capacity of the graph neural network, so that the performance and the accuracy of the model are further improved.

It should be noted that, the multi-layer perceptron model in this embodiment is composed of multiple neural network layers, each layer contains multiple neurons, each node in the preference relation graph is represented as an input of the multi-layer perceptron model, and the preference characteristics of each scheme are obtained through the forward propagation process of the model. Through the training process of the multi-layer perceptron model, weight and bias parameters can be adjusted, so that the model can learn the relation between nodes on the graph and capture the preference characteristics of the scheme. In addition, in order to further improve the performance and accuracy of the model, the graph neural network is combined with the multi-layer perceptron model, so that the connection relation between nodes on the graph can be fully utilized, and richer preference information can be mined. The design of the graph neural network can consider the neighbor information of the nodes and the characteristics of the nodes, and can obtain more global graph preference relation expression by aggregating the neighbor information of the nodes. Meanwhile, the understanding and representing capability of the model to the scheme can be enriched by utilizing the characteristic vector of the node.

In the step S3, the distinction between the nodes on the preference relation graph is sent to the multi-layer perceptron, and the characteristics of each side on the preference relation graph are obtained according to the distinction:

h _uv ＝σ(W ₁ h _u -W ₂ h _v )

wherein h is _uv Is the eigenvector of the edge uv, sigma is a sigmoid function, W represents a weight matrix, h _u And h _v The characteristic representations of nodes u and v, respectively. Based on the feature representation of the edges, the classes of all edges on the preference relationship graph are predicted using the graph neural network. Most variants of the graph neural network follow a messaging paradigm in which the representation vectors of nodes are calculated by recursively aggregating and transforming the representation vectors of their neighbors. In order to use a more general inductive framework to aggregate local neighborhood information of nodes, graphSAGE is adopted as a preference relation prediction model. For a given preference relationship graph G, the neighborhood aggregation policy of GraphSAGE is

Wherein the method comprises the steps ofIs a characteristic representation of node v at time t, < >>Feature representation of neighbor node u representing node v at time t, σ is a sigmoid function, N _v Is the neighbor node set of node v, | represents the connection, W is the weight matrix, AGG _t+1 Is an integrated function that contains three operators, average, LSTM, and pooling. The loss function of the preference relation prediction stage adopts a cross entropy loss function as follows:

wherein L is _class Is a category loss, h _uv Is a feature vector of the edge uv, is also a class of the predicted edge uv,is the true class of the edge uv.

In some preference learning scenarios, a decision maker may only be interested in a particular scheme preference relationship under particular conditions, rather than ranking all schemes. Therefore, in predicting the preference relationship, it is necessary to select scheme nodes in the preference relationship graph according to specific preference conditions and create m different preference relationship subgraphs G ^k ＝(V ^k ,E ^k ) (k=1, 2, …, m), then the edges in each sub-graph can be classified accordingly.

It can be understood as the true class of the edge uv in the kth preference relationship subgraph.

S4: the paired scheme preference data obtained in S3 is input to the comparator neural network, the preference score of each scheme is output, and the relevant positive inconsistent preference is corrected.

Based on S3, the paired scheme preference data is acquired and input into a comparator neural network (S4), the network can output the preference score of each scheme, correct the relevant inconsistent preference, and evaluate the merits of each scheme more comprehensively through the operation of the comparator neural network.

It should be noted that, in this embodiment, the comparator neural network processes the comparison between two schemes by sharing weights, learns the similarity or the difference between them, and through the learning process of the comparator neural network, the preference score of each scheme can be obtained, and the schemes can be ordered or classified according to the size of the score, so that the decision maker can better perform decision analysis and comparison. In addition, the comparator neural network can also detect and correct possible inconsistent conditions of preferences, for example, if obvious inconsistent preferences exist between two schemes, the comparator neural network can correct the inconsistency by adjusting the score, so that the accuracy and consistency of decision making are improved. The design and training of the comparator neural network needs to fully consider the preferences of the decision maker and the characteristics of the decision problem, so before S4 is applied, it needs to be ensured that the architecture and parameter setting of the comparator neural network match the requirements of the decision problem, and is fully trained and verified.

The objective of constructing the comparator neural network in S4 is to achieve an alternative consistent and unique ranking with preference data as input and alternative preference scores as output. Specifically, the preference data is converted into a matrix of pairwise scores, and then the comparator neural network is trained to minimize pairwise ordering loss, which ensures that the output score of the comparator neural network reflects the true pairwise preference, and after each training iteration, the score is updated based on the output of the comparator neural network, and the process is repeated until convergence. The comparator neural network is based on a probability ordering framework, outputs the preference score of the scheme pair (x, y) and predicts the probability that the scheme x is better than the scheme y, and calculates the probability loss C _xy The cross entropy loss function is also used when:

wherein L is _rank Is a loss of ordering and,is the expected probability of scheme x being better than scheme y, let o _xy =f (x) -f (y), whenWhen (I)>I.e. when the desired result is that the x-ordering is better than y, if o _xy The larger the loss, the smaller the loss; when->At time C _xy ＝-log(1-P _xy )＝log(1+exp(o _xy ) If the desired result is that the y-ordering is better than x, if o _xy The smaller the loss, the smaller the loss; when->When (I)>I.e. when the desired result is that the order of x and y is known, if o _xy =0, the loss function will be minimal. After obtaining the loss function, the parameters of the comparator neural network can then be obtained by (random) gradient descent.

Further, in order to improve training efficiency and simultaneously execute a plurality of optimal tasks, the invention adopts a joint training method to train the loss functions in S3 and S4. Therefore, the total loss function of the preference learning method based on the graph neural network can be expressed as

L _loss ＝αL _rank +βL _class

Where 0< α, β <1 represents the importance of the different stage tasks.

S5: and sorting according to the preference scores of the schemes to obtain a final selection result of the decision-making problem, wherein the scheme with the forefront sorting is the optimal scheme of the problem.

All the alternatives are ranked according to the scheme preference scores obtained in the step S4, a final selection result of the decision-making problem is obtained (step S5), and the scheme with the forefront ranking is regarded as the optimal scheme of the problem, because the scheme with the forefront ranking has the highest preference score on the basis of considering all the alternatives and evaluation criteria, and the ranking process can help a decision maker to make an accurate and reasonable decision.

It should be noted that, in this embodiment, through the sorting process of S5, the decision maker can better understand the good-bad relationship between the alternatives, and use the information for decision. Top ranked solutions typically have higher preference scores, which means that they are more excellent in meeting the requirements of various evaluation criteria. Thus, a decision maker can focus on the top-ranked schemes with pertinence and study their characteristics and potential impact more deeply. This ordering process provides an intuitive reference for the decision maker, enabling them to more fully understand the merits of the alternatives and to be more confident in making the decision. The decision maker can balance each factor according to the preference score of the scheme and combining the experience and knowledge of the decision maker, and select the scheme which meets the requirements and targets best. Furthermore, the ordering result of the S5 stage is not an absolute optimal solution, but a more ideal choice is obtained according to the current decision requirement on the basis of considering all alternatives and evaluation criteria. The complexity and subjectivity of the decision problem means that there is no general, absolute optimum, so the decision maker still needs to combine his own judgment with the actual situation to make the final decision when using the result of S5.

In addition, the embodiment of the invention also provides a comprehensive decision system based on the preference learning of the graph neural network, which comprises computer equipment, wherein the computer equipment is used for executing the steps of the comprehensive decision method based on the preference learning of the graph neural network in the first aspect.

The embodiment of the invention also provides a computer readable storage medium, and the computer readable storage medium stores a computer program to realize the comprehensive decision method based on the preference learning of the graph neural network in the first aspect.

In summary, the invention provides a comprehensive decision method for preference learning based on a graph neural network, which improves the comprehensive decision process based on parameterized preference learning by applying the graph neural network. The method can solve the problem of generating a large amount of invalid preference information in the prior art, improves sequencing prediction performance, and ensures that the comprehensive decision result has interpretability, so that the obtained decision result is more reliable and reliable, and provides powerful support for the comprehensive decision. The application of the method has wide potential, can play a role in various decision scenes, and helps decision makers to make accurate and reasonable decisions.

Note that the technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be regarded as the scope of the description. The foregoing examples represent only a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims

1. The comprehensive decision method for preference learning based on the graph neural network is characterized by comprising the following steps of:

s1: preprocessing data in a decision problem, wherein the decision problem consists of an alternative scheme and an evaluation criterion, and obtaining decision scheme information with characteristic representation through preprocessing;

s5: and sorting according to the preference scores of the schemes to obtain a final result of the decision problem, wherein the scheme with the forefront sorting is the optimal scheme of the decision problem.

2. The comprehensive decision method based on preference learning of a graph neural network according to claim 1, wherein preprocessing the data in the decision problem in step S1 includes: and removing repeated data, processing missing values and abnormal values, and performing data conversion and standardization.

3. The comprehensive decision method based on preference learning of a neural network according to claim 1, wherein the preference relation between schemes in the step S2 is determined by the size of preference supervision information of a scheme pair, and the characteristic value of the node is an evaluation criterion evaluation value of the alternative scheme.

4. The comprehensive decision method based on preference learning of a graph neural network according to claim 1, wherein the step S3 of obtaining the scheme preference feature on the preference relation directed graph by using a multi-layer perceptron model comprises the steps of:

5. The comprehensive decision method based on preference learning of the graph neural network according to claim 2, wherein the step S4 includes the steps of:

6. The comprehensive decision method based on preference learning of the graph neural network according to claim 3, wherein in the step S3, edge features on the preference relation graph are obtained by using the graph neural network, and node neighbors on the graph are sampled and aggregated mainly according to a neighbor aggregation mechanism to obtain node features, and an expression of neighbor aggregation is as follows:

where σ represents the sigmoid function, II represents the connection to the feature,characteristic representation of node v at time t, < ->Feature representation of neighbor node u representing node v at time t, N _v Is the neighbor set of node v, W represents the weight, AGG _t+1 Is an aggregation function that may contain a variety of operators, including an average operator.

7. The comprehensive decision method based on preference learning of the graphic neural network according to claim 4, wherein the comprehensive decision method is characterized in thatIn step S4, the comparator neural network may satisfy the transitivity, the symmetry, and the autoreactivity, wherein the input is For input domain, the comparator neural network p can be expressed as

Wherein w is _i Is the output weight of the ith hidden layer neuron, f is an antisymmetric function,and->The outputs of the ith hidden layer neuron for inputs (x, y) and (y, x), respectively, b being the bias;

in view of the antisymmetric nature of f

I.e. p (x, y) = -p (y, x), and similarly, autoreactivity, i.e.It can also be satisfied that +.>Since f needs to keep the sign of the input consistent, i.e +.>Therefore there arep(x,z)＝f(x-y+y-z)>0, i.e. the comparator neural network p satisfies the transitivity.

8. The comprehensive decision method based on preference learning of the graph neural network according to claim 7, wherein in the step S4, the comparator neural network is based on a probability ranking framework, which outputs a preference score of the scheme pair (x, y) and predicts a probability that the scheme x is better than the scheme y, and calculates a probability loss C _xy The cross entropy loss function is also used when:

wherein the method comprises the steps ofIs the expected probability of scheme x being better than scheme y, let o _xy =f (x) -f (y), when +.>In the time-course of which the first and second contact surfaces,i.e. when the desired result is that the x-ordering is better than y, if o _xy The larger the loss, the smaller the loss; when->At time C _xy ＝-log(1-P _xy )＝log(1+exp(o _xy ) If the desired result is that the y-ordering is better than x, if o _xy The smaller the loss, the smaller the loss; when->When (I)>I.e. when the desired result is that the order of x and y is known, if o _xy =0, damageThe loss function will be minimal. After obtaining the loss function, the parameters of the comparator neural network can be obtained through gradient descent.

9. A comprehensive decision system based on preference learning of a graph neural network, comprising computer equipment, characterized in that: the computer device is configured to perform the steps of the comprehensive decision method for preference learning based on a graph neural network as claimed in any one of claims 1 to 8.

10. A computer-readable storage medium, characterized by: the computer readable storage medium stores thereon a computer program for implementing the comprehensive decision method for preference learning based on a graph neural network according to any one of claims 1 to 8.