CN117669739B - Agent-based intelligent negotiation strategy optimization method and system - Google Patents

Agent-based intelligent negotiation strategy optimization method and system Download PDF

Info

Publication number
CN117669739B
CN117669739B CN202410108055.8A CN202410108055A CN117669739B CN 117669739 B CN117669739 B CN 117669739B CN 202410108055 A CN202410108055 A CN 202410108055A CN 117669739 B CN117669739 B CN 117669739B
Authority
CN
China
Prior art keywords
negotiating
subject
potential target
potential
negotiation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410108055.8A
Other languages
Chinese (zh)
Other versions
CN117669739A (en
Inventor
伍京华
张亚
曹瑞阳
孙怡
冯翠洋
周广娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Mining and Technology Beijing CUMTB
Original Assignee
China University of Mining and Technology Beijing CUMTB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Mining and Technology Beijing CUMTB filed Critical China University of Mining and Technology Beijing CUMTB
Priority to CN202410108055.8A priority Critical patent/CN117669739B/en
Publication of CN117669739A publication Critical patent/CN117669739A/en
Application granted granted Critical
Publication of CN117669739B publication Critical patent/CN117669739B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application relates to the technical field of artificial intelligence, and provides an Agent-based intelligent negotiation strategy optimization method and system. The method comprises the following steps: based on a local similarity link prediction algorithm, calculating the similarity between a negotiating subject and different negotiating objects in a social graph network model to determine negotiating potential target objects; predicting potential relation types between the negotiating main body and the potential target object based on a logistic regression prediction method according to node characteristics and structure balance characteristics of the potential target object; based on the reinforcement learning algorithm, the negotiation strategy is automatically matched and updated according to the potential relation type between the negotiation subject and the potential target object. The intelligent processing capability of the negotiating main body in the automatic negotiating process is improved, the problems of complexity and uncertainty in the automatic negotiating process are effectively solved, the intellectualization and the adaptivity of the negotiating process are promoted, and new possibility is provided for realizing intelligent and efficient resource allocation.

Description

Agent-based intelligent negotiation strategy optimization method and system
Technical Field
The application relates to the technical field of artificial intelligence, in particular to an Agent-based intelligent negotiation strategy optimization method and system.
Background
Current automated negotiations are primarily focused on solving the problem that manual negotiations are limited to location, time, and simple decision-making problems. With the rapid development of artificial intelligence, automatic negotiating techniques are required to be more intelligent and autonomous in order to improve production efficiency and life efficiency. In actual business scenarios, the demand for auto-negotiation functionality is growing, and especially in scenarios involving large-scale resource allocation and negotiations, auto-negotiation techniques can significantly improve efficiency and fairness.
In the complex negotiating process, the automatic negotiating system also needs to be capable of processing factors such as information asymmetry and relationship background, so that the negotiating opponents are analyzed and classified before formal negotiations, and different negotiating strategies and skills are adopted for different types of negotiating opponents, which becomes a necessary and critical task.
Thus, there is a need to provide a solution to the above-mentioned deficiencies of the prior art.
Disclosure of Invention
The application aims to provide an Agent-based intelligent negotiation strategy optimization method and system, which are used for solving or relieving the problems in the prior art.
In order to achieve the above object, the present application provides the following technical solutions:
The application provides an Agent-based intelligent negotiation strategy optimization method, which comprises the following steps: step S101, calculating the similarity between a negotiating subject and different negotiating objects in a pre-constructed social graph network model based on a local similarity link prediction algorithm so as to determine negotiating potential target objects; step S102, predicting potential relation types between the negotiating main body and the potential target object based on a logistic regression prediction method according to node characteristics and structure balance characteristics of the potential target object; the node characteristics represent local aggregation relations between the network nodes corresponding to the potential target objects and other network nodes in the social graph network model; the structural balance features characterize structural relations between network nodes corresponding to the potential target objects and other network nodes in the social graph network model; step S103, based on reinforcement learning algorithm, according to the potential relation type between the negotiation subject and the potential target object, automatically matching negotiation strategy and updating the negotiation strategy.
Preferably, step S101 includes: determining all possible network nodes corresponding to the target object which are automatically negotiated in the social graph network model; calculating Adamid-Adar similarity between each possible network node corresponding to the negotiating object and the network node corresponding to the negotiating object in the social graph network model based on a local similarity link prediction algorithm; and sequencing the adaic-Adar similarity between all possible network nodes corresponding to the negotiating object and the network nodes corresponding to the negotiating body, and determining the negotiating object corresponding to the maximum adaic-Adar similarity as the potential target object.
Preferably, the formula is as follows:
calculating the negotiating object Corresponding network node and the negotiating subject/>Adamic-Adar similarity between corresponding network nodes/>; Wherein/>Respectively represent the/>, with the negotiating object, in the social graph network modelThe negotiating subject/>A set of network nodes with direct connections; /(I)Negotiating objects/>, for the social graph network modelNegotiating subject/>Is a common neighbor node of (a); /(I)For/>And/>Characterizing node/>Is a degree of (f).
Preferably, the formula is as follows:
Determining a potential relationship type between the negotiating subject and the potential target object; wherein, The node characteristics and the structure balance characteristics of the potential target object are contained as characteristic vectors; /(I)The feature weight vector is obtained based on a maximum likelihood method in a logistic regression prediction algorithm; /(I)Representing a transpose of the matrix; /(I)Bias items in a logistic regression prediction algorithm obtained based on a maximum likelihood method; /(I)Representing a predicted probability of a type of potential relationship between the negotiating subject and the potential target object.
Preferably, in step S103, based on the Q-Learning algorithm, a discount factor of the Q-Learning algorithm is determined according to a type of a potential relationship between the negotiation subject and the potential target object, and a state-action value in the Q-Learning algorithm is updated.
Preferably, the formula is as follows:
updating the state-action value in the Q-Learning algorithm;
wherein, Respectively, the negotiating subject is at the current moment/>Next time/>A state at that time; /(I)Respectively the negotiating subject is in the current state/>Next time/>New state/>The next selectable optimal action; /(I)The Learning rate of the Q-Learning algorithm is used; /(I)Is a discount factor; To perform the best action/>, for the negotiating subject The instant rewards obtained later; /(I)For the negotiating subject in state/>Take corresponding execution action/>State-action value at time; For the negotiating subject to be in a new state/> Take corresponding execution actionState-action value at that time.
The embodiment of the application also provides an Agent-based intelligent negotiation strategy optimization system, which comprises the following steps: the opponent screening module is configured to calculate the similarity between the negotiating main body and different negotiating objects in the pre-constructed social graph network model based on a local similarity link prediction algorithm so as to determine negotiating potential target objects; the opponent classification module is configured to predict potential relation types between the negotiating main body and the potential target object based on a logistic regression prediction method according to node characteristics and structure balance characteristics of the potential target object; the node characteristics represent local aggregation relations between the network nodes corresponding to the potential target objects and other network nodes in the social graph network model; the structural balance features characterize structural relations between network nodes corresponding to the potential target objects and other network nodes in the social graph network model; and the strategy negotiation module is configured to automatically match the negotiation strategy according to the potential relation type between the negotiation subject and the potential target object based on a reinforcement learning algorithm and update the negotiation strategy.
The beneficial effects are that:
According to the Agent-based intelligent negotiation strategy optimization method provided by the embodiment of the application, firstly, the similarity between a negotiation subject and different negotiation objects in a pre-constructed social graph network model is calculated based on a local similarity link prediction algorithm so as to determine the negotiation potential target objects; so as to pointedly select proper negotiating opponents and lay a foundation for negotiating process; then, predicting potential relation types between the negotiating body and the potential target objects based on a logistic regression prediction method according to node characteristics and structure balance characteristics of the potential target objects, so that the negotiating body can better understand and predict the behavior patterns of negotiating opponents, and the negotiating strategy can be adjusted in a targeted manner; finally, according to the potential relation type between the negotiating body and the potential target object, based on the reinforcement learning algorithm, the negotiating strategy is automatically matched and updated so as to influence the negotiating result, and help the negotiating body to more flexibly cope with various types of negotiating opponents.
Therefore, through screening of negotiating opponents and optimization of negotiating strategies in the automatic negotiating process, the intelligent processing capacity of negotiating subjects in the automatic negotiating process is improved, the negotiating efficiency and effect are further improved, the problems of complexity and uncertainty in the automatic negotiating process are effectively solved, the intellectualization and self-adaptability of the negotiating process are promoted, and new possibility is provided for realizing intelligent and efficient resource allocation.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application. Wherein:
FIG. 1 is a flow chart of an Agent-based intelligent negotiation strategy optimization method according to some embodiments of the present application;
FIG. 2 is a schematic illustration of 16 structural balance features provided according to some embodiments of the present application;
fig. 3 is a schematic structural diagram of an Agent-based intelligent negotiation policy optimization system according to some embodiments of the present application.
Detailed Description
The application will be described in detail below with reference to the drawings in connection with embodiments. The examples are provided by way of explanation of the application and not limitation of the application. Indeed, it will be apparent to those skilled in the art that modifications and variations can be made in the present application without departing from the scope or spirit of the application. For example, features illustrated or described as part of one embodiment can be used on another embodiment to yield still a further embodiment. Accordingly, it is intended that the present application encompass such modifications and variations as fall within the scope of the appended claims and their equivalents.
Agent-based auto-negotiation refers to a group of agents that have a conflict of interests and potential collaboration that attempt to reach an acceptable agreement in a complex environment, thereby enabling efficient allocation of resources. This form of negotiation allows the negotiating Agent to replace humans for negotiating interactions, and its application can provide new solutions for business negotiations, political negotiations, legal disputes, etc.
At present, in the prior art, through simple satisfaction calculation or an automatic negotiation mode based on weight scores, strategies cannot be adaptively adjusted according to negotiation progress, and the intelligent decision and optimization cannot be realized due to difficulty in adapting and coping with changes. On the one hand, because in practice the preferences of the participants may be complex and variable, they cannot be simply measured in terms of weights and scores, and thus the real preferences and needs of the parties may not be accurately captured by simple satisfaction calculation or weight scoring methods. On the other hand, simple satisfaction calculation or automatic negotiation mode based on weight score are limited to known parameters, and it is difficult to accurately cope with the information asymmetry. For example, the range of each decision variable needs to be known in advance, or the preference of each party for the issue needs to be known, but in practice, these parameters are difficult to determine.
Currently, the degree of automation and the degree of intellectualization of the negotiating Agent are improved, so that the negotiating Agent can process factors such as information asymmetry, relationship background and the like in a complex negotiating process, analyze and classify negotiating opponents before formal negotiations, predict the behaviors and demands of the opponents, adopt different negotiating strategies and skills for different types of negotiating opponents, and autonomously adjust the negotiating strategies in the negotiating process, thus becoming a necessary and critical task of an automatic negotiating system.
Based on the method, the application provides an Agent-based intelligent negotiation strategy optimization method, and the relationship background of negotiation parties can be understood through analysis and classification of negotiation opponents, so that more targeted strategies and skills are provided for negotiation, and the situations that simple satisfaction calculation or automatic negotiation modes based on weight scoring in the prior art are limited to known parameters and information asymmetry is difficult to accurately process are effectively avoided; in addition, by utilizing reinforcement learning, the change in the negotiation process is continuously and autonomously learned and perceived, and the negotiation strategy is autonomously adjusted according to different opponent types, more accurate decision and adjustment can be realized in the negotiation process, the intelligent level of negotiation is improved, the dynamic process of negotiation is more met, the influence of various factors in the negotiation process is fully considered, and the negotiation strategy is dynamically adjusted in the negotiation process. Therefore, basis and support are provided for optimization of negotiating opponents screening, classifying and negotiating strategies in the automatic negotiating process, so that the intelligent processing capacity of the automatic negotiating is improved, the negotiating efficiency and effect are improved, the problems of complexity and uncertainty in the automatic negotiating process are solved, and the intellectualization and self-adaptability of the negotiating process are promoted.
As shown in fig. 1, the Agent-based intelligent negotiation strategy optimization method includes:
Step S101, calculating the similarity between the negotiating subject and different negotiating objects in the pre-constructed social graph network model based on the local similarity link prediction algorithm to determine the negotiating potential target object.
In the application, through the social graph network model of the proxy system, a proper negotiating opponent is selected in a targeted manner, and a foundation is laid for the subsequent negotiating process. Specifically, in the multi-agent system, according to the six-degree separation theory, all users in the agent system can establish a connection with any other person through an acquaintance chain, so that a social graph network model is constructed by modeling the social relationship network of the negotiating agents to characterize the connection between different negotiating agents and negotiating agents. And utilizing link prediction in the social network to mine potential target objects and screening negotiating opponents with cooperative potential.
Through interaction histories and connection modes between negotiation agents in the social graph network model, the similarity and the connection strength between different negotiation agents are known. First, a search space for auto-negotiation is determined, i.e. network nodes corresponding to all possible target objects for auto-negotiation in the social graph network model are determined. Specifically, selecting a negotiation subject, and establishing a negotiation agent directly connected with the negotiation subject as a first degree relationship based on a six-degree separation theory; and expanding by utilizing the first degree relation, establishing a second degree relation which is related to the negotiating agent in the first degree relation, and sequentially cycling until the preset requirement is met, wherein the established social network is the search space for automatic negotiating.
Then, based on a local similarity link prediction algorithm, calculating Adamic-Adar similarity between the network node corresponding to each possible negotiation object and the network node corresponding to the negotiation topic in the social graph network model. Specifically, the formula is as follows:
Determining negotiating objects Corresponding network node and negotiating subject/>Adamic-Adar similarity between corresponding network nodes/>. Wherein/>Respectively represent the/>, of the negotiating object in the social graph network modelNegotiating subject/>A set of network nodes with direct connections; /(I)Negotiating objects/>, in a social graph network modelNegotiating subject/>Is a common neighbor node of (a); /(I)For/>And/>Characterizing node/>Degree of (i.e., and node/>) in social graph network modelThe number of adjacently connected edges.
And finally, screening out negotiation agents with cooperative potential based on the predicted connection strength ordering among the negotiation agents. Namely, sorting the adaic-Adar similarity between the network nodes corresponding to all possible negotiating objects and the network nodes corresponding to the negotiating subject, and determining the negotiating object corresponding to the largest adaic-Adar similarity as the potential target object.
Step S102, predicting the potential relation type between the negotiation subject and the potential target object based on the logistic regression prediction method according to the node characteristics and the structure balance characteristics of the potential target object.
In the application, the related characteristic attribute of the negotiating agent is constructed by utilizing the characteristic engineering technology, and the potential relation type between the negotiating body and the negotiating opponent is predicted by using the regression prediction model, so that the negotiating body can better understand and predict the behavior mode of the negotiating opponent, thereby pertinently adjusting the negotiating strategy.
In the application, the negotiating agents are classified by utilizing the characteristic engineering technology, and the related characteristic attribute is selected to predict the relationship type between the negotiating subject and the negotiating opponent. That is, the regression prediction model is used to predict the potential relationship types between the negotiating subject and the negotiating adversary through the node characteristics and the structural balance characteristics of the negotiating adversary.
The node characteristics represent local aggregation relations between network nodes corresponding to potential target objects in the social graph network model and other network nodes. In the social graph network model, the degree of ingress of a node represents the number of edges pointing to the node, and the degree of egress of a node represents the number of edges starting from the node. As shown in table 1, the forward input degree (forward output degree) is the number of forward relations pointing to (from) the node; negative ingress (negative egress) is the number of negative relations pointing to (from) the node; the total ingress (total egress) is the number of all relationships (positive and negative relationships) that point to (from) the node. Moreover, in the social graph network model, the historical data of the relationship type (positive relationship, negative relationship) that the node historically sends or receives reflects the preference and trend of the relationship type of the node, and the total ingress and egress of the node reflects the connection degree (i.e., the connection strength) of the node in the entire social graph network model. Table 1 is as follows:
Table 1 node characteristic representation and meaning
The structural balance features characterize structural relationships between network nodes corresponding to potential target objects in the social graph network model and other network nodes. In the application, the source nodeAnd target node/>For example, consider the source node/>And target node/>The intersection of different types of edges in different directions among nodes is defined according to the edge direction and edge type information of the nodes, and 16 structural balance characteristics/>As shown in fig. 2; wherein, the representation and meaning of each structure balance characteristic are shown in table 2. Table 2 is as follows:
Table 2 representation and meaning of structural balance characteristics
In the present application, there are two types of potential relationships between negotiating subjects and potential target objects: positive and negative relationships. The forward relation table is positive, friendly, cooperative, supporting and other relations between two nodes; the negative relationship characterizes a negative, hostile, competing, conflicting, etc., relationship between two nodes. The method is mainly based on a logistic regression prediction method, and predicts the potential relation type between a negotiating subject and a potential target object (negotiating opponent) through known node characteristics and structure balance characteristics of the potential target object.
Specifically, feature vectors are obtained by extracting node features and structure balance features in a network in a social graph network model(Comprising 6 node features and 16 structural balance features), i.e. node features and structural features together yield a feature vectorWherein, feature vector/>Each element of (a) represents a feature. In the logistic regression model, each sample in the training set has a feature vector/>Wherein/>Is/>The values of the individual features.
Obtaining feature weight vectors based on maximum likelihood method (Maximum Likelihood Estimation, MLE) in logistic regression model training processI.e. selecting the parameter value/>, which gives the highest probability of occurrence of the observed dataAs an estimate of the logistic regression model. In the process of obtaining the characteristic weight vector/>The new samples are then predicted using the vector. Feature vector/>Substituting the model into the logistic regression model, calculating the probability that the model belongs to a positive relation for the sample, and then carrying out classification decision according to a set threshold value.
Specifically, the formula is as follows:
A type of potential relationship between the negotiating subject and the potential target object is determined. In the method, in the process of the invention, The feature vector comprises node features and structure balance features of potential target objects; /(I)The characteristic weight vector is obtained based on a maximum likelihood method; /(I)Representing a transpose of the matrix; /(I)Is a bias term obtained based on a maximum likelihood method; /(I)To represent the predicted probability of a potential relationship type between the negotiating subject and the potential target object. Feature vector/>Comprising 6 node features and 16 structural balance features, i.e. >; Each specific feature corresponds to a feature weight vector, i.e./>
Feature weight vectorAnd bias term/>Is obtained based on a maximum likelihood method in the training process of a logistic regression model, and in the actual training process, the characteristic weight vector/>And bias term/>The model is adjusted according to the characteristics of the training data and the classification labels, so that the model can better fit the training data and predict the result of the new sample. That is, feature vectors/>, of all data in the training set are obtained through feature engineering technologyThen training the training set through a logistic regression model to obtain a characteristic weight vector/>And bias term/>
In determining the type of potential relationship between the negotiating subject and the potential target object, the resulting predictive probability is determinedIs compared with a preset relation threshold value, and when the prediction probability/>When the relation is larger than or equal to a preset relation threshold value, the negotiating main body and the potential target object are in a forward relation; when predicting probability/>And when the relationship is smaller than the preset relationship threshold value, the relationship between the negotiating subject and the potential target object is negative.
Step S103, based on reinforcement learning algorithm, according to the potential relation type between the negotiation subject and the potential target object, the negotiation strategy is automatically matched and updated.
In the application, the potential relation types between the negotiating main body and the potential target object are predicted based on a logistic regression method, and negotiating opponents with different relation types are automatically matched with the negotiating strategies. Specifically, based on the Q-Learning algorithm, according to the potential relation type between the negotiating subject and the potential target object, determining the discount factor of the Q-Learning algorithm, and when negotiating opponents facing the forward relation, the negotiating subject tends to adopt a higher discount factor, so that the negotiating subject tends to select long-term cooperation and common interests to better agree; while in the case of negotiating opponents facing negative relations, a lower discount factor tends to be adopted, so that the negotiating subjects pay more attention to the interests before eyes, and short-term strategies possibly adopted by the negotiating opponents are dealt with, so that the interests of the negotiating subjects are effectively protected.
First, define the state space of negotiating by negotiating subjectAnd action space/>. State space/>Comprising the following steps: historical actions of negotiations, behavioral characteristics of negotiating opponents, current resource allocation. Action space/>The action space that can be taken for the negotiating subject, the specific actions include: different protocol schemes are proposed, existing proposals are modified, and the proposal of an opponent is refused. Wherein, the action space/>Covering various negotiating actions that may occur during negotiations to ensure that the negotiating entity has sufficient flexibility to find the best negotiating strategy. Current state space of negotiating subject/>Can influence the psychological expected price, the action space/>It affects whether the offer of the counterpart (negotiating counterpart) is received.
Then, through the rewarding functionTo evaluate the merits of each negotiation action performed by the negotiating entity, wherein the reward function/>Calculation based on the current state of the negotiating subject and the negotiating actions taken,/>Representing the negotiation time, i.e. the/>, of the negotiation subjectRound negotiations, reward function/>For negotiating subject in the/>Psychological expectations of price in round negotiations. In each negotiation, the negotiating entity chooses to perform a negotiation action based on the current status and based on the outcome of the execution (i.e. feedback of the negotiating adversaries, e.g. different protocol schemes are proposed, existing proposals are modified, proposals against adversaries are refused) and the reward function/>Feedback of (i.e. negotiating subject is at/>)Instant rewards earned for round-robin), the Q value (i.e., state-action value) is updated, i.e., the Q value in the Q-Learning algorithm is updated to optimize the policy selection of the negotiating entity.
In the Q-learning algorithm, the Q value (i.e., state-action value) is used to represent the expected benefit of taking an action in a given state, which is used to measure how good a different action is taken in a given state. Specifically, the Q value is a function of a state-action pair, typically denoted Q (s, p), where s represents the current state and p represents the action taken. The higher the Q value, the greater the expected benefit of that action in that state, and thus is considered to be a better choice of action. In the Q-learning algorithm, the Q value is continuously updated and optimized through interaction with the environment, so that the intelligent agent can learn an optimal action strategy.
Here, the expected benefit and the immediate prize r (t+1) represented by the Q value are related but not identical. In particular, the Q value represents the expected benefit of taking an action in a given state, which is a function or table that evaluates the value of each state-action combination as updated based on historical experience of the agent's interaction with the environment, gradually converging to an optimal value function through iterative learning. And the instant prize r (t+1) is a prize obtained immediately after the agent performs the action, which is a result over a specific time step reflecting the instant effect of the agent under the action, and may be represented by the symbol r (t+1), where t+1 represents the next time step.
In the Q-learning algorithm, the immediate prize r (t+1) is used as part of updating the Q value. Specifically, according to the update rule of the Q-learning algorithm, the Q value is updated according to the current state, the instant prize after the action is performed, the next state, and the optimal action of the next state. Specifically, the formula is as follows:
And updating the Q value in the Q-Learning algorithm. Wherein, Respectively, negotiating subject at current moment/>Next time/>A state at that time; /(I)Respectively the negotiating subject is in the current state/>Next time/>New state/>The next selectable optimal action; /(I)The Learning rate of the Q-Learning algorithm; /(I)Performing optimal actions for negotiating subjects/>The instant rewards obtained later; /(I)For negotiating subject in state/>Take corresponding execution action/>Q value at time; /(I)For negotiating the subject in a new state/>Take corresponding execution action/>Q value at that time. /(I)As a discount factor, the number of times the discount is calculated,The method is used for representing the importance of future rewards in the current state in the Q-Learning algorithm; the discount factor determines how much attention the decision-making negotiating entity pays for future rewards. When the discount factor approaches 0, the negotiating subject is more concerned about the instant rewards than about future rewards; as the discount factor approaches 1, the negotiating entity pays more attention to the long-term jackpot.
In the application, when the negotiating subject faces the friendly negotiating opponent, namely the potential relation type between the negotiating subject and the potential target object is a forward relation, the negotiating subject tends to take a higher discount factor, and further tends to select long-term cooperation and common interests to better reach an agreement. The negotiating subject tends to take a lower discount factor when facing an unfriendly negotiating adversary, i.e., the type of potential relationship between the negotiating subject and the potential target object is a negative relationship, so that the negotiating subject is more concerned about the interests before the eyes, and the negotiating subject can take short-term strategies in response to the possible interests of the negotiating adversary, so as to effectively protect the interests of the negotiating subject.
Aiming at different types of negotiating opponents, predicting potential relation types between the negotiating opponents and negotiating main bodies, so that negotiating strategies can be matched more effectively, and the intelligentized level and efficiency of negotiating are improved; the Q-Learning algorithm is further used for adaptively adjusting the negotiating strategy, so that the negotiating main body has flexibility and intelligence, and can be better suitable for negotiating opponents of different types, thereby improving the success rate of negotiating and negotiating effect, on one hand, being beneficial to improving the adaptability and flexibility of the intelligent system in the decision making process, on the other hand, effectively saving the labor cost, and improving the accuracy and efficiency of negotiating decisions.
As shown in fig. 3, an embodiment of the present application further provides an Agent-based intelligent negotiation policy optimization system, including: an opponent screening module 301, an opponent classification module 302, and a policy negotiation module 303.
Wherein the adversary screening module 301 is configured to calculate similarities between negotiating subjects and different negotiating objects in a pre-constructed social graph network model based on a local similarity link prediction algorithm to determine potential target objects for negotiations.
The opponent classification module 302 is configured to predict a potential relationship type between the negotiating main body and the potential target object based on a logistic regression prediction method according to node characteristics and structure balance characteristics of the potential target object; the node characteristics represent local aggregation relations between network nodes corresponding to potential target objects in the social graph network model and other network nodes; the structural balance features characterize structural relationships between network nodes corresponding to potential target objects in the social graph network model and other network nodes.
The policy negotiation module 303 is configured to automatically match and update the negotiation policy according to the type of potential relationship between the negotiation subject and the potential target object based on the reinforcement learning algorithm.
The Agent-based intelligent negotiation strategy optimization system provided by the embodiment of the application can realize the steps and the flow of any Agent-based intelligent negotiation strategy optimization method embodiment, and achieve the same technical effects, and are not described in detail herein.
The above description is only of the preferred embodiments of the present application and is not intended to limit the present application, but various modifications and variations can be made to the present application by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (5)

1. The intelligent negotiation strategy optimization method based on the Agent is characterized by comprising the following steps of:
step S101, calculating the similarity between a negotiating subject and different negotiating objects in a pre-constructed social graph network model based on a local similarity link prediction algorithm so as to determine negotiating potential target objects;
Step S102, predicting potential relation types between the negotiating main body and the potential target object based on a logistic regression prediction method according to node characteristics and structure balance characteristics of the potential target object; the node characteristics represent local aggregation relations between the network nodes corresponding to the potential target objects and other network nodes in the social graph network model; the structural balance features characterize structural relations between network nodes corresponding to the potential target objects and other network nodes in the social graph network model;
Step S103, based on the Q-Learning algorithm, determining a discount factor of the Q-Learning algorithm according to the potential relation type between the negotiation subject and the potential target object, and according to the formula:
Updating the state-action value in the Q-Learning algorithm to automatically match the negotiation strategy;
wherein, Respectively, the negotiating subject is at the current moment/>Next time/>A state at that time; /(I)Respectively the negotiating subject is in the current state/>Next time/>New state of (2)The next selectable optimal action; /(I)The Learning rate of the Q-Learning algorithm is used; /(I)Is a discount factor; Performing an optimal action/>, for the negotiating subject The instant rewards obtained later;
for the negotiating subject in state/> Take corresponding execution action/>State-action value at time; /(I)For the negotiating subject to be in a new state/>Take corresponding execution action/>State-action value at that time.
2. The Agent-based intelligent negotiation strategy optimization method according to claim 1, wherein step S101 comprises:
determining all possible network nodes corresponding to the target object which are automatically negotiated in the social graph network model;
Calculating Adamid-Adar similarity between each possible network node corresponding to the negotiating object and the network node corresponding to the negotiating object in the social graph network model based on a local similarity link prediction algorithm;
And sequencing the adaic-Adar similarity between all possible network nodes corresponding to the negotiating object and the network nodes corresponding to the negotiating body, and determining the negotiating object corresponding to the maximum adaic-Adar similarity as the potential target object.
3. The Agent-based intelligent negotiation strategy optimization method according to claim 2, wherein the following formula is adopted:
calculating the negotiating object Corresponding network node and the negotiating subject/>Adamic-Adar similarity between corresponding network nodes/>
Wherein,Respectively represent the/>, with the negotiating object, in the social graph network modelSaid negotiation subjectA set of network nodes with direct connections; /(I)Negotiating objects/>, for the social graph network modelNegotiating subject/>Is a common neighbor node of (a); /(I)For/>And/>Characterizing node/>Is a degree of (f).
4. The Agent-based intelligent negotiation strategy optimization method according to claim 1, wherein the following formula is adopted:
determining a potential relationship type between the negotiating subject and the potential target object;
wherein, The node characteristics and the structure balance characteristics of the potential target object are contained as characteristic vectors; /(I)The feature weight vector is obtained based on a maximum likelihood method in a logistic regression prediction algorithm; /(I)Representing a transpose of the matrix; /(I)Bias items in a logistic regression prediction algorithm obtained based on a maximum likelihood method; /(I)Representing a predicted probability of a type of potential relationship between the negotiating subject and the potential target object.
5. An Agent-based intelligent negotiation strategy optimization system, comprising:
The opponent screening module is configured to calculate the similarity between the negotiating main body and different negotiating objects in the pre-constructed social graph network model based on a local similarity link prediction algorithm so as to determine negotiating potential target objects;
The opponent classification module is configured to predict potential relation types between the negotiating main body and the potential target object based on a logistic regression prediction method according to node characteristics and structure balance characteristics of the potential target object; the node characteristics represent local aggregation relations between the network nodes corresponding to the potential target objects and other network nodes in the social graph network model; the structural balance features characterize structural relations between network nodes corresponding to the potential target objects and other network nodes in the social graph network model;
The policy negotiation module is configured to determine a discount factor of the Q-Learning algorithm according to a potential relation type between the negotiation subject and the potential target object based on the Q-Learning algorithm, and according to the formula:
Updating the state-action value in the Q-Learning algorithm to automatically match the negotiation strategy;
wherein, Respectively, the negotiating subject is at the current moment/>Next time/>A state at that time; /(I)Respectively the negotiating subject is in the current state/>Next time/>New state of (2)The next selectable optimal action; /(I)The Learning rate of the Q-Learning algorithm is used; /(I)Is a discount factor; Performing an optimal action/>, for the negotiating subject The instant rewards obtained later;
for the negotiating subject in state/> Take corresponding execution action/>State-action value at time; /(I)For the negotiating subject to be in a new state/>Take corresponding execution action/>State-action value at that time.
CN202410108055.8A 2024-01-26 2024-01-26 Agent-based intelligent negotiation strategy optimization method and system Active CN117669739B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410108055.8A CN117669739B (en) 2024-01-26 2024-01-26 Agent-based intelligent negotiation strategy optimization method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410108055.8A CN117669739B (en) 2024-01-26 2024-01-26 Agent-based intelligent negotiation strategy optimization method and system

Publications (2)

Publication Number Publication Date
CN117669739A CN117669739A (en) 2024-03-08
CN117669739B true CN117669739B (en) 2024-05-24

Family

ID=90084741

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410108055.8A Active CN117669739B (en) 2024-01-26 2024-01-26 Agent-based intelligent negotiation strategy optimization method and system

Country Status (1)

Country Link
CN (1) CN117669739B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113408090A (en) * 2021-05-31 2021-09-17 上海师范大学 Node relation obtaining method based on symbolic network and storage medium
CN115169637A (en) * 2022-05-26 2022-10-11 中国工商银行股份有限公司 Social relationship prediction method, device, equipment and medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113408090A (en) * 2021-05-31 2021-09-17 上海师范大学 Node relation obtaining method based on symbolic network and storage medium
CN115169637A (en) * 2022-05-26 2022-10-11 中国工商银行股份有限公司 Social relationship prediction method, device, equipment and medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Ayan Sengupta 等.An Autonomous Negotiating Agent Framework with Reinforcement Learning based Strategies and Adaptive Strategy Switching Mechanism.ACM:AAMAS '21.2021,正文第1163-1172页. *
Challenges and Opportunities in Deep Reinforcement Learning With Graph Neural Networks: A Comprehensive Review of Algorithms and Applications;Sai Munikoti 等;IEEE Transactions on Neural Networks and Learning Systems ( Early Access );20230626;第1-21页 *
The Rise and Potential of Large Language Model Based Agents: A Survey;Zhiheng Xi 等;arXiv;20230919;第1-86页 *

Also Published As

Publication number Publication date
CN117669739A (en) 2024-03-08

Similar Documents

Publication Publication Date Title
CN110222848A (en) The determination method and device for the integrated model that computer executes
Mao et al. Learning multi-agent communication under limited-bandwidth restriction for internet packet routing
Sequeira et al. Interestingness Elements for Explainable Reinforcement Learning through Introspection.
CN116363452B (en) Task model training method and device
Amini et al. A BOA-based adaptive strategy with multi-party perspective for automated multilateral negotiations
Sallam et al. IMODEII: an Improved IMODE algorithm based on the Reinforcement Learning
Yousefli A fuzzy ant colony approach to fully fuzzy resource constrained project scheduling problem
Li et al. Computation offloading for tasks with bound constraints in multi-access edge computing
CN117669739B (en) Agent-based intelligent negotiation strategy optimization method and system
CN111292062B (en) Network embedding-based crowd-sourced garbage worker detection method, system and storage medium
CN116738923B (en) Chip layout optimization method based on reinforcement learning with constraint
Zineb et al. Cognitive radio networks management using an ANFIS approach with QoS/QoE mapping scheme
Verma et al. Making smart homes smarter: optimizing energy consumption with human in the loop
Zhao et al. Building Innovative Service Composition Based on Two‐Way Selection in Cloud Manufacturing Environment
CN114298376A (en) Software project scheduling method based on heuristic discrete artificial bee colony algorithm
CN114819442A (en) Operational research optimization method and device and computing equipment
Masadeh et al. Selector-actor-critic and tuner-actor-critic algorithms for reinforcement learning
Rajavel et al. Cognitive Fuzzy-based Behavioral Learning System for Augmenting the Automated Multi-issue Negotiation in the E-commerce Applications
Hu et al. Evolving constrained reinforcement learning policy
Sedlak et al. Active Inference on the Edge: A Design Study
US11941500B2 (en) System for engagement of human agents for decision-making in a dynamically changing environment
Louati et al. A multilevel agent-based approach for trustworthy service selection in social networks
Philipp et al. On automating decentralized multi-step service combination
Toranzo et al. Intention reconsideration like uncertain dichotomous choice model
Zhou et al. Collaborative optimization of manufacturing service allocation via multi-task transfer learning evolutionary approach

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Wu Jinghua

Inventor after: Zhang Ya

Inventor after: Cao Ruiyang

Inventor after: Sun Yi

Inventor after: Feng Cuiyang

Inventor after: Zhou Guangjuan

Inventor before: Wu Jinghua

Inventor before: Zhang Ya

Inventor before: Cao Ruiyang

Inventor before: Sun Yi

Inventor before: Feng Cuiyang

Inventor before: Zhou Guangjuan

GR01 Patent grant