CN104166750B

CN104166750B - Robocup based on weighting synergetic rescues collaboration method

Info

Publication number: CN104166750B
Application number: CN201410274653.9A
Authority: CN
Inventors: 高翔; 梁志伟; 汪伟亚
Original assignee: Nanjing Post and Telecommunication University
Current assignee: Nanjing Post and Telecommunication University
Priority date: 2014-06-18
Filing date: 2014-06-18
Publication date: 2017-07-11
Anticipated expiration: 2034-06-18
Also published as: CN104166750A

Abstract

The present invention provides a kind of Robocup based on weighting synergetic and rescues collaboration method, by defining the weighting cooperative figure of based role, and to single intelligent ability of immigrants and multiple intelligent body collaboration capabilities modelings；Intelligent body cooperative team is set up, intelligent body interactivity is effectively modeled；The optimal role for estimating execution task using weighting cooperative figure learning algorithm distributes.The observation that obtains and one obtain one and weight cooperative figure learning algorithm learning algorithm close to optimal role's Task Assigned Policy algorithm in the example of role's task distribution, match from usually emulating.The method is used for Robocup rescue match platforms, and substantial amounts of experiment simulation test result shows that the method for weighting cooperative figure can form an intelligent body cooperative team for near-optimization.

Description

Robocup based on weighting synergetic rescues collaboration method

Technical field

Collaboration method is rescued the present invention relates to a kind of Robocup based on weighting synergetic.

Background technology

Under many current intelligences, due to the requirement and the limitation of environmental condition of task, possess difference in functionality and responsibility Isomery intelligent body can put together cooperation together and complete complicated task.As a frontier of Robocup, RobocupRescue emulation matches provide series of standards task to promote distributed artificial intelligence, intelligent robot technology And its research and development of association area.

The main purpose of Robocup rescue simulation systems (RCRSS) is searched and rescued and co-operative system by setting up multiple agent, When there is disaster, by integrated disaster information on emulation platform, the method such as intensity of a fire prediction, conduct programming is right to provide Decision support under disaster scenario.There are six classes can control intelligent body, respectively hospital (Ambulance Centre in RCRSS Referred to as：AC), medical team (Ambulance Team are referred to as：AT), police office (Police Office are referred to as：PO), police (Police Force are referred to as：PF), Fire Department (Fire Station are referred to as：FS), fire brigade (Fire Bridge are referred to as：FB), There is citizen's intelligent body (Civilian) in other system, it is the object for needing relief.Its multi-agent system simulated environment As shown in Figure 1.

Rescued for cooperation in current competition field, there is following several method：

Intelligent body is submitted a tender to task, and most effective bidder is by the task auction technique of the task of acquisition；

ASyMTRe algorithms, are modeled to intelligent body function and pattern, and task is defined as the same of one group of output function When, the multiple agent team that a group is completion task is formd, wherein feasible troop carries out priority row according to utility function Sequence；

Multi-robotic task distribute (MTRA) method, it using task distribution after performance estimation cooperate value come To distributing multiple-task between role；

MRTA technologies based on auction are equally widely used, and the ability of robot is carried out after being modeled as one group of Service Source Role distributes；

It is recently proposed and how assists the intelligent body with certain independence so as to the teammate unknown with by learning The problem of work, the distribution of its task role with after observing certain specific intelligence body results of learning it is relevant；Ka Naijimeilong is big Somchaya Liemhetcharat are it is proposed that the training data crossed in model and the intelligent body cooperation using figure formulates one Plant the learning method based on graph model.

Collaborative effect of above-mentioned these methods all not to intelligent body team is modeled.And in RCRSS emulation, more need Concern how an effective intelligent body rescue how is formed under Robocup rescue rescue emulation dynamic environment Team's strategy, rather than the behavior and action of being simply only absorbed in single intelligent body.

Due in RCRSS, it is random and in initial time that intelligent body encounters problems, the ability of single intelligent body, Cooperation between multiple intelligent bodies is unknown, can complete individual task and obtain maximum value to find one kind Task Assigned Policy, therefore, dynamic environment under, two problems should be absorbed in：1st, how how single intelligent body leads to Study is crossed to adapt to a unknown teammate 2, how in dynamic environment, an efficient intelligent body team is formed.Therefore see The interaction between intelligent body and intelligent body team in cooperative figure is surveyed, cooperation and task distribution based on intelligent body function is obtained Weighting cooperative figure.

The content of the invention

Collaboration method is rescued it is an object of the invention to provide a kind of Robocup based on weighting synergetic, is solved dynamic Under state rescue simulated environment, profile and team collaboration ability of the intelligent robot body team under initial situation are unknown , an efficient multiple agent team how is rapidly formed, so as to distribute plan for intelligent body finds optimal role's task Problem slightly.

Technical solution of the invention is：

A kind of Robocup based on weighting synergetic rescues collaboration method,

The weighting cooperative figure of based role is defined, and to single intelligent ability of immigrants and multiple intelligent body collaboration capabilities modelings；

Intelligent body cooperative team is set up, intelligent body interactivity is modeled；

The optimal role for estimating execution task using weighting cooperative figure learning algorithm distributes.

Further, the treatment individual task ability to intelligent body in rescue initial time is modeled, and intelligent body is assisted The effect for making to perform a certain task is quantified：F (A), if F ({ PF, FB })>F ({ PF, AT }), selection (PF, FB) combination is completed Task.

Further, to intelligent body collaboration state modeling, compatible function phi is introduced, defines φ：Its Middle φ is monotonic decreasing function, and transmission range parameter, output compatibility is higher apart from the shorter compatibility that represents, compatible higher Represent that synergy is better.

Further, weighting collaboration graph model S is a polynary array { G, C }, wherein：

G={ V, E } is a weighting connected graph；

Each a_i∈ A can use a vertex v_i∈ V represent, a_iRepresent an intelligent body；

E=(v_i, v_j, w_{I, j}) ∈ E are vertex vs_i,v_jBetween carry weight w_i,j∈Z⁺Side；

There is a circle for wraparound oneself on each summit；

So that C_i={ C_{I, 1}..., C_{I, M}, wherein,It is a_iIn role r_aIn ability.

Further, using weighting collaboration graph model, PF, disaster model of the tri- kinds of intelligent bodies of FB, AT in dynamic change In ability represented with normal distribution；

Intelligent body team cooperates with function：One Role Policies π：R → A distribution intelligent body team conspiracy relation be：

Collaboration function returns to a normal distribution, represents performance of the intelligent body team under role's allocation strategy.

Further, optimal rescue team is obtained using approximate algorithm on the basis of cooperative figure is weighted：Using subregion It is fixed that thought assumes that a task needs the number of intelligent body, and intelligent body number needed for the subtask of the task is also solid Fixed, if the number n of troop is unknown, using approximate algorithm constantly by incremental n interative computations, return all suitable N values, then select optimal number.

Further, in cooperative figure is weighted, the normal distribution that will represent intelligent ability of immigrants is converted into their average, comes The optimal troop of selection.

Further, optimal weighting cooperative figure is obtained by the learning algorithm of weighting cooperative figure：

The random initial weighting cooperative figure of generation, the ability of the process task of intelligent body is estimated with example T；

Intelligent body team performs rescue strategies, and the log-likelihood estimation function of Case-based Reasoning obtains optimal weighting collaboration Figure.

Further, the learning algorithm of weighting cooperative figure is specially：

The random initial weighting cooperative figure of generation, generates summit, and the random side for having added any weight in figure, allows collaboration diagram Connection, there is the side of self-loopa on each summit；

When intelligent body carries out fire-fighting fire extinguishing, the space of weighting collaboration diagram is explored, generated by following four situation One adjacent graph structure：

When less than maximum weights, 1 is increased to the weights on side at random；

When more than minimum weights, 1 is reduced to the weights on side at random；

Increase a new side between two summits；

The side that one does not influence figure to connect is eliminated；

If adjacent space comfort cooperative figure has more preferable fire-fighting to show, replace existing weighting cooperative figure；Finally

To optimal weighting cooperative figure.

The beneficial effects of the invention are as follows：Propose a learning algorithm, learning algorithm role's task from usually emulating The observation obtained in example, the match of distribution and an acquisition one close in optimal role's Task Assigned Policy algorithm Weighting cooperative figure learning algorithm.The above method is used for Robocup rescue match platforms, substantial amounts of experiment simulation test result table Bright, the method for weighting cooperative figure can form an intelligent body cooperative team for near-optimization.

Brief description of the drawings

The Robocup Rescue simulated environment of Fig. 1 embodiments.

Fig. 2 is a rescue emulation all-pass figure for task based access control relation.

Fig. 3 is revised collaboration diagram.

Fig. 4 is the cooperative figure with compatibility.

Fig. 5 is simple rescue weighting collaboration graph model.

Fig. 6 is optimal distribution strategy algorithm flow chart.

Fig. 7 is rescue intelligent body cooperation sketch.

Fig. 8 is weighting cooperative figure interactive learning correlation curve.

Fig. 9 is to rescue the effect that intelligent body is rescued at random.

Figure 10 is using after weighting cooperative figure.

Figure 11 is the match emulation initial stage.

Figure 12 is that auction algorithm is same to scheme using adding.

Figure 13 is to use weighting cooperative figure.

Figure 14 is to weight the fractional statisticses figure after cooperative figure is compared with auction algorithm and a Random Cooperation strategy.

Specific embodiment

The preferred embodiment that the invention will now be described in detail with reference to the accompanying drawings.

Embodiment is modeled using cooperative figure is weighted to six kinds of intelligent body roles for rescuing emulation platform, and uses weighting The structure composition ideal team of figure simultaneously tests influence of the different role's distribution to the rescue effect of whole rescue team.

, using weighting collaboration graph model, PF, tri- kinds of intelligent bodies of FB, AT are in the disaster model of dynamic change for embodiment Ability is represented with normal distribution.Intelligent body cooperative team adds weight with a set expression on the basis of graph model is cooperateed with To improve the ability to express of model, wherein on learning algorithm, study is trained using less data domain, each role's Intelligent body has only used 1 data point, meanwhile, it is how to influence task to complete urban SOS effect to simulate different team and formed , optimal team and Task Assigned Policy has then been estimated, solution is divided into three steps：

A, the weighting cooperative figure for defining based role, and single intelligent ability of immigrants and multiple intelligent body collaboration capabilities are built Mould；

B, intelligent body cooperative team is set up, intelligent body interactivity is effectively modeled；

C, the optimal role distribution that execution task is estimated using weighting cooperative figure learning algorithm.

Weighting cooperative figure under Robocup rescue simulated environment, first, defines task environment；In rescue simulated environment Give a definition weighting cooperative figure；Set up the rescue simulation model based on weighting cooperative figure.

Define task environment

At the rescue emulation initial stage, there is an earthquake in city, the common people are trapped in ruins, and building occurs fire, road Blocked by roadblock, one group of rescue team (FB, PF, AT) is deployed in city, to rescue more citizen, is controlled to greatest extent The intensity of a fire of producing building thing, the loss that reduction city is subjected to disaster is target, and wherein PF can remove roadblock, and AT relief is injured flat The people, FB puts out a fire to building, because at the emulation initial stage, each intelligent body not cooperation, respective ability is also unknown, institute With, it is necessary to seek a Finding Cooperative efficiency highest rescue team.

In order to complete a rescue task T for complexity, a model F can be defined, wherein F (A) is the reality of this model Example, in order to preferably express in pick-up operation cooperative relationship between three class intelligent mobile agents, it is possible to use they complete same The contact of individual Task Network, formulates a weighting connection graph model, and summit represents intelligent body, while the relation of intelligent body is represented, Weight represents that the coefficient of concordance d, d of cooperation are smaller, cooperation it is better, cost is lower.

Fig. 2 is a rescue emulation all-pass figure for task based access control relation.But an all-pass figure can not well catch intelligence Transitivity between energy body cooperation, such as it is fine that PF cooperates with FB, and it is fine that FB cooperates with AT, then PF cooperates very with AT It is good, all-pass Fig. 2 is then modified to a connected graph Fig. 3, (PF, AT) is removed, in order to model the collaboration between intelligent body The relation of the negative correlation between the relation of coefficient d and their task based access controls, can introduce a weighting function w：By in figure PF and FB minimum ranges obtain w (d (PF, FB)), w (d) >=0 and w (d) and d inversely, as a simplification it is assumed that simultaneously And assume that weight is the intelligent body relation that can completely catch task based access control.

Rescue simulated environment is given a definition weighting cooperative figure

One simple modeling is carried out in the treatment individual task ability of rescue initial time to intelligent body below, and will intelligence The effect that body cooperation performs a certain task is quantified：F (A), if so AT and FB can well cooperate with PF, but PF1 treatment individual task ability is better than PF2, now F ({ PF, FB })>F ({ PF, AT }), selection (PF, FB) combination completes to appoint Business, although F is a unknown distribution, but it can still be broken down into multiple subtasks, each subtask can act as be Standardized normal distribution, wherein unimodal function can represent that intelligent body completes the mean apparent of task, and symmetrical represent The diversity of intelligent ability of immigrants.A definition for cooperative figure 1 is given below：

A：Intelligent body team, T：The complex task of receiving, M：Decompose the subtask after complex task T, F：2^A(X is tool to → X Have the M n-dimensional random variable ns of unknown distribution), F represents the outfield disaster model in rescue emulation, and F (A) is in this world The world model that intelligent body team obtains in model, F (A) values are relevant with intelligent body team A treatment M subtasks situation.V：Calculate The overall value that complex task T is completed on the basis of based on subtask M.For example：And if only if X (m) ＞ τ,There is V_task(X) =V_sum(X), wherein, if subtask M less than setting threshold tau, then the value of this task be 0.X (m) represents the m units in X Element.Obtain an optimal intelligent body team：There are V (F (A^*))≥V(F(A))。

The thought of collaboration is brought into rescue simulated environment below, in emulation is rescued, with the common people for being rescued (civilian) combination of building (building) state defines a function V in quantity and city.For each role point With tactful p, each role is required for distributing an intelligent body type.The intelligent body a ∈ A of same type can be allocated multiple Role, such as：So that π (r_α)=π (r_β), can all play the part of this angle this represent multiple same intelligent bodies Color.Therefore definition 2 is given：

There are N number of intelligent body type (N=6) and M role, the optimal role that cost function V is belonged to find distributes plan Slightly p^*, it is assumed that the performance of role's allocation strategy is a normal distribution,Each intelligent body type a_i∈A With M normal distribution { C_i,1,...,C_i,MRelated, wherein C_i,aIt is intelligent body a_iServing as role r_aWhen independent process task energy Power, in order to effectively to intelligent body collaboration situation modeling, introduce compatible function phi, compatibility is higher, and expression synergy is better, If compatibility can be transmission --- a₁And a₂Highly compatible, a₂And a₃Highly compatible, then a₁And a₃Highly compatible.DefinitionWherein φ is a monotonic decreasing function, and transmission range parameter, output compatibility, short distance represents height Compatibility.Distance represents the compatibility between them between such as Fig. 4, tri- intelligent bodies of PF, AT, FB, and PF and FB has high compatible Property, but it is compatible low relative to oneself, similarly, from transitivity, the compatibility of PF and AT is also very high.

Rescue simulation model based on weighting cooperative figure

Weighting collaboration graph model S is a polynary array { G, C }, wherein：

G={ V, E } is a weighting connected graph；

For example, there is a circle for wraparound oneself on each summit.

So that C_i={ C_{I, 1}..., C_{I, M}WhereinIt is a_iIn role r_a In ability.In order to pass through to weight the compatible function phi between the acquisition intelligent body of the distance between cooperative figure summit, therefore adjustment Function is cooperateed with pairs：Distribute to role r_a, r_bIntelligent body a_i, a_jBetween paired conspiracy relation formula：S₂(a_i, a_j, r_α, r_β)= φ(d(v_i, v_j))·(C_{I, α}+C_{J, β}) (1)

Wherein, d (v_i,v_i)=w_i,i, and because i ≠ j, d (v_i,v_i)=w_i,iIt is the vertex v in cooperative figure is weighted_i,v_j Between beeline, if figure S in S (A) be a pair of average values of collaboration team, that is,：

Such as Fig. 5, Fig. 5 simply rescue weighting collaboration graph model.S (A) returns to one groupThe normal distribution of task performance is represented, can be obtained by above-mentioned definition：

Wherein, intelligent body process subtask in w_{A, a '}=w (d (v_a, v_a′)) andRecognized To be independent.By parity of reasoning, formulates an intelligent body team collaboration function：One Role Policies π：The intelligent body of R → A distribution Team's conspiracy relation is：

So collaboration function returns to a normal distribution, it represents table of the intelligent body team under role's allocation strategy It is existing.

Cooperative team is built using cooperative figure is weighted

Optimal rescue team is obtained using an approximate algorithm on the basis of cooperative figure is weighted, due to rescuing the number of team Amount is limited by external environment in rescue task, assumes that task needs the number of intelligent body to be using the thought of subregion It is fixed, and intelligent body number needed for the subtask of the task is also fixed, if the number n of troop is unknown, can be with Using the algorithm constantly by incremental n interative computations, all suitable n values are returned to, then select optimal number, assisted in weighting In with figure, the troop optimal in order to arrange selection, the normal distribution that will represent intelligent ability of immigrants is converted into their average, is used for Simplify comparison step.False code is as follows：

A=RandomTeam (A, n) // team is generated at random

Use S(A)to get ArrayList(N_{A, 1}….N_{A, M}) then v=V (Evaluate (N_{A, 1, p}) ...Evaluate(N_{A, M, p})) // obtain the assessed value that troop's intelligent body completes task ability

For k=1 to k_max do

{

A '=RandomTeamNeighbor (A) if // there is no required intelligent body in rescue, using adjacent intelligence Can body replacement

V '=V (Evaluate (N_{A ', 1, p}) ..., Evaluate (N_{A ', M, p}))

If p (v, v ', Temp (k, k_max)) ＞ random () // introducing temperature prediction, Temp is a temperature funtion

A=A ' and v=v '

Return A

}

One team of near-optimization is obtained by above-mentioned algorithm, optimal role's allocation strategy will have been sought below.

Set up optimal cooperation allocation strategy

Known weighting cooperative figure S, it is desirable to use S estimates optimal role's allocation strategy π^*.But, cooperate with function What S was returned is a normal distribution.So defining evaluation function, it uses risk factors ρ ∈ (0,1) by normal distribution Be converted to a number：(wherein φ^-1Standard normal Cumulative Distribution Function is reciprocal. When ρ=1/2, return value is equal to the average value of distribution, and as ρ ＞ 1/2 (＜ 1/2), variance is incremented by (successively decreasing).It is distribution to have The evaluation function of classification, role's allocation strategy is further explored by intelligent body this task of being put out a fire to building, forms estimation The optimal team for going out.Algorithm begins to use one to be randomly assigned strategy, is given birth to by the intelligent body type for changing one of role Into adjacent strategy.New strategy use Evaluate calculates the standard server RoboRescue- that fraction and RCRSS are provided V1,1 temperature simulator.

In order that cooperateing with graph model with above-mentioned weighting, it is necessary to from one weighting collaboration graph model of data learning, below Propose a kind of weighting cooperative figure learning algorithm of the case-based learning of use role allocation strategy.Wherein V is unknown, but can be Obtain example the T={ (π of V₁, V₁) ..., (π_|T|, V_|T|), although V (π) is a distribution function, at each in T Strategy only uses an example, such as

Fig. 6 optimal distribution strategy algorithm flow charts, the learning algorithm on weighting cooperative figure has two steps：

1. the initial weighting cooperative figure of random generation, the ability of the process task of intelligent body is estimated with example T；

2nd, intelligent body team performs rescue strategies, wherein weighting collaboration diagram is various, so allows intelligent body team more preferable The space for exploring weighting collaboration diagram, the log-likelihood estimation function of Case-based Reasoning obtains new weighting cooperative figure.

In false code, an initial weighting collaboration diagram is generated using RandomGraphStructure functions, it gives birth to Cheng LiaoIndividual summit, and added the side of any weight so as to allow collaboration diagram to connect at random in figure, there is self-loopa on each summit Side.Then, when intelligent body carries out fire-fighting fire extinguishing, weighting collaboration diagram is explored with NeighborGraphStructure Space.One adjacent graph structure is generated by following four situation：

A. less than maximum weights V_maxWhen, 1 is increased to the weights on side at random；

B. more than minimum weights V_mimWhen, 1 is reduced to the weights on side at random；

C. a new side is increased between two summits；

D. the side that does not influence figure to connect is eliminated.

Because when weighting the weights on the side of collaboration diagram according to 1 be unit increase and decrease on the occasion of the ownership on side can be explored The combination of value, and two kinds of situations of c, d can change summit be directly connected to situation.It is as follows the learning algorithm for weighting cooperative figure False code：

G=RandomGraphStruture (the A) // weighting collaboration graph model of random generation one

The ability of intelligent body cooperation in the current cooperative figure of C ← EstimateCapabilities (G, R, T) // now

S ← (G, C), l ← LogLikelihood (S, T) // log-likelihood estimate

For k=1to k_max do{

G′←NeighborGraphStructure(G)

C ' ← EstimateCapabilities (G ', R, T)

S ' ← (G ', C '), l ' ← Loglikelihood (S ', T)

} // traversal k_maxIt is secondary, the weighting collaboration graph structure of adjacent space is set up, then explore the space of figure

If P (l, l ', Temp (k, k_max)) ＞ random () preferably disappears if then//adjacent space comfort cooperative figure has Anti- performance

The existing weighting cooperative figure of S ← S ', l ← l ' // substitution

return S

Intelligent body collaboration capabilities are tested

EstimateCapabilities methods above-mentioned, have N number of intelligent body and M role, it is therefore desirable to estimate NM normal distribution.Assuming that under the constraints of | T | ＞ 2NM, it is known to weight the structure of collaboration diagram, can calculate two Vertex v_i, v_jBetween distance.If v_i≠v_jv_i ¹v_j, then d (v_i, v_j) it is minimum range between them.If v_i=v_j, then d (v_i, v_j)=w_{I, j}, such as vertex v_iSelf-loopa side weights.By distance, φ can calculate paired intelligent body compatibility. EstimateCapabilities estimates normal distribution to maximize the log-likelihood of training example T.Individually instructed from one An equation can be formed in white silk sample (π, V (π)) ∈ T, it includes average and variance of intelligent ability of immigrants.

Fig. 7 is rescue intelligent body cooperation sketch.Fig. 7 represents three kinds of intelligent mobile agent types and two angles in rescue emulation The example of the weighting cooperative figure of color.Assuming that intelligent ability of immigrants C_{I, α}It is unknown, andπ=(r₁→ FB, r₂→ AT), while the synergy between them is：Therefore V The log-likelihood function of (π) is：

Each example in T is corresponding with the expression formula of an average comprising intelligent ability of immigrants and variance, in order to ask The distribution function that the log-likelihood function of T can be made maximum is obtained, the summation of log-likelihood function also must be maximized.Average is most Just estimated with a least square solution calculation procedure, then variance is estimated according to the average for providing with a non-linear solver.

Experiment and result

Robocup Rescue rescue emulation match request intelligent bodies are cooperated within 300 cycles, are completed rescue and are appointed Business, after end of match, the achievement of rescue is judged according to scoring V, and the standards of grading of match areFraction Mainly by succouring civic number, the degree of road cleaning, the quantity for putting out building comes what is comprehensively provided, and each intelligent body exists The characteristics of embodying respective in rescue and effect, they are complementary, and only efficient cooperation could preferably play intelligence Can the effect of body simultaneously obtain fraction higher, so the height of scoring indirectly embodied collaboration capabilities between intelligent body and The efficiency of algorithm.Test evaluation will weight cooperative figure in terms of 3 below.

First, the improvement of interactive cooperation efficiency

Algorithm is used in Robocup Rescue emulation platforms, can demonstration weighting cooperative figure obtain dynamic ring well The interaction scenario of intelligent body under border.The algorithm and Robocup2013 ratios cooperated using 4 kinds of intelligent bodies in Robocup Rescue Istanbul scenes in match, wherein have 46 rescue intelligent bodies, in order to test the interactivity of weighting cooperative figure and the spy of study Point, first：For this 46 intelligent bodies, at the emulation initial stage, each intelligent body can at random assign its algorithm to control, because This coordination strategy in bout will be to 46 distribution cooperations of the respective cooperation algorithm of intelligent body.Contrast ABC row is more Secondary emulation, and 400 samples of police agent (PF) are therefrom taken out at random, after emulation terminates, to the common people (civilian) Vital values (hp) and survival intelligent body number, and building burned degree carries out a weighted sum.Exist first 6 wheels are carried out on 400 samples and intersects study, in 300 samples of extracting data as training study group, remaining 100 samples This accurately weights the variance C of cooperative figure intelligence ability of immigrants due to the lazy weight of sample as test group to calculate_i,α, can A constant is set to by it, then optimal role is estimated with the weighting cooperative figure from 6 wheel intersection comparison learnings and is distributed Strategy, in order to obtain the curve of interaction scenario, in units of 200 iterationses, randomly selects 10 samples in every 200 times Log-likelihood function value, draws curve.Fig. 8 tests the log-likelihood function curve for randomly selecting 50 samples in the sample, According to Fig. 8 it can be seen that weighting cooperative figure effectively simulates the interaction scenario of Robocup rescue task allocation strategies in test, Demonstrate the validity of weighting cooperative figure interactive learning.

2nd, the improvement of intelligent body Team Decision Making

In order to test the improvement cooperated using the intelligent body after weighting cooperative figure, compete what is provided using Robocup2013 Kobe maps are tested, and contrast uses the effect after preceding and use, it can be seen in figure 9 that when weighting cooperative figure is not used, In construction zone of catching fire, only Liang Zhi fire brigades and a police agent being rescued, but the one of near zone Fire brigade and a police) rescue cooperation is not engaged in, the buildings intensity of a fire that causes to catch fire is larger, and rescue effect is undesirable.

Figure 10 is the use of the effect after weighting cooperative figure, be can see in figure, is originally not engaged in disappearing for rescue cooperation Anti- team and police agent have participated in the putting out the groups of building that catch fire of the task, and police effectively cleans up roadblock, 4 fire-fightings Team also controls rapidly the intensity of a fire, and disaster area is fallen below into minimum, it can be seen that, on rescue strategies, the cooperation between intelligent body Efficiency has very big improvement.

3rd, the improvement of whole structure

Weighting cooperative figure is compared with auction algorithm and a Random Cooperation strategy.Carried out on Kobe maps first Multiple contrast test, then selects the map in Robocup2013 matches, counts every kind of strategy process, every map and competes Point, contrasted.Figure 11 is the match emulation initial stage, and Figure 12 uses auction algorithm, Figure 13 to use weighting cooperative figure.

10 contrast tests, such as table 1, the average mark of relatively more last gained are carried out below for different maps：

The fractional statisticses table of table 1

Test map (10 times)	VC2	Berlin2	Kobe3	Istanbul	Mexico2	Eindhover	Paris3
								Auction algorithm best result	140.12	97.67	67.42	17.34	19.54	20.25	134.68
Weighted graph best result	159.22	120.45	80.55	19.23	30.1	26.43	157.11
								Minimum point of auction algorithm	125.42	89.65	60.78	12.95	13.43	14.22	110.76
Minimum point of weighted graph	143.09	102.56	70.92	16.45	18.7	21.33	147.43
								Auction algorithm average mark	132.45	94.25	65.4	15.41	16.21	17.65	120.98
Weighted graph average mark	155.26	108.17	73.36	18.11	21.67	24.12	153.88

Data above are depicted as block diagram, as shown in figure 14, are found after being contrasted, after introducing weighting cooperative figure, intelligence Can the whole efficiency of body cooperation rescue have very big lifting, it was demonstrated that carry out multiple agent task distribution using cooperative figure is weighted Validity.

Embodiment emulates the characteristics of competing according to RobocupRescue, and the weighting cooperative figure of based role is defined first, and Ability and collaborative effect for intelligent body in match are modeled, and then the intelligent body cooperative team in match is determined Justice, carries out the intelligent body fire-fighting in rescue emulation match and assists using learning algorithm and Task Assigned Policy based on weighting cooperative figure Make.Tested by experiment simulation, it was demonstrated that the validity of the method.

Claims

1. a kind of Robocup based on weighting synergetic rescues collaboration method, it is characterised in that：

Treatment individual task ability to intelligent body in rescue initial time is modeled, and intelligent body cooperation is performed into a certain task Effect quantified：F (A), wherein, the cooperation of F (A) intelligent body performs the quantized value of the effect of a certain task, and A represents intelligent body Team, if F ({ PF, FB })>F ({ PF, AT }), wherein, PF represents the rescue team for removing roadblock, and FB is represented and building is carried out The rescue team of fire extinguishing, AT represents the rescue team of the injured common people of relief, selection (PF, FB) combination completion task；

To intelligent body collaboration state modeling, compatible function phi is introduced, define φ：Wherein φ is monotone decreasing Function, transmission range parameter, output compatibility, distance is shorter, and the compatibility that represents is higher, and compatibility expression synergy higher is more It is good；

2. using the Robocup rescue collaboration methods based on weighting synergetic required described in 1, it is characterised in that weighting association It is a polynary array { G, C } with graph model S, wherein：

G={ V, E } is a weighting connected graph；

E=(v_i, v_j, w_{I, j}) ∈ E are vertex vs_i,v_jBetween carry weight w_i,j∈Z₊Side；

There is a circle for wraparound oneself on each summit；

So that C_i={ C_{I, 1}..., C_{I, M}, wherein,It is a_i Role r_aIn ability, M represents role's number that each intelligent body is served as.

3. the Robocup based on weighting synergetic as claimed in claim 2 rescues collaboration method, it is characterised in that：Utilize Weighting collaboration graph model, PF, ability gaussian distribution table of the tri- kinds of intelligent bodies of FB, AT in the disaster model of dynamic change Show；

S (π) = \frac{1}{(\begin{matrix} | R | \\ 2 \end{matrix})} \cdot \underset{r_{α}, r_{β} &Element; R}{Σ} S_{2} (π (r_{α}), π (r_{β}), r_{α}, r_{β}) - - - (4);

4. the Robocup based on weighting synergetic as claimed in claim 3 rescues collaboration method, it is characterised in that adding On the basis of power cooperative figure optimal rescue team is obtained using approximate algorithm：Assume that a task needs intelligence using the thought of subregion The number of energy body is fixed, and intelligent body number needed for the subtask of the task is also fixed, if the number n of troop is Unknown, n interative computations that constantly will be incremental using approximate algorithm return to all suitable n values, then select optimal individual Number.

5. the Robocup based on weighting synergetic as claimed in claim 4 rescues collaboration method, it is characterised in that adding In power cooperative figure, the normal distribution that will represent intelligent ability of immigrants is converted into their average to select optimal troop.

6. the Robocup based on weighting synergetic as described in claim any one of 1-5 rescues collaboration method, and its feature exists In obtaining optimal weighting cooperative figure by the learning algorithm of weighting cooperative figure：

Intelligent body team performs rescue strategies, and the log-likelihood estimation function of Case-based Reasoning obtains optimal weighting cooperative figure.

7. the Robocup based on weighting synergetic as claimed in claim 6 rescues collaboration method, it is characterised in that weighting The learning algorithm of cooperative figure is specially：

The initial weighting cooperative figure of generation, generates summit at random, and has added the side of any weight at random in figure, allows collaboration diagram to connect Logical, there is the side of self-loopa on each summit；

When intelligent body carries out fire-fighting fire extinguishing, the space of weighting collaboration diagram is explored, one is generated by following four situation Adjacent graph structure：

Increase a new side between two summits；

The side that one does not influence figure to connect is eliminated；

If adjacent space comfort cooperative figure has more preferable fire-fighting to show, replace existing weighting cooperative figure；Finally obtain optimal Weighting cooperative figure.