CN104063541A - Hierarchical decision making mechanism-based multirobot cooperation method - Google Patents

Hierarchical decision making mechanism-based multirobot cooperation method Download PDF

Info

Publication number
CN104063541A
CN104063541A CN201410274560.6A CN201410274560A CN104063541A CN 104063541 A CN104063541 A CN 104063541A CN 201410274560 A CN201410274560 A CN 201410274560A CN 104063541 A CN104063541 A CN 104063541A
Authority
CN
China
Prior art keywords
role
opponent
decision making
ball
football
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410274560.6A
Other languages
Chinese (zh)
Other versions
CN104063541B (en
Inventor
梁志伟
沈萍
刘娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Post and Telecommunication University
Original Assignee
Nanjing Post and Telecommunication University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Post and Telecommunication University filed Critical Nanjing Post and Telecommunication University
Priority to CN201410274560.6A priority Critical patent/CN104063541B/en
Publication of CN104063541A publication Critical patent/CN104063541A/en
Application granted granted Critical
Publication of CN104063541B publication Critical patent/CN104063541B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Toys (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a hierarchical decision making mechanism-based multirobot cooperation method. Footballers judge to select a form according to position of a football for coping with a match; then all the footballers vote and select the current best center forward (CF), and role assignment is performed on other footballers; each footballer judges whether to be the CF, if one footballer is the CF, the footballer walks to the football and walks by dribbling, mathematical modeling is performed on opponent speed by a desired of behavior prediction model (DOBMP) for a CF walking ball-kicking decision making module; the other footballers not being the CF are subjected to roll assignment and then walk to position points to select a form. According to the method, the selection of the CF and the role assignment on all the other footballers are realized in sequence, the DOBMP is built aiming at a CF dribbling decision making module, and finally a dynamic programming algorithm is adopted to optimize the problem of high-dimensionality calculation amount caused by role functions, so role alternation fluency based on continuous change of the position of the football is ensured.

Description

Multi-Robotics Cooperation Method based on hierarchical decision making mechanism
Technical field
The present invention relates to a kind of multi-Robotics Cooperation Method based on hierarchical decision making mechanism.
Background technology
FIRA (Federation of International Robot-soccerAssociation with strongest influence power in the world now, FIRA) and RoboCup two large machine world cup people football matches, the difference of both maximums is that FIRA allows Yi Zhi team to adopt traditional centralized control, is equivalent to the control that all teammates in Yi Zhi team are subject to same brain.RoboCup necessarily requires to adopt distributed control mode, is equivalent to the brain that each team member has oneself, because of but one independently " main body ".This just need in depth study MAS, and the mode that allows a plurality of intelligent bodies plan to cooperate and compete has been removed certain goal task, uses evolution algorithmic and colony's wisdom to reach a whole breakthrough performance-based objective.
In RoboCup3D emulation match, want to win a football match, it is impossible depending merely on profile, must have cooperatively interacting and cooperating of whole team members, and RoboCup3D emulation match is mainly to embody multiple agent under the dynamic environment of complexity, how to realize cooperation efficiently and antagonism tenaciously.The number of player of RoboCup3D simulated environment is changed to 9 people of 2011 to 11 intelligent bodies so far from 6 intelligent bodies of 2010, this cooperation for multiple agent is had higher requirement.
About the coordination mechanism problem of multirobot, all started probing in various degree recent years both at home and abroad.For example Portuguese FC Portugal is for footballer character assignment problem, adopt repeated optimum allocation (IOA, Iterated Optimal Assignment) method, is under the greedy algorithm based on famous, to seek limited optimal value, and in conjunction with role swap mechanism; Observe the mankind's football, someone proposes to wish by setting up learning by imitation mechanism, unify mankind's complex behavior and robot motion, however in view of the not intellectual of the basic framework of learning by imitation, interactive interface is also difficult to obtain; U.S. UT Austin Villa troop application subtask collection optimization method completes the design of target framework, and the occupy-place of using dynamic role assignment algorithm to coordinate whole troop coordinates; Britain BoldHearts troop is used alliance's algorithm, be intended to build the requirement that a powerful team of alliance meets external environment, can, according to its action parameter of algorithm optimization, adopt the Infotaxis decision search algorithm without gradient, the rate value of suboptimize's information gain simultaneously; The Robocanes team of the U.S. adopts based on space-time model matching process, to set up relevant motion model and its internal state, the walking engine mechanism of the German B-Human of while reference troop, and optimize different behavior action parameters configurations with genetic algorithm and SARSA learning algorithm.
Said method all needs certain Optimization Mechanism and learning method, and for role assignments problem, its calculated amount is large, and renewal speed is slow.The problems referred to above are the problems that should pay attention to and solve in multi-robot Cooperation process.
Summary of the invention
The object of this invention is to provide a kind of multi-Robotics Cooperation Method based on hierarchical decision making mechanism, realize the effective cooperation of whole multirobot team, realize successively forward holding person's selection and the distribution of other all footballer character, simultaneously for the forward holding person decision-making module of dribbling, set up DOBMP model, finally adopt the problem of the high dimension calculated amount that dynamic programming algorithm optimization role function brings, guarantee the fluency of the role rotation under constantly changing based on football position.
Technical solution of the present invention is:
A multi-Robotics Cooperation Method based on hierarchical decision making mechanism,
Sportsman carries out formation according to the position judgment of ball and selects to go reply match;
Follow all sportsmen and vote in the holding person forward holding person who oneself thinks now best, then carry out other role assignments;
Determine whether forward holding person, if forward is holding person, run to ball place, dribbling walking, using ideal behavior forecast model to carry out mathematical modeling for the forward holding person walking decision-making module of playing football to opponent's speed, is that ball is played to impact point or walking and dribbled to impact point;
If not forward holding person, carry out, after other role assignments, running to location point, carry out formation selection.
Further, use ideal behavior forecast model to carry out mathematical modeling for the forward holding person walking decision-making module of playing football to opponent's speed, be specially:
By opponent's average velocity and the position at its current place, calculate the time T that opponent arrives the required cost of ball position; Know that we sportsman carries out the time that striking action spends, setting threshold robot success is played ball to impact point to predict us simultaneously;
Suppose that opponent can stop us to play football at t in the time, when T-t value is less, the possibility that we is successfully completed the task of playing football is larger;
When the value of T-t is less than the threshold value of setting, just think that the task of playing football can be successfully completed, and now takes ball to play to impact point.
Further, after making a policy, opponent still can stop us to play football, the opponent's that change is set up instantaneous velocity table, namely, if we has failed the task of playing football and will penalty value p be set to velometer:
p = V err n = V rea - V ave 2 - - - ( 3 )
Wherein, V errbe the poor of opponent's true velocity and average velocity, n is the number of the instantaneous velocity of sampling.
Further, with dynamic programming function optimization algorithm, reduce calculated amount:
First calculate the distance value that each intelligent body arrives first role position, then utilize role assignments function yr to calculate the distance value that each intelligent body arrives respectively all possibility combinations of first and second position, and preserve the lowest positioned cost combination that every pair of intelligent body arrives these two positions;
For k intelligent body, setting up new location is to arrive { p based on k-1 intelligent body 1p k-1position, utilize role assignments function yr to calculate each intelligent body and arrive respectively { p 1p k-1the distance value of all possibilities combination of position, and preserve every pair of intelligent body and arrive { p 1p k-1the lowest positioned cost combination of position;
Distribute subsequently each intelligent body to arrive p kthe distance value of individual position also calculates the lowest positioned cost combination that all intelligent bodies arrive these three diverse locations.
Further, when calculating the combination of lowest positioned cost: have lower location cost in any subset, the cost that comprises the whole locator meams of this location must be lower.
Further, use containing the ballot system of different weights and vote.
Further, in ballot system, the distribution condition of communication information byte is:
Further, the dynamic assignment of footballer character, the role assignments function yr of use is to realize best occupy-place:
Mode according to dictionary sequence is selected, and each intelligent body is in all possible occupy-place mode, and the sum that walks of all intelligent bodies is the shortest paths;
In shortest path, when two sportsmen have intersection point on path, there will be the situation of collision, role assignments function yr obtains lower cost according to triangle inequality by exchanging two sportsmen's target location.
The invention has the beneficial effects as follows: the method realizes successively the selection of forward holding person CF and the distribution of other footballer character under the support of ballot communication system, and synchronously upgrades all footballer character; For the CF judgment mechanism of playing football, adopt DOBMP model analysis decision-making; For the calculated amount problem of update of role, adopt dynamic programming function to reduce greatly calculated amount, this speed for update of role is very helpful, and has guaranteed the fluency of the role rotation based in football change in location situation.
Accompanying drawing explanation
Fig. 1 is hierarchical decision making mechanism optimization process schematic diagram.
Fig. 2 is the schematic diagram that formation is selected.
Fig. 3 is whole occupy-place formation figure.
Fig. 4 is minimum cost occupy-place explanation schematic diagram.
Fig. 5 is used DOBPM to carry out mathematical modeling for the play football decision flow diagram of decision-making module of CF walking to opponent's speed.
Fig. 6 is the formation occupy-place under different match modes.
Fig. 7 is role rotation attack schematic diagram.
Embodiment
Below in conjunction with accompanying drawing, describe the preferred embodiments of the present invention in detail.
Based on RoboCup3D emulation platform, embodiment has designed a kind of multi-robot Cooperation Method based on hierarchical decision making, realizes the effective cooperation of whole multirobot team.Strategy mainly comprises based on role assignments function, ballot communication system and ideal behavior forecast model (Desired of Behavior Prediction Model, DOBMP) three aspects of hierarchical decision making mechanism under, realize successively forward holding person (CenterForward, CF) selection and the distribution of other all footballer character, for CF dribbling decision-making module, set up DOBMP model simultaneously, finally adopt the problem of the high dimension calculated amount that dynamic programming algorithm optimization role function brings, guarantee the fluency of the role rotation under constantly changing based on football position.
Embodiment
In the strategy of Apollo3D, adopt hierarchical decision making mechanism (Hierarchical Decision Making, be called for short HDM), as shown in Figure 1, so-called hierarchical decision making be exactly first sportsman according to current what formation that should adopt of position judgment of ball, go reply match, follow all sportsmen and vote in the holding person CF that oneself thinks now best, because a football match key is CF, it is that this pass or oneself dribbling are advanced, and this is all the key of whole team policy selection.In whole decision process, between role and sportsman, not to keep changeless always, the A of the current time robot most convenient of receiving, it may be exactly forward CF, next is constantly due to opponent's interception, A cannot realize oneself dribbling to break through and just pass the ball to teammate, and after pass, A will be according to the occupy-place conversion role of current time.Now other sportsman's role and position are all according to the position of self current time and fixed, finally adopt a kind of coordination system to realize communication between all sportsmen and role's synchronous renewal, by communication system, send the position of sportsman's self-position and ball, each sportsman just can be known its teammate's best occupy-place like this, just make sportsman reach an agreement, be just more conducive to the cooperation between sportsman.Wherein the selection of CF must foundation: this sportsman whether fall down, can see ball, ball he the place ahead or rear, apart from the distance of football, whether be goalkeeper, whether be CF, the shared weight of above-mentioned every kind of situation is all different if also having in he upper a decision-making period.
Formation is selected
The same with mankind's football match, RoboCup3D emulation match reply different situations also arrange corresponding match mode, as kick off (Kick-Off), goal kick (Goal_Kick), sideline ball (Throw_In), corner-kick (Corner_Kick) etc.From the angle of football match, team's whole strategy can be divided into the large system of attack and defense two, and in fact sportsman's Action Selection is exactly to be we or the other side according to what control ball, and we just enter attack state at ball-handling, and the other side's ball-handling just enters defence state.Different formations as shown in Figure 2.
Usually said team's formation is according to the position of ball, as shown in Figure 3, the whole erect-position of team when ball is positioned at center, court, whole formation can be divided into advances and guards two parts, the role position of advancing part is according to the coordinate position of asking, to add certain side-play amount to obtain again, and comprises CF, WFL, WFR, SFL, SFR, CAM and FF.Unique special be exactly this role of CF, it is the sportsman nearest apart from ball always, the position of ball is decided to be to its coordinate position.By He Qiu position, center, goal, connected into a linely, the position of guard type sportsman CDM, CBL and CBR is all on this line, and adds certain side-play amount according to court bottom line again.And the position of goalkeeper GK is not affected by its teammate substantially, this is in order to guarantee that oneself does not lose at goal, if when GK is the excellent person of CF constantly, will has another sportsman to be assigned as GK role and stand in center, goal.
Role assignments function y r
After whole confirming of formation, key is exactly the dynamic assignment of footballer character, the role assignments function y of use rto realize best occupy-place, when the extraneous status information of input, function can calculate current time sportsman and role's optimum matching situation.Before discussing this function, must meet three preconditions:
(1) the nearest position of selected distance: each intelligent body is in all possible occupy-place mode, take out respectively the position from they nearest (), to guarantee that the sum that walks of all intelligent bodies is the shortest, this just need to select according to the mode of dictionary sequence.
(2) keep away barrier: sportsman should avoid bumping with other sportsman when moving to their set positions as far as possible.
(3) dynamically consistent: if given a series of target location, if y rat moment T output occupy-place mode m, sportsman f in moving to target location process is m by what export so.
If there be n sportsman, will have Middle occupy-place.The status information in the given external world, an especially n sportsman's position and n target location.Cost and descending sort successively by a kind of occupy-place mode of n element group representation.Like this can according to cost obtain n! Plant feasible occupy-place, according to lexicographic sequence, compare these costs, as shown in Figure 4 and Table 1.
The cost of various occupy-place modes sorts according to lexicographic:
The sequence of table 1 occupy-place cost
This minimum value that show that attribute 1 requires of being easy to according to lexicographic sequence.If two sportsmen have intersection point to there will be the situation of collision, function y on path rcan by the target location that exchanges them, can obtain lower cost according to triangle inequality.
Ballot communication system
In order to allow all sportsmen of team can arrive accurately target location separately, just necessarily require all sportsmen can be harmonious and for just undoubted in executive role occupy-place.If sportsman can be known ball and the accurate location of its teammate on court, harmonious without between sportsman so just, because each sportsman can independently calculate the best occupy-place that needs use.But problem is just sportsman self, have the angle limitations of 120 °, and the perception information receiving is all mingled with noise, so the object of seeing all has error in distance and angle, thereby cannot obtains positional information accurately.In Simspark, allow fortunately intelligent body to carry out intercommunication, be every an emulation cycle (40ms), can intercom mutually between sportsman, but the bandwidth of this communications conduit is conditional, can only there is content constraints that a sportsman sends information and information 20 bytes at every turn.
3D simulated environment provides a so-called audio system, make each robot every two cycles (40ms) to broadcast the information that oneself will ' say ', other robot can receive this information for ' listening ' in the next emulation cycle, but cannot know that the information receiving comes from that intelligent body, so be necessary to add sportsman number in the information sending.The information of all sportsman's sending and receivings is all the ASCII character that is limited in 20 bytes, and to have part ASCII character be not allow to use.Apollo3D, for amount of compressed data, is divided into court the grid of 5000*5000 size, uses ' * ' to encode to 83 characters between '~', can transmit 8320 bit informations.
The concrete distribution condition of information byte is as shown in table 2 below, the basis of the layering decision-making system that wherein noticeable 14-18 byte is used as Apollo3D.In addition, ' saying ' that Apollo3D sends for each sportsman and ' listening ' information receiving are used encryption and decryption strategy, to guarantee the safety of our information communication and to increase certain antijamming capability.
The distribution condition of table 2 communication information byte
It must be emphasized that the occupy-place information of only using communication finally to receive is very unadvisable, because noisy interference, sportsman occur falling down or when self poisoning error has accumulation during the games, again or the information of sending from server even have lose or the situation of time delay occurs, the information that sportsman receives is just more inaccurate.So use the ballot system containing different weights, even if the situation appearance of the misdata that the even information dropout having or sportsman send also can make whole team use unified occupy-place.
Use the ballot system containing different weights, specifically, in play, dribbler's task is the heaviest, can be referred to as CF.Wherein the selection of CF is according to being: this sportsman whether fall down, can see ball, ball he the place ahead or rear, apart from the distance of football, whether be goalkeeper, also have whether this sportsman was CF in a upper decision-making period.The shared weight of above-mentioned every kind of situation is all different, but represents with the probability between (0,1).
Desirable behavior prediction model
In MAS, for the behavior prediction of other intelligent bodies, be one and have much challenging research.In theory, single intelligent body can directly be observed the behavior of other intelligent bodies, thereby sets up fixing behavior model, but only have, has the information interaction of many repeatability could set up model between intelligent body.In RoboCup3D emulation match, cannot just predict by simple observation opponent's behavior, and in match real-time change process, also be difficult to enough interbehaviors and set up useful model.
Embodiment has designed a kind of desirable behavior prediction model DOBPM, to predict the best behavior of single intelligent body under specified criteria.DOBPM supposes based on theoretical analysis what other intelligent bodies will do, but the best behavior error of analyzing them is to describe its anticipatory behavior.DOBPM model can for determine when to shoot, to pass and best held ball constantly etc.Embodiment is used DOBPM to carry out mathematical modeling for the CF walking decision-making module of playing football to opponent's speed, is ball is played to impact point or walking and dribbled to impact point.The process flow diagram of whole decision as shown in Figure 5.
During the games, the speed of travel value of opponent within several cycles of first sampling, and calculate its instantaneous velocity V i:
V i = ( y c - y b ) 2 + ( x c - x b ) 2 Δt - - - ( 1 )
(x wherein b, y b) be a sampling upper opponent's constantly positional value, (x c, y c) be current time opponent's positional value.In order to obtain opponent's average velocity, can use the method for harmonic-mean:
V ave = 1 1 n Σ i = 0 n 1 V i = n 1 V 1 + 1 V 2 + . . . 1 V n - - - ( 2 )
By opponent's average velocity and the position at its current place, can calculate the time T that opponent arrives the required cost of ball position.Also know that we sportsman carries out the time that striking action spends, just can to predict us, robot success be played to impact point by ball by setting threshold simultaneously.Suppose that opponent can stop us to play football at t in the time, when T-t value is less, the possibility that we is successfully completed the task of playing football is larger.When the value of T-t is less than the threshold value of setting, just think that the task of playing football can be successfully completed, and now takes ball to play to impact point.If opponent still can stop us to play football after making a policy, illustrate that to the predicted value of opponent's average velocity be inaccurate, now should change the opponent's of foundation instantaneous velocity table.That is to say, if we has failed the task of playing football and will penalty value p be set to velometer:
p = V err n = V rea - V ave 2 - - - ( 3 )
Wherein, V errbe the poor of opponent's true velocity and average velocity, n is the number of the instantaneous velocity of sampling.And its true velocity is opponent, by initial position, runs to distance between time that final position (position of ball) spends and two positions and obtain.
Dynamic programming optimization method based on hierarchical decision making
Below set forth respectively the four module of hierarchical decision making mechanism, foundation based on ballot communication mechanism and ideal behavior model, thereby draw 11 different role assignments Gei11Ge robots of the process that footballer character is distributed and the Yi Zhi troop of playing football, yet goalkeeper always serve as guard goal role, CF always from the nearest sportsman of ball, all the other nine role positions are all by dynamic programming function y rdraw.If goalkeeper is again by chance during from the nearest sportsman of ball, when GK is CF, y now rneed 10 unequal to 3,628, different targeting scheme in 800, calculate respectively again their cost and select optimal cost value by dictionary sequence, all these calculating all must complete within the emulation cycle of 0.02s, and this just needs to consider to use dynamic programming function (Dynamic PlanningFunction) optimized algorithm to reduce calculated amount.
Wherein A, P represent respectively the set of n intelligent body and position thereof, locator meams m:=y r(A, P), if there is lower location cost in any subset, the cost that comprises so the whole locator meams of this location must be lower.For k intelligent body, setting up new location is to arrive { p based on k-1 intelligent body 1p k-1position.The dynamic programming process of San Ge robot for example, as shown in table 3, first calculate the distance value that three intelligent bodies arrive first role position, then utilize role assignments function yr to calculate the distance value that three intelligent bodies arrive respectively all possibility combinations of the one or two diverse location, and preserve the lowest positioned cost combination that every pair of intelligent body arrives these two positions.Distribute subsequently each intelligent body to arrive the distance value of the 3rd position and calculate the lowest positioned cost combination that all intelligent bodies arrive these three diverse locations.
The occupy-place allocative decision of table 3 San Ge robot
N intelligent body be through n dynamic programming iterative computation, the binomial calculating that to be equivalent to high order be n-1 at every turn:
Σ k = 1 n n - 1 k - 1 = Σ k = 0 n - 1 n - 1 k 2 n - 1 - - - ( 4 )
When 11 intelligent bodies participate in match, remove goalkeeper's totally 10 intelligent bodies participation role assignments, after use dynamic programming optimized algorithm, calculated amount is n2 n-1=10 * 2 9=5120, however through the role assignments algorithm calculated amount of optimizing, be not 10 unequal to 3,628,800, significantly reduced calculated amount, also reduced role switching time cost simultaneously.
Experimental verification
All experiments are all to use DrawAnnotation function in Roboviz that we sportsman is overhead located to show in role's title of current time, each role's implication has explanation in Fig. 5, we Apollo3D is blue robot, and red robot is opponent.
Experiment one: the formation occupy-place under different match modes
This experiment is mainly that as shown in picture group 6, Fig. 6 is the formation occupy-place under different match modes for the formation occupy-place situation under different match modes in RoboCup3D emulation match, wherein, (a) kick off before both sides' occupy-place figure; (b) our left side corner-kick occupy-place figure; (c) our croquet occupy-place figure; (d) our forbidden zone corner-kick occupy-place figure.CF, WFL, WFR, SFL, SFR, CAM and FF are responsible for the role of attack in whole troop, wherein WFL, WFR, SFL, SFR and CAM follow CF closely to form after one's death rectangle, stand in respectively rectangular four angles and center, and self is as far as possible towards ball, like this can be in the situation that guaranteeing formation, each sportsman is nearest apart from ball position.FF stands in before opponents' goal forbidden zone all the time, shows through many experiments: the shooting of CF may be tackled by the other side or shooting angle has deviation, and this is that FF can occupy vantage point as soon as possible, and being switched to next CF constantly, to remedy shooting effect splendid; CDM, CBL, CBR and GK role are the defence tasks of bearing oneself half-court, if we when attack state, CDM can occupy position, midfield, this be for prevent opponent strike back or scoop out we sportsman form our counterattack.
Experiment two: role rotation and DOBMP modelling verification
What picture group 7 was described is occupy-place and the role switching of attack part, and in a figure, No. 2 sportsmen are forward CF, owing to being subject to stopping of opposing team, fall down, the role of No. 2 switches to rapidly CAM, now No. 7 sportsmen are towards ball and nearest apart from ball, and its role switches to rapidly CF, as shown in figure b, c; When No. 7 sportsmen are also tackled by opponent, the judgement of application DOBMP model plays ball to i.e. No. 3 positions of Player of impact point, and No. 3 wheels are changed to CF role simultaneously, and No. 2 and No. 7 simultaneous wheels are changed to SFL and CAM, as schemed as shown in d; Due to Simspark match platform specifies: while having 2 Tongfang sportsmen of surpassing in 1 meter of circle of radius centered by ball, can automatically spring open all sportsmen far away apart from ball, so No. 2 sportsmen are automatically springed open by platform during near ball, No. 5 wheels are changed to CF simultaneously, No. 3 wheel is changed to SFL, as shown in figure e.
Hierarchical decision making mechanism based under role assignments is exactly under the support of ballot communication system, to realize successively the selection of forward holding person CF and the distribution of other footballer character, and synchronously upgrades all footballer character; For the CF judgment mechanism of playing football, adopt DOBMP model analysis decision-making; For the calculated amount problem of update of role, adopt dynamic programming function to reduce greatly calculated amount, this speed for update of role is very helpful, and has guaranteed the fluency of the role rotation based in football change in location situation.

Claims (8)

1. the multi-Robotics Cooperation Method based on hierarchical decision making mechanism, is characterized in that:
Sportsman carries out formation according to the position judgment of ball and selects to go reply match;
Follow all sportsmen and vote in the holding person forward holding person who oneself thinks now best, then carry out other role assignments;
Determine whether forward holding person, if forward is holding person, run to ball place, dribbling walking, using ideal behavior forecast model to carry out mathematical modeling for the forward holding person walking decision-making module of playing football to opponent's speed, is that ball is played to impact point or walking and dribbled to impact point;
If not forward holding person, carry out, after other role assignments, running to location point, carry out formation selection.
2. the multi-Robotics Cooperation Method based on hierarchical decision making mechanism as claimed in claim 1, is characterized in that, uses ideal behavior forecast model to carry out mathematical modeling for the forward holding person walking decision-making module of playing football to opponent's speed, is specially:
By opponent's average velocity and the position at its current place, calculate the time T that opponent arrives the required cost of ball position; Know that we sportsman carries out the time that striking action spends, setting threshold robot success is played ball to impact point to predict us simultaneously;
Suppose that opponent can stop us to play football at t in the time, when T-t value is less, the possibility that we is successfully completed the task of playing football is larger;
When the value of T-t is less than the threshold value of setting, just think that the task of playing football can be successfully completed, and now takes ball to play to impact point.
3. the multi-Robotics Cooperation Method based on hierarchical decision making mechanism as claimed in claim 2, it is characterized in that: after making a policy, opponent still can stop us to play football, the opponent's that change is set up instantaneous velocity table, namely, if we has failed the task of playing football and will penalty value p be set to velometer:
p = V err n = V rea - V ave 2 - - - ( 3 )
Wherein, V errbe the poor of opponent's true velocity and average velocity, n is the number of the instantaneous velocity of sampling.
4. the multi-Robotics Cooperation Method based on hierarchical decision making mechanism as described in claim 1-3 any one, is characterized in that, with dynamic programming function optimization algorithm, reduces calculated amount:
First calculate the distance value that each intelligent body arrives first role position, then utilize role assignments function yr to calculate the distance value that each intelligent body arrives respectively all possibility combinations of first and second position, and preserve the lowest positioned cost combination that every pair of intelligent body arrives these two positions;
For k intelligent body, setting up new location is to arrive { p based on k-1 intelligent body 1p k-1position, utilize role assignments function yr to calculate each intelligent body and arrive respectively { p 1p k-1the distance value of all possibilities combination of position, and preserve every pair of intelligent body and arrive { p 1p k-1the lowest positioned cost combination of position;
Distribute subsequently each intelligent body to arrive p kthe distance value of individual position also calculates the lowest positioned cost combination that all intelligent bodies arrive these three diverse locations.
5. the multi-Robotics Cooperation Method based on hierarchical decision making mechanism as claimed in claim 4, it is characterized in that, when calculating the combination of lowest positioned cost: have lower location cost in any subset, the cost that comprises the whole locator meams of this location must be lower.
6. the multi-Robotics Cooperation Method based on hierarchical decision making mechanism as claimed in claim 5, is characterized in that, uses containing the ballot system of different weights and votes.
7. the multi-Robotics Cooperation Method based on hierarchical decision making mechanism as claimed in claim 6, is characterized in that, in ballot system, the distribution condition of communication information byte is:
8. the multi-Robotics Cooperation Method based on hierarchical decision making mechanism as claimed in claim 7, is characterized in that, the dynamic assignment of footballer character, and the role assignments function yr of use is to realize best occupy-place:
Mode according to dictionary sequence is selected, and each intelligent body is in all possible occupy-place mode, and the sum that walks of all intelligent bodies is the shortest paths;
In shortest path, when two sportsmen have intersection point on path, there will be the situation of collision, role assignments function yr obtains lower cost according to triangle inequality by exchanging two sportsmen's target location.
CN201410274560.6A 2014-06-18 2014-06-18 Multi-Robotics Cooperation Method based on hierarchical decision making mechanism Expired - Fee Related CN104063541B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410274560.6A CN104063541B (en) 2014-06-18 2014-06-18 Multi-Robotics Cooperation Method based on hierarchical decision making mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410274560.6A CN104063541B (en) 2014-06-18 2014-06-18 Multi-Robotics Cooperation Method based on hierarchical decision making mechanism

Publications (2)

Publication Number Publication Date
CN104063541A true CN104063541A (en) 2014-09-24
CN104063541B CN104063541B (en) 2017-12-01

Family

ID=51551254

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410274560.6A Expired - Fee Related CN104063541B (en) 2014-06-18 2014-06-18 Multi-Robotics Cooperation Method based on hierarchical decision making mechanism

Country Status (1)

Country Link
CN (1) CN104063541B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109254584A (en) * 2018-09-13 2019-01-22 鲁东大学 Role allocating method, device, computer equipment and storage medium based on multiple agent
CN109794937A (en) * 2019-01-29 2019-05-24 南京邮电大学 A kind of Soccer robot collaboration method based on intensified learning
CN110377048A (en) * 2019-06-26 2019-10-25 沈阳航空航天大学 A kind of unmanned aerial vehicle group defensive disposition method based on genetic algorithm
CN110520813A (en) * 2017-05-05 2019-11-29 赫尔实验室有限公司 It is mobile throughout the multiple agent confrontation type of label formation using the transformation of RADON cumulative distribution and canonical correlation analysis prediction
CN111954564A (en) * 2018-01-21 2020-11-17 斯塔特斯公司 Method and system for interactive, exposable and improved game and player performance prediction in team sports
CN112221160A (en) * 2020-10-22 2021-01-15 厦门渊亭信息科技有限公司 Role distribution system based on random game
CN113001545A (en) * 2021-03-01 2021-06-22 北方工业大学 Robot control method and device and robot
CN113799143A (en) * 2021-11-18 2021-12-17 广东隆崎机器人有限公司 Safe cooperation method and device of multiple robots in working area
CN114986503A (en) * 2022-05-31 2022-09-02 江苏经贸职业技术学院 Multi-robot cooperative motion method and football robot

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1617170A (en) * 2003-09-19 2005-05-18 索尼株式会社 Environment identification device and method, route design device and method and robot
CN103292804A (en) * 2013-05-27 2013-09-11 浙江大学 Monocular natural vision landmark assisted mobile robot positioning method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1617170A (en) * 2003-09-19 2005-05-18 索尼株式会社 Environment identification device and method, route design device and method and robot
CN103292804A (en) * 2013-05-27 2013-09-11 浙江大学 Monocular natural vision landmark assisted mobile robot positioning method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
罗真: "对抗性环境下多机器人协作关键技术的研究", 《中国博士学位论文全文数据库》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110520813A (en) * 2017-05-05 2019-11-29 赫尔实验室有限公司 It is mobile throughout the multiple agent confrontation type of label formation using the transformation of RADON cumulative distribution and canonical correlation analysis prediction
CN110520813B (en) * 2017-05-05 2022-06-24 赫尔实验室有限公司 System, computer-implemented method, and storage medium for predicting multi-agent movement
CN111954564A (en) * 2018-01-21 2020-11-17 斯塔特斯公司 Method and system for interactive, exposable and improved game and player performance prediction in team sports
CN109254584A (en) * 2018-09-13 2019-01-22 鲁东大学 Role allocating method, device, computer equipment and storage medium based on multiple agent
CN109254584B (en) * 2018-09-13 2021-08-17 鲁东大学 Role distribution method and device based on multiple intelligent agents, computer equipment and storage medium
CN109794937A (en) * 2019-01-29 2019-05-24 南京邮电大学 A kind of Soccer robot collaboration method based on intensified learning
CN109794937B (en) * 2019-01-29 2021-10-01 南京邮电大学 Football robot cooperation method based on reinforcement learning
CN110377048A (en) * 2019-06-26 2019-10-25 沈阳航空航天大学 A kind of unmanned aerial vehicle group defensive disposition method based on genetic algorithm
CN112221160B (en) * 2020-10-22 2022-05-17 厦门渊亭信息科技有限公司 Role distribution system based on random game
CN112221160A (en) * 2020-10-22 2021-01-15 厦门渊亭信息科技有限公司 Role distribution system based on random game
CN113001545A (en) * 2021-03-01 2021-06-22 北方工业大学 Robot control method and device and robot
CN113799143B (en) * 2021-11-18 2022-04-19 广东隆崎机器人有限公司 Safe cooperation method and device of multiple robots in working area
CN113799143A (en) * 2021-11-18 2021-12-17 广东隆崎机器人有限公司 Safe cooperation method and device of multiple robots in working area
CN114986503A (en) * 2022-05-31 2022-09-02 江苏经贸职业技术学院 Multi-robot cooperative motion method and football robot

Also Published As

Publication number Publication date
CN104063541B (en) 2017-12-01

Similar Documents

Publication Publication Date Title
CN104063541A (en) Hierarchical decision making mechanism-based multirobot cooperation method
Zhang et al. Efficient communication in multi-agent reinforcement learning via variance based control
Zhang et al. Game of drones: Multi-UAV pursuit-evasion game with online motion planning by deep reinforcement learning
CN112269396B (en) Unmanned aerial vehicle cluster cooperative confrontation control method for eagle pigeon-imitated intelligent game
Xia et al. Cooperative task assignment and track planning for multi-UAV attack mobile targets
CN105119733B (en) Artificial intelligence system and its state transition method, server, communication system
Chen et al. Research on the approach of task decomposition in soccer robot system
CN112034714A (en) Grouping time-varying formation enclosure tracking control method and system
CN104090573A (en) Robot soccer dynamic decision-making device and method based on ant colony algorithm
CN102065446B (en) Topology control system and method orienting group mobile environment
CN111157002B (en) Aircraft 3D path planning method based on multi-agent evolutionary algorithm
Shi et al. Research on self-adaptive decision-making mechanism for competition strategies in robot soccer
CN109397294A (en) A kind of robot cooperated localization method based on BA-ABC converged communication algorithm
Zhang et al. Multi-robot cooperation strategy in game environment using deep reinforcement learning
Li et al. Cooperative multi-agent reinforcement learning with hierarchical relation graph under partial observability
CN110427046A (en) A kind of three-dimensional smooth random walk unmanned aerial vehicle group mobility model
Haobin et al. Robot soccer confrontation decision-making technology based on MOGM: Multi-objective game model
Kiourt et al. Social reinforcement learning in game playing
Xuanyu et al. Multi-robot collaboration based on Markov decision process in Robocup3D soccer simulation game
Jiang et al. A specified-time multi-agent hunting scheme with fairness consideration
Lu et al. 3D humanoid robot multi-gait switching and optimization
Cui et al. Role allocation tactics of soccer robots on RoboCup3D simulation platform
Zhan et al. [Retracted] Cooperation Mode of Soccer Robot Game Based on Improved SARSA Algorithm
Yang et al. Fuzzy theory based single belief state generation for partially observable real-time strategy games
Verhoeven Team Behavior of Artificial Intelligence Bots in Games

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20171201

CF01 Termination of patent right due to non-payment of annual fee