CN104063541A - Hierarchical decision making mechanism-based multirobot cooperation method - Google Patents
Hierarchical decision making mechanism-based multirobot cooperation method Download PDFInfo
- Publication number
- CN104063541A CN104063541A CN201410274560.6A CN201410274560A CN104063541A CN 104063541 A CN104063541 A CN 104063541A CN 201410274560 A CN201410274560 A CN 201410274560A CN 104063541 A CN104063541 A CN 104063541A
- Authority
- CN
- China
- Prior art keywords
- role
- opponent
- decision making
- ball
- football
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 32
- 230000007246 mechanism Effects 0.000 title claims abstract description 27
- 230000008859 change Effects 0.000 claims abstract description 7
- 230000015572 biosynthetic process Effects 0.000 claims description 17
- 238000004891 communication Methods 0.000 claims description 13
- 238000005457 optimization Methods 0.000 claims description 9
- 230000009471 action Effects 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 5
- 230000006870 function Effects 0.000 abstract description 20
- 230000006399 behavior Effects 0.000 abstract description 17
- 238000004364 calculation method Methods 0.000 abstract 1
- 230000010485 coping Effects 0.000 abstract 1
- 238000005755 formation reaction Methods 0.000 description 15
- 230000008569 process Effects 0.000 description 9
- 238000010586 diagram Methods 0.000 description 6
- 238000002474 experimental method Methods 0.000 description 5
- 101100012910 Plasmodium falciparum (isolate FC27 / Papua New Guinea) FIRA gene Proteins 0.000 description 3
- 210000004556 brain Anatomy 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 241000196324 Embryophyta Species 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 230000008485 antagonism Effects 0.000 description 1
- 230000000454 anti-cipatory effect Effects 0.000 description 1
- 230000004888 barrier function Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 231100000572 poisoning Toxicity 0.000 description 1
- 230000000607 poisoning effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
Landscapes
- Toys (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a hierarchical decision making mechanism-based multirobot cooperation method. Footballers judge to select a form according to position of a football for coping with a match; then all the footballers vote and select the current best center forward (CF), and role assignment is performed on other footballers; each footballer judges whether to be the CF, if one footballer is the CF, the footballer walks to the football and walks by dribbling, mathematical modeling is performed on opponent speed by a desired of behavior prediction model (DOBMP) for a CF walking ball-kicking decision making module; the other footballers not being the CF are subjected to roll assignment and then walk to position points to select a form. According to the method, the selection of the CF and the role assignment on all the other footballers are realized in sequence, the DOBMP is built aiming at a CF dribbling decision making module, and finally a dynamic programming algorithm is adopted to optimize the problem of high-dimensionality calculation amount caused by role functions, so role alternation fluency based on continuous change of the position of the football is ensured.
Description
Technical field
The present invention relates to a kind of multi-Robotics Cooperation Method based on hierarchical decision making mechanism.
Background technology
FIRA (Federation of International Robot-soccerAssociation with strongest influence power in the world now, FIRA) and RoboCup two large machine world cup people football matches, the difference of both maximums is that FIRA allows Yi Zhi team to adopt traditional centralized control, is equivalent to the control that all teammates in Yi Zhi team are subject to same brain.RoboCup necessarily requires to adopt distributed control mode, is equivalent to the brain that each team member has oneself, because of but one independently " main body ".This just need in depth study MAS, and the mode that allows a plurality of intelligent bodies plan to cooperate and compete has been removed certain goal task, uses evolution algorithmic and colony's wisdom to reach a whole breakthrough performance-based objective.
In RoboCup3D emulation match, want to win a football match, it is impossible depending merely on profile, must have cooperatively interacting and cooperating of whole team members, and RoboCup3D emulation match is mainly to embody multiple agent under the dynamic environment of complexity, how to realize cooperation efficiently and antagonism tenaciously.The number of player of RoboCup3D simulated environment is changed to 9 people of 2011 to 11 intelligent bodies so far from 6 intelligent bodies of 2010, this cooperation for multiple agent is had higher requirement.
About the coordination mechanism problem of multirobot, all started probing in various degree recent years both at home and abroad.For example Portuguese FC Portugal is for footballer character assignment problem, adopt repeated optimum allocation (IOA, Iterated Optimal Assignment) method, is under the greedy algorithm based on famous, to seek limited optimal value, and in conjunction with role swap mechanism; Observe the mankind's football, someone proposes to wish by setting up learning by imitation mechanism, unify mankind's complex behavior and robot motion, however in view of the not intellectual of the basic framework of learning by imitation, interactive interface is also difficult to obtain; U.S. UT Austin Villa troop application subtask collection optimization method completes the design of target framework, and the occupy-place of using dynamic role assignment algorithm to coordinate whole troop coordinates; Britain BoldHearts troop is used alliance's algorithm, be intended to build the requirement that a powerful team of alliance meets external environment, can, according to its action parameter of algorithm optimization, adopt the Infotaxis decision search algorithm without gradient, the rate value of suboptimize's information gain simultaneously; The Robocanes team of the U.S. adopts based on space-time model matching process, to set up relevant motion model and its internal state, the walking engine mechanism of the German B-Human of while reference troop, and optimize different behavior action parameters configurations with genetic algorithm and SARSA learning algorithm.
Said method all needs certain Optimization Mechanism and learning method, and for role assignments problem, its calculated amount is large, and renewal speed is slow.The problems referred to above are the problems that should pay attention to and solve in multi-robot Cooperation process.
Summary of the invention
The object of this invention is to provide a kind of multi-Robotics Cooperation Method based on hierarchical decision making mechanism, realize the effective cooperation of whole multirobot team, realize successively forward holding person's selection and the distribution of other all footballer character, simultaneously for the forward holding person decision-making module of dribbling, set up DOBMP model, finally adopt the problem of the high dimension calculated amount that dynamic programming algorithm optimization role function brings, guarantee the fluency of the role rotation under constantly changing based on football position.
Technical solution of the present invention is:
A multi-Robotics Cooperation Method based on hierarchical decision making mechanism,
Sportsman carries out formation according to the position judgment of ball and selects to go reply match;
Follow all sportsmen and vote in the holding person forward holding person who oneself thinks now best, then carry out other role assignments;
Determine whether forward holding person, if forward is holding person, run to ball place, dribbling walking, using ideal behavior forecast model to carry out mathematical modeling for the forward holding person walking decision-making module of playing football to opponent's speed, is that ball is played to impact point or walking and dribbled to impact point;
If not forward holding person, carry out, after other role assignments, running to location point, carry out formation selection.
Further, use ideal behavior forecast model to carry out mathematical modeling for the forward holding person walking decision-making module of playing football to opponent's speed, be specially:
By opponent's average velocity and the position at its current place, calculate the time T that opponent arrives the required cost of ball position; Know that we sportsman carries out the time that striking action spends, setting threshold robot success is played ball to impact point to predict us simultaneously;
Suppose that opponent can stop us to play football at t in the time, when T-t value is less, the possibility that we is successfully completed the task of playing football is larger;
When the value of T-t is less than the threshold value of setting, just think that the task of playing football can be successfully completed, and now takes ball to play to impact point.
Further, after making a policy, opponent still can stop us to play football, the opponent's that change is set up instantaneous velocity table, namely, if we has failed the task of playing football and will penalty value p be set to velometer:
Wherein, V
errbe the poor of opponent's true velocity and average velocity, n is the number of the instantaneous velocity of sampling.
Further, with dynamic programming function optimization algorithm, reduce calculated amount:
First calculate the distance value that each intelligent body arrives first role position, then utilize role assignments function yr to calculate the distance value that each intelligent body arrives respectively all possibility combinations of first and second position, and preserve the lowest positioned cost combination that every pair of intelligent body arrives these two positions;
For k intelligent body, setting up new location is to arrive { p based on k-1 intelligent body
1p
k-1position, utilize role assignments function yr to calculate each intelligent body and arrive respectively { p
1p
k-1the distance value of all possibilities combination of position, and preserve every pair of intelligent body and arrive { p
1p
k-1the lowest positioned cost combination of position;
Distribute subsequently each intelligent body to arrive p
kthe distance value of individual position also calculates the lowest positioned cost combination that all intelligent bodies arrive these three diverse locations.
Further, when calculating the combination of lowest positioned cost: have lower location cost in any subset, the cost that comprises the whole locator meams of this location must be lower.
Further, use containing the ballot system of different weights and vote.
Further, in ballot system, the distribution condition of communication information byte is:
Further, the dynamic assignment of footballer character, the role assignments function yr of use is to realize best occupy-place:
Mode according to dictionary sequence is selected, and each intelligent body is in all possible occupy-place mode, and the sum that walks of all intelligent bodies is the shortest paths;
In shortest path, when two sportsmen have intersection point on path, there will be the situation of collision, role assignments function yr obtains lower cost according to triangle inequality by exchanging two sportsmen's target location.
The invention has the beneficial effects as follows: the method realizes successively the selection of forward holding person CF and the distribution of other footballer character under the support of ballot communication system, and synchronously upgrades all footballer character; For the CF judgment mechanism of playing football, adopt DOBMP model analysis decision-making; For the calculated amount problem of update of role, adopt dynamic programming function to reduce greatly calculated amount, this speed for update of role is very helpful, and has guaranteed the fluency of the role rotation based in football change in location situation.
Accompanying drawing explanation
Fig. 1 is hierarchical decision making mechanism optimization process schematic diagram.
Fig. 2 is the schematic diagram that formation is selected.
Fig. 3 is whole occupy-place formation figure.
Fig. 4 is minimum cost occupy-place explanation schematic diagram.
Fig. 5 is used DOBPM to carry out mathematical modeling for the play football decision flow diagram of decision-making module of CF walking to opponent's speed.
Fig. 6 is the formation occupy-place under different match modes.
Fig. 7 is role rotation attack schematic diagram.
Embodiment
Below in conjunction with accompanying drawing, describe the preferred embodiments of the present invention in detail.
Based on RoboCup3D emulation platform, embodiment has designed a kind of multi-robot Cooperation Method based on hierarchical decision making, realizes the effective cooperation of whole multirobot team.Strategy mainly comprises based on role assignments function, ballot communication system and ideal behavior forecast model (Desired of Behavior Prediction Model, DOBMP) three aspects of hierarchical decision making mechanism under, realize successively forward holding person (CenterForward, CF) selection and the distribution of other all footballer character, for CF dribbling decision-making module, set up DOBMP model simultaneously, finally adopt the problem of the high dimension calculated amount that dynamic programming algorithm optimization role function brings, guarantee the fluency of the role rotation under constantly changing based on football position.
Embodiment
In the strategy of Apollo3D, adopt hierarchical decision making mechanism (Hierarchical Decision Making, be called for short HDM), as shown in Figure 1, so-called hierarchical decision making be exactly first sportsman according to current what formation that should adopt of position judgment of ball, go reply match, follow all sportsmen and vote in the holding person CF that oneself thinks now best, because a football match key is CF, it is that this pass or oneself dribbling are advanced, and this is all the key of whole team policy selection.In whole decision process, between role and sportsman, not to keep changeless always, the A of the current time robot most convenient of receiving, it may be exactly forward CF, next is constantly due to opponent's interception, A cannot realize oneself dribbling to break through and just pass the ball to teammate, and after pass, A will be according to the occupy-place conversion role of current time.Now other sportsman's role and position are all according to the position of self current time and fixed, finally adopt a kind of coordination system to realize communication between all sportsmen and role's synchronous renewal, by communication system, send the position of sportsman's self-position and ball, each sportsman just can be known its teammate's best occupy-place like this, just make sportsman reach an agreement, be just more conducive to the cooperation between sportsman.Wherein the selection of CF must foundation: this sportsman whether fall down, can see ball, ball he the place ahead or rear, apart from the distance of football, whether be goalkeeper, whether be CF, the shared weight of above-mentioned every kind of situation is all different if also having in he upper a decision-making period.
Formation is selected
The same with mankind's football match, RoboCup3D emulation match reply different situations also arrange corresponding match mode, as kick off (Kick-Off), goal kick (Goal_Kick), sideline ball (Throw_In), corner-kick (Corner_Kick) etc.From the angle of football match, team's whole strategy can be divided into the large system of attack and defense two, and in fact sportsman's Action Selection is exactly to be we or the other side according to what control ball, and we just enter attack state at ball-handling, and the other side's ball-handling just enters defence state.Different formations as shown in Figure 2.
Usually said team's formation is according to the position of ball, as shown in Figure 3, the whole erect-position of team when ball is positioned at center, court, whole formation can be divided into advances and guards two parts, the role position of advancing part is according to the coordinate position of asking, to add certain side-play amount to obtain again, and comprises CF, WFL, WFR, SFL, SFR, CAM and FF.Unique special be exactly this role of CF, it is the sportsman nearest apart from ball always, the position of ball is decided to be to its coordinate position.By He Qiu position, center, goal, connected into a linely, the position of guard type sportsman CDM, CBL and CBR is all on this line, and adds certain side-play amount according to court bottom line again.And the position of goalkeeper GK is not affected by its teammate substantially, this is in order to guarantee that oneself does not lose at goal, if when GK is the excellent person of CF constantly, will has another sportsman to be assigned as GK role and stand in center, goal.
Role assignments function y
r
After whole confirming of formation, key is exactly the dynamic assignment of footballer character, the role assignments function y of use
rto realize best occupy-place, when the extraneous status information of input, function can calculate current time sportsman and role's optimum matching situation.Before discussing this function, must meet three preconditions:
(1) the nearest position of selected distance: each intelligent body is in all possible occupy-place mode, take out respectively the position from they nearest (), to guarantee that the sum that walks of all intelligent bodies is the shortest, this just need to select according to the mode of dictionary sequence.
(2) keep away barrier: sportsman should avoid bumping with other sportsman when moving to their set positions as far as possible.
(3) dynamically consistent: if given a series of target location, if y
rat moment T output occupy-place mode m, sportsman f in moving to target location process is m by what export so.
If there be n sportsman, will have Middle occupy-place.The status information in the given external world, an especially n sportsman's position and n target location.Cost and descending sort successively by a kind of occupy-place mode of n element group representation.Like this can according to cost obtain n! Plant feasible occupy-place, according to lexicographic sequence, compare these costs, as shown in Figure 4 and Table 1.
The cost of various occupy-place modes sorts according to lexicographic:
The sequence of table 1 occupy-place cost
This minimum value that show that attribute 1 requires of being easy to according to lexicographic sequence.If two sportsmen have intersection point to there will be the situation of collision, function y on path
rcan by the target location that exchanges them, can obtain lower cost according to triangle inequality.
Ballot communication system
In order to allow all sportsmen of team can arrive accurately target location separately, just necessarily require all sportsmen can be harmonious and for just undoubted in executive role occupy-place.If sportsman can be known ball and the accurate location of its teammate on court, harmonious without between sportsman so just, because each sportsman can independently calculate the best occupy-place that needs use.But problem is just sportsman self, have the angle limitations of 120 °, and the perception information receiving is all mingled with noise, so the object of seeing all has error in distance and angle, thereby cannot obtains positional information accurately.In Simspark, allow fortunately intelligent body to carry out intercommunication, be every an emulation cycle (40ms), can intercom mutually between sportsman, but the bandwidth of this communications conduit is conditional, can only there is content constraints that a sportsman sends information and information 20 bytes at every turn.
3D simulated environment provides a so-called audio system, make each robot every two cycles (40ms) to broadcast the information that oneself will ' say ', other robot can receive this information for ' listening ' in the next emulation cycle, but cannot know that the information receiving comes from that intelligent body, so be necessary to add sportsman number in the information sending.The information of all sportsman's sending and receivings is all the ASCII character that is limited in 20 bytes, and to have part ASCII character be not allow to use.Apollo3D, for amount of compressed data, is divided into court the grid of 5000*5000 size, uses ' * ' to encode to 83 characters between '~', can transmit 8320 bit informations.
The concrete distribution condition of information byte is as shown in table 2 below, the basis of the layering decision-making system that wherein noticeable 14-18 byte is used as Apollo3D.In addition, ' saying ' that Apollo3D sends for each sportsman and ' listening ' information receiving are used encryption and decryption strategy, to guarantee the safety of our information communication and to increase certain antijamming capability.
The distribution condition of table 2 communication information byte
It must be emphasized that the occupy-place information of only using communication finally to receive is very unadvisable, because noisy interference, sportsman occur falling down or when self poisoning error has accumulation during the games, again or the information of sending from server even have lose or the situation of time delay occurs, the information that sportsman receives is just more inaccurate.So use the ballot system containing different weights, even if the situation appearance of the misdata that the even information dropout having or sportsman send also can make whole team use unified occupy-place.
Use the ballot system containing different weights, specifically, in play, dribbler's task is the heaviest, can be referred to as CF.Wherein the selection of CF is according to being: this sportsman whether fall down, can see ball, ball he the place ahead or rear, apart from the distance of football, whether be goalkeeper, also have whether this sportsman was CF in a upper decision-making period.The shared weight of above-mentioned every kind of situation is all different, but represents with the probability between (0,1).
Desirable behavior prediction model
In MAS, for the behavior prediction of other intelligent bodies, be one and have much challenging research.In theory, single intelligent body can directly be observed the behavior of other intelligent bodies, thereby sets up fixing behavior model, but only have, has the information interaction of many repeatability could set up model between intelligent body.In RoboCup3D emulation match, cannot just predict by simple observation opponent's behavior, and in match real-time change process, also be difficult to enough interbehaviors and set up useful model.
Embodiment has designed a kind of desirable behavior prediction model DOBPM, to predict the best behavior of single intelligent body under specified criteria.DOBPM supposes based on theoretical analysis what other intelligent bodies will do, but the best behavior error of analyzing them is to describe its anticipatory behavior.DOBPM model can for determine when to shoot, to pass and best held ball constantly etc.Embodiment is used DOBPM to carry out mathematical modeling for the CF walking decision-making module of playing football to opponent's speed, is ball is played to impact point or walking and dribbled to impact point.The process flow diagram of whole decision as shown in Figure 5.
During the games, the speed of travel value of opponent within several cycles of first sampling, and calculate its instantaneous velocity V
i:
(x wherein
b, y
b) be a sampling upper opponent's constantly positional value, (x
c, y
c) be current time opponent's positional value.In order to obtain opponent's average velocity, can use the method for harmonic-mean:
By opponent's average velocity and the position at its current place, can calculate the time T that opponent arrives the required cost of ball position.Also know that we sportsman carries out the time that striking action spends, just can to predict us, robot success be played to impact point by ball by setting threshold simultaneously.Suppose that opponent can stop us to play football at t in the time, when T-t value is less, the possibility that we is successfully completed the task of playing football is larger.When the value of T-t is less than the threshold value of setting, just think that the task of playing football can be successfully completed, and now takes ball to play to impact point.If opponent still can stop us to play football after making a policy, illustrate that to the predicted value of opponent's average velocity be inaccurate, now should change the opponent's of foundation instantaneous velocity table.That is to say, if we has failed the task of playing football and will penalty value p be set to velometer:
Wherein, V
errbe the poor of opponent's true velocity and average velocity, n is the number of the instantaneous velocity of sampling.And its true velocity is opponent, by initial position, runs to distance between time that final position (position of ball) spends and two positions and obtain.
Dynamic programming optimization method based on hierarchical decision making
Below set forth respectively the four module of hierarchical decision making mechanism, foundation based on ballot communication mechanism and ideal behavior model, thereby draw 11 different role assignments Gei11Ge robots of the process that footballer character is distributed and the Yi Zhi troop of playing football, yet goalkeeper always serve as guard goal role, CF always from the nearest sportsman of ball, all the other nine role positions are all by dynamic programming function y
rdraw.If goalkeeper is again by chance during from the nearest sportsman of ball, when GK is CF, y now
rneed 10 unequal to 3,628, different targeting scheme in 800, calculate respectively again their cost and select optimal cost value by dictionary sequence, all these calculating all must complete within the emulation cycle of 0.02s, and this just needs to consider to use dynamic programming function (Dynamic PlanningFunction) optimized algorithm to reduce calculated amount.
Wherein A, P represent respectively the set of n intelligent body and position thereof, locator meams m:=y
r(A, P), if there is lower location cost in any subset, the cost that comprises so the whole locator meams of this location must be lower.For k intelligent body, setting up new location is to arrive { p based on k-1 intelligent body
1p
k-1position.The dynamic programming process of San Ge robot for example, as shown in table 3, first calculate the distance value that three intelligent bodies arrive first role position, then utilize role assignments function yr to calculate the distance value that three intelligent bodies arrive respectively all possibility combinations of the one or two diverse location, and preserve the lowest positioned cost combination that every pair of intelligent body arrives these two positions.Distribute subsequently each intelligent body to arrive the distance value of the 3rd position and calculate the lowest positioned cost combination that all intelligent bodies arrive these three diverse locations.
The occupy-place allocative decision of table 3 San Ge robot
N intelligent body be through n dynamic programming iterative computation, the binomial calculating that to be equivalent to high order be n-1 at every turn:
When 11 intelligent bodies participate in match, remove goalkeeper's totally 10 intelligent bodies participation role assignments, after use dynamic programming optimized algorithm, calculated amount is n2
n-1=10 * 2
9=5120, however through the role assignments algorithm calculated amount of optimizing, be not 10 unequal to 3,628,800, significantly reduced calculated amount, also reduced role switching time cost simultaneously.
Experimental verification
All experiments are all to use DrawAnnotation function in Roboviz that we sportsman is overhead located to show in role's title of current time, each role's implication has explanation in Fig. 5, we Apollo3D is blue robot, and red robot is opponent.
Experiment one: the formation occupy-place under different match modes
This experiment is mainly that as shown in picture group 6, Fig. 6 is the formation occupy-place under different match modes for the formation occupy-place situation under different match modes in RoboCup3D emulation match, wherein, (a) kick off before both sides' occupy-place figure; (b) our left side corner-kick occupy-place figure; (c) our croquet occupy-place figure; (d) our forbidden zone corner-kick occupy-place figure.CF, WFL, WFR, SFL, SFR, CAM and FF are responsible for the role of attack in whole troop, wherein WFL, WFR, SFL, SFR and CAM follow CF closely to form after one's death rectangle, stand in respectively rectangular four angles and center, and self is as far as possible towards ball, like this can be in the situation that guaranteeing formation, each sportsman is nearest apart from ball position.FF stands in before opponents' goal forbidden zone all the time, shows through many experiments: the shooting of CF may be tackled by the other side or shooting angle has deviation, and this is that FF can occupy vantage point as soon as possible, and being switched to next CF constantly, to remedy shooting effect splendid; CDM, CBL, CBR and GK role are the defence tasks of bearing oneself half-court, if we when attack state, CDM can occupy position, midfield, this be for prevent opponent strike back or scoop out we sportsman form our counterattack.
Experiment two: role rotation and DOBMP modelling verification
What picture group 7 was described is occupy-place and the role switching of attack part, and in a figure, No. 2 sportsmen are forward CF, owing to being subject to stopping of opposing team, fall down, the role of No. 2 switches to rapidly CAM, now No. 7 sportsmen are towards ball and nearest apart from ball, and its role switches to rapidly CF, as shown in figure b, c; When No. 7 sportsmen are also tackled by opponent, the judgement of application DOBMP model plays ball to i.e. No. 3 positions of Player of impact point, and No. 3 wheels are changed to CF role simultaneously, and No. 2 and No. 7 simultaneous wheels are changed to SFL and CAM, as schemed as shown in d; Due to Simspark match platform specifies: while having 2 Tongfang sportsmen of surpassing in 1 meter of circle of radius centered by ball, can automatically spring open all sportsmen far away apart from ball, so No. 2 sportsmen are automatically springed open by platform during near ball, No. 5 wheels are changed to CF simultaneously, No. 3 wheel is changed to SFL, as shown in figure e.
Hierarchical decision making mechanism based under role assignments is exactly under the support of ballot communication system, to realize successively the selection of forward holding person CF and the distribution of other footballer character, and synchronously upgrades all footballer character; For the CF judgment mechanism of playing football, adopt DOBMP model analysis decision-making; For the calculated amount problem of update of role, adopt dynamic programming function to reduce greatly calculated amount, this speed for update of role is very helpful, and has guaranteed the fluency of the role rotation based in football change in location situation.
Claims (8)
1. the multi-Robotics Cooperation Method based on hierarchical decision making mechanism, is characterized in that:
Sportsman carries out formation according to the position judgment of ball and selects to go reply match;
Follow all sportsmen and vote in the holding person forward holding person who oneself thinks now best, then carry out other role assignments;
Determine whether forward holding person, if forward is holding person, run to ball place, dribbling walking, using ideal behavior forecast model to carry out mathematical modeling for the forward holding person walking decision-making module of playing football to opponent's speed, is that ball is played to impact point or walking and dribbled to impact point;
If not forward holding person, carry out, after other role assignments, running to location point, carry out formation selection.
2. the multi-Robotics Cooperation Method based on hierarchical decision making mechanism as claimed in claim 1, is characterized in that, uses ideal behavior forecast model to carry out mathematical modeling for the forward holding person walking decision-making module of playing football to opponent's speed, is specially:
By opponent's average velocity and the position at its current place, calculate the time T that opponent arrives the required cost of ball position; Know that we sportsman carries out the time that striking action spends, setting threshold robot success is played ball to impact point to predict us simultaneously;
Suppose that opponent can stop us to play football at t in the time, when T-t value is less, the possibility that we is successfully completed the task of playing football is larger;
When the value of T-t is less than the threshold value of setting, just think that the task of playing football can be successfully completed, and now takes ball to play to impact point.
3. the multi-Robotics Cooperation Method based on hierarchical decision making mechanism as claimed in claim 2, it is characterized in that: after making a policy, opponent still can stop us to play football, the opponent's that change is set up instantaneous velocity table, namely, if we has failed the task of playing football and will penalty value p be set to velometer:
Wherein, V
errbe the poor of opponent's true velocity and average velocity, n is the number of the instantaneous velocity of sampling.
4. the multi-Robotics Cooperation Method based on hierarchical decision making mechanism as described in claim 1-3 any one, is characterized in that, with dynamic programming function optimization algorithm, reduces calculated amount:
First calculate the distance value that each intelligent body arrives first role position, then utilize role assignments function yr to calculate the distance value that each intelligent body arrives respectively all possibility combinations of first and second position, and preserve the lowest positioned cost combination that every pair of intelligent body arrives these two positions;
For k intelligent body, setting up new location is to arrive { p based on k-1 intelligent body
1p
k-1position, utilize role assignments function yr to calculate each intelligent body and arrive respectively { p
1p
k-1the distance value of all possibilities combination of position, and preserve every pair of intelligent body and arrive { p
1p
k-1the lowest positioned cost combination of position;
Distribute subsequently each intelligent body to arrive p
kthe distance value of individual position also calculates the lowest positioned cost combination that all intelligent bodies arrive these three diverse locations.
5. the multi-Robotics Cooperation Method based on hierarchical decision making mechanism as claimed in claim 4, it is characterized in that, when calculating the combination of lowest positioned cost: have lower location cost in any subset, the cost that comprises the whole locator meams of this location must be lower.
6. the multi-Robotics Cooperation Method based on hierarchical decision making mechanism as claimed in claim 5, is characterized in that, uses containing the ballot system of different weights and votes.
7. the multi-Robotics Cooperation Method based on hierarchical decision making mechanism as claimed in claim 6, is characterized in that, in ballot system, the distribution condition of communication information byte is:
8. the multi-Robotics Cooperation Method based on hierarchical decision making mechanism as claimed in claim 7, is characterized in that, the dynamic assignment of footballer character, and the role assignments function yr of use is to realize best occupy-place:
Mode according to dictionary sequence is selected, and each intelligent body is in all possible occupy-place mode, and the sum that walks of all intelligent bodies is the shortest paths;
In shortest path, when two sportsmen have intersection point on path, there will be the situation of collision, role assignments function yr obtains lower cost according to triangle inequality by exchanging two sportsmen's target location.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410274560.6A CN104063541B (en) | 2014-06-18 | 2014-06-18 | Multi-Robotics Cooperation Method based on hierarchical decision making mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410274560.6A CN104063541B (en) | 2014-06-18 | 2014-06-18 | Multi-Robotics Cooperation Method based on hierarchical decision making mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104063541A true CN104063541A (en) | 2014-09-24 |
CN104063541B CN104063541B (en) | 2017-12-01 |
Family
ID=51551254
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410274560.6A Expired - Fee Related CN104063541B (en) | 2014-06-18 | 2014-06-18 | Multi-Robotics Cooperation Method based on hierarchical decision making mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104063541B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109254584A (en) * | 2018-09-13 | 2019-01-22 | 鲁东大学 | Role allocating method, device, computer equipment and storage medium based on multiple agent |
CN109794937A (en) * | 2019-01-29 | 2019-05-24 | 南京邮电大学 | A kind of Soccer robot collaboration method based on intensified learning |
CN110377048A (en) * | 2019-06-26 | 2019-10-25 | 沈阳航空航天大学 | A kind of unmanned aerial vehicle group defensive disposition method based on genetic algorithm |
CN110520813A (en) * | 2017-05-05 | 2019-11-29 | 赫尔实验室有限公司 | It is mobile throughout the multiple agent confrontation type of label formation using the transformation of RADON cumulative distribution and canonical correlation analysis prediction |
CN111954564A (en) * | 2018-01-21 | 2020-11-17 | 斯塔特斯公司 | Method and system for interactive, exposable and improved game and player performance prediction in team sports |
CN112221160A (en) * | 2020-10-22 | 2021-01-15 | 厦门渊亭信息科技有限公司 | Role distribution system based on random game |
CN113001545A (en) * | 2021-03-01 | 2021-06-22 | 北方工业大学 | Robot control method and device and robot |
CN113799143A (en) * | 2021-11-18 | 2021-12-17 | 广东隆崎机器人有限公司 | Safe cooperation method and device of multiple robots in working area |
CN114986503A (en) * | 2022-05-31 | 2022-09-02 | 江苏经贸职业技术学院 | Multi-robot cooperative motion method and football robot |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1617170A (en) * | 2003-09-19 | 2005-05-18 | 索尼株式会社 | Environment identification device and method, route design device and method and robot |
CN103292804A (en) * | 2013-05-27 | 2013-09-11 | 浙江大学 | Monocular natural vision landmark assisted mobile robot positioning method |
-
2014
- 2014-06-18 CN CN201410274560.6A patent/CN104063541B/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1617170A (en) * | 2003-09-19 | 2005-05-18 | 索尼株式会社 | Environment identification device and method, route design device and method and robot |
CN103292804A (en) * | 2013-05-27 | 2013-09-11 | 浙江大学 | Monocular natural vision landmark assisted mobile robot positioning method |
Non-Patent Citations (1)
Title |
---|
罗真: "对抗性环境下多机器人协作关键技术的研究", 《中国博士学位论文全文数据库》 * |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110520813A (en) * | 2017-05-05 | 2019-11-29 | 赫尔实验室有限公司 | It is mobile throughout the multiple agent confrontation type of label formation using the transformation of RADON cumulative distribution and canonical correlation analysis prediction |
CN110520813B (en) * | 2017-05-05 | 2022-06-24 | 赫尔实验室有限公司 | System, computer-implemented method, and storage medium for predicting multi-agent movement |
CN111954564A (en) * | 2018-01-21 | 2020-11-17 | 斯塔特斯公司 | Method and system for interactive, exposable and improved game and player performance prediction in team sports |
CN109254584A (en) * | 2018-09-13 | 2019-01-22 | 鲁东大学 | Role allocating method, device, computer equipment and storage medium based on multiple agent |
CN109254584B (en) * | 2018-09-13 | 2021-08-17 | 鲁东大学 | Role distribution method and device based on multiple intelligent agents, computer equipment and storage medium |
CN109794937A (en) * | 2019-01-29 | 2019-05-24 | 南京邮电大学 | A kind of Soccer robot collaboration method based on intensified learning |
CN109794937B (en) * | 2019-01-29 | 2021-10-01 | 南京邮电大学 | Football robot cooperation method based on reinforcement learning |
CN110377048A (en) * | 2019-06-26 | 2019-10-25 | 沈阳航空航天大学 | A kind of unmanned aerial vehicle group defensive disposition method based on genetic algorithm |
CN112221160B (en) * | 2020-10-22 | 2022-05-17 | 厦门渊亭信息科技有限公司 | Role distribution system based on random game |
CN112221160A (en) * | 2020-10-22 | 2021-01-15 | 厦门渊亭信息科技有限公司 | Role distribution system based on random game |
CN113001545A (en) * | 2021-03-01 | 2021-06-22 | 北方工业大学 | Robot control method and device and robot |
CN113799143B (en) * | 2021-11-18 | 2022-04-19 | 广东隆崎机器人有限公司 | Safe cooperation method and device of multiple robots in working area |
CN113799143A (en) * | 2021-11-18 | 2021-12-17 | 广东隆崎机器人有限公司 | Safe cooperation method and device of multiple robots in working area |
CN114986503A (en) * | 2022-05-31 | 2022-09-02 | 江苏经贸职业技术学院 | Multi-robot cooperative motion method and football robot |
Also Published As
Publication number | Publication date |
---|---|
CN104063541B (en) | 2017-12-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104063541A (en) | Hierarchical decision making mechanism-based multirobot cooperation method | |
Zhang et al. | Efficient communication in multi-agent reinforcement learning via variance based control | |
Zhang et al. | Game of drones: Multi-UAV pursuit-evasion game with online motion planning by deep reinforcement learning | |
CN112269396B (en) | Unmanned aerial vehicle cluster cooperative confrontation control method for eagle pigeon-imitated intelligent game | |
Xia et al. | Cooperative task assignment and track planning for multi-UAV attack mobile targets | |
CN105119733B (en) | Artificial intelligence system and its state transition method, server, communication system | |
Chen et al. | Research on the approach of task decomposition in soccer robot system | |
CN112034714A (en) | Grouping time-varying formation enclosure tracking control method and system | |
CN104090573A (en) | Robot soccer dynamic decision-making device and method based on ant colony algorithm | |
CN102065446B (en) | Topology control system and method orienting group mobile environment | |
CN111157002B (en) | Aircraft 3D path planning method based on multi-agent evolutionary algorithm | |
Shi et al. | Research on self-adaptive decision-making mechanism for competition strategies in robot soccer | |
CN109397294A (en) | A kind of robot cooperated localization method based on BA-ABC converged communication algorithm | |
Zhang et al. | Multi-robot cooperation strategy in game environment using deep reinforcement learning | |
Li et al. | Cooperative multi-agent reinforcement learning with hierarchical relation graph under partial observability | |
CN110427046A (en) | A kind of three-dimensional smooth random walk unmanned aerial vehicle group mobility model | |
Haobin et al. | Robot soccer confrontation decision-making technology based on MOGM: Multi-objective game model | |
Kiourt et al. | Social reinforcement learning in game playing | |
Xuanyu et al. | Multi-robot collaboration based on Markov decision process in Robocup3D soccer simulation game | |
Jiang et al. | A specified-time multi-agent hunting scheme with fairness consideration | |
Lu et al. | 3D humanoid robot multi-gait switching and optimization | |
Cui et al. | Role allocation tactics of soccer robots on RoboCup3D simulation platform | |
Zhan et al. | [Retracted] Cooperation Mode of Soccer Robot Game Based on Improved SARSA Algorithm | |
Yang et al. | Fuzzy theory based single belief state generation for partially observable real-time strategy games | |
Verhoeven | Team Behavior of Artificial Intelligence Bots in Games |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20171201 |
|
CF01 | Termination of patent right due to non-payment of annual fee |