CN106681149B - It is a kind of that abdomen control method is put based on the bumblebee robot of virtual reality and intensified learning - Google Patents

It is a kind of that abdomen control method is put based on the bumblebee robot of virtual reality and intensified learning Download PDF

Info

Publication number
CN106681149B
CN106681149B CN201710018201.8A CN201710018201A CN106681149B CN 106681149 B CN106681149 B CN 106681149B CN 201710018201 A CN201710018201 A CN 201710018201A CN 106681149 B CN106681149 B CN 106681149B
Authority
CN
China
Prior art keywords
bumblebee
abdomen
pendulum
mode
robot
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710018201.8A
Other languages
Chinese (zh)
Other versions
CN106681149A (en
Inventor
郑能干
金梦洁
马倩
王雪菲
江毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201710018201.8A priority Critical patent/CN106681149B/en
Publication of CN106681149A publication Critical patent/CN106681149A/en
Application granted granted Critical
Publication of CN106681149B publication Critical patent/CN106681149B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/0265Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/04Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
    • G05B13/042Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance

Landscapes

  • Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Toys (AREA)

Abstract

Abdomen control method is put based on the bumblebee robot of virtual reality and intensified learning the invention discloses a kind of, belongs to insect robot technical field.The control method, comprising: (1) target pattern swung according to the bumblebee abdomen of setting establishes Q table using nitrification enhancement, and the Q table has recorded the accumulation discount reward value that each movement obtains under every class pendulum abdomen mode;(2) the pendulum abdomen mode for monitoring bumblebee current time, compares Q table, according to reward value maximum principle, makes action control bumblebee pendulum abdomen in real time.The present invention establishes the interaction mechanism between bumblebee pendulum abdomen behavior and visual stimulator using algorithm, does not need to carry out a large amount of Bioexperiment to obtain the relationship between stimulation-behavior response.The motor behavior of visual stimulus induction bumblebee of the present invention, and using nitrification enhancement as control algolithm, the control system of a closed loop is effectively formed, the abdomen that can be accurately controlled bumblebee swings behavior.

Description

It is a kind of that abdomen control method is put based on the bumblebee robot of virtual reality and intensified learning
Technical field
The present invention relates to insect robot technical fields, and in particular to a kind of bumblebee based on virtual reality and intensified learning Abdomen control method is put by robot.
Background technique
Insect robot is using insect as carrier, in conjunction with a kind of novel animal robot system of control module formation.This Class special machine people is by carrier locomitivity itself, without the concern for complicated machinery and dynamical system, in scientific research, state The fields such as anti-security, industrial production have wide practical use.
How to control insect flying is primarily solved the problems, such as in insect robot research.Insect robot generally comprises row For stimulating module, insect lipids acquisition and three analysis module, control instruction decision parts.
Currently, the behavior stimulation mode of insect robot is based on electro photoluminescence, and insect is adjusted not by manpower work With parameters of electrical stimulation corresponding under motion state.Electrical stimulation method is to be implanted to silicon or fibril electrode by animal surgery The muscle or nervous system of insect generate the stimulation site that square-wave pulse is input to insect using microcontroller, reach control The purpose of the behaviors such as insect deflection.Although the insect robot of electro photoluminescence formula achieves significant progress, still deposit In some problem:
First, the muscle or nervous system of insect can be caused to damage in implantable stimulation electrode process, caused subsequent Control performance decline;Second, prolonged electro photoluminescence will cause insect fatigue, decline to the response sensitivity of electro photoluminescence;The Three, since insect figure is smaller, the muscle of most of insect and the volume of brain are about 1 cubic millimeter, then an electro photoluminescence It may relate to multiple feeling-moving loops, even entire muscle or brain, to cause uncertain behavior reaction;The Four, to carry out manual adjustment stimulation parameter, needs to carry out a large amount of Bioexperiment in advance and obtain stimulation-behavior response pass System, needs a large amount of Bioexperiment in this way, will consume a large amount of time, and behavior of the different insects individual to identical stimulation parameter There are still biggish deviations for response, therefore there are inefficiency for the mode of manual adjustment stimulation parameter, control the problems such as not accurate;
The control mode for developing non-built-in mode can be to avoid above-mentioned the problem of causing damage to insect body.Numerous studies table Bright, light stream plays key effect in many important behaviors of insect, for example, flying speed control, height control, depth survey Amount, flying distance measurement etc., to realize that the insect robot of optofluidic standard provides theoretical basis.
Mode in the research of insect visual and electro physiology for providing visual stimulus has: printing streaky paper, throwing Shadow instrument, display and LED dot matrix etc..Most common visual stimulator is made of LED dot matrix, such as application No. is 201410422437.4 Chinese patent application document provides a kind of bumblebee flight controller, not only to meet the compound eye of bumblebee Characteristic, and want that customized light stream stimulation can be generated.But how to be stimulated using light stream, and in nitrification enhancement frame Under, it control effectively to bumblebee flight and does not have been reported that but.
Summary of the invention
Abdomen control method is put based on the bumblebee robot of virtual reality and intensified learning the present invention provides a kind of, using strong The abdomen for changing learning algorithm control bumblebee, which is swung, from an original state reaches a dbjective state, overcome in the prior art due to Artificial adjusting stimulation parameter control is not accurate, and needs the problem of carrying out a large amount of Bioexperiment.
It is a kind of that abdomen control method is put based on the bumblebee robot of virtual reality and intensified learning, comprising:
(1) target pattern swung according to the bumblebee abdomen of setting establishes Q table, the Q table note using nitrification enhancement The accumulation discount reward value that each movement obtains under every class pendulum abdomen mode is recorded;
(2) the pendulum abdomen mode for monitoring bumblebee current time, compares Q table, according to reward value maximum principle, makes movement in real time It controls bumblebee and puts abdomen;
The movement is by the visual stimulus of the different parameters formed around the light beam of bumblebee flowing.
The present invention models the insect robot control problem with nitrification enhancement.Virtual reality system provides Optic flow information is directly inputted to the compound eye of bumblebee, induces the pendulum abdomen behavior of bumblebee;The abdomen of high-definition camera real-time capture bumblebee Motor behavior, and by the abdomen weave mode of computer picture algorithm acquisition bumblebee, as ambient condition;Algorithm can change light The parameter of stream information, LED screen can change its kinematic parameter, then input to bumblebee, so as to form a closed-loop control system.
The reward value maximum principle, that is, Greedy strategy.
The movement is generated by the virtual reality system that several LED displays form, and the LED display is by being based on The LED control system of CAN bus controls.
Specifically, a kind of CAN bus based LED display, the Optic flow information for providing bumblebee input;The LED Display screen is 32 × 8 structures, and 32 pins control column of single-chip microcontroller show that remaining 8 pins control transverse direction LED is shown to LED. By in PC control terminal, CAN bus host node and several slave nodes comprising LED drive module form.The PC control terminal packet Input unit and first serial communication service unit are included, the data packet for containing image information and control command is sent to CAN The host node of bus, the host node in the CAN bus includes second serial communication service unit, receives the number that client is sent According to;CAN bus communication service unit uploads to CAN bus after the data encoding received;The slave node includes the Two CAN bus communication service units receive data from CAN bus, and are transferred to corresponding LED drive module.The LED driving Single-chip microcontroller model MSP430F149 used in module, LED drive module drive LED light, to reach display image and control The function of image motion.
The present invention reflects the flight behavior of bumblebee, numerous studies discovery, insect using record bumblebee abdomen weave mode Abdomen insect flying control in play a key role because the position of abdomen determine wing flutter its wings up and down generation power side To.
The acquisition and analysis of above-mentioned bumblebee pendulum abdomen behavior, are made of the camera of a high definition, particle filter algorithm, can To obtain the deflection angle time series of bumblebee abdomen.The high-definition camera model ANC HD1080P, acquisition frame rate 30 Frame/second;Camera is connected by USB line with computer, by image transmitting to computer, is used for subsequent behavioural analysis.The grain Son filtering be it is a kind of widely used, for the classic algorithm of object tracking, before the experiments, choose a certain fixed point be used as with Track point chooses one piece of bumblebee abdomen rectangular area in image to use as tracking target, the color histogram in the region as feature In calculating next frame target position, the abdomen deflection angle of bumblebee is thus calculated.After obtaining angle, sliding window is used Method calculate bumblebee pendulum abdomen mode.
Preferably, the pendulum abdomen mode by the maximum value, minimum value of bumblebee abdomen deflection angle in single sliding window, Average absolute value, Wave crest and wave trough number, the 6 dimensional vectors characterization constituted greater than average point number and variance.
Preferably, the width of the sliding window is 200~400ms.More preferably, the width of sliding window is 200ms。
The present invention establishes the interaction mechanism between bumblebee and visual stimulator, intensified learning using nitrification enhancement Process is one constantly with the process of environmental interaction, available a kind of from environment to the optimal of movement mapping by interaction Strategy.Intensified learning includes environment, movement and three major parts of reward function.
The environment division is that the abdomen of bumblebee puts abdomen mode, is indicated by 6 above-mentioned dimensional vectors;
The movement is that the parameter to external visual stimulus that nitrification enhancement can be taken regulates and controls, preferably, the Q Table includes 5~20 movements.The visual stimulus parameter is the angular speed of image rotation, specifically, there is 10 optional angle speed Degree, is respectively as follows: [2.05,2.89,3.97,4.88,6.20,7.21,8.72,11.02,14.5,22.16 ,] rad/s;In addition, view Feel that stimulation parameter also may include brightness, the direction of motion, width of image etc..
The reward function is mainly put the Euclidean distance between abdomen mode by the pendulum abdomen mode and target of current bumblebee and is determined, Note dbjective state is vector xg, corresponding discrete state is kg, the bumblebee pendulum abdomen mode of current time t is xt, corresponding to discrete state is kt, the pendulum abdomen mode of last moment bumblebee is denoted as xt-1, dist1For xtWith xgBetween distance, dist2For xt-1With xgBetween distance, So reward function can be denoted as:
R=0.2, if (dist1> dist2and kt≠kg);
R=+1, if (dist1> dist2and kt=kg);
R=-1, otherwise.
Reward function is to generate the quality of some movement to intensified learning to make a kind of evaluation, the purpose of intensified learning be so that The reward of accumulation maximizes.
Preferably, the method for building up of the Q table are as follows:
(1) Q table is initialized, learning rate α, discount factor γ put abdomen pattern clustering center;
(2) the pendulum abdomen mode x of moment t bumblebee is obtainedt
(3) sequence K-means discretization, adjustment pendulum abdomen mode class center are utilized;
(4) a is acted according to epsilon-greedy selectiont
(5) execution acts at, obtain reward value rt+1With the pendulum abdomen mode x of subsequent time bumblebeet+1
(6) step (3) and (4) are repeated, selection acts at+1
(7) Q table is updated according to the following formula;
Q(xt,at)←Q(xt,at)+α[rt+1+γQ(xt+1,at+1)-Q(xt,at)];
(8) execution is started the cycle over from step (2) again, until Q table is restrained.
By the interaction of nitrification enhancement and environment, obtain one from can optimal policy with environment to movement mapping, The parameter that visual stimulus next time can be determined according to the pendulum abdomen mode of current bumblebee, forms the control system an of closed loop System.In actual control process, pendulum abdomen mode will fast, accurately level off to target-like under nitrification enhancement control State.
It is that the present invention has the utility model has the advantages that
(1) present invention is directed to the pathways for vision of bumblebee, is stimulated using light stream, eliminates because being caused using implanted mode The injury to bumblebee itself.
(2) interaction mechanism between bumblebee pendulum abdomen behavior and visual stimulator is established using nitrification enhancement, no Need to carry out a large amount of Bioexperiment to obtain the relationship between stimulation-behavior response.
(3) motor behavior of visual stimulus induction bumblebee of the present invention, and using nitrification enhancement as control algolithm, effectively Ground forms the control system of a closed loop, and the abdomen that can be accurately controlled bumblebee swings behavior.
Detailed description of the invention
Fig. 1 is the bumblebee robot closed-loop control system control flow chart based on virtual reality and intensified learning of invention.
Fig. 2 is the bumblebee robot close-loop control part structural representation based on virtual reality and intensified learning of invention Figure.
Fig. 3 is CAN bus based LED display module diagram.
Fig. 4 is the behavior acquisition of bumblebee and analytic process and result schematic diagram.
Fig. 5 is nitrification enhancement interactive process schematic diagram.
Fig. 6 is the process extracted bumblebee abdomen based on sliding window and swing feature.
Fig. 7 is the learning process of nitrification enhancement.
Fig. 8 be a bumblebee respectively nitrification enhancement and randomized policy control under dimension SSC controlling curve.
Fig. 9 is the bumblebee dimension SSC and target dimension SSC under nitrification enhancement and randomized policy control respectively Aberration curve.
Figure 10 is the Q table distribution map that algorithm learns.
Specific embodiment
Below with reference to embodiment, the invention will be further described.
Embodiment 1
As shown in Figure 1, wherein Fig. 1 (A) is the bumblebee robot of the invention based on virtual reality and nitrification enhancement The control loop of closed-loop control system, control loop are that the compound eye of bumblebee first has received external light stream stimulation, are then produced Abdomen swing phase has been given birth to answer;Behavior is swung in order to capture the abdomen of bumblebee, the present invention uses particle filter algorithm tracking bumblebee Abdomen tip, and its swing angle is calculated, feature extraction is then carried out according to sliding window;The bumblebee abdomen pendulum extracted Dynamic feature, as the ambient condition of nitrification enhancement, algorithm can choose the stimulation parameter at next moment, thus form The learning process of closed loop.Fig. 1 (B) is then the bumblebee robot closed loop control of the invention based on virtual reality and nitrification enhancement The physical composition of system processed, corresponding diagram 1 (A), system include a CAN bus based LED display, for providing light stream thorn Swash;One high-definition camera for acquiring bumblebee pendulum abdomen behavior and one transport for image to be analyzed and handled Go the computer of nitrification enhancement.
As shown in Fig. 2, a kind of bumblebee robot control method experiment porch based on light stream stimulation and intensified learning, including Vibration isolators 1,2, the LED display 3 being placed on vibration isolators, LED screen is used to generate specific visual stimulus, micro- for fixing Support bracket fastened pillar 4 is grasped, for adjusting the fixed bracket 5 of micro- behaviour of the fixed position of bumblebee, swings row for shooting bumblebee abdomen For high-definition camera 6, gimbals 7, for the extension of follow-up system, in addition, further comprised in design high speed camera 8 with And the bracket 9 for fixing high speed camera.Whole system is placed on vibration isolators other than PC, prevents desktop shake band The influence to experiment.
As shown in figure 3, a kind of CAN bus based LED display, the Optic flow information for providing bumblebee is inputted;It is described LED display is 32 × 8 structures, and 32 pins control column of single-chip microcontroller show that remaining 8 pins control transverse direction LED is aobvious to LED Show.By in PC control terminal, CAN bus host node and several slave nodes comprising LED drive module form.The PC control terminal Including input unit and first serial communication service unit, the data packet for containing image information and control command is sent to The host node of CAN bus, the host node in the CAN bus includes second serial communication service unit, receives client and sends Data;CAN bus communication service unit uploads to CAN bus after the data encoding received;The slave node packet The second CAN bus communication service unit is included, receives data from CAN bus, and be transferred to corresponding LED drive module.The LED Single-chip microcontroller model MSP430F149 used in drive module, LED drive module drive LED light, thus reach display image and Control the function of image motion.
It uses the single bright fringes as bumblebee robotic vision information input in the present invention, thus bumblebee is induced to transport Dynamic behavior.
As shown in figure 4, analyzing schematic diagram for the acquisition of bumblebee flight behavior in the present invention and data, the abdomen of bumblebee is swung Behavior is captured by the high-definition camera at the back side, and the model of camera is ANC HD1080P, the pendulum abdomen frequency highest order of bumblebee 7Hz or so, therefore the frame per second that camera is arranged is 30 frames/second.(x in figure0,y0) it is the fixed point of bumblebee, bumblebee is fixed on On a piece thin iron staff of bending, the bending angle of iron staff is 45 degree, diameter 1mm, and fixed point is bumblebee back.It is selected by operator The abdominal region region for taking needs to track remembers that the center in region is (x, y), according to formula available bumblebee pendulum abdomen angle beta= arctan[(x-x0)/(y-y0)]。
As shown in figure 5, for the interactive process of the nitrification enhancement in the present invention, in general, intensified learning maintains one From environment to the movement value function Q (x of movement mappingt,at), it indicates in state xtUnder, take movement atThe accumulation prize that can be obtained It encourages, each time, after obtaining this value function, generally selects corresponding states xtUnder, the movement a of maximum value can be obtainedt.This The update of value function is constantly obtained with environmental interaction by intensified learning.
When t moment, the pendulum abdomen mode of bumblebee is xt, and state discrete is carried out according to sequence K-means algorithm, then select Select a movement at, that is, current light stream stimulation parameter is had adjusted, meanwhile, after the compound eye of bumblebee receives light stream stimulation, meeting It generates corresponding abdomen and swings behavior, the pendulum abdomen behavioral parameters of bumblebee are the deflection angle obtained by particle filter algorithm, and such as Shown in Fig. 6, we are provided with the sliding window that a size is 200ms here, take wherein all deflection angle data, calculate Include maximum value, minimum value, average absolute value, Wave crest and wave trough number, is greater than average point number and variance this 6 feature conducts The pendulum abdomen mode of bumblebee, this 6 dimensional feature, that is, state shown in figure xt+1, meanwhile, it is available by calculating reward function Currently available instant prize rt+1, the more new formula of nitrification enhancement are as follows:
Q(xt,at)←Q(xt,at)+α[rt+1+γQ(xt+1,at+1)-Q(xt,at)]。
It is wherein the learning rate of algorithm for α, γ is discount factor.Specific algorithm process is as shown in Figure 7.
The reward function is mainly put the Euclidean distance between abdomen mode by the pendulum abdomen mode and target of current bumblebee and is determined, Note dbjective state is vector xg, corresponding discrete state is kg, the bumblebee pendulum abdomen mode of current time t is xt, corresponding to discrete state is kt, the pendulum abdomen mode of last moment bumblebee is denoted as xt-1, dist1For xtWith xgBetween distance, dist2For xt-1With xgBetween distance, So reward function can be denoted as:
R=0.2, if (dist1> dist2and kt≠kg);
R=+1, if (dist1> dist2and kt=kg);
R=-1, otherwise.
Using nitrification enhancement, without knowing the corresponding pass between light stream stimulation parameter and bumblebee pendulum abdomen mode in advance System, by continuous interactive learning, nitrification enhancement may learn this from ambient condition (bumblebee put abdomen mode) to moving Make the mapping relations of (light stream stimulation parameter), and as subsequent closed loop Real-Time Control Strategy, avoids artificially many experiments And carry out the process of data analysis.
In order to verify the validity of the controlling mechanism based on nitrification enhancement, we are also carried out with common random algorithm Compare, data that Fig. 8-10 is our collected bumblebees and its Algorithm Control Experiment are as a result, Fig. 8 and 9 is the control of bumblebee Koji-making line chart, embodiment be current state dimension SSC to the dimension SSC of dbjective state xg approach the case where (SSC refers to sliding The number of Wave crest and wave trough in window, has reacted the frequency information of abdominal exercise, is that description bumblebee abdomen swings most important spy Sign).
Wherein the line with triangle is the SSC value of target, is the variation under nitrification enhancement control with circular line Curve, band × line be then random algorithm control change curve, it will thus be seen that
1, the control result of nitrification enhancement makes each step of SSC all increasingly level off to dbjective state, each step shape The deviation (corresponding is the Euclidean distance of Figure 10) of state and dbjective state is monotone decreasing, and final deviation tends to 0.But it is random The controlling curve of algorithm be in control process it is random, deviation be also it is random, there is no specific control trend.
2, under the control of nitrification enhancement, bumblebee has tended to target value, and final holding within most 8 steps Stabilization, however random algorithm does not make the process of control tend to target value, the state close to target value is without regularity.
3. the curve under nitrification enhancement control is more smooth, the curve shake of random algorithm control is very big.
From the point of view of above result and analysis, it is based on nitrification enhancement, we can be very good to be controlled with light stream stimulation The abdomen of bumblebee processed is swung, and is laid a good foundation for subsequent light stream formula bumblebee robot.

Claims (5)

1. a kind of put abdomen control method based on the bumblebee robot of virtual reality and intensified learning, comprising:
(1) target pattern swung according to the bumblebee abdomen of setting, establishes Q table using nitrification enhancement, the Q table has recorded The accumulation discount reward value that each movement obtains under every class pendulum abdomen mode;
(2) the pendulum abdomen mode for monitoring bumblebee current time, compares Q table, according to reward value maximum principle, makes action control in real time Bumblebee puts abdomen;
Intensified learning includes environment, movement and reward function, and the environment division is that the abdomen of bumblebee puts abdomen mode, the movement It is that the parameter to external visual stimulus that nitrification enhancement can be taken regulates and controls, by the difference formed around the light beam of bumblebee flowing The visual stimulus of parameter, the reward function are determined by the Euclidean distance between pendulum abdomen mode and target the pendulum abdomen mode of current bumblebee It is fixed;
Bumblebee puts the acquisition and analysis of abdomen behavior, is made of the camera of a high definition, particle filter algorithm, the pendulum abdomen mode By the maximum value, minimum value, average absolute value of bumblebee abdomen deflection angle in single sliding window, Wave crest and wave trough number, be greater than it is equal The 6 dimensional vectors characterization that value point number and variance are constituted.
2. abdomen control method is put by bumblebee robot as described in claim 1, which is characterized in that the width of the sliding window is 200~400ms.
3. abdomen control method is put by bumblebee robot as described in claim 1, which is characterized in that the movement is by several LED The virtual reality system of display screen composition generates, and the LED display is controlled by CAN bus based LED control system.
4. abdomen control method is put by bumblebee robot as described in claim 1, which is characterized in that the Q table includes 5~20 dynamic Make.
5. abdomen control method is put by bumblebee robot as described in claim 1, which is characterized in that the method for building up of the Q table are as follows:
(1) Q table is initialized, learning rate α, discount factor γ put abdomen pattern clustering center;
(2) the pendulum abdomen mode x of moment t bumblebee is obtainedt
(3) cluster discretization is carried out to state using sequence K-means, and adjusts pendulum abdomen mode class center;
(4) a is acted according to epsilon-greedy selectiont
(5) execution acts at, obtain reward value rt+1With the pendulum abdomen mode x of subsequent time bumblebeet+1
(6) step (3) and (4) are repeated, selection acts at+1
(7) Q table is updated according to the following formula;
Q(xt,at)←Q(xt,at)+α[rt+1+γQ(xt+1,at+1)-Q(xt,at)];
(8) execution is started the cycle over from step (2) again, until Q table is restrained.
CN201710018201.8A 2017-01-11 2017-01-11 It is a kind of that abdomen control method is put based on the bumblebee robot of virtual reality and intensified learning Active CN106681149B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710018201.8A CN106681149B (en) 2017-01-11 2017-01-11 It is a kind of that abdomen control method is put based on the bumblebee robot of virtual reality and intensified learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710018201.8A CN106681149B (en) 2017-01-11 2017-01-11 It is a kind of that abdomen control method is put based on the bumblebee robot of virtual reality and intensified learning

Publications (2)

Publication Number Publication Date
CN106681149A CN106681149A (en) 2017-05-17
CN106681149B true CN106681149B (en) 2019-11-19

Family

ID=58849400

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710018201.8A Active CN106681149B (en) 2017-01-11 2017-01-11 It is a kind of that abdomen control method is put based on the bumblebee robot of virtual reality and intensified learning

Country Status (1)

Country Link
CN (1) CN106681149B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112427843B (en) * 2020-11-09 2021-09-03 中国船舶重工集团公司第七一六研究所 Ship multi-mechanical-arm welding spot cooperative welding method based on QMIX reinforcement learning algorithm
CN112965595B (en) * 2021-02-26 2023-09-05 清华大学深圳国际研究生院 Modeling and predicting method for nerve electrical stimulation simulation touch sense
CH718639A2 (en) * 2021-05-20 2022-11-30 Ai Redefined Inc SYSTEMS AND METHODS OF REINFORCEMENT LEARNING IN REAL TIME.
CN114888829B (en) * 2022-04-29 2023-09-22 浙江大学 Bumblebee robot flight deflection behavior control method and system based on lightweight brain-computer interface

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000271350A (en) * 1999-03-26 2000-10-03 Bandai Co Ltd Insect robot
CN104216300A (en) * 2014-08-25 2014-12-17 浙江大学 LED control system based on CAN bus and bumblebee flight controller
CN105438305A (en) * 2015-11-13 2016-03-30 安徽工业大学 Six-limb insect motion mode determination method, bionic six-limb insect robot and use method thereof
CN105620568A (en) * 2016-03-17 2016-06-01 华东交通大学 Flexible worm-shaped robot and control method thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000271350A (en) * 1999-03-26 2000-10-03 Bandai Co Ltd Insect robot
CN104216300A (en) * 2014-08-25 2014-12-17 浙江大学 LED control system based on CAN bus and bumblebee flight controller
CN105438305A (en) * 2015-11-13 2016-03-30 安徽工业大学 Six-limb insect motion mode determination method, bionic six-limb insect robot and use method thereof
CN105620568A (en) * 2016-03-17 2016-06-01 华东交通大学 Flexible worm-shaped robot and control method thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
模拟昆虫视觉-行为抉择的强化学习模型;马奇 等;《生物物理学报》;20080630;第24卷(第3期);第211-220页 *
模拟飞行器设计及其在熊蜂偏航行为研究中的应用;巩凡;《中国优秀硕士学位论文全文数据库(电子期刊)》;20151215(第12期);第I140-231页 *

Also Published As

Publication number Publication date
CN106681149A (en) 2017-05-17

Similar Documents

Publication Publication Date Title
CN106681149B (en) It is a kind of that abdomen control method is put based on the bumblebee robot of virtual reality and intensified learning
Mischiati et al. Internal models direct dragonfly interception steering
Bender et al. A comparison of visual and haltere-mediated feedback in the control of body saccades in Drosophila melanogaster
Hofmann et al. Sensory flow shaped by active sensing: sensorimotor strategies in electric fish
Chen et al. Simulating prosthetic vision: I. Visual models of phosphenes
CN104541306B (en) Neurobehavioral animation system
CN110134165A (en) A kind of intensified learning method and system for environmental monitoring and control
Minegishi et al. Construction of a brain–machine hybrid system to evaluate adaptability of an insect
CN106861043A (en) The portable nerve regulator control system of animal population agreement
CN103461166B (en) Three-arm maze device oriented to animal robot control training, and training method
Taylor et al. New experimental approaches to the biology of flight control systems
Gray et al. A method for recording behavior and multineuronal CNS activity from tethered insects flying in virtual space
CN108596259A (en) A method of the artificial intelligence training dataset for object identification generates
KR20140058448A (en) System comprising an oculometer, method implemented on such a system and corresponding computer program product
CN109015635A (en) A kind of service robot control method based on brain-machine interaction
Deng et al. A VR-based BCI interactive system for UAV swarm control
Zhou et al. Progresses of animal robots: A historical review and perspectiveness
Kanzaki et al. Insect–machine hybrid system for understanding and evaluating sensory-motor control by sex pheromone in Bombyx mori
US20180214700A1 (en) Stimulation field templates to be applied across patient populations
CN108153357B (en) Intelligent management method and system for greenhouse
Zhang et al. An automatic control system for ratbot navigation
Schmeling et al. Synchronization of wing beat cycle of the desert locust, Schistocerca gregaria, by periodic light flashes
CN108113638A (en) A kind of dynamic visual acuity detection method
CN114888829B (en) Bumblebee robot flight deflection behavior control method and system based on lightweight brain-computer interface
Xie et al. Biomimetic control of pan-tilt-zoom camera for visual tracking based-on an autonomous helicopter

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant