CN105205533B - Development automatic machine and its learning method with brain Mechanism of Cognition - Google Patents
Development automatic machine and its learning method with brain Mechanism of Cognition Download PDFInfo
- Publication number
- CN105205533B CN105205533B CN201510628233.0A CN201510628233A CN105205533B CN 105205533 B CN105205533 B CN 105205533B CN 201510628233 A CN201510628233 A CN 201510628233A CN 105205533 B CN105205533 B CN 105205533B
- Authority
- CN
- China
- Prior art keywords
- mrow
- msub
- strio
- mfrac
- orientation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Landscapes
- Feedback Control In General (AREA)
Abstract
The present invention relates to the development automatic machine and its learning method with brain Mechanism of Cognition, belong to intelligent robot technology field.Development automatic machine with brain Mechanism of Cognition, including internal state set, system output set, built-in function behavior set, state transition equation, reward signal, system evaluation functions, system acting select probability, dopamine responsive to differential signal.Development automatic machine and its learning method provided by the invention with brain Mechanism of Cognition, framework provides that a kind of generalization ability is strong for system autonomous development process based on learning automaton, mathematical modeling applied widely;Sensorimotor system is combined by this method with intrinsic motivation mechanism, is improved self study and the adaptive ability of system, is realized intelligence truly.
Description
Technical field
The present invention relates to the development automatic machine and its learning method with brain Mechanism of Cognition, belong to intelligent robot technology
Field.
Technical background
Learning and Memory is the essence of people and animal intelligence behavior, and a variety of technical ability of people and animal are all in its nervous system
Gradually form and grow up during by self study and self-organizing, study and simulation people and animal nervous activity and
Self-regulatory mechanism, and intelligent robot is given to, it is the important subject of artificial intelligence and control science.
1996, J.Weng proposed robot autonomous intelligence development thought earliest, and he thinks that intelligent body should simulate
On the basis of human brain, interacted under the interior control in development program by sensor and effector with circumstances not known to develop intelligence
Ability.Brooks etc. emphasizes that robot interacts study with teacher, environment and gradually develops its intelligence, and by with reference to neurology department
Research theory proposes that the computation model in the regions such as prefrontal lobe in the cortex of simulation people and animal, hypothalamus, hippocampus comes
Challenge in complex environment is handled, this has also related to sensorimotor system.Initial cognitive development is from sensorimotor
What the formation and development of system coordination mechanism started, while sensorimotor system is the process for being formed and being developed in intrinsic motivation again
In constantly coordinate and it is perfect.Neurology pertinent literature shows, during people and animal learning, cerebral cortex, basal nuclei
And cerebellum can with itself distinctive method multiple operation, and in people's correlation relevant with animal movement, cerebellum and
Basal nuclei is distributed in cerebral cortex to the both sides of the route of motor message transmission between spinal cord, and they can participate in any behavior act
Initiation and control.
Related patent such as application number CN200910086990.4 patent of invention is based on automaton theory, it is proposed that operation
Automaton model, and the model is applied in the autonomous learning control of robot.Application No. CN201310656943.5's
Operant conditioning reflex principle is then applied to image processing field by patent, effectively raises the precision and speed of system process images
Degree.The patent of Application No. 201410101272.0 is low mainly for traditional robot learning efficiency, the problems such as adaptability difference
A kind of bionic intelligence control method is proposed, effectively raises intelligent robot level.Application No. 201410163756.8
A kind of autonomous intelligence development cloud robot system based on cloud computing is proposed, the system can effectively mitigate robot execution
The burden of computing intensive task, the shared of different machines human world knowledge can also be realized.But above patent does not relate to
Simulate the learning system of mankind's brain Mechanism of Cognition.
The content of the invention
For above-mentioned technical problem, the present invention is introduced into psychology using biological sensorimotor system as theoretical foundation
Intrinsic motivation mechanism learns to drive, there is provided a kind of development automatic machine and its learning method with brain Mechanism of Cognition, improves machine
The autonomous development cognitive ability of device people.
Development automatic machine with brain Mechanism of Cognition, including internal state set, system output set, built-in function behavior
Set, state transition equation, reward signal, system evaluation functions, system acting select probability, dopamine responsive to differential signal;
(1) SC=[s1,s2,...sj] it is expressed as limited internal state set, the sensation corresponded in cerebral cortex
Cortex, sjJ-th of state is represented, j is the number of internal state.
(2) MC=[y1,y2,...yi] system output set is expressed as, the motor cortex corresponded in cerebral cortex, yi
I-th of output is represented, i represents the number of output.
(3)CbA=[a1,a2,...ak] built-in function behavior set is expressed as, correspond to cerebellum region, akFor k-th
Internal actions, k are the number of internal actions.
(4)f:S (t) × a (t) → s (t+1) is state transition equation, i.e., the state s (t+1) at t+1 moment is by t
State s (t) and operation behavior a (t) are together decided on, and typically have environment or model to determine.
(5) r (t)=r (s (t), a (t)) is expressed as system internally state is the inside taken by s (t) in t
The reward signal for making state be transferred to after s (t+1) after operation behavior a (t), the mound sent relative to thalamus are felt.
(6) input signal in cerebral cortex includes two parts, is sensory cortex information and motor cortex information respectively, makees
For the input of corpus straitum, therefore:
CC={ SC, MC } (1)
Corpus straitum is mainly the evaluation mechanism for predicting organism operative orientation quality, is also furtherly intrinsic motivation machine
The evaluation mechanism of tropism quality is produced, it is as follows to define system evaluation functions:
BGstrio(t)=r (t+1)+γ r (t+2)+γ2r(t+3)+... (2)
Wherein, γ ∈ [0,1] are discount factor;Due to reason existing for intrinsic motivation mechanism so that the evaluation letter of system
Number BGstrioGradually 0 is leveled off to, so as to ensure that system is ultimately at stable state;It is the oriented nuclei in intrinsic motivation mechanism to define η
The heart, major function are to instruct autonomous cognition direction;Definition orientation core η span is in [ηmin,ηmax] between, i.e. orientation
Preferably between the functional value worst with orientation;The definition of intrinsic motivation orientation function is as shown in formula (3) so in corpus straitum:
Wherein λ is the parameter of orientation function, and the difference for defining the orientation function of two adjacent moments is θ (t)=η (t)-η
(t-1), carry out the orientation degree of judgement system, if θ (t) > 0, illustrate that t is bigger than the orientation value at t-1 moment, on the contrary θ
(t) < 0, illustrate to illustrate that t is smaller than the orientation value at t-1 moment.
(7) in the learning process of basal ganglion, the matrix in corpus straitum mainly acts selection function;By it is interior
A most important feature selects execution to act exactly according to probability size in the learning process of motivational mechanism driving;Using
The Boltzmann rule of probabilitys realize the action selection function of matrix, so as to realizing the probability selection mechanism of learning automaton, its
The middle Boltzmann rule of probabilitys belong to known;Define first:
Wherein:M represents m-th of internal actions, and A represents the Boltzmann rule of probabilitys, p (a=ak) expression action selection is generally
Rate.
According to the definition in formula (4), by the system acting select probability output BG of corpus straitum matrixmatrix(s, a) come
Substitute p (a=ak) represent, formula (2) substitutes into formula (4) and obtains formula (5):
Wherein, T is thermal constant, and the random degree of selection of expression action, the degree that the bigger explanations of T act selection is bigger,
The degree that the opposite smaller explanations of T act selection is smaller;When T gradually goes to zero, then BGstrio(SC(t),ak) corresponding to action
Select probability gradually tends to 1, and T numerical value is gradually reduced over time in system, represents system experience in learning process
Knowledge gradually increases, and is gradually evolved into a systems stabilisation from a unstable system;
(8) dopaminergic discharged by substantia nigra compacta be used as action assess instruct signal, for improve by
The Behavior Expression of maximum following award caused by action, more accurate action is performed to obtain;At the t+1 moment by corpus straitum
The evaluation function determined is:
BGstrio(t+1)=r (t+2)+γ r (t+3)+γ2r(t+4)+...(6)
Formula (7) can be drawn with reference to formula (2) and formula (6):
BGstrio(t)=r (t+1)+γ BGstrio(t+1) (7)
This shows, in t, evaluation function BGstrio(t) the evaluation function BG at t+1 moment can be usedstrio(t+1) come
Represent, but due to the influence of the error present in prediction initial stage so that with evaluation of estimate BGstrio(t+1) BG is representedstrio(t)
Value and actual value and unequal, so need to carry out in substantia nigra compacta by the award information of thalamus output and corpus straitum output
Processing, and discharge dopaminergic SNDPATo adjust the table of evaluation of estimate, dopamine responsive to differential signal is represented with formula (8):
SNDPA=r (t+1)+γ BGstrio(t+1)-BGstrio(t) (8)
The learning method of development automatic machine with brain Mechanism of Cognition, comprises the following steps:
(1) initialize:Iterative learning step number initial value t=0, iterative learning number are stepmax, initialize parameters
And synaptic weight, the then probability that initial internal operation behavior is performed when experiment starts are identical;
(2) current state SC (t) is perceived;
(3) evaluation function BG is calculated in corpus straitumstrio(t), due to the presence of intrinsic motivation mechanism, according to current
BGstrio(t) value calculates orientation function η (t);
(4) the action select probability BG of corpus straitum matrix is calculated according to formula according to orientation qualitymatrix(s, a) and by
Cerebellum execution action a (t);
(5) according to state transition equation, state is by SC (t) → SC (t+1);
(6) thalamus sends award r (t) immediately and triggers dopamine response regulation evaluation of estimate;
(7) by brain motor cortex output action y (t);
(8) (2)~(7) are repeated until t=stepmax;Study terminates.
Compared with prior art, development automatic machine and its learning method provided by the invention with brain Mechanism of Cognition, with
Framework provides that a kind of generalization ability is strong for system autonomous development process based on learning automaton, mathematical modulo applied widely
Type;Secondly sensorimotor system is combined by this method with intrinsic motivation mechanism, improves self study and the adaptive ability of system,
Realize intelligence truly.
Brief description of the drawings
Fig. 1 is present system structure chart;
Fig. 2 is learning process figure of the present invention;
Fig. 3 is that the coaxial two wheels robot balance of embodiment controls each condition responsive curve;
Fig. 4 is the coaxial two wheels robot balance control evaluation function and error simulation curve of embodiment;
Fig. 5 is the interference--free experiments simulation result of embodiment;
Fig. 6 is the learning method of embodiment and traditional learning automaton method evaluation function curve comparison figure;
The learning method of Fig. 7 embodiments and traditional learning automaton method error curve comparison figure.
Embodiment
The invention will be further described with reference to the accompanying drawings and detailed description.
Using coaxial two wheels robot as embodiment, system construction drawing according to Fig. 2 step flow as shown in figure 1, learnt.
For incomplete formula double-wheel self-balancing robot, it is an intrinsic unstable system, various realizing
Before motion, first have to ensure that robot can keep Equilibrium, so the posture balancing of coaxial two wheels robot is to be moved
The most important condition of control.In order to verify a kind of validity of development automatic machine with brain Mechanism of Cognition proposed by the invention,
Robustness and superiority, the present embodiment have studied how logical the robot under circumstances not known is using coaxial two wheels robot as object
Cross autonomous learning and finally learn technical performance.
Robot has four output quantities in experimentation and meets corresponding conditionses, i.e. left and right two-wheeled angular speed θrAnd θl
Less than 3.489rad/s, fuselage itself inclination alpha < 0.1744rad and robot swing rod angular speed β < 3.489rad/s.Discount because
Sub- γ=0.9, sampling time 0.01s.In each experiment, when the number of attempt of robot is tasted more than 1000 times or once
When the balance step number of examination is more than 20000 step, then stops the study of robot and restart another experiment.If robot exists
It can also keep balancing after undergoing 20000 steps in wherein once attempting, then it is assumed that the technical ability of balance control has been learned by robot.
After each the failure of an experiment, original state and each weights are reset to a range of random value again, then relearn.
Experiment 1:Balance control experiment
Robot, using method proposed by the present invention, by constantly study, passes through under the circumstances not known not interfered with
42 explorations simultaneously complete experiment in the 43rd exploration, take around 220 Walk of experience or so, i.e. 2.2s or so has just learned to balance
Technical ability is controlled, is demonstrated by its faster independent learning ability and effectiveness of the invention, each shape of preceding 3000 step in simulation result
State amount response curve and evaluation function and error simulation curve are as shown in Figure 3 and Figure 4.
Experiment 2:Interference--free experiments
In the actual running of system, input/output signal more or less can be disturbed by external noise, or
Detection means it is inaccurate, quantity of state is produced certain error.So in order to simulate actual environment, when robot
When keeping 9800 step after association's balance control, the pulse signal that amplitude is 25 is added in each input state amount, if machine
Device people can be subjected to the interference of pulse signal and keep balancing, then it is assumed that Success in Experiment simultaneously proves that the present invention has certain robust
Property.Fig. 5 is the output response for adding each state after pulse signal, it can be seen that by 200 steps, i.e. after 2s or so, and robot weight
Newly reach equilbrium position.
Experiment 3:The present embodiment and traditional learning automaton contrast experiment
Because the present invention has introduced intrinsic motivation mechanism to drive the autonomous learning of robot, the mistake of system is advantageously reduced
Difference, improve convergence of algorithm speed.In order to prove the superiority of the present invention, respectively using traditional learning automaton algorithm and Ben Fa
It is bright that balance control experiment has been carried out to coaxial two wheels robot, and its experimental result is analyzed.The parameter of two kinds of algorithms in experiment
Set identical, Fig. 6 and Fig. 7 are the comparison diagram of the evaluation function of two kinds of algorithms and error curve in preceding 2000 step.Can be with by Fig. 6
The present invention is found out in about 220 steps, i.e. 2.2s just completes the study of balance control technical ability, and traditional learning automaton method
In about 600 steps, i.e. 6s, just complete study, it was demonstrated that convergence rate of the invention is better than traditional learning automaton method.Fig. 7 tables
Bright error span of the invention is better than traditional learning automaton method, is more beneficial for the stabilization of system.
Claims (2)
- A kind of 1. development automatic machine with brain Mechanism of Cognition, it is characterised in that:Including internal state set, system output collection Close, built-in function behavior set, state transition equation, reward signal, system evaluation functions, system acting select probability, DOPA Amine responsive to differential signal;(1) SC=[s1,s2,...sj] it is expressed as limited internal state set, the sensory cortex corresponded in cerebral cortex, sjJ-th of state is represented, j is the number of internal state;(2) MC=[y1,y2,...yi] system output set is expressed as, the motor cortex corresponded in cerebral cortex, yiRepresent I-th of output, i represent the number of output;(3)CbA=[a1,a2,...ak] built-in function behavior set is expressed as, correspond to cerebellum region, akInside k-th Action, k are the number of internal actions;(4)f:S (t) × a (t) → s (t+1) is state transition equation, i.e. the state s (t+1) at t+1 moment by t state s (t) together decide on operation behavior a (t), determined by environment or model;(5) r (t)=r (s (t), a (t)) be expressed as system t internally state by the built-in function taken during s (t) The reward signal for making state be transferred to after s (t+1) after behavior a (t), the mound sent relative to thalamus are felt;(6) input signal in cerebral cortex includes two parts, is sensory cortex information and motor cortex information respectively, as line The input of shape body, therefore:CC={ SC, MC } (1)Corpus straitum is mainly the evaluation mechanism for predicting organism operative orientation quality, and furtherly and intrinsic motivation mechanism takes The evaluation mechanism of tropism quality, it is as follows to define system evaluation functions:BGstrio(t)=r (t+1)+γ r (t+2)+γ2r(t+3)+... (2)Wherein, γ ∈ [0,1] are discount factor;Due to reason existing for intrinsic motivation mechanism so that the evaluation function of system BGstrioGradually 0 is leveled off to, so as to ensure that system is ultimately at stable state;It is the oriented nuclei in intrinsic motivation mechanism to define η The heart, major function are to instruct autonomous cognition direction;Definition orientation core η span is in [ηmin,ηmax] between, i.e. orientation Preferably between the functional value worst with orientation;The definition of intrinsic motivation orientation function is as shown in formula (3) so in corpus straitum:<mrow> <mi>&eta;</mi> <mrow> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <mn>1</mn> <mo>-</mo> <msup> <mi>e</mi> <mrow> <mo>-</mo> <msub> <mi>&lambda;BG</mi> <mrow> <mi>s</mi> <mi>t</mi> <mi>r</mi> <mi>i</mi> <mi>o</mi> </mrow> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> </mrow> </msup> </mrow> <mrow> <mn>1</mn> <mo>+</mo> <msup> <mi>e</mi> <mrow> <mo>-</mo> <msub> <mi>&lambda;BG</mi> <mrow> <mi>s</mi> <mi>t</mi> <mi>r</mi> <mi>i</mi> <mi>o</mi> </mrow> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> </mrow> </msup> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>3</mn> <mo>)</mo> </mrow> </mrow>Wherein λ is the parameter of orientation function, and the difference for defining the orientation function of two adjacent moments is θ (t)=η (t)-η (t- 1), carry out the orientation degree of judgement system, if θ (t) > 0, illustrate that t is bigger than the orientation value at t-1 moment, on the contrary θ (t) < 0, illustrate that t is smaller than the orientation value at t-1 moment;(7) in the learning process of basal ganglion, the matrix in corpus straitum mainly acts selection function;By intrinsic motivation A most important feature selects execution to act exactly according to probability size in the learning process of mechanism drives;Using The Boltzmann rule of probabilitys realize the action selection function of matrix, so as to realizing the probability selection mechanism of learning automaton, its The middle Boltzmann rule of probabilitys belong to known;Define first:<mrow> <mtable> <mtr> <mtd> <mrow> <mi>A</mi> <mo>=</mo> <msub> <mi>Boltz</mi> <mi>T</mi> </msub> <mo>{</mo> <mi>E</mi> <mrow> <mo>(</mo> <mi>s</mi> <mo>,</mo> <msub> <mi>a</mi> <mi>k</mi> </msub> <mo>)</mo> </mrow> <mo>,</mo> <mi>k</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mn>2</mn> <mo>,</mo> <mn>....</mn> <mi>m</mi> <mo>}</mo> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mo>&DoubleLeftRightArrow;</mo> <mi>p</mi> <mrow> <mo>(</mo> <mi>a</mi> <mo>=</mo> <msub> <mi>a</mi> <mi>k</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <msup> <mi>e</mi> <mfrac> <mrow> <mi>E</mi> <mrow> <mo>(</mo> <mi>s</mi> <mo>,</mo> <msub> <mi>a</mi> <mi>k</mi> </msub> <mo>)</mo> </mrow> </mrow> <mi>T</mi> </mfrac> </msup> <mrow> <munderover> <mo>&Sigma;</mo> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>m</mi> </munderover> <msup> <mi>e</mi> <mfrac> <mrow> <mi>E</mi> <mrow> <mo>(</mo> <mi>s</mi> <mo>,</mo> <msub> <mi>a</mi> <mi>k</mi> </msub> <mo>)</mo> </mrow> </mrow> <mi>T</mi> </mfrac> </msup> </mrow> </mfrac> </mrow> </mtd> </mtr> </mtable> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>4</mn> <mo>)</mo> </mrow> </mrow>Wherein:M represents m-th of internal actions, and A represents the Boltzmann rule of probabilitys, p (a=ak) expression action select probability;According to the definition in formula (4), by the system acting select probability output BG of corpus straitum matrixmatrix(s, a) substitute P (a=ak) represent, formula (2) substitutes into formula (4) and obtains formula (5):<mrow> <msub> <mi>BG</mi> <mrow> <mi>m</mi> <mi>a</mi> <mi>t</mi> <mi>r</mi> <mi>i</mi> <mi>x</mi> </mrow> </msub> <mrow> <mo>(</mo> <mi>s</mi> <mo>,</mo> <mi>a</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <msup> <mi>e</mi> <mfrac> <mrow> <msub> <mi>BG</mi> <mrow> <mi>s</mi> <mi>t</mi> <mi>r</mi> <mi>i</mi> <mi>o</mi> </mrow> </msub> <mrow> <mo>(</mo> <mi>S</mi> <mi>C</mi> <mo>(</mo> <mi>t</mi> <mo>)</mo> <mo>,</mo> <msub> <mi>a</mi> <mi>k</mi> </msub> <mo>)</mo> </mrow> </mrow> <mi>T</mi> </mfrac> </msup> <mrow> <munderover> <mo>&Sigma;</mo> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>m</mi> </munderover> <msup> <mi>e</mi> <mfrac> <mrow> <msub> <mi>BG</mi> <mrow> <mi>s</mi> <mi>t</mi> <mi>r</mi> <mi>i</mi> <mi>o</mi> </mrow> </msub> <mrow> <mo>(</mo> <mi>S</mi> <mi>C</mi> <mo>(</mo> <mi>t</mi> <mo>)</mo> <mo>,</mo> <msub> <mi>a</mi> <mi>k</mi> </msub> <mo>)</mo> </mrow> </mrow> <mi>T</mi> </mfrac> </msup> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>5</mn> <mo>)</mo> </mrow> </mrow>Wherein, T is thermal constant, and the random degree of selection of expression action, the degree that the bigger explanations of T act selection is bigger, opposite T The degree that smaller explanation acts selection is smaller;When T gradually goes to zero, then BGstrio(SC(t),ak) corresponding to action selection Probability gradually tends to 1, and T numerical value is gradually reduced over time in system, represents system Heuristics in learning process Gradually increase, and be gradually evolved into a systems stabilisation from a unstable system;(8) dopaminergic discharged by substantia nigra compacta be used as action assess instruct signal, for improve by acting The Behavior Expression of caused maximum following award, more accurate action is performed to obtain;Determined at the t+1 moment by corpus straitum Fixed evaluation function is:BGstrio(t+1)=r (t+2)+γ r (t+3)+γ2r(t+4)+... (6)Formula (7) can be drawn with reference to formula (2) and formula (6):BGstrio(t)=r (t+1)+γ BGstrio(t+1) (7)This shows, in t, evaluation function BGstrio(t) the evaluation function BG at t+1 moment can be usedstrio(t+1) represent, But due to the influence of the error present in prediction initial stage so that with evaluation of estimate BGstrio(t+1) BG is representedstrio(t) value With actual value and unequal, so need to carry out in substantia nigra compacta by the award information of thalamus output and corpus straitum output Reason, and discharge dopaminergic SNDPATo adjust the table of evaluation of estimate, dopamine responsive to differential signal is represented with formula (8):SNDPA=r (t+1)+γ BGstrio(t+1)-BGstrio(t) (8)
- 2. the development automatic machine according to claim 1 with brain Mechanism of Cognition, it is characterised in that:Its learning method, bag Include following steps:(1) initialize:Iterative learning step number initial value t=0, iterative learning number are stepmax, initialize parameters and dash forward Weights are touched, then the probability that initial internal operation behavior is performed when experiment starts is identical;(2) current state SC (t) is perceived;(3) evaluation function BG is calculated in corpus straitumstrio(t), due to the presence of intrinsic motivation mechanism, according to current BGstrio(t) Value calculate orientation function η (t);(4) the action select probability BG of corpus straitum matrix is calculated according to formula according to orientation qualitymatrix(s, a) and by cerebellum Execution action a (t);(5) according to state transition equation, state is by SC (t) → SC (t+1);(6) thalamus sends award r (t) immediately and triggers dopamine response regulation evaluation of estimate;(7) by brain motor cortex output action y (t);(8) (2)~(7) are repeated until t=stepmax;Study terminates.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510628233.0A CN105205533B (en) | 2015-09-29 | 2015-09-29 | Development automatic machine and its learning method with brain Mechanism of Cognition |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510628233.0A CN105205533B (en) | 2015-09-29 | 2015-09-29 | Development automatic machine and its learning method with brain Mechanism of Cognition |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105205533A CN105205533A (en) | 2015-12-30 |
CN105205533B true CN105205533B (en) | 2018-01-05 |
Family
ID=54953202
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510628233.0A Expired - Fee Related CN105205533B (en) | 2015-09-29 | 2015-09-29 | Development automatic machine and its learning method with brain Mechanism of Cognition |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105205533B (en) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105690392B (en) * | 2016-04-14 | 2017-11-28 | 苏州大学 | Motion planning and robot control method and apparatus based on actor reviewer's method |
CN105824251B (en) * | 2016-05-18 | 2018-06-15 | 重庆邮电大学 | It is a kind of based on neural network it is bionical become warm behavioral approach |
CN106598058A (en) * | 2016-12-20 | 2017-04-26 | 华北理工大学 | Intrinsically motivated extreme learning machine autonomous development system and operating method thereof |
CN107894715A (en) * | 2017-11-13 | 2018-04-10 | 华北理工大学 | The cognitive development method of robot pose path targetpath optimization |
CN108646550B (en) * | 2018-04-03 | 2022-03-22 | 江苏江荣智能科技有限公司 | Multi-agent formation method based on behavior selection |
CN109002887A (en) * | 2018-08-10 | 2018-12-14 | 华北理工大学 | The heuristic curiosity cognitive development system of biology and its operation method |
CN109212975B (en) * | 2018-11-13 | 2021-05-28 | 北方工业大学 | Cognitive learning method with development mechanism for perception action |
CN112558605B (en) * | 2020-12-06 | 2022-12-16 | 北京工业大学 | Robot behavior learning system based on striatum structure and learning method thereof |
CN113255765B (en) | 2021-05-25 | 2024-03-19 | 南京航空航天大学 | Cognitive learning method based on brain mechanism |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101599137A (en) * | 2009-07-15 | 2009-12-09 | 北京工业大学 | Autonomous operant conditioning reflex automat and the application in realizing intelligent behavior |
US7668796B2 (en) * | 2006-01-06 | 2010-02-23 | The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration | Automata learning algorithms and processes for providing more complete systems requirements specification by scenario generation, CSP-based syntax-oriented model construction, and R2D2C system requirements transformation |
CN101673354A (en) * | 2009-06-12 | 2010-03-17 | 北京工业大学 | Operant conditioning reflex automatic machine and application thereof in control of biomimetic autonomous learning |
-
2015
- 2015-09-29 CN CN201510628233.0A patent/CN105205533B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7668796B2 (en) * | 2006-01-06 | 2010-02-23 | The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration | Automata learning algorithms and processes for providing more complete systems requirements specification by scenario generation, CSP-based syntax-oriented model construction, and R2D2C system requirements transformation |
CN101673354A (en) * | 2009-06-12 | 2010-03-17 | 北京工业大学 | Operant conditioning reflex automatic machine and application thereof in control of biomimetic autonomous learning |
CN101599137A (en) * | 2009-07-15 | 2009-12-09 | 北京工业大学 | Autonomous operant conditioning reflex automat and the application in realizing intelligent behavior |
Non-Patent Citations (3)
Title |
---|
A study on autonomous learning mechanism of Cognitive robot;Shi Tao;《Control and Decision Conference》;20150525;全文 * |
Research on robust bionic learning algorithm in balance control for the robot;Shi Tao;《Control and Decision Conference》;20150525;全文 * |
一种内在动机驱动的FRBF网络自主学习算法;任红格;《河北联合大学学报(自然科学版)》;20150731;第37卷(第3期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN105205533A (en) | 2015-12-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105205533B (en) | Development automatic machine and its learning method with brain Mechanism of Cognition | |
Newell | Change in movement and skill: Learning, retention, and transfer | |
Casellato et al. | Adaptive robotic control driven by a versatile spiking cerebellar network | |
Tamosiunaite et al. | Learning to pour with a robot arm combining goal and shape learning for dynamic movement primitives | |
Holland et al. | Robots with internal models a route to machine consciousness? | |
Caligiore et al. | TRoPICALS: a computational embodied neuroscience model of compatibility effects. | |
Salt et al. | Parameter optimization and learning in a spiking neural network for UAV obstacle avoidance targeting neuromorphic processors | |
Caligiore et al. | Integrating reinforcement learning, equilibrium points, and minimum variance to understand the development of reaching: a computational model. | |
CN107894715A (en) | The cognitive development method of robot pose path targetpath optimization | |
Gumbsch et al. | Autonomous identification and goal-directed invocation of event-predictive behavioral primitives | |
Weng et al. | Modulation for emergent networks: Serotonin and dopamine | |
Zhang et al. | Overview of deep reinforcement learning improvements and applications | |
CN103886367B (en) | A kind of bionic intelligence control method | |
John et al. | Modelling 3D saccade generation by feedforward optimal control | |
CN112405542B (en) | Musculoskeletal robot control method and system based on brain inspiring multitask learning | |
Hilleli et al. | Toward deep reinforcement learning without a simulator: An autonomous steering example | |
Guo et al. | WWN-9: Cross-domain synaptic maintenance and its application to object groups recognition | |
Houbre et al. | Balancing exploration and exploitation: a neurally inspired mechanism to learn sensorimotor contingencies | |
Preux et al. | A generic architecture for adaptive agents based on reinforcement learning | |
CN112525194A (en) | Cognitive navigation method based on endogenous and exogenous information of hippocampus-striatum | |
Ren et al. | A computational model of cognitive development for the motor skill learning from curiosity | |
CN112766317A (en) | Neural network weight training method based on memory playback and computer equipment | |
CN109002887A (en) | The heuristic curiosity cognitive development system of biology and its operation method | |
Saquib | Visual Tracking with Spiking Neural Networks in an Oculomotor Controller for a Biomimetic Model of the Eye | |
Vaidya et al. | Reducing catastrophic forgetting in self organizing maps with internally-induced generative replay (student abstract) |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20180105 Termination date: 20180929 |
|
CF01 | Termination of patent right due to non-payment of annual fee |