CN109784400A - Intelligent body Behavioral training method, apparatus, system, storage medium and equipment - Google Patents
Intelligent body Behavioral training method, apparatus, system, storage medium and equipment Download PDFInfo
- Publication number
- CN109784400A CN109784400A CN201910028902.9A CN201910028902A CN109784400A CN 109784400 A CN109784400 A CN 109784400A CN 201910028902 A CN201910028902 A CN 201910028902A CN 109784400 A CN109784400 A CN 109784400A
- Authority
- CN
- China
- Prior art keywords
- data
- decision
- auxiliary
- intelligent body
- behavioral
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 137
- 230000003542 behavioural effect Effects 0.000 title claims abstract description 131
- 238000012549 training Methods 0.000 title claims abstract description 80
- 230000006399 behavior Effects 0.000 claims abstract description 113
- 230000009471 action Effects 0.000 claims abstract description 99
- 230000008569 process Effects 0.000 claims abstract description 90
- 241001269238 Data Species 0.000 claims abstract description 8
- 230000033001 locomotion Effects 0.000 claims description 21
- 238000003384 imaging method Methods 0.000 claims description 20
- 230000001133 acceleration Effects 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 12
- 238000005259 measurement Methods 0.000 claims description 5
- 238000004458 analytical method Methods 0.000 claims description 4
- 230000005540 biological transmission Effects 0.000 claims description 4
- 230000000875 corresponding effect Effects 0.000 description 34
- 238000010586 diagram Methods 0.000 description 14
- 238000013528 artificial neural network Methods 0.000 description 3
- 239000003795 chemical substances by application Substances 0.000 description 3
- 239000012636 effector Substances 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 238000013075 data extraction Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000037147 athletic performance Effects 0.000 description 1
- 230000004888 barrier function Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000005314 correlation function Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000003014 reinforcing effect Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
Abstract
This application involves a kind of intelligent body Behavioral training method, the intelligent body Behavioral training method includes: the decision data in the action process for obtain executive expert;Wherein, the decision data includes the set of multiple decision behavior data and corresponding decision observation data pair;Obtain the auxiliary data in the action process for executing auxiliary;Wherein, the auxiliary data includes the set of multiple auxiliary behavioral datas and corresponding supplementary observation data pair;Based on the decision data and the auxiliary data, model autonomous learning is carried out, intelligent body behavior model is obtained.Using technical solution of the present invention, the successful row of intelligent body Behavioral training is improved.Using technical solution of the present invention, the time of model training is saved, improves the adaptability and accuracy of agent model in all cases.
Description
Technical field
This application involves equipment control technology fields, more particularly to a kind of intelligent body Behavioral training method, apparatus, are
System, storage medium and equipment.
Background technique
With the raising of scientific and technological level, entire society all develops towards intelligent, automation direction.More and more rows
For the realization dependent on intelligent body.Such as: the movement of crawl is executed by intelligent body, the movement of assembly, drives object movement
Etc. action behavior.
Artificial intelligence is that intelligent body future development brings unlimited possibility, by supervision, semi-supervised, reinforcing or is imitated
Study etc. various methods are trained neural network model, so that the intelligent body based on network model control can
Autonomous study executes various movements.
Learning by imitation refers to learn from the example that demonstrator provides, and obtains the multiple groups decision number of the expert in demonstration program
Include status data and corresponding action data according to, every group of decision data, all state and action data are constituted to summarizing
New set.Later can using state as feature (feature), movement classified as label (label) (for from
Dissipate movement) or the study of (for continuous action) is returned to obtain optimal policy model.
However, it is desirable to it is seen that, the method to learn by imitation is during being trained neural network, very
Good model training result cannot be obtained in more situations.
Summary of the invention
Based on this, the present invention provides a kind of intelligent body Behavioral training method, apparatus, system, storage medium and equipment.
First aspect present invention provides a kind of intelligent body Behavioral training method, and the intelligent body Behavioral training method includes:
Obtain the decision data in the action process of executive expert;Wherein, the decision data includes multiple decision behaviors
Data and corresponding decision observe data;
Obtain the auxiliary data in the action process for executing auxiliary;Wherein, the auxiliary data includes multiple auxiliary behaviors
Data and corresponding supplementary observation data;
Based on the decision data and the auxiliary data, model autonomous learning is carried out, intelligent body behavior model is obtained.
Further, described to be based on the decision data and the auxiliary data, model autonomous learning is carried out, intelligent body is obtained
Behavior model includes:
Based on the decision data and the auxiliary data, training initial model obtains pretreated model;
Pretreated model autonomous learning is carried out, intelligent body behavior model is obtained;
Further, described to be based on the decision data and the auxiliary data, model autonomous learning is carried out, intelligent body is obtained
Behavior model includes:
Based on the decision data and the auxiliary data, initial model autonomous learning is carried out, intelligent body behavior mould is obtained
Type.
Further, the decision data in the action process for obtaining executive expert includes:
Obtain the decision behavior data at multiple current times in the action process of the executive expert;
The described of the multiple current time that first sensor is sent in the action process of the executive expert is obtained to determine
Plan observes data;Wherein, the decision behavior data at the current time are corresponding with the decision at current time observation data;
Or
Obtain decision behavior described in multiple current times that second sensor is sent in the action process of the executive expert
The relevant information of data;
The relevant information is parsed, the decision behavior data of multiple last moments are generated;
The described of the multiple last moment that first sensor is sent in the action process of the executive expert is obtained to determine
Plan observes data;Wherein, the decision behavior data of the last moment and the decision of the last moment observe number
According to corresponding.
Further, the auxiliary data obtained in the action process for executing auxiliary includes:
Obtain the auxiliary behavioral data at multiple current times in the action process for executing auxiliary;
Obtain the described auxiliary of the multiple current time that first sensor is sent in the action process for executing auxiliary
Help observation data;Wherein, the auxiliary behavioral data at the current time is corresponding with the supplementary observation data at the current time;
Or
It obtains and assists behavior described in multiple current times that second sensor is sent in the action process for executing auxiliary
The relevant information of data;
According to the relevant information, the behavioral data of multiple last moments is obtained;
Obtain the described auxiliary of the multiple last moment that first sensor is sent in the action process for executing auxiliary
Help observation data;Wherein, the supplementary observation number of the auxiliary behavioral data of the last moment and the last moment
According to corresponding.
Second aspect of the present invention provides a kind of intelligent body Behavioral training control device, the intelligent body Behavioral training control dress
It sets and includes:
Decision data obtains module, the decision data in action process for obtaining executive expert;Wherein, the decision
Data include multiple decision behavior data and corresponding decision observation data;
Auxiliary data obtains module, obtains the auxiliary data in the action process for executing auxiliary;Wherein, the auxiliary data
Including multiple auxiliary behavioral datas and corresponding supplementary observation data;
Behavior model generation module, for carrying out model autonomous learning based on the decision data and the auxiliary data,
Obtain intelligent body behavior model.
Third aspect present invention provides a kind of intelligent body Behavioral training system, and the intelligent body Behavioral training system includes:
Behavioral data generating means, for generating decision behavior data and the auxiliary behavioral data, and by the decision
Behavioral data and the auxiliary behavioral data are sent to the control device;
First sensor, for obtaining decision observation data and supplementary observation data, and by decision observation data and
The supplementary observation data are sent to the control device;
Control device, the decision data in action process for obtaining executive expert;Wherein, the decision data includes
Multiple decision behavior data and corresponding decision observe data;Obtain the auxiliary data in the action process for executing auxiliary;Wherein,
The auxiliary data includes multiple auxiliary behavioral datas and corresponding supplementary observation data;Based on the decision data and described auxiliary
Data are helped, model autonomous learning is carried out, obtains intelligent body behavior model.
Further, the intelligent body Behavioral training system further include:
Intelligent body, for executing the behavior of the expert and the behavior of the auxiliary under teaching.
Further, the first sensor includes:
Imaging sensor, the image data of the intelligent body for obtaining a certain moment;
Force snesor, the force feedback data of the intelligent body for obtaining a certain moment;
Encoder, the motion feedback data of the driving unit for obtaining intelligent body described in a certain moment;
Range finder, the relevant ranging data of distance for obtaining intelligent body described in a certain moment;
Speed or acceleration information measuring appliance, for obtaining the speed or acceleration analysis number of intelligent body described in a certain moment
According to;
Current or voltage measuring appliance, for obtaining the current or voltage measurement data of intelligent body described in a certain moment;
Timer, for obtaining the specific time data at a certain moment;And/or
Temperature sensor, for obtaining the temperature data of intelligent body described in a certain moment.
Further, the behavioral data generating means include: control unit;
Described control unit, for generating the decision behavior data and the auxiliary behavioral data.
Further, the behavioral data generating means include: second sensor and control unit;
The second sensor, for obtain second sensor transmission multiple current times described in decision behavior data and
Assist the relevant information of behavioral data;
Described control unit, for obtaining the behavioral data of multiple last moments according to the relevant information.
Further, the second sensor includes imaging sensor and encoder.
Third aspect present invention provides a kind of multiagent system, and the robot system includes intelligence described in any of the above item
It can body Behavioral training system.
Fourth aspect present invention provides a kind of computer equipment, including memory and processor, and the memory is stored with
Computer program, the processor realize intelligent body Behavioral training side described in any of the above item when executing the computer program
Method.
Fourth aspect present invention provides a kind of computer readable storage medium, is stored thereon with computer program, the meter
Calculation machine program realizes intelligent body Behavioral training method described in any of the above item when being executed by processor.
Due to the auxiliary data of behavior and the decision data of expert's behavior will be assisted to input jointly during model training
Into initial model, model is trained, saves the time of model training, improves agent model in all cases
Adaptability and accuracy.
Detailed description of the invention
Fig. 1 is the first pass schematic diagram of intelligent body Behavioral training method in one embodiment;
Fig. 2 is the second procedure schematic diagram of intelligent body Behavioral training method in one embodiment;
Fig. 3 is the third flow diagram of intelligent body Behavioral training method in one embodiment;
Fig. 4 is the 4th flow diagram of intelligent body Behavioral training method in one embodiment;
Fig. 5 is the 5th flow diagram of intelligent body Behavioral training method in one embodiment;
Fig. 6 is the 6th flow diagram of intelligent body Behavioral training method in one embodiment;
Fig. 7 is the first structure diagram of the embodiment of multiagent system;
Fig. 8 is the second structural schematic diagram of the embodiment of multiagent system;
Fig. 9 is the first structure block diagram of intelligent body training device;
Figure 10 is the second structural block diagram of intelligent body training device;
Figure 11 is the first structure block diagram of middle intelligent body training system;
Figure 12 is the second structural block diagram of middle intelligent body training system;
Figure 13 is the first structure block diagram of the behavioral data generating means of middle robot;
Figure 14 is the second structural block diagram of the behavioral data generating means of middle robot.
Specific embodiment
It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood
The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not
For limiting the application.
In one embodiment, as shown in Figure 1, providing a kind of intelligent body Behavioral training method, the intelligent body Behavioral training
Method the following steps are included:
Step S100 obtains the decision data in the action process of executive expert.
Specifically, intelligent body can be the independent Computer Control Unit of realization machine learning, or including the control
Various automation equipments of device processed (such as: industry, medical treatment, amusement, communications and transportation, service etc. field) or measuring device etc.
Deng.For convenience of description, this specific embodiment is further described so that intelligent body is robot as an example, wherein robot can be with
Regard advanced automation equipment as.In some embodiments, the amount of exercise of the robot in following example, may refer to machine
The amount of exercise of any part of people, such as;The amount of exercise of end effector.
Specifically, as shown in figure 3 or 4, robot can pass through formation in series or in parallel for multiple joints and connecting rod
Various types of manipulators, each joint are a driving unit, such as: the cascade machines such as four axis robots, six axis robot
Tool hand or parallel manipulator.In some embodiments, the output end of the terminal shaft of manipulator also fixation ends actuator, end is held
Row device can be sucker or clamping jaw etc.;Therefore, the end of manipulator described in this specific embodiment can refer to manipulator
End effector;Or the terminal shaft of manipulator.
Wherein, the corresponding behavioral data that the observation data and the moment that acquisition is inscribed when decision data refers to a certain obtain
To the set summarized.
Specifically, inscribing the observation data of acquisition and the corresponding behavior number of the moment acquisition when decision data refers to a certain
According to the set summarized.
Specifically, action process can include but is not limited to: grabbing target object from bulk or regularly arranged object
Movement (as shown in Figure 8);Assemble the movement (as shown in Figure 7) of target object;The movement (omitting attached drawing) for putting down object, from
One position some or all of moves in the movement (omit attached drawing) or above-mentioned each movement of another position movement
Combination.
In one embodiment, decision data is to obtain in the action process by teaching robot executive expert.
Specifically, control instruction of controller generation etc. band mobile robot can be directed or through by operator
Executive expert's behavior;Such as: robot completes the assembly behavior act of building blocks under the drive of operator;For another example: according to
The assembly behavior act of building blocks is completed in the amount of exercise instruction of each driving unit for the robot that controller is sent.
Further, in some embodiments, it is executed by the instruction tape mobile robot that behavioral data generating means generate special
In the case where family's behavior:
Behavioral data can include but is not limited to: each step in the action process of executive expert is corresponding to pass through controller
The robot of output executes the object pose (X, Y, Z, U, V, W coordinate) or position (X, Y coordinates) of each step;Either according to mesh
Mark appearance or position, the amount of exercise (rotation amount and/or flat of the driving unit of the corresponding robot calculated based on kinematical equation
Shifting amount);Or the amount of exercise of robot.
Control device obtains the observation data that first sensor is obtained and sent.Specifically, observation data may include but
Be not limited to: imaging sensor obtain and the image data that sends or according to the robot of the image data extraction (such as: robot
End effector) pose or position, distance measuring sensor obtain and the ranging data that sends, force snesor obtain and send
Driving unit amount of exercise (rotation amount and/or the translation of robot that power (power/torque) feedback data, encoder are obtained and sent
Amount) data, speed or acceleration analysis device obtains and send speed or acceleration information, current or voltage measuring appliance acquisition simultaneously
The temperature number that the current or voltage measurement data of transmission, timer obtain and the time data that send, thermometer are obtained and sent
According to.
Such as: as shown in fig. 7, by taking image training robot assembles behavior (such as: object M2 is assembled on object M1) as an example,
Obtain the multiple groups decision data in the action process of executive expert;It is at a time obtained down specifically, behavioral data can be
The movement of the object pose or position or driving unit of the robot next step exported by behavioral data generating means taken
Amount;And the corresponding image data for observing each first sensor transmission that acquisition is inscribed when data can be this or pose or position
It sets, force feedback data, encoder feedback data, speed or acceleration information and/or current or voltage data etc.;It is special executing
In the action process of family, the multiple groups decision data that will acquire is sent to the control unit of robot.In some embodiments, it executes
Multiple groups decision data in the action process of expert needs to include at least the decision data under assembly success status.
It in some embodiments, is image data when control device obtains, it can be directly using image data as observation
Data, or the pose of robot is gone out perhaps behind position using pose or position as observation according to image data extraction
Data.
Further, in some embodiments, in the case where the behavior by operator with mobile robot executive expert:
Since in this case, none very specific behavior command is as behavioral data, in order to obtain behavior number
According to can indirectly obtain the relevant information of behavioral data or behavioral data by certain second sensors.At this moment, the first sensing
Device and second sensor may include the sensor of identical type, such as: imaging sensor and encoder, in some embodiments
In, sensor identical in first sensor and second sensor can be merged into a sensor, that is, the data obtained are i.e.
It can be used as behavioral data, can also be used as status data.Such as: the driving list that the encoder obtained under current time is sent
The amount of exercise data of member;The amount of exercise data can do the observation data inscribed when this, can also be used as the row at last moment
For data.For another example: the imaging sensor under current time according to acquisition obtains and sends the pose that image obtains robot
Or position may act as the behavioral data of last moment;It can be used as the observation data of the robot under current time again.
Such as: as shown in figure 8, by taking image training robot is from the behavior for grabbing object in bulk as an example.Wherein, bulk
Refer to that multiple objects M is scattered with irregular state.Obtain the multiple groups decision data in the action process of executive expert
(behavioral data and corresponding observation data);Specifically, the behavioral data at a certain current time subsequent time can obtain according to
The pose of the robot for the image zooming-out that the imaging sensor taken is sent or position, or the figure according to current time and subsequent time
As the amount of exercise of the robot of the pose or position acquisition of the robot of extraction;And the observation data at current time then can be to work as
The information that each first sensor of the acquisition inscribed when preceding is sent, such as: the force feedback data of force snesor (such as: in hand
The pressure sensor being arranged on finger obtains the numerical value and/or directional information of power when completing grasping movement;Or in robot
Multi-dimension force sensor is arranged in terminal shaft output end, obtains the variation etc. of the power or torque of output end in the process of grasping), driving
(robot is during the motion for unit feedback data (such as: motor rotation or the angle of movement), speed or acceleration information
Speed or acceleration) and/or current or voltage data (such as: input motor current or voltage value) etc., in addition, according to current
The image data at moment can also be with the pose or position data at the current time of extraction machine people.
Specifically, behavioral data can include but is not limited to: object pose or position, robot each driving unit
Amount of exercise.
In some embodiments, the multiple groups decision data in the action process of executive expert is needed to include at least and be grabbed successfully
When the decision data inscribed.
For another example: by taking image training robot moves (translation and/or rotation) to another position from a position as an example.It obtains
Multiple groups decision data in the action process of executive expert;Specifically, behavioral data may include what imaging sensor obtained
The athletic performance of robot it is each when the pose of the actuator of the robot of image zooming-out inscribed;And the moment is corresponding
Observe data: such as: by the range information for the distance objective position that ranging data is fed back, such as: it installs and surveys in robot
Distance meter (such as: infrared range-measurement system), range information driving unit feedback data, the speed of distance objective position are fed back by rangefinder
Degree or acceleration information etc..Specifically, the multiple groups decision data in the action process of executive expert needs to include at least movement
The decision data inscribed when to target position.
In one embodiment, decision data is what expert obtained in the action process of executive expert.
Specifically, expert can be operator or other robot, such as: it obtains some operator and assembles realizing
Decision data in behavior;Specifically, such as: the behaviour that can be shot and send by obtaining multiple current time imaging sensors
Work person is in the image data for executing assembling process to obtain the behavioral data of last moment of the operator in assembling process and work as
The observation data at preceding moment;In addition to this, it can also be fed back by force snesor and be held in people in the installation force snesor on hand of people
Luggage is with the observation data etc. in action process.
Specifically, during executing the behavior of the various experts of grasping body, assembly etc., under multiple states of acquisition
Image data can be 3D rendering, 2D image or video image.Imaging sensor can include but is not limited to: camera is taken the photograph
Camera, scanner or other equipment (mobile phone, computer etc.) etc. with correlation function.Imaging sensor can for more than or equal to
Any of 1.
Specifically, imaging sensor can be set in robot or be fixed on a certain position outside robot, it is right in advance
Imaging sensor, imaging sensor and robot (referred to as " eye hand ") and robot are demarcated.
Step S200 obtains the auxiliary data in the action process for executing auxiliary;
Specifically, auxiliary data includes the set of multiple auxiliary behavioral datas and corresponding supplementary observation data pair;
By for realize in certain action processes obviously assisted in predefined action purpose the multiple groups status data that obtains and
Corresponding action data is to as in auxiliary data input model.Specifically, reaching a certain destination row executing a certain track
For in the process when, can by the action trail of certain auxiliary, such as: may bump against barrier etc. mistake expert behavior
The data obtained during (i.e. auxiliary behavior) track are as auxiliary data.
In some embodiments, the decision data described in above example can assign to corresponding positive value, and supplementary number
According to the corresponding negative value of imparting.
Step S300 is based on the decision data and the auxiliary data, is trained to obtain pretreatment mould to initial model
Type carries out autonomous learning based on the pretreated model, obtains intelligent body behavior model.
Autonomous learning process is exactly to allow intelligent body to be based on pretreated model to generate some action trails, then defines a mark
Difference between action trail of the standard to judge the expert of these tracks and teaching period acquisition, then according to this difference come more
The strategy of new pretreated model, the track for generating it next time more close to the behavior of expert, is sentenced until according to standard
The action trail of the disconnected close enough expert of action trail generated based on pretreated model, the then model obtained are final intelligence
Body behavior model.
Specifically, standard described in above example can based on experience value, machine learning, the various methods of random value etc.
It obtains, in some embodiments, this standard can be indicated with the neural network through overfitting.
By using learning method above, since the auxiliary data and specially of behavior during model training, will be assisted
The decision data of family's behavior is input in initial model jointly, is trained to model, is saved the time of model training, is improved
The adaptability and accuracy of agent model in all cases.
In some embodiments, step S300 includes following method and step:
S310 is based on the decision data and auxiliary data, and training initial model obtains pretreated model;
Decision is observed data and supplementary observation data as feature (feature), decision behavior data and auxiliary behavior
Data are classified as label (label) (for discrete movement) or the study of recurrence (for continuous action), constantly update
The parameter of initial model, to obtain pretreated model.
S320 carries out autonomous learning based on the pretreated model, obtains intelligent body behavior model.
Autonomous learning process is exactly to allow intelligent body to be based on pretreated model to generate some action trails, then defines a mark
Difference between action trail of the standard to judge the expert of these tracks and teaching period acquisition, then according to this difference come more
The strategy of new pretreated model, the track for generating it next time more close to the behavior of expert, is sentenced until according to standard
The action trail of the disconnected close enough expert of action trail generated based on pretreated model, the then model obtained are final intelligence
Body behavior model.
Specifically, standard described in above example can based on experience value, machine learning, the various methods of random value etc.
It obtains, in some embodiments, this standard can be indicated with the neural network through overfitting.
By using learning method above, due to the decision data that is obtained in the behavior based on expert to initial model into
Row training obtains pretreated model, carries out autonomous learning based on pretreated model, finally obtains the model of intelligent body study, therefore
Agent model consummatory behavior acts in all cases adaptability and accuracy after improving training.
Alternatively, it is also possible to reduce the time of intelligent body behavior model training.
In some embodiments, step S300 includes following method and step:
S330 is used to be based on the decision data, carries out initial model autonomous learning, obtains intelligent body behavior model.
Autonomous learning process is exactly to allow intelligent body to be based on initial model to generate some action trails, defines a standard to sentence
The difference broken between the action trail of expert that these tracks and teaching period obtain, then updates pre- place according to this difference
The strategy of model is managed, the track for generating it next time more close to the behavior of expert, is based on until according to standard judgement
The action trail for the close enough expert of action trail that initial model generates, the then model obtained are final intelligent body behavior mould
Type.
As shown in figure 3, in some embodiments, step S100 obtains the decision data packet in the action process of executive expert
Include following method and step:
S110 obtains the decision behavior data at multiple current times in the action process of the executive expert;
S130 obtains the institute at the multiple current time that first sensor is sent in the action process of the executive expert
State decision observation data;Wherein, the decision behavior data at the current time and the decision at the current time observe data phase
It is corresponding.
As shown in figure 4, in some embodiments, step S100 obtains the decision data packet in the action process of executive expert
Include following method and step:
S120 obtains decision described in multiple current times that second sensor is sent in the action process of the executive expert
The relevant information of behavioral data;
S140 obtains the decision behavior data of multiple last moments according to the relevant information;
S160 obtains the institute for the multiple last moment that first sensor is sent in the action process of the executive expert
State decision observation data;Wherein, the decision behavior data of the last moment and the decision of the last moment are seen
Measured data is corresponding.
As shown in figure 5, in some embodiments, step S200 obtains the decision data packet in the action process for executing auxiliary
Include following method and step:
S210 obtains the auxiliary behavioral data at multiple current times in the action process for executing auxiliary;
S230 obtains the institute at the multiple current time that first sensor is sent in the action process for executing auxiliary
State supplementary observation data;Wherein, the supplementary observation data phase of the auxiliary behavioral data and the current time at the current time
It is corresponding;Or
As shown in fig. 6, in some embodiments, step S200 obtains the decision data packet in the action process for executing auxiliary
Include following method and step:
S220 obtains auxiliary described in multiple current times that second sensor is sent in the action process for executing auxiliary
The relevant information of behavioral data;
S230 obtains the behavioral data of multiple last moments according to the relevant information;
S240 obtains the institute for the multiple last moment that first sensor is sent in the action process for executing auxiliary
State supplementary observation data;Wherein, the auxiliary of the auxiliary behavioral data of the last moment and the last moment are seen
Measured data is corresponding.
Although should be understood that Fig. 1,2,3,4,5 and 6 flow chart in each step according to arrow instruction successively
It has been shown that, but these steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly state otherwise herein,
There is no stringent sequences to limit for the execution of these steps, these steps can execute in other order.Moreover, Fig. 1,2,3,
4, at least part step in 5 and 6 may include that perhaps these sub-steps of multiple stages or stage be not for multiple sub-steps
Completion necessarily is executed in synchronization, but can be executed at different times, the execution in these sub-steps or stage is suitable
Sequence, which is also not necessarily, successively to be carried out, but can be at least one of the sub-step or stage of other steps or other steps
Minute wheel stream alternately executes.
In one embodiment, as shown in figure 9, providing a kind of intelligent body Behavioral training control device, the intelligent body behavior
Training device includes that decision data obtains module 100, auxiliary data obtains module 200 and behavior model generation module 300.
Decision data obtains module 100, the decision data in action process for obtaining executive expert;
Auxiliary data obtains module 200, for obtaining the aid decision data in the action process for executing auxiliary;
Behavior model generation module 300, for being based on the decision data and the aid decision data, to initial model
It is trained, obtains intelligent body action learning model.
As shown in Figure 10, in some embodiments, the behavior model generation module 300 includes: that pretreated model generates
Portion 310 and the first row are generating unit 320;
The pretreated model generating unit 310, for being based on the decision data and the auxiliary data, training introductory die
Type obtains pretreated model;
The first row obtains intelligent body behavior model for carrying out pretreated model autonomous learning for generating unit 320.
In some embodiments, the behavior model generation module 300 includes: the second behavior generating unit 330;
The second behavior generating unit 330 carries out initial model for being based on the decision data and the auxiliary data
Autonomous learning obtains intelligent body behavior model.
In some embodiments, decision data obtains module and includes: current decision behavioral data generation module and currently determine
Plan observes data generation module.
Current decision behavioral data generation module, when multiple current in the action process for obtaining the executive expert
The decision behavior data at quarter;
Current decision observes data generation module, first sensor hair in the action process for obtaining the executive expert
The decision at the multiple current time sent observes data;Wherein, the decision behavior data at the current time with it is described
The decision observation data at current time are corresponding.
In some embodiments, it includes: the relevant information acquisition unit of current decision, a upper decision that decision data, which obtains module,
Behavioral data generating unit and a upper decision observe data generating section.
The relevant information acquisition unit of current decision, second sensor hair in the action process for obtaining the executive expert
The relevant information of decision behavior data described in the multiple current times sent;
Upper decision behavior data generating section, for obtaining the described of multiple last moments and determining according to the relevant information
Plan behavioral data;
A upper decision observes data generating section, and first sensor is sent in the action process for obtaining the executive expert
The multiple last moment the decision observe data;Wherein, the decision behavior data of the last moment and institute
The decision observation data for stating last moment are corresponding.
In some embodiments, it includes: current auxiliary behavior data generation module and current auxiliary that auxiliary data, which obtains module,
Help observation data generation module.
Current auxiliary behavior data generation module, when for obtaining multiple current in the action process for executing auxiliary
The auxiliary behavioral data at quarter;
Current supplementary observation data generation module, for obtaining first sensor hair in the action process for executing auxiliary
The supplementary observation data at the multiple current time sent;Wherein, the auxiliary behavioral data at the current time with it is described
The supplementary observation data at current time are corresponding;Or
In some embodiments, it includes: the relevant information acquisition unit currently assisted, a upper auxiliary that auxiliary data, which obtains module,
Behavioral data generating unit and upper supplementary observation data generating section.
The relevant information acquisition unit currently assisted, for obtaining second sensor hair in the action process for executing auxiliary
The relevant information of behavioral data is assisted described in the multiple current times sent;
Upper auxiliary behavior data generating section, for obtaining the row of multiple last moments according to the relevant information
For data;
Upper supplementary observation data generating section is sent for obtaining first sensor in the action process for executing auxiliary
The multiple last moment the supplementary observation data;Wherein, the auxiliary behavioral data of the last moment and institute
The supplementary observation data for stating last moment are corresponding.
Specific restriction about intelligent body Behavioral training control device may refer to above for intelligent body Behavioral training
The restriction of method, details are not described herein.Modules in above-mentioned intelligent body Behavioral training control device can completely or partially lead to
Software, hardware and combinations thereof are crossed to realize.Above-mentioned each module can be embedded in the form of hardware or independently of in computer equipment
In processor, can also be stored in a software form in the memory in computer equipment, in order to processor call execute with
The corresponding operation of upper modules.
In one embodiment, as shown in figure 11, a kind of intelligent body Behavioral training system is provided, including control device 400,
First sensor 500 and behavioral data generating means 600.
Control device 400, the decision data in action process for obtaining executive expert;Wherein, the decision data
Data are observed including multiple decision behavior data and corresponding decision;Obtain the auxiliary data in the action process for executing auxiliary;
Wherein, the auxiliary data includes multiple auxiliary behavioral datas and corresponding supplementary observation data;Based on the decision data and
The auxiliary data carries out model autonomous learning, obtains intelligent body behavior model.
Behavioral data generating means 600, for generating decision behavior data and auxiliary behavioral data, and by the decision row
The control device is sent to for data and the auxiliary behavioral data.
First sensor 500, for obtaining decision data and auxiliary data in expert's action process, by decision data and
Auxiliary data is sent to control device.
In one embodiment, as shown in figure 12, the intelligent body training system further include: intelligent body 700, in teaching
The behavior of lower executive expert and the behavior of auxiliary.
Specifically, first sensor 500 includes but is not limited to:
Imaging sensor, the image data of the intelligent body for obtaining a certain moment;
Force snesor, the force feedback data of the intelligent body for obtaining a certain moment;
Encoder, the motion feedback data of the driving unit for obtaining a certain moment intelligent body;
Range finder, for obtaining the relevant ranging data of distance of a certain moment intelligent body;
Speed or acceleration information measuring appliance, for obtaining the speed or acceleration analysis data of a certain moment intelligent body;
Current or voltage measuring appliance, for obtaining the current or voltage measurement data of a certain moment intelligent body;
Timer, for obtaining the specific time data at a certain moment;
Temperature sensor, for obtaining the temperature data of a certain moment intelligent body.
As shown in figure 13, in some embodiments, the behavioral data generating means 600 include: control unit 610;
Described control unit 610, for generating the decision behavior data and auxiliary behavioral data.
As shown in figure 14, in some embodiments, the behavioral data generating means 600 include: 620 He of second sensor
Control unit 610;
The second sensor 620, for obtaining decision behavior data described in multiple current times and auxiliary behavioral data
Relevant information, be sent to described control unit;
Described control unit 610 generates the decision behavior number of multiple last moments for parsing the relevant information
According to or auxiliary behavioral data.
Specifically, second sensor 620 can include but is not limited to: imaging sensor and encoder.
It should be noted that when first sensor 500 includes such as:, can be with second when imaging sensor and encoder
The imaging sensor for including and encoder of sensor 620 are separately independently arranged, in addition to this, can also be with common image sensor
And encoder, i.e., it is parsed by the relevant information shot to a certain current time imaging sensor and encoder, it can
The decision behavior data and auxiliary behavioral data of last moment are generated as, also can be generated and observe data for the decision at current time
With supplementary observation data.
Specifically, control device 400 and control unit 610 can be independently provided separately, a device (ratio can also be combined into
Such as: as shown in Figure 5,6, control device 400 and control unit 610 merge, unified to realize control device 400 by control device 400
With the robot behavior training method of control unit 610 and behavioral data generation method etc..)
Control device 400 and control unit 610 can be programmable logic controller (PLC) (Programmable Logic
Controller, PLC), field programmable gate array (Field-Programmable Gate Array, FPGA), computer
(Personal Computer, PC), industrial control computer (Industrial Personal Computer, IPC) or service
Device etc..Control device is according to program fixed in advance, in conjunction with the first sensor of the information, parameter or the outside that are manually entered
And/or data of second sensor (such as imaging sensor) acquisition etc. generate program instruction.
Specific about control device limits the restriction that may refer to above for intelligent body training method, herein no longer
It repeats.Modules in above-mentioned control device can be realized fully or partially through software, hardware and combinations thereof.Above-mentioned each mould
Block can be embedded in the form of hardware or independently of in the processor in computer equipment, can also be stored in calculating in a software form
In memory in machine equipment, the corresponding operation of the above modules is executed in order to which processor calls.
In one embodiment, the present invention also provides a kind of including intelligent body Behavioral training system described in above example
Multiagent system.
Associated description in relation to intelligent body Behavioral training system is referring to above embodiment, and it is no longer repeated herein.
It should be noted that above-mentioned intelligent body Behavioral training method, Behavioral training control device, Behavioral training system or intelligence
The intelligent body and/or sensor mentioned in energy system system etc., it can it is real intelligence body and the sensor under true environment,
The Virtual Agent and/or sensor being also possible under emulation platform, by simulated environment with reach connection real intelligence body and/
Or the effect of sensor.The control device after virtual environment consummatory behavior is trained will be relied on, is transplanted under true environment, to true
Intelligent body and sensor carry out control or retraining, resource and the time of training process can be saved.
In one embodiment, a kind of computer equipment, including memory and processor are provided, memory is stored with calculating
Machine program, processor perform the steps of the decision data in the action process for obtaining executive expert when executing computer program;
Obtain the aid decision data in the action process for executing auxiliary;It is right based on the decision data and the aid decision data
Initial model is trained, and obtains intelligent body action learning model.
In one embodiment, a kind of computer readable storage medium is provided, computer program, computer are stored thereon with
The decision data in the action process for obtaining executive expert is performed the steps of when program is executed by processor;It is auxiliary to obtain execution
Aid decision data in the action process helped;Based on the decision data and the aid decision data, to initial model into
Row training, obtains intelligent body action learning model.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with
Relevant hardware is instructed to complete by computer program, the computer program can be stored in a non-volatile computer
In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein,
To any reference of memory, storage, database or other media used in each embodiment provided herein,
Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM
(PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include
Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms,
Such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhancing
Type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM
(RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..
Each technical characteristic of embodiment described above can be combined arbitrarily, for simplicity of description, not to above-mentioned reality
It applies all possible combination of each technical characteristic in example to be all described, as long as however, the combination of these technical characteristics is not deposited
In contradiction, all should be considered as described in this specification.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, is not described in detail or remembers in some embodiment
The part of load may refer to the associated description of other embodiments.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list
Member both can take the form of hardware realization, can also realize in the form of software functional units.
Unless otherwise defined, technical and scientific term all used in this specification is led with technology of the invention is belonged to
The normally understood meaning of the technical staff in domain is identical.In this specification in the description of the invention used in belong to and be only
The purpose of description specific embodiment is not intended to the limitation present invention.
Claims of the present invention and specification and term " first " in above-mentioned attached drawing, " second ", " third ",
" S110 ", " S120 " " S130 " etc. (if present) are for distinguishing similar object, without specific suitable for describing
Sequence or precedence.It should be understood that the data used in this way are interchangeable under appropriate circumstances, so as to the embodiments described herein
It can be implemented with the sequence other than the content for illustrating or describing herein.In addition, term " includes " " having " and they
Any deformation, it is intended that cover and non-exclusive include.Such as: include series of steps or module process, method,
System, product or robot those of are not necessarily limited to be clearly listed step or module, but including being not clearly listed
Or other steps or module intrinsic for these process, methods, system, product or robot.
It should be noted that those skilled in the art should also know that, embodiment described in this description belongs to excellent
Embodiment is selected, related structure and module are not necessarily essential to the invention.
The embodiments described above only express several embodiments of the present invention, and the description thereof is more specific and detailed, but simultaneously
It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art
It says, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to protection of the invention
Range.Therefore, the scope of protection of the patent of the invention shall be subject to the appended claims.
The embodiments described above only express several embodiments of the present invention, and the description thereof is more specific and detailed, but simultaneously
It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art
It says, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to protection of the invention
Range.Therefore, the scope of protection of the patent of the invention shall be subject to the appended claims.
Claims (15)
1. a kind of intelligent body Behavioral training method, which is characterized in that the intelligent body Behavioral training method includes:
Obtain the decision data in the action process of executive expert;Wherein, the decision data includes multiple decision behavior data
Data are observed with corresponding decision;
Obtain the auxiliary data in the action process for executing auxiliary;Wherein, the auxiliary data includes multiple auxiliary behavioral datas
With corresponding supplementary observation data;
Based on the decision data and the auxiliary data, model autonomous learning is carried out, intelligent body behavior model is obtained.
2. intelligent body Behavioral training method according to claim 1, which is characterized in that it is described based on the decision data and
The auxiliary data carries out model autonomous learning, and obtaining intelligent body behavior model includes:
Based on the decision data and the auxiliary data, training initial model obtains pretreated model;
Pretreated model autonomous learning is carried out, intelligent body behavior model is obtained.
3. intelligent body Behavioral training method according to claim 1, which is characterized in that it is described based on the decision data and
The auxiliary data carries out model autonomous learning, and obtaining intelligent body behavior model includes:
Based on the decision data and the auxiliary data, initial model autonomous learning is carried out, intelligent body behavior model is obtained.
4. intelligent body Behavioral training method according to claim 1,2 or 3, which is characterized in that the acquisition executive expert
Action process in decision data include:
Obtain the decision behavior data at multiple current times in the action process of the executive expert;
The decision for obtaining the multiple current time that first sensor is sent in the action process of the executive expert is seen
Measured data;Wherein, the decision behavior data at the current time are corresponding with the decision at current time observation data;Or
Obtain decision behavior data described in multiple current times that second sensor is sent in the action process of the executive expert
Relevant information;
The relevant information is parsed, the decision behavior data of multiple last moments are generated;
The decision for obtaining the multiple last moment that first sensor is sent in the action process of the executive expert is seen
Measured data;Wherein, the decision behavior data of the last moment and the decision of the last moment observe data phase
It is corresponding.
5. intelligent body Behavioral training method according to claim 1,2 or 3, which is characterized in that the acquisition executes auxiliary
Action process in auxiliary data include:
Obtain the auxiliary behavioral data at multiple current times in the action process for executing auxiliary;
The auxiliary for obtaining the multiple current time that first sensor is sent in the action process for executing auxiliary is seen
Measured data;Wherein, the auxiliary behavioral data at the current time is corresponding with the supplementary observation data at the current time;Or
It obtains and assists behavioral data described in multiple current times that second sensor is sent in the action process for executing auxiliary
Relevant information;
According to the relevant information, the behavioral data of multiple last moments is obtained;
The auxiliary for obtaining the multiple last moment that first sensor is sent in the action process for executing auxiliary is seen
Measured data;Wherein, the supplementary observation data phase of the auxiliary behavioral data of the last moment and the last moment
It is corresponding.
6. a kind of intelligent body Behavioral training control device, which is characterized in that the intelligent body Behavioral training control device includes:
Decision data obtains module, the decision data in action process for obtaining executive expert;Wherein, the decision data
Data are observed including multiple decision behavior data and corresponding decision;
Auxiliary data obtains module, obtains the auxiliary data in the action process for executing auxiliary;Wherein, the auxiliary data includes
Multiple auxiliary behavioral datas and corresponding supplementary observation data;
Behavior model generation module carries out model autonomous learning, obtains for being based on the decision data and the auxiliary data
Intelligent body behavior model.
7. a kind of intelligent body Behavioral training system, which is characterized in that the intelligent body Behavioral training system includes:
Behavioral data generating means, for generating decision behavior data and the auxiliary behavioral data, and by the decision behavior
Data and the auxiliary behavioral data are sent to the control device;
First sensor observes data and described for obtaining decision observation data and supplementary observation data, and by the decision
Supplementary observation data are sent to the control device;
Control device, the decision data in action process for obtaining executive expert;Wherein, the decision data includes multiple
Decision behavior data and corresponding decision observe data;Obtain the auxiliary data in the action process for executing auxiliary;Wherein, described
Auxiliary data includes multiple auxiliary behavioral datas and corresponding supplementary observation data;Based on the decision data and the supplementary number
According to progress model autonomous learning obtains intelligent body behavior model.
8. intelligent body Behavioral training system according to claim 7, which is characterized in that the intelligent body Behavioral training system
Further include:
Intelligent body, for executing the behavior of the expert and the behavior of the auxiliary under teaching.
9. intelligent body Behavioral training system according to claim 7 or 8, which is characterized in that the first sensor includes:
Imaging sensor, the image data of the intelligent body for obtaining a certain moment;
Force snesor, the force feedback data of the intelligent body for obtaining a certain moment;
Encoder, the motion feedback data of the driving unit for obtaining intelligent body described in a certain moment;
Range finder, the relevant ranging data of distance for obtaining intelligent body described in a certain moment;
Speed or acceleration information measuring appliance, for obtaining the speed or acceleration analysis data of intelligent body described in a certain moment;
Current or voltage measuring appliance, for obtaining the current or voltage measurement data of intelligent body described in a certain moment;
Timer, for obtaining the specific time data at a certain moment;And/or
Temperature sensor, for obtaining the temperature data of intelligent body described in a certain moment.
10. intelligent body Behavioral training system according to claim 7 or 8, which is characterized in that the behavioral data generates dress
Setting includes: control unit;
Described control unit, for generating the decision behavior data and the auxiliary behavioral data.
11. intelligent body Behavioral training system according to claim 7 or 8, which is characterized in that the behavioral data generates dress
Set includes: second sensor and control unit;
The second sensor, for obtaining decision behavior data and auxiliary described in multiple current times of second sensor transmission
The relevant information of behavioral data;
Described control unit, for obtaining the behavioral data of multiple last moments according to the relevant information.
12. intelligent body Behavioral training system according to claim 11, which is characterized in that the second sensor includes figure
As sensor and encoder.
13. a kind of multiagent system, which is characterized in that the robot system includes the described in any item intelligence of claim 7-12
It can body Behavioral training system.
14. a kind of computer equipment, including memory and processor, the memory are stored with computer program, feature exists
In the processor realizes the described in any item intelligent body Behavioral training sides claim 1-5 when executing the computer program
Method.
15. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program
Claim 1-5 described in any item intelligent body Behavioral training methods are realized when being executed by processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910028902.9A CN109784400A (en) | 2019-01-12 | 2019-01-12 | Intelligent body Behavioral training method, apparatus, system, storage medium and equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910028902.9A CN109784400A (en) | 2019-01-12 | 2019-01-12 | Intelligent body Behavioral training method, apparatus, system, storage medium and equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109784400A true CN109784400A (en) | 2019-05-21 |
Family
ID=66500352
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910028902.9A Pending CN109784400A (en) | 2019-01-12 | 2019-01-12 | Intelligent body Behavioral training method, apparatus, system, storage medium and equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109784400A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112230647A (en) * | 2019-06-28 | 2021-01-15 | 鲁班嫡系机器人(深圳)有限公司 | Intelligent power system behavior model, training method and device for trajectory planning |
CN112287728A (en) * | 2019-07-24 | 2021-01-29 | 鲁班嫡系机器人(深圳)有限公司 | Intelligent agent trajectory planning method, device, system, storage medium and equipment |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH10301616A (en) * | 1997-04-25 | 1998-11-13 | Tokico Ltd | Teaching device for robot |
CN105291111A (en) * | 2015-11-27 | 2016-02-03 | 深圳市神州云海智能科技有限公司 | Patrol robot |
US20170028553A1 (en) * | 2015-07-31 | 2017-02-02 | Fanuc Corporation | Machine learning device, robot controller, robot system, and machine learning method for learning action pattern of human |
JP2017030135A (en) * | 2015-07-31 | 2017-02-09 | ファナック株式会社 | Machine learning apparatus, robot system, and machine learning method for learning workpiece take-out motion |
CN108115681A (en) * | 2017-11-14 | 2018-06-05 | 深圳先进技术研究院 | Learning by imitation method, apparatus, robot and the storage medium of robot |
US20180215039A1 (en) * | 2017-02-02 | 2018-08-02 | Brain Corporation | Systems and methods for assisting a robotic apparatus |
-
2019
- 2019-01-12 CN CN201910028902.9A patent/CN109784400A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH10301616A (en) * | 1997-04-25 | 1998-11-13 | Tokico Ltd | Teaching device for robot |
US20170028553A1 (en) * | 2015-07-31 | 2017-02-02 | Fanuc Corporation | Machine learning device, robot controller, robot system, and machine learning method for learning action pattern of human |
JP2017030135A (en) * | 2015-07-31 | 2017-02-09 | ファナック株式会社 | Machine learning apparatus, robot system, and machine learning method for learning workpiece take-out motion |
CN105291111A (en) * | 2015-11-27 | 2016-02-03 | 深圳市神州云海智能科技有限公司 | Patrol robot |
US20180215039A1 (en) * | 2017-02-02 | 2018-08-02 | Brain Corporation | Systems and methods for assisting a robotic apparatus |
CN108115681A (en) * | 2017-11-14 | 2018-06-05 | 深圳先进技术研究院 | Learning by imitation method, apparatus, robot and the storage medium of robot |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112230647A (en) * | 2019-06-28 | 2021-01-15 | 鲁班嫡系机器人(深圳)有限公司 | Intelligent power system behavior model, training method and device for trajectory planning |
CN112287728A (en) * | 2019-07-24 | 2021-01-29 | 鲁班嫡系机器人(深圳)有限公司 | Intelligent agent trajectory planning method, device, system, storage medium and equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109760050A (en) | Robot behavior training method, device, system, storage medium and equipment | |
Kaspar et al. | Sim2real transfer for reinforcement learning without dynamics randomization | |
Mayer et al. | A system for robotic heart surgery that learns to tie knots using recurrent neural networks | |
Paxton et al. | Prospection: Interpretable plans from language by predicting the future | |
Lin et al. | Evolutionary digital twin: A new approach for intelligent industrial product development | |
CN109784400A (en) | Intelligent body Behavioral training method, apparatus, system, storage medium and equipment | |
Bazzi et al. | Robustness in human manipulation of dynamically complex objects through control contraction metrics | |
Kim et al. | Learning and generalization of dynamic movement primitives by hierarchical deep reinforcement learning from demonstration | |
Kurrek et al. | Ai motion control–a generic approach to develop control policies for robotic manipulation tasks | |
Xu et al. | Dexterous manipulation from images: Autonomous real-world rl via substep guidance | |
Su et al. | A ROS based open source simulation environment for robotics beginners | |
Pairet et al. | Learning and generalisation of primitives skills towards robust dual-arm manipulation | |
Tjomsland et al. | Human-robot collaboration via deep reinforcement learning of real-world interactions | |
Nehmzow | Flexible control of mobile robots through autonomous competence acquisition | |
De Magistris et al. | Teaching a robot pick and place task using recurrent neural network | |
Mohan et al. | Towards reasoning and coordinating action in the mental space | |
Claassens | An RRT-based path planner for use in trajectory imitation | |
Nawrocka et al. | Neural network control for robot manipulator | |
CN113927593B (en) | Mechanical arm operation skill learning method based on task decomposition | |
Benotsmane et al. | Survey on artificial intelligence algorithms used in industrial robotics | |
Rocchi et al. | A generic simulator for underactuated compliant hands | |
Ramírez et al. | Human behavior learning in joint space using dynamic time warping and neural networks | |
Ichiwara et al. | Multimodal time series learning of robots based on distributed and integrated modalities: Verification with a simulator and actual robots | |
Arie et al. | Reinforcement learning of a continuous motor sequence with hidden states | |
Fallas-Hernández et al. | OSCAR: A low-cost, open-source robotic platform design for cognitive research |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20190521 |