CN112140101A

CN112140101A - Trajectory planning method, device and system

Info

Publication number: CN112140101A
Application number: CN202010174770.3A
Authority: CN
Inventors: 朱文飞; 何德裕
Original assignee: Robotics Robotics Shenzhen Ltd
Current assignee: Robotics Robotics Shenzhen Ltd
Priority date: 2019-06-28
Filing date: 2020-03-13
Publication date: 2020-12-29
Also published as: CN111958584A

Abstract

The application relates to a trajectory planning method, a trajectory planning device and a trajectory planning system. The trajectory planning method comprises the following steps: acquiring initial and target position information of an agent; acquiring an intelligent power system behavior model; and inputting the initial and target position information into an intelligent power system behavior model, and outputting a planning track instruction. By adopting the technical scheme of the invention, the planned track is output by adopting the intelligent power system behavior model based on track planning, so that the precision of track planning can be improved; in addition, the generalization capability of the track planning can be improved on the premise of ensuring the precision.

Description

Trajectory planning method, device and system

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to a trajectory planning method, apparatus, and system.

Background

With the development of science and technology, people have more demands on realizing the trajectory planning of an intelligent agent based on artificial intelligence; the existing artificial intelligence-based track planning has the defects of low precision, poor generalization and the like.

Disclosure of Invention

Based on the above, the invention provides an intelligent agent trajectory planning method, device and system.

The invention discloses a trajectory planning method, which comprises the following steps: acquiring initial and target position information of an agent;

acquiring an intelligent power system behavior model; and

and inputting the initial and target position information into an intelligent power system behavior model, and outputting a planning track instruction.

Preferably, when the intelligent agent is a manipulator, the planning trajectory instruction is:

the motion acceleration and/or angular acceleration corresponding to each discrete point in the planned trajectory consisting of a plurality of discrete points.

The invention provides a trajectory planning method, which comprises the following steps:

acquiring target position information of an agent;

acquiring a current track of the agent;

acquiring an intelligent power system behavior model; and

and inputting the current track and the target position information into the intelligent power system behavior model, and outputting a planned track instruction.

and the motion acceleration and/or the angular acceleration corresponding to the intelligent agent at the next moment.

acquiring a reference auxiliary parameter of the intelligent agent;

acquiring target position information of the agent;

acquiring current auxiliary parameters of the agent;

calculating an error between the reference auxiliary parameter and the current auxiliary parameter;

acquiring a current track of the agent; and

and inputting the current error, the current track and the target position information into the intelligent power system behavior model, and outputting a planned track instruction and/or a corrected track instruction.

Preferably, the acquiring the reference auxiliary parameter of the agent includes:

acquiring initial and target position information of the agent;

acquiring a behavior model of an intelligent power system of an intelligent agent;

inputting the initial and target position information into the intelligent power system behavior model, and outputting a reference instruction of a planning track instruction; and

and acquiring a reference auxiliary parameter in the process of executing the reference instruction by the intelligent agent.

Preferably, when the intelligent agent is a manipulator, the planning trajectory instruction and/or the correction trajectory instruction is:

the motion acceleration and/or the angular acceleration of the intelligent body corresponding to each discrete point in the planned trajectory and/or the corrected trajectory consisting of a plurality of discrete points.

acquiring target position information of an agent;

acquiring a current track of the agent;

acquiring or generating current obstacle information of an obstacle; and

and inputting the current obstacle information, the current track and the target position information into the intelligent power system behavior model, and outputting a planned track instruction and/or a corrected track instruction.

acquiring target position information of an agent;

acquiring a current track of the agent; wherein the current trajectory includes at least current location information;

judging whether the intelligent agent is located in a resistant domain or not according to the current position information;

if yes, deflecting the target position information according to the current position increment information to generate new target position information;

inputting the current track and the new target position information into an intelligent power system behavior model to generate a new planned track instruction; and

if not, inputting the current track and the target position information into the intelligent power system behavior model, and outputting a planning track instruction.

The invention provides a trajectory planning device, comprising:

the position information acquisition module is used for acquiring initial and target position information of the intelligent agent;

the behavior model acquisition module is used for acquiring the behavior model of the intelligent power system; and

the planning track generation module is used for inputting the initial and target position information into the intelligent power system behavior model and outputting a planning track instruction;

the invention provides an intelligent agent track planning device, which comprises:

the target position acquisition module is used for acquiring target position information of the intelligent agent;

the current track acquisition module is used for acquiring the current track of the intelligent agent;

the behavior model acquisition module is used for acquiring an intelligent power system behavior model; and

and the planned track generation module is used for inputting the current track and the target position information into the intelligent power system behavior model and outputting a planned track instruction.

The invention provides a trajectory planning device, comprising:

the reference auxiliary parameter acquisition module is used for acquiring reference auxiliary parameters of the intelligent agent;

a current auxiliary parameter obtaining module, configured to obtain a current auxiliary parameter of the agent;

an error calculation module for calculating an error between the reference auxiliary parameter and the current auxiliary parameter;

the current track acquisition module is used for acquiring the current track of the intelligent agent; and

and the track generation module is used for inputting the current error, the current track and the target position information into the intelligent power system behavior model and outputting a planned track instruction and/or a corrected track instruction.

Preferably, the reference auxiliary parameter acquiring module includes:

an initial and target acquisition unit for acquiring initial and target position information of the agent;

the hybrid power model acquisition unit is used for acquiring an intelligent power system behavior model;

a reference target result generating unit, configured to input the initial and target position information into the intelligent power system behavior model, and output a reference instruction of a planned trajectory instruction; and

and the reference auxiliary parameter acquisition unit is used for acquiring the reference auxiliary parameters in the process of executing the reference instruction by the intelligent agent.

The invention provides a trajectory planning device, comprising:

the current obstacle information acquisition or generation module is used for acquiring or generating current obstacle information;

and the track generation module is used for inputting the current obstacle information, the current track and the target position information into the intelligent power system behavior model and outputting a planned track instruction and/or a corrected track instruction.

The invention provides a trajectory planning device, comprising:

the target position acquisition module is used for acquiring the target position of the intelligent agent;

the current track acquisition module is used for acquiring the current track of the intelligent agent; wherein the current trajectory includes at least current location information;

the judging module is used for judging whether the intelligent agent is positioned in the resistant domain or not according to the current position information;

if so, deflecting the target position information according to the current position increment information to generate new target position information;

the new planned track generation module is used for inputting the current track and the new target position information into an intelligent power system behavior model to generate a new planned track instruction; and

and if not, the planning track generation module inputs the current track and the target position into the intelligent power system behavior model and outputs a planning track instruction.

The invention provides an intelligent agent system, which comprises a control device and an intelligent agent;

the control device is used for acquiring initial and target position information of the intelligent agent; acquiring the intelligent power system behavior model; inputting the initial and target position information into an intelligent power system behavior model, and outputting a planning track instruction; or

Acquiring target position information of an agent; acquiring a current track of an agent; acquiring an intelligent power system behavior model; inputting the current track and the target position information into an intelligent power system behavior model, and outputting a planning track instruction; or

Acquiring a reference auxiliary parameter of the intelligent agent; acquiring target position information of an agent; acquiring current auxiliary parameters of the intelligent agent; calculating an error between the reference auxiliary parameter and the current auxiliary parameter; acquiring a current track of an agent; inputting the current error, the current track and the target position information into the intelligent power system behavior model, and outputting a planned track instruction and/or a corrected track instruction; or

Acquiring target position information of an agent; acquiring a current track of an agent; acquiring or generating current obstacle information of an obstacle; inputting the current obstacle information, the current track and the target position information into the intelligent power system behavior model, and outputting a planned track instruction and/or a corrected track instruction; or

Acquiring target position information of an agent; acquiring a current track of the agent; wherein the current trajectory includes at least current location information; judging whether the intelligent agent is located in a resistant domain or not according to the current position information; if yes, deflecting the target position information according to the current position increment information to generate new target position information; inputting the current track and the new target position information into the intelligent power system behavior model to generate a new planned track instruction; if not, inputting the current track and the target position into the intelligent power system behavior model, and outputting a planning track instruction;

and the intelligent agent is used for executing corresponding track motion according to the planning track instruction and/or the correction track instruction.

The invention provides a computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the trajectory planning method of any of the above when executing the computer program.

The invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the trajectory planning method of any of the above.

By adopting the technical scheme of the invention, the planned trajectory is output by adopting the intelligent power system behavior model based on trajectory planning, so that the accuracy of trajectory planning can be improved; in addition, the generalization capability of the track planning can be improved on the premise of ensuring the precision.

Drawings

FIG. 1 is a schematic diagram of a first process of trajectory planning training in one embodiment;

FIG. 2 is a first flow diagram of trajectory planning in one embodiment;

FIG. 3 is a second flow diagram of trajectory planning in one embodiment;

FIG. 4 is a third flow diagram of trajectory planning in one embodiment;

FIG. 5 is a diagram illustrating a first process for obtaining reference auxiliary parameters of an agent in trajectory planning, according to an embodiment;

FIG. 6 is a fourth flowchart of trajectory planning in one embodiment;

FIG. 7 is a fifth flowchart of trajectory planning in one embodiment;

FIG. 8 is a first block diagram of an intelligent system in one embodiment;

FIG. 9 is a second schematic diagram of an intelligent system in one embodiment;

FIG. 10 is a schematic diagram of a third configuration of an intelligent system in one embodiment;

FIG. 11 is a first block diagram of a trajectory planning training device in accordance with an embodiment;

FIG. 12 is a first block diagram of a trajectory planner in one embodiment;

FIG. 13 is a block diagram showing a second configuration of the trajectory planner in one embodiment;

FIG. 14 is a block diagram of a third configuration of a trajectory planner in one embodiment;

FIG. 15 is a first block diagram of a reference auxiliary parameter acquisition module of the trajectory planning device in accordance with an embodiment;

FIG. 16 is a fourth block diagram showing the construction of a trajectory planning apparatus according to an embodiment;

FIG. 17 is a block diagram showing a fifth configuration of the trajectory planner in one embodiment;

FIG. 18 is a first block diagram of an intelligent powertrain behavior model in one embodiment;

FIG. 19 is a second block diagram of an intelligent powertrain behavior model in one embodiment;

FIG. 20 is a first block diagram of a trajectory planning training system in accordance with an embodiment;

FIG. 21 is a block diagram of a first configuration of a computer device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

In one embodiment, as shown in fig. 1, a trajectory planning training method is provided, which is exemplified by the application of the method to the intelligent system in fig. 21; the trajectory planning training method is used for trajectory planning training of an agent, wherein the agent is any intelligent entity capable of implementing the trajectory planning training method of the embodiment, such as a manipulator 800 (shown in fig. 8, 9 or 10) or a humanoid robot (the drawings are omitted). In one embodiment, the manipulator may be various types of manipulators formed by connecting a plurality of joints and links in series or in parallel, each joint being a driving unit, such as: a four-axis robot (not shown), a six-axis robot (shown in fig. 8, 9, or 10), and the like. In one embodiment, as shown in fig. 8, 9 or 10, the end of the robot 800 may be provided with various actuators 810, with the actuators 810 performing specific grasping or releasing actions, etc. For convenience of understanding, the embodiment takes the intelligent agent as an example for further detailed description.

Continuing with fig. 1, the trajectory planning training method includes the following steps:

and step S110, acquiring a teaching track of the teaching body in the teaching behavior executing process.

In one embodiment, teaching tracks acquired and sent in real time in the teaching behavior process of a teaching main body by various sensors and/or encoders are acquired; or obtaining the teaching tracks from a server or a memory.

In particular, the teaching action may include, but is not limited to: a trajectory plan (shown in fig. 9) for picking up the target object M1 from bulk or regularly arranged objects; trajectory planning (shown in fig. 8) for assembling the target objects M1 and M2; trajectory planning for dropping the target (the drawing is omitted); trajectory planning for driving the target object to move from one position to another position (the drawing is omitted); a trajectory plan (shown in fig. 10) for driving the target object M1 to avoid the obstacle F during movement; trajectory planning (drawings omitted) for grabbing objects in rest or motion; or a combination of some or all of the above actions in each trajectory plan.

In one embodiment, taking an intelligent agent as an example of a manipulator, the teaching trajectory may be position information corresponding to each sampling point in a coordinate system space (e.g., cartesian space) through which the teaching agent passes during the course of performing the teaching action, a motion speed and/or an angular velocity of the teaching agent, and a motion acceleration and/or an angular acceleration of the teaching agent.

Specifically, the location information includes, but is not limited to: 6d coordinates (pose) or 2d coordinates.

Further, in one embodiment, the 6d coordinate is taken as an example, that is, xyz vw; for an intelligent power system behavior model requiring learning of a specific trajectory plan, acquiring the pose xyz vw of each sampling point, the motion velocity and/or the angular velocity dxdydzdudvdw of a teaching subject, and the motion acceleration and/or the angular acceleration ddxdddzdduddvdw of the teaching subject; wherein d represents the derivation.

When the agent is an agent other than a robot, the teaching trajectory and the coordinate system may be changed accordingly, for example: when the intelligent power system behavior model is applied to liquid level control of an industrial water tank, the teaching track can become the liquid level height, the derivative of the liquid level height and the second derivative of the liquid level height, and the teaching track is not on a Cartesian coordinate system at the moment.

In one embodiment, taking a manipulator as an example, the position information of the sampling point can be obtained based on a positive kinematic equation of the manipulator according to the motion amount fed back by the encoder of each driving unit of the manipulator; the position information of the actuator of the manipulator is acquired through the position sensor, or the position information is generated through a traditional or artificial intelligence algorithm according to image data acquired by the image sensor.

In one embodiment, the motion speed and/or angular speed of the teaching body at each sampling point can be measured by a speed or angular speed sensor; or by estimating the speed of motion and/or angular velocity in some way, such as: the movement speed and/or the angular speed can be estimated from the change of the position information at adjacent time instants and the time interval.

In one embodiment, the motion acceleration and/or the angular acceleration of the teaching subject at each sampling point can be measured by the acceleration and angular acceleration sensors; or by some means estimating the acceleration of motion and/or the angular acceleration.

In one embodiment, if the motion velocity and/or angular velocity, motion acceleration and/or angular acceleration are not directly available, it is desirable that the sampling time is as short as possible, sufficient to estimate the motion velocity and/or angular velocity, motion acceleration and/or angular acceleration using a difference method.

Specifically, the teaching trajectory is acquired by the teaching subject during the teaching action.

In one embodiment, the "teaching agent" may be an "agent" itself, and the teaching agent is further described in detail below as an example of a robot;

specifically, the manipulator can be driven to execute teaching behaviors through control instructions generated by an operator directly or through a controller or position information of a target and the like; such as: the manipulator completes the assembly action of the building blocks under the drive of an operator; for another example: and finishing the assembly action of the building blocks according to the motion amount commands of the driving units of the manipulator sent by the controller, wherein the controller comprises but is not limited to: PC, PAD, mobile terminal, etc.; for another example: the manipulator is indirectly controlled to complete the action corresponding to VR through a Virtual Reality technology (VR for short), and the manipulator can be controlled to execute finer action by adopting the VR; for another example: by controlling the movement of the apparatus provided with other speed or acceleration data measurers such as gravity sensors, for example: a mobile terminal, a PAD, etc., to generate a control command or position information of a related target, etc., to control the motion of the manipulator.

In one embodiment, the "teaching agent" may be a "third person" other than the agent, such as: a sensor is installed on the hand of the third person, a teaching track of the third person in the process of executing teaching behaviors is fed back through the sensor, and the like.

And step S120, acquiring an initial model of the intelligent power system behavior model.

In one embodiment, the training device obtains an initial model of the intelligent power system behavior model from a server or memory or the like.

The Intelligent power system Behavior model (Behavior modeling of Intelligent dynamic Systems) is combined with a traditional linear power system and an artificial Intelligent nonlinear power system modeling method, and the trajectory planning of an Intelligent agent is carried out based on the model, so that the accuracy of the trajectory planning is improved; in addition, in some embodiments, the generalization capability of the trajectory planning can be improved under the condition of ensuring the precision. The linear power system has the characteristics of easiness in analysis and control, the artificial intelligence method enables learning of a nonlinear function to be possible, system parameters are adjusted through learning decision data, an expected nonlinear intelligent power system behavior model is obtained, and the trouble of manually adjusting the system parameters is eliminated.

When the intelligent power system behavior model is adopted for trajectory planning learning, the learning effect is determined by the hyper-parameters of the intelligent power system behavior model. The hyper-parameters refer to parameters which are set before the model begins a training process, and are not parameter data obtained through training; and these hyper-parameters, such as: the gain of the linear power system part, the number of gaussian functions of the non-linear part, etc. are required by the designer to the trajectory, such as: the time of the trajectory, the maximum range of the trajectory, etc. The designer is required to firstly obtain the optimal hyper-parameter in the intelligent power system behavior model through a group of objective algorithms according to subjective requirements so as to achieve the optimal learning effect, and the intelligent power system behavior model with the hyper-parameter is used as an initial model of the intelligent power system behavior model.

In one embodiment, as shown in fig. 18, the intelligent power system behavior model for trajectory planning includes: a reference unit 910, a learning unit 920, and a coordination unit 930.

A reference unit 910 for generating a reference trajectory command, which provides the entire model with a reference trajectory command generated based on a conventional linear power system model, which ensures that the robot arm reaches a specified target position within a specified time. Specifically, any conventional powertrain module that satisfies this condition may be used, such as: a second order damping module or other higher or lower order power module. The choice of power module depends on the characteristics of the system being controlled, for example: since the robot control is based on acceleration, a second order damping module is preferred.

A learning unit 920, configured to modulate the reference trajectory instruction based on an artificial intelligence nonlinear power system method, and output a planned trajectory instruction; modulation allows the robot to learn to plan an arbitrary continuous smooth trajectory. Any non-linear system model may be used herein, such as: a Gaussian Mixture Model (Gaussian Mixture Model), a Neural Network (Neural Network), and the like.

The coordinating unit 930 is used for coordinating the learning unit and the parameter (for example, time parameter) for modulating the reference track command by the coordinating learning unit, and firstly, it ensures that the learning unit can finish modulating the reference track command under the specified parameter (for example, within the specified time), and ensures the stability of the system. Second, it can coordinate multiple non-linear power systems, such as: ensuring that they are synchronized in time. In one embodiment, a robot control system is comprised of multiple independent nonlinear power systems. For example, when we wish to control the manipulator to move in three-dimensional space, the motions of the manipulator in the xyz directions are respectively handled by three independent nonlinear power systems, and the three nonlinear power systems can be synchronized by using the same coordination unit. The choice of coordination unit is also manifold. For example, when we want the non-linear power system to be periodic, we can choose an oscillating system as the coordinating unit; when we want the non-linear dynamical system to be non-periodic, we can use a simple first order linear system.

In one embodiment, as shown in fig. 19, the intelligent power system behavior model for trajectory planning comprises: a reference unit 910, a learning unit 920, a coordination unit 930, and a correction unit 940;

a correcting unit 940, configured to correct at least part of the reference trajectory instruction or at least part of the planned trajectory instruction, and generate a corrected trajectory instruction;

specifically, in one embodiment, as shown in FIG. 19, the correction unit corrects at least a portion of the planned trajectory instructions generated by the learning unit

Further, in one embodiment, the correction unit may include 3 functions, 1, respectively, to correct the trajectory to avoid the obstacle when encountering the obstacle; 2. after obstacle avoidance is finished, compensation is carried out, so that the system can return to a normal track as soon as possible; 3. an error correction function based on force feedback is provided for addressing problems with contact.

Further, in one embodiment, for the above 3 functions, the following methods can be respectively adopted to implement:

1. obstacle avoidance

(1) And the human-like obstacle avoidance module is used for correcting the acceleration direction of the controlled object, so that the included angle and/or the distance between the moving direction of the intelligent body and the obstacle relative to the direction of the intelligent body are increased, and the intelligent body is far away from the obstacle.

2. Trajectory recovery

(1) And the minimum track error module is used for measuring the error between the current track and the expected track of the controlled object, and correcting the motion track of the object by utilizing a PD control algorithm with the aim of minimizing the error.

3. Error correction based on force feedback

(1) And the experience reviewing module is used for recording the experience (force feedback data) of successfully completing the task once, reviewing the past experience during the subsequent operation, and minimizing the error between the current force feedback data and the empirical force feedback data by using the experience reviewing module.

(2) It is reviewed empirically that in conjunction with the phase suspension assist module, that is, when the correction unit instructs correction of the reference trajectory, the update of the output of the learning unit is suspended by extending the time parameter of the coordination unit so that the generation of the piece of trajectory becomes slow until the correction unit exits the correction.

For convenience of understanding, a specific intelligent power system behavior model is taken as an example for further detailed description below:

in one embodiment, the model may include 3 subsystems, which may correspond to three dimensions of xyz, respectively. The method is used for planning the track only with translation and without rotation, and the specific expression is as follows:

wherein the content of the first and second substances,

and, instead,

taking the intelligent power system behavior model subsystem in the y dimension as an example,

the second-order damping system is used as a reference power system of the whole system;

represents a non-linear learning system based on a Gaussian mixture model;

a first order linear reference coordination system is shown.

In the above expression, [ alpha ]_x，α_y，α_z]、[β_x，β_y，β_z]The method is a second-order linear system hyper-parameter which needs to be set in advance; [ N ]_x，N_y，N_z]、[h_xi，h_yi，h_zi]、[c_xi，c_yi，c_zi]Is the hyper-parameter of the nonlinear mixed Gaussian function term which needs to be set in advance; C. alpha is alpha_sThe first-order linear system hyper-parameter is required to be set in advance;

[w_xi，w_yi，w_zi]is the weight of the gaussian function to be learned and the parameters which are obtained by training.

The trajectory planning problem of the manipulator can be regarded as a process for establishing a dynamic system to transfer from an initial state to a target state, and then the process can be established into a nonlinear dynamic system model expressed by an intelligent power system behavior model. Based on the model, the robot can have the capability of generating various complex motion tracks by a teaching training method, for example, the robot can write words. Meanwhile, the generalization capability of the intelligent power system behavior model enables the robot to freely change the elapsed time of the tracks and the starting and ending points of the tracks, and the robot can still generate similar tracks.

And S130, training an initial model according to the teaching track to obtain an intelligent power system behavior model.

During training, firstly, the model generates characteristic tracks to be fitted based on the reference unit according to the teaching tracks, and then trains the initial model based on the learning unit and the coordination unit by using the characteristic tracks to obtain an intelligent power system behavior model, namely the training of the initial model is performed aiming at the learning unit.

The specific training method is different according to different learning units of the model, such as: in one embodiment, according to the above, when the mixture gaussian model is used as the learning unit, the mixture gaussian model is composed of several pre-designed gaussian models mixed with different weights, and these weights are parameters to be learned during training. In one embodiment, the training can be completed by directly obtaining the weight of each gaussian function by using a local Weighted Regression method (localization Weighted Regression).

The track planning training method based on the intelligent power system behavior model has the following beneficial effects:

on one hand, as the intelligent power system behavior model can reproduce the teaching track only by using 1 teaching track at least, and simultaneously realize the capability of randomly changing the starting point and the end point and the track passing time, the model training samples are saved, and the model training speed is improved;

on the other hand, the training speed of the intelligent power system behavior model is very high, so that the time for training the model is saved;

on the other hand, the test result can be directly estimated through the training result, so that the time loss of the test after the training is finished is avoided.

In one embodiment, as shown in fig. 2, a trajectory planning method is provided, which is exemplified by the application of the method to the intelligent system in fig. 8, 9 or 10, and the method includes the following method steps:

step S210, acquiring initial and target position information of the intelligent agent;

in one embodiment, initial position information and desired target position information that are preset and/or input by a user in real time are obtained.

Step S230, acquiring an intelligent power system behavior model;

specifically, an intelligent power system behavior model is obtained from a memory or a server;

for the intelligent power system behavior model, reference is made to the description of the model shown in fig. 18 in the above embodiment, and details are not repeated here.

And step S250, inputting the initial and target position information into the intelligent power system behavior model, and outputting a planning track instruction.

The planning track instruction is different according to different output requirements of model design.

In one embodiment, taking an intelligent body as a manipulator as an example, the motion trajectory is composed of a plurality of discrete points, and the planned trajectory instruction may be a motion acceleration and/or an angular acceleration of the intelligent body at each discrete point, so as to generate a planned trajectory instruction of the whole intelligent body.

The track planning method based on the intelligent power system behavior model has the following beneficial effects:

on one hand, because the intelligent power system behavior model is easy to use and analyze, the precision of the trajectory planning can be improved by being based on the intelligent power system behavior model.

On the other hand, due to the adoption of a control idea and the addition of additional items, the behavior model of the intelligent power system can quickly realize various additional functions such as disturbance resistance, track recovery, obstacle avoidance and the like on the original model without retraining a new model, so that the generalization capability of the model is improved.

In one embodiment, as shown in fig. 3, a trajectory planning method is provided, the method comprising the following method steps:

step S220, acquiring target position information of the intelligent agent;

step S240, acquiring the current track of the intelligent agent;

in one embodiment, taking a robot as an example, the current trajectory may include: current position information, current velocity and/or angular velocity;

specifically, the current velocity and/or the angular velocity may be an actual value acquired by a sensor or an estimated value calculated according to some method.

Step S260, acquiring an intelligent power system behavior model;

And step S280, inputting the current track and the target position information into an intelligent power system behavior model, and outputting a planned track instruction.

And generating a planned track instruction at the next moment according to the current track and the target position information.

Specifically, the planned trajectory instructions differ according to model design, such as: taking the manipulator as an example, the planned trajectory command may be a motion acceleration and/or an angular acceleration of the manipulator at the next moment.

Specifically, the motion acceleration and/or the angular acceleration may be directly output to the manipulator as a trajectory planning instruction.

In one embodiment, if it is difficult to control the robot with the motion acceleration and/or the angular acceleration, some existing or future developed calculation method may be used to convert the motion acceleration and/or the angular acceleration into information that can control the robot, such as: location information of the next time.

The trajectory planning of the intelligent agent is completed through the planning trajectory instruction generated in real time, so that the accuracy of the trajectory planning is improved.

In one embodiment, as shown in fig. 4, a trajectory planning method is provided, when some trajectory planning is performed, some auxiliary parameters are required to be combined to complete the final trajectory planning, such as: force/moment dependent trajectory planning (e.g. object grabbing or object fitting), the method comprising the method steps of:

step S310, acquiring a reference auxiliary parameter of the agent;

step S320, acquiring target position information of the intelligent agent;

the following embodiments will be described in further detail with respect to the method of acquiring the reference auxiliary parameter.

The control device acquires reference auxiliary parameters acquired by various sensors (such as force/torque sensors) and transmitted in real time, or acquires the reference auxiliary parameters from a server or a memory.

Step S330, acquiring the current auxiliary parameters of the agent;

step S340, calculating a current error between the reference auxiliary parameter and the current auxiliary parameter;

step S350, acquiring the current track of the intelligent agent;

and step S360, inputting the current error, the current track and the target position information into the intelligent power system behavior model, and outputting a planned track instruction and/or a corrected track instruction.

Specifically, the planned trajectory instruction and/or the corrected trajectory instruction differ according to model design, such as: taking the manipulator as an example, the planned trajectory instruction and/or the corrected trajectory instruction may be a motion acceleration and/or an angular acceleration of the manipulator at the next moment.

For the intelligent power system behavior model, reference is made to the description of the model shown in fig. 19 or 20 in the above embodiment, and details are not repeated here.

If the grabbing is successful, the force-moment track is completely the same as the force-moment reference track in any grabbing process, so that if the currently grabbed force-moment track is different from the force-moment reference track, the mechanical arm is in a wrong grabbing state; at the moment, the target position information of the actuator of the manipulator at the next moment is calculated by adopting the idea of feedback control and utilizing the error between the current force-moment and the reference force-moment, so that a planning track instruction is output;

and the sensor reads the force/moment information at the tail end of the current moment, a position increment is added to the position of the intelligent power system at the next moment in the normal planning of the behavior model at the moment, and the position increment is obtained by calculating the error between the front force-moment and the reference force/moment, so that the trajectory planning of the task of grabbing the target object is completed.

Specifically, according to the above embodiment, the intelligent power system behavior model includes a reference unit, a learning unit, a coordination unit, and a modification unit, and the description of each unit refers to the above embodiment, and is not repeated herein.

In one embodiment, when the current error is zero, the current trajectory and the target position information are input into the intelligent power system behavior model, and a planned trajectory command is output, wherein the correction unit does not participate in the operation.

When the current error is larger than zero, starting a correction unit, wherein the specific working method is different according to the difference of the specific structure of the correction unit;

in one embodiment, when the correction unit is an experience review module, on the basis of the planned track command generated by the learning unit and the reference unit together, the correction unit performs correction by minimizing an error between the current force/moment feedback data and the reference force/moment feedback data by using an experience review method, so as to generate a corrected track command.

In one embodiment, when the correction unit is an empirical review combined phase suspension assist module, i.e. when the correction unit commands a correction to the reference trajectory, the updating of the learning unit output is suspended by extending the time parameter of the coordination unit, so that the generation of the section of trajectory becomes slow until the correction unit exits the correction.

Further, in an embodiment, before step S350, the method may further include:

step S370, judging whether the error is zero or not; step S380, if the current trajectory and the target position information are zero, inputting the current trajectory and the target position information into an intelligent power system behavior model, and outputting a planned trajectory instruction; in step S390, if not, step S360 is executed.

In one embodiment, as shown in fig. 5, step S310, acquiring the reference auxiliary parameter of the agent includes the following method steps:

step S311, acquiring initial and target position information of the agent;

setting the initial and target position information to be the same as the initial and target position information of the teaching behavior of the teaching subject during the trajectory planning training described in the above embodiment;

step S312, acquiring an intelligent power system behavior model;

Step S313, inputting the initial and target position information into an intelligent power system behavior model, and outputting a reference instruction of an intelligent agent planning track instruction;

step S314, acquiring the reference auxiliary parameter in the process of executing the reference instruction by the agent.

As shown in fig. 9, taking the manipulator to grab the target object as an example, the grabbing trajectory that the manipulator walks when the model training is repeated once is repeated, and a successful grabbing is completed; in the process of the successful grabbing, force/moment information of the actuator at each moment in the process of the successful grabbing operation is recorded through a force/moment sensor of the actuator arranged on the manipulator and is called as a reference auxiliary parameter of the force/moment.

In one embodiment, as shown in fig. 6, taking obstacle avoidance as an example, the trajectory planning method includes the following steps:

step S410, acquiring target position information of the agent;

step S420, acquiring the current track of the intelligent agent;

in one embodiment, when the agent is a manipulator, the current trajectory includes current position information, current motion velocity and/or angular velocity of the manipulator, and current motion acceleration and/or angular acceleration of the manipulator;

step S430, obtaining or generating current obstacle information of the obstacle;

specifically, the obstacle information may be a current position increment, a current angle increment, and/or position information of a current obstacle, and the like;

the current position increment information refers to the current distance between the current obstacle position and the current position of the intelligent agent; the current angle refers to a current angle between a current obstacle position and a current position of the agent.

In one embodiment, an image sensor or other obstacle detector may be disposed on or around the agent to detect current location information of the obstacle;

and step S440, inputting the current obstacle information, the current track and the target position information into the intelligent power system behavior model, and outputting a planned track instruction and/or a corrected track instruction.

The working method of the correction unit is different according to the difference of the specific structure of the correction unit;

in one embodiment, when the correction unit is a human-like obstacle avoidance module, the acceleration direction of the controlled object is corrected, so that the included angle between the motion direction of the object and the direction of the obstacle relative to the object is increased, and the object is far away from the obstacle.

Further, in one embodiment, when the correction unit includes a minimum trajectory error module;

the method is used for measuring the error between the current track and the expected track of the controlled object after obstacle avoidance is finished, and correcting the motion track of the object by utilizing a PD control algorithm with the aim of minimizing the error.

After the correction by the correction unit is completed (the obstacle information is zero) or before the correction is started, a planned trajectory command may be output by the learning unit and the reference unit.

The entire trajectory planning instruction may include only the planned trajectory instruction (without the obstacle); only comprises a track correcting instruction; or a combination of planned trajectory instructions and revised trajectory instructions.

In one embodiment, as shown in fig. 7, a trajectory planning method is provided, the method comprising the following method steps:

step S510, acquiring target position information of the agent;

step S520, acquiring the current track of the intelligent agent; wherein the current track at least comprises current position information;

step S530, judging whether the intelligent agent is currently located in the resistant domain or not according to the current position information;

specifically, the resistance domain may be set in a certain range around the obstacle in advance;

judging whether the manipulator is located in the resistant area or not according to the current position information of the manipulator;

step S540, if yes, deflecting the target position information according to the current position increment information to generate new target position information;

wherein the current position increment refers to the relative position increment (distance increment and deflection amount) of the current position of the agent to the obstacle or impedance domain.

In one embodiment, taking the manipulator as an example, the deviation amount between the new target position and the original target position is determined by the relative position between the end of the current manipulator and the obstacle (the vector of the end and the obstacle), and the deviation distance is determined by multiplying the increment of the distance between the end of the current manipulator and the obstacle by an arbitrary scaling factor.

Step S550, inputting the current track and the new target position information into an intelligent power system behavior model, and generating a new planned track instruction;

And step S560, if not, inputting the current track and the target position information into the intelligent power system behavior model, and outputting a planned track instruction.

By adopting the track planning method, the target position is adjusted in real time, so that the track planning precision can be improved; in addition, the generalization ability of the model can be improved.

It should be understood that, although the respective steps in the flowcharts of fig. 1, 2, 3, 4, 5, 6, 7, etc. are sequentially shown as indicated by arrows, the steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 1, 2, 3, 4, 5, 6, and 7, etc. may include multiple sub-steps or phases that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or phases is not necessarily sequential, but may be performed alternately or alternatingly with other steps or at least some of the sub-steps or phases of other steps.

In one embodiment, as shown in fig. 11, there is provided a trajectory planning training device, the device comprising:

a teaching track obtaining module 110, configured to obtain a teaching track in a teaching behavior process executed by a teaching body;

an initial model generation module 120, configured to obtain an initial model of an intelligent power system behavior model;

and the behavior model generation module 130 is used for training the initial model according to the teaching track and outputting the intelligent power system behavior model.

In one embodiment, as shown in fig. 12, there is provided a trajectory planning apparatus, including:

a location information obtaining module 210 for obtaining initial and target location information of the agent;

a behavior model obtaining module 230, configured to obtain an intelligent power system behavior model;

and a planned trajectory generation module 250, configured to input the initial and target position information into the intelligent power system behavior model, and output a planned trajectory instruction.

In one embodiment, as shown in fig. 13, there is provided a trajectory planning apparatus, including:

a target position obtaining module 220, configured to obtain target position information of the agent;

a current trajectory obtaining module 240, configured to obtain a current trajectory of the agent;

a behavior model obtaining module 260, configured to obtain an intelligent power system behavior model;

and the planned track generation module 280 is used for inputting the current track and the target position information into the intelligent power system behavior model and outputting a planned track instruction.

In one embodiment, as shown in fig. 14, there is provided a trajectory planning apparatus, including:

a reference auxiliary parameter obtaining module 310, which obtains a reference auxiliary parameter of the agent;

a target position obtaining module 320 for obtaining target position information of the agent;

a current auxiliary parameter obtaining module 330 for obtaining current auxiliary parameters of the agent;

an error calculation module 340 calculating an error between the reference auxiliary parameter and the current auxiliary parameter;

a current trajectory acquisition module 350 for acquiring a current trajectory of the agent;

and the track generation module 360 is used for inputting the current error, the current track and the target position information into the intelligent power system behavior model and outputting a planned track instruction and/or a corrected track instruction.

Further, in one embodiment, as shown in fig. 15, a reference auxiliary parameter obtaining module is provided, which includes:

an initial and target position acquisition unit 311 for acquiring initial and target position information of the agent;

a behavior model obtaining unit 312, configured to obtain an intelligent power system behavior model;

a reference instruction generating unit 313, configured to input the initial and target position information into an intelligent power system behavior model, and output a reference instruction of trajectory planning;

a reference auxiliary parameter obtaining unit 314, configured to obtain a reference auxiliary parameter during the execution of the reference instruction by the agent.

In one embodiment, as shown in fig. 16, there is provided a trajectory planning apparatus, including:

a target position obtaining module 410, configured to obtain a current trajectory and target position information of the agent;

a current trajectory obtaining module 420, configured to obtain a current trajectory of the agent; wherein the current track at least comprises current position information;

a current obstacle information obtaining or generating module 430, configured to obtain or generate current obstacle information;

a behavior model obtaining module 440, configured to obtain an intelligent power system behavior model;

and the trajectory generation module 450 is configured to input the current obstacle information, the current trajectory, and the target position information into the intelligent power system behavior model, and output a planned trajectory instruction and/or a corrected trajectory instruction.

In one embodiment, as shown in fig. 17, there is provided a trajectory planning apparatus, including:

a target location obtaining module 510, configured to obtain a target location of the agent;

a current trajectory obtaining module 520, configured to obtain a current trajectory of the agent; wherein the current track at least comprises current position information;

a judging module 530, configured to judge whether the agent is located in the resistant domain according to the current location information;

a new planned trajectory generation module 550, which inputs the current trajectory and the new target position information into the intelligent power system behavior model to generate a new planned trajectory instruction;

and if not, the planned trajectory generation module 560 inputs the current trajectory and the target position into the intelligent power system behavior model and outputs a planned trajectory instruction.

For the specific limitations of the trajectory planning training device and the trajectory planning device, reference may be made to the above limitations of the trajectory planning training method and the trajectory planning method, which are not described herein again. All or part of the modules in the trajectory planning and training device and the trajectory planning device can be realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, as shown in fig. 20, there is provided an agent training system, the system comprising:

a teaching body 500 for performing a teaching action;

a teaching track generating device 600, configured to obtain or generate a teaching track according to the teaching behavior;

the training device 710 is used for acquiring a teaching track in the process of teaching behavior executed by a teaching main body; acquiring an initial model of an intelligent power system behavior model; and training the initial model according to the teaching track, and outputting an intelligent power system behavior model.

For the specific definition of the training device, reference may be made to the above definition of the trajectory planning training method, which is not described herein again.

In one embodiment, as shown in fig. 8, 9 or 10, there is provided an intelligent system comprising: agent 800 and control device 720;

control means 720 for obtaining initial and target location information of the agent; acquiring an intelligent power system behavior model; inputting the initial and target position information into an intelligent power system behavior model, and outputting a planning track instruction; or

Acquiring target position information of an agent; acquiring a current track of the agent; wherein the current trajectory includes at least current location information; judging whether the intelligent agent is located in a resistant domain or not according to the current position information; if yes, deflecting the target position information according to the current position increment information to generate new target position information; inputting the current track and the new target position information into the intelligent power system behavior model to generate a new planned track instruction; if not, inputting the current track and the target position into the intelligent power system behavior model, and outputting a planning track instruction.

And the agent 800 is configured to execute a corresponding trajectory motion according to the planned trajectory instruction and/or the corrected trajectory instruction.

For the specific definition of the control device, reference may be made to the above definition of the trajectory planning method, which is not described herein again.

Further, in one embodiment, the agent system further comprises the agent training system of the above embodiment.

The training device and the control device may be a Programmable Logic Controller (PLC), a Field Programmable Gate Array (FPGA), a Computer (PC), an Industrial Personal Computer (IPC), a server, or the like. The control device generates program instructions according to a preset program by combining information and parameters input manually or data collected by an external first sensor and/or second sensor (such as an image sensor).

In one embodiment, as shown in fig. 21, a computer device is provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the trajectory planning training method and/or the trajectory planning method described in the above embodiments when executing the computer program.

In an embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which, when being executed by a processor, implements the trajectory planning training method and/or the trajectory planning method described in the above embodiments.

In one embodiment, a computer-readable storage medium having stored thereon the intelligent power system behavior model for trajectory planning described in the above embodiments is provided.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

It should be noted that the intelligent body, the teaching subject, the training device, the control device, and/or the sensor mentioned in the trajectory planning training method, the trajectory planning method, the training device, the trajectory planning device, the training system, or the intelligent system, etc. may be a real object in a real environment, or a virtual object in a simulation platform, and the effect of connecting the real object is achieved through the simulation environment. The control device which completes the behavior training depending on the virtual environment is transplanted to the real environment to control or retrain the real object, so that the resources and time in the training process can be saved.

The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.

The terms "first," "second," "third," "S110," "S120," "S130," and the like in the claims and in the description and in the drawings above, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," and any variations thereof, are intended to cover non-exclusive inclusions. For example: a process, method, system, article, or robot that comprises a list of steps or modules is not necessarily limited to those steps or modules explicitly listed, but includes other steps or modules not explicitly listed or inherent to such process, method, system, article, or robot.

It should be noted that the embodiments described in the specification are preferred embodiments, and the structures and modules involved are not necessarily essential to the invention, as will be understood by those skilled in the art.

The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A trajectory planning method, characterized in that the method comprises:

acquiring initial and target position information of an agent;

acquiring an intelligent power system behavior model; and

2. The trajectory planning method according to claim 1, wherein when the agent is a robot, the trajectory planning instruction is:

3. A trajectory planning method, characterized in that the method comprises:

acquiring target position information of an agent;

acquiring a current track of the agent;

acquiring an intelligent power system behavior model; and

4. The agent trajectory planning method of claim 3, wherein when the agent is a robot, the trajectory planning instruction is:

5. A trajectory planning method, characterized in that the method comprises:

acquiring a reference auxiliary parameter of the intelligent agent;

acquiring target position information of the agent;

acquiring current auxiliary parameters of the agent;

acquiring a current track of the agent; and

6. An agent trajectory planning method according to claim 5, wherein said obtaining of reference auxiliary parameters of an agent comprises:

acquiring initial and target position information of the agent;

7. An agent trajectory planning method according to claim 5 or 6, wherein when the agent is a manipulator, the planned trajectory instruction and/or the revised trajectory instruction is/are:

8. A trajectory planning method, characterized in that the method comprises:

acquiring target position information of an agent;

acquiring a current track of the agent;

acquiring or generating current obstacle information of an obstacle; and

9. A trajectory planning method, characterized in that the method comprises:

acquiring target position information of an agent;

10. A trajectory planning apparatus, characterized in that the apparatus comprises:

and the planning track generation module is used for inputting the initial and target position information into the intelligent power system behavior model and outputting a planning track instruction.

11. An agent trajectory planning apparatus, the apparatus comprising:

12. A trajectory planning apparatus, characterized in that the apparatus comprises:

13. The agent trajectory planning device of claim 12, wherein the reference auxiliary parameter acquisition module comprises:

14. A trajectory planning apparatus, characterized in that the apparatus comprises:

15. A trajectory planning apparatus, characterized in that the apparatus comprises:

16. An agent system, characterized in that the system comprises a control device and an agent;

17. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the trajectory planning method according to any one of claims 1-9 when executing the computer program.

18. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the trajectory planning method according to any one of claims 1 to 9.