KR101819323B1 - Method for Generating Robot Task Motion Based on Imitation Learning and Motion Composition and Apparatus Therefor - Google Patents
Method for Generating Robot Task Motion Based on Imitation Learning and Motion Composition and Apparatus Therefor Download PDFInfo
- Publication number
- KR101819323B1 KR101819323B1 KR1020160032597A KR20160032597A KR101819323B1 KR 101819323 B1 KR101819323 B1 KR 101819323B1 KR 1020160032597 A KR1020160032597 A KR 1020160032597A KR 20160032597 A KR20160032597 A KR 20160032597A KR 101819323 B1 KR101819323 B1 KR 101819323B1
- Authority
- KR
- South Korea
- Prior art keywords
- unit
- behaviors
- trajectory
- demonstration
- behavior
- Prior art date
Links
Images
Classifications
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
- B25J9/1656—Programme controls characterised by programming, planning systems for manipulators
- B25J9/1664—Programme controls characterised by programming, planning systems for manipulators characterised by motion, path, trajectory planning
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
- B25J9/1628—Programme controls characterised by the control loop
- B25J9/163—Programme controls characterised by the control loop learning, adaptive, model based, rule based expert control
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
- B25J9/1628—Programme controls characterised by the control loop
- B25J9/1653—Programme controls characterised by the control loop parameters identification, estimation, stiffness, accuracy, error analysis
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
- B25J9/1674—Programme controls characterised by safety, monitoring, diagnostic
- B25J9/1676—Avoiding collision or forbidden zones
Landscapes
- Engineering & Computer Science (AREA)
- Robotics (AREA)
- Mechanical Engineering (AREA)
- Manipulator (AREA)
Abstract
A method and apparatus for generating a behavior trajectory of a robot based on simulated learning and action combinations. A method for generating a work behavior trajectory of a robot according to an embodiment of the present invention includes: obtaining a demonstration trajectory of a human task behavior and dividing the demonstration trajectory into a plurality of unit behaviors; Classifying the divided unit behaviors into groups of common unit behaviors; Retrieving a probability model corresponding to a unit behavior of each of the classified groups from a database storing a learned probability model for each of the unit behaviors; Generating representative trajectories of unit behaviors of the classified groups using the retrieved probability models; And reproducing the demonstration locus based on the generated representative locus.
Description
More particularly, the present invention relates to a technique for generating a work behavior trajectory of a robot based on simulated learning in which a robot simulates human behavior, and more particularly, The present invention relates to a method and apparatus for generating a behavioral trajectory of a robot, which can reproduce continuous actions by dividing and analyzing consecutive operations of the robot actively.
The initial robot was limited to performing repetitive tasks on behalf of humans in order to automate and unmannize the work on the production site. Recently, however, service robots such as guiding robots and educational robots that require complex interaction with people have been developed, and due to diversification of products, factory robots are also required to have scalability applicable to new jobs. .
Also, in the near future, home service robots that replace or assist people's work in the home are also considered, and research and development thereof is actively being carried out.
In this environment, simulation learning is being studied as a method for ensuring the scalability of the robot. This is because when the robot is required to perform a new task in a new environment, it simulates human behavior and guarantees scalability that can broaden the range of actions that can be performed as a learning-based method. Instead of taking into account the goals and conditions of a person, they can present a generalized approach to various tasks by taking into consideration the demonstrations of those who have already been reflected. It is also an intuitive, user-friendly way of simulating demonstrations, such as how people learn to behave.
As a study on this simulation learning, mainly a method of modeling the dynamic characteristics of the unit behavior and a method of generating the adaptive trajectory from the basic operation have been studied. Unit behavior modeling is a method of learning the essential features of motion that does not include noise from multiple demonstrations. Adaptive trajectory generation is a method in which an action modifies a basic trajectory according to external conditions, such as changing a target point. These methods are effective in learning one motion and applying it in various environments that the robot faces.
However, when a person actually performs a task, he / she performs a series of task operations consecutively, so that a natural demonstration trajectory includes several motions. On the other hand, existing algorithms are limited to a single unit of action. Therefore, in order for the robot to learn the work behavior, a human being must distinguish one motion and show it to the robot. There is a limit.
Embodiments of the present invention provide a method and system for robots that can actively reproduce and analyze a series of consecutive operations that occur when a person performs a task based on simulated learning and action combinations, A behavior trajectory generating method and apparatus are provided.
More specifically, the embodiments of the present invention can be applied to a method of modeling a dynamic behavior of a unit behavior through a probability model and generating a motion trajectory of a robot that can reproduce a human demonstration trajectory as a continuous action using a learned probability model Method and apparatus therefor.
A method for generating a work behavior trajectory of a robot according to an embodiment of the present invention includes: obtaining a demonstration trajectory of a human task behavior and dividing the demonstration trajectory into a plurality of unit behaviors; Classifying the divided unit behaviors into groups of common unit behaviors; Retrieving a probability model corresponding to a unit behavior of each of the classified groups from a database storing a learned probability model for each of the unit behaviors; Generating representative trajectories of unit behaviors of the classified groups using the retrieved probability models; And reproducing the demonstration locus based on the generated representative locus.
Wherein the step of dividing the unit behaviors comprises the steps of: dividing an operation section performing an operation from the demonstration locus and a stop section between the operation section, and comparing the demonstration locus with the unit behaviors based on the divided operation section and the stop section, .
The dividing into the unit behaviors may calculate an energy value for the demonstration locus, and may divide an interval in which the calculated energy value is equal to or greater than a predetermined threshold value into the operation interval.
Further, a method for generating a behavior trajectory of a robot according to an embodiment of the present invention includes: obtaining symbols of each of the retrieved probability models; And expressing the divided unit behaviors as a sequence of the obtained symbols, wherein the step of reproducing the demonstration trajectory includes arranging and connecting the generated representative trajectory according to the sequence of the symbols, The demonstration trajectory can be reproduced.
The step of reproducing the demonstration trajectory may reproduce the demonstration trajectory by deforming and connecting the generated representative trajectory according to a preset target position.
The searching for the probability model may search for a probability model corresponding to a unit behavior of each of the classified groups based on the unit behavior of each of the classified groups and the similarity between the probability models stored in the database.
Searching for the probability model retrieves a probability model corresponding to a unit behavior of each of the classified groups from the database and learns the retrieved probability model using the unit behavior of each of the classified groups, The step of generating representative trajectories may generate a representative trajectory for each of the divided unit behaviors using the learned probability model.
Further, in a method of generating a work behavior trajectory of a robot according to an embodiment of the present invention, when a probability model corresponding to at least one unit behavior among unit behaviors of each of the classified groups is not retrieved from the database, Generating a probability model for dynamic characteristics of unit behaviors; And storing the generated probability model in the database for learning.
The apparatus for generating a task behavior trajectory of a robot according to an embodiment of the present invention includes: a division unit for obtaining a demonstration trajectory of a human task behavior and dividing the demonstration trajectory into a plurality of unit behaviors; A classifier for classifying the divided unit behaviors into groups of unit behaviors that are common; A database for storing a learned probability model for each of the unit behaviors; A search unit for searching a probability model corresponding to a unit behavior of each of the classified groups from the database; A generating unit for generating representative trajectories of unit behaviors of the classified groups using the retrieved probability models; And a reproducing unit for reproducing the demonstration locus based on the generated representative locus.
The dividing unit may divide an operation period for performing the operation from the demonstration locus and a stop period between the operation period, and divide the demonstration locus into the unit behaviors based on the divided operation period and the stop period.
The dividing unit may calculate an energy value for the demonstration locus, and may divide an interval in which the calculated energy value is equal to or greater than a predetermined threshold value into the operation interval.
Furthermore, an apparatus for generating a behavioral locus of a robot according to an embodiment of the present invention includes: an acquiring unit acquiring symbols of each of the retrieved probability models; And a presentation unit for expressing the divided unit behaviors as a sequence of the obtained symbols, wherein the presentation unit arranges and connects the generated representative trajectory in the order of the sequence of the symbols to reproduce the demonstration trajectory .
The reproducing unit can reproduce the demonstration locus by modifying and connecting the generated representative locus according to a preset target position.
The search unit may search for a probability model corresponding to a unit behavior of each of the classified groups based on the unit behavior of each of the classified groups and the similarity between the probability models stored in the database.
Furthermore, when the probability model corresponding to the unit behavior of each of the classified groups is retrieved from the database, the apparatus for generating a behavior trajectory of the robot according to an embodiment of the present invention may search the probabilistic model for each of the classified groups And a learning unit for learning by using the unit behaviors.
Furthermore, when the probability model corresponding to at least one unit behavior among the unit behaviors of each of the classified groups is not retrieved from the database, the apparatus for generating a behavior trajectory of a robot according to an embodiment of the present invention, And a model generating unit for generating a probability model for dynamic characteristics of the unit behavior of the plurality of unit behaviors and storing the generated probability model in the database.
According to the embodiments of the present invention, the robot can actively divide and analyze a series of consecutive operations that occur when a person performs an operation based on simulated learning and behavioral combinations, thereby reproducing them in a continuous action.
Therefore, according to the embodiments of the present invention, the robot performs the learning only by showing the entire work process without the need for the individual to separately indicate the individual actions, and it is possible to instruct the target point in each operation step in the work process, It is possible to generate a flexible behavior trajectory.
According to the embodiments of the present invention, it is possible to improve convenience by minimizing the user's intervention in learning a new behavior for the robot, and to enable the robot to actively take necessary information even if the user's skill is low, Learning can be performed.
According to the embodiments of the present invention, it is possible to quickly learn based on existing analysis contents when learning similar work behaviors reconstructing previously learned unit behaviors, and to perform various applications by combining learned unit behaviors can do.
Further, according to the embodiments of the present invention, the following effects can be obtained.
First, by imitating the way humans learn behavior and applying it to robots, it is possible to intuitively and easily extend the functions of robots, which can be applied more easily than existing learning algorithms in terms of user scenarios.
Second, it is possible for the robot to effectively learn even if the proficiency of the performer is lowered because the robot analyzes and acquires necessary information from the demonstration information, rather than providing the user with the accurate demonstration information.
Third, human action is divided and analyzed and stored. Therefore, it is possible to apply and expand it by merely combining them.
FIG. 1 is a flowchart illustrating a process of learning a probability model for unit behavior according to an embodiment of the present invention.
FIG. 2 is a flowchart illustrating a method of generating a work behavior trajectory of a robot according to an embodiment of the present invention.
FIG. 3 shows an example of dividing a demonstration trajectory based on kinetic energy-based distinction between a stationary section and an operation section.
FIG. 4 is a diagram illustrating an example of a result obtained by connecting representative trajectories of a unit behavior model generated by the method according to the present invention.
FIG. 5 is a block diagram illustrating a configuration of an apparatus for generating a work behavior trajectory of a robot according to an embodiment of the present invention.
Hereinafter, embodiments according to the present invention will be described in detail with reference to the accompanying drawings. However, the present invention is not limited to or limited by the embodiments. In addition, the same reference numerals shown in the drawings denote the same members.
The embodiments according to the present invention can be applied to a method by which a robot can actively divide and analyze a series of consecutive operations that occur when a person performs an operation based on simulated learning and behavior combination, Thereby providing a device.
In this case, the embodiments of the present invention divide the demonstration trajectory of a user or a human into a plurality of unit behaviors, classify the unit behaviors into groups of common unit behaviors, The generated probability model can be learned by generating the probability model for the robot, and the human demonstration trajectory can be reproduced by the robot using the probability model for each of the learned unit behaviors.
Each of the unit behaviors classified in the present invention can learn the intrinsic dynamic characteristics of motion based on a probability model based on GMM (Gaussian Mixture Model).
FIG. 1 is a flowchart illustrating a process of learning a probability model for unit behavior according to an embodiment of the present invention.
Referring to FIG. 1, a process of generating and learning a probability model acquires a human demonstration trajectory and divides the obtained demonstration trajectory into a plurality of unit behaviors (S110, S120).
Here, the demonstration trajectory obtained in step S110 can be acquired through various motion capturing techniques with data obtained by capturing that the robot demonstrates a work to be learned by the robot. For example, an inertial measurement unit (IMU) is mounted on each joint of a human arm to acquire their orientation and applied to a human body model to calculate the joint value to match the IMU direction value on each joint .
Although humanoid model motion is directly applied to a robot, there is a problem that the link length and joint characteristics of a robot are different from each other. However, in the case of a humanoid robot comparatively similar to a human, (inverse kinematics). Since the characteristic of the action is mainly through the position trajectory expressed by the three-dimensional position and direction of the hand, similar motion can be obtained by matching the trajectory of the robot hand with the trajectory of the human hand on the obtained demonstration trajectory. However, it is necessary to consider the position of the elbow, since it is possible to deform according to the position of the elbow even if the position and the direction of the hand coincide when the person performs the operation.
Even if the inverse kinematics is used, the length of the link between human and robot is different, resulting in a difference in work space. Therefore, we prepare a robot model matching the length of the link with the human body model, and convert the human demonstration motion into the joint locus of the robot by matching the characteristic elements of the motion represented by the hand and elbow positions through inverse kinematics.
Step S120 is a step of dividing the obtained demonstration trajectory into a plurality of unit behaviors, and the obtained demonstration trajectory or the obtained robot joint trajectory can be divided into unit behaviors using kinetic energy. For example, when a person performs a certain series of actions, a temporary halt usually occurs as each step ends. Accordingly, in step S120, the demonstration trajectory is divided into a plurality of unit behaviors by distinguishing the operation section for performing the operation from the stop section in the stop state between the operation section for performing the operation and acquiring the unit behavior in each action stage .
As described above, the characteristics of the operation are mainly expressed through the three-dimensional position and orientation of the hand. The necessary information is information on whether the motion is in a dynamic state or in a static state, and considering the actual kinetic energy applied to the whole arm is not only complex in calculation, but also inherently incompatible with the characteristics of the robot, A simplified formula can be applied. For example, the present invention ignores the weight of the arm of the robot and assumes that the weight is concentrated at the end of the hand. Considering the kinetic energy, the kinetic energy is expressed as a product of squares of mass and velocity. However, the mass is invariant and the mass is negligible since the required energy is the relative value of the energy. Since the direction is also considered and the angular velocity exists, this Equation (1) can also be obtained by considering the rotational inertia as a fixed value.
[Equation 1]
Here, w 1 and w 2 are weights determined according to the ratio difference between the kinetic energy generated due to the position change and the rotation generated according to the mass and rotational inertia, and an appropriate value is selected according to the result of dividing the actual data .
Also, it is important to remove noise through a low-pass filter (LPF) since the energy value of
Accordingly, in step S120, a stop period and an operation period of the demonstration trajectory can be distinguished based on the energy value obtained through Equation (1). A predetermined threshold value is applied to a section where the energy value exceeds the threshold value, , And the interval between these operation intervals can be divided into the stop interval.
For example, as shown in FIG. 3 showing the position of the hand on the demonstration trajectory and the energy value thereof, it can be seen that the stopping interval between the operation interval and the operation is clearly distinguished according to the energy value . The upper graph shown in Fig. 3 is a three-dimensional position coordinate trajectory of the hand in the demonstration trajectory. The x, y, and z coordinates of the hand are represented by red, green, and blue lines, respectively. In the graph, the boundaries between the motion sections (A, B, C, and D) and the stop sections in the kinetic energy graph are represented by red lines, and the bottom images are simulation results showing the time points of each divided motion.
If it is divided into unit behaviors in step S120, the divided unit behaviors are classified into groups of common unit behaviors (S130).
At this time, in step S130, unit behaviors having the same meaning are classified into the same group based on the similarity of the trajectories of the divided unit behaviors, so that the unit behaviors can be classified into a plurality of groups.
It is determined whether a probability model corresponding to the unit behavior of each of the groups classified by the step S130 exists in the database. If the probability model exists in the database, the probability model is learned using the unit behavior (S140, S170) .
The database (DB) in the present invention accepts a new demonstration and is gradually learned, and the learned behavior is stored as a behavioral model or a probability model and is not stored in the demonstration data itself. Therefore, the determination of the degree of similarity for classification should be defined as the similarity between the newly introduced demonstration trajectory and the previously learned model.
Here, in step S140, it is possible to determine whether a probability model of the corresponding group exists in the database by using the similarity between data and model in order to determine which behavior model relates to the demonstration data. The similarity between data and model is defined as follows . The distribution of each time step is extracted from the learned model by regression analysis, and the non - inference of Mahalanobis distance is defined. The input trajectory sample
And the distribution at each time step calculated by GMR (Gaussian Mixture Regression) , The non-musical apostrophe of Mahalanobis distance can be as shown in Equation (2) below.
&Quot; (2) "
At this time, since the demonstration locus may be distorted in time, time alignment using DTW (Dynamic Time Warping) can be applied. The DTW finds and matches the matched pairs to minimize the distance between each sample. Based on this, comparison is made based on minimizing the non - inference when the demonstration trajectory is fitted to the model.
In the whole system, when the demonstration data is received, the model in which the non-inference degree is the minimum is searched to perform progressive learning, and when the non-inference degree exceeds the predetermined reference value, it is determined that there is no similar model and a new model is generated.
That is, if it is determined in step S140 that there is no probability model corresponding to the unit behavior of a specific group in the database, a probability model for the dynamic characteristic of the unit behavior of each group in which there is no probability model is generated, Symbols are allocated to each of the probability models and stored in the database (S150, S160).
In this case, each new probability model is numbered and used as a symbol representing the corresponding operation. In step S150, the process of performing the task is expressed as a sequence of symbols according to the kind of unit behavior shown on the demonstration trajectory It is possible.
The GMM (Gaussian Mixture Model) is a probabilistic model for modeling nonlinearly distributed data samples, and is suitable for modeling behavioral trajectories with nonlinear characteristics that can result in various patterns. Moreover, there is an advantage that it is robust against noise and some spatial distortion according to the characteristics of the probability model. Further, progressive learning suitable for the system or apparatus proposed by the present invention is also possible. There is also a gradual learning model that combines DTM (Dynamic Time Warping) to overcome the temporal distortion of the trajectory and Principal Component Analysis (PCA) to reduce the dimension.
The probability model generated in steps S150 and S160 and stored in the database is then learned by the unit behavior divided by the human demonstration trajectory (S170).
That is, the classified demonstration data can be generated and learned effectively through the probability model stored in the database.
FIG. 2 is a flowchart illustrating a method for generating a work behavior trajectory of a robot according to an exemplary embodiment of the present invention. Referring to FIG. 1, it is assumed that all probability models are learned in a database.
Referring to FIG. 2, the method according to an embodiment of the present invention acquires a human demonstration trajectory to be reproduced and divides the obtained demonstration trajectory into a plurality of unit behaviors (S210, S220).
Here, steps S210 and S220 are the same as steps S110 and S120 in Fig. 1, and a description thereof will be omitted.
If it is divided into unit behaviors in step S220, the divided unit behaviors are classified into groups of common unit behaviors (S230).
At this time, step S130 is the same as step S130 in Fig. 1, and a description thereof will be omitted.
If it is classified into groups in step S130, the probability model corresponding to the unit behavior of each of the classified groups is searched from the database, and symbols of the retrieved probability model are obtained (S240, S250).
At this time, the probability model corresponding to the unit behavior of each of the classified groups can be learned by the unit behavior of each of the classified groups.
Of course, if there is no related probability model among the groups in the database, steps S150 through S170 of FIG. 1 may be performed. In FIG. 2, it is assumed that probability models for all groups are searched. The description is omitted.
When the symbols of the probabilistic model searched in step S250 are obtained, the unit behaviors of the demonstration trajectory to be reproduced are represented by the symbol sequence, and representative trajectories of the retrieved probability models are generated (S260, S270).
When a representative locus of each of the probability models is generated in step S270, the representative locus is arranged and connected according to the order of the symbol sequences expressed in step S260, thereby reproducing the demonstration locus (S280).
Here, in step S280, the representative trajectory generated according to the preset target position is deformed and connected to reproduce the demonstration trajectory.
Specifically, in order to generate a work locus, the robot receives a target position of an operation determined by analyzing a demonstration locus, which is determined by a symbol sequence expressing a work step and an interactive object. In the adaptive trajectory generation and linkage step, representative trajectories are generated from each behavior model according to the list of unit behaviors appearing in the symbol sequence, and they are linked and transformed according to the task goal.
The MTM (Motion Trajectory Morphing) algorithm is an algorithm that generates a trajectory that reaches the target point stably, similar to the original trajectory of a single behavioral trajectory. The MTM algorithm generates a trajectory that effectively reaches a target position while generating a stable trajectory do. This algorithm is generally applicable to multidimensional trajectories.
Although the intrinsic characteristics of behavior are mainly in the end point trajectory, the end point trajectory alone can not cover all the characteristics of the behavior. Therefore, it is reasonable to model based on the joint trajectory when learning behavioral or probabilistic models. On the other hand, since the target point is given as a three-dimensional position, there is a gap between the representative locus generated from the model and the target point. In order to solve this problem, it is possible to overcome this difference by taking the joint value in the original behavioral trajectory as the initial value and taking the joint value corresponding to the given target point asymptotically through the inverse kinematics as the target value in the joint space.
The MTM algorithm can effectively do this when connecting the trajectories obtained from the behavioral model at each stage of the work. In the case of the portion where the target point is separately given in each work step, the end point of the operation and the point of the next operation can be modified according to the target point. Otherwise, as shown in FIG. 4, A continuous trajectory can be obtained.
Here, the blue line on the graph shown in FIG. 4 indicates that a discontinuous point is found at the connecting point by simply connecting the representative trajectory, and the red line indicates the point of time of the trajectory of each step and the end point through the MTM algorithm It can be seen that the discontinuity point has disappeared in one trajectory.
As described above, the method according to the embodiment of the present invention is a method in which the robot actively divides and analyzes a series of consecutive operations that occur when a person performs an operation based on simulated learning and action combinations, The robot can learn by simply showing the entire work process without having to separately direct the individual actions individually, and it is possible to generate a more flexible behavior trajectory by instructing the target point at each step of the operation in the work process .
In addition, the method according to embodiments of the present invention minimizes the user's intervention in learning a new behavior for the robot, enhances convenience, and allows the robot to actively take necessary information even if the user's skill is low Learning can be performed effectively without errors.
In addition, the method according to the embodiments of the present invention can quickly learn based on existing analysis contents when learning similar action behaviors reconstructing previously learned unit behaviors, . ≪ / RTI >
FIG. 5 is a block diagram illustrating a configuration of an apparatus for generating a work behavior trajectory of a robot according to an embodiment of the present invention, and shows a configuration of an apparatus for performing the method described with reference to FIG. 1 to FIG.
5, an
The database (DB) is a means for storing a probability model for each of the unit behaviors together with symbols.
Here, the database accepts a new demonstration, each of the probability models is progressively learned, and the learned behaviors are stored as a behavioral model or a probability model, and may not be stored in the demonstration data itself.
The
At this time, the
The classifying
At this time, the classifying
The
At this time, the
The
At this time, the
When the probability model corresponding to the unit behavior of each of the groups classified from the database is retrieved, the
The acquiring
Here, the symbol may be assigned when the corresponding probability model is generated.
The
The generating
The reproducing
At this time, the
Furthermore, the reproducing
Although not illustrated in FIG. 5, the apparatus of FIG. 5 may perform all of the operations of FIGS. 1 through 4 described above and may include all of the contents of FIGS.
The system or apparatus described above may be implemented as a hardware component, a software component, and / or a combination of hardware components and software components. For example, the systems, devices, and components described in the embodiments may be implemented in various forms such as, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable array ), A programmable logic unit (PLU), a microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications running on the operating system. The processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For ease of understanding, the processing apparatus may be described as being used singly, but those skilled in the art will recognize that the processing apparatus may have a plurality of processing elements and / As shown in FIG. For example, the processing unit may comprise a plurality of processors or one processor and one controller. Other processing configurations are also possible, such as a parallel processor.
The software may include a computer program, code, instructions, or a combination of one or more of the foregoing, and may be configured to configure the processing device to operate as desired or to process it collectively or collectively Device can be commanded. The software and / or data may be in the form of any type of machine, component, physical device, virtual equipment, computer storage media, or device , Or may be permanently or temporarily embodied in a transmitted signal wave. The software may be distributed over a networked computer system and stored or executed in a distributed manner. The software and data may be stored on one or more computer readable recording media.
The method according to embodiments may be implemented in the form of a program instruction that may be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions to be recorded on the medium may be those specially designed and configured for the embodiments or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Magneto-optical media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.
While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. For example, it is to be understood that the techniques described may be performed in a different order than the described methods, and / or that components of the described systems, structures, devices, circuits, Lt; / RTI > or equivalents, even if it is replaced or replaced.
Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.
Claims (16)
Classifying the divided unit behaviors into groups of common unit behaviors;
Retrieving a probability model corresponding to a unit behavior of each of the classified groups from a database storing a learned probability model for each of the unit behaviors;
Generating representative trajectories of unit behaviors of the classified groups using the retrieved probability models; And
Reproducing the demonstration locus based on the generated representative locus
Lt; / RTI >
The step of dividing into unit behaviors
Wherein the control unit divides an operation interval for performing an operation from the demonstration trajectory and a stop interval between the operation intervals, divides the demonstration trajectory into the unit actions based on the divided operation interval and the stop interval, Calculating an energy value, and dividing an interval in which the calculated energy value is equal to or greater than a predetermined threshold value into the operation interval.
Classifying the divided unit behaviors into groups of common unit behaviors;
Retrieving a probability model corresponding to a unit behavior of each of the classified groups from a database storing a learned probability model for each of the unit behaviors;
Generating representative trajectories of unit behaviors of the classified groups using the retrieved probability models; And
Reproducing the demonstration locus based on the generated representative locus
Lt; / RTI >
Obtaining symbols of each of the retrieved probability models; And
Expressing the divided unit behaviors as a sequence of the obtained symbols
Further comprising:
The step of reproducing the demonstration locus
Wherein the generated trajectory is rearranged by arranging and connecting the generated representative trajectory according to the order of the sequence of the symbols.
The step of reproducing the demonstration locus
Wherein the generated trajectory is modified by connecting the generated representative trajectory according to a preset target position, thereby reproducing the demonstration trajectory.
The step of retrieving the probability model
And a probability model corresponding to the unit behavior of each of the classified groups is searched based on the unit behavior of each of the classified groups and the similarity between the probability models stored in the database. .
Classifying the divided unit behaviors into groups of common unit behaviors;
Retrieving a probability model corresponding to a unit behavior of each of the classified groups from a database storing a learned probability model for each of the unit behaviors;
Generating representative trajectories of unit behaviors of the classified groups using the retrieved probability models; And
Reproducing the demonstration locus based on the generated representative locus
Lt; / RTI >
The step of retrieving the probability model
Retrieving a probability model corresponding to the unit behavior of each of the classified groups from the database, and learning the retrieved probability model using the unit behavior of each of the classified groups,
The step of generating the representative locus
And generating a representative trajectory for each of the divided unit behaviors using the learned probability model.
Classifying the divided unit behaviors into groups of common unit behaviors;
Retrieving a probability model corresponding to a unit behavior of each of the classified groups from a database storing a learned probability model for each of the unit behaviors;
Generating representative trajectories of unit behaviors of the classified groups using the retrieved probability models; And
Reproducing the demonstration locus based on the generated representative locus
Lt; / RTI >
Generating a probability model for the dynamic characteristics of the at least one unit behavior when a probability model corresponding to at least one unit behavior among unit behaviors of each of the classified groups is not retrieved from the database; And
Storing the generated probability model in the database and learning
Wherein the step of generating the locus of the action behavior of the robot further comprises the steps of:
A classifier for classifying the divided unit behaviors into groups of unit behaviors that are common;
A database for storing a learned probability model for each of the unit behaviors;
A search unit for searching a probability model corresponding to a unit behavior of each of the classified groups from the database;
A generating unit for generating representative trajectories of unit behaviors of the classified groups using the retrieved probability models; And
A reproduction section for reproducing the demonstration locus based on the generated representative locus;
Lt; / RTI >
The divider
Wherein the control unit divides an operation interval for performing an operation from the demonstration trajectory and a stop interval between the operation intervals, divides the demonstration trajectory into the unit actions based on the divided operation interval and the stop interval, Calculates an energy value, and divides an interval in which the calculated energy value is equal to or greater than a preset threshold value into the operation interval.
A classifier for classifying the divided unit behaviors into groups of unit behaviors that are common;
A database for storing a learned probability model for each of the unit behaviors;
A search unit for searching a probability model corresponding to a unit behavior of each of the classified groups from the database;
A generating unit for generating representative trajectories of unit behaviors of the classified groups using the retrieved probability models; And
A reproduction section for reproducing the demonstration locus based on the generated representative locus;
Lt; / RTI >
An obtaining unit obtaining symbols of each of the retrieved probability models; And
A unit for expressing the divided unit behaviors as a sequence of the obtained symbols,
Further comprising:
The reproducing unit
And the generated representative trajectory is arranged and connected according to the sequence of the symbols to reproduce the demonstration trajectory.
The reproducing unit
Wherein the demonstration trajectory is reproduced by transforming and connecting the generated representative trajectory according to a preset target position.
The search unit
And a probability model corresponding to the unit behavior of each of the classified groups is searched based on the unit behavior of each of the classified groups and the similarity between the probability models stored in the database. .
A classifier for classifying the divided unit behaviors into groups of unit behaviors that are common;
A database for storing a learned probability model for each of the unit behaviors;
A search unit for searching a probability model corresponding to a unit behavior of each of the classified groups from the database;
A generating unit for generating representative trajectories of unit behaviors of the classified groups using the retrieved probability models; And
A reproduction section for reproducing the demonstration locus based on the generated representative locus;
Lt; / RTI >
When a probability model corresponding to a unit behavior of each of the classified groups is retrieved from the database, the learning unit learns the retrieved probability model using unit behaviors of the classified groups,
Wherein the robot further comprises:
A classifier for classifying the divided unit behaviors into groups of unit behaviors that are common;
A database for storing a learned probability model for each of the unit behaviors;
A search unit for searching a probability model corresponding to a unit behavior of each of the classified groups from the database;
A generating unit for generating representative trajectories of unit behaviors of the classified groups using the retrieved probability models; And
A reproduction section for reproducing the demonstration locus based on the generated representative locus;
Lt; / RTI >
Generating a probability model for the dynamic characteristics of the at least one unit behavior when a probability model corresponding to at least one unit behavior among the unit behaviors of each of the classified groups is not retrieved from the database, To the database,
Wherein the robot further comprises:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020160032597A KR101819323B1 (en) | 2016-03-18 | 2016-03-18 | Method for Generating Robot Task Motion Based on Imitation Learning and Motion Composition and Apparatus Therefor |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020160032597A KR101819323B1 (en) | 2016-03-18 | 2016-03-18 | Method for Generating Robot Task Motion Based on Imitation Learning and Motion Composition and Apparatus Therefor |
Publications (2)
Publication Number | Publication Date |
---|---|
KR20170108526A KR20170108526A (en) | 2017-09-27 |
KR101819323B1 true KR101819323B1 (en) | 2018-01-16 |
Family
ID=60036230
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020160032597A KR101819323B1 (en) | 2016-03-18 | 2016-03-18 | Method for Generating Robot Task Motion Based on Imitation Learning and Motion Composition and Apparatus Therefor |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR101819323B1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20230115139A (en) | 2022-01-26 | 2023-08-02 | 경북대학교 산학협력단 | Motion imitation robot control device and method based on artificial neural network, and a computer-readable storage medium |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109590986B (en) * | 2018-12-03 | 2022-03-29 | 日照市越疆智能科技有限公司 | Robot teaching method, intelligent robot and storage medium |
CN110524544A (en) * | 2019-10-08 | 2019-12-03 | 深圳前海达闼云端智能科技有限公司 | A kind of control method of manipulator motion, terminal and readable storage medium storing program for executing |
CN111890357B (en) * | 2020-07-01 | 2023-07-04 | 广州中国科学院先进技术研究所 | Intelligent robot grabbing method based on action demonstration teaching |
CN112847336B (en) * | 2020-12-24 | 2023-08-22 | 达闼机器人股份有限公司 | Action learning method and device, storage medium and electronic equipment |
-
2016
- 2016-03-18 KR KR1020160032597A patent/KR101819323B1/en active IP Right Grant
Non-Patent Citations (2)
Title |
---|
2014 제29회 제어로봇시스템학회 국내학술대회 논문집 pp. 197-198 (2014.5.)* |
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 43, NO. 3, pp. 730-740 (2013.5.)* |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20230115139A (en) | 2022-01-26 | 2023-08-02 | 경북대학교 산학협력단 | Motion imitation robot control device and method based on artificial neural network, and a computer-readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
KR20170108526A (en) | 2017-09-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR101819323B1 (en) | Method for Generating Robot Task Motion Based on Imitation Learning and Motion Composition and Apparatus Therefor | |
WO2017159614A1 (en) | Learning service provision device | |
US10474934B1 (en) | Machine learning for computing enabled systems and/or devices | |
JP6640060B2 (en) | Robot system | |
EP3431229A1 (en) | Action information generation device | |
KR102139513B1 (en) | Autonomous driving control apparatus and method based on ai vehicle in the loop simulation | |
US10102449B1 (en) | Devices, systems, and methods for use in automation | |
Copot et al. | Predictive control of nonlinear visual servoing systems using image moments | |
CN106030457A (en) | Tracking objects during processes | |
WO2015175739A1 (en) | Robotic task demonstration interface | |
US11914761B1 (en) | Systems and methods for virtual artificial intelligence development and testing | |
Wang et al. | Perception of demonstration for automatic programing of robotic assembly: framework, algorithm, and validation | |
KR101577711B1 (en) | Method for learning task skill using temporal and spatial relation | |
US10162737B2 (en) | Emulating a user performing spatial gestures | |
US20200160210A1 (en) | Method and system for predicting a motion trajectory of a robot moving between a given pair of robotic locations | |
Lloyd et al. | Programming contact tasks using a reality-based virtual environment integrated with vision | |
Magyar et al. | Guided stochastic optimization for motion planning | |
Liu et al. | Grasp pose learning from human demonstration with task constraints | |
KR101329642B1 (en) | Modeling method for learning task skill and robot using thereof | |
Österberg | Skill Imitation Learning on Dual-arm Robotic Systems | |
KR101676541B1 (en) | Method for Learning Task Skill and Robot Using Thereof | |
Wu et al. | Video driven adaptive grasp planning of virtual hand using deep reinforcement learning | |
Herrmann et al. | Motion Data and Model Management for Applied Statistical Motion Synthesis. | |
Ahmadzadeh et al. | Visuospatial skill learning | |
WO2022190435A1 (en) | Command script assistance system, command script assistance method, and command script assistance program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
E902 | Notification of reason for refusal | ||
E701 | Decision to grant or registration of patent right | ||
GRNT | Written decision to grant |