KR101819323B1 - Method for Generating Robot Task Motion Based on Imitation Learning and Motion Composition and Apparatus Therefor - Google Patents

Method for Generating Robot Task Motion Based on Imitation Learning and Motion Composition and Apparatus Therefor Download PDF

Info

Publication number
KR101819323B1
KR101819323B1 KR1020160032597A KR20160032597A KR101819323B1 KR 101819323 B1 KR101819323 B1 KR 101819323B1 KR 1020160032597 A KR1020160032597 A KR 1020160032597A KR 20160032597 A KR20160032597 A KR 20160032597A KR 101819323 B1 KR101819323 B1 KR 101819323B1
Authority
KR
South Korea
Prior art keywords
unit
behaviors
trajectory
demonstration
behavior
Prior art date
Application number
KR1020160032597A
Other languages
Korean (ko)
Other versions
KR20170108526A (en
Inventor
조성호
조수민
Original Assignee
한국과학기술원
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 한국과학기술원 filed Critical 한국과학기술원
Priority to KR1020160032597A priority Critical patent/KR101819323B1/en
Publication of KR20170108526A publication Critical patent/KR20170108526A/en
Application granted granted Critical
Publication of KR101819323B1 publication Critical patent/KR101819323B1/en

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1656Programme controls characterised by programming, planning systems for manipulators
    • B25J9/1664Programme controls characterised by programming, planning systems for manipulators characterised by motion, path, trajectory planning
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1628Programme controls characterised by the control loop
    • B25J9/163Programme controls characterised by the control loop learning, adaptive, model based, rule based expert control
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1628Programme controls characterised by the control loop
    • B25J9/1653Programme controls characterised by the control loop parameters identification, estimation, stiffness, accuracy, error analysis
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1674Programme controls characterised by safety, monitoring, diagnostic
    • B25J9/1676Avoiding collision or forbidden zones

Landscapes

  • Engineering & Computer Science (AREA)
  • Robotics (AREA)
  • Mechanical Engineering (AREA)
  • Manipulator (AREA)

Abstract

A method and apparatus for generating a behavior trajectory of a robot based on simulated learning and action combinations. A method for generating a work behavior trajectory of a robot according to an embodiment of the present invention includes: obtaining a demonstration trajectory of a human task behavior and dividing the demonstration trajectory into a plurality of unit behaviors; Classifying the divided unit behaviors into groups of common unit behaviors; Retrieving a probability model corresponding to a unit behavior of each of the classified groups from a database storing a learned probability model for each of the unit behaviors; Generating representative trajectories of unit behaviors of the classified groups using the retrieved probability models; And reproducing the demonstration locus based on the generated representative locus.

Description

TECHNICAL FIELD The present invention relates to a method and apparatus for generating a motion trajectory of a robot based on simulated learning and action combinations,

More particularly, the present invention relates to a technique for generating a work behavior trajectory of a robot based on simulated learning in which a robot simulates human behavior, and more particularly, The present invention relates to a method and apparatus for generating a behavioral trajectory of a robot, which can reproduce continuous actions by dividing and analyzing consecutive operations of the robot actively.

The initial robot was limited to performing repetitive tasks on behalf of humans in order to automate and unmannize the work on the production site. Recently, however, service robots such as guiding robots and educational robots that require complex interaction with people have been developed, and due to diversification of products, factory robots are also required to have scalability applicable to new jobs. .

Also, in the near future, home service robots that replace or assist people's work in the home are also considered, and research and development thereof is actively being carried out.

In this environment, simulation learning is being studied as a method for ensuring the scalability of the robot. This is because when the robot is required to perform a new task in a new environment, it simulates human behavior and guarantees scalability that can broaden the range of actions that can be performed as a learning-based method. Instead of taking into account the goals and conditions of a person, they can present a generalized approach to various tasks by taking into consideration the demonstrations of those who have already been reflected. It is also an intuitive, user-friendly way of simulating demonstrations, such as how people learn to behave.

As a study on this simulation learning, mainly a method of modeling the dynamic characteristics of the unit behavior and a method of generating the adaptive trajectory from the basic operation have been studied. Unit behavior modeling is a method of learning the essential features of motion that does not include noise from multiple demonstrations. Adaptive trajectory generation is a method in which an action modifies a basic trajectory according to external conditions, such as changing a target point. These methods are effective in learning one motion and applying it in various environments that the robot faces.

However, when a person actually performs a task, he / she performs a series of task operations consecutively, so that a natural demonstration trajectory includes several motions. On the other hand, existing algorithms are limited to a single unit of action. Therefore, in order for the robot to learn the work behavior, a human being must distinguish one motion and show it to the robot. There is a limit.

Embodiments of the present invention provide a method and system for robots that can actively reproduce and analyze a series of consecutive operations that occur when a person performs a task based on simulated learning and action combinations, A behavior trajectory generating method and apparatus are provided.

More specifically, the embodiments of the present invention can be applied to a method of modeling a dynamic behavior of a unit behavior through a probability model and generating a motion trajectory of a robot that can reproduce a human demonstration trajectory as a continuous action using a learned probability model Method and apparatus therefor.

A method for generating a work behavior trajectory of a robot according to an embodiment of the present invention includes: obtaining a demonstration trajectory of a human task behavior and dividing the demonstration trajectory into a plurality of unit behaviors; Classifying the divided unit behaviors into groups of common unit behaviors; Retrieving a probability model corresponding to a unit behavior of each of the classified groups from a database storing a learned probability model for each of the unit behaviors; Generating representative trajectories of unit behaviors of the classified groups using the retrieved probability models; And reproducing the demonstration locus based on the generated representative locus.

Wherein the step of dividing the unit behaviors comprises the steps of: dividing an operation section performing an operation from the demonstration locus and a stop section between the operation section, and comparing the demonstration locus with the unit behaviors based on the divided operation section and the stop section, .

The dividing into the unit behaviors may calculate an energy value for the demonstration locus, and may divide an interval in which the calculated energy value is equal to or greater than a predetermined threshold value into the operation interval.

Further, a method for generating a behavior trajectory of a robot according to an embodiment of the present invention includes: obtaining symbols of each of the retrieved probability models; And expressing the divided unit behaviors as a sequence of the obtained symbols, wherein the step of reproducing the demonstration trajectory includes arranging and connecting the generated representative trajectory according to the sequence of the symbols, The demonstration trajectory can be reproduced.

The step of reproducing the demonstration trajectory may reproduce the demonstration trajectory by deforming and connecting the generated representative trajectory according to a preset target position.

The searching for the probability model may search for a probability model corresponding to a unit behavior of each of the classified groups based on the unit behavior of each of the classified groups and the similarity between the probability models stored in the database.

Searching for the probability model retrieves a probability model corresponding to a unit behavior of each of the classified groups from the database and learns the retrieved probability model using the unit behavior of each of the classified groups, The step of generating representative trajectories may generate a representative trajectory for each of the divided unit behaviors using the learned probability model.

Further, in a method of generating a work behavior trajectory of a robot according to an embodiment of the present invention, when a probability model corresponding to at least one unit behavior among unit behaviors of each of the classified groups is not retrieved from the database, Generating a probability model for dynamic characteristics of unit behaviors; And storing the generated probability model in the database for learning.

The apparatus for generating a task behavior trajectory of a robot according to an embodiment of the present invention includes: a division unit for obtaining a demonstration trajectory of a human task behavior and dividing the demonstration trajectory into a plurality of unit behaviors; A classifier for classifying the divided unit behaviors into groups of unit behaviors that are common; A database for storing a learned probability model for each of the unit behaviors; A search unit for searching a probability model corresponding to a unit behavior of each of the classified groups from the database; A generating unit for generating representative trajectories of unit behaviors of the classified groups using the retrieved probability models; And a reproducing unit for reproducing the demonstration locus based on the generated representative locus.

The dividing unit may divide an operation period for performing the operation from the demonstration locus and a stop period between the operation period, and divide the demonstration locus into the unit behaviors based on the divided operation period and the stop period.

The dividing unit may calculate an energy value for the demonstration locus, and may divide an interval in which the calculated energy value is equal to or greater than a predetermined threshold value into the operation interval.

Furthermore, an apparatus for generating a behavioral locus of a robot according to an embodiment of the present invention includes: an acquiring unit acquiring symbols of each of the retrieved probability models; And a presentation unit for expressing the divided unit behaviors as a sequence of the obtained symbols, wherein the presentation unit arranges and connects the generated representative trajectory in the order of the sequence of the symbols to reproduce the demonstration trajectory .

The reproducing unit can reproduce the demonstration locus by modifying and connecting the generated representative locus according to a preset target position.

The search unit may search for a probability model corresponding to a unit behavior of each of the classified groups based on the unit behavior of each of the classified groups and the similarity between the probability models stored in the database.

Furthermore, when the probability model corresponding to the unit behavior of each of the classified groups is retrieved from the database, the apparatus for generating a behavior trajectory of the robot according to an embodiment of the present invention may search the probabilistic model for each of the classified groups And a learning unit for learning by using the unit behaviors.

Furthermore, when the probability model corresponding to at least one unit behavior among the unit behaviors of each of the classified groups is not retrieved from the database, the apparatus for generating a behavior trajectory of a robot according to an embodiment of the present invention, And a model generating unit for generating a probability model for dynamic characteristics of the unit behavior of the plurality of unit behaviors and storing the generated probability model in the database.

According to the embodiments of the present invention, the robot can actively divide and analyze a series of consecutive operations that occur when a person performs an operation based on simulated learning and behavioral combinations, thereby reproducing them in a continuous action.

Therefore, according to the embodiments of the present invention, the robot performs the learning only by showing the entire work process without the need for the individual to separately indicate the individual actions, and it is possible to instruct the target point in each operation step in the work process, It is possible to generate a flexible behavior trajectory.

According to the embodiments of the present invention, it is possible to improve convenience by minimizing the user's intervention in learning a new behavior for the robot, and to enable the robot to actively take necessary information even if the user's skill is low, Learning can be performed.

According to the embodiments of the present invention, it is possible to quickly learn based on existing analysis contents when learning similar work behaviors reconstructing previously learned unit behaviors, and to perform various applications by combining learned unit behaviors can do.

Further, according to the embodiments of the present invention, the following effects can be obtained.

First, by imitating the way humans learn behavior and applying it to robots, it is possible to intuitively and easily extend the functions of robots, which can be applied more easily than existing learning algorithms in terms of user scenarios.

Second, it is possible for the robot to effectively learn even if the proficiency of the performer is lowered because the robot analyzes and acquires necessary information from the demonstration information, rather than providing the user with the accurate demonstration information.

Third, human action is divided and analyzed and stored. Therefore, it is possible to apply and expand it by merely combining them.

FIG. 1 is a flowchart illustrating a process of learning a probability model for unit behavior according to an embodiment of the present invention.
FIG. 2 is a flowchart illustrating a method of generating a work behavior trajectory of a robot according to an embodiment of the present invention.
FIG. 3 shows an example of dividing a demonstration trajectory based on kinetic energy-based distinction between a stationary section and an operation section.
FIG. 4 is a diagram illustrating an example of a result obtained by connecting representative trajectories of a unit behavior model generated by the method according to the present invention.
FIG. 5 is a block diagram illustrating a configuration of an apparatus for generating a work behavior trajectory of a robot according to an embodiment of the present invention.

Hereinafter, embodiments according to the present invention will be described in detail with reference to the accompanying drawings. However, the present invention is not limited to or limited by the embodiments. In addition, the same reference numerals shown in the drawings denote the same members.

The embodiments according to the present invention can be applied to a method by which a robot can actively divide and analyze a series of consecutive operations that occur when a person performs an operation based on simulated learning and behavior combination, Thereby providing a device.

In this case, the embodiments of the present invention divide the demonstration trajectory of a user or a human into a plurality of unit behaviors, classify the unit behaviors into groups of common unit behaviors, The generated probability model can be learned by generating the probability model for the robot, and the human demonstration trajectory can be reproduced by the robot using the probability model for each of the learned unit behaviors.

Each of the unit behaviors classified in the present invention can learn the intrinsic dynamic characteristics of motion based on a probability model based on GMM (Gaussian Mixture Model).

FIG. 1 is a flowchart illustrating a process of learning a probability model for unit behavior according to an embodiment of the present invention.

Referring to FIG. 1, a process of generating and learning a probability model acquires a human demonstration trajectory and divides the obtained demonstration trajectory into a plurality of unit behaviors (S110, S120).

Here, the demonstration trajectory obtained in step S110 can be acquired through various motion capturing techniques with data obtained by capturing that the robot demonstrates a work to be learned by the robot. For example, an inertial measurement unit (IMU) is mounted on each joint of a human arm to acquire their orientation and applied to a human body model to calculate the joint value to match the IMU direction value on each joint .

Although humanoid model motion is directly applied to a robot, there is a problem that the link length and joint characteristics of a robot are different from each other. However, in the case of a humanoid robot comparatively similar to a human, (inverse kinematics). Since the characteristic of the action is mainly through the position trajectory expressed by the three-dimensional position and direction of the hand, similar motion can be obtained by matching the trajectory of the robot hand with the trajectory of the human hand on the obtained demonstration trajectory. However, it is necessary to consider the position of the elbow, since it is possible to deform according to the position of the elbow even if the position and the direction of the hand coincide when the person performs the operation.

Even if the inverse kinematics is used, the length of the link between human and robot is different, resulting in a difference in work space. Therefore, we prepare a robot model matching the length of the link with the human body model, and convert the human demonstration motion into the joint locus of the robot by matching the characteristic elements of the motion represented by the hand and elbow positions through inverse kinematics.

Step S120 is a step of dividing the obtained demonstration trajectory into a plurality of unit behaviors, and the obtained demonstration trajectory or the obtained robot joint trajectory can be divided into unit behaviors using kinetic energy. For example, when a person performs a certain series of actions, a temporary halt usually occurs as each step ends. Accordingly, in step S120, the demonstration trajectory is divided into a plurality of unit behaviors by distinguishing the operation section for performing the operation from the stop section in the stop state between the operation section for performing the operation and acquiring the unit behavior in each action stage .

As described above, the characteristics of the operation are mainly expressed through the three-dimensional position and orientation of the hand. The necessary information is information on whether the motion is in a dynamic state or in a static state, and considering the actual kinetic energy applied to the whole arm is not only complex in calculation, but also inherently incompatible with the characteristics of the robot, A simplified formula can be applied. For example, the present invention ignores the weight of the arm of the robot and assumes that the weight is concentrated at the end of the hand. Considering the kinetic energy, the kinetic energy is expressed as a product of squares of mass and velocity. However, the mass is invariant and the mass is negligible since the required energy is the relative value of the energy. Since the direction is also considered and the angular velocity exists, this Equation (1) can also be obtained by considering the rotational inertia as a fixed value.

[Equation 1]

Figure 112016026110168-pat00001

Here, w 1 and w 2 are weights determined according to the ratio difference between the kinetic energy generated due to the position change and the rotation generated according to the mass and rotational inertia, and an appropriate value is selected according to the result of dividing the actual data .

Also, it is important to remove noise through a low-pass filter (LPF) since the energy value of Equation 1 is proportional to the square of the speed and the differential value is generally susceptible to noise.

Accordingly, in step S120, a stop period and an operation period of the demonstration trajectory can be distinguished based on the energy value obtained through Equation (1). A predetermined threshold value is applied to a section where the energy value exceeds the threshold value, , And the interval between these operation intervals can be divided into the stop interval.

For example, as shown in FIG. 3 showing the position of the hand on the demonstration trajectory and the energy value thereof, it can be seen that the stopping interval between the operation interval and the operation is clearly distinguished according to the energy value . The upper graph shown in Fig. 3 is a three-dimensional position coordinate trajectory of the hand in the demonstration trajectory. The x, y, and z coordinates of the hand are represented by red, green, and blue lines, respectively. In the graph, the boundaries between the motion sections (A, B, C, and D) and the stop sections in the kinetic energy graph are represented by red lines, and the bottom images are simulation results showing the time points of each divided motion.

If it is divided into unit behaviors in step S120, the divided unit behaviors are classified into groups of common unit behaviors (S130).

At this time, in step S130, unit behaviors having the same meaning are classified into the same group based on the similarity of the trajectories of the divided unit behaviors, so that the unit behaviors can be classified into a plurality of groups.

It is determined whether a probability model corresponding to the unit behavior of each of the groups classified by the step S130 exists in the database. If the probability model exists in the database, the probability model is learned using the unit behavior (S140, S170) .

The database (DB) in the present invention accepts a new demonstration and is gradually learned, and the learned behavior is stored as a behavioral model or a probability model and is not stored in the demonstration data itself. Therefore, the determination of the degree of similarity for classification should be defined as the similarity between the newly introduced demonstration trajectory and the previously learned model.

Here, in step S140, it is possible to determine whether a probability model of the corresponding group exists in the database by using the similarity between data and model in order to determine which behavior model relates to the demonstration data. The similarity between data and model is defined as follows . The distribution of each time step is extracted from the learned model by regression analysis, and the non - inference of Mahalanobis distance is defined. The input trajectory sample

Figure 112016026110168-pat00002
And the distribution at each time step calculated by GMR (Gaussian Mixture Regression)
Figure 112016026110168-pat00003
, The non-musical apostrophe of Mahalanobis distance can be as shown in Equation (2) below.

&Quot; (2) "

Figure 112016026110168-pat00004

At this time, since the demonstration locus may be distorted in time, time alignment using DTW (Dynamic Time Warping) can be applied. The DTW finds and matches the matched pairs to minimize the distance between each sample. Based on this, comparison is made based on minimizing the non - inference when the demonstration trajectory is fitted to the model.

In the whole system, when the demonstration data is received, the model in which the non-inference degree is the minimum is searched to perform progressive learning, and when the non-inference degree exceeds the predetermined reference value, it is determined that there is no similar model and a new model is generated.

That is, if it is determined in step S140 that there is no probability model corresponding to the unit behavior of a specific group in the database, a probability model for the dynamic characteristic of the unit behavior of each group in which there is no probability model is generated, Symbols are allocated to each of the probability models and stored in the database (S150, S160).

In this case, each new probability model is numbered and used as a symbol representing the corresponding operation. In step S150, the process of performing the task is expressed as a sequence of symbols according to the kind of unit behavior shown on the demonstration trajectory It is possible.

The GMM (Gaussian Mixture Model) is a probabilistic model for modeling nonlinearly distributed data samples, and is suitable for modeling behavioral trajectories with nonlinear characteristics that can result in various patterns. Moreover, there is an advantage that it is robust against noise and some spatial distortion according to the characteristics of the probability model. Further, progressive learning suitable for the system or apparatus proposed by the present invention is also possible. There is also a gradual learning model that combines DTM (Dynamic Time Warping) to overcome the temporal distortion of the trajectory and Principal Component Analysis (PCA) to reduce the dimension.

The probability model generated in steps S150 and S160 and stored in the database is then learned by the unit behavior divided by the human demonstration trajectory (S170).

That is, the classified demonstration data can be generated and learned effectively through the probability model stored in the database.

FIG. 2 is a flowchart illustrating a method for generating a work behavior trajectory of a robot according to an exemplary embodiment of the present invention. Referring to FIG. 1, it is assumed that all probability models are learned in a database.

Referring to FIG. 2, the method according to an embodiment of the present invention acquires a human demonstration trajectory to be reproduced and divides the obtained demonstration trajectory into a plurality of unit behaviors (S210, S220).

Here, steps S210 and S220 are the same as steps S110 and S120 in Fig. 1, and a description thereof will be omitted.

If it is divided into unit behaviors in step S220, the divided unit behaviors are classified into groups of common unit behaviors (S230).

At this time, step S130 is the same as step S130 in Fig. 1, and a description thereof will be omitted.

If it is classified into groups in step S130, the probability model corresponding to the unit behavior of each of the classified groups is searched from the database, and symbols of the retrieved probability model are obtained (S240, S250).

At this time, the probability model corresponding to the unit behavior of each of the classified groups can be learned by the unit behavior of each of the classified groups.

Of course, if there is no related probability model among the groups in the database, steps S150 through S170 of FIG. 1 may be performed. In FIG. 2, it is assumed that probability models for all groups are searched. The description is omitted.

When the symbols of the probabilistic model searched in step S250 are obtained, the unit behaviors of the demonstration trajectory to be reproduced are represented by the symbol sequence, and representative trajectories of the retrieved probability models are generated (S260, S270).

When a representative locus of each of the probability models is generated in step S270, the representative locus is arranged and connected according to the order of the symbol sequences expressed in step S260, thereby reproducing the demonstration locus (S280).

Here, in step S280, the representative trajectory generated according to the preset target position is deformed and connected to reproduce the demonstration trajectory.

Specifically, in order to generate a work locus, the robot receives a target position of an operation determined by analyzing a demonstration locus, which is determined by a symbol sequence expressing a work step and an interactive object. In the adaptive trajectory generation and linkage step, representative trajectories are generated from each behavior model according to the list of unit behaviors appearing in the symbol sequence, and they are linked and transformed according to the task goal.

The MTM (Motion Trajectory Morphing) algorithm is an algorithm that generates a trajectory that reaches the target point stably, similar to the original trajectory of a single behavioral trajectory. The MTM algorithm generates a trajectory that effectively reaches a target position while generating a stable trajectory do. This algorithm is generally applicable to multidimensional trajectories.

Although the intrinsic characteristics of behavior are mainly in the end point trajectory, the end point trajectory alone can not cover all the characteristics of the behavior. Therefore, it is reasonable to model based on the joint trajectory when learning behavioral or probabilistic models. On the other hand, since the target point is given as a three-dimensional position, there is a gap between the representative locus generated from the model and the target point. In order to solve this problem, it is possible to overcome this difference by taking the joint value in the original behavioral trajectory as the initial value and taking the joint value corresponding to the given target point asymptotically through the inverse kinematics as the target value in the joint space.

The MTM algorithm can effectively do this when connecting the trajectories obtained from the behavioral model at each stage of the work. In the case of the portion where the target point is separately given in each work step, the end point of the operation and the point of the next operation can be modified according to the target point. Otherwise, as shown in FIG. 4, A continuous trajectory can be obtained.

Here, the blue line on the graph shown in FIG. 4 indicates that a discontinuous point is found at the connecting point by simply connecting the representative trajectory, and the red line indicates the point of time of the trajectory of each step and the end point through the MTM algorithm It can be seen that the discontinuity point has disappeared in one trajectory.

As described above, the method according to the embodiment of the present invention is a method in which the robot actively divides and analyzes a series of consecutive operations that occur when a person performs an operation based on simulated learning and action combinations, The robot can learn by simply showing the entire work process without having to separately direct the individual actions individually, and it is possible to generate a more flexible behavior trajectory by instructing the target point at each step of the operation in the work process .

In addition, the method according to embodiments of the present invention minimizes the user's intervention in learning a new behavior for the robot, enhances convenience, and allows the robot to actively take necessary information even if the user's skill is low Learning can be performed effectively without errors.

In addition, the method according to the embodiments of the present invention can quickly learn based on existing analysis contents when learning similar action behaviors reconstructing previously learned unit behaviors, . ≪ / RTI >

FIG. 5 is a block diagram illustrating a configuration of an apparatus for generating a work behavior trajectory of a robot according to an embodiment of the present invention, and shows a configuration of an apparatus for performing the method described with reference to FIG. 1 to FIG.

5, an apparatus 500 according to an embodiment of the present invention includes a partitioning unit 510, a classifying unit 520, a searching unit 530, a model generating unit 540, a learning unit 550, A rendering unit 560, an expression unit 570, a generation unit 580, a representation unit 590, and a database (DB).

The database (DB) is a means for storing a probability model for each of the unit behaviors together with symbols.

Here, the database accepts a new demonstration, each of the probability models is progressively learned, and the learned behaviors are stored as a behavioral model or a probability model, and may not be stored in the demonstration data itself.

The division unit 510 acquires the demonstration trajectory of the human work behavior and divides it into a plurality of unit behaviors.

At this time, the partitioning unit 510 may divide the stopping interval between the operation interval and the operation interval for performing the operation from the demonstration trajectory, and divide the demonstration trajectory into unit behaviors based on the divided operation interval and the stop interval. have. For example, the division unit 510 calculates an energy value for a demonstration locus, divides an interval in which the calculated energy value is equal to or greater than a predetermined threshold value into an operation interval, divides the operation interval into a stop interval, The demonstration trajectory can be divided into a plurality of unit behaviors by acquiring the unit behaviors in each action stage based on the zone and the stop zone.

The classifying unit 520 classifies the unit behaviors divided by the decomposing unit 510 into groups of common unit behaviors.

At this time, the classifying unit 520 classifies the unit behaviors having the same meaning into the same group based on the similarity of the trajectories of the divided unit behaviors, thereby classifying the unit behaviors into a plurality of groups.

The search unit 530 searches a probability model corresponding to the unit behavior of each of the classified groups from the database.

At this time, the retrieval unit 530 can retrieve the probability model of the group from the database by using the similarity between the data and the model in order to grasp which behavior model is associated with the demonstration data, A high probability model can be retrieved as a probability model of the corresponding unit behavior.

The model generation unit 540 generates a probability model for the dynamic characteristics of at least one unit behavior when a probability model corresponding to at least one unit behavior among the unit behaviors of the groups classified from the database is not found, The probability model is stored in the database.

At this time, the model generating unit 540 can generate a probability model for the dynamic characteristics of the unit behavior based on the GMM, and assign a specific symbol to the generated probability model.

When the probability model corresponding to the unit behavior of each of the groups classified from the database is retrieved, the learning unit 550 learns the retrieved probability model using the unit behavior of each of the classified groups.

The acquiring unit 560 acquires symbols of each of the retrieved probability models for unit behaviors of the demonstration locus to reproduce the demonstration locus.

Here, the symbol may be assigned when the corresponding probability model is generated.

The expression unit 570 expresses a sequence of symbols using the obtained symbols for the divided unit behaviors of the demonstration locus.

The generating unit 580 generates representative trajectories of the unit behaviors of each of the classified groups using the retrieved probability model.

The reproducing unit 590 reproduces the demonstration locus based on the generated representative locus.

At this time, the representation unit 590 can reproduce the demonstration trajectory by arranging and connecting representative trajectories according to the sequence of the symbols represented by the expression unit 570. [

Furthermore, the reproducing unit 590 can reproduce the demonstration trajectory by modifying and connecting the representative trajectory generated according to the preset target position.

Although not illustrated in FIG. 5, the apparatus of FIG. 5 may perform all of the operations of FIGS. 1 through 4 described above and may include all of the contents of FIGS.

The system or apparatus described above may be implemented as a hardware component, a software component, and / or a combination of hardware components and software components. For example, the systems, devices, and components described in the embodiments may be implemented in various forms such as, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable array ), A programmable logic unit (PLU), a microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications running on the operating system. The processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For ease of understanding, the processing apparatus may be described as being used singly, but those skilled in the art will recognize that the processing apparatus may have a plurality of processing elements and / As shown in FIG. For example, the processing unit may comprise a plurality of processors or one processor and one controller. Other processing configurations are also possible, such as a parallel processor.

The software may include a computer program, code, instructions, or a combination of one or more of the foregoing, and may be configured to configure the processing device to operate as desired or to process it collectively or collectively Device can be commanded. The software and / or data may be in the form of any type of machine, component, physical device, virtual equipment, computer storage media, or device , Or may be permanently or temporarily embodied in a transmitted signal wave. The software may be distributed over a networked computer system and stored or executed in a distributed manner. The software and data may be stored on one or more computer readable recording media.

The method according to embodiments may be implemented in the form of a program instruction that may be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions to be recorded on the medium may be those specially designed and configured for the embodiments or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Magneto-optical media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. For example, it is to be understood that the techniques described may be performed in a different order than the described methods, and / or that components of the described systems, structures, devices, circuits, Lt; / RTI > or equivalents, even if it is replaced or replaced.

Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

Claims (16)

Obtaining a demonstration trajectory for a human action and dividing the trajectory into a plurality of unit behaviors;
Classifying the divided unit behaviors into groups of common unit behaviors;
Retrieving a probability model corresponding to a unit behavior of each of the classified groups from a database storing a learned probability model for each of the unit behaviors;
Generating representative trajectories of unit behaviors of the classified groups using the retrieved probability models; And
Reproducing the demonstration locus based on the generated representative locus
Lt; / RTI >
The step of dividing into unit behaviors
Wherein the control unit divides an operation interval for performing an operation from the demonstration trajectory and a stop interval between the operation intervals, divides the demonstration trajectory into the unit actions based on the divided operation interval and the stop interval, Calculating an energy value, and dividing an interval in which the calculated energy value is equal to or greater than a predetermined threshold value into the operation interval.
delete delete Obtaining a demonstration trajectory for a human action and dividing the trajectory into a plurality of unit behaviors;
Classifying the divided unit behaviors into groups of common unit behaviors;
Retrieving a probability model corresponding to a unit behavior of each of the classified groups from a database storing a learned probability model for each of the unit behaviors;
Generating representative trajectories of unit behaviors of the classified groups using the retrieved probability models; And
Reproducing the demonstration locus based on the generated representative locus
Lt; / RTI >
Obtaining symbols of each of the retrieved probability models; And
Expressing the divided unit behaviors as a sequence of the obtained symbols
Further comprising:
The step of reproducing the demonstration locus
Wherein the generated trajectory is rearranged by arranging and connecting the generated representative trajectory according to the order of the sequence of the symbols.
5. The method of claim 4,
The step of reproducing the demonstration locus
Wherein the generated trajectory is modified by connecting the generated representative trajectory according to a preset target position, thereby reproducing the demonstration trajectory.
The method according to claim 1 or 4,
The step of retrieving the probability model
And a probability model corresponding to the unit behavior of each of the classified groups is searched based on the unit behavior of each of the classified groups and the similarity between the probability models stored in the database. .
Obtaining a demonstration trajectory for a human action and dividing the trajectory into a plurality of unit behaviors;
Classifying the divided unit behaviors into groups of common unit behaviors;
Retrieving a probability model corresponding to a unit behavior of each of the classified groups from a database storing a learned probability model for each of the unit behaviors;
Generating representative trajectories of unit behaviors of the classified groups using the retrieved probability models; And
Reproducing the demonstration locus based on the generated representative locus
Lt; / RTI >
The step of retrieving the probability model
Retrieving a probability model corresponding to the unit behavior of each of the classified groups from the database, and learning the retrieved probability model using the unit behavior of each of the classified groups,
The step of generating the representative locus
And generating a representative trajectory for each of the divided unit behaviors using the learned probability model.
Obtaining a demonstration trajectory for a human action and dividing the trajectory into a plurality of unit behaviors;
Classifying the divided unit behaviors into groups of common unit behaviors;
Retrieving a probability model corresponding to a unit behavior of each of the classified groups from a database storing a learned probability model for each of the unit behaviors;
Generating representative trajectories of unit behaviors of the classified groups using the retrieved probability models; And
Reproducing the demonstration locus based on the generated representative locus
Lt; / RTI >
Generating a probability model for the dynamic characteristics of the at least one unit behavior when a probability model corresponding to at least one unit behavior among unit behaviors of each of the classified groups is not retrieved from the database; And
Storing the generated probability model in the database and learning
Wherein the step of generating the locus of the action behavior of the robot further comprises the steps of:
A division unit for acquiring a demonstration trajectory of a human work behavior and dividing it into a plurality of unit behaviors;
A classifier for classifying the divided unit behaviors into groups of unit behaviors that are common;
A database for storing a learned probability model for each of the unit behaviors;
A search unit for searching a probability model corresponding to a unit behavior of each of the classified groups from the database;
A generating unit for generating representative trajectories of unit behaviors of the classified groups using the retrieved probability models; And
A reproduction section for reproducing the demonstration locus based on the generated representative locus;
Lt; / RTI >
The divider
Wherein the control unit divides an operation interval for performing an operation from the demonstration trajectory and a stop interval between the operation intervals, divides the demonstration trajectory into the unit actions based on the divided operation interval and the stop interval, Calculates an energy value, and divides an interval in which the calculated energy value is equal to or greater than a preset threshold value into the operation interval.
delete delete A division unit for acquiring a demonstration trajectory of a human work behavior and dividing it into a plurality of unit behaviors;
A classifier for classifying the divided unit behaviors into groups of unit behaviors that are common;
A database for storing a learned probability model for each of the unit behaviors;
A search unit for searching a probability model corresponding to a unit behavior of each of the classified groups from the database;
A generating unit for generating representative trajectories of unit behaviors of the classified groups using the retrieved probability models; And
A reproduction section for reproducing the demonstration locus based on the generated representative locus;
Lt; / RTI >
An obtaining unit obtaining symbols of each of the retrieved probability models; And
A unit for expressing the divided unit behaviors as a sequence of the obtained symbols,
Further comprising:
The reproducing unit
And the generated representative trajectory is arranged and connected according to the sequence of the symbols to reproduce the demonstration trajectory.
13. The method of claim 12,
The reproducing unit
Wherein the demonstration trajectory is reproduced by transforming and connecting the generated representative trajectory according to a preset target position.
The method according to claim 9 or 12,
The search unit
And a probability model corresponding to the unit behavior of each of the classified groups is searched based on the unit behavior of each of the classified groups and the similarity between the probability models stored in the database. .
A division unit for acquiring a demonstration trajectory of a human work behavior and dividing it into a plurality of unit behaviors;
A classifier for classifying the divided unit behaviors into groups of unit behaviors that are common;
A database for storing a learned probability model for each of the unit behaviors;
A search unit for searching a probability model corresponding to a unit behavior of each of the classified groups from the database;
A generating unit for generating representative trajectories of unit behaviors of the classified groups using the retrieved probability models; And
A reproduction section for reproducing the demonstration locus based on the generated representative locus;
Lt; / RTI >
When a probability model corresponding to a unit behavior of each of the classified groups is retrieved from the database, the learning unit learns the retrieved probability model using unit behaviors of the classified groups,
Wherein the robot further comprises:
A division unit for acquiring a demonstration trajectory of a human work behavior and dividing it into a plurality of unit behaviors;
A classifier for classifying the divided unit behaviors into groups of unit behaviors that are common;
A database for storing a learned probability model for each of the unit behaviors;
A search unit for searching a probability model corresponding to a unit behavior of each of the classified groups from the database;
A generating unit for generating representative trajectories of unit behaviors of the classified groups using the retrieved probability models; And
A reproduction section for reproducing the demonstration locus based on the generated representative locus;
Lt; / RTI >
Generating a probability model for the dynamic characteristics of the at least one unit behavior when a probability model corresponding to at least one unit behavior among the unit behaviors of each of the classified groups is not retrieved from the database, To the database,
Wherein the robot further comprises:
KR1020160032597A 2016-03-18 2016-03-18 Method for Generating Robot Task Motion Based on Imitation Learning and Motion Composition and Apparatus Therefor KR101819323B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020160032597A KR101819323B1 (en) 2016-03-18 2016-03-18 Method for Generating Robot Task Motion Based on Imitation Learning and Motion Composition and Apparatus Therefor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020160032597A KR101819323B1 (en) 2016-03-18 2016-03-18 Method for Generating Robot Task Motion Based on Imitation Learning and Motion Composition and Apparatus Therefor

Publications (2)

Publication Number Publication Date
KR20170108526A KR20170108526A (en) 2017-09-27
KR101819323B1 true KR101819323B1 (en) 2018-01-16

Family

ID=60036230

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020160032597A KR101819323B1 (en) 2016-03-18 2016-03-18 Method for Generating Robot Task Motion Based on Imitation Learning and Motion Composition and Apparatus Therefor

Country Status (1)

Country Link
KR (1) KR101819323B1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20230115139A (en) 2022-01-26 2023-08-02 경북대학교 산학협력단 Motion imitation robot control device and method based on artificial neural network, and a computer-readable storage medium

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109590986B (en) * 2018-12-03 2022-03-29 日照市越疆智能科技有限公司 Robot teaching method, intelligent robot and storage medium
CN110524544A (en) * 2019-10-08 2019-12-03 深圳前海达闼云端智能科技有限公司 A kind of control method of manipulator motion, terminal and readable storage medium storing program for executing
CN111890357B (en) * 2020-07-01 2023-07-04 广州中国科学院先进技术研究所 Intelligent robot grabbing method based on action demonstration teaching
CN112847336B (en) * 2020-12-24 2023-08-22 达闼机器人股份有限公司 Action learning method and device, storage medium and electronic equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
2014 제29회 제어로봇시스템학회 국내학술대회 논문집 pp. 197-198 (2014.5.)*
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 43, NO. 3, pp. 730-740 (2013.5.)*

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20230115139A (en) 2022-01-26 2023-08-02 경북대학교 산학협력단 Motion imitation robot control device and method based on artificial neural network, and a computer-readable storage medium

Also Published As

Publication number Publication date
KR20170108526A (en) 2017-09-27

Similar Documents

Publication Publication Date Title
KR101819323B1 (en) Method for Generating Robot Task Motion Based on Imitation Learning and Motion Composition and Apparatus Therefor
WO2017159614A1 (en) Learning service provision device
US10474934B1 (en) Machine learning for computing enabled systems and/or devices
JP6640060B2 (en) Robot system
EP3431229A1 (en) Action information generation device
KR102139513B1 (en) Autonomous driving control apparatus and method based on ai vehicle in the loop simulation
US10102449B1 (en) Devices, systems, and methods for use in automation
Copot et al. Predictive control of nonlinear visual servoing systems using image moments
CN106030457A (en) Tracking objects during processes
WO2015175739A1 (en) Robotic task demonstration interface
US11914761B1 (en) Systems and methods for virtual artificial intelligence development and testing
Wang et al. Perception of demonstration for automatic programing of robotic assembly: framework, algorithm, and validation
KR101577711B1 (en) Method for learning task skill using temporal and spatial relation
US10162737B2 (en) Emulating a user performing spatial gestures
US20200160210A1 (en) Method and system for predicting a motion trajectory of a robot moving between a given pair of robotic locations
Lloyd et al. Programming contact tasks using a reality-based virtual environment integrated with vision
Magyar et al. Guided stochastic optimization for motion planning
Liu et al. Grasp pose learning from human demonstration with task constraints
KR101329642B1 (en) Modeling method for learning task skill and robot using thereof
Österberg Skill Imitation Learning on Dual-arm Robotic Systems
KR101676541B1 (en) Method for Learning Task Skill and Robot Using Thereof
Wu et al. Video driven adaptive grasp planning of virtual hand using deep reinforcement learning
Herrmann et al. Motion Data and Model Management for Applied Statistical Motion Synthesis.
Ahmadzadeh et al. Visuospatial skill learning
WO2022190435A1 (en) Command script assistance system, command script assistance method, and command script assistance program

Legal Events

Date Code Title Description
A201 Request for examination
E902 Notification of reason for refusal
E701 Decision to grant or registration of patent right
GRNT Written decision to grant