WO2019098044A1

WO2019098044A1 - Robot motion adjustment device, motion control system, and robot system

Info

Publication number: WO2019098044A1
Application number: PCT/JP2018/040696
Authority: WO
Inventors: 浩司白土; 高志南本
Original assignee: 三菱電機株式会社
Priority date: 2017-11-14
Filing date: 2018-11-01
Publication date: 2019-05-23
Also published as: CN111344120A; DE112018005832B4; CN111344120B; JPWO2019098044A1; DE112018005832T5; JP6696627B2

Abstract

The present invention adjusts the motion of a robot so that excessive load is not applied to a target object, and facilitates the adjustment. A robot control device (111) sends a motion command value to a robot (120) and instructs the robot (120) equipped with an end effector (130) to perform an operation on a target object (200). The force applied to the end effector (130) according to the instruction is detected by an external sensor (142). A motion adjustment device (112) performs learning using the detection results of the external sensor (142), and adjusts and updates the motion command value acquired from the robot control device (111).

Description

Motion adjustment device for robot, motion control system and robot system

The present invention relates to industrial robots and service robots for non-manufacturing industries. In particular, the present invention comprises an operation adjustment device and an operation control system for adjusting an operation of a robot for causing an end effector mounted on the robot to reach a target position and orientation, and the operation adjustment device and the operation control system. It relates to a robot system.

In the conventional industrial robot system, the relationship between the robot and the work object is precisely positioned, and there are many system configurations in which the robot repeats the work with high speed and high accuracy under the positioned environment. On the other hand, in recent years, robot systems utilizing a plurality of external sensors such as force sensors or vision sensors are increasing. Such a robot system is used in an environment in which the robot and the work object are not precisely positioned, and controls robot operation according to the detection result of the external sensor.

For example, such a robot system is used in a situation where the position and orientation of an object to be worked or the surrounding environment is unknown. As another example, such a robot system is used in a situation where the position and orientation of an object to be worked on or the surrounding environment changes. Specific examples include a bin picking operation, an insertion operation with a surface copy operation, and a fitting operation of parts such as a connector. Further, in the field of service robots for non-manufacturing industries, work under variously changing environments is premised, and the motion of the robot is similarly controlled using a plurality of sensors.

In a control system of a robot utilizing these sensors, it is necessary to adjust a plurality of control parameters in order to adjust the operation of the robot. By properly adjusting the control parameters, the operation of the robot becomes appropriate, and the performance of the robot system is secured. However, adjustment of control parameters is not easy and often requires specialized knowledge. Therefore, several automatic adjustment means have been proposed to facilitate adjustment of control parameters. For example, Patent Document 1 discloses a robot system that accelerates the motion of a robot by learning.

JP, 2017-94438, A

In the conventional robot system, in learning, the magnitude of the load acting on the work object due to the motion of the robot is not taken into consideration. Therefore, in the operation of the robot obtained by learning, the load acting on the work target may not be an appropriate size, and an excessive load may act on the work target. An object of the present invention is to provide a motion adjustment device, a motion control system, and a robot system that can adjust the motion of the robot so that an excessive load does not act on the work target, and can easily adjust the motion of the robot. Do.

The robot motion adjustment device according to the present invention includes a robot equipped with an end effector and a robot control device that controls the motion of the robot, and is used in a robot system in which the robot performs work on a work target. A command value learning unit is provided that performs learning based on the detected force acting on the end effector, and adjusts an operation command value transmitted from the robot control device to the robot to control the operation of the robot.

Further, according to the motion control system of the present invention, a robot control apparatus is used in a robot system in which a robot equipped with an end effector performs a task on a work target, and transmits motion command values to the robot to control the motion of the robot. And a command value learning unit that adjusts the motion command value by performing learning based on a force acting on the end effector detected by the sensor.

Further, according to the robot system of the present invention, a robot on which an end effector is mounted, a robot control device that transmits an operation command value to the robot to control an operation of the robot, and a force acting on the end effector detected by a sensor The robot includes a command value learning unit that performs learning as an input and adjusts an operation command value, and the robot performs work on a work target.

According to the motion adjustment device, the motion control system, and the robot system of the present invention, the motion of the robot can be adjusted so that an excessive load does not act on the work target, and the adjustment of the motion of the robot can be facilitated.

FIG. 1 is a block diagram showing an example of a system configuration of a robot system provided with a motion adjustment device according to a first embodiment of the present invention. It is a figure which shows an example of the concrete hardware constitutions for implement | achieving the robot control apparatus by Embodiment 1 of this invention, and an operation | movement adjustment apparatus. It is a block diagram which shows the structural example of the operation | movement adjustment apparatus by Embodiment 1 of this invention, and the periphery block. It is a figure for demonstrating the operation | movement of the operation | movement adjustment apparatus by Embodiment 1 of this invention. It is a figure which shows an example of the speed pattern before the update in the robot system by Embodiment 1 of this invention. It is a flowchart which shows an example of the flow of a process of the operation control system by Embodiment 1 of this invention. It is a figure for demonstrating the operation | movement of the operation | movement adjustment apparatus by Embodiment 2 of this invention. It is a figure which shows an example of the initial value of the velocity pattern in the robot system by Embodiment 2 of this invention. It is a figure which shows an example of the detected value of the force sensor in the robot system by Embodiment 2 of this invention. It is a figure which shows an example of the velocity pattern after the update in the robot system by Embodiment 2 of this invention. It is a figure which shows another example of the velocity pattern after the update in the robot system by Embodiment 2 of this invention. It is a block diagram which shows the structural example of the operation | movement adjustment apparatus by Embodiment 3 of this invention, and the block of the periphery. It is a block diagram which shows the structural example of the command value learning part by Embodiment 3 of this invention, and the block of the periphery. It is a figure which shows an example of the operation | work which the robot system by Embodiment 3 of this invention implements. It is a flowchart which shows an example of the flow of a process of the learning process part by Embodiment 3 of this invention. It is a flowchart which shows an example of the flow of pre-processing performed by the learning process part by Embodiment 3 of this invention. It is a flowchart which shows an example of the flow of the learning process performed by the learning process part by Embodiment 3 of this invention. It is a figure which shows an example of the velocity pattern at the time of the trial in the robot system by Embodiment 3 of this invention. It is a figure which shows an example of the force information acquired at the time of the trial in the robot system by Embodiment 3 of this invention. It is a block diagram which shows the structural example of the operation | movement adjustment apparatus by Embodiment 4 of this invention, and the periphery block. It is a flowchart which shows an example of the flow of pre-processing performed with the operation | movement adjustment apparatus by Embodiment 4 of this invention. It is a flowchart which shows an example of the flow of the learning process performed with the operation | movement adjustment apparatus by Embodiment 4 of this invention. It is a block diagram which shows the structural example of the operation | movement adjustment apparatus by Embodiment 5 of this invention, and a surrounding block. It is a block diagram which shows the structural example of the operation | movement learning part by Embodiment 5 of this invention. It is a block diagram which shows another structural example of the operation | movement adjustment apparatus by Embodiment 5 of this invention, and the surrounding block.

Embodiment 1
FIG. 1 is a block diagram showing an example of a system configuration of a robot system 100 provided with a motion adjustment device according to a first embodiment of the present invention. As shown in FIG. 1, the robot system 100 includes a motion control system 110, a robot 120, an end effector 130, an internal sensor 141, and an external sensor 142. The motion control system 110 also includes a robot control device 111 and a motion adjustment device 112. The robot controller is also called a robot controller.

The robot control device 111 transmits an operation command value for controlling the operation of the robot 120 to the robot 120 based on the detection results of the internal sensor 141 and the external sensor 142, and controls the operation of the robot 120. An end effector 130 such as a robot hand is attached to the robot 120. The end effector 130 directly works on the work target 200. As the end effector 130, an appropriate type of end effector 130 is selected according to each operation performed by the robot system 100. A surrounding environment 300 exists around the work target 200.

The peripheral environment 300 includes, for example, a part to which the work target 200 is to be assembled, a jig for positioning the work target 200, a tool (such as an electric driver) for processing the work target 200, a parts feeder for supplying the work target 200, and a robot 120. , A belt conveyor for conveying the work object 200, and the like. In addition, the outside sensor 142 may be treated as part of the surrounding environment, such as a camera for capturing a work target. This is because the robot 120 or the end effector 130 may contact the external sensor 142 when the external sensor 142 is fixed at a predetermined position around the robot 120, for example.

The operation command value output from the robot control device 111 is, for example, information indicating a target position and a target posture at each time of the end effector 130 mounted on the robot 120, that is, a position command value. When the motion command value represents the target position of the end effector 130 at each time, the motion command value also indicates the moving speed of the end effector 130 between the respective times. Therefore, the position command value can also be considered as a speed command value representing a target motion speed of the robot.

Further, the operation command value output from the robot control device 111 may be a target operation speed of the robot 120 or a speed command value representing a target moving speed of the end effector 130. The target operating speed or target moving speed is given by the speed between each point in time of the movement of the robot 120 or the speed between each point on the path. Furthermore, the motion command value may be an acceleration command value representing a target acceleration of the motion of the robot 120 or a target acceleration of the movement of the end effector 130. The motion command value may take various forms as long as it directly controls the motion of the robot 120.

The motion adjustment device 112 adjusts and updates the motion command value generated by the robot control device 111 in accordance with the detection result of the external sensor 142 and the constraint given from the outside. That is, the motion adjustment device 112 adjusts the motion of the robot. In other words, the motion adjustment device 112 adjusts the correspondence between the detection results of the internal sensor 141 and the external sensor 142 and the motion command value output from the robot control device 111, reflects the adjustment result, and corresponds. It will be updated. The adjustment of the operation command value can be reworded as the correction of the operation command value or the correction of the operation command value.

When the updated operation command value is present, the robot control device 111 outputs the updated operation command value to the robot 120. The operation adjustment device 112 may update the operation command value with reference to not only the detection result of the external sensor 142 but also the detection result of the internal sensor 141. The constraint conditions may be stored in advance inside the motion adjustment device 112 or the robot control device 111.

The robot system 100 according to the present embodiment performs two processes, an adjustment process of adjusting and updating an operation command value, and a work process of performing an operation on the work target 200 using the updated operation command value. In other words, the operation of the robot system 100 includes the adjustment phase and the work phase, and the adjustment process is a process of the robot system 100 in the adjustment phase. The work process is a process of the robot system 100 in the work phase. The operation adjustment device 112 adjusts the operation command value so as to be an optimal operation command value in the adjustment process. However, the adjustment process and the work process do not have to be completely separated. For example, the robot system 100 may be configured such that the motion adjustment device 112 calculates an optimal motion command value as needed, even while the work on the work target 200 is being performed. In this configuration, the robot system 100 updates the operation command value at a predetermined timing as needed, such as when an operation command value more appropriate than the currently used operation command value is calculated. This point is the same as in the following embodiments.

FIG. 2 is a diagram showing an example of a specific hardware configuration for realizing the robot control device 111 and the operation adjustment device 112. As shown in FIG. The robot control device 111 and the operation adjustment device 112 are realized by causing the processor 401 to execute a program stored in the memory 402. The processor 401 and the memory 402 are connected by a data bus 403. The memory 402 is provided with volatile memory and non-volatile memory, and temporary information is stored in the volatile memory. The robot control device 111 and the operation adjustment device 112 may be configured integrally or separately. For example, the robot control device 111 and the operation adjustment device 112 may be connected via a network or the like. Also in the following embodiments, the robot control device 111 and the operation adjustment device 112 can be realized by the same hardware configuration.

In the robot system 100, the operation control system 110 outputs an operation command value based on data acquired by the internal sensor 141 and the external sensor 142, and configures a control system in which the robot 120 operates following the operation command value. ing. Examples of the internal sensor 141 include a sensor for acquiring the position of a joint of a robot, a sensor for acquiring an operation speed of a joint, and a sensor for acquiring a current value of a motor for operating the joint. The robot system 100 configures a position control system that positions the end effector 130 by the robot control device 111, the robot 120, and the internal sensor 141. As a sensor which acquires the position of the joint of a robot, the encoder which detects the amount of rotations of a motor, a resolver, a potentiometer etc. can be considered, for example. Moreover, a tachometer etc. can be considered as a sensor which acquires the motion speed of a joint. In addition, as the inside sensor, a gyro sensor, an inertial sensor, or the like may be used as information of the robot 120 itself.

By feedback control based on the internal sensor 141, the robot system 100 configures a position control robot system that performs material handling work and the like. Here, the material handling operation is an operation for transferring and transporting materials and parts. This position control robot system is called a feedback control system based on the internal sensor 141. In feedback control based on the internal sensor 141, control parameters include position control gain, speed control gain, current control gain, and filter design parameters used for feedback control. As a filter used for feedback control, a moving average filter, a low pass filter, a band pass filter, a high pass filter, etc. can be considered. The feedback control based on the internal sensor 141 is control for the robot 120 to operate in accordance with the operation command value. In other words, feedback control based on the internal sensor 141 is control that is performed to realize the operation command value.

On the other hand, examples of the external sensor 142 include a force sensor, a vision sensor such as a camera, a tactile sensor, a touch sensor, and the like. The external sensor 142 measures the contact state and positional relationship between the robot 120 and the work target 200 or the surrounding environment 300. The robot system 100 configures a sensor feedback control system based on the external sensor 142 by the robot control device 111, the operation adjustment device 112, the robot 120, and the external sensor 142. In addition, the robot system 100 may not use sensor feedback control based on a sensor signal output from the external sensor 142, but may simply use the sensor signal from the external sensor 142 as a trigger signal. In this case, the robot system 100 switches control parameters of feedback control by the internal sensor 141 from the trigger signal. A sensor feedback control system based on the external sensor 142 is constructed as an outer loop of a position control robot system.

The sensor feedback control system based on the external sensor 142 detects the positional relationship between the robot 120, the robot arm or the end effector 130, and the work object 200 or the surrounding environment 300, by contact with acceleration, velocity, position, attitude, distance, force, moment, etc. Sensing behavior etc. Furthermore, a sensor feedback control system based on the external sensor 142 controls the operation of the robot 120 so as to obtain a desired positional relationship or force response based on the sensing result. In other words, the sensor feedback control system based on the external sensor 142 corrects the operation command value so as to obtain a desired positional relationship or force response. In the sensor feedback control system based on the external sensor 142, control parameters include force control gain related to force control, impedance parameter, gain related to visual servo control, visual impedance parameter, setting parameter of a filter used for feedback control.

In the case where control is performed based on the internal sensor 141 and the external sensor 142, a control parameter that needs adjustment may be simply referred to as a parameter hereinafter. Here, as the sensors used as the internal sensor 141 or the external sensor 142, specifically, a current value sensor, a joint position sensor, a joint speed sensor, a temperature distance sensor, a camera, an RGB-D sensor, a proximity sensor , Tactile sensors, force sensors, etc. can be considered. Further, the measurement target of the internal sensor 141 or the external sensor 142 may be the position and orientation of the robot 120, the position and orientation of the end effector 130, the position and orientation of a work to be the operation target 200, the position and orientation of an operator, and the like.

FIG. 3 is a block diagram showing an example of the configuration of the operation adjustment apparatus 112 according to Embodiment 1 of the present invention and peripheral blocks. FIG. 3 shows a part of the configuration of the robot system 100 extracted. The operation adjustment device 112 includes a command value learning unit 113. In FIG. 3, the sensor 140 is a combination of the internal sensor 141 and the external sensor 142 into one. As described above, various sensors 140 can be considered. However, in the robot system 100 according to the present embodiment, the sensor 140 includes at least a force sensor that detects an external force acting on the end effector 130 due to the operation of the robot 120. This force sensor is an external sensor 142. Note that including at least a force sensor as the sensor 140 is the same as in the following embodiments.

A force sensor measures external force acting on the end effector 130 and is used to perform force control or impedance control. Control of the force that the end effector 130 applies to the work target 200 or the surrounding environment 300 is called force control. Also, controlling the operation of the robot 120 according to the detection result of the force sensor is referred to as force control. In force control, a target work force is set, and the magnitude of the force applied to the work target 200 or the surrounding environment 300 is controlled.

On the other hand, in impedance control, impedance characteristics (spring, damper, inertia) related to the contact force generated when the end effector 130 and the work object 200 contact each other are defined and used for control. As a case where the contact force is generated, when the end effector 130 and the surrounding environment 300 come in contact with each other, the case where the work target 200 held by the end effector 130 and the surrounding environment 300 come in contact with each other may be considered. Also, the impedance characteristic is represented by an impedance parameter.

In force control, it is necessary to determine a target value of force control. Further, in impedance control, it is necessary to determine control characteristics using impedance parameters. Furthermore, in any of the force control and the impedance control, it is necessary to determine the gain that contributes to the responsiveness of the control, and there are many adjustment items. In conventional robot systems, many parameter adjustments have been made for the purpose of performing tasks stably. In this case, system characteristics including the response of operation of the robot 120, mechanical rigidity, and the like are identified, and one parameter set that stably responds regardless of conditions or states is found. However, in the operation of the robot 120 accompanied by the contact with the work target 200, the contact state between the work target 200 and the end effector 130 changes as the movement progresses. Therefore, adjustment of the parameter set needs to be performed in consideration of the transition of the contact state. This adjustment was not easy because it would be performed by trial and error.

In the robot system 100 according to the present embodiment, the motion adjustment device 112 updates the motion command value to control the motion of the robot 120 to be appropriate. Constraint conditions are input to the operation adjustment device 112. The constraint conditions include the upper limit value or the lower limit value of the force information detected by the force sensor. Hereinafter, the operation command value output from the operation control system 110 will be described as the speed command value. The speed command value is a target moving speed of the end effector 130 with respect to each point on the moving path of the end effector 130. At this time, the time-series speed command value is the speed pattern for each point. The speed command value may be a target operating speed of the robot 120 for each point in time of work.

In the velocity pattern, a target velocity Vi (i = 1, 2, 3,...) And a switching position Pi of the target velocity (i = 1, 2, 3,...) Are defined. The switching position may be set by switching time or parameters for switching. As the parameter for switching, the progress rate of the operation command value based on the position and time is exemplified. Further, the switching position Pi of the target speed may be a start point of switching of the target speed or may be a completion point of switching of the target speed. Further, the switching position Pi of the target velocity may be a point that it is guaranteed that the operation velocity detected by the internal sensor 141 falls within a predetermined error range from the target velocity.

FIG. 4 is a diagram for explaining the operation of the operation adjustment device 112 according to the first embodiment of the present invention. As shown in FIG. 4, consider the case where the end effector 130 mounted on the robot 120 moves from the position P0 to the position P3. A force sensor 143 is attached to the robot 120 as an external sensor 142. The force sensor 143 measures an external force acting on the end effector 130.

FIG. 5 is a view showing an example of a velocity pattern before update in the robot system 100 according to the first embodiment of the present invention. In FIG. 5, the horizontal axis is the position P of the end effector 130, and the vertical axis is the target moving velocity V of the end effector 130. In the velocity pattern of FIG. 5, the target velocity changes while the end effector 130 moves from P0 to P3. The motion adjustment device 112 updates the speed pattern based on the detection result of the force sensor 143.

FIG. 6 is a flow chart showing an example of the process flow of the operation control system 110 according to the first embodiment of the present invention. Here, it is assumed that the upper limit value and the lower limit value of the force information detected by the force sensor 143 and the upper limit value of the working time are included as the constraint conditions. First, in step S10, the robot control device 111 determines an initial value of the velocity pattern. Next, in step S11, the robot control device 111 controls the operation of the robot 120 and tries a task. When the adjustment process and the work process are not completely separated as described above, a part of the normal work in the robot system 100 may be treated as a trial.

Next, in step S12, the operation adjustment device 112 determines whether the constraint condition is satisfied. That is, in step S12, the operation adjustment device 112 determines whether the detection value of the force sensor 143 falls within the range between the upper limit value and the lower limit value defined by the constraint condition and whether the constraint of the working time is satisfied. Do. When determining the detection value of the force sensor 143, for example, the maximum value of the detection value is compared with the upper limit value of the constraint condition, and the minimum value of the detection value is compared with the lower limit value of the constraint condition. In step S12, the operation adjustment device 112 may use not the detection value itself of the force sensor 143 but an evaluation value obtained by calculation from the detection value. As an example of the evaluation value, an evaluation value calculated with an evaluation function using the detection value of the force sensor 143 and the tact time as an input can be considered. In step S12, the operation adjustment device 112 may determine whether the evaluation value is within the limit range.

When it is determined in step S12 that the constraint condition is satisfied, the process of the operation control system 110 is temporarily ended, and the work with the updated speed pattern is performed thereafter. On the other hand, when it is determined in step S12 that the constraint condition is not satisfied, the process of the operation control system 110 proceeds to step S13. In step S13, the operation adjustment device 112 adjusts the speed pattern and updates the speed pattern. In step S13, the operation adjustment device 112 calculates, for example, a correction coefficient for correction, and adjusts the velocity pattern by multiplying the velocity pattern at the time of trial. When the process of step S13 ends, the process of operation control system 110 returns to step S11.

The operation control system 110 according to the first embodiment of the present invention performs the above processing. As described above, the operation control system 110 according to the first embodiment of the present invention adjusts the speed pattern in a learning manner based on data obtained by a plurality of trials. In other words, the motion control system 110 according to the first embodiment of the present invention adjusts the speed pattern, which is the motion command value, using machine learning or optimization.

In the above description, although the upper limit value of the working time is included in the constraint condition, it is not an essential condition but may be another condition. Further, instead of the upper limit value of the working time being given as the constraint condition, the constraint condition may be that the working time is shortest after the other conditions are satisfied. Furthermore, in the above description, although the case where the operation control system 110 updates the operation command value so as to satisfy the given constraints is described, the operation control system 110 is configured to adjust and update the control parameter. Is also conceivable. Furthermore, although FIG. 1 shows a configuration example in which the robot control device 111 and the operation adjustment device 112 are separately provided, the robot control device 111 may be configured to incorporate the operation adjustment device 112.

The motion adjustment device 112, the motion control system 110, and the robot system 100 of the present embodiment are configured as described above. According to the motion adjustment device 112, the motion control system 110, and the robot system 100 of the present embodiment, the motion of the robot 120 is adjusted such that the detection value of the force sensor 143 falls within a predetermined range. Here, the detection value of the force sensor 143 represents the magnitude of the external force acting on the end effector 130. In other words, the detection value of the force sensor 143 is information representing the magnitude of the force applied to the work target 200 or the surrounding environment 300 due to the operation of the robot 120. Therefore, according to the motion adjustment device 112, the motion control system 110, and the robot system 100 of the present embodiment, the force applied to the work target 200 or the surrounding environment 300 has an appropriate magnitude, that is, the work target 200 or the circumference. The operation of the robot 120 can be adjusted such that an excessive load does not act on the environment 300, and the adjustment of the operation of the robot 120 can be facilitated.

As described above, by using the force sensor 143 to learn and adjust the motion command value so that the force response falls within the desired range, high-quality robot work that does not damage the item to be worked is realized. can do. Furthermore, high speed work can be realized by adding work time to the constraint.

In addition, although the motion adjustment device 112, the motion control system 110, and the robot system 100 according to the present embodiment use the magnitude of the force detected by the force sensor 143 as a constraint condition, the moment, torque, current value, etc. It is also possible to detect and use any of these upper and lower limits as constraints. By these, it is possible to set a limit value in the contact situation between the robot 120 or the end effector 130 and the outside world, and it becomes possible to search for an operation command value within a desired range. As a result, an operation that does not damage the work target 200 can be realized.

Furthermore, as a constraint condition, the relative position and orientation with the surrounding environment 300 and the position and orientation of the robot 120 can be added. By adding either of these upper limits or lower limits to the constraint conditions, robot work with reduced interference with the surrounding environment 300 can be realized while realizing high-quality work. As a result, it is possible to obtain a remarkable effect of increasing the system operation rate. The effects described above are similarly obtained in the other embodiments.

Second Embodiment
The configurations of the operation adjustment device, the operation control system, and the robot system of the present embodiment are the same as those shown in FIG. The motion adjustment device 112, the motion control system 110, and the robot system 100 according to the present embodiment divide the motion command given to the robot 120 into a plurality of sections for a series of tasks, and adjust the motion command value for each section. It is a thing. In the following description, it is assumed that the operation command value output from the operation control system is a speed command value.

FIG. 7 is a diagram for explaining the operation of the operation adjustment device 112 according to the second embodiment of the present invention. As shown in FIG. 7, an operation of moving the end effector 130 mounted on the robot 120 from the position P0 to the position P3 is considered. Position P0 which is an initial position is a start point of work, and position P3 is an end point of work. The end effector 130 passes through the positions P1 and P2 while moving from the position P0 to the position P3.

In robot system 100 of the present embodiment, the path from the start point of work to the end point of work is divided into a plurality of sections. In other words, in the robot system 100 according to the present embodiment, the operation of the robot 120 from the start of one operation to the end of the operation is divided into a plurality of sections. Here, a section S1 is from the position P0 to the position P1, a section S2 from the position P1 to the position P2, and a section S3 from the position P2 to the position P3. Further, the target moving speed of section S1 is V1, the target moving speed of section S2 is V2, and the target moving speed of section S3 is V3. The robot system 100 according to the present embodiment adjusts and updates the operation command value for each of the divided sections. Specifically, the robot system 100 adjusts the target moving speed of the section S1, the target moving speed of the section S2, and the target moving speed of the section S3.

In robot system 100 of the present embodiment, positions P1 and P2, which are division points for division into divisions, are determined in advance according to the contents of work. Positions P1 and P2 are positions at which the sections switch, and may be referred to as switching positions. Moreover, although the number of divisions is illustrated as three here, it is not necessarily limited to three. Furthermore, although division is defined spatially by position here, it may be divided temporally from the start time of work to the end time of work.

The upper limit value Flim of the detection result of the force sensor 143 is given to the operation control system 110 according to the present embodiment as a constraint condition. The flow of processing of the operation control system 110 according to the present embodiment is basically the same as the flow shown in FIG. However, the speed pattern will be adjusted for each segment. First, in step S10 of FIG. 6, the robot control device 111 determines an initial value of the velocity pattern. FIG. 8 is a diagram showing an example of initial values of velocity patterns in the robot system 100 according to Embodiment 2 of the present invention. In FIG. 8, the horizontal axis is the position P of the end effector 130, and the vertical axis is the target moving velocity V of the end effector 130. In FIG. 8, the initial value of the velocity pattern is V1 = V2 = V3 = Vini.

Next, in step S11, the robot control device 111 controls the operation of the robot 120 and tries a task. FIG. 9 is a diagram showing an example of detection values of force sensor 143 in robot system 100 according to the second embodiment of the present invention. In FIG. 9, the horizontal axis is the position P of the end effector 130, and the vertical axis is the detection value F of the force sensor 143. FIG. 9 shows values detected by the force sensor 143 when the robot 120 is operated at the initial value of the velocity pattern shown in FIG.

Next, in step S12, the operation adjustment device 112 determines whether the constraint condition is satisfied. That is, in step S12, the operation adjustment device 112 determines whether the detected value of the force sensor 143 in each section is equal to or less than the upper limit Flim defined by the constraint condition. As the detection value of the force sensor 143 used for determination, for example, the maximum value among the detection values of the force sensor 143 in each section is used. In step S12, when the detection value of the force sensor 143 is equal to or less than Flim in all the sections, the operation adjustment device 112 determines that the constraint condition is satisfied. On the other hand, if there is at least one section in which the detection value of the force sensor 143 exceeds the upper limit Flim in step S12, the operation adjustment device 112 determines that the constraint condition is not satisfied.

When it is determined in step S12 that the constraint condition is satisfied, the process of the operation control system 110 is temporarily ended, and the work with the updated speed pattern is performed thereafter. On the other hand, when it is determined in step S12 that the constraint condition is not satisfied, the process of the operation control system 110 proceeds to step S13. In step S13, the operation adjustment device 112 adjusts the velocity pattern so as to reduce the target velocity of the section in which the detection value of the force sensor 143 exceeds the upper limit Flim, and updates the velocity pattern.

In the example shown in FIG. 9, in the section S2, the detection value Fmax2 of the force sensor 143 exceeds the upper limit value Flim. On the other hand, the detected value Fmax1 of the force sensor 143 in the section S1 and the detected value Fmax3 of the force sensor 143 in the section S3 do not exceed the upper limit value Flim. Therefore, in step S12, the operation adjustment device 112 determines that the constraint is not satisfied. In step S13, the operation adjustment device 112 adjusts the speed pattern such that the target speed V2 in the section S2 decreases. The operation control system 110 according to the second embodiment of the present invention performs the above processing. FIG. 10 is a diagram showing an example of the updated velocity pattern in the robot system 100 according to the second embodiment of the present invention. In FIG. 10, the horizontal axis is the position P of the end effector 130, and the vertical axis is the target moving velocity V of the end effector 130.

In the above description, the upper limit Flim of the detection result of the force sensor 143 is given as the constraint condition, but the shortest operation time may be added as the constraint condition. In this case, since Fmax1 and Fmax3 do not exceed the upper limit Flim in FIG. 9, the operation adjusting apparatus 112 increases the target velocity V1 in the section S1 and the target velocity V3 in the section S3 in step S13. Adjust the speed pattern. By adjusting the speed pattern in this manner, the working time can be further shortened. FIG. 11 is a diagram showing another example of the updated velocity pattern in the robot system 100 according to the second embodiment of the present invention. In FIG. 11, the horizontal axis is the position P of the end effector 130, and the vertical axis is the target moving velocity V of the end effector 130.

When the operation command value is the speed command value, as shown in FIGS. 10 and 11, the dividing points P1 and P2 are positions where the target speed is switched. The division points P1 and P2 may be start points of switching of the target speed, or may be completion points of switching of the target speed. Further, the dividing points P1 and P2 may be points that guarantee that the operation speed detected by the internal sensor 141 falls within a predetermined error range from the target speed.

The motion adjustment device 112, the motion control system 110, and the robot system 100 of the present embodiment are configured as described above. According to the motion adjustment device 112, the motion control system 110, and the robot system 100 of the present embodiment, the motion of the robot 120 is adjusted for each section. Since only the division in which the detection value of the force sensor 143 is larger than the predetermined value is adjusted to delay the operation, the operation of the entire operation is not unnecessarily delayed, and the operation target 200 or the surrounding environment 300 is excessively large. It is possible to adjust the operation of the robot 120 so that no load is applied, and to facilitate the adjustment of the operation of the robot 120. Furthermore, if the section where the detected value of the force sensor 143 is smaller than the predetermined value is adjusted so that the operation becomes faster, it is also possible to make the operation of the whole operation faster.

As described above, according to the operation adjustment device 112, the operation control system 110 and the robot system 100 of the present embodiment, the conventional adjustment can be realized by learning and updating the optimum operation command value for each section. It becomes possible to design delicate motion command values that were not possible, and as a result, high-speed, high-quality robot work can be realized.

Third Embodiment
FIG. 12 is a block diagram showing a configuration example of the operation adjustment device 112b according to the third embodiment of the present invention and peripheral blocks. FIG. 12 shows a part of the configuration of the robot system 100 extracted. The operation adjustment device 112 b includes a command value learning unit 113 b. The configurations of the motion adjustment device, motion control system, and robot system of the present embodiment are the same as those shown in FIG. 1 except that the motion adjustment device 112 is replaced with the motion adjustment device 112 b. The operation adjustment apparatus 112b according to the present embodiment is different from the operation adjustment apparatus 112 according to the second embodiment in that division information is input. The division information includes information on the initial value of the division position and the initial value of the operation command value in each division. The division position is the position of division points Pi at both ends of each division, and is, for example, a position where the target value of the operation speed is switched. When the internal sensor 141 or the external sensor 142 detects that the end effector 130 has reached a predetermined position, the target value of the operation speed is switched.

The motion adjustment device, motion control system, and robot system of the present embodiment adjust the motion command value for each section, as in the second embodiment. By adjusting the operation command value for each section, the command value learning unit 113b is adjusted so that the section where a collision or the like occurs is a low speed operation, and the other sections are adjusted so as to be a high speed operation. Can be According to this learning device, it is possible to automatically learn an operation command value that realizes high-speed work. The command value learning unit 113b automatically learns an operation command value corresponding to each section. For the sake of simplicity, the operation adjustment device 112 b is described as adjusting only the operation command value without adjusting the control parameter.

In the motion adjustment device, motion control system, and robot system of the present embodiment, the command value learning unit 113b receives motion information, using the classification information, the constraint condition, the detection value of the sensor 140, and the motion command value before updating. Update each one. The division information is defined to divide the operation command value into N divisions. Each division point for dividing into divisions is defined as Pi (i = 0, 1, 2,..., N + 1). Here, N is a natural number. Here, the start point and the end point of the operation are also included in the division point, and the start point is P0. A section between a division point immediately before the division point Pi and the division point Pi is called a division Si (i = 0, 1, 2,..., N).

In the motion adjustment device, motion control system, and robot system of the present embodiment, it is assumed that divisions are defined each time the work state changes. For example, in the case of a fitting operation using a force sensor, the dividing points Pi are defined before and after a contact phenomenon occurs between parts to be fitted, and before and after a change in contact state. The division point Pi is defined in accordance with the change in the expected contact state, and the speed of the entire operation can be increased by changing the target value of the operation command such as the position, velocity, or acceleration suitable for each. At this time, it is a feature of the operation adjustment device, the operation control system, and the robot system of the present embodiment that the position of the dividing point Pi and the command value pattern of each division Si are defined from past trial information.

The constraint condition input to the command value learning unit 113b is a condition that defines the boundary between the success and failure of the work with respect to the speeded-up work. In the speeded-up work operation, there is a risk that the end effector 130 strongly collides with the work target 200 due to an error in position control of the end effector 130 or the like. If a strong collision occurs, the end effector 130 or the work target 200 may be damaged, resulting in work failure. In consideration of such past work failures, generation of operation command values for performing high-speed, low-impact work by the user defining constraint conditions at design time or defining constraint conditions by past trial data Can be realized.

Constraint conditions include position limit range, posture limit range, upper limit value of operating speed, lower limit value of operating speed, upper limit value of force, lower limit value of force, upper limit value of moment, lower limit value of moment, etc. . In particular, when the positions and orientations of the robot 120 and the work object 200 can be acquired, they are defined by the position and orientation of the robot 120 or the upper limit or lower limit of the relative positions and orientations of the robot 120 and the surrounding environment 300 as constraint conditions. Limit values can be entered.

Also, data acquired by the internal sensor 141 or the external sensor 142 is referred to as sensor information. For sensor information, pre-processing such as processing for removing noise by filter processing and processing for extracting only a value that exceeds a threshold is performed as necessary.

The operation command value refers to a control command value that can be input to the position control system of the robot system 100. The operation command value may be simply referred to as a command value. The motion of the robot 120 is controlled by the motion of the motor of each axis. The operation command value includes, for example, a position command value for controlling the operation of the motor, a speed command value, a current command value, and the like. Also, the motion adjustment device 112 b can equivalently generate time-series data of the position command value from the velocity pattern generated from the profile representing the relationship between time and velocity, and input it to the robot control device 111. The motion command value can also be generated inside the robot control device 111.

The operation adjustment device 112b according to the present embodiment takes out the command value inside the robot control device 111, adjusts the operation command value according to the sensor information obtained as a response when the robot 120 performs the work, and updates it. Do. This point is the same as in the other embodiments. Hereinafter, the operation command value output from the operation control system is described as the speed command value. As another configuration, a configuration may be considered in which the motion adjustment device 112 b passes not the motion command value itself but parameters necessary for generating the motion command value to the robot control device 111. For example, the motion adjustment device 112 b can also input to the robot control device 111 only the target values of the segment position and the motion speed in each segment. In this case, the robot control device generates an operation command value based on the input target position of the divided position and the operation speed.

The operation adjustment device 112 b includes a command value learning unit 113 b. The command value learning unit 113b adjusts and updates the operation command value. The command value learning unit 113b obtains a new operation command value based on the classification information, the constraint condition, the operation command value before updating, and the detection value of the sensor 140. When obtaining a new motion command value, the command value learning unit 113b is designed to evaluate the speed of work and the quality of work using an evaluation function, and search the work target 200 for a high-speed operation that is resistant to breakage. The operation adjustment device 112 b may also be configured to adjust and update control parameters used by the robot control device 111. Adjustment and update of control parameters are also performed by the command value learning unit 113b.

FIG. 13 is a block diagram showing a configuration example of the command value learning unit 113b according to the third embodiment of the present invention and peripheral blocks. FIG. 13 shows a part of the configuration of the robot system 100 extracted. The command value learning unit 113 b includes a storage unit 114 and a learning processing unit 115. An example of a search method in the command value learning unit 113b will be described with reference to FIG. Here, it is assumed that the number of divisions is previously defined as N = 4. Further, it is assumed that a speed target value which is a value of the target speed defined in each section is used as the operation command value. In addition, the command value learning unit 113b realizes high-speed work by adjusting the speed target value in each section.

FIG. 14 is a diagram showing an example of operations performed by the robot system 100 according to the third embodiment of the present invention. As shown in FIG. 14, the robot system 100 performs an operation of inserting the first part 210 into the second part 310. FIG. 14 illustrates the change in relative position between the first part 210 and the second part 310 as the work progresses, and in the order of (a), (b), (c), (d) It shows how work is progressing. The first part 210 corresponds to the work target 200, and the second part 310 corresponds to the surrounding environment 300.

The first component 210 is provided with a hole 211. On the other hand, the second component 310 is provided with a protrusion 311. When the first part 210 is inserted into the second part 310, the protrusion 311 is inserted into the hole 211. The first part 210 is made of a first material. On the other hand, the second part 310 includes a portion 312 formed of the first material and a portion 313 formed of the second material. When inserting the first part 210 into the second part 310, a change occurs in the contact between the first part 210 and the second part 310.

In the example shown in FIG. 14, in accordance with the progress of the work from (b) to (d), the contact portion and the contact state between the parts change. The contact state includes the material of each part of the contact portion, the size of the contact portion, and the like. The change in the contact state changes the frictional force generated at the contact portion. In (b) of FIG. 14, a friction force between the outer shapes of the first component 210 and the second component 310 is generated. In (c) of FIG. 14, the contact between the hole 211 and the protrusion 311 is further added, so the frictional force is increased. Due to the change in the frictional force generated between the parts, the detection result of the force sensor 143 also changes. That is, in the fitting work of the parts, the insertion work of the connector, etc., the reaction force between the parts changes in accordance with the progress of the work. The force sensor detects the reaction force between the parts.

As shown in FIG. 13, the command value learning unit 113 b stores the force information detected by the sensor 140 and the velocity pattern acquired from the robot control device 111 in the storage unit 114. It is assumed that the robot system 100 can operate the robot 120 by designating a velocity pattern when attempting to perform an operation for adjusting the operation command value. The learning processing unit 115 updates the velocity pattern based on the force information, the velocity pattern, the classification information, and the constraint condition stored in the storage unit 114, and outputs the velocity pattern to the robot control apparatus 111 as the offline processing.

Here, although one velocity pattern is stored in the robot control device 111, when adjusting the operation command value, the operation adjustment device 112b also performs a plurality of types of one velocity pattern as a reference. Encourage you to try the task using the speed pattern of As a result, when adjusting the operation command value, the robot system 100 will try under various conditions. The operation adjustment device 112 b adjusts the operation command value based on data obtained in trials under various conditions. For example, the robot system 100 performs the trial Na times with different operation command values including operation command values different from the operation command value stored in the robot control device 111. The operation adjustment device 112 b inputs data obtained as a result of Na trials, learns once, and updates the operation command value. When the trial of Nb set is performed with one set of Na trials, in many cases, the operation command value converges and no further improvement occurs. Here, Na and Nb are integers of 1 or more.

As described above, in the robot system 100 according to the present embodiment, a trial is performed using one or more set operation command values, and an evaluation value is generated based on the obtained force sensor data. . The operation adjustment device 112 b updates the operation command value based on each evaluation value. In updating the operation command value, the operation adjustment device 112 b generates one or more operation command values and executes the trial again. If the operation command value is one, if the evaluation value converges in the graph in which the evaluation value is plotted, the operation adjustment device 112 b ends the update of the operation command value. If there are a plurality of motion command values, if the evaluation value converges in the graph in which only the result in which the evaluation value corresponding to the motion command value is minimum is plotted, the motion adjustment device 112b updates the motion command value. finish. In this case, when the plurality of motion command values are updated, the motion adjustment device 112 b updates the motion value to the minimum evaluation value.

FIG. 15 is a flowchart showing an example of a process flow of the learning processing unit 115 according to the third embodiment of the present invention. As shown in FIG. 15, first, in step S100, the learning processing unit 115 performs preprocessing as a preparation stage. Next, in step S200, the learning processing unit 115 performs a learning process.

FIG. 16 is a flowchart showing an example of the flow of preprocessing performed by the learning processing unit 115 according to the third embodiment of the present invention. Note that FIG. 16 also describes operations performed by blocks other than the learning processing unit 115 in order to explain the operations. First, in step S101, the robot control device 111 sets control parameters for performing force sense control. Next, in step S102, the robot control device 111 operates the robot 120 to try a task. Next, in step S103, the command value learning unit 113b acquires data obtained in the trial. The data obtained by each trial is called trial data. The trial data includes force information detected in each trial and a velocity pattern used in each trial. The force information is time-series data acquired by the force sensor 143 at predetermined time intervals in each trial, and is also referred to as a force waveform. Next, in step S104, the storage unit 114 stores the data acquired in step S103.

Next, in step S105, the learning processing unit 115 determines whether K or more pieces of trial data have been acquired. Here, K is a natural number and is preset. If K or more trial data have not been acquired yet, the process returns to step S102. On the other hand, if K or more trial data have been acquired, the process proceeds to step S106. Therefore, when the process proceeds to step S 106, K trial data D 1 j (j = 1, 2, 3,... K) are acquired and stored in the storage unit 114.

Next, in step S106, the learning processing unit 115 defines the division position based on the K trial data stored in the storage unit 114. The division position is the position of division points at both ends of each division. The position of the dividing point corresponds, for example, to the position of the end effector 130. The position of the dividing point is the switching position of the speed target value. The position of the dividing point may be a start point of switching of the target speed, or may be a completion point of switching of the target speed. In addition, the position of the dividing point may be a point at which it is ensured that the operation speed detected by the internal sensor 141 falls within a predetermined error range from the target speed.

The learning processing unit 115 defines, for example, the division position based on the average and the variance of the K trial data. The learning processing unit 115 can automatically determine the position of the dividing point by setting the dividing point before and after the force waveform largely changes, paying attention to the rate of change of the force waveform. Alternatively, the user can manually determine the point at which the state change occurs according to the work content as the division point.

Next, in step S107, the learning processing unit 115 determines whether the division position is defined. If the section position is not defined, the process returns to step S106. If the section position is defined, the process proceeds to step S108. Next, in step S108, the learning processing unit 115 defines a speed target value for each section. The speed target value is calculated based on the upper limit value of the force designated by the user and the target tact time.

Specifically, the learning processing unit 115 sets a standard working speed for completing the work by the target tact time as the overall speed target value Vdn. Next, the learning processing unit 115 defines the speed upper limit value Vmax based on the upper limit value of the force. The relationship between the speed when the end effector 130 collides with the work target 200 or the surrounding environment 300 and the external force applied to the end effector 130 at that time can be obtained in advance based on the rigidity information of the work target and the like. The learning processing unit 115 can obtain the speed upper limit value Vmax with reference to a table or the like storing this relationship.

The learning processing unit 115 determines the target velocity Vd using the entire velocity target value Vdn and the velocity upper limit value Vmax. The speed target value Vd is larger than 0 and smaller than the speed upper limit value Vmax. The target velocity Vd is set to gradually approach Vdn. For example, under the condition of 0 <Vd <Vdn <Vmax, the learning processing unit 115 defines a plurality of speed target values Vd using random numbers so that the speed parameters may be dispersed to some extent. As described above, in step S108, the learning processing unit 115 determines the speed target value Vd so as to be a separated value within the determined range. Next, in step S109, the learning processing unit 115 determines whether the speed target value is defined. If the speed target value has not been defined, the process returns to step S108. If the speed target value is defined, the pre-processing ends. By the pre-processing, an initial value when performing the learning processing is determined.

FIG. 17 is a flow chart showing an example of the flow of learning processing performed by the learning processing unit 115 according to the third embodiment of the present invention. Note that FIG. 17 also describes operations performed by blocks other than the learning processing unit 115 for the description of the operations. First, in step S201, the robot control device 111 operates the robot 120 to try a task. Next, in step S202, the command value learning unit 113b acquires trial data obtained by the trial. Next, in step S203, the storage unit 114 stores the trial data acquired in step S202.

Next, in step S204, the learning processing unit 115 determines whether M or more trial data have been acquired. Here, M is a natural number and is preset. If M or more trial data have not been acquired yet, the process returns to step S201. On the other hand, if M or more trial data have been acquired, the process proceeds to step S205. Therefore, when the process proceeds to step S 205, M trial data D 2 j (j = 1, 2, 3,... M) are acquired and stored in the storage unit 114. Note that if M division positions and speed target values are defined, the trial in step S201 is executed using different groups of division positions and speed target values. Therefore, when the process proceeds to step S205, M trial positions D2j corresponding to M sets of division positions and speed target values are stored.

Next, in step S205, the learning processing unit 115 calculates an evaluation value for each of the M trial data based on the constraint condition. The calculated evaluation value is stored. Next, in step S206, the learning processing unit 115 obtains a division position and a speed target value corresponding to the trial data having the best evaluation value among the M trial data. Next, in step S207, the learning processing unit 115 compares the best evaluation value among the newly obtained M evaluation values with the evaluation value obtained in the past, and the evaluation value is the best. It is determined whether the result has converged. If it has converged, the process proceeds to step S209, a process for completing the adjustment is performed, and the adjustment of the operation command value is completed. When the adjustment of the operation command value is completed, the divided position and the speed target value at which the best evaluation value is obtained become the adjustment result of the operation command value. On the other hand, if not converged yet, the process proceeds to step S208.

Next, in step S208, the learning processing unit 115 newly defines M sets of division positions and speed target values, and updates the division positions and speed target values. M division position and speed target values are different from each other in division position or speed target value. That is, in step S208, the learning processing unit 115 newly sets M sets of operation command values. Each of the M sets of operation command values has, as parameters, division positions and speed target values corresponding to the division positions. In each set of operation command values, there are one more division position than the number of divisions, and there are the same number of speed target values as the number of divisions. When the process of step S208 is completed, the process returns to step S201.

As described above, in the learning process, the robot system 100 performs M times of trial work based on the set segment position and the speed target value set for each segment. The trial operation M times is performed under the condition that the division position or the speed target value is different. Every time M trials are completed, the learning processing unit 115 updates the position of the dividing point for each section and the speed target value for each section.

FIG. 18 is a diagram showing an example of a velocity pattern at the time of trial in the robot system 100 according to the third embodiment of the present invention. FIG. 19 is a diagram showing an example of force information acquired at the time of trial in the robot system 100 according to the third embodiment of the present invention. In FIG. 18 and FIG. 19, P0 to P3 are positions of division points, and S1 to S4 indicate four divisions. Further, in FIG. 18, V1 to V4 represent speed target values in each section. FIG. 19 shows force information acquired in the trial based on the velocity pattern shown in FIG.

In the assembly operation as shown in FIG. 14, the reaction force between the end effector 130 holding the first part 210 and the second part 310 is higher than the limit value in the vicinity of the position where the contact between the parts occurs. It can be large. In this case, the amount of force exceeding the limit value can be evaluated by the over limit amount. In FIG. 19, in the section S2, the magnitude F of the force detected by the force sensor 143 exceeds the limit value L0. When the magnitude F of the detected force exceeds the limit value L0, the limit excess amount DH is obtained by the difference between the magnitude F of the detected force and the limit value L0. In the case where there is a division that is larger than the threshold set by the over-limit amount DH, it is necessary to adjust the speed target value of that division.

In FIG. 19, the force F detected in the section S2 is large. Therefore, with respect to the velocity pattern shown in FIG. 18, the learning processing unit 115 adjusts the velocity pattern so that the velocity target value V2 in the section S2 becomes smaller. Further, the learning processing unit 115 also adjusts the positions of division points P1 and P2 which are both ends of the division S2. In the velocity pattern shown in FIG. 18, the division point P1 is a point at which the target velocity value starts to be lowered, and the division point P2 is a point at which the target velocity value starts to be increased. That is, the learning processing unit 115 also adjusts the position of the point at which the change of the speed target value is started. These adjustments are made based on constraints.

For example, when limiting value L0 is set to the magnitude F of force as a constraint condition, evaluation is made such that the evaluation value related to the magnitude F of the force is 0 for trials that do not exceed the limiting value L0 as the upper limit. Define a function If the evaluation value related to the magnitude F of the force does not become 0, the learning processing unit 115 continues updating the positions of the speed target value V2 and the division points P1 and P2 to adjust the operation command value. At the same time as this adjustment, it is possible to define an evaluation function so that the work as fast as possible can be performed. In FIG. 19, in the sections S1, S3 and S4, the detected force magnitude F has a margin DL with respect to the limit value L0. Here, it is assumed that the margin amount DL is an amount up to the limit value L0 or an amount up to the limit value L0 as an index. The amount up to the limit value L0 is defined by the difference between the limit value L0 and the magnitude F of the detected force. When the allowance amount DL is larger than 0, the speed target value can be adjusted and updated in the increasing direction. Such adjustment makes it possible to perform the work as fast as possible.

These adjustments are performed in steps S205 and S206 in FIG. At this time, a machine learning or optimization method using an evaluation function can be applied to obtain the position and speed target value of the division point Pi that makes the evaluation value the best. For example, techniques such as reinforcement learning, Bayesian optimization, and particle swarm optimization are exemplified. By these methods, it is possible to set an operation command value that makes the evaluation value the best. For example, it is assumed that the evaluation function Fq represented by the equation (1) using the force F (t) detected at each time point in operation and the operation time T is defined. The learning processing unit 115 may obtain an operation command value such that the force F (t) and the work time T become smaller by adjusting the operation command value so that the evaluation value calculated by the evaluation function Fq becomes smaller. it can. As shown in FIG. 17, the adjustment is completed when the evaluation value obtained by the evaluation function converges.

The motion adjustment device 112, the motion control system 110, and the robot system 100 of the present embodiment are configured as described above. According to the motion adjustment device 112, the motion control system 110, and the robot system 100 of the present embodiment, the motion of the robot 120 is adjusted for each section. Therefore, it is possible to adjust the operation of the robot 120 so as not to unnecessarily delay the operation of the entire operation and to prevent an excessive load from acting on the operation target 200 or the surrounding environment 300. Adjustment can be facilitated. Furthermore, if the section where the detected value of the force sensor 143 is smaller than the predetermined value is adjusted so that the operation becomes faster, it is also possible to make the operation of the whole operation faster.

As described above, according to the operation adjustment device 112, the operation control system 110 and the robot system 100 of the present embodiment, the conventional adjustment can be realized by learning and updating the optimum operation command value for each section. It becomes possible to design delicate motion command values that were not possible, and as a result, high-speed, high-quality robot work can be realized. Specifically, according to the operation adjustment device 112, the operation control system 110 and the robot system 100 of the present embodiment, the reaction force between the fitted parts is suppressed in the fitting work of the parts, the insertion work of the connector, etc. Work time can be shortened.

Fourth Embodiment
FIG. 20 is a block diagram showing a configuration example of the operation adjustment apparatus 112c according to the fourth embodiment of the present invention and surrounding blocks. Other configurations of the motion adjustment device, motion control system, and robot system of the present embodiment are the same as those shown in FIG. FIG. 20 shows a part of the configuration of the robot system 100 extracted. The operation adjustment device 112 c of the present embodiment includes a command value learning unit 113 b and a command value classification unit 116.

The command value classification unit 116 receives an operation command value before updating from the robot control device 111, sensor information that is a detection value of the sensor 140 from the sensor 140, and a constraint condition from the outside. For these inputs, command value sorting section 116 splits the operation command value using the position of end effector 130 or the like or the command value progress rate, and division points Pi (i = 0, 1, 2,... Define N + 1) and output this as classification information. The command value learning unit 113b is the same as that shown in FIG.

The operation adjustment apparatus 112c according to the present embodiment applies the machine learning, for example, to determine the space to be divided using the feature amount and the constraint condition of the sensor information, and class information on the divided feature amount space is determined here. To generate the current division point Pi. The operation adjustment device 112c performs pre-processing and learning processing as in the processing shown in FIG. FIG. 21 is a flow chart showing an example of the flow of pre-processing performed by the operation adjustment device 112 c according to the fourth embodiment of the present invention. FIG. 22 is a flow chart showing an example of the flow of learning processing performed by the operation adjustment apparatus 112 c according to the fourth embodiment of the present invention.

The pre-processing shown in FIG. 21 differs from the processing shown in FIG. 16 in that the number of divisions is also defined in addition to the division positions in step S106 b. For example, partitions can be generated automatically based on waveform features. As waveform characteristics, for example, regarding position data, velocity data, force data, and force change rate data acquired in time series, clustering is performed based on the input with the maximum value or frequency distribution of data of Tsmp every constant time Do. For clustering, it is possible to define a break for each characteristic history of a waveform using a clustering method such as k-means method which is a type of machine learning. For example, it is assumed that X kinds of waveform features are defined based on this.

The original data can then be labeled based on the acquired clusters. For example, by defining the similarity S (i) (where i = 1, 2, 3,..., X) of the target input to each of the X existing clusters, the characteristic of the attribute of any group is defined. It can be expressed as a percentage whether it is closest to In that case, it can be labeled as a group with the largest percentage. The label L (t) of each time defined with the time t as a variable is used. In step S106b, it is possible to define the number of divisions and the position of division as a break at all or some parts where a change in label occurs.

On the other hand, the learning process shown in FIG. 22 is different from the process shown in FIG. 17 in three processes of step S211, step S212 and step S213. In the learning process shown in FIG. 22, in step S211, a first evaluation value is obtained using an evaluation function for learning the number of divisions and the position of division based on sensor information, operation command values and control parameters, and constraint conditions. . Further, in the learning process illustrated in FIG. 22, in step S <b> 212, the number of divisions and the position of division are learned and updated based on the first evaluation value. Further, in the learning process shown in FIG. 22, in step S213, a second evaluation value for learning an operation command value is obtained. Therefore, the learning process shown in FIG. 22 learns the operation command value after learning the number of segments and the segment position.

By including the above processing, a framework for automatically learning division information is added, and there is no need to design division information in advance by utilizing prior knowledge, and the special effect of shortening design time is realized. You can get it.

Embodiment 5
FIG. 23 is a block diagram showing a configuration example of an operation adjustment device 112 d according to a fifth embodiment of the present invention and peripheral blocks. Other configurations of the motion adjustment device, motion control system, and robot system of the present embodiment are the same as those shown in FIG. FIG. 23 shows a part of the configuration of the robot system 100 extracted. The operation adjustment device 112 d of the present embodiment includes a command value classification unit 116 and an operation learning unit 117. The command value classification unit 116 is similar to that shown in FIG.

FIG. 24 is a block diagram showing a configuration example of the operation learning unit 117 according to the fifth embodiment of the present invention. The action learning unit 117 includes a command value learning unit 113 b and a parameter learning unit 118. The command value learning unit 113b is the same as that shown in FIG. The motion learning unit 117 receives the motion command value and the control parameter before updating from the robot control device 111. Further, the constraint condition is input to the operation learning unit 117 from the outside. Further, sensor information is input from the sensor 140 to the operation learning unit 117. Further, classification information is input to the operation learning unit 117 from the command value classification unit 116. The input signal is input to the command value learning unit 113 b and the parameter learning unit 118.

The parameter learning unit 118 adjusts the gain of a sensor feedback control system based on an external sensor, an impedance parameter, a filter design parameter, and the like, not a direct robot behavior such as a position command value, a speed command value, and an acceleration command value. That is, the parameter learning unit 118 adjusts control parameters of the feedback control system. The parameter learning unit 118 receives the classification information, the sensor information, the constraint condition, the command value, and the control parameter as input, and updates the input control parameter to a control parameter satisfying the constraint condition using these as input. Machine learning can be used when updating control parameters. As an example, the parameter learning unit 118 updates the control parameter so that the evaluation value obtained by the previously-defined evaluation function becomes large, and repeats the operation until it converges asymptotically. Depending on the evaluation function to be defined, the parameter learning unit 118 updates the control parameter so that the evaluation value becomes smaller.

Here, in FIG. 24, the parameter learning unit 118 is illustrated as a configuration independent of the command value learning unit 113b. However, the parameter learning unit 118 and the command value learning unit 113b do not necessarily have to perform independent processing. For example, processing can be performed simultaneously using the parameter learning unit 118, the command value learning unit 113b, and one evaluation function. The parameter learning unit 118 adjusts the control parameter for each section. Further, the number of divisions used in the command value learning unit 113 b and the number of divisions used in the parameter learning unit 118 are not necessarily the same. For example, the number of divisions used in the parameter learning unit 118 may be larger than the number of divisions used in the command value learning unit 113b.

In addition, the parameter learning unit 118 can update not only control parameters in the sensor feedback control system based on the external sensor 142 but also control parameters in the feedback control system based on the internal sensor 141. As a result, with higher quality It is possible to realize high-speed robot work.

FIG. 25 is a block diagram showing another configuration example of the operation adjustment device 112 d according to the fifth embodiment of the present invention and peripheral blocks. FIG. 25 shows a configuration example without the command value learning unit 113b. In this configuration example, the operation adjustment device 112d does not update the operation command value, but updates only the control parameter.

Sixth Embodiment
When adjusting the velocity pattern, the motion adjustment device, motion control system and robot system of the present embodiment determine the upper limit value or the lower limit value for the speed target value in each section Si, and search space in each section Define based on rigidity of work target and assembly quality restrictions of work target. According to the motion adjustment device, the motion control system, and the robot system of the present embodiment, a motion command value or control parameter that can be realized but causes a problem in assembly quality in the search space is not searched. Therefore, it is possible to converge on an operation command value or control parameter that realizes high-speed assembly within a range in which the operation quality required by the user is specified. As a result, the robot after adjustment can obtain a remarkable effect that the work quality can be secured without increasing the reaction force applied to the work object without damaging it.

DESCRIPTION OF SYMBOLS 100 robot system, 110 motion control system, 111 robot control device, 112, 112b, 112c, 112d motion adjustment device, 113, 113b command value learning unit, 114 storage unit, 115 learning processing unit, 116 command value sorting unit, 117 motion Learning part, 120 robots, 130 end effectors, 140 sensors, 141 internal sensors, 142 external sensors, 143 force sensors, 200 work objects, 210 first parts, 211 holes, 300 peripheral environment, 310 second parts, 311 projections, 401 processor, 402 memory, 403 data bus.

Claims

A robot operation adjustment device used in a robot system comprising: a robot on which an end effector is mounted; and a robot control device that controls an operation of the robot, wherein the robot performs a work on a work target,
An operation command value transmitted from the robot control device to the robot to perform learning by using a force acting on the end effector detected by an external sensor included in the robot system as an input to control the operation of the robot An operation adjusting apparatus comprising: a command value learning unit for adjusting
The operation adjustment device according to claim 1, wherein the command value learning unit performs learning with a range of force acting on the end effector as a constraint condition to adjust the operation command value.
The operation adjustment device according to claim 2, wherein the command value learning unit adjusts the operation command value by performing learning with the upper limit of the time required for the work as a constraint condition.
The motion command value according to any one of claims 1 to 3, wherein the motion command value is a target value of movement speed of the end effector or a speed command value which is a target value of motion speed of the robot. Adjustment device.
The said command value learning part adjusts the said operation command value with respect to each of the some division which divided | segmented from the start to completion | finish of the said operation | work to any one of Claim 2 to 4 The operation adjustment device described in.
The system further comprises a command value classification unit that generates a plurality of divisions by dividing the work from start to finish.
The operation adjustment device according to claim 5, wherein the command value learning unit adjusts the operation command value for each of the divisions generated by the command value classification unit.
The operation adjustment apparatus according to claim 6, wherein the command value sorting unit adjusts the position of a dividing point for dividing the work into the divisions.
The command value learning unit adjusts the operation command value by performing learning with a force, a moment, a torque, or an upper limit or a lower limit of a current value as a constraint condition. The operation adjustment device according to any one of the above.
The command value learning unit adjusts the motion command value by performing learning with any of the upper and lower limits of the position and orientation of the robot or the relative position and orientation with the surrounding environment as constraints. Item 9. The operation adjustment device according to any one of items 2 to 8.
The operation according to claim 5 or 6, wherein the command value learning unit performs evaluation based on an evaluation function every M trials of work (M is a natural number), and adjusts the operation command value. Adjustment device.
The robot system includes a parameter learning unit for learning control parameters in at least one of feedback control based on an internal sensor provided in the robot system and feedback control based on the external sensor,
The parameter learning unit is characterized in that the control parameter is updated by performing learning based on division information which is information on the division and sensor information obtained from the external sensor in a plurality of trials. The operation adjustment device according to claim 5 or 6.
A robot operation adjustment device used in a robot system comprising: a robot on which an end effector is mounted; and a robot control device that controls an operation of the robot, wherein the robot performs a work on a work target,
The learning based on the force acting on the end effector detected by the external sensor included in the robot system is performed to perform feedback control of the operation of the robot based on the internal sensor included in the robot system and the external sensor A motion adjustment device comprising: a parameter learning unit that learns control parameters in at least one of feedback control of motion of the robot.
The operation adjustment device according to claim 12, wherein the parameter learning unit adjusts the control parameter by performing learning with a range of force acting on the end effector as a constraint condition.
The operation adjustment apparatus according to claim 13, wherein the parameter learning unit adjusts the control parameter by performing learning with the upper limit of the time required for the operation as a constraint.
The said parameter learning part adjusts the said control parameter with respect to each of the some division which divided | segmented from the start to completion | finish of the said operation | work to any one of Claim 12 to 14 characterized by the above-mentioned. Operation adjustment device.
The system further comprises a command value classification unit that generates a plurality of divisions by dividing the work from start to finish.
The operation adjustment device according to claim 15, wherein the parameter learning unit adjusts the control parameter for each of the divisions generated by the command value division unit.
The operation adjustment device according to claim 16, wherein the command value classification unit adjusts the position of a division point for dividing the work into the divisions.
An operation adjustment device according to any one of claims 1 to 17,
A robot control device for controlling the motion of the robot based on the motion command value or the control parameter adjusted by the motion adjustment device.
A motion control system according to claim 18;
A robot system comprising: the robot controlled by the motion control system.