WO2019098044A1 - Robot motion adjustment device, motion control system, and robot system - Google Patents

Robot motion adjustment device, motion control system, and robot system Download PDF

Info

Publication number
WO2019098044A1
WO2019098044A1 PCT/JP2018/040696 JP2018040696W WO2019098044A1 WO 2019098044 A1 WO2019098044 A1 WO 2019098044A1 JP 2018040696 W JP2018040696 W JP 2018040696W WO 2019098044 A1 WO2019098044 A1 WO 2019098044A1
Authority
WO
WIPO (PCT)
Prior art keywords
robot
command value
adjustment device
motion
learning
Prior art date
Application number
PCT/JP2018/040696
Other languages
French (fr)
Japanese (ja)
Inventor
浩司 白土
高志 南本
Original Assignee
三菱電機株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 三菱電機株式会社 filed Critical 三菱電機株式会社
Priority to CN201880072607.7A priority Critical patent/CN111344120B/en
Priority to DE112018005832.8T priority patent/DE112018005832B4/en
Priority to JP2019523125A priority patent/JP6696627B2/en
Publication of WO2019098044A1 publication Critical patent/WO2019098044A1/en

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J13/00Controls for manipulators
    • B25J13/08Controls for manipulators by means of sensing devices, e.g. viewing or touching devices
    • B25J13/085Force or torque sensors
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1628Programme controls characterised by the control loop
    • B25J9/163Programme controls characterised by the control loop learning, adaptive, model based, rule based expert control
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B2219/00Program-control systems
    • G05B2219/30Nc systems
    • G05B2219/39Robotics, robotics to robotics hand
    • G05B2219/39529Force, torque sensor in wrist, end effector

Definitions

  • the present invention relates to industrial robots and service robots for non-manufacturing industries.
  • the present invention comprises an operation adjustment device and an operation control system for adjusting an operation of a robot for causing an end effector mounted on the robot to reach a target position and orientation, and the operation adjustment device and the operation control system. It relates to a robot system.
  • such a robot system is used in a situation where the position and orientation of an object to be worked or the surrounding environment is unknown.
  • a robot system is used in a situation where the position and orientation of an object to be worked on or the surrounding environment changes.
  • Specific examples include a bin picking operation, an insertion operation with a surface copy operation, and a fitting operation of parts such as a connector.
  • work under variously changing environments is premised, and the motion of the robot is similarly controlled using a plurality of sensors.
  • Patent Document 1 discloses a robot system that accelerates the motion of a robot by learning.
  • An object of the present invention is to provide a motion adjustment device, a motion control system, and a robot system that can adjust the motion of the robot so that an excessive load does not act on the work target, and can easily adjust the motion of the robot. Do.
  • the robot motion adjustment device includes a robot equipped with an end effector and a robot control device that controls the motion of the robot, and is used in a robot system in which the robot performs work on a work target.
  • a command value learning unit is provided that performs learning based on the detected force acting on the end effector, and adjusts an operation command value transmitted from the robot control device to the robot to control the operation of the robot.
  • a robot control apparatus is used in a robot system in which a robot equipped with an end effector performs a task on a work target, and transmits motion command values to the robot to control the motion of the robot.
  • a command value learning unit that adjusts the motion command value by performing learning based on a force acting on the end effector detected by the sensor.
  • a robot on which an end effector is mounted a robot control device that transmits an operation command value to the robot to control an operation of the robot, and a force acting on the end effector detected by a sensor
  • the robot includes a command value learning unit that performs learning as an input and adjusts an operation command value, and the robot performs work on a work target.
  • the motion of the robot can be adjusted so that an excessive load does not act on the work target, and the adjustment of the motion of the robot can be facilitated.
  • FIG. 1 is a block diagram showing an example of a system configuration of a robot system provided with a motion adjustment device according to a first embodiment of the present invention. It is a figure which shows an example of the concrete hardware constitutions for implement
  • FIG. 1 is a block diagram showing an example of a system configuration of a robot system 100 provided with a motion adjustment device according to a first embodiment of the present invention.
  • the robot system 100 includes a motion control system 110, a robot 120, an end effector 130, an internal sensor 141, and an external sensor 142.
  • the motion control system 110 also includes a robot control device 111 and a motion adjustment device 112.
  • the robot controller is also called a robot controller.
  • the robot control device 111 transmits an operation command value for controlling the operation of the robot 120 to the robot 120 based on the detection results of the internal sensor 141 and the external sensor 142, and controls the operation of the robot 120.
  • An end effector 130 such as a robot hand is attached to the robot 120.
  • the end effector 130 directly works on the work target 200.
  • an appropriate type of end effector 130 is selected according to each operation performed by the robot system 100.
  • a surrounding environment 300 exists around the work target 200.
  • the peripheral environment 300 includes, for example, a part to which the work target 200 is to be assembled, a jig for positioning the work target 200, a tool (such as an electric driver) for processing the work target 200, a parts feeder for supplying the work target 200, and a robot 120. , A belt conveyor for conveying the work object 200, and the like.
  • the outside sensor 142 may be treated as part of the surrounding environment, such as a camera for capturing a work target. This is because the robot 120 or the end effector 130 may contact the external sensor 142 when the external sensor 142 is fixed at a predetermined position around the robot 120, for example.
  • the operation command value output from the robot control device 111 is, for example, information indicating a target position and a target posture at each time of the end effector 130 mounted on the robot 120, that is, a position command value.
  • the motion command value represents the target position of the end effector 130 at each time
  • the motion command value also indicates the moving speed of the end effector 130 between the respective times. Therefore, the position command value can also be considered as a speed command value representing a target motion speed of the robot.
  • the operation command value output from the robot control device 111 may be a target operation speed of the robot 120 or a speed command value representing a target moving speed of the end effector 130.
  • the target operating speed or target moving speed is given by the speed between each point in time of the movement of the robot 120 or the speed between each point on the path.
  • the motion command value may be an acceleration command value representing a target acceleration of the motion of the robot 120 or a target acceleration of the movement of the end effector 130.
  • the motion command value may take various forms as long as it directly controls the motion of the robot 120.
  • the motion adjustment device 112 adjusts and updates the motion command value generated by the robot control device 111 in accordance with the detection result of the external sensor 142 and the constraint given from the outside. That is, the motion adjustment device 112 adjusts the motion of the robot. In other words, the motion adjustment device 112 adjusts the correspondence between the detection results of the internal sensor 141 and the external sensor 142 and the motion command value output from the robot control device 111, reflects the adjustment result, and corresponds. It will be updated.
  • the adjustment of the operation command value can be reworded as the correction of the operation command value or the correction of the operation command value.
  • the robot control device 111 When the updated operation command value is present, the robot control device 111 outputs the updated operation command value to the robot 120.
  • the operation adjustment device 112 may update the operation command value with reference to not only the detection result of the external sensor 142 but also the detection result of the internal sensor 141.
  • the constraint conditions may be stored in advance inside the motion adjustment device 112 or the robot control device 111.
  • the robot system 100 performs two processes, an adjustment process of adjusting and updating an operation command value, and a work process of performing an operation on the work target 200 using the updated operation command value.
  • the operation of the robot system 100 includes the adjustment phase and the work phase
  • the adjustment process is a process of the robot system 100 in the adjustment phase.
  • the work process is a process of the robot system 100 in the work phase.
  • the operation adjustment device 112 adjusts the operation command value so as to be an optimal operation command value in the adjustment process.
  • the adjustment process and the work process do not have to be completely separated.
  • the robot system 100 may be configured such that the motion adjustment device 112 calculates an optimal motion command value as needed, even while the work on the work target 200 is being performed.
  • the robot system 100 updates the operation command value at a predetermined timing as needed, such as when an operation command value more appropriate than the currently used operation command value is calculated. This point is the same as in the following embodiments.
  • FIG. 2 is a diagram showing an example of a specific hardware configuration for realizing the robot control device 111 and the operation adjustment device 112.
  • the robot control device 111 and the operation adjustment device 112 are realized by causing the processor 401 to execute a program stored in the memory 402.
  • the processor 401 and the memory 402 are connected by a data bus 403.
  • the memory 402 is provided with volatile memory and non-volatile memory, and temporary information is stored in the volatile memory.
  • the robot control device 111 and the operation adjustment device 112 may be configured integrally or separately.
  • the robot control device 111 and the operation adjustment device 112 may be connected via a network or the like.
  • the robot control device 111 and the operation adjustment device 112 can be realized by the same hardware configuration.
  • the operation control system 110 outputs an operation command value based on data acquired by the internal sensor 141 and the external sensor 142, and configures a control system in which the robot 120 operates following the operation command value.
  • the internal sensor 141 include a sensor for acquiring the position of a joint of a robot, a sensor for acquiring an operation speed of a joint, and a sensor for acquiring a current value of a motor for operating the joint.
  • the robot system 100 configures a position control system that positions the end effector 130 by the robot control device 111, the robot 120, and the internal sensor 141.
  • the encoder which detects the amount of rotations of a motor, a resolver, a potentiometer etc. can be considered, for example.
  • a tachometer etc. can be considered as a sensor which acquires the motion speed of a joint.
  • a gyro sensor, an inertial sensor, or the like may be used as information of the robot 120 itself.
  • the robot system 100 configures a position control robot system that performs material handling work and the like.
  • the material handling operation is an operation for transferring and transporting materials and parts.
  • This position control robot system is called a feedback control system based on the internal sensor 141.
  • control parameters include position control gain, speed control gain, current control gain, and filter design parameters used for feedback control.
  • filter design parameters used for feedback control.
  • a filter used for feedback control a moving average filter, a low pass filter, a band pass filter, a high pass filter, etc. can be considered.
  • the feedback control based on the internal sensor 141 is control for the robot 120 to operate in accordance with the operation command value. In other words, feedback control based on the internal sensor 141 is control that is performed to realize the operation command value.
  • examples of the external sensor 142 include a force sensor, a vision sensor such as a camera, a tactile sensor, a touch sensor, and the like.
  • the external sensor 142 measures the contact state and positional relationship between the robot 120 and the work target 200 or the surrounding environment 300.
  • the robot system 100 configures a sensor feedback control system based on the external sensor 142 by the robot control device 111, the operation adjustment device 112, the robot 120, and the external sensor 142.
  • the robot system 100 may not use sensor feedback control based on a sensor signal output from the external sensor 142, but may simply use the sensor signal from the external sensor 142 as a trigger signal. In this case, the robot system 100 switches control parameters of feedback control by the internal sensor 141 from the trigger signal.
  • a sensor feedback control system based on the external sensor 142 is constructed as an outer loop of a position control robot system.
  • the sensor feedback control system based on the external sensor 142 detects the positional relationship between the robot 120, the robot arm or the end effector 130, and the work object 200 or the surrounding environment 300, by contact with acceleration, velocity, position, attitude, distance, force, moment, etc. Sensing behavior etc. Furthermore, a sensor feedback control system based on the external sensor 142 controls the operation of the robot 120 so as to obtain a desired positional relationship or force response based on the sensing result. In other words, the sensor feedback control system based on the external sensor 142 corrects the operation command value so as to obtain a desired positional relationship or force response.
  • control parameters include force control gain related to force control, impedance parameter, gain related to visual servo control, visual impedance parameter, setting parameter of a filter used for feedback control.
  • a control parameter that needs adjustment may be simply referred to as a parameter hereinafter.
  • the sensors used as the internal sensor 141 or the external sensor 142 specifically, a current value sensor, a joint position sensor, a joint speed sensor, a temperature distance sensor, a camera, an RGB-D sensor, a proximity sensor , Tactile sensors, force sensors, etc. can be considered.
  • the measurement target of the internal sensor 141 or the external sensor 142 may be the position and orientation of the robot 120, the position and orientation of the end effector 130, the position and orientation of a work to be the operation target 200, the position and orientation of an operator, and the like.
  • FIG. 3 is a block diagram showing an example of the configuration of the operation adjustment apparatus 112 according to Embodiment 1 of the present invention and peripheral blocks.
  • FIG. 3 shows a part of the configuration of the robot system 100 extracted.
  • the operation adjustment device 112 includes a command value learning unit 113.
  • the sensor 140 is a combination of the internal sensor 141 and the external sensor 142 into one. As described above, various sensors 140 can be considered. However, in the robot system 100 according to the present embodiment, the sensor 140 includes at least a force sensor that detects an external force acting on the end effector 130 due to the operation of the robot 120. This force sensor is an external sensor 142. Note that including at least a force sensor as the sensor 140 is the same as in the following embodiments.
  • a force sensor measures external force acting on the end effector 130 and is used to perform force control or impedance control. Control of the force that the end effector 130 applies to the work target 200 or the surrounding environment 300 is called force control. Also, controlling the operation of the robot 120 according to the detection result of the force sensor is referred to as force control. In force control, a target work force is set, and the magnitude of the force applied to the work target 200 or the surrounding environment 300 is controlled.
  • impedance control impedance characteristics (spring, damper, inertia) related to the contact force generated when the end effector 130 and the work object 200 contact each other are defined and used for control.
  • the contact force is generated, when the end effector 130 and the surrounding environment 300 come in contact with each other, the case where the work target 200 held by the end effector 130 and the surrounding environment 300 come in contact with each other may be considered.
  • the impedance characteristic is represented by an impedance parameter.
  • the motion adjustment device 112 updates the motion command value to control the motion of the robot 120 to be appropriate.
  • Constraint conditions are input to the operation adjustment device 112.
  • the constraint conditions include the upper limit value or the lower limit value of the force information detected by the force sensor.
  • the operation command value output from the operation control system 110 will be described as the speed command value.
  • the speed command value is a target moving speed of the end effector 130 with respect to each point on the moving path of the end effector 130. At this time, the time-series speed command value is the speed pattern for each point.
  • the speed command value may be a target operating speed of the robot 120 for each point in time of work.
  • the switching position may be set by switching time or parameters for switching.
  • the progress rate of the operation command value based on the position and time is exemplified.
  • the switching position Pi of the target speed may be a start point of switching of the target speed or may be a completion point of switching of the target speed.
  • the switching position Pi of the target velocity may be a point that it is guaranteed that the operation velocity detected by the internal sensor 141 falls within a predetermined error range from the target velocity.
  • FIG. 4 is a diagram for explaining the operation of the operation adjustment device 112 according to the first embodiment of the present invention. As shown in FIG. 4, consider the case where the end effector 130 mounted on the robot 120 moves from the position P0 to the position P3. A force sensor 143 is attached to the robot 120 as an external sensor 142. The force sensor 143 measures an external force acting on the end effector 130.
  • FIG. 5 is a view showing an example of a velocity pattern before update in the robot system 100 according to the first embodiment of the present invention.
  • the horizontal axis is the position P of the end effector 130
  • the vertical axis is the target moving velocity V of the end effector 130.
  • the target velocity changes while the end effector 130 moves from P0 to P3.
  • the motion adjustment device 112 updates the speed pattern based on the detection result of the force sensor 143.
  • FIG. 6 is a flow chart showing an example of the process flow of the operation control system 110 according to the first embodiment of the present invention.
  • the robot control device 111 determines an initial value of the velocity pattern.
  • step S11 the robot control device 111 controls the operation of the robot 120 and tries a task.
  • the adjustment process and the work process are not completely separated as described above, a part of the normal work in the robot system 100 may be treated as a trial.
  • step S12 the operation adjustment device 112 determines whether the constraint condition is satisfied. That is, in step S12, the operation adjustment device 112 determines whether the detection value of the force sensor 143 falls within the range between the upper limit value and the lower limit value defined by the constraint condition and whether the constraint of the working time is satisfied. Do.
  • the operation adjustment device 112 may use not the detection value itself of the force sensor 143 but an evaluation value obtained by calculation from the detection value. As an example of the evaluation value, an evaluation value calculated with an evaluation function using the detection value of the force sensor 143 and the tact time as an input can be considered.
  • the operation adjustment device 112 may determine whether the evaluation value is within the limit range.
  • step S12 When it is determined in step S12 that the constraint condition is satisfied, the process of the operation control system 110 is temporarily ended, and the work with the updated speed pattern is performed thereafter. On the other hand, when it is determined in step S12 that the constraint condition is not satisfied, the process of the operation control system 110 proceeds to step S13.
  • step S13 the operation adjustment device 112 adjusts the speed pattern and updates the speed pattern.
  • step S13 the operation adjustment device 112 calculates, for example, a correction coefficient for correction, and adjusts the velocity pattern by multiplying the velocity pattern at the time of trial.
  • the operation control system 110 according to the first embodiment of the present invention performs the above processing. As described above, the operation control system 110 according to the first embodiment of the present invention adjusts the speed pattern in a learning manner based on data obtained by a plurality of trials. In other words, the motion control system 110 according to the first embodiment of the present invention adjusts the speed pattern, which is the motion command value, using machine learning or optimization.
  • the upper limit value of the working time is included in the constraint condition, it is not an essential condition but may be another condition. Further, instead of the upper limit value of the working time being given as the constraint condition, the constraint condition may be that the working time is shortest after the other conditions are satisfied. Furthermore, in the above description, although the case where the operation control system 110 updates the operation command value so as to satisfy the given constraints is described, the operation control system 110 is configured to adjust and update the control parameter. Is also conceivable. Furthermore, although FIG. 1 shows a configuration example in which the robot control device 111 and the operation adjustment device 112 are separately provided, the robot control device 111 may be configured to incorporate the operation adjustment device 112.
  • the motion adjustment device 112, the motion control system 110, and the robot system 100 of the present embodiment are configured as described above. According to the motion adjustment device 112, the motion control system 110, and the robot system 100 of the present embodiment, the motion of the robot 120 is adjusted such that the detection value of the force sensor 143 falls within a predetermined range.
  • the detection value of the force sensor 143 represents the magnitude of the external force acting on the end effector 130.
  • the detection value of the force sensor 143 is information representing the magnitude of the force applied to the work target 200 or the surrounding environment 300 due to the operation of the robot 120.
  • the force applied to the work target 200 or the surrounding environment 300 has an appropriate magnitude, that is, the work target 200 or the circumference.
  • the operation of the robot 120 can be adjusted such that an excessive load does not act on the environment 300, and the adjustment of the operation of the robot 120 can be facilitated.
  • the motion adjustment device 112, the motion control system 110, and the robot system 100 use the magnitude of the force detected by the force sensor 143 as a constraint condition, the moment, torque, current value, etc. It is also possible to detect and use any of these upper and lower limits as constraints. By these, it is possible to set a limit value in the contact situation between the robot 120 or the end effector 130 and the outside world, and it becomes possible to search for an operation command value within a desired range. As a result, an operation that does not damage the work target 200 can be realized.
  • the relative position and orientation with the surrounding environment 300 and the position and orientation of the robot 120 can be added.
  • robot work with reduced interference with the surrounding environment 300 can be realized while realizing high-quality work.
  • the effects described above are similarly obtained in the other embodiments.
  • the configurations of the operation adjustment device, the operation control system, and the robot system of the present embodiment are the same as those shown in FIG.
  • the motion adjustment device 112, the motion control system 110, and the robot system 100 according to the present embodiment divide the motion command given to the robot 120 into a plurality of sections for a series of tasks, and adjust the motion command value for each section. It is a thing. In the following description, it is assumed that the operation command value output from the operation control system is a speed command value.
  • FIG. 7 is a diagram for explaining the operation of the operation adjustment device 112 according to the second embodiment of the present invention. As shown in FIG. 7, an operation of moving the end effector 130 mounted on the robot 120 from the position P0 to the position P3 is considered. Position P0 which is an initial position is a start point of work, and position P3 is an end point of work. The end effector 130 passes through the positions P1 and P2 while moving from the position P0 to the position P3.
  • the path from the start point of work to the end point of work is divided into a plurality of sections.
  • the operation of the robot 120 from the start of one operation to the end of the operation is divided into a plurality of sections.
  • a section S1 is from the position P0 to the position P1, a section S2 from the position P1 to the position P2, and a section S3 from the position P2 to the position P3.
  • the target moving speed of section S1 is V1
  • the target moving speed of section S2 is V2
  • the target moving speed of section S3 is V3.
  • the robot system 100 according to the present embodiment adjusts and updates the operation command value for each of the divided sections. Specifically, the robot system 100 adjusts the target moving speed of the section S1, the target moving speed of the section S2, and the target moving speed of the section S3.
  • positions P1 and P2 which are division points for division into divisions, are determined in advance according to the contents of work.
  • Positions P1 and P2 are positions at which the sections switch, and may be referred to as switching positions.
  • the number of divisions is illustrated as three here, it is not necessarily limited to three.
  • division is defined spatially by position here, it may be divided temporally from the start time of work to the end time of work.
  • the upper limit value Flim of the detection result of the force sensor 143 is given to the operation control system 110 according to the present embodiment as a constraint condition.
  • the flow of processing of the operation control system 110 according to the present embodiment is basically the same as the flow shown in FIG. However, the speed pattern will be adjusted for each segment.
  • the robot control device 111 determines an initial value of the velocity pattern.
  • FIG. 8 is a diagram showing an example of initial values of velocity patterns in the robot system 100 according to Embodiment 2 of the present invention.
  • the horizontal axis is the position P of the end effector 130, and the vertical axis is the target moving velocity V of the end effector 130.
  • FIG. 9 is a diagram showing an example of detection values of force sensor 143 in robot system 100 according to the second embodiment of the present invention.
  • the horizontal axis is the position P of the end effector 130
  • the vertical axis is the detection value F of the force sensor 143.
  • FIG. 9 shows values detected by the force sensor 143 when the robot 120 is operated at the initial value of the velocity pattern shown in FIG.
  • step S12 the operation adjustment device 112 determines whether the constraint condition is satisfied. That is, in step S12, the operation adjustment device 112 determines whether the detected value of the force sensor 143 in each section is equal to or less than the upper limit Flim defined by the constraint condition. As the detection value of the force sensor 143 used for determination, for example, the maximum value among the detection values of the force sensor 143 in each section is used. In step S12, when the detection value of the force sensor 143 is equal to or less than Flim in all the sections, the operation adjustment device 112 determines that the constraint condition is satisfied. On the other hand, if there is at least one section in which the detection value of the force sensor 143 exceeds the upper limit Flim in step S12, the operation adjustment device 112 determines that the constraint condition is not satisfied.
  • step S12 When it is determined in step S12 that the constraint condition is satisfied, the process of the operation control system 110 is temporarily ended, and the work with the updated speed pattern is performed thereafter. On the other hand, when it is determined in step S12 that the constraint condition is not satisfied, the process of the operation control system 110 proceeds to step S13.
  • step S13 the operation adjustment device 112 adjusts the velocity pattern so as to reduce the target velocity of the section in which the detection value of the force sensor 143 exceeds the upper limit Flim, and updates the velocity pattern.
  • FIG. 10 is a diagram showing an example of the updated velocity pattern in the robot system 100 according to the second embodiment of the present invention.
  • the horizontal axis is the position P of the end effector 130
  • the vertical axis is the target moving velocity V of the end effector 130.
  • FIG. 11 is a diagram showing another example of the updated velocity pattern in the robot system 100 according to the second embodiment of the present invention.
  • the horizontal axis is the position P of the end effector 130
  • the vertical axis is the target moving velocity V of the end effector 130.
  • the dividing points P1 and P2 are positions where the target speed is switched.
  • the division points P1 and P2 may be start points of switching of the target speed, or may be completion points of switching of the target speed. Further, the dividing points P1 and P2 may be points that guarantee that the operation speed detected by the internal sensor 141 falls within a predetermined error range from the target speed.
  • the motion adjustment device 112, the motion control system 110, and the robot system 100 of the present embodiment are configured as described above. According to the motion adjustment device 112, the motion control system 110, and the robot system 100 of the present embodiment, the motion of the robot 120 is adjusted for each section. Since only the division in which the detection value of the force sensor 143 is larger than the predetermined value is adjusted to delay the operation, the operation of the entire operation is not unnecessarily delayed, and the operation target 200 or the surrounding environment 300 is excessively large. It is possible to adjust the operation of the robot 120 so that no load is applied, and to facilitate the adjustment of the operation of the robot 120. Furthermore, if the section where the detected value of the force sensor 143 is smaller than the predetermined value is adjusted so that the operation becomes faster, it is also possible to make the operation of the whole operation faster.
  • the conventional adjustment can be realized by learning and updating the optimum operation command value for each section. It becomes possible to design delicate motion command values that were not possible, and as a result, high-speed, high-quality robot work can be realized.
  • FIG. 12 is a block diagram showing a configuration example of the operation adjustment device 112b according to the third embodiment of the present invention and peripheral blocks.
  • FIG. 12 shows a part of the configuration of the robot system 100 extracted.
  • the operation adjustment device 112 b includes a command value learning unit 113 b.
  • the configurations of the motion adjustment device, motion control system, and robot system of the present embodiment are the same as those shown in FIG. 1 except that the motion adjustment device 112 is replaced with the motion adjustment device 112 b.
  • the operation adjustment apparatus 112b according to the present embodiment is different from the operation adjustment apparatus 112 according to the second embodiment in that division information is input.
  • the division information includes information on the initial value of the division position and the initial value of the operation command value in each division.
  • the division position is the position of division points Pi at both ends of each division, and is, for example, a position where the target value of the operation speed is switched.
  • the internal sensor 141 or the external sensor 142 detects that the end effector 130 has reached a predetermined position, the target value of the operation speed is switched.
  • the motion adjustment device, motion control system, and robot system of the present embodiment adjust the motion command value for each section, as in the second embodiment.
  • the command value learning unit 113b is adjusted so that the section where a collision or the like occurs is a low speed operation, and the other sections are adjusted so as to be a high speed operation.
  • the command value learning unit 113b automatically learns an operation command value corresponding to each section.
  • the operation adjustment device 112 b is described as adjusting only the operation command value without adjusting the control parameter.
  • the command value learning unit 113b receives motion information, using the classification information, the constraint condition, the detection value of the sensor 140, and the motion command value before updating. Update each one.
  • the division information is defined to divide the operation command value into N divisions.
  • N is a natural number.
  • the start point and the end point of the operation are also included in the division point, and the start point is P0.
  • the motion adjustment device, motion control system, and robot system of the present embodiment it is assumed that divisions are defined each time the work state changes.
  • the dividing points Pi are defined before and after a contact phenomenon occurs between parts to be fitted, and before and after a change in contact state.
  • the division point Pi is defined in accordance with the change in the expected contact state, and the speed of the entire operation can be increased by changing the target value of the operation command such as the position, velocity, or acceleration suitable for each.
  • the position of the dividing point Pi and the command value pattern of each division Si are defined from past trial information.
  • the constraint condition input to the command value learning unit 113b is a condition that defines the boundary between the success and failure of the work with respect to the speeded-up work.
  • the end effector 130 strongly collides with the work target 200 due to an error in position control of the end effector 130 or the like. If a strong collision occurs, the end effector 130 or the work target 200 may be damaged, resulting in work failure.
  • generation of operation command values for performing high-speed, low-impact work by the user defining constraint conditions at design time or defining constraint conditions by past trial data Can be realized.
  • Constraint conditions include position limit range, posture limit range, upper limit value of operating speed, lower limit value of operating speed, upper limit value of force, lower limit value of force, upper limit value of moment, lower limit value of moment, etc. .
  • positions and orientations of the robot 120 and the work object 200 are defined by the position and orientation of the robot 120 or the upper limit or lower limit of the relative positions and orientations of the robot 120 and the surrounding environment 300 as constraint conditions.
  • Limit values can be entered.
  • sensor information data acquired by the internal sensor 141 or the external sensor 142 is referred to as sensor information.
  • pre-processing such as processing for removing noise by filter processing and processing for extracting only a value that exceeds a threshold is performed as necessary.
  • the operation command value refers to a control command value that can be input to the position control system of the robot system 100.
  • the operation command value may be simply referred to as a command value.
  • the motion of the robot 120 is controlled by the motion of the motor of each axis.
  • the operation command value includes, for example, a position command value for controlling the operation of the motor, a speed command value, a current command value, and the like.
  • the motion adjustment device 112 b can equivalently generate time-series data of the position command value from the velocity pattern generated from the profile representing the relationship between time and velocity, and input it to the robot control device 111.
  • the motion command value can also be generated inside the robot control device 111.
  • the operation adjustment device 112b takes out the command value inside the robot control device 111, adjusts the operation command value according to the sensor information obtained as a response when the robot 120 performs the work, and updates it. Do. This point is the same as in the other embodiments.
  • the operation command value output from the operation control system is described as the speed command value.
  • a configuration may be considered in which the motion adjustment device 112 b passes not the motion command value itself but parameters necessary for generating the motion command value to the robot control device 111.
  • the motion adjustment device 112 b can also input to the robot control device 111 only the target values of the segment position and the motion speed in each segment. In this case, the robot control device generates an operation command value based on the input target position of the divided position and the operation speed.
  • the operation adjustment device 112 b includes a command value learning unit 113 b.
  • the command value learning unit 113b adjusts and updates the operation command value.
  • the command value learning unit 113b obtains a new operation command value based on the classification information, the constraint condition, the operation command value before updating, and the detection value of the sensor 140.
  • the command value learning unit 113b is designed to evaluate the speed of work and the quality of work using an evaluation function, and search the work target 200 for a high-speed operation that is resistant to breakage.
  • the operation adjustment device 112 b may also be configured to adjust and update control parameters used by the robot control device 111. Adjustment and update of control parameters are also performed by the command value learning unit 113b.
  • FIG. 13 is a block diagram showing a configuration example of the command value learning unit 113b according to the third embodiment of the present invention and peripheral blocks.
  • FIG. 13 shows a part of the configuration of the robot system 100 extracted.
  • the command value learning unit 113 b includes a storage unit 114 and a learning processing unit 115.
  • An example of a search method in the command value learning unit 113b will be described with reference to FIG.
  • N 4
  • a speed target value which is a value of the target speed defined in each section is used as the operation command value.
  • the command value learning unit 113b realizes high-speed work by adjusting the speed target value in each section.
  • FIG. 14 is a diagram showing an example of operations performed by the robot system 100 according to the third embodiment of the present invention. As shown in FIG. 14, the robot system 100 performs an operation of inserting the first part 210 into the second part 310.
  • FIG. 14 illustrates the change in relative position between the first part 210 and the second part 310 as the work progresses, and in the order of (a), (b), (c), (d) It shows how work is progressing.
  • the first part 210 corresponds to the work target 200
  • the second part 310 corresponds to the surrounding environment 300.
  • the first component 210 is provided with a hole 211.
  • the second component 310 is provided with a protrusion 311.
  • the protrusion 311 is inserted into the hole 211.
  • the first part 210 is made of a first material.
  • the second part 310 includes a portion 312 formed of the first material and a portion 313 formed of the second material.
  • the contact portion and the contact state between the parts change.
  • the contact state includes the material of each part of the contact portion, the size of the contact portion, and the like.
  • the change in the contact state changes the frictional force generated at the contact portion.
  • a friction force between the outer shapes of the first component 210 and the second component 310 is generated.
  • the contact between the hole 211 and the protrusion 311 is further added, so the frictional force is increased. Due to the change in the frictional force generated between the parts, the detection result of the force sensor 143 also changes. That is, in the fitting work of the parts, the insertion work of the connector, etc., the reaction force between the parts changes in accordance with the progress of the work.
  • the force sensor detects the reaction force between the parts.
  • the command value learning unit 113 b stores the force information detected by the sensor 140 and the velocity pattern acquired from the robot control device 111 in the storage unit 114. It is assumed that the robot system 100 can operate the robot 120 by designating a velocity pattern when attempting to perform an operation for adjusting the operation command value.
  • the learning processing unit 115 updates the velocity pattern based on the force information, the velocity pattern, the classification information, and the constraint condition stored in the storage unit 114, and outputs the velocity pattern to the robot control apparatus 111 as the offline processing.
  • the operation adjustment device 112b when adjusting the operation command value, the operation adjustment device 112b also performs a plurality of types of one velocity pattern as a reference. Encourage you to try the task using the speed pattern of As a result, when adjusting the operation command value, the robot system 100 will try under various conditions.
  • the operation adjustment device 112 b adjusts the operation command value based on data obtained in trials under various conditions. For example, the robot system 100 performs the trial Na times with different operation command values including operation command values different from the operation command value stored in the robot control device 111.
  • the operation adjustment device 112 b inputs data obtained as a result of Na trials, learns once, and updates the operation command value. When the trial of Nb set is performed with one set of Na trials, in many cases, the operation command value converges and no further improvement occurs.
  • Na and Nb are integers of 1 or more.
  • a trial is performed using one or more set operation command values, and an evaluation value is generated based on the obtained force sensor data.
  • the operation adjustment device 112 b updates the operation command value based on each evaluation value. In updating the operation command value, the operation adjustment device 112 b generates one or more operation command values and executes the trial again. If the operation command value is one, if the evaluation value converges in the graph in which the evaluation value is plotted, the operation adjustment device 112 b ends the update of the operation command value.
  • the motion adjustment device 112b updates the motion command value. finish. In this case, when the plurality of motion command values are updated, the motion adjustment device 112 b updates the motion value to the minimum evaluation value.
  • FIG. 15 is a flowchart showing an example of a process flow of the learning processing unit 115 according to the third embodiment of the present invention. As shown in FIG. 15, first, in step S100, the learning processing unit 115 performs preprocessing as a preparation stage. Next, in step S200, the learning processing unit 115 performs a learning process.
  • FIG. 16 is a flowchart showing an example of the flow of preprocessing performed by the learning processing unit 115 according to the third embodiment of the present invention. Note that FIG. 16 also describes operations performed by blocks other than the learning processing unit 115 in order to explain the operations.
  • the robot control device 111 sets control parameters for performing force sense control.
  • the robot control device 111 operates the robot 120 to try a task.
  • the command value learning unit 113b acquires data obtained in the trial.
  • the data obtained by each trial is called trial data.
  • the trial data includes force information detected in each trial and a velocity pattern used in each trial.
  • the force information is time-series data acquired by the force sensor 143 at predetermined time intervals in each trial, and is also referred to as a force waveform.
  • the storage unit 114 stores the data acquired in step S103.
  • step S105 the learning processing unit 115 determines whether K or more pieces of trial data have been acquired.
  • the learning processing unit 115 defines the division position based on the K trial data stored in the storage unit 114.
  • the division position is the position of division points at both ends of each division.
  • the position of the dividing point corresponds, for example, to the position of the end effector 130.
  • the position of the dividing point is the switching position of the speed target value.
  • the position of the dividing point may be a start point of switching of the target speed, or may be a completion point of switching of the target speed.
  • the position of the dividing point may be a point at which it is ensured that the operation speed detected by the internal sensor 141 falls within a predetermined error range from the target speed.
  • the learning processing unit 115 defines, for example, the division position based on the average and the variance of the K trial data.
  • the learning processing unit 115 can automatically determine the position of the dividing point by setting the dividing point before and after the force waveform largely changes, paying attention to the rate of change of the force waveform. Alternatively, the user can manually determine the point at which the state change occurs according to the work content as the division point.
  • step S107 the learning processing unit 115 determines whether the division position is defined. If the section position is not defined, the process returns to step S106. If the section position is defined, the process proceeds to step S108. Next, in step S108, the learning processing unit 115 defines a speed target value for each section. The speed target value is calculated based on the upper limit value of the force designated by the user and the target tact time.
  • the learning processing unit 115 sets a standard working speed for completing the work by the target tact time as the overall speed target value Vdn.
  • the learning processing unit 115 defines the speed upper limit value Vmax based on the upper limit value of the force.
  • the relationship between the speed when the end effector 130 collides with the work target 200 or the surrounding environment 300 and the external force applied to the end effector 130 at that time can be obtained in advance based on the rigidity information of the work target and the like.
  • the learning processing unit 115 can obtain the speed upper limit value Vmax with reference to a table or the like storing this relationship.
  • the learning processing unit 115 determines the target velocity Vd using the entire velocity target value Vdn and the velocity upper limit value Vmax.
  • the speed target value Vd is larger than 0 and smaller than the speed upper limit value Vmax.
  • the target velocity Vd is set to gradually approach Vdn. For example, under the condition of 0 ⁇ Vd ⁇ Vdn ⁇ Vmax, the learning processing unit 115 defines a plurality of speed target values Vd using random numbers so that the speed parameters may be dispersed to some extent.
  • the learning processing unit 115 determines the speed target value Vd so as to be a separated value within the determined range.
  • step S109 the learning processing unit 115 determines whether the speed target value is defined. If the speed target value has not been defined, the process returns to step S108. If the speed target value is defined, the pre-processing ends. By the pre-processing, an initial value when performing the learning processing is determined.
  • FIG. 17 is a flow chart showing an example of the flow of learning processing performed by the learning processing unit 115 according to the third embodiment of the present invention. Note that FIG. 17 also describes operations performed by blocks other than the learning processing unit 115 for the description of the operations.
  • the robot control device 111 operates the robot 120 to try a task.
  • the command value learning unit 113b acquires trial data obtained by the trial.
  • the storage unit 114 stores the trial data acquired in step S202.
  • step S204 the learning processing unit 115 determines whether M or more trial data have been acquired.
  • step S205 the learning processing unit 115 calculates an evaluation value for each of the M trial data based on the constraint condition. The calculated evaluation value is stored.
  • step S206 the learning processing unit 115 obtains a division position and a speed target value corresponding to the trial data having the best evaluation value among the M trial data.
  • step S207 the learning processing unit 115 compares the best evaluation value among the newly obtained M evaluation values with the evaluation value obtained in the past, and the evaluation value is the best. It is determined whether the result has converged. If it has converged, the process proceeds to step S209, a process for completing the adjustment is performed, and the adjustment of the operation command value is completed. When the adjustment of the operation command value is completed, the divided position and the speed target value at which the best evaluation value is obtained become the adjustment result of the operation command value. On the other hand, if not converged yet, the process proceeds to step S208.
  • step S208 the learning processing unit 115 newly defines M sets of division positions and speed target values, and updates the division positions and speed target values. M division position and speed target values are different from each other in division position or speed target value. That is, in step S208, the learning processing unit 115 newly sets M sets of operation command values. Each of the M sets of operation command values has, as parameters, division positions and speed target values corresponding to the division positions. In each set of operation command values, there are one more division position than the number of divisions, and there are the same number of speed target values as the number of divisions.
  • the robot system 100 performs M times of trial work based on the set segment position and the speed target value set for each segment.
  • the trial operation M times is performed under the condition that the division position or the speed target value is different. Every time M trials are completed, the learning processing unit 115 updates the position of the dividing point for each section and the speed target value for each section.
  • FIG. 18 is a diagram showing an example of a velocity pattern at the time of trial in the robot system 100 according to the third embodiment of the present invention.
  • FIG. 19 is a diagram showing an example of force information acquired at the time of trial in the robot system 100 according to the third embodiment of the present invention.
  • P0 to P3 are positions of division points, and S1 to S4 indicate four divisions.
  • V1 to V4 represent speed target values in each section.
  • FIG. 19 shows force information acquired in the trial based on the velocity pattern shown in FIG.
  • the reaction force between the end effector 130 holding the first part 210 and the second part 310 is higher than the limit value in the vicinity of the position where the contact between the parts occurs. It can be large. In this case, the amount of force exceeding the limit value can be evaluated by the over limit amount.
  • the magnitude F of the force detected by the force sensor 143 exceeds the limit value L0.
  • the limit excess amount DH is obtained by the difference between the magnitude F of the detected force and the limit value L0. In the case where there is a division that is larger than the threshold set by the over-limit amount DH, it is necessary to adjust the speed target value of that division.
  • the learning processing unit 115 adjusts the velocity pattern so that the velocity target value V2 in the section S2 becomes smaller. Further, the learning processing unit 115 also adjusts the positions of division points P1 and P2 which are both ends of the division S2.
  • the division point P1 is a point at which the target velocity value starts to be lowered
  • the division point P2 is a point at which the target velocity value starts to be increased. That is, the learning processing unit 115 also adjusts the position of the point at which the change of the speed target value is started.
  • limiting value L0 is set to the magnitude F of force as a constraint condition
  • evaluation is made such that the evaluation value related to the magnitude F of the force is 0 for trials that do not exceed the limiting value L0 as the upper limit.
  • the learning processing unit 115 continues updating the positions of the speed target value V2 and the division points P1 and P2 to adjust the operation command value.
  • this adjustment it is possible to define an evaluation function so that the work as fast as possible can be performed.
  • the detected force magnitude F has a margin DL with respect to the limit value L0.
  • the margin amount DL is an amount up to the limit value L0 or an amount up to the limit value L0 as an index.
  • the amount up to the limit value L0 is defined by the difference between the limit value L0 and the magnitude F of the detected force.
  • a machine learning or optimization method using an evaluation function can be applied to obtain the position and speed target value of the division point Pi that makes the evaluation value the best.
  • techniques such as reinforcement learning, Bayesian optimization, and particle swarm optimization are exemplified.
  • the learning processing unit 115 may obtain an operation command value such that the force F (t) and the work time T become smaller by adjusting the operation command value so that the evaluation value calculated by the evaluation function Fq becomes smaller. it can.
  • the adjustment is completed when the evaluation value obtained by the evaluation function converges.
  • the motion adjustment device 112, the motion control system 110, and the robot system 100 of the present embodiment are configured as described above. According to the motion adjustment device 112, the motion control system 110, and the robot system 100 of the present embodiment, the motion of the robot 120 is adjusted for each section. Therefore, it is possible to adjust the operation of the robot 120 so as not to unnecessarily delay the operation of the entire operation and to prevent an excessive load from acting on the operation target 200 or the surrounding environment 300. Adjustment can be facilitated. Furthermore, if the section where the detected value of the force sensor 143 is smaller than the predetermined value is adjusted so that the operation becomes faster, it is also possible to make the operation of the whole operation faster.
  • the conventional adjustment can be realized by learning and updating the optimum operation command value for each section. It becomes possible to design delicate motion command values that were not possible, and as a result, high-speed, high-quality robot work can be realized.
  • the operation control system 110 and the robot system 100 of the present embodiment the reaction force between the fitted parts is suppressed in the fitting work of the parts, the insertion work of the connector, etc. Work time can be shortened.
  • FIG. 20 is a block diagram showing a configuration example of the operation adjustment apparatus 112c according to the fourth embodiment of the present invention and surrounding blocks. Other configurations of the motion adjustment device, motion control system, and robot system of the present embodiment are the same as those shown in FIG. FIG. 20 shows a part of the configuration of the robot system 100 extracted.
  • the operation adjustment device 112 c of the present embodiment includes a command value learning unit 113 b and a command value classification unit 116.
  • the command value learning unit 113b is the same as that shown in FIG.
  • the operation adjustment apparatus 112c applies the machine learning, for example, to determine the space to be divided using the feature amount and the constraint condition of the sensor information, and class information on the divided feature amount space is determined here. To generate the current division point Pi.
  • the operation adjustment device 112c performs pre-processing and learning processing as in the processing shown in FIG.
  • FIG. 21 is a flow chart showing an example of the flow of pre-processing performed by the operation adjustment device 112 c according to the fourth embodiment of the present invention.
  • FIG. 22 is a flow chart showing an example of the flow of learning processing performed by the operation adjustment apparatus 112 c according to the fourth embodiment of the present invention.
  • the pre-processing shown in FIG. 21 differs from the processing shown in FIG. 16 in that the number of divisions is also defined in addition to the division positions in step S106 b.
  • partitions can be generated automatically based on waveform features.
  • waveform characteristics for example, regarding position data, velocity data, force data, and force change rate data acquired in time series, clustering is performed based on the input with the maximum value or frequency distribution of data of Tsmp every constant time Do.
  • clustering it is possible to define a break for each characteristic history of a waveform using a clustering method such as k-means method which is a type of machine learning. For example, it is assumed that X kinds of waveform features are defined based on this.
  • the label L (t) of each time defined with the time t as a variable is used.
  • step S106b it is possible to define the number of divisions and the position of division as a break at all or some parts where a change in label occurs.
  • the learning process shown in FIG. 22 is different from the process shown in FIG. 17 in three processes of step S211, step S212 and step S213.
  • step S211 a first evaluation value is obtained using an evaluation function for learning the number of divisions and the position of division based on sensor information, operation command values and control parameters, and constraint conditions.
  • step S ⁇ b> 212 the number of divisions and the position of division are learned and updated based on the first evaluation value.
  • a second evaluation value for learning an operation command value is obtained. Therefore, the learning process shown in FIG. 22 learns the operation command value after learning the number of segments and the segment position.
  • FIG. 23 is a block diagram showing a configuration example of an operation adjustment device 112 d according to a fifth embodiment of the present invention and peripheral blocks. Other configurations of the motion adjustment device, motion control system, and robot system of the present embodiment are the same as those shown in FIG. FIG. 23 shows a part of the configuration of the robot system 100 extracted.
  • the operation adjustment device 112 d of the present embodiment includes a command value classification unit 116 and an operation learning unit 117.
  • the command value classification unit 116 is similar to that shown in FIG.
  • FIG. 24 is a block diagram showing a configuration example of the operation learning unit 117 according to the fifth embodiment of the present invention.
  • the action learning unit 117 includes a command value learning unit 113 b and a parameter learning unit 118.
  • the command value learning unit 113b is the same as that shown in FIG.
  • the motion learning unit 117 receives the motion command value and the control parameter before updating from the robot control device 111. Further, the constraint condition is input to the operation learning unit 117 from the outside. Further, sensor information is input from the sensor 140 to the operation learning unit 117. Further, classification information is input to the operation learning unit 117 from the command value classification unit 116. The input signal is input to the command value learning unit 113 b and the parameter learning unit 118.
  • the parameter learning unit 118 adjusts the gain of a sensor feedback control system based on an external sensor, an impedance parameter, a filter design parameter, and the like, not a direct robot behavior such as a position command value, a speed command value, and an acceleration command value. That is, the parameter learning unit 118 adjusts control parameters of the feedback control system.
  • the parameter learning unit 118 receives the classification information, the sensor information, the constraint condition, the command value, and the control parameter as input, and updates the input control parameter to a control parameter satisfying the constraint condition using these as input.
  • Machine learning can be used when updating control parameters.
  • the parameter learning unit 118 updates the control parameter so that the evaluation value obtained by the previously-defined evaluation function becomes large, and repeats the operation until it converges asymptotically.
  • the parameter learning unit 118 updates the control parameter so that the evaluation value becomes smaller.
  • the parameter learning unit 118 is illustrated as a configuration independent of the command value learning unit 113b.
  • the parameter learning unit 118 and the command value learning unit 113b do not necessarily have to perform independent processing. For example, processing can be performed simultaneously using the parameter learning unit 118, the command value learning unit 113b, and one evaluation function.
  • the parameter learning unit 118 adjusts the control parameter for each section.
  • the number of divisions used in the command value learning unit 113 b and the number of divisions used in the parameter learning unit 118 are not necessarily the same.
  • the number of divisions used in the parameter learning unit 118 may be larger than the number of divisions used in the command value learning unit 113b.
  • the parameter learning unit 118 can update not only control parameters in the sensor feedback control system based on the external sensor 142 but also control parameters in the feedback control system based on the internal sensor 141. As a result, with higher quality It is possible to realize high-speed robot work.
  • FIG. 25 is a block diagram showing another configuration example of the operation adjustment device 112 d according to the fifth embodiment of the present invention and peripheral blocks.
  • FIG. 25 shows a configuration example without the command value learning unit 113b.
  • the operation adjustment device 112d does not update the operation command value, but updates only the control parameter.
  • the motion adjustment device, motion control system and robot system of the present embodiment determine the upper limit value or the lower limit value for the speed target value in each section Si, and search space in each section Define based on rigidity of work target and assembly quality restrictions of work target. According to the motion adjustment device, the motion control system, and the robot system of the present embodiment, a motion command value or control parameter that can be realized but causes a problem in assembly quality in the search space is not searched. Therefore, it is possible to converge on an operation command value or control parameter that realizes high-speed assembly within a range in which the operation quality required by the user is specified. As a result, the robot after adjustment can obtain a remarkable effect that the work quality can be secured without increasing the reaction force applied to the work object without damaging it.

Landscapes

  • Engineering & Computer Science (AREA)
  • Robotics (AREA)
  • Mechanical Engineering (AREA)
  • Human Computer Interaction (AREA)
  • Manipulator (AREA)

Abstract

The present invention adjusts the motion of a robot so that excessive load is not applied to a target object, and facilitates the adjustment. A robot control device (111) sends a motion command value to a robot (120) and instructs the robot (120) equipped with an end effector (130) to perform an operation on a target object (200). The force applied to the end effector (130) according to the instruction is detected by an external sensor (142). A motion adjustment device (112) performs learning using the detection results of the external sensor (142), and adjusts and updates the motion command value acquired from the robot control device (111).

Description

ロボットの動作調整装置、動作制御システム及びロボットシステムMotion adjustment device for robot, motion control system and robot system
 この発明は、産業用ロボットや非製造業向けのサービスロボットなどに関するものである。特に、この発明は、ロボットに装着されたエンドエフェクタを目標となる位置姿勢に到達させるためのロボットの動作を調整する動作調整装置及び動作制御システムと、当該動作調整装置及び動作制御システムを備えたロボットシステムに関するものである。 The present invention relates to industrial robots and service robots for non-manufacturing industries. In particular, the present invention comprises an operation adjustment device and an operation control system for adjusting an operation of a robot for causing an end effector mounted on the robot to reach a target position and orientation, and the operation adjustment device and the operation control system. It relates to a robot system.
 従来の産業用ロボットシステムでは、ロボットと作業対象の関係が精密に位置決めされ、位置決めされた環境下でロボットが高速・高精度で作業を繰り返すようなシステム構成が多かった。これに対して近年では、力覚センサあるいはビジョンセンサなどの複数の外界センサを活用するロボットシステムが増加しつつある。このようなロボットシステムは、ロボットと作業対象とが精密に位置決めされていない環境で使用され、外界センサの検出結果に応じてロボット動作を制御する。 In the conventional industrial robot system, the relationship between the robot and the work object is precisely positioned, and there are many system configurations in which the robot repeats the work with high speed and high accuracy under the positioned environment. On the other hand, in recent years, robot systems utilizing a plurality of external sensors such as force sensors or vision sensors are increasing. Such a robot system is used in an environment in which the robot and the work object are not precisely positioned, and controls robot operation according to the detection result of the external sensor.
 例えば、このようなロボットシステムは、作業対象となる物体の位置姿勢あるいは周辺環境が未知の状況で使用される。また、別の例としては、このようなロボットシステムは、作業対象となる物体の位置姿勢あるいは周辺環境が変化する状況で使用される。具体的な事例としては、ビンピッキング作業、表面倣い動作を伴う挿入作業、コネクタ等の部品の嵌め合い作業などが挙げられる。また、非製造業向けのサービスロボットの分野では、様々に変化する環境下での作業が前提とされており、同様に複数のセンサを用いてロボットの動作が制御されている。 For example, such a robot system is used in a situation where the position and orientation of an object to be worked or the surrounding environment is unknown. As another example, such a robot system is used in a situation where the position and orientation of an object to be worked on or the surrounding environment changes. Specific examples include a bin picking operation, an insertion operation with a surface copy operation, and a fitting operation of parts such as a connector. Further, in the field of service robots for non-manufacturing industries, work under variously changing environments is premised, and the motion of the robot is similarly controlled using a plurality of sensors.
 これらのセンサを活用したロボットの制御系では、ロボットの動作を調整するために、複数の制御パラメータの調整が必要となる。制御パラメータが適切に調整されることで、ロボットの動作が適切となり、ロボットシステムの性能が確保される。しかし、制御パラメータの調整は容易ではなく、専門的な知識が要求されることが多い。そこで、制御パラメータの調整を容易化するために、いくつかの自動調整手段が提案されている。例えば、特許文献1には、学習によってロボットの動作を高速化させるロボットシステムが開示されている。 In a control system of a robot utilizing these sensors, it is necessary to adjust a plurality of control parameters in order to adjust the operation of the robot. By properly adjusting the control parameters, the operation of the robot becomes appropriate, and the performance of the robot system is secured. However, adjustment of control parameters is not easy and often requires specialized knowledge. Therefore, several automatic adjustment means have been proposed to facilitate adjustment of control parameters. For example, Patent Document 1 discloses a robot system that accelerates the motion of a robot by learning.
特開2017-94438号公報JP, 2017-94438, A
 従来のロボットシステムでは、学習において、ロボットの動作に起因して作業対象に作用する負荷の大きさが考慮されていない。したがって、学習で得られたロボットの動作において、作業対象に作用する負荷が適切な大きさとならず、作業対象に過大な負荷が作用する場合があった。本発明は、作業対象に過大な負荷が作用することがないようにロボットの動作を調整でき、ロボットの動作の調整を容易化できる動作調整装置、動作制御システム及びロボットシステムを得ることを目的とする。 In the conventional robot system, in learning, the magnitude of the load acting on the work object due to the motion of the robot is not taken into consideration. Therefore, in the operation of the robot obtained by learning, the load acting on the work target may not be an appropriate size, and an excessive load may act on the work target. An object of the present invention is to provide a motion adjustment device, a motion control system, and a robot system that can adjust the motion of the robot so that an excessive load does not act on the work target, and can easily adjust the motion of the robot. Do.
 本発明のロボット動作調整装置は、エンドエフェクタが装着されたロボットと、ロボットの動作を制御するロボット制御装置とを備え、ロボットが作業対象に対して作業を行うロボットシステムで用いられ、外界センサで検出されたエンドエフェクタに作用する力を入力とした学習を行って、ロボットの動作を制御するためにロボット制御装置からロボットに送信される動作指令値を調整する指令値学習部を備える。 The robot motion adjustment device according to the present invention includes a robot equipped with an end effector and a robot control device that controls the motion of the robot, and is used in a robot system in which the robot performs work on a work target. A command value learning unit is provided that performs learning based on the detected force acting on the end effector, and adjusts an operation command value transmitted from the robot control device to the robot to control the operation of the robot.
 また、本発明の動作制御システムは、エンドエフェクタが装着されたロボットが作業対象に対して作業を行うロボットシステムで用いられ、ロボットに動作指令値を送信してロボットの動作を制御するロボット制御装置と、センサで検出されたエンドエフェクタに作用する力を入力とした学習を行って、動作指令値を調整する指令値学習部とを備える。 Further, according to the motion control system of the present invention, a robot control apparatus is used in a robot system in which a robot equipped with an end effector performs a task on a work target, and transmits motion command values to the robot to control the motion of the robot. And a command value learning unit that adjusts the motion command value by performing learning based on a force acting on the end effector detected by the sensor.
 また、本発明のロボットシステムは、エンドエフェクタが装着されたロボットと、ロボットに動作指令値を送信してロボットの動作を制御するロボット制御装置と、センサで検出されたエンドエフェクタに作用する力を入力とした学習を行って、動作指令値を調整する指令値学習部とを備え、ロボットが作業対象に対して作業を行う。 Further, according to the robot system of the present invention, a robot on which an end effector is mounted, a robot control device that transmits an operation command value to the robot to control an operation of the robot, and a force acting on the end effector detected by a sensor The robot includes a command value learning unit that performs learning as an input and adjusts an operation command value, and the robot performs work on a work target.
 本発明の動作調整装置、動作制御システム及びロボットシステムによれば、作業対象に過大な負荷が作用することがないようにロボットの動作を調整でき、ロボットの動作の調整を容易化できる。 According to the motion adjustment device, the motion control system, and the robot system of the present invention, the motion of the robot can be adjusted so that an excessive load does not act on the work target, and the adjustment of the motion of the robot can be facilitated.
本発明の実施の形態1による動作調整装置を備えたロボットシステムのシステム構成の一例を示すブロック図である。FIG. 1 is a block diagram showing an example of a system configuration of a robot system provided with a motion adjustment device according to a first embodiment of the present invention. 本発明の実施の形態1によるロボット制御装置及び動作調整装置を実現するための具体的なハードウェア構成の一例を示す図である。It is a figure which shows an example of the concrete hardware constitutions for implement | achieving the robot control apparatus by Embodiment 1 of this invention, and an operation | movement adjustment apparatus. 本発明の実施の形態1による動作調整装置の構成例及び周辺のブロックを示すブロック図である。It is a block diagram which shows the structural example of the operation | movement adjustment apparatus by Embodiment 1 of this invention, and the periphery block. 本発明の実施の形態1による動作調整装置の動作を説明するための図である。It is a figure for demonstrating the operation | movement of the operation | movement adjustment apparatus by Embodiment 1 of this invention. 本発明の実施の形態1によるロボットシステムにおける更新前の速度パターンの一例を示す図である。It is a figure which shows an example of the speed pattern before the update in the robot system by Embodiment 1 of this invention. 本発明の実施の形態1による動作制御システムの処理の流れの一例を示すフロー図である。It is a flowchart which shows an example of the flow of a process of the operation control system by Embodiment 1 of this invention. 本発明の実施の形態2による動作調整装置の動作を説明するための図である。It is a figure for demonstrating the operation | movement of the operation | movement adjustment apparatus by Embodiment 2 of this invention. 本発明の実施の形態2によるロボットシステムにおける速度パターンの初期値の一例を示す図である。It is a figure which shows an example of the initial value of the velocity pattern in the robot system by Embodiment 2 of this invention. 本発明の実施の形態2によるロボットシステムにおける力覚センサの検出値の一例を示す図である。It is a figure which shows an example of the detected value of the force sensor in the robot system by Embodiment 2 of this invention. 本発明の実施の形態2によるロボットシステムにおける更新後の速度パターンの一例を示す図である。It is a figure which shows an example of the velocity pattern after the update in the robot system by Embodiment 2 of this invention. 本発明の実施の形態2によるロボットシステムにおける更新後の速度パターンの別の例を示す図である。It is a figure which shows another example of the velocity pattern after the update in the robot system by Embodiment 2 of this invention. 本発明の実施の形態3による動作調整装置の構成例及び周辺のブロックを示すブロック図である。It is a block diagram which shows the structural example of the operation | movement adjustment apparatus by Embodiment 3 of this invention, and the block of the periphery. 本発明の実施の形態3による指令値学習部の構成例及び周辺のブロックを示すブロック図である。It is a block diagram which shows the structural example of the command value learning part by Embodiment 3 of this invention, and the block of the periphery. 本発明の実施の形態3によるロボットシステムが実施する作業の一例を示す図である。It is a figure which shows an example of the operation | work which the robot system by Embodiment 3 of this invention implements. 本発明の実施の形態3による学習処理部の処理の流れの一例を示すフロー図である。It is a flowchart which shows an example of the flow of a process of the learning process part by Embodiment 3 of this invention. 本発明の実施の形態3による学習処理部で行われる前処理の流れの一例を示すフロー図である。It is a flowchart which shows an example of the flow of pre-processing performed by the learning process part by Embodiment 3 of this invention. 本発明の実施の形態3による学習処理部で行われる学習処理の流れの一例を示すフロー図である。It is a flowchart which shows an example of the flow of the learning process performed by the learning process part by Embodiment 3 of this invention. 本発明の実施の形態3によるロボットシステムにおける試行時の速度パターンの一例を示す図である。It is a figure which shows an example of the velocity pattern at the time of the trial in the robot system by Embodiment 3 of this invention. 本発明の実施の形態3によるロボットシステムにおける試行時に取得される力情報の一例を示す図である。It is a figure which shows an example of the force information acquired at the time of the trial in the robot system by Embodiment 3 of this invention. 本発明の実施の形態4による動作調整装置の構成例及び周辺のブロックを示すブロック図である。It is a block diagram which shows the structural example of the operation | movement adjustment apparatus by Embodiment 4 of this invention, and the periphery block. 本発明の実施の形態4による動作調整装置で行われる前処理の流れの一例を示すフロー図である。It is a flowchart which shows an example of the flow of pre-processing performed with the operation | movement adjustment apparatus by Embodiment 4 of this invention. 本発明の実施の形態4による動作調整装置で行われる学習処理の流れの一例を示すフロー図である。It is a flowchart which shows an example of the flow of the learning process performed with the operation | movement adjustment apparatus by Embodiment 4 of this invention. 本発明の実施の形態5による動作調整装置の構成例及び周辺のブロックを示すブロック図である。It is a block diagram which shows the structural example of the operation | movement adjustment apparatus by Embodiment 5 of this invention, and a surrounding block. 本発明の実施の形態5による動作学習部の構成例を示すブロック図である。It is a block diagram which shows the structural example of the operation | movement learning part by Embodiment 5 of this invention. 本発明の実施の形態5による動作調整装置の別の構成例及び周辺のブロックを示すブロック図である。It is a block diagram which shows another structural example of the operation | movement adjustment apparatus by Embodiment 5 of this invention, and the surrounding block.
実施の形態1.
 図1は本発明の実施の形態1による動作調整装置を備えたロボットシステム100のシステム構成の一例を示すブロック図である。図1に示す通り、ロボットシステム100は、動作制御システム110、ロボット120、エンドエフェクタ130、内界センサ141、及び外界センサ142を備える。また、動作制御システム110は、ロボット制御装置111及び動作調整装置112を備える。ロボット制御装置は、ロボットコントローラとも呼ばれる。
Embodiment 1
FIG. 1 is a block diagram showing an example of a system configuration of a robot system 100 provided with a motion adjustment device according to a first embodiment of the present invention. As shown in FIG. 1, the robot system 100 includes a motion control system 110, a robot 120, an end effector 130, an internal sensor 141, and an external sensor 142. The motion control system 110 also includes a robot control device 111 and a motion adjustment device 112. The robot controller is also called a robot controller.
 ロボット制御装置111は、内界センサ141及び外界センサ142の検出結果に基づいて、ロボット120の動作を制御するための動作指令値をロボット120に送信し、ロボット120の動作を制御する。ロボット120には、ロボットハンド等のエンドエフェクタ130が装着される。エンドエフェクタ130は、作業対象200に直接働きかける。エンドエフェクタ130は、ロボットシステム100が行う各作業に応じて適切な種類のものが選択される。作業対象200の周辺には、周辺環境300が存在する。 The robot control device 111 transmits an operation command value for controlling the operation of the robot 120 to the robot 120 based on the detection results of the internal sensor 141 and the external sensor 142, and controls the operation of the robot 120. An end effector 130 such as a robot hand is attached to the robot 120. The end effector 130 directly works on the work target 200. As the end effector 130, an appropriate type of end effector 130 is selected according to each operation performed by the robot system 100. A surrounding environment 300 exists around the work target 200.
 周辺環境300は、例えば、作業対象200を組み付ける先となる部品、作業対象200を位置決めするジグ、作業対象200に加工を施す工具(電動ドライバ等)、作業対象200を供給するパーツフィーダ、ロボット120を取り囲む安全カバー、作業対象200を搬送するベルトコンベア等である。また、作業対象を撮像するカメラなど、外界センサ142も周辺環境の一部として扱う場合もある。これは、外界センサ142がロボット120の周辺の所定の位置に固定されている場合などに、ロボットが120またはエンドエフェクタ130が外界センサ142に接触する可能性があるためである。 The peripheral environment 300 includes, for example, a part to which the work target 200 is to be assembled, a jig for positioning the work target 200, a tool (such as an electric driver) for processing the work target 200, a parts feeder for supplying the work target 200, and a robot 120. , A belt conveyor for conveying the work object 200, and the like. In addition, the outside sensor 142 may be treated as part of the surrounding environment, such as a camera for capturing a work target. This is because the robot 120 or the end effector 130 may contact the external sensor 142 when the external sensor 142 is fixed at a predetermined position around the robot 120, for example.
 ロボット制御装置111から出力される動作指令値は、例えば、ロボット120に装着されたエンドエフェクタ130の各時刻における目標位置および目標姿勢を表す情報、すなわち位置指令値である。動作指令値が、各時刻におけるエンドエフェクタ130の目標位置を表す場合、動作指令値によって各時刻間のエンドエフェクタの130の移動速度も表されている。したがって、位置指令値は、ロボットの目標動作速度を表す速度指令値であると考えることもできる。 The operation command value output from the robot control device 111 is, for example, information indicating a target position and a target posture at each time of the end effector 130 mounted on the robot 120, that is, a position command value. When the motion command value represents the target position of the end effector 130 at each time, the motion command value also indicates the moving speed of the end effector 130 between the respective times. Therefore, the position command value can also be considered as a speed command value representing a target motion speed of the robot.
 また、ロボット制御装置111から出力される動作指令値は、ロボット120の目標動作速度、またはエンドエフェクタ130の目標移動速度を表す速度指令値であっても良い。目標動作速度または目標移動速度は、ロボット120の動作の各時点の間の速度、または経路の各地点の間の速度で与えられる。さらに、動作指令値は、ロボット120の動作の目標加速度、またはエンドエフェクタ130の移動の目標加速度を表す加速度指令値であっても良い。動作指令値は、ロボット120の動作を直接的に制御するものであれば、様々な形態が考えられる。 Further, the operation command value output from the robot control device 111 may be a target operation speed of the robot 120 or a speed command value representing a target moving speed of the end effector 130. The target operating speed or target moving speed is given by the speed between each point in time of the movement of the robot 120 or the speed between each point on the path. Furthermore, the motion command value may be an acceleration command value representing a target acceleration of the motion of the robot 120 or a target acceleration of the movement of the end effector 130. The motion command value may take various forms as long as it directly controls the motion of the robot 120.
 動作調整装置112は、外界センサ142の検出結果と、外部から与えられる制約条件とに応じて、ロボット制御装置111で生成される動作指令値を調整し、更新する。すなわち、動作調整装置112は、ロボットの動作を調整する。言い換えると、動作調整装置112は、内界センサ141及び外界センサ142の検出結果と、ロボット制御装置111から出力される動作指令値との対応関係を調整し、調整結果を反映して対応関係を更新することになる。なお、動作指令値の調整は、動作指令値の修正、または動作指令値の補正と言い換えることもできる。 The motion adjustment device 112 adjusts and updates the motion command value generated by the robot control device 111 in accordance with the detection result of the external sensor 142 and the constraint given from the outside. That is, the motion adjustment device 112 adjusts the motion of the robot. In other words, the motion adjustment device 112 adjusts the correspondence between the detection results of the internal sensor 141 and the external sensor 142 and the motion command value output from the robot control device 111, reflects the adjustment result, and corresponds. It will be updated. The adjustment of the operation command value can be reworded as the correction of the operation command value or the correction of the operation command value.
 更新された動作指令値が存在する場合、ロボット制御装置111は、更新された動作指令値をロボット120へと出力する。動作調整装置112は、外界センサ142の検出結果だけではなく、内界センサ141の検出結果も参照して動作指令値を更新しても良い。なお、制約条件は、動作調整装置112またはロボット制御装置111の内部に予め記憶されていても良い。 When the updated operation command value is present, the robot control device 111 outputs the updated operation command value to the robot 120. The operation adjustment device 112 may update the operation command value with reference to not only the detection result of the external sensor 142 but also the detection result of the internal sensor 141. The constraint conditions may be stored in advance inside the motion adjustment device 112 or the robot control device 111.
 本実施の形態のロボットシステム100は、動作指令値を調整して更新する調整処理と、更新された動作指令値を用いて作業対象200に対する作業を行う作業処理との2つの処理を行う。言い換えると、ロボットシステム100の動作には、調整フェーズと作業フェーズとがあり、調整処理は、調整フェーズにおけるロボットシステム100の処理である。また、作業処理は、作業フェーズにおけるロボットシステム100の処理である。動作調整装置112は、調整処理において、最適な動作指令値になるように動作指令値を調整する。ただし、調整処理と作業処理とは完全に分離される必要はない。例えば、作業対象200に対する作業が行われている間にも、動作調整装置112が最適な動作指令値を随時算出するように、ロボットシステム100が構成されても良い。この構成においては、ロボットシステム100は、現在使用されている動作指令値よりも適切な動作指令値が算出された場合など、必要に応じて所定のタイミングで動作指令値を更新する。この点は、以降の実施の形態でも同様である。 The robot system 100 according to the present embodiment performs two processes, an adjustment process of adjusting and updating an operation command value, and a work process of performing an operation on the work target 200 using the updated operation command value. In other words, the operation of the robot system 100 includes the adjustment phase and the work phase, and the adjustment process is a process of the robot system 100 in the adjustment phase. The work process is a process of the robot system 100 in the work phase. The operation adjustment device 112 adjusts the operation command value so as to be an optimal operation command value in the adjustment process. However, the adjustment process and the work process do not have to be completely separated. For example, the robot system 100 may be configured such that the motion adjustment device 112 calculates an optimal motion command value as needed, even while the work on the work target 200 is being performed. In this configuration, the robot system 100 updates the operation command value at a predetermined timing as needed, such as when an operation command value more appropriate than the currently used operation command value is calculated. This point is the same as in the following embodiments.
 図2は、ロボット制御装置111及び動作調整装置112を実現するための具体的なハードウェア構成の一例を示す図である。ロボット制御装置111及び動作調整装置112は、メモリ402に記憶されるプログラムをプロセッサ401で実行することで実現される。プロセッサ401とメモリ402とは、データバス403で接続される。メモリ402には、揮発性のメモリ及び非揮発性のメモリが備えられ、一時的な情報は揮発性のメモリに記憶される。なお、ロボット制御装置111及び動作調整装置112は一体として構成しても良いし、別体として構成しても良い。例えば、ロボット制御装置111と動作調整装置112とが、ネットワークなどを介して接続されていても良い。以降の実施の形態においても、ロボット制御装置111及び動作調整装置112は同様のハードウェア構成で実現できる。 FIG. 2 is a diagram showing an example of a specific hardware configuration for realizing the robot control device 111 and the operation adjustment device 112. As shown in FIG. The robot control device 111 and the operation adjustment device 112 are realized by causing the processor 401 to execute a program stored in the memory 402. The processor 401 and the memory 402 are connected by a data bus 403. The memory 402 is provided with volatile memory and non-volatile memory, and temporary information is stored in the volatile memory. The robot control device 111 and the operation adjustment device 112 may be configured integrally or separately. For example, the robot control device 111 and the operation adjustment device 112 may be connected via a network or the like. Also in the following embodiments, the robot control device 111 and the operation adjustment device 112 can be realized by the same hardware configuration.
 ロボットシステム100は、内界センサ141及び外界センサ142で取得されたデータに基づいて動作制御システム110が動作指令値を出力し、動作指令値に追従してロボット120が動作する制御系を構成している。内界センサ141としては、ロボットの関節の位置を取得するセンサ、関節の動作速度を取得するセンサ、関節を動作させるためのモータの電流値を取得するセンサ等がある。ロボットシステム100は、ロボット制御装置111、ロボット120、及び内界センサ141によって、エンドエフェクタ130の位置決めを行う位置制御系を構成している。ロボットの関節の位置を取得するセンサとしては、例えば、モータの回転量を検出するエンコーダ、レゾルバ、ポテンショメータなどが考えられる。また、関節の動作速度を取得するセンサとしては、タコメータなどが考えられる。内界センサとしては、他にも、ロボット120自身の情報として、ジャイロセンサ、慣性センサ等が使用される場合がある。 In the robot system 100, the operation control system 110 outputs an operation command value based on data acquired by the internal sensor 141 and the external sensor 142, and configures a control system in which the robot 120 operates following the operation command value. ing. Examples of the internal sensor 141 include a sensor for acquiring the position of a joint of a robot, a sensor for acquiring an operation speed of a joint, and a sensor for acquiring a current value of a motor for operating the joint. The robot system 100 configures a position control system that positions the end effector 130 by the robot control device 111, the robot 120, and the internal sensor 141. As a sensor which acquires the position of the joint of a robot, the encoder which detects the amount of rotations of a motor, a resolver, a potentiometer etc. can be considered, for example. Moreover, a tachometer etc. can be considered as a sensor which acquires the motion speed of a joint. In addition, as the inside sensor, a gyro sensor, an inertial sensor, or the like may be used as information of the robot 120 itself.
 内界センサ141に基づくフィードバック制御によって、ロボットシステム100は、マテハン作業などを行う位置制御ロボットシステムを構成する。ここで、マテハン作業とは、資材や部品などの移送や搬送する作業である。この位置制御ロボットシステムを内界センサ141に基づくフィードバック制御システムと呼ぶ。内界センサ141に基づくフィードバック制御において、制御パラメータとしては、位置制御のゲイン、速度制御のゲイン、電流制御のゲイン、フィードバック制御に用いられるフィルタの設計パラメータ等が存在する。フィードバック制御に用いられるフィルタとしては、移動平均フィルタ、ローパスフィルタ、バンドパスフィルタ、ハイパスフィルタ等が考えられる。なお、内界センサ141に基づくフィードバック制御は、ロボット120が動作指令値に従って動作するための制御となる。言い換えると、内界センサ141に基づくフィードバック制御は、動作指令値を実現するために行われる制御となる。 By feedback control based on the internal sensor 141, the robot system 100 configures a position control robot system that performs material handling work and the like. Here, the material handling operation is an operation for transferring and transporting materials and parts. This position control robot system is called a feedback control system based on the internal sensor 141. In feedback control based on the internal sensor 141, control parameters include position control gain, speed control gain, current control gain, and filter design parameters used for feedback control. As a filter used for feedback control, a moving average filter, a low pass filter, a band pass filter, a high pass filter, etc. can be considered. The feedback control based on the internal sensor 141 is control for the robot 120 to operate in accordance with the operation command value. In other words, feedback control based on the internal sensor 141 is control that is performed to realize the operation command value.
 一方で、外界センサ142としては、力覚センサ、カメラ等のビジョンセンサ、触覚センサ、タッチセンサ等がある。外界センサ142は、ロボット120と、作業対象200または周辺環境300との接触状態や位置関係を計測する。ロボットシステム100は、ロボット制御装置111、動作調整装置112、ロボット120、及び外界センサ142によって、外界センサ142に基づくセンサフィードバック制御システムを構成している。また、ロボットシステム100は、外界センサ142から出力されるセンサ信号に基づいてセンサフィードバック制御を実施するのではなく、外界センサ142からのセンサ信号を単にトリガー信号として利用する場合もある。この場合、ロボットシステム100は、トリガー信号を起点として、内界センサ141によるフィードバック制御の制御パラメータを切り替える。外界センサ142に基づくセンサフィードバック制御システムは、位置制御ロボットシステムのアウターループとして構築されている。 On the other hand, examples of the external sensor 142 include a force sensor, a vision sensor such as a camera, a tactile sensor, a touch sensor, and the like. The external sensor 142 measures the contact state and positional relationship between the robot 120 and the work target 200 or the surrounding environment 300. The robot system 100 configures a sensor feedback control system based on the external sensor 142 by the robot control device 111, the operation adjustment device 112, the robot 120, and the external sensor 142. In addition, the robot system 100 may not use sensor feedback control based on a sensor signal output from the external sensor 142, but may simply use the sensor signal from the external sensor 142 as a trigger signal. In this case, the robot system 100 switches control parameters of feedback control by the internal sensor 141 from the trigger signal. A sensor feedback control system based on the external sensor 142 is constructed as an outer loop of a position control robot system.
 外界センサ142に基づくセンサフィードバック制御システムは、加速度、速度、位置姿勢、距離、力、モーメント等によって、ロボット120、ロボットアームまたはエンドエフェクタ130と、作業対象200または周辺環境300との位置関係、接触挙動等をセンシングする。さらに、外界センサ142に基づくセンサフィードバック制御システムは、センシング結果に基づいて、所望の位置関係または力応答を得るようにロボット120の動作を制御する。言い換えると、外界センサ142に基づくセンサフィードバック制御システムは、所望の位置関係または力応答を得るように動作指令値を修正する。外界センサ142に基づくセンサフィードバック制御システムにおいて、制御パラメータとしては、力覚制御に関する力制御ゲイン、インピーダンスパラメータ、ビジュアルサーボ制御に関するゲイン、ビジュアルインピーダンスパラメータ、フィードバック制御に用いられるフィルタの設定パラメータなどがある。 The sensor feedback control system based on the external sensor 142 detects the positional relationship between the robot 120, the robot arm or the end effector 130, and the work object 200 or the surrounding environment 300, by contact with acceleration, velocity, position, attitude, distance, force, moment, etc. Sensing behavior etc. Furthermore, a sensor feedback control system based on the external sensor 142 controls the operation of the robot 120 so as to obtain a desired positional relationship or force response based on the sensing result. In other words, the sensor feedback control system based on the external sensor 142 corrects the operation command value so as to obtain a desired positional relationship or force response. In the sensor feedback control system based on the external sensor 142, control parameters include force control gain related to force control, impedance parameter, gain related to visual servo control, visual impedance parameter, setting parameter of a filter used for feedback control.
 内界センサ141および外界センサ142に基づいて制御を行う場合に、調整が必要となる制御パラメータを、以後では単にパラメータと呼ぶことがある。ここで、内界センサ141または外界センサ142として使用されるセンサとしては、具体的には、電流値センサ、関節位置センサ、関節速度センサ、温度距離センサ、カメラ、RGB-Dセンサ、近接覚センサ、触覚センサ、力センサ等が考えられる。また、内界センサ141または外界センサ142の計測対象は、ロボット120の位置姿勢、エンドエフェクタ130の位置姿勢、作業対象200となるワークの位置姿勢、作業者の位置姿勢等が考えられる。 In the case where control is performed based on the internal sensor 141 and the external sensor 142, a control parameter that needs adjustment may be simply referred to as a parameter hereinafter. Here, as the sensors used as the internal sensor 141 or the external sensor 142, specifically, a current value sensor, a joint position sensor, a joint speed sensor, a temperature distance sensor, a camera, an RGB-D sensor, a proximity sensor , Tactile sensors, force sensors, etc. can be considered. Further, the measurement target of the internal sensor 141 or the external sensor 142 may be the position and orientation of the robot 120, the position and orientation of the end effector 130, the position and orientation of a work to be the operation target 200, the position and orientation of an operator, and the like.
 図3は、本発明の実施の形態1による動作調整装置112の構成例及び周辺のブロックを示すブロック図である。図3は、ロボットシステム100の構成の一部を抽出して示したものである。動作調整装置112は、指令値学習部113を備える。なお、図3において、センサ140は、内界センサ141及び外界センサ142を1つにまとめたものである。上述のように、センサ140としては多様なものが考えられる。しかし、本実施の形態のロボットシステム100は、センサ140には、ロボット120の動作に起因してエンドエフェクタ130に作用する外力を検出する力覚センサを少なくとも備える。この力覚センサは、外界センサ142となる。なお、センサ140として少なくとも力覚センサを含むことは、以降の実施の形態でも同様である。 FIG. 3 is a block diagram showing an example of the configuration of the operation adjustment apparatus 112 according to Embodiment 1 of the present invention and peripheral blocks. FIG. 3 shows a part of the configuration of the robot system 100 extracted. The operation adjustment device 112 includes a command value learning unit 113. In FIG. 3, the sensor 140 is a combination of the internal sensor 141 and the external sensor 142 into one. As described above, various sensors 140 can be considered. However, in the robot system 100 according to the present embodiment, the sensor 140 includes at least a force sensor that detects an external force acting on the end effector 130 due to the operation of the robot 120. This force sensor is an external sensor 142. Note that including at least a force sensor as the sensor 140 is the same as in the following embodiments.
 力覚センサは、エンドエフェクタ130に作用する外力を計測し、力制御あるいはインピーダンス制御を実施するのに用いられる。なお、エンドエフェクタ130が作業対象200または周辺環境300に与える力を制御することを力制御と呼ぶ。また、力覚センサの検出結果に従ってロボット120の動作を制御することを力覚制御と呼ぶ。力制御においては、目標作業力が設定され、作業対象200または周辺環境300に与えられる力の大きさが制御される。 A force sensor measures external force acting on the end effector 130 and is used to perform force control or impedance control. Control of the force that the end effector 130 applies to the work target 200 or the surrounding environment 300 is called force control. Also, controlling the operation of the robot 120 according to the detection result of the force sensor is referred to as force control. In force control, a target work force is set, and the magnitude of the force applied to the work target 200 or the surrounding environment 300 is controlled.
 一方、インピーダンス制御においては、エンドエフェクタ130と作業対象200とが接触した場合などに発生する接触力に関するインピーダンス特性(バネ、ダンパ、慣性)が定義され、制御に利用される。接触力が発生する場合としては、エンドエフェクタ130と周辺環境300とが接触した場合、エンドエフェクタ130に把持された作業対象200と周辺環境300とが接触した場合なども考えられる。また、インピーダンス特性は、インピーダンスパラメータで表される。 On the other hand, in impedance control, impedance characteristics (spring, damper, inertia) related to the contact force generated when the end effector 130 and the work object 200 contact each other are defined and used for control. As a case where the contact force is generated, when the end effector 130 and the surrounding environment 300 come in contact with each other, the case where the work target 200 held by the end effector 130 and the surrounding environment 300 come in contact with each other may be considered. Also, the impedance characteristic is represented by an impedance parameter.
 力制御においては、力制御の目標値を決定する必要がある。また、インピーダンス制御においては、インピーダンスパラメータを用いて制御特性を決定する必要がある。さらに、力制御及びインピーダンス制御のいずれにおいても、制御の応答性に寄与するゲインなども決定する必要があり、調整項目は多い。従来のロボットシステムでは、作業を安定的に行うことを目的としたパラメータ調整が多くなされてきた。この場合、ロボット120の動作の応答性、機械剛性等を含めたシステム特性を同定して、条件または状態によらず安定して応答するパラメータセットを1つ見つけることになる。しかし、作業対象200との接触を伴うロボット120の動作では、動作の進行によって、作業対象200とエンドエフェクタ130との間の接触状態が変化する。したがって、パラメータセットの調整は、接触状態の遷移を考慮して行われる必要がある。この調整は試行錯誤的に行われることになり、容易ではなかった。 In force control, it is necessary to determine a target value of force control. Further, in impedance control, it is necessary to determine control characteristics using impedance parameters. Furthermore, in any of the force control and the impedance control, it is necessary to determine the gain that contributes to the responsiveness of the control, and there are many adjustment items. In conventional robot systems, many parameter adjustments have been made for the purpose of performing tasks stably. In this case, system characteristics including the response of operation of the robot 120, mechanical rigidity, and the like are identified, and one parameter set that stably responds regardless of conditions or states is found. However, in the operation of the robot 120 accompanied by the contact with the work target 200, the contact state between the work target 200 and the end effector 130 changes as the movement progresses. Therefore, adjustment of the parameter set needs to be performed in consideration of the transition of the contact state. This adjustment was not easy because it would be performed by trial and error.
 本実施の形態のロボットシステム100においては、動作調整装置112が動作指令値を更新することで、ロボット120の動作が適切となるように制御する。動作調整装置112には、制約条件が入力される。制約条件には、力覚センサで検出される力情報の上限値または下限値が含まれる。以降では、動作制御システム110から出力される動作指令値が速度指令値であるものとして説明する。速度指令値は、エンドエフェクタ130の移動経路上の各地点に対する、エンドエフェクタ130の目標移動速度とする。この時、時系列の速度指令値は、各地点に対する速度パターンとなる。速度指令値は、作業中の各時点に対するロボット120の目標動作速度であっても良い。 In the robot system 100 according to the present embodiment, the motion adjustment device 112 updates the motion command value to control the motion of the robot 120 to be appropriate. Constraint conditions are input to the operation adjustment device 112. The constraint conditions include the upper limit value or the lower limit value of the force information detected by the force sensor. Hereinafter, the operation command value output from the operation control system 110 will be described as the speed command value. The speed command value is a target moving speed of the end effector 130 with respect to each point on the moving path of the end effector 130. At this time, the time-series speed command value is the speed pattern for each point. The speed command value may be a target operating speed of the robot 120 for each point in time of work.
 速度パターンでは、目標速度Vi(i=1,2,3,・・・)と目標速度の切り替わり位置Pi(i=1,2,3,・・・)が定義される。なお、切り替わり位置は、切り替わり時間や、切り替わりのためのパラメータで設定してよい。切り替わりのためのパラメータとしては、位置や時間を基準とした動作指令値の進捗率が例示される。また、目標速度の切り替わり位置Piは、目標速度の切り替えの開始点であっても良いし、目標速度の切り替えの完了点であっても良い。また、目標速度の切り替わり位置Piは、内界センサ141で検出される動作速度が、目標速度から所定の誤差範囲内に収まることが保証される点であっても良い。 In the velocity pattern, a target velocity Vi (i = 1, 2, 3,...) And a switching position Pi of the target velocity (i = 1, 2, 3,...) Are defined. The switching position may be set by switching time or parameters for switching. As the parameter for switching, the progress rate of the operation command value based on the position and time is exemplified. Further, the switching position Pi of the target speed may be a start point of switching of the target speed or may be a completion point of switching of the target speed. Further, the switching position Pi of the target velocity may be a point that it is guaranteed that the operation velocity detected by the internal sensor 141 falls within a predetermined error range from the target velocity.
 図4は、本発明の実施の形態1による動作調整装置112の動作を説明するための図である。図4に示すように、ロボット120に装着されたエンドエフェクタ130が位置P0から位置P3まで移動する場合を考える。ロボット120には、外界センサ142として力覚センサ143が取り付けられている。力覚センサ143は、エンドエフェクタ130に作用する外力を計測する。 FIG. 4 is a diagram for explaining the operation of the operation adjustment device 112 according to the first embodiment of the present invention. As shown in FIG. 4, consider the case where the end effector 130 mounted on the robot 120 moves from the position P0 to the position P3. A force sensor 143 is attached to the robot 120 as an external sensor 142. The force sensor 143 measures an external force acting on the end effector 130.
 図5は、本発明の実施の形態1によるロボットシステム100における更新前の速度パターンの一例を示す図である。図5において横軸はエンドエフェクタ130の位置P、縦軸はエンドエフェクタ130の目標移動速度Vである。図5の速度パターンでは、エンドエフェクタ130がP0からP3に移動する間に、目標速度が変化している。動作調整装置112は、力覚センサ143の検出結果に基づいて速度パターンを更新する。 FIG. 5 is a view showing an example of a velocity pattern before update in the robot system 100 according to the first embodiment of the present invention. In FIG. 5, the horizontal axis is the position P of the end effector 130, and the vertical axis is the target moving velocity V of the end effector 130. In the velocity pattern of FIG. 5, the target velocity changes while the end effector 130 moves from P0 to P3. The motion adjustment device 112 updates the speed pattern based on the detection result of the force sensor 143.
 図6は、本発明の実施の形態1による動作制御システム110の処理の流れの一例を示すフロー図である。ここで、制約条件としては、力覚センサ143で検出される力情報の上限値及び下限値と、作業時間の上限値が含まれているものとする。まず、ステップS10において、ロボット制御装置111は、速度パターンの初期値を決定する。次に、ステップS11において、ロボット制御装置111は、ロボット120の動作を制御して作業を試行する。なお、前述のように調整処理と作業処理とは完全に分離されていない場合など、ロボットシステム100における通常の作業の一部が試行として扱われる場合もある。 FIG. 6 is a flow chart showing an example of the process flow of the operation control system 110 according to the first embodiment of the present invention. Here, it is assumed that the upper limit value and the lower limit value of the force information detected by the force sensor 143 and the upper limit value of the working time are included as the constraint conditions. First, in step S10, the robot control device 111 determines an initial value of the velocity pattern. Next, in step S11, the robot control device 111 controls the operation of the robot 120 and tries a task. When the adjustment process and the work process are not completely separated as described above, a part of the normal work in the robot system 100 may be treated as a trial.
 次に、ステップS12において、動作調整装置112は、制約条件が満たされているかを判定する。すなわち、ステップS12において、動作調整装置112は、力覚センサ143の検出値が制約条件で規定される上限値及び下限値の間に入っているかと、作業時間の制約が満たされているかを判定する。力覚センサ143の検出値を判定する際には、例えば、検出値の最大値を制約条件の上限値と比較し、検出値の最小値を制約条件の下限値と比較する。なお、ステップS12において、動作調整装置112は、力覚センサ143の検出値そのものではなく、検出値から演算によって求められる評価値を用いても良い。この評価値の一例としては、力覚センサ143の検出値と、タクトタイムとを入力とした評価関数で演算される評価値が考えられる。ステップS12では、動作調整装置112は、この評価値を制限範囲内か否かを判定しても良い。 Next, in step S12, the operation adjustment device 112 determines whether the constraint condition is satisfied. That is, in step S12, the operation adjustment device 112 determines whether the detection value of the force sensor 143 falls within the range between the upper limit value and the lower limit value defined by the constraint condition and whether the constraint of the working time is satisfied. Do. When determining the detection value of the force sensor 143, for example, the maximum value of the detection value is compared with the upper limit value of the constraint condition, and the minimum value of the detection value is compared with the lower limit value of the constraint condition. In step S12, the operation adjustment device 112 may use not the detection value itself of the force sensor 143 but an evaluation value obtained by calculation from the detection value. As an example of the evaluation value, an evaluation value calculated with an evaluation function using the detection value of the force sensor 143 and the tact time as an input can be considered. In step S12, the operation adjustment device 112 may determine whether the evaluation value is within the limit range.
 ステップS12において、制約条件が満たされていると判定された場合には、動作制御システム110の処理は一旦終了し、以降は更新された速度パターンでの作業が行われる。一方、ステップS12において、制約条件が満たされていないと判定された場合には、動作制御システム110の処理はステップS13へと移行する。ステップS13では、動作調整装置112は、速度パターンを調整し、速度パターンを更新する。ステップS13では、動作調整装置112は、例えば補正するための補正係数を算出し、試行を行った際の速度パターンに乗算することで、速度パターンを調整する。ステップS13の処理が終了すると、動作制御システム110の処理はステップS11へと戻る。 When it is determined in step S12 that the constraint condition is satisfied, the process of the operation control system 110 is temporarily ended, and the work with the updated speed pattern is performed thereafter. On the other hand, when it is determined in step S12 that the constraint condition is not satisfied, the process of the operation control system 110 proceeds to step S13. In step S13, the operation adjustment device 112 adjusts the speed pattern and updates the speed pattern. In step S13, the operation adjustment device 112 calculates, for example, a correction coefficient for correction, and adjusts the velocity pattern by multiplying the velocity pattern at the time of trial. When the process of step S13 ends, the process of operation control system 110 returns to step S11.
 本発明の実施の形態1による動作制御システム110は、以上のような処理を行う。以上のように、本発明の実施の形態1による動作制御システム110は、複数回の試行によって得られるデータに基づいて学習的に速度パターンの調整を行う。言い換えると、本発明の実施の形態1による動作制御システム110は、機械学習または最適化手法を用いて動作指令値である速度パターンの調整を行う。 The operation control system 110 according to the first embodiment of the present invention performs the above processing. As described above, the operation control system 110 according to the first embodiment of the present invention adjusts the speed pattern in a learning manner based on data obtained by a plurality of trials. In other words, the motion control system 110 according to the first embodiment of the present invention adjusts the speed pattern, which is the motion command value, using machine learning or optimization.
 なお、以上の説明では、作業時間の上限値が制約条件に含まれているものとしたが、必須の条件ではなく、他の条件であっても良い。また、制約条件として作業時間の上限値が与えられる代わりに、他の条件を満たした上で作業時間が最短となることを制約条件としても良い。さらに、以上の説明では、与えられた制約条件を満たすように動作制御システム110が動作指令値を更新する場合について説明したが、動作制御システム110が制御パラメータを調整して更新する構成とすることも考えられる。さらに、図1では、ロボット制御装置111と動作調整装置112とを別に備える構成例を示しているが、ロボット制御装置111が動作調整装置112を内蔵するように構成することもできる。 In the above description, although the upper limit value of the working time is included in the constraint condition, it is not an essential condition but may be another condition. Further, instead of the upper limit value of the working time being given as the constraint condition, the constraint condition may be that the working time is shortest after the other conditions are satisfied. Furthermore, in the above description, although the case where the operation control system 110 updates the operation command value so as to satisfy the given constraints is described, the operation control system 110 is configured to adjust and update the control parameter. Is also conceivable. Furthermore, although FIG. 1 shows a configuration example in which the robot control device 111 and the operation adjustment device 112 are separately provided, the robot control device 111 may be configured to incorporate the operation adjustment device 112.
 本実施の形態の動作調整装置112、動作制御システム110及びロボットシステム100は、以上のように構成される。本実施の形態の動作調整装置112、動作制御システム110及びロボットシステム100によれば、力覚センサ143の検出値が所定の範囲内となるようにロボット120の動作が調整される。ここで、力覚センサ143の検出値は、エンドエフェクタ130に作用する外力の大きさを表している。言い換えると、力覚センサ143の検出値は、ロボット120の動作に起因して作業対象200又は周辺環境300に加えられる力の大きさを表す情報である。したがって、本実施の形態の動作調整装置112、動作制御システム110及びロボットシステム100によれば、作業対象200または周辺環境300に加えられる力が適切な大きさとなるように、すなわち作業対象200または周辺環境300に過大な負荷が作用することがないようにロボット120の動作を調整でき、また、ロボット120の動作の調整を容易化できる。 The motion adjustment device 112, the motion control system 110, and the robot system 100 of the present embodiment are configured as described above. According to the motion adjustment device 112, the motion control system 110, and the robot system 100 of the present embodiment, the motion of the robot 120 is adjusted such that the detection value of the force sensor 143 falls within a predetermined range. Here, the detection value of the force sensor 143 represents the magnitude of the external force acting on the end effector 130. In other words, the detection value of the force sensor 143 is information representing the magnitude of the force applied to the work target 200 or the surrounding environment 300 due to the operation of the robot 120. Therefore, according to the motion adjustment device 112, the motion control system 110, and the robot system 100 of the present embodiment, the force applied to the work target 200 or the surrounding environment 300 has an appropriate magnitude, that is, the work target 200 or the circumference. The operation of the robot 120 can be adjusted such that an excessive load does not act on the environment 300, and the adjustment of the operation of the robot 120 can be facilitated.
 以上のように、力覚センサ143を用いて力応答が所望の範囲内に収まる様に動作指令値を学習的に調整することで、作業対象となるアイテムを破損しない高品質なロボット作業を実現することができる。さらに、作業時間を制約条件に加えることで、高速な作業も実現可能でとなる。 As described above, by using the force sensor 143 to learn and adjust the motion command value so that the force response falls within the desired range, high-quality robot work that does not damage the item to be worked is realized. can do. Furthermore, high speed work can be realized by adding work time to the constraint.
 また、本実施の形態の動作調整装置112、動作制御システム110及びロボットシステム100は、制約条件として力覚センサ143で検出される力の大きさを用いたが、モーメント、トルク、電流値などを検出し、これらの上限あるいは下限のいずれかを制約条件に用いることもできる。これらによって、ロボット120またはエンドエフェクタ130と外界との接触状況に制限値を設けることができ、所望の範囲内での動作指令値を探索することが可能となる。その結果、作業対象200を傷つけないような作業を実現することができる。 In addition, although the motion adjustment device 112, the motion control system 110, and the robot system 100 according to the present embodiment use the magnitude of the force detected by the force sensor 143 as a constraint condition, the moment, torque, current value, etc. It is also possible to detect and use any of these upper and lower limits as constraints. By these, it is possible to set a limit value in the contact situation between the robot 120 or the end effector 130 and the outside world, and it becomes possible to search for an operation command value within a desired range. As a result, an operation that does not damage the work target 200 can be realized.
 さらに、制約条件としては、周辺環境300との相対位置姿勢やロボット120の位置姿勢を加えることもできる。これらの上限あるいは下限のいずれかを制約条件に加えることで、高品質な作業を実現しつつも、周辺環境300との干渉を抑制したロボット作業を実現できる。その結果として、システムの稼働率を上げるといった、格別の効果を得ることができる。以上で述べた効果は、他の実施の形態でも同様に得られるものである。 Furthermore, as a constraint condition, the relative position and orientation with the surrounding environment 300 and the position and orientation of the robot 120 can be added. By adding either of these upper limits or lower limits to the constraint conditions, robot work with reduced interference with the surrounding environment 300 can be realized while realizing high-quality work. As a result, it is possible to obtain a remarkable effect of increasing the system operation rate. The effects described above are similarly obtained in the other embodiments.
実施の形態2.
 本実施の形態の動作調整装置、動作制御システム及びロボットシステムの構成は、図1に示されたものと同様である。本実施の形態の動作調整装置112、動作制御システム110及びロボットシステム100は、一連の作業のためにロボット120に与えられる動作指令を複数の区分に分割し、区分毎に動作指令値を調整するものである。なお、以降では動作制御システムから出力される動作指令値が速度指令値であるものとして説明する。
Second Embodiment
The configurations of the operation adjustment device, the operation control system, and the robot system of the present embodiment are the same as those shown in FIG. The motion adjustment device 112, the motion control system 110, and the robot system 100 according to the present embodiment divide the motion command given to the robot 120 into a plurality of sections for a series of tasks, and adjust the motion command value for each section. It is a thing. In the following description, it is assumed that the operation command value output from the operation control system is a speed command value.
 図7は、本発明の実施の形態2による動作調整装置112の動作を説明するための図である。図7に示すように、ロボット120に装着されたエンドエフェクタ130を位置P0から位置P3まで移動させる作業を考える。初期位置である位置P0が作業の開始点であり、位置P3が作業の終了点である。エンドエフェクタ130は、位置P0から位置P3まで移動する間に、位置P1、位置P2を経由する。 FIG. 7 is a diagram for explaining the operation of the operation adjustment device 112 according to the second embodiment of the present invention. As shown in FIG. 7, an operation of moving the end effector 130 mounted on the robot 120 from the position P0 to the position P3 is considered. Position P0 which is an initial position is a start point of work, and position P3 is an end point of work. The end effector 130 passes through the positions P1 and P2 while moving from the position P0 to the position P3.
 本実施の形態のロボットシステム100において、作業の開始点から作業の終了点までの経路は、複数の区分に分割される。言い換えると、本実施の形態のロボットシステム100において、1つの作業の開始から作業の終了までのロボット120の動作は、複数の区分に分割される。ここで、位置P0から位置P1までを区分S1、位置P1から位置P2までを区分S2、位置P2から位置P3までを区分S3とする。また、区分S1の目標移動速度をV1とし、区分S2の目標移動速度をV2とし、区分S3の目標移動速度をV3とする。本実施の形態のロボットシステム100は、分割された区分毎に動作指令値を調整して更新する。具体的には、ロボットシステム100は、区分S1の目標移動速度、区分S2の目標移動速度、区分S3の目標移動速度をそれぞれ調整する。 In robot system 100 of the present embodiment, the path from the start point of work to the end point of work is divided into a plurality of sections. In other words, in the robot system 100 according to the present embodiment, the operation of the robot 120 from the start of one operation to the end of the operation is divided into a plurality of sections. Here, a section S1 is from the position P0 to the position P1, a section S2 from the position P1 to the position P2, and a section S3 from the position P2 to the position P3. Further, the target moving speed of section S1 is V1, the target moving speed of section S2 is V2, and the target moving speed of section S3 is V3. The robot system 100 according to the present embodiment adjusts and updates the operation command value for each of the divided sections. Specifically, the robot system 100 adjusts the target moving speed of the section S1, the target moving speed of the section S2, and the target moving speed of the section S3.
 なお、本実施の形態のロボットシステム100において、区分に分割するための分割点となる位置P1、P2は、作業内容に応じて予め定められるものとする。位置P1、P2は、区分が切り替わる位置であり、切り替え位置と呼ばれる場合もある。また、ここでは区分の数を3つとして例示しているが、3つに限定されるわけではない。さらに、ここでは位置によって空間的に区分を定義しているが、作業の開始時点から作業の終了時点までを時間的に分割しても良い。 In robot system 100 of the present embodiment, positions P1 and P2, which are division points for division into divisions, are determined in advance according to the contents of work. Positions P1 and P2 are positions at which the sections switch, and may be referred to as switching positions. Moreover, although the number of divisions is illustrated as three here, it is not necessarily limited to three. Furthermore, although division is defined spatially by position here, it may be divided temporally from the start time of work to the end time of work.
 本実施の形態の動作制御システム110には、制約条件として力覚センサ143の検出結果の上限値Flimが与えられるものとする。本実施の形態の動作制御システム110の処理の流れは、基本的に図6に示すフロー図と同様である。ただし、速度パターンは区分毎に調整されることになる。まず、図6のステップS10において、ロボット制御装置111は、速度パターンの初期値を決定する。図8は、本発明の実施の形態2によるロボットシステム100における速度パターンの初期値の一例を示す図である。図8において横軸はエンドエフェクタ130の位置P、縦軸はエンドエフェクタ130の目標移動速度Vである。図8において、速度パターンの初期値は、V1=V2=V3=Viniである。 The upper limit value Flim of the detection result of the force sensor 143 is given to the operation control system 110 according to the present embodiment as a constraint condition. The flow of processing of the operation control system 110 according to the present embodiment is basically the same as the flow shown in FIG. However, the speed pattern will be adjusted for each segment. First, in step S10 of FIG. 6, the robot control device 111 determines an initial value of the velocity pattern. FIG. 8 is a diagram showing an example of initial values of velocity patterns in the robot system 100 according to Embodiment 2 of the present invention. In FIG. 8, the horizontal axis is the position P of the end effector 130, and the vertical axis is the target moving velocity V of the end effector 130. In FIG. 8, the initial value of the velocity pattern is V1 = V2 = V3 = Vini.
 次に、ステップS11において、ロボット制御装置111は、ロボット120の動作を制御して作業を試行する。図9は、本発明の実施の形態2によるロボットシステム100における力覚センサ143の検出値の一例を示す図である。図9において横軸はエンドエフェクタ130の位置P、縦軸は力覚センサ143の検出値Fである。図9は、図8に示す速度パターンの初期値でロボット120を動作させた場合に、力覚センサ143で検出される値を表している。 Next, in step S11, the robot control device 111 controls the operation of the robot 120 and tries a task. FIG. 9 is a diagram showing an example of detection values of force sensor 143 in robot system 100 according to the second embodiment of the present invention. In FIG. 9, the horizontal axis is the position P of the end effector 130, and the vertical axis is the detection value F of the force sensor 143. FIG. 9 shows values detected by the force sensor 143 when the robot 120 is operated at the initial value of the velocity pattern shown in FIG.
 次に、ステップS12において、動作調整装置112は、制約条件が満たされているかを判定する。すなわち、ステップS12において、動作調整装置112は、各区分における力覚センサ143の検出値が制約条件で規定される上限値Flim以下であるかを判定する。判定に用いる力覚センサ143の検出値としては、例えば、各区分における力覚センサ143の検出値のうちの最大値を用いる。ステップS12において、全ての区分で力覚センサ143の検出値がFlim以下であった場合には、動作調整装置112は制約条件が満たされていると判定する。一方、ステップS12において、力覚センサ143の検出値が上限値Flimを超えた区分が1つでも存在する場合には、動作調整装置112は制約条件が満たされていないと判定する。 Next, in step S12, the operation adjustment device 112 determines whether the constraint condition is satisfied. That is, in step S12, the operation adjustment device 112 determines whether the detected value of the force sensor 143 in each section is equal to or less than the upper limit Flim defined by the constraint condition. As the detection value of the force sensor 143 used for determination, for example, the maximum value among the detection values of the force sensor 143 in each section is used. In step S12, when the detection value of the force sensor 143 is equal to or less than Flim in all the sections, the operation adjustment device 112 determines that the constraint condition is satisfied. On the other hand, if there is at least one section in which the detection value of the force sensor 143 exceeds the upper limit Flim in step S12, the operation adjustment device 112 determines that the constraint condition is not satisfied.
 ステップS12において、制約条件が満たされていると判定された場合には、動作制御システム110の処理は一旦終了し、以降は更新された速度パターンでの作業が行われる。一方、ステップS12において、制約条件が満たされていないと判定された場合には、動作制御システム110の処理はステップS13へと移行する。ステップS13では、動作調整装置112は、力覚センサ143の検出値が上限値Flimを超えた区分の目標速度が小さくなるように速度パターンを調整し、速度パターンを更新する。 When it is determined in step S12 that the constraint condition is satisfied, the process of the operation control system 110 is temporarily ended, and the work with the updated speed pattern is performed thereafter. On the other hand, when it is determined in step S12 that the constraint condition is not satisfied, the process of the operation control system 110 proceeds to step S13. In step S13, the operation adjustment device 112 adjusts the velocity pattern so as to reduce the target velocity of the section in which the detection value of the force sensor 143 exceeds the upper limit Flim, and updates the velocity pattern.
 図9に示す例では、区分S2において、力覚センサ143の検出値Fmax2が、上限値Flimを超えている。一方、区分S1における力覚センサ143の検出値Fmax1、及び区分S3における力覚センサ143の検出値Fmax3は、上限値Flimを超えていない。したがって、ステップS12において、動作調整装置112は制約条件が満たされていないと判定する。ステップS13では、動作調整装置112は、区分S2における目標速度V2が小さくなるように速度パターンを調整する。本発明の実施の形態2による動作制御システム110は、以上のような処理を行う。図10は、本発明の実施の形態2によるロボットシステム100における更新後の速度パターンの一例を示す図である。図10において横軸はエンドエフェクタ130の位置P、縦軸はエンドエフェクタ130の目標移動速度Vである。 In the example shown in FIG. 9, in the section S2, the detection value Fmax2 of the force sensor 143 exceeds the upper limit value Flim. On the other hand, the detected value Fmax1 of the force sensor 143 in the section S1 and the detected value Fmax3 of the force sensor 143 in the section S3 do not exceed the upper limit value Flim. Therefore, in step S12, the operation adjustment device 112 determines that the constraint is not satisfied. In step S13, the operation adjustment device 112 adjusts the speed pattern such that the target speed V2 in the section S2 decreases. The operation control system 110 according to the second embodiment of the present invention performs the above processing. FIG. 10 is a diagram showing an example of the updated velocity pattern in the robot system 100 according to the second embodiment of the present invention. In FIG. 10, the horizontal axis is the position P of the end effector 130, and the vertical axis is the target moving velocity V of the end effector 130.
 なお、以上の説明では、制約条件として力覚センサ143の検出結果の上限値Flimが与えられるものとしたが、さらに作業時間が最短となることを制約条件として加えても良い。この場合、図9において、Fmax1及びFmax3は、上限値Flimを超えていないので、ステップS13において、動作調整装置112は、区分S1における目標速度V1、及び区分S3における目標速度V3が大きくなるように速度パターンを調整する。このように速度パターンを調整することで、作業時間をより短くすることが可能となる。図11は、本発明の実施の形態2によるロボットシステム100における更新後の速度パターンの別の例を示す図である。図11において横軸はエンドエフェクタ130の位置P、縦軸はエンドエフェクタ130の目標移動速度Vである。 In the above description, the upper limit Flim of the detection result of the force sensor 143 is given as the constraint condition, but the shortest operation time may be added as the constraint condition. In this case, since Fmax1 and Fmax3 do not exceed the upper limit Flim in FIG. 9, the operation adjusting apparatus 112 increases the target velocity V1 in the section S1 and the target velocity V3 in the section S3 in step S13. Adjust the speed pattern. By adjusting the speed pattern in this manner, the working time can be further shortened. FIG. 11 is a diagram showing another example of the updated velocity pattern in the robot system 100 according to the second embodiment of the present invention. In FIG. 11, the horizontal axis is the position P of the end effector 130, and the vertical axis is the target moving velocity V of the end effector 130.
 なお、動作指令値が速度指令値である場合、図10、図11に示す通り、分割点P1、P2は、目標速度が切り換えられる位置となる。分割点P1、P2は、目標速度の切り替えの開始点であっても良いし、目標速度の切り替えの完了点であっても良い。また、分割点P1、P2は、内界センサ141で検出される動作速度が、目標速度から所定の誤差範囲内に収まることが保証される点であっても良い。 When the operation command value is the speed command value, as shown in FIGS. 10 and 11, the dividing points P1 and P2 are positions where the target speed is switched. The division points P1 and P2 may be start points of switching of the target speed, or may be completion points of switching of the target speed. Further, the dividing points P1 and P2 may be points that guarantee that the operation speed detected by the internal sensor 141 falls within a predetermined error range from the target speed.
 本実施の形態の動作調整装置112、動作制御システム110及びロボットシステム100は、以上のように構成される。本実施の形態の動作調整装置112、動作制御システム110及びロボットシステム100によれば、区分毎にロボット120の動作が調整される。力覚センサ143の検出値が所定の値よりも大きくなる区分のみ動作が遅くなるように調整されるので、作業全体の動作を不要に遅くすることなく、しかも作業対象200または周辺環境300に過大な負荷が作用することがないように、ロボット120の動作を調整でき、また、ロボット120の動作の調整を容易化できる。さらに、力覚センサ143の検出値が所定の値よりも小さくなる区分については動作が早くなるように調整されるように構成すれば、作業全体の動作をより早くすることも可能となる。 The motion adjustment device 112, the motion control system 110, and the robot system 100 of the present embodiment are configured as described above. According to the motion adjustment device 112, the motion control system 110, and the robot system 100 of the present embodiment, the motion of the robot 120 is adjusted for each section. Since only the division in which the detection value of the force sensor 143 is larger than the predetermined value is adjusted to delay the operation, the operation of the entire operation is not unnecessarily delayed, and the operation target 200 or the surrounding environment 300 is excessively large. It is possible to adjust the operation of the robot 120 so that no load is applied, and to facilitate the adjustment of the operation of the robot 120. Furthermore, if the section where the detected value of the force sensor 143 is smaller than the predetermined value is adjusted so that the operation becomes faster, it is also possible to make the operation of the whole operation faster.
 以上のように、本実施の形態の動作調整装置112、動作制御システム110及びロボットシステム100によれば、区間ごとに最適な動作指令値を学習し、更新することで、従来の調整では実現できなかった細やかな動作指令値の設計が可能となり、結果として高速かつ高品質なロボット作業を実現することができる。 As described above, according to the operation adjustment device 112, the operation control system 110 and the robot system 100 of the present embodiment, the conventional adjustment can be realized by learning and updating the optimum operation command value for each section. It becomes possible to design delicate motion command values that were not possible, and as a result, high-speed, high-quality robot work can be realized.
実施の形態3.
 図12は、本発明の実施の形態3による動作調整装置112bの構成例及び周辺のブロックを示すブロック図である。図12は、ロボットシステム100の構成の一部を抽出して示したものである。動作調整装置112bは、指令値学習部113bを備える。本実施の形態の動作調整装置、動作制御システム及びロボットシステムの構成は、動作調整装置112が動作調整装置112bに置き換えられる以外は、図1に示されたものと同様である。本実施の形態における動作調整装置112bは、実施の形態2における動作調整装置112と比較して、区分情報が入力される点が異なる。区分情報には、区分位置の初期値、および各区分における動作指令値の初期値の情報が含まれる。なお、区分位置とは、各区分の両端の分割点Piの位置であり、例えば、動作速度の目標値が切り換えられる位置である。内界センサ141または外界センサ142によって、エンドエフェクタ130が所定の位置に到達したことが検出されると、動作速度の目標値が切り換えられる。
Third Embodiment
FIG. 12 is a block diagram showing a configuration example of the operation adjustment device 112b according to the third embodiment of the present invention and peripheral blocks. FIG. 12 shows a part of the configuration of the robot system 100 extracted. The operation adjustment device 112 b includes a command value learning unit 113 b. The configurations of the motion adjustment device, motion control system, and robot system of the present embodiment are the same as those shown in FIG. 1 except that the motion adjustment device 112 is replaced with the motion adjustment device 112 b. The operation adjustment apparatus 112b according to the present embodiment is different from the operation adjustment apparatus 112 according to the second embodiment in that division information is input. The division information includes information on the initial value of the division position and the initial value of the operation command value in each division. The division position is the position of division points Pi at both ends of each division, and is, for example, a position where the target value of the operation speed is switched. When the internal sensor 141 or the external sensor 142 detects that the end effector 130 has reached a predetermined position, the target value of the operation speed is switched.
 本実施の形態の動作調整装置、動作制御システム及びロボットシステムは、実施の形態2におけるものと同様に、区分毎に動作指令値を調整する。区分毎に動作指令値を調整することで、指令値学習部113bを、衝突などが生じる区分は低速な動作となるように調整し、それ以外の区分は高速な動作となるように調整する学習器とすることができる。この学習器によれば、高速な作業が実現される動作指令値を自動的に学習できる。指令値学習部113bは、各区分に対応する動作指令値を自動的に学習していく。簡単のため、動作調整装置112bは制御パラメータを調整せず、動作指令値のみを調整するものとして説明する。 The motion adjustment device, motion control system, and robot system of the present embodiment adjust the motion command value for each section, as in the second embodiment. By adjusting the operation command value for each section, the command value learning unit 113b is adjusted so that the section where a collision or the like occurs is a low speed operation, and the other sections are adjusted so as to be a high speed operation. Can be According to this learning device, it is possible to automatically learn an operation command value that realizes high-speed work. The command value learning unit 113b automatically learns an operation command value corresponding to each section. For the sake of simplicity, the operation adjustment device 112 b is described as adjusting only the operation command value without adjusting the control parameter.
 本実施の形態の動作調整装置、動作制御システム及びロボットシステムにおいて、指令値学習部113bは、区分情報、制約条件、センサ140の検出値、および更新前の動作指令値を入力として、動作指令値をそれぞれ更新する。区分情報は、動作指令値をN個の区分に分割するために定義されている。区分に分割するためのそれぞれの分割点をPi(i=0,1,2,・・・,N+1)と定義する。ここで、Nは自然数である。また、ここでは、動作の開始点および終了点も分割点に含まれるものとし、開始点をP0とする。分割点Piの1つ前の分割点と分割点Piとの間の区間を区分Si(i=0,1,2,・・・,N)と呼ぶ。 In the motion adjustment device, motion control system, and robot system of the present embodiment, the command value learning unit 113b receives motion information, using the classification information, the constraint condition, the detection value of the sensor 140, and the motion command value before updating. Update each one. The division information is defined to divide the operation command value into N divisions. Each division point for dividing into divisions is defined as Pi (i = 0, 1, 2,..., N + 1). Here, N is a natural number. Here, the start point and the end point of the operation are also included in the division point, and the start point is P0. A section between a division point immediately before the division point Pi and the division point Pi is called a division Si (i = 0, 1, 2,..., N).
 本実施の形態の動作調整装置、動作制御システム及びロボットシステムにおいて、区分は、作業状態が変化するたびに定義されることを想定している。例えば、力覚センサを用いた嵌合作業を考えると、分割点Piは嵌合される部品間の接触現象が生じる前後、接触状態が変化する前後で定義される。予想される接触状態の変化に応じて分割点Piは定義され、それぞれにふさわしい位置、速度、加速度、といった動作指令の目標値に変更することで、作業全体の高速化が図られる。この際、過去の試行情報から適切な分割点Piの位置と、それぞれの区分Siの指令値パターンを定義することが本実施の形態の動作調整装置、動作制御システム及びロボットシステムの特徴である。 In the motion adjustment device, motion control system, and robot system of the present embodiment, it is assumed that divisions are defined each time the work state changes. For example, in the case of a fitting operation using a force sensor, the dividing points Pi are defined before and after a contact phenomenon occurs between parts to be fitted, and before and after a change in contact state. The division point Pi is defined in accordance with the change in the expected contact state, and the speed of the entire operation can be increased by changing the target value of the operation command such as the position, velocity, or acceleration suitable for each. At this time, it is a feature of the operation adjustment device, the operation control system, and the robot system of the present embodiment that the position of the dividing point Pi and the command value pattern of each division Si are defined from past trial information.
 指令値学習部113bに入力される制約条件は、高速化した作業に関して、作業成功と作業失敗の境界を定義する条件である。高速化した作業動作では、エンドエフェクタ130の位置制御の誤差などから、エンドエフェクタ130が作業対象200に強く衝突するリスクがある。強い衝突が生じると、エンドエフェクタ130あるいは作業対象200が破損し、作業失敗になってしまう場合がある。このような過去の作業失敗を考慮し、設計時点でユーザが制約条件を定義すること、あるいは過去の試行データによって制約条件を定義することで、高速かつ低衝撃な作業を行う動作指令値の生成が実現可能となる。 The constraint condition input to the command value learning unit 113b is a condition that defines the boundary between the success and failure of the work with respect to the speeded-up work. In the speeded-up work operation, there is a risk that the end effector 130 strongly collides with the work target 200 due to an error in position control of the end effector 130 or the like. If a strong collision occurs, the end effector 130 or the work target 200 may be damaged, resulting in work failure. In consideration of such past work failures, generation of operation command values for performing high-speed, low-impact work by the user defining constraint conditions at design time or defining constraint conditions by past trial data Can be realized.
 制約条件としては、位置の制限範囲、姿勢の制限範囲、動作速度の上限値、動作速度の下限値、力の上限値、力の下限値、モーメントの上限値、モーメントの下限定値などがある。特に、ロボット120や作業対象200の位置姿勢を取得可能な場合に、制約条件としてロボット120の位置姿勢、ロボット120と周辺環境300との相対的な位置姿勢の上限あるいは下限のいずれかで定義される制限値を入力することができる。 Constraint conditions include position limit range, posture limit range, upper limit value of operating speed, lower limit value of operating speed, upper limit value of force, lower limit value of force, upper limit value of moment, lower limit value of moment, etc. . In particular, when the positions and orientations of the robot 120 and the work object 200 can be acquired, they are defined by the position and orientation of the robot 120 or the upper limit or lower limit of the relative positions and orientations of the robot 120 and the surrounding environment 300 as constraint conditions. Limit values can be entered.
 また、内界センサ141または外界センサ142で取得したデータをセンサ情報と呼ぶ。センサ情報に対しては、フィルタ処理によってノイズを除去する処理、閾値を超えた値だけを抽出する処理などの前処理が必要に応じて行われる。 Also, data acquired by the internal sensor 141 or the external sensor 142 is referred to as sensor information. For sensor information, pre-processing such as processing for removing noise by filter processing and processing for extracting only a value that exceeds a threshold is performed as necessary.
 動作指令値とは、ロボットシステム100の位置制御系に入力可能な制御指令値のことを指す。動作指令値は、単に指令値と呼ばれる場合もある。ロボット120の動作は、各軸のモータの動作によって制御されている。動作指令値には、例えば、モータの動作を制御するための位置指令値、速度指令値、電流指令値なども含まれる。また、時間と速度との関係を表すプロファイルから生成された速度パターンから、動作調整装置112bが等価的に位置指令値の時系列データを生成し、ロボット制御装置111に入力することもできる。動作指令値はロボット制御装置111の内部で生成することもできる。 The operation command value refers to a control command value that can be input to the position control system of the robot system 100. The operation command value may be simply referred to as a command value. The motion of the robot 120 is controlled by the motion of the motor of each axis. The operation command value includes, for example, a position command value for controlling the operation of the motor, a speed command value, a current command value, and the like. Also, the motion adjustment device 112 b can equivalently generate time-series data of the position command value from the velocity pattern generated from the profile representing the relationship between time and velocity, and input it to the robot control device 111. The motion command value can also be generated inside the robot control device 111.
 本実施の形態の動作調整装置112bは、ロボット制御装置111の内部の指令値を取り出し、ロボット120が作業を実施した際の応答として得られるセンサ情報に応じて、動作指令値を調整し、更新する。この点は、他の実施の形態でも同様である。以降では動作制御システムから出力される動作指令値が速度指令値であるものとして説明する。なお、他の構成として、動作調整装置112bが、動作指令値そのものではなく、動作指令値の生成に必要なパラメータをロボット制御装置111に渡す構成も考えられる。例えば、動作調整装置112bは、区分位置および各区分における動作速度の目標値だけをロボット制御装置111に入力することもできる。この場合、ロボット制御装置が、入力された区分位置および動作速度の目標値を元にして動作指令値を生成する。 The operation adjustment device 112b according to the present embodiment takes out the command value inside the robot control device 111, adjusts the operation command value according to the sensor information obtained as a response when the robot 120 performs the work, and updates it. Do. This point is the same as in the other embodiments. Hereinafter, the operation command value output from the operation control system is described as the speed command value. As another configuration, a configuration may be considered in which the motion adjustment device 112 b passes not the motion command value itself but parameters necessary for generating the motion command value to the robot control device 111. For example, the motion adjustment device 112 b can also input to the robot control device 111 only the target values of the segment position and the motion speed in each segment. In this case, the robot control device generates an operation command value based on the input target position of the divided position and the operation speed.
 動作調整装置112bは、指令値学習部113bを備えている。指令値学習部113bは、動作指令値を調整し、更新する。指令値学習部113bは、区分情報、制約条件、更新前の動作指令値、センサ140の検出値に基づいて、新しい動作指令値を求める。指令値学習部113bは、新しい動作指令値を求める際に、評価関数によって作業の高速性と作業品質を評価し、作業対象200が壊れにくく高速な動作を探索するよう設計される。なお、また、動作調整装置112bは、ロボット制御装置111で用いられる制御パラメータも調整、更新する構成としても良い。制御パラメータの調整、更新も、指令値学習部113bで行われる。 The operation adjustment device 112 b includes a command value learning unit 113 b. The command value learning unit 113b adjusts and updates the operation command value. The command value learning unit 113b obtains a new operation command value based on the classification information, the constraint condition, the operation command value before updating, and the detection value of the sensor 140. When obtaining a new motion command value, the command value learning unit 113b is designed to evaluate the speed of work and the quality of work using an evaluation function, and search the work target 200 for a high-speed operation that is resistant to breakage. The operation adjustment device 112 b may also be configured to adjust and update control parameters used by the robot control device 111. Adjustment and update of control parameters are also performed by the command value learning unit 113b.
 図13は、本発明の実施の形態3による指令値学習部113bの構成例及び周辺のブロックを示すブロック図である。図13は、ロボットシステム100の構成の一部を抽出して示したものである。指令値学習部113bは、記憶部114及び学習処理部115を備える。図13を用いて、指令値学習部113bにおける探索の方法の一例について述べる。ここで、予め区分の数がN=4と定義されているものとする。また、各区分で定義された目標速度の値である速度目標値が動作指令値として用いられるものとする。また、指令値学習部113bは、各区分における速度目標値を調整することで、高速な作業を実現するものとする。 FIG. 13 is a block diagram showing a configuration example of the command value learning unit 113b according to the third embodiment of the present invention and peripheral blocks. FIG. 13 shows a part of the configuration of the robot system 100 extracted. The command value learning unit 113 b includes a storage unit 114 and a learning processing unit 115. An example of a search method in the command value learning unit 113b will be described with reference to FIG. Here, it is assumed that the number of divisions is previously defined as N = 4. Further, it is assumed that a speed target value which is a value of the target speed defined in each section is used as the operation command value. In addition, the command value learning unit 113b realizes high-speed work by adjusting the speed target value in each section.
 図14は、本発明の実施の形態3によるロボットシステム100が実施する作業の一例を示す図である。図14に示す通り、ロボットシステム100は、第1の部品210を第2の部品310に挿入する作業を行う。図14は、作業の進行に伴う第1の部品210と第2の部品310との相対位置の変化を図示したものであり、(a)、(b)、(c)、(d)の順に作業が進行していく様子を表している。第1の部品210が作業対象200に相当し、第2の部品310が周辺環境300に相当する。 FIG. 14 is a diagram showing an example of operations performed by the robot system 100 according to the third embodiment of the present invention. As shown in FIG. 14, the robot system 100 performs an operation of inserting the first part 210 into the second part 310. FIG. 14 illustrates the change in relative position between the first part 210 and the second part 310 as the work progresses, and in the order of (a), (b), (c), (d) It shows how work is progressing. The first part 210 corresponds to the work target 200, and the second part 310 corresponds to the surrounding environment 300.
 第1の部品210には、穴211が設けられている。一方、第2の部品310には、突起311が設けられている。第1の部品210を第2の部品310に挿入する際には、穴211に突起311が挿入される。第1の部品210は、第1の素材で構成される。一方、第2の部品310は、第1の素材で構成される部分312と第2の素材で構成される部分313とを備えている。第1の部品210を第2の部品310に挿入する際には、第1の部品210と第2の部品310との接触状態に変化が生じる。 The first component 210 is provided with a hole 211. On the other hand, the second component 310 is provided with a protrusion 311. When the first part 210 is inserted into the second part 310, the protrusion 311 is inserted into the hole 211. The first part 210 is made of a first material. On the other hand, the second part 310 includes a portion 312 formed of the first material and a portion 313 formed of the second material. When inserting the first part 210 into the second part 310, a change occurs in the contact between the first part 210 and the second part 310.
 図14に示す例では、(b)から(d)にかけての作業の進行に応じて、部品間で接触する部位および接触状態が変化する。接触状態としては、接触部分の各部品の素材、接触部分の広さなどが挙げられる。接触状態が変化することで、接触部分に生じる摩擦力が変化する。図14の(b)では、第1の部品210及び第2の部品310の外形同士の摩擦力が発生する。図14の(c)では、さらに、穴211と突起311との接触が加わるため、摩擦力が大きくなる。部品間に発生する摩擦力の変化によって、力覚センサ143の検出結果も変化することになる。すなわち、部品のはめあい作業やコネクタの挿入作業などにおいては、作業の進行に応じて部品間の反力が変化する。力覚センサは、この部品間の反力を検出している。 In the example shown in FIG. 14, in accordance with the progress of the work from (b) to (d), the contact portion and the contact state between the parts change. The contact state includes the material of each part of the contact portion, the size of the contact portion, and the like. The change in the contact state changes the frictional force generated at the contact portion. In (b) of FIG. 14, a friction force between the outer shapes of the first component 210 and the second component 310 is generated. In (c) of FIG. 14, the contact between the hole 211 and the protrusion 311 is further added, so the frictional force is increased. Due to the change in the frictional force generated between the parts, the detection result of the force sensor 143 also changes. That is, in the fitting work of the parts, the insertion work of the connector, etc., the reaction force between the parts changes in accordance with the progress of the work. The force sensor detects the reaction force between the parts.
 図13に示す通り、指令値学習部113bは、センサ140で検出された力情報、およびロボット制御装置111から取得した速度パターンを記憶部114に記憶する。ロボットシステム100は、動作指令値の調整のために作業を試行する際に、速度パターンを指定してロボット120を動作させることができるものとする。記憶部114に記憶された力情報、速度パターン、区分情報、および制約条件に基づいて、学習処理部115が速度パターンを更新して、オフライン処理としてロボット制御装置111に出力する。 As shown in FIG. 13, the command value learning unit 113 b stores the force information detected by the sensor 140 and the velocity pattern acquired from the robot control device 111 in the storage unit 114. It is assumed that the robot system 100 can operate the robot 120 by designating a velocity pattern when attempting to perform an operation for adjusting the operation command value. The learning processing unit 115 updates the velocity pattern based on the force information, the velocity pattern, the classification information, and the constraint condition stored in the storage unit 114, and outputs the velocity pattern to the robot control apparatus 111 as the offline processing.
 ここで、ロボット制御装置111には1つの速度パターンが記憶されているが、動作指令値の調整の際には、動作調整装置112bは、基準となる1つの速度パターンに対しても、複数種類の速度パターンを用いて作業を試行するように促す。この結果、動作指令値の調整の際に、ロボットシステム100は、様々な条件で試行することになる。動作調整装置112bは、様々な条件での試行で得られたデータに基づいて動作指令値を調整する。例えば、ロボットシステム100は、ロボット制御装置111に記憶されている動作指令値とは異なる動作指令値を含め、それぞれ異なる動作指令値によってNa回の試行を行う。動作調整装置112bは、Na回の試行の結果として得られるデータを入力して1回学習し、動作指令値を更新する。Na回の試行を1セットとして、Nbセットの試行を実施すると、多くの場合において動作指令値は収束し、それ以上の改善が発生しなくなる。ここで、Na、Nbは、1以上の整数である。 Here, although one velocity pattern is stored in the robot control device 111, when adjusting the operation command value, the operation adjustment device 112b also performs a plurality of types of one velocity pattern as a reference. Encourage you to try the task using the speed pattern of As a result, when adjusting the operation command value, the robot system 100 will try under various conditions. The operation adjustment device 112 b adjusts the operation command value based on data obtained in trials under various conditions. For example, the robot system 100 performs the trial Na times with different operation command values including operation command values different from the operation command value stored in the robot control device 111. The operation adjustment device 112 b inputs data obtained as a result of Na trials, learns once, and updates the operation command value. When the trial of Nb set is performed with one set of Na trials, in many cases, the operation command value converges and no further improvement occurs. Here, Na and Nb are integers of 1 or more.
 以上に示した通り、本実施の形態のロボットシステム100では、設定した1つあるいは1つ以上の複数の動作指令値を用いて試行し、得られた力センサデータに基づいて評価値を生成する。動作調整装置112bは、それぞれの評価値に基づいて、動作指令値の更新を行う。動作指令値の更新において、動作調整装置112bは、1つあるいは1つ以上の複数の動作指令値を生成し、再び試行を実施する。動作指令値が1つである場合には、評価値をプロットしたグラフにおいて評価値が収束していれば、動作調整装置112bは動作指令値の更新を終了する。動作指令値が複数である場合には、動作指令値に対応する評価値が最小となる結果のみをプロットしたグラフにおいて評価値が収束していれば、動作調整装置112bは動作指令値の更新を終了する。この場合、複数の動作指令値を更新していた場合は、動作調整装置112bは評価値が最小となった動作指令値に更新する。 As described above, in the robot system 100 according to the present embodiment, a trial is performed using one or more set operation command values, and an evaluation value is generated based on the obtained force sensor data. . The operation adjustment device 112 b updates the operation command value based on each evaluation value. In updating the operation command value, the operation adjustment device 112 b generates one or more operation command values and executes the trial again. If the operation command value is one, if the evaluation value converges in the graph in which the evaluation value is plotted, the operation adjustment device 112 b ends the update of the operation command value. If there are a plurality of motion command values, if the evaluation value converges in the graph in which only the result in which the evaluation value corresponding to the motion command value is minimum is plotted, the motion adjustment device 112b updates the motion command value. finish. In this case, when the plurality of motion command values are updated, the motion adjustment device 112 b updates the motion value to the minimum evaluation value.
 図15は、本発明の実施の形態3による学習処理部115の処理の流れの一例を示すフロー図である。図15に示す通り、まずステップS100において、学習処理部115は準備段階としての前処理を行う。次にステップS200において、学習処理部115は学習処理を行う。 FIG. 15 is a flowchart showing an example of a process flow of the learning processing unit 115 according to the third embodiment of the present invention. As shown in FIG. 15, first, in step S100, the learning processing unit 115 performs preprocessing as a preparation stage. Next, in step S200, the learning processing unit 115 performs a learning process.
 図16は、本発明の実施の形態3による学習処理部115で行われる前処理の流れの一例を示すフロー図である。なお、動作の説明のために、図16には学習処理部115以外のブロックが行う動作も記載されている。まず、ステップS101において、ロボット制御装置111は、力覚制御を行うための制御パラメータを設定する。次に、ステップS102において、ロボット制御装置111は、ロボット120を動作させて作業を試行する。次に、ステップS103において、指令値学習部113bは、その試行で得られたデータを取得する。各試行によって得られたデータを試行データと呼ぶ。試行データには、各試行で検出された力情報、各試行で使用された速度パターンを含む。力情報は、各試行において、所定の時間間隔で力覚センサ143によって取得された時系列のデータであり、力波形とも呼ばれる。次に、ステップS104において、記憶部114は、ステップS103で取得されたデータを記憶する。 FIG. 16 is a flowchart showing an example of the flow of preprocessing performed by the learning processing unit 115 according to the third embodiment of the present invention. Note that FIG. 16 also describes operations performed by blocks other than the learning processing unit 115 in order to explain the operations. First, in step S101, the robot control device 111 sets control parameters for performing force sense control. Next, in step S102, the robot control device 111 operates the robot 120 to try a task. Next, in step S103, the command value learning unit 113b acquires data obtained in the trial. The data obtained by each trial is called trial data. The trial data includes force information detected in each trial and a velocity pattern used in each trial. The force information is time-series data acquired by the force sensor 143 at predetermined time intervals in each trial, and is also referred to as a force waveform. Next, in step S104, the storage unit 114 stores the data acquired in step S103.
 次に、ステップS105において、学習処理部115は、試行データがK個以上取得されたか否かを判定する。ここで、Kは自然数であり、予め設定される。まだK個以上の試行データが取得されていなければ、処理はステップS102に戻る。一方、K個以上の試行データが取得されていれば場合、処理はステップS106に進む。したがって、ステップS106に処理が進んだ時点では、K個の試行データD1j(j=1,2,3,・・・K)が取得され、記憶部114に記憶されている。 Next, in step S105, the learning processing unit 115 determines whether K or more pieces of trial data have been acquired. Here, K is a natural number and is preset. If K or more trial data have not been acquired yet, the process returns to step S102. On the other hand, if K or more trial data have been acquired, the process proceeds to step S106. Therefore, when the process proceeds to step S 106, K trial data D 1 j (j = 1, 2, 3,... K) are acquired and stored in the storage unit 114.
 次に、ステップS106において、学習処理部115は、記憶部114に記憶されているK個の試行データに基づいて区分位置を定義する。区分位置とは、各区分の両端の分割点の位置である。分割点の位置は、例えば、エンドエフェクタ130の位置に対応する。分割点の位置が、速度目標値の切り替え位置となる。分割点の位置は、目標速度の切り替えの開始点であっても良いし、目標速度の切り替えの完了点であっても良い。また、分割点の位置は、内界センサ141で検出される動作速度が、目標速度から所定の誤差範囲内に収まることが保証される点であっても良い。 Next, in step S106, the learning processing unit 115 defines the division position based on the K trial data stored in the storage unit 114. The division position is the position of division points at both ends of each division. The position of the dividing point corresponds, for example, to the position of the end effector 130. The position of the dividing point is the switching position of the speed target value. The position of the dividing point may be a start point of switching of the target speed, or may be a completion point of switching of the target speed. In addition, the position of the dividing point may be a point at which it is ensured that the operation speed detected by the internal sensor 141 falls within a predetermined error range from the target speed.
 学習処理部115は、例えば、K個の試行データの平均や分散に基づいて区分位置を定義する。学習処理部115は、力波形の変化率に注目し、力波形が大きく変化する前後に分割点を設定することで、自動的に分割点の位置を決定することができる。あるいは、ユーザが作業内容に合わせて状態変化の生じる点を分割点として、手動で決定することもできる。 The learning processing unit 115 defines, for example, the division position based on the average and the variance of the K trial data. The learning processing unit 115 can automatically determine the position of the dividing point by setting the dividing point before and after the force waveform largely changes, paying attention to the rate of change of the force waveform. Alternatively, the user can manually determine the point at which the state change occurs according to the work content as the division point.
 次に、ステップS107において、学習処理部115は、区分位置が定義されているか否かを判定する。区分位置が定義されていなければ、処理はステップS106に戻る。区分位置が定義されていれば、処理はステップS108に進む。次に、ステップS108において、学習処理部115は、各区分に対して速度目標値を定義する。速度目標値は、ユーザによって指定される力の上限値、および、目標タクトタイムに基づいて算出される。 Next, in step S107, the learning processing unit 115 determines whether the division position is defined. If the section position is not defined, the process returns to step S106. If the section position is defined, the process proceeds to step S108. Next, in step S108, the learning processing unit 115 defines a speed target value for each section. The speed target value is calculated based on the upper limit value of the force designated by the user and the target tact time.
 具体的には、学習処理部115は、目標タクトタイムまでに作業を完了させるための標準作業速度を全体の速度目標値Vdnとして設定する。次に、学習処理部115は、力の上限値に基づいて、速度上限値Vmaxを定義する。エンドエフェクタ130が作業対象200または周辺環境300と衝突したときの速度と、その際にエンドエフェクタ130に加えられる外力との関係は、作業対象の剛性情報などに基づいて予め求めることができる。学習処理部115は、この関係を記憶したテーブル等を参照して速度上限値Vmaxを求めることができる。 Specifically, the learning processing unit 115 sets a standard working speed for completing the work by the target tact time as the overall speed target value Vdn. Next, the learning processing unit 115 defines the speed upper limit value Vmax based on the upper limit value of the force. The relationship between the speed when the end effector 130 collides with the work target 200 or the surrounding environment 300 and the external force applied to the end effector 130 at that time can be obtained in advance based on the rigidity information of the work target and the like. The learning processing unit 115 can obtain the speed upper limit value Vmax with reference to a table or the like storing this relationship.
 学習処理部115は、全体の速度目標値Vdnと速度上限値Vmaxとを用いて、目標速度Vdを決定する。速度目標値Vdは、0より大きく、速度上限値Vmaxより小さい。目標速度Vdは、徐々にVdnに近づくように設定される。例えば、学習処理部115は、0<Vd<Vdn<Vmaxの条件下で、速度パラメータがある程度バラけるように、乱数を利用して複数個の速度目標値Vdを定義する。このように、ステップS108において、学習処理部115は、決められた範囲内でバラけた値となるように速度目標値Vdを決定する。次に、ステップS109において、学習処理部115は、速度目標値が定義されているか否かを判定する。速度目標値が定義されていなければ、処理はステップS108に戻る。速度目標値が定義されていれば、前処理は終了となる。前処理によって、学習処理を行う際の初期値が決定される。 The learning processing unit 115 determines the target velocity Vd using the entire velocity target value Vdn and the velocity upper limit value Vmax. The speed target value Vd is larger than 0 and smaller than the speed upper limit value Vmax. The target velocity Vd is set to gradually approach Vdn. For example, under the condition of 0 <Vd <Vdn <Vmax, the learning processing unit 115 defines a plurality of speed target values Vd using random numbers so that the speed parameters may be dispersed to some extent. As described above, in step S108, the learning processing unit 115 determines the speed target value Vd so as to be a separated value within the determined range. Next, in step S109, the learning processing unit 115 determines whether the speed target value is defined. If the speed target value has not been defined, the process returns to step S108. If the speed target value is defined, the pre-processing ends. By the pre-processing, an initial value when performing the learning processing is determined.
 図17は、本発明の実施の形態3による学習処理部115で行われる学習処理の流れの一例を示すフロー図である。なお、動作の説明のために、図17には学習処理部115以外のブロックが行う動作も記載されている。まず、ステップS201において、ロボット制御装置111は、ロボット120を動作させて作業を試行する。次に、ステップS202において、指令値学習部113bは、その試行で得られた試行データを取得する。次に、ステップS203において、記憶部114は、ステップS202で取得された試行データを記憶する。 FIG. 17 is a flow chart showing an example of the flow of learning processing performed by the learning processing unit 115 according to the third embodiment of the present invention. Note that FIG. 17 also describes operations performed by blocks other than the learning processing unit 115 for the description of the operations. First, in step S201, the robot control device 111 operates the robot 120 to try a task. Next, in step S202, the command value learning unit 113b acquires trial data obtained by the trial. Next, in step S203, the storage unit 114 stores the trial data acquired in step S202.
 次に、ステップS204において、学習処理部115は、試行データがM個以上取得されたか否かを判定する。ここで、Mは自然数であり、予め設定される。まだM個以上の試行データが取得されていなければ、処理はステップS201に戻る。一方、M個以上の試行データが取得されていれば、処理はステップS205に進む。したがって、ステップS205に処理が進んだ時点では、M個の試行データD2j(j=1,2,3,・・・M)が取得され、記憶部114に記憶されている。なお、M組の区分位置、速度目標値が定義されていれば、ステップS201における試行はそれぞれ異なる組の区分位置、速度目標値を用いて実行される。したがって、ステップS205に処理が進んだ時点では、M組の区分位置、速度目標値に対応するM個の試行データD2jが記憶されることになる。 Next, in step S204, the learning processing unit 115 determines whether M or more trial data have been acquired. Here, M is a natural number and is preset. If M or more trial data have not been acquired yet, the process returns to step S201. On the other hand, if M or more trial data have been acquired, the process proceeds to step S205. Therefore, when the process proceeds to step S 205, M trial data D 2 j (j = 1, 2, 3,... M) are acquired and stored in the storage unit 114. Note that if M division positions and speed target values are defined, the trial in step S201 is executed using different groups of division positions and speed target values. Therefore, when the process proceeds to step S205, M trial positions D2j corresponding to M sets of division positions and speed target values are stored.
 次に、ステップS205において、学習処理部115は、制約条件に基づいて、M個の試行データのそれぞれに対して評価値を演算する。演算された評価値は記憶される。次に、ステップS206において、学習処理部115は、M個の試行データのうち、評価値が最良となった試行データに対応する区分位置および速度目標値を求める。次に、ステップS207において、学習処理部115は、新しく求められたM個の評価値の中で最良の評価値と、過去に求められた評価値とを比較して、評価値が最良となる結果に収束したか否かを判定する。収束していれば、処理はステップS209に進み、調整を完了するための処理が行われ、動作指令値の調整は完了となる。動作指令値の調整が完了した時点で、最良の評価値が得られた区分位置および速度目標値が、動作指令値の調整結果となる。一方、まだ収束していなければ、処理はステップS208に進む。 Next, in step S205, the learning processing unit 115 calculates an evaluation value for each of the M trial data based on the constraint condition. The calculated evaluation value is stored. Next, in step S206, the learning processing unit 115 obtains a division position and a speed target value corresponding to the trial data having the best evaluation value among the M trial data. Next, in step S207, the learning processing unit 115 compares the best evaluation value among the newly obtained M evaluation values with the evaluation value obtained in the past, and the evaluation value is the best. It is determined whether the result has converged. If it has converged, the process proceeds to step S209, a process for completing the adjustment is performed, and the adjustment of the operation command value is completed. When the adjustment of the operation command value is completed, the divided position and the speed target value at which the best evaluation value is obtained become the adjustment result of the operation command value. On the other hand, if not converged yet, the process proceeds to step S208.
 次に、ステップS208において、学習処理部115は、新たにM組の区分位置および速度目標値を定義し、区分位置および速度目標値を更新する。M組の区分位置および速度目標値は、互いに区分位置または速度目標値が異なる。すなわち、ステップS208において、学習処理部115は、新たにM組の動作指令値を設定する。M組の動作指令値のそれぞれは、区分位置と各区分位置に対応する速度目標値をパラメータとして有している。各組の動作指令値においては、区分数よりも1つ多い区分位置が存在し、区分数と同じ数の速度目標値が存在する。ステップS208の処理が完了すると、処理はステップS201に戻る。 Next, in step S208, the learning processing unit 115 newly defines M sets of division positions and speed target values, and updates the division positions and speed target values. M division position and speed target values are different from each other in division position or speed target value. That is, in step S208, the learning processing unit 115 newly sets M sets of operation command values. Each of the M sets of operation command values has, as parameters, division positions and speed target values corresponding to the division positions. In each set of operation command values, there are one more division position than the number of divisions, and there are the same number of speed target values as the number of divisions. When the process of step S208 is completed, the process returns to step S201.
 以上のように、学習処理において、ロボットシステム100は、設定された区分位置と、各区分に対して設定された速度目標値とに基づいて、M回の試行作業を実施する。M回の試行作業は、それぞれ区分位置または速度目標値が異なる条件下で実施される。M回の試行が終了する度に、学習処理部115は、各区分に対する分割点の位置および各区分に対する速度目標値を更新する。 As described above, in the learning process, the robot system 100 performs M times of trial work based on the set segment position and the speed target value set for each segment. The trial operation M times is performed under the condition that the division position or the speed target value is different. Every time M trials are completed, the learning processing unit 115 updates the position of the dividing point for each section and the speed target value for each section.
 図18は、本発明の実施の形態3によるロボットシステム100における試行時の速度パターンの一例を示す図である。また、図19は、本発明の実施の形態3によるロボットシステム100における試行時に取得される力情報の一例を示す図である。図18及び図19において、P0~P3は分割点の位置であり、S1~S4は4つの区分を表している。また、図18において、V1~V4は、各区分における速度目標値を表している。図19は、図18に示す速度パターンによる試行において取得された力情報を表している。  FIG. 18 is a diagram showing an example of a velocity pattern at the time of trial in the robot system 100 according to the third embodiment of the present invention. FIG. 19 is a diagram showing an example of force information acquired at the time of trial in the robot system 100 according to the third embodiment of the present invention. In FIG. 18 and FIG. 19, P0 to P3 are positions of division points, and S1 to S4 indicate four divisions. Further, in FIG. 18, V1 to V4 represent speed target values in each section. FIG. 19 shows force information acquired in the trial based on the velocity pattern shown in FIG.
 図14に示すような組立作業においては、部品間の接触が発生する位置の付近で、第1の部品210を保持するエンドエフェクタ130と第2の部品310との作用反力が制限値よりも大きくなることがある。この場合、制限値を超えた力の量を制限超過量で評価できる。図19においては、区分S2において、力覚センサ143で検出された力の大きさFが制限値L0を超えている。制限超過量DHは、検出された力の大きさFが制限値L0を超えている場合に、検出された力の大きさFと制限値L0との差分で求められる。制限超過量DHが設定する閾値より大きい区分がある場合は、その区分の速度目標値を調整する必要がある。 In the assembly operation as shown in FIG. 14, the reaction force between the end effector 130 holding the first part 210 and the second part 310 is higher than the limit value in the vicinity of the position where the contact between the parts occurs. It can be large. In this case, the amount of force exceeding the limit value can be evaluated by the over limit amount. In FIG. 19, in the section S2, the magnitude F of the force detected by the force sensor 143 exceeds the limit value L0. When the magnitude F of the detected force exceeds the limit value L0, the limit excess amount DH is obtained by the difference between the magnitude F of the detected force and the limit value L0. In the case where there is a division that is larger than the threshold set by the over-limit amount DH, it is necessary to adjust the speed target value of that division.
 図19では区分S2で検出された力Fが大きくなっている。したがって、図18に示す速度パターンに対して、学習処理部115は、区分S2における速度目標値V2が小さくなるように速度パターンを調整する。さらに、学習処理部115は、区分S2の両端となる分割点P1およびP2の位置も調整する。図18に示す速度パターンにおいて、分割点P1は速度目標値を下げ始める点であり、分割点P2は速度目標値を上げ始める点である。すなわち、学習処理部115は、速度目標値の変化を開始する点の位置も調整する。これらの調整は、制約条件に基づいて行われる。 In FIG. 19, the force F detected in the section S2 is large. Therefore, with respect to the velocity pattern shown in FIG. 18, the learning processing unit 115 adjusts the velocity pattern so that the velocity target value V2 in the section S2 becomes smaller. Further, the learning processing unit 115 also adjusts the positions of division points P1 and P2 which are both ends of the division S2. In the velocity pattern shown in FIG. 18, the division point P1 is a point at which the target velocity value starts to be lowered, and the division point P2 is a point at which the target velocity value starts to be increased. That is, the learning processing unit 115 also adjusts the position of the point at which the change of the speed target value is started. These adjustments are made based on constraints.
 例えば、制約条件として、力の大きさFに制限値L0を設定した場合、上限となる制限値L0を超えていない試行に対しては力の大きさFに関する評価値が0となるように評価関数を定義する。力の大きさFに関する評価値が0にならない場合は、学習処理部115は、速度目標値V2、分割点P1、P2の位置を更新し続けて、動作指令値を調整する。この調整と同時に、なるべく高速な作業が実施されるように評価関数を定義することもできる。図19においては、区分S1、S3及びS4では、検出された力の大きさFは、制限値L0に対して余裕量DLが存在する。ここで、余裕量DLは制限値L0までの量、もしくは制限値L0までの量を指標化したものとする。制限値L0までの量は、制限値L0と検出された力の大きさFとの差分で定義される。余裕量DLが0より大きい場合は、速度目標値を上げる方向に調整、更新することができる。このような調整によって、なるべく高速に作業を行うような調整が可能となる。 For example, when limiting value L0 is set to the magnitude F of force as a constraint condition, evaluation is made such that the evaluation value related to the magnitude F of the force is 0 for trials that do not exceed the limiting value L0 as the upper limit. Define a function If the evaluation value related to the magnitude F of the force does not become 0, the learning processing unit 115 continues updating the positions of the speed target value V2 and the division points P1 and P2 to adjust the operation command value. At the same time as this adjustment, it is possible to define an evaluation function so that the work as fast as possible can be performed. In FIG. 19, in the sections S1, S3 and S4, the detected force magnitude F has a margin DL with respect to the limit value L0. Here, it is assumed that the margin amount DL is an amount up to the limit value L0 or an amount up to the limit value L0 as an index. The amount up to the limit value L0 is defined by the difference between the limit value L0 and the magnitude F of the detected force. When the allowance amount DL is larger than 0, the speed target value can be adjusted and updated in the increasing direction. Such adjustment makes it possible to perform the work as fast as possible.
 これらの調整を図17におけるステップS205およびステップS206で行う。このとき、評価値を最良とする分割点Piの位置および速度目標値を求めるために、評価関数を用いた機械学習あるいは最適化手法を適用することができる。例えば、強化学習、ベイズ最適化、粒子群最適化などの手法が例示される。これらの手法によって、評価値を最良とする動作指令値を設定することができる。例えば、作業中の各時点で検出される力F(t)および作業時間Tを用いた式(1)で表される評価関数Fqが定義されているとする。学習処理部115は、評価関数Fqで算出される評価値が小さくなるように動作指令値を調整することで、力F(t)および作業時間Tが小さくなるような動作指令値を求めることができる。図17に示すとおり、評価関数によって求められる評価値が収束したところで、調整は完了となる。 These adjustments are performed in steps S205 and S206 in FIG. At this time, a machine learning or optimization method using an evaluation function can be applied to obtain the position and speed target value of the division point Pi that makes the evaluation value the best. For example, techniques such as reinforcement learning, Bayesian optimization, and particle swarm optimization are exemplified. By these methods, it is possible to set an operation command value that makes the evaluation value the best. For example, it is assumed that the evaluation function Fq represented by the equation (1) using the force F (t) detected at each time point in operation and the operation time T is defined. The learning processing unit 115 may obtain an operation command value such that the force F (t) and the work time T become smaller by adjusting the operation command value so that the evaluation value calculated by the evaluation function Fq becomes smaller. it can. As shown in FIG. 17, the adjustment is completed when the evaluation value obtained by the evaluation function converges.
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000001
 本実施の形態の動作調整装置112、動作制御システム110及びロボットシステム100は、以上のように構成される。本実施の形態の動作調整装置112、動作制御システム110及びロボットシステム100によれば、区分毎にロボット120の動作が調整される。したがって、作業全体の動作を不要に遅くすることなく、しかも作業対象200または周辺環境300に過大な負荷が作用することがないように、ロボット120の動作を調整でき、また、ロボット120の動作の調整を容易化できる。さらに、力覚センサ143の検出値が所定の値よりも小さくなる区分については動作が早くなるように調整されるように構成すれば、作業全体の動作をより早くすることも可能となる。 The motion adjustment device 112, the motion control system 110, and the robot system 100 of the present embodiment are configured as described above. According to the motion adjustment device 112, the motion control system 110, and the robot system 100 of the present embodiment, the motion of the robot 120 is adjusted for each section. Therefore, it is possible to adjust the operation of the robot 120 so as not to unnecessarily delay the operation of the entire operation and to prevent an excessive load from acting on the operation target 200 or the surrounding environment 300. Adjustment can be facilitated. Furthermore, if the section where the detected value of the force sensor 143 is smaller than the predetermined value is adjusted so that the operation becomes faster, it is also possible to make the operation of the whole operation faster.
 以上のように、本実施の形態の動作調整装置112、動作制御システム110及びロボットシステム100によれば、区間ごとに最適な動作指令値を学習し、更新することで、従来の調整では実現できなかった細やかな動作指令値の設計が可能となり、結果として高速かつ高品質なロボット作業を実現することができる。具体的には、本実施の形態の動作調整装置112、動作制御システム110及びロボットシステム100によれば、部品のはめあい作業やコネクタの挿入作業などにおいて、はめあう部品間の反力を抑制しながら作業時間を短縮することができる。 As described above, according to the operation adjustment device 112, the operation control system 110 and the robot system 100 of the present embodiment, the conventional adjustment can be realized by learning and updating the optimum operation command value for each section. It becomes possible to design delicate motion command values that were not possible, and as a result, high-speed, high-quality robot work can be realized. Specifically, according to the operation adjustment device 112, the operation control system 110 and the robot system 100 of the present embodiment, the reaction force between the fitted parts is suppressed in the fitting work of the parts, the insertion work of the connector, etc. Work time can be shortened.
実施の形態4.
 図20は、本発明の実施の形態4による動作調整装置112cの構成例及び周辺のブロックを示すブロック図である。本実施の形態の動作調整装置、動作制御システム及びロボットシステムにおいて、他の構成は図1に示されたものと同様である。図20は、ロボットシステム100の構成の一部を抽出して示したものである。本実施の形態の動作調整装置112cは、指令値学習部113bおよび指令値区分部116を備える。
Fourth Embodiment
FIG. 20 is a block diagram showing a configuration example of the operation adjustment apparatus 112c according to the fourth embodiment of the present invention and surrounding blocks. Other configurations of the motion adjustment device, motion control system, and robot system of the present embodiment are the same as those shown in FIG. FIG. 20 shows a part of the configuration of the robot system 100 extracted. The operation adjustment device 112 c of the present embodiment includes a command value learning unit 113 b and a command value classification unit 116.
 指令値区分部116には、更新前の動作指令値がロボット制御装置111から入力され、センサ140の検出値であるセンサ情報がセンサ140から入力され、制約条件が外部から入力される。指令値区分部116は、これらの入力に対して、エンドエフェクタ130等の位置あるいは指令値進捗率を用いて動作指令値を区分する分割点Pi(i=0,1,2,・・・,N+1)を定義し、これを区分情報として出力する。指令値学習部113bは、図12に示されたものと同様のものである。 The command value classification unit 116 receives an operation command value before updating from the robot control device 111, sensor information that is a detection value of the sensor 140 from the sensor 140, and a constraint condition from the outside. For these inputs, command value sorting section 116 splits the operation command value using the position of end effector 130 or the like or the command value progress rate, and division points Pi (i = 0, 1, 2,... Define N + 1) and output this as classification information. The command value learning unit 113b is the same as that shown in FIG.
 本実施の形態の動作調整装置112cは、センサ情報の特徴量や制約条件を用いて、たとえば機械学習を適用して分割すべき空間を決定し、ここで分割された特徴量空間上のクラス情報を利用して現在の分割点Piを生成する。動作調整装置112cは、図15に示す処理と同様に、前処理及び学習処理を行う。図21は、本発明の実施の形態4による動作調整装置112cで行われる前処理の流れの一例を示すフロー図である。また、図22は、本発明の実施の形態4による動作調整装置112cで行われる学習処理の流れの一例を示すフロー図である。 The operation adjustment apparatus 112c according to the present embodiment applies the machine learning, for example, to determine the space to be divided using the feature amount and the constraint condition of the sensor information, and class information on the divided feature amount space is determined here. To generate the current division point Pi. The operation adjustment device 112c performs pre-processing and learning processing as in the processing shown in FIG. FIG. 21 is a flow chart showing an example of the flow of pre-processing performed by the operation adjustment device 112 c according to the fourth embodiment of the present invention. FIG. 22 is a flow chart showing an example of the flow of learning processing performed by the operation adjustment apparatus 112 c according to the fourth embodiment of the present invention.
 図21に示す前処理は、図16に示す処理と比較すると、ステップS106bにおいて、区分位置に加えて、区分数も定義する点が異なる。例えば、波形的な特徴に基づいて自動的に区分を生成することが出来る。波形的な特徴として例えば、時系列で取得した位置データ、速度データ、力データおよび力変化率データに関して、一定時間毎Tsmpのデータの最大値あるいは度数分布を入力とし、入力に基づいてクラスタリングを実施する。クラスタリングには、機械学習の一種であるk-means法などクラスタリング手法を用いて波形の特徴的な履歴毎に区切れ目を定義することが出来る。これに基づいて例えばX個の種類の波形特徴を定義したとする。 The pre-processing shown in FIG. 21 differs from the processing shown in FIG. 16 in that the number of divisions is also defined in addition to the division positions in step S106 b. For example, partitions can be generated automatically based on waveform features. As waveform characteristics, for example, regarding position data, velocity data, force data, and force change rate data acquired in time series, clustering is performed based on the input with the maximum value or frequency distribution of data of Tsmp every constant time Do. For clustering, it is possible to define a break for each characteristic history of a waveform using a clustering method such as k-means method which is a type of machine learning. For example, it is assumed that X kinds of waveform features are defined based on this.
 次に、取得したクラスタに基づいて、元のデータに対してラベル付けを実施することができる。例えば、X個存在するクラスタそれぞれに対する、対象としている入力の類似度S(i)(ここで、i=1,2,3,・・・,X)を定義して、どのグループの属性の特徴に最も近いかということをパーセンテージで表現することが出来る。その場合、最もパーセンテージが大きなグループとして、ラベル付けすることができる。時間tを変数して定義された各時刻のラベルL(t)とする。ステップS106bにおいて、ラベルの変化が生じる全てあるいはいくつかの部分で区切り目として、区分数・区分位置を定義することできる。 The original data can then be labeled based on the acquired clusters. For example, by defining the similarity S (i) (where i = 1, 2, 3,..., X) of the target input to each of the X existing clusters, the characteristic of the attribute of any group is defined. It can be expressed as a percentage whether it is closest to In that case, it can be labeled as a group with the largest percentage. The label L (t) of each time defined with the time t as a variable is used. In step S106b, it is possible to define the number of divisions and the position of division as a break at all or some parts where a change in label occurs.
 一方、図22に示す学習処理は、図17に示す処理と比較すると、ステップS211、ステップS212、ステップS213の3つの処理が異なる。図22に示す学習処理は、ステップS211において、センサ情報、動作指令値および制御パラメータ、制約条件に基づき、区分数と区分位置とを学習するための評価関数を用いて第1の評価値を求める。また、図22に示す学習処理は、ステップS212において、第1の評価値に基づいて区分数および区分位置を学習し、更新する。さらに、図22に示す学習処理は、ステップS213において、動作指令値を学習するための第2の評価値を求める。したがって、図22に示す学習処理は、区分数および区分位置を学習した後に、動作指令値を学習することになる。 On the other hand, the learning process shown in FIG. 22 is different from the process shown in FIG. 17 in three processes of step S211, step S212 and step S213. In the learning process shown in FIG. 22, in step S211, a first evaluation value is obtained using an evaluation function for learning the number of divisions and the position of division based on sensor information, operation command values and control parameters, and constraint conditions. . Further, in the learning process illustrated in FIG. 22, in step S <b> 212, the number of divisions and the position of division are learned and updated based on the first evaluation value. Further, in the learning process shown in FIG. 22, in step S213, a second evaluation value for learning an operation command value is obtained. Therefore, the learning process shown in FIG. 22 learns the operation command value after learning the number of segments and the segment position.
 以上の処理を含むことで、区分情報を自動的に学習する枠組みが追加され、区分情報を予め事前知識を活用して設計する必要が無くなることになり、設計時間を短くするという格別の効果を得ることができる。 By including the above processing, a framework for automatically learning division information is added, and there is no need to design division information in advance by utilizing prior knowledge, and the special effect of shortening design time is realized. You can get it.
実施の形態5.
 図23は、本発明の実施の形態5による動作調整装置112dの構成例及び周辺のブロックを示すブロック図である。本実施の形態の動作調整装置、動作制御システム及びロボットシステムにおいて、他の構成は図1に示されたものと同様である。図23は、ロボットシステム100の構成の一部を抽出して示したものである。本実施の形態の動作調整装置112dは、指令値区分部116および動作学習部117を備える。指令値区分部116は、図20に示されたものと同様のものである。
Embodiment 5
FIG. 23 is a block diagram showing a configuration example of an operation adjustment device 112 d according to a fifth embodiment of the present invention and peripheral blocks. Other configurations of the motion adjustment device, motion control system, and robot system of the present embodiment are the same as those shown in FIG. FIG. 23 shows a part of the configuration of the robot system 100 extracted. The operation adjustment device 112 d of the present embodiment includes a command value classification unit 116 and an operation learning unit 117. The command value classification unit 116 is similar to that shown in FIG.
 図24は、本発明の実施の形態5による動作学習部117の構成例を示すブロック図である。動作学習部117は、指令値学習部113bおよびパラメータ学習部118を備える。指令値学習部113bは、図20に示されたものと同様のものである。動作学習部117には、ロボット制御装置111から更新前の動作指令値および制御パラメータが入力される。また、動作学習部117には、外部から制約条件が入力される。また、動作学習部117には、センサ140からセンサ情報が入力される。また、動作学習部117には、指令値区分部116から区分情報が入力される。入力された信号は、指令値学習部113bおよびパラメータ学習部118に入力される。 FIG. 24 is a block diagram showing a configuration example of the operation learning unit 117 according to the fifth embodiment of the present invention. The action learning unit 117 includes a command value learning unit 113 b and a parameter learning unit 118. The command value learning unit 113b is the same as that shown in FIG. The motion learning unit 117 receives the motion command value and the control parameter before updating from the robot control device 111. Further, the constraint condition is input to the operation learning unit 117 from the outside. Further, sensor information is input from the sensor 140 to the operation learning unit 117. Further, classification information is input to the operation learning unit 117 from the command value classification unit 116. The input signal is input to the command value learning unit 113 b and the parameter learning unit 118.
 パラメータ学習部118は、位置指令値、速度指令値、加速度指令値といった直接的なロボットの振る舞いではなく、外界センサに基づくセンサフィードバック制御系のゲイン、インピーダンスパラメータ、フィルタ設計パラメータなどを調整する。すなわち、パラメータ学習部118は、フィードバック制御系の制御パラメータを調整する。パラメータ学習部118は、区分情報、センサ情報、制約条件、指令値および制御パラメータを入力として、これらを用いて、入力された制御パラメータを、制約条件を満たすような制御パラメータに更新する。制御パラメータを更新する際には、機械学習を用いることができる。例示として、パラメータ学習部118は、予め定義された評価関数で得られる評価値が大きくなるように制御パラメータを更新し、漸近的に収束するまで演算を繰り返す。なお、定義される評価関数によっては、パラメータ学習部118は、評価値が小さくなるように制御パラメータを更新することになる。 The parameter learning unit 118 adjusts the gain of a sensor feedback control system based on an external sensor, an impedance parameter, a filter design parameter, and the like, not a direct robot behavior such as a position command value, a speed command value, and an acceleration command value. That is, the parameter learning unit 118 adjusts control parameters of the feedback control system. The parameter learning unit 118 receives the classification information, the sensor information, the constraint condition, the command value, and the control parameter as input, and updates the input control parameter to a control parameter satisfying the constraint condition using these as input. Machine learning can be used when updating control parameters. As an example, the parameter learning unit 118 updates the control parameter so that the evaluation value obtained by the previously-defined evaluation function becomes large, and repeats the operation until it converges asymptotically. Depending on the evaluation function to be defined, the parameter learning unit 118 updates the control parameter so that the evaluation value becomes smaller.
 ここで、図24において、パラメータ学習部118は、指令値学習部113bとは独立した構成として例示されている。しかし、パラメータ学習部118及び指令値学習部113bは、必ずしもそれぞれが独立した処理を行う必要はない。例えば、パラメータ学習部118及び指令値学習部113b、1つの評価関数を用いて同時に処理を行うこともできる。なお、パラメータ学習部118は、制御パラメータを区分毎に調整する。また、指令値学習部113bで使用される区分数と、パラメータ学習部118で使用される区分数とは、必ずしも同じではない。例えば、指令値学習部113bで使用される区分数と比較して、パラメータ学習部118で使用される区分数の方が多い場合が考えられる。 Here, in FIG. 24, the parameter learning unit 118 is illustrated as a configuration independent of the command value learning unit 113b. However, the parameter learning unit 118 and the command value learning unit 113b do not necessarily have to perform independent processing. For example, processing can be performed simultaneously using the parameter learning unit 118, the command value learning unit 113b, and one evaluation function. The parameter learning unit 118 adjusts the control parameter for each section. Further, the number of divisions used in the command value learning unit 113 b and the number of divisions used in the parameter learning unit 118 are not necessarily the same. For example, the number of divisions used in the parameter learning unit 118 may be larger than the number of divisions used in the command value learning unit 113b.
 また、パラメータ学習部118は、外界センサ142に基づくセンサフィードバック制御システムにおける制御パラメータだけではなく、内界センサ141に基づくフィードバック制御システムにおける制御パラメータも更新することができ、この結果、より高品質で高速なロボット作業を実現することが可能となる。 In addition, the parameter learning unit 118 can update not only control parameters in the sensor feedback control system based on the external sensor 142 but also control parameters in the feedback control system based on the internal sensor 141. As a result, with higher quality It is possible to realize high-speed robot work.
 図25は、発明の実施の形態5による動作調整装置112dの別の構成例及び周辺のブロックを示すブロック図である。図25では、指令値学習部113bを備えない構成例を示している。この構成例では、動作調整装置112dは、動作指令値の更新は行わず、制御パラメータのみを更新する。 FIG. 25 is a block diagram showing another configuration example of the operation adjustment device 112 d according to the fifth embodiment of the present invention and peripheral blocks. FIG. 25 shows a configuration example without the command value learning unit 113b. In this configuration example, the operation adjustment device 112d does not update the operation command value, but updates only the control parameter.
実施の形態6.
 本実施の形態の動作調整装置、動作制御システム及びロボットシステムは、速度パターンを調整するに際し、各区分Siにおける速度目標値に対して、上限値または下限値を定め、それぞれの区分における探索空間を作業対象の剛性や作業対象の組立て品質上の制約に基づいて定義する。本実施の形態の動作調整装置、動作制御システム及びロボットシステムによれば、探索空間の中で、実現可能であるが組立て品質上の問題が生じる動作指令値あるいは制御パラメータを探索しなくなる。したがって、ユーザが求める作業品質を規定した範囲で、高速な組立を実現する動作指令値あるいは制御パラメータに収束させることができる。これにより、調整後のロボットは、作業対象に作用させる反力を大きくせず傷つけない作業品質を確保することができるという、格別な効果を得ることができる。
Sixth Embodiment
When adjusting the velocity pattern, the motion adjustment device, motion control system and robot system of the present embodiment determine the upper limit value or the lower limit value for the speed target value in each section Si, and search space in each section Define based on rigidity of work target and assembly quality restrictions of work target. According to the motion adjustment device, the motion control system, and the robot system of the present embodiment, a motion command value or control parameter that can be realized but causes a problem in assembly quality in the search space is not searched. Therefore, it is possible to converge on an operation command value or control parameter that realizes high-speed assembly within a range in which the operation quality required by the user is specified. As a result, the robot after adjustment can obtain a remarkable effect that the work quality can be secured without increasing the reaction force applied to the work object without damaging it.
 100 ロボットシステム、110 動作制御システム、111 ロボット制御装置、112、112b、112c、112d 動作調整装置、113、113b 指令値学習部、114 記憶部、115 学習処理部、116 指令値区分部、117 動作学習部、120 ロボット、130 エンドエフェクタ、140 センサ、141 内界センサ、142 外界センサ、143 力覚センサ、200 作業対象、210 第1の部品、211 穴、300 周辺環境、310 第2の部品、311 突起、401 プロセッサ、402 メモリ、403 データバス。 DESCRIPTION OF SYMBOLS 100 robot system, 110 motion control system, 111 robot control device, 112, 112b, 112c, 112d motion adjustment device, 113, 113b command value learning unit, 114 storage unit, 115 learning processing unit, 116 command value sorting unit, 117 motion Learning part, 120 robots, 130 end effectors, 140 sensors, 141 internal sensors, 142 external sensors, 143 force sensors, 200 work objects, 210 first parts, 211 holes, 300 peripheral environment, 310 second parts, 311 projections, 401 processor, 402 memory, 403 data bus.

Claims (19)

  1.  エンドエフェクタが装着されたロボットと、前記ロボットの動作を制御するロボット制御装置とを備え、前記ロボットが作業対象に対して作業を行うロボットシステムで用いられるロボットの動作調整装置であって、
     前記ロボットシステムが備える外界センサで検出された前記エンドエフェクタに作用する力を入力とした学習を行って、前記ロボットの動作を制御するために前記ロボット制御装置から前記ロボットに送信される動作指令値を調整する指令値学習部を備えることを特徴とする動作調整装置。
    A robot operation adjustment device used in a robot system comprising: a robot on which an end effector is mounted; and a robot control device that controls an operation of the robot, wherein the robot performs a work on a work target,
    An operation command value transmitted from the robot control device to the robot to perform learning by using a force acting on the end effector detected by an external sensor included in the robot system as an input to control the operation of the robot An operation adjusting apparatus comprising: a command value learning unit for adjusting
  2.  前記指令値学習部は、前記エンドエフェクタに作用する力の範囲を制約条件とした学習を行って、前記動作指令値を調整することを特徴とする請求項1に記載の動作調整装置。 The operation adjustment device according to claim 1, wherein the command value learning unit performs learning with a range of force acting on the end effector as a constraint condition to adjust the operation command value.
  3.  前記指令値学習部は、前記作業に要する時間の上限を制約条件とした学習を行って、前記動作指令値を調整することを特徴とする請求項2に記載の動作調整装置。 The operation adjustment device according to claim 2, wherein the command value learning unit adjusts the operation command value by performing learning with the upper limit of the time required for the work as a constraint condition.
  4.  前記動作指令値は、前記エンドエフェクタの移動速度の目標値または前記ロボットの動作速度の目標値である速度指令値であることを特徴とする請求項1から3のいずれか1項に記載の動作調整装置。 The motion command value according to any one of claims 1 to 3, wherein the motion command value is a target value of movement speed of the end effector or a speed command value which is a target value of motion speed of the robot. Adjustment device.
  5.  前記指令値学習部は、前記作業の開始から終了までの間を分割した複数の区分のそれぞれに対して、前記動作指令値を調整することを特徴とする請求項2から4のいずれか1項に記載の動作調整装置。 The said command value learning part adjusts the said operation command value with respect to each of the some division which divided | segmented from the start to completion | finish of the said operation | work to any one of Claim 2 to 4 The operation adjustment device described in.
  6.  前記作業の開始から終了までの間を分割して複数の区分を生成する指令値区分部を備え、
     前記指令値学習部は、前記指令値区分部で生成された前記区分のそれぞれに対して、前記動作指令値を調整することを特徴とする請求項5に記載の動作調整装置。
    The system further comprises a command value classification unit that generates a plurality of divisions by dividing the work from start to finish.
    The operation adjustment device according to claim 5, wherein the command value learning unit adjusts the operation command value for each of the divisions generated by the command value classification unit.
  7.  前記指令値区分部は、前記作業を前記区分に分割するための分割点の位置を調整することを特徴とする請求項6に記載の動作調整装置。 The operation adjustment apparatus according to claim 6, wherein the command value sorting unit adjusts the position of a dividing point for dividing the work into the divisions.
  8.  前記指令値学習部は、力、モーメント、トルク、または電流値の上限あるいは下限のいずれかを制約条件とした学習を行って、前記動作指令値を調整することを特徴とする請求項2から7のいずれか1項に記載の動作調整装置。 The command value learning unit adjusts the operation command value by performing learning with a force, a moment, a torque, or an upper limit or a lower limit of a current value as a constraint condition. The operation adjustment device according to any one of the above.
  9.  前記指令値学習部は、前記ロボットの位置姿勢、または周辺環境との相対位置姿勢の上限あるいは下限のいずれかを制約条件として学習を行って、前記動作指令値を調整することを特徴とする請求項2から8のいずれか1項に記載の動作調整装置。 The command value learning unit adjusts the motion command value by performing learning with any of the upper and lower limits of the position and orientation of the robot or the relative position and orientation with the surrounding environment as constraints. Item 9. The operation adjustment device according to any one of items 2 to 8.
  10.  前記指令値学習部は、M回の作業の試行(Mは自然数)ごとに評価関数に基づいた評価を行い、前記動作指令値を調整することを特徴とする請求項5または6に記載の動作調整装置。 The operation according to claim 5 or 6, wherein the command value learning unit performs evaluation based on an evaluation function every M trials of work (M is a natural number), and adjusts the operation command value. Adjustment device.
  11.  前記ロボットシステムが備える内界センサに基づくフィードバック制御および前記外界センサに基づくフィードバック制御の少なくとも一方における制御パラメータの学習を行うパラメータ学習部を備え、
     前記パラメータ学習部は、前記区分に関する情報である区分情報と、複数回の試行で前記外界センサから得られたセンサ情報とに基づいた学習を行って、前記制御パラメータを更新することを特徴とする請求項5または6に記載の動作調整装置。
    The robot system includes a parameter learning unit for learning control parameters in at least one of feedback control based on an internal sensor provided in the robot system and feedback control based on the external sensor,
    The parameter learning unit is characterized in that the control parameter is updated by performing learning based on division information which is information on the division and sensor information obtained from the external sensor in a plurality of trials. The operation adjustment device according to claim 5 or 6.
  12.  エンドエフェクタが装着されたロボットと、前記ロボットの動作を制御するロボット制御装置とを備え、前記ロボットが作業対象に対して作業を行うロボットシステムで用いられるロボットの動作調整装置であって、
     前記ロボットシステムが備える外界センサで検出された前記エンドエフェクタに作用する力を入力とした学習を行って、前記ロボットシステムが備える内界センサに基づく前記ロボットの動作のフィードバック制御および前記外界センサに基づく前記ロボットの動作のフィードバック制御の少なくとも一方における制御パラメータの学習を行うパラメータ学習部を備えることを特徴とする動作調整装置。
    A robot operation adjustment device used in a robot system comprising: a robot on which an end effector is mounted; and a robot control device that controls an operation of the robot, wherein the robot performs a work on a work target,
    The learning based on the force acting on the end effector detected by the external sensor included in the robot system is performed to perform feedback control of the operation of the robot based on the internal sensor included in the robot system and the external sensor A motion adjustment device comprising: a parameter learning unit that learns control parameters in at least one of feedback control of motion of the robot.
  13.  前記パラメータ学習部は、前記エンドエフェクタに作用する力の範囲を制約条件とした学習を行って、前記制御パラメータを調整することを特徴とする請求項12に記載の動作調整装置。 The operation adjustment device according to claim 12, wherein the parameter learning unit adjusts the control parameter by performing learning with a range of force acting on the end effector as a constraint condition.
  14.  前記パラメータ学習部は、前記作業に要する時間の上限を制約条件とした学習を行って、前記制御パラメータを調整することを特徴とする請求項13に記載の動作調整装置。 The operation adjustment apparatus according to claim 13, wherein the parameter learning unit adjusts the control parameter by performing learning with the upper limit of the time required for the operation as a constraint.
  15.  前記パラメータ学習部は、前記作業の開始から終了までの間を分割した複数の区分のそれぞれに対して、前記制御パラメータを調整することを特徴とする請求項12から14のいずれか1項に記載の動作調整装置。 The said parameter learning part adjusts the said control parameter with respect to each of the some division which divided | segmented from the start to completion | finish of the said operation | work to any one of Claim 12 to 14 characterized by the above-mentioned. Operation adjustment device.
  16.  前記作業の開始から終了までの間を分割して複数の区分を生成する指令値区分部を備え、
     前記パラメータ学習部は、前記指令値区分部で生成された前記区分のそれぞれに対して、前記制御パラメータを調整することを特徴とする請求項15に記載の動作調整装置。
    The system further comprises a command value classification unit that generates a plurality of divisions by dividing the work from start to finish.
    The operation adjustment device according to claim 15, wherein the parameter learning unit adjusts the control parameter for each of the divisions generated by the command value division unit.
  17.  前記指令値区分部は、前記作業を前記区分に分割するための分割点の位置を調整することを特徴とする請求項16に記載の動作調整装置。 The operation adjustment device according to claim 16, wherein the command value classification unit adjusts the position of a division point for dividing the work into the divisions.
  18.  請求項1から17のいずれか1項に記載の動作調整装置と、
     前記動作調整装置で調整された動作指令値または制御パラメータに基づいて前記ロボットの動作を制御するロボット制御装置と
     を備えることを特徴とする動作制御システム。
    An operation adjustment device according to any one of claims 1 to 17,
    A robot control device for controlling the motion of the robot based on the motion command value or the control parameter adjusted by the motion adjustment device.
  19.  請求項18に記載の動作制御システムと、
     前記動作制御システムによって制御される前記ロボットと
    を備えるロボットシステム。
    A motion control system according to claim 18;
    A robot system comprising: the robot controlled by the motion control system.
PCT/JP2018/040696 2017-11-14 2018-11-01 Robot motion adjustment device, motion control system, and robot system WO2019098044A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN201880072607.7A CN111344120B (en) 2017-11-14 2018-11-01 Robot motion adjusting device, motion control system, and robot system
DE112018005832.8T DE112018005832B4 (en) 2017-11-14 2018-11-01 ROBOT MOTION ADJUSTMENT DEVICE, MOTION CONTROL SYSTEM AND ROBOT SYSTEM
JP2019523125A JP6696627B2 (en) 2017-11-14 2018-11-01 Robot movement adjustment device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2017219048 2017-11-14
JP2017-219048 2017-11-14

Publications (1)

Publication Number Publication Date
WO2019098044A1 true WO2019098044A1 (en) 2019-05-23

Family

ID=66539027

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2018/040696 WO2019098044A1 (en) 2017-11-14 2018-11-01 Robot motion adjustment device, motion control system, and robot system

Country Status (4)

Country Link
JP (1) JP6696627B2 (en)
CN (1) CN111344120B (en)
DE (1) DE112018005832B4 (en)
WO (1) WO2019098044A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200301510A1 (en) * 2019-03-19 2020-09-24 Nvidia Corporation Force estimation using deep learning
WO2020246059A1 (en) * 2019-06-06 2020-12-10 三菱電機株式会社 Parameter calculation device, robot control system, and robot system
WO2020255312A1 (en) * 2019-06-19 2020-12-24 三菱電機株式会社 Operation adjustment device for robot, operation control system, and robot system
JP2021013999A (en) * 2019-07-16 2021-02-12 ファナック株式会社 Robot control device
CN114367958A (en) * 2020-10-16 2022-04-19 精工爱普生株式会社 Force control parameter adjustment method
CN114474039A (en) * 2020-10-27 2022-05-13 精工爱普生株式会社 Method for supporting adjustment of parameter set of robot and information processing apparatus
CN114599488A (en) * 2019-10-28 2022-06-07 株式会社安川电机 Machine learning data generation device, machine learning device, work system, computer program, machine learning data generation method, and work machine manufacturing method

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102020103854B4 (en) 2020-02-14 2022-06-15 Franka Emika Gmbh Machine learning of a successfully completed robot application
JP2022054043A (en) * 2020-09-25 2022-04-06 セイコーエプソン株式会社 Method, program and information processing device for performing display about control parameter of robot
CN114095872A (en) * 2021-11-24 2022-02-25 南京工程学院 Quick positioning system and method based on machine vision feedback

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0852674A (en) * 1994-08-12 1996-02-27 Kobe Steel Ltd Position attitude determining method for manipulator
JP2008142810A (en) * 2006-12-07 2008-06-26 Fanuc Ltd Robot controller
JP2014128857A (en) * 2012-12-28 2014-07-10 Yaskawa Electric Corp Robot teaching system and robot teaching method
JP2016016488A (en) * 2014-07-09 2016-02-01 ファナック株式会社 Robot program correction system
JP2016215359A (en) * 2015-05-14 2016-12-22 ファナック株式会社 Processing system for adjusting processing tool rotation number and work-piece feeding speed

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05108108A (en) * 1991-05-10 1993-04-30 Nok Corp Compliance control method and controller
US8886359B2 (en) * 2011-05-17 2014-11-11 Fanuc Corporation Robot and spot welding robot with learning control function
US9517556B2 (en) * 2012-06-29 2016-12-13 Mitsubishi Electric Corporation Robot control apparatus and robot control method
EP2749974A2 (en) 2012-12-28 2014-07-02 Kabushiki Kaisha Yaskawa Denki Robot teaching system, robot teaching assistant device, and robot teaching method
JP5893664B2 (en) 2014-04-14 2016-03-23 ファナック株式会社 Robot control device for controlling a robot to be moved according to an applied force
DE102014216514B3 (en) 2014-08-20 2015-09-10 Kuka Roboter Gmbh Method for programming an industrial robot and associated industrial robots
CN106142081B (en) 2015-05-14 2021-03-02 发那科株式会社 Machining system for adjusting rotating speed of machining tool and feeding speed of workpiece
DE102015011910A1 (en) 2015-09-11 2017-03-16 Kuka Roboter Gmbh Method and system for controlling a robot assembly
JP6333795B2 (en) 2015-11-24 2018-05-30 ファナック株式会社 Robot system with simplified teaching and learning performance improvement function by learning
JP2017159428A (en) * 2016-03-11 2017-09-14 セイコーエプソン株式会社 Control device, robot, and robot system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0852674A (en) * 1994-08-12 1996-02-27 Kobe Steel Ltd Position attitude determining method for manipulator
JP2008142810A (en) * 2006-12-07 2008-06-26 Fanuc Ltd Robot controller
JP2014128857A (en) * 2012-12-28 2014-07-10 Yaskawa Electric Corp Robot teaching system and robot teaching method
JP2016016488A (en) * 2014-07-09 2016-02-01 ファナック株式会社 Robot program correction system
JP2016215359A (en) * 2015-05-14 2016-12-22 ファナック株式会社 Processing system for adjusting processing tool rotation number and work-piece feeding speed

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200301510A1 (en) * 2019-03-19 2020-09-24 Nvidia Corporation Force estimation using deep learning
CN113950393B (en) * 2019-06-06 2023-11-17 三菱电机株式会社 Parameter calculation device, robot control system, and robot system
WO2020246059A1 (en) * 2019-06-06 2020-12-10 三菱電機株式会社 Parameter calculation device, robot control system, and robot system
WO2020246005A1 (en) * 2019-06-06 2020-12-10 三菱電機株式会社 Parameter calculation device, robot control system, and robot system
JP6833115B1 (en) * 2019-06-06 2021-02-24 三菱電機株式会社 Parameter calculator, robot control system, and robot system
CN113950393A (en) * 2019-06-06 2022-01-18 三菱电机株式会社 Parameter calculation device, robot control system, and robot system
WO2020255312A1 (en) * 2019-06-19 2020-12-24 三菱電機株式会社 Operation adjustment device for robot, operation control system, and robot system
JPWO2020255312A1 (en) * 2019-06-19 2021-11-25 三菱電機株式会社 Robot motion adjustment device, motion control system and robot system
JP7098062B2 (en) 2019-06-19 2022-07-08 三菱電機株式会社 Robot motion adjustment device, motion control system and robot system
JP2021013999A (en) * 2019-07-16 2021-02-12 ファナック株式会社 Robot control device
JP7448317B2 (en) 2019-07-16 2024-03-12 ファナック株式会社 robot control device
CN114599488A (en) * 2019-10-28 2022-06-07 株式会社安川电机 Machine learning data generation device, machine learning device, work system, computer program, machine learning data generation method, and work machine manufacturing method
CN114367958A (en) * 2020-10-16 2022-04-19 精工爱普生株式会社 Force control parameter adjustment method
US11839978B2 (en) 2020-10-16 2023-12-12 Seiko Epson Corporation Method of adjusting force control parameter
CN114367958B (en) * 2020-10-16 2023-12-22 精工爱普生株式会社 Force control parameter adjusting method
JP7528709B2 (en) 2020-10-16 2024-08-06 セイコーエプソン株式会社 Force control parameter adjustment method
CN114474039B (en) * 2020-10-27 2023-06-06 精工爱普生株式会社 Method for supporting adjustment of parameter set of robot and information processing apparatus
CN114474039A (en) * 2020-10-27 2022-05-13 精工爱普生株式会社 Method for supporting adjustment of parameter set of robot and information processing apparatus

Also Published As

Publication number Publication date
CN111344120A (en) 2020-06-26
DE112018005832B4 (en) 2023-11-02
CN111344120B (en) 2023-04-07
JPWO2019098044A1 (en) 2019-11-21
DE112018005832T5 (en) 2020-07-30
JP6696627B2 (en) 2020-05-20

Similar Documents

Publication Publication Date Title
WO2019098044A1 (en) Robot motion adjustment device, motion control system, and robot system
US10618164B2 (en) Robot system having learning control function and learning control method
US9421687B2 (en) Robot control apparatus and robot control method
US20190015972A1 (en) Robot system and robot teaching method
US10350749B2 (en) Robot control device having learning control function
JP5480198B2 (en) Spot welding robot with learning control function
US20170090459A1 (en) Machine tool for generating optimum acceleration/deceleration
JP5971342B2 (en) Robot control parameter adjustment method, robot system, and robot control apparatus
JP6006256B2 (en) Robot controller with functions to simplify teaching work and improve operation performance
WO2019116891A1 (en) Robot system and robot control method
EP1587162B1 (en) Self-calibrating sensor orienting system
CN109202894B (en) Robot performing learning control and control method thereof
CN109421049B (en) Robot system
WO2021139373A1 (en) Hybrid control method, apparatus and system for robot arm
KR102033241B1 (en) Device and method controlling a movement speed of robot
EP1586421B1 (en) Self-calibrating orienting system for a manipulating device
US20220009101A1 (en) Control device, control method, and non-transitory recording medium
CN107398903B (en) Track control method for industrial mechanical arm execution end
JP7098062B2 (en) Robot motion adjustment device, motion control system and robot system
US9827673B2 (en) Robot controller inhibiting shaking of tool tip in robot equipped with travel axis
CN111989193A (en) Method and control system for controlling motion trail of robot
JP2009045678A (en) Method for judging success or failure of operation of robot, and robot system
JP7284874B1 (en) ROBOT CONTROL DEVICE AND ROBOT CONTROL METHOD
Sun et al. Motion-reproduction system adaptable to position fluctuation of picking objects based on image information
JP7426333B2 (en) robot control device

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2019523125

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18879027

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 18879027

Country of ref document: EP

Kind code of ref document: A1