WO2017134735A1 - Système de robot, système d'optimisation de robot et procédé d'apprentissage de plan de fonctionnement de robot - Google Patents

Système de robot, système d'optimisation de robot et procédé d'apprentissage de plan de fonctionnement de robot Download PDF

Info

Publication number
WO2017134735A1
WO2017134735A1 PCT/JP2016/052979 JP2016052979W WO2017134735A1 WO 2017134735 A1 WO2017134735 A1 WO 2017134735A1 JP 2016052979 W JP2016052979 W JP 2016052979W WO 2017134735 A1 WO2017134735 A1 WO 2017134735A1
Authority
WO
WIPO (PCT)
Prior art keywords
trajectory
robot
evaluation
unit
cost
Prior art date
Application number
PCT/JP2016/052979
Other languages
English (en)
Japanese (ja)
Inventor
祐太 是枝
敬介 藤本
潔人 伊藤
宣隆 木村
Original Assignee
株式会社日立製作所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立製作所 filed Critical 株式会社日立製作所
Priority to PCT/JP2016/052979 priority Critical patent/WO2017134735A1/fr
Publication of WO2017134735A1 publication Critical patent/WO2017134735A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric

Definitions

  • the present invention relates to a robot motion planning method and motion planning system.
  • Robot movement generation is most easily performed by teaching playback.
  • a remote controller called teaching pendant is used, or a person directly grabs and operates the robot to record the trajectory (posture set that interpolates the initial posture and the target posture) (teaching). The trajectory is reproduced faithfully (playback) (Patent Document 1).
  • the motion plan is a technology that determines the trajectory of the motion of the robot based on some criteria with the input of the initial posture, the target posture and the surrounding environment.
  • the motion planning problem is a problem of searching for a route from a certain point to a different point in a car robot.
  • the motion planning problem is a problem of searching for how to move to the specified posture in an industrial arm type robot.
  • the motion planning method currently used is a method of sampling an infinite number of postures that interpolate an initial posture and a target posture, and determining a trajectory so as to minimize the cost based on a predetermined cost function (Patent Document 2).
  • the current posture of the robot and the action pair to be taken at that time are generated without explicitly obtaining the motion trajectory (Patent Document 3).
  • Patent Document 4 As a prior art disclosing a method for supporting generation of a robot motion by means such as learning, there is a pet robot (Patent Document 4) that optimizes selection of a reaction expected by an operator from operator evaluation.
  • JP 2006-346792 A Japanese Patent Laying-Open No. 2015-160253 JP 2005-56185 A JP-A-11-175132
  • the initial posture and the target posture are fixedly set. Therefore, when the shape and position of the work object and the work environment change, it is necessary to redo the setting work, and the operation planning method cannot be substituted.
  • the operation is only selected from a predetermined list, and the method for acquiring the operation not explicitly given is not disclosed.
  • a trajectory generation unit that generates one or more trajectory candidates that the robot can take in order to reach a predetermined target state from the start state of the robot, and a trajectory candidate From the trajectory feature extraction unit that extracts the motion feature that is the characteristic of the trajectory candidate, the trajectory selection reference recording unit that records the trajectory selection criterion for calculating the appropriateness of the trajectory candidate from the motion feature, and the motion feature, Based on the trajectory selection criteria, a trajectory cost calculation unit that calculates a trajectory cost, which is an index of the appropriateness of trajectory candidates, and a trajectory candidate to be used as a robot motion trajectory are determined using the trajectory cost, and an operation signal is determined.
  • An output trajectory calculation unit an operation unit that operates the robot based on the operation signal, and an evaluation input unit that receives an input of an evaluation based on an operator's order scale for the operation result of the robot Based on the input evaluation trend, the evaluation interpretation unit that determines the update amount of the orbit cost of the orbit candidate corresponding to the operation result to be evaluated, and the orbit cost after being updated with the update amount
  • the robot system includes a learning unit that changes the trajectory selection criterion recorded in the trajectory selection criterion recording unit so that the costs calculated by the calculation unit coincide.
  • Another aspect of the present invention provides a trajectory generation unit that generates one or more trajectory candidates that the robot can take in order to reach a predetermined target state from the start state of the robot, and the trajectory candidates from the trajectory candidates.
  • the trajectory selection extraction unit that extracts the trajectory feature that is the characteristic
  • the trajectory selection criteria recording unit that records the trajectory selection criteria for calculating the appropriateness of the trajectory candidate from the motion features
  • the trajectory cost calculation unit that calculates the trajectory cost that is an index of the appropriateness of the trajectory candidate, the trajectory calculation unit that determines the candidate trajectory to be adopted as the robot trajectory using the trajectory cost
  • the motion trajectory A simulation unit that calculates at least one of the robot movement and the influence of the robot movement on the predetermined virtual physical environment based on a predetermined calculation model, and the robot movement
  • a display unit that visualizes at least one of the effects of robot motion on a predetermined virtual physical environment as a simulation result, and an evaluation input unit that receives an input of an evaluation based on an operator's order scale for the simulation result of the display unit.
  • the evaluation interpretation unit that determines the update amount of the orbit cost of the orbit candidate corresponding to the operation result that is the evaluation target, and the orbit cost after the update amount is updated by the orbit cost calculation unit
  • the robot optimization system includes a learning unit that changes the trajectory selection criterion recorded in the trajectory selection criterion recording unit so that the calculated costs match.
  • Another aspect of the present invention provides a trajectory generation process for generating one or more trajectory candidates that the robot can take in order to reach a predetermined target state from the start state of the robot, and a trajectory candidate from a trajectory candidate.
  • the trajectory cost which is an index of the appropriateness of the candidate trajectory from the motion features, based on the trajectory feature extraction process for extracting the motion features that are characteristics and the trajectory selection criteria for calculating the appropriateness of the trajectory candidates from the motion features
  • receiving the evaluation input by ordinal scale, performing a learning process to change the trajectory selection criteria based on the evaluation input is an operation plan learning method of the robot.
  • Configuration diagram of an embodiment of the present invention SCARA robot perspective view Block diagram showing the trajectory generator Block diagram showing the orbital cost calculator Block diagram showing the candidate trajectory recording unit Table showing the data structure of the candidate trajectory recording unit Block diagram showing the trajectory calculation unit Block diagram showing the operating unit
  • the perspective view which shows the condition which uses an evaluation input part Plan view of evaluation input unit with different configuration
  • Block diagram showing the automatic evaluation unit Table showing the data structure of the criteria record
  • Block diagram showing the evaluation interpreter Block diagram showing the initial function part Plan view showing the operation of a car robot Block diagram showing configurations in different configurations Block diagram showing the trajectory calculation unit
  • notations such as “first”, “second”, and “third” are attached to identify the constituent elements, and do not necessarily limit the number or order.
  • a number for identifying a component is used for each context, and a number used in one context does not necessarily indicate the same configuration in another context. Further, it does not preclude that a component identified by a certain number also functions as a component identified by another number.
  • FIG. 2 is a perspective view of an example of the SCARA robot.
  • the SCARA robot is, for example, a robot 200 having four joints as shown in FIG.
  • the shape of the robot is not limited to the configuration shown in FIG. 2, and may have five or more joints, or may be provided with other driving units such as a gripper.
  • the robot 200 can constitute all or part of the operation unit 106. Alternatively, the robot 200 may be remote and operate according to a command from the operation unit 106.
  • FIG. 1 shows an example of a system for controlling the robot 200 having the configuration shown in FIG.
  • This system can be configured, for example, by a server 201 connected to the robot 200 of FIG. 2 directly or via a network.
  • the server 201 includes a processing device 202, a storage device 203, an input device 204, and an output device 205.
  • the storage device 203 can be configured by a known magnetic disk device, a semiconductor memory, or a combination thereof.
  • functions such as calculation and control are realized in cooperation with other hardware by executing a program (software) stored in the storage device 203 by the processing device 202.
  • the A program executed by the server 201, its function, or means for realizing the function may be referred to as “function”, “means”, “part”, “unit”, “module”, or the like.
  • a function in which the storage device 203 of the server 201 stores specific data or a means for realizing the function may be referred to as a “recording unit”.
  • the trajectory generation unit 101 generates one or more trajectory candidates that the robot 200 can take in order to reach a predetermined target state (target position) from a predetermined start state.
  • the trajectory candidates are selected so as to satisfy the conditions such that the robot 200 does not collide with surrounding objects and the robot 200 itself, and each joint follows the robot motion model. This will be described in detail later with reference to FIG.
  • the trajectory feature extraction unit 102 extracts the motion features of each candidate trajectory (an index that well represents the trajectory properties) by referring to the values of variables that define the trajectory.
  • the trajectory selection criterion recording unit 104 records, as trajectory selection criteria, a calculation criterion for calculating, as a quantitative value, whether or not the trajectory is favorable for the operator from the motion characteristics of the robot motion.
  • the calculation standard can be expressed by, for example, a weight parameter between each neuron of the neural network. However, it is not limited to the above as long as it is a criterion for evaluating the appropriateness from the motion characteristics of the trajectory.
  • the trajectory cost calculation unit 103 receives the motion features extracted by the trajectory feature extraction unit 102 from the candidate trajectories generated by the trajectory generation unit 101.
  • the trajectory cost calculation unit 103 calculates the appropriateness of the candidate trajectory as a quantitative numerical value by applying the trajectory selection criterion recorded in the trajectory selection criterion recording unit 104 to the input motion feature, and calculates this value. Output as cost.
  • the track cost calculation unit 103 will be described in detail later with reference to FIG.
  • the trajectory selection reference recording unit 104 records the conversion parameters for calculating the appropriateness of the trajectory candidates from the motion features extracted by the trajectory feature extraction unit 102 from the trajectory candidates generated by the trajectory generation unit 101. Yes.
  • the trajectory cost calculation unit 103 calculates a trajectory cost that is an index of the appropriateness of trajectory candidates based on the conversion parameter.
  • the trajectory calculation unit 105 determines the trajectory of the robot 200 so as to reduce the cost, and outputs a motor signal that moves the robot 200 according to the trajectory. This will be described in detail later with reference to FIG.
  • the operation unit 106 drives the four joints of the SCARA robot 200 based on the motor signal. Based on the operation of the robot 200 generated by the operation unit, for example, the operator gives an input of “good” or “bad” by the evaluation input unit 107. This will be described in detail later with reference to FIG.
  • the evaluation interpretation unit 108 newly determines a cost corresponding to the motion feature of the motion that has been evaluated by using the tendency of the evaluation based on a plurality of evaluations from the operator.
  • the new cost output by the evaluation interpretation unit 108 increases the cost of the feature of the motion trajectory that received the “bad” evaluation, and decreases the cost of the feature of the motion trajectory that received the “good” rating.
  • the cost of the motion trajectory that was irrelevant to the evaluation is determined so as not to change.
  • the learning unit 109 changes the trajectory selection criterion of the trajectory selection criterion recording unit 104 so that the cost calculated by the trajectory cost calculation unit 103 matches the cost calculated by the evaluation interpretation unit 108. By repeating the increase / decrease in costs and learning, the trajectory selection criteria for the motion features that show a strong evaluation tendency are changed.
  • the motion feature input to the trajectory cost calculation unit 103 is, for example, a distance x between the tip of the robot and the operation target.
  • the cost calculated by the trajectory cost calculation unit 103 can be calculated by c (x) which is a function for calculating the cost.
  • the trajectory selection criterion recording unit 104 stores the definition of c (x) as a calculation criterion. For example, if the distance x is 10 mm, 20 mm, and 30 mm, the cost for each is c (10), c (20), and c (30). Here, it is assumed that the cost calculated by the trajectory cost calculation unit 103 is c (10) ⁇ c (20) ⁇ c (30).
  • the cost indicates that the smaller value is the “good” trajectory.
  • the operator's evaluation input by the evaluation input unit 107 is 20 mm, and the evaluation is “good” in the order of 10 mm and 30 mm thereafter.
  • the operator's evaluation is an order scale indicating the order of candidates.
  • the learning unit 109 corrects the calculation criterion c (x) of the trajectory selection criterion recording unit 104 so as to approach the operator's evaluation.
  • c (x) is changed so that c (20) ⁇ c (10) ⁇ c (30).
  • the robot 200 having three joints and one push-down member (pusher) in FIG. 2 will be described as an example.
  • the trajectory generation unit 101 connects the current posture [ ⁇ 0 0 , ⁇ 1 0 , ⁇ 2 0 , t 0 ] to the target posture [ ⁇ 0 L , ⁇ 1 L , ⁇ 2 L , t L ] given by the operator.
  • these postures and trajectories may have differential elements (velocity, acceleration) of the respective variables and conditions other than the posture, and are not limited to the above as long as they are elements for constituting the trajectory of the robot 200.
  • the target posture may be a set of a plurality of postures. The setting of the target state is not limited to that given by the operator, and may be automatically given from a robot control system or the like.
  • ⁇ Sampling of postures need not be limited to random, such as sampling at regular intervals or sampling based on a predetermined rule such as a quasi-random number.
  • the posture pair generation unit 302 generates a plurality of posture pairs from neighboring (Equation 3) postures. [Equation 3]
  • the definition of the neighborhood may be a primary norm. It is assumed that ⁇ uses a threshold set in advance by an operator or the like. The larger this set value ⁇ , the more the number of posture pairs increases, leading to an increase in calculation time, while the smaller the set value ⁇ , the smaller the number of posture pairs and the better the trajectory cannot be obtained.
  • a value may be given such that the number of pairs of predetermined postures is about 10 on average. Note that the value setting example is not limited to the above example.
  • the trajectory feature extraction unit 102 creates a feature quantity q by aggregating posture values included in the trajectory from the trajectory.
  • the feature quantity q is the Euclidean distance d between the positions of the hand (pusher tip) between the two postures, the change amount ⁇ i of the joint angle between the two postures, the velocity ⁇ i of each joint between the two postures, the obstacle from the robot Or at least one selected from the minimum distance l (clearance) or a combination thereof.
  • the feature is information representing the characteristics of the trajectory involved in the robot operation, various features can be adopted without being limited to the above feature.
  • the trajectory cost calculation unit 103 will be described with reference to FIG.
  • the input preprocessing unit 401 applies PCA (principal component analysis) to the feature quantity q, and performs decorrelation of the input.
  • the calculation unit 402 receives the uncorrelated feature value as an input, and uses the neural network weight parameter recorded in the trajectory selection reference recording unit 104, and outputs a numerical value (cost) corresponding to the feature value q by the neural network.
  • the cost calculation index used by the calculation unit 402 is not limited to the neural network as long as it is a function that can change the contribution rate of each numerical value of the feature value by a parameter such as linear combination of the numerical values of the feature value q or random forest regression.
  • the trajectory selection reference recording unit 104 records the coefficient of each feature amount, and if it is random forest regression, records the configuration of the decision tree. In this way, the cost is calculated for each trajectory candidate.
  • FIG. 5 shows a modification of the embodiment of FIG.
  • the present embodiment may include a candidate trajectory recording unit 501 as shown in FIG.
  • the trajectory generation unit 101 generates a candidate trajectory for each execution, but the candidate trajectory recording unit 501 records the trajectory candidates and feature quantities q calculated in advance by the trajectory generation unit 101 and the trajectory feature extraction unit 102.
  • the trajectory cost calculation unit 103 uses the recorded trajectory candidates and the feature quantity q for cost calculation.
  • FIG. 6 shows the data structure of the candidate trajectory recording unit 501.
  • Reference numeral 601 in FIG. 6A denotes a posture data structure sampled by the posture sampling unit 301.
  • Each row of 601 indicates all variables of one posture. For example, in posture A, the angle of joint 0 is 0.1, the angle of joint 1 is 2.1, the angle of joint 2 is 0.5, and the pusher elongation t is 5.0.
  • FIG. 6B records a posture 604 that forms a posture pair with each posture 603, and a memory address 605 that records data of the posture pair.
  • posture B is connected to postures A, D, and E, and information on the connection is recorded in memories 0001, 0002, and 0003, respectively.
  • 606 records the feature quantity 607 of the posture pair.
  • the connection from posture B to posture A is stored in the memory 0001
  • the Euclidean distance d of the hand position between the two postures at that time is 0.6
  • the angle change amount ⁇ 0 of the joint 0 between the two postures is ⁇ 0.1. .
  • the trajectory calculation unit 105 will be described with reference to FIG.
  • the minimum cost route search unit 701 searches for a trajectory that minimizes the cost.
  • the collision determination unit 702 is used to determine whether or not the object is in contact with the obstacle, and the route with the obstacle is excluded from the search.
  • the operation characteristic recording unit 703 records various operation characteristics of the robot 200 necessary for converting the trajectory into a drive signal for the actuator of the robot 200. It is assumed that the robot operation characteristic data is acquired in advance and stored in the storage device 203.
  • the operation signal generation unit 704 receives the trajectory of the minimum cost, uses the operation characteristics of the robot 200, and outputs a PWM (pulse width modulation) signal as an operation signal for operating the robot 200.
  • the operation signal is not limited to the PWM signal as long as it is a signal for driving the actuator of the robot.
  • FIG. 7 is used again to describe the trajectory calculation unit 105 in a configuration in which the trajectory is a set of posture pairs.
  • the cost of a candidate trajectory can be regarded as a graph in which the vertices of posture are connected by edges with weights of costs.
  • the minimum cost route search unit 701 determines the trajectory using the Dijkstra method so that the total cost is minimized. However, the Dijkstra method may be an arbitrary minimum cost search algorithm.
  • the collision determination unit 702 is used to determine whether or not the object is in contact with the obstacle, and the route that is in contact with the obstacle is excluded from the search.
  • the operation characteristic recording unit 703 and the operation signal generation unit 704 have the same configuration as described above.
  • the sensor unit 801 observes the state of the robot with encoders of each joint of the robot 200. However, the sensor unit may be either an internal sensor or an external sensor.
  • the controller unit 802 receives the difference between the state of the robot 200 and the command signal, and determines the motor output by PID control (Proportional-Integral-Derivative Controller).
  • PID control Proportional-Integral-Derivative Controller
  • the actuator unit 803 inputs the motor output and drives the actuator of the robot 200.
  • FIG. 9 shows a situation where evaluation is performed by the evaluation input unit 107 (901).
  • the evaluation input unit 901 is a switch having a binary input (Equation 4) of “good” and “bad”. [Equation 4]
  • the operator 900 can make an evaluation by directly viewing the operation of the robot 902 on site. Further, the robot 902 is in a remote place, and it is possible to perform evaluation by transmitting voice and video data acquired in the remote place and monitoring by an operator.
  • the robot 200 (902) may have various shapes such as the one shown in FIG. 2 and the one shown in FIG.
  • the evaluation value is not limited to 2 as long as it is an ordinal scale.
  • an operation that the robot 902 can correctly accomplish the purpose is appropriate, and an operation that the robot 902 cannot complete the purpose or an unfavorable operation such as not securing sufficient clearance is inappropriate. It can be considered.
  • the evaluation input unit 107 (901) in a different configuration will be described with reference to FIG.
  • the evaluation input unit 107 receives an operation ranking input obtained by the operator 900 observing the operation of the robot 902 by referring to the past operation. For example, the smaller the numerical value in a three-step evaluation, the better. In the example of FIG. 10, for example, the evaluation of motion Motion ⁇ A evaluated in the past can be referred to, and compared with this, the motion Motion D can be evaluated. However, the ranking may include equal and inequality signs.
  • the evaluation input unit 107 may receive an operation ranking input obtained by the operator 900 comparing the operations of two or more robots 902 having the same specifications.
  • FIG. 11 shows an example of a different configuration that does not require operator involvement.
  • an automatic evaluation unit 1100 is provided.
  • the motion sensing unit 1101 is an optical three-dimensional motion measurement device that measures the tip position of the robot 902, for example.
  • the motion sensing unit is not limited to the optical three-dimensional motion measurement device as long as it is a sensor that measures the robot 902 itself.
  • the environment sensing unit 1102 is, for example, a camera that observes a change in the position of the operation target of the robot 902. For example, the environment sensing unit 1102 determines whether the part is correctly inserted into a different part.
  • environment sensing is not limited to a camera as long as it is a sensor that measures the environment surrounding the robot 902. In the above example, two types of sensing units are provided, but only one of the sensing units may be provided, or another sensing unit may be added.
  • FIG. 12 shows an example of the data structure of the criterion recording unit 1103. “Good” when both the criterion “A” 1201 indicating whether the tip position of the robot 902 is more than the predetermined distance from the work target and the criterion “B” 1202 indicating whether the component is correctly inserted into the component are satisfied. Criteria for evaluating ("RESULT" number "1") are recorded.
  • the criterion may include an arbitrary calculation formula or conditional branch. The operation can be evaluated by applying information that can be acquired from various sensing units to the conditions defined by the determination criteria.
  • the determination unit 1104 outputs an evaluation value of the operation based on the inputs from the operation sensing unit 1101 and the environment sensing unit 1102 and the determination standard 1200.
  • the evaluation value may be either an order scale or a ranking of actions.
  • the evaluation interpretation unit 108 will be described with reference to FIG.
  • the evaluation organizing unit 1301 divides the operation / evaluation pair by the evaluation of the operator 900 into two groups based on a threshold set by the user. In this embodiment, the two groups are “good” and “bad”, which are the same as the evaluation of the operator 900. If the evaluation is “very good”, “good”, “bad”, “very bad”, etc., it is divided into two groups: “very good” “good” and “bad” “very bad”. You just have to decide.
  • the cost update amount determination unit 1302 determines the cost difference r ′ k before and after learning so that the cost decreases with a “good” evaluation and increases with a “bad” evaluation. However, r 'k is updated before the neural network function f, and the neural network after the update f' r 'k is taken as is defined by equation (5). [Equation 5]
  • the cost associated with these motion features is simply reduced by a “good” evaluation and increased by a “bad” evaluation. Changes unstable.
  • the change amount ⁇ i of the joint angle between the two postures can be considered as the operation feature universally involved in the costs of various operations. Therefore, in addition to the cost being reduced by the “good” evaluation and the cost being increased by the “bad” evaluation, the cost related to the operation feature that is not “good” or “bad” that was not related to the operator's evaluation does not change.
  • Determine the cost update amount For example, with the weight of the neural network irrelevant to the evaluation, a cost update amount that cancels the update amount is determined.
  • the evaluation interpretation unit 108 in the configuration in which the evaluation input unit outputs the ranking outputs the evaluation order in which the circulation and contradiction are arranged using the Schulze method with the set of evaluation orders as an input.
  • the rearrangement of evaluation rank is not limited to Schulze method as long as it is an election method that can arrange the circulation and contradiction of the set of evaluation orders.
  • the learning unit 109 updates the trajectory selection reference recording unit using the steepest descent method so as to minimize (Equation 7) from the difference r ′ k . [Equation 7]
  • trajectory selection reference recording unit may be updated using different optimization algorithms such as AdaGrad and RMSProp.
  • the learning unit 109 in the configuration in which the evaluation unit outputs rankings is output by the evaluation interpretation unit, and the combination of ranked feature quantities ⁇ [q g , q b ] 0 , [q g , q b ] 1 ,... ⁇ From (Equation 8), the steepest descent method is used to minimize each weight of the neural network.
  • is a sigmoid function.
  • FIG. 14 is a diagram showing another modification of the embodiment.
  • a cost initialization unit 1401 that outputs an initial value of the trajectory selection reference recording unit with an operator-defined initial objective function as an input may be provided.
  • the initial objective function is the total movement amount of the joint (Equation 9). [Equation 9]
  • and ⁇ t
  • the initial objective function is not limited to the movement amount of the joint as long as it is a mapping from the feature value to the real number.
  • the trajectory selection criterion is calculated using the learning unit so that the output for the feature quantity q i is f init (q i ).
  • (Equation 10) is a planar coordinate
  • (Equation 11) is the direction of the robot. [Equation 10] [Equation 11]
  • the trajectory generation unit 101, the trajectory cost calculation unit 103, the trajectory selection reference recording unit 104, the evaluation input unit 107, the evaluation interpretation unit 108, and the learning unit 109 are implemented by performing the corresponding processing of the first embodiment (FIG. 1). It becomes possible.
  • the trajectory feature extraction unit 1602 creates a feature quantity q by aggregating posture values included in the trajectory from the trajectory.
  • the feature quantity q is the Euclidean distance (Equation 12) between the two postures, the radius of rotation R of the orbit, and the speeds ⁇ 0 and ⁇ 1 in the two postures.
  • the feature is not limited to the above feature as long as it is information representing the characteristics of the trajectory involved in the operation of the robot. [Equation 12]
  • the trajectory calculation unit 1605 will be described with reference to FIG.
  • the minimum cost path search unit 1701 searches for a trajectory with the minimum cost calculated by the trajectory cost calculation unit 103 from the candidate trajectories generated by the trajectory generation unit 101, and outputs the trajectory.
  • the collision determination unit 1702 is used to determine whether or not the object is in contact with the obstacle, and the route that is in contact with the obstacle is excluded from the search.
  • the simulation process 1611 simulates the operation of the robot when the robot performs the trajectory calculated by the trajectory calculation unit 1605 in the real world.
  • the simulation for example, the robot operation itself and the influence of the robot operation on a predetermined virtual physical environment are output as simulation results.
  • the simulation is an event that is expected when the robot operates in the real world, such as the deviation between the trajectory calculated by the trajectory calculation unit 1605 and the trajectory in the simulation due to the disturbance that occurs in the robot, and the impact of the robot on the environment. If there is, it is not limited to the operation of the robot.
  • the display unit 1612 presents the trajectory calculated by the trajectory calculation unit 1605 to the operator on the display.
  • a trajectory on a plane can be displayed as shown in FIG.
  • the car robot starts from the position 1501, takes a trajectory passing through the position 1503, and arrives at the position 1502.
  • the display form may be converted into a stereoscopic image and displayed.
  • the example of the vehicle type robot has been described.
  • the SCARA robot as shown in FIG.
  • the same effects as in the first embodiment can be obtained without actually preparing or operating the robot.
  • the server 201 described in the above embodiments may be configured by a single computer, or any part of the input device 204, the output device 205, the processing device 202, and the storage device 203 are connected via a network. You may comprise with another computer. The idea of the invention is equivalent and unchanged.
  • functions equivalent to those configured by software can be realized by hardware such as FPGA (Field Programmable Gate Array) and ASIC (Application Specific Integrated Circuit). Such an embodiment is also included in the scope of the present invention.
  • the present invention is not limited to the above-described embodiment, and includes various modifications.
  • a part of the configuration of one embodiment can be replaced with the configuration of another embodiment, and the configuration of another embodiment can be added to the configuration of one embodiment.
  • Orbit generator 101 Orbit feature extraction unit: 102 Orbit cost calculator: 103 Orbit selection reference recording unit: 104 Orbit calculation unit: 105 Operating part: 106 Input section: 107 Evaluation interpretation section: 108 Learning Department: 109 Robot: 200 Server: 201

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Manipulator (AREA)

Abstract

La présente invention concerne un procédé d'apprentissage de plan de fonctionnement de robot qui comprend : un processus de génération de trajectoire permettant de générer un ou plusieurs candidats pour la trajectoire qui doit être suivie par un robot de telle sorte que le robot soit amené dans une position cible prédéterminée à partir d'une position de départ ; un processus d'extraction de caractéristiques de trajectoire permettant d'extraire, de chaque candidat de trajectoire, une caractéristique opérationnelle, servant de propriété de candidat de trajectoire ; un processus de calcul de coût de trajectoire permettant de calculer des coûts de trajectoire, qui servent d'indice de la justesse de la trajectoire de candidat à partir de caractéristiques opérationnelles, sur la base d'une référence de sélection de trajectoire pour calculer la justesse du candidat de trajectoire à partir de la caractéristique opérationnelle ; un processus de calcul de trajectoire permettant de déterminer un candidat de trajectoire qui doit être adopté en tant que trajectoire de fonctionnement du robot à l'aide du coût de trajectoire ; un processus de démonstration permettant de faire fonctionner le robot sur la base de la trajectoire de fonctionnement de robot déterminée par le processus de calcul de trajectoire et/ou un processus de simulation permettant de réaliser une simulation du fonctionnement du robot sur la base de la trajectoire de fonctionnement de robot ; et un processus d'apprentissage permettant de recevoir une évaluation, en échelle ordinale, réalisée lors du processus de démonstration et/ou du processus de simulation, et de changer la référence de sélection de trajectoire sur la base de l'entrée d'évaluation.
PCT/JP2016/052979 2016-02-02 2016-02-02 Système de robot, système d'optimisation de robot et procédé d'apprentissage de plan de fonctionnement de robot WO2017134735A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2016/052979 WO2017134735A1 (fr) 2016-02-02 2016-02-02 Système de robot, système d'optimisation de robot et procédé d'apprentissage de plan de fonctionnement de robot

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2016/052979 WO2017134735A1 (fr) 2016-02-02 2016-02-02 Système de robot, système d'optimisation de robot et procédé d'apprentissage de plan de fonctionnement de robot

Publications (1)

Publication Number Publication Date
WO2017134735A1 true WO2017134735A1 (fr) 2017-08-10

Family

ID=59500337

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2016/052979 WO2017134735A1 (fr) 2016-02-02 2016-02-02 Système de robot, système d'optimisation de robot et procédé d'apprentissage de plan de fonctionnement de robot

Country Status (1)

Country Link
WO (1) WO2017134735A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019176477A1 (fr) * 2018-03-14 2019-09-19 オムロン株式会社 Dispositif de contrôle de robot
CN111195906A (zh) * 2018-11-20 2020-05-26 西门子工业软件有限公司 用于预测机器人的运动轨迹的方法和系统
CN112894822A (zh) * 2021-02-01 2021-06-04 配天机器人技术有限公司 机器人运动轨迹规划方法、机器人及计算机存储介质
CN113260936A (zh) * 2018-12-26 2021-08-13 三菱电机株式会社 移动体控制装置、移动体控制学习装置及移动体控制方法
US20220143829A1 (en) * 2020-11-10 2022-05-12 Kabushiki Kaisha Yaskawa Denki Determination of robot posture

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04112302A (ja) * 1990-09-03 1992-04-14 Matsushita Electric Ind Co Ltd ファジィ推論装置
JPH0561844A (ja) * 1991-08-01 1993-03-12 Fujitsu Ltd 適応型データ処理装置の自己学習処理方式
JPH11175132A (ja) * 1997-12-15 1999-07-02 Omron Corp ロボット、ロボットシステム、ロボットの学習方法、ロボットシステムの学習方法および記録媒体
JP2002269530A (ja) * 2001-03-13 2002-09-20 Sony Corp ロボット装置、ロボット装置の行動制御方法、プログラム及び記録媒体
JP2013193194A (ja) * 2012-03-22 2013-09-30 Toyota Motor Corp 軌道生成装置、移動体、軌道生成方法及びプログラム

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04112302A (ja) * 1990-09-03 1992-04-14 Matsushita Electric Ind Co Ltd ファジィ推論装置
JPH0561844A (ja) * 1991-08-01 1993-03-12 Fujitsu Ltd 適応型データ処理装置の自己学習処理方式
JPH11175132A (ja) * 1997-12-15 1999-07-02 Omron Corp ロボット、ロボットシステム、ロボットの学習方法、ロボットシステムの学習方法および記録媒体
JP2002269530A (ja) * 2001-03-13 2002-09-20 Sony Corp ロボット装置、ロボット装置の行動制御方法、プログラム及び記録媒体
JP2013193194A (ja) * 2012-03-22 2013-09-30 Toyota Motor Corp 軌道生成装置、移動体、軌道生成方法及びプログラム

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019176477A1 (fr) * 2018-03-14 2019-09-19 オムロン株式会社 Dispositif de contrôle de robot
JP2019155554A (ja) * 2018-03-14 2019-09-19 オムロン株式会社 ロボットの制御装置
US11673266B2 (en) 2018-03-14 2023-06-13 Omron Corporation Robot control device for issuing motion command to robot on the basis of motion sequence of basic motions
CN111195906A (zh) * 2018-11-20 2020-05-26 西门子工业软件有限公司 用于预测机器人的运动轨迹的方法和系统
CN111195906B (zh) * 2018-11-20 2023-11-28 西门子工业软件有限公司 用于预测机器人的运动轨迹的方法和系统
CN113260936A (zh) * 2018-12-26 2021-08-13 三菱电机株式会社 移动体控制装置、移动体控制学习装置及移动体控制方法
CN113260936B (zh) * 2018-12-26 2024-05-07 三菱电机株式会社 移动体控制装置、移动体控制学习装置及移动体控制方法
US20220143829A1 (en) * 2020-11-10 2022-05-12 Kabushiki Kaisha Yaskawa Denki Determination of robot posture
US11717965B2 (en) * 2020-11-10 2023-08-08 Kabushiki Kaisha Yaskawa Denki Determination of robot posture
CN112894822A (zh) * 2021-02-01 2021-06-04 配天机器人技术有限公司 机器人运动轨迹规划方法、机器人及计算机存储介质
CN112894822B (zh) * 2021-02-01 2023-12-15 配天机器人技术有限公司 机器人运动轨迹规划方法、机器人及计算机存储介质

Similar Documents

Publication Publication Date Title
Ibarz et al. How to train your robot with deep reinforcement learning: lessons we have learned
Long et al. Towards optimally decentralized multi-robot collision avoidance via deep reinforcement learning
Francis et al. Long-range indoor navigation with prm-rl
Kyrarini et al. Robot learning of industrial assembly task via human demonstrations
WO2017134735A1 (fr) Système de robot, système d'optimisation de robot et procédé d'apprentissage de plan de fonctionnement de robot
Bency et al. Neural path planning: Fixed time, near-optimal path generation via oracle imitation
Fu et al. One-shot learning of manipulation skills with online dynamics adaptation and neural network priors
JP6951659B2 (ja) タスク実行システム、タスク実行方法、並びにその学習装置及び学習方法
US20190143517A1 (en) Systems and methods for collision-free trajectory planning in human-robot interaction through hand movement prediction from vision
US9387589B2 (en) Visual debugging of robotic tasks
US20180036882A1 (en) Layout setting method and layout setting apparatus
Petrič et al. Smooth continuous transition between tasks on a kinematic control level: Obstacle avoidance as a control problem
Frank et al. Efficient motion planning for manipulation robots in environments with deformable objects
WO2019009350A1 (fr) Procédé de génération d'itinéraire, système de génération d'itinéraire et programme de génération d'itinéraire
Kshirsagar et al. Specifying and synthesizing human-robot handovers
CN115605326A (zh) 用于控制机器人的方法和机器人控制器
JP7295421B2 (ja) 制御装置及び制御方法
WO2020246482A1 (fr) Dispositif de commande, système, dispositif d'apprentissage et procédé de commande
Frank et al. Using gaussian process regression for efficient motion planning in environments with deformable objects
JP2020508888A (ja) ロボットが技能を学習し実行するシステム、装置及び方法
Thomaz et al. Mobile robot path planning using genetic algorithms
Sturm et al. Unsupervised body scheme learning through self-perception
KR102332314B1 (ko) 로봇과 카메라 간 좌표값 보정 장치 및 보정 방법
Sturm et al. Adaptive body scheme models for robust robotic manipulation.
Maldonado-Valencia et al. Planning and visual-servoing for robotic manipulators in ROS

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16889225

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16889225

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP