WO2022003833A1 - Positioning control device and machine learning device - Google Patents

Positioning control device and machine learning device Download PDF

Info

Publication number
WO2022003833A1
WO2022003833A1 PCT/JP2020/025698 JP2020025698W WO2022003833A1 WO 2022003833 A1 WO2022003833 A1 WO 2022003833A1 JP 2020025698 W JP2020025698 W JP 2020025698W WO 2022003833 A1 WO2022003833 A1 WO 2022003833A1
Authority
WO
WIPO (PCT)
Prior art keywords
motor
vibration
speed control
parameter
motor speed
Prior art date
Application number
PCT/JP2020/025698
Other languages
French (fr)
Japanese (ja)
Inventor
翔 堀内
宏武 勅使
直弥 田島
Original Assignee
三菱電機株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 三菱電機株式会社 filed Critical 三菱電機株式会社
Priority to JP2021512951A priority Critical patent/JPWO2022003833A1/ja
Priority to PCT/JP2020/025698 priority patent/WO2022003833A1/en
Publication of WO2022003833A1 publication Critical patent/WO2022003833A1/en

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D3/00Control of position or direction
    • G05D3/10Control of position or direction without using feedback

Definitions

  • the present disclosure relates to a positioning control device and a machine learning device that control the positioning of a motor by a motor speed control parameter that controls the speed of the motor.
  • the motor is electrically connected to a control device such as a programmable logic controller (PLC) via an amplifier.
  • PLC programmable logic controller
  • the user inputs the motor speed control parameter, which is a parameter necessary for controlling the speed of the motor.
  • motor speed control parameters are position, operating pulse speed and acceleration rate.
  • the PLC outputs a pulse signal, which is a command signal, to the amplifier based on the motor speed control parameter input by the user, and the amplifier controls the motor based on the pulse signal. That is, the current consumption of the system, the load on the motor and the takt time are determined by the motor speed control parameters input by the user.
  • the motor speed control parameters are determined based on the user's experience, etc., according to the motor information including the motor type, equipment, and product. If the system is not performing the intended operation, the user redetermines and inputs the motor speed control parameters. This process is repeated until the system performs the intended operation.
  • the motor speed control parameters for example, if the acceleration parameter is set large to shorten the tact time, and as a result, the vibration of the equipment becomes large or the power consumption becomes large, other than the tact time. On the other hand, it may lead to negative results. Therefore, the user needs to consider the tact time, the vibration of the equipment, the power consumption, etc. so as to be within the permissible range. However, there is a trade-off relationship between takt time, vibration, and power consumption, and it is difficult to derive a balanced speed control parameter with short tact time, low vibration, and low power consumption based on experience. Is.
  • Patent Document 1 describes a machine that processes a workpiece, a higher-level device that is located above one or more control devices that control the machine and adjusts the servo gain used in machining by the machine, and an adjustment of the servo gain of the machine.
  • a control system including a machine learning device for machine learning and a machine learning device is disclosed.
  • the machine learning device determines the adjustment behavior of the servo gain of the machine based on the machine learning result and the state data of the adjustment of the servo gain of the machine, and changes the servo gain of the machine. do.
  • the present disclosure has been made in view of the above, and an object thereof is to obtain a positioning control device capable of shortening the tact time as compared with the conventional case and extending the life of a control system including a motor. And.
  • the present disclosure is a positioning control device that is electrically connected to a motor via an amplifier and controls the motor, and is a parameter storage unit and a vibration data acquisition unit. , A machine learning unit, and an output unit.
  • the parameter storage unit is information necessary for determining the motor speed control parameter, which is a parameter for controlling the motor, and stores the motor information and the parameters including the allowable range of the vibration and the tact time of the motor.
  • the vibration data acquisition unit acquires vibration data, which is the vibration of the motor installation location detected by the vibration sensor.
  • the machine learning unit uses a trained model that learns the correlation between the vibration of the motor installation location and the motor speed control parameter from the parameters and vibration data, and the vibration is within the allowable range from the parameters and vibration data.
  • the output unit outputs to the amplifier as a pulse for controlling the amplifier based on the determined motor speed control parameter.
  • the positioning control device has the effect of shortening the tact time as compared with the conventional case and extending the life of the control system including the motor.
  • a block diagram showing an example of a configuration of a control system including a positioning control device according to the first embodiment A block diagram schematically showing an example of the functional configuration of the machine learning unit included in the positioning control device according to the first embodiment.
  • Block diagram showing another example of the configuration of the control system including the positioning control device according to the first embodiment Block diagram showing another example of the configuration of the control system including the positioning control device according to the first embodiment.
  • Block diagram showing another example of the configuration of the control system including the positioning control device according to the third embodiment Block diagram showing another example of the configuration of the control system including the positioning control device according to the third embodiment. Block diagram showing another example of the configuration of the control system including the positioning control device according to the third embodiment. Block diagram showing another example of the configuration of the control system including the positioning control device according to the third embodiment.
  • FIG. 1 is a block diagram showing an example of a configuration of a control system including a positioning control device according to the first embodiment.
  • the control system 1 includes a positioning control device 10, an amplifier 30, a motor 50, and a vibration sensor 61.
  • the positioning control device 10 is within a predetermined range of the motor 50 to be connected from the information of the motor 50 and the vibration data which is the data acquired from the vibration sensor 61 of the device to which the positioning control device 10 is connected. It is a device that obtains the motor speed control parameters so that the vibration becomes optimum.
  • the positioning control device 10 includes a parameter storage unit 11, a sensor value acquisition unit 12, a machine learning unit 13, a motor speed control parameter output unit 14, and a pulse output unit 15.
  • An example of the positioning control device 10 is a PLC or a positioning motion unit.
  • the parameter storage unit 11 stores parameters that are information about the motor 50 to be controlled.
  • the parameters are information necessary for determining the motor speed control parameters, and include information of the motor 50 input by the user and operating conditions of the motor 50.
  • the information of the motor 50 is information including the type of the motor 50, the power capacity, the rated speed, and the size of the motor 50.
  • the operating condition is a condition to be satisfied by the control system 1 when the motor 50 is controlled by the set motor speed control parameter. Operating conditions include tolerances and priorities.
  • the permissible range is a condition that defines the tact time during operation of the control system 1, the current consumption of the motor 50, and the maximum value of the vibration of the motor 50.
  • the priority item is a condition indicating an item that the user wants to optimize among the tact time, the current consumption of the motor 50, and the vibration of the motor 50.
  • the allowable range for the vibration of the motor 50 and the tact time having a trade-off relationship with the vibration of the motor 50 is set, and the vibration of the motor 50 is set as a priority item.
  • the sensor value acquisition unit 12 holds vibration data, which is data obtained from the vibration sensor 61, and outputs the vibration data to the machine learning unit 13.
  • the machine learning unit 13 learns the motor speed control parameter that is the optimum vibration of the motor 50 from the value of the parameter storage unit 11 and the vibration data obtained from the sensor value acquisition unit 12.
  • the machine learning unit 13 corresponds to the machine learning device.
  • the vibration and tact time of the motor 50 may be within the allowable range, but if there are a plurality of motor speed control parameters in which the vibration and the tact time are within the allowable range, one motor is according to a predetermined standard.
  • the speed control parameter is selected.
  • the optimum vibration is such that the current consumption of the motor 50 is also within the allowable range.
  • the machine learning unit 13 gives priority to the item specified as the priority item and determines the motor speed control parameter so as not to exceed the allowable range.
  • the motor speed control parameters are set so as not to exceed the allowable range, but for items other than the priority items, the allowable range may be exceeded or the allowable range is exceeded. It may not be.
  • Motor speed control parameters include position, start-up pulse count, run pulse speed, run pulse count and acceleration / deceleration rate.
  • the motor speed control parameter output unit 14 outputs the motor speed control parameter obtained from the machine learning unit 13 to the pulse output unit 15.
  • the pulse output unit 15 outputs a pulse, which is a command signal for controlling the amplifier 30, to the amplifier 30 based on the motor speed control parameter obtained from the motor speed control parameter output unit 14.
  • the motor speed control parameter output unit 14 and the pulse output unit 15 correspond to the output unit.
  • the amplifier 30 is a device that is electrically connected to the positioning control device 10 and the motor 50 and controls the motor 50 by a pulse output from the positioning control device 10.
  • the motor 50 is an electric power device that can be controlled by the amplifier 30.
  • An example of the motor 50 is a servomotor having an encoder for position detection and a stepping motor without an encoder.
  • the vibration sensor 61 measures the vibration at the location where the motor 50 is installed, and outputs the measured vibration data to the positioning control device 10.
  • FIG. 2 is a block diagram schematically showing an example of the functional configuration of the machine learning unit included in the positioning control device according to the first embodiment.
  • the machine learning unit 13 includes a data acquisition unit 131, a model generation unit 132, a trained model storage unit 133, and an inference unit 134.
  • the data acquisition unit 131 acquires motor speed control parameters, motor 50 information, allowable ranges, priority items, and vibration data as learning data.
  • the motor speed control parameter is a value set in the motor speed control parameter output unit 14.
  • the information and the allowable range of the motor 50 are values stored in the parameter storage unit 11.
  • the permissible range includes the tact time, the current consumption of the motor 50 and the vibration of the motor 50, but in the first embodiment, the permissible range regarding the vibration of the motor 50 and the tact time is used.
  • the vibration data is vibration data detected by the vibration sensor 61 and acquired by the sensor value acquisition unit 12 when the motor 50 is driven by the set motor speed control parameters.
  • the model generation unit 132 has a motor speed that provides optimum vibration according to learning data created based on a combination of motor speed control parameters, motor 50 information, allowable range, and vibration data output from the data acquisition unit 131.
  • Learn control parameters That is, a trained model that infers the motor speed control parameter that provides the optimum vibration from the information of the motor 50, the vibration of the motor 50, and the allowable range of the tact time, that is, the vibration falls within the allowable range of the tact time.
  • the trained model is a model that learns the correlation between the vibration of the installation location of the motor 50 and the motor speed control parameter from the information of the motor 50, the allowable range of the vibration and tact time of the motor 50, and the vibration data.
  • the learning data is data in which the motor speed control parameter, the information of the motor 50, the allowable range of the vibration and the tact time of the motor 50, and the vibration data are associated with each other.
  • the learning algorithm used by the model generation unit 132 known algorithms such as supervised learning, unsupervised learning, and reinforcement learning can be used.
  • reinforcement learning Reinforcement Learning
  • an agent who is the action subject in a certain environment observes the parameters of the environment in the current state and decides the action to be taken.
  • the environment changes dynamically depending on the behavior of the agent, and the agent is rewarded according to the change in the environment.
  • the agent repeats this process and learns the action policy that gives the most reward through a series of actions.
  • Q-learning and TD-Learning are known as typical methods of reinforcement learning.
  • the general update equation of the action value function Q (s, a) is expressed by the following equation (1).
  • s t represents the state of the environment at time t
  • a t represents the behavior in time t.
  • the action a t the state is changed to s t + 1.
  • r t + 1 represents the reward received by the change of the state
  • represents the discount rate
  • represents the learning coefficient.
  • is in the range of 0 ⁇ ⁇ 1
  • is in the range of 0 ⁇ ⁇ 1.
  • Motor speed control parameter action a t becomes, the allowable range of vibration information and the motor 50 of the motor 50 to learn the best action a t in state s t of the state s t, and the time t.
  • the state of the motor is input as the state
  • the motor speed control parameter is input as the action
  • the vibration of the motor 50 is input as a result.
  • the permissible range is used as the standard of compensation.
  • the action value Q of the action a having the highest Q value at time t + 1 is larger than the action value Q of the action a executed at time t, the action value Q is increased. However, in the opposite case, the action value Q is reduced. In other words, the action value function Q (s, a) is updated so that the action value Q of the action a at time t approaches the best action value at time t + 1. As a result, the best behavioral value in a certain environment is sequentially propagated to the behavioral value in the previous environment.
  • the model generation unit 132 includes a reward calculation unit 141 and a function update unit 142.
  • the reward calculation unit 141 calculates the reward based on the motor speed control parameter, the information of the motor 50, the allowable range of the vibration and tact time of the motor 50, and the vibration data.
  • the reward calculation unit 141 calculates the reward r based on the magnitude of the vibration obtained from the vibration data and the allowable range of the vibration of the motor 50.
  • the threshold value defined by the allowable range of the magnitude of vibration is defined as the first threshold value. For example, if the magnitude of vibration ⁇ first threshold value, the reward r is increased (for example, a reward of "1" is given), while if the magnitude of vibration> the first threshold value, the reward r is decreased. (For example, give a reward of "-1".).
  • the function update unit 142 updates the function for determining the motor speed control parameter that produces the optimum vibration according to the reward calculated by the reward calculation unit 141, and outputs the function to the trained model storage unit 133.
  • Q-learning it is used as a function for calculating the motor speed control parameter to be optimized vibrate (1) Action value function formula Q (s t, a t). Repeat the above learning.
  • Learned model storage unit 133 action value is updated by the function updating unit 142 function Q (s t, a t) , i.e., storing the learned model.
  • the inference unit 134 infers the motor speed control parameter using the learned model stored in the learned model storage unit 133. That is, by inputting the information of the motor 50 acquired by the data acquisition unit 131, the allowable range of the vibration and the tact time of the motor 50, and the priority items into this trained model, the information of the motor 50, the vibration and the tact of the motor 50 are input. Motor speed control parameters inferred from the time tolerance and priority items can be output. Further, the vibration data acquired by analyzing the value of the vibration data detected by the vibration sensor 61, the information of the motor 50, the allowable range of the vibration and the tact time of the motor 50, and the priority items with the trained model. The value of can be reflected in the motor speed control parameter as a feedback value.
  • the inference unit 134 has been described as outputting the motor speed control parameter using the learned model learned by the model generation unit 132 of the positioning control device 10 connected to the motor 50. However, the inference unit 134 acquires a trained model from the outside such as another positioning control device 10 connected to the other motor 50, and outputs the motor speed control parameter based on the acquired trained model. May be good.
  • FIG. 3 is a flowchart showing an example of the procedure of the learning process of the machine learning unit included in the positioning control device according to the first embodiment.
  • the data acquisition unit 131 acquires motor speed control parameters, motor 50 information, allowable range, and vibration data as learning data (step S11).
  • an allowable range for vibration and tact time of the motor 50 is set.
  • the data acquisition unit 131 acquires priority items from the parameter storage unit 11 (step S12).
  • the model generation unit 132 calculates the reward based on the motor speed control parameter, the information of the motor 50, and the allowable range. Specifically, the reward calculation unit 141 of the model generation unit 132 acquires the motor speed control parameter, the information of the motor 50, and the allowable range, and rewards based on the relationship between the predetermined vibration magnitude and the first threshold value. Judge the increase or decrease of.
  • the first threshold is, in one example, the value of vibration defined in the permissible range.
  • the reward calculation unit 141 determines whether the value of the vibration data when the motor 50 is operated by the motor speed control parameter is less than the first threshold value (step S13).
  • the first threshold value is the value of vibration allowed for the device connected to the motor 50 when the motor 50 is operated.
  • the reward calculation unit 141 increases the reward (step S14).
  • the reward calculation unit 141 reduces the reward (step S15).
  • the reward may be increased or decreased.
  • step S14 or step S15 the function updater 142 of the model generating unit 132, based on the calculated compensation by compensation calculation unit 141, action value function Q (s t, a t) to update (step S16) .
  • Action value function Q (s t, a t) is a function expressed by learned model storage unit 133 stores (1).
  • the process returns to step S11. That is, the machine learning unit 13 repeatedly executes the processing up to S16 step S11 above, and stores the generated action-value function Q (s t, a t) as a learned model.
  • the learning data at the time of acquisition of the data is accumulated.
  • the trained model is stored in the trained model storage unit 133 provided inside the machine learning unit 13, but the trained model storage unit 133 is stored in the machine learning unit 13. It may be prepared outside of.
  • supervised learning In addition to “reinforcement learning”, "supervised learning”, “unsupervised learning”, “semi-supervised learning” or other known learning algorithms can be used to control motor speed control parameters, motor 50 information, and motor 50.
  • the motor speed control parameter that provides the optimum vibration may be machine-learned from the permissible range of vibration and tact time and the vibration data. Machine learning using these learning algorithms can also reduce the vibration of the entire system.
  • FIG. 4 is a flowchart showing an example of a processing procedure of a motor control method in the positioning control device according to the first embodiment.
  • the configuration of the initial control system 1 is determined by the user.
  • the parameter storage unit 11 stores the information including the information of the motor 50, the allowable range, and the value including the priority item (step S31).
  • Information, tolerances and priorities for the motor 50 are set by the user via inputs not shown.
  • an allowable range for vibration and tact time of the motor 50 is set.
  • the sensor value acquisition unit 12 acquires vibration data from the vibration sensor 61 (step S32) and holds it.
  • the inference unit 134 of the machine learning unit 13 analyzes the value of the parameter storage unit 11 and the value of the acquired vibration data using the trained model stored in the trained model storage unit 133, and optimal vibration.
  • the motor speed control parameter is set in the motor speed control parameter output unit 14 (step S33).
  • the motor speed control parameter output unit 14 outputs the set motor speed control parameter to the pulse output unit 15 (step S34).
  • the pulse output unit 15 outputs a pulse to the amplifier 30 based on the motor speed control parameter from the motor speed control parameter output unit 14 (step S35).
  • step S32 the sensor value acquisition unit 12 acquires vibration data, which is the vibration of the motor 50.
  • the processes from steps S32 to S35 are repeatedly executed.
  • the inference unit 134 uses the information of the motor 50, the permissible range, and the priority items in step S33.
  • the value of the vibration data and the value of the vibration data are analyzed using the trained model, and the motor speed control parameter that provides the optimum vibration is set in the motor speed control parameter output unit 14.
  • FIG. 5 is a block diagram showing another example of the configuration of the control system including the positioning control device according to the first embodiment.
  • the vibration sensor 61 is not provided in the motor 50, but in the product 52, which is a device including the motor 50 and the drive unit 51 driven by the motor 50.
  • the machine learning unit 13 can optimize the vibration value of the product 52 including the drive unit 51 and the like as well as the motor 50.
  • FIG. 6 is a block diagram showing another example of the configuration of the control system including the positioning control device according to the first embodiment.
  • the vibration sensor 61 is provided in a system 53 including a plurality of products 52.
  • An amplifier 30 is electrically connected to the motor 50 of each product 52 in the system 53.
  • a system 53 is a multi-axis control system including a plurality of motors 50 and products 52.
  • the machine learning unit 13 can optimize the vibration value of the system 53 including the plurality of products 52.
  • the motor 50 in FIG. 1, the product 52 in FIG. 5, and the system 53 in FIG. 6 where the vibration sensor 61 is installed are the locations where the motor 50 is installed.
  • the machine learning unit 13 of the positioning control device 10 learns the motor speed control parameter in which the vibration data is within the allowable range according to the learning data, and generates a trained model.
  • the learning data is created based on a combination of vibration data from the vibration sensor 61, information on the motor 50, motor speed control parameters, and allowable ranges for vibration and takt time of the motor 50.
  • the vibration sensor 61 is provided in a motor 50, a product 52 including a drive unit 51 connected to the motor 50, or a system 53 including a plurality of products 52.
  • the machine learning unit 13 sets the motor speed control parameter in which the vibration falls within the permissible range by analyzing the information of the motor 50, the permissible range, and the value of the vibration data using the trained model.
  • the tact time can be shortened, and the product 52 or the system 53 can be operated without giving excessive vibration to the motor 50, which imposes a burden on the motor 50. Can be reduced and the life of the motor 50 can be extended.
  • the positioning control device 10 when the positioning control device 10 is provided with a higher-level device, the positioning control device 10 is provided with a function related to machine learning. Therefore, the load on the host device can be reduced.
  • the servo gain which is a control value is determined by the feedback control from the motor 50, but the motor speed control parameter is determined by the output control of the positioning control device 10. Therefore, it can be applied not only to devices such as servo motors and amplifiers having an encoder, which are expensive and capable of feedback control, but also to devices such as stepping motors and amplifiers which do not have a feedback mechanism.
  • FIG. 7 is a block diagram showing an example of the configuration of the control system including the positioning control device according to the second embodiment.
  • the control system 1 of the second embodiment further includes a current consumption measuring device 62 in the motor 50.
  • the current consumption measuring device 62 is a device that acquires a current consumption value, which is a value of the current consumed at the installation location.
  • the current consumption measuring device 62 outputs the measured current consumption value to the positioning control device 10.
  • the positioning control device 10 further includes a current consumption acquisition unit 16.
  • the current consumption acquisition unit 16 acquires and holds the current consumption value output from the current consumption measuring device 62.
  • the machine learning unit 13 has an allowable range of priority items specified according to the learning data created based on the combination of the motor speed control parameter, the information of the motor 50, the allowable range, and the vibration data and the current consumption value of the motor 50.
  • Learn motor speed control parameters that fit in.
  • the allowable range of takt time is set in advance in addition to the allowable range specified by the priority item. That is, the motor speed control parameter in which the tact time is within the allowable range from the information of the motor 50, the allowable range, the priority item, the vibration data of the motor 50, and the current consumption value, and the specified priority item is within the allowable range.
  • the priority item specified is vibration or current consumption.
  • the trained model correlates the vibration and current consumption value of the installation location of the motor 50 with the motor speed control parameter from the information of the motor 50, the allowable range, the vibration data, and the current consumption value. It is a learned model.
  • the machine learning unit 13 may learn motor speed control parameters in which the designated priority items fall within the permissible range, and not only the designated priority items but also items other than the priority items fall within the permissible range. You may learn the speed control parameters. Further, the machine learning unit 13 analyzes the information of the motor 50, the allowable range, the value of the vibration data of the motor 50, and the current consumption value by using the trained model, and the priority item is the motor 50 having the information of the motor 50. Outputs motor speed control parameters that fall within the permissible range.
  • FIG. 8 is a flowchart showing an example of the procedure of the learning process of the machine learning unit included in the positioning control device according to the second embodiment.
  • the data acquisition unit 131 acquires the motor speed control parameter, the information of the motor 50, the allowable range, the vibration data of the motor 50, and the current consumption value as learning data (step S51). Further, the data acquisition unit 131 acquires priority items from the parameter storage unit 11 (step S52). The model generation unit 132 determines whether the priority item is vibration or current consumption (step S53).
  • the model generation unit 132 calculates the reward based on the motor speed control parameter, the information of the motor 50, and the allowable range. Specifically, the reward calculation unit 141 of the model generation unit 132 acquires the motor speed control parameter, the information of the motor 50, the allowable range, the vibration data of the motor 50, and the current consumption value, and determines the magnitude of the vibration. The increase or decrease of the reward is determined based on the relationship with the first threshold value and the relationship between the current consumption value and the second threshold value. The second threshold value is, in one example, the current consumption value defined in the allowable range.
  • the reward calculation unit 141 determines whether the value of the vibration data when the motor 50 is operated by the motor speed control parameter is less than the first threshold value (step S54).
  • the reward calculation unit 141 increases the reward (step S55). If items other than the priority items are not considered, the reward will be increased if the vibration data, which is the priority item, is less than the first threshold value. However, when considering items other than the priority items so as to be within the permissible range, the reward according to the difference between the first threshold value and the vibration data value and the difference between the second threshold value and the current consumption value are further considered. It is possible to determine the reward according to the above. In one example, the reward calculation unit 141 increases the reward when the difference between the second threshold value and the current consumption value is positive, as compared with the case where the difference between the second threshold value and the current consumption value is negative.
  • the reward may be determined depending on whether the difference between the second threshold value and the current consumption value is positive or negative, and the reward is determined according to the magnitude of the difference between the second threshold value and the current consumption value. May be good. That is, even within the permissible range, the reward may be increased as the current consumption value becomes smaller, and the reward may decrease as the current consumption value becomes larger. Further, when the current consumption value becomes larger than the second threshold value outside the permissible range, the reward may be reduced.
  • vibration is a priority item, so the difference between the first threshold and the vibration data value is larger than the difference between the second threshold and the current consumption value. It is desirable to contribute to the increase or decrease of the reward. As a result, when both the vibration and the current consumption are within the permissible range, the reward is higher and the action value can be enhanced as compared with the case where the vibration is within the permissible range but the current consumption is not within the permissible range.
  • the reward calculation unit 141 reduces the reward (step S56). If items other than the priority items are not considered, the reward will be reduced if the vibration data, which is the priority item, is larger than the first threshold value. However, when considering items other than the priority items so as to be within the permissible range, the reward according to the difference between the first threshold value and the vibration data value and the difference between the second threshold value and the current consumption value are further considered. It is possible to determine the reward according to the above. In one example, the reward calculation unit 141 makes the absolute value of the reward to be reduced smaller when the difference between the second threshold value and the current consumption value is positive than when the difference between the second threshold value and the current consumption value is negative.
  • the reward may be determined depending on whether the difference between the second threshold value and the current consumption value is positive or negative, and the reward is determined according to the magnitude of the difference between the second threshold value and the current consumption value. May be good. That is, even within the permissible range, the absolute value of the reward to be reduced may be reduced as the current consumption value becomes smaller, and the absolute value of the reward to be reduced may be increased as the current consumption value becomes larger. Further, when the current consumption value is larger than the second threshold value outside the permissible range, the absolute value of the reward to be reduced may be increased. However, even in these cases, vibration is a priority item, so the difference between the first threshold and the vibration data value is larger than the difference between the second threshold and the current consumption value. It is desirable to contribute to the increase or decrease of the reward.
  • the model generation unit 132 calculates the reward based on the motor speed control parameter, the information of the motor 50, and the allowable range. Specifically, the reward calculation unit 141 of the model generation unit 132 acquires the motor speed control parameter, the information of the motor 50, the allowable range, the vibration data of the motor 50, and the current consumption value, and determines the magnitude of the vibration. The increase or decrease of the reward is determined based on the relationship with the first threshold value and the relationship between the current consumption value and the second threshold value. Here, the reward calculation unit 141 determines whether the current consumption value when the motor 50 is operated by the motor speed control parameter is less than the second threshold value (step S57).
  • the reward calculation unit 141 increases the reward (step S58). If items other than the priority items are not considered, the reward will be increased if the current consumption value, which is the priority item, is less than the second threshold value. However, when considering items other than the priority items so as to be within the permissible range, the reward according to the difference between the first threshold value and the vibration data value and the difference between the second threshold value and the current consumption value are further considered. It is possible to determine the reward according to the above. In one example, the reward calculation unit 141 increases the reward when the difference between the first threshold value and the value of the vibration data is positive, as compared with the case where the difference between the first threshold value and the value of the vibration data is negative.
  • the reward may be determined depending on whether the difference between the first threshold value and the value of the vibration data is positive or negative, or the reward is determined according to the magnitude of the difference between the first threshold value and the value of the vibration data. May be done. That is, even within the permissible range, the reward may be increased as the value of the vibration data becomes smaller, and the reward may be decreased as the value of the vibration data becomes larger. Further, when the value of the vibration data becomes larger than the first threshold value outside the permissible range, the reward may be reduced.
  • the current consumption is a priority item, so the difference between the second threshold value and the current consumption value is larger than the difference between the first threshold value and the vibration data value. Is desirable to contribute to the increase or decrease in reward. As a result, when both the vibration and the current consumption are within the permissible range, the reward is higher and the action value can be enhanced as compared with the case where the current consumption is within the permissible range but the vibration is out of the permissible range.
  • the reward calculation unit 141 reduces the reward (step S59). If items other than the priority items are not considered, the reward will be reduced if the current consumption value, which is the priority item, is larger than the second threshold value. However, when considering items other than the priority items so as to be within the permissible range, the reward according to the difference between the first threshold value and the vibration data value and the difference between the second threshold value and the current consumption value are further considered. It is possible to determine the reward according to the above.
  • the reward calculation unit 141 sets the absolute value of the reward to be reduced when the difference between the first threshold value and the vibration data value is positive, as compared with the case where the difference between the first threshold value and the vibration data value is negative. Make it smaller.
  • the reward may be determined depending on whether the difference between the first threshold value and the value of the vibration data is positive or negative, or the reward is determined according to the magnitude of the difference between the first threshold value and the value of the vibration data. May be done. That is, even within the permissible range, the absolute value of the reward to be reduced may be reduced as the value of the vibration data becomes smaller, and the absolute value of the reward to be reduced may be increased as the value of the vibration data becomes larger.
  • the absolute value of the reward to be reduced may be made larger.
  • the current consumption is a priority item, so the difference between the second threshold value and the current consumption value is larger than the difference between the first threshold value and the vibration data value. Is desirable to contribute to the increase or decrease in reward.
  • Action value function Q (s t, a t) is a function expressed by learned model storage unit 133 stores (1).
  • step S51 the machine learning unit 13 repeatedly executes the processes from S60 from step S51 described above, and stores the generated action-value function Q (s t, a t) as a learned model. Further, when the feedback value regarding the tact time other than the vibration and the current consumption is acquired, the learning data at the time of acquiring the data is accumulated.
  • FIG. 9 is a flowchart showing an example of a processing procedure of a motor control method in the positioning control device according to the second embodiment.
  • the configuration of the initial control system 1 is determined by the user. Then, the parameter storage unit 11 stores the information including the information of the motor 50, the allowable range, and the value including the priority item (step S71). Information, tolerances and priorities for the motor 50 are set by the user via inputs not shown. Further, the sensor value acquisition unit 12 acquires vibration data from the vibration sensor 61 (step S72) and holds it. Further, the current consumption acquisition unit 16 acquires and holds the current consumption value from the current consumption measuring device 62 (step S73).
  • the inference unit 134 of the machine learning unit 13 analyzes the value of the parameter storage unit 11, the value of the acquired vibration data, and the current consumption value using the trained model, and the set priority item becomes an allowable range.
  • the motor speed control parameter is set in the motor speed control parameter output unit 14 (step S74). If the priority item is vibration, the motor speed control parameters that provide the optimum vibration within the permissible range are determined. At this time, the motor speed control parameter may be determined so that the current consumption, which is an item other than the priority item, is also within the allowable range. If the priority item is current consumption, the motor speed control parameters that provide the optimum current consumption within the permissible range are determined. At this time, the motor speed control parameter may be determined so that the vibration, which is an item other than the priority item, is also within the allowable range.
  • the motor speed control parameter output unit 14 outputs the set motor speed control parameter to the pulse output unit 15 (step S75).
  • the pulse output unit 15 outputs a pulse to the amplifier 30 based on the motor speed control parameter from the motor speed control parameter output unit 14 (step S76).
  • step S72 the sensor value acquisition unit 12 acquires vibration data, which is the vibration of the motor 50. Further, when the motor 50 is driven, the current consumption value of the motor 50 is detected by the current consumption measuring device 62, and the current consumption value of the motor 50 is acquired by the current consumption acquisition unit 16 as described in step S73. Will be done. Then, as described above, the processes from steps S72 to S76 are repeatedly executed. In one example, the value of the vibration data acquired by the sensor value acquisition unit 12 changes beyond the permissible range, or the current consumption value acquired by the current consumption acquisition unit 16 changes beyond the permissible range.
  • step S74 the inference unit 134 analyzes the information of the motor 50, the permissible range and the priority item, the value of the vibration data, and the current consumption value by using the trained model, and the priority item is permissible.
  • the motor speed control parameter within the range is set in the motor speed control parameter output unit 14.
  • FIG. 10 is a block diagram showing another example of the configuration of the control system including the positioning control device according to the second embodiment.
  • the same components as those in FIGS. 1, 5 and 7 are designated by the same reference numerals, the description thereof will be omitted, and the parts different from those in FIGS. 1, 5 and 7 will be described.
  • the vibration sensor 61 and the current consumption measuring device 62 are provided not in the motor 50 but in the product 52 which is a device including the motor 50 and the drive unit 51 driven by the motor 50.
  • the machine learning unit 13 can optimize the vibration value of the product 52 including the drive unit 51 and the like as well as the motor 50.
  • FIG. 11 is a block diagram showing another example of the configuration of the control system including the positioning control device according to the second embodiment.
  • the same components as those in FIGS. 1, 6 and 7 are designated by the same reference numerals, the description thereof will be omitted, and the parts different from those in FIGS. 1, 6 and 7 will be described.
  • the vibration sensor 61 and the current consumption measuring device 62 are provided in the system 53 including the plurality of products 52.
  • a system 53 is a multi-axis control system including a plurality of motors 50 and products 52.
  • the machine learning unit 13 can optimize the vibration value of the system 53 including the plurality of products 52.
  • the machine learning unit 13 of the positioning control device 10 learns the motor speed control parameter in which the set priority item is within the permissible range according to the learning data, and generates a trained model.
  • the learning data is created based on a combination of vibration data from the vibration sensor 61, current consumption value from the current consumption measuring device 62, information on the motor 50, motor speed control parameters, and an allowable range.
  • the vibration sensor 61 and the current consumption measuring device 62 are provided in a motor 50, a product 52 including a drive unit 51 connected to the motor 50, or a system 53 including a plurality of products 52.
  • the machine learning unit 13 analyzes the information of the motor 50, the permissible range, the value of the vibration data, and the current consumption value by using the trained model, so that the tact time is within the permissible range and the set priority is set.
  • Set the motor speed control parameters so that the item is within the allowable range.
  • the motor 50 When the current consumption is set as a priority item, the motor 50 is driven by the set motor speed control parameter, so that the tact time can be shortened and the motor 50 does not consume excessive power.
  • the product 52 or the system 53 can be operated to save power in the entire system 53.
  • the servo gain which is a control value
  • the motor speed control parameter is determined by the output control of the positioning control device 10. Therefore, not only equipment such as a servomotor and an amplifier having an encoder, which is expensive and capable of feedback control, but also equipment such as a stepping motor and an amplifier having no feedback mechanism, the product 52 or the system 53 in which the motor 50 is provided. Life or energy saving can be improved.
  • the machine learning unit 13 is designed to learn motor speed control parameters that fall within the permissible range for all items, not just the items specified in the priority items. This makes it possible to set motor speed control parameters that maintain the minimum current consumption value while suppressing vibration within the permissible range when vibration is prioritized. On the contrary, when the current consumption is prioritized, the motor speed control parameter can be set so as to maintain the minimum vibration while suppressing the current consumption within the allowable range.
  • FIG. 12 is a block diagram showing an example of the configuration of a control system including the positioning control device according to the third embodiment.
  • the control system 1 of the third embodiment further includes a takt time measuring device 63 in the product 52, which is a device including the motor 50 and the drive unit 51 driven by the motor 50, instead of the motor 50. That is, the tact time measuring device 63 is provided in the configuration of FIG.
  • the takt time measuring device 63 is a device that measures takt time using a camera, a sensor, or the like.
  • the tact time measuring device 63 outputs the measured tact time to the positioning control device 10.
  • the positioning control device 10 further includes a simulator unit 17 and a tact time acquisition unit 18.
  • the simulator unit 17 is built in the positioning control device 10, and simulates the takt time from the motor speed control parameters output by the motor speed control parameter output unit 14.
  • the simulator unit 17 reflects the tact time acquired by the simulation in the output.
  • the tact time acquisition unit 18 holds the tact time output from the tact time measuring device 63 and the tact time simulated by the simulator unit 17.
  • the value to be held may be only one of the tact time output from the tact time measuring device 63 and the tact time simulated by the simulator unit 17.
  • the machine learning unit 13 sets the designated priority items in the allowable range according to the learning data created based on the combination of the motor speed control parameter, the information of the motor 50, the allowable range, and the vibration data and the tact time of the motor 50.
  • the specified priority is vibration or takt time.
  • the trained model is a model that learns the correlation between the vibration and tact time of the installation location of the motor 50 and the motor speed control parameter from the information of the motor 50, the allowable range, the vibration data, and the tact time. be.
  • the machine learning unit 13 analyzes the information of the motor 50, the permissible range, the value of the vibration data of the motor 50, and the tact time using the trained model, and the priority item is permissible for the motor 50 having the information of the motor 50. Output motor speed control parameters that fall within the range.
  • FIG. 13 is a flowchart showing an example of the procedure of the learning process of the machine learning unit included in the positioning control device according to the third embodiment.
  • the data acquisition unit 131 acquires the motor speed control parameter, the information of the motor 50, the allowable range, the vibration data of the motor 50, and the tact time as learning data (step S91). Further, the data acquisition unit 131 acquires priority items from the parameter storage unit 11 (step S92). The model generation unit 132 determines whether the priority item is vibration or takt time (step S93).
  • the model generation unit 132 calculates the reward based on the motor speed control parameter, the information of the motor 50, and the allowable range. Specifically, the reward calculation unit 141 of the model generation unit 132 acquires the motor speed control parameter, the information of the motor 50, the allowable range, the vibration data of the motor 50, and the takt time, and obtains the predetermined vibration magnitude and the first. 1 The increase or decrease of the reward is determined based on the relationship with the threshold value and the relationship between the takt time and the third threshold value. The third threshold value is, in one example, the value of the takt time defined in the permissible range.
  • the reward calculation unit 141 determines whether the value of the vibration data when the motor 50 is operated by the motor speed control parameter is less than the first threshold value (step S94).
  • the reward calculation unit 141 increases the reward (step S95). If items other than the priority items are not considered, the reward will be increased if the vibration data, which is the priority item, is less than the first threshold value. However, when considering items other than the priority items so that they are within the permissible range, the reward according to the difference between the first threshold value and the vibration data value and the difference between the third threshold value and the takt time are further increased. You can set the reward to respond to. In one example, the reward calculation unit 141 increases the reward when the difference between the third threshold value and the takt time is positive, as compared with the case where the difference between the third threshold value and the takt time is negative.
  • the reward may be determined depending on whether the difference between the third threshold value and the tact time is positive or negative, or the reward may be determined according to the magnitude of the difference between the third threshold value and the tact time. .. That is, even within the permissible range, the reward may be increased as the tact time becomes smaller, and the reward may decrease as the tact time becomes larger. Further, when the tact time becomes larger than the third threshold value outside the permissible range, the reward may be reduced.
  • vibration is a priority item, so the difference between the first threshold and the vibration data value is more rewarding than the difference between the third threshold and the takt time. It is desirable to contribute to the increase or decrease of.
  • the reward calculation unit 141 reduces the reward (step S96). If items other than the priority items are not considered, the reward will be reduced if the vibration data, which is the priority item, is larger than the first threshold value. However, when considering items other than the priority items so that they are within the permissible range, the reward according to the difference between the first threshold value and the vibration data value and the difference between the third threshold value and the takt time are further increased. You can set the reward to respond to. In one example, the reward calculation unit 141 makes the absolute value of the reward to be reduced smaller when the difference between the third threshold value and the takt time is positive than when the difference between the third threshold value and the takt time is negative.
  • the reward may be determined depending on whether the difference between the third threshold value and the tact time is positive or negative, or the reward may be determined according to the magnitude of the difference between the third threshold value and the tact time. .. That is, even within the permissible range, the absolute value of the reward to be reduced may be reduced as the tact time becomes smaller, and the absolute value of the reward to be reduced may be increased as the tact time becomes larger. Further, when the tact time is larger than the third threshold value outside the permissible range, the absolute value of the reward to be reduced may be increased. However, even in these cases, vibration is a priority item, so the difference between the first threshold and the vibration data value is more rewarding than the difference between the third threshold and the takt time. It is desirable to contribute to the increase or decrease of.
  • the model generation unit 132 calculates the reward based on the motor speed control parameter, the information of the motor 50, and the allowable range. Specifically, the reward calculation unit 141 of the model generation unit 132 acquires the motor speed control parameter, the information of the motor 50, the allowable range, the vibration data of the motor 50, and the takt time, and obtains the predetermined vibration magnitude and the first. 1 The increase or decrease of the reward is determined based on the relationship with the threshold value and the relationship between the takt time and the third threshold value. Here, the reward calculation unit 141 determines whether the value of the tact time when the motor 50 is operated by the motor speed control parameter is less than the third threshold value (step S97).
  • the reward calculation unit 141 increases the reward (step S98). If items other than the priority items are not considered, the reward will be increased if the priority item, takt time, is less than the third threshold value. However, when considering items other than the priority items so that they are within the permissible range, the reward according to the difference between the first threshold value and the vibration data value and the difference between the third threshold value and the takt time are further increased. You can set the reward to respond to. In one example, the reward calculation unit 141 increases the reward when the difference between the first threshold value and the value of the vibration data is positive, as compared with the case where the difference between the first threshold value and the value of the vibration data is negative.
  • the reward may be determined depending on whether the difference between the first threshold value and the value of the vibration data is positive or negative, or the reward is determined according to the magnitude of the difference between the first threshold value and the value of the vibration data. May be done. That is, even within the permissible range, the reward may be increased as the value of the vibration data becomes smaller, and the reward may be decreased as the value of the vibration data becomes larger. Further, when the value of the vibration data becomes larger than the first threshold value outside the permissible range, the reward may be reduced.
  • the tact time is a priority item, so the difference between the third threshold and the tact time is larger than the difference between the first threshold and the vibration data value. It is desirable to contribute to the increase or decrease of the reward. As a result, when both the vibration and the tact time are within the permissible range, the reward is higher and the action value can be enhanced as compared with the case where the tact time is within the permissible range but the vibration is not within the permissible range.
  • the reward calculation unit 141 reduces the reward (step S99). If items other than the priority items are not considered, the reward will be reduced if the priority item, the takt time, is larger than the third threshold value. However, when considering items other than the priority items so that they are within the permissible range, the reward according to the difference between the first threshold value and the vibration data value and the difference between the third threshold value and the takt time are further increased. You can set the reward to respond to.
  • the reward calculation unit 141 sets the absolute value of the reward to be reduced when the difference between the first threshold value and the vibration data value is positive, as compared with the case where the difference between the first threshold value and the vibration data value is negative. Make it smaller.
  • the reward may be determined depending on whether the difference between the first threshold value and the value of the vibration data is positive or negative, or the reward is determined according to the magnitude of the difference between the first threshold value and the value of the vibration data. May be done. That is, even within the permissible range, the absolute value of the reward to be reduced may be reduced as the value of the vibration data becomes smaller, and the absolute value of the reward to be reduced may be increased as the value of the vibration data becomes larger.
  • the absolute value of the reward to be reduced may be made larger.
  • the tact time is a priority item, so the difference between the third threshold and the tact time is larger than the difference between the first threshold and the vibration data value. It is desirable to contribute to the increase or decrease of the reward.
  • Action value function Q (s t, a t) is a function expressed by learned model storage unit 133 stores (1).
  • step S91 the machine learning unit 13 repeatedly executes the processes from S100 from step S91 described above, and stores the generated action-value function Q (s t, a t) as a learned model. Further, when the feedback value regarding the current consumption other than the vibration and the tact time is acquired, the learning data at the time of acquiring the data is accumulated.
  • FIG. 14 is a flowchart showing an example of a processing procedure of a motor control method in the positioning control device according to the third embodiment.
  • the configuration of the initial control system 1 is determined by the user. Then, the parameter storage unit 11 stores the information including the information of the motor 50, the allowable range, and the value including the priority item (step S111). Information, tolerances and priorities for the motor 50 are set by the user via inputs not shown. Further, the sensor value acquisition unit 12 acquires and holds vibration data from the vibration sensor 61 (step S112). Further, the takt time acquisition unit 18 acquires and holds the takt time from the takt time measuring device 63 (step S113).
  • the inference unit 134 of the machine learning unit 13 analyzes the value of the parameter storage unit 11, the value of the acquired vibration data, and the takt time using the trained model, and the set priority item is within the allowable range.
  • the speed control parameter is set in the motor speed control parameter output unit 14 (step S114). If the priority item is vibration, the motor speed control parameters that provide the optimum vibration within the permissible range are determined. At this time, the motor speed control parameter may be determined so that the takt time, which is an item other than the priority item, is also within the allowable range. If the priority item is takt time, the motor speed control parameter that provides the optimum takt time within the permissible range is determined. At this time, the motor speed control parameter may be determined so that the vibration, which is an item other than the priority item, is also within the allowable range.
  • the motor speed control parameter output unit 14 outputs the set motor speed control parameter to the pulse output unit 15 (step S115).
  • the pulse output unit 15 outputs a pulse to the amplifier 30 based on the motor speed control parameter from the motor speed control parameter output unit 14 (step S116).
  • the simulator unit 17 simulates the control system 1 from the motor speed control parameters obtained from the motor speed control parameter output unit 14 and calculates the takt time (step S117).
  • the simulator unit 17 outputs the calculated tact time to the tact time acquisition unit 18.
  • step S112 This drives the motor 50.
  • the vibration of the motor 50 is detected by the vibration sensor 61 provided in the motor 50.
  • the sensor value acquisition unit 12 acquires vibration data, which is the vibration of the motor 50.
  • the tact time of the motor 50 is acquired by the tact time acquisition unit 18 as described in step S113. ..
  • the processes from steps S112 to S116 are repeatedly executed.
  • the value of the vibration data acquired by the sensor value acquisition unit 12 changes beyond the permissible range, or the tact time acquired by the tact time acquisition unit 18 changes beyond the permissible range.
  • step S114 the inference unit 134 analyzes the information of the motor 50, the allowable range and the priority item, the value of the vibration data, and the takt time using the trained model, and the priority item is the allowable range.
  • the motor speed control parameter is set in the motor speed control parameter output unit 14.
  • FIG. 15 is a block diagram showing another example of the configuration of the control system including the positioning control device according to the third embodiment.
  • the same components as those in FIGS. 1, 6 and 12 are designated by the same reference numerals, the description thereof will be omitted, and the parts different from those in FIGS. 1, 6 and 12 will be described.
  • the vibration sensor 61 and the takt time measuring device 63 are provided in a system 53 including a plurality of products 52.
  • a system 53 is a multi-axis control system including a plurality of motors 50 and products 52.
  • the machine learning unit 13 can optimize the vibration value of the system 53 including the plurality of products 52.
  • machine learning can be performed in consideration of the tact time in addition to the vibration of the motor 50 in the configuration of the first embodiment.
  • machine learning may be performed in consideration of the tact time in addition to the vibration and the current consumption of the motor 50.
  • FIG. 16 is a block diagram showing another example of the configuration of the control system including the positioning control device according to the third embodiment.
  • the control system 1 of FIG. 16 shows a case where the third embodiment is applied to the configuration of FIG. 10 of the second embodiment.
  • the control system 1 further includes a takt time measuring device 63 in the product 52.
  • the positioning control device 10 further includes a simulator unit 17 and a tact time acquisition unit 18.
  • FIG. 17 is a block diagram showing another example of the configuration of the control system including the positioning control device according to the third embodiment.
  • the control system 1 of FIG. 17 shows a case where the third embodiment is applied to the configuration of FIG. 11 of the second embodiment.
  • the control system 1 further includes a takt time measuring device 63 in a system 53 including a plurality of products 52.
  • the positioning control device 10 further includes a simulator unit 17 and a tact time acquisition unit 18.
  • the machine learning unit 13 is designated as a priority item according to the learning data created based on the combination of the motor speed control parameter, the information of the motor 50, the allowable range, the vibration data of the motor 50, the current consumption value and the tact time. Learns motor speed control parameters that are within the permissible range. That is, a trained model is generated that infers the motor speed control parameter in which the priority item specified from the information of the motor 50, the vibration data of the motor 50, the current consumption value, and the takt time falls within the allowable range.
  • the priority items specified are vibration, current consumption or takt time.
  • the trained model is based on the information of the motor 50, the allowable range, the vibration data, the current consumption, the tact time, and the vibration, the current consumption, the tact time, and the motor speed control parameter of the installation location of the motor 50. It is a model that learned the correlation of.
  • the machine learning unit 13 may learn motor speed control parameters in which the designated priority items fall within the permissible range, and not only the designated priority items but also items other than the priority items fall within the permissible range. You may learn the speed control parameters. Further, the machine learning unit 13 analyzes the information of the motor 50, the allowable range and priority items, the value of the vibration data of the motor 50, the current consumption value and the tact time by using the trained model, and has the information of the motor 50.
  • the motor 50 outputs motor speed control parameters such that the priority items are within the permissible range.
  • the machine learning method in the machine learning unit 13 is a combination of the methods shown in the second and third embodiments, and the basic processing is the same, so the description thereof will be omitted.
  • the model generation unit 132 determines whether the priority item is vibration, current consumption, or takt time. Then, in each case, the reward calculation unit 141 increases or decreases the reward based on the relationship between the priority item and the threshold value. Further, at this time, the reward may be determined only by paying attention to whether or not the priority item is within the allowable range, or the reward may be determined by paying attention to whether or not the priority item and the items other than the priority item are within the allowable range. May be determined.
  • the motor control method using the trained model in the positioning control device 10 is a combination of those shown in the second and third embodiments, and the basic processing is the same, so the description thereof will be omitted. ..
  • the inference unit 134 analyzes the value of the parameter storage unit 11, the value of the acquired vibration data, the current consumption value, and the takt time using the trained model, and sets the priority.
  • the motor speed control parameter whose item is within the allowable range is set in the motor speed control parameter output unit 14.
  • FIG. 18 is a diagram showing an example of a speed command in the positioning control device according to the third embodiment.
  • the horizontal axis represents time and the vertical axis represents pulse frequency or velocity.
  • the time from the speed 0 to the command speed is called the actual acceleration time
  • the time from the command speed to the speed 0 is called the actual deceleration time.
  • the area of the part where the command speed is continued during the speed command represents the moving distance.
  • takt time the time from speed 0 to accelerating to reach the command speed, then decelerating to reach speed 0.
  • FIG. 19 is a diagram showing an example of learning results of motor speed control parameters in the positioning control device according to the third embodiment.
  • the horizontal axis represents time and the vertical axis represents pulse frequency or velocity.
  • the speed command curve F1 shows a speed command curve with motor speed control parameters set so that the tact time is short.
  • the actual acceleration time and the actual deceleration time are shortened. That is, sudden acceleration and deceleration are performed, and the current consumption increases and the vibration also increases.
  • the speed command curve F2 shows a speed command curve with motor speed control parameters set so that the tact time is longer than that of the speed command curve F1.
  • the actual acceleration time and the actual deceleration time are longer than those of the speed command curve F1. Therefore, the vibration can be reduced as compared with the case of the speed command curve F1, but the acceleration and deceleration are too slow.
  • the command speed which is the steady operation speed, is lower than the speed command curve F1
  • the current consumption becomes large and the tact time becomes long.
  • the speed command curve F3 has a longer tact time and a smaller command speed than the speed command curve F1, and has a shorter tact time and a larger command speed than the speed command curve F2.
  • the actual acceleration time and the actual deceleration time are between the speed command curve F1 and the speed command curve F2, and the vibration is suppressed.
  • the current consumption can be reduced as compared with the speed command curve F2.
  • the tact time can also be shortened as compared with the speed command curve F2. That is, in the speed command curve F3, the vibration, the current consumption, and the tact time are all within the permissible range and are balanced values.
  • the machine learning unit 13 will set motor speed control parameters such as the speed command curve F3.
  • the machine learning unit 13 of the positioning control device 10 learns the motor speed control parameter in which the set priority item is within the permissible range according to the learning data, and generates a trained model.
  • the learning data is created based on a combination of vibration data from the vibration sensor 61, tact time from the tact time measuring device 63, information on the motor 50, motor speed control parameters, and an allowable range.
  • the vibration sensor 61, the current consumption measuring device 62, and the takt time measuring device 63 are provided in the motor 50, the product 52 including the drive unit 51 connected to the motor 50, or the system 53 including the plurality of products 52.
  • the machine learning unit 13 analyzes the information of the motor 50, the permissible range, the value of the vibration data, and the takt time by using the trained model, and the motor speed control in which the set priority items are within the permissible range.
  • Set the parameters When vibration is set as a priority item, by driving the motor 50 with the set motor speed control parameter, the product 52 or the system 53 is operated without giving excessive vibration to the motor 50, and the motor 50 is operated. The burden on the motor 50 can be reduced, and the life of the motor 50 can be extended.
  • the tact time is set as a priority item, the tact time can be shortened and the production efficiency of the entire system 53 can be improved by driving the motor 50 with the set motor speed control parameter.
  • the servo gain which is a control value
  • the motor speed control parameter is determined by the output control of the positioning control device 10. Therefore, not only equipment such as a servomotor and an amplifier having an encoder, which is expensive and capable of feedback control, but also equipment such as a stepping motor and an amplifier having no feedback mechanism, the product 52 or the system 53 in which the motor 50 is provided. Life, energy saving and production efficiency can be improved.
  • the gain is only adjusted and the effect of shortening the tact time is small.
  • the motor speed control parameter is adjusted, so that the motor speed control parameter is adjusted. Depending on the restrictions, it is possible to significantly reduce the tact time compared to the conventional technology.
  • the motor speed control parameter can be set so as to maintain the minimum vibration and the current consumption while keeping the takt time within the allowable range.
  • FIG. 20 is a diagram schematically showing an example of a hardware configuration that realizes the positioning control device according to the first, second, and third embodiments.
  • the processor 101 and the memory 102 are connected via the bus line 103.
  • An example of the processor 101 is a CPU (Central Processing Unit) or a system LSI (Large Scale Integration).
  • An example of the memory 102 is a RAM (Random Access Memory), a ROM (Read Only Memory), which is a main storage device, an HDD (Hard Disk Drive) or an SSD (Solid State Drive), which is an auxiliary storage device.
  • a part or all of the functions of the sensor value acquisition unit 12, the machine learning unit 13, the motor speed control parameter output unit 14, the pulse output unit 15, the current consumption acquisition unit 16, the simulator unit 17, and the tact time acquisition unit 18 are performed by the processor 101.
  • some or all of the functions are realized by the processor 101 and software, firmware, or a combination of software and firmware.
  • the software or firmware is written as a program and stored in the memory 102.
  • the processor 101 reads and executes the sensor value acquisition unit 12, the machine learning unit 13, the motor speed control parameter output unit 14, the pulse output unit 15, the current consumption acquisition unit 16, and the simulator.
  • a part or all of the functions of the unit 17 and the tact time acquisition unit 18 are realized.
  • the positioning control device 10 includes a sensor value acquisition unit 12, a machine learning unit 13, a motor speed control parameter output unit 14, a pulse output unit 15, a current consumption acquisition unit 16, a simulator unit 17, and a tact time acquisition unit 18.
  • a program in which a step executed by a part or all of the above will be executed as a result is stored in the memory 102.
  • the program stored in the memory 102 is one of the sensor value acquisition unit 12, the machine learning unit 13, the motor speed control parameter output unit 14, the pulse output unit 15, the current consumption acquisition unit 16, the simulator unit 17, and the tact time acquisition unit 18. It can also be said to cause a computer to perform a procedure or method performed by a part or all of them.
  • the configuration shown in the above embodiments is an example, and can be combined with another known technique, can be combined with each other, and does not deviate from the gist. It is also possible to omit or change a part of the configuration.

Abstract

The present disclosure relates to a positioning control device that is electrically coupled to a motor via an amplifier and that controls the motor, the positioning control device comprising a parameter storage unit, a vibration data acquisition unit, a machine learning unit, and an outputting unit. The parameter storage unit stores a parameter including information on the motor and a tolerance range for motor vibration and takt time, the parameter being information necessary for determining a motor speed control parameter which is a parameter for controlling the motor. The vibration data acquisition unit acquires vibration data which is vibration, as detected by a vibration sensor, at a location where the motor is installed. Using a learned model obtained by learning, from said parameter and said vibration data, a correlation between vibrations at locations where the motor is to be installed and said motor speed control parameter, the machine learning unit determines, from said parameter and said vibration data, said motor speed control parameter that allows the vibration to fit within said tolerance range. On the basis of the determined motor speed control parameter, the outputting unit outputs, to the amplifier, pulses for controlling the amplifier.

Description

位置決め制御装置および機械学習装置Positioning control device and machine learning device
 本開示は、モータの速度制御を行うモータ速度制御パラメータによってモータの位置決め制御を行う位置決め制御装置および機械学習装置に関する。 The present disclosure relates to a positioning control device and a machine learning device that control the positioning of a motor by a motor speed control parameter that controls the speed of the motor.
 モータを用いた製造ラインなどのシステムでは、モータはアンプを介してプログラマブルロジックコントローラ(Programmable Logic Controller:PLC)などの制御装置と電気的に接続されている。このようなシステムでは、ユーザがモータの速度制御を行うために必要なパラメータであるモータ速度制御パラメータを入力している。モータ速度制御パラメータの一例は、位置、運転パルス速度および加速度レートである。ユーザによって入力されたモータ速度制御パラメータを基にPLCがアンプに指令信号であるパルス信号を出力し、アンプがパルス信号に基づいてモータを制御している。つまり、システムの消費電流、モータへの負荷およびタクトタイムはユーザが入力したモータ速度制御パラメータによって決定されている。 In a system such as a production line using a motor, the motor is electrically connected to a control device such as a programmable logic controller (PLC) via an amplifier. In such a system, the user inputs the motor speed control parameter, which is a parameter necessary for controlling the speed of the motor. Examples of motor speed control parameters are position, operating pulse speed and acceleration rate. The PLC outputs a pulse signal, which is a command signal, to the amplifier based on the motor speed control parameter input by the user, and the amplifier controls the motor based on the pulse signal. That is, the current consumption of the system, the load on the motor and the takt time are determined by the motor speed control parameters input by the user.
 通常、モータ速度制御パラメータは、モータの種類、設備および製造物を含むモータの情報に応じて、ユーザによる経験等を基に決定される。システムが意図した動作を行っていない場合には、ユーザはモータ速度制御パラメータを再度決定し、入力する。この処理が、システムが意図した動作を行うようになるまで繰り返し行われる。また、モータ速度制御パラメータの決定にあたり、例えばタクトタイムを短くするために加速度のパラメータを大きく設定した結果、設備の振動が大きくなったり、消費電力が大きくなったりする場合には、タクトタイム以外に対しマイナスな結果を招くことがある。そのため、ユーザはタクトタイム、設備の振動および消費電力等が許容範囲に収まるように考慮する必要がある。しかし、タクトタイム、振動および消費電力はトレードオフの関係にあり、経験等からタクトタイムが短く、かつ振動が少なく、かつ消費電力が少ない三者のバランスが取れた速度制御パラメータを導くのは困難である。 Normally, the motor speed control parameters are determined based on the user's experience, etc., according to the motor information including the motor type, equipment, and product. If the system is not performing the intended operation, the user redetermines and inputs the motor speed control parameters. This process is repeated until the system performs the intended operation. In addition, when determining the motor speed control parameters, for example, if the acceleration parameter is set large to shorten the tact time, and as a result, the vibration of the equipment becomes large or the power consumption becomes large, other than the tact time. On the other hand, it may lead to negative results. Therefore, the user needs to consider the tact time, the vibration of the equipment, the power consumption, etc. so as to be within the permissible range. However, there is a trade-off relationship between takt time, vibration, and power consumption, and it is difficult to derive a balanced speed control parameter with short tact time, low vibration, and low power consumption based on experience. Is.
 特許文献1には、ワークを加工する機械と、機械を制御する1以上の制御装置の上位に位置し、機械による加工で用いられるサーボゲインを調整する上位装置と、機械のサーボゲインの調整を機械学習する機械学習装置と、を備えた制御システムが開示されている。特許文献1に記載の制御システムでは、機械学習装置が、機械のサーボゲインの調整の機械学習結果と状態データとに基づいて、機械のサーボゲインの調整行動を決定し、機械のサーボゲインを変更する。 Patent Document 1 describes a machine that processes a workpiece, a higher-level device that is located above one or more control devices that control the machine and adjusts the servo gain used in machining by the machine, and an adjustment of the servo gain of the machine. A control system including a machine learning device for machine learning and a machine learning device is disclosed. In the control system described in Patent Document 1, the machine learning device determines the adjustment behavior of the servo gain of the machine based on the machine learning result and the state data of the adjustment of the servo gain of the machine, and changes the servo gain of the machine. do.
特開2018-097680号公報Japanese Unexamined Patent Publication No. 2018-097680
 しかしながら、特許文献1に記載の制御システムでは、機械を制御する制御装置のさらに上位に位置する上位装置で機械学習を行い、機械のサーボゲインの調整を行っている。このため、位置指令に対する追従性が最適化されるだけであり、タクトタイムの短縮はできないという問題があった。また、特許文献1に記載の制御システムでは、システムの寿命に大きく影響するモータによる振動が考慮されておらず、システムの寿命を延ばす点では不十分であった。 However, in the control system described in Patent Document 1, machine learning is performed by a higher-level device located higher than the control device that controls the machine, and the servo gain of the machine is adjusted. Therefore, there is a problem that the followability to the position command is only optimized and the tact time cannot be shortened. Further, the control system described in Patent Document 1 does not take into consideration the vibration caused by the motor, which greatly affects the life of the system, and is insufficient in extending the life of the system.
 本開示は、上記に鑑みてなされたものであって、従来に比してタクトタイムの短縮を可能とするとともに、モータを含む制御システムの寿命を延ばすことができる位置決め制御装置を得ることを目的とする。 The present disclosure has been made in view of the above, and an object thereof is to obtain a positioning control device capable of shortening the tact time as compared with the conventional case and extending the life of a control system including a motor. And.
 上述した課題を解決し、目的を達成するために、本開示は、アンプを介して電気的にモータと接続され、モータを制御する位置決め制御装置であって、パラメータ記憶部と、振動データ取得部と、機械学習部と、出力部と、を備える。パラメータ記憶部は、モータを制御するパラメータであるモータ速度制御パラメータを決定するために必要な情報であり、モータの情報と、モータの振動およびタクトタイムの許容範囲と、を含むパラメータを記憶する。振動データ取得部は、振動センサによって検出されるモータの設置箇所の振動である振動データを取得する。機械学習部は、パラメータと振動データとからモータの設置箇所の振動とモータ速度制御パラメータとの間の相関を学習した学習済モデルを用いて、パラメータと振動データとから振動が許容範囲内に収まるモータ速度制御パラメータを決定する。出力部は、決定されたモータ速度制御パラメータを基に、アンプを制御するパルスとしてアンプに出力する。 In order to solve the above-mentioned problems and achieve the object, the present disclosure is a positioning control device that is electrically connected to a motor via an amplifier and controls the motor, and is a parameter storage unit and a vibration data acquisition unit. , A machine learning unit, and an output unit. The parameter storage unit is information necessary for determining the motor speed control parameter, which is a parameter for controlling the motor, and stores the motor information and the parameters including the allowable range of the vibration and the tact time of the motor. The vibration data acquisition unit acquires vibration data, which is the vibration of the motor installation location detected by the vibration sensor. The machine learning unit uses a trained model that learns the correlation between the vibration of the motor installation location and the motor speed control parameter from the parameters and vibration data, and the vibration is within the allowable range from the parameters and vibration data. Determine motor speed control parameters. The output unit outputs to the amplifier as a pulse for controlling the amplifier based on the determined motor speed control parameter.
 本開示にかかる位置決め制御装置は、従来に比してタクトタイムの短縮を可能とするとともに、モータを含む制御システムの寿命を延ばすことができるという効果を奏する。 The positioning control device according to the present disclosure has the effect of shortening the tact time as compared with the conventional case and extending the life of the control system including the motor.
実施の形態1による位置決め制御装置を含む制御システムの構成の一例を示すブロック図A block diagram showing an example of a configuration of a control system including a positioning control device according to the first embodiment. 実施の形態1による位置決め制御装置に含まれる機械学習部の機能構成の一例を模式的に示すブロック図A block diagram schematically showing an example of the functional configuration of the machine learning unit included in the positioning control device according to the first embodiment. 実施の形態1による位置決め制御装置に含まれる機械学習部の学習処理の手順の一例を示すフローチャートA flowchart showing an example of the learning processing procedure of the machine learning unit included in the positioning control device according to the first embodiment. 実施の形態1による位置決め制御装置におけるモータの制御方法の処理手順の一例を示すフローチャートA flowchart showing an example of a processing procedure of a motor control method in the positioning control device according to the first embodiment. 実施の形態1による位置決め制御装置を含む制御システムの構成の他の例を示すブロック図Block diagram showing another example of the configuration of the control system including the positioning control device according to the first embodiment. 実施の形態1による位置決め制御装置を含む制御システムの構成の他の例を示すブロック図Block diagram showing another example of the configuration of the control system including the positioning control device according to the first embodiment. 実施の形態2による位置決め制御装置を含む制御システムの構成の一例を示すブロック図A block diagram showing an example of a configuration of a control system including a positioning control device according to the second embodiment. 実施の形態2による位置決め制御装置に含まれる機械学習部の学習処理の手順の一例を示すフローチャートA flowchart showing an example of the learning processing procedure of the machine learning unit included in the positioning control device according to the second embodiment. 実施の形態2による位置決め制御装置におけるモータの制御方法の処理手順の一例を示すフローチャートA flowchart showing an example of a processing procedure of a motor control method in the positioning control device according to the second embodiment. 実施の形態2による位置決め制御装置を含む制御システムの構成の他の例を示すブロック図Block diagram showing another example of the configuration of the control system including the positioning control device according to the second embodiment. 実施の形態2による位置決め制御装置を含む制御システムの構成の他の例を示すブロック図Block diagram showing another example of the configuration of the control system including the positioning control device according to the second embodiment. 実施の形態3による位置決め制御装置を含む制御システムの構成の一例を示すブロック図A block diagram showing an example of a configuration of a control system including a positioning control device according to the third embodiment. 実施の形態3による位置決め制御装置に含まれる機械学習部の学習処理の手順の一例を示すフローチャートA flowchart showing an example of the learning processing procedure of the machine learning unit included in the positioning control device according to the third embodiment. 実施の形態3による位置決め制御装置におけるモータの制御方法の処理手順の一例を示すフローチャートA flowchart showing an example of a processing procedure of a motor control method in the positioning control device according to the third embodiment. 実施の形態3による位置決め制御装置を含む制御システムの構成の他の例を示すブロック図Block diagram showing another example of the configuration of the control system including the positioning control device according to the third embodiment. 実施の形態3による位置決め制御装置を含む制御システムの構成の他の例を示すブロック図Block diagram showing another example of the configuration of the control system including the positioning control device according to the third embodiment. 実施の形態3による位置決め制御装置を含む制御システムの構成の他の例を示すブロック図Block diagram showing another example of the configuration of the control system including the positioning control device according to the third embodiment. 実施の形態3による位置決め制御装置での速度指令の一例を示す図The figure which shows an example of the speed command in the positioning control apparatus according to Embodiment 3. 実施の形態3による位置決め制御装置でのモータ速度制御パラメータの学習結果の一例を示す図The figure which shows an example of the learning result of the motor speed control parameter in the positioning control apparatus according to Embodiment 3. 実施の形態1,2,3による位置決め制御装置を実現するハードウェア構成の一例を模式的に示す図The figure which shows typically an example of the hardware composition which realizes the positioning control device by Embodiments 1, 2, and 3.
 以下に、本開示の実施の形態にかかる位置決め制御装置および機械学習装置を図面に基づいて詳細に説明する。 Hereinafter, the positioning control device and the machine learning device according to the embodiment of the present disclosure will be described in detail with reference to the drawings.
実施の形態1.
 図1は、実施の形態1による位置決め制御装置を含む制御システムの構成の一例を示すブロック図である。制御システム1は、位置決め制御装置10と、アンプ30と、モータ50と、振動センサ61と、を備える。
Embodiment 1.
FIG. 1 is a block diagram showing an example of a configuration of a control system including a positioning control device according to the first embodiment. The control system 1 includes a positioning control device 10, an amplifier 30, a motor 50, and a vibration sensor 61.
 位置決め制御装置10は、モータ50の情報と、位置決め制御装置10の接続先の機器の振動センサ61から取得されるデータである振動データと、から、接続先のモータ50が予め定められた範囲内で最適な振動となるようにモータ速度制御パラメータを求める装置である。位置決め制御装置10は、パラメータ記憶部11と、センサ値取得部12と、機械学習部13と、モータ速度制御パラメータ出力部14と、パルス出力部15と、を有する。位置決め制御装置10の一例は、PLCまたは位置決めモーションユニットである。 The positioning control device 10 is within a predetermined range of the motor 50 to be connected from the information of the motor 50 and the vibration data which is the data acquired from the vibration sensor 61 of the device to which the positioning control device 10 is connected. It is a device that obtains the motor speed control parameters so that the vibration becomes optimum. The positioning control device 10 includes a parameter storage unit 11, a sensor value acquisition unit 12, a machine learning unit 13, a motor speed control parameter output unit 14, and a pulse output unit 15. An example of the positioning control device 10 is a PLC or a positioning motion unit.
 パラメータ記憶部11は、制御対象のモータ50についての情報であるパラメータを記憶する。パラメータは、モータ速度制御パラメータを決定するために必要な情報であり、ユーザによって入力されるモータ50の情報およびモータ50の動作条件を含む。モータ50の情報は、モータ50の種類、電力容量、定格速度およびモータ50のサイズを含む情報である。動作条件は、設定されるモータ速度制御パラメータでモータ50を制御した場合における制御システム1が満たすべき条件である。動作条件は、許容範囲および優先項目を含む。許容範囲は、制御システム1の稼働時におけるタクトタイム、モータ50の消費電流およびモータ50の振動の最大値を規定する条件である。優先項目は、タクトタイム、モータ50の消費電流およびモータ50の振動のうち、ユーザが最適化したい項目を示す条件である。なお、実施の形態1では、モータ50の振動およびモータ50の振動とトレードオフの関係にあるタクトタイムについての許容範囲が設定され、モータ50の振動が優先項目に設定されているものとする。 The parameter storage unit 11 stores parameters that are information about the motor 50 to be controlled. The parameters are information necessary for determining the motor speed control parameters, and include information of the motor 50 input by the user and operating conditions of the motor 50. The information of the motor 50 is information including the type of the motor 50, the power capacity, the rated speed, and the size of the motor 50. The operating condition is a condition to be satisfied by the control system 1 when the motor 50 is controlled by the set motor speed control parameter. Operating conditions include tolerances and priorities. The permissible range is a condition that defines the tact time during operation of the control system 1, the current consumption of the motor 50, and the maximum value of the vibration of the motor 50. The priority item is a condition indicating an item that the user wants to optimize among the tact time, the current consumption of the motor 50, and the vibration of the motor 50. In the first embodiment, it is assumed that the allowable range for the vibration of the motor 50 and the tact time having a trade-off relationship with the vibration of the motor 50 is set, and the vibration of the motor 50 is set as a priority item.
 センサ値取得部12は、振動センサ61から得たデータである振動データを保持し、振動データを機械学習部13に出力する。 The sensor value acquisition unit 12 holds vibration data, which is data obtained from the vibration sensor 61, and outputs the vibration data to the machine learning unit 13.
 機械学習部13は、パラメータ記憶部11の値およびセンサ値取得部12から得た振動データから、最適なモータ50の振動となるモータ速度制御パラメータを学習する。機械学習部13は、機械学習装置に対応する。モータ50における振動およびタクトタイムが許容範囲内であればよいが、振動およびタクトタイムが許容範囲内となる複数のモータ速度制御パラメータが存在する場合には、予め定められた基準にしたがって1つのモータ速度制御パラメータが選択される。あるいは、実施の形態2で説明するように、モータ50の消費電流も許容範囲内となるような振動が最適な振動となる。このとき、機械学習部13は優先項目に指定された項目を優先するとともに、許容範囲を超えないようにモータ速度制御パラメータを決定する。ただし、優先項目に指定された項目については、許容範囲を超えないようなモータ速度制御パラメータが設定されるが、優先項目以外の項目については、許容範囲を超えてもよいし、許容範囲を超えないようにしてもよい。モータ速度制御パラメータは、位置、起動時パルス数、運転パルス速度、運転パルス数および加減速レートを含む。 The machine learning unit 13 learns the motor speed control parameter that is the optimum vibration of the motor 50 from the value of the parameter storage unit 11 and the vibration data obtained from the sensor value acquisition unit 12. The machine learning unit 13 corresponds to the machine learning device. The vibration and tact time of the motor 50 may be within the allowable range, but if there are a plurality of motor speed control parameters in which the vibration and the tact time are within the allowable range, one motor is according to a predetermined standard. The speed control parameter is selected. Alternatively, as described in the second embodiment, the optimum vibration is such that the current consumption of the motor 50 is also within the allowable range. At this time, the machine learning unit 13 gives priority to the item specified as the priority item and determines the motor speed control parameter so as not to exceed the allowable range. However, for the items specified as priority items, the motor speed control parameters are set so as not to exceed the allowable range, but for items other than the priority items, the allowable range may be exceeded or the allowable range is exceeded. It may not be. Motor speed control parameters include position, start-up pulse count, run pulse speed, run pulse count and acceleration / deceleration rate.
 モータ速度制御パラメータ出力部14は、機械学習部13から得たモータ速度制御パラメータをパルス出力部15に出力する。 The motor speed control parameter output unit 14 outputs the motor speed control parameter obtained from the machine learning unit 13 to the pulse output unit 15.
 パルス出力部15は、モータ速度制御パラメータ出力部14から得たモータ速度制御パラメータを基に、アンプ30を制御する指令信号であるパルスをアンプ30に出力する。モータ速度制御パラメータ出力部14およびパルス出力部15は、出力部に対応する。 The pulse output unit 15 outputs a pulse, which is a command signal for controlling the amplifier 30, to the amplifier 30 based on the motor speed control parameter obtained from the motor speed control parameter output unit 14. The motor speed control parameter output unit 14 and the pulse output unit 15 correspond to the output unit.
 アンプ30は、位置決め制御装置10およびモータ50と電気的に接続され、位置決め制御装置10から出力されるパルスによってモータ50を制御する装置である。 The amplifier 30 is a device that is electrically connected to the positioning control device 10 and the motor 50 and controls the motor 50 by a pulse output from the positioning control device 10.
 モータ50は、アンプ30によって制御することのできる電気式動力装置である。モータ50の一例は、位置検出用のエンコーダを有するサーボモータ、エンコーダを有さないステッピングモータである。 The motor 50 is an electric power device that can be controlled by the amplifier 30. An example of the motor 50 is a servomotor having an encoder for position detection and a stepping motor without an encoder.
 振動センサ61は、モータ50を設置した箇所の振動を測定し、測定した振動データを位置決め制御装置10に出力する。 The vibration sensor 61 measures the vibration at the location where the motor 50 is installed, and outputs the measured vibration data to the positioning control device 10.
 ここで、位置決め制御装置10に設けられる機械学習部13の詳細について説明する。図2は、実施の形態1による位置決め制御装置に含まれる機械学習部の機能構成の一例を模式的に示すブロック図である。機械学習部13は、データ取得部131と、モデル生成部132と、学習済モデル記憶部133と、推論部134と、を備える。 Here, the details of the machine learning unit 13 provided in the positioning control device 10 will be described. FIG. 2 is a block diagram schematically showing an example of the functional configuration of the machine learning unit included in the positioning control device according to the first embodiment. The machine learning unit 13 includes a data acquisition unit 131, a model generation unit 132, a trained model storage unit 133, and an inference unit 134.
 データ取得部131は、モータ速度制御パラメータ、モータ50の情報、許容範囲、優先項目および振動データを学習用データとして取得する。モータ速度制御パラメータは、モータ速度制御パラメータ出力部14に設定される値である。モータ50の情報および許容範囲は、パラメータ記憶部11に格納される値である。許容範囲は、タクトタイム、モータ50の消費電流およびモータ50の振動を含むが、実施の形態1では、モータ50の振動およびタクトタイムに関する許容範囲が使用される。振動データは、設定されたモータ速度制御パラメータでモータ50が駆動されたときに、振動センサ61で検知され、センサ値取得部12で取得された振動データである。 The data acquisition unit 131 acquires motor speed control parameters, motor 50 information, allowable ranges, priority items, and vibration data as learning data. The motor speed control parameter is a value set in the motor speed control parameter output unit 14. The information and the allowable range of the motor 50 are values stored in the parameter storage unit 11. The permissible range includes the tact time, the current consumption of the motor 50 and the vibration of the motor 50, but in the first embodiment, the permissible range regarding the vibration of the motor 50 and the tact time is used. The vibration data is vibration data detected by the vibration sensor 61 and acquired by the sensor value acquisition unit 12 when the motor 50 is driven by the set motor speed control parameters.
 モデル生成部132は、データ取得部131から出力されるモータ速度制御パラメータ、モータ50の情報、許容範囲および振動データの組み合わせに基づいて作成される学習用データにしたがって、最適な振動となるモータ速度制御パラメータを学習する。すなわち、モータ50の情報、モータ50の振動およびタクトタイムの許容範囲から最適な振動となる、すなわちタクトタイムの許容範囲内で振動が許容範囲内に収まるモータ速度制御パラメータを推論する学習済モデルを生成する。学習済モデルは、モータ50の情報、モータ50の振動およびタクトタイムの許容範囲と、振動データと、から、モータ50の設置箇所の振動とモータ速度制御パラメータとの間の相関を学習したモデルである。ここで、学習用データは、モータ速度制御パラメータ、モータ50の情報、モータ50の振動およびタクトタイムの許容範囲および振動データを互いに関連付けたデータである。 The model generation unit 132 has a motor speed that provides optimum vibration according to learning data created based on a combination of motor speed control parameters, motor 50 information, allowable range, and vibration data output from the data acquisition unit 131. Learn control parameters. That is, a trained model that infers the motor speed control parameter that provides the optimum vibration from the information of the motor 50, the vibration of the motor 50, and the allowable range of the tact time, that is, the vibration falls within the allowable range of the tact time. Generate. The trained model is a model that learns the correlation between the vibration of the installation location of the motor 50 and the motor speed control parameter from the information of the motor 50, the allowable range of the vibration and tact time of the motor 50, and the vibration data. be. Here, the learning data is data in which the motor speed control parameter, the information of the motor 50, the allowable range of the vibration and the tact time of the motor 50, and the vibration data are associated with each other.
 モデル生成部132が用いる学習アルゴリズムは教師あり学習、教師なし学習、強化学習等の公知のアルゴリズムを用いることができる。一例として、強化学習(Reinforcement Learning)を適用した場合について説明する。強化学習では、ある環境内における行動主体であるエージェントが、現在の状態である環境のパラメータを観測し、取るべき行動を決定する。エージェントの行動により環境が動的に変化し、エージェントには環境の変化に応じて報酬が与えられる。エージェントはこれを繰り返し、一連の行動を通じて報酬が最も多く得られる行動方針を学習する。強化学習の代表的な手法として、Q学習(Q-Learning)およびTD学習(TD-Learning)が知られている。例えば、Q学習の場合、行動価値関数Q(s,a)の一般的な更新式は次式(1)で表される。 As the learning algorithm used by the model generation unit 132, known algorithms such as supervised learning, unsupervised learning, and reinforcement learning can be used. As an example, the case where reinforcement learning (Reinforcement Learning) is applied will be described. In reinforcement learning, an agent who is the action subject in a certain environment observes the parameters of the environment in the current state and decides the action to be taken. The environment changes dynamically depending on the behavior of the agent, and the agent is rewarded according to the change in the environment. The agent repeats this process and learns the action policy that gives the most reward through a series of actions. Q-learning and TD-Learning are known as typical methods of reinforcement learning. For example, in the case of Q-learning, the general update equation of the action value function Q (s, a) is expressed by the following equation (1).
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000001
 (1)式において、stは時刻tにおける環境の状態を表し、atは時刻tにおける行動を表す。行動atにより、状態はst+1に変わる。rt+1はその状態の変化によってもらえる報酬を表し、γは割引率を表し、αは学習係数を表す。なお、γは0<γ≦1の範囲であり、αは0<α≦1の範囲であるとする。モータ速度制御パラメータが行動atとなり、モータ50の情報およびモータ50の振動の許容範囲が状態stとなり、時刻tの状態stにおける最良の行動atを学習する。一例では、状態としてモータの状態が入力され、行動としてモータ速度制御パラメータが入力され、結果としてモータ50の振動が入力される。また、報酬の基準として、許容範囲が使用される。 (1) In the formula, s t represents the state of the environment at time t, a t represents the behavior in time t. By the action a t, the state is changed to s t + 1. r t + 1 represents the reward received by the change of the state, γ represents the discount rate, and α represents the learning coefficient. It is assumed that γ is in the range of 0 <γ ≦ 1 and α is in the range of 0 <α ≦ 1. Motor speed control parameter action a t becomes, the allowable range of vibration information and the motor 50 of the motor 50 to learn the best action a t in state s t of the state s t, and the time t. In one example, the state of the motor is input as the state, the motor speed control parameter is input as the action, and the vibration of the motor 50 is input as a result. In addition, the permissible range is used as the standard of compensation.
 (1)式で表される更新式は、時刻t+1における最もQ値の高い行動aの行動価値Qが、時刻tにおいて実行された行動aの行動価値Qよりも大きければ、行動価値Qを大きくし、逆の場合は、行動価値Qを小さくする。換言すれば、時刻tにおける行動aの行動価値Qを、時刻t+1における最良の行動価値に近づけるように、行動価値関数Q(s,a)を更新する。それにより、或る環境における最良の行動価値が、それ以前の環境における行動価値に順次伝播していくようになる。 In the update formula represented by the equation (1), if the action value Q of the action a having the highest Q value at time t + 1 is larger than the action value Q of the action a executed at time t, the action value Q is increased. However, in the opposite case, the action value Q is reduced. In other words, the action value function Q (s, a) is updated so that the action value Q of the action a at time t approaches the best action value at time t + 1. As a result, the best behavioral value in a certain environment is sequentially propagated to the behavioral value in the previous environment.
 上記のように、強化学習によって学習済モデルを生成する場合に、モデル生成部132は、報酬計算部141と、関数更新部142と、を備える。 As described above, when a trained model is generated by reinforcement learning, the model generation unit 132 includes a reward calculation unit 141 and a function update unit 142.
 報酬計算部141は、モータ速度制御パラメータ、モータ50の情報、モータ50の振動およびタクトタイムの許容範囲および振動データに基づいて報酬を計算する。報酬計算部141は、振動データから得られる振動の大きさとモータ50の振動の許容範囲とに基づいて、報酬rを計算する。振動の大きさの許容範囲で規定される閾値を第1閾値とする。例えば、振動の大きさ<第1閾値の場合には報酬rを増大させ(例えば「1」の報酬を与える。)、他方、振動の大きさ>第1閾値の場合には報酬rを低減する(例えば「-1」の報酬を与える。)。 The reward calculation unit 141 calculates the reward based on the motor speed control parameter, the information of the motor 50, the allowable range of the vibration and tact time of the motor 50, and the vibration data. The reward calculation unit 141 calculates the reward r based on the magnitude of the vibration obtained from the vibration data and the allowable range of the vibration of the motor 50. The threshold value defined by the allowable range of the magnitude of vibration is defined as the first threshold value. For example, if the magnitude of vibration <first threshold value, the reward r is increased (for example, a reward of "1" is given), while if the magnitude of vibration> the first threshold value, the reward r is decreased. (For example, give a reward of "-1".).
 関数更新部142は、報酬計算部141によって計算される報酬に従って、最適な振動となるモータ速度制御パラメータを決定するための関数を更新し、学習済モデル記憶部133に出力する。例えばQ学習の場合、(1)式で表される行動価値関数Q(st,at)を最適な振動となるモータ速度制御パラメータを算出するための関数として用いる。以上のような学習を繰り返し実行する。 The function update unit 142 updates the function for determining the motor speed control parameter that produces the optimum vibration according to the reward calculated by the reward calculation unit 141, and outputs the function to the trained model storage unit 133. For example, in the case of Q-learning, it is used as a function for calculating the motor speed control parameter to be optimized vibrate (1) Action value function formula Q (s t, a t). Repeat the above learning.
 学習済モデル記憶部133は、関数更新部142によって更新された行動価値関数Q(st,at)、すなわち、学習済モデルを記憶する。 Learned model storage unit 133, action value is updated by the function updating unit 142 function Q (s t, a t) , i.e., storing the learned model.
 推論部134は、学習済モデル記憶部133に記憶されている学習済モデルを利用してモータ速度制御パラメータを推論する。すなわち、この学習済モデルに、データ取得部131で取得したモータ50の情報、モータ50の振動およびタクトタイムの許容範囲並びに優先項目を入力することで、モータ50の情報、モータ50の振動およびタクトタイムの許容範囲並びに優先項目から推論されるモータ速度制御パラメータを出力することができる。また、振動センサ61で検知された振動データの値と、モータ50の情報、モータ50の振動およびタクトタイムの許容範囲並びに優先項目と、を学習済モデルで解析することによって、取得された振動データの値をフィードバック値として、モータ速度制御パラメータに反映させることができる。 The inference unit 134 infers the motor speed control parameter using the learned model stored in the learned model storage unit 133. That is, by inputting the information of the motor 50 acquired by the data acquisition unit 131, the allowable range of the vibration and the tact time of the motor 50, and the priority items into this trained model, the information of the motor 50, the vibration and the tact of the motor 50 are input. Motor speed control parameters inferred from the time tolerance and priority items can be output. Further, the vibration data acquired by analyzing the value of the vibration data detected by the vibration sensor 61, the information of the motor 50, the allowable range of the vibration and the tact time of the motor 50, and the priority items with the trained model. The value of can be reflected in the motor speed control parameter as a feedback value.
 以上の説明では、推論部134は、モータ50と接続される位置決め制御装置10のモデル生成部132で学習した学習済モデルを用いてモータ速度制御パラメータを出力するものとして説明した。しかし、推論部134は、他のモータ50に接続される他の位置決め制御装置10等の外部から学習済モデルを取得し、取得した学習済モデルに基づいてモータ速度制御パラメータを出力するようにしてもよい。 In the above description, the inference unit 134 has been described as outputting the motor speed control parameter using the learned model learned by the model generation unit 132 of the positioning control device 10 connected to the motor 50. However, the inference unit 134 acquires a trained model from the outside such as another positioning control device 10 connected to the other motor 50, and outputs the motor speed control parameter based on the acquired trained model. May be good.
 次に、機械学習部13の学習処理について説明する。図3は、実施の形態1による位置決め制御装置に含まれる機械学習部の学習処理の手順の一例を示すフローチャートである。まず、データ取得部131は、モータ速度制御パラメータ、モータ50の情報、許容範囲および振動データを学習用データとして取得する(ステップS11)。ここでは、モータ50の振動およびタクトタイムについての許容範囲が設定される。また、データ取得部131は、パラメータ記憶部11から優先項目を取得する(ステップS12)。ここでは、優先項目には、振動が設定されているものとする。 Next, the learning process of the machine learning unit 13 will be described. FIG. 3 is a flowchart showing an example of the procedure of the learning process of the machine learning unit included in the positioning control device according to the first embodiment. First, the data acquisition unit 131 acquires motor speed control parameters, motor 50 information, allowable range, and vibration data as learning data (step S11). Here, an allowable range for vibration and tact time of the motor 50 is set. Further, the data acquisition unit 131 acquires priority items from the parameter storage unit 11 (step S12). Here, it is assumed that vibration is set as the priority item.
 ついで、モデル生成部132は、モータ速度制御パラメータ、モータ50の情報および許容範囲に基づいて報酬を計算する。具体的には、モデル生成部132の報酬計算部141は、モータ速度制御パラメータ、モータ50の情報および許容範囲を取得し、予め定められた振動の大きさと第1閾値との関係に基づいて報酬の増加または減少を判断する。第1閾値は、一例では、許容範囲に規定されている振動の値である。ここでは、報酬計算部141は、モータ速度制御パラメータでモータ50を動作させたときの振動データの値が第1閾値未満であるかを判定する(ステップS13)。第1閾値は、モータ50を動作させたときのモータ50に接続される機器に許容される振動の値である。 Then, the model generation unit 132 calculates the reward based on the motor speed control parameter, the information of the motor 50, and the allowable range. Specifically, the reward calculation unit 141 of the model generation unit 132 acquires the motor speed control parameter, the information of the motor 50, and the allowable range, and rewards based on the relationship between the predetermined vibration magnitude and the first threshold value. Judge the increase or decrease of. The first threshold is, in one example, the value of vibration defined in the permissible range. Here, the reward calculation unit 141 determines whether the value of the vibration data when the motor 50 is operated by the motor speed control parameter is less than the first threshold value (step S13). The first threshold value is the value of vibration allowed for the device connected to the motor 50 when the motor 50 is operated.
 振動データの値が第1閾値未満である場合(ステップS13でYesの場合)には、報酬計算部141は、報酬を増加させる(ステップS14)。また、振動データの値が第1閾値よりも大きい場合(ステップS13でNoの場合)には、報酬計算部141は、報酬を減少させる(ステップS15)。なお、振動データの値が第1閾値と等しい場合には、報酬を増加させてもよいし、減少させてもよい。 When the value of the vibration data is less than the first threshold value (Yes in step S13), the reward calculation unit 141 increases the reward (step S14). When the value of the vibration data is larger than the first threshold value (No in step S13), the reward calculation unit 141 reduces the reward (step S15). When the value of the vibration data is equal to the first threshold value, the reward may be increased or decreased.
 ステップS14またはステップS15の後、モデル生成部132の関数更新部142は、報酬計算部141によって計算された報酬に基づいて、行動価値関数Q(st,at)を更新する(ステップS16)。行動価値関数Q(st,at)は、学習済モデル記憶部133が記憶する(1)式で表される関数である。 After step S14 or step S15, the function updater 142 of the model generating unit 132, based on the calculated compensation by compensation calculation unit 141, action value function Q (s t, a t) to update (step S16) .. Action value function Q (s t, a t) is a function expressed by learned model storage unit 133 stores (1).
 そして、処理がステップS11へと戻る。すなわち、機械学習部13は、以上のステップS11からS16までの処理を繰り返し実行し、生成された行動価値関数Q(st,at)を学習済モデルとして記憶する。なお、振動以外の消費電流またはタクトタイムに関するフィードバック値を取得した場合には、データを取得したときの学習データを蓄積する。 Then, the process returns to step S11. That is, the machine learning unit 13 repeatedly executes the processing up to S16 step S11 above, and stores the generated action-value function Q (s t, a t) as a learned model. When a feedback value related to current consumption or takt time other than vibration is acquired, the learning data at the time of acquisition of the data is accumulated.
 実施の形態1による機械学習部13では、学習済モデルを機械学習部13の内部に設けられた学習済モデル記憶部133に記憶するものとしたが、学習済モデル記憶部133を機械学習部13の外部に備えていてもよい。 In the machine learning unit 13 according to the first embodiment, the trained model is stored in the trained model storage unit 133 provided inside the machine learning unit 13, but the trained model storage unit 133 is stored in the machine learning unit 13. It may be prepared outside of.
 また、「強化学習」以外にも「教師あり学習」、「教師なし学習」、「半教師あり学習」またはの他の公知の学習アルゴリズムによって、モータ速度制御パラメータ、モータ50の情報、モータ50の振動およびタクトタイムの許容範囲並びに振動データから最適な振動となるモータ速度制御パラメータを機械学習してもよい。これらの学習アルゴリズムを用いた機械学習によっても、システム全体の振動を低減することができる。 In addition to "reinforcement learning", "supervised learning", "unsupervised learning", "semi-supervised learning" or other known learning algorithms can be used to control motor speed control parameters, motor 50 information, and motor 50. The motor speed control parameter that provides the optimum vibration may be machine-learned from the permissible range of vibration and tact time and the vibration data. Machine learning using these learning algorithms can also reduce the vibration of the entire system.
 つぎに、機械学習部13に記憶された学習済モデルを用いたモータ速度制御パラメータの推論方法について説明する。図4は、実施の形態1による位置決め制御装置におけるモータの制御方法の処理手順の一例を示すフローチャートである。 Next, a method of inferring motor speed control parameters using the trained model stored in the machine learning unit 13 will be described. FIG. 4 is a flowchart showing an example of a processing procedure of a motor control method in the positioning control device according to the first embodiment.
 まず、ユーザによって、初期の制御システム1の構成が決定される。そして、パラメータ記憶部11は、モータ50の情報、許容範囲および優先項目を含む値を記憶する(ステップS31)。モータ50の情報、許容範囲および優先項目は、ユーザによって図示しない入力部を介して設定される。ここでは、モータ50の振動およびタクトタイムについての許容範囲が設定される。また、センサ値取得部12は、振動センサ61から振動データを取得し(ステップS32)、保持する。その後、機械学習部13の推論部134は、パラメータ記憶部11の値および取得した振動データの値を、学習済モデル記憶部133に記憶されている学習済モデルを用いて解析し、最適な振動となるモータ速度制御パラメータをモータ速度制御パラメータ出力部14に設定する(ステップS33)。 First, the configuration of the initial control system 1 is determined by the user. Then, the parameter storage unit 11 stores the information including the information of the motor 50, the allowable range, and the value including the priority item (step S31). Information, tolerances and priorities for the motor 50 are set by the user via inputs not shown. Here, an allowable range for vibration and tact time of the motor 50 is set. Further, the sensor value acquisition unit 12 acquires vibration data from the vibration sensor 61 (step S32) and holds it. After that, the inference unit 134 of the machine learning unit 13 analyzes the value of the parameter storage unit 11 and the value of the acquired vibration data using the trained model stored in the trained model storage unit 133, and optimal vibration. The motor speed control parameter is set in the motor speed control parameter output unit 14 (step S33).
 モータ速度制御パラメータ出力部14は、設定されたモータ速度制御パラメータをパルス出力部15に出力する(ステップS34)。パルス出力部15は、モータ速度制御パラメータ出力部14からのモータ速度制御パラメータを基にアンプ30へパルスを出力する(ステップS35)。 The motor speed control parameter output unit 14 outputs the set motor speed control parameter to the pulse output unit 15 (step S34). The pulse output unit 15 outputs a pulse to the amplifier 30 based on the motor speed control parameter from the motor speed control parameter output unit 14 (step S35).
 これによって、モータ50が駆動される。モータ50が駆動されると、モータ50に設けられる振動センサ61によって、モータ50の振動が検知される。その後、ステップS32で説明したように、センサ値取得部12によって、モータ50の振動である振動データが取得される。そして、上記したようにステップS32からS35までの処理が繰り返し実行される。一例では、センサ値取得部12で取得された振動データの値に許容範囲を超えるような変化が生じると、ステップS33で、推論部134が、モータ50の情報と、許容範囲および優先項目と、振動データの値と、を学習済モデルを用いて解析し、最適な振動となるモータ速度制御パラメータをモータ速度制御パラメータ出力部14に設定することになる。 This drives the motor 50. When the motor 50 is driven, the vibration of the motor 50 is detected by the vibration sensor 61 provided in the motor 50. After that, as described in step S32, the sensor value acquisition unit 12 acquires vibration data, which is the vibration of the motor 50. Then, as described above, the processes from steps S32 to S35 are repeatedly executed. In one example, when the value of the vibration data acquired by the sensor value acquisition unit 12 changes beyond the permissible range, the inference unit 134 uses the information of the motor 50, the permissible range, and the priority items in step S33. The value of the vibration data and the value of the vibration data are analyzed using the trained model, and the motor speed control parameter that provides the optimum vibration is set in the motor speed control parameter output unit 14.
 なお、上記した説明では、振動センサ61がモータ50に設けられる場合を示したが、実施の形態がこれに限定されるものではない。図5は、実施の形態1による位置決め制御装置を含む制御システムの構成の他の例を示すブロック図である。以下では、図1と同一の構成要素には同一の符号を付して、その説明を省略し、図1と異なる部分について説明する。図5では、振動センサ61は、モータ50ではなく、モータ50およびモータ50によって駆動される駆動部51を含む装置である製品52に設けられる。これによって、機械学習部13は、モータ50だけでなく、駆動部51等を含めた製品52の振動の値を最適化することができる。 Although the above description shows the case where the vibration sensor 61 is provided in the motor 50, the embodiment is not limited to this. FIG. 5 is a block diagram showing another example of the configuration of the control system including the positioning control device according to the first embodiment. In the following, the same components as those in FIG. 1 are designated by the same reference numerals, the description thereof will be omitted, and the parts different from those in FIG. 1 will be described. In FIG. 5, the vibration sensor 61 is not provided in the motor 50, but in the product 52, which is a device including the motor 50 and the drive unit 51 driven by the motor 50. As a result, the machine learning unit 13 can optimize the vibration value of the product 52 including the drive unit 51 and the like as well as the motor 50.
 図6は、実施の形態1による位置決め制御装置を含む制御システムの構成の他の例を示すブロック図である。以下では、図1および図5と同一の構成要素には同一の符号を付して、その説明を省略し、図1および図5と異なる部分について説明する。図6では、振動センサ61は、複数の製品52を含むシステム53に設けられる。システム53中の各製品52のモータ50には、それぞれアンプ30が電気的に接続されている。このようなシステム53は、一例では、モータ50および製品52を複数含む多軸制御によるシステムである。これによって、機械学習部13は、複数の製品52を含むシステム53の振動の値を最適化することができる。なお、振動センサ61が設置される箇所である図1のモータ50、図5の製品52および図6のシステム53は、モータ50の設置箇所となる。 FIG. 6 is a block diagram showing another example of the configuration of the control system including the positioning control device according to the first embodiment. In the following, the same components as those in FIGS. 1 and 5 are designated by the same reference numerals, the description thereof will be omitted, and the parts different from those in FIGS. 1 and 5 will be described. In FIG. 6, the vibration sensor 61 is provided in a system 53 including a plurality of products 52. An amplifier 30 is electrically connected to the motor 50 of each product 52 in the system 53. As an example, such a system 53 is a multi-axis control system including a plurality of motors 50 and products 52. Thereby, the machine learning unit 13 can optimize the vibration value of the system 53 including the plurality of products 52. The motor 50 in FIG. 1, the product 52 in FIG. 5, and the system 53 in FIG. 6 where the vibration sensor 61 is installed are the locations where the motor 50 is installed.
 実施の形態1では、位置決め制御装置10の機械学習部13が、学習用データにしたがって、振動データが許容範囲となるモータ速度制御パラメータを学習し、学習済モデルを生成する。学習用データは、振動センサ61からの振動データと、モータ50の情報と、モータ速度制御パラメータと、モータ50の振動およびタクトタイムについての許容範囲と、の組み合わせに基づいて作成される。振動センサ61は、モータ50、モータ50に接続される駆動部51を含む製品52、または複数の製品52を含むシステム53に設けられる。そして、機械学習部13は、モータ50の情報と許容範囲と振動データの値とを、学習済モデルを用いて解析することによって、振動が許容範囲に収まるモータ速度制御パラメータを設定する。設定されたモータ速度制御パラメータでモータ50を駆動させることによって、タクトタイムの短縮を可能とするとともに、モータ50に過剰な振動を与えずに製品52またはシステム53を稼働させ、モータ50への負担が軽減され、モータ50の寿命の延長を実現することができる。 In the first embodiment, the machine learning unit 13 of the positioning control device 10 learns the motor speed control parameter in which the vibration data is within the allowable range according to the learning data, and generates a trained model. The learning data is created based on a combination of vibration data from the vibration sensor 61, information on the motor 50, motor speed control parameters, and allowable ranges for vibration and takt time of the motor 50. The vibration sensor 61 is provided in a motor 50, a product 52 including a drive unit 51 connected to the motor 50, or a system 53 including a plurality of products 52. Then, the machine learning unit 13 sets the motor speed control parameter in which the vibration falls within the permissible range by analyzing the information of the motor 50, the permissible range, and the value of the vibration data using the trained model. By driving the motor 50 with the set motor speed control parameters, the tact time can be shortened, and the product 52 or the system 53 can be operated without giving excessive vibration to the motor 50, which imposes a burden on the motor 50. Can be reduced and the life of the motor 50 can be extended.
 また、位置決め制御装置10に上位装置が設けられる場合に、機械学習に関する機能が位置決め制御装置10に設けられるので、従来の技術のように複数の制御装置の上位装置で機械学習を行う場合に比して、上位装置への負荷を軽減することができる。 Further, when the positioning control device 10 is provided with a higher-level device, the positioning control device 10 is provided with a function related to machine learning. Therefore, the load on the host device can be reduced.
 さらに、従来の技術のように、モータ50からのフィードバック制御で制御値であるサーボゲインを決定するものではなく、位置決め制御装置10の出力制御によってモータ速度制御パラメータを決定する。そのため、エンコーダを有するサーボモータおよびアンプのように高価でフィードバック制御ができる機器だけでなく、フィードバック機構を有さないステッピングモータおよびアンプのような機器にも適用することができる。 Further, unlike the conventional technique, the servo gain which is a control value is determined by the feedback control from the motor 50, but the motor speed control parameter is determined by the output control of the positioning control device 10. Therefore, it can be applied not only to devices such as servo motors and amplifiers having an encoder, which are expensive and capable of feedback control, but also to devices such as stepping motors and amplifiers which do not have a feedback mechanism.
実施の形態2.
 図7は、実施の形態2による位置決め制御装置を含む制御システムの構成の一例を示すブロック図である。以下では、実施の形態1と同一の構成には同一の符号を付し、その説明を省略し、実施の形態1と異なる部分について説明する。実施の形態2の制御システム1は、モータ50に、消費電流測定機器62をさらに備える。消費電流測定機器62は、設置箇所で消費した電流の値である消費電流値を取得する装置である。消費電流測定機器62は、測定した消費電流値を位置決め制御装置10に出力する。
Embodiment 2.
FIG. 7 is a block diagram showing an example of the configuration of the control system including the positioning control device according to the second embodiment. In the following, the same components as those in the first embodiment are designated by the same reference numerals, the description thereof will be omitted, and the parts different from those in the first embodiment will be described. The control system 1 of the second embodiment further includes a current consumption measuring device 62 in the motor 50. The current consumption measuring device 62 is a device that acquires a current consumption value, which is a value of the current consumed at the installation location. The current consumption measuring device 62 outputs the measured current consumption value to the positioning control device 10.
 位置決め制御装置10は、消費電流取得部16をさらに備える。消費電流取得部16は、消費電流測定機器62から出力された消費電流値を取得し、保持する。 The positioning control device 10 further includes a current consumption acquisition unit 16. The current consumption acquisition unit 16 acquires and holds the current consumption value output from the current consumption measuring device 62.
 機械学習部13は、モータ速度制御パラメータ、モータ50の情報、許容範囲並びにモータ50の振動データおよび消費電流値の組み合わせに基づいて作成される学習用データにしたがって、指定される優先項目が許容範囲に収まるモータ速度制御パラメータを学習する。ここでは、優先項目で指定される許容範囲に加えて、予めタクトタイムの許容範囲が設定されているものとする。つまり、モータ50の情報と、許容範囲と、優先項目と、モータ50の振動データおよび消費電流値と、からタクトタイムが許容範囲内となり、指定された優先項目が許容範囲に収まるモータ速度制御パラメータを推論する学習済モデルを生成する。指定される優先項目は、振動または消費電流である。すなわち、学習済モデルは、モータ50の情報と、許容範囲と、振動データと、消費電流値と、から、モータ50の設置箇所の振動と消費電流値とモータ速度制御パラメータとの間の相関を学習したモデルである。なお、機械学習部13は、指定される優先項目が許容範囲に収まるモータ速度制御パラメータを学習してもよいし、指定される優先項目だけでなく、優先項目以外の項目も許容範囲に収まるモータ速度制御パラメータを学習してもよい。また、機械学習部13は、モータ50の情報、許容範囲並びにモータ50の振動データの値および消費電流値を、学習済モデルを用いて解析し、モータ50の情報を有するモータ50で優先項目が許容範囲に収まるようなモータ速度制御パラメータを出力する。 The machine learning unit 13 has an allowable range of priority items specified according to the learning data created based on the combination of the motor speed control parameter, the information of the motor 50, the allowable range, and the vibration data and the current consumption value of the motor 50. Learn motor speed control parameters that fit in. Here, it is assumed that the allowable range of takt time is set in advance in addition to the allowable range specified by the priority item. That is, the motor speed control parameter in which the tact time is within the allowable range from the information of the motor 50, the allowable range, the priority item, the vibration data of the motor 50, and the current consumption value, and the specified priority item is within the allowable range. Generate a trained model that infers. The priority item specified is vibration or current consumption. That is, the trained model correlates the vibration and current consumption value of the installation location of the motor 50 with the motor speed control parameter from the information of the motor 50, the allowable range, the vibration data, and the current consumption value. It is a learned model. The machine learning unit 13 may learn motor speed control parameters in which the designated priority items fall within the permissible range, and not only the designated priority items but also items other than the priority items fall within the permissible range. You may learn the speed control parameters. Further, the machine learning unit 13 analyzes the information of the motor 50, the allowable range, the value of the vibration data of the motor 50, and the current consumption value by using the trained model, and the priority item is the motor 50 having the information of the motor 50. Outputs motor speed control parameters that fall within the permissible range.
 図8は、実施の形態2による位置決め制御装置に含まれる機械学習部の学習処理の手順の一例を示すフローチャートである。まず、データ取得部131は、モータ速度制御パラメータ、モータ50の情報、許容範囲並びにモータ50の振動データおよび消費電流値を学習用データとして取得する(ステップS51)。また、データ取得部131は、パラメータ記憶部11から優先項目を取得する(ステップS52)。モデル生成部132は、優先項目が振動および消費電流のうちいずれかであるかを判定する(ステップS53)。 FIG. 8 is a flowchart showing an example of the procedure of the learning process of the machine learning unit included in the positioning control device according to the second embodiment. First, the data acquisition unit 131 acquires the motor speed control parameter, the information of the motor 50, the allowable range, the vibration data of the motor 50, and the current consumption value as learning data (step S51). Further, the data acquisition unit 131 acquires priority items from the parameter storage unit 11 (step S52). The model generation unit 132 determines whether the priority item is vibration or current consumption (step S53).
 優先項目が振動である場合(ステップS53で振動の場合)には、モデル生成部132は、モータ速度制御パラメータ、モータ50の情報および許容範囲に基づいて報酬を計算する。具体的には、モデル生成部132の報酬計算部141は、モータ速度制御パラメータ、モータ50の情報、許容範囲並びにモータ50の振動データおよび消費電流値を取得し、予め定められた振動の大きさと第1閾値との関係および消費電流値と第2閾値との関係に基づいて報酬の増加または減少を判断する。第2閾値は、一例では、許容範囲に規定されている消費電流値である。ここでは、報酬計算部141は、モータ速度制御パラメータでモータ50を動作させたときの振動データの値が第1閾値未満であるかを判定する(ステップS54)。 When the priority item is vibration (vibration in step S53), the model generation unit 132 calculates the reward based on the motor speed control parameter, the information of the motor 50, and the allowable range. Specifically, the reward calculation unit 141 of the model generation unit 132 acquires the motor speed control parameter, the information of the motor 50, the allowable range, the vibration data of the motor 50, and the current consumption value, and determines the magnitude of the vibration. The increase or decrease of the reward is determined based on the relationship with the first threshold value and the relationship between the current consumption value and the second threshold value. The second threshold value is, in one example, the current consumption value defined in the allowable range. Here, the reward calculation unit 141 determines whether the value of the vibration data when the motor 50 is operated by the motor speed control parameter is less than the first threshold value (step S54).
 振動データの値が第1閾値未満である場合(ステップS54でYesの場合)には、報酬計算部141は、報酬を増加させる(ステップS55)。なお、優先項目以外の項目について考慮しない場合には、優先項目である振動データが第1閾値未満であれば、報酬を増加させることになる。しかし、優先項目以外の項目についても許容範囲内に収まるように考慮する場合には、さらに、第1閾値と振動データの値との差に応じる報酬と、第2閾値と消費電流値との差に応じる報酬と、を定めることができる。一例では、報酬計算部141は、第2閾値と消費電流値との差が正の場合に、第2閾値と消費電流値との差が負の場合よりも報酬を増加させる。このとき、第2閾値と消費電流値との差が正か負かで報酬が定められてもよいし、第2閾値と消費電流値との差の大きさに応じて、報酬が定められてもよい。すなわち、許容範囲内でも消費電流値が小さくなるほど報酬を増加させ、消費電流値が大きくなるほど報酬が減少するようにしてもよい。また、許容範囲外で第2閾値よりも消費電流値が大きくなる場合に、報酬を減少させるようにしてもよい。ただし、これらの場合には、振動が優先項目となっているので、第2閾値と消費電流値との差の大きさよりも、第1閾値と振動データの値との差の大きさの方が報酬の増減に寄与するようにすることが望ましい。これによって、振動および消費電流がともに許容範囲内である場合に、振動は許容範囲であるが消費電流は許容範囲外である場合に比して、高い報酬となり、行動価値を高めることができる。 When the value of the vibration data is less than the first threshold value (Yes in step S54), the reward calculation unit 141 increases the reward (step S55). If items other than the priority items are not considered, the reward will be increased if the vibration data, which is the priority item, is less than the first threshold value. However, when considering items other than the priority items so as to be within the permissible range, the reward according to the difference between the first threshold value and the vibration data value and the difference between the second threshold value and the current consumption value are further considered. It is possible to determine the reward according to the above. In one example, the reward calculation unit 141 increases the reward when the difference between the second threshold value and the current consumption value is positive, as compared with the case where the difference between the second threshold value and the current consumption value is negative. At this time, the reward may be determined depending on whether the difference between the second threshold value and the current consumption value is positive or negative, and the reward is determined according to the magnitude of the difference between the second threshold value and the current consumption value. May be good. That is, even within the permissible range, the reward may be increased as the current consumption value becomes smaller, and the reward may decrease as the current consumption value becomes larger. Further, when the current consumption value becomes larger than the second threshold value outside the permissible range, the reward may be reduced. However, in these cases, vibration is a priority item, so the difference between the first threshold and the vibration data value is larger than the difference between the second threshold and the current consumption value. It is desirable to contribute to the increase or decrease of the reward. As a result, when both the vibration and the current consumption are within the permissible range, the reward is higher and the action value can be enhanced as compared with the case where the vibration is within the permissible range but the current consumption is not within the permissible range.
 振動データの値が第1閾値よりも大きい場合(ステップS54でNoの場合)には、報酬計算部141は、報酬を減少させる(ステップS56)。なお、優先項目以外の項目について考慮しない場合には、優先項目である振動データが第1閾値よりも大きければ、報酬を減少させることになる。しかし、優先項目以外の項目についても許容範囲内に収まるように考慮する場合には、さらに、第1閾値と振動データの値との差に応じる報酬と、第2閾値と消費電流値との差に応じる報酬と、を定めることができる。一例では、報酬計算部141は、第2閾値と消費電流値との差が正の場合に、第2閾値と消費電流値との差が負の場合よりも減少させる報酬の絶対値を小さくする。このとき、第2閾値と消費電流値との差が正か負かで報酬が定められてもよいし、第2閾値と消費電流値との差の大きさに応じて、報酬が定められてもよい。すなわち、許容範囲内でも消費電流値が小さくなるほど減少させる報酬の絶対値を小さくし、消費電流値が大きくなるほど減少させる報酬の絶対値が大きくなるようにしてもよい。また、許容範囲外で第2閾値よりも消費電流値が大きくなる場合に、減少させる報酬の絶対値が大きくなるようにしてもよい。ただし、これらの場合にも、振動が優先項目となっているので、第2閾値と消費電流値との差の大きさよりも、第1閾値と振動データの値との差の大きさの方が報酬の増減に寄与するようにすることが望ましい。 When the value of the vibration data is larger than the first threshold value (No in step S54), the reward calculation unit 141 reduces the reward (step S56). If items other than the priority items are not considered, the reward will be reduced if the vibration data, which is the priority item, is larger than the first threshold value. However, when considering items other than the priority items so as to be within the permissible range, the reward according to the difference between the first threshold value and the vibration data value and the difference between the second threshold value and the current consumption value are further considered. It is possible to determine the reward according to the above. In one example, the reward calculation unit 141 makes the absolute value of the reward to be reduced smaller when the difference between the second threshold value and the current consumption value is positive than when the difference between the second threshold value and the current consumption value is negative. .. At this time, the reward may be determined depending on whether the difference between the second threshold value and the current consumption value is positive or negative, and the reward is determined according to the magnitude of the difference between the second threshold value and the current consumption value. May be good. That is, even within the permissible range, the absolute value of the reward to be reduced may be reduced as the current consumption value becomes smaller, and the absolute value of the reward to be reduced may be increased as the current consumption value becomes larger. Further, when the current consumption value is larger than the second threshold value outside the permissible range, the absolute value of the reward to be reduced may be increased. However, even in these cases, vibration is a priority item, so the difference between the first threshold and the vibration data value is larger than the difference between the second threshold and the current consumption value. It is desirable to contribute to the increase or decrease of the reward.
 優先項目が消費電流である場合(ステップS53で消費電流の場合)には、モデル生成部132は、モータ速度制御パラメータ、モータ50の情報および許容範囲に基づいて報酬を計算する。具体的には、モデル生成部132の報酬計算部141は、モータ速度制御パラメータ、モータ50の情報、許容範囲並びにモータ50の振動データおよび消費電流値を取得し、予め定められた振動の大きさと第1閾値との関係および消費電流値と第2閾値との関係に基づいて報酬の増加または減少を判断する。ここでは、報酬計算部141は、モータ速度制御パラメータでモータ50を動作させたときの消費電流値が第2閾値未満であるかを判定する(ステップS57)。 When the priority item is current consumption (current consumption in step S53), the model generation unit 132 calculates the reward based on the motor speed control parameter, the information of the motor 50, and the allowable range. Specifically, the reward calculation unit 141 of the model generation unit 132 acquires the motor speed control parameter, the information of the motor 50, the allowable range, the vibration data of the motor 50, and the current consumption value, and determines the magnitude of the vibration. The increase or decrease of the reward is determined based on the relationship with the first threshold value and the relationship between the current consumption value and the second threshold value. Here, the reward calculation unit 141 determines whether the current consumption value when the motor 50 is operated by the motor speed control parameter is less than the second threshold value (step S57).
 消費電流値が第2閾値未満である場合(ステップS57でYesの場合)には、報酬計算部141は、報酬を増加させる(ステップS58)。なお、優先項目以外の項目について考慮しない場合には、優先項目である消費電流値が第2閾値未満であれば、報酬を増加させることになる。しかし、優先項目以外の項目についても許容範囲内に収まるように考慮する場合には、さらに、第1閾値と振動データの値との差に応じる報酬と、第2閾値と消費電流値との差に応じる報酬と、を定めることができる。一例では、報酬計算部141は、第1閾値と振動データの値との差が正の場合に、第1閾値と振動データの値との差が負の場合よりも報酬を増加させる。このとき、第1閾値と振動データの値との差が正か負かで報酬が定められてもよいし、第1閾値と振動データの値との差の大きさに応じて、報酬が定められてもよい。すなわち、許容範囲内でも振動データの値が小さくなるほど報酬を増加させ、振動データの値が大きくなるほど報酬が減少するようにしてもよい。また、許容範囲外で第1閾値よりも振動データの値が大きくなる場合に、報酬を減少させるようにしてもよい。ただし、これらの場合には、消費電流が優先項目となっているので、第1閾値と振動データの値との差の大きさよりも、第2閾値と消費電流値との差の大きさの方が報酬の増減に寄与するようにすることが望ましい。これによって、振動および消費電流がともに許容範囲内である場合に、消費電流は許容範囲であるが振動は許容範囲外である場合に比して、高い報酬となり、行動価値を高めることができる。 When the current consumption value is less than the second threshold value (Yes in step S57), the reward calculation unit 141 increases the reward (step S58). If items other than the priority items are not considered, the reward will be increased if the current consumption value, which is the priority item, is less than the second threshold value. However, when considering items other than the priority items so as to be within the permissible range, the reward according to the difference between the first threshold value and the vibration data value and the difference between the second threshold value and the current consumption value are further considered. It is possible to determine the reward according to the above. In one example, the reward calculation unit 141 increases the reward when the difference between the first threshold value and the value of the vibration data is positive, as compared with the case where the difference between the first threshold value and the value of the vibration data is negative. At this time, the reward may be determined depending on whether the difference between the first threshold value and the value of the vibration data is positive or negative, or the reward is determined according to the magnitude of the difference between the first threshold value and the value of the vibration data. May be done. That is, even within the permissible range, the reward may be increased as the value of the vibration data becomes smaller, and the reward may be decreased as the value of the vibration data becomes larger. Further, when the value of the vibration data becomes larger than the first threshold value outside the permissible range, the reward may be reduced. However, in these cases, the current consumption is a priority item, so the difference between the second threshold value and the current consumption value is larger than the difference between the first threshold value and the vibration data value. Is desirable to contribute to the increase or decrease in reward. As a result, when both the vibration and the current consumption are within the permissible range, the reward is higher and the action value can be enhanced as compared with the case where the current consumption is within the permissible range but the vibration is out of the permissible range.
 消費電流値が第2閾値よりも大きい場合(ステップS57でNoの場合)には、報酬計算部141は、報酬を減少させる(ステップS59)。なお、優先項目以外の項目について考慮しない場合には、優先項目である消費電流値が第2閾値よりも大きければ、報酬を減少させることになる。しかし、優先項目以外の項目についても許容範囲内に収まるように考慮する場合には、さらに、第1閾値と振動データの値との差に応じる報酬と、第2閾値と消費電流値との差に応じる報酬と、を定めることができる。一例では、報酬計算部141は、第1閾値と振動データの値との差が正の場合に、第1閾値と振動データの値との差が負の場合よりも減少させる報酬の絶対値を小さくする。このとき、第1閾値と振動データの値との差が正か負かで報酬が定められてもよいし、第1閾値と振動データの値との差の大きさに応じて、報酬が定められてもよい。すなわち、許容範囲内でも振動データの値が小さくなるほど減少させる報酬の絶対値を小さくし、振動データの値が大きくなるほど減少させる報酬の絶対値が大きくなるようにしてもよい。また、許容範囲外で第1閾値よりも振動データの値が大きくなる場合に、減少させる報酬の絶対値が大きくなるようにしてもよい。ただし、これらの場合にも、消費電流が優先項目となっているので、第1閾値と振動データの値との差の大きさよりも、第2閾値と消費電流値との差の大きさの方が報酬の増減に寄与するようにすることが望ましい。 When the current consumption value is larger than the second threshold value (No in step S57), the reward calculation unit 141 reduces the reward (step S59). If items other than the priority items are not considered, the reward will be reduced if the current consumption value, which is the priority item, is larger than the second threshold value. However, when considering items other than the priority items so as to be within the permissible range, the reward according to the difference between the first threshold value and the vibration data value and the difference between the second threshold value and the current consumption value are further considered. It is possible to determine the reward according to the above. In one example, the reward calculation unit 141 sets the absolute value of the reward to be reduced when the difference between the first threshold value and the vibration data value is positive, as compared with the case where the difference between the first threshold value and the vibration data value is negative. Make it smaller. At this time, the reward may be determined depending on whether the difference between the first threshold value and the value of the vibration data is positive or negative, or the reward is determined according to the magnitude of the difference between the first threshold value and the value of the vibration data. May be done. That is, even within the permissible range, the absolute value of the reward to be reduced may be reduced as the value of the vibration data becomes smaller, and the absolute value of the reward to be reduced may be increased as the value of the vibration data becomes larger. Further, when the value of the vibration data becomes larger than the first threshold value outside the permissible range, the absolute value of the reward to be reduced may be made larger. However, even in these cases, the current consumption is a priority item, so the difference between the second threshold value and the current consumption value is larger than the difference between the first threshold value and the vibration data value. Is desirable to contribute to the increase or decrease in reward.
 ステップS55,S56,S58またはS59の後、モデル生成部132の関数更新部142は、報酬計算部141によって計算された報酬に基づいて、行動価値関数Q(st,at)を更新する(ステップS60)。行動価値関数Q(st,at)は、学習済モデル記憶部133が記憶する(1)式で表される関数である。 After step S55, S56, S58 or S59, the function updater 142 of the model generating unit 132, based on the calculated compensation by compensation calculation unit 141, action value function Q (s t, a t) Update ( Step S60). Action value function Q (s t, a t) is a function expressed by learned model storage unit 133 stores (1).
 そして、処理がステップS51へと戻る。すなわち、機械学習部13は、以上のステップS51からS60までの処理を繰り返し実行し、生成された行動価値関数Q(st,at)を学習済モデルとして記憶する。また、振動および消費電流以外のタクトタイムに関するフィードバック値を取得した場合には、データを取得したときの学習データを蓄積する。 Then, the process returns to step S51. That is, the machine learning unit 13 repeatedly executes the processes from S60 from step S51 described above, and stores the generated action-value function Q (s t, a t) as a learned model. Further, when the feedback value regarding the tact time other than the vibration and the current consumption is acquired, the learning data at the time of acquiring the data is accumulated.
 つぎに、機械学習部13に記憶された学習済モデルを用いたモータ速度制御パラメータの推論方法について説明する。図9は、実施の形態2による位置決め制御装置におけるモータの制御方法の処理手順の一例を示すフローチャートである。 Next, a method of inferring motor speed control parameters using the trained model stored in the machine learning unit 13 will be described. FIG. 9 is a flowchart showing an example of a processing procedure of a motor control method in the positioning control device according to the second embodiment.
 まず、ユーザによって、初期の制御システム1の構成が決定される。そして、パラメータ記憶部11は、モータ50の情報、許容範囲および優先項目を含む値を記憶する(ステップS71)。モータ50の情報、許容範囲および優先項目は、ユーザによって図示しない入力部を介して設定される。また、センサ値取得部12は、振動センサ61から振動データを取得し(ステップS72)、保持する。さらに、消費電流取得部16は、消費電流測定機器62から消費電流値を取得し(ステップS73)、保持する。 First, the configuration of the initial control system 1 is determined by the user. Then, the parameter storage unit 11 stores the information including the information of the motor 50, the allowable range, and the value including the priority item (step S71). Information, tolerances and priorities for the motor 50 are set by the user via inputs not shown. Further, the sensor value acquisition unit 12 acquires vibration data from the vibration sensor 61 (step S72) and holds it. Further, the current consumption acquisition unit 16 acquires and holds the current consumption value from the current consumption measuring device 62 (step S73).
 その後、機械学習部13の推論部134は、パラメータ記憶部11の値並びに取得した振動データの値および消費電流値を、学習済モデルを用いて解析し、設定された優先項目が許容範囲となるモータ速度制御パラメータをモータ速度制御パラメータ出力部14に設定する(ステップS74)。優先項目が振動である場合には、許容範囲内で最適な振動となるモータ速度制御パラメータが決定される。このとき、優先項目以外の項目である消費電流も許容範囲内となるようにモータ速度制御パラメータが決定されてもよい。優先項目が消費電流である場合には、許容範囲内で最適な消費電流となるモータ速度制御パラメータが決定される。このとき、優先項目以外の項目である振動も許容範囲内となるようにモータ速度制御パラメータが決定されてもよい。 After that, the inference unit 134 of the machine learning unit 13 analyzes the value of the parameter storage unit 11, the value of the acquired vibration data, and the current consumption value using the trained model, and the set priority item becomes an allowable range. The motor speed control parameter is set in the motor speed control parameter output unit 14 (step S74). If the priority item is vibration, the motor speed control parameters that provide the optimum vibration within the permissible range are determined. At this time, the motor speed control parameter may be determined so that the current consumption, which is an item other than the priority item, is also within the allowable range. If the priority item is current consumption, the motor speed control parameters that provide the optimum current consumption within the permissible range are determined. At this time, the motor speed control parameter may be determined so that the vibration, which is an item other than the priority item, is also within the allowable range.
 モータ速度制御パラメータ出力部14は、設定されたモータ速度制御パラメータをパルス出力部15に出力する(ステップS75)。パルス出力部15は、モータ速度制御パラメータ出力部14からのモータ速度制御パラメータを基にアンプ30へパルスを出力する(ステップS76)。 The motor speed control parameter output unit 14 outputs the set motor speed control parameter to the pulse output unit 15 (step S75). The pulse output unit 15 outputs a pulse to the amplifier 30 based on the motor speed control parameter from the motor speed control parameter output unit 14 (step S76).
 これによって、モータ50が駆動される。モータ50が駆動されると、モータ50に設けられる振動センサ61によって、モータ50の振動が検知される。その後、ステップS72で説明したように、センサ値取得部12によって、モータ50の振動である振動データが取得される。また、モータ50が駆動されると、消費電流測定機器62によって、モータ50の消費電流値が検知され、ステップS73で説明したように、消費電流取得部16によって、モータ50の消費電流値が取得される。そして、上記したようにステップS72からS76までの処理が繰り返し実行される。一例では、センサ値取得部12で取得された振動データの値に許容範囲を超えるような変化が生じるか、あるいは消費電流取得部16で取得された消費電流値に許容範囲を超えるような変化が生じると、ステップS74で、推論部134が、モータ50の情報と、許容範囲および優先項目と、振動データの値と、消費電流値と、を学習済モデルを用いて解析し、優先項目が許容範囲内となるモータ速度制御パラメータをモータ速度制御パラメータ出力部14に設定することになる。 This drives the motor 50. When the motor 50 is driven, the vibration of the motor 50 is detected by the vibration sensor 61 provided in the motor 50. After that, as described in step S72, the sensor value acquisition unit 12 acquires vibration data, which is the vibration of the motor 50. Further, when the motor 50 is driven, the current consumption value of the motor 50 is detected by the current consumption measuring device 62, and the current consumption value of the motor 50 is acquired by the current consumption acquisition unit 16 as described in step S73. Will be done. Then, as described above, the processes from steps S72 to S76 are repeatedly executed. In one example, the value of the vibration data acquired by the sensor value acquisition unit 12 changes beyond the permissible range, or the current consumption value acquired by the current consumption acquisition unit 16 changes beyond the permissible range. When it occurs, in step S74, the inference unit 134 analyzes the information of the motor 50, the permissible range and the priority item, the value of the vibration data, and the current consumption value by using the trained model, and the priority item is permissible. The motor speed control parameter within the range is set in the motor speed control parameter output unit 14.
 なお、上記した説明では、振動センサ61および消費電流測定機器62がモータ50に設けられる場合を示したが、実施の形態がこれに限定されるものではない。図10は、実施の形態2による位置決め制御装置を含む制御システムの構成の他の例を示すブロック図である。以下では、図1、図5および図7と同一の構成要素には同一の符号を付して、その説明を省略し、図1、図5および図7と異なる部分について説明する。図10では、振動センサ61および消費電流測定機器62は、モータ50ではなく、モータ50およびモータ50によって駆動される駆動部51を含む装置である製品52に設けられる。これによって、機械学習部13は、モータ50だけでなく、駆動部51等を含めた製品52の振動の値を最適化することができる。 In the above description, the case where the vibration sensor 61 and the current consumption measuring device 62 are provided in the motor 50 is shown, but the embodiment is not limited to this. FIG. 10 is a block diagram showing another example of the configuration of the control system including the positioning control device according to the second embodiment. In the following, the same components as those in FIGS. 1, 5 and 7 are designated by the same reference numerals, the description thereof will be omitted, and the parts different from those in FIGS. 1, 5 and 7 will be described. In FIG. 10, the vibration sensor 61 and the current consumption measuring device 62 are provided not in the motor 50 but in the product 52 which is a device including the motor 50 and the drive unit 51 driven by the motor 50. As a result, the machine learning unit 13 can optimize the vibration value of the product 52 including the drive unit 51 and the like as well as the motor 50.
 図11は、実施の形態2による位置決め制御装置を含む制御システムの構成の他の例を示すブロック図である。以下では、図1、図6および図7と同一の構成要素には同一の符号を付して、その説明を省略し、図1、図6および図7と異なる部分について説明する。図11では、振動センサ61および消費電流測定機器62は、複数の製品52を含むシステム53に設けられる。このようなシステム53は、一例では、モータ50および製品52を複数含む多軸制御によるシステムである。これによって、機械学習部13は、複数の製品52を含むシステム53の振動の値を最適化することができる。 FIG. 11 is a block diagram showing another example of the configuration of the control system including the positioning control device according to the second embodiment. In the following, the same components as those in FIGS. 1, 6 and 7 are designated by the same reference numerals, the description thereof will be omitted, and the parts different from those in FIGS. 1, 6 and 7 will be described. In FIG. 11, the vibration sensor 61 and the current consumption measuring device 62 are provided in the system 53 including the plurality of products 52. As an example, such a system 53 is a multi-axis control system including a plurality of motors 50 and products 52. Thereby, the machine learning unit 13 can optimize the vibration value of the system 53 including the plurality of products 52.
 実施の形態2では、位置決め制御装置10の機械学習部13が、学習用データにしたがって、設定された優先項目が許容範囲となるモータ速度制御パラメータを学習し、学習済モデルを生成する。学習用データは、振動センサ61からの振動データおよび消費電流測定機器62からの消費電流値と、モータ50の情報と、モータ速度制御パラメータと、許容範囲と、の組み合わせに基づいて作成される。振動センサ61および消費電流測定機器62は、モータ50、モータ50に接続される駆動部51を含む製品52、または複数の製品52を含むシステム53に設けられる。そして、機械学習部13は、モータ50の情報と許容範囲と振動データの値と消費電流値とを、学習済モデルを用いて解析することによって、タクトタイムが許容範囲内となり、設定された優先項目が許容範囲に収まるモータ速度制御パラメータを設定する。優先項目として振動が設定されている場合には、設定されたモータ速度制御パラメータでモータ50を駆動させることによって、タクトタイムの短縮を可能としながら、モータ50に過剰な振動を与えずに製品52またはシステム53を稼働させ、モータ50への負担が軽減され、モータ50の寿命の延長を実現することができる。優先項目として消費電流が設定されている場合には、設定されたモータ速度制御パラメータでモータ50を駆動させることによって、タクトタイムの短縮を可能としながら、モータ50が過剰な電力を消費することなく製品52またはシステム53を稼働させ、システム53全体の省電力化を実現することができる。 In the second embodiment, the machine learning unit 13 of the positioning control device 10 learns the motor speed control parameter in which the set priority item is within the permissible range according to the learning data, and generates a trained model. The learning data is created based on a combination of vibration data from the vibration sensor 61, current consumption value from the current consumption measuring device 62, information on the motor 50, motor speed control parameters, and an allowable range. The vibration sensor 61 and the current consumption measuring device 62 are provided in a motor 50, a product 52 including a drive unit 51 connected to the motor 50, or a system 53 including a plurality of products 52. Then, the machine learning unit 13 analyzes the information of the motor 50, the permissible range, the value of the vibration data, and the current consumption value by using the trained model, so that the tact time is within the permissible range and the set priority is set. Set the motor speed control parameters so that the item is within the allowable range. When vibration is set as a priority item, by driving the motor 50 with the set motor speed control parameter, it is possible to shorten the tact time, and the product 52 without giving excessive vibration to the motor 50. Alternatively, the system 53 can be operated, the load on the motor 50 can be reduced, and the life of the motor 50 can be extended. When the current consumption is set as a priority item, the motor 50 is driven by the set motor speed control parameter, so that the tact time can be shortened and the motor 50 does not consume excessive power. The product 52 or the system 53 can be operated to save power in the entire system 53.
 従来の技術のように、モータ50からのフィードバック制御で制御値であるサーボゲインを決定するものではなく、位置決め制御装置10の出力制御によってモータ速度制御パラメータを決定する。そのため、エンコーダを有するサーボモータおよびアンプのように高価でフィードバック制御ができる機器だけでなく、フィードバック機構を有さないステッピングモータおよびアンプのような機器でも、モータ50が設けられる製品52またはシステム53の寿命または省エネルギ性を向上させることができる。 Unlike the conventional technique, the servo gain, which is a control value, is determined by the feedback control from the motor 50, but the motor speed control parameter is determined by the output control of the positioning control device 10. Therefore, not only equipment such as a servomotor and an amplifier having an encoder, which is expensive and capable of feedback control, but also equipment such as a stepping motor and an amplifier having no feedback mechanism, the product 52 or the system 53 in which the motor 50 is provided. Life or energy saving can be improved.
 また、機械学習部13は、優先項目で指定された項目だけでなく、すべての項目について許容範囲に収まるようなモータ速度制御パラメータを学習するようにした。これによって、振動を優先した場合に、許容範囲内に振動を抑えながら、最低限の消費電流値を維持するようなモータ速度制御パラメータを設定することができる。また逆に、消費電流を優先した場合に、消費電流を許容範囲内に抑えながら、最低限の振動を維持するようなモータ速度制御パラメータを設定することができる。 In addition, the machine learning unit 13 is designed to learn motor speed control parameters that fall within the permissible range for all items, not just the items specified in the priority items. This makes it possible to set motor speed control parameters that maintain the minimum current consumption value while suppressing vibration within the permissible range when vibration is prioritized. On the contrary, when the current consumption is prioritized, the motor speed control parameter can be set so as to maintain the minimum vibration while suppressing the current consumption within the allowable range.
実施の形態3.
 図12は、実施の形態3による位置決め制御装置を含む制御システムの構成の一例を示すブロック図である。以下では、実施の形態1,2と同一の構成には同一の符号を付し、その説明を省略し、実施の形態1,2と異なる部分について説明する。実施の形態3の制御システム1は、モータ50ではなく、モータ50およびモータ50によって駆動される駆動部51を含む装置である製品52に、タクトタイム測定機器63をさらに備える。すなわち、図5の構成に、タクトタイム測定機器63が設けられる。タクトタイム測定機器63は、カメラまたはセンサ等を用いてタクトタイムを測定する装置である。タクトタイム測定機器63は、測定したタクトタイムを位置決め制御装置10に出力する。
Embodiment 3.
FIG. 12 is a block diagram showing an example of the configuration of a control system including the positioning control device according to the third embodiment. Hereinafter, the same configurations as those of the first and second embodiments are designated by the same reference numerals, the description thereof will be omitted, and the parts different from those of the first and second embodiments will be described. The control system 1 of the third embodiment further includes a takt time measuring device 63 in the product 52, which is a device including the motor 50 and the drive unit 51 driven by the motor 50, instead of the motor 50. That is, the tact time measuring device 63 is provided in the configuration of FIG. The takt time measuring device 63 is a device that measures takt time using a camera, a sensor, or the like. The tact time measuring device 63 outputs the measured tact time to the positioning control device 10.
 位置決め制御装置10は、シミュレータ部17と、タクトタイム取得部18と、をさらに備える。シミュレータ部17は、位置決め制御装置10に内蔵されており、モータ速度制御パラメータ出力部14が出力するモータ速度制御パラメータからタクトタイムをシミュレートする。シミュレータ部17は、シミュレートによって取得したタクトタイムを出力に反映する。 The positioning control device 10 further includes a simulator unit 17 and a tact time acquisition unit 18. The simulator unit 17 is built in the positioning control device 10, and simulates the takt time from the motor speed control parameters output by the motor speed control parameter output unit 14. The simulator unit 17 reflects the tact time acquired by the simulation in the output.
 タクトタイム取得部18は、タクトタイム測定機器63から出力されたタクトタイムと、シミュレータ部17によってシミュレートされたタクトタイムと、を保持する。保持する値は、タクトタイム測定機器63から出力されたタクトタイムおよびシミュレータ部17によってシミュレートされたタクトタイムのいずれか一方だけでもよい。 The tact time acquisition unit 18 holds the tact time output from the tact time measuring device 63 and the tact time simulated by the simulator unit 17. The value to be held may be only one of the tact time output from the tact time measuring device 63 and the tact time simulated by the simulator unit 17.
 機械学習部13は、モータ速度制御パラメータ、モータ50の情報、許容範囲並びにモータ50の振動データおよびタクトタイムの組み合わせに基づいて作成される学習用データにしたがって、指定された優先項目が許容範囲に収まるモータ速度制御パラメータを学習する。つまり、モータ50の情報と、モータ50の振動データおよびタクトタイムと、から指定された優先項目が許容範囲に収まるモータ速度制御パラメータを推論する学習済モデルを生成する。指定された優先項目は、振動またはタクトタイムである。学習済モデルは、モータ50の情報と、許容範囲と、振動データと、タクトタイムと、から、モータ50の設置箇所の振動とタクトタイムとモータ速度制御パラメータとの間の相関を学習したモデルである。また、機械学習部13は、モータ50の情報、許容範囲並びにモータ50の振動データの値およびタクトタイムを、学習済モデルを用いて解析し、モータ50の情報を有するモータ50で優先項目が許容範囲に収まるようなモータ速度制御パラメータを出力する。 The machine learning unit 13 sets the designated priority items in the allowable range according to the learning data created based on the combination of the motor speed control parameter, the information of the motor 50, the allowable range, and the vibration data and the tact time of the motor 50. Learn the motor speed control parameters that fit. That is, a trained model is generated that infers the motor speed control parameter in which the priority item specified from the information of the motor 50, the vibration data of the motor 50, and the takt time falls within the allowable range. The specified priority is vibration or takt time. The trained model is a model that learns the correlation between the vibration and tact time of the installation location of the motor 50 and the motor speed control parameter from the information of the motor 50, the allowable range, the vibration data, and the tact time. be. Further, the machine learning unit 13 analyzes the information of the motor 50, the permissible range, the value of the vibration data of the motor 50, and the tact time using the trained model, and the priority item is permissible for the motor 50 having the information of the motor 50. Output motor speed control parameters that fall within the range.
 図13は、実施の形態3による位置決め制御装置に含まれる機械学習部の学習処理の手順の一例を示すフローチャートである。まず、データ取得部131は、モータ速度制御パラメータ、モータ50の情報、許容範囲並びにモータ50の振動データおよびタクトタイムを学習データとして取得する(ステップS91)。また、データ取得部131は、パラメータ記憶部11から優先項目を取得する(ステップS92)。モデル生成部132は、優先項目が振動およびタクトタイムのうちいずれかであるかを判定する(ステップS93)。 FIG. 13 is a flowchart showing an example of the procedure of the learning process of the machine learning unit included in the positioning control device according to the third embodiment. First, the data acquisition unit 131 acquires the motor speed control parameter, the information of the motor 50, the allowable range, the vibration data of the motor 50, and the tact time as learning data (step S91). Further, the data acquisition unit 131 acquires priority items from the parameter storage unit 11 (step S92). The model generation unit 132 determines whether the priority item is vibration or takt time (step S93).
 優先項目が振動である場合(ステップS93で振動の場合)には、モデル生成部132は、モータ速度制御パラメータ、モータ50の情報および許容範囲に基づいて報酬を計算する。具体的には、モデル生成部132の報酬計算部141は、モータ速度制御パラメータ、モータ50の情報、許容範囲並びにモータ50の振動データおよびタクトタイムを取得し、予め定められた振動の大きさと第1閾値との関係およびタクトタイムと第3閾値との関係に基づいて報酬の増加または減少を判断する。第3閾値は、一例では、許容範囲に規定されているタクトタイムの値である。ここでは、報酬計算部141は、モータ速度制御パラメータでモータ50を動作させたときの振動データの値が第1閾値未満であるかを判定する(ステップS94)。 When the priority item is vibration (vibration in step S93), the model generation unit 132 calculates the reward based on the motor speed control parameter, the information of the motor 50, and the allowable range. Specifically, the reward calculation unit 141 of the model generation unit 132 acquires the motor speed control parameter, the information of the motor 50, the allowable range, the vibration data of the motor 50, and the takt time, and obtains the predetermined vibration magnitude and the first. 1 The increase or decrease of the reward is determined based on the relationship with the threshold value and the relationship between the takt time and the third threshold value. The third threshold value is, in one example, the value of the takt time defined in the permissible range. Here, the reward calculation unit 141 determines whether the value of the vibration data when the motor 50 is operated by the motor speed control parameter is less than the first threshold value (step S94).
 振動データの値が第1閾値未満である場合(ステップS94でYesの場合)には、報酬計算部141は、報酬を増加させる(ステップS95)。なお、優先項目以外の項目について考慮しない場合には、優先項目である振動データが第1閾値未満であれば、報酬を増加させることになる。しかし、優先項目以外の項目についても許容範囲内に収まるように考慮する場合には、さらに、第1閾値と振動データの値との差に応じる報酬と、第3閾値とタクトタイムとの差に応じる報酬と、を定めることができる。一例では、報酬計算部141は、第3閾値とタクトタイムとの差が正の場合に、第3閾値とタクトタイムとの差が負の場合よりも報酬を増加させる。このとき、第3閾値とタクトタイムとの差が正か負かで報酬が定められてもよいし、第3閾値とタクトタイムとの差の大きさに応じて、報酬が定められてもよい。すなわち、許容範囲内でもタクトタイムが小さくなるほど報酬を増加させ、タクトタイムが大きくなるほど報酬が減少するようにしてもよい。また、許容範囲外で第3閾値よりもタクトタイムが大きくなる場合に、報酬を減少させるようにしてもよい。ただし、これらの場合には、振動が優先項目となっているので、第3閾値とタクトタイムとの差の大きさよりも、第1閾値と振動データの値との差の大きさの方が報酬の増減に寄与するようにすることが望ましい。これによって、振動およびタクトタイムがともに許容範囲内である場合に、振動は許容範囲であるがタクトタイムは許容範囲外である場合に比して、高い報酬となり、行動価値を高めることができる。 When the value of the vibration data is less than the first threshold value (Yes in step S94), the reward calculation unit 141 increases the reward (step S95). If items other than the priority items are not considered, the reward will be increased if the vibration data, which is the priority item, is less than the first threshold value. However, when considering items other than the priority items so that they are within the permissible range, the reward according to the difference between the first threshold value and the vibration data value and the difference between the third threshold value and the takt time are further increased. You can set the reward to respond to. In one example, the reward calculation unit 141 increases the reward when the difference between the third threshold value and the takt time is positive, as compared with the case where the difference between the third threshold value and the takt time is negative. At this time, the reward may be determined depending on whether the difference between the third threshold value and the tact time is positive or negative, or the reward may be determined according to the magnitude of the difference between the third threshold value and the tact time. .. That is, even within the permissible range, the reward may be increased as the tact time becomes smaller, and the reward may decrease as the tact time becomes larger. Further, when the tact time becomes larger than the third threshold value outside the permissible range, the reward may be reduced. However, in these cases, vibration is a priority item, so the difference between the first threshold and the vibration data value is more rewarding than the difference between the third threshold and the takt time. It is desirable to contribute to the increase or decrease of. As a result, when both the vibration and the takt time are within the permissible range, the reward is higher and the action value can be enhanced as compared with the case where the vibration is within the permissible range but the takt time is not within the permissible range.
 振動データの値が第1閾値よりも大きい場合(ステップS94でNoの場合)には、報酬計算部141は、報酬を減少させる(ステップS96)。なお、優先項目以外の項目について考慮しない場合には、優先項目である振動データが第1閾値よりも大きければ、報酬を減少させることになる。しかし、優先項目以外の項目についても許容範囲内に収まるように考慮する場合には、さらに、第1閾値と振動データの値との差に応じる報酬と、第3閾値とタクトタイムとの差に応じる報酬と、を定めることができる。一例では、報酬計算部141は、第3閾値とタクトタイムとの差が正の場合に、第3閾値とタクトタイムとの差が負の場合よりも減少させる報酬の絶対値を小さくする。このとき、第3閾値とタクトタイムとの差が正か負かで報酬が定められてもよいし、第3閾値とタクトタイムとの差の大きさに応じて、報酬が定められてもよい。すなわち、許容範囲内でもタクトタイムが小さくなるほど減少させる報酬の絶対値を小さくし、タクトタイムが大きくなるほど減少させる報酬の絶対値が大きくなるようにしてもよい。また、許容範囲外で第3閾値よりもタクトタイムが大きくなる場合に、減少させる報酬の絶対値が大きくなるようにしてもよい。ただし、これらの場合にも、振動が優先項目となっているので、第3閾値とタクトタイムとの差の大きさよりも、第1閾値と振動データの値との差の大きさの方が報酬の増減に寄与するようにすることが望ましい。 When the value of the vibration data is larger than the first threshold value (No in step S94), the reward calculation unit 141 reduces the reward (step S96). If items other than the priority items are not considered, the reward will be reduced if the vibration data, which is the priority item, is larger than the first threshold value. However, when considering items other than the priority items so that they are within the permissible range, the reward according to the difference between the first threshold value and the vibration data value and the difference between the third threshold value and the takt time are further increased. You can set the reward to respond to. In one example, the reward calculation unit 141 makes the absolute value of the reward to be reduced smaller when the difference between the third threshold value and the takt time is positive than when the difference between the third threshold value and the takt time is negative. At this time, the reward may be determined depending on whether the difference between the third threshold value and the tact time is positive or negative, or the reward may be determined according to the magnitude of the difference between the third threshold value and the tact time. .. That is, even within the permissible range, the absolute value of the reward to be reduced may be reduced as the tact time becomes smaller, and the absolute value of the reward to be reduced may be increased as the tact time becomes larger. Further, when the tact time is larger than the third threshold value outside the permissible range, the absolute value of the reward to be reduced may be increased. However, even in these cases, vibration is a priority item, so the difference between the first threshold and the vibration data value is more rewarding than the difference between the third threshold and the takt time. It is desirable to contribute to the increase or decrease of.
 優先項目がタクトタイムである場合(ステップS93でタクトタイムの場合)には、モデル生成部132は、モータ速度制御パラメータ、モータ50の情報および許容範囲に基づいて報酬を計算する。具体的には、モデル生成部132の報酬計算部141は、モータ速度制御パラメータ、モータ50の情報、許容範囲並びにモータ50の振動データおよびタクトタイムを取得し、予め定められた振動の大きさと第1閾値との関係およびタクトタイムと第3閾値との関係に基づいて報酬の増加または減少を判断する。ここでは、報酬計算部141は、モータ速度制御パラメータでモータ50を動作させたときのタクトタイムの値が第3閾値未満であるかを判定する(ステップS97)。 When the priority item is takt time (in the case of takt time in step S93), the model generation unit 132 calculates the reward based on the motor speed control parameter, the information of the motor 50, and the allowable range. Specifically, the reward calculation unit 141 of the model generation unit 132 acquires the motor speed control parameter, the information of the motor 50, the allowable range, the vibration data of the motor 50, and the takt time, and obtains the predetermined vibration magnitude and the first. 1 The increase or decrease of the reward is determined based on the relationship with the threshold value and the relationship between the takt time and the third threshold value. Here, the reward calculation unit 141 determines whether the value of the tact time when the motor 50 is operated by the motor speed control parameter is less than the third threshold value (step S97).
 タクトタイムが第3閾値未満である場合(ステップS97でYesの場合)には、報酬計算部141は、報酬を増加させる(ステップS98)。なお、優先項目以外の項目について考慮しない場合には、優先項目であるタクトタイムが第3閾値未満であれば、報酬を増加させることになる。しかし、優先項目以外の項目についても許容範囲内に収まるように考慮する場合には、さらに、第1閾値と振動データの値との差に応じる報酬と、第3閾値とタクトタイムとの差に応じる報酬と、を定めることができる。一例では、報酬計算部141は、第1閾値と振動データの値との差が正の場合に、第1閾値と振動データの値との差が負の場合よりも報酬を増加させる。このとき、第1閾値と振動データの値との差が正か負かで報酬が定められてもよいし、第1閾値と振動データの値との差の大きさに応じて、報酬が定められてもよい。すなわち、許容範囲内でも振動データの値が小さくなるほど報酬を増加させ、振動データの値が大きくなるほど報酬が減少するようにしてもよい。また、許容範囲外で第1閾値よりも振動データの値が大きくなる場合に、報酬を減少させるようにしてもよい。ただし、これらの場合には、タクトタイムが優先項目となっているので、第1閾値と振動データの値との差の大きさよりも、第3閾値とタクトタイムとの差の大きさの方が報酬の増減に寄与するようにすることが望ましい。これによって、振動およびタクトタイムがともに許容範囲内である場合に、タクトタイムは許容範囲であるが振動は許容範囲外である場合に比して、高い報酬となり、行動価値を高めることができる。 When the takt time is less than the third threshold value (Yes in step S97), the reward calculation unit 141 increases the reward (step S98). If items other than the priority items are not considered, the reward will be increased if the priority item, takt time, is less than the third threshold value. However, when considering items other than the priority items so that they are within the permissible range, the reward according to the difference between the first threshold value and the vibration data value and the difference between the third threshold value and the takt time are further increased. You can set the reward to respond to. In one example, the reward calculation unit 141 increases the reward when the difference between the first threshold value and the value of the vibration data is positive, as compared with the case where the difference between the first threshold value and the value of the vibration data is negative. At this time, the reward may be determined depending on whether the difference between the first threshold value and the value of the vibration data is positive or negative, or the reward is determined according to the magnitude of the difference between the first threshold value and the value of the vibration data. May be done. That is, even within the permissible range, the reward may be increased as the value of the vibration data becomes smaller, and the reward may be decreased as the value of the vibration data becomes larger. Further, when the value of the vibration data becomes larger than the first threshold value outside the permissible range, the reward may be reduced. However, in these cases, the tact time is a priority item, so the difference between the third threshold and the tact time is larger than the difference between the first threshold and the vibration data value. It is desirable to contribute to the increase or decrease of the reward. As a result, when both the vibration and the tact time are within the permissible range, the reward is higher and the action value can be enhanced as compared with the case where the tact time is within the permissible range but the vibration is not within the permissible range.
 タクトタイムの値が第3閾値よりも大きい場合(ステップS97でNoの場合)には、報酬計算部141は、報酬を減少させる(ステップS99)。なお、優先項目以外の項目について考慮しない場合には、優先項目であるタクトタイムが第3閾値よりも大きければ、報酬を減少させることになる。しかし、優先項目以外の項目についても許容範囲内に収まるように考慮する場合には、さらに、第1閾値と振動データの値との差に応じる報酬と、第3閾値とタクトタイムとの差に応じる報酬と、を定めることができる。一例では、報酬計算部141は、第1閾値と振動データの値との差が正の場合に、第1閾値と振動データの値との差が負の場合よりも減少させる報酬の絶対値を小さくする。このとき、第1閾値と振動データの値との差が正か負かで報酬が定められてもよいし、第1閾値と振動データの値との差の大きさに応じて、報酬が定められてもよい。すなわち、許容範囲内でも振動データの値が小さくなるほど減少させる報酬の絶対値を小さくし、振動データの値が大きくなるほど減少させる報酬の絶対値が大きくなるようにしてもよい。また、許容範囲外で第1閾値よりも振動データの値が大きくなる場合に、減少させる報酬の絶対値が大きくなるようにしてもよい。ただし、これらの場合にも、タクトタイムが優先項目となっているので、第1閾値と振動データの値との差の大きさよりも、第3閾値とタクトタイムとの差の大きさの方が報酬の増減に寄与するようにすることが望ましい。 When the value of the takt time is larger than the third threshold value (No in step S97), the reward calculation unit 141 reduces the reward (step S99). If items other than the priority items are not considered, the reward will be reduced if the priority item, the takt time, is larger than the third threshold value. However, when considering items other than the priority items so that they are within the permissible range, the reward according to the difference between the first threshold value and the vibration data value and the difference between the third threshold value and the takt time are further increased. You can set the reward to respond to. In one example, the reward calculation unit 141 sets the absolute value of the reward to be reduced when the difference between the first threshold value and the vibration data value is positive, as compared with the case where the difference between the first threshold value and the vibration data value is negative. Make it smaller. At this time, the reward may be determined depending on whether the difference between the first threshold value and the value of the vibration data is positive or negative, or the reward is determined according to the magnitude of the difference between the first threshold value and the value of the vibration data. May be done. That is, even within the permissible range, the absolute value of the reward to be reduced may be reduced as the value of the vibration data becomes smaller, and the absolute value of the reward to be reduced may be increased as the value of the vibration data becomes larger. Further, when the value of the vibration data becomes larger than the first threshold value outside the permissible range, the absolute value of the reward to be reduced may be made larger. However, even in these cases, the tact time is a priority item, so the difference between the third threshold and the tact time is larger than the difference between the first threshold and the vibration data value. It is desirable to contribute to the increase or decrease of the reward.
 ステップS95,S96,S98またはS99の後、モデル生成部132の関数更新部142は、報酬計算部141によって計算された報酬に基づいて、行動価値関数Q(st,at)を更新する(ステップS100)。行動価値関数Q(st,at)は、学習済モデル記憶部133が記憶する(1)式で表される関数である。 After step S95, S96, S98 or S99, the function updater 142 of the model generating unit 132, based on the calculated compensation by compensation calculation unit 141, action value function Q (s t, a t) Update ( Step S100). Action value function Q (s t, a t) is a function expressed by learned model storage unit 133 stores (1).
 そして、処理がステップS91へと戻る。すなわち、機械学習部13は、以上のステップS91からS100までの処理を繰り返し実行し、生成された行動価値関数Q(st,at)を学習済モデルとして記憶する。また、振動およびタクトタイム以外の消費電流に関するフィードバック値を取得した場合には、データを取得したときの学習データを蓄積する。 Then, the process returns to step S91. That is, the machine learning unit 13 repeatedly executes the processes from S100 from step S91 described above, and stores the generated action-value function Q (s t, a t) as a learned model. Further, when the feedback value regarding the current consumption other than the vibration and the tact time is acquired, the learning data at the time of acquiring the data is accumulated.
 つぎに、機械学習部13に記憶された学習済モデルを用いたモータ速度制御パラメータの推論方法について説明する。図14は、実施の形態3による位置決め制御装置におけるモータの制御方法の処理手順の一例を示すフローチャートである。 Next, a method of inferring motor speed control parameters using the trained model stored in the machine learning unit 13 will be described. FIG. 14 is a flowchart showing an example of a processing procedure of a motor control method in the positioning control device according to the third embodiment.
 まず、ユーザによって、初期の制御システム1の構成が決定される。そして、パラメータ記憶部11は、モータ50の情報、許容範囲および優先項目を含む値を記憶する(ステップS111)。モータ50の情報、許容範囲および優先項目は、ユーザによって図示しない入力部を介して設定される。また、センサ値取得部12は、振動センサ61から振動データを取得し(ステップS112)、保持する。さらに、タクトタイム取得部18は、タクトタイム測定機器63からタクトタイムを取得し(ステップS113)、保持する。 First, the configuration of the initial control system 1 is determined by the user. Then, the parameter storage unit 11 stores the information including the information of the motor 50, the allowable range, and the value including the priority item (step S111). Information, tolerances and priorities for the motor 50 are set by the user via inputs not shown. Further, the sensor value acquisition unit 12 acquires and holds vibration data from the vibration sensor 61 (step S112). Further, the takt time acquisition unit 18 acquires and holds the takt time from the takt time measuring device 63 (step S113).
 その後、機械学習部13の推論部134は、パラメータ記憶部11の値並びに取得した振動データの値およびタクトタイムを、学習済モデルを用いて解析し、設定された優先項目が許容範囲となるモータ速度制御パラメータをモータ速度制御パラメータ出力部14に設定する(ステップS114)。優先項目が振動である場合には、許容範囲内で最適な振動となるモータ速度制御パラメータが決定される。このとき、優先項目以外の項目であるタクトタイムも許容範囲内となるようにモータ速度制御パラメータが決定されてもよい。優先項目がタクトタイムである場合には、許容範囲内で最適なタクトタイムとなるモータ速度制御パラメータが決定される。このとき、優先項目以外の項目である振動も許容範囲内となるようにモータ速度制御パラメータが決定されてもよい。 After that, the inference unit 134 of the machine learning unit 13 analyzes the value of the parameter storage unit 11, the value of the acquired vibration data, and the takt time using the trained model, and the set priority item is within the allowable range. The speed control parameter is set in the motor speed control parameter output unit 14 (step S114). If the priority item is vibration, the motor speed control parameters that provide the optimum vibration within the permissible range are determined. At this time, the motor speed control parameter may be determined so that the takt time, which is an item other than the priority item, is also within the allowable range. If the priority item is takt time, the motor speed control parameter that provides the optimum takt time within the permissible range is determined. At this time, the motor speed control parameter may be determined so that the vibration, which is an item other than the priority item, is also within the allowable range.
 モータ速度制御パラメータ出力部14は、設定されたモータ速度制御パラメータをパルス出力部15に出力する(ステップS115)。パルス出力部15は、モータ速度制御パラメータ出力部14からのモータ速度制御パラメータを基にアンプ30へパルスを出力する(ステップS116)。 The motor speed control parameter output unit 14 outputs the set motor speed control parameter to the pulse output unit 15 (step S115). The pulse output unit 15 outputs a pulse to the amplifier 30 based on the motor speed control parameter from the motor speed control parameter output unit 14 (step S116).
 ステップS115およびS116と並行して、シミュレータ部17は、モータ速度制御パラメータ出力部14から得たモータ速度制御パラメータから制御システム1のシミュレートを行い、タクトタイムを算出する(ステップS117)。シミュレータ部17は、算出したタクトタイムをタクトタイム取得部18に出力する。 In parallel with steps S115 and S116, the simulator unit 17 simulates the control system 1 from the motor speed control parameters obtained from the motor speed control parameter output unit 14 and calculates the takt time (step S117). The simulator unit 17 outputs the calculated tact time to the tact time acquisition unit 18.
 これによって、モータ50が駆動される。モータ50が駆動されると、モータ50に設けられる振動センサ61によって、モータ50の振動が検知される。その後、ステップS112で説明したように、センサ値取得部12によって、モータ50の振動である振動データが取得される。また、ステップS116のアンプ30へのパルスの出力の後およびステップS117でのタクトタイムの算出の後、ステップS113で説明したように、タクトタイム取得部18によって、モータ50のタクトタイムが取得される。そして、上記したようにステップS112からS116までの処理が繰り返し実行される。一例では、センサ値取得部12で取得された振動データの値に許容範囲を超えるような変化が生じるか、あるいはタクトタイム取得部18で取得されたタクトタイムに許容範囲を超えるような変化が生じると、ステップS114で、推論部134が、モータ50の情報と、許容範囲および優先項目と、振動データの値と、タクトタイムと、を学習済モデルを用いて解析し、優先項目が許容範囲となるモータ速度制御パラメータをモータ速度制御パラメータ出力部14に設定することになる。 This drives the motor 50. When the motor 50 is driven, the vibration of the motor 50 is detected by the vibration sensor 61 provided in the motor 50. After that, as described in step S112, the sensor value acquisition unit 12 acquires vibration data, which is the vibration of the motor 50. Further, after the pulse is output to the amplifier 30 in step S116 and after the tact time is calculated in step S117, the tact time of the motor 50 is acquired by the tact time acquisition unit 18 as described in step S113. .. Then, as described above, the processes from steps S112 to S116 are repeatedly executed. In one example, the value of the vibration data acquired by the sensor value acquisition unit 12 changes beyond the permissible range, or the tact time acquired by the tact time acquisition unit 18 changes beyond the permissible range. Then, in step S114, the inference unit 134 analyzes the information of the motor 50, the allowable range and the priority item, the value of the vibration data, and the takt time using the trained model, and the priority item is the allowable range. The motor speed control parameter is set in the motor speed control parameter output unit 14.
 なお、上記した説明では、振動センサ61およびタクトタイム測定機器63が製品52に設けられる場合を示したが、実施の形態がこれに限定されるものではない。図15は、実施の形態3による位置決め制御装置を含む制御システムの構成の他の例を示すブロック図である。以下では、図1、図6および図12と同一の構成要素には同一の符号を付して、その説明を省略し、図1、図6および図12と異なる部分について説明する。図15では、振動センサ61およびタクトタイム測定機器63は、複数の製品52を含むシステム53に設けられる。このようなシステム53は、一例では、モータ50および製品52を複数含む多軸制御によるシステムである。これによって、機械学習部13は、複数の製品52を含むシステム53の振動の値を最適化することができる。 In the above description, the case where the vibration sensor 61 and the tact time measuring device 63 are provided in the product 52 is shown, but the embodiment is not limited to this. FIG. 15 is a block diagram showing another example of the configuration of the control system including the positioning control device according to the third embodiment. In the following, the same components as those in FIGS. 1, 6 and 12 are designated by the same reference numerals, the description thereof will be omitted, and the parts different from those in FIGS. 1, 6 and 12 will be described. In FIG. 15, the vibration sensor 61 and the takt time measuring device 63 are provided in a system 53 including a plurality of products 52. As an example, such a system 53 is a multi-axis control system including a plurality of motors 50 and products 52. Thereby, the machine learning unit 13 can optimize the vibration value of the system 53 including the plurality of products 52.
 また、上記した説明では、実施の形態1の構成においてモータ50の振動に加えてタクトタイムを考慮して機械学習を行うことができるようにした場合を示した。しかし、実施の形態2の構成においてモータ50の振動および消費電流に加えてタクトタイムを考慮して機械学習を行うことができるようにしてもよい。 Further, in the above description, the case where machine learning can be performed in consideration of the tact time in addition to the vibration of the motor 50 in the configuration of the first embodiment is shown. However, in the configuration of the second embodiment, machine learning may be performed in consideration of the tact time in addition to the vibration and the current consumption of the motor 50.
 図16は、実施の形態3による位置決め制御装置を含む制御システムの構成の他の例を示すブロック図である。以下では、図1、図10および図12と同一の構成要素には同一の符号を付して、その説明を省略し、図1、図10および図12と異なる部分について説明する。図16の制御システム1は、実施の形態2の図10の構成に実施の形態3を適用した場合を示している。制御システム1は、製品52に、タクトタイム測定機器63をさらに備える。また、位置決め制御装置10は、シミュレータ部17と、タクトタイム取得部18と、をさらに備える。 FIG. 16 is a block diagram showing another example of the configuration of the control system including the positioning control device according to the third embodiment. In the following, the same components as those in FIGS. 1, 10 and 12 are designated by the same reference numerals, the description thereof will be omitted, and the parts different from those in FIGS. 1, 10 and 12 will be described. The control system 1 of FIG. 16 shows a case where the third embodiment is applied to the configuration of FIG. 10 of the second embodiment. The control system 1 further includes a takt time measuring device 63 in the product 52. Further, the positioning control device 10 further includes a simulator unit 17 and a tact time acquisition unit 18.
 図17は、実施の形態3による位置決め制御装置を含む制御システムの構成の他の例を示すブロック図である。以下では、図1、図6、図11および図12と同一の構成要素には同一の符号を付して、その説明を省略し、図1、図6、図11および図12と異なる部分について説明する。図17の制御システム1は、実施の形態2の図11の構成に実施の形態3を適用した場合を示している。制御システム1は、複数の製品52を含むシステム53に、タクトタイム測定機器63をさらに備える。また、位置決め制御装置10は、シミュレータ部17と、タクトタイム取得部18と、をさらに備える。 FIG. 17 is a block diagram showing another example of the configuration of the control system including the positioning control device according to the third embodiment. In the following, the same components as those in FIGS. 1, 6, 11 and 12 are designated by the same reference numerals, the description thereof will be omitted, and the parts different from those in FIGS. 1, 6, 11 and 12 will be described. explain. The control system 1 of FIG. 17 shows a case where the third embodiment is applied to the configuration of FIG. 11 of the second embodiment. The control system 1 further includes a takt time measuring device 63 in a system 53 including a plurality of products 52. Further, the positioning control device 10 further includes a simulator unit 17 and a tact time acquisition unit 18.
 機械学習部13は、モータ速度制御パラメータ、モータ50の情報、許容範囲並びにモータ50の振動データ、消費電流値およびタクトタイムの組み合わせに基づいて作成される学習用データにしたがって、指定される優先項目が許容範囲に収まるモータ速度制御パラメータを学習する。つまり、モータ50の情報と、モータ50の振動データ、消費電流値およびタクトタイムと、から指定された優先項目が許容範囲に収まるモータ速度制御パラメータを推論する学習済モデルを生成する。指定される優先項目は、振動、消費電流またはタクトタイムである。学習済モデルは、モータ50の情報と、許容範囲と、振動データと、消費電流と、タクトタイムと、から、モータ50の設置箇所の振動と消費電流とタクトタイムとモータ速度制御パラメータとの間の相関を学習したモデルである。なお、機械学習部13は、指定される優先項目が許容範囲に収まるモータ速度制御パラメータを学習してもよいし、指定される優先項目だけでなく、優先項目以外の項目も許容範囲に収まるモータ速度制御パラメータを学習してもよい。また、機械学習部13は、モータ50の情報、許容範囲および優先項目並びにモータ50の振動データの値、消費電流値およびタクトタイムを、学習済モデルを用いて解析し、モータ50の情報を有するモータ50で優先項目が許容範囲に収まるようなモータ速度制御パラメータを出力する。 The machine learning unit 13 is designated as a priority item according to the learning data created based on the combination of the motor speed control parameter, the information of the motor 50, the allowable range, the vibration data of the motor 50, the current consumption value and the tact time. Learns motor speed control parameters that are within the permissible range. That is, a trained model is generated that infers the motor speed control parameter in which the priority item specified from the information of the motor 50, the vibration data of the motor 50, the current consumption value, and the takt time falls within the allowable range. The priority items specified are vibration, current consumption or takt time. The trained model is based on the information of the motor 50, the allowable range, the vibration data, the current consumption, the tact time, and the vibration, the current consumption, the tact time, and the motor speed control parameter of the installation location of the motor 50. It is a model that learned the correlation of. The machine learning unit 13 may learn motor speed control parameters in which the designated priority items fall within the permissible range, and not only the designated priority items but also items other than the priority items fall within the permissible range. You may learn the speed control parameters. Further, the machine learning unit 13 analyzes the information of the motor 50, the allowable range and priority items, the value of the vibration data of the motor 50, the current consumption value and the tact time by using the trained model, and has the information of the motor 50. The motor 50 outputs motor speed control parameters such that the priority items are within the permissible range.
 なお、機械学習部13における機械学習の方法は、実施の形態2,3で示したものを組み合わせたものあり、基本的な処理は同様であるので、説明を省略する。一例では、図8のステップS53で、モデル生成部132は、優先項目が振動、消費電流およびタクトタイムのうちいずれかであるかを判定する。そして、各場合について、優先項目と閾値との関係から報酬計算部141が報酬を増加または減少させる。また、このとき、優先項目が許容範囲にあるか否かだけに注目して報酬が決定されてもよいし、優先項目および優先項目以外の項目が許容範囲にあるか否かに注目して報酬が決定されてもよい。 The machine learning method in the machine learning unit 13 is a combination of the methods shown in the second and third embodiments, and the basic processing is the same, so the description thereof will be omitted. In one example, in step S53 of FIG. 8, the model generation unit 132 determines whether the priority item is vibration, current consumption, or takt time. Then, in each case, the reward calculation unit 141 increases or decreases the reward based on the relationship between the priority item and the threshold value. Further, at this time, the reward may be determined only by paying attention to whether or not the priority item is within the allowable range, or the reward may be determined by paying attention to whether or not the priority item and the items other than the priority item are within the allowable range. May be determined.
 また、位置決め制御装置10における学習済モデルを用いたモータの制御方法は、実施の形態2,3で示したものを組み合わせたものであり、基本的な処理は同様であるので、説明を省略する。一例では、図14のステップS114で、推論部134は、パラメータ記憶部11の値並びに取得した振動データの値、消費電流値およびタクトタイムを、学習済モデルを用いて解析し、設定された優先項目が許容範囲となるモータ速度制御パラメータをモータ速度制御パラメータ出力部14に設定する。 Further, the motor control method using the trained model in the positioning control device 10 is a combination of those shown in the second and third embodiments, and the basic processing is the same, so the description thereof will be omitted. .. In one example, in step S114 of FIG. 14, the inference unit 134 analyzes the value of the parameter storage unit 11, the value of the acquired vibration data, the current consumption value, and the takt time using the trained model, and sets the priority. The motor speed control parameter whose item is within the allowable range is set in the motor speed control parameter output unit 14.
 図18は、実施の形態3による位置決め制御装置での速度指令の一例を示す図である。この図において、横軸は時間を示し、縦軸はパルス周波数すなわち速度を示している。速度指令において、速度0から指令速度に到達するまでの時間は実加速時間と称され、指令速度から速度0に到達するまでの時間は実減速時間と称される。速度指令中の指令速度が継続される部分の面積は移動距離を表す。また、速度0から加速して指令速度に到達し、その後減速して速度0に到達するまでの時間は、タクトタイムと称される。 FIG. 18 is a diagram showing an example of a speed command in the positioning control device according to the third embodiment. In this figure, the horizontal axis represents time and the vertical axis represents pulse frequency or velocity. In the speed command, the time from the speed 0 to the command speed is called the actual acceleration time, and the time from the command speed to the speed 0 is called the actual deceleration time. The area of the part where the command speed is continued during the speed command represents the moving distance. Further, the time from speed 0 to accelerating to reach the command speed, then decelerating to reach speed 0 is called takt time.
 図19は、実施の形態3による位置決め制御装置でのモータ速度制御パラメータの学習結果の一例を示す図である。この図において、横軸は時間を示し、縦軸はパルス周波数すなわち速度を示している。速度指令曲線F1は、タクトタイムが短くなるように設定されたモータ速度制御パラメータでの速度指令曲線を示している。速度指令曲線F1では、実加速時間および実減速時間が短くなっている。すなわち、急加速および急減速が行われることになり、消費電流が大きくなるとともに、振動も大きくなってしまう。 FIG. 19 is a diagram showing an example of learning results of motor speed control parameters in the positioning control device according to the third embodiment. In this figure, the horizontal axis represents time and the vertical axis represents pulse frequency or velocity. The speed command curve F1 shows a speed command curve with motor speed control parameters set so that the tact time is short. In the speed command curve F1, the actual acceleration time and the actual deceleration time are shortened. That is, sudden acceleration and deceleration are performed, and the current consumption increases and the vibration also increases.
 速度指令曲線F2は、速度指令曲線F1に比して、タクトタイムが長くなるように設定されたモータ速度制御パラメータでの速度指令曲線を示している。速度指令曲線F2では、実加速時間および実減速時間が、速度指令曲線F1に比して長く取られている。そのため、振動を速度指令曲線F1の場合に比して低減させることができるが、加速および減速が緩やか過ぎてしまう。また、定常運転速度である指令速度が、速度指令曲線F1に比して低いため、消費電流が大きくなるとともに、タクトタイムも長くなってしまう。 The speed command curve F2 shows a speed command curve with motor speed control parameters set so that the tact time is longer than that of the speed command curve F1. In the speed command curve F2, the actual acceleration time and the actual deceleration time are longer than those of the speed command curve F1. Therefore, the vibration can be reduced as compared with the case of the speed command curve F1, but the acceleration and deceleration are too slow. Further, since the command speed, which is the steady operation speed, is lower than the speed command curve F1, the current consumption becomes large and the tact time becomes long.
 速度指令曲線F3は、速度指令曲線F1に比して、タクトタイムが長くかつ指令速度が小さく、また、速度指令曲線F2に比して、タクトタイムが短くかつ指令速度が大きくされている。実加速時間および実減速時間は、速度指令曲線F1と速度指令曲線F2との間であり、振動は抑えられる。また、指令速度が速度指令曲線F2に比して大きくなっているので、速度指令曲線F2に比して消費電流を減少させることができる。また、タクトタイムも速度指令曲線F2に比して短くすることができる。すなわち、速度指令曲線F3では、振動、消費電流およびタクトタイムがすべて許容範囲内に収まり、バランスの取れた値になっている。機械学習部13は、速度指令曲線F3のようなモータ速度制御パラメータを設定することになる。 The speed command curve F3 has a longer tact time and a smaller command speed than the speed command curve F1, and has a shorter tact time and a larger command speed than the speed command curve F2. The actual acceleration time and the actual deceleration time are between the speed command curve F1 and the speed command curve F2, and the vibration is suppressed. Further, since the command speed is larger than the speed command curve F2, the current consumption can be reduced as compared with the speed command curve F2. Further, the tact time can also be shortened as compared with the speed command curve F2. That is, in the speed command curve F3, the vibration, the current consumption, and the tact time are all within the permissible range and are balanced values. The machine learning unit 13 will set motor speed control parameters such as the speed command curve F3.
 実施の形態3では、位置決め制御装置10の機械学習部13が、学習用データにしたがって、設定された優先項目が許容範囲となるモータ速度制御パラメータを学習し、学習済モデルを生成する。学習用データは、振動センサ61からの振動データおよびタクトタイム測定機器63からのタクトタイムと、モータ50の情報と、モータ速度制御パラメータと、許容範囲と、の組み合わせに基づいて作成される。振動センサ61、消費電流測定機器62およびタクトタイム測定機器63は、モータ50、モータ50に接続される駆動部51を含む製品52、または複数の製品52を含むシステム53に設けられる。そして、機械学習部13は、モータ50の情報と許容範囲と振動データの値とタクトタイムとを、学習済モデルを用いて解析することによって、設定された優先項目が許容範囲に収まるモータ速度制御パラメータを設定する。優先項目として振動が設定されている場合には、設定されたモータ速度制御パラメータでモータ50を駆動させることによって、モータ50に過剰な振動を与えずに製品52またはシステム53を稼働させ、モータ50への負担が軽減され、モータ50の寿命の延長を実現することができる。優先項目としてタクトタイムが設定されている場合には、設定されたモータ速度制御パラメータでモータ50を駆動させることによって、タクトタイムが短縮され、システム53全体の生産効率を向上させることができる。 In the third embodiment, the machine learning unit 13 of the positioning control device 10 learns the motor speed control parameter in which the set priority item is within the permissible range according to the learning data, and generates a trained model. The learning data is created based on a combination of vibration data from the vibration sensor 61, tact time from the tact time measuring device 63, information on the motor 50, motor speed control parameters, and an allowable range. The vibration sensor 61, the current consumption measuring device 62, and the takt time measuring device 63 are provided in the motor 50, the product 52 including the drive unit 51 connected to the motor 50, or the system 53 including the plurality of products 52. Then, the machine learning unit 13 analyzes the information of the motor 50, the permissible range, the value of the vibration data, and the takt time by using the trained model, and the motor speed control in which the set priority items are within the permissible range. Set the parameters. When vibration is set as a priority item, by driving the motor 50 with the set motor speed control parameter, the product 52 or the system 53 is operated without giving excessive vibration to the motor 50, and the motor 50 is operated. The burden on the motor 50 can be reduced, and the life of the motor 50 can be extended. When the tact time is set as a priority item, the tact time can be shortened and the production efficiency of the entire system 53 can be improved by driving the motor 50 with the set motor speed control parameter.
 従来の技術のように、モータ50からのフィードバック制御で制御値であるサーボゲインを決定するものではなく、位置決め制御装置10の出力制御によってモータ速度制御パラメータを決定する。そのため、エンコーダを有するサーボモータおよびアンプのように高価でフィードバック制御ができる機器だけでなく、フィードバック機構を有さないステッピングモータおよびアンプのような機器でも、モータ50が設けられる製品52またはシステム53の寿命、省エネルギ性および生産効率を向上させることができる。また、従来の技術のように、位置指令を用いたフィードバック制御では、ゲインを調整する程度であり、タクトタイムの短縮効果が小さいが、実施の形態3では、モータ速度制御パラメータを調整するので、制約によっては従来の技術に比してタクトタイムの大幅な短縮を行うことも可能になる。 Unlike the conventional technique, the servo gain, which is a control value, is determined by the feedback control from the motor 50, but the motor speed control parameter is determined by the output control of the positioning control device 10. Therefore, not only equipment such as a servomotor and an amplifier having an encoder, which is expensive and capable of feedback control, but also equipment such as a stepping motor and an amplifier having no feedback mechanism, the product 52 or the system 53 in which the motor 50 is provided. Life, energy saving and production efficiency can be improved. Further, as in the conventional technique, in the feedback control using the position command, the gain is only adjusted and the effect of shortening the tact time is small. However, in the third embodiment, the motor speed control parameter is adjusted, so that the motor speed control parameter is adjusted. Depending on the restrictions, it is possible to significantly reduce the tact time compared to the conventional technology.
 また、優先項目で指定された項目だけでなく、すべての項目について許容範囲に収まるようなモータ速度制御パラメータを学習するようにした。これによって、振動を優先した場合に、許容範囲内に振動を抑えながら、最低限の消費電流およびタクトタイムを維持するようなモータ速度制御パラメータを設定することができる。また逆に、タクトタイムを優先した場合に、タクトタイムを許容範囲内に抑えながら、最低限の振動および消費電流を維持するようなモータ速度制御パラメータを設定することができる。 Also, not only the items specified in the priority items, but also the motor speed control parameters that fall within the allowable range for all items are learned. This makes it possible to set motor speed control parameters that maintain the minimum current consumption and takt time while suppressing vibration within the permissible range when vibration is prioritized. On the contrary, when the takt time is prioritized, the motor speed control parameter can be set so as to maintain the minimum vibration and the current consumption while keeping the takt time within the allowable range.
 ここで、実施の形態1,2,3で説明した位置決め制御装置10のハードウェア構成について説明する。図20は、実施の形態1,2,3による位置決め制御装置を実現するハードウェア構成の一例を模式的に示す図である。 Here, the hardware configuration of the positioning control device 10 described in the first, second, and third embodiments will be described. FIG. 20 is a diagram schematically showing an example of a hardware configuration that realizes the positioning control device according to the first, second, and third embodiments.
 位置決め制御装置10は、プロセッサ101と、メモリ102と、がバスライン103を介して接続される。プロセッサ101の例は、CPU(Central Processing Unit)またはシステムLSI(Large Scale Integration)である。メモリ102の例は、主記憶装置であるRAM(Random Access Memory)、ROM(Read Only Memory)、補助記憶装置であるHDD(Hard Disk Drive)またはSSD(Solid State Drive)である。 In the positioning control device 10, the processor 101 and the memory 102 are connected via the bus line 103. An example of the processor 101 is a CPU (Central Processing Unit) or a system LSI (Large Scale Integration). An example of the memory 102 is a RAM (Random Access Memory), a ROM (Read Only Memory), which is a main storage device, an HDD (Hard Disk Drive) or an SSD (Solid State Drive), which is an auxiliary storage device.
 センサ値取得部12、機械学習部13、モータ速度制御パラメータ出力部14、パルス出力部15、消費電流取得部16、シミュレータ部17およびタクトタイム取得部18の一部または全部の機能がプロセッサ101によって実現される場合、当該一部または全部の機能は、プロセッサ101と、ソフトウェア、ファームウェア、またはソフトウェアおよびファームウェアとの組み合わせにより実現される。ソフトウェアまたはファームウェアはプログラムとして記述され、メモリ102に格納される。プロセッサ101は、メモリ102に記憶されたプログラムを読み出して実行することにより、センサ値取得部12、機械学習部13、モータ速度制御パラメータ出力部14、パルス出力部15、消費電流取得部16、シミュレータ部17およびタクトタイム取得部18の一部または全部の機能を実現する。 A part or all of the functions of the sensor value acquisition unit 12, the machine learning unit 13, the motor speed control parameter output unit 14, the pulse output unit 15, the current consumption acquisition unit 16, the simulator unit 17, and the tact time acquisition unit 18 are performed by the processor 101. When realized, some or all of the functions are realized by the processor 101 and software, firmware, or a combination of software and firmware. The software or firmware is written as a program and stored in the memory 102. By reading and executing the program stored in the memory 102, the processor 101 reads and executes the sensor value acquisition unit 12, the machine learning unit 13, the motor speed control parameter output unit 14, the pulse output unit 15, the current consumption acquisition unit 16, and the simulator. A part or all of the functions of the unit 17 and the tact time acquisition unit 18 are realized.
 センサ値取得部12、機械学習部13、モータ速度制御パラメータ出力部14、パルス出力部15、消費電流取得部16、シミュレータ部17およびタクトタイム取得部18の一部または全部の機能がプロセッサ101によって実現される場合、位置決め制御装置10は、センサ値取得部12、機械学習部13、モータ速度制御パラメータ出力部14、パルス出力部15、消費電流取得部16、シミュレータ部17およびタクトタイム取得部18の一部または全部によって実行されるステップが結果的に実行されることになるプログラムがメモリ102に格納される。メモリ102に格納されるプログラムは、センサ値取得部12、機械学習部13、モータ速度制御パラメータ出力部14、パルス出力部15、消費電流取得部16、シミュレータ部17およびタクトタイム取得部18の一部または全部が実行する手順または方法をコンピュータに実行させるものであるともいえる。 A part or all of the functions of the sensor value acquisition unit 12, the machine learning unit 13, the motor speed control parameter output unit 14, the pulse output unit 15, the current consumption acquisition unit 16, the simulator unit 17, and the tact time acquisition unit 18 are performed by the processor 101. When realized, the positioning control device 10 includes a sensor value acquisition unit 12, a machine learning unit 13, a motor speed control parameter output unit 14, a pulse output unit 15, a current consumption acquisition unit 16, a simulator unit 17, and a tact time acquisition unit 18. A program in which a step executed by a part or all of the above will be executed as a result is stored in the memory 102. The program stored in the memory 102 is one of the sensor value acquisition unit 12, the machine learning unit 13, the motor speed control parameter output unit 14, the pulse output unit 15, the current consumption acquisition unit 16, the simulator unit 17, and the tact time acquisition unit 18. It can also be said to cause a computer to perform a procedure or method performed by a part or all of them.
 以上の実施の形態に示した構成は、一例を示すものであり、別の公知の技術と組み合わせることも可能であるし、実施の形態同士を組み合わせることも可能であるし、要旨を逸脱しない範囲で、構成の一部を省略、変更することも可能である。 The configuration shown in the above embodiments is an example, and can be combined with another known technique, can be combined with each other, and does not deviate from the gist. It is also possible to omit or change a part of the configuration.
 1 制御システム、10 位置決め制御装置、11 パラメータ記憶部、12 センサ値取得部、13 機械学習部、14 モータ速度制御パラメータ出力部、15 パルス出力部、16 消費電流取得部、17 シミュレータ部、18 タクトタイム取得部、30 アンプ、50 モータ、51 駆動部、52 製品、53 システム、61 振動センサ、62 消費電流測定機器、63 タクトタイム測定機器、131 データ取得部、132 モデル生成部、133 学習済モデル記憶部、134 推論部、141 報酬計算部、142 関数更新部。 1 control system, 10 positioning control device, 11 parameter storage unit, 12 sensor value acquisition unit, 13 machine learning unit, 14 motor speed control parameter output unit, 15 pulse output unit, 16 current consumption acquisition unit, 17 simulator unit, 18 tact Time acquisition unit, 30 amplifier, 50 motor, 51 drive unit, 52 product, 53 system, 61 vibration sensor, 62 current consumption measurement device, 63 tact time measurement device, 131 data acquisition unit, 132 model generation unit, 133 trained model Storage unit, 134 inference unit, 141 reward calculation unit, 142 function update unit.

Claims (9)

  1.  アンプを介して電気的にモータと接続され、前記モータを制御する位置決め制御装置であって、
     前記モータを制御するパラメータであるモータ速度制御パラメータを決定するために必要な情報であり、前記モータの情報と、前記モータの振動およびタクトタイムの許容範囲と、を含むパラメータを記憶するパラメータ記憶部と、
     振動センサによって検出される前記モータの設置箇所の振動である振動データを取得する振動データ取得部と、
     前記パラメータと前記振動データとから前記モータの前記設置箇所の振動と前記モータ速度制御パラメータとの間の相関を学習した学習済モデルを用いて、前記パラメータと前記振動データとから前記振動が前記許容範囲内に収まる前記モータ速度制御パラメータを決定する機械学習部と、
     決定された前記モータ速度制御パラメータを基に、前記アンプを制御するパルスとして前記アンプに出力する出力部と、
     を備えることを特徴とする位置決め制御装置。
    A positioning control device that is electrically connected to a motor via an amplifier and controls the motor.
    Information necessary for determining the motor speed control parameter, which is a parameter for controlling the motor, and is a parameter storage unit that stores parameters including the information of the motor and the allowable range of vibration and tact time of the motor. When,
    A vibration data acquisition unit that acquires vibration data that is vibration of the installation location of the motor detected by the vibration sensor, and a vibration data acquisition unit.
    Using a trained model that learns the correlation between the vibration of the installation location of the motor and the motor speed control parameter from the parameter and the vibration data, the vibration is permissible from the parameter and the vibration data. A machine learning unit that determines the motor speed control parameters that fall within the range,
    Based on the determined motor speed control parameters, an output unit that outputs to the amplifier as a pulse that controls the amplifier, and
    A positioning control device comprising.
  2.  消費電流測定機器によって測定される前記モータの前記設置箇所の消費電流値を取得する消費電流取得部をさらに備え、
     前記パラメータは、前記モータの消費電流の許容範囲をさらに含み、
     前記学習済モデルは、前記パラメータと前記振動データと前記消費電流値とから、前記モータの前記設置箇所の振動と前記消費電流値と前記モータ速度制御パラメータとの間の相関を学習したものであり、
     前記機械学習部は、前記学習済モデルを用いて前記パラメータと前記振動データと前記消費電流値とから前記モータ速度制御パラメータを決定することを特徴とする請求項1に記載の位置決め制御装置。
    Further provided with a current consumption acquisition unit that acquires the current consumption value of the installation location of the motor as measured by the current consumption measuring device.
    The parameters further include an allowable range of current consumption of the motor.
    In the trained model, the correlation between the vibration of the installation location of the motor, the current consumption value, and the motor speed control parameter is learned from the parameters, the vibration data, and the current consumption value. ,
    The positioning control device according to claim 1, wherein the machine learning unit determines the motor speed control parameter from the parameter, the vibration data, and the current consumption value using the trained model.
  3.  前記機械学習部は、前記学習済モデルを用いて、前記振動および前記消費電流を含む項目のうち、ユーザによって選択された優先項目を優先しつつ、優先項目以外の項目も前記許容範囲内に収まる前記モータ速度制御パラメータを決定することを特徴とする請求項2に記載の位置決め制御装置。 Using the trained model, the machine learning unit prioritizes the priority items selected by the user among the items including the vibration and the current consumption, and the items other than the priority items are also within the permissible range. The positioning control device according to claim 2, wherein the motor speed control parameter is determined.
  4.  タクトタイム測定機器によって測定される前記モータの前記設置箇所のタクトタイムを取得するタクトタイム取得部をさらに備え、
     前記パラメータは、前記モータの前記タクトタイムの許容範囲をさらに含み、
     前記学習済モデルは、前記パラメータと前記振動データと前記タクトタイムとから、前記モータの前記設置箇所の振動と前記タクトタイムと前記モータ速度制御パラメータとの間の相関を学習したものであり、
     前記機械学習部は、前記学習済モデルを用いて前記パラメータと前記振動データと前記タクトタイムとから前記モータ速度制御パラメータを決定することを特徴とする請求項1に記載の位置決め制御装置。
    Further equipped with a tact time acquisition unit for acquiring the tact time of the installation location of the motor measured by the tact time measuring device.
    The parameters further include the takt time tolerance of the motor.
    In the trained model, the correlation between the vibration of the installation location of the motor, the tact time, and the motor speed control parameter is learned from the parameters, the vibration data, and the tact time.
    The positioning control device according to claim 1, wherein the machine learning unit determines the motor speed control parameter from the parameter, the vibration data, and the takt time using the trained model.
  5.  前記機械学習部は、前記学習済モデルを用いて、前記振動および前記タクトタイムを含む項目のうち、ユーザによって選択された優先項目を優先しつつ、優先項目以外の項目も前記許容範囲内に収まる前記モータ速度制御パラメータを決定することを特徴とする請求項4に記載の位置決め制御装置。 Using the trained model, the machine learning unit prioritizes the priority items selected by the user among the items including the vibration and the takt time, and the items other than the priority items are also within the permissible range. The positioning control device according to claim 4, wherein the motor speed control parameter is determined.
  6.  タクトタイム測定機器によって測定される前記モータの前記設置箇所のタクトタイムを取得するタクトタイム取得部をさらに備え、
     前記パラメータは、前記モータの前記タクトタイムの許容範囲をさらに含み、
     前記学習済モデルは、前記パラメータと前記振動データと前記消費電流値と前記タクトタイムとから、前記モータの前記設置箇所の振動と前記消費電流値と前記タクトタイムと前記モータ速度制御パラメータとの間の相関を学習したものであり、
     前記機械学習部は、前記学習済モデルを用いて前記パラメータと前記振動データと前記消費電流値と前記タクトタイムとから前記モータ速度制御パラメータを決定することを特徴とする請求項2に記載の位置決め制御装置。
    Further equipped with a tact time acquisition unit for acquiring the tact time of the installation location of the motor measured by the tact time measuring device.
    The parameters further include the takt time tolerance of the motor.
    In the trained model, the vibration of the installation location of the motor, the current consumption value, the tact time, and the motor speed control parameter are obtained from the parameters, the vibration data, the current consumption value, and the tact time. It is a learning of the correlation of
    The positioning according to claim 2, wherein the machine learning unit determines the motor speed control parameter from the parameter, the vibration data, the current consumption value, and the takt time using the trained model. Control device.
  7.  前記機械学習部は、前記学習済モデルを用いて、前記振動、前記消費電流および前記タクトタイムを含む項目のうち、ユーザによって選択された優先項目を優先しつつ、優先項目以外の項目も前記許容範囲内に収まる前記モータ速度制御パラメータを決定することを特徴とする請求項6に記載の位置決め制御装置。 Using the trained model, the machine learning unit prioritizes the priority items selected by the user among the items including the vibration, the current consumption, and the takt time, and allows items other than the priority items. The positioning control device according to claim 6, wherein the motor speed control parameter within the range is determined.
  8.  前記機械学習部は、
     前記パラメータと前記振動データとの組み合わせを取得するデータ取得部と、
     前記パラメータと前記振動データとの組み合わせを基に作成される学習用データにしたがって、前記モータの前記設置箇所の振動と前記モータ速度制御パラメータとの間の相関を学習し、前記学習済モデルを生成する学習部と、
     前記学習済モデルを記憶する学習済モデル記憶部と、
     を有することを特徴とする請求項1に記載の位置決め制御装置。
    The machine learning unit
    A data acquisition unit that acquires a combination of the parameter and the vibration data,
    According to the learning data created based on the combination of the parameters and the vibration data, the correlation between the vibration of the installation location of the motor and the motor speed control parameter is learned, and the trained model is generated. Learning department and
    A trained model storage unit that stores the trained model,
    The positioning control device according to claim 1, further comprising.
  9.  モータの設置箇所の振動と、前記モータを制御するパラメータであるモータ速度制御パラメータと、の間の相関を学習する機械学習装置であって、
     前記モータ速度制御パラメータを決定するために必要な情報であり、前記モータの情報と、前記モータの振動およびタクトタイムの許容範囲と、を含むパラメータと、振動センサによって検出される前記モータの設置箇所の振動である振動データと、の組み合わせを取得するデータ取得部と、
     前記パラメータと前記振動データとの組み合わせを基に作成される学習用データにしたがって、前記モータの前記設置箇所の振動と前記モータ速度制御パラメータとの間の相関を学習し、学習済モデルを生成する学習部と、
     前記学習済モデルを記憶する学習済モデル記憶部と、
     を備えることを特徴とする機械学習装置。
    A machine learning device that learns the correlation between the vibration of the motor installation location and the motor speed control parameter, which is a parameter that controls the motor.
    Information necessary for determining the motor speed control parameter, the parameter including the information of the motor, the allowable range of vibration and tact time of the motor, and the installation location of the motor detected by the vibration sensor. A data acquisition unit that acquires a combination of vibration data, which is the vibration of
    According to the learning data created based on the combination of the parameter and the vibration data, the correlation between the vibration of the installation location of the motor and the motor speed control parameter is learned, and a trained model is generated. With the learning department
    A trained model storage unit that stores the trained model,
    A machine learning device characterized by being equipped with.
PCT/JP2020/025698 2020-06-30 2020-06-30 Positioning control device and machine learning device WO2022003833A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2021512951A JPWO2022003833A1 (en) 2020-06-30 2020-06-30
PCT/JP2020/025698 WO2022003833A1 (en) 2020-06-30 2020-06-30 Positioning control device and machine learning device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2020/025698 WO2022003833A1 (en) 2020-06-30 2020-06-30 Positioning control device and machine learning device

Publications (1)

Publication Number Publication Date
WO2022003833A1 true WO2022003833A1 (en) 2022-01-06

Family

ID=79314981

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/025698 WO2022003833A1 (en) 2020-06-30 2020-06-30 Positioning control device and machine learning device

Country Status (2)

Country Link
JP (1) JPWO2022003833A1 (en)
WO (1) WO2022003833A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017064837A (en) * 2015-09-29 2017-04-06 ファナック株式会社 Machine learning method and machine learning device for learning operation command for electric motor, and machine tool equipped with machine learning device
JP2018097680A (en) * 2016-12-14 2018-06-21 ファナック株式会社 Control system and machine learning device
JP2018186610A (en) * 2017-04-25 2018-11-22 株式会社安川電機 System, evaluation device, and evaluation method

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4355273A (en) * 1980-09-15 1982-10-19 Xerox Corporation Servo capture system
JP2681969B2 (en) * 1988-02-17 1997-11-26 株式会社安川電機 Coulomb friction compensation method by variable structure system
JP2950149B2 (en) * 1994-05-30 1999-09-20 株式会社デンソー Auto tuning controller
JPH1063339A (en) * 1996-08-26 1998-03-06 Mori Seiki Co Ltd Controller of numerical value control machine tool
JP3465786B2 (en) * 1999-02-23 2003-11-10 横河電機株式会社 Linear actuator
JP6538787B2 (en) * 2017-09-12 2019-07-03 ファナック株式会社 Motor control device and motor control method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017064837A (en) * 2015-09-29 2017-04-06 ファナック株式会社 Machine learning method and machine learning device for learning operation command for electric motor, and machine tool equipped with machine learning device
JP2018097680A (en) * 2016-12-14 2018-06-21 ファナック株式会社 Control system and machine learning device
JP2018186610A (en) * 2017-04-25 2018-11-22 株式会社安川電機 System, evaluation device, and evaluation method

Also Published As

Publication number Publication date
JPWO2022003833A1 (en) 2022-01-06

Similar Documents

Publication Publication Date Title
US11275345B2 (en) Machine learning Method and machine learning device for learning fault conditions, and fault prediction device and fault prediction system including the machine learning device
JP6490127B2 (en) Machine learning device, servo control device, servo control system, and machine learning method
JP6219897B2 (en) Machine tools that generate optimal acceleration / deceleration
CN110376965B (en) Machine learning device, control device, and machine learning method
US10331104B2 (en) Machine tool, simulation apparatus, and machine learning device
US9887661B2 (en) Machine learning method and machine learning apparatus learning operating command to electric motor and controller and electric motor apparatus including machine learning apparatus
US10509397B2 (en) Action information learning device, action information optimization system and computer readable medium
US11640557B2 (en) Machine learning device, numerical control system, and machine learning method
JP2017102613A (en) Machine learning device and method for optimizing smoothness of feeding of feed shaft of machine and motor control device having machine learning device
JP6077617B1 (en) Machine tools that generate optimal speed distribution
CN110376964B (en) Machine learning device, control device, and machine learning method
JP2017045300A (en) Numerical controller with machining condition adjustment function which reduces chatter or tool wear/breakage occurrence
CN110658785B (en) Output device, control device, and method for outputting evaluation function value
US10359742B2 (en) Learning model construction device, and control information optimization device
US9952574B2 (en) Machine learning device, motor control system, and machine learning method for learning cleaning interval of fan motor
JP2019164484A (en) Machine learning device, servo control device, servo control system and machine learning method
US11921489B2 (en) Numerical control device, learning apparatus, inference apparatus, and numerical control method
JP6841801B2 (en) Machine learning equipment, control systems and machine learning methods
WO2022003833A1 (en) Positioning control device and machine learning device
JP6856591B2 (en) Control device, CNC device and control method of control device
JP6896196B1 (en) Numerical control device and learning device
WO2022153936A1 (en) Machine learning device
JP2017033040A (en) Control device and machine learning device with plc program optimization function
JP7436632B2 (en) Machine learning device, numerical control system, setting device, numerical control device, and machine learning method
WO2022210472A1 (en) Machining condition adjustment device

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2021512951

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20943386

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20943386

Country of ref document: EP

Kind code of ref document: A1