WO2022153936A1 - 機械学習装置 - Google Patents
機械学習装置 Download PDFInfo
- Publication number
- WO2022153936A1 WO2022153936A1 PCT/JP2022/000336 JP2022000336W WO2022153936A1 WO 2022153936 A1 WO2022153936 A1 WO 2022153936A1 JP 2022000336 W JP2022000336 W JP 2022000336W WO 2022153936 A1 WO2022153936 A1 WO 2022153936A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- condition
- swing
- learning
- machine learning
- rocking
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01H—MEASUREMENT OF MECHANICAL VIBRATIONS OR ULTRASONIC, SONIC OR INFRASONIC WAVES
- G01H13/00—Measuring resonant frequency
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01M—TESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
- G01M99/00—Subject matter not provided for in other groups of this subclass
- G01M99/005—Testing of complete machines, e.g. washing-machines or mobile phones
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B19/00—Program-control systems
- G05B19/02—Program-control systems electric
- G05B19/18—Numerical control [NC], i.e. automatically operating machines, in particular machine tools, e.g. in a manufacturing environment, so as to execute positioning, movement or co-ordinated operations by means of program data in numerical form
- G05B19/4093—Numerical control [NC], i.e. automatically operating machines, in particular machine tools, e.g. in a manufacturing environment, so as to execute positioning, movement or co-ordinated operations by means of program data in numerical form characterised by part programming, e.g. entry of geometrical information as taken from a technical drawing, combining this with machining and material information to obtain control information, named part program, for the NC machine
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/092—Reinforcement learning
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B2219/00—Program-control systems
- G05B2219/30—Nc systems
- G05B2219/33—Director till display
- G05B2219/33034—Online learning, training
Definitions
- the present invention relates to a machine learning device.
- swing cutting in which cutting is performed while feeding the work and the cutting tool in the machining direction, and at the same time, the work and the cutting tool are reciprocally vibrated relative to each other (see, for example, Patent Document 1).
- oscillating cutting in order to shred chips, it is necessary to set oscillating conditions such as frequency and amplitude so that the tool swings (air cuts) with respect to the work at regular intervals.
- the operator confirms whether the tolerance between the previous path and the current path has occurred, that is, whether chip cutting has occurred, by confirming the waveform of the set rocking condition.
- evaluation data on the finish of the machined work such as surface roughness, roundness, dimensional accuracy, etc. of the machined work machined by rocking cutting, is worse than when swinging cutting is not applied. Often. This is caused by vibration of oscillating cutting, change in cutting amount, etc., but the mechanism of deterioration of evaluation data is very complicated. Therefore, it is required to calculate the swing condition that causes the tool to swing idle and realizes good evaluation data of the machined work.
- the machine learning device is a machine learning device that learns the swinging conditions of a machine tool that swings while swinging relative to a tool and a work, and is used for the swinging.
- the setting condition acquisition unit that acquires the setting condition
- the label acquisition unit that acquires the evaluation data of the machined work by the machine tool as a label
- the combination of the setting condition and the label are used as teacher data for supervised learning.
- the learning unit includes a learning unit, and the learning unit includes a learning model that learns swing conditions for optimizing evaluation data of the processed work.
- the machine learning device is a machine learning device that learns the swinging conditions of a machine tool that swings while swinging relative to a tool and a work, and is used for the swinging.
- the setting condition acquisition unit that acquires the setting condition as state information
- the judgment information acquisition unit that acquires the evaluation data of the machined work by the machine learning machine as judgment information, and which of the swing conditions is set with respect to the current state.
- the behavior information output unit that outputs the behavior information indicating whether or not the change should be made
- the reward calculation unit that calculates the reward value in the reinforcement learning based on the determination information, and the state information, the behavior information, and the reward.
- the value function update unit that updates the value function that determines the value of the rocking condition of the machine tool and the optimum rocking condition that optimizes the evaluation data of the processed work are output based on the value function. It is provided with a swing condition output unit.
- FIG. 1 is a diagram showing an outline of the control system 1 according to the first embodiment.
- the purpose of the control system 1 is to calculate the swing condition by calculating the swing condition by using machine learning, so that the machine tool 100 can swing the tool and realize good evaluation data of the machined workpiece. And.
- the control system 1 according to the first embodiment by calculating the swing condition using machine learning, the tool swings in the machine tool 100 and the swing condition can realize good surface roughness data.
- the purpose is to calculate.
- the control system 1 includes a machine tool 100, a numerical control device 200, a machined surface analysis device 300, and a machine learning device 400.
- the machine tool 100 and the numerical control device 200 are connected in a one-to-one pair so as to be able to communicate with each other.
- the machine tool 100 and the numerical control device 200 may be directly connected via a connection interface, or may be connected via a network such as a LAN (Local Area Network).
- LAN Local Area Network
- the numerical control device 200, the machined surface analysis device 300, and the machine learning device 400 are each directly connected via a connection interface or connected via a network, and can communicate with each other.
- the network is, for example, a LAN constructed in a factory, the Internet, a public telephone network, or a combination thereof.
- the specific communication method in the network and whether it is a wired connection or a wireless connection are not particularly limited.
- the machine tool 100 processes the work by using oscillating cutting according to the control of the numerical control device 200.
- the machine tool 100 has a general configuration for performing oscillating cutting of a tool, a spindle, a feed shaft, and the like.
- the numerical control device 200 is a device for controlling the machine tool 100.
- the numerical control device 200 includes a machining condition setting unit 210 and a swing condition setting unit 220.
- the machining condition setting unit 210 sets the machining conditions and the setting conditions including the swing conditions for performing the swing machining in the machine tool 100.
- the machining condition setting unit 210 outputs the set setting condition to the machine learning device 400.
- the setting conditions are a machining condition including at least one of a tool feed speed, a spindle rotation speed, a coordinate value, a tool cutting edge, a tool type, and a work material in the machine tool 100, and a swing of the machine tool 100. It has a swing condition that includes at least one of a frequency and a swing amplitude.
- the swing condition setting unit 220 sets the swing condition output from the machine learning device 400 in the numerical control device 200, and outputs the swing condition to the machine tool 100.
- the machined surface analysis device 300 measures and calculates the surface roughness data of the machined workpiece machined by the machine tool 100, and outputs the calculated surface roughness data to the machine learning device 400.
- the surface roughness data includes, for example, at least one of arithmetic mean roughness, maximum height, maximum peak height, maximum valley depth, average height, maximum cross-sectional height and load length ratio.
- the surface roughness data is calculated by measuring the line roughness at a plurality of locations on the side surface of the cylinder and calculating the average surface roughness (arithmetic mean roughness Ra), and the line roughness at a plurality of locations on the side surface of the cylinder.
- a method such as a method of measuring the above value and calculating the line roughness which is the maximum value (maximum height Rz) can be used.
- the machine learning device 400 performs machine learning using the setting conditions received from the numerical control device 200 and the surface roughness data received from the machined surface analysis device 300. Then, the machine learning device 400 builds a learning model for outputting the swing condition by performing machine learning.
- the machine learning device 400 includes a setting condition acquisition unit 410, a label acquisition unit 420, a learning unit 430, a learning model storage unit 440, and a swing condition output unit 450.
- the setting condition acquisition unit 410 acquires the machining conditions for rocking and the setting conditions including the rocking conditions from the numerical control device 200.
- the label acquisition unit 420 acquires the surface roughness data of the machined work by the machine tool 100 from the machined surface analysis device 300 as a label.
- the label is the correct output that should correspond to the input in machine learning.
- the setting conditions for the swinging process to be learned and the label of the surface roughness data are paired and input to the learning unit 430.
- This set of setting conditions and labels corresponds to teacher data in machine learning.
- the learning unit 430 builds a learning model by performing machine learning based on the input teacher data. That is, the learning unit 430 learns the swing condition for optimizing the surface roughness of the processed work.
- the learning model constructed by the learning unit 430 is output to the learning model storage unit 440.
- the learning model storage unit 440 stores the learning model learned by the learning unit 430.
- the learning model stored in the learning model storage unit 440 is used by the swing condition output unit 450.
- the learning unit 430 is once constructed by performing supervised learning on the learning model stored in the learning model storage unit 440.
- the learning model may be updated.
- the machine learning device 400 may share the learning model stored in the learning model storage unit 440 with other machine learning devices. If the learning model is shared by a plurality of machine learning devices, it becomes possible to distribute the learning model among the machine learning devices and further perform supervised learning, so that the efficiency and accuracy of supervised learning can be improved. Can be done.
- the rocking condition output unit 450 outputs the optimum rocking condition for optimizing the surface roughness of the processed work based on the learning model stored in the learning model storage unit 440. Further, the swing condition output unit 450 includes a chip shredding condition calculation unit 451 and a swing condition upper limit calculation unit 452.
- the chip shredding condition calculation unit 451 calculates the chip shredding rocking condition that enables the work chips to be shredded by the rocking of the machine tool 100. Then, the swing condition output unit 450 outputs the optimum swing condition that satisfies the chip shredding swing condition based on the learning model.
- a combination of rocking conditions capable of shredding chips may be held in advance as table data, or in the processing path calculated based on the rocking condition, the previous time.
- the relational expression of each swing condition that causes an overlap between the path and the current path may be calculated.
- the swing condition upper limit calculation unit 452 calculates the upper limit swing condition that does not exceed the preset upper limit value. Then, the swing condition output unit 450 outputs the optimum swing condition satisfying the upper limit swing condition based on the learning model.
- the preset upper limit value indicates, for example, an upper limit value of frequency or amplitude in rocking machining, or an upper limit value of speed or acceleration in rocking machining. Determining whether the upper limit swing condition is satisfied is realized by, for example, holding various upper limit values in parameters or specifying them in a machining program, or calculating speed and acceleration from swing conditions (frequency and amplitude). can.
- FIG. 2 is a flowchart showing a flow of a learning model construction process by the machine learning device 400 according to the first embodiment.
- step S1 the setting condition acquisition unit 410 acquires the machining conditions for swing machining and the setting conditions including the swing conditions from the numerical control device 200.
- step S2 the label acquisition unit 420 acquires the surface roughness data of the machined work by the machine tool 100 from the machined surface analysis device 300 as a label.
- step S3 when the teacher data in which the setting condition and the label are set is input, the learning unit 430 executes machine learning based on the input teacher data.
- the learning unit 430 executes machine learning using the input teacher data.
- the learning unit 430 performs supervised learning by, for example, regression analysis, neural network, least squares method, stepwise method, or the like.
- Supervised learning may be performed by online learning, batch learning, or mini-batch learning.
- Online learning is a learning method in which features are extracted from voice data, and supervised learning is performed immediately each time a label is input and teacher data is created.
- batch learning is the process of extracting features from voice data, inputting labels, and repeatedly creating teacher data, while collecting and collecting a plurality of teacher data according to the repetition. This is a learning method in which supervised learning is performed using the teacher data of.
- mini-batch learning is a learning method that is intermediate between online learning and batch learning, in which supervised learning is performed each time teacher data is accumulated to some extent.
- step S4 the learning unit 430 determines whether or not to end the supervised learning.
- the conditions for ending supervised learning can be arbitrarily determined.
- the learning unit 430 may end the supervised learning when the value of the error between the output of the neural network and the label becomes equal to or less than a predetermined value. Further, the learning unit 430 may end the supervised learning when the supervised learning is repeated a predetermined number of times.
- the process proceeds to step S5.
- the process proceeds to step S3.
- step S5 the learning unit 430 outputs the learning model constructed by supervised learning up to that point to the learning model storage unit 440 and stores it. Further, when new teacher data is acquired, the machine learning device 400 can perform further machine learning on the learning model. After that, the machine learning device 400 ends this process.
- FIG. 3 is a flowchart showing a flow of output processing of swing conditions by the machine learning device 400 according to the first embodiment.
- step S11 the setting condition acquisition unit 410 acquires the machining conditions for rocking and the setting conditions including the rocking conditions from the numerical control device 200.
- step S12 the chip shredding condition calculation unit 451 calculates the chip shredding rocking condition that enables the chip of the work to be shredded by the rocking of the machine tool 100.
- step S13 the swing condition upper limit calculation unit 452 calculates the upper limit swing condition that does not exceed the preset upper limit value.
- step S14 the swing condition output unit 450 optimizes the surface roughness of the machined workpiece based on the setting conditions acquired in step S11 and the learning model stored in the learning model storage unit 440. Is output.
- step S15 the rocking condition output unit 450 determines whether or not the optimum rocking condition output in step S13 satisfies the chip shredding rocking condition calculated in step S12.
- the process proceeds to step S16.
- the process proceeds to step S14 again.
- step S16 the swing condition output unit 450 determines whether or not the optimum swing condition output in step S13 satisfies the upper limit swing condition calculated in step S13.
- the process proceeds to step S16.
- the process proceeds to step S14 again.
- step S17 the rocking condition output unit 450 outputs the optimum rocking condition to the numerical control device 200, and then ends the process.
- the machine learning device 400 obtains the setting condition acquisition unit 410 for acquiring the setting conditions for rocking machining and the surface roughness data of the machined work by the machine tool 100.
- a label acquisition unit 420 that acquires as a label and a learning unit 430 that performs supervised learning using a set of setting conditions and labels as supervised data are provided, and the learning unit 430 optimizes the surface roughness of the processed work.
- a learning model for learning the swing condition is provided.
- the machine learning device 400 can learn the optimum swing condition optimized in consideration of the surface roughness as compared with the conventional technique in which it is difficult to set the swing condition in consideration of the surface roughness. can.
- the machine learning device 400 further includes a swing condition output unit 450 that outputs an optimum swing condition that optimizes the surface roughness of the machined work based on the learning model.
- the machine learning device 400 can output the optimum swing condition optimized in consideration of the surface roughness as compared with the conventional technique in which it is difficult to set the swing condition in consideration of the surface roughness. can. Further, since the machine learning device 400 can automate the setting of the swing condition, the burden on the operator can be reduced.
- the rocking condition output unit 450 includes a chip shredding condition calculation unit 451 that calculates a chip shredding rocking condition that enables shredding of work chips, and the rocking condition output unit 450 is based on a learning model. The optimum rocking condition that satisfies the chip shredding rocking condition is output. As a result, the machine learning device 400 can set the optimum swing condition in which chips are shredded and optimized in consideration of surface roughness.
- the swing condition output unit 450 includes a swing condition upper limit calculation unit 452 that calculates an upper limit swing condition that does not exceed a preset upper limit value, and the swing condition output unit 450 is based on a learning model.
- the optimum swing condition that satisfies the upper limit swing condition is output.
- the machine learning device 400 can set the optimum rocking conditions that are chipped, do not exceed the upper limit of the rocking conditions, and are optimized in consideration of the surface roughness.
- control system 10 according to the second embodiment will be described.
- the same components as those of the first embodiment are designated by the same reference numerals, and the description thereof will be omitted or simplified.
- the control system 10 according to the second embodiment is mainly different from the first embodiment in that reinforcement learning is used instead of supervised learning, and other configurations include the same configurations as those of the first embodiment.
- FIG. 4 is a diagram showing an outline of the control system 10 according to the second embodiment.
- the purpose of the control system 10 is to calculate the swing condition by using machine learning to calculate the swing condition in which the tool swings in the machine tool 100 and good surface roughness data can be realized. ..
- the control system 1 includes a machine tool 100, a numerical control device 200, a machined surface analysis device 300, and a machine learning device 500.
- the machine tool 100 and the numerical control device 200 are connected in a one-to-one pair so as to be able to communicate with each other.
- the machine tool 100 and the numerical control device 200 may be directly connected via a connection interface, or may be connected via a network such as a LAN (Local Area Network).
- LAN Local Area Network
- the numerical control device 200, the machined surface analysis device 300, and the machine learning device 500 are each directly connected via a connection interface or connected via a network, and can communicate with each other.
- the network is, for example, a LAN constructed in a factory, the Internet, a public telephone network, or a combination thereof.
- the specific communication method in the network and whether it is a wired connection or a wireless connection are not particularly limited.
- the machine tool 100, the numerical control device 200, and the machined surface analysis device 300 have the same configurations as those in the first embodiment as described above.
- the machine learning device 500 is a device that performs reinforcement learning. Prior to the explanation of each functional block included in the machine learning device 500, first, the basic mechanism of reinforcement learning will be described.
- the agent (corresponding to the machine learning device 500 in the present embodiment) observes the state of the environment, selects a certain action, and changes the environment based on the action. As the environment changes, some reward is given and the agent learns better behavioral choices (decision-making). Whereas supervised learning gives a complete answer, rewards in reinforcement learning are often fragmentary values based on some changes in the environment. Therefore, the agent learns to choose an action to maximize the total reward for the future.
- any learning method can be used as reinforcement learning, but in the following description, a method of learning the value Q (s, a) for selecting the action a under the state s of a certain environment.
- Q-learning which is the above, will be described as an example.
- the purpose of Q-learning is to select the action a having the highest value Q (s, a) as the optimum action from the actions a that can be taken in a certain state s.
- the agent selects various actions a under a certain state s, and for the action a at that time, selects a better action based on the reward given, so that the correct value Q (s) , A) will be learned.
- st represents the state of the environment at time t
- at represents the action at time t
- the state changes to st + 1 depending on the action at. rt + 1 represents the reward obtained by changing the state.
- the term with max is obtained by multiplying the Q value when the action a having the highest Q value known at that time is selected under the state st + 1 by ⁇ .
- ⁇ is a parameter of 0 ⁇ ⁇ 1 and is called a discount rate.
- ⁇ is a learning coefficient and is in the range of 0 ⁇ ⁇ 1.
- the above-mentioned equation (1) represents a method of updating the value Q ( st , at) of the action at in the state st based on the reward rt + 1 returned as a result of the trial at.
- This update equation is that the value of the best action in the next state st + 1 by the action at max a Q ( st + 1 , a) is more than the value Q ( st , at) of the action at in the state st. If it is larger, the value Q ( st , at) is increased, and if it is smaller, the value Q ( st , at) is decreased. That is, it brings the value of one action in one state closer to the value of the best action in the next state. However, the difference depends on the discount rate ⁇ and the reward rt + 1 , but basically, the value of the best action in a certain state propagates to the value of the action in the previous state. It is a mechanism to go.
- DQN Deep Q-Network
- the value function Q is constructed using an appropriate neural network
- the value Q (s, a) is approximated by approximating the value function Q with an appropriate neural network by adjusting the parameters of the neural network.
- the value may be calculated.
- DQN it is possible to shorten the time required for Q-learning to converge.
- DQN for example, there is a detailed description in the following non-patent documents.
- the machine learning device 500 performs the Q-learning described above. Specifically, the machine learning device 500 sets the setting conditions (machining conditions, swing conditions, etc.) set in the numerical control device 200 as the state s, and changes the swing conditions in the numerical control device 200 related to the state s. Is the action a, and the value function Q to be selected is learned.
- setting conditions machining conditions, swing conditions, etc.
- the machine learning device 500 observes the state s such as the setting conditions (machining conditions, swing conditions, etc.) set in the numerical control device 200, and determines the action a.
- the machine learning device 500 returns a reward each time the action a is performed.
- the machine learning device 500 searches for the optimum action a by trial and error so as to maximize the total reward in the future. By doing so, the machine learning device 500 can select the optimum action a for the state s such as the setting conditions (machining conditions, swing conditions, etc.) set in the numerical control device 200. Become.
- the value function Q of the value function Q By selecting the action a having the maximum value, it is possible to select the action a having the minimum (optimum) surface roughness data.
- the machine learning device 500 swings with the setting condition acquisition unit 510, the determination information acquisition unit 520, the behavior information output unit 530, the learning unit 540, and the value function storage unit 550. It includes a condition output unit 560.
- the setting condition acquisition unit 510 acquires the setting conditions (machining conditions, swing conditions, etc.) set in the numerical control device 200 from the numerical control device 200 as state information (state s). This state s corresponds to the environment state s in Q-learning.
- the state s in the second embodiment indicates the setting conditions (machining conditions, rocking conditions, etc.) set in the numerical control device 200.
- the setting conditions include machining conditions including the spindle rotation speed and feed rate of the machine tool 100, and rocking conditions including the swing amplitude and swing frequency of the machine tool 100.
- the judgment information acquisition unit 520 acquires the judgment information for calculating the reward for performing Q-learning. Specifically, the determination information acquisition unit 520 acquires the surface roughness data of the machined work by the machine tool 100 as the determination information for calculating the reward for performing Q-learning.
- the action information output unit 530 transmits the action information (action a) generated by the learning unit 540 to the numerical control device 200.
- the numerical control device 200 changes the current state s, that is, the currently set setting condition, based on this action a, and thereby changes the next state s'(that is, the changed swing condition). Is executed on the machine tool 100).
- the learning unit 540 learns the value Q (s, a) when selecting a certain action a under the state s of a certain environment.
- the learning unit 540 includes a reward calculation unit 541, a value function update unit 542, and an action information generation unit 543.
- the reward calculation unit 541 calculates the reward when the action a is selected under a certain state s, based on the determination information.
- the reward calculation unit 541 sets the reward as a positive value when the surface roughness data of the processed work is less than a predetermined threshold value, and negatively reduces the reward when the surface roughness data exceeds the predetermined threshold value. The value of.
- the reward calculation unit 541 determines a predetermined threshold value based on an approximate expression of the theoretical surface roughness. Specifically, the reward calculation unit 541 determines a value obtained by multiplying the approximate expression of the theoretical surface roughness by the correction coefficient a as a predetermined threshold value.
- the predetermined threshold value is represented by a ⁇ F 2 / 8R, where F represents the feed amount [mm] and R represents the tool radius.
- the value function update unit 542 performs Q-learning based on the state s, the action a, the state s'when the action a is applied to the state s, and the reward value calculated as described above. Therefore, the value function Q stored in the value function storage unit 550 is updated.
- the value function Q may be updated by online learning, batch learning, or mini-batch learning.
- Online learning is a learning method in which the value function Q is immediately updated each time the state s transitions to a new state s'by applying a certain action a to the current state s.
- batch learning data for learning is collected and collected by repeating the transition of the state s to a new state s'by applying a certain action a to the current state s.
- This is a learning method in which the value function Q is updated using the learning data of.
- mini-batch learning is a learning method in which the value function Q is updated every time learning data is accumulated to some extent, which is intermediate between online learning and batch learning.
- the action information generation unit 543 generates the action a in the process of Q-learning, and outputs the generated action a to the action information output unit 530. Specifically, the action information generation unit 543 selects the action a in the Q-learning process with respect to the current state s.
- the action a in the second embodiment includes how the swing condition should be changed with respect to the current state s.
- the action information generation unit 543 randomly acts with a certain small probability ⁇ or a greedy method for selecting the action a'with the highest value Q (s, a) among the current estimated values of the action a.
- a method of selecting the action a' may be taken by a known method such as the ⁇ -greedy method of selecting a'and otherwise selecting the action a'with the highest value Q (s, a).
- the value function storage unit 550 is a storage device that stores the value function Q.
- the value function Q stored in the value function storage unit 550 is updated by the value function update unit 542.
- the machine learning device 500 causes the machine tool 100 to perform rocking processing that maximizes the value Q (s, a) based on the value function Q updated by the value function updating unit 542 performing Q learning.
- Action a (hereinafter referred to as "optimized action information") as a swing condition for the purpose is generated.
- the swing condition output unit 560 acquires the value function Q stored in the value function storage unit 550. As described above, this value function Q is updated by the value function updating unit 542 performing Q-learning. Then, the swing condition output unit 560 generates the optimum swing condition for optimizing the surface roughness of the processed work as the optimization action information based on the value function Q, and the generated optimum swing condition (optimum). Optimized behavior information) is output to the numerical control device 200.
- the swing condition output unit 560 has a chip shredding condition calculation unit 561 and a swing condition upper limit calculation unit 562.
- the chip shredding condition calculation unit 561 calculates the chip shredding rocking condition that enables the work chips to be shredded by the rocking of the machine tool 100. Then, the swing condition output unit 560 outputs the optimum swing condition that satisfies the chip shredding swing condition.
- the swing condition upper limit calculation unit 562 calculates the upper limit swing condition that does not exceed the preset upper limit value. Then, the swing condition output unit 560 outputs the optimum swing condition that satisfies the upper swing condition.
- the numerical control device 200 corrects the swing condition currently set based on the optimum rocking condition (optimized behavior information) and generates an operation command, so that the machine tool 100 can move the machine tool 100 to the machine tool. It can operate so that the surface roughness is the minimum (optimum).
- FIG. 5 is a flowchart showing the flow of the value function update process by the machine learning device 500 according to the second embodiment.
- step S21 the setting condition acquisition unit 510 acquires the setting condition as state information from the numerical control device 200.
- the acquired setting conditions are output to the value function update unit 542 and the action information generation unit 543.
- this setting condition (state information) is information corresponding to the state s of the environment in Q-learning.
- step S22 the action information generation unit 543 generates a swing condition as new action information, and the generated new action information (action a) is transmitted to the numerical control device 200 via the action information output unit 530. And output.
- the numerical control device 200 that has received the action information drives the machine tool 100 to swing the workpiece in the state s'in which the swing condition related to the current state s is changed based on the received action information. ..
- this behavior information corresponds to the behavior a in Q-learning.
- step S23 the determination information acquisition unit 520 acquires the surface roughness data of the machined work by the machine tool 100 as the determination information for calculating the reward for performing Q-learning.
- step S24 the reward calculation unit 531 calculates the reward based on the input determination information (surface roughness of the processed work).
- the reward calculation unit 541 determines whether or not the surface roughness data of the processed work is less than a predetermined threshold value. If the surface roughness data is less than a predetermined threshold value (YES), the process proceeds to step S25. On the other hand, when the surface roughness data exceeds a predetermined threshold value (NO), the process proceeds to step S26.
- step S25 the reward calculation unit 541 calculates the reward value as a positive value.
- step S26 the reward calculation unit 541 calculates the reward value as a negative value.
- step S27 the value function update unit 542 updates the value function Q stored in the value function storage unit 550 based on the reward value calculated above. Then, the learning unit 640 returns to step S21 again, and by repeating the above-described processing, the value function Q converges to an appropriate value.
- the learning unit 640 may be terminated on the condition that the above-mentioned processing is repeated a predetermined number of times or repeated for a predetermined time.
- the operation of the machine learning device 500 has been described above, but for example, the process of calculating the reward value is an example and is not limited to this.
- FIG. 6 is a flowchart showing a flow of processing for outputting a swing condition by the machine learning device 500 according to the second embodiment.
- the swing condition output unit 560 acquires the value function Q stored in the value function storage unit 550.
- This value function Q is updated by the value function update unit 542 performing Q learning as described above.
- step S32 the swing condition output unit 560 optimizes the action a having the highest value Q (s, a) among the actions a that can be taken, for example, in the currently set state s, based on the value function Q.
- Optimal swing conditions are generated by selecting the desired behavior.
- step S33 the chip shredding condition calculation unit 561 calculates the chip shredding rocking condition that enables the chip of the work to be shredded by the rocking of the machine tool 100.
- step S34 the swing condition upper limit calculation unit 562 calculates the upper limit swing condition that does not exceed the preset upper limit value.
- step S35 the rocking condition output unit 560 determines whether or not the optimum rocking condition generated in step S32 satisfies the chip shredding rocking condition calculated in step S33.
- the process proceeds to step S35.
- the optimum rocking condition does not satisfy the chip shredding rocking condition (NO)
- the process proceeds to step S31 again.
- step S36 the swing condition output unit 560 determines whether or not the optimum swing condition output in step S32 satisfies the upper limit swing condition calculated in step S34. If the optimum swing condition satisfies the upper limit swing condition (YES), the process proceeds to step S37. On the other hand, when the optimum swing condition does not satisfy the upper limit swing condition (NO), the process proceeds to step S31 again.
- step S32 the rocking condition output unit 560 outputs the generated optimum rocking condition (optimized behavior information) to the numerical control device 200.
- the numerical control device 200 corrects the currently set state s (that is, the currently set swing condition) based on the optimum swing condition, and generates an operation command. Then, the numerical control device 200 sends the generated operation command to the machine tool 100, so that the machine tool 100 can operate so that the surface roughness data of the machined work becomes optimum (minimum).
- the machine learning device 500 has the setting condition acquisition unit 510 that acquires the setting conditions for rocking machining as state information, and the surface roughness of the machined work by the machine tool 100.
- the judgment information acquisition unit 520 that acquires the data as the judgment information
- the action information output unit 530 that outputs the action information indicating how the swing condition should be changed with respect to the current state, and the judgment information.
- the reward calculation unit 541 that calculates the reward value in the reinforcement learning
- the value function update unit 542 that updates the value function that determines the value of the swing condition of the machine tool 100 based on the state information, the action information, and the reward.
- a swing condition output unit 560 that outputs an optimum swing condition that optimizes the surface roughness of the machined work based on the value function.
- the machine learning device 500 can output the optimum rocking conditions optimized in consideration of the surface roughness as compared with the conventional technique in which it is difficult to set the rocking conditions in consideration of the surface roughness. can. Further, since the machine learning device 500 can automate the setting of the swing condition, the burden on the operator can be reduced.
- the reward calculation unit 541 sets the reward as a positive value when the surface roughness data of the processed work is less than a predetermined threshold value, and negatively charges the reward when the surface roughness data exceeds the predetermined threshold value. Let it be a value. As a result, the machine learning device 500 can determine the value of the reward in consideration of the surface roughness data.
- the reward calculation unit 541 determines a predetermined threshold value based on an approximate expression of theoretical surface roughness. Thereby, the machine learning device 500 can determine the value of the reward in consideration of the theoretical surface roughness.
- the rocking condition output unit 560 includes a chip shredding condition calculation unit 561 for calculating a chip shredding rocking condition that enables shredding of work chips, and the rocking condition output unit 560 includes a chip shredding condition calculation unit 561. Outputs the optimum rocking conditions that satisfy the dynamic conditions. As a result, the machine learning device 500 can set the optimum swing condition in which chips are shredded and optimized in consideration of surface roughness.
- the swing condition output unit 560 includes a swing condition upper limit calculation unit 562 that calculates an upper limit swing condition that does not exceed a preset upper limit value, and the swing condition output unit 560 provides an optimum swing that satisfies the upper limit swing condition. Output the dynamic conditions.
- the machine learning device 500 can set the optimum rocking conditions that are chipped, do not exceed the upper limit of the rocking conditions, and are optimized in consideration of the surface roughness.
- control system 20 according to the third embodiment will be described.
- the same components as those of the first embodiment are designated by the same reference numerals, and the description thereof will be omitted or simplified.
- the control system 10 according to the third embodiment is mainly different from the first embodiment in that the roundness of the machined work is used instead of the surface roughness data of the machined work, and the other configurations are the first. It has the same configuration as the embodiment.
- FIG. 7 is a diagram showing an outline of the control system 20 according to the third embodiment.
- the purpose of the control system 20 is to calculate the swing condition by using machine learning to calculate the swing condition in which the tool swings in the machine tool 100 and good roundness data can be realized. ..
- the control system 20 includes a machine tool 100, a numerical control device 200, a machined surface analysis device 600, and a machine learning device 700.
- the machine tool 100 and the numerical control device 200 are connected in a one-to-one pair so as to be able to communicate with each other.
- the machine tool 100 and the numerical control device 200 may be directly connected via a connection interface, or may be connected via a network such as a LAN (Local Area Network).
- LAN Local Area Network
- the numerical control device 200, the machined surface analysis device 600, and the machine learning device 700 are each directly connected via a connection interface or connected via a network, and can communicate with each other.
- the network is, for example, a LAN constructed in a factory, the Internet, a public telephone network, or a combination thereof.
- the specific communication method in the network and whether it is a wired connection or a wireless connection are not particularly limited.
- the machine tool 100 and the numerical control device 200 have the same configuration as that of the first embodiment as described above.
- the machined surface analysis device 600 measures and calculates the roundness data of the machined work machined by the machine tool 100, and outputs the calculated surface roughness data to the machine learning device 700.
- the machine learning device 700 performs machine learning using the setting conditions received from the numerical control device 200 and the roundness data received from the machined surface analysis device 600. Then, the machine learning device 700 builds a learning model for outputting the swing condition by performing machine learning.
- the machine learning device 700 includes a setting condition acquisition unit 710, a label acquisition unit 720, a learning unit 730, a learning model storage unit 740, and a swing condition output unit 750.
- the setting condition acquisition unit 710 acquires the machining conditions for rocking and the setting conditions including the rocking conditions from the numerical control device 200.
- the label acquisition unit 720 acquires the roundness data of the machined work by the machine tool 100 from the machined surface analysis device 600 as a label.
- the label is the correct output that should correspond to the input in machine learning.
- This set of setting conditions and labels corresponds to teacher data in machine learning.
- the learning unit 730 builds a learning model by performing machine learning based on the input teacher data. That is, the learning unit 730 learns the swing condition that optimizes the roundness of the processed work.
- the learning model constructed by the learning unit 730 is output to the learning model storage unit 740.
- the learning model storage unit 740 stores the learning model learned by the learning unit 730.
- the learning model stored in the learning model storage unit 740 is used by the swing condition output unit 750.
- the learning unit 730 is once constructed by performing supervised learning on the learning model stored in the learning model storage unit 740.
- the learning model may be updated.
- the machine learning device 700 may share the learning model stored in the learning model storage unit 740 with other machine learning devices. If the learning model is shared by a plurality of machine learning devices, it becomes possible to distribute the learning model among the machine learning devices and further perform supervised learning, so that the efficiency and accuracy of supervised learning can be improved. Can be done.
- the rocking condition output unit 750 outputs the optimum rocking condition for optimizing the roundness of the processed work based on the learning model stored in the learning model storage unit 740. Further, the swing condition output unit 750 includes a chip shredding condition calculation unit 751 and a swing condition upper limit calculation unit 752.
- the chip shredding condition calculation unit 751 calculates the chip shredding rocking condition that enables the work chips to be shredded by the rocking of the machine tool 100. Then, the swing condition output unit 750 outputs the optimum swing condition that satisfies the chip shredding swing condition based on the learning model.
- the swing condition upper limit calculation unit 752 calculates the upper limit swing condition that does not exceed the preset upper limit value. Then, the swing condition output unit 750 outputs the optimum swing condition satisfying the upper limit swing condition based on the learning model.
- FIG. 8 is a flowchart showing a flow of a learning model construction process by the machine learning device 700 according to the third embodiment.
- step S41 the setting condition acquisition unit 710 acquires the machining conditions for swing machining and the setting conditions including the swing conditions from the numerical control device 200.
- step S42 the label acquisition unit 720 acquires the roundness data of the machined work by the machine tool 100 from the machined surface analysis device 600 as a label.
- step S43 when the teacher data in which the setting condition and the label are set is input, the learning unit 730 executes machine learning based on the input teacher data.
- the learning unit 730 executes machine learning using the input teacher data.
- the learning unit 730 performs supervised learning by, for example, regression analysis, neural network, least squares method, stepwise method, or the like.
- step S4 the learning unit 730 determines whether or not to end the supervised learning.
- the conditions for ending supervised learning can be arbitrarily determined.
- the learning unit 730 may end the supervised learning when the value of the error between the output of the neural network and the label becomes equal to or less than a predetermined value. Further, the learning unit 730 may end the supervised learning when the supervised learning is repeated a predetermined number of times.
- the process proceeds to step S45.
- supervised learning is not completed (NO)
- the process proceeds to step S43.
- step S45 the learning unit 730 outputs the learning model constructed by supervised learning up to that point to the learning model storage unit 740 and stores it. Further, when new teacher data is acquired, the machine learning device 700 can perform further machine learning on the learning model. After that, the machine learning device 700 ends this process.
- FIG. 9 is a flowchart showing a flow of output processing of swing conditions by the machine learning device 700 according to the third embodiment.
- step S51 the setting condition acquisition unit 710 acquires the machining conditions for rocking and the setting conditions including the rocking conditions from the numerical control device 200.
- step S52 the chip shredding condition calculation unit 751 calculates the chip shredding rocking condition that enables the chip of the work to be shredded by the rocking of the machine tool 100.
- step S53 the swing condition upper limit calculation unit 752 calculates the upper limit swing condition that does not exceed the preset upper limit value.
- step S54 the swing condition output unit 750 optimizes the roundness of the processed work based on the setting condition acquired in step S51 and the learning model stored in the learning model storage unit 740. Is output.
- step S55 the rocking condition output unit 750 determines whether or not the optimum rocking condition output in step S53 satisfies the chip shredding rocking condition calculated in step S52.
- the process proceeds to step S56.
- the process proceeds to step S54 again.
- step S56 the swing condition output unit 750 determines whether or not the optimum swing condition output in step S53 satisfies the upper limit swing condition calculated in step S53.
- the process proceeds to step S56.
- the process proceeds to step S54 again.
- step S57 the rocking condition output unit 750 outputs the optimum rocking condition to the numerical control device 200, and then ends the process.
- the machine learning device 700 obtains the setting condition acquisition unit 710 for acquiring the setting conditions for rocking machining and the roundness data of the machined work by the machine tool 100.
- a label acquisition unit 720 to acquire as a label and a learning unit 730 to perform supervised learning using a set of setting conditions and labels as supervised data are provided, and the learning unit 730 optimizes the roundness of the processed work.
- a learning model for learning the swing condition is provided.
- the machine learning device 700 can learn the optimum swing condition optimized in consideration of the roundness as compared with the conventional technique in which it is difficult to set the swing condition in consideration of the roundness. can.
- the machine learning device 700 further includes a swing condition output unit 750 that outputs an optimum swing condition that optimizes the roundness of the machined work based on the learning model.
- the machine learning device 700 can output the optimum swing condition optimized in consideration of the roundness as compared with the conventional technique in which it is difficult to set the swing condition in consideration of the roundness. can. Further, since the machine learning device 700 can automate the setting of the swing condition, the burden on the operator can be reduced.
- the swing condition output unit 750 includes a chip shredding condition calculation unit 751 that calculates a chip shredding swing condition that enables the work chips to be shredded, and the swing condition output unit 750 is based on a learning model. The optimum rocking condition that satisfies the chip shredding rocking condition is output. As a result, the machine learning device 700 can set the optimum swing condition in which chips are shredded and optimized in consideration of surface roughness.
- the swing condition output unit 750 includes a swing condition upper limit calculation unit 752 that calculates an upper limit swing condition that does not exceed a preset upper limit value, and the swing condition output unit 750 is based on a learning model.
- the optimum swing condition that satisfies the upper limit swing condition is output.
- the machine learning device 700 can set the optimum rocking conditions that are chipped, do not exceed the upper limit of the rocking conditions, and are optimized in consideration of the surface roughness.
- control system 30 according to the fourth embodiment will be described.
- the same components as those of the third embodiment are designated by the same reference numerals, and the description thereof will be omitted or simplified.
- the control system 30 according to the fourth embodiment is mainly different from the third embodiment in that reinforcement learning is used instead of supervised learning, and other configurations include the same configurations as those of the third embodiment.
- FIG. 10 is a diagram showing an outline of the control system 30 according to the fourth embodiment.
- the purpose of the control system 30 is to calculate the swing condition by using machine learning to calculate the swing condition in which the tool swings in the machine tool 100 and good roundness data can be realized. ..
- the control system 30 includes a machine tool 100, a numerical control device 200, a machined surface analysis device 600, and a machine learning device 800.
- the machine tool 100 and the numerical control device 200 are connected in a one-to-one pair so as to be able to communicate with each other.
- the machine tool 100 and the numerical control device 200 may be directly connected via a connection interface, or may be connected via a network such as a LAN (Local Area Network).
- LAN Local Area Network
- the numerical control device 200, the machined surface analysis device 600, and the machine learning device 800 are each directly connected via a connection interface or connected via a network, and can communicate with each other.
- the network is, for example, a LAN constructed in a factory, the Internet, a public telephone network, or a combination thereof.
- the specific communication method in the network and whether it is a wired connection or a wireless connection are not particularly limited.
- the machine tool 100, the numerical control device 200, and the machined surface analysis device 600 have the same configurations as those in the third embodiment as described above.
- the machine learning device 800 is a device that performs reinforcement learning. Since the processing related to reinforcement learning is the same as that of the machine learning device 500 according to the second embodiment, the description thereof will be omitted.
- the machine learning device 800 includes a setting condition acquisition unit 810, a judgment information acquisition unit 820, an action information output unit 830, a learning unit 840, a value function storage unit 850, and a swing condition output.
- a unit 860 is provided.
- the setting condition acquisition unit 810 acquires the setting conditions (machining conditions, swing conditions, etc.) set in the numerical control device 200 from the numerical control device 200 as state information (state s).
- This state s corresponds to the environment state s in Q-learning.
- the state s in the fourth embodiment indicates the setting conditions (machining conditions, rocking conditions, etc.) set in the numerical control device 200.
- the setting conditions include machining conditions including the spindle rotation speed and feed rate of the machine tool 100, and rocking conditions including the swing amplitude and swing frequency of the machine tool 100.
- the judgment information acquisition unit 820 acquires the judgment information for calculating the reward for performing Q-learning. Specifically, the determination information acquisition unit 820 acquires the roundness data of the processed work by the machine tool 100 as the determination information for calculating the reward for performing Q-learning.
- the action information output unit 830 transmits the action information (action a) generated by the learning unit 840 to the numerical control device 200.
- the numerical control device 200 changes the current state s, that is, the currently set setting condition, based on this action a, and thereby changes the next state s'(that is, the changed swing condition). Is executed on the machine tool 100).
- the learning unit 840 learns the value Q (s, a) when selecting a certain action a under the state s of a certain environment.
- the learning unit 840 includes a reward calculation unit 841, a value function update unit 842, and an action information generation unit 843.
- the reward calculation unit 841 calculates the reward when the action a is selected under a certain state s, based on the determination information.
- the reward calculation unit 841 sets the reward as a positive value when the roundness data of the processed work is less than a predetermined threshold value, and negatively reduces the reward when the roundness data exceeds the predetermined threshold value. The value of.
- the value function update unit 842 performs Q-learning based on the state s, the action a, the state s'when the action a is applied to the state s, and the reward value calculated as described above. Therefore, the value function Q stored in the value function storage unit 850 is updated.
- the value function Q may be updated by online learning, batch learning, or mini-batch learning.
- the action information generation unit 843 generates the action a in the process of Q-learning, and outputs the generated action a to the action information output unit 830. Specifically, the action information generation unit 843 selects the action a in the Q-learning process with respect to the current state s.
- the action a in the fourth embodiment includes how the swing condition should be changed with respect to the current state s.
- the action information generation unit 843 randomly acts with a certain small probability ⁇ or a greedy method for selecting the action a'with the highest value Q (s, a) among the current estimated values of the action a.
- a method of selecting the action a' may be taken by a known method such as the ⁇ -greedy method of selecting a'and otherwise selecting the action a'with the highest value Q (s, a).
- the value function storage unit 850 is a storage device that stores the value function Q.
- the value function Q stored in the value function storage unit 850 is updated by the value function update unit 842.
- the machine learning device 800 causes the machine tool 100 to perform rocking processing that maximizes the value Q (s, a) based on the value function Q updated by the value function updating unit 842 performing Q learning.
- Action a (hereinafter referred to as "optimized action information") as a swing condition for the purpose is generated.
- the swing condition output unit 860 acquires the value function Q stored in the value function storage unit 850. As described above, this value function Q is updated by the value function update unit 842 performing Q-learning. Then, the swing condition output unit 860 generates the optimum swing condition for optimizing the surface roughness of the processed work as the optimization action information based on the value function Q, and the generated optimum swing condition (optimum). Optimized behavior information) is output to the numerical control device 200.
- the swing condition output unit 860 has a chip shredding condition calculation unit 861 and a swing condition upper limit calculation unit 862.
- the chip shredding condition calculation unit 861 calculates the chip shredding rocking condition that enables the work chips to be shredded by the rocking of the machine tool 100. Then, the swing condition output unit 860 outputs the optimum swing condition that satisfies the chip shredding swing condition.
- the swing condition upper limit calculation unit 862 calculates the upper limit swing condition that does not exceed the preset upper limit value. Then, the swing condition output unit 860 outputs the optimum swing condition that satisfies the upper swing condition.
- the numerical control device 200 corrects the swing condition currently set based on the optimum rocking condition (optimized behavior information) and generates an operation command, so that the machine tool 100 can move the machine tool 100. It can operate so that the roundness is optimized.
- FIG. 11 is a flowchart showing the flow of the value function update process by the machine learning device 800 according to the fourth embodiment.
- step S61 the setting condition acquisition unit 810 acquires the setting condition as state information from the numerical control device 200.
- the acquired setting conditions are output to the value function update unit 842 and the action information generation unit 843.
- this setting condition (state information) is information corresponding to the state s of the environment in Q-learning.
- step S62 the action information generation unit 843 generates a swing condition as new action information, and the generated new action information (action a) is transmitted to the numerical control device 200 via the action information output unit 830. And output.
- the numerical control device 200 that has received the action information drives the machine tool 100 to swing the workpiece in the state s'in which the swing condition related to the current state s is changed based on the received action information. .. As described above, this behavior information corresponds to the behavior a in Q-learning.
- step S63 the determination information acquisition unit 820 acquires the roundness data of the processed work by the machine tool 100 as the determination information for calculating the reward for performing Q-learning.
- step S64 the reward calculation unit 831 calculates the reward based on the input determination information (roundness data of the processed work).
- the reward calculation unit 841 determines whether or not the roundness data of the processed work is less than a predetermined threshold value. If the roundness data is less than a predetermined threshold value (YES), the process proceeds to step S65. On the other hand, when the roundness data exceeds a predetermined threshold value (NO), the process proceeds to step S66.
- step S65 the reward calculation unit 841 calculates the reward value as a positive value.
- step S66 the reward calculation unit 841 calculates the reward value as a negative value.
- step S67 the value function update unit 842 updates the value function Q stored in the value function storage unit 850 based on the reward value calculated above. Then, the learning unit 840 returns to step S61 again, and by repeating the above-described processing, the value function Q converges to an appropriate value.
- the learning unit 840 may be terminated on the condition that the above-mentioned processing is repeated a predetermined number of times or repeated for a predetermined time.
- the operation of the machine learning device 800 has been described above, but for example, the process of calculating the reward value is an example and is not limited to this.
- FIG. 12 is a flowchart showing a flow of processing for outputting a swing condition by the machine learning device 800 according to the fourth embodiment.
- the swing condition output unit 860 acquires the value function Q stored in the value function storage unit 850.
- This value function Q is updated by the value function update unit 842 performing Q-learning as described above.
- step S72 the swing condition output unit 860 optimizes the action a having the highest value Q (s, a) among the actions a that can be taken, for example, in the currently set state s, based on the value function Q.
- Optimal swing conditions are generated by selecting the desired behavior.
- step S73 the chip shredding condition calculation unit 861 calculates the chip shredding rocking condition that enables the work chips to be shredded by the rocking of the machine tool 100.
- step S74 the swing condition upper limit calculation unit 862 calculates the upper limit swing condition that does not exceed the preset upper limit value.
- step S75 the rocking condition output unit 860 determines whether or not the optimum rocking condition generated in step S72 satisfies the chip shredding rocking condition calculated in step S73.
- the process proceeds to step S75.
- the optimum rocking condition does not satisfy the chip shredding rocking condition (NO)
- the process proceeds to step S71 again.
- step S76 the swing condition output unit 860 determines whether or not the optimum swing condition output in step S72 satisfies the upper limit swing condition calculated in step S74.
- the process proceeds to step S77.
- the process proceeds to step S71 again.
- step S72 the rocking condition output unit 860 outputs the generated optimum rocking condition (optimized behavior information) to the numerical control device 200.
- the numerical control device 200 corrects the currently set state s (that is, the currently set rocking condition) based on the optimum rocking condition, and generates an operation command. Then, the numerical control device 200 sends the generated operation command to the machine tool 100, so that the machine tool 100 can operate so that the roundness data of the machined work is optimized.
- the machine learning device 800 has a setting condition acquisition unit 810 that acquires setting conditions for rocking machining as state information, and a perfect circle of the machined work by the machine tool 100.
- Judgment information acquisition unit 820 that acquires degree data as judgment information
- action information output unit 830 that outputs action information indicating how the swing condition should be changed with respect to the current state, and judgment information.
- the reward calculation unit 841 that calculates the reward value in the reinforcement learning
- the value function update unit 842 that updates the value function that determines the value of the swing condition of the machine tool 100 based on the state information, the behavior information, and the reward.
- a swing condition output unit 860 that outputs an optimum swing condition that optimizes the roundness of the machined work based on the value function.
- the machine learning device 800 can output the optimum swing condition optimized in consideration of roundness as compared with the conventional technique in which it is difficult to set the swing condition in consideration of surface roughness. can. Further, since the machine learning device 800 can automate the setting of the swing condition, the burden on the operator can be reduced.
- the reward calculation unit 841 sets the reward as a positive value when the roundness data of the processed work is less than a predetermined threshold value, and negatively charges the reward when the roundness data exceeds the predetermined threshold value. Let it be a value. As a result, the machine learning device 800 can determine the reward value in consideration of the surface roughness data.
- the swing condition output unit 860 includes a chip shredding condition calculation unit 861 for calculating a chip shredding condition that enables the work chips to be shredded, and the rocking condition output unit 860 is provided with a chip shredding condition calculation unit 861. Outputs the optimum rocking conditions that satisfy the dynamic conditions. As a result, the machine learning device 800 can set the optimum swing condition in which chips are shredded and optimized in consideration of surface roughness.
- the swing condition output unit 860 includes a swing condition upper limit calculation unit 862 that calculates an upper limit swing condition that does not exceed a preset upper limit value, and the swing condition output unit 860 provides an optimum swing that satisfies the upper limit swing condition. Output the dynamic conditions.
- the machine learning device 800 can set the optimum rocking conditions that are chipped, do not exceed the upper limit of the rocking conditions, and are optimized in consideration of the surface roughness.
- the machine learning devices 400, 500, 700, 800 used the surface roughness data and the roundness data of the processed work, but the machine learning devices according to the other embodiments use the processed work.
- Other data may be used as the evaluation data of.
- the evaluation data of the processed work may include surface roughness, roundness or dimensional accuracy.
- the dimensional accuracy indicates whether or not the dimensional accuracy is processed according to the shape specified by the processing program.
- the machine learning device when supervised learning is used, is a machine learning device that learns the swinging conditions of a machine tool that swings while swinging relative to the tool and the work.
- the setting condition acquisition unit that acquires the setting conditions for rocking machining
- the label acquisition unit that acquires the evaluation data of the machined work by the machine tool as a label
- the combination of the setting conditions and the label as supervised learning. It includes a learning unit that performs supervised learning, and the learning unit includes a learning model that learns swing conditions that optimize the evaluation data of the processed work.
- the machine learning device may further include a swing condition output unit that outputs an optimum swing condition that optimizes the evaluation data of the machined work based on the learning model.
- the machine learning device is optimized in consideration of the evaluation data of the machined work, as compared with the conventional technique in which it is difficult to set the swing condition in consideration of the evaluation data of the machined work. Can be output. Further, since the machine learning device can automate the setting of the swing condition, the burden on the operator can be reduced.
- the machine learning device is a machine learning device that learns the swinging conditions of a machine tool that swings while swinging relative to the tool and the work.
- the setting condition acquisition unit that acquires the setting conditions for rocking machining as state information
- the judgment information acquisition unit that acquires the evaluation data of the machined work by the machine tool as judgment information
- the current state The behavior information output unit that outputs behavior information indicating how the dynamic conditions should be changed
- the reward calculation unit that calculates the reward value in reinforcement learning based on the judgment information, and the state information, behavior information, and reward.
- the value function updater that updates the value function that determines the value of the rocking condition of the machine tool, and the rocking condition that outputs the optimum rocking condition that optimizes the evaluation data of the machined workpiece based on the value function. It includes an output unit. Further, the reward calculation unit sets the reward as a positive value when the evaluation data of the processed work is less than a predetermined threshold value, and sets the reward as a negative value when the evaluation data exceeds the predetermined threshold value.
- the machine learning device is optimized in consideration of the evaluation data of the machined work, as compared with the conventional technique in which it is difficult to set the swing condition in consideration of the evaluation data of the machined work. Can be output. Further, since the machine learning device can automate the setting of the swing condition, the burden on the operator can be reduced.
- the setting conditions according to the first to fourth embodiments include at least one of the tool feed speed, the spindle rotation speed, the coordinate values, the tool cutting edge, the tool type, and the work material in the machine tool 100. It includes a machining condition and a swing condition including at least one swing frequency and swing amplitude of the machine tool 100.
- the machine learning devices 400, 500, 700, 800 can acquire setting conditions including appropriate machining conditions and swing conditions.
- the surface roughness data according to the first embodiment and the second embodiment are at least the arithmetic mean roughness, the maximum height, the maximum mountain height, the maximum valley depth, the average height, the maximum cross-sectional height, and the load length ratio. Including one.
- the machine learning devices 400 and 500 can acquire surface roughness data by an appropriate method.
- the machine learning devices 400, 500, 700, 800 may be shared by a plurality of numerical control devices 200.
- the machine learning devices 400, 500, 700, 800 can share and update the learning model, the value function, and the like by the plurality of numerical control devices 200.
- machine learning devices 400, 500, 700, 800 may be provided on the cloud server.
- the machine learning devices 400 and 500 can share and update the learning model, the value function, and the like on the cloud server.
- the numerical control device 200 and the machine learning devices 400, 500, 700, 800 described above include arithmetic processing units such as a CPU (Central Processing Unit). Further, the numerical control device 200 and the machine learning devices 400, 500, 700, 800 include auxiliary storage devices such as an HDD (Hard Disk Drive) that stores various control programs such as application software and an OS (Operating System), and auxiliary storage devices such as HDDs (Hard Disk Drive). It also includes a main storage device such as a RAM (Random Access Memory) for storing data temporarily required for the arithmetic processing device to execute a program.
- arithmetic processing unit such as a CPU (Central Processing Unit).
- auxiliary storage devices such as an HDD (Hard Disk Drive) that stores various control programs such as application software and an OS (Operating System), and auxiliary storage devices such as HDDs (Hard Disk Drive). It also includes a main storage device such as a RAM (Random Access Memory) for storing data temporarily required for the arithmetic
- the arithmetic processing unit reads the application software and the OS from the auxiliary storage device, and deploys the read application software and the OS to the main storage device. Performs arithmetic processing based on the application software and OS of. Further, the numerical control device 200 and the machine learning devices 400, 500, 700, 800 control various hardware included in each device based on the calculation result. As a result, the functional block of the present embodiment is realized. That is, the above-described embodiment can be realized by the cooperation of hardware and software.
- the numerical control device 200 can be realized by incorporating application software for realizing the embodiment into the control device of a general machine tool 100.
- the machine learning devices 400, 500, 700, 800 can be realized by incorporating application software for realizing the present embodiment into a general personal computer.
- the machine learning devices 400, 500, 700, 800 have a large amount of calculation associated with machine learning, for example, a GPU (Graphics Processing Units) is mounted on a personal computer, and a GPU PU (General-Purpose computing on Graphics Processing Units) is installed. ), If the GPU is used for arithmetic processing associated with machine learning, high-speed processing can be performed. Furthermore, in order to perform higher-speed processing, the machine learning devices 400 and 500 construct a computer cluster using a plurality of computers equipped with such GPUs, and a plurality of computers included in the computer cluster. You may perform parallel processing with.
- a GPU Graphics Processing Units
- GPU PU General-Purpose computing on Graphics Processing Units
- control systems 1, 10, 20, and 30 can be realized by hardware, software, or a combination thereof. Further, the control method performed by the above control systems 1, 10, 20, and 30 can also be realized by hardware, software, or a combination thereof.
- what is realized by software means that it is realized by a computer reading and executing a program.
- Non-transitory computer-readable media include various types of tangible storage media (tangible storage media).
- Examples of non-temporary computer-readable media include magnetic recording media (eg, hard disk drives), photomagnetic recording media (eg, photomagnetic disks), CD-ROMs (Read Only Memory), CD-Rs, CD-Rs / W, including a semiconductor memory (for example, mask ROM, PROM (Programmable ROM), EPROM (Erasable PROM), flash ROM, RAM (random access memory)).
- Control system 10 Control system 100 Machine machine 200 Numerical control device 300 Machining surface analysis device 400 Machine learning device 410 Setting condition acquisition unit 420 Label acquisition unit 430 Learning unit 440 Learning model storage unit 450 Swing condition output unit 500 Machine learning device 510 Setting condition acquisition unit 520 Judgment information acquisition unit 530 Action information output unit 540 Learning unit 550 Value function storage unit 560 Swing condition output unit
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Automation & Control Theory (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Biophysics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Geometry (AREA)
- Human Computer Interaction (AREA)
- Manufacturing & Machinery (AREA)
- Numerical Control (AREA)
Priority Applications (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202280009199.7A CN116783561A (zh) | 2021-01-14 | 2022-01-07 | 机器学习装置 |
| JP2022575565A JP7538258B2 (ja) | 2021-01-14 | 2022-01-07 | 機械学習装置 |
| US18/260,342 US20240302802A1 (en) | 2021-01-14 | 2022-01-07 | Machine learning device |
| DE112022000207.7T DE112022000207T5 (de) | 2021-01-14 | 2022-01-07 | Maschinelle Lernvorrichtung |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2021004401 | 2021-01-14 | ||
| JP2021-004401 | 2021-01-14 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2022153936A1 true WO2022153936A1 (ja) | 2022-07-21 |
Family
ID=82447343
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2022/000336 Ceased WO2022153936A1 (ja) | 2021-01-14 | 2022-01-07 | 機械学習装置 |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US20240302802A1 (https=) |
| JP (1) | JP7538258B2 (https=) |
| CN (1) | CN116783561A (https=) |
| DE (1) | DE112022000207T5 (https=) |
| WO (1) | WO2022153936A1 (https=) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP7313585B1 (ja) * | 2022-08-05 | 2023-07-24 | 三菱電機株式会社 | 駆動条件決定装置および駆動条件決定方法 |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2018181217A (ja) * | 2017-04-20 | 2018-11-15 | ファナック株式会社 | 加減速制御装置 |
| JP2020157425A (ja) * | 2019-03-27 | 2020-10-01 | 株式会社ジェイテクト | 研削盤の支援装置及び支援方法 |
| JP6775720B1 (ja) * | 2020-03-24 | 2020-10-28 | 三菱電機株式会社 | 数値制御装置 |
Family Cites Families (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107000152B (zh) * | 2014-11-26 | 2019-10-15 | 三菱电机株式会社 | 数控装置 |
| JP6557198B2 (ja) * | 2016-09-06 | 2019-08-07 | ファナック株式会社 | 数値制御装置 |
| JP6499709B2 (ja) * | 2017-04-14 | 2019-04-10 | ファナック株式会社 | 揺動切削を行う工作機械の制御装置 |
| JP6503002B2 (ja) * | 2017-04-20 | 2019-04-17 | ファナック株式会社 | 揺動切削を行う工作機械の制御装置 |
| JP6608879B2 (ja) * | 2017-07-21 | 2019-11-20 | ファナック株式会社 | 機械学習装置、数値制御装置、数値制御システム、及び機械学習方法 |
| JP6595537B2 (ja) | 2017-07-27 | 2019-10-23 | ファナック株式会社 | 揺動切削を行う工作機械の制御装置 |
| JP2019185125A (ja) * | 2018-04-02 | 2019-10-24 | ファナック株式会社 | 制御装置及び機械学習装置 |
| JP6802213B2 (ja) * | 2018-04-26 | 2020-12-16 | ファナック株式会社 | 工具選定装置及び機械学習装置 |
| JP7044734B2 (ja) * | 2019-03-28 | 2022-03-30 | ファナック株式会社 | サーボ制御装置 |
| CN114746203B (zh) * | 2019-12-03 | 2023-08-18 | 三菱电机株式会社 | 控制装置、放电加工机及机器学习装置 |
| CN111079690B (zh) * | 2019-12-27 | 2022-05-20 | 华中科技大学 | 基于堆栈稀疏自动编码网络的主轴和工件振动预测方法 |
-
2022
- 2022-01-07 DE DE112022000207.7T patent/DE112022000207T5/de active Pending
- 2022-01-07 US US18/260,342 patent/US20240302802A1/en active Pending
- 2022-01-07 WO PCT/JP2022/000336 patent/WO2022153936A1/ja not_active Ceased
- 2022-01-07 CN CN202280009199.7A patent/CN116783561A/zh active Pending
- 2022-01-07 JP JP2022575565A patent/JP7538258B2/ja active Active
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2018181217A (ja) * | 2017-04-20 | 2018-11-15 | ファナック株式会社 | 加減速制御装置 |
| JP2020157425A (ja) * | 2019-03-27 | 2020-10-01 | 株式会社ジェイテクト | 研削盤の支援装置及び支援方法 |
| JP6775720B1 (ja) * | 2020-03-24 | 2020-10-28 | 三菱電機株式会社 | 数値制御装置 |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP7313585B1 (ja) * | 2022-08-05 | 2023-07-24 | 三菱電機株式会社 | 駆動条件決定装置および駆動条件決定方法 |
| WO2024029074A1 (ja) * | 2022-08-05 | 2024-02-08 | 三菱電機株式会社 | 駆動条件決定装置および駆動条件決定方法 |
| CN119404160A (zh) * | 2022-08-05 | 2025-02-07 | 三菱电机株式会社 | 驱动条件决定装置及驱动条件决定方法 |
Also Published As
| Publication number | Publication date |
|---|---|
| JPWO2022153936A1 (https=) | 2022-07-21 |
| CN116783561A (zh) | 2023-09-19 |
| US20240302802A1 (en) | 2024-09-12 |
| JP7538258B2 (ja) | 2024-08-21 |
| DE112022000207T5 (de) | 2023-09-07 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP6608879B2 (ja) | 機械学習装置、数値制御装置、数値制御システム、及び機械学習方法 | |
| CN111857052B (zh) | 机器学习装置、数值控制系统以及机器学习方法 | |
| US20180210431A1 (en) | Action information learning device, action information optimization system and computer readable medium | |
| JP6457563B2 (ja) | 数値制御装置及び機械学習装置 | |
| JP6063013B1 (ja) | びびり或いは工具摩耗/破損の発生を抑制する加工条件調整機能を有する数値制御装置 | |
| CN108388205B (zh) | 学习模型构建装置以及控制信息最优化装置 | |
| US10747193B2 (en) | Machine learning apparatus, servo control apparatus, servo control system, and machine learning method | |
| JP6564432B2 (ja) | 機械学習装置、制御システム、制御装置、及び機械学習方法 | |
| JP6348098B2 (ja) | 機械学習を使った中子の溶着位置決定機能を備えたワイヤ放電加工機のシミュレーション装置 | |
| JP2017030067A (ja) | 加工時間測定機能とオンマシン測定機能を有する制御装置付き加工装置 | |
| JP6077617B1 (ja) | 最適な速度分布を生成する工作機械 | |
| CN110286645B (zh) | 机器学习装置、伺服控制装置、伺服控制系统以及机器学习方法 | |
| US10698380B2 (en) | Numerical controller | |
| JP7158604B1 (ja) | 数値制御装置、学習装置、推論装置、および数値制御方法 | |
| JP7364699B2 (ja) | 機械学習装置、コンピュータ装置、制御システム、及び機械学習方法 | |
| CN110875703A (zh) | 机器学习装置、控制系统以及机器学习方法 | |
| WO2022153936A1 (ja) | 機械学習装置 | |
| JPWO2021111530A1 (ja) | 制御装置、放電加工機、および機械学習装置 | |
| JP6740263B2 (ja) | 機械学習装置、サーボモータ制御装置、サーボモータ制御システム、及び機械学習方法 | |
| WO2021187268A1 (ja) | 機械学習装置、数値制御システム、設定装置、数値制御装置、及び機械学習方法 | |
| CN117600887A (zh) | 机床减振方法、装置、设备及存储介质 | |
| WO2022224450A1 (ja) | 機械学習装置、加減速調整装置及びコンピュータ読み取り可能な記憶媒体 | |
| JP6952941B1 (ja) | 加工条件設定装置、加工条件設定方法、および放電加工装置 | |
| CN120598332B (zh) | 一种柔性作业车间调度优化方法、系统、设备及介质 | |
| WO2022003833A1 (ja) | 位置決め制御装置および機械学習装置 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22739345 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 112022000207 Country of ref document: DE |
|
| ENP | Entry into the national phase |
Ref document number: 2022575565 Country of ref document: JP Kind code of ref document: A |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 18260342 Country of ref document: US |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 202280009199.7 Country of ref document: CN |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 22739345 Country of ref document: EP Kind code of ref document: A1 |