CN112518425A

CN112518425A - Intelligent machining cutter wear prediction method based on multi-source sample migration reinforcement learning

Info

Publication number: CN112518425A
Application number: CN202011462218.0A
Authority: CN
Inventors: 杨文安; 刘学为; 郭宇
Original assignee: Nanjing University of Aeronautics and Astronautics
Current assignee: Nanjing University of Aeronautics and Astronautics
Priority date: 2020-12-10
Filing date: 2020-12-10
Publication date: 2021-03-19
Anticipated expiration: 2040-12-10
Also published as: CN112518425B

Abstract

The intelligent machining cutter wear prediction method based on the multi-source sample migration reinforcement learning comprises the following steps: acquiring a plurality of source tasks and a target task according to different wear curves of several cutters; initializing model parameters and maximum iteration time; detecting the wear state of the current cutter, performing feature extraction and dimension reduction on wear data, constructing a training sample set of a weighted extreme learning machine, and training the weighted extreme learning machine; executing a machining action, observing the abrasion state of the current cutter, and calculating state similarity and return similarity between each source task and each target task; calculating the probability that each sample in the source task belongs to the target sample set; acquiring task similarity, and transferring a fixed number of samples from each source sample set to a target sample set; updating the Q value by using a Q learning mechanism based on a weighted extreme learning machine, and adding new tool wear data into a target sample set; and (3) constructing a state equation and an observation equation of the particle filter model.

Description

Intelligent machining cutter wear prediction method based on multi-source sample migration reinforcement learning

Technical Field

The disclosure belongs to the field of tool wear detection of numerical control manufacturing equipment, and particularly relates to an intelligent machining tool wear prediction method based on multi-source sample migration reinforcement learning.

Background

The most common faults in the digital processing process are the abrasion and the damage of the cutter, which not only directly affects the manufacturing cost, the product quality and the production efficiency, but also causes great equipment shutdown accidents in serious cases. Therefore, the detection of the cutter state has important significance in the aspects of ensuring the safety of a processing system, smoothly processing, reducing the production cost, improving the production efficiency and the like. However, since the machining process involves physical reactions, conversion and transfer of substances and energy, and the complexity and uncertainty of the machining process cause difficulty in detecting the state parameters of the tool during the machining process, the premise for realizing the method is that the state information of the tool during the machining process can be effectively obtained and reflected, regardless of the specific implementation of advanced machining process control algorithms and strategies during the machining process control, or the parameter optimization, the state monitoring and fault diagnosis of the tool during the machining process. Due to the limitation of the development level of the machining process detection technology, a plurality of advanced machining process control algorithms and strategies can only stay in theoretical discussion at present and are difficult to apply to actual machining, and a plurality of machining production systems cannot improve the safety and reliability of system operation by means of measures such as fault diagnosis and state monitoring. To solve the above problems, virtual measurement techniques have been developed.

The Chinese patent application of the invention discloses a method for monitoring the abrasion of a numerical control machine tool cutter based on deep learning (CN201711117628), which is characterized in that three-phase current signals of a spindle motor of the numerical control machine tool are collected, current signals corresponding to the cutter to be detected are intercepted from the three-phase current signals, each section of current signals are normalized, each section of current signals after normalization is input into a sparse automatic coding network for training, and further the detection of the abrasion of the cutter is realized.

The Chinese patent application for 'a cutter wear monitoring method' (CN103465107A) optimizes the initial weight and threshold of a neural network by collecting acoustic emission signals of a cutter in various wear states, current signals of a spindle motor and a feed motor in a machine tool, cutting speed, cutting depth and feed amount, and predicts the cutter wear degree by using the trained neural network. However, the method has the disadvantages of large data volume, unbalanced data distribution, long time for training the neural network and low accuracy, and the obtained neural network can only detect a single cutter type, so that the method is not strong in applicability.

The european patent application "Tool monitor" (EP0334341a2) collects an acoustic emission signal of a Tool of a manufacturing equipment by using an acoustic emission sensor, preprocesses the signal by combining a band-pass filter, and obtains a Tool state by comparing the acoustic emission signal of the Tool of the manufacturing equipment collected in real time with an acoustic emission signal when the Tool is broken. However, the method cannot acquire the implicit characteristics of the acoustic emission signal and the cutter state, so that the real-time state of the cutter cannot be completely reflected by the filtered acoustic emission signal, the simple comparison and judgment need to be carried out in an off-line mode, the judgment result mainly depends on the past experience, and the accuracy of the detection result is insufficient.

The european patent application "Cutting tool wear monitor" (EP0165745a2) detects the wear state of a rotary cutter by collecting short-circuit current, open-circuit voltage and power signals generated on a workpiece during machining in combination with empirical analysis, and the generated current, voltage and power gradually increase as the cutter wears until a sharp increase indicates that the cutter is damaged due to excessive wear or breakage. However, the current and the voltage are easily affected by external factors, so that the detection result cannot accurately represent the real-time wear state of the cutter, and only whether the cutter is worn or not can be judged through the change of the current and the voltage, but the real-time wear amount of the cutter cannot be judged, and the requirement of real-time detection cannot be met.

Disclosure of Invention

In view of at least one aspect of the problems, the intelligent machining cutter wear prediction method based on multi-source sample migration reinforcement learning is designed, a multi-source migration reinforcement learning model and a particle filter model are utilized, and a virtual detection method is combined, so that the problems that a cutter wear detection system in the current machining process is single in detection target and insufficient in accuracy are effectively solved.

Specifically, aiming at the defects of the existing cutter wear detection method, the disclosed method aims to provide an intelligent machining cutter wear prediction method which is used for detecting different types of cutters and is based on multi-source sample migration reinforcement learning, has good accuracy and strong data processing capacity, utilizes the idea of migration learning to reduce the difference between wear state data of different cutters, enables the model to detect the wear states of different cutters, reduces the required training data amount, enables the model to train more rapidly, combines a Weighted Extreme Learning Machine (WELM) algorithm to adjust the imbalance among training samples, enables the training samples to be more effective, and improves the accuracy of the model.

The embodiment of the disclosure is realized by adopting the following technical scheme:

the intelligent machining cutter wear prediction method based on multi-source sample migration reinforcement learning is characterized by comprising the following steps:

the method comprises the following steps: constructing an intelligent machining cutter wear prediction system based on multi-source migration reinforcement learning;

step two: acquiring a plurality of source tasks and a target task according to different wear curves of several cutters;

step three: initializing model parameters and maximum iteration time;

step four: detecting the wear state of the current cutter, performing feature extraction and dimension reduction on wear data, constructing a training sample set of a weighted extreme learning machine, and training the weighted extreme learning machine;

step five: executing a machining action, observing the wear state of the current cutter, simultaneously observing the next wear state and return of the cutter, and calculating the state similarity and return similarity between each source task and each target task;

step six: calculating the probability that each sample in the source task belongs to the target sample set, and performing descending order arrangement on the samples of each tool abrasion source task based on the probability;

step seven: acquiring task similarity, and transferring a fixed number of samples from each source sample set to a target sample set;

step eight: updating the Q value by using a Q learning mechanism based on a weighted extreme learning machine, adding new tool wear data to a target sample set, and then rolling a time window forward to discard the oldest sample;

step nine: constructing a state equation and an observation equation of a particle filter model; and

step ten: and predicting the wear state of the cutter for intelligent machining.

According to the embodiment of the disclosure, the system for predicting the wear of the intelligent machining tool in the first step comprises a tool state virtual detection object, a virtual perception model, an empirical model and a particle filter model. The virtual detection objects of the cutter state are different cutters, and a source task and a target task are respectively obtained from the virtual detection objects; the virtual perception model is mainly used for determining an observation equation of a particle filter model and comprises four parts, namely tool wear data acquisition, tool wear data feature selection, tool wear data feature dimension reduction and multi-source migration reinforcement learning based on a WELM (Web-based algorithm), wherein the tool wear data are acquired online through a sensor and acquired offline through a microscope respectively, the tool wear data are feature selected by utilizing statistical features, time domain features and frequency domain features, the selected features are feature dimension reduction by utilizing a Kernel Principal Component Analysis (KPCA) method, and the multi-source migration reinforcement learning based on the WELM consists of a sample space migration stage, a task space migration stage and a Q learning stage based on the WELM; the empirical model is mainly used for determining a state equation of the particle filter model and consists of various tool wear empirical models, such as a tool wear rate model, a tool wear amount model and the like; the particle filter model consists of an observation equation and a state equation and completes virtual detection of the tool wear state of the target task.

According to the embodiment of the disclosure, the model parameters in the third step comprise a WELM learning rate, a WELM discount factor, a Q value updating probability, the number L of neurons in a hidden layer of the WELM, a width parameter of a Gaussian kernel and a matching degree control parameter Z_PAnd Z_QGeneral rule ofRate control parameter Z_XSource task sample number and target task sample number.

According to the embodiment of the disclosure, in the fourth step, the wear state of the current tool is detected by using an online and offline measurement method, and the wear data is subjected to feature extraction and dimension reduction and then used for constructing a training sample set.

For example, in the fourth step, the sensor is used for realizing online measurement of the tool state, the microscope is used for realizing offline measurement of the tool wear state, different feature extraction and dimension reduction methods are used for preprocessing the wear data, and the preprocessed data are used as a training sample set of the WLEM.

According to the embodiment of the disclosure, in the fifth step, the obtained state similarity and the return similarity are used for quantifying the similarity between each sample of the source task and the target task; in the sixth step, the state similarity and the return similarity are used for calculating the probability that each sample in the source task belongs to the target sample set.

According to the embodiment of the disclosure, the method for calculating the state similarity and the return similarity in the step five is as follows:

suppose that

And

respectively representing the sample sets of the kth source task and the target task, wherein the jth source sample and the ith target sample are respectively represented as

And

calculated by equations (1) and (2)

And

similarity of state therebetween

And reporting similarity

In the formula (I), the compound is shown in the specification,

for similar weights, δ_s，δ_qAnd delta_saIs the width parameter of the gaussian kernel. Usually will be delta_s，δ_qAnd delta_saIs set to correspond to the molecule | | s'_i，s′_j||，|Q_i-Q_jI and I(s)_j，a_j)，(s_i，a_i) The same order of magnitude of | this may be

And w_ijThe value of (b) is limited to a reasonable range. It is clear that the higher the similarity between two samples,

the larger. Also, as can be seen from equation (2), two Q values Q_iAnd Q_jThe closer the distance is to each other,

the larger.

According to the embodiment of the present disclosure, in the sixth step, the probability that each sample in the source task belongs to the target sample set is used as a sample migration weight, and the value determines the possibility of sample migration.

According to the embodiment of the disclosure, in the sixth step, the probability calculation method that each sample in the source task belongs to the target sample set is as follows:

in the formula (I), the compound is shown in the specification,

referred to as the sample migration weight, is,

and

respectively represent

And

the definition of which is similar to equations (1) and (2).

According to the embodiment of the disclosure, in the seventh step, the task similarity is calculated according to the bayesian probability analysis theory, and the value is used for determining the number of samples migrated from each source task to the target task set.

For example, task similarity in step seven

The calculation method of (2) is as follows:

from the above analysis, we can calculate the matching degree of the status and the return, as follows:

in the formula, Z_PAnd Z_QIndicating the control parameter, Z may be set without overflowing the normal value_PAnd Z_QIs set to 1, i.e. Z_P＝Z_Q1. Multiplying equation (4) by equation (5) may obtain the ith target sample

And source task S_kThe total degree of match between, i.e.

Thus, a likelihood ratio X between the source task and the target task can be obtained_k：

In the formula, P (S)_k) Is a model S_kA priori of, and Z_XIs a control parameter of the possibility. Z_XAnd Z_PAnd Z_QWith the same effect, Z can be adjusted without overflowing the usual values_XIs set to 1, i.e. Z_X＝1. We can normalize the likelihood ratios of all active tasks

To obtain a source task S_kTask similarity with T, while also requiring migration from M source tasks (n)_S-n_T) One sample is used to supplement the target sample.

According to an embodiment of the present disclosure, in the step eight, the maximum Q value or the random Q value is selected with different probabilities to update the Q value.

In the step eight, the rolling time window is used for updating the target sample set, so that the condition that the learning speed of the weighted extreme learning machine is too low due to the fact that the sample set is too large is avoided.

According to the embodiment of the disclosure, the palm-based Q learning mechanism in the step eight is as follows: in an embodiment of the present disclosure, the WELM is used to approximate a reinforcement learning Q-value function.

Suppose s_l＝[s_l1，s_l2，...，s_lm]^T∈R^mRepresents a system state having m dimensions, and a_le.R represents the action of a Q learning agent, and the input vector of WELM is a state-action pair x_l＝(s_l，a_l)^T＝[s_l1，s_l2，...，s_lm，a_l]^T∈R^m+1And the output of the WELM is estimated to correspond to(s)_l，a_l) Wherein L is 1, 2, and L is an index of the training sample. Alpha is alpha₁＝[α₁₁，α₁₂，...，α_1(m+1)]And ω ═ ω [ ω ]₁，ω₂，..，ω_L]^TIs an input and output weight vector, β ═ β₁，β₂，...，β_L]^TIs the deviation of the hidden layer node. In order to ensure the learning efficiency of the wellm, the number of hidden layer nodes should be equal to the number of training samples, i.e. the number of hidden layer nodes is L.

According to an embodiment of the present disclosure, the architecture of the multi-source migration learning algorithm in step eight is as follows: in the present disclosure, the task space Γ ═ S, T ═ S for multi-source migration reinforcement learning₁，...，S_k，..，S_MT), the formula includes M source tasks and an unknown target task T. Each source task containing n_sA sample and the target task contains n_TAnd (4) sampling. On the one hand, the cost of collecting the sample is high, and n_TThe sample does not train task T well. On the other hand, the cost of using M source tasks and extracting the required samples is small compared to resampling from the current environment. Therefore, we apply a Knowledge Transfer (KT) technique to the WELM-based Q learning to improve the learning speed of the target task T. The mapping of KT is described as follows:

in the formula

And

the sample sets corresponding to the source task and the target task are represented, respectively, and Λ represents the optimal transmission sample set.

Compared with the prior art, the technical scheme provided by the embodiment of the disclosure has at least the following beneficial effects:

(1) the migration learning idea adopted by the embodiment of the disclosure has at least the following advantages:

1) the demand for training data volume is smaller when the cutter detection model is trained, and the advantage of the method is that the model training speed is obviously improved;

2) the generalization capability of the model obtained by utilizing the transfer learning training is stronger, the capability of the model on classifying well on non-training data is enhanced, and the applicability of the model is improved;

3) the transfer learning training process is more robust, the number of trainable parameters can be reduced by 100%, so that the training is more stable, the debugging is easier, and the accuracy of the tool detection model is improved.

(2) The reinforcement learning model based on the weighted extreme learning machine adopted by the embodiment of the disclosure has at least the following advantages:

1) the tool wear virtual detection system has strong learning capacity, and can learn enough abnormal tool wear states to be communicated;

2) the cutter wear virtual detection system can train unbalanced data, and the applicability and accuracy of the system are improved;

3) the tool wear virtual detection system has continuous self-improvement capability, and the understanding and comprehension of the tool wear condition in the intelligent machining process are gradually improved.

(3) The virtual measurement technology adopted by the embodiment of the disclosure has at least the following advantages:

1) the tool wear virtual detection system has the functions of error compensation and fault diagnosis, so that the precision and the reliability of tool wear virtual measurement are improved;

2) the virtual detection system for the cutter wear can comprehensively use a plurality of measurable cutter state information to carry out state estimation, diagnosis and trend analysis on the wear state of the cutter to be detected;

3) the tool wear virtual detection system can acquire microcosmic real-time state information of the detected tool on line so as to meet the requirement of tool wear virtual detection;

therefore, in the embodiment of the disclosure, the intelligent machining tool wear prediction method based on the multi-source sample migration reinforcement learning at least solves the problems that the tool state information existing in the intelligent machining process cannot be detected on line in real time, so that real-time control is difficult to perform and the machining quality is difficult to ensure.

Drawings

The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this disclosure, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure and not to limit the disclosure. In the drawings:

FIG. 1 is a system block diagram of a method for intelligent machining tool wear prediction based on multi-source sample migration reinforcement learning according to an embodiment of the present disclosure; and

fig. 2 is a block diagram of a webm-based reinforcement Q learning system, according to an embodiment of the present disclosure.

Detailed Description

The technical solution of the present disclosure is further specifically described below by way of examples and with reference to the accompanying drawings. In the specification, the same or similar reference numerals denote the same or similar components. The following description of the embodiments of the present disclosure with reference to the accompanying drawings is intended to explain the general inventive concept of the present disclosure and should not be construed as limiting the present disclosure.

Furthermore, in the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details.

The virtual measurement technology is an automatic measurement technology developed along with a computer technology and a modern measurement technology, realizes measurement of a process variable to be measured by utilizing an easily-measured process variable through various mathematical calculation and estimation methods according to a mathematical relationship between the easily-measured process variable and the process variable to be measured which is difficult to directly measure, overcomes the defects of sealing, lack of flexibility, low response speed and the like of a hardware detection technology, and has the advantages of functionalization, modularization, data sharing, low cost and the like. The virtual measurement technology is applied to cutter abrasion detection, cutter abrasion after actual processing can be predicted, cutter abrasion and damage problems possibly occurring in the actual processing can be found in advance, a feasible modification scheme is provided in a targeted manner, processing parameters are optimized, and the virtual measurement technology has important significance for ensuring the quality of processed products, greatly reducing the processing cost, shortening the processing period and improving the processing efficiency. The virtual measurement technology for the tool wear mainly comprises a tool wear state equation and a tool wear observation equation, wherein the state equation represents the inherent state change of the tool in the machining process and describes the evolution behavior of the tool degradation condition. The observation equation represents the mapping from the current tool state to the tool wear state, but because the intelligent machining process has the characteristics of multiple variables, variable type mixing and strong nonlinear coupling among variables, and the tool wear state in the machining process is interfered by various uncertain factors such as workpiece materials, machining working conditions, machining equipment states and the like, the mapping relationship from the current tool state to the tool wear state is difficult to describe by a mathematical model. The existing solution is to use an intelligent learning model to represent a nonlinear mapping relationship from a current tool state to a tool wear state, so the accuracy of a virtual detection result of the tool wear state mainly depends on the construction of the intelligent learning model.

The tool wear intelligent learning model established by the traditional machine learning method usually needs data to be in the same distribution, so that more data are needed to be trained when a batch of the same workpieces are processed, but the data of the workpieces produced by a single piece in small batch and processed by the same parameters are less, and the training requirements cannot be met. In the embodiment of the disclosure, data under other parameters are considered to assist training, and variable parameter tool wear detection is realized, which is different from tool wear detection of the same parameter, and a signal acquired by variable parameter tool wear detection is influenced by two factors, namely the actual wear condition of the tool and the change of the cutting parameter. Therefore, data under different parameters are distributed in different data, and in order to detect the state of a parameter tool by using tool wear data of other parameters, in the embodiment of the disclosure, a migration learning method is used to solve the problem of insufficient data in the single piece small batch production process, because the migration learning can help the tool state detection model of a certain parameter to learn by using signal data under other parameters distributed in different ways.

However, the tool wear sample data obtained in practical applications is always of unbalanced class, i.e. the sample size of one class is much larger than that of another or some classes. For the unbalanced classification problem caused by the classification problem of class data, the traditional machine learning method is mainly based on an intelligent learning model with minimized experience risk or minimized structure risk for classification. The least empirical risk is that the classification error rate of the model on the training set is expected to be as low as possible, and the samples in a few classes are subjected to a large number of false classifications; the minimum structural risk is the desire to maximize the inter-class distance, and the influence of the majority class on the classifier will generally push the inter-class spacing surface to the side of the minority class, thereby causing the classifier to degrade the recognition performance of the minority class. For a large-scale data set, particularly a tool wear data set, the traditional algorithm cannot conveniently and effectively extract data features, so that the classification error rate is improved, and the time cost for searching a classification spacing surface by the algorithm is high.

In this document, so-called migration learning is to use existing knowledge to learn new knowledge, and the core is to find the similarity between the existing knowledge and the new knowledge. In the transfer learning, the existing knowledge is called a source domain, the new knowledge to be learned is called a target domain, the source domain and the target domain are different but have certain correlation, the distribution difference between the source domain and the target domain needs to be reduced, and the knowledge transfer is carried out, so that the data calibration is realized. For example, the source domain target domain distinction is expressed as follows: the target domain can be different in data distribution, characteristic dimension and model output change condition relative to the source domain, and the target domain is better modeled by organically utilizing the knowledge in the source domain. In addition, under the condition that target domain calibration data is lack, the migration learning can well utilize the calibrated data in the related field to finish the calibration of the data. It should be noted that if the similarity between the source domain and the target domain is not sufficient, the migration result is not ideal, and a so-called negative migration situation occurs. Therefore, finding the source domain and the target domain with the highest similarity is the most important precondition for the whole migration process.

Further, herein, the expression "domain" may include data features and feature distributions, which are subjects of machine learning; the expression "source domain" is a domain that is already known; the expression "target domain" is a domain to be learned; the expression "task" includes an objective function and a learning result, which is a result of learning, and can be understood as a classifier.

As shown in fig. 1, a system block diagram of an intelligent machining tool wear prediction method based on multi-source sample migration reinforcement learning according to an embodiment of the present disclosure is schematically shown. Referring to fig. 1, the intelligent machining tool wear prediction method based on multi-source sample migration reinforcement learning according to an embodiment of the present disclosure may include the following steps.

In the first step, an intelligent machining tool wear prediction system based on multi-source migration reinforcement learning is constructed, as shown in fig. 1.

Specifically, in step one, the tool wear prediction system (also referred to as a tool wear state virtual detection system) may include a tool state virtual detection object, a virtual perception model, an empirical model and a particle filter model. The virtual detection objects of the cutter state are different cutters, and a source task and a target task are respectively obtained from the virtual detection objects; the virtual perception model is mainly used for determining an observation equation of a particle filter model and comprises four parts, namely tool wear data acquisition, tool wear data feature selection, tool wear data feature dimension reduction and multi-source migration reinforcement learning based on a Weighted Extreme Learning Machine (WELM), wherein the tool wear data are acquired online through a sensor and acquired offline through a microscope respectively, the tool wear data are subjected to feature selection by utilizing statistical features, time domain features and frequency domain features, the selected features are subjected to feature dimension reduction by utilizing a Kernel Principal Component Analysis (KPCA) method, and the multi-source migration reinforcement learning based on the WELM comprises a sample space migration stage, a task space migration stage and a Q learning stage based on the WELM; the empirical model is mainly used for determining a state equation of the particle filter model and consists of various tool wear empirical models, such as a tool wear rate model, a tool wear amount model and the like; the particle filter model consists of an observation equation and a state equation and completes virtual detection of the tool wear state of the target task.

In step two, a plurality of source tasks S are obtained according to different wear curves of several tools₁，S₂，...，S_MAnd a target task T.

In step three, model parameters and maximum iteration time are initialized.

Specifically, in step three, the model parameters to be initialized comprise a WELM learning rate eta, a WELM discount factor gamma and a Q value updating probability epsilon₀The WELM implies the number L of layer neurons, the width parameter δ of the Gaussian kernel_s、δ_saAnd δ q, matching degree control parameter Z_PAnd Z_QProbability control parameter Z_XNumber of samples n of source task_SAnd a target task sample number n_T。

And in the fourth step, detecting the wear state of the current cutter, performing feature extraction and dimension reduction on wear data, constructing a WELM training sample set D, and training a WELM.

Specifically, in the fourth step, the sensor is used for realizing the on-line measurement of the tool state, the microscope is used for realizing the off-line measurement of the tool wear state, different feature extraction and dimension reduction methods are used for preprocessing the wear data, and the preprocessed data are used as a training sample set of the WLEM.

And step five, executing the machining action, observing the current tool wear state, simultaneously observing the wear state and return of the next tool, and calculating the state similarity and return similarity between each source task and the target task.

Specifically, the method for calculating the state similarity and the return similarity in step five is as follows:

suppose that

And

And

calculated by equations (1) and (2)

And

similarity of state therebetween

And reporting similarity

In the formula (I), the compound is shown in the specification,

the larger.

And in the sixth step, calculating the probability that each sample in the source task belongs to the target sample set, and performing descending order arrangement on the samples of each cutter abrasion virtual detection source task based on the probability.

Specifically, in the sixth step, the probability calculation method that each sample in the source task belongs to the target sample set is as follows:

in the formula (I), the compound is shown in the specification,

referred to as the sample migration weight, is,

and

respectively represent

And

the definition of which is similar to equations (1) and (2).

In step seven, task similarity is obtained, and a fixed number of samples are migrated from each source sample set to the target sample set.

Specifically, the task similarity in the step seven

The calculation method of (2) is as follows:

in the formula, Z_PAnd Z_QIndicating the control parameter, Z may be set without overflowing the normal value_PAnd Z_QIs set to 1, i.e. Z_P＝Z_Q＝1. Multiplying equation (4) by equation (5) may obtain the ith target sample

And source task S_kThe total degree of match between, i.e.

In the formula, P (S)_k) Is a model S_kA priori of, and Z_XIs a control parameter of the possibility. Z_XAnd Z_PAnd Z_OWith the same effect, Z can be adjusted without overflowing the usual values_XIs set to 1, i.e. Z_X＝₁. We can normalize the likelihood ratios of all active tasks

In step eight, update the Q value using the palm-based Q learning mechanism, add new tool wear data to the target sample set, and then scroll the time window forward to discard the oldest samples.

Specifically, the Q learning mechanism based on the palm in step eight is as follows:

in this disclosure, WELM is used to approximate a reinforcement learning Q-value function.

Suppose s_l＝[s_l1，s_l2，...，s_lm]^T∈R^mRepresents a system state having m dimensions, and a_le.R represents the action of a Q learning agent, and the input vector of WELM is a state-action pair x_l＝(s_l，a_l)^T＝[s_l1，s_l2，...，s_lm，a_l]^T∈R^m+1And the output of the WELM is estimated to correspond to(s)_l，a_l) Wherein L is 1, 2, and L is an index of the training sample. Alpha is alpha₁＝[α₁₁，α₁₂，..，α_1(m+1)]And ω ═ ω [ ω ]₁，ω₂，...，ω_L]^TIs an input and output weight vector, β ═ β₁，β₂，...，β_L]^TIs the deviation of the hidden layer node. To ensure learning efficiency of WELM, the number of nodes of hidden layerThe amount should be equal to the number of training samples, i.e. the number of hidden layer nodes is L.

Specifically, the architecture of the multi-source migration learning algorithm in step eight is as follows, as shown in fig. 2. In the present disclosure, the task space Γ ═ S, T ═ S for multi-source migration reinforcement learning₁，...，S_k，..，S_MT), the formula includes M source tasks and an unknown target task T. Each source task containing n_sA sample and the target task contains n_TAnd (4) sampling. On the one hand, the cost of collecting the sample is high, and n_TThe sample does not train task T well. On the other hand, the cost of using M source tasks and extracting the required samples is small compared to resampling from the current environment. Therefore, we apply a Knowledge Transfer (KT) technique to the WELM-based Q learning to improve the learning speed of the target task T. The mapping of KT is described as follows:

in the formula

And

In the ninth step, a state equation and an observation equation of the particle filter model are constructed.

In step ten, the tool wear state of the target task is virtually detected.

Embodiments of the present disclosure also provide an intelligent machining tool wear prediction system based on multi-source sample migration reinforcement learning, where the system includes a memory and a processor, where the memory stores instructions, and when the instructions are executed by the processor, the method as described above is implemented.

The foregoing detailed description has set forth numerous embodiments of the present disclosure via the use of schematics, block diagrams, flowcharts, and/or examples. Where such diagrams, flowcharts, and/or examples contain one or more functions and/or operations, it will be understood by those within the art that each function and/or operation within such diagrams, flowcharts, or examples can be implemented, individually and/or collectively, by a wide range of structures, hardware, software, firmware, or virtually any combination thereof. In one embodiment, portions of the subject matter described in embodiments of the present disclosure may be implemented by Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), Digital Signal Processors (DSPs), or other integrated circuits. However, those skilled in the art will recognize that some aspects of the embodiments disclosed herein, in whole or in part, can be equivalently implemented in integrated circuits, as one or more computer programs running on one or more computers (e.g., as one or more programs running on one or more computer systems), as one or more programs running on one or more processors (e.g., as one or more programs running on one or more microprocessors), as firmware, or as virtually any combination thereof, and that designing the circuitry and/or writing the code for the software and or firmware would be well within the skill of one of skill in the art in light of this disclosure. In addition, those skilled in the art will appreciate that the mechanisms of the subject matter described herein are capable of being distributed as a program product in a variety of forms, and that an illustrative embodiment of the subject matter described herein applies regardless of the particular type of signal bearing media used to actually carry out the distribution. Examples of signal bearing media include, but are not limited to: recordable type media such as floppy disks, hard disk drives, Compact Disks (CDs), Digital Versatile Disks (DVDs), digital tape, computer memory, etc.; and a transmission type medium such as a digital and/or an analog communication medium (e.g., a fiber optic cable, a waveguide, a wired communications link, a wireless communication link, etc.).

While the foregoing specification illustrates and describes the practice of the disclosure, it is to be understood that the disclosure is not limited to the forms disclosed herein, but is not intended to be exhaustive or to limit the disclosure to other embodiments, and may be used in various other combinations, modifications, and environments and is capable of changes within the scope of the inventive concept as described herein, commensurate with the above teachings, or the skill or knowledge of the relevant art. And that modifications and variations may be effected by those skilled in the art without departing from the spirit and scope of the disclosure, which is to be protected by the following claims.

Claims

1. An intelligent machining cutter wear prediction method based on multi-source sample migration reinforcement learning is characterized by comprising the following steps:

step three: initializing model parameters and maximum iteration time;

2. The intelligent tool wear prediction method of claim 1, wherein in the fourth step, the wear state of the current tool is detected by using online and offline measurement methods, and the wear data is subjected to feature extraction and dimension reduction and then used for constructing a training sample set.

3. The intelligent machining tool wear prediction method according to claim 1 or 2, characterized in that in the fifth step, the obtained state similarity and return similarity are used to quantify the degree of similarity between each sample of the source task and the target task; in the sixth step, the state similarity and the return similarity are used for calculating the probability that each sample in the source task belongs to the target sample set.

4. The intelligent machining tool wear prediction method according to claim 1 or 2, characterized in that in the sixth step, the probability that each sample in the source task belongs to the target sample set is used as a sample migration weight, and the value determines the possibility of sample migration.

5. The intelligent machining tool wear prediction method of claim 1 or 2, wherein in the seventh step, task similarity is calculated according to bayesian probability analysis theory, and the value is used for determining the number of samples to be migrated from each source task to the target task set.

6. The smart machine tool wear prediction method of claim 1 or 2 wherein in step eight, the Q value is updated by selecting the maximum Q value or the random Q value with different probabilities.

7. The intelligent machining tool wear prediction method of claim 1 or 2, wherein in the step eight, the rolling time window is used for updating the target sample set, and the condition that the learning speed of the weighted limit learning machine is too slow due to the fact that the sample set is too large is avoided.

8. An intelligent machining tool wear prediction system based on multi-source sample migration reinforcement learning, characterized in that the machinable tool wear prediction system comprises a memory and a processor, the memory having stored thereon instructions that, when executed by the processor, implement the method according to any one of claims 1-7.