CN112518425B

CN112518425B - Intelligent machining cutter wear prediction method based on multi-source sample migration reinforcement learning

Info

Publication number: CN112518425B
Application number: CN202011462218.0A
Authority: CN
Inventors: 杨文安; 刘学为; 郭宇
Original assignee: Nanjing University of Aeronautics and Astronautics
Current assignee: Nanjing University of Aeronautics and Astronautics
Priority date: 2020-12-10
Filing date: 2020-12-10
Publication date: 2022-10-04
Anticipated expiration: 2040-12-10
Also published as: CN112518425A

Abstract

The method for predicting the wear of the intelligent machining cutter based on the multi-source sample transfer reinforcement learning comprises the following steps: acquiring a plurality of source tasks and a target task according to different wear curves of several cutters; initializing model parameters and maximum iteration time; detecting the wear state of the current cutter, performing feature extraction and dimension reduction on wear data, constructing a training sample set of a weighted extreme learning machine, and training the weighted extreme learning machine; executing a machining action, observing the abrasion state of the current cutter, and calculating state similarity and return similarity between each source task and each target task; calculating the probability of each sample in the source task belonging to a target sample set; acquiring task similarity, and transferring a fixed number of samples from each source sample set to a target sample set; updating the Q value by using a Q learning mechanism based on a weighted extreme learning machine, and adding new tool wear data into a target sample set; and (3) constructing a state equation and an observation equation of the particle filter model.

Description

Intelligent machining cutter wear prediction method based on multi-source sample migration reinforcement learning

Technical Field

The disclosure belongs to the field of tool wear detection of numerical control manufacturing equipment, and particularly relates to an intelligent machining tool wear prediction method based on multi-source sample migration reinforcement learning.

Background

The most common faults in the digital processing process are the abrasion and the damage of the cutter, which not only directly affects the manufacturing cost, the product quality and the production efficiency, but also causes great equipment shutdown accidents in serious cases. Therefore, the detection of the cutter state has important significance in the aspects of ensuring the safety of a processing system, smoothly processing, reducing the production cost, improving the production efficiency and the like. However, since the machining process involves physical reactions, conversion and transfer of substances and energy, and the complexity and uncertainty of the machining process cause difficulty in detecting the state parameters of the tool during the machining process, the premise for realizing the method is that the state information of the tool during the machining process can be effectively obtained and reflected, regardless of the specific implementation of advanced machining process control algorithms and strategies during the machining process control, or the parameter optimization, the state monitoring and fault diagnosis of the tool during the machining process. Due to the limitation of the development level of the machining process detection technology, a plurality of advanced machining process control algorithms and strategies can only stay in theoretical discussion at present and are difficult to apply to actual machining, and a plurality of machining production systems cannot improve the safety and reliability of system operation by means of measures such as fault diagnosis and state monitoring. To solve the above problems, virtual measurement techniques have been developed.

The Chinese patent application of the invention discloses a method for monitoring the abrasion of a numerical control machine tool cutter based on deep learning (CN 201711117628), which is characterized in that three-phase current signals of a spindle motor of the numerical control machine tool are collected, current signals corresponding to the cutter to be detected are intercepted from the three-phase current signals, each section of current signals are normalized, each section of current signals after normalization is input into a sparse automatic coding network for training, and further the detection of the abrasion of the cutter is realized.

The Chinese patent application for 'a cutter wear monitoring method' (CN 103465107A) optimizes an initial weight value and a threshold value of a neural network through a genetic algorithm by acquiring acoustic emission signals of a cutter in various wear states, current signals of a spindle motor and a feed motor in a machine tool, cutting speed, cutting depth and feed amount, and predicts the cutter wear degree by using a trained neural network. However, the method has the disadvantages of large data volume, unbalanced data distribution, long time for training the neural network and low accuracy, and the obtained neural network can only detect a single cutter type, so that the method is not strong in applicability.

The european patent application "Tool monitor" (EP 0334341 A2) uses an acoustic emission sensor to collect an acoustic emission signal of a Tool of a manufacturing equipment, and combines a band-pass filter to preprocess the signal, and obtains a Tool state by comparing the acoustic emission signal of the Tool of the manufacturing equipment collected in real time with the acoustic emission signal when the Tool is broken. However, the method cannot acquire the implicit characteristics of the acoustic emission signal and the cutter state, so that the real-time state of the cutter cannot be completely reflected by the filtered acoustic emission signal, the simple comparison and judgment need to be carried out in an off-line mode, the judgment result mainly depends on the past experience, and the accuracy of the detection result is insufficient.

The european patent application "Cutting tool wear monitor" (EP 0165745 A2) detects the wear state of a rotary tool by collecting short-circuit current, open-circuit voltage and power signals generated on a workpiece during machining in combination with empirical analysis, and the generated current, voltage and power gradually increase as the tool wears until a sharp increase indicates that the tool is damaged due to excessive wear or breakage. However, the current and the voltage are easily affected by external factors, so that the detection result cannot accurately represent the real-time wear state of the cutter, and only whether the cutter is worn or not can be judged through the change of the current and the voltage, but the real-time wear amount of the cutter cannot be judged, and the requirement of real-time detection cannot be met.

Disclosure of Invention

Aiming at least one aspect of the problems, the method for predicting the wear of the intelligent machining cutter based on the multi-source sample transfer reinforcement learning is designed, the multi-source transfer reinforcement learning model and the particle filter model are utilized, and a virtual detection method is combined, so that the problems that the detection target of a cutter wear detection system is single and the accuracy is insufficient in the current machining process are effectively solved.

Specifically, aiming at the defects of the existing cutter wear detection method, the disclosed method aims to provide an intelligent machining cutter wear prediction method which is used for detecting different types of cutters and is based on multi-source sample migration reinforcement learning, has good accuracy and strong data processing capacity, utilizes the idea of migration learning to reduce the difference between wear state data of different cutters, enables the model to detect the wear states of different cutters, reduces the required training data amount, enables the model to train more rapidly, combines a Weighted Extreme Learning Machine (WELM) algorithm to adjust the imbalance among training samples, enables the training samples to be more effective, and improves the accuracy of the model.

The embodiment of the disclosure is realized by adopting the following technical scheme:

the intelligent machining cutter wear prediction method based on multi-source sample migration reinforcement learning is characterized by comprising the following steps:

the method comprises the following steps: constructing an intelligent machining cutter wear prediction system based on multi-source migration reinforcement learning;

step two: acquiring a plurality of source tasks and a target task according to different wear curves of several cutters;

step three: initializing model parameters and maximum iteration time;

step four: detecting the wear state of the current cutter, performing feature extraction and dimension reduction on wear data, constructing a training sample set of a weighted extreme learning machine, and training the weighted extreme learning machine;

step five: executing a machining action, observing the wear state of the current cutter, simultaneously observing the next wear state and return of the cutter, and calculating the state similarity and return similarity between each source task and each target task;

step six: calculating the probability that each sample in the source task belongs to the target sample set, and performing descending order on the samples of each cutter abrasion source task based on the probability;

step seven: acquiring task similarity, and transferring a fixed number of samples from each source sample set to a target sample set;

step eight: updating the Q value by using a Q learning mechanism based on a weighted extreme learning machine, adding new tool wear data to a target sample set, and then rolling a time window forward to discard the oldest sample;

step nine: constructing a state equation and an observation equation of a particle filter model; and

step ten: and predicting the wear state of the cutter for intelligent machining.

According to the embodiment of the disclosure, the system for predicting the wear of the intelligent machining tool in the first step comprises a tool state virtual detection object, a virtual perception model, an empirical model and a particle filter model. The virtual detection objects of the cutter state are different cutters, and a source task and a target task are respectively obtained from the virtual detection objects; the virtual perception model is mainly used for determining an observation equation of a particle filter model and comprises four parts, namely tool wear data acquisition, tool wear data feature selection, tool wear data feature dimension reduction and multi-source migration reinforcement learning based on WELM, wherein the tool wear data are acquired online through a sensor and are acquired offline through a microscope respectively, the tool wear data are subjected to feature selection through statistical features, time domain features and frequency domain features, the selected features are subjected to feature dimension reduction through Kernel Principal Component Analysis (KPCA) method, and the multi-source migration reinforcement learning based on WELM consists of a sample space migration stage, a task space migration stage and a WELM-based Q learning stage; the empirical model is mainly used for determining a state equation of the particle filter model and consists of various tool wear empirical models, such as a tool wear rate model, a tool wear loss model and the like; the particle filter model consists of an observation equation and a state equation and completes virtual detection of the tool wear state of the target task.

According to the embodiment of the disclosure, the model parameters in the third step comprise a WELM learning rate, a WELM discount factor, a Q value updating probability, a WELM hidden layer neuron number L, a Gaussian kernel width parameter and a matching degree control parameter Z _P And Z _Q Probability control parameter Z _X Source task sample number and target task sample number.

According to the embodiment of the disclosure, in the fourth step, the wear state of the current tool is detected by using an online and offline measurement method, and the wear data is subjected to feature extraction and dimension reduction and then used for constructing a training sample set.

For example, in the fourth step, the sensor is used for realizing online measurement of the tool state, the microscope is used for realizing offline measurement of the tool wear state, different feature extraction and dimension reduction methods are used for preprocessing the wear data, and the preprocessed data are used as a training sample set of the WLEM.

According to the embodiment of the disclosure, in the fifth step, the obtained state similarity and the return similarity are used for quantifying the similarity between each sample of the source task and the target task; in the sixth step, the state similarity and the return similarity are used for calculating the probability that each sample in the source task belongs to the target sample set.

According to the embodiment of the disclosure, the method for calculating the state similarity and the return similarity in the step five is as follows:

suppose that

And

respectively representing the sample sets of the kth source task and the target task, wherein the jth source sample and the ith target sample are respectively represented as

And

calculated by equations (1) and (2)

And with

State similarity between them

And reporting similarity

In the formula (I), the compound is shown in the specification,

being similar weights, δ _s ，δ _q And delta _sa Is the width parameter of the gaussian kernel. Usually will be delta _s ，δ _q And delta _sa Is set to correspond to the molecule | | s' _i ，s′ _j ||，|Q _i -Q _j I and I(s) _j ，a _j )，(s _i ，a _i ) The same order of magnitude of | this may be

And w _ij The value of (b) is limited to a reasonable range. It is clear that the higher the similarity between two samples,

the larger. Also, as can be seen from equation (2), two Q values Q _i And Q _j The closer the distance is to each other,

the larger.

According to the embodiment of the present disclosure, in the sixth step, the probability that each sample in the source task belongs to the target sample set is used as a sample migration weight, and the value determines the possibility of sample migration.

According to the embodiment of the disclosure, the probability calculation method that each sample in the source task in the sixth step belongs to the target sample set is as follows:

in the formula (I), the compound is shown in the specification,

referred to as the sample migration weight, is,

and

respectively represent

And

the definition of which is similar to equations (1) and (2).

According to the embodiment of the disclosure, in the seventh step, the task similarity is calculated according to the bayesian probability analysis theory, and the value is used for determining the number of samples migrated from each source task to the target task set.

For example, task similarity in step seven

The calculation method of (2) is as follows:

from the above analysis, we can calculate the matching degree of the status and the return, as follows:

in the formula, Z _P And Z _Q Indicating the control parameter, Z may be set without overflowing the normal value _P And Z _Q Is set to 1, i.e. Z _P＝ Z _Q =1. By multiplying equation (4) by equation (5), the ith target sample can be obtained

And source task S _k Total degree of match therebetween, i.e.

Thus, a likelihood ratio X between the source task and the target task can be obtained _k ：

In the formula, P (S) _k ) Is a model S _k A priori of, and Z _X Is a control parameter of the possibility. Z is a linear or branched member _X And Z _P And Z _Q With the same effect, Z can be adjusted without overflowing the usual values _X Is set to 1, i.e. Z _X＝1 . We can normalize the likelihood ratios of all active tasks

To obtain a source task S _k Task similarity with T, while also requiring migration from M source tasks (n) _S -n _T ) One sample is used to supplement the target sample.

According to an embodiment of the present disclosure, in the step eight, the maximum Q value or the random Q value is selected with different probabilities to update the Q value.

In the step eight, the rolling time window is used for updating the target sample set, so that the condition that the learning speed of the weighted extreme learning machine is too low due to the fact that the sample set is too large is avoided.

According to the embodiment of the disclosure, the palm-based Q learning mechanism in the step eight is as follows: in an embodiment of the present disclosure, the WELM is used to approximate a reinforcement learning Q-value function.

Suppose s _l ＝[s _l1 ，s _l2 ，...，s _lm ] ^T ∈R ^m Represents a system state having m dimensions, and a _l e.R represents the action of a Q learning agent, and the input vector of WELM is a state-action pair x _l ＝(s _l ，a _l ) ^T ＝[s _l1 ，s _l2 ，...，s _lm ，a _l ] ^T ∈R ^m+1 And the output of the WELM is estimated to correspond to(s) _l ，a _l ) Is given by the formula (i), wherein L =1, 2. Alpha is alpha ₁ ＝[α ₁₁ ，α ₁₂ ，...，α _1(m+1 )]And ω = [ ] ₁ ，ω ₂ ，..，ω _L ] ^T Is the input and output weight vector, β = [ beta ] ₁ ，β ₂ ，...，β _L ] ^T Is the deviation of the hidden layer node. To ensure the learning effect of WELMThe number of hidden layer nodes should be equal to the number of training samples, i.e. the number of hidden layer nodes is L.

According to an embodiment of the present disclosure, the architecture of the multi-source migration learning algorithm in step eight is as follows: task space Γ = (S, T) = (S) for multi-source migration reinforcement learning in this disclosure ₁ ，...，S _k ，..，S _M T), the formula includes M source tasks and an unknown target task T. Each source task containing n _s A sample and the target task contains n _T And (4) sampling. On the one hand, the cost of collecting the sample is high, and n _T The sample does not train task T well. On the other hand, there is less of a cost to use M source tasks and extract the required samples than to resample from the current environment. Therefore, we apply a Knowledge Transfer (KT) technique to the WELM-based Q learning to improve the learning speed of the target task T. The mapping of KT is described as follows:

in the formula

And

representing the sample sets corresponding to the source task and the target task, respectively, and Λ represents the optimal transmission sample set.

Compared with the prior art, the technical scheme provided by the embodiment of the disclosure has at least the following beneficial effects:

(1) The migration learning idea adopted by the embodiment of the disclosure has at least the following advantages:

1) The demand for training data volume is smaller when the cutter detection model is trained, and the advantage of the model training speed is obviously improved;

2) The generalization capability of the model obtained by utilizing the transfer learning training is stronger, the capability of the model on classifying well on non-training data is enhanced, and the applicability of the model is improved;

3) The transfer learning training process is more robust, the number of trainable parameters can be reduced by 100%, so that the training is more stable, the debugging is easier, and the accuracy of the tool detection model is improved.

(2) The reinforcement learning model based on the weighted extreme learning machine adopted by the embodiment of the disclosure has at least the following advantages:

1) The tool wear virtual detection system has strong learning capacity, and can learn enough abnormal tool wear states to be communicated;

2) The cutter wear virtual detection system can train unbalanced data, and the applicability and accuracy of the system are improved;

3) The tool wear virtual detection system has continuous self-improvement capability, and the understanding and comprehension of the tool wear condition in the intelligent machining process are gradually improved.

(3) The virtual measurement technology adopted by the embodiment of the disclosure has at least the following advantages:

1) The cutter wear virtual detection system has the functions of error compensation and fault diagnosis, so that the precision and the reliability of cutter wear virtual measurement are improved;

2) The virtual detection system for the cutter wear can comprehensively use a plurality of measurable cutter state information to carry out state estimation, diagnosis and trend analysis on the wear state of the cutter to be detected;

3) The tool wear virtual detection system can acquire microcosmic real-time state information of the detected tool on line so as to meet the requirement of tool wear virtual detection;

therefore, in the embodiment of the disclosure, the intelligent machining tool wear prediction method based on multi-source sample migration reinforcement learning at least solves the problems that the tool state information existing in the intelligent machining process cannot be detected on line in real time, so that real-time control is difficult to perform and the machining quality is difficult to ensure.

Drawings

The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this disclosure, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure and not to limit the disclosure. In the drawings:

FIG. 1 is a system block diagram of a method for intelligent machining tool wear prediction based on multi-source sample migration reinforcement learning according to an embodiment of the present disclosure; and

fig. 2 is a block diagram of a wellness learning system based on a wellm in accordance with an embodiment of the present disclosure.

Detailed Description

The technical solution of the present disclosure is further specifically described below by way of examples and with reference to the accompanying drawings. In the specification, the same or similar reference numerals denote the same or similar components. The following description of the embodiments of the present disclosure with reference to the accompanying drawings is intended to explain the general inventive concept of the present disclosure and should not be construed as limiting the present disclosure.

Furthermore, in the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details.

The virtual measurement technology is an automatic measurement technology developed along with a computer technology and a modern measurement technology, realizes measurement of a process variable to be measured by utilizing an easily-measured process variable through various mathematical calculation and estimation methods according to a mathematical relationship between the easily-measured process variable and the process variable to be measured which is difficult to directly measure, overcomes the defects of sealing, lack of flexibility, low response speed and the like of a hardware detection technology, and has the advantages of functionalization, modularization, data sharing, low cost and the like. The virtual measurement technology is applied to cutter abrasion detection, cutter abrasion after actual processing can be predicted, cutter abrasion and damage problems possibly occurring in the actual processing can be found in advance, a feasible modification scheme is provided in a targeted manner, processing parameters are optimized, and the virtual measurement technology has important significance for ensuring the quality of processed products, greatly reducing the processing cost, shortening the processing period and improving the processing efficiency. The virtual measurement technology for the tool wear mainly comprises a tool wear state equation and a tool wear observation equation, wherein the state equation represents the inherent state change of the tool in the machining process and describes the evolution behavior of the tool degradation condition. The observation equation represents the mapping from the current tool state to the tool wear state, but because the intelligent machining process has the characteristics of multiple variables, variable type mixing and strong nonlinear coupling among variables, and the tool wear state in the machining process is interfered by various uncertain factors such as workpiece materials, machining working conditions, machining equipment states and the like, the mapping relationship from the current tool state to the tool wear state is difficult to describe by a mathematical model. The existing solution is to use an intelligent learning model to represent a nonlinear mapping relationship from a current tool state to a tool wear state, so that the accuracy of a virtual detection result of the tool wear state mainly depends on the construction of the intelligent learning model.

The tool wear intelligent learning model established by the traditional machine learning method usually needs data to be in the same distribution, so that more data are needed to be trained when a batch of the same workpieces are processed, but the data of the workpieces produced by a single piece in small batch and processed by the same parameters are less, and the training requirements cannot be met. In the embodiment of the disclosure, data under other parameters are considered to assist training, and variable parameter tool wear detection is realized, which is different from tool wear detection of the same parameter, and a signal acquired by variable parameter tool wear detection is influenced by two factors, namely the actual wear condition of the tool and the change of the cutting parameter. Therefore, data under different parameters are distributed in different data, in order to detect the state of a parameter tool by using tool wear data of other parameters, in the embodiment of the disclosure, a transfer learning method is used to solve the problem of insufficient data in the single-piece small-batch production process, because the transfer learning can help the tool state detection model of a certain parameter to learn by using signal data under other parameters of different distributions.

However, the tool wear sample data obtained in practical applications is always of unbalanced class, i.e. the sample size of one class is much larger than that of another or some classes. For the unbalanced classification problem caused by the classification problem of class data, the traditional machine learning method is mainly based on an intelligent learning model with minimized experience risk or minimized structure risk for classification. The least empirical risk is that the classification error rate of the model on the training set is expected to be as low as possible, and the samples in a few classes are subjected to a large amount of error classification; the minimum structural risk is the desire to maximize the inter-class distance, and the influence of the majority class on the classifier will generally push the inter-class spacing surface to the side of the minority class, thereby causing the classifier to degrade the recognition performance of the minority class. For a large-scale data set, particularly a tool wear data set, the traditional algorithm cannot conveniently and effectively extract data features, so that the classification error rate is improved, and the time cost for searching a classification spacing surface by the algorithm is high.

In this document, so-called migration learning is to use existing knowledge to learn new knowledge, and the core is to find the similarity between the existing knowledge and the new knowledge. In the transfer learning, the existing knowledge is called a source domain, the new knowledge to be learned is called a target domain, the source domain and the target domain are different but have certain correlation, the distribution difference between the source domain and the target domain needs to be reduced, and the knowledge transfer is carried out, so that the data calibration is realized. For example, the source domain target domain distinction is expressed as follows: the target domain can be different in data distribution, feature dimension and model output variation condition relative to the source domain, and the target domain is better modeled by organically utilizing the knowledge in the source domain. In addition, under the condition that target domain calibration data is lack, the migration learning can well utilize the calibrated data in the related field to finish the calibration of the data. It should be noted that if the similarity between the source domain and the target domain is not sufficient, the migration result is not ideal, and a so-called negative migration situation occurs. Therefore, finding the source domain and the target domain with the highest similarity is the most important prerequisite for the whole migration process.

Further, herein, the expression "domain" may include data features and feature distributions, which are subjects of machine learning; the expression "source domain" is a domain that is already known; the expression "target domain" is a domain to be learned; the expression "task" includes an objective function and a learning result, which is a result of learning, and can be understood as a classifier.

As shown in fig. 1, a system block diagram of an intelligent machining tool wear prediction method based on multi-source sample migration reinforcement learning according to an embodiment of the present disclosure is schematically shown. Referring to fig. 1, the intelligent machining tool wear prediction method based on multi-source sample migration reinforcement learning according to an embodiment of the present disclosure may include the following steps.

In the first step, an intelligent machining tool wear prediction system based on multi-source migration reinforcement learning is constructed, as shown in fig. 1.

Specifically, in step one, the tool wear prediction system (also referred to as a tool wear state virtual detection system) may include a tool state virtual detection object, a virtual perception model, an empirical model and a particle filter model. The virtual detection objects of the cutter state are different cutters, and a source task and a target task are respectively obtained from the virtual detection objects; the virtual perception model is mainly used for determining an observation equation of a particle filter model and comprises four parts, namely tool wear data acquisition, tool wear data feature selection, tool wear data feature dimension reduction and multi-source migration reinforcement learning based on a Weighted Extreme Learning Machine (WELM), wherein the tool wear data are acquired online through a sensor and acquired offline through a microscope respectively, the tool wear data are subjected to feature selection by utilizing statistical features, time domain features and frequency domain features, the selected features are subjected to feature dimension reduction by utilizing a Kernel Principal Component Analysis (KPCA) method, and the multi-source migration reinforcement learning based on the WELM comprises a sample space migration stage, a task space migration stage and a Q learning stage based on the WELM; the empirical model is mainly used for determining a state equation of the particle filter model and consists of various tool wear empirical models, such as a tool wear rate model, a tool wear loss model and the like; the particle filter model consists of an observation equation and a state equation and completes virtual detection of the tool wear state of the target task.

In step two, a plurality of source tasks S are obtained according to different wear curves of several tools ₁ ，S ₂ ，...，S _M And a target task T.

In step three, model parameters and maximum iteration time are initialized.

Specifically, in step three, the model parameters to be initialized include a WELM learning rate eta, a WELM discount factor gamma, and a Q value update probability epsilon ₀ Welm hidden layer neuron number L, width parameter of Gaussian kernel δ _s 、δ _sa And δ q, matching degree control parameter Z _P And Z _Q Probability control parameter Z _X Number of samples n of source task _S And a target number of task samples n _T 。

And in the fourth step, detecting the wear state of the current cutter, performing feature extraction and dimension reduction on the wear data, constructing a WELM training sample set D, and training the WELM.

Specifically, in the fourth step, the sensor is used for realizing online measurement of the tool state, the microscope is used for realizing offline measurement of the tool wear state, different feature extraction and dimension reduction methods are used for preprocessing wear data, and the preprocessed data are used as a WLEM training sample set.

And step five, executing the machining action, observing the current tool wear state, simultaneously observing the wear state and return of the next tool, and calculating the state similarity and return similarity between each source task and the target task.

Specifically, the method for calculating the state similarity and the return similarity in step five is as follows:

suppose that

And with

And

calculated by equations (1) and (2)

And

similarity of state therebetween

And reporting the similarity

In the formula (I), the compound is shown in the specification,

being similar weights, δ _s ，δ _q And delta _sa Is the width parameter of the gaussian kernel. Usually will be delta _s ，δ _q And delta _sa Is set to the corresponding molecule | | | s' _i ，s′ _j ||，|Q _i -Q _j I and I(s) _j ，a _j )，(s _i ，a _i ) The same order of magnitude of | this may be

the larger. Also, as can be seen from equation (2), two Q values Q _i And Q _j The closer the distance is to the point of departure,

the larger.

And in the sixth step, calculating the probability that each sample in the source task belongs to the target sample set, and performing descending order on the samples of each tool wear virtual detection source task based on the probability.

Specifically, in the sixth step, the probability calculation method that each sample in the source task belongs to the target sample set is as follows:

in the formula (I), the compound is shown in the specification,

referred to as the sample-migration weight,

and

respectively represent

And

the definition of which is similar to equations (1) and (2).

In step seven, task similarity is obtained, and a fixed number of samples are migrated from each source sample set to the target sample set.

Specifically, the task similarity in the step seven

The calculation method of (2) is as follows:

in the formula, Z _P And Z _Q Indicating the control parameter, Z may be set without overflowing the normal value _P And Z _Q Is set to 1, i.e. Z _P＝ Z _Q＝1 . Multiplying equation (4) by equation (5) may obtain the ith target sample

And source task S _k The total degree of match between, i.e.

In the formula, P (S) _k ) Is a model S _k A priori of (a) and Z _X Is a control parameter of the possibility. Z _X And Z _P And Z _O With the same effect, Z can be adjusted without overflowing the usual values _X Is set to 1, i.e. Z _X ＝ ₁ . We can normalize the likelihood ratios of all active tasks

In step eight, update the Q value using the palm-based Q learning mechanism, add new tool wear data to the target sample set, and then scroll the time window forward to discard the oldest samples.

Specifically, the Q learning mechanism based on the palm in step eight is as follows:

in this disclosure, WELM is used to approximate a reinforcement learning Q-value function.

Suppose s _l ＝[s _l1 ，s _l2 ，...，s _lm ] ^T ∈R ^m Represents a system state having m dimensions, and a _l e.R represents the action of a Q learning agent, and the input vector of WELM is a state-action pair x _l ＝(s _l ，a _l ) ^T ＝[s _l1 ，s _l2 ，...，s _lm ，a _l ] ^T ∈R ^m+1 And the output of the WELM is estimated to correspond to(s) _l ，a _l ) Is given by L =1, 2.. And L is the index of the training sample. Alpha (alpha) ("alpha") ₁ ＝[α ₁₁ ，α ₁₂ ，..，α _1(m+1 )]And ω = [ ω ] ₁ ，ω ₂ ，...，ω _L ] ^T Is the input and output weight vector, β = [ beta ] ₁ ，β ₂ ，...，β _L ] ^T Is the deviation of the hidden layer node. In order to ensure the learning efficiency of the wellm, the number of hidden layer nodes should be equal to the number of training samples, i.e. the number of hidden layer nodes is L.

Specifically, the architecture of the multi-source migration learning algorithm in step eight is as follows, as shown in fig. 2. Task space Γ = (S, T) = (S) for multi-source migration reinforcement learning in this disclosure ₁ ，...，S _k ，..，S _M T), the formula includes M source tasks and an unknown target task T. Each source task containing n _s A sample and the target task contains n _T And (4) sampling. On the one hand, the cost of collecting the sample is high, and n _T The sample does not train task T well. On the other hand, the cost of using M source tasks and extracting the required samples is small compared to resampling from the current environment. Therefore, we apply a Knowledge Transfer (KT) technique to the WELM-based Q learning to improve the learning speed of the target task T. The mapping of KT is described as follows:

in the formula

And

the sample sets corresponding to the source task and the target task are represented, respectively, and Λ represents the optimal transmission sample set.

In the ninth step, a state equation and an observation equation of the particle filter model are constructed.

In step ten, the tool wear state of the target task is virtually detected.

Embodiments of the present disclosure also provide an intelligent machining tool wear prediction system based on multi-source sample migration reinforcement learning, the machinable tool wear prediction system including a memory and a processor, the memory having instructions stored thereon that, when executed by the processor, implement the method as described above.

The foregoing detailed description has set forth numerous embodiments of the present disclosure via the use of diagrams, block diagrams, flowcharts, and/or examples. Insofar as such schematics, flowcharts, and/or examples contain one or more functions and/or operations, it will be understood by those within the art that each function and/or operation within such schematics, flowcharts, or examples can be implemented, individually and/or collectively, by a wide range of structures, hardware, software, firmware, or virtually any combination thereof. In one embodiment, portions of the subject matter described in embodiments of the present disclosure may be implemented by Application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs), digital Signal Processors (DSPs), or other integrated circuits. However, those skilled in the art will recognize that some aspects of the embodiments disclosed herein, in whole or in part, can be equivalently implemented in integrated circuits, as one or more computer programs running on one or more computers (e.g., as one or more programs running on one or more computer systems), as one or more programs running on one or more processors (e.g., as one or more programs running on one or more microprocessors), as firmware, or as virtually any combination thereof, and that designing the circuitry and/or writing the code for the software and or firmware would be well within the skill of one of skill in the art in light of this disclosure. In addition, those skilled in the art will appreciate that the mechanisms of the subject matter described herein are capable of being distributed as a program product in a variety of forms, and that an illustrative embodiment of the subject matter described herein applies regardless of the particular type of signal bearing media used to actually carry out the distribution. Examples of signal bearing media include, but are not limited to: recordable type media such as floppy disks, hard disk drives, compact Disks (CDs), digital Versatile Disks (DVDs), digital magnetic tape, computer memory, etc.; and a transmission type medium such as a digital and/or an analog communication medium (e.g., a fiber optic cable, a waveguide, a wired communications link, a wireless communication link, etc.).

While the foregoing specification illustrates and describes the practice of the disclosure, it is to be understood that the disclosure is not limited to the forms disclosed herein, but is not intended to be exhaustive or to limit the disclosure to other embodiments, and may be used in various other combinations, modifications, and environments and is capable of changes within the scope of the inventive concept as described herein, commensurate with the above teachings, or the skill or knowledge of the relevant art. And that modifications and variations may be effected by those skilled in the art without departing from the spirit and scope of the disclosure, which is to be protected by the following claims.

Claims

1. An intelligent machining cutter wear prediction method based on multi-source sample migration reinforcement learning is characterized by comprising the following steps:

the method comprises the following steps: constructing an intelligent machining cutter wear prediction system based on multi-source migration reinforcement learning, wherein the intelligent machining cutter wear prediction system comprises cutter state virtual detection objects, and the cutter state virtual detection objects are different cutters;

step three: initializing model parameters and maximum iteration time;

step ten: the wear state of the cutter for intelligent processing is predicted,

in the fourth step, the wear state of the current cutter is detected by using an online and offline measurement method, and the wear data is subjected to feature extraction and dimension reduction and then used for constructing a training sample set.

2. The intelligent machining tool wear prediction method of claim 1, wherein in the fifth step, the similarity between each sample of the source task and the target task is quantified by using the obtained state similarity and the return similarity; in the sixth step, the state similarity and the return similarity are used for calculating the probability that each sample in the source task belongs to the target sample set.

3. The method for predicting the wear of the intelligent machining tool according to claim 1, wherein in the sixth step, the probability that each sample in the source task belongs to the target sample set is used as a sample migration weight, and the value determines the possibility of sample migration.

4. The intelligent machining tool wear prediction method of claim 1 wherein in step seven, task similarity is calculated according to bayesian probability analysis theory, this value being used to determine the number of samples to migrate from each source task to the target task set.

5. The intelligent machining tool wear prediction method of claim 1 wherein in step eight, the maximum Q value or a random Q value is selected with different probabilities to update the Q value.

6. The intelligent machining tool wear prediction method of claim 1, wherein in step eight, the rolling time window is used to update the target sample set to avoid too large a sample set resulting in too slow a learning rate of the weighted extreme learning machine.

7. An intelligent machining tool wear prediction system based on multi-source sample migration reinforcement learning, characterized in that the machinable tool wear prediction system comprises a memory and a processor, the memory having stored thereon instructions that, when executed by the processor, implement the method according to any one of claims 1-6.