CN113378725A

CN113378725A - Cutter fault diagnosis method, equipment and storage medium based on multi-scale-channel attention network

Info

Publication number: CN113378725A
Application number: CN202110662716.8A
Authority: CN
Inventors: 袁东风; 狄子钧; 周晓天; 李东阳; 梁道君
Original assignee: Shandong University
Current assignee: Shandong University
Priority date: 2021-06-15
Filing date: 2021-06-15
Publication date: 2021-09-10
Anticipated expiration: 2041-06-15
Also published as: CN113378725B

Abstract

The invention relates to a cutter fault diagnosis method, equipment and a storage medium based on a multi-scale-channel attention network, wherein the method comprises the following steps: (1) collecting data; (2) preprocessing data; (3) constructing a multi-scale-channel attention network model; (4) training; (5) and (6) testing. According to the method, the multi-scale-channel attention network is adopted to diagnose the tool wear fault, and the channel attention mechanism is introduced into residual connection of the multi-scale network, so that different importance degrees of vibration signals of the machine tool spindle in three directions to a tool wear state classification task are mapped to a characteristic learning process, and data are fused better.

Description

Cutter fault diagnosis method, equipment and storage medium based on multi-scale-channel attention network

Technical Field

The invention relates to the field of intelligent manufacturing product quality control, in particular to a cutter fault diagnosis method, equipment and a storage medium based on a multi-scale-channel attention network.

Background

The numerical control machine tool is used as an industrial master machine and is widely applied in the production and processing process. The cutter is used as a cutting tool of a numerical control machine tool, and the real-time health state of the cutter directly influences the machining efficiency and the product quality of the machine tool. The cutter and the workpiece are in direct contact and interact, inevitable abrasion damage can be generated in the high-speed cutting process of the cutter, and the accurate monitoring of the abrasion state of the cutter is helpful for avoiding the product quality problem caused by cutter failure.

At present, the method of manual detection is still used in the actual industrial field, and the detection method is time-consuming and labor-consuming and has errors caused by the self-reason of operators. With the shift from automation to intelligence in manufacturing, the deep integration of artificial intelligence technology and manufacturing becomes a key. The method is expected to introduce deep learning into the fault diagnosis method to replace the traditional detection method of manual direct contact, so that the accurate fault diagnosis of the cutter is realized.

A general intelligent detection process is to collect monitoring signals (such as cutting force signals, vibration signals, acoustic emission signals, spindle current signals, etc.), and analyze a mapping relationship between the signals and a tool wear state. At present, a large number of scholars at home and abroad develop the research. Hu et al propose a Multiscale Network (MSNet) that contains a three-branch structure with each branch having a different depth of convolution layer so that features of different levels of a one-dimensional vibration signal can be extracted and combined by full connectivity. Lepeng and the like use a one-dimensional convolution network to carry out primary feature extraction on vibration and acoustic emission signals of a machine tool spindle and a workbench in the machining process, then input signal features into a long-time memory network for analysis, and finally obtain an evaluation result of the wear state of a cutter, wherein the classification accuracy rate reaches 93.8%. Hsieh and the like convert the vibration time domain signals into frequency domain signals by using fast Fourier transform, extract frequency domain characteristics by using a similar mean value scattering criterion, input the characteristics into a single-layer convolutional neural network, classify the wear state of a cutter, and find that a better classification result can be obtained by combining a vibration signal in the Z direction with a vibration signal in the X direction or the Y direction according to an experimental result. To optimize the gradient propagation of the model, the article adds residual connections between convolutional layers with the same feature size. However, the above method has the following problems: (1) the method is limited by the effectiveness and singleness of a data set, only a single-channel vibration signal can be converted into a single-channel image, and the single channel has poor characteristics and possibly affects the classification precision of tool faults; (2) the method is limited by the step completion of feature extraction and feature analysis, the self-learning capability of the model is weak, and the accuracy of final tool fault diagnosis is low; (3) the method is limited by simple model structure, and an effective data fusion method is not used, so that the fusion effect of the vibration signal characteristics in different directions is poor, and the data characteristics cannot be effectively extracted.

Disclosure of Invention

Aiming at the problems, the invention provides a tool fault diagnosis method based on a multi-scale-channel attention network.

According to the invention, through the cutter abrasion test platform, a machine tool spindle vibration signal and a cutter abrasion value which accord with the actual industrial production field are collected, and the vibration signal comprises X, Y, Z directions. Based on the idea of feature fusion, X, Y, Z vibration signals in three directions are spliced to form a three-channel multi-channel feature map.

The method adopts the convolutional neural network to extract the characteristics of the characteristic diagram and carry out classification tasks, and the convolutional neural network has excellent self-adaptive characteristic learning capability and can well mine the potential characteristics of the data.

In the invention, in order to map different importance degrees of vibration signals in three directions of a machine tool spindle to a tool wear state classification task to a feature learning process, data are fused better, a multi-scale network is improved, and channel attention is introduced into the multi-scale network.

The invention also provides computer equipment and a storage medium.

The technical scheme of the invention is as follows:

a cutter fault diagnosis method based on a multi-scale-channel attention network comprises the following steps:

(1) data acquisition:

respectively collecting vibration signals of three axes of a machine tool spindle X, Y, Z and tool wear values after each cutting feed, wherein a spindle coordinate system in which the three axes of the machine tool spindle X, Y, Z are located is established according to a right-handed Cartesian rectangular coordinate system;

(2) data preprocessing:

classifying the tool wear stage according to the tool wear value, and classifying the vibration signals by taking the classification result as a label;

segmenting the vibration signals of three axes of machine tool spindle X, Y, Z into n lengths in time sequence²The slices of the three axes of the machine tool main shaft X, Y, Z are spliced to construct an n × n × 3 three-channel input characteristic diagram, wherein n is the height or width of the characteristic diagram when the vibration signal slices are converted into the characteristic diagram;

selecting a training set, a verification set and a test set from the data after data preprocessing;

(3) constructing a multi-scale-channel attention network model:

the multi-scale-channel attention network model comprises an input layer, three branches and a full connection layer;

the three branches comprise a first branch, a second branch and a third branch, and the first branch comprises 5 convolutional layers; the second branch comprises 2 convolutional layers; the third branch comprises 1 convolutional layer;

residual error connection is arranged between the second convolution layer of the first branch and the first convolution layer of the second branch, between the fourth convolution layer of the first branch and the second convolution layer of the second branch, and between the first convolution layer of the second branch and the first convolution layer of the third branch, and residual error addition results are respectively input into the third convolution layer of the first branch, the fourth convolution layer of the second branch, and the second convolution layer of the second branch;

and the input channel attention module is used before residual error connection is carried out on the output characteristic diagram of the first convolution layer of the second branch, the output characteristic diagram of the second convolution layer of the second branch and the output characteristic diagram of the first convolution layer of the third branch.

(4) Training: inputting the training set into a multi-scale-channel attention network model for training, and recording the accuracy and loss function of the training set in each training period during training;

(5) and (3) testing: inputting the test set into the trained multi-scale-channel attention network model, and outputting a tool wear stage corresponding to the test set data.

Preferably, according to the present invention, 80% of the data after data preprocessing is used as a training set, 20% is used as a test set, 80% of the training set is used as a training, and 20% of the training set is used as a validation set.

Further preferably, the three branches include a first branch, a second branch and a third branch, and the mathematical expressions are respectively shown in formula (I), formula (ii) and formula (iii):

in the formulae (I), (II) and (III), I₁＝1,2,…5，i₂＝1,2，i₃＝1；

Is referred to as the i-th branch of the first branch₁The output characteristic map of each convolutional layer,

is referred to as the i-th branch of the first branch₁A plurality of convolution layers, each of which is wound,

is referred to as the i-th branch of the first branch₁-output signature of 1 convolutional layer;

is referred to as the i-th branch of the second branch₂The output characteristic map of each convolutional layer,

is referred to as the i-th branch of the second branch₂A plurality of convolution layers, each of which is wound,

is referred to as the i-th branch of the second branch₂-output signature of 1 convolutional layer;

i is the i-th branch of the third branch₃The output characteristic map of each convolutional layer,

i is the i-th branch of the third branch₃A plurality of convolution layers, each of which is wound,

i is the i-th branch of the third branch₃Output profile of 1 convolutional layer.

Further preferably, the convolution kernel of each convolution layer is 3 × 3, and a ReLU nonlinear activation function is added after each convolution layer.

More preferably, the output feature map scales of the 5 convolutional layers of the first branch are respectively 64 × 64 × 3, 32 × 32 × 3, 16 × 16 × 3, 8 × 8 × 3, and 4 × 4 × 3; the output characteristic graph scales of the 2 convolutional layers of the second branch are respectively 32 multiplied by 3 and 8 multiplied by 3; the scale of the output profile of the 1 convolutional layer of the third branch is 32 × 32 × 3.

Further preferably, the channel attention module obtains the global receptive field by respectively adopting global average pooling and global maximum pooling, inputs them into a two-layer neural network, respectively, the number of neurons in the first layer is 1, the activation function is ReLU, the number of neurons in the second layer is 3, the two-layer neural network is shared, adds them to generate channel weights, and normalizes the channel weights to be between (0,1) through the normalization layer.

Preferably, the output characteristic map of the shallow branch convolutional layer includes an output characteristic map of a first convolutional layer of the second branch, an output characteristic map of a second convolutional layer of the second branch, and an output characteristic map of a first convolutional layer of the third branch, which are input to the characteristic attention module, and the output characteristic map of the first convolutional layer of the third branch is subjected to pooling operation to obtain channel weights, and then the channel weights are multiplied by the original output characteristic map, and the channel weights are mapped onto the output characteristic map of the shallow branch convolutional layer, and the characteristic map with the channel weights and the deep branch convolutional layer include a second convolutional layer of the first branch, a fourth convolutional layer of the first branch, and an output characteristic map of a first convolutional layer of the second branch are subjected to residual error connection, and the output characteristic map is input to a next convolutional layer of the deep branch convolutional layer.

Preferably, in the step (1), a vibration sensor with the model number of KS903 is adopted to collect vibration signals of three axes of a main shaft X, Y, Z of the machine tool, and the sampling frequency is 10240 Hz; and (3) collecting the cutter abrasion value by adopting a 19JC digital universal tool microscope.

Preferably, in the step (2), tool wear stage classification is performed according to the tool wear value, and the classification result is used as a label to classify the vibration signal; the method specifically comprises the following steps:

drawing a curve graph of the maximum wear value of the secondary flank of the full life cycle of the tool and a time domain graph of a full life cycle vibration signal of the X axis of the machine tool spindle; the abscissa of the graph of the maximum wear value is the number of feed times, and the ordinate is the maximum wear value; the abscissa of the time domain graph of the vibration signal is time, and the ordinate is the amplitude of the vibration signal;

dividing the vibration signals into three categories according to the change of the slope of the maximum wear value curve, wherein when the slope of the maximum wear value curve is not more than 0.01, the corresponding vibration signals are in a rapid initial wear stage, when the slope of the maximum wear value curve is between 0.01 and 0.3, the corresponding vibration signals are in a steady-state wear stage, and when the slope of the maximum wear value curve is not less than 0.3, the corresponding vibration signals are in a rapid wear stage;

preferably, according to the present invention, in step (4), a cross entropy function L is used as a loss function, as shown in formula (iv):

in the formula (IV), y_iThe actual value is represented by the value of,

representing the predicted value, and N is the number of samples.

A computer device comprising a memory storing a computer program and a processor implementing the steps of a multi-scale-channel attention network based tool failure diagnosis method when executing the computer program.

A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method for tool fault diagnosis based on a multiscale-channel attention network.

The invention has the beneficial effects that:

1. according to the mapping relation between the vibration signal of the machine tool spindle and the wear state of the cutter, the convolutional neural network is adopted to diagnose the cutter fault.

2. Based on the idea of feature fusion, vibration signals in three directions are used as three channels of an input feature graph, the vibration signals of three axes of a machine tool spindle are spliced, and then the spliced vibration signals are input into a convolutional neural network.

3. In order to more effectively fuse the vibration signals in the three directions, the channel attention mechanism is introduced into the multi-scale convolutional neural network, and different importance degrees of the vibration signals in the three directions to the tool state classification are mapped to the characteristic learning process by utilizing the interdependency among the channels.

Drawings

FIG. 1(a) is a graph of maximum flank wear values for a full life cycle of a tool;

FIG. 1(b) is a time domain diagram of a full-life-cycle vibration signal in the X direction of a machine tool spindle;

FIG. 2 is a schematic structural diagram of a multi-scale-channel attention network model;

FIG. 3(a) is a graph of accuracy for a multi-scale-channel attention network model training process;

FIG. 3(b) is a graph of a loss function of a multi-scale-channel attention network model training process;

FIG. 4 is a schematic diagram of a confusion matrix;

Detailed Description

The invention is further defined in the following, but not limited to, the figures and examples in the description.

Example 1

(1) data acquisition:

in order to collect real data conforming to actual industrial production scenes, the invention designs a cutter abrasion test platform. Respectively acquiring vibration signals of three axes of a machine tool spindle X, Y, Z and tool wear values after each cutting feed by using a tool wear test platform, wherein a spindle coordinate system in which the three axes of the machine tool spindle X, Y, Z are located is established according to a right-handed Cartesian rectangular coordinate system;

(2) data preprocessing:

the slope of the tool wear value curve reflects the degree of tool wear in the current stage, the tool wear stage is classified according to the tool wear value, and the vibration signals are classified by taking the classification result as a label;

segmenting the vibration signals of three axes of machine tool spindle X, Y, Z into n lengths in time sequence²The slices of the three axes of the machine tool main shaft X, Y, Z are spliced to construct an n × n × 3 three-channel input characteristic diagram, wherein n is the height or width of the characteristic diagram when the vibration signal slices are converted into the characteristic diagram; the number is an artificially defined size;

80% of the data after data preprocessing is used as a training set, 20% is used as a test set, 80% of the training set is used as training, and 20% of the training set is used as a verification set.

(3) Constructing a multi-scale-channel attention network model:

the multi-scale network comprises three branches, the depths of convolution layers of different branches are different, deeper network branches can extract more local information, shallower network branches can extract more local information, and finally features of different levels are subjected to feature fusion in a full connection layer. The multi-scale network sets residual connection between convolution layers with different branches and the same output characteristic diagram scale, so that gradient transmission of the network is facilitated, and the performance of the network is improved.

Before residual connection is carried out on the convolutional layers of the shallower network branches, the output characteristic diagram of the convolutional layers is input into a channel attention module, the correlation of X, Y, Z vibration signals in three directions is captured through pooling operation, channel attention weight is obtained, then pixel point multiplication is carried out on the channel weight and the output characteristic diagram, and the channel weight is set on the characteristic diagram. And residual error connection is carried out between the feature map with the channel attention weight and the output feature map of the convolution layer of the deeper network branch, so that different importance degrees in three directions are mapped to the feature learning process.

As shown in fig. 2, the multi-scale-channel attention network model includes an input layer, three branches, and a full connection layer;

and implementing identity mapping between convolutional layers with different branches and the same output characteristic diagram scale, connecting the characteristic diagram of the shallow convolutional network branch with the characteristic diagram of the deep convolutional network branch by residual errors, and inputting the characteristic diagram of the shallow convolutional network branch into the next convolutional layer of the deep convolutional network branch. Residual error connection is arranged between the second convolution layer of the first branch and the first convolution layer of the second branch, between the fourth convolution layer of the first branch and the second convolution layer of the second branch, and between the first convolution layer of the second branch and the first convolution layer of the third branch, and residual error addition results are respectively input into the third convolution layer of the first branch, the fourth convolution layer of the second branch, and the second convolution layer of the second branch; residual connection can help the characteristics in the network to carry out identity mapping in the forward process, and when the output of the shallow network reaches the optimum, the characteristics are directly transmitted to the deep network; and the gradient is conducted in the reverse process, so that a deeper model can be successfully trained, and the performance of the network is improved.

And the input channel attention module is used before residual error connection is carried out on the output characteristic diagram of the shallow branch convolution layer, and the input channel attention module is used before residual error connection is carried out on the output characteristic diagram of the first convolution layer of the second branch, the output characteristic diagram of the second convolution layer of the second branch and the output characteristic diagram of the first convolution layer of the third branch.

The three branches comprise a first branch, a second branch and a third branch, and the mathematical expressions are respectively shown as formula (I), formula (II) and formula (III):

in the formulae (I), (II) and (III), I₁＝1,2,…5，i₂＝1,2，i₃＝1；

The convolution kernel for each convolutional layer is 3 × 3, and a ReLU nonlinear activation function is added after each convolutional layer.

The output characteristic diagram scales of the 5 convolutional layers of the first branch are respectively 64 multiplied by 3, 32 multiplied by 3, 16 multiplied by 3, 8 multiplied by 3 and 4 multiplied by 3; the output characteristic graph scales of the 2 convolutional layers of the second branch are respectively 32 multiplied by 3 and 8 multiplied by 3; the scale of the output profile of the 1 convolutional layer of the third branch is 32 × 32 × 3.

The channel attention module respectively adopts global average pooling operation and global maximum pooling operation to obtain global receptive fields, then respectively inputs the global receptive fields into a two-layer neural network, the number of neurons in the first layer is 1, an activation function is ReLU, the number of neurons in the second layer is 3, the two-layer neural network is shared, then the neural networks are added to generate channel weights, and finally the channel weights are normalized to be between (0 and 1) through a normalization layer.

The method comprises the steps of inputting a channel attention module before residual connection is carried out on an output feature map of a shallow branch convolution layer, and capturing X, Y, Z correlation of vibration signals in three directions by utilizing interdependency among channels, so that different importance degrees of the three directions can be mapped to a feature extraction process in a channel attention learning mode. The output characteristic diagram of the shallow branch convolutional layer comprises an output characteristic diagram of a first convolutional layer of a second branch, an output characteristic diagram of a second convolutional layer of the second branch and an output characteristic diagram of a first convolutional layer of a third branch, the output characteristic diagram of the first convolutional layer of the third branch is input into a characteristic attention module, channel weights are obtained through pooling operation, then the channel weights are multiplied by an original output characteristic diagram, the channel weights are mapped onto the output characteristic diagram of the shallow branch convolutional layer, the characteristic diagram with the channel weights and the deep branch convolutional layer comprise a second convolutional layer of the first branch, a fourth convolutional layer of the first branch and an output characteristic diagram of a first convolutional layer of the second branch are subjected to residual error connection, and the deep branch convolutional layer is input into a next convolutional layer of the deep branch convolutional layer.

Outputting feature maps with convolutional layers

And

for example, the following steps are carried out:

representing the input of the third convolutional layer of the first branch, a represents the channel attention module,

an output characteristic diagram representing the second convolutional layer of the first branch,

an output profile of the first convolutional layer of the second branch is shown. Through the high-efficiency channel attention module, the channel characteristic weight with strong influence factors is promoted, and the channel characteristic weight with weak influence factors is restrained.

(5) and (3) testing: inputting the test set into the trained multi-scale-channel attention network model, and outputting a tool wear stage corresponding to the test set data. And recording the accuracy of each category of the test set, and drawing a confusion matrix. And (3) adopting four indexes of Accuracy (Accuracy), Precision (Precision), Recall (Recall) and F1 Score (F1 Score) to explain the performance of the network.

Example 2

The tool fault diagnosis method based on the multi-scale-channel attention network in the embodiment 1 is characterized in that:

in the step (1), a vibration sensor with the model number of KS903 is adopted to collect vibration signals of three axes of a machine tool main shaft X, Y, Z, and the sampling frequency is 10240 Hz; and (3) collecting the cutter abrasion value by adopting a 19JC digital universal tool microscope. Data acquisition is carried out on a cutter abrasion test platform, the workpiece cutting process is finished in a vertical machining center (VDF-850) of a numerical control machine tool, the cutter is a three-edge end mill with the diameter of 10mm, and the cutting workpiece is a No. 45 steel cylindrical workpiece. The rotating speed of the main shaft of the numerical control machine tool is 2000r/min, and the feeding speed is 764 mm/min. The milling depth and width were 2mm and 5mm, respectively. In order to accelerate the abrasion of the cutter, a dry cutting mode without cutting fluid is adopted. The milling cutter has the total processing length of about 110 meters, so that the cutter reaches the serious abrasion degree and has small-area milling chips. Each tool acquired 35-47 sets of test data, each set of milling process took 4 minutes and 17 seconds. Meanwhile, after each group of cutting is finished, a 19JC numerical formula is adoptedThe universal tool microscope collects the abrasion value of the cutter and adopts the maximum abrasion width VB of the auxiliary rear cutter surface_maxAs a sort label for the wear phase of the tool.

In the step (2), carrying out tool wear stage classification according to the tool wear value, and classifying the vibration signals by taking a classification result as a label; the method specifically comprises the following steps:

dividing the vibration signals into three types according to the slope change of the maximum wear value curve, and respectively corresponding to three different cutter wear stages: a rapid initial wear phase, a steady state wear phase, and a rapid wear phase. The maximum wear value of the tool in the full life cycle is shown in fig. 1 (a). As can be seen from FIG. 1(a), the tool wears faster in the 1-5 feed stages, where the slope of the wear curve is larger. In the 6-41 feed stages, the abrasion value is uniformly increased until a limit value is reached, and the stage is the effective working time of the cutter. During the 42-47 times of feed stages, the abrasion value of the cutter rises rapidly and causes the cutter to fail, and the slope of the abrasion curve at the stage increases rapidly. Although the initial wear stage is not obvious in fig. 1(a), the amplitude of the vibration signal is large in the time domain of the vibration signal in fig. 1(b), which is caused by the rough surface and the large contact stress of the new blade, and the surface defects caused by the decarburization and the oxidation layer of the new blade. From the above analysis, the vibration signal is divided into three phases.

Segmenting the vibration signal into segments of length 64 in time order of the signal before the vibration signal is input into the multi-scale-channel attention network model²And then splicing X, Y, Z vibration signal slices in three directions as three channels of an input feature map. Assuming three contemporaneous signal slices in the direction of X, Y, Z,

the slices were constructed as a 64 x 3 input feature map input to the multiscale-channel attention network model.

In the step (4), the number of training iterations is set to 100, and a cross entropy function L is used as a loss function, as shown in formula (iv):

in the formula (IV), y_iThe actual value is represented by the value of,

representing the predicted value, and N is the number of samples.

And recording the accuracy and the loss function value of the training set and the verification set in each training period in the training process. After the training process is finished, storing the optimal verification model for testing, and drawing a training set accuracy curve and a training process loss function curve, wherein fig. 3(a) is an accuracy curve graph of the multi-scale-channel attention network model training process; FIG. 3(b) is a graph of a loss function of a multi-scale-channel attention network model training process;

recording the accuracy of each category of the test set, and drawing a confusion matrix, wherein fig. 4 is a schematic diagram of the confusion matrix; table 1 shows four evaluation index tables of the multi-scale-channel attention network model. The performance of the multi-scale-channel attention network model is illustrated by four indexes including Accuracy (Accuracy), Precision (Precision), Recall (Recall) and F1 Score (F1 Score),

TABLE 1

From table 1, the Accuracy (Accuracy), Precision (Precision), Recall (Recall), and F1 Score (F1 Score) are all higher, which indicates that the performance of the multi-scale-channel attention network model is better.

Example 3

A computer device comprising a memory storing a computer program and a processor implementing the steps of the multiscale-channel attention network based tool fault diagnosis method of

embodiment

1 or 2 when the processor executes the computer program.

Example 4

A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the multi-scale-channel attention network-based tool fault diagnosis method according to

embodiment

1 or 2.

Claims

1. The cutter fault diagnosis method based on the multi-scale-channel attention network is characterized by comprising the following steps of:

(1) data acquisition:

(2) data preprocessing:

(3) constructing a multi-scale-channel attention network model:

the input channel attention module is used before residual errors are connected by the output characteristic diagram of the first convolution layer of the second branch, the output characteristic diagram of the second convolution layer of the second branch and the output characteristic diagram of the first convolution layer of the third branch;

2. The tool fault diagnosis method based on the multi-scale-channel attention network as claimed in claim 1, wherein the mathematical expressions of the three branches including the first branch, the second branch and the third branch are respectively shown as formula (I), formula (ii) and formula (iii):

in the formulae (I), (II) and (III), I₁＝1,2,…5，i₂＝1,2，i₃＝1；

means thatIth of the third branch₃The output characteristic map of each convolutional layer,

i is the i-th branch of the third branch₃-output signature of 1 convolutional layer;

further preferably, the convolution kernel of each convolution layer is 3 × 3, and a ReLU nonlinear activation function is added after each convolution layer;

3. The method as claimed in claim 2, wherein the channel attention module obtains the global receptive field by using the global average pooling operation and the global maximum pooling operation, and inputs them into a two-layer neural network, the number of neurons in the first layer is 1, the activation function is ReLU, the number of neurons in the second layer is 3, the two-layer neural network is shared, and then adds them to generate the channel weight, and finally normalizes the channel weight to be between (0,1) by the normalization layer.

4. The multi-scale-channel attention network-based tool fault diagnosis method according to claim 2, the method is characterized in that an output characteristic diagram of a shallow branch convolutional layer comprises an output characteristic diagram of a first convolutional layer of a second branch, an output characteristic diagram of a second convolutional layer of the second branch and an output characteristic diagram of a first convolutional layer of a third branch, the output characteristic diagrams are input into a characteristic attention module, channel weights are obtained through pooling operation, then the channel weights are multiplied by an original output characteristic diagram, the channel weights are mapped onto the output characteristic diagram of the shallow branch convolutional layer, the characteristic diagram with the channel weights and the deep branch convolutional layer comprise a second convolutional layer of the first branch, a fourth convolutional layer of the first branch, an output characteristic diagram of the first convolutional layer of the second branch are subjected to residual error connection, and the output characteristic diagram is input into a next convolutional layer of the deep branch convolutional layer.

5. The tool fault diagnosis method based on the multiscale-channel attention network, according to the claim 1, characterized in that in the step (1), a vibration sensor with the model number KS903 is adopted to collect vibration signals of three axes of a machine tool main shaft X, Y, Z, and the sampling frequency is 10240 Hz; and (3) collecting the cutter abrasion value by adopting a 19JC digital universal tool microscope.

6. The multi-scale-channel attention network-based tool fault diagnosis method according to claim 1, wherein 80% of data after data preprocessing is used as a training set, 20% is used as a test set, 80% of the training set is used as training, and 20% of the training set is used as a validation set.

7. The tool fault diagnosis method based on the multi-scale-channel attention network according to claim 1, wherein in the step (2), tool wear stage classification is performed according to tool wear values, and vibration signals are classified by using classification results as labels; the method specifically comprises the following steps:

the vibration signals are divided into three categories according to the change of the slope of the maximum wear value curve, when the slope of the maximum wear value curve is not more than 0.01, the corresponding vibration signals are in a rapid initial wear stage, when the slope of the maximum wear value curve is between 0.01 and 0.3, the corresponding vibration signals are in a steady-state wear stage, and when the slope of the maximum wear value curve is not less than 0.3, the corresponding vibration signals are in a rapid wear stage.

8. The tool fault diagnosis method based on the multi-scale-channel attention network according to any one of claims 1-7, wherein in the step (4), a cross entropy function L is used as a loss function, as shown in formula (IV):

in the formula (IV), y_iThe actual value is represented by the value of,

representing the predicted value, and N is the number of samples.

9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of a multi-scale-channel attention network based tool failure diagnosis method.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of a method for tool fault diagnosis based on a multiscale-channel attention network.