CN112686372A - Product performance prediction method based on depth residual GRU neural network - Google Patents

Product performance prediction method based on depth residual GRU neural network Download PDF

Info

Publication number
CN112686372A
CN112686372A CN202011577005.2A CN202011577005A CN112686372A CN 112686372 A CN112686372 A CN 112686372A CN 202011577005 A CN202011577005 A CN 202011577005A CN 112686372 A CN112686372 A CN 112686372A
Authority
CN
China
Prior art keywords
neural network
gru
residual
assembly
gru neural
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011577005.2A
Other languages
Chinese (zh)
Inventor
钟百鸿
王琳
钟诗胜
林琳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology Weihai
Original Assignee
Harbin Institute of Technology Weihai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology Weihai filed Critical Harbin Institute of Technology Weihai
Priority to CN202011577005.2A priority Critical patent/CN112686372A/en
Publication of CN112686372A publication Critical patent/CN112686372A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Abstract

The invention discloses a product performance prediction method based on a depth residual GRU neural network, which comprises the following steps: step one, constructing a depth residual GRU neural network model; and secondly, predicting the product performance based on the depth residual GRU neural network, wherein the prediction result is used for guiding the assembly of the parts of the complex precision product. The invention provides a novel deep learning method, namely a deep residual GRU neural network (DRGRUNN), which organically integrates the advantages of the GRU neural network and the residual GRU neural network so as to improve the learning capability of the network on the characteristics of assembly parameters and realize the high-precision prediction of the performance of complex precision products.

Description

Product performance prediction method based on depth residual GRU neural network
Technical Field
The invention relates to a prediction method, in particular to a product performance prediction method based on a depth residual GRU neural network.
Background
Industry 4.0 links production flow with intelligent production technology, leads a new round of industrial technology revolution. The industrial 4.0 digitization technology is applied to the assembly of complex precision products, the assembly performance of the complex precision products is effectively and accurately predicted, the production efficiency of the products is improved, and the assembly quality of the products is accurately controlled.
In order to ensure high-quality and reliable operation of complex precision products, the complex precision products are required to have excellent assembly quality. Tolerance analysis is a powerful tool for predicting the assembly quality of products, however, complex precision products are often complex in structure, many in assembly parts, compact in assembly space and high in reliability requirement, the fit tolerance of the parts is limited by measurement cost, measurement technology and the like, and the assembly quality of the products is difficult to represent by effectively measuring the tolerance of the assembled parts of the products. In actual assembly, a trial and error method and a performance test method are generally combined to realize the assembly of parts of complex precision products, and the assembly quality of the products is represented by performance test values. However, this manual trial and error matching method will result in a large number of invalid assemblies, which reduces the production efficiency. On the other hand, assembly parameters of different parts are mutually influenced, and a complex precision product performance prediction model is difficult to establish by simply using theoretical deduction such as dynamics.
With the development of industrial 4.0 digitization technology, a deep learning method based on data driving has become a useful tool for performance prediction. The deep learning method can automatically learn features from original input data, realize performance prediction with higher precision, and various deep learning methods are also applied to the performance prediction. However, for the conventional artificial neural network, effective extraction of data association relationship through network weight optimization is a difficult task. The network weight is adjusted through back propagation error information, and the catastrophic consequences of gradient explosion and gradient disappearance of the gradient of the error function become inaccurate gradually after the gradient of the error function is back propagated through multiple layers. As a result, the parameters of the trainable layer of the network cannot be effectively optimized, and it is difficult to effectively learn the relevant characteristics of the information before and after the original input data, which results in poor performance prediction results.
The GRU neural network is a new deep learning method developed in recent years, has information memory capacity, and is easier to learn characteristics of correlation between the front and the back of data compared with the traditional neural network. Research shows that the deep neural network is helpful for extracting data subtle features, however, training the deep GRU neural network is difficult, and the main challenge is that the problems of gradient explosion and gradient disappearance easily occur in the deep neural network training process.
The assembly parameters of complex and precise products greatly affect the performance of the products. The assembly parameters refer to a series of parameters representing product characteristics, such as dimensional tolerance, form and position tolerance, surface roughness and the like, generated in the product assembly process, and also include parameter items generated in the debugging and inspection processes before product forming, such as a gyro rotor drift test value, a coil component resistance value and the like. The information correlation among the assembly parameters of the parts of the complex precise products from different assembly units can influence the product performance. However, for the correlation information among the assembly parameters, the feature learning capability of the traditional artificial neural network is often not ideal, and the high-dimensional features learned in the output layer of the traditional artificial neural network do not have enough discrimination capability to accurately predict the product performance. Developing a deep learning method to effectively learn the associated features of the assembly parameters so as to improve the prediction precision of product performance is an urgent task to be solved.
Disclosure of Invention
In order to overcome the defects of the technology, the invention provides a product performance prediction method based on a depth residual GRU neural network.
In order to solve the technical problems, the invention adopts the technical scheme that: a product performance prediction method based on a depth residual GRU neural network comprises the following steps:
step one, constructing a depth residual GRU neural network model;
and secondly, predicting the product performance based on the depth residual GRU neural network, wherein the prediction result is used for guiding the assembly of the parts of the complex precision product.
Further, in the first step, a deep residual GRU neural network model is built by adopting the GRU neural network and the residual neural network, the learning capability of the assembly parameter association characteristics is enhanced by means of the information memory capability of the GRU neural network, a deep network structure is built through a residual connecting structure of the residual neural network, and the product assembly parameter characteristics are extracted.
Further, a residual connection structure is introduced into the GRU neural network to obtain two residual GRU structures and a deep residual GRU neural network formed by stacking RGs; both residual GRU structures are collectively referred to as RG.
Further, two residual GRU structures are: basic RG structure, RG structure with BN and Dropout;
the basic RG structure consists of two ReLU layers and two GRU layers, and the RG structure with BN and Dropout consists of two BN layers, two ReLU layers, two GRU layers and two Dropout layers.
Further, the depth residual GRU neural network is formed by stacking N RG blocks, and then connected to the full link layer output result after the flattening operation.
Further, the BN layer refers to a batch standardization layer, the BN layer is added into the RG to optimize the neural network training process, and the specific processing process is as follows:
Figure BDA0002864575920000031
Figure BDA0002864575920000032
Figure BDA0002864575920000033
Figure BDA0002864575920000034
in the formula, xn、ynRespectively representing the input and the output of the batch normalization layer; n represents the size of the batch normalization layer, namely the number of samples; α, β are trainable parameters, and ε is a constant close to 0; sigma is the variance of the batch normalization layer input samples; μ is the batch normalization layer input sample mean.
Further, the ReLU layer is a ReLU activation function, which is used in the neural network for nonlinear transformation to optimize the neural network training process, as shown in equation 11, the derivative of which is 1 or 0,
y ═ max (x,0) formula 11
In the formula, x and y sequentially represent input and output of the ReLU activation function.
Further, a Dropout layer is added to the RG to optimize the neural network training process by randomly discarding neurons in the 0-1 scale range.
Further, in the deep residual GRU network, network parameters are optimized, and the process is as follows:
let l-thRG input be RGlOutput is RGl+1Then the RG performs the following calculation procedure:
RGl+1=RGl+F(RGl,Wl RG) Formula 12
Wherein F represents a residual function, Wl RGRepresenting trainable weight parameters within the RG; l-th denotes the l-th RG block;
by means of a recursive relationship, the following results can be obtained:
Figure BDA0002864575920000041
wherein i represents the number of RGs (RG)iDenotes the input of the ith RG block, Wi RG) Represents a trainable weight parameter in the ith RG, L is the total number of RGs;
in the backward propagation, the loss function is recorded as E, and the loss function can be obtained by a chain method:
Figure BDA0002864575920000042
in the formula (I), the compound is shown in the specification,
Figure BDA0002864575920000043
representing a derivation process;
formula 14 shows that trainable parameters are optimized in a depth residual GRU neural network by a linear superposition mode;
when the neural network is trained by using the cross entropy error as a loss function, the output layer characteristics are converted into the range between [0 and 1] through a softmax function, wherein the softmax function is as follows:
Figure BDA0002864575920000051
in the formula, x and y are input and output of a softmax function in sequence, M is the number of categories, and i and j represent indexes of neurons in an output layer; the cross entropy loss E is given by equation 16:
Figure BDA0002864575920000052
in the formula, targetjA true category label representing the sample belongs to;
after the cross entropy loss is calculated, the network is iteratively trained by a gradient descent method, as shown in formula 17, to adjust the network parameters, so as to obtain optimized network parameters:
Figure BDA0002864575920000053
in the formula, wiAnd lambda is a neural network weight parameter obtained after the ith iteration and is a learning rate.
And further, predicting the performance of the complex product based on the depth residual GRU neural network, taking the assembly parameters of the parts of the complex precision product as the input of the depth residual GRU neural network, taking the classification result of the product performance as the output, establishing the relation between the assembly parameters and the performance, realizing the prediction of the performance of the assembled product based on the assembly parameters of the parts before the assembly is started, and determining whether to assemble the complex precision product according to the prediction result of the model.
The invention discloses a product performance prediction method based on a deep residual GRU neural network, and provides a novel deep learning method, namely a deep residual GRU neural network (DRGRUNN), which organically integrates the advantages of the GRU neural network and the residual GRU neural network, so as to improve the learning capacity of the network on the characteristics of assembly parameters and realize high-precision prediction of the performance of complex precision products.
The DRGRUNN strengthens learning ability of associated features of assembly parameters by means of GRU information memory ability, and constructs a deep network structure through a residual error connection structure to extract richer and more subtle assembly parameter features, so that trainable parameters of the deep network are easier to optimize, and the method has better feature learning ability compared with a traditional neural network. In addition, a BN and Dropout are used for further optimizing the training process, so that DRGRUNN can extract more abundant and fine assembly parameter associated features by constructing a deeper network, and high-precision classification prediction of the performance of the complex precision product is realized. DRGRUNN is used for guiding the assembly of parts of complex precision products, so that the invalid assembly of the parts of the products is effectively reduced, and the production efficiency of the products is improved.
Drawings
FIG. 1 is a flow chart of the assembly of the components of the product based on the performance prediction of the deep residual GRU of the present invention.
Fig. 2 is a schematic view of a GRU structure of the present invention.
Fig. 3 is a schematic diagram of a residual block structure according to the present invention.
Fig. 4 is a diagram illustrating basic RG structures in two residual blocks according to the present invention.
Fig. 5 is a diagram of the RG structure with BN and Dropout for two kinds of residual blocks according to the present invention.
Fig. 6 is a schematic diagram of a deep residual GRU neural network structure according to the present invention.
FIG. 7 is a schematic diagram of an assembly of a top rotor and a coil assembly of the power driven top according to the embodiment of the present invention.
FIG. 8 is a flow chart of the assembly of the gyro rotor and the coil assembly according to the embodiment of the present invention.
Fig. 9 is a training accuracy graph corresponding to an increase in the number of layers of the performance prediction models for the conventional artificial neural network, the conventional GRU neural network, and the depth residual GRU neural network in the embodiment of the present invention.
Fig. 10 is a graph showing the variation in test accuracy corresponding to the increase in the number of layers of the layer performance prediction models of the conventional artificial neural network, the conventional GRU neural network, and the depth residual GRU neural network in the embodiment of the present invention.
FIG. 11 is a comparison graph of training accuracy of a conventional artificial neural network and a conventional GRU neural network performance prediction model in an embodiment of the present invention.
Fig. 12 is a comparison graph of the test accuracy of the performance prediction model of the conventional artificial neural network and the conventional GRU neural network in the embodiment of the present invention.
Fig. 13 is a comparison graph of the training accuracy of the performance prediction model of the conventional GRU neural network and the depth residual GRU neural network in the embodiment of the present invention.
Fig. 14 is a comparison graph of the test accuracy of the conventional GRU neural network and the depth residual GRU neural network performance prediction model in the embodiment of the present invention.
FIG. 15 is a high-dimensional feature dimension-reduction visualization of fitting positive and negative sample observations at DRGRUNNBD input layer, RG4, RG7, and flattened layer output, with an RG number of 11, in an embodiment of the invention
In the figure: 1. a gyro rotor; 2. a coil component.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
The invention discloses a product performance prediction method based on a deep residual GRU neural network, which combines the advantages of the GRU neural network and the residual GRU neural network to improve the learning capability of the network on the associated characteristics of the assembly parameters of complex precision products and finally obtain the high-precision performance prediction capability; the basic flow is shown in fig. 1, and comprises the following two processing steps:
step one, constructing a depth residual GRU neural network model;
and secondly, predicting the product performance based on the depth residual GRU neural network, wherein the prediction result is used for guiding the assembly of the parts of the complex precision product.
For the residual GRU neural network model, the GRU neural network and the residual neural network are adopted for construction, so that in order to facilitate understanding of the scheme of the invention, the basic theory of the GRU neural network and the residual neural network is introduced, and then the design principle and the model structure of the combination of the two networks, namely the deep residual GRU neural network, are emphasized.
Gated cyclic unit (GRU) neural network
Traditional gated cyclic unit (GRU) neural networks are constructed by stacking multiple GRUs. The GRU is used as a variant of a Recurrent Neural Network (RNN), can solve the information learning long-term dependence problem by introducing a gating mechanism, is superior to another variant of the RNN in certain aspects, namely a long-term memory neural network (LSTM), has fewer parameters and is more efficient in training. GRU is schematically shown in FIG. 2, wherein x (t), h (t), r, z, c represent the input, output, reset gate, refresh gate and short-term memory of the unit at time t, σ is sigmoid activation function,
Figure BDA0002864575920000071
is the multiplication of the elements of the vector,
Figure BDA0002864575920000072
is a vector sum. The refresh gate determines how much memory information of the previous cell is retained and the reset gate combines the new input with the memory information of the previous cell. The GRU unit update formula is shown in equations (1) to (4):
z=σ(Wzh(t-1)+Uzx (t) formula 1
r=σ(Wrh(t-1)+Urx (t) formula 2
Figure BDA0002864575920000081
Figure BDA0002864575920000082
In the formula, Wz、Wr、Wc、Uz、Ur、UcIs a trainable weight level.
Two, residual error neural network
The residual neural network was originally applied to image processing and is formed by stacking a plurality of residual blocks. The basic residual block structure is shown in fig. 3, and mainly includes two channels, one is a channel directly connecting input and output (identity mapping), and the other is a channel through a weight updating process, and is composed of two weight layers and two linear rectification activation function layers (relus). The network layer connecting structure of the residual block enables network parameter optimization to be easier, and is beneficial to reducing the risks of gradient disappearance and gradient explosion in network training.
In fig. 3, X, H (X) is the input and output of the residual block, respectively; w1、W2The weight layers of the residual block are mainly convolution layers, which are helpful for extracting local features of the graph; f (X) is the residual mapping, the residual block performs the following operations:
F(X)=W2α(W1x) formula 5
H (X) ═ α (f (X) + X) formula 6
Where α refers to the ReLU activation function.
Thirdly, constructing a depth residual GRU neural network model
The GRU neural network has information memory capacity, and can learn the incidence relation among the complex precision product part assembly data by using the advantage of the GRU neural network. The residual block structure in the residual neural network can effectively solve the problem of deep neural network training, the deep GRU neural network can be constructed by using the advantage, and the characteristics of more complex precision products, fine and rich assembly data are extracted to improve the network performance prediction capability. Therefore, the advantages of the GRU neural network and the residual error neural network are organically integrated, and an effective way for solving the problem of performing high-precision performance prediction on complex precision products based on assembly parameters is formed.
The present invention introduces a residual connection structure into a GRU neural network, and proposes two residual GRU structures (collectively referred to as RG), as shown in fig. 4 and 5, fig. 4 is a basic RG structure composed of two ReLU layers and two GRU layers, fig. 5 is an RG structure with Batch Normalization (BN) and Dropout, which is composed of two BN layers, two ReLU layers, two GRU layers and two Dropout layers, and obtains a deep residual GRU neural network (DRGRUNN) stacked by RG, as shown in fig. 6, which is stacked by N RG blocks, and then connected to a full connection layer (FC) through a flattening (Flatten) operation to output a result.
As shown in fig. 4, 5, 6, DRGRUN forms residual connections between multiple GRUs, which makes it possible to construct a deep GRU network to extract more and richer features, in contrast to conventional GRU neural networks. Compared with the traditional residual neural network, the residual block weight layer in the DRGRUN mainly consists of GRUs instead of convolution layers, and the advantage is that the GRUs can learn the front-back correlation characteristics among the parameters instead of the local characteristics learned by the convolution layers.
Further, the training process of the depth residual GRU neural network is optimized.
The BN layer refers to a batch normalization layer (BN), which is a technique for normalizing input features. Adding a BN layer in the RG realizes the optimization of the neural network training process, effectively solves the problem of the drift of the internal covariance in the neural network training process, and has the following specific treatment process:
Figure BDA0002864575920000091
Figure BDA0002864575920000092
Figure BDA0002864575920000093
Figure BDA0002864575920000094
in the formula, xn、ynInput and output of the batch layer, respectively; n is the batch layer size, i.e. the number of samples; α, β are trainable parameters, ε is a constant close to 0; sigma is the variance of the batch layer input samples; μ is the batch layer input sample mean.
The ReLU layer is a ReLU activation function which is used for nonlinear transformation in the neural network, optimization of the neural network training process is achieved, and the risks of gradient disappearance and gradient explosion of the deep neural network in the training process are further reduced. As shown in equation 11, the derivative thereof is 1 or 0,
y ═ max (x,0) formula 11
In the formula, x and y sequentially represent input and output of the ReLU activation function.
The Dropout layer is used as a regularization technology, Dropout is added into RG, and a certain proportion (usually in a proportion range of 0-1) of neurons are randomly abandoned, so that overfitting of a model is relieved, and generalization capability of the model is improved. On the other hand, when the Dropout technique is used for processing an input layer, it is considered as a method of adding noise.
In order to enable the GRU neural network based on the depth residual error to more accurately predict the performance of a complex product, the network parameters in the GRU network based on the depth residual error are optimized, and the method comprises the following steps:
as shown in FIG. 6, assume that the l-thRG input is RGlOutput is RGl+1Then the RG performs the following calculation procedure:
RGl+1=RGl+F(RGl,Wl RG) Formula 12
Wherein F represents a residual function, Wl RGRepresenting trainable weight parameters within the RG; l-th denotes the l-th RG block;
by means of a recursive relationship (Recursively), the following result can be obtained:
Figure BDA0002864575920000101
wherein i represents the number of RGs (RG)iDenotes the input of the ith RG block, Wi RG) Represents a trainable weight parameter in the ith RG, L is the total number of RGs;
in the backward propagation, the loss function is recorded as E, and the loss function can be obtained by a chain method:
Figure BDA0002864575920000102
in the formula (I), the compound is shown in the specification,
Figure BDA0002864575920000111
representing a derivation process;
equation 14 shows that the trainable parameters are optimized in the depth residual GRU neural network by a linear superposition method, so that the problem of gradient explosion and gradient disappearance caused by increasing the number of network layers is more effectively suppressed in back propagation.
When the neural network is trained by using the cross entropy error as a loss function, the output layer characteristics are converted into the range between [0 and 1] through a softmax function, wherein the softmax function is as follows:
Figure BDA0002864575920000112
in the formula, x and y are input and output of a softmax function in sequence, M is the number of categories, and i and j represent indexes of neurons in an output layer; the cross entropy loss E is given by equation 16:
Figure BDA0002864575920000113
in the formula, targetjA true category label representing the sample belongs to;
after the cross entropy loss is calculated, the network is iteratively trained by a gradient descent method, as shown in formula 17, to adjust the network parameters, so as to obtain optimized network parameters:
Figure BDA0002864575920000114
in the formula, wiAnd lambda is a neural network weight parameter obtained after the ith iteration and is a learning rate.
Finally, the assembly parameters of the parts of the complex precision product are used as the input of the depth residual GRU neural network, the product performance classification result is used as the output, the relation between the assembly parameters and the performance is established, the product performance prediction after the assembly is realized based on the assembly parameters of the parts before the assembly is started, and whether the assembly is carried out or not is determined according to the model prediction result, so that the invalid assembly is reduced, and the production efficiency and the assembly quality of the product are improved.
DRGRUN based on statistical classification disclosed for the present invention was implemented on Keras 2.3.1 with TensorFlow 2.1.0 as its backend library. Keras has the characteristics of high modularity, simplicity, expandability and the like, and is considered to be one of the most popular deep learning development tools. The experiment was performed on a computer equipped with i7-8750 central processing unit and NVIDIA QUADRO P1000 GPU.
[ examples ] A method for producing a compound
The method for predicting product performance based on the deep residual GRU neural network disclosed in the present invention is further described in detail with reference to specific embodiments.
In order to verify that the product performance prediction method based on the depth residual GRU neural network disclosed by the invention can realize the performance prediction of complex precision products, the method is applied to a dynamic servo gyro to verify the effectiveness of the method in assembling feature learning. The specific experimental verification process comprises the following steps:
(1) experimental data Collection
The experimental data of the part are collected in a power follow-up gyro assembly production line, the power follow-up gyro is one of core devices of an inertial navigation system, mainly comprises a gyro rotor and a coil component, is used for navigation, tracking, positioning and the like of an aircraft as shown in figure 7, has a complex structure, high assembly precision requirement and a plurality of assembly parts, and is a typical representative of complex precision products. Therefore, the assembly of the gyro rotor and the coil component is taken as an example to verify the effectiveness of the method (DRGRUNN) provided by the invention, and the assembly flow of the power-driven gyro is shown in FIG. 8.
Due to the limitations of measurement cost and measurement technology, the effective measurement of the fit tolerance of each part of the gyro rotor is difficult, and the gyro rotation is characterized by drift testing of the maximum electric signal values in the + X, + Y, + Z, -Y and Z directionsThe overall assembly quality of the subunits is abbreviated as + X, + Y, + Z, -Y and-Z in this order. The same is true of the coil component, and the assembly quality is characterized by the resistance value of the precession coil, the resistance value of the speed stabilizing coil, the resistance value of the reference coil, the resistance value of the modulation coil and the total resistance value, which are sequentially abbreviated as Rp、Rss、Rr、Rm、Rt. In actual assembly, even if a qualified gyro rotor and a qualified coil component are assembled together, the product performance may not meet the requirements, and the gyro rotor needs to be repeatedly assembled and disassembled for adjustment, resulting in a large number of invalid assemblies.
In order to solve the assembly problem of complex and precise products such as a power follow-up gyroscope, 574 assembly samples are collected in actual assembly, 546 assembly positive samples with good product performance are obtained through a statistical classification mode, the assembly quality requirement of the products is met, and 28 assembly negative samples are obtained, and the sample composition is shown in table 1.
TABLE 1 Gyro Assembly sample composition
Number of positive samples Number of negative samples Total up to
546 28 574
The present embodiment converts the performance prediction problem into a two-class prediction problem, i.e., whether the fitting sample is a positive sample or a negative sample is predicted according to the fitting parameters. Assembling a positive sample means that the assembled top has good product performance. That is, DRGRUNN is input as the gyro rotor and coil component assembly parameter, and the output result is whether it is a positive sample. If the prediction result is a positive sample, the performance of the assembled gyro rotor and coil component meets the product quality requirement, the assembly of the gyro rotor and the coil component can be carried out, otherwise, the assembly is not carried out. The performance classification prediction result is helpful for guiding the assembly of product parts, effectively reduces invalid assembly and improves the production efficiency of products.
(2) Hyper-parameter settings
It can be seen from table 1 that the experimental samples are not balanced in category, which is not favorable for training neural network model. SMOTE is a technology for processing sample class imbalance, and the basic idea is to analyze a few types of samples and artificially synthesize new samples according to the few types of samples to be added into a data set, so that sample class balance and sample set expansion in the data set are realized. The method comprises the steps of processing samples by adopting an SMOTE oversampling technology to obtain 546 positive and negative samples, dividing a data set by adopting a 5-fold cross validation scheme to train and evaluate a network, specifically, dividing the data set into 5 subsets, adopting a 4-subset training model in each experiment, and taking 1 subset as a test set; the experiment was repeated 5 times, so that each subset was in turn a test set. Further, the initialization and selection of hyper-parameters in the deep learning method that has been developed will be described in detail herein.
In deep learning, there is no currently accepted setting and optimization method for setting hyper-parameters such as learning rate, number of hidden layers, number of neurons, etc., and the following setting is performed in this embodiment with reference to some empirical suggestions. The learning rate is set to be 0.001 when the learning rate is too large or too small, which is not beneficial to model training; setting the number of hidden layer neurons as 40; setting the momentum coefficient is favorable for jumping out of a local optimal point, and the momentum coefficient is set to be 0.9; the setting of the batch size is beneficial to accelerating the model training speed, and the setting is 64; the L2 regularization is beneficial to improving the generalization capability of the model, and the L2 weight attenuation coefficient is set to be 0.0001; the initialization method is consistent with the prior art, and the problems of gradient disappearance and gradient explosion in the model training process are avoided; the Dropout drop ratio is set to 0.5. The number of the present embodiments is set to 2,5,8,11 to take into account the influence of the number of network layers on the model predictive performance.
(3) Comparison of Performance
The section shows experimental results, and the comparison method is a traditional artificial neural network and a traditional GRU neural network. Besides the influence of the depth of the model on the prediction accuracy of the model, the change of the model performance after the BN technology and the Dropout technology are added is also considered, and a detailed experimental result is given in Table 2.
Table 2 average accuracy of experimental results (5 transfer cross validation) and their deviation (in%)
Figure BDA0002864575920000141
In table 2, N is the total number of GRU and FC layers used in the method, which is an indirect representation of the depth of the model. "+" means that no BN technique and Dropout technique are used, and "BD" means that the method is used. The bold marking part in the table is the optimal result of the experiment.
a) Effect of increasing RG number on model predictive Performance
Increasing the number of RGs to form deeper networks is helpful to extract more subtle features of the assembly data, and improves the accuracy of performance prediction. Fig. 9 and 10 show graphs of training and testing accuracy changes of the conventional artificial neural network, the conventional GRU neural network and the depth residual GRU neural network at different model depths.
Although it is difficult to train the deep neural network, as shown in fig. 9, 10 and table 2, as the depth of the model increases, the method proposed by the present invention (booth DRGRUNN)BDDRGRUNN) not only can basically maintain the original performance, but also slightly improves the performance, which is rarely found in high-precision performance prediction. The traditional artificial neural network and the traditional GRU neural network have performance degradation phenomena with different degrees on average training precision and average testing precision along with the increase of the model depth, and the average training precision and the average testing precision are respectively reduced by 42.85 percent and 43.44 percent in the process of increasing the number of network layers from N-2 to N-11 in the most serious traditional GRU neural network (GRUBD) with the addition of a batch normalization layer and a Dropout layer because the traditional artificial neural network,Without the residual connection structure in the conventional GRU neural network, it is difficult to optimize trainable parameters of a deep network having such multiple layers. Therefore, compared with the traditional artificial neural network and the traditional GRU neural network, the method provided by the invention has the advantages that the trainable parameter of the network is easier to optimize due to the residual connection structure, and the characteristic learning capability is better, so that the prediction capability is better.
In addition, the maximum average training precision and the maximum average testing precision of the depth residual GRU neural network are respectively 100% (where DRGRUNN with N being 2,5,8,11) and 98.45% (where DRGRUNN)BDwith N being 11), namely, the model is preferably used for guiding the assembly of the gyro rotor and the coil component by applying the method provided by the invention, the accuracy is up to 98.45 percent, so that the invalid assembly is effectively reduced, and the production efficiency of the product is greatly improved.
b) Comparison between predicted performance results of traditional neural networks
Fig. 11 and 12 show the training and testing accuracy experiment results of the traditional artificial neural network and the traditional GRU neural network under different model depths.
Comparing the conventional artificial neural network with the conventional GRU neural network, as shown in fig. 11, 12 and table 2, GRU experimental results have the highest accuracy and the smallest standard deviation. GRU experimental results standard deviation was minimal at different model depths, and GRU mean training accuracy versus mean test accuracy optimality were 100%, 97.90%, respectively (where GRU with N2), versus GRU optimality (99.27% and 97.16% in GRU)BDwith N ═ 2) were improved by 0.73% and 0.76%, respectively, which indicates that the addition of BN and Dropout techniques to the GRU neural network did not achieve good generalization performance, especially in the case of deeper models (GRU)BDwith N ═ 5,8,11), it also causes the phenomenon of more serious model performance degradation, which is largely due to the fact that Dropout technology randomly discards neuron nodes with certain probability to destroy the GRU information memory capability. The optimal result of the GRU average training precision and the average testing precision is compared with that of the Traditional Artificial Neural Network (TANN)BDTan) was improved by 14.67%, 13.73% respectively (87.20% and 86.08% in tan with N ═ 5), indicating that GRU is indicated by informationThe memory ability can effectively capture the relation between the assembly parameters and the performance.
Overall, the conventional GRU neural network (GRU) enables more efficient learning of assembly parameter features and therefore better results than the conventional artificial neural network (TANN).
c) Comparison between GRU neural network performance prediction results
Fig. 13 and 14 show the training and testing accuracy experiment results of the depth residual GRU neural network and the conventional GRU neural network under different model depths.
Compared with the conventional GRU neural network, as shown in FIG. 13, FIG. 14 and Table 2, the proposed method (DRGRUNN)BDDRGRUNN) can obtain optimal experimental results. Under the condition of the same model depth, the average training precision of DRGRUNN reaches 100 percent, and the DRGRUNNBDThe average test precision of the method is superior to that of the traditional GRU neural network (both GRU)BDand GRU), in particular GRU, exhibit severe performance degradation during model depth increase (where GRU)BDwith N ═ 5,8,11), and RNGRUNN with residual ligation structure addedBDNot only does not cause performance degradation, but also can obtain optimal result (where RNGRUNN)BDwith N ═ 11), this is due to RNGRUNNBDThe residual connecting structure in the GRU network directly adds the information of the previous input end to the subsequent output end, so that the network trainable parameter optimization process is easier, the risk of gradient disappearance and gradient explosion in the network back propagation process is effectively reduced by the residual connecting structure, and the effectiveness of the application of the residual connecting structure in the GRU network is verified. On the other hand, comparison of RNGRUNNBDExperimental results with RNGRUNN, RNGRUNN with BN and Dropout techniquesBDThe average training accuracy is not as good as RNGRUNN, but RNGRUNNBDThe average test precision of the method is better than that of RNGRUNN, namely, the BN and Dropout technologies added into the residual block structure network have certain influence on the training precision, but the method is beneficial to improving the generalization capability of the network, and is beneficial to practical application.
Overall, the method presented herein averages the training precision optimal results (100% in)DRGRUNN with N2, 5,8,11) is on average with the best results for the conventional GRU neural network (100% in GRU with N2) and the best results for the average test accuracy (98.45% in DRGRUNN)BDwith N11) is improved by 0.56% compared to the optimal result of a conventional GRU neural network (97.90% in GRU with N2), if this ratio reaches 76.27% and 79.16% respectively at the same model depth (both inN 11). That is, the deep residual GRU neural network provided by the invention enables the trainable parameter optimization of the network to be easier by virtue of the residual connection structure, so that a better result is obtained, and the effectiveness of the application of the residual connection structure in the traditional GRU neural network is verified.
d) Model output high-dimensional feature dimension reduction visualization
In order to visually represent the capability of extracting the feature information of the extracted model, an unsupervised dimension reduction method is adopted, namely t distribution is randomly and adjacently embedded, and dimension reduction visualization is carried out on the features extracted by the model input layer, part of the middle layers and the flattening layer. The TSNE dimensionality reduction visualization technology is adopted to reduce the dimensionality of the high-dimensional features to two-dimensional for visualization, the problem of information loss and distortion can exist in the dimensionality reduction process, but the three-dimensional visualization is used for visually judging whether the high-dimensional features have separability or not, and the three-dimensional visualization does not participate in the training of a neural network.
This example selects the optimal model (RNGRUNN) of the proposed methodBDwith N ═ 11) the extracted fitting parameter features are visualized, which is the model with the best generalization ability. As shown in fig. 11, in the model input layer, the high-dimensional features of the fitting data have not been extracted yet, and it can be seen that the fitting positive and negative sample observations are highly mixed together in the model; through RG4And RG7After the characteristics of the assembly data are extracted, the model basically has the capability of distinguishing high-dimensional characteristics, but a small part of positive and negative sample observed values are mixed together; in the output result of the model flattening layer, it can be seen that after the model performs feature extraction on the assembly data, the assembly positive and negative samples can be correctly distinguished, and the aliasing phenomenon of the assembly positive and negative sample observed values is avoided, namely, the model can realize high-precision classification prediction on the performance of the gyroscope.
Through a dimension reduction visualization technology, the output condition of the observation values of the assembled positive and negative samples on each layer of the model is observed, and the result shows that the developed depth residual error cyclic neural network can effectively extract the high-dimensional characteristics of the assembly parameters of the parts of the power follow-up gyroscope, and the high-precision classification prediction of the performance of the gyroscope is realized.
Therefore, the embodiment verifies the superiority of DRGRUNN feature extraction by comparing with the traditional neural network (including an artificial neural network and a GRU neural network). Compared with the traditional GRU neural network, the DRGRUNN improves the average training precision by 0.009%, and the average testing precision is 2.2%, if BN technology and Dropout technology are considered, the proportion is as high as 75.95%, 79.16%; compared with the traditional artificial neural network, the average training precision and the average testing precision of DNGRUNN are improved by 24.69 percent and 23.40 percent, which shows that the GRU in the method can effectively extract the correlation characteristics among the assembly parameters and the application rationality of the residual error connection structure in the GRU neural network. Moreover, as can be seen from the dimension reduction visualization result of the high-dimensional features of the DRGRUNN optimal model in the embodiment, the method provided by the invention can distinguish the high-dimensional features of the assembly parameters of the complex precision product, and realize high-precision prediction of product performance.
In conclusion, the invention establishes a new deep learning method, namely a deep residual GRU neural network (DRGRUNN), which is used for improving the performance classification prediction capability of complex precision products. Specifically, the DRGRUNN learns the associated characteristics of the assembly parameters of the complex precision product by means of the GRU neural network information memory capacity, and the residual connection structure is integrated into the GRU neural network, so that a deeper network model can be constructed, the fine characteristics of the assembly parameters can be extracted, and the trainable parameters of the deep network can be optimized more easily. In addition, DRGRUNN further improves the model generalization capability by using BN and Dropout technologies. Therefore, DRGRUNN can extract more abundant and fine assembly parameter correlation characteristics by constructing a deeper network, and realize high-precision classification prediction of the performance of the complex precision product. DRGRUNN is used for guiding the assembly of parts of complex precision products, so that the invalid assembly of the parts of the products is effectively reduced, and the production efficiency of the products is improved.
The establishment of the high-precision performance prediction model is beneficial to effectively controlling the assembly quality of each assembly link of the product in the assembly process, the production efficiency of the product is improved, and the comprehensive application in the manufacturing industry is beneficial to promoting the intelligent manufacturing development, accelerating the upgrading of the manufacturing mode, realizing the industrial automatic production, reducing the manufacturing cost and improving the enterprise benefit. Further, the method provided by the present invention can be applied not only to performance prediction, but also to performance prediction, such as abnormality detection and fault diagnosis.
The above embodiments are not intended to limit the present invention, and the present invention is not limited to the above examples, and those skilled in the art may make variations, modifications, additions or substitutions within the technical scope of the present invention.

Claims (10)

1. A product performance prediction method based on a depth residual GRU neural network is characterized in that: the method comprises the following steps:
step one, constructing a depth residual GRU neural network model;
and secondly, predicting the product performance based on the depth residual GRU neural network, wherein the prediction result is used for guiding the assembly of the parts of the complex precision product.
2. The method of claim 1 for predicting product performance based on a depth residual GRU neural network, wherein: in the first step, a GRU neural network model and a residual error neural network are adopted to construct a deep residual error GRU neural network model, the learning capability of the assembly parameter correlation characteristics is enhanced by means of the information memory capability of the GRU neural network, a deep network structure is constructed through a residual error connection structure of the residual error neural network, and the product assembly parameter characteristics are extracted.
3. The method of claim 2 for predicting product performance based on a depth residual GRU neural network, wherein: introducing the residual connection structure into a GRU neural network to obtain two residual GRU structures and obtain a depth residual GRU neural network formed by stacking RGs; both residual GRU structures are collectively referred to as RG.
4. The method of claim 3 for predicting product performance based on a depth residual GRU neural network, wherein: the two residual GRU structures are: basic RG structure, RG structure with BN and Dropout;
the basic RG structure consists of two ReLU layers and two GRU layers, and the RG structure with BN and Dropout consists of two BN layers, two ReLU layers, two GRU layers and two Dropout layers.
5. The method of claim 4 for predicting product performance based on a depth residual GRU neural network, wherein: the depth residual GRU neural network is formed by stacking N RG blocks, and then is connected to a full-link layer to output a result after flattening operation.
6. The method of claim 4 or 5 for predicting product performance based on a depth residual GRU neural network, wherein: the BN layer refers to a batch standardization layer, the BN layer is added into the RG to optimize the neural network training process, and the specific processing process is as follows:
Figure FDA0002864575910000021
Figure FDA0002864575910000022
Figure FDA0002864575910000023
Figure FDA0002864575910000024
in the formula, xn、ynRespectively representing the input and the output of the batch normalization layer; n represents the size of the batch normalization layer, namely the number of samples; α, β are trainable parameters, and ε is a constant close to 0; sigma is the variance of the batch normalization layer input samples; μ is the batch normalization layer input sample mean.
7. The method of claim 4 or 5 for predicting product performance based on a depth residual GRU neural network, wherein: the ReLU layer is a ReLU activation function, which is used for nonlinear transformation in the neural network to optimize the neural network training process, as shown in equation 11, the derivative of which is 1 or 0,
y ═ max (x,0) formula 11
In the formula, x and y sequentially represent input and output of the ReLU activation function.
8. The method of claim 4 or 5 for predicting product performance based on a depth residual GRU neural network, wherein: a Dropout layer is added to the RG, and the neural network training process is optimized by randomly discarding neurons in the 0-1 scale range.
9. The method of claim 4 or 5 for predicting product performance based on a depth residual GRU neural network, wherein: in a deep residual GRU network, network parameters are optimized, with the process:
let l-thRG input be RGlOutput is RGl+1Then the RG performs the following calculation procedure:
RGl+1=RGl+F(RGl,Wl RG) Formula 12
Wherein F represents a residual function, Wl RGRepresenting trainable weight parameters within the RG; l-th denotes the l-th RG block;
by means of a recursive relationship, the following results can be obtained:
Figure FDA0002864575910000031
wherein i represents the number of RGs (RG)iDenotes the input of the ith RG block, Wi RGRepresents a trainable weight parameter in the ith RG, L is the total number of RGs;
in the backward propagation, the loss function is recorded as E, and the loss function can be obtained by a chain method:
Figure FDA0002864575910000032
in the formula (I), the compound is shown in the specification,
Figure FDA0002864575910000033
representing a derivation process;
formula 14 shows that trainable parameters are optimized in a depth residual GRU neural network by a linear superposition mode;
when the neural network is trained by using the cross entropy error as a loss function, the output layer characteristics are converted into the range between [0 and 1] through a softmax function, wherein the softmax function is as follows:
Figure FDA0002864575910000034
in the formula, x and y are input and output of a softmax function in sequence, M is the number of categories, and i and j represent indexes of neurons in an output layer; the cross entropy loss E is given by equation 16:
Figure FDA0002864575910000035
in the formula, targetjA true category label representing the sample belongs to;
after the cross entropy loss is calculated, the network is iteratively trained by a gradient descent method, as shown in formula 17, to adjust the network parameters, so as to obtain optimized network parameters:
Figure FDA0002864575910000041
in the formula, wiAnd lambda is a neural network weight parameter obtained after the ith iteration and is a learning rate.
10. The method of claim 1 for predicting product performance based on a depth residual GRU neural network, wherein: and in the second step, the complex precision product is predicted based on the depth residual GRU neural network, the assembly parameters of the parts of the complex precision product are used as the input of the depth residual GRU neural network, the product performance classification result is used as the output, the relation between the assembly parameters and the performance is established, the product performance prediction after the assembly is realized based on the assembly parameters of the parts before the assembly is started, and whether the assembly is carried out is determined according to the model prediction result.
CN202011577005.2A 2020-12-28 2020-12-28 Product performance prediction method based on depth residual GRU neural network Pending CN112686372A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011577005.2A CN112686372A (en) 2020-12-28 2020-12-28 Product performance prediction method based on depth residual GRU neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011577005.2A CN112686372A (en) 2020-12-28 2020-12-28 Product performance prediction method based on depth residual GRU neural network

Publications (1)

Publication Number Publication Date
CN112686372A true CN112686372A (en) 2021-04-20

Family

ID=75452377

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011577005.2A Pending CN112686372A (en) 2020-12-28 2020-12-28 Product performance prediction method based on depth residual GRU neural network

Country Status (1)

Country Link
CN (1) CN112686372A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113655768A (en) * 2021-10-21 2021-11-16 深圳市信润富联数字科技有限公司 Assembly yield control method, equipment and computer readable storage medium
WO2023181717A1 (en) * 2022-03-22 2023-09-28 三菱電機株式会社 Inference device, inference method, and learning device
CN116883763A (en) * 2023-09-06 2023-10-13 宁德市天铭新能源汽车配件有限公司 Deep learning-based automobile part defect detection method and system
CN117291094A (en) * 2023-09-14 2023-12-26 成都飞机工业(集团)有限责任公司 Unmanned aerial vehicle power system parameter anomaly detection method based on associated parameter mapping

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107977628A (en) * 2017-12-01 2018-05-01 北京旷视科技有限公司 Neural network training method, method for detecting human face and human face detection device
CN108038539A (en) * 2017-10-26 2018-05-15 中山大学 A kind of integrated length memory Recognition with Recurrent Neural Network and the method for gradient lifting decision tree
CN109801621A (en) * 2019-03-15 2019-05-24 三峡大学 A kind of audio recognition method based on residual error gating cycle unit
CN110188637A (en) * 2019-05-17 2019-08-30 西安电子科技大学 A kind of Activity recognition technical method based on deep learning
CN111275328A (en) * 2020-01-19 2020-06-12 哈尔滨工业大学(威海) RNGRU (radio network unit) position marker part matching method based on comprehensive grey correlation sequence
CN111275113A (en) * 2020-01-20 2020-06-12 西安理工大学 Skew time series abnormity detection method based on cost sensitive hybrid network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108038539A (en) * 2017-10-26 2018-05-15 中山大学 A kind of integrated length memory Recognition with Recurrent Neural Network and the method for gradient lifting decision tree
CN107977628A (en) * 2017-12-01 2018-05-01 北京旷视科技有限公司 Neural network training method, method for detecting human face and human face detection device
CN109801621A (en) * 2019-03-15 2019-05-24 三峡大学 A kind of audio recognition method based on residual error gating cycle unit
CN110188637A (en) * 2019-05-17 2019-08-30 西安电子科技大学 A kind of Activity recognition technical method based on deep learning
CN111275328A (en) * 2020-01-19 2020-06-12 哈尔滨工业大学(威海) RNGRU (radio network unit) position marker part matching method based on comprehensive grey correlation sequence
CN111275113A (en) * 2020-01-20 2020-06-12 西安理工大学 Skew time series abnormity detection method based on cost sensitive hybrid network

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113655768A (en) * 2021-10-21 2021-11-16 深圳市信润富联数字科技有限公司 Assembly yield control method, equipment and computer readable storage medium
WO2023181717A1 (en) * 2022-03-22 2023-09-28 三菱電機株式会社 Inference device, inference method, and learning device
CN116883763A (en) * 2023-09-06 2023-10-13 宁德市天铭新能源汽车配件有限公司 Deep learning-based automobile part defect detection method and system
CN116883763B (en) * 2023-09-06 2023-12-12 宁德市天铭新能源汽车配件有限公司 Deep learning-based automobile part defect detection method and system
CN117291094A (en) * 2023-09-14 2023-12-26 成都飞机工业(集团)有限责任公司 Unmanned aerial vehicle power system parameter anomaly detection method based on associated parameter mapping

Similar Documents

Publication Publication Date Title
CN112686372A (en) Product performance prediction method based on depth residual GRU neural network
CN108095716B (en) Electrocardiosignal detection method based on confidence rule base and deep neural network
CN111814871A (en) Image classification method based on reliable weight optimal transmission
CN108596327B (en) Seismic velocity spectrum artificial intelligence picking method based on deep learning
CN111144552B (en) Multi-index grain quality prediction method and device
CN114488140B (en) Small sample radar one-dimensional image target recognition method based on deep migration learning
CN112557034B (en) Bearing fault diagnosis method based on PCA _ CNNS
CN109492748B (en) Method for establishing medium-and-long-term load prediction model of power system based on convolutional neural network
CN110880369A (en) Gas marker detection method based on radial basis function neural network and application
CN112147432A (en) BiLSTM module based on attention mechanism, transformer state diagnosis method and system
CN110082738B (en) Radar target identification method based on Gaussian mixture and tensor recurrent neural network
CN111597760A (en) Method for obtaining gas path parameter deviation value under small sample condition
CN111507365A (en) Confidence rule automatic generation method based on fuzzy clustering
CN115435892A (en) Intelligent fault diagnosis method for complex electromechanical equipment
CN112881987A (en) Airborne phased array radar behavior prediction method based on LSTM model
CN113919220A (en) Intelligent fault diagnosis method for rolling bearing driven by vibration image
CN116028876A (en) Rolling bearing fault diagnosis method based on transfer learning
CN115982141A (en) Characteristic optimization method for time series data prediction
CN111061151B (en) Distributed energy state monitoring method based on multivariate convolutional neural network
Yamada et al. Weight Features for Predicting Future Model Performance of Deep Neural Networks.
CN115062759A (en) Fault diagnosis method based on improved long and short memory neural network
CN115661498A (en) Self-optimization single cell clustering method
CN111506868B (en) Ultra-short-term wind speed prediction method based on HHT weight optimization
CN104616656A (en) Improved ABC (Artificial Bee Colony) algorithm based crested ibis chirp codebook design method
CN113033695A (en) Method for predicting faults of electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210420

RJ01 Rejection of invention patent application after publication