CN115438692A

CN115438692A - Equipment state anomaly detection method based on variational automatic encoder

Info

Publication number: CN115438692A
Application number: CN202210981082.7A
Authority: CN
Inventors: 雷文平; 闫灏; 李晨阳; 陈宏�; 陈磊; 李凌均; 王丽雅; 李沁远
Original assignee: Zhengzhou University
Current assignee: Zhengzhou University
Priority date: 2022-08-16
Filing date: 2022-08-16
Publication date: 2022-12-06

Abstract

The invention discloses a method for detecting equipment state abnormity based on a variational automatic encoder, belonging to the technical field of intelligent diagnosis and monitoring of mechanical systems; the method comprises the following steps: collecting vibration signals when mechanical equipment normally runs; preprocessing a signal to obtain multi-modal data containing a time domain, a frequency domain and self-correlation, and fusing the data to obtain a training set; constructing a VAE network of a variational automatic encoder; training the network by using a training set; after network learning is converged, selecting sample data with normal and abnormal labels as a test set to test the network to determine an abnormal threshold, and setting the abnormal threshold into the VAE network to carry out abnormal detection on the equipment. The invention adopts a multimode fusion anomaly detection method to solve the problem that the analysis of single-domain signals lacks the accuracy of detection results, so that the anomaly detection results have good accuracy and precision, and can be suitable for anomaly detection of mechanical equipment working under different working conditions.

Description

Equipment state abnormity detection method based on variational automatic encoder

Technical Field

The invention relates to an anomaly detection method for a rolling bearing, in particular to an equipment state anomaly detection method based on a variational automatic encoder, and belongs to the technical field of intelligent diagnosis and monitoring of mechanical systems.

Background

In the long-term operation process of equipment such as motors, bearings and gear boxes, the operation state can gradually become worse, and if the equipment cannot be maintained in time, major accidents can be caused. Therefore, it is very important to monitor the operation state of the equipment and detect the fault of the equipment in an early stage, and the vibration of the equipment as an effective index of the operation state of the equipment has very important research value. In the operation process, the working condition of mechanical equipment is always complex and changeable, and the change of load can cause the change of the vibration characteristic of the rolling bearing to a great extent, so that the judgment of the operation state of the equipment becomes more complex and difficult. Therefore, the condition detection and fault diagnosis of the rolling bearing under variable working conditions have become one of the important development directions in the field of mechanical vibration.

Currently, most of the existing anomaly detection is based on single mode signals such as time domain, frequency domain and the like, for example, the application publication number is CN 113532893A-the invention patent of anomaly detection based on device vibration, and the anomaly detection is judged by using frequency domain data; and the energy of the bearing in the frequency domain is transferred to the middle frequency band and the high frequency band, and the time domain signal is sensitive to the defects of the rolling bearing but not sensitive to the amplitude and the frequency, so that the accuracy of the detection result is lacked in the analysis of the single domain signal, and the method is also not suitable for the abnormal detection of mechanical equipment working under different working conditions.

Disclosure of Invention

The purpose of the invention is: the equipment state anomaly detection method based on the variational automatic encoder is provided, data of three modal signals of time domain, frequency domain and autocorrelation data are fused and then used as input of a variational automatic encoder network model, and test data in normal and abnormal states are used for calculating an anomaly threshold value, so that an anomaly detection result has good accuracy, and the method is more suitable for anomaly detection of mechanical equipment working under different working conditions.

In order to realize the purpose, the invention adopts the following technical scheme: a device state abnormity detection method based on a variational automatic encoder comprises the following steps:

s1, collecting vibration signals:

adopting the selected sampling frequency f to obtain the original data X of the time series vibration signal of the mechanical equipment in normal operation under different working conditions _n ；

S2, preprocessing signals to obtain multi-modal data and fusing:

for the original data X according to the selected signal length s (s belongs to Z, s is not equal to 0 and s is even number) _n Intercepting, wherein the backward translation distance of an intercepting window after each interception is l = s/2, and obtaining m sample data X with equal length (m belongs to Z, m is not equal to 0) _i (i ∈ (0, m), with s data points in each sample datum);

for m X _i Respectively carrying out corresponding time domain, frequency domain and autocorrelation transformation, and selecting the first n = s/2 data points to carry out 0-1 normalization processing to respectively obtain waveform sample data Y _i Changing the shape structure of the data, superposing and fusing the sample data of the three modes, and carrying out-5-5 normalization processing on the fused data to be used as a training set;

s3, constructing a variational automatic encoder network;

the VAE network is composed of an encoder and a decoder, the training set preprocessed in the step S2 is used as an original sample data set x and is input into the network, and the encoder network is expressed as

Of a parameterized probabilistic model, wherein

Network parameter set representing encoder, decoder network represented as p _θ (x-z), where θ (w, b) represents a set of network parameters of the decoder, w is an initial weight, and b is an offset; the intermediate hidden layer variable output by the encoder network is z, the distribution of z is p (z-x), the decoder network takes z as network input, and finally, the generated data approaching to the original sample data x is output

Network uses original sample data x and generated data

Reconstruction loss and encoder network in between

And the sum of KL divergence degrees of the distribution p (z-x) of the hidden variable z is used as a total Loss function and an objective function of network optimization, wherein the expression of the total Loss function Loss is as follows:

in the formula, loss of reconstruction

KL divergence

N is the total number of samples in the input data;

s4, training the VAE network by utilizing a training set:

randomly disorganizing the data sequence in the training set preprocessed in the step S2, dividing the disorganized training set according to a preset batch-size, inputting the divided training set into the VAE network constructed in the step S3 for training, and automatically updating network parameters

And theta (w, b), stopping training after the network learning converges, fixing the network parameters and stopping updating the parameters;

s5, selecting a test set to test the VAE network to determine an abnormal threshold value:

selecting sample data with normal and abnormal labels, preprocessing the sample data in the step S2, inputting the sample data serving as a test set into a network for testing, normalizing reconstruction loss of each group of data in the test set by 0-1 to serve as a prediction score of the group of data, and dividing the prediction scores of all the groups by respectively using each prediction score as a threshold value: dividing the label into normal labels below the threshold value and abnormal labels above the threshold value, finally comparing the label with the real label to calculate the F1-score under the threshold value, and taking the threshold value corresponding to the maximum F1-score as the abnormal threshold value of the network;

s6, setting the abnormal threshold value into the VAE network to detect the abnormality of the equipment:

and (4) taking the abnormal threshold determined in the step (S5) as an abnormal threshold of the VAE network, and performing abnormal detection on the equipment vibration signals under different loads by adopting the VAE network constructed in the step (S3) to obtain an abnormal detection result.

In the step S2, m X pieces are processed _i The specific method for obtaining sample data by respectively carrying out corresponding time domain, frequency domain and autocorrelation transformation comprises the following steps: at each X _i Selecting the first n = s/2 data points and carrying out 0-1 normalization processing as waveform sample data Y _i (ii) a For each X _i Obtaining frequency spectrum data by adopting Fast Fourier Transform (FFT), selecting the first n = s/2 data points according to the conjugate symmetry of a real number sequence frequency spectrum, and carrying out 0-1 normalization processingAs spectral sample data Fi; for each X _i And performing autocorrelation transformation to obtain autocorrelation data of the signal, selecting the first n = s/2 data points, and performing 0-1 normalization processing to obtain autocorrelation sample data Ri.

In the step S2, the method for stacking and fusing three modal sample data includes: sample data Y of each modality _i Fi and Ri are subjected to data splitting by a two-dimensional tensor with an original structure of (n, 1), and are converted into a three-dimensional tensor of (1, a, b) (wherein a × b = n, a, b ∈ Z, and a, b ≠ 0);

the data of three dimensions are superposed to obtain a three-dimensional tensor of (3, a, b) as the input of the network, namely, the three-dimensional tensor has three channels, the data structure of each channel is two-dimensional data of a multiplied by b, and the structure of the data of three modes is (Y) _i ，Fi，Ri)。

In said step S3, the encoder network has two outputs μ and log σ ² Let the distribution of the intermediate hidden layer variable z obey p (z | = x) = N (μ, σ) ² ) Z is generated by reparameterization of mu and sigma, and the expression of z is as follows: z = μ + σ × ∈, where ∈ is a parameter sampled from the standard normal distribution N (0,1).

In step S4, when the training set is used to train the network, the hyper-parameters of the VAE network, including the learning rate, are first set, the network parameters are initialized, and the network parameters are automatically updated by using the gradient descent method

And theta (w, b), reducing network loss, stopping training after network convergence, and fixing the converged network parameters as final network parameters of the VAE network.

The invention has the beneficial effects that:

1) The invention adopts a multimode fusion anomaly detection method to solve the problem that the analysis of single-domain signals lacks the accuracy of detection results, fuses the data of three modal signals of time domain, frequency domain and autocorrelation data and then uses the fused data as the input of a network model of a variational automatic encoder, and calculates anomaly threshold values by using data sets in normal and abnormal states, so that the anomaly detection result has good accuracy and precision, and is more suitable for anomaly detection of mechanical equipment working under different working conditions.

2) The invention adopts a multimode fusion anomaly detection method, and the practical test set experiment results prove that the anomaly detection can be carried out on the equipment under various working conditions, and the detection results have high precision; the result of the diagnosis performance contrast experiment proves that under the condition of multiple working conditions, all indexes of the diagnosis performance of the method are superior to those of the conventional detection method of a contrast group, and the method has higher accuracy; the result of the single-mode anomaly detection experiment proves that the total anomaly detection capability after multi-mode fusion is greatly improved.

Drawings

FIG. 1 is a flow chart of the steps of the anomaly detection method of the present invention;

FIG. 2 is a schematic diagram illustrating a clipping manner of a time-domain waveform signal in step S2 according to the present invention;

fig. 3 is a schematic structural diagram of the VAE network constructed in step S3 of the present invention;

FIG. 4 is a graph of the results of VAE network anomaly detection on different data sets;

FIG. 5 is a comparison of results using different anomaly detection methods;

FIG. 6 is a graph comparing the results of anomaly detection using different modalities of data.

Detailed Description

The invention is further explained below with reference to the figures and the embodiments.

The embodiment is as follows: as shown in fig. 1 to 6, the method for detecting abnormal device status based on a variational automatic encoder provided by the present invention comprises the following steps:

s1, collecting vibration signals:

in this embodiment, the experimental data adopted is a bearing fault diagnosis data set of the university of kaiser university (CWRU) and a bearing data set of the university of south of the river (JNU) of china are original data X _n 。

Wherein the university of Kesis storage (CWRU) bearing fault diagnosis dataset is provided by the university of Kesis storage bearing data center, and the data has four fault types: normal (N), ball (B) failure, inner Race (IR) failure, and Outer Race (OR) failure. The drive end vibration signal collected is sampled at several different motor speeds at a sampling frequency of 12 kHz. The data set is divided into four working conditions according to the rotating speed of the motor, wherein the working conditions are as follows: 0, the working condition is the rotating speed of the motor 1796 revolutions per minute; 1, the rotating speed of the motor is 1772 revolutions per minute; 2, the working condition is that the rotating speed of the motor is 1750 revolutions per minute; and the working condition 3 is the motor rotating speed 1730 r/min.

The south of the Yangtze university of China (JNU) dataset is a position dataset obtained by the south of the Yangtze university of China. The four health conditions include: normal condition, inner ring failure, outer ring failure and roller element failure. The vibration signal is at three different rotational speeds: the accelerometer is used for sampling under the working conditions of 600rpm, 800rpm under the working conditions of 1 and 1000rpm under the working conditions of 2, and the sampling frequency is 50kHz.

In this embodiment, during training, vibration data under normal conditions are used for performing experiments, and during testing, a test set composed of vibration data under normal conditions with labels and various faults is selected for performing experiments.

S2, preprocessing signals to obtain multi-modal data and fusing:

in this example, the signal length s is set to 8192, and after each truncation, the waveform is shifted by l =4096 data points, in the manner shown in fig. 2. For intercepted sample data X _i Respectively carrying out corresponding time domain, frequency domain and autocorrelation transformation preprocessing, wherein the preprocessing comprises the following specific steps:

1) Selecting the first 4096 data points of each group of data, and performing 0-1 normalization processing on the data to obtain waveform sample data Y _i ；

2) Fast Fourier Transform (FFT) is adopted to convert each group of data into frequency domain data, and the frequency spectrum has conjugate symmetry according to a real number sequence, so the first 4096 data of the frequency domain data are selected and subjected to 0-1 normalization processing to be used as frequency domain sample data F _i ；

3) Performing autocorrelation transformation on each group of data to obtain autocorrelation data of the signals, selecting the first 4096 data and performing 0-1 normalization processing as autocorrelation sample data Ri;

the sample data of the three modes are overlapped and merged and are subjected to-5-5 normalization processing to be used as a networkThe input training set and the data fusion step are as follows: sample data Y of each modality _i 、F _i 、R _i The data is further split by the two-dimensional tensor with the original structure of (4096, 1) and converted into a three-dimensional tensor of (1, 64, 64); the three-dimensional tensors (3, 64, 64) obtained by superposing the data of three dimensions are used as the input of the network, namely, the network has three channels, the data structure of each channel is two-dimensional data of 64 multiplied by 64, and the structure of the data of three modes is (Y) _i ，F _i ，R _i )。

S3, constructing a variation automatic encoder network;

the VAE network is composed of an encoder and a decoder, and the structure of the network is shown in FIG. 3. Inputting the training set preprocessed in the step S2 as an original sample data set x into a network, wherein the encoder network is expressed as

Of a parameterized probabilistic model, wherein

Network parameter set representing encoder, decoder network represented as p _θ (x-z) wherein θ (w, b) represents a network parameter set of the decoder, w is an initial weight, and b is an offset; the encoder network has two outputs, mu and log sigma ² The intermediate hidden layer variable output by the encoder network is z, the distribution of z is p (z-x), the decoder network takes z as network input, and finally, the generated data approximate to the original sample data x is output

The encoder network is composed of 4 fully-connected layers, parameters of each layer of the network are shown in table 1, input data firstly pass through fc1 and fc12 layers, data dimension is reduced to 400 and then pass through a leakage ReLU activation function layer, then the input data are respectively connected with two fully-connected layers fc21 and fc22, the output of the two fully-connected layers is used as the output of the encoder network, the output of the two fully-connected layers has 300 dimensions, and respectively represents the pair of the mean value mu and the variance of the hidden variable zNumber log σ ² 。

Table 1 shows the parameters of the encoder network structure

In order to make the gradient of the whole neural network continuously propagate so as to train the network by using the gradient descent method, the hidden variable z needs to be obtained by re-parameterization, and the distribution of the intermediate hidden variable z is assumed to obey p (z | < x) = N (mu, sigma) so as to ensure that the gradient of the whole neural network can be continuously propagated so as to facilitate the training of the network by using the gradient descent method ² ) Z is generated by carrying out reparameterization on mu and sigma, and the expression of z is as follows: z = μ + σ × ∈, where ∈ is a parameter sampled from the standard normal distribution N (0,1);

the decoder network consists of three fully-connected layers and two activation function layers, and the network structure parameters are shown in table 2. The hidden variable z is firstly subjected to dimension change into 800 through two full connection layers fc3 and fc34, then is connected with a Leaky ReLU activation layer, then is subjected to dimension change into 4096 which is the same as the original input data after passing through a full connection layer fc4, and finally is subjected to Sigmoid activation layer to form final output.

Table 2 shows the decoder network configuration parameters

Network uses original sample data x and generated data

Reconstruction loss between and encoder network

The sum of KL divergence with the distribution p (z-x) of the hidden variable z as the objective function of the total loss function and the network optimization, where the reconstruction loss

Wherein N is the total number of samples in the input data; KL divergence:

in this example, when

And p (z | x) are both assumed to be normally distributed, i.e.

Obeying N (mu, sigma) ² ) P (z | x) obeys N (μ) ₁ ，σ ₁ ² ) When the utility model is used, the water is discharged,

in particular, assuming that p (z | x) follows a standard normal distribution N (0, 1), the KL divergence is:

the total loss function is:

s4, training the VAE network by using a training set:

after the construction of the variational automatic encoder network is completed, network parameters are set, initial weight W and bias b of the network are subjected to randomization processing, a network optimizer is set to Adam, the learning rate is lr =0.0001, the momentum weight ratio is (0.5, 0.99), and batch-normalization is started, and meanwhile, the network uses drop out to prevent overfitting.

Before training, setting all gradients of the network to zero, setting the batch-size to 64, setting the number of training rounds to 50, inputting a training set to start training, performing forward propagation in the VAE network each time to obtain a reconstructed sample, calculating a network total loss function, and performing automatic derivation on the network and performing backward propagation by using a gradient descent method to update parameters of the whole network model.

The data in the training set preprocessed in the step S2 are processedRandomly disordering the sequence, dividing the disordering training set according to the set batch-size, inputting the divided training set into the VAE network constructed in the step S3 for training, and automatically updating the network parameters

And theta (w, b), stopping training after the network learning converges, fixing the network parameters and stopping parameter updating, and fixing the converged network parameters as the final network parameters of the VAE network.

the trained network parameters are fixed first, network autoderivation and back propagation and parameter updating are disabled, and drop out is not used while the value of batch-normalization is followed. And then selecting sample data with normal and abnormal labels, preprocessing the sample data in the step S2, and inputting the sample data serving as a test set into a network for testing.

Performing 0-1 normalization on the reconstruction loss of each group of data in the test set to obtain a prediction score of the group of data, and dividing the prediction scores of all groups by using each prediction score as a threshold value: and (3) dividing the model which is lower than the threshold into normal, dividing the model which is higher than the threshold into abnormal, finally comparing the abnormal threshold with the real label to calculate F1-Score under the threshold, and taking the threshold corresponding to the maximum F1-Score as the abnormal threshold of the network, wherein the F1-Score is an index for measuring the accuracy of the two classification models, and is a harmonic mean of the model accuracy and the recall ratio, the maximum value is 1, and the minimum value is 0.

In this example, the network was trained using the normal vibration data of the CWRU data set under 0,1,2,3 conditions, and using the vibration data under normal and all fault types under 0,1,2,3 conditions as a labeled test set, and the test results showed that the network model had an anomaly detection Accuracy (Accuracy) of 99.96%, an F1-score of 1, and an area under the ROC curve (AUC) of 1.

The normal vibration data of the JNU data set under the working condition of 0,1,2 are used for training, the vibration data under the normal working condition of 0,1,2 and all fault types under the working condition of 0,1,2 are used as a labeled test set to test the network, and the test result shows that the abnormality detection accuracy (accuracy) of the network model is 94.90 percent, the F1-score is 0.9483, and the area under the ROC curve (AUC) is 0.9793.

The test result is shown in fig. 4, so that it can be seen that the method provided by the patent performs anomaly detection under different data sets and multiple working conditions, and the detection result has high precision.

In order to verify the diagnostic performance of the method provided by the invention, a control group is specially set for experimental comparison analysis. And selecting an abnormality detection method PCA, HBOS and LOF as controls, performing an experiment by using a JNU data set of the university of Jiangnan, and selecting AUC, F1-score and Accuracy as evaluation indexes. The control detection method is characterized in that normal time domain vibration data under the working condition of 0,1,2 are used for training, time domain vibration data under the working condition of 0,1,2 under normal and all fault types are used as a test set with labels for testing the network, and the experimental result is shown in figure 5.

It can be seen that under three evaluation indexes of AUC, F1-score and Accuracy, HBOS is 0.5990,0.744 and 66.10% respectively, PCA is 0.8986,0.8465 and 96.19% respectively, and LOF is 0.8982,0.8485 and 96.19% respectively.

Meanwhile, in order to reflect the influence of the multimode fusion method provided in this example on the anomaly detection, the contrast of the anomaly detection of each single-mode data in the model provided in this example is specially set, except that the experimental data are different, the other network parameters are kept the same, and the experimental result is shown in fig. 6.

Under three evaluation indexes of AUC, F1-score and Accuracy, single waveforms are respectively 0.6901,0.7442 and 65.97%, single-frequency spectra are respectively 0.8843,0.8133 and 81.61%, and single autocorrelation is respectively 0.1872,0.6655 and 49.80%, which shows that the multi-modal fusion method can be used for fusing the advantages of single-modal data under the condition of anomaly detection, so that the total anomaly detection capability is greatly improved.

The invention adopts a multimode fusion anomaly detection method, and the experimental results of an actual test set prove that the anomaly detection method can be used for carrying out anomaly detection on equipment under various working conditions, and the detection results have high precision; the result of the diagnosis performance contrast experiment proves that under the condition of multiple working conditions, all indexes of the diagnosis performance of the method are superior to those of the conventional detection method of a contrast group, and the method has higher accuracy; the result of the single-mode anomaly detection experiment proves that the total anomaly detection capability after multi-mode fusion is greatly improved.

The above description is only for the purpose of illustrating the technical solutions of the present invention and not for the purpose of limiting the same, and other modifications or equivalent substitutions made by those skilled in the art to the technical solutions of the present invention should be covered within the scope of the claims of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims

1. A device state abnormity detection method based on a variational automatic encoder is characterized in that: the method comprises the following steps:

s1, collecting vibration signals:

S2, preprocessing signals to obtain multi-modal data and fusing:

according to the selected signal length s (s belongs to Z, s is not equal to 0 and s is an even number), the original data X is processed _n Intercepting, wherein the backward translation distance of an intercepting window after each interception is l = s/2, and obtaining m sample data X with equal length (m belongs to Z, m is not equal to 0) _i (i ∈ (0, m), with s data points in each sample data;

for m X _i Respectively carrying out corresponding time domain, frequency domain and autocorrelation transformation, and selecting the first n = s/2 data points to carry out 0-1 normalization processing to respectively obtain waveform samplesThis data Y _i Changing the shape structure of the data, superposing and fusing the sample data of the three modes, and carrying out-5-5 normalization processing on the fused data to be used as a training set;

s3, constructing a variational automatic encoder network;

In a parameterized probabilistic model, wherein

Network uses original sample data x and generated data

Reconstruction loss between and encoder network q _φ The sum of KL divergence degrees between the distribution p (z-x) of the hidden variable z and (z-x) is used as a total Loss function and an objective function of network optimization, and the expression of the total Loss function Loss is as follows:

in the formula, loss of reconstruction

KL divergence

N is the total number of samples in the input data;

s4, training the VAE network by using a training set:

And theta (w, b), stopping training after the network learning convergence, fixing the network parameters and stopping updating the parameters;

selecting sample data with normal and abnormal labels, preprocessing the sample data in the step S2, inputting the sample data serving as a test set into a network for testing, performing 0-1 normalization on reconstruction loss of each group of data in the test set, and then using each prediction score as a threshold value to divide the prediction scores of all groups: the classification below the threshold is normal, the classification above the threshold is abnormal, finally F1-score under the threshold is calculated by comparing with the real label, and the threshold corresponding to the maximum F1-score is taken as the abnormal threshold of the network;

2. The method as claimed in claim 1, wherein the method comprises detecting abnormal device statusThe method comprises the following steps: in the step S2, m X numbers are processed _i The specific method for obtaining the sample data by respectively carrying out corresponding time domain, frequency domain and autocorrelation transformation comprises the following steps:

at each X _i Selecting the first n = s/2 data points and carrying out 0-1 normalization processing as waveform sample data Y _i (ii) a For each X _i Acquiring frequency spectrum data by adopting Fast Fourier Transform (FFT), and selecting first n = s/2 data points according to the conjugate symmetry of a real number sequence frequency spectrum and carrying out 0-1 normalization processing to be used as frequency spectrum sample data Fi; for each X _i And performing autocorrelation transformation to obtain autocorrelation data of the signal, selecting the first n = s/2 data points, and performing 0-1 normalization processing to obtain autocorrelation sample data Ri.

3. The method for detecting the abnormal state of the equipment based on the variational automatic encoder as claimed in claim 1, wherein: in the step S2, the method for stacking and fusing the sample data of the three modes includes:

sample data Y of each modality _i Fi, ri are data-split from a two-dimensional tensor with the original structure of (n, 1), and converted into a three-dimensional tensor of (1, a, b) (where a × b = n, a, b ∈ Z, and a, b ≠ 0);

4. The method for detecting the abnormal state of the equipment based on the variational automatic encoder as claimed in claim 1, wherein: in said step S3, the encoder network has two outputs μ and log σ ² Let the distribution of the intermediate hidden layer variable z obey p (z | = x) = N (μ, σ) ² ) Z is generated by carrying out reparameterization on mu and sigma, and the expression of z is as follows: z = μ + σ × ∈ where ∈ is a parameter sampled from the standard normal distribution N (0,1).

5. A substrate according to claim 1The equipment state abnormity detection method of the variational automatic encoder is characterized in that: in step S4, when the training set is used to train the network, the hyper-parameters including the learning rate of the VAE network are set, the network parameters are initialized, and the network parameters are automatically updated by adopting a gradient descent method