CN116448419A

CN116448419A - Zero sample bearing fault diagnosis method based on depth model high-dimensional parameter multi-target efficient optimization

Info

Publication number: CN116448419A
Application number: CN202310237866.3A
Authority: CN
Inventors: 张睿; 白晓露; 张永梅; 潘理虎; 胡立华; 谢斌红
Original assignee: Taiyuan University of Science and Technology
Current assignee: Taiyuan University of Science and Technology
Priority date: 2023-03-14
Filing date: 2023-03-14
Publication date: 2023-07-18

Abstract

The invention discloses a zero sample bearing fault diagnosis method based on high-dimensional parameter multi-target efficient optimization of a depth model, and relates to the field of intelligent bearing fault diagnosis. The method comprises the following steps: constructing a visual feature extractor of high-dimensional space domain conversion; constructing a visual feature extractor component optimizing strategy based on a proxy-assisted multi-objective evolutionary algorithm; constructing a semantic matrix under the assistance of sequence signal statistical characteristics; and constructing a visual-semantic self-coding zero sample mapping strategy of outlier interpolation. The invention solves the problems of insufficient detection discrimination characteristic extraction, balanced detection performance of model reconstruction, low efficiency of various sample numbers, limited abnormal sample/zero sample identification and other technical bottleneck problems in the existing rotary machine fault diagnosis, ensures the safe and reliable operation of equipment, reduces spare part cost, machine system downtime and maintenance time to the maximum extent, and provides a new method for solving the key problems restricting the development of bearing fault diagnosis in the current actual engineering scene.

Description

Zero sample bearing fault diagnosis method based on depth model high-dimensional parameter multi-target efficient optimization

Technical Field

The invention relates to the field of intelligent diagnosis of bearing faults, in particular to a zero sample bearing fault diagnosis method based on high-dimensional parameter multi-objective efficient optimization of a depth model.

Background

In the development of modernization, the use of rotating machinery is not possible from large-scale manufacturing industries such as wind power generation, aerospace components, mining machinery and the like to automated manufacturing equipment of small enterprises. To meet the demands of the ever-evolving industrial production and manufacturing industries, mechanical devices are designed to be ever more sophisticated, complex and intelligent. However, the failure rate of the mechanical equipment is gradually increased due to the influence of the operating environment, the manufacturing process and other factors.

Conventional fault diagnosis methods generally employ a model-based or signal processing-based fault diagnosis method in the diagnosis process. However, with increasing size and complexity of modern mechanical devices, the former is often difficult to build a numerical model capable of reliably reflecting the working characteristics of a mechanical system, and the latter is too dependent on technical experience of people, so that effective fault feature extraction is difficult to deal with mass data. At present, the data-driven intelligent fault diagnosis algorithm gradually replaces the establishment of a complex numerical model or complicated signal processing by virtue of strong feature extraction capability, and becomes a research hotspot in the field of fault diagnosis of rotary machinery at home and abroad. However, the existing data-driven intelligent fault diagnosis method is seriously dependent on a large amount of ideal marking data, and in an actual engineering scene, although the equipment accumulates massive data due to long-term operation, the available data is lacking for training an intelligent fault diagnosis model, which is specifically shown as follows.

(1) The fault type has no history training data problem, namely fault diagnosis under the condition of zero sample. Compared with typical fault types generated by laboratory flow operation, in an actual engineering scene, the rolling bearing is very easy to generate complex and various fault types due to severe working conditions such as large load, strong impact and the like, so that no marked or unmarked historical training data can be used for training an intelligent fault diagnosis model. Under the condition of not carrying out shutdown inspection, how to identify the fault type without history record (namely, no fault is seen) and improve the accuracy of the identification of the fault is a key problem in the field of fault diagnosis at the present stage by utilizing the existing state monitoring data, and the problem needs to be solved.

(2) The mainstream fault diagnosis model structure is generally selected by a manual design method, which depends on the experience knowledge of a designer to a great extent, the final model structure is often required to be determined through repeated experiments, multiple indexes such as model diagnosis speed, detection performance and the like are difficult to balance, the process is time-consuming and low-efficiency, and the generalization of the diagnosis model is difficult to ensure. Therefore, how to provide an optimal model component structure with trusted output while realizing automatic balance and optimization of the diagnostic model performance is still needed to be studied more intensively.

Disclosure of Invention

The invention provides a zero sample bearing fault diagnosis method based on high-dimensional parameter multi-objective efficient optimization of a depth model, which aims to further solve the problems of insufficient detection discrimination characteristic extraction, low model reconstruction detection performance balance efficiency, unbalanced sample quantity, limited abnormal sample/zero sample identification and other technical bottleneck problems in the existing rotary machine fault diagnosis.

The invention is realized by the following technical scheme: a zero sample bearing fault diagnosis method based on high-dimensional parameter multi-target efficient optimization of a depth model comprises the following steps:

1) The visual feature extractor of the high-dimensional space domain conversion constructs:

the fault diagnosis of the bearing is usually carried out by only analyzing the vibration signals acquired by the sensor, and the fault characteristic information contained in the collected one-dimensional time domain sequence is not obvious and the correlation expression among the characteristics is insufficient. The invention surrounds the problems, utilizes a convolutional neural network depth model to construct a visual feature extractor for high-dimensional space domain conversion, firstly utilizes a sensor to collect sequence signals under various states, maps the collected data into Gao Weige lamb angles and fields (Gramain Angular Summation Field, GASF), carries out feature disassembly in a high-dimensional space in a full-connection layer mode, discovers the feasibility of features in the high-dimensional space dimension of a data sample, and realizes the high-efficiency extraction of the visual features of the data sample by matching with the convolutional neural network; the GASF variation expression function is shown in formula (1), and the overall framework of the visual feature extractor is constructed as shown in table 1;

In the middle ofFor the transformed polar sequence, I is the unit row vector [1, …,1]The method comprises the steps of carrying out a first treatment on the surface of the The specific conversion process is to encode the time sequence in a polar coordinate system and display different information granularities of samples; each element in the gram matrix is a trigonometric function value of an angle, and a gram angle and a field are constructed by utilizing the sum operation of the trigonometric function;

table 1 overview framework for visual characteristics extractor

2) Visual feature extractor component optimizing strategy construction based on agent-assisted multi-objective evolutionary algorithm:

the visual feature extractor for constructing the high-dimensional space domain conversion is only a rough framework, the number of nodes of the last layer of fully-connected visual feature extraction layer is determined, the setting of a plurality of internal structural components and parameters/super parameters is unknown, and the problems of unreasonable subjective factors and high redesign cost exist for the operation of the visual feature extractor only by using a manual experience mode. Therefore, the invention constructs a proxy assisted evolutionary algorithm (Constrained Dropout Neural Network based Surrogate-assisted Evolution Algorithm, CDNNEA) based on a constraint Dropout neural network as a visual feature extractor component optimizing strategy, and utilizes the optimizing strategy to adaptively search the internal components of a depth model for extracting visual features, wherein an optimizing target is expanded into three dimensions of identification accuracy, model complexity and training time, and simultaneously, in order to reduce time cost and hardware cost in model training and evaluation in a single-target optimizing process, a proxy model with strong expansibility is constructed to assist the searching strategy, and the specific implementation is as follows:

The overall framework of the CDNNEA algorithm is shown in fig. 3, and inputs are: the method comprises the following steps of outputting a non-dominant solution set (X, Y) which is a real problem, wherein the maximum evaluation times are MaxFE, the evaluation times are FE, the population size P, the decision variable dimension d, the cost function f, the number sigma of the real evaluation solutions and the evaluation ratio rho are all the same, and the method specifically comprises the following steps of:

(1) generating an initial solution portion: x=latinhypercube (11 d-1), y=f (X);

②FE＝11d-1；

(3) starting algorithm iteration: WHILE FE is less than or equal to MaxFE DO;

(4) training a proxy model using the training data set: c-dropout = TrainingData (X, Y);

(5) performing optimized search on non-dominant solution sets in the population: (X) ₁ ,Y ₁ ,ρ ₁ ,ρ ₂ )＝Estimate(P,C-dropout)；

(6) Selecting sigma previously obtained non-dominant solutions for true evaluation according to management criteria:

X ₂ ＝Selection(X ₁ ,Y ₁ ,ρ ₁ ,ρ ₂ ,C-dropout,σ),Y ₂ ＝f(X ₂ )；

⑦X'＝X∪X ₂ ,Y'＝Y∪Y ₂ ；

(8) updating the training data set: (X, Y) =update (X ', Y',11d-1, σ);

⑨FE＝FE+1,ρ ₁ ＝ρ ₂ ；

⑩END WHILE；

wherein step (1) generates an initial solution portion for sampling a portion of the initial samples after the true computation and generates an initial training data set based thereon. Step (3) to step (4) are main iterative processes of the algorithm, and the step (4) mainly uses a training data set to train the agent model; step (5) performing optimization search on non-dominant solution sets in the population, wherein the target evaluation involved in the process is realized only by relying on a proxy model with low calculation cost without performing expensive real calculation; finally, since the originally built proxy model optimization guide is not necessarily correct, the quality of the non-dominant solution and the model estimation accuracy will be difficult to guarantee without any update strategy operation. Step (6) therefore selects σ previously obtained non-dominant solutions for true evaluation based on the management criteria, step (8) updates the training data set.

Step (1) is that before optimization starts, CDNNEA uses Latin hypercube sampling mode to generate uniform sampling points with the size of 11d-1 in a decision space, wherein d is a decision variable dimension, and performs real calculation on the decision variable dimension by using an objective function to be optimized, and the obtained decision variable and a corresponding objective solution are used as a data set training agent model. Suitability of the sampling number set to 11d-1 can be referred to.

3) Semantic matrix construction under the assistance of sequence signal statistical characteristics:

the supervision classification is usually carried out by using a single hot code as label information, so that labels of the supervision classification are not related, the supervision classification cannot classify the unseen class, but auxiliary information of the unseen class is related, the auxiliary information mainly comes from manually defined semantic description information, word vector information obtained by a natural language processing technology or a mixture of the two, and the relevance between the unseen class and the unseen class is generated in an indirect way, so that the unseen class can be classified correctly.

The semantic description information mainly comprises two forms of manually defined semantic description information, namely binary description and continuous value description, wherein the semantic description information is mainly summarized by people on category characteristics according to previous knowledge. The following conditions are required to be met for the design of the semantic description information of the bearing fault class sequence signal:

(1) The semantic meaning is that the information can be obtained through the description information of the person;

(2) The semantic descriptions with discriminant, i.e. different fault types, are different;

(3) With consistency, i.e., once the number of attributes is determined, the dimensions of the fault semantic description vector are fixed and the same position of the vector represents the same attribute.

The binary description mainly indicates whether the attribute has a problem or not, and the attribute described by the fault semantics can be various indexes of the sensor, wherein the value of the attribute is 1 if abnormal, and the value of the attribute is 0 if normal. Each fault in the fault semantic description information is represented by a vector, wherein each attribute value is 0 or 1, and the vector formed by the attribute values is the fault semantic description information. When using binary representation, the construction of the semantic description vector is easy, but the description of the attribute is incomplete, because the attribute in the fault state is not always determined in many cases.

The continuous value representation of the semantic description information tends to describe faults more accurately than the binary representation, the continuous value description tends to be difficult than the binary description, and is more influenced by individual subjectivity than the binary description, so that the attribute is usually required to be scored by an expert in the field, the higher the score is, the more serious the anomaly is, and finally the final attribute continuous value is obtained by averaging multiple people. However, zero sample learning is less studied in the field of fault diagnosis, and semantic description information which needs to be adopted when designing continuous fault value semantic description is difficult to express, so that the zero sample learning is still to be further studied in the aspect of sequence signal semantic matrix construction in the field of fault diagnosis.

Based on statistical analysis, a theoretical statistical rule following a random process is adopted, and a correlation method of probability statistics is mainly applied to explore and discover a random process of things. In the intelligent diagnosis process of bearing faults, uncertainty exists in vibration frequency due to experimental environment and bearing materials, and then the collected signals contain random components. It is difficult to describe such signals with exact mathematical expressions. However, such random signal data often exhibit a specific statistical law after extensive experimentation. The interference signals are often random in intelligent diagnosis of bearing faults and are influenced by the internal structure of materials, the vibration acceleration signals are identified to have the conditions of acoustic deflection, acoustic beam distortion and the like, so that the characteristics of the defect signals are met, and proper quantity of characteristic quantities have important influence on defect identification, in order to find semantic vectors which are capable of simultaneously meeting the characteristics of the fault signals and are optimal in quantity, root-mean-square values, absolute average amplitudes, standard deviations, maximum values, minimum values, peak-peak values, kurtosis, skewness, eighth-order moment coefficients, sixteen-order moment coefficients, waveform indexes, peak indexes, pulse indexes, margin indexes, kurtosis indexes, skewness indexes, mean square spectrums, spectrum gravity centers, frequency domain variances, correlation factors, harmonic factors and spectrum origin moments are selected, and the 24 time domain or frequency domain characteristics are used as statistical characteristic semantic characterization models of the bearing faults; the method comprises the following steps:

Root mean square value:square root amplitude->Absolute average amplitude +.>

Standard deviation ofMaximum value X _max ＝max|x _i I, minimum value X _min ＝min|x _i |；

Peak-to-peak value V _pp ＝max|x _i |-min|x _i I, kurtosisKurtosis->

Degree of deviationEighth order moment coefficient->Sixteen moment coefficients->

Waveform indexPeak index->Pulse index->Margin index->

Kurtosis indexDeviation index->

Mean square spectrumFrequency spectrum center of gravity->

Frequency domain varianceCorrelation factor->

Harmonic factorSpectral origin moment->

4) Building a visual-semantic self-coding zero sample mapping strategy of outlier interpolation:

the visual-semantic mapping is established, so that the similarity between any undiscovered class data and undiscovered class prototypes can be calculated, and the undiscovered classes are classified based on the similarity. The invention adds the limit of specific semantic information in the mapping layer, constrains the reconstruction effect, realizes the projection function learning under supervision, takes semantic attribute description or word vector as migration knowledge, sets the information of the hidden layer as sample semantic attribute, maps the visual features into the semantic space by using an automatic encoder, and reconstructs the original visual features by using a decoder, and the structure diagram is shown in figure 3. The method comprises the following specific steps:

the objective function of constructing the zero sample learning model is as follows:

in the formula, the input sample data is X epsilon R ^d×N D is the characteristic dimension of the sample, N is the total number of samples; projection matrix W E R ^k×d K is the dimension of the sample attribute, sample attribute S ε R ^k×N The method comprises the steps of carrying out a first treatment on the surface of the To simplify the model operation, let W ^* ＝W ^T Considering the difficulty of solving the constraint wx=s at the same time, the above formula is rewritten as:

wherein I II _F Is the Frobenius paradigm, first termIs zero sample feature learning item, second itemIs a visual semantic constraint term used for constraining the projection matrix W, and lambda is an overshoot parameter used for balancing the two terms; the optimization of the above method is firstly derived, and then the property of the matrix trace is simplified, and the result is as follows:

-2SX ^T +2SS ^T W+2λWXX ^T -2λX ^T S (10)

let formula (10) equal to 0, it is possible to obtain:

SS ^T W+λWXX ^T ＝SX ^T +λSX ^T (11)

let a=ss ^T ,B＝λXX ^T ,C＝(1+λ)SX ^T The above formula can ultimately be written as follows:

AW+WB＝C (12)

equation (12) is a Sieverster (Sylvester equation) equation, and the optimal visual-semantic mapping matrices W and W can be obtained by solving the equation by adopting the Bartels-Stewart algorithm ^T The method comprises the steps of carrying out a first treatment on the surface of the Meanwhile, in order to eliminate the influence of incomplete abnormal data on the rationality of the mapping matrix W in the Sylvester equation solving process, the invention sets the average interpolation theory based on the mapping matrix W, if abnormal values are judged to appear, the abnormal values are subjected to average interpolation replacement according to the current attribute column, namely, the abnormal values of the data are subjected to data interpolation by a moving average window method, the non-abnormal values of the column are summed to obtain an average value to serve as interpolation data, the data are assigned to the missing values, and finally, the new column after interpolation is assigned to the original column; experiments prove that the average interpolation data processing method greatly improves the mapping similarity from visual space to semantic space, and can effectively solve the problem of abnormal data in the visual-semantic space mapping process in the model.

Finally, in the label prediction stage of the unseen sample, the derived attribute of the unseen sample is compared with the attribute of the unknown prototype by utilizing the formula (13) and combining with cosine similarity (Cosine Similarity), so that the label of the unknown sample is predicted and obtained:

in the middle ofIs the predictive attribute of the i-th sample in the target domain, is->Is the prototype property of the jth unknown class, d (·) is the cosine distance equation, and f (·) is the predicted sample label.

Further, in step (4) of step 2), training the proxy model by using a training data set, using a constraint Dropout neural network extensible proxy model, taking the difference of distribution among the outputs of different sub-models as a starting point, adding a sample filling mechanism and a loss constraint term to improve the credibility of the proxy model in the multi-objective problem solving process, and for a training set t= { (X) composed of samples with a batch size d _i ,Y _i ) (i=1, 2,., d), d is consistent with the decision variable size, the purpose of originally back-propagating is to minimize the mean square error function shown in equation (2); in the C-dropout process, training data X for each batch _i Are all stacked as new samples and input after being copied, and the purpose of the method is to simulate two forward propagation processes of the same data to obtain two distributions P of model prediction results ₁ ＝P(X _i |Y _i ) And P ₂ ＝P'(X' _i |Y' _i ) Meanwhile, by means of the dynamic change characteristic of the Dropout network structure, the new sample stacking mode can perform data expansion to a certain extent on rare real calculation samples in the expensive optimization problem; the goal of minimizing the error loss at this time is then to change to equation (3) and reduce the sub-model variabilityThe problem is converted into how to distribute P after output ₁ And P ₂ Constraint is carried out; combining the effective measurement of the correlation degree of the Szechwan correlation coefficient on the variables, applying the effective measurement to measure the inconsistency of the two outputs, namely constructing a constraint term shown in the formula (4), and combining the constraint term with l ₂ The final training loss function is formed by simultaneous combination as shown in the formula (5), so that the degree of freedom of parameters in the original network space is reduced; after the error is obtained, the model carries out counter propagation according to the set iteration times by combining the weight and the deviation updated by the gradient and the chain method, and finally the training stage is completed;

further, in the step (5) of the step 2), performing an optimization search on non-dominant solution sets in the population, performing an optimal solution set search on the proxy model by using a PeEA algorithm with good high-dimensional multi-objective problem solving performance, and searching for solutions with excellent performance under the environment that the characteristics of the Pareto front shape, continuity and the like are unknown by evaluating the sensitivity of the Pareto front curvature and the similarity between solution sets in a high-dimensional space, so that a plurality of individuals requiring real evaluation are provided for guiding the update of the proxy model by forming candidate solution sets;

In the self environment selection strategy, the PeEA firstly utilizes an achievement scale function to locate key points which can represent the Pareto front curvature most, and calculates the ratio of the distances between the points to obtain the front curvature approximately; then according to the estimatedSelecting a proper index to realize the maximum convergence of the algorithm by the constructed self-adaptive scale function; meanwhile, the consistency of solution sets when abnormal values exist in the high-dimensional environment is improved by utilizing a unique similarity measurement mode; in PeEA, the point x= (x) on the leading edge face in the problem of dimension optimization for m targets ₁ ,x ₂ ,…,x _m ) Then there is a Pareto front shape estimate of equation (6):

wherein q is a positive parameter representing the curvature of the leading edge, and for 0< q <1, q=1, q >1, the leading edge is concave, linear, convex, respectively; in determining the q value, the PeEA first performs target normalization using the minimum and extreme points on each target in an effort to provide a hyperplane equidistant from the target axis as the base plane. When the predicted leading edge shape is concave or linear, the population convergence is measured by using a linear scaling form. And when the estimated front edge shape is convex, measuring the population convergence by using a Chebyshev distance form.

Step 2) in the application of the CDNNEA to the visual characteristic extraction model, the test time is the total time required by the identification process of the test set, the calculated amount is measured by the floating point running times (Floating Point Operations Per Second, FLPs) of the model, and the test error is calculated by the formula (7):

Where t (i) represents the true label, p (i) is the predicted label, and batch is the batch size.

Compared with the prior art, the zero sample bearing fault diagnosis method based on the depth model high-dimensional parameter multi-objective efficient optimization has the following beneficial effects: (1) the current mainstream rolling bearing fault diagnosis model is mostly remained in the stage of training by a large number of labeling samples, and is not in charge of the complex/unseen new faults which are highlighted in the actual engineering test. Therefore, the invention innovatively takes bearing fault diagnosis under the conditions of few samples and zero samples as a research object, establishes a coupling relation between marked fault categories and newly added unseen fault categories through the embedding space of vision and semantic layers, extracts semantic vectors of known and unseen faults to build semantic space, learns subspaces shared between vision features and semantic attributes, obtains vision feature prototypes of the unknown faults by using mapping functions, finally builds a zero sample diagnosis model for complex/unseen faults, and realizes the prediction and identification of the unknown fault categories; (2) aiming at the problems that the conditions such as sound wave deflection, sound beam distortion and the like can occur in vibration acceleration signal identification in the intelligent bearing fault diagnosis process, so that accurate semantic information of a signal sample is difficult to describe by using an exact mathematical expression, 24 time domain/frequency domain features such as root mean square values, root mean square amplitudes, absolute average amplitudes, mean square spectrums, spectrum centers of gravity, eighth order moment coefficients and the like are innovatively selected as a statistical feature semantic representation model of bearing faults in order to construct semantic vectors which reflect the nature and the quantity of fault signals. Aiming at exploring proper amount of characteristic quantity conforming to the nature of the defect signal; (3) the design of components such as structures, parameters and the like in the visual characteristic extraction model in the zero sample diagnosis method is extremely time-consuming or even impossible to realize if all the components are manually adjusted, and the structures and the parameters in the designed model have redundancy in a high probability. The invention effectively improves model generalization and reconstruction performance, balances contradiction between efficiency and precision of a diagnosis model, constructs a search space of a model framework component to be optimized, determines parameters to be optimized in the model and takes the parameters to be optimized as decision variables, sets constraint conditions and search ranges existing in the interior of each variable, prepares evaluation indexes capable of reflecting requirements such as evaluation speed, fault sample evaluation accuracy, model calculation complexity and the like, selects a proper pareto optimal solution to optimize and set a credibility component of the diagnosis model according to specific actual engineering requirements, finally realizes the requirement of balancing and considering the rapidity and accuracy of the reconstruction model, and extracts higher-quality sample visual characteristics with faster efficiency.

Drawings

FIG. 1 is a CDNNEA application flowchart of step 2) of the present invention.

FIG. 2 is a flowchart of the overall framework of algorithm 1CDNNEA of step 2) of the present invention.

Fig. 3 is a diagram showing the structure of the visual-semantic map according to step 4) of the present invention.

Fig. 4 shows the average IGD values obtained over the DTLZ test problem for three proxy approaches (the optimal results are marked in bold) in an embodiment of the invention.

FIG. 5 shows the average IGD values obtained over WFG test questions (optimal results are marked in bold) for three proxy approaches in an embodiment of the invention.

FIG. 6 shows the average IGD values obtained for the 3-m DTLZ test problem for CDNNEA versus six comparison algorithms in an embodiment of the invention.

FIG. 7 shows the average IGD values obtained for the 40-d DTLZ test problem for CDNNEA versus six comparison algorithms in an embodiment of the invention.

Fig. 8 is a table of CWRU data set partitioning descriptions in an embodiment of the present invention.

Fig. 9 is a table showing the preset ranges of parameters before optimizing the visual characteristic extraction model according to an embodiment of the present invention.

FIG. 10 is a table of parameter values attached to three sets of preference solutions in an embodiment of the present invention.

FIG. 11 is a table showing the performance of the feature extraction model according to the embodiment of the present invention, compared with four comparison algorithms.

FIG. 12 is a sample distribution of unknown class identification samples of a zero sample diagnostic model in an embodiment of the present invention.

Detailed Description

This embodiment is further described below in conjunction with specific embodiments.

A zero sample bearing fault diagnosis method based on high-dimensional parameter multi-target efficient optimization of a depth model comprises the following steps:

constructing a visual feature extractor for high-dimensional space domain conversion by utilizing a convolutional neural network depth model, firstly, collecting sequence signals under various states by utilizing a sensor, mapping the collected data into Gao Weige lambda angles and a field GASF, carrying out feature disassembly in a high-dimensional space in a full-connection layer mode, and finding out the feasibility of features in the high-dimensional space dimension of a data sample, and realizing the efficient extraction of the visual features of the data sample by matching with the convolutional neural network; the GASF variation expression function is shown as a formula (1), and an overall framework of the visual feature extractor is constructed;

in the middle ofFor the converted polar coordinate sequence, I is the unit row vector [1, ], 1]The method comprises the steps of carrying out a first treatment on the surface of the The specific conversion process is to encode the time sequence in a polar coordinate system and display different information granularities of samples; each element in the gram matrix is a trigonometric function value of an angle, and a gram angle and a field are constructed by utilizing the sum operation of the trigonometric function;

constructing a constraint Dropout neural network-based agent assisted evolution algorithm CDNNEA as a visual feature extractor component optimizing strategy, performing self-adaptive search on a depth model internal component for visual feature extraction by using the optimizing strategy, expanding an optimizing target into three dimensions of identification accuracy, model complexity and training time, and constructing a strong-expansibility agent model to assist the searching strategy in order to reduce time cost and hardware cost in model training and evaluation in a single-target optimizing process, wherein the method comprises the following specific implementation steps of:

the CDNNEA algorithm overall framework is input as follows: the method comprises the following steps of outputting a non-dominant solution set (X, Y) which is a real problem, wherein the maximum evaluation times are MaxFE, the evaluation times are FE, the population size P, the decision variable dimension d, the cost function f, the number sigma of the real evaluation solutions and the evaluation ratio rho are all the same, and the method specifically comprises the following steps of:

(1) generating an initial solution portion: x=latinhypercube (11 d-1), y=f (X);

②FE＝11d-1；

(3) starting algorithm iteration: WHILE FE is less than or equal to MaxFE DO;

⑦X'＝X∪X ₂ ,Y'＝Y∪Y ₂ ；

(8) updating the training data set: (X, Y) =update (X ', Y',11d-1, σ);

⑨FE＝FE+1,ρ ₁ ＝ρ ₂ ；

⑩END WHILE；

in the step (4), training the proxy model by using a training data set, using a constraint Dropout neural network extensible proxy model, taking the difference of distribution among the output of different sub-models as a starting point, adding a sample filling mechanism and a loss constraint item to improve the credibility of the proxy model in the multi-objective problem solving process, and for a training set T= { (X) composed of samples with the batch size of d _i ,Y _i ) (i=1, 2,., d), d is consistent with the decision variable size, the purpose of originally back-propagating is to minimize the mean square error function shown in equation (2); in the C-dropout process, training data X for each batch _i Are all stacked as new samples and input after being copied, and the purpose of the method is to simulate two forward propagation processes of the same data to obtain two distributions P of model prediction results ₁ ＝P(X _i |Y _i ) And P ₂ ＝P'(X' _i |Y' _i ) At the same time by virtue ofThe characteristic of dynamic change of the Dropout network structure, the mode of stacking new samples can expand data to a certain extent on rare real calculation samples in the expensive optimization problem; the goal of minimizing the error loss is then changed to equation (3), and the problem of reducing the sub-model variability is translated into how to apply the output distribution P ₁ And P ₂ Constraint is carried out; combining the effective measurement of the correlation degree of the Szechwan correlation coefficient on the variables, applying the effective measurement to measure the inconsistency of the two outputs, namely constructing a constraint term shown in the formula (4), and combining the constraint term with l ₂ The final training loss function is formed by simultaneous combination as shown in the formula (5), so that the degree of freedom of parameters in the original network space is reduced; after the error is obtained, the model carries out counter propagation according to the set iteration times by combining the weight and the deviation updated by the gradient and the chain method, and finally the training stage is completed;

in the step (5), optimizing and searching non-dominant solution sets in the population, searching the optimal solution sets of the agent model by using a PeEA algorithm with good high-dimensional multi-objective problem solving performance, and searching and obtaining solutions with excellent performance under the environment of unknown characteristics such as Pareto front shape, continuity and the like by evaluating the sensitivity of Pareto front curvature and the similarity among solution sets in a high-dimensional space, so that a plurality of individuals needing real evaluation are provided for guiding the update of the agent model by forming candidate solution sets;

in the self environment selection strategy, the PeEA firstly utilizes an achievement scale function to locate key points which can represent the Pareto front curvature most, and calculates the ratio of the distances between the points to obtain the front curvature approximately; then selecting a proper index to realize the maximum convergence of the algorithm by the constructed self-adaptive scale function according to the estimated curvature value; meanwhile, the consistency of solution sets when abnormal values exist in the high-dimensional environment is improved by utilizing a unique similarity measurement mode; in PeEA, the point x= (x) on the leading edge face in the problem of dimension optimization for m targets ₁ ,x ₂ ,…,x _m ) Then there is a Pareto front shape estimate of equation (6):

wherein q is a positive parameter representing the curvature of the leading edge, and for 0< q <1, q=1, q >1, the leading edge is concave, linear, convex, respectively; in determining the q value, the PeEA first performs target normalization using the minimum and extreme points on each target in an effort to provide a hyperplane equidistant from the target axis as the base plane. When the predicted front shape is concave or linear, the population convergence is measured by using a linear scaling form; and when the estimated front edge shape is convex, measuring the population convergence by using a Chebyshev distance form.

FIG. 1 illustrates a flow chart of an application of CDNNEA to a visual characteristics extraction model. Firstly, a training set and a testing set are divided according to a proportion, and a convolutional neural network model is preset and completed. And then determining the decision variable dimension of the CDNNEA algorithm, namely parameters to be optimized in the network model. Such as: the number of convolution kernels, the size of the convolution kernels, an activation function in a convolution layer, a pooling mode, the number of nodes of a full-connection layer, a gradient descent function, a learning rate and a batch size. The undetermined parameters except the activation function, the pooling mode, the learning rate and the gradient descent function need to carry out rounding operation on the first decimal part of the undetermined parameters due to the problem of integer programming of information. And then entering a CDNNEA algorithm optimization flow, and optimizing network parameters in each iteration to find the minimum value of the test time, the calculated amount and the test error of the network model, namely three targets. In the example, the test time is the total time required for the test set identification process, the calculated amount is measured by the floating point running times per second (Floating Point Operations Per Second, FLPs) of the model, and the test error is calculated by the formula (7).

the method is characterized in that the method accords with the nature of a defect signal and proper characteristic quantity has important influence on defect identification, in order to find semantic vectors which reflect the nature of the defect signal and are optimal in quantity, root mean square value, square root amplitude, absolute average amplitude, standard deviation, maximum value, minimum value, peak-to-peak value, kurtosis, skewness, eighth-order moment coefficient, sixteen-order moment coefficient, waveform index, peak index, pulse index, margin index, kurtosis index, skewness index, mean square spectrum, spectrum center of gravity, frequency domain variance, correlation factor, harmonic factor and spectrum origin moment are selected, and 24 time domain or frequency domain characteristics are used as statistical characteristic semantic characterization models of bearing faults; the time domain or frequency domain features are as follows:

root mean square value:square root amplitude->Absolute average amplitude +.>Standard deviation->Maximum value X _max ＝max|x _i I, minimum value X _min ＝min|x _i |；

Peak-to-peak value V _pp ＝max|x _i |-min|x _i I, kurtosisKurtosis->Deflection->Eighth order moment coefficient->Sixteen moment coefficients->Waveform index->Peak index->Pulse index->Margin index->Kurtosis index->Deviation indexMean square spectrum >Frequency spectrum center of gravity->Frequency domain varianceCorrelation factor->Harmonic factor->Spectral origin moment->

the limitation of specific semantic information is added in the mapping layer, the reconstruction effect is restrained, projection function learning under supervision is realized, semantic attribute description or word vectors are used as migration knowledge, information of the hidden layer is set as sample semantic attributes, the visual features are mapped into semantic space by using an automatic encoder, and the original visual features are reconstructed by using a decoder. The objective function of constructing the zero sample learning model is as follows:

-2SX ^T +2SS ^T W+2λWXX ^T -2λX ^T S (10)

Let formula (10) equal to 0, it is possible to obtain:

SS ^T W+λWXX ^T ＝SX ^T +λSX ^T (11)

AW+WB＝C (12)

equation (12) is a Sirtviet equation, and the optimal visual-semantic mapping matrices W and W can be obtained by adopting the Bartels-Stewart algorithm to solve ^T The method comprises the steps of carrying out a first treatment on the surface of the Meanwhile, based on an average interpolation theory, if abnormal values are judged to appear in the mapping matrix W, carrying out average interpolation replacement on the abnormal values according to the current attribute column, namely carrying out data interpolation on the abnormal values of the data by a moving average window method, summing the non-abnormal values of the column, then calculating an average value to serve as interpolation data, assigning the data to the missing values, and finally assigning a new column after interpolation to an original column;

finally, in the label prediction stage of the unseen sample, the derived attribute of the unseen sample is compared with the attribute of the unknown prototype by utilizing the formula (13) and combining cosine similarity, so that the label of the unknown sample is predicted:

5) Evaluation analysis was performed using the procedure described above:

5.1 CDNNEA visual feature extractor component optimization policy performance verification:

(1) Comparison algorithm:

in order to comprehensively analyze the performance of the CDNNEA algorithm, in the embodiment, three different proxy auxiliary models of C-dropout, dropout, kriging are respectively tested simultaneously with the PeEA algorithm, and the feasibility of using the C-dropout as an extensible proxy model is proved by comparison. And then, performing performance comparison on the CDNNEA and six advanced algorithms in the field. The experiments of this example were all implemented by MATLAB R2020b software design on Intel Core i5-9400F CPU, microsoft Windows operating system computers, where the results of the comparison algorithm were run on the platEMO platform.

(2) Test problem set:

the test is carried out by using the DTLZ and WFG test problems as reference test problems. For each test question, the maximum decision variable and target number are set to 100 and 20, respectively. Wherein, on the WFG test set, when the target dimension m is 3 or 5, the number of relevant position parameters gamma is set to m-1, and when m is 10 or 20, the corresponding gamma is set to 2 (m-1).

(3) Parameter setting:

(1) Number of independent runs: for each test question, each algorithm was run independently 20 times.

(2) Maximum number of true evaluations: in addition to the 11d-1 training data of the initial true samples, 120 additional true solutions were used to test the performance of the comparison algorithm, so the maximum number of true evaluations was set to 11d+119.

(4) C-dropout related parameters: two-layer neuron inactivation probability value p ₁ And p is as follows ₂ Respectively 0.2 and 0.5, the learning rate is set to be 0.01, and the batch size is set to be one with the number of decision variablesResulting in a model training iteration number I _train Set to 1 x 10 ⁴ Number of test iterations I _test Let 100 be the number.

(5) PeEA algorithm parameters: the population size was set to 50 and the maximum number of evaluations for the proxy model was set to 20.

(6) Mutation operator: the polynomial mutation mode is adopted, the mutation probability is set to be 1/d, and the distribution index is set to be 20.

(7) Crossover operator: the simulated binary crossover mode is adopted, the crossover probability is set to be 1.0, and the distribution index is set to be 20.

(4) Evaluation index:

the experiment selects reverse generation distance evaluation index (Inverted Generational Distance, IGD) as the evaluation index.

The correlation calculation formula is shown as formula (14).

Wherein P is a target vector set uniformly distributed on a real Pareto front, and Q is a target vector set obtained by algorithm calculation. dist (v, Q) represents the Euclidean distance of one target vector v ε P in P to the nearest vector in Q. The present embodiment sets |p| to 10000, i.e., 10000 points are uniformly sampled on the Pareto leading edge as reference points when IGD index calculation is performed.

Meanwhile, in this embodiment, the performance difference between the CDNNEA and other comparison algorithms is statistically checked by Wilcoxon rank sum test with a confidence level of 0.05, and the symbols '+', '-' and '=' respectively represent that the performance of the CDNNEA algorithm is better, worse and not significantly different than the comparison algorithm.

(5) Feasibility of the C-dropout proxy model:

fig. 4 and 5 show IGD index averages obtained on DTLZ, WFG problem sets with the assistance of the PeEA algorithm in three proxy modes Kriging, dropout, C-dropout. In the experiment, the decision variable dimension d (20, 40, 60, 100) is changed on the premise of the target number m=3 to check the scalability of the C-dropout in the high-dimensional decision space. From the results, it can be found that in the 64 sets of test cases: CDNNEA outperforms the rest of the two ways on the 40-group example; at near optimum over 10 sets of instances; on the 14 sets of examples, the model management part may miss part of solutions with better real evaluation due to the problem of balanced matching of the diversity and convergence of the solutions, so the performance is slightly poor. The overall performance of taking the C-dropout as the proxy model is better, and compared with a Kriging proxy model mode, the CDNNEA can obtain better or similar IGD values on most of test problems, so that the applicability of the C-dropout in the high-dimensional MOP is further proved, and meanwhile, the feasibility of taking the C-dropout as the proxy model scheme in the embodiment is verified.

(6) Comparison experimental results of CDNNEA and advanced algorithm:

the algorithms compared in verifying CDNNEA performance include two non-agent-assisted evolutionary algorithms (ARMOEA, peEA) and four representative agent-assisted evolutionary algorithms (EDN-ARMOEA, parEGO, K-RVEA, MOEA/D-EGO). The performance comparison is mainly aimed at the influence of the decision space dimension and the target number on the algorithm, so two groups of experiments of fixed target number m=3, and changing the decision variable dimension d (20, 40, 60, 100) and fixed decision variable dimension d=40, and changing the target number m (3, 5, 10, 20) are respectively carried out on the DTLZ test set. The ParEGO algorithm is only aimed at solving the multi-objective problem with the number of objectives not exceeding 4, so that the ParEGO algorithm is only compared and tested on the problems with 3 objectives. The IGD average results finally obtained by the seven algorithms are shown in fig. 6 and 7. It can be found that, in the face of 46 sets of test problems, the CDNNEA has optimal performance on 23 sets compared with other comparison algorithms, the performance on 15 sets is approximately optimal, and the related situations belong to higher dimensions of decision variables or target numbers in the test problems, which indicates that the CDNNEA has better solving capability in a high-dimensional problem space compared with other algorithms.

5.2 Bearing failure dataset application instances

In order to evaluate the effectiveness and accuracy of the zero sample bearing fault diagnosis method based on the depth model high-dimensional parameter multi-objective efficient optimization, the test is performed by taking the rolling bearing as the object during the test and adopting the bearing data from Caesalpinia Chu Da (CWRU) in the United states.

CWRU data sets are provided by kesixi Chu Da school load data centers and are widely used for rolling bearing fault diagnosis. The test bed mainly comprises a 2 horsepower motor, a bearing accelerometer, a torque sensor and a power tester. The types of the test bearings at the driving end and the fan end are 6205-2rs JEM SKF and 6203 respectively. Single point failure is placed on the rolling elements, inner race and outer race of the bearing, respectively, by an electrical discharge machining technique, with failure damage levels including 7mils,14mils and 21mils. And an acceleration sensor is respectively arranged on the bearing seats of the motor driving end and the fan end to collect vibration acceleration signals of the fault bearing, the vibration signals are collected by a 16-channel data recorder, and the sampling frequency is 12kHz.

In the example, 10 types of faults with different severity on the rolling elements, the inner raceway and the outer raceway of the driving end rolling bearing are studied together for the sensors with different sources on the driving end and the fan end bearing seat. The 10 types of fault samples generate a sample set according to the window size of 1024 time points, the window moving step length is 1000 time points, and the number of samples in each type of state is 1200 and is 12000 samples in total. A detailed description of experimental data is shown in fig. 8.

(1) The reliability of the CDNNEA strategy optimization model component is verified:

first, for the CWRU dataset, the following is 6:1, the training set and the test set are divided in proportion, and a visual characteristic extraction model shown in table 1 is preset. And then determining the decision variable dimension of the CDNNEA algorithm, namely parameters to be optimized in the network model. The visual feature extraction model contains 16 parameters in total for the CWRU dataset, namely: the three-layer convolution kernel numbers (channel_1, channel_2, channel_3), the three-layer convolution kernel sizes (c_s_1, c_s_2, c_s_3), the three activation functions (act_1, act_2, act_3) in the convolution layers, the three-layer pooling mode (pool_1, pool_2, pool_3), the first-layer full-connection layer node number (f_1), the gradient descent function (Op), the learning rate (l_rate), the Batch size (Batch), and the range of each parameter are set as shown in fig. 9. The undetermined parameters except the activation function, the pooling mode, the learning rate and the gradient descent function need to carry out rounding operation on the first decimal part of the undetermined parameters due to the problem of integer programming of information. FIG. 10 is a graph of decision variable values for each of three different preference type solutions searched in the visual feature extraction model after the last iteration of the algorithm. It can be obviously observed from the graph that the test time of the solution network of item A is shortest, the calculated amount of the solution network of item B is smallest, and the solution model of item C has the lowest test error. Meanwhile, the four comparison algorithms ARMOEA, peEA, EDN-ARMOEA and K-RVEA are applied to the optimizing process of the visual feature extraction model component, and the performance pair is as shown in fig. 11, and under the setting of 216 maximum evaluation times, the CDNNEA algorithm searches to obtain a network architecture with higher recognition accuracy while guaranteeing lower calculation time loss.

(2) Zero sample bearing diagnostic method performance verification

For verifying the validity of the zero sample method, firstly, the dimension output of the penultimate full connection layer in the C item solution model constructed by the CDNNEA algorithm is extracted and used as the visual characteristic X in the zero sample learning model. The semantic features S are based on the established semantic attribute relation matrix based on the statistical features. Visual characteristics X of visual evaluation result type data by training _Y Combining visible type semantic features S in semantic space _Y Solving a mapping matrix W, and enabling the data which are not seen in the test set to pass through the visual characteristics X _Z Reflecting and emitting semantic vectors by W and matching with the original unseen semantic feature matrix S _Z The comparison is carried out to obtain a classification result by cosine similarity. In the verification process, the visible Class in the CWRU dataset is set to 1, 2, 3, 4 and 5 labels in the Class Label, and the remaining 5 labels are set to unknown classes. Figure 12 shows the accuracy of the evaluation index identification tested by the model trained by the visible class on the invisible class based on the above settings, with classification accuracy of 82.43% for the 5 invisible class samples on the CWRU dataset. Analysis can result in an efficient recognition rate for such samples even if the undiscovered samples are not used at all during the training process, under the method of the present embodiment.

The scope of the present invention is not limited to the above embodiments, and various modifications and alterations of the present invention will become apparent to those skilled in the art, and any modifications, improvements and equivalents within the spirit and principle of the present invention are intended to be included in the scope of the present invention.

Claims

1. A zero sample bearing fault diagnosis method based on high-dimensional parameter multi-target efficient optimization of a depth model is characterized in that: the method comprises the following steps:

in the middle ofFor the transformed polar sequence, I is the unit row vector [1, …,1 ]The method comprises the steps of carrying out a first treatment on the surface of the The specific conversion process is to encode the time sequence in a polar coordinate system and display different information granularities of samples; each element in the gram matrix is a trigonometric function value of an angle, and a gram angle and a field are constructed by utilizing the sum operation of the trigonometric function;

(1) Generating an initial solution portion: x=latinhypercube (11 d-1), y=f (X);

②FE＝11d-1；

(3) starting algorithm iteration: WHILE FE is less than or equal to MaxFE DO;

⑦X'＝X∪X ₂ ,Y'＝Y∪Y ₂ ；

(8) updating the training data set: (X, Y) =update (X ', Y',11d-1, σ);

⑨FE＝FE+1,ρ ₁ ＝ρ ₂ ；

⑩END WHILE；

the method is characterized in that the method accords with the nature of a defect signal and proper characteristic quantity has important influence on defect identification, in order to find semantic vectors which reflect the nature of the defect signal and are optimal in quantity, root mean square value, square root amplitude, absolute average amplitude, standard deviation, maximum value, minimum value, peak-to-peak value, kurtosis, skewness, eighth-order moment coefficient, sixteen-order moment coefficient, waveform index, peak index, pulse index, margin index, kurtosis index, skewness index, mean square spectrum, spectrum center of gravity, frequency domain variance, correlation factor, harmonic factor and spectrum origin moment are selected, and 24 time domain or frequency domain characteristics are used as statistical characteristic semantic characterization models of bearing faults;

the limitation of specific semantic information is added in the mapping layer, the reconstruction effect is restrained, projection function learning under supervision is realized, semantic attribute description or word vectors are used as migration knowledge, information of the hidden layer is set as sample semantic attributes, the visual features are mapped into semantic space by using an automatic encoder, and the original visual features are reconstructed by using a decoder.

2. The depth model-based high-dimensional parameter multi-objective efficient optimizing zero-sample bearing fault diagnosis method according to claim 1, wherein the method is characterized by comprising the following steps of: in step (4) of step 2), training the proxy model by using a training data set, using a constraint Dropout neural network extensible proxy model, taking the difference of distribution among the outputs of different sub-models as a starting point, adding a sample filling mechanism and a loss constraint item to improve the credibility of the proxy model in the multi-objective problem solving process, and for a training set T= { (X) composed of samples with the batch size of d _i ,Y _i ) (i=1, 2,., d), d is consistent with the decision variable size, the purpose of originally back-propagating is to minimize the mean square error function shown in equation (2); in the C-dropout process, training data X for each batch _i Are all stacked as new samples and input after being copied, and the purpose of the method is to simulate two forward propagation processes of the same data to obtain two distributions P of model prediction results ₁ ＝P(X _i |Y _i ) And P ₂ ＝P'(X' _i |Y' _i ) Meanwhile, by means of the dynamic change characteristic of the Dropout network structure, a new sample stacking mode is adoptedThe data expansion can be carried out to a certain extent on rare real calculation samples in the expensive optimization problem; the goal of minimizing the error loss is then changed to equation (3), and the problem of reducing the sub-model variability is translated into how to apply the output distribution P ₁ And P ₂ Constraint is carried out; combining the effective measurement of the correlation degree of the Szechwan correlation coefficient on the variables, applying the effective measurement to measure the inconsistency of the two outputs, namely constructing a constraint term shown in the formula (4), and combining the constraint term with l ₂ The final training loss function is formed by simultaneous combination as shown in the formula (5), so that the degree of freedom of parameters in the original network space is reduced; after the error is obtained, the model carries out counter propagation according to the set iteration times by combining the weight and the deviation updated by the gradient and the chain method, and finally the training stage is completed;

3. the depth model-based high-dimensional parameter multi-objective efficient optimizing zero-sample bearing fault diagnosis method according to claim 2, wherein the method is characterized by comprising the following steps of: in the step (5) of the step 2), carrying out optimal search on non-dominant solution sets in the population, carrying out optimal solution set search on a proxy model by using a PeEA algorithm with good high-dimensional multi-objective problem solving performance, and searching to obtain solutions with excellent performance under the environment of unknown characteristics such as Pareto front shape, continuity and the like by evaluating the sensitivity of Pareto front curvature and the similarity among solution sets in a high-dimensional space, so that a plurality of individuals needing real evaluation are provided for guiding the update of the proxy model by forming candidate solution sets;

wherein q is a positive parameter representing the curvature of the leading edge, and for 0< q <1, q=1, q >1, the leading edge is concave, linear, convex, respectively; when the q value is determined, the PeEA firstly utilizes the minimum value point and the extreme point on each target to realize target standardization so as to provide a hyperplane equidistant from the target axis as a base plane; when the predicted front shape is concave or linear, the population convergence is measured by using a linear scaling form; and when the estimated front edge shape is convex, measuring the population convergence by using a Chebyshev distance form.

4. The depth model-based high-dimensional parameter multi-objective efficient optimizing zero-sample bearing fault diagnosis method according to claim 3, wherein the method is characterized by comprising the following steps of: step 2) in the application of the CDNNEA to the visual characteristic extraction model, the test time is the total time required by the identification process of the test set, the calculated amount is measured by the floating point running times FLPs per second of the model, and the test error is calculated by the formula (7):

5. The depth model-based high-dimensional parameter multi-objective efficient optimizing zero-sample bearing fault diagnosis method according to claim 1, wherein the method is characterized by comprising the following steps of: the time domain or frequency domain features of the statistical feature semantic characterization model used as the bearing faults in the step 3) are as follows:

root mean square value:square root amplitude->Absolute average amplitude +.>Standard deviation ofMaximum value X _max ＝max|x _i I, minimum value X _min ＝min|x _i |；

Peak-to-peak value V _pp ＝max|x _i |-min|x _i I, kurtosisKurtosis->Deflection->Eighth order moment coefficient->Sixteen moment coefficients->Waveform index->Peak index->Pulse index->Margin index->Kurtosis index->Deviation index->Mean square spectrum>Frequency spectrum center of gravity->Frequency domain variance->Correlation factorHarmonic factor->Spectral origin moment->

6. The depth model-based high-dimensional parameter multi-objective efficient optimizing zero-sample bearing fault diagnosis method according to claim 1, wherein the method is characterized by comprising the following steps of: in step 4), the objective function of constructing the zero sample learning model is as follows:

wherein I II _F Is the Frobenius paradigm, first termIs a zero sample feature learning term, the second term is WX-S ² _F Is a visual semantic constraint term used for constraining the projection matrix W, and lambda is an overshoot parameter used for balancing the two terms; the optimization of the above method is firstly derived, and then the property of the matrix trace is simplified, and the result is as follows:

-2SX ^T +2SS ^T W+2λWXX ^T -2λX ^T S (10)

let formula (10) equal to 0, it is possible to obtain:

SS ^T W+λWXX ^T ＝SX ^T +λSX ^T (11)

let a=ss ^T ,B＝λXX ^T ,C＝(1+λ)SX ^T Then go upThe formula can eventually be written as follows:

AW+WB＝C (12)