CN116028891B - Industrial anomaly detection model training method and device based on multi-model fusion - Google Patents

Industrial anomaly detection model training method and device based on multi-model fusion Download PDF

Info

Publication number
CN116028891B
CN116028891B CN202310123067.3A CN202310123067A CN116028891B CN 116028891 B CN116028891 B CN 116028891B CN 202310123067 A CN202310123067 A CN 202310123067A CN 116028891 B CN116028891 B CN 116028891B
Authority
CN
China
Prior art keywords
teacher
model
tensor
layer
student
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310123067.3A
Other languages
Chinese (zh)
Other versions
CN116028891A (en
Inventor
刘通
郏维强
王玉柱
韩松岭
张梦璘
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Lab
Original Assignee
Zhejiang Lab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Lab filed Critical Zhejiang Lab
Priority to CN202310123067.3A priority Critical patent/CN116028891B/en
Publication of CN116028891A publication Critical patent/CN116028891A/en
Application granted granted Critical
Publication of CN116028891B publication Critical patent/CN116028891B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses an industrial anomaly detection model training method and device based on multi-model fusion, wherein the method comprises the following steps: step one, preprocessing after acquiring sensor data; step two, respectively inputting the sensor characteristic tensor obtained by pretreatment into a plurality of teacher models and student models, and obtaining the characteristics of each network layer in the models; mapping the middle layer tensor in the characteristics into a public space tensor; step four, carrying out weighted average on public space tensors of all teacher models to obtain teacher weighted tensors corresponding to the public space tensors of students, and transversely splicing task layer vectors of all the teacher models into a teacher task layer splicing vector; step five, obtaining distillation loss, task loss and prediction loss of the model, and obtaining total loss by weighting and summing; and step six, repeating the steps, minimizing total loss, updating the parameters of the neural network of the student model until convergence, finally fixing the parameters of the neural network of the student model, obtaining a target model, and finishing training.

Description

Industrial anomaly detection model training method and device based on multi-model fusion
Technical Field
The invention relates to the field of industrial equipment anomaly detection, in particular to an industrial anomaly detection model training method and device based on multi-model fusion.
Background
In the industrial field, correctly identifying the type of equipment anomaly helps the operation and maintenance personnel to lock the problem more quickly, so as to take corresponding measures in time. With the widespread use of industrial sensors, a large number of monitoring data for critical devices can be collected. The data-driven abnormality detection method has been developed, and by monitoring sensor data in real time, it is possible to dynamically identify whether an abnormality occurs in the apparatus, and to identify the type of the abnormality.
The deep learning neural network-based industrial abnormality detection method is gaining attention, and the deep learning-based industrial abnormality detection method has the following advantages: 1. the dependence on the characteristic engineering is less, and the end-to-end training can be realized; 2. the model has flexible structure and strong fitting capability, and can extract complex modes in data; however, the deep learning method has higher requirements on the labeled data set, and a larger amount of labeled data is often required to achieve a better prediction effect.
In the field of industrial anomaly detection, the difficulty of data annotation is high, and marked data is generally difficult to obtain; in addition, industrial data relates to data security and business confidentiality, equipment operation data of different factories and departments cannot be shared, and original data is difficult to obtain; in addition, the industrial equipment structure and the operation environment are complex, and it is difficult to grasp all abnormal types at first; there is therefore a need for an iterative model to take into account the newly discovered and defined anomaly types.
Typically, multiple models are trained for the same model of device, different factories, or the same factory over different historic periods; the prediction effect can be effectively improved by utilizing the existing model, wherein the effect of the integrated model can be improved by integrating a plurality of sub-models through traditional integrated learning; however, the ensemble learning method has the following problems: 1. all the submodels need to participate in calculation, and when the number of the submodels is large, the calculation pressure is obviously increased; 2. all sub-models are generally required to classify several identical categories, whereas in the field of industrial anomaly detection new anomaly types often occur, the anomaly categories supported by the different period models differ.
Disclosure of Invention
In order to solve the technical problems in the prior art, the invention provides an industrial anomaly detection model training method and device based on multi-model fusion, and the specific technical scheme is as follows:
an industrial anomaly detection model training method based on multi-model fusion comprises the following steps:
step one, preprocessing after acquiring sensor data;
step two, respectively inputting the sensor characteristic tensor obtained by preprocessing into a plurality of teacher models and student models, and obtaining the characteristics output by each network layer in the models, wherein the characteristics comprise middle layer tensors and task layer vectors;
step three, mapping the middle layer tensor of the teacher model and the middle layer tensor of the student model into a teacher public space tensor and a student public space tensor respectively;
step four, obtaining and averaging all the teacher public space tensors in a weighted manner according to the attention coefficient of each teacher public space tensor to obtain a teacher weighted tensor corresponding to the student public space tensor, and transversely splicing all the teacher model task layer vectors into one-dimensional teacher task layer splicing vector;
step five, comparing the public space tensor of the students with the corresponding teacher weighting tensor to obtain distillation loss; comparing the task layer vector of the student model with the teacher task layer splicing vector to obtain task loss; comparing the label marked by the data set with the task layer vector of the student model to obtain a prediction loss; obtaining a total loss based on the distillation loss, the mission loss and the predicted loss;
and step six, repeating the step one to the step five, minimizing the total loss, updating the neural network parameters of the student model until the neural network parameters of the student model are converged and fixed, obtaining a target model, and finishing training.
Further, the first step specifically comprises: converting sensor data into sensor feature tensors using a single-layer LSTM network
Figure SMS_1
Wherein->
Figure SMS_2
Is the time window size of the sensor data, i.e. the data length,/->
Figure SMS_3
Hidden layer dimensions for sensor feature tensors.
Further, the second step specifically includes the following substeps:
s21, tensor of sensor characteristics
Figure SMS_5
Respectively input pre-trained +.>
Figure SMS_7
A teacher model for->
Figure SMS_9
A plurality of teacher models, each model having +.>
Figure SMS_6
A plurality of intermediate layers; for->
Figure SMS_8
No. H of the personal model>
Figure SMS_10
Intermediate layers, the output intermediate layer tensor is
Figure SMS_11
The method comprises the steps of carrying out a first treatment on the surface of the Calculating to obtain the total->
Figure SMS_4
Middle layer tensors of the individual teacher models;
s22, tensor of sensor characteristics
Figure SMS_12
Inputting a student model, for the +.>
Figure SMS_13
Intermediate layer, calculated as->
Figure SMS_14
Intermediate layer tensor of student model of layer->
Figure SMS_15
S23, for the first
Figure SMS_16
The final layer of each teacher model is used for calculating and obtaining teacher task vector +.>
Figure SMS_17
The method comprises the steps of carrying out a first treatment on the surface of the For the final layer of the student model, the student task vector is calculated>
Figure SMS_18
The method comprises the steps of carrying out a first treatment on the surface of the The student task vector->
Figure SMS_19
Is equal to the sum of all teacher model task vector dimensions plus the number of categories that are newly present in the dataset.
Further, the third step specifically includes the following substeps:
s31, converting the middle layer tensor of the teacher model into a teacher public space tensor with the same dimension, for the first
Figure SMS_20
No. H of the teacher model>
Figure SMS_21
Individual layers with corresponding tensor of teacher's public space>
Figure SMS_22
Wherein->
Figure SMS_23
Representing a nonlinear transformation implemented by a convolutional neural network layer;
s32, if nonlinear transformation
Figure SMS_24
Parameter->
Figure SMS_25
Is a solidIf yes, calculate to get the total->
Figure SMS_26
Personal teacher public space tensor->
Figure SMS_27
The method comprises the steps of carrying out a first treatment on the surface of the Otherwise, update the nonlinear transformation by step S33, step S34->
Figure SMS_28
Parameter->
Figure SMS_29
S33, regarding the first
Figure SMS_30
No. H of the teacher model>
Figure SMS_31
Teacher public space tensor corresponding to each layer>
Figure SMS_32
By nonlinear transformation->
Figure SMS_33
Mapping it to middle layer tensor +.>
Figure SMS_34
Teacher middle layer reconstruction tensor with same dimension>
Figure SMS_35
S34, comparing the intermediate layer tensor of the teacher model
Figure SMS_36
Reconstructing tensor with teacher interlayer>
Figure SMS_37
Calculate reconstruction error +.>
Figure SMS_38
Figure SMS_39
Wherein,,
Figure SMS_40
reconstructing a loss function; minimizing +.>
Figure SMS_41
Update nonlinear transformation->
Figure SMS_42
Parameter of->
Figure SMS_43
Until reconstruction error->
Figure SMS_44
Less than threshold->
Figure SMS_45
Or the iteration step number is satisfied, the parameter is fixed +.>
Figure SMS_46
S35, converting the middle layer tensor of the student model into a student public space tensor with the same dimension, for the first
Figure SMS_47
Individual layers (I)>
Figure SMS_48
The method comprises the steps of carrying out a first treatment on the surface of the Wherein->
Figure SMS_49
Is a nonlinear transformation realized by a neural network layer, and the parameters are +.>
Figure SMS_50
The method comprises the steps of carrying out a first treatment on the surface of the The dimensions of the student public space tensors are the same as those of the teacher public space tensor in the step S31.
Further, the step four specifically includes the following substeps:
s41, based on the first
Figure SMS_51
Layered student public space tensor->
Figure SMS_52
First->
Figure SMS_53
Personal teacher model->
Figure SMS_54
Layer teacher public space tensor
Figure SMS_55
Through the attention mechanism, the attention coefficient of the tensor of the teacher public space is obtained>
Figure SMS_56
The expression is:
Figure SMS_57
s42, according to the attention coefficient
Figure SMS_58
Weighted average of all teacher public space tensors to obtain +.>
Figure SMS_59
Layer-corresponding teacher-weighted tensor->
Figure SMS_60
The expression is:
Figure SMS_61
s43, splicing all task layer vectors of the teacher model into one-dimensional teacher task layer splicing vector
Figure SMS_62
Wherein if +_appears in the annotation dataset>
Figure SMS_63
New abnormal categories are spliced on the teacher task layer splicing vector with a length of +.>
Figure SMS_64
To obtain a new teacher task layer splicing vector, and the expression is:
Figure SMS_65
wherein,,
Figure SMS_66
for vector concatenation operations, ++>
Figure SMS_67
For length +.>
Figure SMS_68
Is zero vector,/->
Figure SMS_69
The number of anomaly categories that are newly present in the dataset.
Further, the fifth step specifically includes the following substeps:
s51, comparing the student public space tensor of the kth layer of the student model
Figure SMS_70
The kth teacher-weighted tensor corresponding thereto>
Figure SMS_71
Obtaining distillation loss->
Figure SMS_72
The expression is:
wherein,,
Figure SMS_74
as a loss function;
wherein,,
Figure SMS_75
for the number of intermediate layers of the student model, +.>
Figure SMS_76
As a mean square error loss function, the expression is:
Figure SMS_77
s52, comparing the task layer vectors of the student model, namely the output vectors
Figure SMS_78
And the corresponding teacher task layer splicing vector +.>
Figure SMS_79
Obtaining a soft target loss function>
Figure SMS_80
The expression is:
Figure SMS_81
s53, comparing the vector output by the student model with a small quantity of marked data sets
Figure SMS_82
One-hot representation corresponding to labeling correct category +.>
Figure SMS_83
Obtaining predicted loss->
Figure SMS_84
The expression is:
Figure SMS_85
wherein,,
Figure SMS_86
、/>
Figure SMS_87
respectively correct category of independent heat indicates +.>
Figure SMS_88
Bit-corresponding value, student model task vector +.>
Figure SMS_89
Probability value of bit prediction->
Figure SMS_90
For the total number of categories, its value and student model task vector +.>
Figure SMS_91
Is uniform in length;
s54, distillation loss
Figure SMS_92
Loss of->
Figure SMS_93
、/>
Figure SMS_94
Weighted summation to obtain the final total loss
Figure SMS_95
I.e. the total loss function expression is:
Figure SMS_96
wherein,,
Figure SMS_97
、/>
Figure SMS_98
is a super parameter.
Further, the sixth step specifically includes the following substeps:
s61, repeating the first step to the fifth step, and minimizing the loss function by using a gradient descent algorithm
Figure SMS_99
Neural network parameters for student model +.>
Figure SMS_100
Nonlinear transformation neural network parameters->
Figure SMS_101
Updating;
s62, when model loss function
Figure SMS_102
Lowering to threshold->
Figure SMS_103
Or after the iteration reaches the preset times, finishing training; fixing the student model parameters->
Figure SMS_104
The student model is the target model.
An industrial anomaly detection model training device based on multi-model fusion comprises one or more processors, and is used for realizing the industrial anomaly detection model training method based on multi-model fusion.
A computer readable storage medium having stored thereon a program which, when executed by a processor, implements the method for training an industrial anomaly detection model based on multi-model fusion.
Compared with other methods, the method has the following advantages:
1. by adopting a multi-model fusion method, information of a plurality of teacher models is fused, and the existing models can be reused to obtain models capable of identifying more abnormal categories;
2. the training can be completed through a large amount of non-labeling data under the condition of not contacting with training data of a teacher model, and the dependence on an industrial labeling data set during training of an industrial anomaly detection model is reduced;
3. compared with the traditional integrated learning method, all teacher models are not required to participate in calculation during prediction, and once training is completed, only a single student model is required to be used for prediction, so that the consumption of calculation resources can be reduced;
4. the target model can realize the identification and prediction of the new abnormal type through a small amount of marked new abnormal type training data.
Drawings
FIG. 1 is a schematic flow chart of an industrial anomaly detection model training method based on multi-model fusion;
fig. 2 is a schematic structural diagram of an industrial anomaly detection model training device based on multi-model fusion according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and technical effects of the present invention more apparent, the present invention will be further described in detail with reference to the drawings and examples of the specification.
In this embodiment, the detection of the industrial abnormal signal needs to correctly identify the type of the equipment fault according to the sensor signal, and in this scenario, the first teacher model may identify 6 different types of abnormalities, and the other type is "normal state" (7 types in total); the second teacher model can identify 7 different types of anomalies, another is "normal state" (8 categories total); the existing dataset contains an abnormal type which does not appear in multiple types, and the problems of the scene can be abstracted into multi-classification problems; the data sets of the different periods are not already available; the data currently available are large amounts of unlabeled sensor data collected recently, and small amounts of manually labeled data.
Based on the above examples, as shown in fig. 1, the method for training the industrial anomaly detection model based on multi-model fusion provided by the invention comprises the following steps:
step one, preprocessing is performed after sensor data are acquired.
Wherein, the sensor data specifically is: assuming that there is
Figure SMS_105
A plurality of sensors selected to +.>
Figure SMS_106
The sensor data for the length and the time window size is a 2D time series matrix +.>
Figure SMS_107
Wherein each column is data over a time step, for each time step:
Figure SMS_108
Figure SMS_109
wherein each row of the array is data acquired by a single sensor within a time window, < >>
Figure SMS_110
Is->
Figure SMS_111
The individual sensors are at the moment->
Figure SMS_112
Is a reading of (2); likewise, for the sensor +.>
Figure SMS_113
The time sequence within the selected time window is:
Figure SMS_114
one embodiment of the invention is to use a singleThe LSTM network of layer is regarded as the data processing module of the sensor, carry on the preconditioning to the sensor data, namely: sensor data
Figure SMS_115
Inputting the sensor characteristic tensor into the LSTM network, and calculating to obtain the sensor characteristic tensor as
Figure SMS_116
Wherein->
Figure SMS_117
Is the dimension of the LSTM network layer output tensor hidden layer.
And step two, respectively inputting the sensor characteristic tensor obtained by preprocessing into a plurality of teacher models and student models, and obtaining the characteristics output by each network layer in the models, wherein the characteristics comprise middle layer tensors and task layer vectors.
Specifically, the sensor characteristic tensor
Figure SMS_118
Respectively inputting a plurality of teacher models and student models, and calculating middle layer tensors and task layer vectors of the teacher models and the student models; the teacher models are pre-trained models, and the neural network parameters of the models are fixed; the plurality of teacher models are consistent with the input form of the student models, wherein the input form is consistent with the characteristics, the format and the dimension of input data used by the models; the categories output by the teacher model and the student model among the plurality of teacher models may not be identical, but the same categories exist; the parameters of the student model need to be determined through subsequent steps in an iterative manner; the middle layer tensor refers to output results of all the neural network layers except the last layer in the model; the task layer vector refers to a class probability vector of the output of the last layer of the model.
In the embodiment of the invention, the plurality of teacher models are models with two different structures, and comprise a teacher model 1 and a teacher model 2, wherein the teacher model 1 consists of four convolutional neural network CNN layers and a full connection layer, and the output of the models is probability distribution of 7 categories; the teacher model 2 consists of two superimposed long-short-time memory LSTM layers and a full-connection layer, and the output of the model is probability distribution of 8 categories; the student model consists of three Self-attention layers and a full-connection layer; a new data set has a new type of anomaly; the output of the student model is a probability distribution of 16 categories, where 16 categories are the categories in the two teacher models that are not deduplicated, plus the total number of categories for the new category in the dataset.
The second step specifically comprises the following substeps:
s21, tensor of sensor characteristics
Figure SMS_119
Respectively input pre-trained +.>
Figure SMS_122
A teacher model for->
Figure SMS_124
A plurality of teacher models, each model having +.>
Figure SMS_120
A plurality of intermediate layers; for->
Figure SMS_123
No. H of the personal model>
Figure SMS_125
Intermediate layers, the output intermediate layer tensor is
Figure SMS_126
The method comprises the steps of carrying out a first treatment on the surface of the Calculating to obtain the total->
Figure SMS_121
Middle layer tensors of the individual teacher models;
in the embodiment of the invention, for four CNN layers of the teacher model 1, the input sensor characteristic tensor
Figure SMS_127
Calculating to obtain correspondingThe middle layer tensors of the teacher model are respectively:
a first layer:
Figure SMS_128
a second layer:
Figure SMS_129
third layer:
Figure SMS_130
fourth layer:
Figure SMS_131
wherein,,
Figure SMS_132
、/>
Figure SMS_135
、/>
Figure SMS_138
、/>
Figure SMS_133
nonlinear transformation corresponding to four CNN layers in the teacher model 1 is respectively carried out; />
Figure SMS_137
、/>
Figure SMS_139
、/>
Figure SMS_140
Respectively teacher model 1->
Figure SMS_134
Layer intermediate layer tensor width, height,Dimension in depth direction, & gt>
Figure SMS_136
=(1,2,3,4);
For the two LSTM layers of teacher model 2, the tensor is characterized by the sensor
Figure SMS_141
The intermediate layer tensors of the corresponding teacher model are calculated as follows:
a first layer:
Figure SMS_142
a second layer:
Figure SMS_143
wherein,,
Figure SMS_144
、/>
Figure SMS_145
respectively corresponding nonlinear transformation of two LSTM layers in the teacher model 2;
Figure SMS_146
、/>
Figure SMS_147
respectively teacher model 2->
Figure SMS_148
Dimension in the width and height directions of the tensor of the layer intermediate layer.
S22, tensor of sensor characteristics
Figure SMS_149
Inputting a student model, for the +.>
Figure SMS_150
Intermediate layer, calculated as->
Figure SMS_151
Intermediate layer tensor of student model of layer->
Figure SMS_152
In this embodiment, for two Self-saturation layers of the student model, the tensor is characterized by the sensor
Figure SMS_153
The intermediate layer tensors of the corresponding student models are obtained through calculation respectively:
a first layer:
Figure SMS_154
a second layer:
Figure SMS_155
wherein,,
Figure SMS_156
、/>
Figure SMS_157
respectively carrying out nonlinear transformation corresponding to two Self-saturation layers in the student model; />
Figure SMS_158
、/>
Figure SMS_159
Respectively student models->
Figure SMS_160
Dimension in the width and height directions of the tensor of the layer intermediate layer.
S23, for the first
Figure SMS_161
The final layer of each teacher model is used for calculating and obtaining teacher task vector +.>
Figure SMS_162
The method comprises the steps of carrying out a first treatment on the surface of the For the final layer of the student model, the student task vector is calculated>
Figure SMS_163
The method comprises the steps of carrying out a first treatment on the surface of the The student task vector->
Figure SMS_164
The dimension of the task vector is equal to the sum of the dimensions of all teacher model tasks, and the newly-appearing category number in the data set is added;
in this embodiment, for the teacher model 1, the teacher model 2, and the student models, the corresponding task layer vectors are respectively:
Figure SMS_165
the dimension of the task layer vector of the student model is the sum of the dimensions of the task layer vectors of all teacher models, 15 dimensions are added, and the newly-appearing category number and 1 category in the data set are added.
Step three, mapping the middle layer tensor of the teacher model and the middle layer tensor of the student model into a teacher public space tensor and a student public space tensor respectively, wherein the dimensions of the teacher public space tensor and the student public space tensor are the same, and the method comprises the following substeps:
s31, converting the middle layer tensor of the teacher model into a teacher public space tensor with the same dimension, for the first
Figure SMS_166
No. H of the teacher model>
Figure SMS_167
Individual layers with corresponding tensor of teacher's public space>
Figure SMS_168
Wherein->
Figure SMS_169
Is a convolution of 1*1 convolution kernelA nonlinear transformation consisting of a neural network layer and a convolutional neural network layer of the teacher model;
taking the middle layer tensor of the second layer of the teacher model 1 as an example, the expression of the corresponding teacher public space tensor is:
Figure SMS_170
wherein,,
Figure SMS_171
、/>
Figure SMS_172
the convolution transformation of the second layer tensor and the convolution transformation of the 1*1 convolution kernel of the teacher model 1 are respectively; />
Figure SMS_173
、/>
Figure SMS_174
The corresponding dimension numbers of the public space tensor in the width direction and the height direction are respectively;
s32, if nonlinear transformation
Figure SMS_175
Parameter->
Figure SMS_176
Is fixed, the total ∈is calculated>
Figure SMS_177
Personal teacher public space tensor->
Figure SMS_178
The method comprises the steps of carrying out a first treatment on the surface of the In this embodiment, the public space tensors of all teachers are calculated as follows:
Figure SMS_179
wherein, for the transformation of the middle layer tensor of the teacher model into the teacher public space tensor
Figure SMS_180
If the neural network parameters of the teacher model +.>
Figure SMS_181
If not, obtaining the neural network parameters through the following steps of content training
Figure SMS_182
S33, regarding the first
Figure SMS_183
No. H of the teacher model>
Figure SMS_184
Teacher public space tensor corresponding to each layer>
Figure SMS_185
Transformation by nonlinear reconstruction->
Figure SMS_186
Mapping it to middle layer tensor +.>
Figure SMS_187
Teacher intermediate layer reconstruction tensor with same dimension
Figure SMS_188
The expression is:
Figure SMS_189
wherein the nonlinear reconstruction transformation
Figure SMS_190
Is composed of two layers of convolutional neural networks, +.>
Figure SMS_191
Figure SMS_192
Two convolution transforms in the nonlinear reconstruction transform respectively;
s34, comparing the intermediate layer tensor of the teacher model
Figure SMS_193
Reconstructing tensor with teacher interlayer>
Figure SMS_194
Calculate reconstruction error +.>
Figure SMS_195
The expression is:
Figure SMS_196
wherein,,
Figure SMS_197
reconstruction of the loss function by gradient descent method to minimize +.>
Figure SMS_198
Updating nonlinear transformation->
Figure SMS_199
Parameter of->
Figure SMS_200
Until reconstruction error->
Figure SMS_201
Less than threshold->
Figure SMS_202
Or the iteration step number is satisfied, the parameter is fixed +.>
Figure SMS_203
S35, converting the middle layer tensor of the student model into a student public space tensor with the same dimension; in an embodiment, the intermediate layer tensor of the student model
Figure SMS_204
、/>
Figure SMS_205
The method is converted into a student public space tensor, and the expression is as follows:
Figure SMS_206
Figure SMS_207
intermediate layer tensor of the student model>
Figure SMS_208
、/>
Figure SMS_209
The dimension of the transformed student public space tensor is consistent with that of the teacher public space tensor;
for the first
Figure SMS_210
Individual layers (I)>
Figure SMS_211
The method comprises the steps of carrying out a first treatment on the surface of the Wherein->
Figure SMS_212
The nonlinear transformation is realized by a neural network layer, taking the middle layer tensor of a second layer of a student model as an example:
Figure SMS_213
wherein,,
Figure SMS_214
、/>
Figure SMS_215
the convolution transformation is corresponding to the second layer tensor of the student model and the convolution transformation is corresponding to the 1*1 convolution kernel; />
Figure SMS_216
、/>
Figure SMS_217
The corresponding dimension of the public space tensor in the height direction is wide and is consistent with the dimension of the public space tensor of the teacher.
Step four, obtaining and averaging all the teacher public space tensors in a weighted manner according to the attention coefficient of each teacher public space tensor to obtain a teacher weighted tensor corresponding to the student public space tensor, and transversely splicing all the teacher model task layer vectors into a one-dimensional teacher task layer splicing vector, wherein the method specifically comprises the following substeps:
s41, based on the first
Figure SMS_218
Layered student public space tensor->
Figure SMS_219
First->
Figure SMS_220
Personal teacher model->
Figure SMS_221
Layer teacher public space tensor
Figure SMS_222
Through the attention mechanism, the attention coefficient of the tensor of the teacher public space is obtained>
Figure SMS_223
The expression is:
Figure SMS_224
s42, according to the attention coefficient
Figure SMS_225
Weighted average of all teacher public space tensors to obtain +.>
Figure SMS_226
Layer-corresponding teacher-weighted tensor->
Figure SMS_227
The expression is:
Figure SMS_228
s43, splicing all task layer vectors of the teacher model into one-dimensional teacher task layer splicing vector
Figure SMS_229
Wherein if +_appears in the annotation dataset>
Figure SMS_230
New abnormal categories are spliced on the teacher task layer splicing vector with a length of +.>
Figure SMS_231
To obtain a new teacher task layer splicing vector, and the expression is:
Figure SMS_232
wherein,,
Figure SMS_233
for vector concatenation operations, ++>
Figure SMS_234
For length +.>
Figure SMS_235
All zero vectors of (2), here, +.>
Figure SMS_236
The number of anomaly categories that are newly present in the dataset.
Step five, comparing the public space tensor of the students with the corresponding teacher weighting tensor to obtain distillation loss; comparing the task layer vector of the student model with the teacher task layer splicing vector to obtain task loss; comparing the label marked by the data set with the task layer vector of the student model to obtain the prediction loss for a small amount of marked data sets; based on the distillation loss, the task loss and the predicted loss, obtaining total loss, specifically comprising the following substeps:
s51, comparing the student public space tensor of the kth layer of the student model
Figure SMS_237
The kth teacher-weighted tensor corresponding thereto>
Figure SMS_238
Obtaining distillation loss->
Figure SMS_239
The expression is:
Figure SMS_240
wherein,,
Figure SMS_241
for the number of intermediate layers of the student model, +.>
Figure SMS_242
As a mean square error loss function, the expression is:
Figure SMS_243
s52, comparing the task layer vectors of the student model, namely the output vectors
Figure SMS_244
And the corresponding teacher task layer splicing vector +.>
Figure SMS_245
Obtaining a soft target loss function>
Figure SMS_246
The expression is:
Figure SMS_247
s53, comparing the vector output by the student model with a small quantity of marked data sets
Figure SMS_248
One-hot representation corresponding to labeling correct category +.>
Figure SMS_249
Obtaining predicted loss->
Figure SMS_250
The expression is:
Figure SMS_251
wherein,,
Figure SMS_252
、/>
Figure SMS_253
respectively correct category of independent heat indicates +.>
Figure SMS_254
Bit-corresponding value, student model task vector +.>
Figure SMS_255
Probability values for bit predictions;
s54, distillation loss
Figure SMS_256
Loss of->
Figure SMS_257
、/>
Figure SMS_258
Weighted summation to obtain the most significantTotal loss of end
Figure SMS_259
I.e. the total loss function expression is:
Figure SMS_260
wherein,,
Figure SMS_261
、/>
Figure SMS_262
is a superparameter, let ∈ ->
Figure SMS_263
、/>
Figure SMS_264
Step six, repeating the step one to the step five, minimizing the total loss, updating the neural network parameters of the student model until converging and fixing the neural network parameters of the student model, obtaining a target model, and completing training, wherein the method specifically comprises the following substeps:
s61, repeating the first step to the fifth step, and minimizing the loss function by using a gradient descent algorithm
Figure SMS_265
Neural network parameters for student model +.>
Figure SMS_266
Nonlinear transformation neural network parameters->
Figure SMS_267
Updating;
s62, when model loss function
Figure SMS_268
Lowering to threshold->
Figure SMS_269
Or stack ofAfter the times of generation reach a certain number of times (10000 steps), training is finished; fixing the student model parameters->
Figure SMS_270
The student model is the target model.
Corresponding to the embodiment of the industrial anomaly detection model training method based on multi-model fusion, the invention also provides an embodiment of the industrial anomaly detection model training device based on multi-model fusion.
Referring to fig. 2, an industrial anomaly detection model training device based on multi-model fusion according to an embodiment of the present invention includes one or more processors configured to implement the industrial anomaly detection model training method based on multi-model fusion in the above embodiment.
The embodiment of the industrial anomaly detection model training method based on multi-model fusion can be applied to any equipment with data processing capability, and the equipment with data processing capability can be equipment or a device such as a computer. The apparatus embodiments may be implemented by software, or may be implemented by hardware or a combination of hardware and software. Taking software implementation as an example, the device in a logic sense is formed by reading corresponding computer program instructions in a nonvolatile memory into a memory by a processor of any device with data processing capability. In terms of hardware, as shown in fig. 2, a hardware structure diagram of an apparatus with any data processing capability where the industrial anomaly detection model training device based on multi-model fusion is located is shown in fig. 2, and in addition to a processor, a memory, a network interface, and a nonvolatile memory shown in fig. 2, the apparatus with any data processing capability where the apparatus is located in an embodiment generally includes other hardware according to an actual function of the apparatus with any data processing capability, which is not described herein again.
The implementation process of the functions and roles of each unit in the above device is specifically shown in the implementation process of the corresponding steps in the above method, and will not be described herein again.
For the device embodiments, reference is made to the description of the method embodiments for the relevant points, since they essentially correspond to the method embodiments. The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purposes of the present invention. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
The embodiment of the invention also provides a computer readable storage medium, wherein a program is stored on the computer readable storage medium, and when the program is executed by a processor, the industrial anomaly detection model training method based on multi-model fusion in the embodiment is realized.
The computer readable storage medium may be an internal storage unit, such as a hard disk or a memory, of any of the data processing enabled devices described in any of the previous embodiments. The computer readable storage medium may also be an external storage device, such as a plug-in hard disk, a Smart Media Card (SMC), an SD Card, a Flash memory Card (Flash Card), or the like, provided on the device. Further, the computer readable storage medium may include both internal storage units and external storage devices of any data processing device. The computer readable storage medium is used for storing the computer program and other programs and data required by the arbitrary data processing apparatus, and may also be used for temporarily storing data that has been output or is to be output.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention in any way. Although the foregoing detailed description of the invention has been provided, it will be apparent to those skilled in the art that modifications may be made to the embodiments described in the foregoing examples, and that certain features may be substituted for those illustrated and described herein. Modifications, equivalents, and alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims (5)

1. The industrial anomaly detection model training method based on multi-model fusion is characterized by comprising the following steps of:
step one, preprocessing after acquiring sensor data;
step two, respectively inputting the sensor characteristic tensor obtained by preprocessing into a plurality of teacher models and student models, and obtaining the characteristics output by each network layer in the models, wherein the characteristics comprise middle layer tensors and task layer vectors;
step three, mapping the middle layer tensor of the teacher model and the middle layer tensor of the student model into a teacher public space tensor and a student public space tensor respectively;
step four, obtaining and averaging all the teacher public space tensors in a weighted manner according to the attention coefficient of each teacher public space tensor to obtain a teacher weighted tensor corresponding to the student public space tensor, and transversely splicing all the teacher model task layer vectors into one-dimensional teacher task layer splicing vector;
step five, comparing the public space tensor of the students with the corresponding teacher weighting tensor to obtain distillation loss; comparing the task layer vector of the student model with the teacher task layer splicing vector to obtain task loss; comparing the label marked by the data set with the task layer vector of the student model to obtain a prediction loss; obtaining a total loss based on the distillation loss, the mission loss and the predicted loss;
step six, repeating the step one to the step five, minimizing the total loss, updating the neural network parameters of the student model until the neural network parameters of the student model are converged and fixed, obtaining a target model, and finishing training;
the first step is specifically as follows: converting sensor data into sensor feature tensors using a single-layer LSTM networkWherein->
Figure QLYQS_2
Is the time window size of the sensor data, i.e. the data length,/->
Figure QLYQS_3
Hidden layer dimensions that are sensor feature tensors;
the second step specifically comprises the following substeps:
s21, tensor of sensor characteristics
Figure QLYQS_5
Respectively input pre-trained +.>
Figure QLYQS_7
A teacher model for->
Figure QLYQS_9
A plurality of teacher models, each model having +.>
Figure QLYQS_6
A plurality of intermediate layers; for->No. H of the personal model>
Figure QLYQS_10
Intermediate layers whose output intermediate layer tensor is +.>
Figure QLYQS_11
The method comprises the steps of carrying out a first treatment on the surface of the Calculating to obtain the total->
Figure QLYQS_4
Middle layer tensors of the individual teacher models;
s22, tensor of sensor characteristics
Figure QLYQS_12
Inputting a student model, for the +.>
Figure QLYQS_13
Intermediate layer, calculated as->
Figure QLYQS_14
Intermediate layer tensor of student model of layer->
Figure QLYQS_15
S23, for the first
Figure QLYQS_16
The final layer of each teacher model is used for calculating and obtaining teacher task vector +.>
Figure QLYQS_17
The method comprises the steps of carrying out a first treatment on the surface of the For the final layer of the student model, the student task vector is calculated>
Figure QLYQS_18
The method comprises the steps of carrying out a first treatment on the surface of the The student task vector->
Figure QLYQS_19
The dimension of the task vector is equal to the sum of the dimensions of all teacher model tasks, and the newly-appearing category number in the data set is added;
the third step specifically comprises the following substeps:
s31, converting the middle layer tensor of the teacher model into a teacher public space tensor with the same dimension, for the first
Figure QLYQS_20
No. H of the teacher model>
Figure QLYQS_21
Individual layers with corresponding tensor of teacher's public space>
Figure QLYQS_22
Wherein->
Figure QLYQS_23
Representing a nonlinear transformation implemented by a convolutional neural network layer;
s32, if nonlinear transformation
Figure QLYQS_24
Parameter->
Figure QLYQS_25
Is fixed, the total ∈is calculated>
Figure QLYQS_26
Personal teacher public space tensor->
Figure QLYQS_27
The method comprises the steps of carrying out a first treatment on the surface of the Otherwise, update the nonlinear transformation by step S33, step S34->
Figure QLYQS_28
Parameter->
Figure QLYQS_29
S33, regarding the first
Figure QLYQS_30
No. H of the teacher model>
Figure QLYQS_31
Teacher public space tensor corresponding to each layer>
Figure QLYQS_32
By non-linear transformation
Figure QLYQS_33
Mapping it to middle layer tensor +.>
Figure QLYQS_34
Teacher middle layer reconstruction tensor with same dimension>
Figure QLYQS_35
S34, comparing the intermediate layer tensor of the teacher model
Figure QLYQS_36
Reconstructing tensor with teacher interlayer>
Figure QLYQS_37
Calculate reconstruction error +.>
Figure QLYQS_38
Figure QLYQS_39
Wherein,,
Figure QLYQS_40
reconstructing a loss function; minimizing +.>
Figure QLYQS_41
Update nonlinear transformation->
Figure QLYQS_42
Parameters of (2)
Figure QLYQS_43
Until reconstruction error->
Figure QLYQS_44
Less than threshold->
Figure QLYQS_45
Or the iteration step number is satisfied, the parameter is fixed +.>
Figure QLYQS_46
S35, converting the intermediate layer tensor of the student model into student public air with the same dimensionTensor of the first
Figure QLYQS_47
The number of layers of the composite material,
Figure QLYQS_48
the method comprises the steps of carrying out a first treatment on the surface of the Wherein->
Figure QLYQS_49
Is a nonlinear transformation realized by a neural network layer, and the parameters are +.>
Figure QLYQS_50
The method comprises the steps of carrying out a first treatment on the surface of the The student public space tensor with the same dimension is consistent with the teacher public space tensor in dimension in the step S31;
the fourth step comprises the following substeps:
s41, based on the first
Figure QLYQS_51
Layered student public space tensor->
Figure QLYQS_52
First->
Figure QLYQS_53
Personal teacher model->
Figure QLYQS_54
Teacher public space tensor of layer>
Figure QLYQS_55
Through the attention mechanism, the attention coefficient of the tensor of the teacher public space is obtained>
Figure QLYQS_56
The expression is:
Figure QLYQS_57
s42, according to the attention coefficient
Figure QLYQS_58
Weighted average of all teacher public space tensors to obtain +.>
Figure QLYQS_59
Layer-corresponding teacher-weighted tensor->
Figure QLYQS_60
The expression is:
Figure QLYQS_61
s43, splicing all task layer vectors of the teacher model into one-dimensional teacher task layer splicing vector
Figure QLYQS_62
Wherein if +_appears in the annotation dataset>
Figure QLYQS_63
New abnormal categories are spliced on the teacher task layer splicing vector with a length of +.>
Figure QLYQS_64
To obtain a new teacher task layer splicing vector, and the expression is:
Figure QLYQS_65
wherein,,
Figure QLYQS_66
for vector concatenation operations, ++>
Figure QLYQS_67
For length +.>
Figure QLYQS_68
Is zero vector,/->
Figure QLYQS_69
The number of anomaly categories that are newly present in the dataset.
2. The industrial anomaly detection model training method based on multi-model fusion according to claim 1, wherein the fifth step specifically comprises the following sub-steps:
s51, comparing the student public space tensor of the kth layer of the student model
Figure QLYQS_70
The kth teacher weighted tensor corresponding to the same
Figure QLYQS_71
Obtaining distillation loss->
Figure QLYQS_72
The expression is:
Figure QLYQS_73
wherein->
Figure QLYQS_74
As a loss function;
wherein,,
Figure QLYQS_75
for the number of intermediate layers of the student model, +.>
Figure QLYQS_76
As a mean square error loss function, the expression is:
Figure QLYQS_77
s52, comparing the task layer vectors of the student model, namely the output vectors
Figure QLYQS_78
And the corresponding teacher task layer splicing vector +.>
Figure QLYQS_79
Obtaining a soft target loss function>
Figure QLYQS_80
The expression is:
Figure QLYQS_81
s53, comparing the vector output by the student model with a small quantity of marked data sets
Figure QLYQS_82
One-hot representation corresponding to labeling correct category +.>
Figure QLYQS_83
Obtaining predicted loss->
Figure QLYQS_84
The expression is:
Figure QLYQS_85
wherein,,
Figure QLYQS_86
、/>
Figure QLYQS_87
respectively correct category of independent heat indicates +.>
Figure QLYQS_88
Bit-corresponding value, student model task vector +.>
Figure QLYQS_89
Probability value of bit prediction->
Figure QLYQS_90
For the total number of categories, its value and student model task vector +.>
Figure QLYQS_91
Is uniform in length;
s54, distillation loss
Figure QLYQS_92
Loss of->
Figure QLYQS_93
、/>
Figure QLYQS_94
Weighted summation, resulting in a final total loss +.>
Figure QLYQS_95
I.e. the total loss function expression is:
Figure QLYQS_96
wherein,,
Figure QLYQS_97
、/>
Figure QLYQS_98
is a super parameter.
3. The industrial anomaly detection model training method based on multi-model fusion according to claim 2, wherein the step six specifically comprises the following sub-steps:
s61, repeating the first step to the fifth step, and minimizing the loss function by using a gradient descent algorithm
Figure QLYQS_99
Neural network parameters for student model +.>
Figure QLYQS_100
Nonlinear transformation neural network parameters->
Figure QLYQS_101
Updating;
s62, when model loss function
Figure QLYQS_102
Lowering to threshold->
Figure QLYQS_103
Or after the iteration reaches the preset times, finishing training; fixing the student model parameters->
Figure QLYQS_104
The student model is the target model.
4. An industrial anomaly detection model training device based on multi-model fusion, comprising one or more processors configured to implement the industrial anomaly detection model training method based on multi-model fusion of any one of claims 1 to 3.
5. A computer-readable storage medium, having stored thereon a program which, when executed by a processor, implements an industrial anomaly detection model training method based on multi-model fusion as claimed in any one of claims 1 to 3.
CN202310123067.3A 2023-02-16 2023-02-16 Industrial anomaly detection model training method and device based on multi-model fusion Active CN116028891B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310123067.3A CN116028891B (en) 2023-02-16 2023-02-16 Industrial anomaly detection model training method and device based on multi-model fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310123067.3A CN116028891B (en) 2023-02-16 2023-02-16 Industrial anomaly detection model training method and device based on multi-model fusion

Publications (2)

Publication Number Publication Date
CN116028891A CN116028891A (en) 2023-04-28
CN116028891B true CN116028891B (en) 2023-07-14

Family

ID=86091403

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310123067.3A Active CN116028891B (en) 2023-02-16 2023-02-16 Industrial anomaly detection model training method and device based on multi-model fusion

Country Status (1)

Country Link
CN (1) CN116028891B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117668622B (en) * 2024-02-01 2024-05-10 山东能源数智云科技有限公司 Training method of equipment fault diagnosis model, fault diagnosis method and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113408570A (en) * 2021-05-08 2021-09-17 浙江智慧视频安防创新中心有限公司 Image category identification method and device based on model distillation, storage medium and terminal
CN114170478A (en) * 2021-12-09 2022-03-11 中山大学 Defect detection and positioning method and system based on cross-image local feature alignment
CN114240892A (en) * 2021-12-17 2022-03-25 华中科技大学 Unsupervised industrial image anomaly detection method and system based on knowledge distillation
CN115346207A (en) * 2022-08-03 2022-11-15 北京交通大学 Method for detecting three-dimensional target in two-dimensional image based on example structure correlation
CN115471645A (en) * 2022-11-15 2022-12-13 南京信息工程大学 Knowledge distillation anomaly detection method based on U-shaped student network
CN115526332A (en) * 2022-08-17 2022-12-27 阿里巴巴(中国)有限公司 Student model training method and text classification system based on pre-training language model

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110162799B (en) * 2018-11-28 2023-08-04 腾讯科技(深圳)有限公司 Model training method, machine translation method, and related devices and equipment
US11487944B1 (en) * 2019-12-09 2022-11-01 Asapp, Inc. System, method, and computer program for obtaining a unified named entity recognition model with the collective predictive capabilities of teacher models with different tag sets using marginal distillation
CN111160409A (en) * 2019-12-11 2020-05-15 浙江大学 Heterogeneous neural network knowledge reorganization method based on common feature learning
CN113052768B (en) * 2019-12-27 2024-03-19 武汉Tcl集团工业研究院有限公司 Method, terminal and computer readable storage medium for processing image
CN111611377B (en) * 2020-04-22 2021-10-29 淮阴工学院 Knowledge distillation-based multi-layer neural network language model training method and device
US20220076136A1 (en) * 2020-09-09 2022-03-10 Peyman PASSBAN Method and system for training a neural network model using knowledge distillation
CN112116030B (en) * 2020-10-13 2022-08-30 浙江大学 Image classification method based on vector standardization and knowledge distillation
CN112418343B (en) * 2020-12-08 2024-01-05 中山大学 Multi-teacher self-adaptive combined student model training method
CN112801209B (en) * 2021-02-26 2022-10-25 同济大学 Image classification method based on dual-length teacher model knowledge fusion and storage medium
CN114241282B (en) * 2021-11-04 2024-01-26 河南工业大学 Knowledge distillation-based edge equipment scene recognition method and device
CN114067819B (en) * 2021-11-22 2024-06-21 南京工程学院 Speech enhancement method based on cross-layer similarity knowledge distillation
CN114936605A (en) * 2022-06-09 2022-08-23 五邑大学 Knowledge distillation-based neural network training method, device and storage medium
CN115481316A (en) * 2022-09-01 2022-12-16 贵州大学 Multi-model fusion knowledge distillation recommendation model
CN115690708A (en) * 2022-10-21 2023-02-03 苏州轻棹科技有限公司 Method and device for training three-dimensional target detection model based on cross-modal knowledge distillation

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113408570A (en) * 2021-05-08 2021-09-17 浙江智慧视频安防创新中心有限公司 Image category identification method and device based on model distillation, storage medium and terminal
CN114170478A (en) * 2021-12-09 2022-03-11 中山大学 Defect detection and positioning method and system based on cross-image local feature alignment
CN114240892A (en) * 2021-12-17 2022-03-25 华中科技大学 Unsupervised industrial image anomaly detection method and system based on knowledge distillation
CN115346207A (en) * 2022-08-03 2022-11-15 北京交通大学 Method for detecting three-dimensional target in two-dimensional image based on example structure correlation
CN115526332A (en) * 2022-08-17 2022-12-27 阿里巴巴(中国)有限公司 Student model training method and text classification system based on pre-training language model
CN115471645A (en) * 2022-11-15 2022-12-13 南京信息工程大学 Knowledge distillation anomaly detection method based on U-shaped student network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于组合神经网络的教师评价模型研究;刘彩红;唐万梅;;重庆师范大学学报(自然科学版)(第04期);全文 *

Also Published As

Publication number Publication date
CN116028891A (en) 2023-04-28

Similar Documents

Publication Publication Date Title
Tkachenko et al. Model and principles for the implementation of neural-like structures based on geometric data transformations
Oh et al. A tutorial on quantum convolutional neural networks (QCNN)
Jia et al. Quantum neural network states: A brief review of methods and applications
US9361586B2 (en) Method and system for invariant pattern recognition
Furukawa SOM of SOMs
Yuan et al. Quantum image edge detection algorithm
CN116028891B (en) Industrial anomaly detection model training method and device based on multi-model fusion
CN113821668A (en) Data classification identification method, device, equipment and readable storage medium
CN116206158A (en) Scene image classification method and system based on double hypergraph neural network
Lee et al. Application of domain-adaptive convolutional variational autoencoder for stress-state prediction
Egbo et al. Forecasting students’ enrollment using neural networks and ordinary least squares regression models
Rai Advanced deep learning with R: Become an expert at designing, building, and improving advanced neural network models using R
Chen et al. Total variation based tensor decomposition for multi‐dimensional data with time dimension
CN116128575A (en) Item recommendation method, device, computer apparatus, storage medium, and program product
Zhang et al. The Role of Knowledge Creation‐Oriented Convolutional Neural Network in Learning Interaction
Christiansen et al. Optimization of neural networks for time-domain simulation of mooring lines
JP7118882B2 (en) Variable transformation device, latent parameter learning device, latent parameter generation device, methods and programs thereof
CN116992937A (en) Neural network model restoration method and related equipment
CN113496119B (en) Method, electronic device and computer readable medium for extracting metadata in table
CN114511092A (en) Graph attention mechanism implementation method based on quantum circuit
JP7047665B2 (en) Learning equipment, learning methods and learning programs
CN112232261A (en) Method and device for fusing image sequences
CN114998990B (en) Method and device for identifying safety behaviors of personnel on construction site
Sundararaghavan et al. Methodology for estimation of intrinsic dimensions and state variables of microstructures
Zhang et al. Calibrated multivariate regression networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant