CN116415201A

CN116415201A - Ship main power abnormality detection method based on deep concentric learning

Info

Publication number: CN116415201A
Application number: CN202310667541.9A
Authority: CN
Inventors: 钟百鸿; 赵明航; 钟诗胜; 付旭云; 张永健
Original assignee: Harbin Institute of Technology Weihai
Current assignee: Harbin Institute of Technology Weihai
Priority date: 2023-06-07
Filing date: 2023-06-07
Publication date: 2023-07-11
Anticipated expiration: 2043-06-07
Also published as: CN116415201B

Abstract

The invention relates to the technical field of ship propulsion power detection, in particular to a ship main power abnormality detection method based on deep concentric learning, which can accurately detect the running state of a ship main engine.

Description

Ship main power abnormality detection method based on deep concentric learning

Technical Field

Background

The ship is important strategic equipment for guaranteeing national defense safety, developing national economy and building ocean strong countries. The main engine of the ship is used as core equipment of a ship power system, and whether the main engine can reliably run directly determines the navigation safety of the ship. According to civil ship statistics, among the causes of marine accidents, the accident rate caused by mechanical faults is 22%, and the high position is the first position; among them, in the mechanical failure, the accident rate caused by the failure of the ship main engine accounts for 45%. Investigation of The marine failure risk of 2015-2017 by The Swedish Club of The maritime insurance company (The Swedish Club) shows that The ratio of The marine mechanical equipment claim fee to The overall marine claim fee is increased from 35% to 48%; wherein the marine main engine failure claim accounts for about 1/3 of the claim of all the mechanical equipment. Therefore, the abnormal state of the ship host is detected in time, potential host faults can be found in advance to take corresponding maintenance measures, and the navigation safety of the ship is ensured.

The sensor data collected from the marine host provides a large amount of information on the operational status of the marine host. By detecting abnormality of the sensor data of the ship host, the health state of the ship host can be timely grasped. Existing algorithms for anomaly detection can be broadly divided into three categories, depending on the availability of tags: a supervision-based anomaly detection algorithm, a semi-supervision-based anomaly detection algorithm, and an unsupervised-based anomaly detection algorithm. In the supervision-based anomaly detection algorithm, a tag is necessary. A typical supervised learning-based strategy is to train a binary classifier under label guidance to achieve anomaly detection. In this case, existing classification algorithms can be designed as anomaly detectors, such as support vector machines, neural network classification models of various advanced structures, and the like. However, considering that the number of abnormal modes of the state of the ship host is large, only the abnormal mode of the state of the host which is appeared can be effectively identified by adopting the abnormality detection method based on supervision, the unknown abnormal mode can not be effectively identified, and the discriminant features of the unknown abnormal sample can not be learned by training a binary classifier under the supervision learning. Further, in the host state anomaly detection task, there is often a serious class imbalance problem, that is, the data set has a large amount of state monitoring data in normal operation, but the abnormal state data is very small. In this case, many classifiers tend to learn the characteristics of the host state normal data, and cannot capture the characteristics of the host state abnormal data effectively, resulting in the host state abnormal state being misclassified as a normal state.

Most of the existing abnormality detection algorithms still have limitations when facing to ship host state monitoring data with high dimension, complex structure, unbalanced categories, few abnormal samples and many abnormal modes. In some recent studies, deep learning methods have demonstrated excellent performance in addressing anomalies in high-dimensional, complex data distributions. Among other things, deep self-encoder networks (DAE) are often used for depth characterization learning. Typically, the intermediate hidden layer feature of the DAE is used as a new potential characterization for other anomaly detection techniques, or the potential characterization is combined with other features to facilitate anomaly detection performance. The potential characterization dimensions obtained by the DAE are typically low-dimensional and can be used to assist a simple classifier in identifying anomalies. However, the distribution of the DAE mapping the input data into the potential space is generally arbitrary in shape, which is hardly directly applicable to the ship's host state anomaly detection task, and may also be detrimental to the stability of the anomaly detection algorithm. Another key issue is that most DAE-based token learning methods exist separated from the anomaly detector into two independent processes, which may produce sub-optimal or task independent representations. Therefore, there is a need to develop new characterization learning methods to address the challenges of current marine host state anomaly detection.

Disclosure of Invention

Aiming at the defects and shortcomings in the prior art, the invention provides a ship main power abnormality detection method based on deep concentric learning, which can accurately detect the running state of a ship main engine.

The invention is achieved by the following measures:

the ship main power abnormality detection method based on deep concentric learning is characterized by comprising the following steps of:

step 1: data preprocessing, namely dividing monitoring data acquired from a ship host state sensor into a training data set and a testing data set;

step 2: constructing a deep concentric learning DCL model, performing characterization learning by using the deep concentric learning DCL model, performing nonlinear feature learning on a training data set by using DCL in a training stage, and synchronously executing reconstruction learning and concentric learning;

step 3: the network weight is optimized to obtain optimized deep concentric learning DCL, firstly, reconstruction loss and concentric loss are calculated respectively, then the weight is optimized by combining the random gradient descent SGD with minimum reconstruction loss and concentric loss, finally, whether the iteration number reaches the set iteration number is judged, if yes, the optimized DCL is obtained, and if not, the step 2 is returned;

Step 4: the method comprises the steps of firstly, inputting a ship host state test data set into an optimized DCL to obtain a low-dimensional potential representation, then, calculating the distance from the low-dimensional potential representation to a concentric center to serve as an abnormal score of a test sample, and finally, outputting a ship host state abnormal detection result according to the abnormal score of the sample.

In the step 1 of the invention, the monitoring data obtained from the ship host state sensor is divided into a training data set and a testing data set; wherein the training data set is used to train the model and the test data set is used to evaluate the performance of the network.

The construction of the deep concentric learning DCL model in the step 2 comprises the following steps:

step 2-1: reconstruction learning is used to obtain a low-dimensional potential representation, including two stages of encoding and decoding, in which an encoder encodes input data, maps the input data to a low-dimensional potential space, and obtains a low-dimensional potential representation, as shown in the following equation:

（1）

in the method, in the process of the invention,

is an encoder characteristic mapping function, and is composed of nonlinear transformation functions; />

Is encoder input data; />

Is the encoder output data, is a low-dimensional potential representation for representing the input data,/- >

；

Is a weight parameter that the encoder can learn;

in the decoding phase, the decoder dominates, working on reconstructing the low-dimensional latent representation into input data, as shown in the following formula:

in the method, in the process of the invention,

is a decoder feature mapping function, and is composed of nonlinear transformation functions; />

Is reconstruction data; />

Is a weight parameter that the decoder can learn; the weight parameters of the depth self-encoder network are obtained by minimizing the reconstruction error, the mean square error MSE being taken as a reconstruction loss function as follows:

in the method, in the process of the invention,

representing reconstruction loss; />

Is a weight parameter that the depth self-encoder network can learn through training;

is encoder input data; />

Is reconstruction data;

step 2-2: constructing concentricity loss for guiding concentricity learning, wherein a given concentric center is in a concentric potential space

Low-dimensional potential characterization->

The distance to the concentric center is measured as the Euclidean distance as follows:

in the method, in the process of the invention,

is a learnable network weight parameter; />

Is the firsti-th input data->

Mapping to a low-dimensional potential representation in a potential space; />

Is abbreviated as->

Default setting of concentric centersCFor a potential spatial origin of coordinates, the general form of concentricity loss is defined as follows:

In the method, in the process of the invention,Nis the number of samples;

is the firsti-th samples with a label corresponding to a normal sample label of 0 and an abnormal sample label of 1; />

Expressed as concentric inner boundary radius and concentric outer boundary radius, respectively, ">

A default setting r1=0.5, r2=1,

when the input data is considered to be a normal sample, i.e

The concentricity loss at this time is:

as can be seen from the above, if the normal samples are mapped outside the concentric inner boundaries, i.e

At this time

This requires imposing a large constraint on it, bringing it together towards the concentric center until it falls within the concentric inner boundary; if normal samples are mapped to the sameWithin the endocardial border, i.e.)>

At this time

No additional constraints need to be imposed on it, minimizing +.>

Ensuring that normal samples are mapped into concentric inner boundaries with a potential spatial radius R1, i.e. potential characterizations of normal samples are distributed within concentric inner boundaries, +.>

The gradient calculation of (2) is shown in the following formula:

in the method, in the process of the invention,

representing Concentric loss caused by normal samples->

For a learnable network weight parameter +.>

Obtaining a partial derivative; />

Representing encoder input data +.>

Is>

A sample number; />

Representing concentric center positions; />

Representation and- >

Corresponding->

Low-dimensional potential characterization->

To concentric center->

Is a distance of (2); />

Representation->

For a learnable network weight parameter +.>

Obtaining a partial derivative; r1 represents a concentric inner boundary radius;

in the case of taking into account that the input data is an abnormal sample, i.e.

The concentricity loss at this time is:

from the above equation, if the outlier samples are mapped within concentric outer boundaries, i.e.

At this time

This requires that a constraint be imposed on it until it is pushed out of the concentric outer boundary; if the abnormal samples are mapped outside the concentric outer boundary, i.e. +.>

At this time->

No additional constraint is imposed on it, minimizing +.>

Ensuring that the outlier samples are mapped outside the concentric outer boundary with a potential spatial radius R2, i.e. the potential characterizations of the outlier samples are distributed outside the concentric outer boundary,

the gradient calculation of (2) is shown in the following formula:

in the method, in the process of the invention,

representing the Concentric loss caused by an abnormal sample->

For a learnable network weight parameter +.>

Obtaining a partial derivative; />

Representing encoder input data +.>

Is>

A sample number; />

Representing concentric center positions; />

Representation and->

Corresponding->

Low-dimensional potential characterization->

To concentric center->

Is a distance of (2); />

Representation->

For a learnable network weight parameter +. >

Obtaining a partial derivative; r2 represents the concentric outer boundary radius;

the concentricity loss overcomes the adverse effect caused by the imbalance of the category by two weight factors, as shown in the following formula:

in the method, in the process of the invention,

for normal sample number, +.>

For the number of abnormal samples, +.>

； />

For the normal sample weight factor,

for the abnormal sample weight factor, < +.in actual use>

Set to the number of abnormal samples, i.e. +.>

； />

Set to the number of normal samples, i.e. +.>

；

Step 2-3: setting an anomaly score and a concentric decision boundary: the anomaly score is used to measure the anomaly degree of the sample, the higher the anomaly score, the greater the anomaly probability of the sample, and in deep concentric learning DCL, the low-dimensional potential characterization obtained through characterization learning retains key information of the input data, and therefore, the anomaly score can be obtained by calculating the distance from the low-dimensional potential characterization to the concentric center, as shown in the following formula:

in the method, in the process of the invention,

represent the firsti-anomaly scores corresponding to th samples;

after obtaining the anomaly score for each sample, an anomaly threshold is also required to identify whether each sample is a normal sample or an anomalous sample, this anomaly threshold being referred to in concentric potential space as a concentric decision boundary,

concentric decision boundaries can be employed after training is complete PObtained by quantile method, abbreviated as

The specific flow is as follows: calculating distances between all sample depth features and the concentric center according to formula (4); all samples were arranged with a distance from the concentric center of +.>

The method comprises the steps of carrying out a first treatment on the surface of the Assume that the number of abnormal samples in the training set isMBy usingPIn the fractional methodDObtain a sample of concentric distance +.>

，/>

Is shown inDThe corresponding sample concentric distance at the N-M position of (B) is +.>

)，

After obtaining the anomaly score for each sample, the anomaly score is higher than

Is regarded as an abnormal sample, thereby realizing abnormality detection as shown in the following formula:

in the method, in the process of the invention,

normal sample, ->

The time is an abnormal sample;

step 2-4: constructing an optimization objective function: given a data set comprising

Normal samples and->

The optimized objective function of supervised deep concentric learning DCL training for each abnormal sample is constructed as follows:

under the condition of considering class imbalance, the optimization objective function of the supervised deep concentric learning DCL training is constructed as follows:

the optimized objective function of Deep Concentric Learning (DCL) includes three components: (1)

: characterizing reconstruction errors caused by the depth self-encoder network,/>

Consider the reconstruction loss generated by normal samples, excluding those generated by abnormal samples; (2)/ >

: characterizing the concentric loss of normal samples mapped outside the concentric inner boundary,/for>

The smaller the better, (3)>

: characterizing the concentric loss of an abnormal sample mapped within the concentric outer boundary, hopefully +.>

The smaller the better;

step 3: network weight optimization: firstly, respectively calculating reconstruction loss and concentricity loss; then optimizing weights by randomly gradient descent SGD to jointly minimize reconstruction loss and concentricity loss; finally judging whether the iteration times reach the set iteration times or not; if yes, obtaining the optimized DCL; if not, returning to the step 2;

step 4: abnormality detection: in the testing stage, firstly, inputting a ship host state testing data set into an optimized DCL to obtain a low-dimensional potential representation; then, calculating the distance from the low-dimensional potential characterization to the concentric center as an anomaly score of the test sample; and finally, outputting a ship host state abnormality detection result according to the abnormality score of the sample.

According to the invention, a new deep characterization learning method, namely Deep Concentric Learning (DCL), is constructed, so that a new potential characterization is studied to effectively separate samples in different categories, so that abnormal detection performance of a ship host state is promoted, the problem that detection performance is not ideal due to inconsistent optimization targets of the traditional DAE-based characterization learning and abnormal detection tasks is solved, the DCL utilizes the DAE to conduct characterization learning, high-dimensional and complex-structure ship host state input data are mapped into a low-dimensional potential characterization through strong nonlinear mapping capability, and in order to promote the obtained low-dimensional potential characterization to have enough discriminance to separate samples in different categories, a concentric loss is constructed, training of the DAE is supervised together by combining reconstruction loss, the abnormal sample diversity of the ship host state is considered, in a potential space, the DCL maps the normal sample of the ship host state into the concentric inner boundary, and the abnormal sample is mapped out of the concentric inner boundary, so that a larger mapping area can be provided for the unknown ship host state abnormal sample; at the same time, DCL allows for a significant concentric separation between concentric inner and outer boundaries to separate potential characterizations between different classes. Further, by setting two weight factors on the concentric loss to balance the concentric loss caused by normal samples and abnormal samples of the ship host state, the DCL can overcome the problem of unbalanced categories, not only can the DCL train in an end-to-end manner, but also can dynamically adjust the hyper-parameters related to the concentric loss (or can be preset) in the training process, and more importantly, the DCL can give an interpretable abnormal score by measuring the distance from the samples to the concentric center in a potential space, thereby realizing the ship host state abnormal detection task.

Drawings

FIG. 1 is a block diagram of a deep concentric learning DCL model in accordance with the present invention.

FIG. 2 is a schematic diagram of concentric loss build-up in the present invention.

FIG. 3 is a flow chart of anomaly detection based on deep concentric learning DCL in the present invention.

Fig. 4 is an experimental result of 13 anomaly detection data sets in the embodiment of the present invention, wherein (a) in fig. 4 is an experimental result under an AUC-ROC evaluation index, and (b) in fig. 4 is an experimental result under an AUC-PR evaluation index.

Fig. 5 is a schematic diagram showing a change of a loss curve on different anomaly detection data sets according to an embodiment of the present invention, where (a) in fig. 5 is a loss curve of DCL on a test sample of an anomaly detection data set of AD1, (b) in fig. 5 is a loss curve of DCL on a test sample of an anomaly detection data set of AD2, (c) in fig. 5 is a loss curve of DCL on a test sample of an anomaly detection data set of AD3, and (d) in fig. 5 is a loss curve of DCL on a test sample of an anomaly detection data set of AD 4.

FIG. 6 is a schematic diagram of a visual representation of the potential characterization of a test sample on different anomaly detection datasets in an embodiment of the present invention, where (a) in FIG. 6 is a visual representation of the potential characterization of an AD1 dataset test sample, (b) in FIG. 6 is a visual representation of the potential characterization of an AD2 dataset test sample, (c) in FIG. 6 is a visual representation of the potential characterization of an AD3 dataset test sample, (d) in FIG. 6 is a visual representation of the potential characterization of an AD4 dataset test sample, and (e) in FIG. 6 is a visual representation of the potential characterization of an AD5 dataset test sample, and (f) in FIG. 6 is a visual representation of the potential characterization of an AD6 dataset test sample.

Detailed Description

The invention will be further described with reference to the drawings and examples.

In order to obtain discriminative potential characterization from ship host state high-dimensional input data to improve abnormality detection performance, the invention constructs a novel deep characterization learning method, namely Deep Concentric Learning (DCL). The flow for detecting the abnormal state of the ship host by using the DCL is shown in the figure 3, and mainly comprises two stages: training phase and testing phase. The training stage is used for optimizing the weight of the DCL, and the testing stage is used for executing the ship host state abnormality detection task by using the optimized DCL, and the specific flow is as follows:

firstly, preprocessing data, namely dividing monitoring data acquired from a ship host state sensor into a training data set and a testing data set; wherein the training data set is used for training a model, and the test data set is used for evaluating network performance; secondly, characterizing learning, wherein in a training stage, nonlinear characteristic learning is carried out on a training data set by using DCL, and reconstruction learning and concentric learning are synchronously carried out;

thirdly, optimizing the network weight, and firstly, respectively calculating reconstruction loss and concentricity loss; the weights are then optimized by a random gradient descent (SGD) joint minimization of reconstruction loss and concentricity loss; finally judging whether the iteration times reach the set iteration times or not; if yes, obtaining the optimized DCL; if not, returning to the second step;

Fourth, abnormal detection, in the testing stage, firstly, inputting the ship host state testing data set into the optimized DCL to obtain a low-dimensional potential representation; then, calculating the distance from the low-dimensional potential characterization to the concentric center as an anomaly score of the test sample; and finally, the abnormal state detection task of the ship host is realized according to the abnormal score of the sample.

The overall framework of Deep Concentric Learning (DCL) of the present invention is shown in fig. 1, and includes two important token learning processes: reconstruction learning and concentric learning. Wherein the reconstruction learning is directed to learning a low-dimensional potential representation from the input data, and the concentric learning is directed to facilitating the obtained low-dimensional potential representation to have sufficient discriminativity for samples between different classes. Notably, in DCL, the two processes of reconstruction learning and concentric learning are performed simultaneously, which together promote the distinguishability of samples between different classes, which is very promising for improving the abnormality detection performance; in fig. 1, the upper part is reconstruction learning, the obtained low-dimensional representation is of any shape, can hardly be directly used for an abnormality detection task, and the lower part is deep concentric learning, so that the obtained low-dimensional representation has enough discrimination and can directly execute the abnormality detection task.

In reconstruction learning, a depth self-encoder is used for reconstruction learning, which means that high-dimensional, structurally complex input data can be mapped into a low-dimensional potential representation through nonlinearity by using reconstruction learning, and the obtained low-dimensional potential representation can retain as much key information in the input data as possible under the condition of minimizing reconstruction loss. However, the low-dimensional potential characterization obtained by reconstruction learning is arbitrary in shape, as shown in the upper part of fig. 1, and is hardly directly used for abnormality detection, and even abnormality detection performance may be impaired. One straightforward reason is that reconstruction losses are designed to compress the data dimensions, which is inconsistent with the anomaly detection task optimization objective.

Concentric learning solves the problems of the reconstruction learning described above, and the low-dimensional latent representation obtained by guiding the reconstruction learning forms a concentric distribution of latent space, as shown in the lower part of fig. 1. Considering the diversity of abnormal samples, in concentric potential space, normal samples are guided to map inside the concentric boundary, while abnormal samples are mapped outside the concentric boundary, which can provide a larger mapping area for diversified, unpredictable abnormal patterns. In the case of concentric distribution, normal and abnormal samples can be directly identified by measuring the distance of the samples from the concentric center. Therefore, in order to obtain a low-dimensional potential characterization that can be directly used for anomaly detection, it is necessary to design a loss-guided concentric learning, referred to herein as concentric loss.

Reconstruction learning is used to obtain a low-dimensional latent representation, which includes two stages of encoding and decoding, as shown in the upper part of fig. 1.

In the encoding stage, the encoder encodes the input data, maps the input data to a low-dimensional potential space, and obtains a low-dimensional potential representation as shown in the following formula:

in the method, in the process of the invention,

Is encoder input data; />

Is the encoder output data, which is a low-dimensional potential representation used to characterize the input data, typically +.>

； />

Is a weight parameter that the encoder can learn;

in the method, in the process of the invention,

Is reconstruction data; />

Is a weight parameter that the decoder can learn; the weight parameters of the depth self-encoder network may be obtained by minimizing the reconstruction error, the Mean Square Error (MSE) being taken as a reconstruction loss function, as follows:

wherein->

Representing reconstruction loss; />

Is a weight parameter that the depth self-encoder network can learn through training; / >

Is encoder input data; />

Is reconstruction data.

The concentricity loss is used to guide concentric learning. The potential space obtained by conventional DAE-based token learning is arbitrarily shaped, as shown in part (a) of fig. 2. To guide the low-dimensional latent representation into a concentric distribution of latent space, the concentric loss not only requires a large penalty to be applied to normal sample latent representations mapped outside the concentric inner boundary to cause them to fall within the concentric inner boundary, but also requires a large penalty to be applied to abnormal sample latent representations mapped within the concentric outer boundary to cause them to fall outside the concentric outer boundary, as shown in part (b) of fig. 2. By minimizing the concentricity loss, normal samples can be forced to map inside the concentric inner boundary and abnormal samples can be forced to map outside the concentric outer boundary, which allows for a significant concentric separation gap between the corresponding low-dimensional potential characterizations of both normal and abnormal samples, as shown in part (c) of fig. 2.

In the concentric potential space, a given concentric center is

Low-dimensional potential characterization->

The distance to the concentric center may be measured using the Euclidean distance as follows: />

Wherein->

Is a learnable network weight parameter; / >

Is the firsti-th input data->

Mapping to a low-dimensional potential representation in a potential space;

can be abbreviated as->

. The invention defaults to concentric centerCIs the potential spatial origin of coordinates.

The general form of concentric loss can be defined as follows:

in the method, in the process of the invention,Nis the number of samples;

Expressed as concentric inner boundary radius and concentric outer boundary radius, respectively, generally +.>

The default setting of the invention, r1=0.5, r2=1, in practice this will achieve a desirable result;

when the input data is considered to be a normal sample, i.e

The concentricity loss at this time is:

At this time

This requires imposing a large constraint on it, bringing it together towards the concentric center until it falls within the concentric inner boundary; if normal samples are mapped within concentric inner boundaries, i.e. +.>

At this time

No additional constraints need to be imposed on it. Minimizing +.>

It can be ensured that normal samples are mapped into concentric inner boundaries with a potential spatial radius R1, i.e. that potential characterizations of normal samples are distributed within concentric inner boundaries. / >

The gradient calculation of (2) is shown in the following formula:

in the method, in the process of the invention,

representing Concentric loss caused by normal samples->

For a learnable network weight parameter +.>

Obtaining a partial derivative; />

Representing encoder input data +.>

Is>

A sample number; />

Representing concentric center positions; />

Representation and->

Corresponding->

Low-dimensional potential characterization->

To concentric center->

Is a distance of (2); />

Representation->

For a learnable network weight parameter +.>

The concentricity loss at this time is:

as can be seen from the above, if the outlier samples are mapped within concentric outer boundaries, i.e

At this time

At this time->

No additional constraints need to be imposed on it. Minimizing +.>

It can be ensured that the outlier samples are mapped outside the concentric outer boundary with a potential spatial radius R2, i.e. the potential characterizations of the outlier samples are distributed outside the concentric outer boundary,

the gradient calculation of (2) is shown in the following formula:

in the method, in the process of the invention,

representing the Concentric loss caused by an abnormal sample->

For a learnable network weight parameter +. >

Obtaining a partial derivative; />

Representing encoder input data +.>

Is>

A sample number; />

Representing concentric center positions; />

Representation and->

Corresponding->

Low-dimensional potential characterization->

To concentric center->

Is a distance of (2); />

Representation->

For a learnable network weight parameter +.>

consider that in most anomaly detection tasks, there is typically a problem of class imbalance, i.e., there are far more normal samples than anomalous samples. In such an unbalanced case, the deep learning method may have insufficient feature capturing ability for the abnormal sample, causing the abnormal sample to be misclassified as a normal sample. To address this problem, the concentricity penalty overcomes the adverse effects of class imbalance by two weight factors, as shown in the following equation:

in the method, in the process of the invention,

for normal sample number, +.>

For the number of abnormal samples, +.>

； />

For the normal sample weight factor,

is an outlier sample weight factor. For easy setting, in actual use, < > for>

Can be set as the abnormal sample number, namely

； />

Can be set to the number of normal samples, i.e. +.>

；

The anomaly score is used to measure the degree of anomaly of the sample. The higher the anomaly score, the greater the sample anomaly probability. In Deep Concentric Learning (DCL), the low-dimensional potential characterizations obtained by characterization learning may preserve key information of the input data. Thus, the anomaly score can be obtained by calculating the distance of the low-dimensional potential representation to the concentric center, as shown in the following equation:

In the method, in the process of the invention,

represent the firsti-anomaly scores for th samples.

After obtaining the anomaly score for each sample, an anomaly threshold is also required to identify whether each sample is a normal sample or a anomalyA constant sample. This outlier threshold, in concentric potential space, is referred to as a concentric decision boundary. The concentric decision boundary can be obtained by adopting a P quantile method after training is finished, and is abbreviated as

The specific flow is as follows: calculating distances between all sample depth features and the concentric center according to formula (4); the distances from all samples to the concentric center are arranged from small to large

The method comprises the steps of carrying out a first treatment on the surface of the Assuming that the number of abnormal samples in the training set is M, obtaining a sample concentric distance in D by adopting a P-branch bit method>

，/>

Indicating the corresponding sample concentric distance at the N-M position in D, there is +.>

。

in the method, in the process of the invention,

normal sample, ->

And is an abnormal sample.

Given a data set comprising

Normal samples and->

An optimized objective function for supervised Deep Concentric Learning (DCL) training is constructed as follows.

In consideration of class imbalance, an optimized objective function for supervised Deep Concentric Learning (DCL) training is constructed as follows.

The optimized objective function of Deep Concentric Learning (DCL) includes three components:

（1）

: reconstruction errors caused by the depth self-encoder network are characterized. In order for the obtained low-dimensional latent representation to be able to preserve well the critical information in the input data, the smaller the reconstruction error is expected to be, the better. In practice, normal samples are expected to be well reconstructed, while abnormal samples are not. Thus (S)>

Consider the reconstruction loss generated by normal samples, excluding those generated by abnormal samples.

（2）

: the concentric loss resulting from normal samples being mapped outside the concentric inner boundary is characterized. In order that the potential characterization of the obtained normal sample can be distributed within the concentric inner border, it is desirable +.>

The smaller the better.

（3）

: the concentric loss resulting from the abnormal samples being mapped within the concentric outer boundaries is characterized. In order that the low-dimensional potential characterization of the obtained anomalous sample can be distributed outside the concentric outer boundary, it is desirable that

The smaller the better.

By minimizing

A potential space where a low-dimensional potential representation exhibits a concentric distribution can be guaranteed. In this potential space, the low-dimensional potential characterization is not only able to preserve well the important information contained in the normal sample input data, but also the normal samples are mapped inside the concentric inner boundary and the abnormal samples are mapped outside the concentric outer boundary.

The present invention uses an 84-dimensional feature vector as input.

The data set details are shown in table 1.

The dataset was added with 60dB of gaussian white noise, containing 3500 samples, including 14 health states, respectively normal (no fault), intake manifold pressure reduction, no. 1 to No. 6 cylinder compression ratio reduction, and No. 1 to No. 6 cylinder fuel injection amount reduction, each of the number of samples of health states being 250.

Except for the normal state, the other states are abnormal states.

The invention combines the normal state and the rest abnormal states in turn, and forms 13 abnormal detection data sets respectively marked as AD1, AD2, … and AD13. For each dataset, it was randomly divided into training dataset and test dataset in a ratio of 7:3.

DCL mainly introduces concentric learning (i.e. under the joint supervision of concentric losses) on the basis of DAE for characterization learning, and overcomes the problem of class imbalance by two weight factors in concentric losses. The present invention uses 3 methods, namely DAE, dcl_n and DCL, to perform comparative experiments. Wherein, DAE: using only the mean square error loss as an optimization target; dcl_n: using mean square error loss and concentric loss without consideration to overcome class imbalance as optimization targets; DCL: the mean square error loss and the concentric loss that is considered to overcome class imbalance are used as optimization targets.

TABLE 1 Ship host health status type and sample number corresponding thereto

The superparameter related to the network structure is referred to in the prior art and set to FC (input_size, 60, BN, leakyReLU) -FC (60, 30, BN, leakyReLU) -FC (30, 10, BN, leakyReLU) -FC (10, 2, none, none) -FC (2, 10,BN, leakyReLU) -FC (10, 30,BN, leakyReLU) -FC (30, 60,BN, leakyReLU) -FC (60, output_size, none, none). Wherein, FC represents a full connection layer, BN represents a batch normalization layer, leakyReLU represents a LeakyReLU activation function layer, and input_size and output_size are characteristic dimensions of DCL Input and Output vectors respectively, and are equal to each other; the parameters in brackets are expressed as FC input neuron number, output neuron number, batch normalization immediately following FC, and activation function immediately following BN, respectively. It can be seen that the encoder output is 2D, i.e. the potential characterization dimension is 2D. The superparameter optimized with the network weights refers to the existing literature and is set as follows. The iteration number is 200; adopting an SGD optimizer, wherein the momentum is 0.9, and the weight attenuation is 0.0001; the initial learning rate is 0.001, and then decays by 0.1 every 66 iterations; the batch size was set to 128.

The present invention uses two commonly used evaluation indices, AUC-ROC (Area Under Receiver Operating Characteristic Curve) and AUC-PR (Area Under Precision-Recall Curve), to comprehensively evaluate the detection performance of the method under consideration. The AUC-ROC focuses on the relationship between the two indicators, true positive rate (True Positive Rate, TPR) and false positive rate (False Positive Rate, FPR). The model performs best when AUC-roc=1, while AUC-roc=0.5 means that the model has no resolution. AUC-PR focuses on the relationship between accuracy (Precision) and Recall (Recall), rather than true and false positive. The higher the AUC-PR, the better the detection performance; the model performs best when AUC-pr=1.

To verify the validity of the algorithm, multiple sets of experiments were organized. Firstly, evaluating the performance of the constructed method on 13 anomaly detection data sets; then, the detection performance of the constructed method under the condition of insufficient abnormal samples is explored by changing the proportion of the abnormal samples on the training data set; finally, the constructed method is subjected to convergence analysis. The DCL constructed by the invention was implemented on a PyTorch 1.8.0, and all experiments were performed on a computer configured as Intel (R) Core (TM) i9-9900K CPU @ 3.60 GHz.

The performance on the different data sets is compared as follows: table 2 gives the performance of the method under consideration over 13 anomaly detection datasets.

To avoid the effect of random factors, each result is an average of 5 replicates. From the experimental results in Table 2, it can be seen that the following conclusion can be drawn.

(1) The performance of DCL_N is better than DAE, whether AUC-ROC or AUC-PR. More specifically, dcl_n achieved a performance improvement of 0.89% over DAE with respect to the average AUC-ROC; the DCL-N achieved a performance improvement of 0.76% over DAE with respect to the average AUC-PR. Dcl_n differs from DAE in that dcl_n uses a concentricity penalty that is better than the detection performance of DAE, indicating that the concentricity penalty can cause dcl_n to obtain a discriminatory potential characterization to separate normal and abnormal samples.

(2) Of the 3 methods considered, DCL gave the best detection performance of 100% in terms of average AUC-ROC and average AUC-PR. In contrast to dcl_n, DCL uses concentric losses that account for class imbalance. Although the ratio of normal and abnormal samples in the 13 data sets considered is the same, the two weight factors in the concentric loss used by the DCL produce a greater loss, playing a stronger regularization role, and thus achieving a performance improvement over dcl_n.

The influence of different training anomaly ratios (namely under the conditions of insufficient anomaly samples and unbalanced categories) on the DCL detection performance is examined below. Here, the training anomaly ratio refers to a ratio of an anomaly sample used in the training process to an anomaly sample in the training data set. In training, experimental verification was performed by varying the proportion of abnormal samples on the training dataset.

Fig. 4 shows the experimental results of the method under consideration on 13 anomaly detection datasets. Each result is the average test performance of each method for 5 replicates on each dataset. It can be seen that as the training anomaly proportion increases, the performances of DAE and dcl_ N, DCL show a gradually increasing trend, which indicates that properly adding the anomaly samples on the training data set helps to learn the discriminant information between the normal samples and the anomaly samples, and improves the discriminant of the samples between different categories. However, under the condition of smaller training anomaly proportion, the detection performance of DAE is relatively poor, and DCL_ N, DCL maintains higher detection performance, which indicates that the constructed method still has excellent characteristic learning capability under the condition of unbalanced category; especially, DCL, since the problem of class imbalance is considered, achieves better detection performance than dcl_n. The performance improvement obtained by DCL suggests that it is necessary to consider the problem of class imbalance in anomaly detection tasks, and concentricity loss can prompt token learning to obtain more discriminative potential tokens to distinguish samples between different classes.

TABLE 2 detection Performance of different methods on test sample sets (%)

In DCL, two losses, namely a reconstruction loss (MSE loss) and a concentricity loss (Concentric loss), are used, the sum of which constitutes the Total loss (Total loss). Fig. 5 shows the loss curves of DCL in the AD1, AD2, AD3, AD4 anomaly detection dataset test samples. It can be seen that the total loss gradually converges to a small value as the network weights are optimized. The total losses are initially large, mainly due to concentric losses. Thereafter, as the concentric loss decreases, the reconstruction loss dominates the total loss. By observing the change in the loss profile over the different data sets, it can be seen that the concentric loss has good convergence.

To further understand whether the potential characterizations obtained by DCL have sufficient discriminatory properties to distinguish samples between different classes, fig. 6 shows the potential characterization visualization results of DCL testing samples at the AD1, AD2, AD3, AD4, AD5, AD6 dataset. It can be seen that on different data sets, the potential characterizations of the vast majority of normal samples fall within the concentric inner boundary (i.e., concentric circles with radius R1) and the potential characterizations of the vast majority of abnormal samples fall outside the concentric outer boundary (i.e., concentric circles with radius R2), with significant separation between normal and abnormal samples. The visualization results on the different data sets indicate that the potential characterization obtained by DCL has sufficient discriminatory information to separate normal and abnormal samples.

The invention constructs a novel deep characterization learning method, namely Deep Concentric Learning (DCL), which promotes learning of effective potential characterization to improve the state abnormality detection performance of the ship host. The main novelty of DCL is that a concentric loss-inducing potential representation is constructed to form a concentric potential space. In this concentric potential space, normal samples are mapped inside the concentric inner boundary and abnormal samples are mapped outside the concentric outer boundary. And the normal sample and the abnormal sample can be well separated by setting the concentric inner boundary radius and the concentric outer boundary radius to form a concentric interval in the training process, so that the separability among different categories is remarkably improved.

The effectiveness of the DCL in improving the abnormality detection performance is verified through experimental results on a ship host data set. Specifically, dcl_n achieves a detection performance improvement of 0.89%, 0.76% for the mean AUC-ROC and the mean AUC-PR, respectively, over the 13 anomaly detection datasets considered, compared to the traditional DAE-based approach; further, among the 3 methods considered (i.e., DAE, dcl_ N, DCL), DCL that considers the class imbalance problem achieves the highest detection performance of 100%. At different training anomaly ratios, both dcl_n and DCL achieved better detection performance than DAE. The performance improvement obtained by the DCL shows that the problem of unbalanced category is necessary to be considered in the abnormal detection task, and the DCL can obviously promote the characteristic learning to obtain the discriminative potential characteristic so as to improve the abnormal detection performance of the ship host state.

Finally, the DCL can be applied not only to the task of detecting abnormal states of the ship host, but also to other mode recognition tasks that need to improve the separability between different classes.

Claims

1. The ship main power abnormality detection method based on deep concentric learning is characterized by comprising the following steps of:

2. The method for detecting abnormal main power of a ship based on deep concentric learning according to claim 1, wherein the construction of the deep concentric learning DCL model in step 2 comprises the steps of:

(1) Wherein->

Is encoder input data; />

Is the encoder output data, is a low-dimensional potential representation used to represent the input data,

；/>

is a weight parameter that the encoder can learn;

wherein->

Is reconstruction data; />

Is a weight parameter that the decoder can learnThe method comprises the steps of carrying out a first treatment on the surface of the The weight parameters of the depth self-encoder network are obtained by minimizing the reconstruction error, the mean square error MSE being taken as a reconstruction loss function as follows:

Wherein->

Representing reconstruction loss; />

step 2-2: constructing concentricity loss for guiding concentric learning;

step 2-3: setting an anomaly score and a concentric decision boundary;

step 2-4: and constructing an optimization objective function.

3. The method for detecting abnormal main power of a ship based on deep concentric learning according to claim 2, wherein in the concentric potential space in step 2-2, a given concentric center is

Low-dimensional potential characterization->

wherein->

Is the firsti-th input data->

Mapping to a low-dimensional potential representation in a potential space; />

Is abbreviated as->

in the method, in the process of the invention,Nis the number of samples;

is the firsti-thLabels corresponding to the samples, wherein a normal sample label is 0, and an abnormal sample label is 1; />

Considering that the input data is normal samples, i.e. +.>

The concentric losses are:

At this time

At this time->

Then it is not necessary to impose additional constraints on it, minimizing +.>

Ensuring that normal samples are mapped into concentric inner boundaries with a potential spatial radius R1, i.e. that potential characterizations of normal samples are distributed within concentric inner boundaries,

the gradient calculation of (2) is shown in the following formula:

in the method, in the process of the invention,

representing Concentric loss caused by normal samples->

For a learnable network weight parameter +.>

Obtaining a partial derivative; />

Representing encoder input data +.>

Is>

A sample number; />

Representing concentric center positions; />

Representation and->

Corresponding->

Low-dimensional potential characterization->

To concentric center->

Is a distance of (2); />

Representation->

For a learnable network weight parameter +.>

The concentricity loss at this time is:

At this time

This requires that a constraint be imposed on it until it is pushed out of the concentric outer boundary; if the abnormal samples are mapped outside the concentric outer boundary, i.e. +. >

At this time->

No additional constraint is imposed on it, minimizing +.>

The evidence anomaly sample is mapped outside the concentric outer boundary with a potential spatial radius R2, i.e. the potential characterization of the anomaly sample is distributed outside the concentric outer boundary, +.>

The gradient calculation of (2) is shown in the following formula:

wherein->

Representing the Concentric loss caused by an abnormal sample->

For a learnable network weight parameter +.>

Obtaining a partial derivative; />

Representing encoder input data +.>

Is>

A sample number; />

Representing concentric center positions; />

Representation and->

Corresponding->

Low-dimensional potential characterization->

To concentric center->

Is a distance of (2); />

Representation->

For a learnable network weight parameter +.>

，

in the method, in the process of the invention,

for normal sample number, +.>

For the number of abnormal samples, +.>

；/>

For normal sample weight factor, +.>

For the abnormal sample weight factor, < +.in actual use>

Set to the number of abnormal samples, i.e. +.>

；/>

Set to the number of normal samples, i.e. +.>

。

4. A method for detecting abnormal main power of a ship based on deep concentric learning according to claim 3, wherein the abnormal score in step 2-3 is obtained by calculating the distance from the low-dimensional potential characterization to the concentric center, as shown in the following formula:

The present invention relates to a method for manufacturing a semiconductor deviceIn (I)>

Represent the firsti-anomaly scores corresponding to th samples;

the concentric decision boundary is obtained by adopting a P quantile method after training is finished and is abbreviated as

，/>

Representing the corresponding sample concentric distance at the N-M position in D, there is

After obtaining the abnormality score for each sample, the abnormality score was higher than +.>

wherein->

Normal sample, ->

And is an abnormal sample.

5. The method for detecting abnormal main power of a ship based on deep concentric learning according to claim 2, wherein the steps 2-4 specifically comprise the steps of:

given a data set comprising

Normal samples and->

Under the condition of considering class imbalance, the optimized objective function of the supervised deep concentric learning DCL training is constructed as follows:

，

the optimized objective function of deep concentric learning DCL includes three components: (1)

Consider the reconstruction loss generated by normal samples, excluding those generated by abnormal samples; (2)/>

The smaller the better, (3)>

The smaller the better.