CN116415201A - Ship main power abnormality detection method based on deep concentric learning - Google Patents

Ship main power abnormality detection method based on deep concentric learning Download PDF

Info

Publication number
CN116415201A
CN116415201A CN202310667541.9A CN202310667541A CN116415201A CN 116415201 A CN116415201 A CN 116415201A CN 202310667541 A CN202310667541 A CN 202310667541A CN 116415201 A CN116415201 A CN 116415201A
Authority
CN
China
Prior art keywords
concentric
learning
sample
abnormal
samples
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310667541.9A
Other languages
Chinese (zh)
Other versions
CN116415201B (en
Inventor
钟百鸿
赵明航
钟诗胜
付旭云
张永健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology Weihai
Original Assignee
Harbin Institute of Technology Weihai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology Weihai filed Critical Harbin Institute of Technology Weihai
Priority to CN202310667541.9A priority Critical patent/CN116415201B/en
Publication of CN116415201A publication Critical patent/CN116415201A/en
Application granted granted Critical
Publication of CN116415201B publication Critical patent/CN116415201B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T90/00Enabling technologies or technologies with a potential or indirect contribution to GHG emissions mitigation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Testing Or Calibration Of Command Recording Devices (AREA)

Abstract

The invention relates to the technical field of ship propulsion power detection, in particular to a ship main power abnormality detection method based on deep concentric learning, which can accurately detect the running state of a ship main engine.

Description

Ship main power abnormality detection method based on deep concentric learning
Technical Field
The invention relates to the technical field of ship propulsion power detection, in particular to a ship main power abnormality detection method based on deep concentric learning, which can accurately detect the running state of a ship main engine.
Background
The ship is important strategic equipment for guaranteeing national defense safety, developing national economy and building ocean strong countries. The main engine of the ship is used as core equipment of a ship power system, and whether the main engine can reliably run directly determines the navigation safety of the ship. According to civil ship statistics, among the causes of marine accidents, the accident rate caused by mechanical faults is 22%, and the high position is the first position; among them, in the mechanical failure, the accident rate caused by the failure of the ship main engine accounts for 45%. Investigation of The marine failure risk of 2015-2017 by The Swedish Club of The maritime insurance company (The Swedish Club) shows that The ratio of The marine mechanical equipment claim fee to The overall marine claim fee is increased from 35% to 48%; wherein the marine main engine failure claim accounts for about 1/3 of the claim of all the mechanical equipment. Therefore, the abnormal state of the ship host is detected in time, potential host faults can be found in advance to take corresponding maintenance measures, and the navigation safety of the ship is ensured.
The sensor data collected from the marine host provides a large amount of information on the operational status of the marine host. By detecting abnormality of the sensor data of the ship host, the health state of the ship host can be timely grasped. Existing algorithms for anomaly detection can be broadly divided into three categories, depending on the availability of tags: a supervision-based anomaly detection algorithm, a semi-supervision-based anomaly detection algorithm, and an unsupervised-based anomaly detection algorithm. In the supervision-based anomaly detection algorithm, a tag is necessary. A typical supervised learning-based strategy is to train a binary classifier under label guidance to achieve anomaly detection. In this case, existing classification algorithms can be designed as anomaly detectors, such as support vector machines, neural network classification models of various advanced structures, and the like. However, considering that the number of abnormal modes of the state of the ship host is large, only the abnormal mode of the state of the host which is appeared can be effectively identified by adopting the abnormality detection method based on supervision, the unknown abnormal mode can not be effectively identified, and the discriminant features of the unknown abnormal sample can not be learned by training a binary classifier under the supervision learning. Further, in the host state anomaly detection task, there is often a serious class imbalance problem, that is, the data set has a large amount of state monitoring data in normal operation, but the abnormal state data is very small. In this case, many classifiers tend to learn the characteristics of the host state normal data, and cannot capture the characteristics of the host state abnormal data effectively, resulting in the host state abnormal state being misclassified as a normal state.
Most of the existing abnormality detection algorithms still have limitations when facing to ship host state monitoring data with high dimension, complex structure, unbalanced categories, few abnormal samples and many abnormal modes. In some recent studies, deep learning methods have demonstrated excellent performance in addressing anomalies in high-dimensional, complex data distributions. Among other things, deep self-encoder networks (DAE) are often used for depth characterization learning. Typically, the intermediate hidden layer feature of the DAE is used as a new potential characterization for other anomaly detection techniques, or the potential characterization is combined with other features to facilitate anomaly detection performance. The potential characterization dimensions obtained by the DAE are typically low-dimensional and can be used to assist a simple classifier in identifying anomalies. However, the distribution of the DAE mapping the input data into the potential space is generally arbitrary in shape, which is hardly directly applicable to the ship's host state anomaly detection task, and may also be detrimental to the stability of the anomaly detection algorithm. Another key issue is that most DAE-based token learning methods exist separated from the anomaly detector into two independent processes, which may produce sub-optimal or task independent representations. Therefore, there is a need to develop new characterization learning methods to address the challenges of current marine host state anomaly detection.
Disclosure of Invention
Aiming at the defects and shortcomings in the prior art, the invention provides a ship main power abnormality detection method based on deep concentric learning, which can accurately detect the running state of a ship main engine.
The invention is achieved by the following measures:
the ship main power abnormality detection method based on deep concentric learning is characterized by comprising the following steps of:
step 1: data preprocessing, namely dividing monitoring data acquired from a ship host state sensor into a training data set and a testing data set;
step 2: constructing a deep concentric learning DCL model, performing characterization learning by using the deep concentric learning DCL model, performing nonlinear feature learning on a training data set by using DCL in a training stage, and synchronously executing reconstruction learning and concentric learning;
step 3: the network weight is optimized to obtain optimized deep concentric learning DCL, firstly, reconstruction loss and concentric loss are calculated respectively, then the weight is optimized by combining the random gradient descent SGD with minimum reconstruction loss and concentric loss, finally, whether the iteration number reaches the set iteration number is judged, if yes, the optimized DCL is obtained, and if not, the step 2 is returned;
Step 4: the method comprises the steps of firstly, inputting a ship host state test data set into an optimized DCL to obtain a low-dimensional potential representation, then, calculating the distance from the low-dimensional potential representation to a concentric center to serve as an abnormal score of a test sample, and finally, outputting a ship host state abnormal detection result according to the abnormal score of the sample.
In the step 1 of the invention, the monitoring data obtained from the ship host state sensor is divided into a training data set and a testing data set; wherein the training data set is used to train the model and the test data set is used to evaluate the performance of the network.
The construction of the deep concentric learning DCL model in the step 2 comprises the following steps:
step 2-1: reconstruction learning is used to obtain a low-dimensional potential representation, including two stages of encoding and decoding, in which an encoder encodes input data, maps the input data to a low-dimensional potential space, and obtains a low-dimensional potential representation, as shown in the following equation:
Figure SMS_1
(1)
in the method, in the process of the invention,
Figure SMS_2
is an encoder characteristic mapping function, and is composed of nonlinear transformation functions; />
Figure SMS_3
Is encoder input data; />
Figure SMS_4
Is the encoder output data, is a low-dimensional potential representation for representing the input data,/- >
Figure SMS_5
Figure SMS_6
Is a weight parameter that the encoder can learn;
in the decoding phase, the decoder dominates, working on reconstructing the low-dimensional latent representation into input data, as shown in the following formula:
Figure SMS_7
in the method, in the process of the invention,
Figure SMS_8
is a decoder feature mapping function, and is composed of nonlinear transformation functions; />
Figure SMS_9
Is reconstruction data; />
Figure SMS_10
Is a weight parameter that the decoder can learn; the weight parameters of the depth self-encoder network are obtained by minimizing the reconstruction error, the mean square error MSE being taken as a reconstruction loss function as follows:
Figure SMS_11
in the method, in the process of the invention,
Figure SMS_12
representing reconstruction loss; />
Figure SMS_13
Is a weight parameter that the depth self-encoder network can learn through training;
Figure SMS_14
is encoder input data; />
Figure SMS_15
Is reconstruction data;
step 2-2: constructing concentricity loss for guiding concentricity learning, wherein a given concentric center is in a concentric potential space
Figure SMS_16
Low-dimensional potential characterization->
Figure SMS_17
The distance to the concentric center is measured as the Euclidean distance as follows:
Figure SMS_18
in the method, in the process of the invention,
Figure SMS_19
is a learnable network weight parameter; />
Figure SMS_20
Is the firsti-th input data->
Figure SMS_21
Mapping to a low-dimensional potential representation in a potential space; />
Figure SMS_22
Is abbreviated as->
Figure SMS_23
Default setting of concentric centersCFor a potential spatial origin of coordinates, the general form of concentricity loss is defined as follows:
Figure SMS_24
Figure SMS_25
In the method, in the process of the invention,Nis the number of samples;
Figure SMS_26
is the firsti-th samples with a label corresponding to a normal sample label of 0 and an abnormal sample label of 1; />
Figure SMS_27
Expressed as concentric inner boundary radius and concentric outer boundary radius, respectively, ">
Figure SMS_28
A default setting r1=0.5, r2=1,
when the input data is considered to be a normal sample, i.e
Figure SMS_29
The concentricity loss at this time is:
Figure SMS_30
as can be seen from the above, if the normal samples are mapped outside the concentric inner boundaries, i.e
Figure SMS_31
At this time
Figure SMS_32
This requires imposing a large constraint on it, bringing it together towards the concentric center until it falls within the concentric inner boundary; if normal samples are mapped to the sameWithin the endocardial border, i.e.)>
Figure SMS_33
At this time
Figure SMS_34
No additional constraints need to be imposed on it, minimizing +.>
Figure SMS_35
Ensuring that normal samples are mapped into concentric inner boundaries with a potential spatial radius R1, i.e. potential characterizations of normal samples are distributed within concentric inner boundaries, +.>
Figure SMS_36
The gradient calculation of (2) is shown in the following formula:
Figure SMS_37
in the method, in the process of the invention,
Figure SMS_39
representing Concentric loss caused by normal samples->
Figure SMS_42
For a learnable network weight parameter +.>
Figure SMS_46
Obtaining a partial derivative; />
Figure SMS_41
Representing encoder input data +.>
Figure SMS_45
Is>
Figure SMS_49
A sample number; />
Figure SMS_52
Representing concentric center positions; />
Figure SMS_38
Representation and- >
Figure SMS_43
Corresponding->
Figure SMS_47
Low-dimensional potential characterization->
Figure SMS_50
To concentric center->
Figure SMS_40
Is a distance of (2); />
Figure SMS_44
Representation->
Figure SMS_48
For a learnable network weight parameter +.>
Figure SMS_51
Obtaining a partial derivative; r1 represents a concentric inner boundary radius;
in the case of taking into account that the input data is an abnormal sample, i.e.
Figure SMS_53
The concentricity loss at this time is:
Figure SMS_54
from the above equation, if the outlier samples are mapped within concentric outer boundaries, i.e.
Figure SMS_55
At this time
Figure SMS_56
This requires that a constraint be imposed on it until it is pushed out of the concentric outer boundary; if the abnormal samples are mapped outside the concentric outer boundary, i.e. +.>
Figure SMS_57
At this time->
Figure SMS_58
No additional constraint is imposed on it, minimizing +.>
Figure SMS_59
Ensuring that the outlier samples are mapped outside the concentric outer boundary with a potential spatial radius R2, i.e. the potential characterizations of the outlier samples are distributed outside the concentric outer boundary,
Figure SMS_60
the gradient calculation of (2) is shown in the following formula:
Figure SMS_61
in the method, in the process of the invention,
Figure SMS_63
representing the Concentric loss caused by an abnormal sample->
Figure SMS_68
For a learnable network weight parameter +.>
Figure SMS_71
Obtaining a partial derivative; />
Figure SMS_64
Representing encoder input data +.>
Figure SMS_66
Is>
Figure SMS_74
A sample number; />
Figure SMS_75
Representing concentric center positions; />
Figure SMS_62
Representation and->
Figure SMS_69
Corresponding->
Figure SMS_72
Low-dimensional potential characterization->
Figure SMS_73
To concentric center->
Figure SMS_65
Is a distance of (2); />
Figure SMS_67
Representation->
Figure SMS_70
For a learnable network weight parameter +. >
Figure SMS_76
Obtaining a partial derivative; r2 represents the concentric outer boundary radius;
the concentricity loss overcomes the adverse effect caused by the imbalance of the category by two weight factors, as shown in the following formula:
Figure SMS_77
in the method, in the process of the invention,
Figure SMS_79
for normal sample number, +.>
Figure SMS_82
For the number of abnormal samples, +.>
Figure SMS_85
; />
Figure SMS_80
For the normal sample weight factor,
Figure SMS_81
for the abnormal sample weight factor, < +.in actual use>
Figure SMS_84
Set to the number of abnormal samples, i.e. +.>
Figure SMS_86
; />
Figure SMS_78
Set to the number of normal samples, i.e. +.>
Figure SMS_83
Step 2-3: setting an anomaly score and a concentric decision boundary: the anomaly score is used to measure the anomaly degree of the sample, the higher the anomaly score, the greater the anomaly probability of the sample, and in deep concentric learning DCL, the low-dimensional potential characterization obtained through characterization learning retains key information of the input data, and therefore, the anomaly score can be obtained by calculating the distance from the low-dimensional potential characterization to the concentric center, as shown in the following formula:
Figure SMS_87
in the method, in the process of the invention,
Figure SMS_88
represent the firsti-anomaly scores corresponding to th samples;
after obtaining the anomaly score for each sample, an anomaly threshold is also required to identify whether each sample is a normal sample or an anomalous sample, this anomaly threshold being referred to in concentric potential space as a concentric decision boundary,
concentric decision boundaries can be employed after training is complete PObtained by quantile method, abbreviated as
Figure SMS_89
The specific flow is as follows: calculating distances between all sample depth features and the concentric center according to formula (4); all samples were arranged with a distance from the concentric center of +.>
Figure SMS_90
The method comprises the steps of carrying out a first treatment on the surface of the Assume that the number of abnormal samples in the training set isMBy usingPIn the fractional methodDObtain a sample of concentric distance +.>
Figure SMS_91
,/>
Figure SMS_92
Is shown inDThe corresponding sample concentric distance at the N-M position of (B) is +.>
Figure SMS_93
),
After obtaining the anomaly score for each sample, the anomaly score is higher than
Figure SMS_94
Is regarded as an abnormal sample, thereby realizing abnormality detection as shown in the following formula:
Figure SMS_95
in the method, in the process of the invention,
Figure SMS_96
normal sample, ->
Figure SMS_97
The time is an abnormal sample;
step 2-4: constructing an optimization objective function: given a data set comprising
Figure SMS_98
Normal samples and->
Figure SMS_99
The optimized objective function of supervised deep concentric learning DCL training for each abnormal sample is constructed as follows:
Figure SMS_100
under the condition of considering class imbalance, the optimization objective function of the supervised deep concentric learning DCL training is constructed as follows:
Figure SMS_101
the optimized objective function of Deep Concentric Learning (DCL) includes three components: (1)
Figure SMS_102
: characterizing reconstruction errors caused by the depth self-encoder network,/>
Figure SMS_103
Consider the reconstruction loss generated by normal samples, excluding those generated by abnormal samples; (2)/ >
Figure SMS_104
: characterizing the concentric loss of normal samples mapped outside the concentric inner boundary,/for>
Figure SMS_105
The smaller the better, (3)>
Figure SMS_106
: characterizing the concentric loss of an abnormal sample mapped within the concentric outer boundary, hopefully +.>
Figure SMS_107
The smaller the better;
step 3: network weight optimization: firstly, respectively calculating reconstruction loss and concentricity loss; then optimizing weights by randomly gradient descent SGD to jointly minimize reconstruction loss and concentricity loss; finally judging whether the iteration times reach the set iteration times or not; if yes, obtaining the optimized DCL; if not, returning to the step 2;
step 4: abnormality detection: in the testing stage, firstly, inputting a ship host state testing data set into an optimized DCL to obtain a low-dimensional potential representation; then, calculating the distance from the low-dimensional potential characterization to the concentric center as an anomaly score of the test sample; and finally, outputting a ship host state abnormality detection result according to the abnormality score of the sample.
According to the invention, a new deep characterization learning method, namely Deep Concentric Learning (DCL), is constructed, so that a new potential characterization is studied to effectively separate samples in different categories, so that abnormal detection performance of a ship host state is promoted, the problem that detection performance is not ideal due to inconsistent optimization targets of the traditional DAE-based characterization learning and abnormal detection tasks is solved, the DCL utilizes the DAE to conduct characterization learning, high-dimensional and complex-structure ship host state input data are mapped into a low-dimensional potential characterization through strong nonlinear mapping capability, and in order to promote the obtained low-dimensional potential characterization to have enough discriminance to separate samples in different categories, a concentric loss is constructed, training of the DAE is supervised together by combining reconstruction loss, the abnormal sample diversity of the ship host state is considered, in a potential space, the DCL maps the normal sample of the ship host state into the concentric inner boundary, and the abnormal sample is mapped out of the concentric inner boundary, so that a larger mapping area can be provided for the unknown ship host state abnormal sample; at the same time, DCL allows for a significant concentric separation between concentric inner and outer boundaries to separate potential characterizations between different classes. Further, by setting two weight factors on the concentric loss to balance the concentric loss caused by normal samples and abnormal samples of the ship host state, the DCL can overcome the problem of unbalanced categories, not only can the DCL train in an end-to-end manner, but also can dynamically adjust the hyper-parameters related to the concentric loss (or can be preset) in the training process, and more importantly, the DCL can give an interpretable abnormal score by measuring the distance from the samples to the concentric center in a potential space, thereby realizing the ship host state abnormal detection task.
Drawings
FIG. 1 is a block diagram of a deep concentric learning DCL model in accordance with the present invention.
FIG. 2 is a schematic diagram of concentric loss build-up in the present invention.
FIG. 3 is a flow chart of anomaly detection based on deep concentric learning DCL in the present invention.
Fig. 4 is an experimental result of 13 anomaly detection data sets in the embodiment of the present invention, wherein (a) in fig. 4 is an experimental result under an AUC-ROC evaluation index, and (b) in fig. 4 is an experimental result under an AUC-PR evaluation index.
Fig. 5 is a schematic diagram showing a change of a loss curve on different anomaly detection data sets according to an embodiment of the present invention, where (a) in fig. 5 is a loss curve of DCL on a test sample of an anomaly detection data set of AD1, (b) in fig. 5 is a loss curve of DCL on a test sample of an anomaly detection data set of AD2, (c) in fig. 5 is a loss curve of DCL on a test sample of an anomaly detection data set of AD3, and (d) in fig. 5 is a loss curve of DCL on a test sample of an anomaly detection data set of AD 4.
FIG. 6 is a schematic diagram of a visual representation of the potential characterization of a test sample on different anomaly detection datasets in an embodiment of the present invention, where (a) in FIG. 6 is a visual representation of the potential characterization of an AD1 dataset test sample, (b) in FIG. 6 is a visual representation of the potential characterization of an AD2 dataset test sample, (c) in FIG. 6 is a visual representation of the potential characterization of an AD3 dataset test sample, (d) in FIG. 6 is a visual representation of the potential characterization of an AD4 dataset test sample, and (e) in FIG. 6 is a visual representation of the potential characterization of an AD5 dataset test sample, and (f) in FIG. 6 is a visual representation of the potential characterization of an AD6 dataset test sample.
Detailed Description
The invention will be further described with reference to the drawings and examples.
In order to obtain discriminative potential characterization from ship host state high-dimensional input data to improve abnormality detection performance, the invention constructs a novel deep characterization learning method, namely Deep Concentric Learning (DCL). The flow for detecting the abnormal state of the ship host by using the DCL is shown in the figure 3, and mainly comprises two stages: training phase and testing phase. The training stage is used for optimizing the weight of the DCL, and the testing stage is used for executing the ship host state abnormality detection task by using the optimized DCL, and the specific flow is as follows:
firstly, preprocessing data, namely dividing monitoring data acquired from a ship host state sensor into a training data set and a testing data set; wherein the training data set is used for training a model, and the test data set is used for evaluating network performance; secondly, characterizing learning, wherein in a training stage, nonlinear characteristic learning is carried out on a training data set by using DCL, and reconstruction learning and concentric learning are synchronously carried out;
thirdly, optimizing the network weight, and firstly, respectively calculating reconstruction loss and concentricity loss; the weights are then optimized by a random gradient descent (SGD) joint minimization of reconstruction loss and concentricity loss; finally judging whether the iteration times reach the set iteration times or not; if yes, obtaining the optimized DCL; if not, returning to the second step;
Fourth, abnormal detection, in the testing stage, firstly, inputting the ship host state testing data set into the optimized DCL to obtain a low-dimensional potential representation; then, calculating the distance from the low-dimensional potential characterization to the concentric center as an anomaly score of the test sample; and finally, the abnormal state detection task of the ship host is realized according to the abnormal score of the sample.
The overall framework of Deep Concentric Learning (DCL) of the present invention is shown in fig. 1, and includes two important token learning processes: reconstruction learning and concentric learning. Wherein the reconstruction learning is directed to learning a low-dimensional potential representation from the input data, and the concentric learning is directed to facilitating the obtained low-dimensional potential representation to have sufficient discriminativity for samples between different classes. Notably, in DCL, the two processes of reconstruction learning and concentric learning are performed simultaneously, which together promote the distinguishability of samples between different classes, which is very promising for improving the abnormality detection performance; in fig. 1, the upper part is reconstruction learning, the obtained low-dimensional representation is of any shape, can hardly be directly used for an abnormality detection task, and the lower part is deep concentric learning, so that the obtained low-dimensional representation has enough discrimination and can directly execute the abnormality detection task.
In reconstruction learning, a depth self-encoder is used for reconstruction learning, which means that high-dimensional, structurally complex input data can be mapped into a low-dimensional potential representation through nonlinearity by using reconstruction learning, and the obtained low-dimensional potential representation can retain as much key information in the input data as possible under the condition of minimizing reconstruction loss. However, the low-dimensional potential characterization obtained by reconstruction learning is arbitrary in shape, as shown in the upper part of fig. 1, and is hardly directly used for abnormality detection, and even abnormality detection performance may be impaired. One straightforward reason is that reconstruction losses are designed to compress the data dimensions, which is inconsistent with the anomaly detection task optimization objective.
Concentric learning solves the problems of the reconstruction learning described above, and the low-dimensional latent representation obtained by guiding the reconstruction learning forms a concentric distribution of latent space, as shown in the lower part of fig. 1. Considering the diversity of abnormal samples, in concentric potential space, normal samples are guided to map inside the concentric boundary, while abnormal samples are mapped outside the concentric boundary, which can provide a larger mapping area for diversified, unpredictable abnormal patterns. In the case of concentric distribution, normal and abnormal samples can be directly identified by measuring the distance of the samples from the concentric center. Therefore, in order to obtain a low-dimensional potential characterization that can be directly used for anomaly detection, it is necessary to design a loss-guided concentric learning, referred to herein as concentric loss.
Reconstruction learning is used to obtain a low-dimensional latent representation, which includes two stages of encoding and decoding, as shown in the upper part of fig. 1.
In the encoding stage, the encoder encodes the input data, maps the input data to a low-dimensional potential space, and obtains a low-dimensional potential representation as shown in the following formula:
Figure SMS_108
in the method, in the process of the invention,
Figure SMS_109
is an encoder characteristic mapping function, and is composed of nonlinear transformation functions; />
Figure SMS_110
Is encoder input data; />
Figure SMS_111
Is the encoder output data, which is a low-dimensional potential representation used to characterize the input data, typically +.>
Figure SMS_112
; />
Figure SMS_113
Is a weight parameter that the encoder can learn;
in the decoding phase, the decoder dominates, working on reconstructing the low-dimensional latent representation into input data, as shown in the following formula:
Figure SMS_114
in the method, in the process of the invention,
Figure SMS_116
is a decoder feature mapping function, and is composed of nonlinear transformation functions; />
Figure SMS_119
Is reconstruction data; />
Figure SMS_121
Is a weight parameter that the decoder can learn; the weight parameters of the depth self-encoder network may be obtained by minimizing the reconstruction error, the Mean Square Error (MSE) being taken as a reconstruction loss function, as follows:
Figure SMS_117
wherein->
Figure SMS_118
Representing reconstruction loss; />
Figure SMS_120
Is a weight parameter that the depth self-encoder network can learn through training; / >
Figure SMS_122
Is encoder input data; />
Figure SMS_115
Is reconstruction data.
The concentricity loss is used to guide concentric learning. The potential space obtained by conventional DAE-based token learning is arbitrarily shaped, as shown in part (a) of fig. 2. To guide the low-dimensional latent representation into a concentric distribution of latent space, the concentric loss not only requires a large penalty to be applied to normal sample latent representations mapped outside the concentric inner boundary to cause them to fall within the concentric inner boundary, but also requires a large penalty to be applied to abnormal sample latent representations mapped within the concentric outer boundary to cause them to fall outside the concentric outer boundary, as shown in part (b) of fig. 2. By minimizing the concentricity loss, normal samples can be forced to map inside the concentric inner boundary and abnormal samples can be forced to map outside the concentric outer boundary, which allows for a significant concentric separation gap between the corresponding low-dimensional potential characterizations of both normal and abnormal samples, as shown in part (c) of fig. 2.
In the concentric potential space, a given concentric center is
Figure SMS_124
Low-dimensional potential characterization->
Figure SMS_126
The distance to the concentric center may be measured using the Euclidean distance as follows: />
Figure SMS_128
Wherein->
Figure SMS_125
Is a learnable network weight parameter; / >
Figure SMS_127
Is the firsti-th input data->
Figure SMS_129
Mapping to a low-dimensional potential representation in a potential space;
Figure SMS_130
can be abbreviated as->
Figure SMS_123
. The invention defaults to concentric centerCIs the potential spatial origin of coordinates.
The general form of concentric loss can be defined as follows:
Figure SMS_131
Figure SMS_132
in the method, in the process of the invention,Nis the number of samples;
Figure SMS_133
is the firsti-th samples with a label corresponding to a normal sample label of 0 and an abnormal sample label of 1; />
Figure SMS_134
Expressed as concentric inner boundary radius and concentric outer boundary radius, respectively, generally +.>
Figure SMS_135
The default setting of the invention, r1=0.5, r2=1, in practice this will achieve a desirable result;
when the input data is considered to be a normal sample, i.e
Figure SMS_136
The concentricity loss at this time is:
Figure SMS_137
as can be seen from the above, if the normal samples are mapped outside the concentric inner boundaries, i.e
Figure SMS_138
At this time
Figure SMS_139
This requires imposing a large constraint on it, bringing it together towards the concentric center until it falls within the concentric inner boundary; if normal samples are mapped within concentric inner boundaries, i.e. +.>
Figure SMS_140
At this time
Figure SMS_141
No additional constraints need to be imposed on it. Minimizing +.>
Figure SMS_142
It can be ensured that normal samples are mapped into concentric inner boundaries with a potential spatial radius R1, i.e. that potential characterizations of normal samples are distributed within concentric inner boundaries. / >
Figure SMS_143
The gradient calculation of (2) is shown in the following formula:
Figure SMS_144
in the method, in the process of the invention,
Figure SMS_146
representing Concentric loss caused by normal samples->
Figure SMS_151
For a learnable network weight parameter +.>
Figure SMS_153
Obtaining a partial derivative; />
Figure SMS_145
Representing encoder input data +.>
Figure SMS_149
Is>
Figure SMS_152
A sample number; />
Figure SMS_155
Representing concentric center positions; />
Figure SMS_148
Representation and->
Figure SMS_150
Corresponding->
Figure SMS_154
Low-dimensional potential characterization->
Figure SMS_156
To concentric center->
Figure SMS_147
Is a distance of (2); />
Figure SMS_157
Representation->
Figure SMS_158
For a learnable network weight parameter +.>
Figure SMS_159
Obtaining a partial derivative; r1 represents a concentric inner boundary radius;
in the case of taking into account that the input data is an abnormal sample, i.e.
Figure SMS_160
The concentricity loss at this time is:
Figure SMS_161
as can be seen from the above, if the outlier samples are mapped within concentric outer boundaries, i.e
Figure SMS_162
At this time
Figure SMS_163
This requires that a constraint be imposed on it until it is pushed out of the concentric outer boundary; if the abnormal samples are mapped outside the concentric outer boundary, i.e. +.>
Figure SMS_164
At this time->
Figure SMS_165
No additional constraints need to be imposed on it. Minimizing +.>
Figure SMS_166
It can be ensured that the outlier samples are mapped outside the concentric outer boundary with a potential spatial radius R2, i.e. the potential characterizations of the outlier samples are distributed outside the concentric outer boundary,
Figure SMS_167
the gradient calculation of (2) is shown in the following formula:
Figure SMS_168
in the method, in the process of the invention,
Figure SMS_170
representing the Concentric loss caused by an abnormal sample->
Figure SMS_173
For a learnable network weight parameter +. >
Figure SMS_178
Obtaining a partial derivative; />
Figure SMS_171
Representing encoder input data +.>
Figure SMS_176
Is>
Figure SMS_181
A sample number; />
Figure SMS_183
Representing concentric center positions; />
Figure SMS_169
Representation and->
Figure SMS_175
Corresponding->
Figure SMS_179
Low-dimensional potential characterization->
Figure SMS_182
To concentric center->
Figure SMS_172
Is a distance of (2); />
Figure SMS_174
Representation->
Figure SMS_177
For a learnable network weight parameter +.>
Figure SMS_180
Obtaining a partial derivative; r2 represents the concentric outer boundary radius;
consider that in most anomaly detection tasks, there is typically a problem of class imbalance, i.e., there are far more normal samples than anomalous samples. In such an unbalanced case, the deep learning method may have insufficient feature capturing ability for the abnormal sample, causing the abnormal sample to be misclassified as a normal sample. To address this problem, the concentricity penalty overcomes the adverse effects of class imbalance by two weight factors, as shown in the following equation:
Figure SMS_184
in the method, in the process of the invention,
Figure SMS_186
for normal sample number, +.>
Figure SMS_190
For the number of abnormal samples, +.>
Figure SMS_192
; />
Figure SMS_187
For the normal sample weight factor,
Figure SMS_189
is an outlier sample weight factor. For easy setting, in actual use, < > for>
Figure SMS_191
Can be set as the abnormal sample number, namely
Figure SMS_193
; />
Figure SMS_185
Can be set to the number of normal samples, i.e. +.>
Figure SMS_188
The anomaly score is used to measure the degree of anomaly of the sample. The higher the anomaly score, the greater the sample anomaly probability. In Deep Concentric Learning (DCL), the low-dimensional potential characterizations obtained by characterization learning may preserve key information of the input data. Thus, the anomaly score can be obtained by calculating the distance of the low-dimensional potential representation to the concentric center, as shown in the following equation:
Figure SMS_194
In the method, in the process of the invention,
Figure SMS_195
represent the firsti-anomaly scores for th samples.
After obtaining the anomaly score for each sample, an anomaly threshold is also required to identify whether each sample is a normal sample or a anomalyA constant sample. This outlier threshold, in concentric potential space, is referred to as a concentric decision boundary. The concentric decision boundary can be obtained by adopting a P quantile method after training is finished, and is abbreviated as
Figure SMS_196
The specific flow is as follows: calculating distances between all sample depth features and the concentric center according to formula (4); the distances from all samples to the concentric center are arranged from small to large
Figure SMS_197
The method comprises the steps of carrying out a first treatment on the surface of the Assuming that the number of abnormal samples in the training set is M, obtaining a sample concentric distance in D by adopting a P-branch bit method>
Figure SMS_198
,/>
Figure SMS_199
Indicating the corresponding sample concentric distance at the N-M position in D, there is +.>
Figure SMS_200
After obtaining the anomaly score for each sample, the anomaly score is higher than
Figure SMS_201
Is regarded as an abnormal sample, thereby realizing abnormality detection as shown in the following formula:
Figure SMS_202
in the method, in the process of the invention,
Figure SMS_203
normal sample, ->
Figure SMS_204
And is an abnormal sample.
Given a data set comprising
Figure SMS_205
Normal samples and->
Figure SMS_206
An optimized objective function for supervised Deep Concentric Learning (DCL) training is constructed as follows.
Figure SMS_207
In consideration of class imbalance, an optimized objective function for supervised Deep Concentric Learning (DCL) training is constructed as follows.
Figure SMS_208
The optimized objective function of Deep Concentric Learning (DCL) includes three components:
(1)
Figure SMS_209
: reconstruction errors caused by the depth self-encoder network are characterized. In order for the obtained low-dimensional latent representation to be able to preserve well the critical information in the input data, the smaller the reconstruction error is expected to be, the better. In practice, normal samples are expected to be well reconstructed, while abnormal samples are not. Thus (S)>
Figure SMS_210
Consider the reconstruction loss generated by normal samples, excluding those generated by abnormal samples.
(2)
Figure SMS_211
: the concentric loss resulting from normal samples being mapped outside the concentric inner boundary is characterized. In order that the potential characterization of the obtained normal sample can be distributed within the concentric inner border, it is desirable +.>
Figure SMS_212
The smaller the better.
(3)
Figure SMS_213
: the concentric loss resulting from the abnormal samples being mapped within the concentric outer boundaries is characterized. In order that the low-dimensional potential characterization of the obtained anomalous sample can be distributed outside the concentric outer boundary, it is desirable that
Figure SMS_214
The smaller the better.
By minimizing
Figure SMS_215
A potential space where a low-dimensional potential representation exhibits a concentric distribution can be guaranteed. In this potential space, the low-dimensional potential characterization is not only able to preserve well the important information contained in the normal sample input data, but also the normal samples are mapped inside the concentric inner boundary and the abnormal samples are mapped outside the concentric outer boundary.
The present invention uses an 84-dimensional feature vector as input.
The data set details are shown in table 1.
The dataset was added with 60dB of gaussian white noise, containing 3500 samples, including 14 health states, respectively normal (no fault), intake manifold pressure reduction, no. 1 to No. 6 cylinder compression ratio reduction, and No. 1 to No. 6 cylinder fuel injection amount reduction, each of the number of samples of health states being 250.
Except for the normal state, the other states are abnormal states.
The invention combines the normal state and the rest abnormal states in turn, and forms 13 abnormal detection data sets respectively marked as AD1, AD2, … and AD13. For each dataset, it was randomly divided into training dataset and test dataset in a ratio of 7:3.
DCL mainly introduces concentric learning (i.e. under the joint supervision of concentric losses) on the basis of DAE for characterization learning, and overcomes the problem of class imbalance by two weight factors in concentric losses. The present invention uses 3 methods, namely DAE, dcl_n and DCL, to perform comparative experiments. Wherein, DAE: using only the mean square error loss as an optimization target; dcl_n: using mean square error loss and concentric loss without consideration to overcome class imbalance as optimization targets; DCL: the mean square error loss and the concentric loss that is considered to overcome class imbalance are used as optimization targets.
TABLE 1 Ship host health status type and sample number corresponding thereto
Figure SMS_216
The superparameter related to the network structure is referred to in the prior art and set to FC (input_size, 60, BN, leakyReLU) -FC (60, 30, BN, leakyReLU) -FC (30, 10, BN, leakyReLU) -FC (10, 2, none, none) -FC (2, 10,BN, leakyReLU) -FC (10, 30,BN, leakyReLU) -FC (30, 60,BN, leakyReLU) -FC (60, output_size, none, none). Wherein, FC represents a full connection layer, BN represents a batch normalization layer, leakyReLU represents a LeakyReLU activation function layer, and input_size and output_size are characteristic dimensions of DCL Input and Output vectors respectively, and are equal to each other; the parameters in brackets are expressed as FC input neuron number, output neuron number, batch normalization immediately following FC, and activation function immediately following BN, respectively. It can be seen that the encoder output is 2D, i.e. the potential characterization dimension is 2D. The superparameter optimized with the network weights refers to the existing literature and is set as follows. The iteration number is 200; adopting an SGD optimizer, wherein the momentum is 0.9, and the weight attenuation is 0.0001; the initial learning rate is 0.001, and then decays by 0.1 every 66 iterations; the batch size was set to 128.
The present invention uses two commonly used evaluation indices, AUC-ROC (Area Under Receiver Operating Characteristic Curve) and AUC-PR (Area Under Precision-Recall Curve), to comprehensively evaluate the detection performance of the method under consideration. The AUC-ROC focuses on the relationship between the two indicators, true positive rate (True Positive Rate, TPR) and false positive rate (False Positive Rate, FPR). The model performs best when AUC-roc=1, while AUC-roc=0.5 means that the model has no resolution. AUC-PR focuses on the relationship between accuracy (Precision) and Recall (Recall), rather than true and false positive. The higher the AUC-PR, the better the detection performance; the model performs best when AUC-pr=1.
To verify the validity of the algorithm, multiple sets of experiments were organized. Firstly, evaluating the performance of the constructed method on 13 anomaly detection data sets; then, the detection performance of the constructed method under the condition of insufficient abnormal samples is explored by changing the proportion of the abnormal samples on the training data set; finally, the constructed method is subjected to convergence analysis. The DCL constructed by the invention was implemented on a PyTorch 1.8.0, and all experiments were performed on a computer configured as Intel (R) Core (TM) i9-9900K CPU @ 3.60 GHz.
The performance on the different data sets is compared as follows: table 2 gives the performance of the method under consideration over 13 anomaly detection datasets.
To avoid the effect of random factors, each result is an average of 5 replicates. From the experimental results in Table 2, it can be seen that the following conclusion can be drawn.
(1) The performance of DCL_N is better than DAE, whether AUC-ROC or AUC-PR. More specifically, dcl_n achieved a performance improvement of 0.89% over DAE with respect to the average AUC-ROC; the DCL-N achieved a performance improvement of 0.76% over DAE with respect to the average AUC-PR. Dcl_n differs from DAE in that dcl_n uses a concentricity penalty that is better than the detection performance of DAE, indicating that the concentricity penalty can cause dcl_n to obtain a discriminatory potential characterization to separate normal and abnormal samples.
(2) Of the 3 methods considered, DCL gave the best detection performance of 100% in terms of average AUC-ROC and average AUC-PR. In contrast to dcl_n, DCL uses concentric losses that account for class imbalance. Although the ratio of normal and abnormal samples in the 13 data sets considered is the same, the two weight factors in the concentric loss used by the DCL produce a greater loss, playing a stronger regularization role, and thus achieving a performance improvement over dcl_n.
The influence of different training anomaly ratios (namely under the conditions of insufficient anomaly samples and unbalanced categories) on the DCL detection performance is examined below. Here, the training anomaly ratio refers to a ratio of an anomaly sample used in the training process to an anomaly sample in the training data set. In training, experimental verification was performed by varying the proportion of abnormal samples on the training dataset.
Fig. 4 shows the experimental results of the method under consideration on 13 anomaly detection datasets. Each result is the average test performance of each method for 5 replicates on each dataset. It can be seen that as the training anomaly proportion increases, the performances of DAE and dcl_ N, DCL show a gradually increasing trend, which indicates that properly adding the anomaly samples on the training data set helps to learn the discriminant information between the normal samples and the anomaly samples, and improves the discriminant of the samples between different categories. However, under the condition of smaller training anomaly proportion, the detection performance of DAE is relatively poor, and DCL_ N, DCL maintains higher detection performance, which indicates that the constructed method still has excellent characteristic learning capability under the condition of unbalanced category; especially, DCL, since the problem of class imbalance is considered, achieves better detection performance than dcl_n. The performance improvement obtained by DCL suggests that it is necessary to consider the problem of class imbalance in anomaly detection tasks, and concentricity loss can prompt token learning to obtain more discriminative potential tokens to distinguish samples between different classes.
TABLE 2 detection Performance of different methods on test sample sets (%)
Figure SMS_217
In DCL, two losses, namely a reconstruction loss (MSE loss) and a concentricity loss (Concentric loss), are used, the sum of which constitutes the Total loss (Total loss). Fig. 5 shows the loss curves of DCL in the AD1, AD2, AD3, AD4 anomaly detection dataset test samples. It can be seen that the total loss gradually converges to a small value as the network weights are optimized. The total losses are initially large, mainly due to concentric losses. Thereafter, as the concentric loss decreases, the reconstruction loss dominates the total loss. By observing the change in the loss profile over the different data sets, it can be seen that the concentric loss has good convergence.
To further understand whether the potential characterizations obtained by DCL have sufficient discriminatory properties to distinguish samples between different classes, fig. 6 shows the potential characterization visualization results of DCL testing samples at the AD1, AD2, AD3, AD4, AD5, AD6 dataset. It can be seen that on different data sets, the potential characterizations of the vast majority of normal samples fall within the concentric inner boundary (i.e., concentric circles with radius R1) and the potential characterizations of the vast majority of abnormal samples fall outside the concentric outer boundary (i.e., concentric circles with radius R2), with significant separation between normal and abnormal samples. The visualization results on the different data sets indicate that the potential characterization obtained by DCL has sufficient discriminatory information to separate normal and abnormal samples.
The invention constructs a novel deep characterization learning method, namely Deep Concentric Learning (DCL), which promotes learning of effective potential characterization to improve the state abnormality detection performance of the ship host. The main novelty of DCL is that a concentric loss-inducing potential representation is constructed to form a concentric potential space. In this concentric potential space, normal samples are mapped inside the concentric inner boundary and abnormal samples are mapped outside the concentric outer boundary. And the normal sample and the abnormal sample can be well separated by setting the concentric inner boundary radius and the concentric outer boundary radius to form a concentric interval in the training process, so that the separability among different categories is remarkably improved.
The effectiveness of the DCL in improving the abnormality detection performance is verified through experimental results on a ship host data set. Specifically, dcl_n achieves a detection performance improvement of 0.89%, 0.76% for the mean AUC-ROC and the mean AUC-PR, respectively, over the 13 anomaly detection datasets considered, compared to the traditional DAE-based approach; further, among the 3 methods considered (i.e., DAE, dcl_ N, DCL), DCL that considers the class imbalance problem achieves the highest detection performance of 100%. At different training anomaly ratios, both dcl_n and DCL achieved better detection performance than DAE. The performance improvement obtained by the DCL shows that the problem of unbalanced category is necessary to be considered in the abnormal detection task, and the DCL can obviously promote the characteristic learning to obtain the discriminative potential characteristic so as to improve the abnormal detection performance of the ship host state.
Finally, the DCL can be applied not only to the task of detecting abnormal states of the ship host, but also to other mode recognition tasks that need to improve the separability between different classes.

Claims (5)

1. The ship main power abnormality detection method based on deep concentric learning is characterized by comprising the following steps of:
step 1: data preprocessing, namely dividing monitoring data acquired from a ship host state sensor into a training data set and a testing data set;
step 2: constructing a deep concentric learning DCL model, performing characterization learning by using the deep concentric learning DCL model, performing nonlinear feature learning on a training data set by using DCL in a training stage, and synchronously executing reconstruction learning and concentric learning;
step 3: the network weight is optimized to obtain optimized deep concentric learning DCL, firstly, reconstruction loss and concentric loss are calculated respectively, then the weight is optimized by combining the random gradient descent SGD with minimum reconstruction loss and concentric loss, finally, whether the iteration number reaches the set iteration number is judged, if yes, the optimized DCL is obtained, and if not, the step 2 is returned;
step 4: the method comprises the steps of firstly, inputting a ship host state test data set into an optimized DCL to obtain a low-dimensional potential representation, then, calculating the distance from the low-dimensional potential representation to a concentric center to serve as an abnormal score of a test sample, and finally, outputting a ship host state abnormal detection result according to the abnormal score of the sample.
2. The method for detecting abnormal main power of a ship based on deep concentric learning according to claim 1, wherein the construction of the deep concentric learning DCL model in step 2 comprises the steps of:
step 2-1: reconstruction learning is used to obtain a low-dimensional potential representation, including two stages of encoding and decoding, in which an encoder encodes input data, maps the input data to a low-dimensional potential space, and obtains a low-dimensional potential representation, as shown in the following equation:
Figure QLYQS_1
(1) Wherein->
Figure QLYQS_2
Is an encoder characteristic mapping function, and is composed of nonlinear transformation functions; />
Figure QLYQS_3
Is encoder input data; />
Figure QLYQS_4
Is the encoder output data, is a low-dimensional potential representation used to represent the input data,
Figure QLYQS_5
;/>
Figure QLYQS_6
is a weight parameter that the encoder can learn;
in the decoding phase, the decoder dominates, working on reconstructing the low-dimensional latent representation into input data, as shown in the following formula:
Figure QLYQS_7
wherein->
Figure QLYQS_8
Is a decoder feature mapping function, and is composed of nonlinear transformation functions; />
Figure QLYQS_9
Is reconstruction data; />
Figure QLYQS_10
Is a weight parameter that the decoder can learnThe method comprises the steps of carrying out a first treatment on the surface of the The weight parameters of the depth self-encoder network are obtained by minimizing the reconstruction error, the mean square error MSE being taken as a reconstruction loss function as follows:
Figure QLYQS_11
Wherein->
Figure QLYQS_12
Representing reconstruction loss; />
Figure QLYQS_13
Is a weight parameter that the depth self-encoder network can learn through training;
step 2-2: constructing concentricity loss for guiding concentric learning;
step 2-3: setting an anomaly score and a concentric decision boundary;
step 2-4: and constructing an optimization objective function.
3. The method for detecting abnormal main power of a ship based on deep concentric learning according to claim 2, wherein in the concentric potential space in step 2-2, a given concentric center is
Figure QLYQS_14
Low-dimensional potential characterization->
Figure QLYQS_15
The distance to the concentric center is measured as the Euclidean distance as follows:
Figure QLYQS_16
wherein->
Figure QLYQS_17
Is the firsti-th input data->
Figure QLYQS_18
Mapping to a low-dimensional potential representation in a potential space; />
Figure QLYQS_19
Is abbreviated as->
Figure QLYQS_20
Default setting of concentric centersCFor a potential spatial origin of coordinates, the general form of concentricity loss is defined as follows:
Figure QLYQS_21
Figure QLYQS_22
in the method, in the process of the invention,Nis the number of samples;
Figure QLYQS_23
is the firsti-thLabels corresponding to the samples, wherein a normal sample label is 0, and an abnormal sample label is 1; />
Figure QLYQS_24
Expressed as concentric inner boundary radius and concentric outer boundary radius, respectively, ">
Figure QLYQS_25
Considering that the input data is normal samples, i.e. +.>
Figure QLYQS_26
The concentric losses are:
Figure QLYQS_27
as can be seen from the above, if the normal samples are mapped outside the concentric inner boundaries, i.e
Figure QLYQS_28
At this time
Figure QLYQS_29
This requires imposing a large constraint on it, bringing it together towards the concentric center until it falls within the concentric inner boundary; if normal samples are mapped within concentric inner boundaries, i.e. +.>
Figure QLYQS_30
At this time->
Figure QLYQS_31
Then it is not necessary to impose additional constraints on it, minimizing +.>
Figure QLYQS_32
Ensuring that normal samples are mapped into concentric inner boundaries with a potential spatial radius R1, i.e. that potential characterizations of normal samples are distributed within concentric inner boundaries,
Figure QLYQS_33
the gradient calculation of (2) is shown in the following formula:
Figure QLYQS_34
in the method, in the process of the invention,
Figure QLYQS_35
representing Concentric loss caused by normal samples->
Figure QLYQS_42
For a learnable network weight parameter +.>
Figure QLYQS_45
Obtaining a partial derivative; />
Figure QLYQS_37
Representing encoder input data +.>
Figure QLYQS_40
Is>
Figure QLYQS_46
A sample number; />
Figure QLYQS_48
Representing concentric center positions; />
Figure QLYQS_36
Representation and->
Figure QLYQS_39
Corresponding->
Figure QLYQS_43
Low-dimensional potential characterization->
Figure QLYQS_44
To concentric center->
Figure QLYQS_38
Is a distance of (2); />
Figure QLYQS_41
Representation->
Figure QLYQS_47
For a learnable network weight parameter +.>
Figure QLYQS_49
Obtaining a partial derivative; r1 represents a concentric inner boundary radius;
in the case of taking into account that the input data is an abnormal sample, i.e.
Figure QLYQS_50
The concentricity loss at this time is:
Figure QLYQS_51
from the above equation, if the outlier samples are mapped within concentric outer boundaries, i.e.
Figure QLYQS_52
At this time
Figure QLYQS_53
This requires that a constraint be imposed on it until it is pushed out of the concentric outer boundary; if the abnormal samples are mapped outside the concentric outer boundary, i.e. +. >
Figure QLYQS_54
At this time->
Figure QLYQS_55
No additional constraint is imposed on it, minimizing +.>
Figure QLYQS_56
The evidence anomaly sample is mapped outside the concentric outer boundary with a potential spatial radius R2, i.e. the potential characterization of the anomaly sample is distributed outside the concentric outer boundary, +.>
Figure QLYQS_57
The gradient calculation of (2) is shown in the following formula:
Figure QLYQS_66
wherein->
Figure QLYQS_60
Representing the Concentric loss caused by an abnormal sample->
Figure QLYQS_64
For a learnable network weight parameter +.>
Figure QLYQS_61
Obtaining a partial derivative; />
Figure QLYQS_63
Representing encoder input data +.>
Figure QLYQS_67
Is>
Figure QLYQS_71
A sample number; />
Figure QLYQS_65
Representing concentric center positions; />
Figure QLYQS_69
Representation and->
Figure QLYQS_58
Corresponding->
Figure QLYQS_62
Low-dimensional potential characterization->
Figure QLYQS_68
To concentric center->
Figure QLYQS_72
Is a distance of (2); />
Figure QLYQS_70
Representation->
Figure QLYQS_73
For a learnable network weight parameter +.>
Figure QLYQS_59
Obtaining a partial derivative; r2 represents the concentric outer boundary radius;
the concentricity loss overcomes the adverse effect caused by the imbalance of the category by two weight factors, as shown in the following formula:
Figure QLYQS_74
in the method, in the process of the invention,
Figure QLYQS_76
for normal sample number, +.>
Figure QLYQS_78
For the number of abnormal samples, +.>
Figure QLYQS_81
;/>
Figure QLYQS_75
For normal sample weight factor, +.>
Figure QLYQS_80
For the abnormal sample weight factor, < +.in actual use>
Figure QLYQS_82
Set to the number of abnormal samples, i.e. +.>
Figure QLYQS_83
;/>
Figure QLYQS_77
Set to the number of normal samples, i.e. +.>
Figure QLYQS_79
4. A method for detecting abnormal main power of a ship based on deep concentric learning according to claim 3, wherein the abnormal score in step 2-3 is obtained by calculating the distance from the low-dimensional potential characterization to the concentric center, as shown in the following formula:
Figure QLYQS_84
The present invention relates to a method for manufacturing a semiconductor deviceIn (I)>
Figure QLYQS_85
Represent the firsti-anomaly scores corresponding to th samples;
the concentric decision boundary is obtained by adopting a P quantile method after training is finished and is abbreviated as
Figure QLYQS_86
The specific flow is as follows: calculating distances between all sample depth features and the concentric center according to formula (4); all samples were arranged with a distance from the concentric center of +.>
Figure QLYQS_87
The method comprises the steps of carrying out a first treatment on the surface of the Assuming that the number of abnormal samples in the training set is M, obtaining a sample concentric distance in D by adopting a P-branch bit method>
Figure QLYQS_88
,/>
Figure QLYQS_89
Representing the corresponding sample concentric distance at the N-M position in D, there is
Figure QLYQS_90
After obtaining the abnormality score for each sample, the abnormality score was higher than +.>
Figure QLYQS_91
Is regarded as an abnormal sample, thereby realizing abnormality detection as shown in the following formula:
Figure QLYQS_92
wherein->
Figure QLYQS_93
Normal sample, ->
Figure QLYQS_94
And is an abnormal sample.
5. The method for detecting abnormal main power of a ship based on deep concentric learning according to claim 2, wherein the steps 2-4 specifically comprise the steps of:
given a data set comprising
Figure QLYQS_95
Normal samples and->
Figure QLYQS_96
Under the condition of considering class imbalance, the optimized objective function of the supervised deep concentric learning DCL training is constructed as follows:
Figure QLYQS_97
the optimized objective function of deep concentric learning DCL includes three components: (1)
Figure QLYQS_98
: characterizing reconstruction errors caused by the depth self-encoder network,/>
Figure QLYQS_99
Consider the reconstruction loss generated by normal samples, excluding those generated by abnormal samples; (2)/>
Figure QLYQS_100
: characterizing the concentric loss of normal samples mapped outside the concentric inner boundary,/for>
Figure QLYQS_101
The smaller the better, (3)>
Figure QLYQS_102
: characterizing the concentric loss of an abnormal sample mapped within the concentric outer boundary, hopefully +.>
Figure QLYQS_103
The smaller the better.
CN202310667541.9A 2023-06-07 2023-06-07 Ship main power abnormality detection method based on deep concentric learning Active CN116415201B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310667541.9A CN116415201B (en) 2023-06-07 2023-06-07 Ship main power abnormality detection method based on deep concentric learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310667541.9A CN116415201B (en) 2023-06-07 2023-06-07 Ship main power abnormality detection method based on deep concentric learning

Publications (2)

Publication Number Publication Date
CN116415201A true CN116415201A (en) 2023-07-11
CN116415201B CN116415201B (en) 2023-08-15

Family

ID=87059672

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310667541.9A Active CN116415201B (en) 2023-06-07 2023-06-07 Ship main power abnormality detection method based on deep concentric learning

Country Status (1)

Country Link
CN (1) CN116415201B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117710118A (en) * 2023-12-28 2024-03-15 中保金服(深圳)科技有限公司 Intelligent claim settlement analysis method and system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200294401A1 (en) * 2017-09-04 2020-09-17 Nng Software Developing And Commercial Llc. A Method and Apparatus for Collecting and Using Sensor Data from a Vehicle
CN112165464A (en) * 2020-09-15 2021-01-01 江南大学 Industrial control hybrid intrusion detection method based on deep learning
CN113532866A (en) * 2020-04-16 2021-10-22 中国船舶重工集团公司第七一一研究所 Diesel engine abnormal state detection method and system and computer storage medium
CN113870264A (en) * 2021-12-02 2021-12-31 湖北全兴通管业有限公司 Tubular part port abnormity detection method and system based on image processing
WO2022210281A1 (en) * 2021-04-01 2022-10-06 ジャパン マリンユナイテッド株式会社 Thrust clearance measuring device, thrust clearance measuring method, and marine vessel
CN115879505A (en) * 2022-11-15 2023-03-31 哈尔滨理工大学 Self-adaptive correlation perception unsupervised deep learning anomaly detection method
CN115905991A (en) * 2022-11-21 2023-04-04 国网河南省电力公司经济技术研究院 Time series data multivariate abnormal detection method based on deep learning

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200294401A1 (en) * 2017-09-04 2020-09-17 Nng Software Developing And Commercial Llc. A Method and Apparatus for Collecting and Using Sensor Data from a Vehicle
CN113532866A (en) * 2020-04-16 2021-10-22 中国船舶重工集团公司第七一一研究所 Diesel engine abnormal state detection method and system and computer storage medium
CN112165464A (en) * 2020-09-15 2021-01-01 江南大学 Industrial control hybrid intrusion detection method based on deep learning
WO2022210281A1 (en) * 2021-04-01 2022-10-06 ジャパン マリンユナイテッド株式会社 Thrust clearance measuring device, thrust clearance measuring method, and marine vessel
CN113870264A (en) * 2021-12-02 2021-12-31 湖北全兴通管业有限公司 Tubular part port abnormity detection method and system based on image processing
CN115879505A (en) * 2022-11-15 2023-03-31 哈尔滨理工大学 Self-adaptive correlation perception unsupervised deep learning anomaly detection method
CN115905991A (en) * 2022-11-21 2023-04-04 国网河南省电力公司经济技术研究院 Time series data multivariate abnormal detection method based on deep learning

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
GYEONGSUB SONG.ETC: "Fault Detection in Inertial Measurement Unit and Global Navigation Satellite System of an Unmanned surface vehicle", IEEE *
冯乔: "基于超球面支持向量机的传感器网络数据异常检测分析", 微型电脑应用, no. 10 *
张正勇, 曾庆威: "舰船中压发电机组振动监测与故障诊断", 船电技术, no. 03 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117710118A (en) * 2023-12-28 2024-03-15 中保金服(深圳)科技有限公司 Intelligent claim settlement analysis method and system

Also Published As

Publication number Publication date
CN116415201B (en) 2023-08-15

Similar Documents

Publication Publication Date Title
CN103645249B (en) Online fault detection method for reduced set-based downsampling unbalance SVM (Support Vector Machine) transformer
CN110598851A (en) Time series data abnormity detection method fusing LSTM and GAN
CN110298235B (en) Hyperspectral anomaly detection method and system based on manifold constraint self-coding network
CN116415201B (en) Ship main power abnormality detection method based on deep concentric learning
CN113920400B (en) Metal surface defect detection method based on improvement YOLOv3
CN106326915B (en) A kind of Fault Diagnosis for Chemical Process method based on improvement core Fisher
Mechefske et al. Fault detection and diagnosis in low speed rolling element bearings Part II: The use of nearest neighbour classification
CN113949549B (en) Real-time traffic anomaly detection method for intrusion and attack defense
Song et al. Data and decision level fusion-based crack detection for compressor blade using acoustic and vibration signal
CN110458039A (en) A kind of construction method of industrial process fault diagnosis model and its application
CN115824519A (en) Valve leakage fault comprehensive diagnosis method based on multi-sensor information fusion
CN115052304A (en) GCN-LSTM-based industrial sensor network abnormal data detection method
CN116383747A (en) Anomaly detection method for generating countermeasure network based on multi-time scale depth convolution
CN107220475A (en) A kind of bearing features data analysing method based on linear discriminant analysis
CN116543538B (en) Internet of things fire-fighting electrical early warning method and early warning system
CN111474476B (en) Motor fault prediction method
CN113553319A (en) LOF outlier detection cleaning method, device and equipment based on information entropy weighting and storage medium
JP5178471B2 (en) Optimal partial waveform data generation apparatus and method, and rope state determination apparatus and method
CN117407816A (en) Multi-element time sequence anomaly detection method based on contrast learning
CN116842358A (en) Soft measurement modeling method based on multi-scale convolution and self-adaptive feature fusion
CN116611184A (en) Fault detection method, device and medium for gear box
CN114548555B (en) Axial flow compressor stall surge prediction method based on deep autoregressive network
CN115690677A (en) Abnormal behavior identification method based on local intuition fuzzy support vector machine
Zuo et al. Bearing fault dominant symptom parameters selection based on canonical discriminant analysis and false nearest neighbor using GA filtering signal
CN111695634A (en) Data abnormal mutation point detection algorithm based on limited accompanying censoring mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant