CN113609933B - Fault detection method, system, device and storage medium based on suffix tree - Google Patents

Fault detection method, system, device and storage medium based on suffix tree Download PDF

Info

Publication number
CN113609933B
CN113609933B CN202110823447.9A CN202110823447A CN113609933B CN 113609933 B CN113609933 B CN 113609933B CN 202110823447 A CN202110823447 A CN 202110823447A CN 113609933 B CN113609933 B CN 113609933B
Authority
CN
China
Prior art keywords
fault
repeated
time
waveform
suffix tree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110823447.9A
Other languages
Chinese (zh)
Other versions
CN113609933A (en
Inventor
岳夏
王亚东
张春良
翁润庭
朱厚耀
李植鑫
陆凤清
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou University
Original Assignee
Guangzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou University filed Critical Guangzhou University
Priority to CN202110823447.9A priority Critical patent/CN113609933B/en
Publication of CN113609933A publication Critical patent/CN113609933A/en
Application granted granted Critical
Publication of CN113609933B publication Critical patent/CN113609933B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a fault detection method, a system, a device and a storage medium based on a suffix tree, wherein the method comprises the following steps: acquiring a first fault signal, and decomposing the first fault signal through a suffix tree algorithm to obtain a fault repeated waveform and a repeated time sequence; determining a first time-frequency characteristic diagram of a first fault signal according to the fault repeated waveform and the repeated time sequence, and constructing a training picture set according to the first time-frequency characteristic diagram; inputting the training picture set into a pre-constructed convolutional neural network for training to obtain a trained fault recognition model; and acquiring a second fault signal to be detected, determining a second time-frequency characteristic diagram of the second fault signal through a suffix tree algorithm, inputting the second time-frequency characteristic diagram into a fault identification model, and outputting to obtain a fault type identification result. The method improves the accuracy and reliability of the training sample, further improves the precision of the fault identification model and the fault detection accuracy, and can be widely applied to the technical field of fault detection.

Description

Fault detection method, system, device and storage medium based on suffix tree
Technical Field
The invention relates to the technical field of fault detection, in particular to a fault detection method, a fault detection system, a fault detection device and a fault detection medium based on a suffix tree.
Background
At present, most fault diagnosis and analysis of the rolling bearing are based on vibration signals, and the vibration signals have the characteristics of nonlinearity, non-stationarity and the like, and information which fully expresses the characteristics of the signals can be obtained by using the vibration signals. In the prior art, for the processing of vibration signals, time-frequency conversion methods such as fourier transform are generally adopted, for example, a rolling bearing fault feature extraction method based on Daubechies wavelet transform. The method generates errors due to aliasing in the processing process, and the errors are generated by the Fourier algorithm principle and are inevitable. Due to the large error of feature extraction of the vibration signal, the existing fault detection method for the vibration signal is often inaccurate.
Disclosure of Invention
The present invention aims to solve at least to some extent one of the technical problems existing in the prior art.
Therefore, an object of an embodiment of the present invention is to provide a fault detection method based on a suffix tree, which decomposes a fault signal to obtain two pieces of information with different scales of a fault repeated waveform and a repeated time sequence, thereby avoiding an error caused by aliasing of a fourier algorithm, improving accuracy and reliability of a training sample, and further improving precision of a fault identification model, thereby improving accuracy of fault detection.
Another object of the embodiments of the present invention is to provide a fault detection system based on a suffix tree.
In order to achieve the technical purpose, the technical scheme adopted by the embodiment of the invention comprises the following steps:
in a first aspect, an embodiment of the present invention provides a fault detection method based on a suffix tree, including the following steps:
acquiring a first fault signal, and decomposing the first fault signal through a suffix tree algorithm to obtain a fault repeated waveform and a repeated time sequence;
determining a first time-frequency characteristic diagram of the first fault signal according to the fault repeated waveform and the repeated time sequence, and constructing a training picture set according to the first time-frequency characteristic diagram;
inputting the training picture set into a pre-constructed convolutional neural network for training to obtain a trained fault recognition model;
and acquiring a second fault signal to be detected, determining a second time-frequency characteristic diagram of the second fault signal through a suffix tree algorithm, inputting the second time-frequency characteristic diagram into the fault recognition model, and outputting to obtain a fault type recognition result.
Further, in an embodiment of the present invention, the decomposing the first fault signal by using a suffix tree algorithm to obtain a fault repeated waveform and a repeated time sequence specifically includes:
coding the first fault signal through average distribution or Gaussian distribution to obtain a first time domain signal;
decomposing the first time domain signal through a suffix tree algorithm to obtain a plurality of fault waveform information and corresponding time information, and constructing a first suffix tree according to the fault waveform information and the time information;
and traversing each node of the first suffix tree, acquiring repeated fault waveform information as a fault repeated waveform, and determining a repeated time sequence of the fault repeated waveform.
Further, in an embodiment of the present invention, the step of traversing each node of the first suffix tree, acquiring repeatedly occurring fault waveform information as a fault repeated waveform, and determining a repeated time sequence of the fault repeated waveform specifically includes:
traversing each node of the first suffix tree by a depth-first nested traversal algorithm from a root node of the first suffix tree;
acquiring repeated fault waveform information as a fault repeated waveform, and determining a plurality of time information corresponding to the fault repeated waveform;
and determining the repeated time sequence of the fault repeated waveform according to a plurality of time information corresponding to the fault repeated waveform.
Further, in an embodiment of the present invention, the fault detection method further includes the following steps:
acquiring the maximum repetition length of the fault repeated waveform exceeding a preset repetition time threshold, and updating parameters of residual coding and then performing residual coding and signal decomposition when the maximum repetition length is greater than or equal to the preset length threshold and the residual coding times are less than or equal to a preset coding time threshold.
Further, in an embodiment of the present invention, the step of determining a first time-frequency characteristic diagram of the first fault signal according to the fault repeated waveform and the repeated time sequence, and constructing a training picture set according to the first time-frequency characteristic diagram specifically includes:
according to a preset repetition length range, carrying out normalization processing on the fault repeated waveform and the repeated time sequence to obtain a first time-frequency characteristic diagram of the first fault signal;
determining a training sample according to the first time-frequency characteristic diagram;
acquiring the fault type of the first fault signal, and generating a fault type label according to the fault type;
and constructing a training picture set according to the training samples and the fault type labels.
Further, in an embodiment of the present invention, the step of inputting the training picture set into a pre-constructed convolutional neural network for training specifically includes:
inputting the training picture set into the convolutional neural network to obtain a fault prediction result;
determining a loss value of training according to the fault prediction result and the fault type label;
and updating the parameters of the convolutional neural network according to the loss value.
Further, in one embodiment of the present invention, the convolutional neural network comprises an input layer, a low hidden layer, a fully-connected layer and an output layer, wherein the low hidden layer is composed of a plurality of convolutional layers and a plurality of pooling layers alternately.
In a second aspect, an embodiment of the present invention provides a fault detection system based on a suffix tree, including:
the signal decomposition module is used for acquiring a first fault signal and decomposing the first fault signal through a suffix tree algorithm to obtain a fault repeated waveform and a repeated time sequence;
the training picture set construction module is used for determining a first time-frequency characteristic diagram of the first fault signal according to the fault repeated waveform and the repeated time sequence and constructing a training picture set according to the first time-frequency characteristic diagram;
the fault recognition model training module is used for inputting the training picture set into a pre-constructed convolutional neural network for training to obtain a trained fault recognition model;
and the fault type identification module is used for acquiring a second fault signal to be detected, determining a second time-frequency characteristic diagram of the second fault signal through a suffix tree algorithm, inputting the second time-frequency characteristic diagram into the fault identification model, and outputting to obtain a fault type identification result.
In a third aspect, an embodiment of the present invention provides a fault detection apparatus based on a suffix tree, including:
at least one processor;
at least one memory for storing at least one program;
the at least one program, when executed by the at least one processor, causes the at least one processor to implement a suffix tree based fault detection method as described above.
In a fourth aspect, the present invention further provides a computer-readable storage medium, in which a processor-executable program is stored, and the processor-executable program is used for executing the suffix tree based fault detection method when executed by a processor.
Advantages and benefits of the present invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention:
according to the embodiment of the invention, a first fault signal with a known fault type is obtained, the first fault signal is decomposed through a suffix tree algorithm to obtain a fault repeated waveform and a repeated time sequence, then a first time-frequency characteristic diagram is determined according to the fault repeated waveform and the repeated time sequence, a training picture set for training a convolutional neural network model is constructed according to the first time-frequency characteristic diagram, and a fault recognition model is obtained through training, so that a second fault signal to be detected can be detected and recognized according to the fault recognition model. According to the embodiment of the invention, the fault signal is decomposed to obtain the information of the fault repeated waveform and the repeated time sequence with two different scales, so that the error caused by the aliasing phenomenon of a Fourier algorithm is avoided, the accuracy and the reliability of the training sample are improved, the precision of the fault identification model is further improved, and the accuracy of fault detection is further improved.
Drawings
In order to more clearly illustrate the technical solution in the embodiment of the present invention, the following description is made on the drawings required to be used in the embodiment of the present invention, and it should be understood that the drawings in the following description are only for convenience and clarity of describing some embodiments in the technical solution of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flowchart illustrating steps of a suffix tree based fault detection method according to an embodiment of the present invention;
FIG. 2 is an exploded view of a suffix tree algorithm provided by an embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating a leaf node recursive call sequence according to an embodiment of the present invention;
fig. 4 is a schematic diagram of time-frequency characteristics provided in an embodiment of the present invention;
FIG. 5 is a block diagram of a suffix tree based fault detection system according to an embodiment of the present invention;
fig. 6 is a block diagram of a fault detection apparatus based on a suffix tree according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention. The step numbers in the following embodiments are provided only for convenience of illustration, the order between the steps is not limited at all, and the execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art.
In the description of the present invention, the meaning of a plurality is two or more, if there is a description to the first and the second for the purpose of distinguishing technical features, it is not understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features or implicitly indicating the precedence of the indicated technical features. Furthermore, unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art.
Referring to fig. 1, an embodiment of the present invention provides a fault detection method based on a suffix tree, which specifically includes the following steps:
s101, acquiring a first fault signal, and decomposing the first fault signal through a suffix tree algorithm to obtain a fault repeated waveform and a repeated time sequence;
specifically, a sensor is used for collecting a first fault signal f (n) of a known fault type, wherein n is the number of sampling points. The data volume requirement of single processing at least comprises 2 complete cycles of the attention signal, and generally more than 3-5 times of the attention characteristic cycle is required to obtain more fault repetition characteristics.
As a further optional implementation, the step of decomposing the first fault signal by using a suffix tree algorithm to obtain a fault repeated waveform and a repeated time sequence specifically includes:
a1, coding the first fault signal through average distribution or Gaussian distribution to obtain a first time domain signal;
a2, decomposing the first time domain signal through a suffix tree algorithm to obtain a plurality of fault waveform information and corresponding time information, and constructing a first suffix tree according to the fault waveform information and the time information;
and A3, traversing each node of the first suffix tree, acquiring repeated fault waveform information as a fault repeated waveform, and determining a repeated time sequence of the fault repeated waveform.
Specifically, the first fault signal is encoded according to the amplitude interval of the first fault signal and the preset encoding number according to the average distribution or the Gaussian distribution.
The formula for encoding the first fault signal in terms of an even distribution is as follows:
C 1 (n)=Int[(f(n)-f 1,min )*L code /(f 1,max -f 1,min )]
the formula for encoding the first fault signal according to a gaussian distribution is as follows:
C 1 (n)=Int[IGD((f(n)-μ 1 )/σ 1 )*L code ]
wherein, C 1 (n) is the encoded value of the first code band corresponding to the n times of sampled data, i.e. the first time domain signal, Int () is a rounding function, L code To a predetermined number of coded bits, f 1,max Is the maximum value of the value range of the first error signal, i.e. the maximum value of the value range during the encoding of the first code strip, f 1,min The value domain minimum value of the first fault signal is the value domain minimum value when the first code band is coded; mu.s 1 Is the mean value of the Gaussian distribution of the first code band, sigma 1 The IGD is a standard normal distribution integral probability look-up table return function for the gaussian distribution variance of the first code band.
Obtaining residual signals R simultaneously 1 (n):R 1 (n)=f(n)-C 1 (n)。
C after coding 1 And (n) performing suffix tree reconstruction. Three delivery rules are followed in the suffix tree construction process:
relu 1: used when inserting a new suffix to the root node root. active _ node remains root, active _ edge is set to the first character of the new suffix to be inserted, active _ length minus 1.
Relu 2: when an edge is Split (Split) and a new node is inserted (Insert), if the new node is not the first node created in the current step, the previously inserted node is connected to the new node by a special pointer, called Suffix Link (Suffix Link), which is usually drawn with dotted lines in the figure.
Relu 3: when the slave _ node is not a node splitting edge of a root node root, searching a node along the direction of Suffix connection (Suffix Link), and if one node exists, setting the node as the slave _ node; if not, active _ node is set to root. active _ edge and active _ length remain unchanged.
The suffix tree algorithm employed in the embodiments of the present invention is described below. There is one original data column T ═ T 1 t 2 ...t n Wherein t is i (1. ltoreq. i.ltoreq.n +1), n being the data length, from t 1 To t n The original data is decomposed into n +1 non-repeated subsequences in sequence, and the n +1 th subsequence is a specified terminator and is denoted by '#'. For ease of expression, the relevant symbols are illustrated below:
o: (root) root node, sequence starting point, has no specific meaning;
p: (acitve _ piont) an activity point, specifying an activity starting point;
n: (active _ node) an active node designating a child node;
e: (active _ edge) active edge, specifying sequence connection direction;
l: (active _ edge) active length, specifying the amount of data moved by the sequence;
r: (remaining) number of remaining suffixes, indicating number of unconnected suffixes;
#: a terminator;
street (T): and finally decomposing the result.
The original data sequence is decomposed from left to right in order starting from the root node O until the (n +1) th sequence is generated. The following formula:
STree(T)=(F i ,f i ,g i ),i∈[1,n]
wherein, F i Representing the sequence of the main edge, f i Denotes a sub-edge sequence, g i And the connection mode of the data i is represented and comprises the values of the activity point P and the residual suffix R. When the data is at t i (1. ltoreq. i. ltoreq.n) bitWhen the device is set, the connection of each edge is completed through the following transmission mode.
1) When i is 1
P 1 =(O,'F 1 ',1), R is 1. Selecting a root node O from the initial position; the movable edge E is set as' F 1 '; the active length L and the number of remaining suffixes R are set to 1, indicating that only one data volume needs to be transferred in. STRee (T) 1 )=(F 1 ,g 1 )。
2) When i >1
Figure BDA0003172746650000061
I.e. t i Is STRee (T) i-1 ) Newly appearing data later, set P i =(O,'F 1 ', i), R ═ i. Directly connected after all the main edges, slave F 1 Starting the iterative update P ik =(O,'F k ', i-k +1) where k represents the number of primary edges, k ∈ (0, i)]K +1, R-1, until i-k, the update P is stopped ik . After extending the existing edge, a new main sequence F is created k+1 Starting from the root node O, t i As its first side. STRee (T) i )=(F i ,g i )。
②t i ∈T i-1 I.e. t i Already present in the prefix, is considered duplicate data. From the main chain F 1 First data t of 0 Start, find and t i Taking j as the position where the repeated data appears represents the side length L. Setting P i =(O,'F 1 ', j), R is 1. Due to unknown t i+1 Can only temporarily take the fixed moving point P i And the number of remaining suffixes R, do not give a specific direction of the data sequence. J can tell that all the first j backbones are present with t i Repeated data, so that j starts at 1, updating P iteratively ij =(O,'F j ',i-j),L∈(0,j]L-1, R-1, with the primary edge continuing t during each update i R value is constant and 1, only for t i This one data volume operates. The step can refer to rule 1, the active point is a root node, the active edge is set as the initial data of the new suffix, and the operation is carried outAfter one time, the side length of the movable edge is reduced by 1. The process does not create a new edge. The update of the active point is used to concatenate suffix data and cannot be used as a standard for creating new edges.
Thirdly, on the premise of the second step, a subsequence f is created i ,f i All with a moving point P ij As a starting point, from the main edge F j Middle separation, following main edge F j The prefix data of (2). Subsequence f i In-lead data with primary edge F j Match, data t i And splicing after the sequence.
STree(T i )=(F i ,f i ,g i ),i<n。
Establishing Suffix connection (Suffix Link). Repeating data t i When present, each main edge generates one or more sub-edge nodes P Fi If the node is not the first node P created in the current data insertion process F1 The previously inserted node is connected to the new node by a special pointer, called a suffix connection.
When the node from N is not the root node O splits the edge, searching the node along the direction of Suffix connection (Suffix Link), if there is a node, setting the node as N; if not, N is set to O. E and L remain unchanged.
Sixthly, the steps from the second step to the fifth step are circulated until i is equal to n, and the decomposition of all data is completed.
Taking the 'abcabxabcd' character string as an example, suffix tree decomposition of the character string is completed, as shown in fig. 2, the serial numbers are the numbers of the main and sub edges, '#' is an end sign, the open circles represent parent nodes, the triangle numbers represent leaf nodes, and the arrows represent the search order.
After adding the terminator, a total of 11 substrings are divided as shown in table 1 below:
serial number Character string Class I
1 abcabxabcd# Main edge
2 bcabxabcd# Main edge
3 cabxabcd# Main edge
4 abxabcd# Sub-edge
5 bxabcd# Sub-edge
6 xabcd# Main edge
7 abcd# Sub-edge
8 bcd# Sub-edge
9 cd# Sub-edge
10 d# Main edge
11 # Sub-edge
TABLE 1
It is clear from table 1 that the original character string can be decomposed into several non-repetitive substrings, and the information amount carried by the substrings is different. And (4) postfix tree decomposition, namely processing original data, finishing the rearrangement process of data coding, replacing the reading time of the characteristic data segment by sacrificing memory space, and completely reserving all data signals.
As a further optional implementation, the step a3 of traversing each node of the first suffix tree, acquiring repeatedly occurring fault waveform information as a fault repeated waveform, and determining a repeated time sequence of the fault repeated waveform specifically includes:
a31, traversing each node of the first suffix tree by a depth-first nested traversal algorithm from the root node of the first suffix tree;
a32, acquiring repeated fault waveform information as a fault repeated waveform, and determining a plurality of time information corresponding to the fault repeated waveform;
and A33, determining the repeated time sequence of the repeated fault waveform according to the plurality of time information corresponding to the repeated fault waveform.
As a further optional implementation manner, the step of traversing each node of the first suffix tree by using a depth-first nested traversal algorithm specifically includes:
b1, creating a repeated time storage array with the length equal to that of the first time domain signal;
b2, creating a repeated feature record array consistent with the number of the non-leaf nodes of the first suffix tree;
b3, running a Depth _ First nested function, inputting a node number and a father node repeated waveform length, and outputting a repeated moment initial position, a repeated waveform length, a waveform repetition number, a node repeated character string length and a plurality of repeated moments;
the repeated characteristic recording array is used for storing the repeated moment starting position, the repeated waveform length, the waveform repetition times and the node repeated character string length, and the repeated moment storage array is used for storing the repeated moment.
Specifically, the algorithm pseudo-code is as follows:
depth _ First nested function (abbreviated DF):
inputting: node number nNodeID, father node repeating waveform length nFatherNodeRepeatLength;
and (3) outputting: the starting position of the repetition time, the length of the repeated waveform nNodeRepeatLength, the repetition frequency of the waveform nWRIndex, and the storage array (global variable) of the repetition time;
1: the length of the repeated waveform is equal to the length of the current node character plus the length of the repeated waveform of the father node
2: recording the starting position of the node repeating time
3: accessing a first child node of the node according to the node number
4: number of waveform repetitions is 0
5:repeat
6: if child node is a non-leaf node
7: recursively calling a child node DF function to obtain the waveform repetition times of the child node
8: the waveform repetition number is the waveform repetition number + the sub-node waveform repetition number
9: else// child node is a leaf node
10: the number of waveform repetitions +1
11: filling leaf node numbers into a repeat time storage array
12:end if
13: all child nodes of the unitil have access to
14: return repeated waveform length
Depth-first algorithm:
1: creating a repeated time storage array with the same length as the signal to be processed;
2: creating an array of duplicate feature records consistent with the number of non-leaf nodes, containing the following information: { repeating waveform string termination address, node repeating string length, repeating time starting position, waveform repeating times };
3: running DF (root node number, 0)// father node repeating waveform length is 0;
4: and obtaining the characteristic waveform of the fault and the characteristic of the corresponding repeated time sequence.
The time-frequency feature extraction algorithm only traverses the nodes created in the suffix tree algorithm once. Since the suffix tree algorithm has a complexity of o (n), the time-frequency feature extraction algorithm also has a complexity of o (n).
The basic principle of the time-frequency feature extraction algorithm in the embodiment of the invention is further described below by taking the suffix tree structure of the abcabxabcd character string as an example.
The recursive calling sequence of the DF function is shown in fig. 3, and sequentially DF (0,0), DF (1,0), DF (3,2), DF (4,1), DF (2,0), DF (5,0), and DF (0,0), and in each main edge, the leaf node needs to return to the parent node after the leaf node is visited, and the next node search is waited.
The subsequence feature array is completed according to the order of the non-leaf nodes in fig. 2, and the decomposition results are shown in table 2 and table 3.
1 2 3 4 5 6 7 8 9 10 11
a b c a b x a b c d #
1 7 4 2 8 5 3 9 6 10 11
TABLE 2
Table 2 the 1 st behavior primitive character string 'abcabxabcd' each character number, the 2 nd behavior primitive character string, and the 3 rd behavior extracted repetitive feature decomposition sequence, whose value corresponds to the leaf node number in fig. 2 and also to the time of the sampling point indicated by the value. The storage capacity of the array is consistent with the length of the original data, the last bit of the array always corresponds to the character string terminator, and the bit can be omitted.
Figure BDA0003172746650000091
Figure BDA0003172746650000101
TABLE 3
Table 3 records the repetitive feature array information of each non-leaf node, including the repetitive feature waveform end point, the repetitive feature waveform length ReLen, the repetitive time feature vector Start point Start _ RTV, and the repetition times RepTimes. For easy understanding, the Node number NUM and the Node string Node _ str are added in table 3. The node numbers and the node character strings in table 3 are all the same as the labels in fig. 2, and can be directly obtained by a suffix tree algorithm.
As a further optional implementation, the fault detection method further includes the following steps:
and when the maximum repetition length is greater than or equal to the preset length threshold and the residual coding frequency is less than or equal to the preset coding frequency threshold, updating the parameters of the residual coding and then carrying out residual coding and signal decomposition.
Specifically, the maximum repetition length of the waveform exceeding a preset repetition number is checked, and the analysis is terminated if the length is smaller than a preset value or the residual analysis number i is larger than a preset value. If the length is greater than or equal to the preset value and the residual error analysis time i is less than or equal to the preset value, the ith residual error signal R is processed i (n) continuing to perform the decomposition.
The rule of parameter modification in residual coding is as follows:
if the encoding is performed according to the average distribution, the upper and lower limits are updated using the following formula:
Figure BDA0003172746650000102
if the encoding is performed according to a gaussian distribution:
a) residual coding can be performed again according to the average distribution, and the upper limit and the lower limit are updated by adopting the above formula.
b) Still according to the gaussian distribution encoding, the parameters are updated using the following formula:
μ i =0
Figure BDA0003172746650000103
and then residual error coding is carried out by using the following formula:
C i (n)=Int[(IGD((R i-1 (n)-μ i )/σ i )-0.5)*2*L code ]
s102, determining a first time-frequency characteristic diagram of the first fault signal according to the fault repeated waveform and the repeated time sequence, and constructing a training picture set according to the first time-frequency characteristic diagram.
Specifically, the output fault characteristics include two pieces of information, namely a fault repeated waveform and a repeated time sequence. As shown in table 3, the length ReLen of the repetitive characteristic waveform is at most 3, and the End point W _ End of the corresponding repetitive characteristic waveform is 3, so the waveform of the characteristic is "abc" shown in the first 3 bits of row 2 in table 2, and the corresponding repetitive time series characteristic is 2 sequences of the total number of repetitions repTimes, i.e., "1, 7", from the Start point Start _ RTV, i.e., bit 1, of the repetitive time characteristic vector in row 3 in table 2. Therefore, the fault characteristics of a sampling point with a repetitive waveform length of 3 are as follows:
{“abc”,“1,7”}
similarly, the repetition length of 2 is characterized by:
{“ab”,“1,7,4”}、{“bc”,“2,8”}
the repetitive signature and repetitive time series may be used individually or together as a fault signature for subsequent fault diagnosis. And by combining a dynamic model, the set operation results of intersection, union set, complement set and the like of a plurality of repeated time sequences can be used as fault characteristics. Wherein a longer maximum waveform repetition length at a time instant means a better stability of the data structure for a longer time around the time instant. When the interference is approximately white, the instantaneous frequency at that instant is also relatively low. Conversely, a shorter maximum waveform repetition length at a certain time means that the data structure at that time is less stable and the transient signal at that time is closer to the transient or impulsive signal.
As a further optional real-time mode, the step S102 of determining a first time-frequency feature map of the first fault signal according to the fault repeated waveform and the repeated time sequence, and constructing a training picture set according to the first time-frequency feature map specifically includes:
s1021, normalizing the fault repeated waveform and the repeated time sequence according to a preset repeated length range to obtain a first time-frequency characteristic diagram of the first fault signal;
s1022, determining a training sample according to the first time-frequency characteristic diagram;
s1023, acquiring the fault type of the first fault signal, and generating a fault type label according to the fault type;
and S1024, constructing a training picture set according to the training samples and the fault type labels.
Specifically, the abscissa of the first time-frequency characteristic diagram represents time information, the ordinate of the first time-frequency characteristic diagram represents a repeated waveform length, and the color value of the pixel point of the first time-frequency characteristic diagram represents the participation degree of data at a corresponding time in a fault repeated waveform corresponding to the repeated waveform length, so that fault feature visualization of the first fault signal can be realized.
Fig. 4 is a schematic diagram of time-frequency characteristics provided in the embodiment of the present invention. The string data is analogized to the fault signal, and the shades in fig. 4 represent the activity of the data in the current repetition length at different times. The corresponding color value of the line 2 "b" character is lighter, indicating that there is likely to be a periodic signal of lower frequency. The numerical values of the characters of 'x' and'd' on the 1 st line have low values, which indicates that the signal at the moment is extremely unstable, and abrupt signals such as impact can exist. The numerical value of the character a at the position of the 1 st row and the 2 nd row also has a low value, but the corresponding position value of the second row is larger, which indicates that the corresponding signal of the character a belongs to the subsequent character ab, and the character a does not deviate from the character ab to appear independently in the processed data.
And after the suffix tree is decomposed, obtaining an image result with time-frequency characteristics. The position of a low-frequency pit appearing in the image can be regarded as the moment when the fault piece is impacted; the high frequency highlight areas appearing in the image can be considered as the natural vibration signals of the system. For a rotating system, the rotation period is small, the vibration frequency is high, and the difference between color blocks of an image (the color blocks are characterized by signal time-frequency characteristics) is difficult to directly judge, so that the embodiment of the invention introduces a convolutional neural network, and identifies the time-frequency characteristic diagram of each type of fault in an image identification mode so as to judge the specific fault type.
The suffix tree decomposition completes the processing of the original fault signal, the time-frequency information of the signal is reserved in a picture format and is completely stored in a local file, and the direct input of a convolutional neural network is facilitated. In the Tensorflow platform, after the pictures are read in, the image information is converted into calculation data in a matrix form, so that subsequent calculation, training and identification are facilitated.
Constructing a training picture set (used for training a model and determining parameter values) and a test set (used for predicting model accuracy) for the pictures processed by the suffix tree according to fault types, taking the corresponding fault types as label values (used for expressing fault types), and finally importing data into a convolutional network model to finish the identification of the specific fault types. In the subsequent fault detection and identification process, the digital signals acquired in real time are directly put into the trained network model to carry out corresponding fault judgment.
S103, inputting the training picture set into a pre-constructed convolutional neural network for training to obtain a trained fault recognition model.
In the embodiment of the invention, the convolutional neural network is adopted to finish the training of the fault diagnosis model. The convolutional neural network is a multi-layer supervised learning neural network, and the convolutional layer and the pooling layer of the hidden layer are core modules for realizing the function of extracting the characteristics of the convolutional neural network. The low hidden layer of the convolutional neural network is composed of convolutional layers and pooling layers alternately, and the high hidden layer is a full-connection layer and corresponds to the hidden layer and the logistic regression classifier of the traditional multilayer perceptron.
The basic convolutional neural network comprises five parts, an input layer, a convolutional layer, a pooling layer, a full-connection layer and an output layer. Each layer has a plurality of feature maps, each feature map extracting a feature of the input by a convolution filter, each feature map having a plurality of neurons. The input layer inputs each pixel representing a feature node. The convolution layer is composed of a plurality of filters, and the input data is subjected to characteristic extraction, and each element composing the convolution kernel corresponds to a weight coefficient and a deviation value.
The convolutional layer parameters are made up of a series of learnable filters, each filter being small in width and height, the input and data being consistent. When the filter slides along the width and height of the image, a two-dimensional activation map is generated.
Each filter will have an entire set of filters that will form multiple activation maps.
Figure BDA0003172746650000121
(i,j)∈{0,1,...,L+1}
Figure BDA0003172746650000131
In the above formula, b is a deviation amount, Z l And Z l+1 Represents the convolution input and output of layer L +1, L l+1 Is Z l+1 Z (i, j) corresponds to the pixel of the feature map, K is the number of channels of the feature map, f corresponds to the convolution kernel size, s 0 Convolution step size, p number of padding layers.
The distribution of data is mostly nonlinear, and the introduction of an excitation function is to introduce a nonlinear relation in a neural network so as to strengthen the learning capability of the network. The stimulus function subjects the input data of the network to a specific distribution:
a. the data distribution is zero-averaged, namely: the mean value obtained by the distribution calculation is approximately equal to 0. A non-zero averaged distribution may result in a gradient vanishing or training jitter.
b. The data distribution is normal. A non-normal distribution may result in an algorithm overfitting.
c. In the training process, when different data scales batch are faced, the input data distribution of each layer of the neural network should be kept consistent all the time, and the phenomenon that the input data distribution cannot be kept consistent is called Internal Covaraite Shift, so that the training process is seriously influenced.
The convolutional layer contains an excitation function to assist in expressing complex features, and the expression form is as follows:
Figure BDA0003172746650000132
in the above formula, f (z) is a characterization activation function, z l L is the output of the L-th layer, and the output value after the excitation function processing is A l
Although the current ReLU function has some defects, the ReLU function can also achieve good effect. ReLU has the lowest computational cost and simplest code implementation compared to other activation functions. Activation functions that have the ability to generate a zero mean distribution are preferred over other activation functions. It should be noted that neural network training and reasoning using ReLU will be slower because more complex exponential operations are required to obtain the function activation values.
The pooling layer performs feature selection and information filtering on the convolution result, so that the size of the parameter matrix can be effectively reduced, the number of parameters in the last connection layer is reduced, and the effects of accelerating calculation speed and preventing overfitting are achieved. The pooling layer contains a pre-set pooling function whose function is to replace the result of a single point in the feature map with the feature map statistics of its neighboring regions. The step of selecting the pooling area by the pooling layer is the same as the step of scanning the characteristic diagram by the convolution kernel, and the pooling size, the step length and the filling are controlled. The general representation of the pooling model is:
Figure BDA0003172746650000133
in the formula, step length s 0 The pixel (i, j) has the same meaning as the convolution layer, and p is a predetermined parameter. Analogous to the vector norm, when p is 1, the pooling process is averaged over the pooled region, referred to as average pooling; when p is 2, the pooling process takes a maximum within the region, called maximal pooling (max pooling).
Fully connected layers (FC) act as "classifiers" throughout the convolutional neural network. Operations such as convolution, pooling and activation map raw data to hidden layer feature controls, and the fully-connected layer plays a role in mapping learned 'distributed features' to sample labels. The last layer of the network serves as the input to the fully connected layer. The correlation formula is as follows:
Z j =W j X+b j =ω j1 x 1j2 x 2 +…+ω jn x n +b j
will W j And (4) regarding the weight of the features under the jth class, namely the importance degree of the features in each dimension, obtaining the score of each class by weighting and summing the features, and mapping the score into the probability through a Softmax function. Through the full connection layer, the fraction Z in the range of K categories (- ∞, + ∞) j To obtain the probability of belonging to each category, first pass
Figure BDA0003172746650000142
The score is mapped to (0, + ∞) and then normalized to (0, 1). The following formula:
Figure BDA0003172746650000141
all Z of e j And calculating the power, summing, calculating the ratio of each value, and ensuring that the sum is 1, wherein the probability of classification is obtained by softmax.
dropout refers to temporarily discarding a neural network unit from a network according to a certain probability in the training process of a deep learning network. After Dropout is used, the transfer process changes, in the following specific form:
1) randomly (temporarily) deleting hidden neurons in the network, and keeping input and output neurons unchanged;
2) the input x is propagated forward through the modified network and backward in the modified network using the obtained loss result. After a small batch of training samples finishes the process, updating corresponding parameters (w, b) on the non-deleted neurons according to a random gradient descent method;
3) this process then continues to repeat:
a. recovering the deleted neurons (deleted neurons remain intact, non-deleted neurons have been updated);
b. randomly selecting a subset from hidden layer neurons to delete temporarily (backup parameters of deleted neurons);
c. for a small batch of training samples, the propagation is performed in the forward direction and then the loss is performed in the backward direction, and the parameters are updated according to the stochastic gradient descent method (the part of the parameters which are not deleted is updated, and the deleted neuron parameters keep the result before deletion).
4) This process is repeated until the number of iterations is complete.
The corresponding calculation formula is as follows:
Figure BDA0003172746650000151
y ~(l) =r (l) y (l)
Figure BDA0003172746650000152
Figure BDA0003172746650000153
the Bernoulli function in the above formula is to generate a probability r vector, i.e. a vector of 0 and 1 is randomly generated.
The weight of each neuron is multiplied by a probability p, which "collectively" makes the test data and the training data approximately the same. The standard model has no Dropout layer, and the same training data to train 5 different neural networks generally results in 5 different results, and the final result is determined by using "5 result averaging" or "majority winning voting strategy". Different networks produce different overfitting, and some of the "inverse" fitting mutually offset to achieve overall reduction of overfitting. The updating of the weight value is not dependent on the joint action of the implicit nodes with fixed relations, and the situation that some characteristics are effective only under other specific characteristics is prevented.
The calculation of the maximum likelihood loss function softmax _ loss comprises 2 steps:
1) calculating the normalized probability of softmax, wherein the formula is as follows:
x i =x i -max(x 1 ,...,x n )
Figure BDA0003172746650000154
2) a loss function is calculated, the formula being:
Figure BDA0003172746650000155
wherein N is the number of samples and K is the number of tags.
The Adam algorithm dynamically adjusts the learning rate for each parameter using first and second moment estimates of the gradient. The tf.train.adammoptimizer provided by transorflow can control the learning speed, and each iteration is corrected by the offsetThe learning rate has a certain range, so that the parameters are relatively stable. In the default initial background of TensorFlow, the step size alpha is 0.001, and the power value beta is 1 =0.9,β 2 0.999, epsilon is a very small number to avoid the divisor 0, 10 -8 . After determining the parameters alpha and beta 1 、β 2 And the stochastic objective function f (w), the parameter vector, the first moment vector, the second moment vector and the time step need to be initialized.
Direct incorporation of momentum into the first moment m of the gradient in Adam 0 (exponentially weighted) estimation. Adam includes bias corrections that correct first moment (momentum term) and (non-central) second moment estimates initialized from the origin, as compared to RMSProp where lack of correction factors results in second moment estimates that may have very high bias during the initial stages of training.
Figure BDA0003172746650000161
Wherein m is t And v t First order momentum terms and second order momentum terms, respectively.
Figure BDA0003172746650000162
The respective correction values are obtained. w is a t Representing the parameters of the t-th iterative model at time t,
Figure BDA0003172746650000163
the gradient magnitude of the loss function with respect to w at the t-th iteration is shown. When the parameter w does not converge, the loop iteratively updates the various parts. I.e. the time step t plus 1, updating the gradient of the target function over the parameter w at this time step, updating the first moment estimate and the second original moment estimate of the bias, calculating the first moment estimate of the bias correction and the second moment estimate of the bias correction, and then updating the parameter w of the model with the values calculated above t
And selecting functions of each layer of the network according to actual needs, and connecting each layer to complete the construction of the integral structure of the convolutional neural network. Adding dimensionality of input data in a first-layer network, and selecting an activation function as a relu function; selecting padding to ensure the consistency of data dimensions; the maximum pooling is selected with a step size of 2. The latter network is gracefully modified with reference to the first tier network settings. The Dropout layer setting parameter is 0.5, half of neurons are shielded in operation, and the operation speed of the network model is increased. The full connection layer selects the activation functions as a relu function and a softmax function, and is suitable for multi-classification problems. The optimizer selects Adam, the loss function selects catagorical _ crosssentryp, and the output matrix selects the parameter accuracy. And at this point, the initial convolutional neural network model is built.
Table 4 below shows a model parameter table of a convolutional neural network model constructed according to an embodiment of the present invention.
Figure BDA0003172746650000164
Figure BDA0003172746650000171
TABLE 4
In table 4, the first column is the layer type, i.e., layer name; the second column is the data size of the layer after data processing, wherein None refers to the number of pictures, 2-dimension and 3-dimension represent the vertical and horizontal width of the image, and the 4-dimension represents the number of convolution kernels; the third column is the number of parameters, in the whole network, parameter operation can be performed on convolution operation and a full connection layer, and data extraction and dimension adjustment are performed on the rest layers.
As a further optional implementation, the step of inputting the training picture set into a pre-constructed convolutional neural network for training specifically includes:
c1, inputting the training picture set into a convolutional neural network to obtain a fault prediction result;
c2, determining a loss value of training according to the fault prediction result and the fault type label;
and C3, updating the parameters of the convolutional neural network according to the loss value.
Specifically, after the data in the training picture set is input into the initialized convolutional neural network model, the recognition result output by the model, that is, the failure prediction result, can be obtained, and the accuracy of the recognition model prediction can be evaluated according to the failure prediction result and the label value, so that the parameters of the model are updated. For the fault identification model, the accuracy of the model prediction result can be measured by a Loss Function (Loss Function), the Loss Function is defined on a single training data and is used for measuring the prediction error of the training data, and specifically, the Loss value of the training data is determined by the label of the single training data and the prediction result of the model on the training data. In actual training, a training data set has many training data, so a Cost Function (Cost Function) is generally adopted to measure the overall error of the training data set, and the Cost Function is defined on the whole training data set and is used for calculating the average value of prediction errors of all the training data, so that the prediction effect of the model can be measured better. For a general machine learning model, based on the cost function, and a regularization term for measuring the complexity of the model, the regularization term can be used as a training objective function, and based on the objective function, the loss value of the whole training data set can be obtained. There are many types of commonly used loss functions, such as 0-1 loss function, square loss function, absolute loss function, logarithmic loss function, cross-entropy loss function, etc., all of which can be used as the loss function of the machine learning model, and are not described herein. In the embodiment of the invention, a loss function can be selected from the loss functions to determine the loss value of the training. And updating the parameters of the model by adopting a back propagation algorithm based on the trained loss value, and iterating for several rounds to obtain the trained fault diagnosis model. Specifically, the number of iteration rounds may be preset, or training may be considered complete when the test set meets the accuracy requirement.
In the embodiment of the invention, after the specific repeated waveform and the corresponding time sequence thereof are identified by the convolution kernel, the specific repeated waveform is used as new training data to reversely update the network. The identified fault features are set as convolution kernels, the network is continuously updated in forward and reverse transmission, and the new convolution kernels are used for directly reading the fault features, so that the identification rate and the prediction accuracy of the training model can be greatly improved.
Further as an optional embodiment, the convolutional neural network comprises an input layer, a low hidden layer, a full connection layer and an output layer, wherein the low hidden layer is composed of a plurality of convolutional layers and a plurality of pooling layers alternately.
S104, obtaining a second fault signal to be detected, determining a second time-frequency characteristic diagram of the second fault signal through a suffix tree algorithm, inputting the second time-frequency characteristic diagram into a fault recognition model, and outputting to obtain a fault type recognition result.
In the embodiment of the present invention, the processing procedure of the second fault signal to be detected is consistent with the processing procedure of the first fault signal, which is not described herein again. After the second time-frequency characteristic diagram of the second fault signal is obtained, the second time-frequency characteristic diagram is input into the fault recognition model trained in the step S103, and then a fault type recognition result can be obtained.
The method steps of the present invention are described above. It can be appreciated that in the embodiment of the invention, the fault signal is decomposed to obtain the information of the fault repeated waveform and the repeated time sequence with two different scales, so that the error caused by the aliasing phenomenon of the Fourier algorithm is avoided, the accuracy and the reliability of the training sample are improved, the precision of the fault identification model is further improved, and the accuracy of fault detection is further improved.
Compared with the prior art, the embodiment of the invention also has the following advantages:
1) the embodiment of the invention directly decomposes the time domain signal, obtains the information of two different scales of the fault repeated waveform and the repeated time sequence, and obtains the physical meaning of the characteristics with clear and strong interpretability.
2) The embodiment of the invention has the calculation complexity of O (n), has high running speed, has the processing capacity of a single code band of a common desktop computer reaching 1M sampling points/second, and is particularly suitable for feature extraction during large-scale data real-time processing such as high sampling frequency, multi-channel data processing and the like.
3) The fault repeated waveform and the repeated time sequence can be directly compared with the analysis result of the dynamic model, so that the method is suitable for non-stationary faults, and the reliability of fault characteristics is greatly improved.
4) And a neural network model is introduced to identify the fault type, the identification speed can reach the millisecond level, training data is fed back in real time, the network model is updated, and the target of real-time diagnosis is realized.
Referring to fig. 5, an embodiment of the present invention provides a fault detection system based on a suffix tree, including:
the signal decomposition module is used for acquiring a first fault signal and decomposing the first fault signal through a suffix tree algorithm to obtain a fault repeated waveform and a repeated time sequence;
the training picture set construction module is used for determining a first time-frequency characteristic diagram of the first fault signal according to the fault repeated waveform and the repeated time sequence and constructing a training picture set according to the first time-frequency characteristic diagram;
the fault recognition model training module is used for inputting the training picture set into a pre-constructed convolutional neural network for training to obtain a trained fault recognition model;
and the fault type identification module is used for acquiring a second fault signal to be detected, determining a second time-frequency characteristic diagram of the second fault signal through a suffix tree algorithm, inputting the second time-frequency characteristic diagram into the fault identification model, and outputting to obtain a fault type identification result.
The contents in the above method embodiments are all applicable to the present system embodiment, the functions specifically implemented by the present system embodiment are the same as those in the above method embodiment, and the beneficial effects achieved by the present system embodiment are also the same as those achieved by the above method embodiment.
Referring to fig. 6, an embodiment of the present invention provides a suffix tree-based fault detection apparatus, including:
at least one processor;
at least one memory for storing at least one program;
the at least one program, when executed by the at least one processor, causes the at least one processor to implement a suffix tree based fault detection method as described above.
The contents in the above method embodiments are all applicable to the present apparatus embodiment, the functions specifically implemented by the present apparatus embodiment are the same as those in the above method embodiments, and the advantageous effects achieved by the present apparatus embodiment are also the same as those achieved by the above method embodiments.
Embodiments of the present invention also provide a computer-readable storage medium, in which a processor-executable program is stored, and the processor-executable program is used for executing the above-mentioned suffix tree-based fault detection method when executed by a processor.
The computer-readable storage medium of the embodiment of the invention can execute the fault detection method based on the suffix tree provided by the embodiment of the method of the invention, can execute any combination of the implementation steps of the embodiment of the method, and has corresponding functions and beneficial effects of the method.
The embodiment of the invention also discloses a computer program product or a computer program, which comprises computer instructions, and the computer instructions are stored in a computer readable storage medium. The computer instructions may be read by a processor of a computer device from a computer-readable storage medium, and executed by the processor, causing the computer device to perform the method illustrated in fig. 1.
In alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flow charts of the present invention are provided by way of example in order to provide a more comprehensive understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed and in which sub-operations described as part of larger operations are performed independently.
Furthermore, although the present invention is described in the context of functional modules, it should be understood that, unless otherwise indicated to the contrary, one or more of the functions and/or features described above may be integrated in a single physical device and/or software module, or one or more functions and/or features may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary for an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be understood within the ordinary skill of an engineer, given the nature, function, and internal relationship of the modules. Accordingly, those skilled in the art can, using ordinary skill, practice the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative of and not intended to limit the scope of the invention, which is defined by the appended claims and their full scope of equivalents.
The above functions, if implemented in the form of software functional units and sold or used as a separate product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the above method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (ram), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Further, the computer readable medium could even be paper or another suitable medium upon which the above described program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following technologies, which are well known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
In the foregoing description of the specification, reference to the description of "one embodiment/example," "another embodiment/example," or "certain embodiments/examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.
While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (9)

1. A fault detection method based on a suffix tree is characterized by comprising the following steps:
acquiring a first fault signal, and decomposing the first fault signal through a suffix tree algorithm to obtain a fault repeated waveform and a repeated time sequence;
determining a first time-frequency characteristic diagram of the first fault signal according to the fault repeated waveform and the repeated time sequence, and constructing a training picture set according to the first time-frequency characteristic diagram;
inputting the training picture set into a pre-constructed convolutional neural network for training to obtain a trained fault recognition model;
acquiring a second fault signal to be detected, determining a second time-frequency characteristic diagram of the second fault signal through a suffix tree algorithm, inputting the second time-frequency characteristic diagram into the fault identification model, and outputting to obtain a fault type identification result; the step of decomposing the first fault signal by a suffix tree algorithm to obtain a fault repeated waveform and a repeated time sequence specifically comprises:
coding the first fault signal through average distribution or Gaussian distribution to obtain a first time domain signal;
decomposing the first time domain signal through a suffix tree algorithm to obtain a plurality of fault waveform information and corresponding time information, and constructing a first suffix tree according to the fault waveform information and the time information;
and traversing each node of the first suffix tree, acquiring repeated fault waveform information as a fault repeated waveform, and determining a repeated time sequence of the fault repeated waveform.
2. The suffix tree based fault detection method according to claim 1, wherein the step of traversing each node of the first suffix tree, obtaining repeated fault waveform information as a fault repeated waveform, and determining a repeated time sequence of the fault repeated waveform specifically comprises:
traversing each node of the first suffix tree by a depth-first nested traversal algorithm from a root node of the first suffix tree;
acquiring repeated fault waveform information as a fault repeated waveform, and determining a plurality of time information corresponding to the fault repeated waveform;
and determining the repeated time sequence of the fault repeated waveform according to a plurality of time information corresponding to the fault repeated waveform.
3. The suffix tree based fault detection method according to claim 1, further comprising the steps of:
acquiring the maximum repetition length of the fault repeated waveform exceeding a preset repetition time threshold, and updating parameters of residual coding and then performing residual coding and signal decomposition when the maximum repetition length is greater than or equal to the preset length threshold and the residual coding times are less than or equal to a preset coding time threshold.
4. The suffix tree based fault detection method according to claim 1, wherein the step of determining a first time-frequency feature map of the first fault signal according to the fault repeated waveform and the repeated time sequence and constructing a training picture set according to the first time-frequency feature map specifically comprises:
according to a preset repetition length range, carrying out normalization processing on the fault repeated waveform and the repeated time sequence to obtain a first time-frequency characteristic diagram of the first fault signal;
determining a training sample according to the first time-frequency characteristic diagram;
acquiring the fault type of the first fault signal, and generating a fault type label according to the fault type;
and constructing a training picture set according to the training samples and the fault type labels.
5. The suffix tree based fault detection method according to claim 4, wherein the step of inputting the training picture set into a pre-constructed convolutional neural network for training specifically comprises:
inputting the training picture set into the convolutional neural network to obtain a fault prediction result;
determining a loss value of training according to the fault prediction result and the fault type label;
and updating the parameters of the convolutional neural network according to the loss value.
6. The suffix tree based fault detection method according to any one of claims 1 to 5, wherein: the convolutional neural network comprises an input layer, a low hidden layer, a full connection layer and an output layer, wherein the low hidden layer is formed by a plurality of convolutional layers and a plurality of pooling layers in an alternating mode.
7. A suffix tree based fault detection system comprising:
the signal decomposition module is used for acquiring a first fault signal and decomposing the first fault signal through a suffix tree algorithm to obtain a fault repeated waveform and a repeated time sequence;
the training picture set construction module is used for determining a first time-frequency characteristic diagram of the first fault signal according to the fault repeated waveform and the repeated time sequence and constructing a training picture set according to the first time-frequency characteristic diagram;
the fault recognition model training module is used for inputting the training picture set into a pre-constructed convolutional neural network for training to obtain a trained fault recognition model;
the fault type identification module is used for acquiring a second fault signal to be detected, determining a second time-frequency characteristic diagram of the second fault signal through a suffix tree algorithm, inputting the second time-frequency characteristic diagram into the fault identification module, and outputting to obtain a fault type identification result;
the step of decomposing the first fault signal by a suffix tree algorithm to obtain a fault repeated waveform and a repeated time sequence specifically comprises:
coding the first fault signal through average distribution or Gaussian distribution to obtain a first time domain signal;
decomposing the first time domain signal through a suffix tree algorithm to obtain a plurality of fault waveform information and corresponding time information, and constructing a first suffix tree according to the fault waveform information and the time information;
and traversing each node of the first suffix tree, acquiring repeated fault waveform information as a fault repeated waveform, and determining a repeated time sequence of the fault repeated waveform.
8. A suffix tree based fault detection apparatus, comprising:
at least one processor;
at least one memory for storing at least one program;
when executed by the at least one processor, cause the at least one processor to implement a suffix tree based failure detection method as recited in any of claims 1 to 6.
9. A computer readable storage medium in which a processor executable program is stored, wherein the processor executable program when executed by a processor is adapted to perform a suffix tree based fault detection method as claimed in any one of claims 1 to 6.
CN202110823447.9A 2021-07-21 2021-07-21 Fault detection method, system, device and storage medium based on suffix tree Active CN113609933B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110823447.9A CN113609933B (en) 2021-07-21 2021-07-21 Fault detection method, system, device and storage medium based on suffix tree

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110823447.9A CN113609933B (en) 2021-07-21 2021-07-21 Fault detection method, system, device and storage medium based on suffix tree

Publications (2)

Publication Number Publication Date
CN113609933A CN113609933A (en) 2021-11-05
CN113609933B true CN113609933B (en) 2022-09-16

Family

ID=78304988

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110823447.9A Active CN113609933B (en) 2021-07-21 2021-07-21 Fault detection method, system, device and storage medium based on suffix tree

Country Status (1)

Country Link
CN (1) CN113609933B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104156636A (en) * 2014-07-30 2014-11-19 中南大学 Suffix array based fuzzy tandem repeat recognition method
CN105095276A (en) * 2014-05-13 2015-11-25 华为技术有限公司 Method and device for mining maximum repetitive sequence
CN105588720A (en) * 2015-12-15 2016-05-18 广州大学 Fault diagnosis device and method for antifriction bearing based on analysis on morphological component of acoustic signal
CN105606360A (en) * 2015-11-24 2016-05-25 国网内蒙古东部电力有限公司电力科学研究院 Fault diagnosis method for condition-variable planetary gear box based on multi-sensor information fusion
CN110161343A (en) * 2019-06-12 2019-08-23 中南大学 A kind of non-intrusion type real-time dynamic monitoring method of intelligence train exterior power receiving device
CN110472587A (en) * 2019-08-19 2019-11-19 四川大学 Vibrating motor defect identification method and device based on CNN and sound time-frequency characteristics figure
CN112738088A (en) * 2020-12-28 2021-04-30 上海观安信息技术股份有限公司 Behavior sequence anomaly detection method and system based on unsupervised algorithm

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017116627A1 (en) * 2016-01-03 2017-07-06 Presenso, Ltd. System and method for unsupervised prediction of machine failures

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105095276A (en) * 2014-05-13 2015-11-25 华为技术有限公司 Method and device for mining maximum repetitive sequence
CN104156636A (en) * 2014-07-30 2014-11-19 中南大学 Suffix array based fuzzy tandem repeat recognition method
CN105606360A (en) * 2015-11-24 2016-05-25 国网内蒙古东部电力有限公司电力科学研究院 Fault diagnosis method for condition-variable planetary gear box based on multi-sensor information fusion
CN105588720A (en) * 2015-12-15 2016-05-18 广州大学 Fault diagnosis device and method for antifriction bearing based on analysis on morphological component of acoustic signal
CN110161343A (en) * 2019-06-12 2019-08-23 中南大学 A kind of non-intrusion type real-time dynamic monitoring method of intelligence train exterior power receiving device
CN110472587A (en) * 2019-08-19 2019-11-19 四川大学 Vibrating motor defect identification method and device based on CNN and sound time-frequency characteristics figure
CN112738088A (en) * 2020-12-28 2021-04-30 上海观安信息技术股份有限公司 Behavior sequence anomaly detection method and system based on unsupervised algorithm

Also Published As

Publication number Publication date
CN113609933A (en) 2021-11-05

Similar Documents

Publication Publication Date Title
CN113670610B (en) Fault detection method, system and medium based on wavelet transformation and neural network
TWI769754B (en) Method and device for determining target business model based on privacy protection
CN108231201B (en) Construction method, system and application method of disease data analysis processing model
CN109583501B (en) Method, device, equipment and medium for generating image classification and classification recognition model
CN113609932B (en) Fault detection method, system, device and medium based on long-term and short-term memory network
CN109829162B (en) Text word segmentation method and device
CN111552803B (en) Text classification method based on graph wavelet network model
CN109034147A (en) Optical character identification optimization method and system based on deep learning and natural language
CN113535964B (en) Enterprise classification model intelligent construction method, device, equipment and medium
CN115205689B (en) Improved unsupervised remote sensing image anomaly detection method
CN112116010B (en) Classification method for ANN-SNN conversion based on membrane potential pretreatment
CN113670609B (en) Fault detection method, system, device and medium based on wolf optimization algorithm
CN112116957A (en) Disease subtype prediction method, system, device and medium based on small sample
CN113554175B (en) Knowledge graph construction method and device, readable storage medium and terminal equipment
CN112200293A (en) CART-AMV improved random forest algorithm
CN117495071B (en) Flow discovery method and system based on predictive log enhancement
CN113654818B (en) Equipment fault detection method, system, device and medium based on capsule network
CN114529917A (en) Zero-sample Chinese single character recognition method, system, device and storage medium
CN113670608B (en) Fault detection method, system, device and medium based on suffix tree and vector machine
CN113609933B (en) Fault detection method, system, device and storage medium based on suffix tree
CN112302976B (en) Fan blade fault early warning method based on entropy weight method
CN110569966B (en) Data processing method and device and electronic equipment
CN116522070A (en) Non-supervision intelligent fault diagnosis method and system for mechanical parts
CN112463964B (en) Text classification and model training method, device, equipment and storage medium
CN111797732B (en) Video motion identification anti-attack method insensitive to sampling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant