CN117972447A

CN117972447A - Method, device and equipment for identifying mode of time sequence data

Info

Publication number: CN117972447A
Application number: CN202410148718.9A
Authority: CN
Inventors: 罗云铧; 安阳明; 金超
Original assignee: Beijing Cyberinsight Technology Co ltd
Current assignee: Beijing Cyberinsight Technology Co ltd
Priority date: 2024-02-02
Filing date: 2024-02-02
Publication date: 2024-05-03

Abstract

The application provides a method, a device and equipment for identifying a mode of time sequence data. According to the pattern recognition method, device and equipment for the time sequence data, the time resolution can be better controlled through sliding window slicing, the data loss information is reduced, the time sequence information of the time sequence data can be reserved by the image data, in the second aspect, the problem that the magnitudes of different types of data in one-dimensional time sequence data are not in the same scale can be solved, in other words, the problem that the models fail due to mismatching of the magnitudes of the data and the time scale in the recognition process of the different types of data can be avoided by converting the one-dimensional time sequence data into the image data, the recognition accuracy can be improved, the pattern of the time sequence data can be efficiently and accurately recognized, fault diagnosis and health management can be guided based on the recognition result, early faults can be found, and the production efficiency and the service life of chemical equipment can be further improved.

Description

Method, device and equipment for identifying mode of time sequence data

Technical Field

The present application relates to the field of fault diagnosis technologies, and in particular, to a method, an apparatus, and a device for pattern recognition of time-series data.

Background

With the continuous rising of production demands and the increasing of industrialization level, various industrial fields such as manufacturing, energy, chemical industry, medicine, food, metallurgy and the like face increasingly complex operation and management challenges, and under the background, research on fault diagnosis and health management (Prognostics AND HEALTH MANAGEMENT, abbreviated as PHM) of industrial equipment becomes one of key means for ensuring industrial safe production and improving equipment reliability.

Currently, most industrial devices have basic automatic control and data monitoring capabilities, that is, most devices implement a distributed control system (Distributed Control System, abbreviated as DCS), so that fault diagnosis and health management can be performed by using DCS data (time sequence data) provided by the distributed control system. In fault diagnosis and health management using DSC data, it is often necessary to identify a timing pattern of DCS data. Therefore, it is desirable to provide a method for identifying patterns of time series data, so as to identify patterns of time series data efficiently and accurately.

Disclosure of Invention

In view of the above, the present application provides a method, apparatus and device for identifying a pattern of time series data, so as to identify the pattern of time series data efficiently and accurately, and further discover early fault symptoms based on the identification result, so as to implement predictive maintenance.

Specifically, the application is realized by the following technical scheme:

The first aspect of the present application provides a pattern recognition method of time series data, the method comprising:

taking the time sequence data to be identified as target data, and carrying out sliding window slicing processing on the target data according to the preset window size to obtain a plurality of window data;

Drawing a waveform diagram corresponding to each window data, and performing image processing on the waveform diagram according to a preset image processing mode to obtain image data corresponding to each window data;

Inputting a plurality of image data corresponding to the window data into an identification model corresponding to each pre-trained preset time sequence mode, identifying the similarity between each image data in the plurality of image data and the preset time sequence mode by the identification model, and determining the image data with the highest similarity as the image data in the preset time sequence mode;

Searching target image data with highest similarity from image data under various preset time sequence models, and determining target window data corresponding to the target image data and a preset time sequence mode to which the target image data belongs as a recognition result of the current mode recognition;

dividing the target data by using the target window data to obtain a plurality of divided fragments;

and for the target segmentation fragments except the target window data in the plurality of segmentation fragments, when the length of the target segmentation fragments is larger than the preset window size, taking the target segmentation fragments as target data, and executing the process of sliding window slicing processing on the target data again.

The second aspect of the application provides a pattern recognition device of time sequence data, which comprises a processing module, a determining module and a dividing module; wherein,

The processing module is used for taking the time sequence data to be identified as target data, and carrying out sliding window slicing processing on the target data according to the size of a preset window to obtain a plurality of window data;

the processing module is further used for drawing a waveform diagram corresponding to each window data, and performing image processing on the waveform diagram according to a preset image processing mode to obtain image data corresponding to each window data;

The determining module is used for inputting a plurality of image data corresponding to the window data into an identification model corresponding to each pre-trained preset time sequence mode, identifying the similarity between each image data in the plurality of image data and the preset time sequence mode by the identification model, and determining the image data with the highest similarity as the image data in the preset time sequence mode;

The determining module is further configured to search target image data with highest similarity from image data under various preset time sequence models, and determine target window data corresponding to the target image data and a preset time sequence mode to which the target image data belongs as a recognition result of the current mode recognition;

The segmentation module is used for segmenting the target data by utilizing the target window data to obtain a plurality of segmentation fragments;

The processing module is further configured to, for a target segment other than the target window data in the plurality of segments, execute a sliding window slicing process on the target data again with the target segment as target data when the length of the target segment is greater than the preset window size.

A third aspect of the application provides a pattern recognition device for time series data comprising a memory, a processor and a computer program stored on said memory and executable on the processor, said processor implementing the steps of any one of the methods provided in the first aspect of the application when said program is executed.

A fourth aspect of the application provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of any of the methods provided in the first aspect of the application.

According to the pattern recognition method, device and equipment for the time sequence data, through the establishment of the recognition models corresponding to various preset time sequence patterns, different characteristics or rules can be learned by the recognition models corresponding to various preset time sequence patterns, the excessive fitting of the recognition models to specific samples can be reduced, the risk of excessive fitting can be reduced, the recognition accuracy is improved, in addition, through sliding window slicing processing of target data, a plurality of window data are obtained, and a waveform diagram corresponding to each window data is drawn, further, image processing is carried out on the waveform diagram according to a preset image processing mode, image data corresponding to each window data are obtained, so that in the first aspect, time resolution can be better controlled through sliding window slicing, data loss information is reduced, in the second aspect, the time sequence information of the time sequence data can be reserved, in other words, the problem that the amplitude of data of different types in one-dimensional time sequence data is not in the same scale can be solved, in other words, the problem that the data of different types are invalid due to the fact that the amplitude value of the data is not matched with the time scale in the recognition process can be avoided, the problem that the data of different types are high-efficiency can be improved, the recognition accuracy can be realized, the recognition of the time sequence data can be further, the accuracy can be improved, the recognition can be based on the prediction of the time sequence data, the accuracy is based on the prediction mode, the recognition accuracy can be further, the failure can be further found, and the failure can be based on the recognition can be further based on the recognition and the failure can be detected, and the failure can be further, and the diagnosis can be further recognized and the failure can be is further recognized and the diagnosis and the life is well recognized.

Drawings

FIG. 1 is a flowchart of a method for pattern recognition of time series data according to an embodiment of the present application;

FIG. 2 is a waveform diagram illustrating various timing patterns according to an exemplary embodiment of the present application;

FIG. 3 is a schematic diagram of an implementation of partitioning target data, according to an exemplary embodiment of the present application;

FIG. 4 is a flowchart of a second embodiment of a method for pattern recognition of time-series data according to the present application;

FIG. 5 is a flowchart of a third embodiment of a method for pattern recognition of time-series data according to the present application;

FIG. 6 is a schematic diagram of a generator according to an exemplary embodiment of the present application;

FIG. 7 is a schematic diagram of a arbiter according to an exemplary embodiment of the present application;

FIG. 8 is a schematic diagram of a twinning network according to an exemplary embodiment of the present application;

FIG. 9 is a hardware configuration diagram of a timing data pattern recognition device of the timing data pattern recognition device according to the present application;

fig. 10 is a schematic structural diagram of a pattern recognition device for time-series data according to an embodiment of the present application.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with aspects of the application as detailed in the accompanying claims.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.

It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the application. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination" depending on the context.

The chemical industry is used as a pillar industry of national economy, along with the continuous rising of production demands and the improvement of industrialization level, the chemical industry faces increasingly complex operation and management challenges, and under the background, research on fault diagnosis and health management of chemical equipment becomes one of key means for guaranteeing industrial safety production and improving equipment reliability.

Currently, most chemical equipment has automatic control and data monitoring capabilities, namely most equipment realizes a distributed control system (Distributed Control System, abbreviated as DCS), so that fault diagnosis and health management can be performed by using DCS data (time sequence data) provided by the distributed control system. For example, a mechanism model based on expert experience mainly utilizes high-frequency vibration data to extract fault characteristics of chemical equipment, and then designs related alarm thresholds to perform fault diagnosis, the method excessively depends on the knowledge and experience of the expert, the model needs to be correspondingly designed and configured according to different equipment, a high-frequency vibration data source does not belong to DCS data category, additional sensors and acquisition equipment are needed, and the problems of high data acquisition cost, high modeling cost and complicated configuration exist; for another example, the data driving is to build a classification model or a detection model based on a large amount of data and combining with a machine learning technology, and the method has been widely applied in the chemical industry. However, for classification models, such as SVM, random forest, KNN and other methods, the problem of excessive dependence on sample data exists, and almost no full amount of negative samples are used for supporting the classification modeling in the actual production environment; for a detection model, methods such as 3sigma, isolated forest, principal Component Analysis (PCA) and the like are often based on distance, correlation and other information modeling, the modeling methods have strict requirements on boundary conditions and data distribution, and the phenomenon of missing report and false report often occurs due to unreasonable threshold setting;

The above method focuses on the distribution, boundary or threshold of the data, ignores the change of the data in the form or mode, often appears that some data features do not exceed the set threshold or boundary in actual production, but deviate from the normal level in the form or mode, and such phenomenon includes fault information of the equipment, so that it is very necessary to perform time sequence mode identification on the data, and further perform fault diagnosis based on the identification result.

Common time series data pattern recognition methods include symbolization-based methods, such as PAA, SAX and the like, distance measurement-based methods, such as 1NN, DTW and the like, and SHAPELETS types of similarity analysis-based methods, which often have obvious requirements on time and numerical scale, can only perform specific analysis on certain types of data, lack of generality and are difficult to apply in actual production.

Therefore, it is desirable to provide a method, apparatus and device for identifying patterns of time series data, so as to identify patterns of time series data efficiently and accurately.

The application provides a method, a device and equipment for identifying a time sequence data mode, which are used for efficiently and accurately identifying the time sequence data mode, so that early fault symptoms are found based on an identification result, and predictive maintenance is realized.

According to the pattern recognition method, device and equipment for the time sequence data, through the establishment of the recognition models corresponding to various preset time sequence patterns, different characteristics or rules can be learned by the recognition models corresponding to various preset time sequence patterns, excessive fitting of the recognition models to specific samples can be reduced, the risk of excessive fitting can be reduced, recognition accuracy is improved, in addition, through sliding window slicing processing of target data, a plurality of window data are obtained, and a waveform diagram corresponding to each window data is drawn, further image processing is carried out on the waveform diagram according to a preset image processing mode, image data corresponding to each window data is obtained, so that in the first aspect, time resolution can be better controlled through sliding window slicing, data loss information is reduced, in the second aspect, the time sequence information of the time sequence data can be reserved, in other words, the problem that the magnitude of data of different types in one-dimensional time sequence data is not in the same scale can be solved, in other words, the problem that the magnitude of data of different types are not in the same size can be avoided in the recognition process because of the magnitude of the data is not matched with the time scale can be avoided, recognition accuracy can be improved, the accuracy of the recognition model can be improved, the recognition accuracy can be further, the failure can be further based on the recognition accuracy can be realized, the recognition accuracy can be further, the failure can be further based on the recognition can be further based on the prediction and the accuracy can be achieved, and the failure can be further detected, and can be further based on the failure can be further detected and has life is well is detected.

Specific examples are given below to describe the technical solution of the present application in detail.

Fig. 1 is a flowchart of a pattern recognition method for time series data according to an embodiment of the present application. Referring to fig. 1, the method provided in this embodiment may include:

S101, taking time sequence data to be identified as target data, and carrying out sliding window slicing processing on the target data according to a preset window size to obtain a plurality of window data.

Specifically, the time sequence data refers to a data set arranged in time sequence, wherein each data point carries time information. In this embodiment, the time sequence data may be DSC data of chemical equipment.

In specific implementation, sliding window slicing processing can be performed on target data according to a preset window size and a preset sliding step length.

It should be noted that, the preset window size refers to the number of time steps contained in the target data, and the preset sliding step size refers to the step size of each sliding. The preset window size and the preset sliding step length are set according to actual needs, and in this embodiment, specific values of the preset window size and the preset sliding step length are not limited. For example, in one embodiment, the predetermined window size is j and the predetermined sliding step is k.

When the preset window size is j and the preset sliding step length is k, sliding window slicing processing is performed on the target data to obtain a plurality of window data. Each window data may be denoted as x_smooth _(i,j,k), where (i, j, k) is window information of the window data, i is a sliding number (i is an integer, i.e., i is 0, 1, 2,3, … …), j is a preset window size, and k is a preset sliding step.

Optionally, in one possible implementation manner of the present application, before the sliding window slicing processing is performed on the target data, the method further includes:

And smoothing the target data.

Specifically, the manner of smoothing is set according to actual needs, and this is not limited in this embodiment. For example, in one embodiment, a smoothing process of local weighted regression may be performed on the target data.

According to the method provided by the application, through smoothing the target data, noise, fluctuation or irregularity in the target data can be reduced or eliminated, so that the data becomes smoother and more stable.

S102, drawing a waveform diagram corresponding to each window data, and performing image processing on the waveform diagram according to a preset image processing mode to obtain image data corresponding to each window data.

In specific implementation, for each window data, the data amount can be taken as an abscissa, the amplitude value is taken as an ordinate, and a waveform diagram corresponding to the window data is drawn.

Further, the preset image processing mode is set according to actual needs, and in this embodiment, this is not limited. For example, in one possible implementation manner, the preset image processing manner includes at least one of the following image processing manners: graying, binarizing, sharpening and smoothing.

Wherein, the graying can preserve the brightness information of the image, and reduce the complexity of the image without color information. Binarization can convert an image into a form with only two colors of black and white, and the edge information of the image is enhanced. The sharpening process may enhance the sharpness and detail of the image. The smoothing process may make the image smoother and more stable.

Specifically, when the waveform image is subjected to image processing, a processed image can be obtained, and further, image data corresponding to each window data can be obtained based on the processed image. It should be noted that, the image data corresponding to each window data may be represented by a two-dimensional matrix, for example, when the size of the processed image is m×n, the image data corresponding to the window data may be represented by Tij, which is a two-dimensional matrix with m×n, where the element Tij in the two-dimensional matrix represents the pixel value corresponding to the ith row and jth column of pixels in the processed image.

Further, referring to the foregoing description, for convenience of explanation, the image data corresponding to each window data is denoted as imag _(i,j,k), and the image data is stored in imag_group while window information is maintained. After the processing, the one-dimensional time sequence data is converted into two-dimensional picture data, so that the form information of the time sequence data is reserved, and the problem that the magnitudes of different types of data in the one-dimensional time sequence data are not in the same scale is solved.

S103, inputting a plurality of image data corresponding to the window data into an identification model corresponding to each pre-trained preset time sequence mode, identifying the similarity between each image data in the plurality of image data and the preset time sequence mode by the identification model, and determining the image data with the highest similarity as the image data in the preset time sequence mode.

For the purpose of illustration, for example, in one embodiment, the obtained plurality of window data includes Q window data (for convenience of illustration, the Q window data are sequentially labeled A, B, C and … … Q), and the image data corresponding to each window data is a matrix of m×n, so that the plurality of image data corresponding to the plurality of window data is a matrix of m×n×q.

Specifically, the preset time sequence patterns related to the recognition models corresponding to the pre-trained various preset time sequence patterns are set according to actual needs, and in this embodiment, the method is not limited. In one possible implementation, the preset timing pattern includes at least two timing patterns: ascending trend, descending trend, periodic trend, ascending step trend, descending step trend and steady trend. Correspondingly, the recognition models corresponding to the pre-trained various preset time sequence modes at least comprise the following two recognition models: the method comprises the steps of identifying a model corresponding to an ascending trend, identifying a model corresponding to a descending trend, identifying a model corresponding to a periodic trend, identifying a model corresponding to an ascending step trend, identifying a model corresponding to a descending step trend and identifying a model corresponding to a steady trend. The following description will be made by taking, as an example, a recognition model corresponding to each pre-trained preset time sequence pattern, where the recognition model includes a recognition model 1 corresponding to an ascending trend, a recognition model 2 corresponding to a descending trend, a recognition model 3 corresponding to a periodic trend, a recognition model 4 corresponding to an ascending trend, a recognition model 5 corresponding to a descending step phase, and a recognition model 6 corresponding to a stable region.

It should be noted that fig. 2 is a waveform diagram of various preset timing modes according to an exemplary embodiment of the present application. Referring to fig. 2, a graph a in fig. 2 is a waveform diagram corresponding to an ascending trend, and as can be seen from the graph a in fig. 2, the ascending trend refers to that the data amplitude gradually increases with time. As can be seen from the graph B in fig. 2, the graph B in fig. 2 shows a waveform diagram corresponding to a decreasing trend, which means that the data amplitude gradually decreases with time. The graph C in fig. 2 is a waveform diagram corresponding to an upward step trend, and as can be seen from the graph C in fig. 2, the upward step trend means that the data amplitude rapidly increases with time-stop lag. The graph D in fig. 2 is a waveform diagram corresponding to a downward step trend, and as can be seen from the graph D in fig. 2, the downward step trend means that the data amplitude decreases rapidly with time-lapse lag. The E plot in fig. 2 is a waveform plot corresponding to a periodic trend, and as can be seen from the E plot in fig. 2, the periodic trend refers to a periodic rise or fall of the data amplitude with time. The F graph in fig. 2 is a waveform graph corresponding to a stationary trend, and as can be seen from the F graph in fig. 2, the stationary trend refers to no significant change in data amplitude over time.

Specifically, in this step, in combination with the above example, a plurality of image data are input into the recognition model 1 corresponding to the upward trend, and in combination with the above example, Q image data corresponding to Q window data are input into the recognition model 1 (i.e., a matrix of m×n×q is input into the recognition model 1), and further, the recognition model 1 outputs Q similarities, and one of the Q similarities characterizes the similarity between one image data and the upward trend. The recognition model 1 also determines the image data having the highest similarity among the Q similarities as the image data in the upward trend. For example, in one embodiment, in combination with the foregoing example, the image data having the highest similarity among the Q image data is identified as the image data a by the identification model 1, and at this time, the image data a is determined to be the image data in the upward trend.

Similarly, further, after the recognition model 2, the recognition model 3, the recognition model 4, the recognition model 5 and the recognition model 6 respectively recognize the plurality of image data corresponding to the plurality of window data, the image data B is determined to be the image data in the downward trend, the image data C is determined to be the image data in the periodic trend, the image data D is determined to be the image data in the upward step trend, the image data E is determined to be the image data in the downward step trend, and the image data F is determined to be the image data in the steady trend.

S104, searching target image data with highest similarity from the image data under various preset time sequence models, and determining target window data corresponding to the target image data and a preset time sequence mode to which the target image data belongs as a recognition result of the current mode recognition.

Specifically, in this step, in combination with the above example, the target image data having the highest similarity is searched for from among the image data a, the image data B, the image data C, the image data D, the image data E, and the image data F. For example, in one embodiment, if the found target image data is the image data B, at this time, the target window data x_smooth _(i1,j1,k1) corresponding to the image data B and the descending trend to which the image data B belongs are determined as the recognition result of the current pattern recognition. In other words, the recognition result of the present pattern recognition is: the timing pattern of the window data x_smooth _(i1,j1,k1) is a downward trend.

It should be noted that, referring to the foregoing description, the image data corresponding to each window data retains the window information, and when the target image data is found, the corresponding target window data (the window information of the target window data and the window information of the target image data are identical) may be determined based on the window information of the target image data.

S105, dividing the target data by using the target window data to obtain a plurality of divided fragments.

Specifically, fig. 3 is a schematic diagram illustrating implementation of the split target data according to an exemplary embodiment of the present application. Referring to fig. 3, in the example shown in fig. 3, when dividing target data by using target window data, a start point and an end point of the target window data may be used as division points, and the target data may be divided by using the division points to obtain a plurality of division fragments. For example, in the example shown in fig. 3, after dividing the target data by the target window data, the target data is divided into three divided pieces, which are respectively: segment 1, target window data, and segment 2.

S106, regarding target segmentation fragments except for the target window data in the plurality of segmentation fragments, when the length of the target segmentation fragments is larger than the preset window size, taking the target segmentation fragments as target data, and executing the process of sliding window slicing processing on the target data again.

It should be noted that, referring to the foregoing description, since the timing pattern of the target window data has been determined for each segment, at this time, it is necessary to further determine the timing patterns of other segments, and therefore, in this step, the timing patterns are further determined for the target segments other than the target window data among the plurality of segments.

In particular, for each target segment, if the length of the target segment is smaller than or equal to the preset window size, the time sequence mode of the target segment is not analyzed because the target segment is too short, and at this time, the target segment is not processed. Further, when the length of the target segment is greater than the preset window size, the target segment may be used as target data, and the sliding window slicing process may be performed on the target data again, so as to further determine the timing pattern.

Specifically, in this step, in view of the segment 1 and the segment 2, the target segment with a length greater than the preset window size is found to be the segment 1, and in this step, the sliding window slicing process is performed again with the segment 1 as the target data.

According to the mode identification method of the time sequence data, through the establishment of the identification models corresponding to various preset time sequence modes, different characteristics or rules can be learned by the identification models corresponding to various preset time sequence modes, the excessive fitting of the identification models to specific samples can be reduced, the risk of excessive fitting can be reduced, and the identification accuracy is improved.

Fig. 4 is a flowchart of a second embodiment of a pattern recognition method for time-series data according to the present application. Referring to fig. 4, in the method provided in this embodiment, based on the foregoing embodiment, a training process of an identification model corresponding to each type of preset time sequence mode includes:

S401, generating sample data corresponding to each type of preset time sequence mode by using a sample generation model.

Specifically, the sample generation model may perform sample data generation for different preset time sequence modes, and may generate sample data corresponding to each type of preset time sequence mode. For example, in combination with the above example, the sample generation model may generate sample data corresponding to the upward trend, and further obtain the sample data set 1 corresponding to the upward trend; the sample generation model can generate sample data corresponding to the descending trend, so as to obtain a sample data set 2 corresponding to the descending trend; the sample generation model can generate sample data corresponding to the upward step trend, so as to obtain sample data 3 corresponding to the upward step trend; the sample generation model can generate sample data corresponding to the lower step trend, so as to obtain a sample data set 4 corresponding to the lower step trend; the sample generation model can generate sample data corresponding to the periodic trend, so as to obtain sample data 5 corresponding to the periodic trend, and the sample generation model can generate sample data corresponding to the steady trend, so as to obtain a sample data set 6 corresponding to the steady trend.

For example, in one embodiment, the sample generation model may be a model built based on generating an antagonism network, a variational self-encoder, an autoregressive model, a self-encoder, and the like. In this embodiment, this is not limited.

Optionally, in one possible implementation manner of the present application, specifically, fig. 5 is a flowchart of a third embodiment of a pattern recognition method of time series data provided by the present application. Referring to fig. 5, in the method provided in the present embodiment, based on the foregoing embodiment, the generating sample data corresponding to each type of preset time sequence pattern by using a sample generation model includes:

s501, acquiring real data corresponding to each type of preset time sequence mode according to each type of preset time sequence mode; the time sequence mode of the real data is the preset time sequence mode.

Specifically, in combination with the above example, the predetermined time sequence pattern includes an upward trend, a downward trend, a periodic trend, an upward step trend, a downward step trend, and a steady trend. In this step, various basic forms of the preset time sequence modes are defined, and corresponding real data are obtained. It should be noted that, the time sequence mode of the real data is the preset time sequence mode, and the real data may be the obtained real data or may be virtual data simulated based on simulation software, which is not limited in this embodiment.

S502, training a generator in a generated countermeasure network through a countermeasure game mechanism by using the real data, and taking the generated countermeasure network as a sample generation model for generating sample data corresponding to the preset time sequence mode when a discriminator in the generated countermeasure network determines that the data generated by the generator meets an evaluation standard.

Specifically, the generating countermeasure network includes a generator and a discriminator, fig. 6 is a schematic diagram of the structure of the generator shown in an exemplary embodiment of the present application, and fig. 7 is a schematic diagram of the structure of the discriminator shown in an exemplary embodiment of the present application. Referring to fig. 6, the generator includes an input layer (dense_1_input in fig. 6 represents the input layer), a hidden layer 1 (dense_1 in fig. 6 represents the hidden layer 1), a hidden layer 2 (dense_2 in fig. 6 represents the hidden layer 2), and an output layer (dense_3 in fig. 6 represents the output layer).

Wherein the input layer is a fully connected layer (Dense) with 1 neuron, which receives an input vector with dimension 1; hidden layer 1 is a fully connected layer with 256 neurons, using LeakyReLU activation functions, introducing a nonlinear relationship through LeakyReLU activation functions. Hidden layer 2 is a fully connected layer with 512 neurons, using LeakyReLU activation functions. The output layer is a fully connected layer with 1 neuron, using a linear activation function (not shown in the figure).

Further, referring to fig. 7, the arbiter includes an input layer (dense_4_input in fig. 7 represents an input layer), a hidden layer 1 (dense_4 in fig. 7 represents a hidden layer 1), a hidden layer 2 (dense_5 in fig. 7 represents a hidden layer 2), and an output layer (dense_6 in fig. 7 represents an output layer).

The input layer is a fully connected layer with 1 neuron, and receives an input vector with 1 dimension. Hidden layer 1 is a fully connected layer with 512 neurons, using LeakyReLU activation functions, introducing a nonlinear relationship through LeakyReLU activation functions. Hidden layer 2 is a fully connected layer with 256 neurons and uses LeakyReLU activation functions. The output layer is a fully connected layer with 1 neuron and uses a Sigmoid activation function (not shown).

In particular, the generator and the discriminant are sequentially stacked using a linear stacking approach, wherein the discriminant is set to be untrainable. And secondly, taking the real data as a real input sample, and introducing noise data which is subjected to normal distribution and has a mean value of 0 and a standard deviation of 0.05 into the noise sample. Setting the training times as 100 times, the training step length as 64, compiling the discriminator and generating the countermeasure network, and using a binary cross entropy loss function and an Adam optimizer. In the training process, the generator generates sample data from noise samples through learning, and the discriminator discriminates real data from sample data generated by the generator through learning so that the sample data generated by the generator is discriminated as real data as much as possible.

S503, generating sample data corresponding to the preset time sequence mode by using the sample generation model.

Specifically, after the model is trained, sample data can be generated based on the model. For example, in one possible implementation, for each type of preset timing pattern, 100 sets of sample data are generated, resulting in a sample data set corresponding to the preset timing pattern.

The method provided by the application utilizes the sample data generated by the antagonistic neural network, not only can ensure that the sample data and the real data are subjected to uniform distribution, but also can improve the diversity of the data, can expand the original data set, and simultaneously reduces the labeling cost.

Optionally, in a possible implementation manner of the present application, the generation countermeasure network may be replaced by a network structure such as a variational self-encoder, an autoregressive model, and the like, and the purpose of generating sample data corresponding to the preset data pattern may also be achieved.

S402, drawing a waveform diagram corresponding to the sample data, and performing image processing on the waveform diagram corresponding to the sample data according to the preset image processing mode to obtain training samples corresponding to each type of preset time sequence model.

In specific implementation, for each sample data, the data amount may be taken as an abscissa, and the amplitude may be taken as an ordinate, and a waveform diagram corresponding to the sample data may be drawn.

Further, referring to the foregoing description, the preset image processing manner includes at least one of the following image processing manners: graying, binarizing, sharpening and smoothing.

Specifically, after image processing is performed on the waveform diagram corresponding to the sample data, a processed waveform diagram is obtained, and a corresponding training sample can be obtained based on the processed waveform diagram. With reference to the foregoing description, it will be appreciated that each training sample is a two-dimensional matrix, each element of which is the pixel value of each pixel in the processed waveform.

S403, training the recognition model corresponding to the preset time sequence mode by using training samples corresponding to the preset time sequence mode as positive samples and training samples corresponding to other preset time sequence modes as negative samples according to each preset time sequence mode, and obtaining the trained recognition model corresponding to the preset time sequence mode; wherein the number of training samples contained in the positive sample is consistent with the number of training samples contained in the negative sample.

Specifically, the number of training samples included in the positive samples and the negative samples is set according to actual needs, and in this embodiment, this is not limited. For example, in one embodiment, the positive and negative samples each include 100 training samples.

For example, in one possible implementation manner, in this step, for the downward trend, 100 training samples corresponding to the downward trend are taken as positive samples, and 100 training samples are randomly extracted from the training samples corresponding to the upward trend, the periodic trend, the upward step trend, the downward step trend, and the steady trend to be taken as negative samples.

In particular, when training the recognition model by using the positive sample and the negative sample, the twin network may be selected as an initial recognition model, for example, the SiameseNetwork model is selected as the initial recognition model for training. It should be noted that fig. 8 is a schematic diagram of a twin network according to an exemplary embodiment of the present application. Referring to fig. 8, the network includes a feature extraction layer, an euclidean distance calculation layer, and an output layer (not shown). The feature extraction layer selects a VGG16 model for feature extraction of the input, and can map positive samples of the input to a one-dimensional feature vector f ₀ and negative samples of the input to a one-dimensional feature vector f ₁. Further, a euclidean distance calculating layer, configured to calculate a euclidean distance between f ₁ and f ₀, where the euclidean distance represents a difference between the two vectors, and is denoted as distance. Further, the output layer is implemented by a full connection layer, which processes distance by using sigmod functions and outputs similarity.

It should be noted that during training, contrast loss is used as a loss function that pushes similar samples closer together in feature space by minimizing euclidean distance or cosine similarity between them, while maximizing the distance between dissimilar samples.

The initial recognition model may be a transition learning model, a matching network, or the like, and in this embodiment, the method is not limited thereto.

In addition, training the recognition model corresponding to each type of preset time sequence mode by contrast learning has the advantage that compared with training a multi-classification time sequence mode recognition model, the contrast learning emphasizes the difference between different types by comparing the similarity between samples. In contrast, multi-classification time series pattern recognition models may have difficult decision boundaries between classes, resulting in difficulty in accurately classifying new samples; the recognition model corresponding to each type of preset time sequence mode is compared and trained, so that the risk of over fitting can be reduced, and different characteristics or rules can be learned by the recognition model corresponding to each type of preset time sequence mode in the training process, so that the over fitting of specific samples is reduced. By integrating the recognition results of a plurality of recognition models, the advantages of the plurality of recognition models can be integrated, and the generalization performance of the models can be improved; in addition, contrast learning can better handle class-unbalanced data sets, and for cases where some class samples are fewer, training a multi-class time-series pattern recognition model alone can result in poor recognition performance for these minority class data. Contrast learning can learn the distinction between different categories by comparing similarities, thereby balancing the training sample distribution between different categories; the contrast learning can easily support online learning, namely, when defining a new time sequence mode, only new sample pairs and corresponding models are needed to be constructed, and the whole models are not needed to be retrained, which is very useful for online learning or the situation that existing models are needed to be applied in different fields.

The application also provides an embodiment of a pattern recognition device for time sequence data, corresponding to the embodiment of the pattern recognition method for time sequence data.

The embodiment of the pattern recognition device for the time sequence data can be applied to pattern recognition equipment for the time sequence data. The apparatus embodiments may be implemented by software, or may be implemented by hardware or a combination of hardware and software. Taking software implementation as an example, the device in a logic sense is formed by reading corresponding computer program instructions in a nonvolatile memory into a memory through a processor of a pattern recognition device of time sequence data of the device. In terms of hardware, as shown in fig. 9, a hardware structure diagram of a timing data pattern recognition device provided by the present application is shown, and in addition to a processor, a memory, a network interface, and a nonvolatile memory shown in fig. 9, the timing data pattern recognition device in the embodiment generally includes other hardware according to an actual function of the timing data pattern recognition device, which is not described herein.

Fig. 10 is a schematic structural diagram of a pattern recognition device for time-series data according to an embodiment of the present application. Referring to fig. 10, the apparatus provided in this embodiment includes a processing module 1010, a determining module 1020, and a dividing module 1030; wherein,

The processing module 1010 is configured to take the time sequence data to be identified as target data, and perform sliding window slicing processing on the target data according to a preset window size to obtain a plurality of window data;

The processing module 1010 is further configured to draw a waveform diagram corresponding to each piece of window data, and perform image processing on the waveform diagram according to a preset image processing manner to obtain image data corresponding to each piece of window data;

The determining module 1020 is configured to input a plurality of image data corresponding to the plurality of window data into an identification model corresponding to each pre-trained preset time sequence mode, so that the identification model identifies a similarity between each image data in the plurality of image data and the preset time sequence mode, and determines the image data with the highest similarity as the image data in the preset time sequence mode;

The determining module 1020 is further configured to search for target image data with highest similarity from image data under various preset time sequence models, and determine target window data corresponding to the target image data and a preset time sequence mode to which the target image data belongs as a recognition result of the current mode recognition;

The segmentation module 1030 is configured to segment the target data by using the target window data to obtain a plurality of segments;

The processing module 1010 is further configured to, for a target segment other than the target window data in the plurality of segments, execute a sliding window slicing process on the target data again with the target segment as target data when the length of the target segment is greater than the preset window size.

The apparatus provided in this embodiment may be used to execute the steps of the method embodiment shown in fig. 1, and the specific implementation principle and implementation process are similar, and are not described herein again.

Optionally, the training process of the recognition model corresponding to each type of preset time sequence mode includes:

Generating sample data corresponding to each type of preset time sequence mode by using a sample generation model;

drawing a waveform diagram corresponding to the sample data, and performing image processing on the waveform diagram corresponding to the sample data according to the preset image processing mode to obtain training samples corresponding to each type of preset time sequence model;

aiming at each type of preset time sequence mode, training the recognition model corresponding to the type of preset time sequence mode by using training samples corresponding to the type of preset time sequence mode as positive samples and using training samples corresponding to other types of preset time sequence modes as negative samples to obtain a trained recognition model corresponding to the type of preset time sequence mode; wherein the number of training samples contained in the positive sample is consistent with the number of training samples contained in the negative sample.

Optionally, the generating, using a sample generation model, sample data corresponding to each type of preset time sequence mode includes:

Acquiring real data corresponding to each type of preset time sequence mode aiming at each type of preset time sequence mode; the time sequence mode of the real data is a preset time sequence mode;

Training a generator in a generated countermeasure network through a countermeasure game mechanism by using the real data, and taking the generated countermeasure network as a sample generation model for generating sample data corresponding to the preset time sequence mode when a discriminator in the generated countermeasure network determines that the data generated by the generator meets an evaluation standard;

And generating sample data corresponding to the preset time sequence mode by using the sample generation model.

Optionally, the processing module is further configured to perform smoothing processing on the target data before performing sliding window slicing processing on the target data.

Optionally, the preset image processing mode includes at least one of the following image processing modes: graying, binarizing, sharpening and smoothing.

Optionally, the recognition models corresponding to the pre-trained preset time sequence modes at least comprise the following two recognition models: the method comprises the steps of identifying a model corresponding to an ascending trend, identifying a model corresponding to a descending trend, identifying a model corresponding to a periodic trend, identifying a model corresponding to an ascending step trend, identifying a model corresponding to a descending step trend and identifying a model corresponding to a steady trend.

With continued reference to fig. 9, the present application further provides a pattern recognition device for time series data, including a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor implements the steps of any one of the methods provided in the first aspect of the present application when executing the program.

Further, the present application also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the steps of any of the methods provided in the first aspect of the present application.

The implementation process of the functions and roles of each unit in the above device is specifically shown in the implementation process of the corresponding steps in the above method, and will not be described herein again.

For the device embodiments, reference is made to the description of the method embodiments for the relevant points, since they essentially correspond to the method embodiments. The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purposes of the present application. Those of ordinary skill in the art will understand and implement the present application without undue burden.

The foregoing description of the preferred embodiments of the application is not intended to be limiting, but rather to enable any modification, equivalent replacement, improvement or the like to be made within the spirit and principles of the application.

Claims

1. A method for pattern recognition of time series data, the method comprising:

2. The method according to claim 1, wherein the training process of the recognition model corresponding to each type of preset time sequence pattern comprises:

3. The method according to claim 2, wherein generating sample data corresponding to each type of preset time sequence pattern using a sample generation model includes:

4. The method of claim 1, wherein prior to sliding window slicing the target data, the method further comprises:

And smoothing the target data.

5. The method of claim 1, wherein the predetermined image processing means comprises at least one of the following image processing means: graying, binarizing, sharpening and smoothing.

6. The method according to claim 1, wherein the recognition models corresponding to the pre-trained preset time sequence patterns at least comprise the following two recognition models: the method comprises the steps of identifying a model corresponding to an ascending trend, identifying a model corresponding to a descending trend, identifying a model corresponding to a periodic trend, identifying a model corresponding to an ascending step trend, identifying a model corresponding to a descending step trend and identifying a model corresponding to a steady trend.

7. A mode identification device of time sequence data, which is characterized by comprising a processing module, a determining module and a dividing module; wherein,

8. The apparatus of claim 7, wherein the training process of the recognition model corresponding to each type of preset time sequence pattern comprises:

9. A pattern recognition device for time series data, comprising a memory, a processor and a computer program stored on said memory and executable on the processor, characterized in that said processor implements the steps of the method according to any one of claims 1-6 when said program is executed by said processor.

10. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the steps of the method according to any of claims 1-6.