CN109948646A

CN109948646A - A kind of time series data method for measuring similarity and gauging system

Info

Publication number: CN109948646A
Application number: CN201910067744.8A
Authority: CN
Inventors: 钱步月; 张先礼; 陆亮; 王谞动; 刘小彤; 李扬; 卫荣; 郑庆华
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2019-01-24
Filing date: 2019-01-24
Publication date: 2019-06-28

Abstract

The invention discloses a kind of time series data method for measuring similarity and gauging systems, comprising the following steps: firstly, for the event in all time series datas, the vector for learning each event is indicated；Secondly, the time map that each event is occurred is embedded into event vector at the vector with the dimensions such as event vector by vectorial addition；Finally, final sequence of events is indicated that being sent into convolutional neural networks carries out the study for having supervision, the time series data measuring similarity model of the robust that finally learns；Measuring similarity is carried out by obtained measuring similarity model.The present invention indicates more rationally effective to timing sequence data, so as to promote the accuracy of time series data measuring similarity.

Description

A kind of time series data method for measuring similarity and gauging system

Technical field

The invention belongs to time series data similarity technical field, in particular to a kind of time series data method for measuring similarity and Gauging system.

Background technique

Data measuring similarity is the underlying issue in data science, it is related to natural language processing, data retrieval, queue Multiple application fields such as analysis.There are a large amount of time series data in reality scene, these data usually have timing, higher-dimension Degree, heterogeneity, sparsity, the features such as not etc. peacekeepings are not irregular.

Currently, this representation method is because of sparsity, height usually using the sequence representation method based on one-hot vector The features such as dimension, can seriously reduce the efficiency and accuracy of similarity calculation.In addition, existing method is usually in special time period Polymeric sequence event ignores the opposite pass of the relativeness and each event and time of origin in sequence between each event System, this will lead to the loss of temporal information.From the perspective of reality scene, most of events all can change at any time in sequence Change and changes, and the correlativity of each event also can be different with the variation of event, therefore temporal information is for event The expression of sequence is particularly significant.

To sum up, a kind of new time series data method for measuring similarity is needed.

Summary of the invention

The purpose of the present invention is to provide when a kind of time series data method for measuring similarity and gauging system, it is above-mentioned to solve ?.The present invention by effective expression to time series data, can make up conventional method ignore in data event it Between and event and time of origin between relativeness defect, can to time series data similarity carry out valid metric.

In order to achieve the above objectives, the invention adopts the following technical scheme:

A kind of time series data method for measuring similarity, comprising the following steps:

Step 1, the sample time-series data for acquiring preset quantity consider the opposite of each event in each sample time-series data The data of higher dimensional space are mapped to lower dimensional space by the relativeness of relationship and each event and time of origin, are constructed every The expression of a sample time-series data；

Step 2, the expression of all sample time-series data step 1 obtained inputs preset convolutional neural networks mould Type carries out feature extraction to the expression of each sample time-series data, obtains the feature vector of each sample time series data；

Step 3, the feature vector of each sample time series data obtained according to step 2 is calculated based on similarity matrix and is obtained Similarity between each sample time series data；

Step 4, when each sample that the feature vector and step 3 of each sample time series data obtained by step 2 obtain Similarity training preset convolutional neural networks model of the ordinal number between, training are trained to the default condition of convergence Measuring similarity model；

Step 5, the expression of time series data to be measured is constructed by the method for step 1, and is inputted step 4 acquisition In trained measuring similarity model, the measuring similarity result of time series data to be measured is obtained.

Further, step 1 specifically includes:

Step 1.1, it is a sequence of events by every sample time series data matrix conversion, is arranged according to the relative time of event Event free of turn in the same time occurs for column event；

Step 1.2, each event is mapped to the vector of fixed length using word2vec, obtains each event in sequence of events Vector comprising relativeness information indicates；

Step 1.3, time map event each in sequence of events occurred using word2vec is at event vector etc. The vector of dimension, the vector for obtaining Time To Event indicate；

Step 1.4, the vector of Time To Event each in time series data is indicated to be embedded into relatively by vectorial addition In the event expression answered, the expression of sample time-series data is obtained.

Further, in step 2, by convolutional neural networks extract the isometric features of each sample time-series data to Amount；

Convolutional neural networks structure used includes:

Convolutional layer for receiving input data, and exports characteristic pattern；

Sample level, for receiving the characteristic pattern of convolutional layer output, and the fixed length feature vector of output timing data.

Further, in convolutional neural networks structure used:

The convolution of convolutional layer is unidirectional.

In convolutional layer, maximum sampling is taken, the feature of each characteristic pattern is sampled into a single numerical value, when finally obtaining The fixed length vector of ordinal number evidence indicates.

Further, step 4 specifically includes:

Step 4.1, feature vector is merged with the similarity obtained is calculated, is spliced into a vector.

Step 4.2, by the DUAL PROBLEMS OF VECTOR MAPPING obtained in step 4.1 at a two-dimensional vector；Ordinal number at two articles of one-dimensional representation The size for being 1 according to similarity, the size that two articles of time series data similarities of two-dimensional representation are 0；

Step 4.3, the similarity for obtaining two time series datas is calculated by Softmax；

Step 4.4, loss function, and the preset convolutional neural networks of training are constructed, trained measuring similarity is obtained Model.

Further, in step 4.4, objective function is constructed first, and the loss of iteration each time is calculated according to objective function, Objective function seeks local derviation to each parameter, and parameter is updated to its derivative negative direction loses, to continue to optimize parameter；

Loss function formalization representation are as follows:

L(S₁,S₂, y) and=(y-M (S₁,S₂))²；

In formula, S₁、S₂Indicate the data pair of input.

Further, in step 3 similarity calculation method are as follows: one matrix M of random initializtion, when with two obtained The feature vector, X of ordinal number evidence_a, X_bThe similarity S of the two obtained is calculated with M.

A kind of time series data measuring similarity system, comprising:

Time series data indicates that building module considers each sample time-series for acquiring the sample time-series data of preset quantity The relativeness of the relativeness of each event and each event and time of origin in data maps the data of higher dimensional space To lower dimensional space, the expression of each sample time-series data is constructed；

Measuring similarity network module, for each sample time-series data to time series data expression building module building It indicates to carry out feature extraction, obtains the feature vector of each sample time series data；For according to each sample time series data of acquisition Feature vector calculates the similarity between obtaining each sample time series data based on similarity matrix；For passing through characteristic vector pickup Between each sample time series data that the feature vector and similarity calculation module for each sample time series data that module obtains obtain The preset convolutional neural networks model of similarity training, training obtain trained similarity degree to the default condition of convergence Measure model；

The measuring similarity of time series data to be measured is completed by trained measuring similarity model.

Compared with prior art, the invention has the following advantages:

When only considering feature of event in special time period in sequence itself different from existing method, and ignoring it with occurring Between relativeness；The present invention for time series data sparsity, it is high-dimensional, not etc. dimensions, timing and scrambling the features such as, A kind of rationally effective time series data method for measuring similarity is provided.In method of the invention: firstly, for all time series datas In event, the vector for constructing each event indicates that this vector indicates that the distance that can efficiently use vector space indicates to suffer from The relativeness of each event of person；Secondly, the time map that each event is occurred passes through at the vector with the dimensions such as event vector Vectorial addition is embedded into event vector；Finally, final sequence of events is indicated that being sent into convolutional neural networks has carried out supervision Study, the time series data measuring similarity model of the robust that finally learns passes through obtained model and carries out measuring similarity. The present invention by effective expression to time series data, compensate for conventional method ignore in data between event, event and when occurring Between between relativeness the problem of, solve the problems, such as can not to time series data similarity carry out valid metric, clock synchronization of the present invention Sequence sequence data indicates more rationally effective, so as to promote the accuracy of measuring similarity.In the present invention, the when ordinal number of acquisition According to dense, low-dimensional is indicated, it may make that calculating is efficient；The time series data of acquisition is embedded in temporal information, and representation method is more reasonable；? On the basis of reasonable representation of the present invention, feature calculation similarity is extracted using convolutional neural networks and is had supervision end to end Training, while realizing that reasonable representation and efficient feature extract, so that measurement accuracy can be improved.

Further, sparse time series data matrix 1) is become into dense event vector, realizes non-sparsity.It 2) will be high The expression of dimension event is mapped to low-dimensional vector space by word2vec, realizes low dimensional.3) final sequence of events indicates fusion The relativeness between relativeness and event and time of origin between event.

Further, due to different along matrix both direction convolution operation from image analysis, time series data only exists It is just significant that convolution is done in time orientation, so the convolution of convolutional layer is unidirectional.

Further, for the feature vector of extraction, the similarity between two feature vectors is calculated based on similarity matrix, Consider the position due to arbitrarily exchanging two data, similarity should be equal, therefore use restraint to similarity matrix, i.e., The similarity matrix must be symmetrical.

Detailed description of the invention

Fig. 1 is a kind of schematic process flow diagram of time series data method for measuring similarity of the invention；

Fig. 2 is the signal of sequence of events matrix dimension-reduction treatment in a kind of time series data method for measuring similarity of the invention Figure；

Fig. 3 is the schematic diagram of convolutional neural networks structure in a kind of time series data method for measuring similarity of the invention.

Specific embodiment

Invention is further described in detail in the following with reference to the drawings and specific embodiments.

Referring to Fig. 1, a kind of time series data method for measuring similarity of the invention, comprising the following steps:

Step 1, effective expression of time series data is constructed.The effective expression for constructing time series data, needs sparse timing Data become dense, consider the relativeness of each event and the relativeness of each event and time of origin in sequence, will The data of higher dimensional space are mapped to lower dimensional space.

Step 1 specifically includes the following steps:

Step 1.1, time series data matrix is excessively sparse, and it is huge to will lead to operand for directly analysis.Sparse matrix is become Must be dense, the matrix dimension-reduction treatment of higher-dimension.Referring to Fig. 2, being a sequence of events every time series data matrix conversion, press Event is arranged according to the relative time of event, the event free of turn in the same time occurs；

Step 1.2, each event is mapped to the vector of fixed length using word2vec, each event in sequence that obtains includes The vector of relativeness information indicates；

Step 1.3, the time map each event occurred using word2vec is obtained at the vector with the dimensions such as event vector The vector for obtaining Time To Event indicates；

Step 1.4, the vector of Time To Event each in time series data is indicated to be embedded into relatively by vectorial addition In the event expression answered.

1) time series data representation method of the invention, which has the feature that, becomes dense for sparse time series data matrix Event vector, non-sparsity.2) higher-dimension event is indicated to be mapped to low-dimensional vector space, low dimensional by word2vec.3) most Whole sequence of events indicates to have merged the relativeness between the relativeness and event and time of origin between event.

Step 2, the feature of time series data is effectively extracted.

Expression for time series data needs to carry out feature extraction to it, effectively to carry out measuring similarity.In addition, Since the quantity of each time series data event is different, the vector quantity that the sequence of events for causing previous step to obtain indicates is different, is The data similarity that convenience is calculated as pair, needs the sequence expression of different length being mapped to isometric character representation.We Method extracts isometric time series data character representation using improved convolutional neural networks.

Referring to Fig. 3, convolutional neural networks structure of the invention includes:

1) convolutional layer: due to different along matrix both direction convolution operation from image analysis, time series data only exists It is just significant that convolution is done in time orientation, so the convolution of convolutional layer is unidirectional.This layer has multiple kernel function (kernel Function it) is used as filter, different features is extracted, obtains multiple characteristic patterns.

2) sample level: simple maximum sampling is taken, the feature of each characteristic pattern is sampled into a single numerical value, finally The fixed length vector for obtaining time series data indicates.

Step 3, similarity and training network are calculated.

For the feature vector that previous step is extracted, the similarity between two feature vectors is calculated based on similarity matrix, is examined Consider the position due to arbitrarily exchanging two data, similarity should be equal, therefore use restraint to similarity matrix, i.e., should Similarity matrix must be symmetrical.It is lost with calculated similarity calculation, and training network.

Step 3 specifically includes the following steps:

Step 3.1, the similarity between time series data is calculated based on similarity matrix；One matrix M of random initializtion, use Two the feature vector, Xs a, Xb and M that one step obtains calculate similarity S.

Step 3.2, feature vector is merged with the similarity obtained is calculated；By two feature vector, Xs a, Xb and step 3.1 The middle similarity S obtained that calculates is spliced into a vector.

Step 3.3, classification is exported by full articulamentum；By the DUAL PROBLEMS OF VECTOR MAPPING obtained in step 3.2 at a two-dimensional vector, The size that two data similarity of one-dimensional representation is 1, the size that two data similarity of two-dimensional representation is 0.

Step 3.4, the similarity of two datas is calculated by Softmax.

Step 3.5, loss function, and training network are constructed.

Objective function is constructed first, the loss of iteration each time is calculated according to objective function, objective function is to each parameter Local derviation is sought, parameter is updated to its derivative (gradient) negative direction loses, to continue to optimize parameter.Loss function can formalization representation Are as follows:

L(S₁,S₂, y) and=(y-M (S₁,S₂))²；

In formula, S₁、S₂Indicate the data pair of input.

A kind of data method for measuring similarity indicated based on time insertion of the invention, can make up for it existing representation method The deficiencies of ignoring temporal information, high-dimensional, sparsity, effective expression of time series data is obtained, and reasonably accurately calculate similar Degree.The time series data that the present invention obtains indicates dense, low-dimensional, so that calculating efficient；The time series data of acquisition is embedded in time letter Breath, representation method are more reasonable；Accuracy rate is high, on the basis of reasonable representation, extracts feature calculation phase using convolutional neural networks Like spending and carry out Training end to end, while realizing that reasonable representation and efficient feature extract, to improve measurement accuracy.

A kind of time series data measuring similarity system, comprising:

Characteristic vector pickup module, the table of each sample time-series data for indicating time series data building module building Show carry out feature extraction, obtains the feature vector of each sample time series data；

Similarity calculation module, the feature of each sample time series data for being obtained according to characteristic vector pickup module to Amount calculates the similarity between obtaining each sample time series data based on similarity matrix；

Measuring similarity network module, the feature of each sample time series data for being obtained by characteristic vector pickup module The preset convolutional Neural net of similarity training between each sample time series data that vector and similarity calculation module obtain Network model, training obtain trained measuring similarity model to the default condition of convergence；

Input/output module, for constructing the expression of time series data to be measured, extraction obtains time series data to be measured Feature vector, and be inputted in measuring similarity network module, export the measuring similarity result of time series data to be measured.

Embodiment

Referring to Fig. 1, a kind of time series data method for measuring similarity of the embodiment of the present invention, is applied to electronic health care case history Measuring similarity, comprising the following steps:

S101 constructs effective expression of electronic health care case history traditional Chinese medicine event sequence.

Step1, electronic health care case history (EMR) matrix is excessively sparse, and to do is to become thick sparse matrix first It is close, the matrix of higher-dimension is carried out dimension-reduction treatment.Referring to Fig. 2, be a sequence of events each EMR matrix conversion, according to The relative time of opposite event arranges event, occurs to finally obtain a vector H in event free of turn on the same day；

Medical events each in electronic health care case history are mapped to the vector of fixed length using word2vec by Step2, to obtain The relativeness of each medical events in electronic health care case history, word2vec are the thought using deep learning, and vocabulary is shown as The Effective model of vector.If word is regarded as feature, it can be understood as Feature Mapping to K dimensional vector space, by each Word is mapped to K dimensional vector, just the processing to text is reduced to the vector operation of K dimensional vector space, and the similarity of vector can To be used to indicate the similarity on text semantic.Therefore, the medical events sequence in every electronic health care case history is considered as one A sentence, and each event in sequence can regard a word as, be mapped to one after each course of event word2vec A permanent vector, vector length are a parameters, use dim_vIndicate, by each event in Step1 in vector H be mapped to A sequence matrix K is obtained after amount；

Step3. time map medical events each in EMR occurred using word2vec is at medical events vector etc. Long vector is indicated with obtaining the vector of medical events time of origin, finally obtains a matrix T；

Step4. the vector expression of medical events time of origin each in EMR traditional Chinese medicine sequence of events is passed through into vectorial addition It is embedded into corresponding medical events expression, can formalizes as follows；

E=K+T

Wherein, E is the medical events sequence representing matrix of final time insertion, and K is the sequence square generated in Step2 Battle array, T is the time series matrix generated in Step3.

Specifically, 1) above-mentioned medical events sequence representation method has the feature that by medical events square sparse in EMR Battle array becomes dense medical events vector, has non-sparsity.2) higher-dimension event is indicated to be mapped to low-dimensional by word2vec Vector space has low dimensional.3) final patient episode's sequence indicates to have merged relativeness and the doctor between medical events Relativeness between event and time of origin.

S102. the character representation of EMR traditional Chinese medicine sequence is effectively extracted.

Expression for patient's medical events sequence needs to carry out feature extraction to it, effectively to carry out similarity degree Amount.Further, since the quantity of each EMR traditional Chinese medicine event sequence is different, the vector for causing the medical events sequence obtained to indicate Quantity is different, and the patient's similarity being calculated as convenience pair needs for the patient event sequence expression of different length to be mapped to Isometric character representation.Method of the invention is extracted isometric patient characteristics using improved convolutional neural networks and is indicated.

Specific network structure is as follows:

1) convolutional layer: due to different along matrix both direction convolution operation from image analysis, EMR data only when Between to do convolution in direction just significant, so the convolution of convolutional layer is unidirectional.Kernel function (the kernel that there are many this layers Function it) is used as filter, different features is extracted, obtains multiple characteristic patterns；

As shown in Fig. 2, the input of convolutional layer is two dim_vThe EMR sequence representing matrix of × L, the present invention will finally train Out be two EMR similarity, therefore input be two EMR information.After convolution operation by c convolution kernel, Generate the identical vector of c size, i.e., multiple characteristic patterns.Convolution operation is carried out to two EMR datas, is joined using same convolution Number.Any two EMRA and B, should be equal when calculating sim (A, B) and sim (B, A), so the front-rear position of A and B is symmetrical , it should use identical convolution layer parameter.Similarly, in similarity mode layer below, symmetrical similarity is also used Matrix, this two o'clock, which ensure that, can obtain equal value when calculating sim (A, B) and sim (B, A).

After handling by convolutional layer, two c × dim can be obtained_mMatrix.It originally should be c × (T+8) matrix, For convenience of calculation, in the input of convolutional layer, less than the patient EMR matrix end filling 0 of L event.Therefore, it obtains here c×dim_mMatrix, end a part is also possible to be still 0.Although dimension is fixed, be actually still not etc. dimensions put to the proof, It also needs further to handle to carry out similarity analysis, is exactly in next step sample level.

2) sample level: simple maximum sampling is taken, the feature of each characteristic pattern is sampled into a single numerical value, in this way Just each EMR is indicated with a fixed length vector.It is different from average sample, the maximum value of sampling area is taken, region can be found Inside most can performance characteristic point.Each EMR obtains a fixed length vector after sample level, is the vector of p dimension herein, Indicate the feature of the EMR.

S103. similarity and training network are calculated.

For the EMR feature vector that previous step is extracted, the similarity between two feature vectors is calculated based on similarity matrix, Consider the position due to arbitrarily exchanging two patients, similarity should be equal, therefore use restraint to similarity matrix, i.e., The similarity matrix must be symmetrical.It is lost with calculated similarity calculation, and training network.

Specifically includes the following steps:

Step1. patient's similarity is calculated based on similarity matrix；

One matrix M of random initializtion, with two feature vector, Xs obtained in the previous step_a, X_bSimilarity S is calculated with M.

Step2. feature vector is merged with the similarity of calculating；

By two feature vector, Xs_a, X_bAnd the similarity S calculated in Step1 is spliced into a vector.

Step3. classification is exported by full articulamentum；

By the DUAL PROBLEMS OF VECTOR MAPPING obtained in Step2 at a two-dimensional vector, two patients of one-dimensional representation belong to same The size of cohort, two patients of two-dimensional representation belong to the size of different cohort.

Step4. the probability that two initial data belong to the same cohort is calculated by Softmax.

Step5. loss function, and training network are constructed；

L(S₁,S₂, y) and=(y-M (S₁,S₂))²。

When network parameter convergence, deconditioning obtains final EMR measuring similarity model.

To sum up, method of the invention is related to a kind of time series data method for measuring similarity indicated based on time insertion, main Solve the problems, such as to be difficult to similarity between the metric sequence on effective and reasonable ground under a large amount of heterogeneous Dimension Time Series.Specific packet Include following steps: firstly, effective expression of building time series data, is mapped to low-dimensional sky for higher-dimension, sparse temporal events sequence Between, obtain each event by word2vec technology indicates in the vector of lower dimensional space, and is embedded in temporal information.Secondly, building The convolutional neural networks of one customization extract the validity feature of time series data, obtain the fixed length mark sheet of Length discrepancy time series data Show；Finally, indicating to calculate similarity, calculating target function and training network using the fixed length of time series data, to obtain finally Time series data measuring similarity model.Feature of event in special time period in sequence itself is only considered different from existing method, And ignore its relativeness with time of origin, the invention discloses a kind of time series similarities indicated based on time insertion Measure, constructs the event vector and time arrow of each event by word2vec, then by time arrow be embedded in event to In amount, by supervised learning mode one convolutional neural networks of training, patient's likeness in form degree measurement mould an of robust is finally obtained Type.This method compensates for asking for relativeness in existing method between ignorance event and event and the relativeness of time of origin Topic indicates more rationally effective to timing sequence data, to promote the accuracy of measuring similarity.

As it will be easily appreciated by one skilled in the art that the foregoing is merely embodiments of the method for the invention, not to The limitation present invention, any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should all include Within protection scope of the present invention.

Claims

1. a kind of time series data method for measuring similarity, which comprises the following steps:

Step 1, the sample time-series data for acquiring preset quantity consider the relativeness of each event in each sample time-series data And the relativeness of each event and time of origin, the data of higher dimensional space are mapped to lower dimensional space, construct each sample The expression of this time series data；

Step 2, the expression of all sample time-series data step 1 obtained inputs preset convolutional neural networks model, Feature extraction is carried out to the expression of each sample time-series data, obtains the feature vector of each sample time series data；

Step 3, the feature vector of each sample time series data obtained according to step 2 is calculated based on similarity matrix and obtains various kinds Similarity between this time series data；

Step 4, each sample time-series number that the feature vector and step 3 of each sample time series data obtained by step 2 obtain According to the preset convolutional neural networks model of similarity training, training obtains trained phase to the default condition of convergence Like degree measurement model；

Step 5, the expression of time series data to be measured is constructed by the method for step 1, and is inputted the training of step 4 acquisition In good measuring similarity model, the measuring similarity result of time series data to be measured is obtained.

2. a kind of time series data method for measuring similarity according to claim 1, which is characterized in that step 1 specifically includes:

Step 1.1, it is a sequence of events by every sample time series data matrix conversion, arranges thing according to the relative time of event Event free of turn in the same time occurs for part；

Step 1.2, each event is mapped to the vector of fixed length using word2vec, each event in sequence of events that obtains includes The vector of relativeness information indicates；

Step 1.3, time map event each in sequence of events occurred using word2vec at the dimensions such as event vector Vector, the vector for obtaining Time To Event indicate；

Step 1.4, the vector of Time To Event each in time series data is indicated to be embedded by vectorial addition corresponding In event expression, the expression of sample time-series data is obtained.

3. a kind of time series data method for measuring similarity according to claim 1, which is characterized in that in step 2, pass through volume Product neural network extracts the isometric feature vector of each sample time-series data；

Convolutional neural networks structure used includes:

4. a kind of time series data method for measuring similarity according to claim 3, which is characterized in that convolutional Neural used In network structure:

The convolution of convolutional layer is unidirectional；

In convolutional layer, maximum sampling is taken, the feature of each characteristic pattern is sampled into a single numerical value, ordinal number when finally obtaining According to fixed length vector indicate.

5. a kind of time series data method for measuring similarity according to claim 1, which is characterized in that used in step 3 Similarity matrix is symmetrical structure.

6. a kind of time series data method for measuring similarity according to claim 1, which is characterized in that step 4 specifically includes:

Step 4.1, feature vector is merged with the similarity obtained is calculated, is spliced into a vector；

Step 4.2, by the DUAL PROBLEMS OF VECTOR MAPPING obtained in step 4.1 at a two-dimensional vector；Two articles of time series data phases of one-dimensional representation The size for being 1 like degree, the size that two articles of time series data similarities of two-dimensional representation are 0；

Step 4.4, loss function, and the preset convolutional neural networks of training are constructed, trained measuring similarity model is obtained.

7. a kind of time series data method for measuring similarity according to claim 6, which is characterized in that in step 4.4, first Construct objective function, calculate the loss of iteration each time according to objective function, objective function seeks local derviation to each parameter, parameter to Its derivative negative direction updates loss, to continue to optimize parameter；

Loss function formalization representation are as follows:

L(S₁,S₂, y) and=(y-M (S₁,S₂))²；

In formula, S₁、S₂Indicate the data pair of input.

8. a kind of time series data method for measuring similarity according to claim 1, which is characterized in that similarity in step 3 Calculation method are as follows: one matrix M of random initializtion, with the feature vector, X of two obtained time series datas_a, X_bIt is obtained with M calculating The similarity S of the two obtained.

9. a kind of time series data method for measuring similarity according to claim 8, which is characterized in that step 4 specifically includes:

Step 4.1, by the feature vector, X of two time series datas_a、X_bOne is spliced into the similarity S for calculating acquisition in step 3 Vector；

Step 4.2, by the DUAL PROBLEMS OF VECTOR MAPPING obtained in step 4.1 at a two-dimensional vector, two articles of time series data phases of one-dimensional representation The size for being 1 like degree, the size that two articles of time series data similarities of two-dimensional representation are 0；

Step 4.3, the similarity of two time series datas is calculated by Softmax；

10. a kind of time series data measuring similarity system characterized by comprising

Time series data indicates that building module considers each sample time-series data for acquiring the sample time-series data of preset quantity In the relativeness of each event and the relativeness of each event and time of origin, the data of higher dimensional space are mapped to low Dimension space constructs the expression of each sample time-series data；

Measuring similarity network module, the expression of each sample time-series data for indicating time series data building module building Feature extraction is carried out, the feature vector of each sample time series data is obtained；For the feature according to each sample time series data of acquisition Vector calculates the similarity between obtaining each sample time series data based on similarity matrix；For passing through characteristic vector pickup module It is similar between each sample time series data that the feature vector and similarity calculation module of each sample time series data obtained obtain The preset convolutional neural networks model of degree training, training obtain trained measuring similarity mould to the default condition of convergence Type；