CN109299185A - A kind of convolutional neural networks for timing flow data extract the analysis method of feature - Google Patents

A kind of convolutional neural networks for timing flow data extract the analysis method of feature Download PDF

Info

Publication number
CN109299185A
CN109299185A CN201811216349.3A CN201811216349A CN109299185A CN 109299185 A CN109299185 A CN 109299185A CN 201811216349 A CN201811216349 A CN 201811216349A CN 109299185 A CN109299185 A CN 109299185A
Authority
CN
China
Prior art keywords
data
flow data
feature
neural networks
dimension
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811216349.3A
Other languages
Chinese (zh)
Other versions
CN109299185B (en
Inventor
周同明
汪卫
邢宏岩
刁广州
杨勇
秦嘉岷
姜军
王旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Institute Of Shipbuilding Technology (11th Institute Of China Shipbuilding Industry Group Corporation)
Original Assignee
Shanghai Institute Of Shipbuilding Technology (11th Institute Of China Shipbuilding Industry Group Corporation)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Institute Of Shipbuilding Technology (11th Institute Of China Shipbuilding Industry Group Corporation) filed Critical Shanghai Institute Of Shipbuilding Technology (11th Institute Of China Shipbuilding Industry Group Corporation)
Priority to CN201811216349.3A priority Critical patent/CN109299185B/en
Publication of CN109299185A publication Critical patent/CN109299185A/en
Application granted granted Critical
Publication of CN109299185B publication Critical patent/CN109299185B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses the analysis methods that a kind of convolutional neural networks for timing flow data extract feature, include the following steps, corresponding flow data is pre-processed first, the pretreatments such as data cleansing, data integration, data transformation, data merging and data remodeling are carried out, guarantee the accuracy of subsequent flow data analysis;Then stream data samples, and the mode for generally selecting decay window is sampled, and generates analysis sample;Carefully analyze the incidence relation of data characteristic and different dimensions, if little without correlation or correlation, it tries using the convolutional neural networks structure of fractional dimension, carry out mining analysis, not only while having retained sequential flow data characteristics, but also the assemblage characteristic between different dimensions is found.This patent is conducive to find a more preferably method on application convolutional neural networks processing stream data problem.

Description

A kind of convolutional neural networks for timing flow data extract the analysis method of feature
Technical field
The present invention relates to flow data analysis field, specially a kind of convolutional neural networks for timing flow data extract special The analysis method of sign.
Background technique
All the time thousands of data all are being produced, cause all the time data forming explosive increasing It is long.The growth flow of data explosion formula, the data application availability of huge stable data storage and industry field give me The artificial intelligence epoch, impart abundant original material.
Also just because of we are in the epoch of a data explosion, more it is badly in need of powerful Data Management Analysis tool, So that we from the timing flow data of magnanimity, have found those information that do not paid attention to by us once, it is found that those are valuable Information even finds those knowledge concerning human survival.
Summary of the invention
The problem that encounters the technical problem to be solved by the present invention is to general models and method or fail effectively to extract implicit Feature, and the convolutional neural networks structural model in deep learning, it is difficult to accomplish to take into account while temporal aspect and dimensional characteristics, A kind of analysis method of convolutional neural networks extraction feature for timing flow data is provided, to solve the above problems.
In order to solve the above-mentioned technical problems, the present invention provides the following technical solutions:
The present invention provides a kind of analysis method of convolutional neural networks extraction feature for timing flow data, including as follows Step:
S1: fluxion Data preprocess;
S2: with the selection sample of decay window method;
S3: the design of convolutional neural networks model framework is built;
S4: fractional dimension extracts feature using convolution model;
S5: the display that the log of deep learning generates effect picture is compared;
S6: deep learning effect picture visualization;
For the temporal aspect and dimensional information feature in flow data, build, using the convolutional neural networks mould of fractional dimension Type, extracting includes strong feature in basic information and strong rule in data, combines the included temporal aspect of flow data; It is comprehensive out not only to have included temporal aspect but also included the model of dimensional characteristics after the feature extraction, reinforcing of multidimensional data.
In the step S1, according to the characteristic of flow data, carry out fluxion Data preprocess, including the identification of data critical information and Redundant attributes identification, artificially filtering out influences the maximum factor to result, pre-processes to the flow data of all factors, utilizes Data cleansing, data integration, data transformation, data merge and the preprocessing means such as data remodeling to the exception item of historical data, It lacks item, redundancy, differences and carries out polishing;It is examined according to the sample data that pretreatment obtains, carries out screening important letter The digitized description of breath, the artificial dimension for establishing screening target signature.
As a preferred technical solution of the present invention, in the step S1, the abnormal data pretreatment mode includes:
Shortage of data increases the rank of data, while reducing data volume, filters out loss of data;
Data exception, deleted using data, in conjunction with overall model comprehensive analysis substitution, be considered as missing values equivalence and the side such as fill up Formula is pre-processed, and the departure degree after outlier processing between other numerical value is minimized;
Data redundancy rejects one not if the correlation between two attributes of data is larger from the two attributes Important attribute;
Data normalization, the data of each dimension be not in unified range, in the calculating across dimension, weight It swings up and down too greatly, be unfavorable for adjusting and calculate;
Label and label timestamp: it is usually tagged to data set for supervised learnings such as classification problems, while to number Timestamp is put on according to collection.
As a preferred technical solution of the present invention, the step S2 detailed process are as follows: flow data sample acquisition, fluxion It is obtained according to filtering, flow data;
Wherein, in flow data sample acquisition, in general sampling problem, flow data is made of a series of n field tuples, The a subset of these fields is known as critical field;It is assumed that the sample size after sampling is a/b, the key assignments of each tuple is breathed out One in b bucket is wished, then the tuple by cryptographic Hash less than a is put into sample;If critical field more than one, is breathed out The value of these fields will be combined to form single cryptographic Hash by uncommon function;The sample finally obtained is by certain specific key assignments All tuples constitute;The ratio that the key assignments number selected accounts for key assignments sum in stream is about a/b;
Flow data filtering in, using Bloom filter, Bloom filter include n position composition array, each at the beginning of Initial value is all a series of 0, hash function h1, and the set of h2 ..., hk composition, " key " value is mapped to above-mentioned by each hash function In n bucket, the set S of m key assignments composition;Bloom filter allows stream element of all key assignments in S to pass through, and stops most of Stream element of the key assignments not in S;
Flow data obtains, and has important association between the usually neighbouring flow data generated, and what element occurred in stream It is more early, it is associated with method acquisition flow data that is smaller, therefore using decay window, a smooth aggregate-value is calculated, wherein using Weight constantly decay, referred to as exponential damping window is denoted asWherein a1The element reached for first, atFor Current element, enabling c is the constant of a very little, such as 10-9
As a preferred technical solution of the present invention, in the step S3, input data is input in model, it is described Model first layer is convolutional layer, and the input of this layer is exactly the flow data by screening, and unlike the full articulamentum of tradition, volume The input of each node is a fritter of upper one layer of neural network in lamination;Convolutional layer is by each fritter in neural network More in depth analysis is carried out to obtain the higher feature of level of abstraction;It can be become by the processed node matrix equation of convolutional layer Deeper, the depth of the node matrix equation after convolutional layer will increase;The second layer is pond layer, and pond layer neural network will not change The depth of bending moment battle array, but the size of matrix can be reduced;Pondization operation is to convert low resolution for a high-resolution picture The picture of rate (data volume reduces but still retains data characteristics);By pond layer, last full articulamentum can be further reduced The number of interior joint, to reach the number for reducing parameter in entire neural network;Through excessive wheel convolutional layer and pond layer It can be to provide last classification results by 1 to 2 full articulamentums as last of convolutional neural networks after processing;? After processing by a few wheel convolutional layers and pond layer, the information in data has been abstracted into the higher spy of information content Sign;Convolutional layer and pond layer are the processes for automatically extracting feature, and after the feature extraction completion, right to use articulamentum is completed point Generic task.
As a preferred technical solution of the present invention, in the step S4, for each dimension, individual volume is carried out Product;By way of individually extracting the dimensional characteristics of each dimension, to extract the feature of each dimension respectively, strengthen respectively each The feature of dimension is finally integrating the strongest feature of each dimension, in conjunction with judging final classification results.
As a preferred technical solution of the present invention, in the step S5, after having carried out deep learning, to depth Classify after study or the accuracy rate of prediction is calculated, and generate log and accuracy rate etc. as a result, for improving, modifying mould Type and debugging are used.
The beneficial effects obtained by the present invention are as follows being: the present invention can extract in the processing of data is extracted and be included in basis Strong feature and strong rule in information, combine the novel analysis method of the included temporal aspect of flow data, realize to timing The method that flow data uses convolutional neural networks extracts strong feature and strong rule in basic information, while compatible flow data Included temporal aspect.
Detailed description of the invention
Attached drawing is used to provide further understanding of the present invention, and constitutes part of specification, with reality of the invention It applies example to be used to explain the present invention together, not be construed as limiting the invention.
In the accompanying drawings:
Fig. 1 is fractal model building overall flow figure of the present invention;
Fig. 2 is convolutional neural networks structure chart of the invention;
Fig. 3 is the convolutional neural networks structure chart of fractional dimension of the invention
Fig. 4 is the flow chart of the convolutional neural networks structure of fractional dimension of the invention;
Fig. 5 is the data sample example of the invention by taking Financial Time Series flow data as an example;
Fig. 6 is the method for sampling example of the flow data filtering of the invention by taking Financial Time Series flow data as an example;
Fig. 7 is the general neural network convolution mode of the invention by taking Financial Time Series flow data as an example;
Fig. 8 is the neural network convolution mode of the fractional dimension of the invention by taking Financial Time Series flow data as an example;
Fig. 9 is the accuracy rate effect of the convolutional neural networks structure of the fractional dimension of the invention by taking Financial Time Series flow data as an example Fruit figure;
Figure 10 is the training effect of the convolutional neural networks structure of the fractional dimension of the invention by taking Financial Time Series flow data as an example Fruit figure;
Figure 11 is the parameter comparative example of each model algorithm of the invention by taking Financial Time Series flow data as an example.
Specific embodiment
Hereinafter, preferred embodiments of the present invention will be described with reference to the accompanying drawings, it should be understood that preferred reality described herein Apply example only for the purpose of illustrating and explaining the present invention and is not intended to limit the present invention.
Embodiment: as shown in figs. 1-11, it is special that the present invention provides a kind of convolutional neural networks extraction for timing flow data The analysis method of sign, includes the following steps:
S1: fluxion Data preprocess;
S2: with the selection sample of decay window method;
S3: the design of convolutional neural networks model framework is built;
S4: fractional dimension extracts feature using convolution model;
S5: the display that the log of deep learning generates effect picture is compared;
S6: deep learning effect picture visualization;
For the temporal aspect and dimensional information feature in flow data, build, using the convolutional neural networks mould of fractional dimension Type, extracting includes strong feature in basic information and strong rule in data, combines the included temporal aspect of flow data; It is comprehensive out not only to have included temporal aspect but also included the model of dimensional characteristics after the feature extraction, reinforcing of multidimensional data.
In the step S1, according to the characteristic of flow data, carry out fluxion Data preprocess, including the identification of data critical information and Redundant attributes identification, artificially filtering out influences the maximum factor to result, pre-processes to the flow data of all factors, utilizes Data cleansing, data integration, data transformation, data merge and the preprocessing means such as data remodeling to the exception item of historical data, It lacks item, redundancy, differences and carries out polishing;It is examined according to the sample data that pretreatment obtains, carries out screening important letter The digitized description of breath, the artificial dimension for establishing screening target signature.
Further, in the step S1, the abnormal data pretreatment mode includes:
Shortage of data increases the rank of data, while reducing data volume, filters out loss of data;
Data exception, deleted using data, in conjunction with overall model comprehensive analysis substitution, be considered as missing values equivalence and the side such as fill up Formula is pre-processed, and the departure degree after outlier processing between other numerical value is minimized;
Data redundancy rejects one not if the correlation between two attributes of data is larger from the two attributes Important attribute;
Data normalization, the data of each dimension be not in unified range, in the calculating across dimension, weight It swings up and down too greatly, be unfavorable for adjusting and calculate;
Label and label timestamp: it is usually tagged to data set for supervised learnings such as classification problems, while to number Timestamp is put on according to collection.
Further, the step S2 detailed process are as follows: flow data sample acquisition, flow data filtering, flow data obtain;
Wherein, in flow data sample acquisition, in general sampling problem, flow data is made of a series of n field tuples, The a subset of these fields is known as critical field;It is assumed that the sample size after sampling is a/b, the key assignments of each tuple is breathed out One in b bucket is wished, then the tuple by cryptographic Hash less than a is put into sample;If critical field more than one, is breathed out The value of these fields will be combined to form single cryptographic Hash by uncommon function;The sample finally obtained is by certain specific key assignments All tuples constitute;The ratio that the key assignments number selected accounts for key assignments sum in stream is about a/b;
Flow data filtering in, using Bloom filter, Bloom filter include n position composition array, each at the beginning of Initial value is all a series of 0, hash function h1, and the set of h2 ..., hk composition, " key " value is mapped to above-mentioned by each hash function In n bucket, the set S of m key assignments composition;Bloom filter allows stream element of all key assignments in S to pass through, and stops most of Stream element of the key assignments not in S;
Flow data obtains, and has important association between the usually neighbouring flow data generated, and what element occurred in stream It is more early, it is associated with method acquisition flow data that is smaller, therefore using decay window, a smooth aggregate-value is calculated, wherein using Weight constantly decay, referred to as exponential damping window is denoted asWherein a1The element reached for first, atFor Current element, enabling c is the constant of a very little, such as 10-9
Further, in the step S3, input data is input in model, the model first layer is convolutional layer, The input of this layer is exactly the flow data by screening, and unlike the full articulamentum of tradition, each node in convolutional layer Input is a fritter of upper one layer of neural network;Convolutional layer is more in depth analyzed each fritter in neural network To obtain the higher feature of level of abstraction;Can become deeper by the processed node matrix equation of convolutional layer, by convolutional layer it The depth of node matrix equation afterwards will increase;The second layer is pond layer, and pond layer neural network will not change the depth of matrix, still The size of matrix can be reduced;Pondization operation is to convert a high-resolution picture to picture (the data volume contracting of low resolution It is small but still retain data characteristics);By pond layer, the number of last full articulamentum interior joint can be further reduced, thus Reach the number for reducing parameter in entire neural network;After the processing through excessive wheel convolutional layer and pond layer, in convolution mind It can be to provide last classification results by 1 to 2 full articulamentums as last of network;Passing through a few wheel convolutional layers and pond After the processing for changing layer, the information in data has been abstracted into the higher feature of information content;Convolutional layer and pond layer are The process for automatically extracting feature, after feature extraction completion, right to use articulamentum completes classification task.
Further, in the step S4, for each dimension, independent convolution is carried out;By individually extracting each dimension The form of the dimensional characteristics of degree is strengthened the feature of each dimension respectively, is integrated finally to extract the feature of each dimension respectively Each strongest feature of dimension, in conjunction with judging final classification results.
Further, in the step S5, after having carried out deep learning, to classification after deep learning or prediction Accuracy rate calculated, and generate log and accuracy rate etc. as a result, being used for improving, modifying model and debugging.
Specific: stream data is pre-processed in step sl, pretreatment mode include shortage of data, data exception, Data redundancy, data normalization, label and label timestamp.
In step S2, the main operating method using decay window is main to flow to the flow data rotated sample in step S1 Journey is that the acquisition of flow data, the filtering of flow data and decay window are operated, according to target data set and project characteristic, manually Establish the dimension of screening target signature;Using decay window method selection sample the reason of be: in timing flow data, in the recent period The flow data of generation would generally to instantly flow data and the data at the following a bit of moment have an impact, impact factor according to Actual conditions and real data determine.Under normal conditions, there is important association between the neighbouring flow data generated, and be spaced very Remote flow data is to the association of the data generation occurred instantly with regard to much smaller.So generalling use flow data in sample acquisition The operation of filtering and decay window.By taking Financial Time Series are predicted as an example, experiment uses 90 neighbouring time windows, that is, mistake Go 270 minutes data (about one day or so exchange hour) as the foundation of the following price expectation in three minutes.
Illustrate the meaning of decay window: every sky-high price by taking the index Moving Average in the timing flow data of financial field as an example The weight coefficient of lattice is reduced than in the form of by index etc..Time, its weight was bigger closer to the current moment, illustrated that index is mobile flat Equal line strengthens weight ratio to recent price, can more reflect recent price fluctuation situation in time.So index Moving Average Reference value is had more than Moving Average.
Similarly, in analogy to timing flow data, the time, the weight for assigning it was also bigger, i.e., closer to the current moment The past time series data generated to recent data can be assigned with different weights, strengthen the weight of recent flow data Than the case where capable of more reflecting recent numerical fluctuations in time.
In with the application experiment in the timing flow data of financial field, first by all K line numbers according to putting on the time respectively Stamp, then resolved into 80 buckets for daily 240 minutes in chronological order, and transaction flow data is hashing onto this 80 " bucket ", Using Bloom filter, Bloom filter includes 80 with the time arrays that form of position, and each hash function is by " key " value (data The time that sample indicates) it is mapped to the set S of n above-mentioned bucket (daily time bracket) composition.Bloom filter is by data Stream element of the sample in S passes through, and stops stream element of most of key assignments not in S.Such operation, play the role of be It filters out a part to close on closing quotation and close on the data of opening quotation time, evading the inactive bring data of the market liquidity may There is deviation.In choosing the data source that Moving Average etc. can illustrate the indexs such as data short-term trend and amount valence, decaying is introduced The mode of operation of window can will be more right because the time, the weight of imparting was bigger closer to the current moment in timing flow data The generation of recent data is assigned with different weights, strengthens the weight ratio of recent flow data, can more reflect recent number in time The case where value fluctuation.Similarly, in financial field timing flow data, it is effective that index Moving Average EMA, EMA are used The weight of the data remote from the current moment is reduced, and increases the weight of the big factor of Outcome, can allow engineering Habit can preferably learn potential rule.
In step S3, convolutional neural networks are the mutation of multilayer perceptron, and being first, truly successfully training is more The learning algorithm of layer neural network structure.The weight of convolutional neural networks, which shares network structure, makes structure be more closely similar to biological neural Network, and model complexity is greatly reduced, reduce the quantity of weight, simplifies the complexity of calculating.Convolution is to two A kind of mathematical operation of the real variable function, is typically expressed as: s (t)=(x*w) (t);Wherein, w must be that an effective probability is close Function is spent, otherwise exporting no longer is a weighted average;X is input input;Parameter w is kernel function, and output is sometimes referred to as spy Sign mapping;T is time shaft.Under discrete form are as follows:In machine learning, input The usually data of Multidimensional numerical, and core is usually the parameter of the Multidimensional numerical optimized by learning algorithm.We often exist Convolution algorithm is carried out in multiple dimensions:In step S4, for Each dimension carries out independent convolution.Extract the feature of each dimension respectively by way of individually extracting dimensional characteristics, point The feature for not strengthening each dimension is finally integrating the strongest feature of each dimension, in conjunction with judging final classification results.Data In treatment process, the column vector of all inputs, which individually comes out, carries out a point convolution operation for dimension.In this way, both remain The temporal aspect of single dimension, and each dimensional characteristics can be grabbed, finally combine the most important characteristics of each dimension.Step S5 In, after having carried out deep learning, calculated to result classification after deep learning or in the accuracy rate of prediction, and raw At log and accuracy rate etc. as a result, being used for improving, modifying model and debugging;It is entered step in S6 if necessary to optimize simultaneously It optimizes, otherwise enters step in S3 and repeat.
Finally, it should be noted that these are only the preferred embodiment of the present invention, it is not intended to restrict the invention, although Present invention has been described in detail with reference to the aforementioned embodiments, for those skilled in the art, still can be right Technical solution documented by foregoing embodiments is modified or equivalent replacement of some of the technical features.It is all Within the spirit and principles in the present invention, any modification, equivalent replacement, improvement and so on should be included in protection of the invention Within the scope of.

Claims (7)

1. the analysis method that a kind of convolutional neural networks for timing flow data extract feature, which is characterized in that including as follows Step:
S1: fluxion Data preprocess;
S2: with the selection sample of decay window method;
S3: the design of convolutional neural networks model framework is built;
S4: fractional dimension extracts feature using convolution model;
S5: the display that the log of deep learning generates effect picture is compared;
S6: deep learning effect picture visualization;
For the temporal aspect and dimensional information feature in flow data, builds, using the convolutional neural networks model of fractional dimension, mention Include the strong feature and strong rule in basic information in taking-up data, combines the included temporal aspect of flow data;More After the feature extraction of dimension data, reinforcing, comprehensive not only included temporal aspect again including the model of dimensional characteristics out.
2. a kind of convolutional neural networks for timing flow data according to claim 1 extract the analysis method of feature, It is characterized in that, according to the characteristic of flow data, carrying out fluxion Data preprocess, including data critical information is known in the step S1 It is not identified with redundant attributes, artificially filtering out influences the maximum factor to result, the flow data of all factors is pre-processed, Using preprocessing means such as data cleansing, data integration, data transformation, data merging and data remodelings to the exception of historical data Item, missing item, redundancy, differences carry out polishing;It is examined according to the sample data that pretreatment obtains, screen important The digitized description of information, the artificial dimension for establishing screening target signature.
3. a kind of convolutional neural networks for timing flow data according to claim 2 extract the analysis method of feature, It is characterized in that, in the step S1, the abnormal data pretreatment mode includes:
Shortage of data increases the rank of data, while reducing data volume, filters out loss of data;
Data exception, deleted using data, in conjunction with overall model comprehensive analysis substitution, be considered as missing values equivalence fill up etc. modes into Row pretreatment, minimizes the departure degree after outlier processing between other numerical value;
Data redundancy, if the correlation between two attributes of data is larger, rejecting one is inessential from the two attributes Attribute;
Data normalization, the data of each dimension are not in unified range, in the calculating across dimension, above and below weight Swing is too big, is unfavorable for adjusting and calculate;
Label and label timestamp: it is usually tagged to data set for supervised learnings such as classification problems, while giving data set Put on timestamp.
4. a kind of convolutional neural networks for timing flow data according to claim 1 extract the analysis method of feature, It is characterized in that, the step S2 detailed process are as follows: flow data sample acquisition, flow data filtering, flow data obtain;
Wherein, in flow data sample acquisition, in general sampling problem, flow data is made of a series of n field tuples, these The a subset of field is known as critical field;It is assumed that the sample size after sampling is a/b, the key assignments of each tuple is hashing onto One in b bucket, then the tuple by cryptographic Hash less than a is put into sample;If critical field more than one, Hash letter The value of these fields will be combined to form single cryptographic Hash by number;The sample finally obtained by certain specific key assignments institute There is tuple composition;The ratio that the key assignments number selected accounts for key assignments sum in stream is about a/b;
In flow data filtering, using Bloom filter, Bloom filter includes the array of n position composition, each initial value It is all 0, a series of hash function h1, " key " value is mapped to above-mentioned n by the set of h2 ..., hk composition, each hash function In bucket, the set S that m key assignments forms;Bloom filter allows stream element of all key assignments in S to pass through, and stops most of key It is worth the stream element not in S;
Flow data obtains, and has important association between the usually neighbouring flow data generated, and that element occurs in stream is more early, Its association is smaller, therefore the method for using decay window extracts flow data, a smooth aggregate-value is calculated, wherein the weight used Constantly decaying, referred to as exponential damping window, are denoted asWherein a1The element reached for first, atIt is current Element, enabling c is the constant of a very little, such as 10-9
5. a kind of convolutional neural networks for timing flow data according to claim 1 extract the analysis method of feature, It is characterized in that, input data is input in model in the step S3, the model first layer is convolutional layer, this layer Input be exactly by the flow data of screening, the input of each node is only and unlike the full articulamentum of tradition, in convolutional layer It is a fritter of upper one layer of neural network;Each fritter in neural network is carried out more in depth analysis to obtain by convolutional layer To the higher feature of level of abstraction;It can become deeper section after convolutional layer by the processed node matrix equation of convolutional layer The depth of dot matrix will increase;The second layer is pond layer, and pond layer neural network will not change the depth of matrix, but can reduce The size of matrix;Pondization operation be convert a high-resolution picture to low resolution picture (data volume reduce still Still retain data characteristics);By pond layer, the number of last full articulamentum interior joint can be further reduced, is subtracted to reach The number of parameter in few entire neural network;After the processing through excessive wheel convolutional layer and pond layer, in convolutional neural networks Last as can provide last classification results by 1 to 2 full articulamentums;By a few wheel convolutional layers and pond layer After processing, the information in data has been abstracted into the higher feature of information content;Convolutional layer and pond layer are to mention automatically The process for taking feature, after feature extraction completion, right to use articulamentum completes classification task.
6. a kind of convolutional neural networks for timing flow data according to claim 1 extract the analysis method of feature, It is characterized in that, for each dimension, carrying out independent convolution in the step S4;By the dimension for individually extracting each dimension The form of degree feature strengthens the feature of each dimension to extract the feature of each dimension respectively respectively, is finally integrating each dimension Strongest feature is spent, in conjunction with judging final classification results.
7. a kind of convolutional neural networks for timing flow data according to claim 1 extract the analysis method of feature, It is characterized in that, in the step S5, it is accurate to classifying after deep learning or predicting after having carried out deep learning Rate is calculated, and generates log and accuracy rate etc. as a result, being used for improving, modifying model and debugging.
CN201811216349.3A 2018-10-18 2018-10-18 Analysis method for convolutional neural network extraction features aiming at time sequence flow data Active CN109299185B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811216349.3A CN109299185B (en) 2018-10-18 2018-10-18 Analysis method for convolutional neural network extraction features aiming at time sequence flow data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811216349.3A CN109299185B (en) 2018-10-18 2018-10-18 Analysis method for convolutional neural network extraction features aiming at time sequence flow data

Publications (2)

Publication Number Publication Date
CN109299185A true CN109299185A (en) 2019-02-01
CN109299185B CN109299185B (en) 2023-04-07

Family

ID=65157370

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811216349.3A Active CN109299185B (en) 2018-10-18 2018-10-18 Analysis method for convolutional neural network extraction features aiming at time sequence flow data

Country Status (1)

Country Link
CN (1) CN109299185B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111966740A (en) * 2020-08-24 2020-11-20 安徽思环科技有限公司 Water quality fluorescence data feature extraction method based on deep learning
CN111967616A (en) * 2020-08-18 2020-11-20 深延科技(北京)有限公司 Automatic time series regression method and device
CN112184056A (en) * 2020-10-19 2021-01-05 中国工商银行股份有限公司 Data feature extraction method and system based on convolutional neural network
CN112232197A (en) * 2020-10-15 2021-01-15 武汉微派网络科技有限公司 Juvenile identification method, device and equipment based on user behavior characteristics
CN114385699A (en) * 2022-01-06 2022-04-22 云南电网有限责任公司信息中心 Abnormal analysis method for user price rate of power grid
CN115065560A (en) * 2022-08-16 2022-09-16 国网智能电网研究院有限公司 Data interaction leakage-prevention detection method and device based on service time sequence characteristic analysis

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107194404A (en) * 2017-04-13 2017-09-22 哈尔滨工程大学 Submarine target feature extracting method based on convolutional neural networks
WO2018028255A1 (en) * 2016-08-11 2018-02-15 深圳市未来媒体技术研究院 Image saliency detection method based on adversarial network
CN108647834A (en) * 2018-05-24 2018-10-12 浙江工业大学 A kind of traffic flow forecasting method based on convolutional neural networks structure

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018028255A1 (en) * 2016-08-11 2018-02-15 深圳市未来媒体技术研究院 Image saliency detection method based on adversarial network
CN107194404A (en) * 2017-04-13 2017-09-22 哈尔滨工程大学 Submarine target feature extracting method based on convolutional neural networks
CN108647834A (en) * 2018-05-24 2018-10-12 浙江工业大学 A kind of traffic flow forecasting method based on convolutional neural networks structure

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王勇;周慧怡;俸皓;叶苗;柯文龙;: "基于深度卷积神经网络的网络流量分类方法" *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111967616A (en) * 2020-08-18 2020-11-20 深延科技(北京)有限公司 Automatic time series regression method and device
CN111967616B (en) * 2020-08-18 2024-04-23 深延科技(北京)有限公司 Automatic time series regression method and device
CN111966740A (en) * 2020-08-24 2020-11-20 安徽思环科技有限公司 Water quality fluorescence data feature extraction method based on deep learning
CN112232197A (en) * 2020-10-15 2021-01-15 武汉微派网络科技有限公司 Juvenile identification method, device and equipment based on user behavior characteristics
CN112184056A (en) * 2020-10-19 2021-01-05 中国工商银行股份有限公司 Data feature extraction method and system based on convolutional neural network
CN112184056B (en) * 2020-10-19 2024-02-09 中国工商银行股份有限公司 Data feature extraction method and system based on convolutional neural network
CN114385699A (en) * 2022-01-06 2022-04-22 云南电网有限责任公司信息中心 Abnormal analysis method for user price rate of power grid
CN115065560A (en) * 2022-08-16 2022-09-16 国网智能电网研究院有限公司 Data interaction leakage-prevention detection method and device based on service time sequence characteristic analysis

Also Published As

Publication number Publication date
CN109299185B (en) 2023-04-07

Similar Documents

Publication Publication Date Title
CN109299185A (en) A kind of convolutional neural networks for timing flow data extract the analysis method of feature
CN109300121B (en) A kind of construction method of cardiovascular disease diagnosis model, system and the diagnostic device
US11074511B2 (en) System and method for graph pattern analysis
Khomenko et al. Accelerating recurrent neural network training using sequence bucketing and multi-gpu data parallelization
CN110442684A (en) A kind of class case recommended method based on content of text
CN112859822B (en) Equipment health analysis and fault diagnosis method and system based on artificial intelligence
CN109117864A (en) Coronary heart disease risk prediction technique, model and system based on heterogeneous characteristic fusion
CN110532996A (en) The method of visual classification, the method for information processing and server
CN107516110A (en) A kind of medical question and answer Semantic Clustering method based on integrated convolutional encoding
CN109165950A (en) A kind of abnormal transaction identification method based on financial time series feature, equipment and readable storage medium storing program for executing
CN109598387A (en) Forecasting of Stock Prices method and system based on two-way cross-module state attention network model
CN108985929A (en) Training method, business datum classification processing method and device, electronic equipment
CN110188653A (en) Activity recognition method based on local feature polymerization coding and shot and long term memory network
CN111292195A (en) Risk account identification method and device
CN108960264A (en) The training method and device of disaggregated model
CN106874963B (en) A kind of Fault Diagnosis Method for Distribution Networks and system based on big data technology
CN110046550A (en) Pedestrian's Attribute Recognition system and method based on multilayer feature study
Li et al. Multi-factor based stock price prediction using hybrid neural networks with attention mechanism
CN113807951A (en) Transaction data trend prediction method and system based on deep learning
CN114219096A (en) Training method and device of machine learning algorithm model and storage medium
CN110490333A (en) The professional content intelligent generation method write based on AI
CN109934352A (en) The automatic evolvement method of model of mind
CN113468203B (en) Financial user image drawing method based on recurrent neural network and attention mechanism
Wang et al. Transfer ensemble model for customer churn prediction with imbalanced class distribution
Wu A High-Performance Customer Churn Prediction System based on Self-Attention

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 200032 No. two, 851 South Road, Xuhui District, Shanghai, Zhongshan

Applicant after: Shanghai Shipbuilding Technology Research Institute (the 11th Research Institute of China Shipbuilding Corp.)

Address before: 200032 No. two, 851 South Road, Xuhui District, Shanghai, Zhongshan

Applicant before: SHIPBUILDING TECHNOLOGY Research Institute (NO 11 RESEARCH INSTITUTE OF CHINA STATE SHIPBUILDING Corp.,Ltd.)

GR01 Patent grant
GR01 Patent grant