CN110348624A - A kind of classification of sandstorm intensity prediction technique based on Stacking Integrated Strategy - Google Patents
A kind of classification of sandstorm intensity prediction technique based on Stacking Integrated Strategy Download PDFInfo
- Publication number
- CN110348624A CN110348624A CN201910598794.9A CN201910598794A CN110348624A CN 110348624 A CN110348624 A CN 110348624A CN 201910598794 A CN201910598794 A CN 201910598794A CN 110348624 A CN110348624 A CN 110348624A
- Authority
- CN
- China
- Prior art keywords
- sample
- data
- classification
- sandstorm
- classifier
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 50
- 238000013528 artificial neural network Methods 0.000 claims abstract description 50
- 230000006870 function Effects 0.000 claims abstract description 36
- 238000010606 normalization Methods 0.000 claims abstract description 31
- 230000004913 activation Effects 0.000 claims abstract description 29
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 28
- 230000000306 recurrent effect Effects 0.000 claims abstract description 19
- 238000012549 training Methods 0.000 claims description 46
- 238000009826 distribution Methods 0.000 claims description 31
- 210000002569 neuron Anatomy 0.000 claims description 30
- 239000013598 vector Substances 0.000 claims description 22
- 230000010354 integration Effects 0.000 claims description 13
- 230000006641 stabilisation Effects 0.000 claims description 9
- 238000011105 stabilization Methods 0.000 claims description 9
- 230000000694 effects Effects 0.000 claims description 8
- 238000007476 Maximum Likelihood Methods 0.000 claims description 7
- 230000009467 reduction Effects 0.000 claims description 7
- 230000002123 temporal effect Effects 0.000 claims description 7
- 230000008859 change Effects 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 6
- 230000008901 benefit Effects 0.000 claims description 5
- 230000003993 interaction Effects 0.000 claims description 5
- 230000007423 decrease Effects 0.000 claims description 4
- 239000011159 matrix material Substances 0.000 claims description 4
- 230000000644 propagated effect Effects 0.000 claims description 4
- 241000338742 Erebia meta Species 0.000 claims description 3
- 238000005520 cutting process Methods 0.000 claims description 3
- 230000007774 longterm Effects 0.000 claims description 3
- 238000003860 storage Methods 0.000 claims description 3
- 230000017105 transposition Effects 0.000 claims description 3
- 239000012141 concentrate Substances 0.000 claims description 2
- 235000013399 edible fruits Nutrition 0.000 claims description 2
- 230000003213 activating effect Effects 0.000 claims 1
- 230000001186 cumulative effect Effects 0.000 claims 1
- 238000005516 engineering process Methods 0.000 abstract description 5
- 238000012545 processing Methods 0.000 abstract description 3
- 238000000605 extraction Methods 0.000 description 10
- 239000000284 extract Substances 0.000 description 8
- 238000010586 diagram Methods 0.000 description 6
- 238000012360 testing method Methods 0.000 description 6
- 239000000428 dust Substances 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 4
- 230000001351 cycling effect Effects 0.000 description 3
- 238000005259 measurement Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000012886 linear function Methods 0.000 description 2
- 230000001537 neural effect Effects 0.000 description 2
- 239000004576 sand Substances 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000000151 deposition Methods 0.000 description 1
- 230000006866 deterioration Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 238000011049 filling Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000002779 inactivation Effects 0.000 description 1
- 230000003585 interneuronal effect Effects 0.000 description 1
- 238000007477 logistic regression Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 210000005036 nerve Anatomy 0.000 description 1
- 210000004218 nerve net Anatomy 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 238000004393 prognosis Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000004162 soil erosion Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- Human Resources & Organizations (AREA)
- Economics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Development Economics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Game Theory and Decision Science (AREA)
- Evolutionary Biology (AREA)
- Entrepreneurship & Innovation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Educational Administration (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Image Analysis (AREA)
Abstract
A kind of classification of sandstorm intensity prediction technique based on Stacking Integrated Strategy, using Recognition with Recurrent Neural Network R and convolutional neural networks C as first-level class device, original weather sample data is inputted into Recognition with Recurrent Neural Network R and convolutional neural networks C respectively, obtains corresponding one level learning feature;Using Stacking Integrated Strategy, a meta classifier Q is introduced as secondary classifier, the one level learning feature group is incorporated as to the input of secondary classifier;Using the output of secondary classifier as the classification of sandstorm intensity amount finally predicted.The present invention has merged the time series data processing capacity of RNN and the high dimensional feature extractability of CNN, with wider pre- measuring angle and better generalization ability, default activation function selection can lift scheme flexibility and generalization ability, 1*1 convolution kernel replaces full articulamentum, more features can be integrated, better Generalization Capability is provided, using L2 regularization and Batch-Normalization or Dropout technology, the generalization ability and whole classifier prediction accuracy and precision of classifiers at different levels are improved.
Description
Technical field
The invention belongs to field of computer technology, in particular to a kind of classification of sandstorm intensity based on Stacking Integrated Strategy
Prediction technique.
Background technique
Sandstorm takes place frequently as a kind of natural calamity in arid and semi-arid lands, early in away from 7000 Wan Nianqian, on the earth
Just there is sandstorm phenomenon.Since modern age, due to soil erosion, desertification of land, the environment reasons such as vegetation deterioration, China north
Side, the especially sandstorm of northwest region occur quantity and obviously rise, and influence of the sandstorm to people's production and living is increasing.
Traditional weather forecast weather forecast is according to weather observation data, using synoptic meteorology, Dynamic Meteorology, statistics
Principle and method, the weather conditions of certain period following to certain region or certain place make qualitative or quantitative prediction.Formerly
In previous decades, weather forecasting techniques and mechanism have obtained tremendous development, but have arrived in the recent period, and conventional method does not obtain for a long time
Qualitative leap.With weather informationization build it is more and more perfect, improve common meteorological and harmfulness weather prognosis accuracy rate by
Gradually become the hot spot direction of related fields research now.Since the complexity of sandstorm occurrence cause and the quantity of meteorological data are huge
Greatly, common neural network to this or is difficult to be fitted, or is difficult to extensive.
Summary of the invention
In order to overcome the disadvantages of the above prior art, the purpose of the present invention is to provide one kind integrates plan based on Stacking
Classification of sandstorm intensity prediction technique slightly, is modeled and is predicted to sandstorm Meteorological Grade using integrated deep neural network, and
Specifically using Recognition with Recurrent Neural Network R as a first-level class device, using convolutional neural networks C as another first-level class
Device improves it as far as possible and predicts estimated performance to above-mentioned first-level class device, that is, improves its ability in feature extraction, and provide one
It can the second level of high dimensional data feature extracted of the temporal aspect that extracts of first-level class device R above-mentioned compared with good utilisation and first-level class device C
Meta classifier model, and a kind of effective, the integrated approach that the feature extracted can be made sufficiently to merge is provided.
To achieve the goals above, the technical solution adopted by the present invention is that:
A kind of classification of sandstorm intensity prediction technique based on Stacking Integrated Strategy, comprising:
Using Recognition with Recurrent Neural Network R and convolutional neural networks C as first-level class device, original weather sample data is distinguished defeated
Enter Recognition with Recurrent Neural Network R and convolutional neural networks C, obtains corresponding one level learning feature;
Using Stacking Integrated Strategy, a meta classifier Q is introduced as secondary classifier, by the one level learning spy
Sign group is incorporated as the input of secondary classifier;
Using the output of secondary classifier as the classification of sandstorm intensity amount finally predicted.
The original weather sample data obtains in the following way:
By " Chinese terrestrial climate data earning in a day data set " and " Chinese strong chromatic number sequence and its support data set " basis
Date is integrated into a whole data set;
The whole data set is subjected to data cleansing, the data predictions such as attitude layer;
Pretreated data are subjected to timing arrangement, attribute is from left to right unfolded, and timing arranges from top to bottom, and is each
A data stamp classification of sandstorm intensity label, finally obtain original weather sample data.
The present invention extracts the temporal aspect of original weather sample data using the Recognition with Recurrent Neural Network R, utilizes the volume
Product neural network C extracts the high dimensional feature of original weather sample data, merges the excellent of two deep neural networks with meta classifier
Point obtains good extensive prediction classification performance.
It can be specifically k sample set by the m original random cuttings of weather sample data, therefrom enumerate each sample set
Close Si, i≤k is respectively trained two first-level class devices, trained one using remaining sample set as training set
Grade classifier basic mode type is expressed as CiAnd Ri, by its corresponding sample set SiCarry out classification of sandstorm intensity prediction, each basic mode type pair
The predicted value of i-th of training sample set will concentrate a characteristic value of i-th of sample as new samples, and by all features
Value is combined into new feature samples, finally carries out the training of secondary classifier as training set using this feature sample;For pre-
Survey process first is predicted to form feature samples collection, finally be predicted again feature samples collection by all first-level class devices, to obtain
To better prediction effect.
The feedforward neural network of the first-level class device and secondary classifier carries out information propagation by following formula:
a(l)=fl(W(l)·a(l-1)+b(l))
Wherein, a(l)Indicate the output of l layers of neuron, flIndicate the activation primitive of l layers of neuron, W(l)Indicate that l-1 layers are arrived l
The weight matrix of layer, b(l)Indicate l-1 layers to l layers of biasing;
The classification layer of the first-level class device and secondary classifier is using Softmax as output function, Softmax function
Formula is as follows:
Wherein, j=1 ..., K, K here represent class categories number, zjFor vector, i.e., for by the upper one layer of generation of classification layer
, need to input the vector data of Softmax function;
Sample vector x belongs to the probability of j-th of classification are as follows:
xTThe transposition of x is represented, w represents weight, and k is the parameter in sum formula, and numerical value is 1 to K;Recognition with Recurrent Neural Network R
Cycling element activation primitive use tanh (hyperbolic tangent function), the classification layer of each classifier uses Softmax activation primitive,
Each classifier rest part (the full articulamentum of secondary classifier, convolutional layer, pond layer, the full articulamentum and one of first-level class device C
The full articulamentum of grade classifier R) activation primitive default using parametrization amendment linear unit (Parametric Rectified
Linear Unit, PReLU), formula are as follows:
Wherein, x is the data of input;α is adjustable constant, is to be obtained by neural network learning, if study obtains
α=0, then PReLU degenerate for amendment linear unit (Rectified Linear UnitReLU);If α is a very little
Fixed value, then PReLU degenerates for band leakage amendment linear unit (Leaky ReLU, LReLU);
The entirety instruction of the first-level class device and secondary classifier using cross entropy as its cost function, for model
Practice, cross entropy description are as follows:
Wherein P and Q is two given probability distribution, the i.e. probability distribution of prediction label and true tag;Due to sand and dust
Sudden and violent label is discrete distribution ,-EX~PIt is equivalent to-∑xP (x), P (x) are used to describe the true distribution of sample, and Q (x) is for indicating pre-
The distribution of survey;
Sample is independently distributed in the neural network of the first-level class device and second level meta classifier, and cross entropy uses
Maximum likelihood principle, i.e.,
WhereinIt is i-th of sample input data x(i)On output, i.e., prediction label vector, n are each trained batches
The number of middle sample, n are the subsets of m, and each batch training n is a part in m, y(i)It is the sandstorm mark of i-th of sample
Sign vector, x(i)It is the input data of i-th of sample, θ is represented the distribution parameter in maximal possibility estimation, i.e., inputted using sample
Data are distributed the parameter value estimated, p (y according to sample label(i)|x(i);The maximum likelihood for θ) representing every a sample is estimated
Meter is accumulated it as whole maximum likelihood, and σ represents the standard deviation for the sample label distribution for needing to estimate;
Use accuracy rate, precision, recall rate, F1 Score index comprehensive as first-level class device and secondary classifier model
Performance metric is closed, wherein F1 Score is the harmonic-mean of accurate rate and recall rate.
The Recognition with Recurrent Neural Network R is the depth RNN of multilayer, using gate recursive unit (Gated Recursive
Unit, GRU) long-term Dependence Problem present in tradition RNN is solved, and full articulamentum is equally replaced by 1*1 convolutional layer, it carries out
The stabilization and feature integration of model, the activation primitive in GRU unit use tanh, other layer activation primitive of the model except classification layer
Default uses PReLU, and it is excessively quasi- to use batch normalization (Batch-Normalization, BN) and L2 regularization method to reduce it
It closes, increases generalization;
The convolutional neural networks C is the depth CNN of multilayer, obtains local feature information by convolution kernel, passes through pond
Layer carries out down-sampling, and the effect of down-sampling is characterized dimensionality reduction, and the quantity of compressed data and parameter reduces over-fitting, improves simultaneously
The fault-tolerance of model;Full articulamentum is replaced by 1*1 convolutional layer, carries out the stabilization and feature integration of model, and use batch normalizing
Change (Batch-Normalization, BN) and L2 regularization method reduces its over-fitting, increases generalization;
The meta classifier Q is the full Connection Neural Network of a multilayer, uses Dropout (DP) and L2 regularization method
Over-fitting is reduced, full articulamentum is replaced using 1*1 convolutional layer, carries out the stabilization and feature integration of model, i.e. meta classifier Q essence
It is 1*1, the convolutional neural networks that multilayer convolution stacks for convolution kernel size.
It is described that full articulamentum is replaced using 1*1 convolutional layer, to realize interaction and information integration across channel, and rolled up
The dimensionality reduction and liter dimension of product core port number, the storage form of each sample is identical as gray scale picture, i.e., each sample has one
feature map。
Described batch of normalization (Batch-Normalization, BN) reduces over-fitting method, is enabled every by using BN
The activation of a neuron becomes to meet Gaussian Profile, i.e. neuron is usually medium active, sometimes somewhat active, rare very living
Jump, the algorithm description of BN are as follows:
M is the sample size in a batch (batch);X represents input sample;uBRepresent the equal of sample in this batch
Value;For the variance of sample in this batch;Represent the sample data after normalization;γ is scale factor;β is translation
The factor;yiTo operate finally obtained data by batch normalization (bn).
I.e. BN step is broadly divided into 4 steps
(1) mean value of each training lot data is sought;
(2) variance of each training lot data is sought;
(3) training data of the batch is normalized using the mean value and variance that acquire, obtains 0-1 distribution;
(4) change of scale and offset: by xiNumerical values recited is adjusted multiplied by γ, obtains y after increasing offset along with βi, γ is
Scale factor, β are shift factors, due to the x after normalizationiSubstantially can be limited under normal distribution, so that the expression of network
Ability decline, to solve this problem, introduce two new parameters: γ, beta, gamma and β are that network oneself study obtains in training
's.
The L2 regularization method is the quadratic sum that weight parameter is added on the basis of original loss function, L2 regularization
Loss function later indicates are as follows:
Wherein, w is classifier network model parameter, EinIt (w) is the training sample error for not including regularization term, λ is just
Then change parameter;
According to above-mentioned formula, the gradient of L (w) is expressed as:
Dropout (DP) method be when neural network propagated forward, allow the activation value of some neuron with
Certain Probability p stops working, so that model in each training batch, will not too depend on certain parts of lot data
Feature.
Compared with prior art, the beneficial effects of the present invention are:
(1) traditional method of meteorological forecast uses synoptic meteorology, meteorologic dynamics, and the methods of statistics carries out weather pre-
It surveys, to common weather, such as precipitation, temperature etc. has relatively good prediction effect.But sandstorm meteorology is a kind of special weather
Phenomenon, needs to consider the meteorologic factor of various aspects, and traditional weather forecast method predicts sandstorm, will consume a large amount of computing resource
And human resources.Since deep neural network has feature extraction, the advantage in terms of time series modeling, using deep learning modeling
Form predicted, more flexible efficient can use data resource and computing resource.Because its statistical forecast angle is more wide in range,
So can also be used as an effective compensation process of traditional meteorological prediction.
(2) compared with the technology for using single deep neural network, present invention employs Stacking integrated technologies, and select
With RNN and CNN respectively as its first-level class device, the time series data processing capacity of fusion RNN and the higher-dimension of CNN can be very good
Ability in feature extraction has wider pre- measuring angle and better generalization ability.
(3) using PReLU (parametrization amendment linear unit), ReLU is compared default activation function, can pass through network science
It practises, automatically selects and be degenerated to ReLU or retain the opposite preferable parameter alpha of classifying quality, it can be with the flexibility of lift scheme
And generalization ability.
(4) technology that 1*1 convolution kernel replaces full articulamentum is used, net can be allowed in the case where not increasing receptive field
Network is deepened, and introduces more non-linear neural members, can integrate more features, provide better Generalization Capability, improve prediction
Comprehensive performance.
(5) Batch-Normalization and L2 Regularization Technique is used in first-level class device, is classified in second level member
Dropout and L2 Regularization Technique is used in device, improves the generalization ability of classifiers at different levels, it is pre- to improve whole classifier
Survey accuracy and precision.
Detailed description of the invention
Fig. 1 is that the present invention is based on Stacking Integrated Strategy neural network flow charts.
Fig. 2 is grid type timing meteorological data schematic diagram.
Fig. 3 is feedforward neural network schematic diagram.
Fig. 4 is ReLU, PReLU and tanh activation primitive schematic diagram.
Fig. 5 is first-level class device R feature extraction flow chart.
Fig. 6 is RNN recursive unit expanded view.
Fig. 7 is GRU door control mechanism figure.
Fig. 8 is convolution sum pond operation diagram.
Fig. 9 is that 1*1 convolution kernel replaces full attended operation figure.
Figure 10 is Dropout contrast schematic diagram.
Specific embodiment
The embodiment that the present invention will be described in detail with reference to the accompanying drawings and examples.
((Recurrent Neural Network, RNN) is one of deep learning model to Recognition with Recurrent Neural Network.This
Neural network is commonly used in processing sequence data.The space-time characterisation and periodicity having due to meteorological data, the present invention
Recognition with Recurrent Neural Network will be used as its a first-level class device, and use gate recursive unit (Gated Recursive
Unit, GRU) it solves the problems, such as to rely on for a long time in tradition RNN, analysis and pre- is carried out to collected sandstorm Meteorological series data
It surveys.
Convolutional neural networks (Convolutional Neural Network, CNN) in terms of high dimensional data feature extraction,
Generally there is better effect.Convolutional neural networks are a kind of nerves for being specifically used to handle and having the data of similar network
Network, in many fields, such as image procossing, is all excellent in since ability in feature extraction is strong.In meteorological data, entirety
It is a time series, and in a certain data, there are multiple meteorological attributes, is typical high dimensional data.In face of meteorological number
According to the classification of sandstorm intensity that can not only go prediction following from temporal contextual information can also be from the meteorologic factor of higher-dimension
It is middle to extract useful information.Based on the above advantage, use CNN as its second first-level class device in the present invention.
That is, a kind of classification of sandstorm intensity prediction technique based on Stacking integrated neural network of the present invention, utilizes circulation mind
The temporal aspect that original weather data is extracted through network extracts the high dimensional feature of original weather using convolutional neural networks, utilizes
Stacking Integrated Strategy, obtains good extensive prediction and classifies the advantages of merging two deep neural networks with meta classifier
Performance.
Performance metric requirement: for each classifier, classification accuracy, recall rate, precision, the synthesis of F1 Score composition
Performance is as high as possible.
Specifically as shown in Figure 1, a fraction of the present invention in convolutional neural networks C and Recognition with Recurrent Neural Network R as sandstorm
On the basis of class device, and Stacking Integrated Strategy is utilized, introduce a second level meta classifier device Q, it is more extensive accurate to obtain
Prediction result.
Above-mentioned first-level class device convolutional neural networks C and Recognition with Recurrent Neural Network R main function are, by original weather sample number
According to its own network is inputted respectively, its level-one learning characteristic is obtained, and this level-one feature is combined into secondary classifier Q's
Input, R are primarily upon the temporal aspect of original weather sample data, and C is primarily upon the high dimensional feature of original weather sample data.
Raw sample data is grid type meteorological data by data prediction, based on timing arrangement, wherein the time
It arranges from top to bottom, attribute from left to right arranges, as shown in Fig. 2, W represents the time span of time series data, unit is day, L generation
The attribute number of table meteorological data, as shown in the figure is W days, the time series data with L meteorological attribute.
Raw sample data is obtained especially by such as under type:
By " Chinese terrestrial climate data earning in a day data set " and " Chinese strong chromatic number sequence and its support data set " basis
Date is integrated into a whole data set;
The whole data set is subjected to data cleansing, the data predictions such as attitude layer, wherein attribute refers to meteorology
Department's acquisition such as wind speed, the meteorology attribute such as sunshine time;
Pretreated data are subjected to timing arrangement, attribute is from left to right unfolded, and timing arranges from top to bottom, and is each
A data stamp classification of sandstorm intensity label, finally obtain original weather sample data.
Wherein timing refers to the time sequencing that meteorological attribute data is recorded by unified metric, such as in a certain sample, number
It is June 1 according to the time that record starts, the Close Date is June 15, then in this sample, timing is 1-June 15 in June
Day, 15 days time sequencings in total.Classification of sandstorm intensity label falls into 5 types, in addition not occurring according to national standard according to visibility
The classification of sandstorm is specifically defined as { 0,1,2,3,4,5 }, and the smaller higher grade of number, i.e., 0 is Severe Sand-Dust Storms, and 5 be not send out
Raw sandstorm.For example, the time that data record starts is June 1 in a certain sample, the Close Date is June 15, grade
Label is classification of sandstorm intensity on June 16, with this achieve the effect that former 15 days data predict the 16th day classification of sandstorm intensity (
It can label it as the grade on other dates more posteriorly).
Specifically, it can be k sample set by the m original random cuttings of weather sample data, therefrom enumerate each sample
This set Si, i≤k is respectively trained two first-level class devices, trains using remaining sample set as training set
First-level class device R and C have k basic mode type { R } and { C } respectively, wherein each basic mode type is represented by RiAnd Ci, then each
Basic mode type has a training sample set SiNot as its training sample, and as its forecast sample, by its corresponding sample
Set SiClassification of sandstorm intensity prediction is carried out, each basic mode type will be used as new samples collection to the predicted value of i-th of training sample set
In i-th of sample a characteristic value collection, and all eigenvalue clusters are synthesized to new feature samples, combination is identical
(the basic mode type of same type refers to all R to the base model feature value of type by rowsiFor same type, all CiIt is identical
Type), it is unfolded by the characteristic value that same sample set is predicted by column, is finally carried out using this feature sample as training set
The training of second level meta classifier;For predicting process, first respectively by all first-level class device base MODEL CsiAnd RiTest set is carried out
Prediction, it is hereby achieved that k prediction result set, prediction result set is voted and (select most classes), to be formed
Feature samples forecast set, specific combination is that resulting feature is voted by same type basic mode type by rows, by same
The characteristic value that this prediction obtains is unfolded by column, thus the available feature with features described above sample training collection same form of mode
Sample predictions collection finally recycles second level meta classifier to predict it, to obtain preferably predicting classifying quality.
The feedforward neural network of first-level class device and secondary classifier of the present invention is as shown in figure 3, the structure has following characteristics:
(1) it is interconnected completely between every layer of neuron and next layer of neuron
(2) there is no same layers to connect between neuron
(3) parallel link is not present between neuron
Wherein the Neuren in hidden layer and output layer represents neuron, and " feedforward " refers to not depositing in network topology structure
In ring or circuit.
The feedforward neural network of first-level class device and secondary classifier carries out information propagation by following formula:
z(l)=W(l)·a(l-1)+b(l)
a(l)=fl(z(l))
Two formulas are merged:
a(l)=fl(W(l)·a(l-1)+b(l))
Wherein, a(l)Indicate the output of l layers of neuron, flIndicate the activation primitive of l layers of neuron, W(l)Indicate that l-1 layers are arrived l
The weight matrix of layer, b(l)Indicate l-1 layers to l layers of biasing.
The wherein full articulamentum of second level meta classifier, the convolutional layer of first-level class device C, pond layer, full articulamentum and R it is complete
The activation primitive of articulamentum is all made of PReLU (parametrization amendment linear unit), and the cycling element activation primitive in R uses tanh
The classification layer of (hyperbolic tangent function), all classifiers all uses Softmax activation primitive.
Wherein, x is the data of input;α is adjustable constant, is to be obtained by neural network learning, if study obtains
α=0, then PReLU degenerate for amendment linear unit (Rectified Linear UnitReLU);If α is a very little
Fixed value (such as α=0.01), then PReLU degenerate for band leakage amendment linear unit (Leaky ReLU, LReLU).Relative to
The factor alpha of ReLU, PReLU negative loop is according to data come fixed, rather than is defined previously as 0, and it is higher that this has model
Capability of fitting;Relative to LReLU, PReLU is more flexible to obtain factor alpha by training study, and merely adds minimal amount of
Parameter also means which only adds a small amount of, the risk of negligible calculation amount and over-fitting.
The formula of tanh activation primitive are as follows:
Wherein, x is the data of input.
The reason of selecting PReLU, comparing tanh in gradient decline has a faster convergence rate, and computing cost compared with
Small, relative to ReLU, PReLU can be automatically selected to be degenerated to ReLU or retain an opposite classification and be imitated by e-learning
The preferable parameter alpha of fruit, can be with the flexibility of lift scheme and generalization ability;And generally swashed using tanh in RNN cycling element
Function living, PReLU and tanh schematic diagram are as shown in Figure 4.
For above-mentioned first-level class device R and C when carrying out feature extraction, input layer is using original weather data x as first layer
Input a(0)It substitutes into f (w, a, b), exports a using multilayer hidden layer(l)As for entire function output vector, then should
Output vector is inputted as the input vector x of Softmax activation primitive, and Softmax function will export the prediction after normalizing
Probability.
That is, the classification layer of first-level class device and secondary classifier uses Softmax as output function, not with cost function
Together, classification layer can obtain prediction result, and Softmax function formula is as follows:
Wherein, j=1 ..., K, K here represent class categories number, are 6 in the present invention;zjFor vector, i.e., for by point
The upper one layer of generation of class layer, need to input the vector data of Softmax function.
Softmax function is actually the log of gradient normalization of finite term discrete probability distribution.Particularly, in the present invention
Multinomial logistic regression and linear discriminant analysis in, the input of function is obtained from K different linear functions as a result, and sample
This vector x belongs to the probability of j-th of classification are as follows:
This can be considered compound, the x of K linear function Softmax functionTThe transposition of x is represented, w represents weight, and k is
Parameter in sum formula, numerical value are 1 to K.
First-level class device and secondary classifier use cross entropy as its cost function (also known as loss function, loss), use
In the entirety training of model, cross entropy description are as follows:
Wherein P and Q is two given probability distribution, the i.e. probability distribution of prediction label and true tag;Due to sand and dust
Sudden and violent label is discrete distribution ,-EX~PIt is equivalent to-∑xP (x), P (x) are used to describe the true distribution of sample, such as [1,0,0,0]
Indicate that this sample x belongs to the first kind, and Q (x) is used to indicate the distribution of prediction, such as [0.7,0.1,0.1,0.1], to represent this sample
The probability that this x belongs to the first kind is 0.7.
Sample is independently distributed in the neural network of first-level class device and second level meta classifier, and cross entropy uses maximum
Likelihood principle, i.e.,
WhereinIt is i-th of sample input data x(i)On output, i.e., prediction label vector (label use one-hot
Coding), n is the number of sample in each trained batch, and n is the subset of m, and each batch training n is a part in m, y(i)
It is the sandstorm label vector (one-hot coding) of i-th of sample, x(i)It is the input data of i-th of sample, θ represents maximum seemingly
So distribution parameter in estimation, that is, use sample input data, and the parameter value estimated, p (y are distributed according to sample label(i)|
x(i);The maximal possibility estimation for θ) representing every a sample, taking logarithm and is not influenced as a result, accumulating it for convenience of calculation
As whole maximum likelihood, σ represent the standard deviation for the sample label distribution for needing to estimate, i.e., in this formula, the need of levoform
The distribution parameter θ to be estimated is the σ of right formula;
For above-mentioned all secondary classifiers, accuracy rate, precision, recall rate, F1Score index conduct are all used
Comprehensive performance measurement, if:
TP (True Positive): being judged as positive sample, in fact and positive sample.
TN (True Negative): being judged as negative sample, in fact and negative sample.
FP (False Positive): it is judged as positive sample, but is in fact negative sample.
FP (False Positive): it is judged as positive sample, but is in fact negative sample.
Accuracy rate formula are as follows:
Accuracy=(TP+TN)/(TP+TN+FN+FP)/100
Accuracy formula are as follows:
Precision=TP/ (TP+FP)/100
Recall rate formula are as follows:
Recall=TP/ (TP+FN)/100
F1 Score, also known as balance F score (balanced F Score), it is defined as the tune of accurate rate and recall rate
And average, formula are as follows:
First-level class device R and C, the performance metric of itself is not the measurement to overall model performance, but is mentioned to its feature
Take the measurement of ability, that is to say, that the height of R and C performance does not reflect the performance of whole disaggregated model directly.
It is predicted by the training of first-level class device, the available weather meteorological data feature based on first-level class device, this
By the input as second level meta classifier, the specific feature extraction process of first-level class device R is as shown in figure 5, first-level class device C is special
Sign extracts process and R similarly.Global feature extracts detailed process and can be described as:
(a) for model 1 (being first-level class device R in the present invention), training set is divided into k parts, for every portion, with residue
Then data set training pattern predicts this result.
(b) repeat previous step, until it is every it is a all predict, obtain one of the training set of secondary classifier model
Point.
(c) k parts of test set predicted values are obtained, a part of the test set of meta classifier model is obtained after average rounding.
(d) for model 2 (first-level class device C) repeat above step, obtain entire meta classifier model training set and
Test set.
(e) meta classifier Q model is trained and is predicted.
Traditional RNN expansion as shown in fig. 6, first-level class device Recognition with Recurrent Neural Network R of the present invention for multilayer depth RNN,
Long-term Dependence Problem existing for tradition RNN is solved using gate recursive unit (Gated Recursive Unit, GRU), GRU is
A kind of variant of traditional RNN, which introduce door control mechanisms, respectively update door and resetting door.The door control mechanism of GRU such as Fig. 7 institute
Show, the z in figuretAnd rtIt respectively indicates and updates door and resetting door.It updates door and is brought into for controlling the status information of previous moment
Degree in current state, update door the bigger status information for illustrating previous moment of value bring into it is more.It is previous to reset door control
How many information of state is written to current Candidate SetOn, resetting door is smaller, and the information of previous state is written into fewer.
The formula that GUR is propagated forward are as follows:
rt=σ (Wr·[ht-1,xt] (1)
zt=σ (Wz·[ht-1,xt] (2)
yt=σ (W0·ht) (5)
Wherein [] indicates that two vectors are connected, and the product of * representing matrix, Recognition with Recurrent Neural Network R is replaced by 1*1 convolutional layer
Full articulamentum carries out the stabilization and feature integration of model, wherein the tanh in (3) is the activation primitive of GRU unit default, it can be with
Other activation primitives are substituted for, such as correct linear unit (RELU).Model is defaulted except other layer of activation primitive of classification layer to be used
PReLU, and its over-fitting is reduced using batch normalization (Batch-Normalization, BN) and L2 regularization method, increase general
The property changed.
First-level class device convolutional neural networks C of the present invention is the depth CNN of multilayer, and the function using convolutional layer is to input
Data carry out feature extraction, and internal includes multiple convolution kernels, and each element for forming convolution kernel corresponds to a weight coefficient
With a departure (bias vector), similar to the neuron (Neuron) of a feedforward neural network.It is each in convolutional layer
Neuron is all connected with multiple neurons in the region being closely located in preceding layer, and the size in region depends on the big of convolution kernel
It is small, also referred to as " receptive field (receptive field) ".Local feature information is obtained by convolution kernel, is carried out by pond layer
Down-sampling, the effect of down-sampling are characterized dimensionality reduction, and the quantity of compressed data and parameter reduces over-fitting, while improving model
Fault-tolerance;Full articulamentum is replaced by 1*1 convolutional layer, carries out the stabilization and feature integration of model, and use batch normalization
(Batch-Normalization, BN) and L2 regularization method reduce its over-fitting, increase generalization.
Specifically, convolution kernel at work, understands regularly inswept input feature vector, does square to input feature vector in receptive field
Array element element multiplication sums and is superimposed departure.After convolutional layer carries out feature extraction, the characteristic pattern of output can be passed to pond
Layer carries out feature selecting and information filtering.Pond layer includes presetting pond function, and function is by a single point in characteristic pattern
Result replace with the characteristic pattern statistic of its adjacent area.Pond layer choosing takes pond region and convolution kernel scanning feature figure step
It is identical, You Chihua size, step-length and filling control.Convolution and pondization operation are as shown in Figure 8.
Meta classifier Q is the full Connection Neural Network of a multilayer, is reduced using Dropout (DP) and L2 regularization method
Over-fitting replaces full articulamentum using 1*1 convolutional layer, carries out the stabilization and feature integration of model, i.e. meta classifier Q is in the nature volume
Product core size is 1*1, the convolutional neural networks that multilayer convolution stacks.
First-level class device R, C and secondary classifier of the present invention need complete after passing through convolution or GRU operation respectively
The characteristic synthetic that articulamentum extracts front.In the present invention, 1*1 convolution kernel will be used to replace full articulamentum, to
It realizes interaction and information integration across channel, and carries out the dimensionality reduction of convolution kernel port number and rise dimension.In each convolutional layer, data are all
Be it is existing in three dimensions, it can be regarded as many a 2-D datas and stacked, wherein each 2-D data is known as
One feature map.For image data, in input layer, if it is gray scale picture, that is with regard to only one feature
map;It is exactly generally 3 feature map (RGB) if it is color image.Single channel feature map is rolled up with monokaryon
Product operation, as multiplied by a parameter, for multi-kernel convolution multi-channel operation, needs to realize that multiple feature map's is linear
Combination.In the present invention, the storage form of each sample is identical as gray scale picture, i.e., each sample has a feature map.
In numerical operation angle, convolution is a dot product operation with connecting entirely, and difference is that convolution is to act on the area of a part
Domain, and connecting entirely is to expand as entirely inputting by the region that convolution acts on for entirely inputting, 1*1 convolution kernel, that is, effective
Instead of full articulamentum, and better Generalization Capability is provided, improves prediction comprehensive performance.
As shown in figure 9, network first tier has 5 neurons in figure, it is a1-a5 respectively, by becoming 3 after connecting entirely
It is a, it is b1-b3 respectively, i.e., the 5 of first layer neuron will realize full connection with 3 below, and it is complete only to have drawn a1-a5 in this figure
It is connected to the signal of b1, it can be seen that b1 is the weighted sum of 5 neurons in front, corresponding weight in fact in full articulamentum
For W1-W5;When replacing full connection using 1*1 convolution kernel, 5 neurons of first layer are equivalent in input feature vector in fact
Port number: 5, and 3 neurons of the second layer are equivalent to the new feature port number after 1*1 convolution: 3, W1-W5 can be with
It is considered as the weight coefficient of convolution kernel, the 1*1 convolution kernel for replacing full attended operation can be constructed by above-mentioned data.
1*1 convolution kernel, and it is known as Webweb (Network in Network), relative to full connection, 1*1 convolution kernel has
Following characteristics:
(1) in the case where not increasing receptive field, network is allowed to deepen, introduces more non-linear neural members.
(2) interaction and information integration across channel are realized, in the present invention, it can integrate more features.
(3) the liter peacekeeping dimensionality reduction of convolution kernel port number is carried out.
Wherein, it criticizes normalization (Batch-Normalization, BN) and reduce over-fitting method, is enabled every by using BN
The activation of a neuron becomes to meet Gaussian Profile, i.e. neuron is usually medium active, sometimes somewhat active, rare very living
Jump.In the standardization of bn, covariant offset be it is constant, the needs of Gaussian Profile are unsatisfactory for, because subsequent layer must be protected
Hold the variation for adapting to distribution pattern.The essence of bn is exactly to become variance size and mean location using optimization, so that new point
Cloth more suits the true distribution of data, guarantees non-linear expression's ability of model.The algorithm description of BN is as follows:
M is the sample size in a batch (batch);X represents input sample;uBRepresent the equal of sample in this batch
Value;For the variance of sample in this batch;Represent the sample data after normalization;γ is scale factor;β is translation
The factor;yiTo operate finally obtained data by batch normalization (bn).
I.e. BN step is broadly divided into 4 steps
(1) mean value of each training lot data is sought;
(2) variance of each training lot data is sought;
(3) training data of the batch is normalized using the mean value and variance that acquire, obtains 0-1 distribution;
(4) change of scale and offset: by xiNumerical values recited is adjusted multiplied by γ, obtains y after increasing offset along with βi, γ is
Scale factor, β are shift factors, due to the x after normalizationiSubstantially can be limited under normal distribution, so that the expression of network
Ability decline, to solve this problem, introduce two new parameters: γ, beta, gamma and β are that network oneself study obtains in training
's.
The present invention will use batch normalization before the activation primitive of all hidden layers of first-level class device (without classification layer).
L2 regularization method is the quadratic sum that weight parameter is added on the basis of original loss function, after L2 regularization
Loss function indicate are as follows:
Wherein, w is classifier network model parameter, EinIt (w) is the training sample error for not including regularization term, λ is just
Then change parameter;
According to above-mentioned formula, the gradient of L (w) is expressed as:
And the method that Dropout (DP) solves meta classifier Q over-fitting, refer to when neural network propagated forward,
It allows the activation value of some neuron to stop working with certain Probability p (present invention takes p=50%), model can be made every in this way
In secondary trained batch, certain local features of lot data will not be too depended on, this mode can subtract to a certain extent
Few interneuronal interaction, keeps model generalization stronger.As shown in Figure 10, the left side is not use for Dropout signal
The neural network of Dropout operation, the right are using the neural network after Dropout operation, it can be seen that the nerve net on the right
The some neurons of network can be when the data of some batch pass through, temporary inactivation.
Below with reference to embodiment, invention is further explained.
The original weather data being collected into is subjected to data prediction, is integrated into grid type time series data as shown in Figure 2,
And it is divided into training set and test set.Training set is sequentially input to the training that model is carried out in above-mentioned first-level class device R and C, root
According to the feature generating mode of Fig. 5, the temporal aspect and high dimensional feature of original weather data are extracted, and as input, to two
Grade meta classifier is trained;In model construction, Batch-Normalization, Dropout, L2 regularization, 1*1 are used
Convolution kernel replaces full attended operation etc. to increase the method for model generalization ability, finally obtains with C and R Common advantages, has
The integrated classifier of more preferable generalisation properties.
When classifying prediction, test set is input in the R and C in integrated classifier, extracts its feature, inputs member classification
Device Q finally obtains the classification of sandstorm intensity amount of prediction, and the classification performance of integrated classifier is measured with this.
In actual prediction, collection obtains relevant Weather property, is integrated into corresponding sample form, is input to one
Feature is extracted in grade classifier, then feature is inputted into meta classifier, the sand-dust storm forecast grade of future time instance can be obtained.
Claims (10)
1. a kind of classification of sandstorm intensity prediction technique based on Stacking Integrated Strategy characterized by comprising
Using Recognition with Recurrent Neural Network R and convolutional neural networks C as first-level class device, original weather sample data is inputted respectively and is followed
Ring neural network R and convolutional neural networks C obtains corresponding one level learning feature;
Using Stacking Integrated Strategy, a meta classifier Q is introduced as secondary classifier, by the one level learning feature group
It is incorporated as the input of secondary classifier;
Using the output of secondary classifier as the classification of sandstorm intensity amount finally predicted.
2. according to claim 1 based on the classification of sandstorm intensity prediction technique of Stacking Integrated Strategy, which is characterized in that institute
Original weather sample data is stated to obtain in the following way:
By " Chinese terrestrial climate data earning in a day data set " and " Chinese strong chromatic number sequence and its support data set " according to the date
It is integrated into a whole data set;
The whole data set is subjected to data cleansing, the data predictions such as attitude layer;
Pretreated data are subjected to timing arrangement, attribute is from left to right unfolded, and timing arranges from top to bottom, and is each number
According to classification of sandstorm intensity label is stamped, original weather sample data is finally obtained.
3. according to claim 1 based on the classification of sandstorm intensity prediction technique of Stacking Integrated Strategy, which is characterized in that benefit
The temporal aspect that original weather sample data is extracted with the Recognition with Recurrent Neural Network R is extracted former using the convolutional neural networks C
The high dimensional feature of beginning weather sample data.
4. according to claim 1 based on the classification of sandstorm intensity prediction technique of Stacking Integrated Strategy, which is characterized in that will
The m original random cuttings of weather sample data are k sample set, therefrom enumerate each sample set Si, i≤k will be remaining
Sample set as training set, two first-level class devices are trained respectively, trained first-level class device base model table
It is shown as CiAnd Ri, by its corresponding sample set SiClassification of sandstorm intensity prediction is carried out, each basic mode type is to i-th of training sample set
The predicted value of conjunction will concentrate a characteristic value of i-th of sample as new samples, and all eigenvalue clusters are synthesized to new spy
Sample is levied, finally carries out the training of secondary classifier as training set using this feature sample;For predicting process, first by owning
First-level class device is predicted to form feature samples collection, finally be predicted again feature samples collection, to obtain preferably predicting effect
Fruit.
5. according to claim 4 based on the classification of sandstorm intensity prediction technique of Stacking Integrated Strategy, which is characterized in that institute
The feedforward neural network for stating first-level class device and secondary classifier carries out information propagation by following formula:
a(l)=fl(W(l)·a(l-1)+b(l))
Wherein, a(l)Indicate the output of l layers of neuron, flIndicate the activation primitive of l layers of neuron, W(l)Indicate l-1 layers to l layers
Weight matrix, b(l)Indicate l-1 layers to l layers of biasing;
The classification layer of the first-level class device and secondary classifier is using Softmax as output function, Softmax function formula
It is as follows:
Wherein, j=1 ..., K, K here represent class categories number, zjIt is needed for vector that is, for what is generated by upper one layer of classification layer
Input the vector data of Softmax function;
Sample vector x belongs to the probability of j-th of classification are as follows:
xTThe transposition of x is represented, w represents weight, and k is the parameter in sum formula, and numerical value is 1 to K;The circulation of Recognition with Recurrent Neural Network R
Unit activating function uses tanh, and the classification layer of each classifier uses Softmax activation primitive, and each classifier rest part swashs
Function default living is using parametrization amendment linear unit (Parametric Rectified Linear Unit, PReLU), formula
Are as follows:
Wherein, x is the data of input;α be adjustable constant, be obtained by neural network learning, if study obtain α=
0, then PReLU degenerates for amendment linear unit (Rectified Linear UnitReLU);If α is the fixation of a very little
Value, then PReLU degenerates for band leakage amendment linear unit (Leaky ReLU, LReLU);
The first-level class device and secondary classifier are handed over as its cost function for the entirety training of model using cross entropy
Pitch entropy description are as follows:
Wherein P and Q is two given probability distribution, the i.e. probability distribution of prediction label and true tag;Due to sandstorm mark
Label are discrete distribution ,-EX~PIt is equivalent to-∑xP (x), P (x) are used to describe the true distribution of sample, and Q (x) is used to indicate prediction
Distribution;
Sample is independently distributed in the neural network of the first-level class device and second level meta classifier, and cross entropy uses maximum
Likelihood principle, i.e.,
WhereinIt is i-th of sample input data x(i)On output, i.e., prediction label vector, n are samples in each trained batch
This number, n is the subset of m, and each batch training n is a part in m, y(i)Be i-th of sample sandstorm label to
Amount, x(i)It is the input data of i-th of sample, θ represents the distribution parameter in maximal possibility estimation, that is, sample input data is used,
The parameter value estimated, p (y are distributed according to sample label(i)|x(i);The maximal possibility estimation for θ) representing every a sample, will
It is whole maximum likelihood that it is cumulative, and σ represents the standard deviation for the sample label distribution for needing to estimate;
Use accuracy rate, precision, recall rate, F1 Score index comprehensive as first-level class device and secondary classifier model
It can measure, wherein F1 Score is the harmonic-mean of accurate rate and recall rate.
6. according to claim 1 based on the classification of sandstorm intensity prediction technique of Stacking Integrated Strategy, which is characterized in that institute
The depth RNN that Recognition with Recurrent Neural Network R is multilayer is stated, is solved using gate recursive unit (Gated Recursive Unit, GRU)
Long-term Dependence Problem present in traditional RNN, and equally by 1*1 convolutional layer replace full articulamentum, carry out model stabilization and
Feature integration, the activation primitive in GRU unit use tanh, and model is defaulted except other layer of activation primitive of classification layer to be used
PReLU, and its over-fitting is reduced using batch normalization (Batch-Normalization, BN) and L2 regularization method, increase general
The property changed;
The convolutional neural networks C be multilayer depth CNN, by convolution kernel obtain local feature information, by pond layer into
Row down-sampling, the effect of down-sampling are characterized dimensionality reduction, and the quantity of compressed data and parameter reduces over-fitting, while improving model
Fault-tolerance;Full articulamentum is replaced by 1*1 convolutional layer, carries out the stabilization and feature integration of model, and use batch normalization
(Batch-Normalization, BN) and L2 regularization method reduce its over-fitting, increase generalization;
The meta classifier Q is the full Connection Neural Network of a multilayer, is reduced using Dropout (DP) and L2 regularization method
Over-fitting replaces full articulamentum using 1*1 convolutional layer, carries out the stabilization and feature integration of model, i.e. meta classifier Q is in the nature volume
Product core size is 1*1, the convolutional neural networks that multilayer convolution stacks.
7. according to claim 6 based on the classification of sandstorm intensity prediction technique of Stacking Integrated Strategy, which is characterized in that institute
It states and full articulamentum is replaced using 1*1 convolutional layer, to realize interaction and information integration across channel, and carry out convolution kernel port number
Dimensionality reduction and rise dimension, the storage form of each sample is identical as gray scale picture, i.e., each sample has a feature map.
8. according to claim 6 based on the classification of sandstorm intensity prediction technique of Stacking Integrated Strategy, which is characterized in that institute
It states batch normalization (Batch-Normalization, BN) and reduces over-fitting method, be that each neuron is enabled by using BN
Activation becomes to meet Gaussian Profile, i.e. neuron usually medium active, sometimes somewhat active, rare very active, the algorithm of BN
It is described as follows:
M is the sample size in a batch (batch);X represents input sample;uBRepresent the mean value of sample in this batch;For the variance of sample in this batch;Represent the sample data after normalization;γ is scale factor;β is shift factor;
yiTo operate finally obtained data by batch normalization (bn).
I.e. BN step is broadly divided into 4 steps
(1) mean value of each training lot data is sought;
(2) variance of each training lot data is sought;
(3) training data of the batch is normalized using the mean value and variance that acquire, obtains 0-1 distribution;
(4) change of scale and offset: by xiNumerical values recited is adjusted multiplied by γ, obtains y after increasing offset along with βi, γ is scale
The factor, β are shift factors, due to the x after normalizationiSubstantially can be limited under normal distribution, so that the ability to express of network
Decline, to solve this problem, introduce two new parameters: γ, beta, gamma and β are that network oneself study obtains in training.
9. according to claim 6 based on the classification of sandstorm intensity prediction technique of Stacking Integrated Strategy, which is characterized in that institute
Stating L2 regularization method is the quadratic sum that weight parameter is added on the basis of original loss function, the loss after L2 regularization
Function representation are as follows:
Wherein, w is classifier network model parameter, EinIt (w) is the training sample error for not including regularization term, λ is regularization
Parameter;
According to above-mentioned formula, the gradient of L (w) is expressed as:
。
10. according to claim 6 based on the classification of sandstorm intensity prediction technique of Stacking Integrated Strategy, which is characterized in that
Dropout (DP) method is to allow the activation value of some neuron with certain general when neural network propagated forward
Rate p stops working, so that model in each training batch, will not too depend on certain local features of lot data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910598794.9A CN110348624B (en) | 2019-07-04 | 2019-07-04 | Sand storm grade prediction method based on Stacking integration strategy |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910598794.9A CN110348624B (en) | 2019-07-04 | 2019-07-04 | Sand storm grade prediction method based on Stacking integration strategy |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110348624A true CN110348624A (en) | 2019-10-18 |
CN110348624B CN110348624B (en) | 2020-12-29 |
Family
ID=68178292
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910598794.9A Expired - Fee Related CN110348624B (en) | 2019-07-04 | 2019-07-04 | Sand storm grade prediction method based on Stacking integration strategy |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110348624B (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111008604A (en) * | 2019-12-09 | 2020-04-14 | 上海眼控科技股份有限公司 | Prediction image acquisition method and device, computer equipment and storage medium |
CN111062410A (en) * | 2019-11-05 | 2020-04-24 | 复旦大学 | Star information bridge weather prediction method based on deep learning |
CN111178304A (en) * | 2019-12-31 | 2020-05-19 | 江苏省测绘研究所 | High-resolution remote sensing image pixel level interpretation method based on full convolution neural network |
CN111291832A (en) * | 2020-03-11 | 2020-06-16 | 重庆大学 | Sensor data classification method based on Stack integrated neural network |
CN112418267A (en) * | 2020-10-16 | 2021-02-26 | 江苏金智科技股份有限公司 | Motor fault diagnosis method based on multi-scale visual and deep learning |
CN112508060A (en) * | 2020-11-18 | 2021-03-16 | 哈尔滨工业大学(深圳) | Landslide mass state judgment method and system based on graph convolution neural network |
CN112801233A (en) * | 2021-04-07 | 2021-05-14 | 杭州海康威视数字技术股份有限公司 | Internet of things equipment honeypot system attack classification method, device and equipment |
CN113096814A (en) * | 2021-05-28 | 2021-07-09 | 哈尔滨理工大学 | Alzheimer disease classification prediction method based on multi-classifier fusion |
CN113820079A (en) * | 2021-07-28 | 2021-12-21 | 中铁工程装备集团有限公司 | Hydraulic cylinder leakage fault diagnosis method based on cyclostationary theory and Stacking model |
CN114220024A (en) * | 2021-12-22 | 2022-03-22 | 内蒙古自治区气象信息中心(内蒙古自治区农牧业经济信息中心)(内蒙古自治区气象档案馆) | Static satellite sandstorm identification method based on deep learning |
CN114692817A (en) * | 2020-12-31 | 2022-07-01 | 合肥君正科技有限公司 | Method for dynamically adjusting quantized feature clip value |
CN117408167A (en) * | 2023-12-15 | 2024-01-16 | 四川省能源地质调查研究所 | Debris flow disaster vulnerability prediction method based on deep neural network |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160372118A1 (en) * | 2015-06-19 | 2016-12-22 | Google Inc. | Context-dependent modeling of phonemes |
CN107330362A (en) * | 2017-05-25 | 2017-11-07 | 北京大学 | A kind of video classification methods based on space-time notice |
CN108596398A (en) * | 2018-05-03 | 2018-09-28 | 哈尔滨工业大学 | Time Series Forecasting Methods and device based on condition random field Yu Stacking algorithms |
CN109031421A (en) * | 2018-06-05 | 2018-12-18 | 广州海洋地质调查局 | A kind of stack velocity spectrum pick-up method and processing terminal based on deeply study |
-
2019
- 2019-07-04 CN CN201910598794.9A patent/CN110348624B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160372118A1 (en) * | 2015-06-19 | 2016-12-22 | Google Inc. | Context-dependent modeling of phonemes |
CN107330362A (en) * | 2017-05-25 | 2017-11-07 | 北京大学 | A kind of video classification methods based on space-time notice |
CN108596398A (en) * | 2018-05-03 | 2018-09-28 | 哈尔滨工业大学 | Time Series Forecasting Methods and device based on condition random field Yu Stacking algorithms |
CN109031421A (en) * | 2018-06-05 | 2018-12-18 | 广州海洋地质调查局 | A kind of stack velocity spectrum pick-up method and processing terminal based on deeply study |
Non-Patent Citations (1)
Title |
---|
黄婕等: "基于RNN-CNN集成深度学习模型的PM2.5小时浓度预测", 《浙江大学学报(理学版)》 * |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111062410A (en) * | 2019-11-05 | 2020-04-24 | 复旦大学 | Star information bridge weather prediction method based on deep learning |
CN111062410B (en) * | 2019-11-05 | 2023-05-30 | 复旦大学 | Star information bridge weather prediction method based on deep learning |
CN111008604A (en) * | 2019-12-09 | 2020-04-14 | 上海眼控科技股份有限公司 | Prediction image acquisition method and device, computer equipment and storage medium |
CN111178304A (en) * | 2019-12-31 | 2020-05-19 | 江苏省测绘研究所 | High-resolution remote sensing image pixel level interpretation method based on full convolution neural network |
CN111291832A (en) * | 2020-03-11 | 2020-06-16 | 重庆大学 | Sensor data classification method based on Stack integrated neural network |
CN112418267A (en) * | 2020-10-16 | 2021-02-26 | 江苏金智科技股份有限公司 | Motor fault diagnosis method based on multi-scale visual and deep learning |
CN112418267B (en) * | 2020-10-16 | 2023-10-24 | 江苏金智科技股份有限公司 | Motor fault diagnosis method based on multi-scale visual view and deep learning |
CN112508060B (en) * | 2020-11-18 | 2023-08-08 | 哈尔滨工业大学(深圳) | Landslide body state judging method and system based on graph convolution neural network |
CN112508060A (en) * | 2020-11-18 | 2021-03-16 | 哈尔滨工业大学(深圳) | Landslide mass state judgment method and system based on graph convolution neural network |
CN114692817A (en) * | 2020-12-31 | 2022-07-01 | 合肥君正科技有限公司 | Method for dynamically adjusting quantized feature clip value |
CN112801233A (en) * | 2021-04-07 | 2021-05-14 | 杭州海康威视数字技术股份有限公司 | Internet of things equipment honeypot system attack classification method, device and equipment |
CN113096814A (en) * | 2021-05-28 | 2021-07-09 | 哈尔滨理工大学 | Alzheimer disease classification prediction method based on multi-classifier fusion |
CN113820079A (en) * | 2021-07-28 | 2021-12-21 | 中铁工程装备集团有限公司 | Hydraulic cylinder leakage fault diagnosis method based on cyclostationary theory and Stacking model |
CN113820079B (en) * | 2021-07-28 | 2024-05-24 | 中铁工程装备集团有限公司 | Hydraulic cylinder leakage fault diagnosis method based on cyclostationary theory and Stacking model |
CN114220024B (en) * | 2021-12-22 | 2023-07-18 | 内蒙古自治区气象信息中心(内蒙古自治区农牧业经济信息中心)(内蒙古自治区气象档案馆) | Static satellite sand storm identification method based on deep learning |
CN114220024A (en) * | 2021-12-22 | 2022-03-22 | 内蒙古自治区气象信息中心(内蒙古自治区农牧业经济信息中心)(内蒙古自治区气象档案馆) | Static satellite sandstorm identification method based on deep learning |
CN117408167A (en) * | 2023-12-15 | 2024-01-16 | 四川省能源地质调查研究所 | Debris flow disaster vulnerability prediction method based on deep neural network |
Also Published As
Publication number | Publication date |
---|---|
CN110348624B (en) | 2020-12-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110348624A (en) | A kind of classification of sandstorm intensity prediction technique based on Stacking Integrated Strategy | |
CN108491970B (en) | Atmospheric pollutant concentration prediction method based on RBF neural network | |
CN109508360B (en) | Geographical multivariate stream data space-time autocorrelation analysis method based on cellular automaton | |
He et al. | Mining transition rules of cellular automata for simulating urban expansion by using the deep learning techniques | |
CN113537600B (en) | Medium-long-term precipitation prediction modeling method for whole-process coupling machine learning | |
CN106886846A (en) | A kind of bank outlets' excess reserve Forecasting Methodology that Recognition with Recurrent Neural Network is remembered based on shot and long term | |
CN109086799A (en) | A kind of crop leaf disease recognition method based on improvement convolutional neural networks model AlexNet | |
CN103489005B (en) | A kind of Classification of High Resolution Satellite Images method based on multiple Classifiers Combination | |
CN110135267A (en) | A kind of subtle object detection method of large scene SAR image | |
CN111665575B (en) | Medium-and-long-term rainfall grading coupling forecasting method and system based on statistical power | |
CN108171209A (en) | A kind of face age estimation method that metric learning is carried out based on convolutional neural networks | |
CN110648014A (en) | Regional wind power prediction method and system based on space-time quantile regression | |
CN107423820A (en) | The knowledge mapping of binding entity stratigraphic classification represents learning method | |
CN104252625A (en) | Sample adaptive multi-feature weighted remote sensing image method | |
CN102855486A (en) | Generalized image target detection method | |
CN113344045B (en) | Method for improving SAR ship classification precision by combining HOG characteristics | |
CN113032613B (en) | Three-dimensional model retrieval method based on interactive attention convolution neural network | |
CN114611608A (en) | Sea surface height numerical value prediction deviation correction method based on deep learning model | |
CN115099497A (en) | CNN-LSTM-based real-time flood forecasting intelligent method | |
CN113536373A (en) | Desensitization meteorological data generation method | |
Uğuz et al. | A hybrid CNN-LSTM model for traffic accident frequency forecasting during the tourist season | |
Zhang et al. | Atmospheric Environment Data Generation Method Based on Stacked LSTM-GRU | |
Yang et al. | Automatically adjustable multi-scale feature extraction framework for hyperspectral image classification | |
CN116778205A (en) | Citrus disease grade identification method, equipment, storage medium and device | |
CN110348311A (en) | A kind of intersection identifying system and method based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20201229 |