CN107992447A

CN107992447A - A kind of feature selecting decomposition method applied to river level prediction data

Info

Publication number: CN107992447A
Application number: CN201711330726.1A
Authority: CN
Inventors: 杨拥军; 管杰
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2017-12-13
Filing date: 2017-12-13
Publication date: 2018-05-04
Anticipated expiration: 2037-12-13
Also published as: CN107992447B

Abstract

The invention discloses a kind of feature selecting decomposition method applied to river level prediction data, in order to obtain most suitable as the feature of mode input, returned present invention introduces LASSO and carry out feature selecting to being originally inputted collection, and integrate MODWT and ingredient breakdown is carried out to the feature that selection obtains, and using the performance of model measurement LASSO MODWT based on multiple linear regression.Test shows that the feature selecting decomposition method based on LASSO MODWT is conducive to improve the performance and model explanation ability of river level prediction model.

Description

A kind of feature selecting decomposition method applied to river level prediction data

Technical field

The invention belongs to water level forecast technical field, and in particular to a kind of feature applied to river level prediction data is selected Select the design of decomposition method.

Background technology

Water level forecast plays the role of particularly important for flood control and disaster reduction, water resource utilization and allocation managing.One sane Forecast model of water level the situations of change of future levels can be provided for relevant Decision person, grasp potential hydrological disaster in time, So as to carry out correlation early warning deployment earlier.In water level forecast field, due to influencing the multi-dimensional nature and complexity of water level factor, mould Nonlinear dynamical relations and a variety of correlations are often presented between the potential input quantity of type system.In addition the number of input quantity is general Larger, especially into the dimension and computation complexity of increase feature that can be drastically after the hysteresis of each variable, but these become A large amount of duplicate messages and noise contribution are actually included in amount.In order to reduce the computational complexity of model, the flexible of model is improved Property and explanation strengths, it is necessary to the validity feature that selection includes minimum redundancy be concentrated from original high dimensional data, so as to build one Possesses the model that is more succinct, more reflecting real water level changing rule of retractility.

LASSO is proposed first by Robert Tibshirani in 1996, full name Least absolute shrinkage and selection operator.This method is a kind of Shrinkage estimation, it obtains one more by constructing a penalty function The model of refining so that it compresses some coefficients, and it is zero to concurrently set some coefficients.Therefore the advantages of subset is shunk is remained, It is a kind of Biased estimator for handling and there are multi-collinearity data.The basic thought of LASSO is in the sum of absolute value of regression coefficient Under constraints less than or equal to a constant, residual sum of squares (RSS) RSS is minimized, be strictly 0 so as to produce some Regression coefficient, obtains the model with interpretability after compressive features.

Wavelet transform (DWT) is widely used in the model of many integrated small echos, can obtain data Detailed spectrum information, such as periodicity, localized variation characteristic, randomness and mutability.But since it is with extraction effect, The effect can introduce potential loss of learning so as to produce deviation in prediction in the model construction stage.In addition the small echo of DWT Conversion coefficient result is related with the initial position of wavelet transformation, so as to bring certain contingency.

Drawbacks described above based on DWT, related personnel further provide Maximum overlap wavelet transform (MODWT, Maximum overlap discrete wavelet transform) method as feature decomposition.MODWT is a kind of linear Filter operation, can preferably solve extraction effect, by MODWT, can obtain the multidimensional wavelet system with dimension with observation Number.In addition, the result of wavelet transformation is unrelated with the start position converted, it can be used for the conversion of different sample size data.Always For, MODWT can use the different frequency range component of extraction input signal, so as to obtain the information of more horn of plenty, disclose data Potential changing rule.

The content of the invention

The purpose of the invention is to reduce the computational complexity of existing forecast model of water level, while it is pre- to improve existing water level Survey flexibility and the explanation strengths of model, it is proposed that a kind of feature selecting decomposition method applied to river level prediction data.

The technical scheme is that：A kind of feature selecting decomposition method applied to river level prediction data, including Following steps：

S1, collection influence hydrographic features (the current level information of targeted sites, the upstream basin of target prediction website water level Water level information and on the way rainfall etc.).

S2, according to each hydrographic features, based on information theory construction feature collection.

S3, based on correlation analysis in feature set each feature introduce hysteresis, structure be originally inputted collection.

S4, be standardized to being originally inputted collection.

S5, based on LASSO to after standardization input set carry out feature selecting.

S6, based on MODWT to after feature selecting input set carry out feature decomposition, obtain after LASSO-MODWT optimizes Input set.

The beneficial effects of the invention are as follows：The present invention is returned using LASSO and carries out feature selecting to being originally inputted collection, and is integrated MODWT carries out ingredient breakdown to the feature that selection obtains so that the estimated performance of river level is obviously improved, is conducive to carry The performance and model explanation ability of high river level prediction model.

Further, step S2 is specially：Each hydrographic features are calculated respectively and predict the maximum information coefficient between target MIC, analyzes the intensity of its relation between prediction target, will predict that the MIC value between target will more than the hydrology of given threshold Element is used as input feature vector, construction feature collection.

The calculation formula of maximum information coefficient MIC is：

Wherein X, Y are two stochastic variables, and B limits for segmentation, takes 0.6 or 0.55 power of total amount of data, MIC [X；Y] Represent the maximum information coefficient between X and Y, I [X；Y] represent mutual information between X and Y, calculation formula is：

Wherein p (X) and p (Y) represents the probability density function of X, Y respectively, and p (X, Y) represents that the joint probability of X, Y are close Spend distribution function.

Above-mentioned further scheme has the beneficial effect that：Each hydrographic features and prediction mesh are analyzed using maximum information coefficient MIC Relationship strength between mark, will predict between target there is the factor of stronger relation as input feature vector, construction feature collection.

Further, step S3 is specially：For the current level information of targeted sites in feature set, using partial autocorrelation Function PACF determines hysteresis, for other input feature vectors in feature set, is analyzed using cross-correlation coefficient and determines hysteresis；It is right In each hysteresis, if clear and definite statistic correlation is presented between prediction target in it, that is, reach 95% confidential interval, Then the hysteresis is added in input set, collection is originally inputted so as to build.

Above-mentioned further scheme has the beneficial effect that：Since prediction target river level information is time series, so that structure Build the influence for being considered as introducing hysteresis when being originally inputted collection.

Further, step S4 is specially：Collected using min-max value standardization processing method to being originally inputted into rower Quasi-ization processing, will be originally inputted collection and zooms in [0,1] section, and processing formula is：

Wherein x_i,normFor the data value after standardization, x_iRepresent to be originally inputted and concentrate the i-th data item for needing to standardize, N_minAnd N_maxThe minimum value and maximum respectively scaled, is 0 and 1, x_minAnd x_maxRespectively it is originally inputted the minimum of concentration Value and maximum.

Above-mentioned further scheme has the beneficial effect that：Since different input datas have different dimensions, in order to adopt Assessed with same standard to being originally inputted collection, it is necessary to be standardized to being originally inputted collection, realize nondimensionalization, Collection will be originally inputted to zoom in [0,1] section.

Further, step S5 specifically include it is following step by step：

S51, using the input set after standardization as mode input, will predict the waterlevel data collection of targeted sites as Model exports, and builds LASSO regression models.

S52, be trained LASSO regression models, and the parameter lambda returned using grid data service to LASSO carries out optimizing, Find optimized parameter.

S53, using the LASSO regression models with optimized parameter score the feature in input set, standards of grading The regression coefficient returned for LASSO, selects LASSO regression coefficients to be remained in for positive feature in input set, will LASSO regression coefficients are 0 or are that negative feature is removed from input set, realize the feature selecting to input set.

Above-mentioned further scheme has the beneficial effect that：Feature choosing is carried out to the input set after standardization by LASSO After selecting, prediction accuracy can be improved on the premise of mode input parameter is greatly decreased.

Further, step S6 is specially：Feature decomposition is carried out to the input set after feature selecting using MODWT models, The wavelet systems manifold that all feature decompositions are obtained is used to build the input set after optimization.

The formula of wherein feature decomposition is：

Wherein f (t) is characterized the wavelet coefficient for decomposing and obtaining,To the smooth near of original signal during to carry out M layers of decomposition Like wavelet, W_m(t) be original signal in m layers of decomposition wavelet, m=1,2 ..., M, M be the minimal decomposition number of plies, calculation formula is：

M=int [log (N)] (5)

Wherein N is characterized the input set length after selection, and int [] is the function that rounds up.

Above-mentioned further scheme has the beneficial effect that：Feature is carried out to the input set after feature selecting using MODWT models Resolution significantly improves the precision of river level prediction.

Further, MODWT models use Daubechies wavelet basis.

Above-mentioned further scheme has the beneficial effect that：Need to select suitable wavelet basis when establishing the model based on MODWT Function, since currently without a clear and definite general basic function selection criteria, also which kind of base is pertinent literature explanation do not select Function can obtain best modelling effect, and different application scene is adapted to different basic functions in theory, it is contemplated that hydrologic(al) prognosis is fitted Irregular wavelet basis is shared, the present invention uses Daubechies wavelet basis, its extensive use and hydrologic(al) prognosis field.

Brief description of the drawings

Fig. 1 show a kind of feature selecting decomposition side applied to river level prediction data provided in an embodiment of the present invention Method flow chart.

Fig. 2 show the Daubechies wavelet basis provided in an embodiment of the present invention using db3 forms and WL_CS is carried out The comparative result figure that DMDWT is obtained.

Fig. 3 show three hours provided in an embodiment of the present invention and predicts different input set predicted values and actual value comparison diagram.

Fig. 4 show three hours provided in an embodiment of the present invention and predicts different input set predicted values and actual value scatter diagram.

Embodiment

Carry out detailed description of the present invention illustrative embodiments referring now to attached drawing.It should be appreciated that shown in attached drawing and What the embodiment of description was merely exemplary, it is intended that explaination the principle of the present invention and spirit, and not limit the model of the present invention Enclose.

An embodiment of the present invention provides a kind of feature selecting decomposition method applied to river level prediction data, such as Fig. 1 It is shown, comprise the following steps S1-S6：

S1, collection influence hydrographic features (including the current level information of targeted sites, the upstream of target prediction website water level The hydrographic features such as basin water level information and on the way rainfall).

In the embodiment of the present invention, by taking the SEA LEVEL VARIATION trend of Chishui River middle and lower reaches as an example, it is therefore intended that prediction Chishui station is not Come 3 it is small when and 6 it is small when water level conditions.The data of use are by Chishui River middle and lower reaches bank automatic monitor station in 2015 and 2016 Gathered during 5~October, the associated stations information being related to is shown in Table 1 and shows.Since data are to gather to store by hour, so that always Share 8834 data points.Missing is unavoidably had in data acquisition and storing process, analysis finds that missing data is WL_ Data are carried out interpolation polishing by totally 126 item datas of 02~2015-10-14 of MT 2015-10-09 07 using pandas.

Table 1

Code name	Meaning of parameters	Monitoring station	Data type	Collection period
					WL_CS	Chishui station water level	Chishui station	Water level	By hour
WL_EL	Two youths station water level	Two youths stand	Water level	By hour
					WL_MT	Maotai station water level	Maotai station	Water level	By hour
RF_CS	Chishui station rainfall	Chishui station	Rainfall	By hour
					RF_XS	Xishui County station rainfall	Xishui County station	Rainfall	By hour

Calculate respectively each hydrographic features and predict target between maximum information coefficient MIC, analyze itself and prediction target it Between relation intensity, by predict target between MIC value be more than given threshold hydrographic features (i.e. prediction target between with compared with The hydrographic features of strong relation) it is used as input feature vector, construction feature collection.

The calculation formula of maximum information coefficient MIC is：

Wherein X, Y are two stochastic variables, and B limits for segmentation, determines the upper limit of X, Y separate division, evidence of generally fetching 0.6 or 0.55 power of total amount, MIC [X；Y] represent X and Y between maximum information coefficient, I [X；Y] represent X and Y between it is mutual Information, calculation formula are：

In the embodiment of the present invention, a total of 5 features of feature set, include following content：(1) three Hydrologic monitoring station Chishui Stand, Maotai station, two youths stand waterlevel data (code name WL_CS, WL_MT, WL_EL)；(2) two weather monitoring station Chishui station, practise Water station rainfall product data (code name RF_CS, RF_XS).

Since prediction target river level information is time series, it is considered as introducing hysteresis when being originally inputted collection so as to build The influence of amount.In the embodiment of the present invention, for the current level information of targeted sites in feature set, using partial autocorrelation function PACF determines hysteresis, for other input feature vectors in feature set, is analyzed using cross-correlation coefficient and determines hysteresis；For every One hysteresis, if clear and definite statistic correlation (confidential interval for reaching 95%) is presented between prediction target in it, The hysteresis is added in input set, collection is originally inputted so as to build.Partial autocorrelation function PACF is analyzed with cross-correlation coefficient Method is correlation analysis commonly used in the art, and details are not described herein.

In the embodiment of the present invention, 3h predicts the spy for being originally inputted collection after introducing hysteresis to each feature by correlation analysis It is 221 to levy number, and 6h is predicted as 229.

S4, be standardized to being originally inputted collection.

Since different input datas have different dimensions, in order to which same standard can be used to be carried out to being originally inputted collection Assessment realizes nondimensionalization, it is necessary to be standardized to being originally inputted collection.In the embodiment of the present invention, using min-max Value standardization processing method (Min-Max Scaler) is standardized to being originally inputted collection, will be originally inputted collection and is zoomed to In [0,1] section, processing formula is：

In order to simplify input set, the feature most suitable as input is selected, to element input set in the embodiment of the present invention Returned based on LASSO and carry out feature selecting., can be by the recurrence system of redundant character since it introduces L1 regular terms as penalty term Number boil down to 0, so as to be a kind of sparse features system of selection based on the LASSO feature selectings returned.

Step S5 specifically includes following S51-S53 step by step：

In the embodiment of the present invention, the feature that 3h is predicted after the feature selecting based on LASSO is 49, the spy of 6h predictions Levy as 88.It can be seen that the number of input feature vector is all greatly reduced under two kinds of prediction scenes, model construction is thereby reduced Complexity.

Feature decomposition is carried out to the input set after feature selecting using MODWT models, all feature decompositions are obtained small Wave system manifold is used to build the input set after optimization.

The formula of wherein feature decomposition is：

M=int [log (N)] (5)

Effective input set in the embodiment of the present invention is 8678, therefore the minimal decomposition number of plies of MODWT is：M=log (8678)=3.93, rounding M=4, takes two kinds of situations of M=4 and M=5 to be tested in the embodiment of the present invention.

Although MODWT has been proved to possess many advantages as a multiresolution features identification facility, building Be based on MODWT model when face one challenge be the suitable wavelet basis function of selection, due to currently without one clearly General basic function selection criteria, also pertinent literature does not illustrate to select which kind of basic function to obtain best modelling effect, Different application scene is adapted to different basic functions in theory.It is adapted in view of hydrologic(al) prognosis with irregular wavelet basis, the present invention Embodiment uses Daubechies wavelet basis, its extensive use and hydrologic(al) prognosis field.Db2, db3 are used in the embodiment of the present invention Contrast test is carried out with the wavelet basis of tri- kinds of forms of db4, seeks to be most suitable for the wavelet basis for Chishui River water level forecast.

Be shown in Fig. 2 WL_CS is carried out using the Daubechies wavelet basis of db3 forms it is that DMDWT is obtained as a result, from upper 6 subgraphs under be respectively original signal waveform, smoothed approximation waveform (A4) and four layers of DMDWT decomposition coefficients (d1, d2, d3, d4).To reduce computational complexity, the embodiment of the present invention by score most important WL_CS this feature of LASSO only to being divided Solution, input set (4 layers, 5 layers are decomposed respectively 5 dimensions, 6 and maintain number) is added using the wavelet coefficient obtained after decomposition as new feature, At this time 3 it is small when predicted characteristics be 53,6 it is small when predicted characteristics be 92.

Since no general single index for being used to assess hydrologic forecast model performance, the embodiment of the present invention are assorted by receiving Efficiency factor E_NS, tri- kinds of statistics exponent pair estimated performances of root-mean-square error RMSE and mean absolute error MAE integrate commenting Sentence.

(1) assorted efficiency factor E is received_NS：

(2) root-mean-square error RMSE：

(3) mean absolute error MAE：

Wherein, SWL_OBSFor measured water level SWL_FORFor the water level obtained by model prediction, N is data point number,For the population mean of measured water level.

In the embodiment of the present invention, collection is originally inputted by what is obtained based on correlation analysis, by the spy based on LASSO respectively The input that the input set of selection and the input set after LASSO-MODWT optimizes are levied as multiple linear regression model is used in advance Survey Chishui station 3 as a child with 6 waterlevel datas when small, and then assess LASSO-MODWT feature selecting decomposition methods performance.Table 2 be different input sets for prediction Chishui station 3 when small and when 6 is small water level performance comparison.It is from table 2 it can be seen that either right Prediction is predicted when still 6 is small when 3 is small, mode input parameter can be greatly decreased after the feature selecting based on LASSO Under the premise of improve prediction accuracy；And precision of prediction can be significantly improved after integrated MODWT, and for 3 it is small when prediction and 6 it is small when Prediction has good performance.

Table 2

Fig. 3 is the contrast of water level forecast result and actual value when different input sets are small to during in the August, 2016 of Chishui station 3, Fig. 4 is three kinds of input set predicted values and actual value scatter diagram.As can be seen that after the decomposition of LASSO-MODWT feature selectings, phase Prediction result for being originally inputted collection, the predicted value of LASSO-W-MLR and the degree of approximation higher of actual value, model performance is more Stablize.So as to illustrate that LASSO-MODWT feature selectings decomposition method can be obviously improved Chishui River forecast model of water level Precision and stability.

In order to further study influence of the different small echo base types to Chishui River water level forecast performance, in the embodiment of the present invention Two kinds of Decomposition orders of tri- kinds of small echos of db2, db3, db4 and level4, level5 are emulated respectively, table 3 is using different small Ripple base and Decomposition order carry out the results of property of 3h predictions and 6h predictions.It is from table 3 it can be seen that small using db2 wavelet basis and 5 layers Wave Decomposition can obtain more preferably estimated performance in Chishui River forecast model of water level.The result further illustrates different application Scene is adapted to use different wavelet basis, in actual modeling process, should carry out demonstration trial with reference to specific requirements, find most Suitable wavelet basis and Decomposition order, so as to improve model accuracy.

Table 3

Those of ordinary skill in the art will understand that the embodiments described herein, which is to help reader, understands this hair Bright principle, it should be understood that protection scope of the present invention is not limited to such special statement and embodiment.This area Those of ordinary skill these disclosed technical inspirations can make according to the present invention and various not depart from the other each of essence of the invention The specific deformation of kind and combination, these deform and combine still within the scope of the present invention.

Claims

1. a kind of feature selecting decomposition method applied to river level prediction data, it is characterised in that comprise the following steps：

S1, collection influence the hydrographic features of target prediction website water level；

S2, according to each hydrographic features, based on information theory construction feature collection；

S3, based on correlation analysis in feature set each feature introduce hysteresis, structure be originally inputted collection；

S4, be standardized to being originally inputted collection；

S5, based on LASSO to after standardization input set carry out feature selecting；

S6, based on MODWT to after feature selecting input set carry out feature decomposition, obtain defeated after LASSO-MODWT optimizes Enter collection.

2. feature selecting decomposition method according to claim 1, it is characterised in that influence target prediction in the step S1 The current level information of hydrographic features of website water level including targeted sites, upstream basin water level information and rainfall on the way.

3. feature selecting decomposition method according to claim 1, it is characterised in that the step S2 is specially：Count respectively Calculate each hydrographic features and predict the maximum information coefficient MIC between target, analyze the intensity of its relation between prediction target, will MIC value between prediction target is more than the hydrographic features of given threshold as input feature vector, construction feature collection.

4. feature selecting decomposition method according to claim 3, it is characterised in that the meter of the maximum information coefficient MIC Calculating formula is：

Wherein X, Y are two stochastic variables, and B limits for segmentation, takes 0.6 or 0.55 power of total amount of data, MIC [X；Y] represent X Maximum information coefficient between Y, I [X；Y] represent mutual information between X and Y, calculation formula is：

<mrow> <mi>I</mi> <mo>&lsqb;</mo> <mi>X</mi> <mo>;</mo> <mi>Y</mi> <mo>&rsqb;</mo> <mo>=</mo> <munder> <mo>&Sigma;</mo> <mrow> <mi>X</mi> <mo>,</mo> <mi>Y</mi> </mrow> </munder> <mi>p</mi> <mrow> <mo>(</mo> <mi>X</mi> <mo>,</mo> <mi>Y</mi> <mo>)</mo> </mrow> <msub> <mi>log</mi> <mn>2</mn> </msub> <mfrac> <mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>X</mi> <mo>,</mo> <mi>Y</mi> <mo>)</mo> </mrow> </mrow> <mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>X</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>Y</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>2</mn> <mo>)</mo> </mrow> </mrow>

Wherein p (X) and p (Y) represents the probability density function of X, Y respectively, and p (X, Y) represents the joint probability density point of X, Y Cloth function.

5. feature selecting decomposition method according to claim 3, it is characterised in that the step S3 is specially：For spy The current level information of targeted sites in collection, determines hysteresis, for its in feature set using partial autocorrelation function PACF Its input feature vector, is analyzed using cross-correlation coefficient and determines hysteresis；For each hysteresis, if it is between prediction target Clear and definite statistic correlation is presented, that is, reaches 95% confidential interval, then the hysteresis is added in input set, so as to build It is originally inputted collection.

6. feature selecting decomposition method according to claim 1, it is characterised in that the step S4 is specially：Using most Small-maximum standardization processing method is standardized to being originally inputted collection, will be originally inputted collection and is zoomed to [0,1] section Interior, processing formula is：

<mrow> <msub> <mi>x</mi> <mrow> <mi>i</mi> <mo>,</mo> <mi>n</mi> <mi>o</mi> <mi>r</mi> <mi>m</mi> </mrow> </msub> <mo>=</mo> <msub> <mi>N</mi> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> </msub> <mo>+</mo> <mfrac> <mrow> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>-</mo> <msub> <mi>x</mi> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> </msub> </mrow> <mrow> <msub> <mi>x</mi> <mi>max</mi> </msub> <mo>-</mo> <msub> <mi>x</mi> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> </msub> </mrow> </mfrac> <mo>&times;</mo> <mrow> <mo>(</mo> <msub> <mi>N</mi> <mrow> <mi>m</mi> <mi>a</mi> <mi>x</mi> </mrow> </msub> <mo>-</mo> <msub> <mi>N</mi> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> </msub> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>3</mn> <mo>)</mo> </mrow> </mrow>

Wherein x_i,normFor the data value after standardization, x_iRepresent to be originally inputted and concentrate the i-th data item for needing to standardize, N_minWith N_maxThe minimum value and maximum respectively scaled, is 0 and 1, x_minAnd x_maxRespectively it is originally inputted the minimum value and most of concentration Big value.

7. feature selecting decomposition method according to claim 1, it is characterised in that the step S5 specifically includes following point Step：

S51, using the input set after standardization as mode input, the waterlevel data collection of targeted sites will be predicted as model Output, builds LASSO regression models；

S52, be trained LASSO regression models, and the parameter lambda returned using grid data service to LASSO carries out optimizing, finds Optimized parameter；

S53, using the LASSO regression models with optimized parameter score the feature in input set, and standards of grading are The regression coefficient that LASSO is returned, selects LASSO regression coefficients to be remained in for positive feature in input set, by LASSO Regression coefficient is 0 or is that negative feature is removed from input set, realizes the feature selecting to input set.

8. feature selecting decomposition method according to claim 1, it is characterised in that the step S6 is specially：Using MODWT models carry out feature decomposition to the input set after feature selecting, and the wavelet systems manifold that all feature decompositions obtain is used for Input set after structure optimization.

9. feature selecting decomposition method according to claim 8, it is characterised in that the formula of the feature decomposition is：

<mrow> <mi>f</mi> <mrow> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> <mo>=</mo> <mover> <mi>W</mi> <mo>&OverBar;</mo> </mover> <mrow> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> <mo>+</mo> <munderover> <mo>&Sigma;</mo> <mrow> <mi>m</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>M</mi> </munderover> <msub> <mi>W</mi> <mi>m</mi> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>4</mn> <mo>)</mo> </mrow> </mrow>

Wherein f (t) is characterized the wavelet coefficient for decomposing and obtaining,To smoothed approximation of original signal during to carry out M layers of decomposition Ripple, W_m(t) be original signal in m layers of decomposition wavelet, m=1,2 ..., M, M be the minimal decomposition number of plies, calculation formula is：

M=int [log (N)] (5)

10. feature selecting decomposition method according to claim 8, it is characterised in that the MODWT models use Daubechies wavelet basis.