CN107729943A - The missing data fuzzy clustering algorithm of feedback of the information extreme learning machine optimization valuation and its application - Google Patents
The missing data fuzzy clustering algorithm of feedback of the information extreme learning machine optimization valuation and its application Download PDFInfo
- Publication number
- CN107729943A CN107729943A CN201710992778.9A CN201710992778A CN107729943A CN 107729943 A CN107729943 A CN 107729943A CN 201710992778 A CN201710992778 A CN 201710992778A CN 107729943 A CN107729943 A CN 107729943A
- Authority
- CN
- China
- Prior art keywords
- data
- felm
- value
- attribute
- error
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000004422 calculation algorithm Methods 0.000 title claims abstract description 43
- 238000005457 optimization Methods 0.000 title claims abstract description 13
- 238000000034 method Methods 0.000 claims abstract description 34
- 238000012549 training Methods 0.000 claims abstract description 27
- 239000011159 matrix material Substances 0.000 claims abstract description 19
- 241001269238 Data Species 0.000 claims abstract description 5
- 239000013589 supplement Substances 0.000 claims abstract description 4
- 230000006870 function Effects 0.000 claims description 13
- 238000005096 rolling process Methods 0.000 claims description 12
- 230000008569 process Effects 0.000 claims description 7
- 238000007621 cluster analysis Methods 0.000 claims description 6
- 238000013528 artificial neural network Methods 0.000 claims description 4
- 238000011084 recovery Methods 0.000 claims description 4
- 238000000605 extraction Methods 0.000 claims description 3
- 238000013480 data collection Methods 0.000 claims description 2
- 239000000047 product Substances 0.000 claims description 2
- 229910000831 Steel Inorganic materials 0.000 description 10
- 239000010959 steel Substances 0.000 description 10
- 238000004458 analytical method Methods 0.000 description 9
- 238000011161 development Methods 0.000 description 5
- 238000004519 manufacturing process Methods 0.000 description 5
- 238000006757 chemical reactions by type Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- XEEYBQQBJWHFJM-UHFFFAOYSA-N Iron Chemical compound [Fe] XEEYBQQBJWHFJM-UHFFFAOYSA-N 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 206010054949 Metaplasia Diseases 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 229910052799 carbon Inorganic materials 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 238000003306 harvesting Methods 0.000 description 1
- 229910052742 iron Inorganic materials 0.000 description 1
- 238000012067 mathematical method Methods 0.000 description 1
- 230000015689 metaplastic ossification Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
Abstract
The present invention relates to the missing data fuzzy clustering algorithm of feedback of the information extreme learning machine optimization valuation and its application, basic step are as follows:1) calculated using mutual information and select the higher data attribute of the degree of correlation, according to training sample of the complete data in these Attributions selection incomplete datas as FELM networks;2) the input weights ω and bias b of FELM networks are initialized;3) missing attribute is carried out according to Nearest Neighbor Method pre-filled, and trains the obtained error of FELM networks to supplement with money to be adjusted until finding rational numerical value to pre-fill and fill up according to training sample, and then the complete data set after being restored;4) parameter of FCM Algorithms, clusters number c, fuzzy coefficient m, threshold epsilon and degree of membership Matrix dividing U are initialized(0);5) final cluster result is obtained by the degree of membership Matrix dividing U and cluster centre V of iteration optimization FCM Algorithms.The distributed intelligence of relevance between data sample and attribute and partial data sample and incomplete data sample can be made full use of with this method to obtain more rational attribute valuation, so that the cluster result of Incomplete data set is more accurate.
Description
Technical field
The present invention relates to a kind of missing data fuzzy clustering algorithm of feedback of the information extreme learning machine optimization valuation and its answer
With belonging to industrial information technology.
Background technology
Steel are China's construction and indispensable valuable cargo of realizing the four modernizations, and steel industry is a national development
Basis, found the state six during the last ten years, Chinese strip industry keeps the sane, development of high speed, is completed industrialization strip technical system.
At present, China is in the important stage of industrial development, and the demand of steel is still huge.For steel industry, it faces
The very big market space.How for the innovative transformation of existing output strip line progress, minimizing and the production of low-carbon metaplasia
Going out high quality, high benefit, high-caliber steel and being one has the problem of realistic meaning.At this stage, informationization is that covering is modern
Change global strategic act, steel and iron industry want further innovation transformation will abundant combining information technology, information-based
The advanced technology of industry is fully melted into the steel operation of rolling, realizes industrial information cooperative development comprehensively.Therefore, for strip number
According to cluster analysis is carried out, it is extremely important to strengthen industrialized production reform by analysis result.
In recent years, cluster analysis is adapted to numerous different types of data acquisition systems.Achieved extensively in many research fields
Application and development.According to the strip data attribute of itself, gone according to certain similitude or Diversity measure using mathematical method
The kinship between strip data sample is determined, and cluster analysis is carried out to this relation and adjusts life thereby using analysis result
Producing line is a significant thing.But due to being influenceed in the production and living of reality by multifactor:Such as data acquisition is set
Standby failure, the failure of storage medium, the failure of transmission media appearance, slipping for human factor or being limited for detection instrument
Etc..Incomplete phenomenon be present in the data set being collected into, and traditional clustering method be to incomplete data set can not be direct
Application.Therefore, a kind of appropriate mode is selected to handle incomplete data, analysis and futurity industry to final result
The formulation of plan is particularly important.
The content of the invention
In order to solve the above problems, the present invention provides a kind of missing data mould of feedback of the information extreme learning machine optimization valuation
Clustering algorithm is pasted, and is applied in the analysis to strip data, industrialized production is strengthened by analysis result and reformed.
The present invention is achieved through the following technical solutions:Feedback of the information extreme learning machine optimizes the missing data mould of valuation
Paste clustering algorithm, it is characterised in that step is as follows:
1) calculated using mutual information and select the higher data attribute of the degree of correlation, according to these Attributions selection incomplete datas
In training sample of the complete data as FELM networks;
Wherein, μX(x) marginal probability density function of variable X is represented;μY(y) the marginal probability density letter of variable Y is represented
Number;μXY(x, y) represents joint probability density function between variable;
2) FELM network parameters are determined:Initialization input weights ω and bias b;ω and b initialization value is set
Between section [- 1,1], any random number for randomly selecting the section initializes to network, determines extreme learning machine
Hidden layer nodes;
3) it is pre-filled to missing attribute progress according to Nearest Neighbor Method, and train what FELM networks obtained according to training sample
Error is supplemented with money using error descriptor index method to pre-fill to be adjusted until find rational numerical value and fill up, and then after being restored
Complete data set;
4) parameter of FCM Algorithms, clusters number c, fuzzy coefficient m, threshold epsilon and degree of membership Matrix dividing are initialized
U(0);
5) complete data set after recovery is clustered using fuzzy C-mean algorithm, as iterations t=l, according to formula
And degree of membership Matrix dividing U (2)(l-1)Calculate cluster centre matrix V(l), according to formula (3) and V(l)Update U(l), for what is given
Threshold epsilon, ifAlgorithm terminates;Otherwise, l=l+1, iteration renewal degree of membership division is continued
Matrix and cluster centre.
The step 3) is pre-filled to missing attribute progress according to Nearest Neighbor Method, and trains FELM nets according to training sample
The error that network obtains is supplemented with money to be adjusted until finding rational numerical value to pre-fill using error descriptor index method and filled up, and then
The process of complete data set after to recovery is as follows:
1) pre-filled, the nearest k evidence of the selected distance data sample is carried out to missing attribute according to Nearest Neighbor Method,
Average value of the k according to sample relevant position is sought to the relevant position of missing data, the pre-fill using the value as incomplete data
Supplement with money.
Wherein, xaAnd xbIth attribute be x respectivelyiaAnd xib, and IiShown in the condition of satisfaction such as formula (5):
2) calculating of the network concealed layer output matrixes of FELM, the output matrix H of hidden layer is counted using formula (6-8)
Calculate;
Wherein,What is represented is the output of i-th of hidden layer;It isWith xjInner product;Then table
What is reached is the input weight linked between input layer and hidden layer;βiDescription be then linked between hidden layer and output layer it is defeated
Go out weights;biWhat is represented is the bias of j-th of hidden layer.
H β=T (7)
Wherein, H is the output for hiding node layer, and β is output weight, and T is expectation weight.
2) calculating of FELM networks output weights, using output matrix H obtained above and it is expected defeated according to formula (9)
Go out value to calculate output weight;
Wherein,It is H Moore-penrose generalized inverse matrix,Norm be minimum and unique.
3) error between real output value and true output is obtained, error is fed back, it is assumed that extreme learning machine
The predicted value of output is y, and the actual value is Y, error e0;
e0=Y-y (10)
4) error and training sample that judgement is tried to achieve obtain the magnitude relationship between error, if meeting iteration stopping requirement,
Then missing attribute is filled, otherwise receives error, readjusts pre-fill and supplement with money, return to step 1).
Described error descriptor index method, detailed process are as follows:
It is assumed that the initial estimate drawn to missing attribute using k neighbour's rules is Ek, using FELM networks to training sample
Originally show that error mean isIf predicting that the output valve drawn is y for carrying out FELM study comprising missing attribute data, and its
Data actual value is Y, then can obtain error e0=Y-y, calculateThe Filling power of adjustment missing attribute:
If 1) e < 0, then readjust the Filling power E of missing attributenew=Ek+ ρ e, i.e., go to increase with certain probability
This value, FELM study is then carried out again as input, ρ ∈ [0,1] here randomly select according to random function;
If 2)So readjust the Filling power E of missing attributenew=Ek- ρ e, then carried out again as input
FELM learns;
If 3)So explanation passes through the value of FELM neural network forecasts, presses close to very much with actual value, is acceptable,
Therefore the filling of attribute is lacked using the value as Incomplete data set.
The missing data fuzzy clustering algorithm of feedback of the information extreme learning machine optimization valuation is in strip data clusters statistics
Application, including following process:
1) experimental data is gathered:The data of a certain period collection of strip are gathered, as data sample;
2) from the gathered data sample extraction with properties:Roller gap size between the roll-force of rolling machine frame, Rolling roller, roll
Roll gap is poor between roller processed, inlet temperature, outlet temperature, mill current size, mill speed, SONY values;
3) using the property value of step 2) collection as training dataset;
4) data set is normalized.Because reasons such as the data attribute orders of magnitude, first have to institute in data set
There is the analog value that numerical value is transformed into [0,1] section, to eliminate the difference between data;
5) to training sample selection and optimize.Calculated using mutual information and select the higher data attribute of the degree of correlation, foundation
Training sample of the complete data as FELM networks in these Attributions selection incomplete datas.
6) FELM network parameters are determined.Initialization input weights ω and bias b.ω and b initialization value is set
Between section [- 1,1], any random number for randomly selecting the section initializes to network, determines extreme learning machine
Hidden layer nodes;
7) attribute valuation is lacked.It is pre-filled to missing attribute progress according to Nearest Neighbor Method, and trained according to training sample
Obtained error is supplemented with money to be adjusted until finding rational numerical value to pre-fill using error descriptor index method and filled up;
8) cluster analysis is carried out to recovering complete data set using FCM algorithms.
Beneficial effects of the present invention:Either traditional resolution policy only considers to contact between data, or contacted according between attribute
As foundation.The present invention combines inside and outside contact (being contacted with reference between data and between attribute), is lacked using FELM real-time performance data
The optimization valuation of mistake value, afterwards to optimize it is complete after data set carry out corresponding to fuzzy cluster analysis.Using mutual information to sample
Correlation calculations between this attribute, so as to provide theoretical base wad to the selection of training sample.It is foundation using local distance
Nearest Neighbor Method, several nearest-neighbors adjacent with incomplete data are selected, prepare FELM nets for each shortage of data value
The pre-fill that network iteration uses is supplemented with money.Multiple errors (true output and desired output difference) are tried to achieve by training sample set, ask it
Mean error.It is adjustment standard according to this, constantly goes to increase or decrease difference using error descriptor index method and optimize and revise estimate.
So repeatedly, the estimated data of harvest preferably missing values, reaches Incomplete data set and rationally efficiently improves purpose.
Brief description of the drawings
Fig. 1 is the topology diagram of reaction type extreme learning machine.
Fig. 2 is the algorithm flow chart of the present invention.
Fig. 3 is belt steel rolling data signal acquisition figure.
Fig. 4 is the change curve between belt steel rolling data set iterations and object function.
Embodiment
First, theoretical foundation of the invention:
1st, feedback of the information extreme learning machine
Extreme learning machine (ELM) was a kind of new Single hidden layer feedforward neural networks (SLFNs) learning algorithm, in 2004
Itd is proposed by Huang Guangbin.In extreme learning machine, connect the input weights of input layer and hidden layer and the bias of hidden layer with
Machine is chosen, and the output weights for connecting hidden layer and output layer are determined by Generalized Inverse Method analysis.ELM algorithms abandon gradient and decline calculation
Method, the thought using least square method is attempted, to ask for optimal neural network, and achieves great success.It is but traditional
Extreme learning machine can not embody prediction output valve for the value of network structure, and input is also relied solely on during study
Information is calculated.Therefore, the thought for using for reference Kalman filtering is improved to traditional extreme learning machine, obtains reaction type pole
Learning machine is limited, valuation prediction is preferably carried out to the missing attribute in Incomplete data set and is filled.
Reaction type extreme learning machine core concept is:Using existing error between prediction output and reality output, reach
Reasonable adjusting makes Filling power more reasonable, so as to improve the validity of cluster for missing attribute filling.As shown in figure 1, it is one
Individual reaction type extreme learning machine model.
As shown in figure 1, the FELM networks are made up of input layer, hidden layer and output layer.Each circle represents a node.
The processing and calculating of data will will tested by each node execution of hidden layer and output layer, the specific number of hidden layer node
Middle determination.
2nd, fuzzy C-mean algorithm (FCM) clustering algorithm
Fuzzy C-Means Cluster Algorithm (Bezdek, 1981) is by feature space X=(x1, x2..., xn) in characteristic point point
For c classes (1 < c≤n), cluster centre V={ v1, v2... vc, the cluster centre v of jth classj∈RsRepresent, wherein arbitrary data
Point xj∈RsThe degree of membership for belonging to jth class is uij, represent xjIt is under the jurisdiction of the degree of jth class.And uijMeet following condition:
uik∈ [O, 1], i=1,2 ..., c;K=1,2 ..., n; (II)
Object function is defined as follows:
Wherein, xk=[x1k, x2k..., xsk]TIt is k-th of data sample, xjkIt is xkJ-th of property value;viIt is i-th
Cluster centre;M (m > > 1) is to influence the index weight that subordinated-degree matrix is blurred degree;||·||2Represent Euclidean distance.
Cluster centre and the more new formula of degree of membership are as follows:
Under the constraint of formula (12), alternating iteration U and V make formula (14) reach minimum.
2nd, implementation process of the invention:
1) calculated using mutual information and select the higher data attribute of the degree of correlation, according to these Attributions selection incomplete datas
In training sample of the complete data as FELM networks;
Wherein, μX(x) marginal probability density function of variable X is represented;μY(y) the marginal probability density letter of variable Y is represented
Number;μXY(x, y) represents joint probability density function between variable.
2) FELM network parameters are determined.Initialization input weights ω and bias b.ω and b initialization value is set
Between section [- 1,1], any random number for randomly selecting the section initializes to network, determines extreme learning machine
Hidden layer nodes;
3) it is pre-filled to missing attribute progress according to Nearest Neighbor Method, and train what FELM networks obtained according to training sample
Error is supplemented with money to pre-fill to be adjusted until find rational numerical value and fill up, and then the complete data set after being restored;
4) parameter of FCM Algorithms, clusters number c, fuzzy coefficient m, threshold epsilon and degree of membership Matrix dividing are initialized
U(0);
5) complete data set after recovery is clustered using fuzzy C-mean algorithm, as iterations t=l, according to formula
And U (2)(l-1)Calculate V(l), according to formula (3) and V(l)Update U(l)IfAlgorithm is whole
Only;Otherwise, l=l+1, iteration renewal degree of membership Matrix dividing and cluster centre are continued.
Error searching algorithm:It is assumed that the initial estimate drawn to missing attribute using k neighbour's rules is Ek, ELM is used
Show that error mean is to training sampleIf predict the output valve drawn for carrying out ELM study comprising missing attribute data
For y, and its data actual value is Y, then can obtain error e0=Y-y, calculateThe Filling power of adjustment missing attribute:
(1) if e < 0, then readjust the Filling power E of missing attributenew=Ek+ ρ e, i.e., go to increase with certain probability
This value, ELM study is then carried out again as input, ρ ∈ [0,1] here randomly select according to random function;
(2) ifSo readjust the Filling power E of missing attributenew=Ek- ρ e, then carried out again as input
ELM learns;
(3) ifThe value that so explanation is predicted by ELM, presses close to, is acceptable very much with actual value, therefore will
Filling of the value as Incomplete data set missing attribute;
3rd, missing data fuzzy clustering algorithm that feedback of the information extreme learning machine of the present invention is optimized to valuation is used for strip
In the analysis of data, industrialized production is strengthened by analysis result and reformed, is comprised the following steps that:
1st, experimental data is gathered:Strip data are the data collected from a certain period in certain domestic steel mill one day, the number
983 data samples are included altogether according to collection.From the gathered data sample extraction to properties:The roll-force of rolling machine frame, rolling
Roll gap is poor between roller gap size, Rolling roller between roller, inlet temperature, outlet temperature, mill current size, mill speed, SONY values.
Wherein, these attributes have different substantial connections from prediction strip exit thickness.Using these property values as FELM networks
Input.Fig. 3 is the signal acquisition figure of data (wherein the longitudinal axis represents parameter value, and transverse axis represents gathered data time value).
2nd, analysis of experimental results:Experimental data is produced to the rolling data collection of missing at random data by artificial treatment, so
It is afterwards each missing Attributions selection training sample set.In order to illustrate feedback of the information extreme learning machine optimization valuation proposed by the present invention
Incomplete data set fuzzy clustering algorithm validity, by its experimental result and classical Processing Algorithm:Average technique of estimation, zero padding
Fill method, k neighbours technique of estimation, MBP-FCM algorithms and carry out result comparison.It is inclined to contrast the valuation of algorithms of different and different missings than under
Difference, and weighed by three kinds of indexs:Equal absolute deviation ABS, equal deviation Bias between actual value and valuation and average inclined
Move root mean square RMSE.Their value is smaller, shows that the degree of accuracy of valuation is higher.The institute of the present invention it can be seen from Tables 1 and 2
The accuracy of the algorithm of proposition valuation compared with other four kinds contrast algorithm is more preferable, and its valuation effect is closer to initial data.
Under different missing ratios, with the increase of missing values quantity, the deviation of filling can equally increase with the increase of difference.
Fig. 4 describes FELM-FCM algorithms under four kinds of missing ratios, between the iterations and algorithm object function of strip data set
Changing trend diagram.Algorithm proposed by the invention is more obvious in starting stage its functional value floating as seen from Figure 4, experience
Several times after iteration optimization, convergence state that algorithm tends towards stability.
Table 1, which contrasts, lacks strip data set valuation deviation under algorithms of different
Table 2 contrasts different missings than lower missing strip data set valuation deviation
Claims (4)
1. feedback of the information extreme learning machine optimizes the missing data fuzzy clustering algorithm of valuation, it is characterised in that step is as follows:
1) calculated using mutual information and select the higher data attribute of the degree of correlation, according in these Attributions selection incomplete datas
Training sample of the complete data as FELM networks;
Wherein, μX(x) marginal probability density function of variable X is represented;μY(y) marginal probability density function of variable Y is represented;μXY
(x, y) represents joint probability density function between variable;
2) FELM network parameters are determined:Initialization input weights ω and bias b;ω and b initialization value is arranged on area
Between between [- 1,1], any random number for randomly selecting the section initializes to network, determines hiding for extreme learning machine
Node layer number;
3) error that is pre-filled, and being obtained according to training sample training FELM networks is carried out to missing attribute according to Nearest Neighbor Method
Pre-fill is supplemented with money using error descriptor index method and is adjusted until find rational numerical value and fill up, so it is complete after being restored
Entire data collection;
4) parameter of FCM Algorithms, clusters number c, fuzzy coefficient m, threshold epsilon and degree of membership Matrix dividing U are initialized(o);
5) complete data set after recovery is clustered using fuzzy C-mean algorithm, as iterations t=l, according to formula (2)
With degree of membership Matrix dividing U(l-1)Calculate cluster centre matrix V(l), according to formula (3) and V(l)Update U(l), for given threshold
Value ε, ifAlgorithm terminates;Otherwise, l=l+1, iteration renewal degree of membership division is continued
Matrix and cluster centre.
。
2. the missing data fuzzy clustering algorithm of feedback of the information extreme learning machine optimization valuation according to claim 1, its
It is characterised by, the step 3) is pre-filled to missing attribute progress according to Nearest Neighbor Method, and trains FELM according to training sample
The error that network obtains is supplemented with money to be adjusted until finding rational numerical value to pre-fill using error descriptor index method and filled up, and then
The process of complete data set after being restored is as follows:
1) pre-filled, the nearest k evidence of the selected distance data sample is carried out to missing attribute according to Nearest Neighbor Method, to lacking
Average value of the k according to sample relevant position is sought in the relevant position for losing data, is supplemented with money the value as the pre-fill of incomplete data.
Wherein, xaAnd xbIth attribute be x respectivelyiaAnd xib, and IiShown in the condition of satisfaction such as formula (5):
2) calculating of the network concealed layer output matrixes of FELM, the output matrix H of hidden layer is calculated using formula (6-8);
Wherein,What is represented is the output of i-th of hidden layer;It isWith xjInner product;Then express
It is the input weight linked between input layer and hidden layer;βiDescription is then the output power linked between hidden layer and output layer
Value;biWhat is represented is the bias of j-th of hidden layer.
H β=T (7)
Wherein, H is the output for hiding node layer, and β is output weight, and T is expectation weight.
2) calculating of FELM networks output weights, output matrix H obtained above and desired output are used according to formula (9)
Output weight is calculated;
Wherein,It is H Moore-penrose generalized inverse matrix,Norm be minimum and unique.
3) error between real output value and true output is obtained, error is fed back, it is assumed that extreme learning machine exports
Predicted value be y, and the actual value is Y, error e0;
e0=Y-y (10)
4) error and training sample that judgement is tried to achieve obtain the magnitude relationship between error, right if meeting iteration stopping requirement
Missing attribute is filled, and is otherwise received error, is readjusted pre-fill and supplement with money, return to step 1).
3. the missing data fuzzy clustering algorithm of feedback of the information extreme learning machine optimization valuation according to claim 2, its
It is characterised by, described error descriptor index method, detailed process is as follows:
It is assumed that the initial estimate drawn to missing attribute using k neighbour's rules is Ek, training sample is drawn using FELM networks
Error mean isIf predicting that the output valve drawn is y for carrying out FELM study comprising missing attribute data, and its data is true
Real value is Y, then can obtain error e0=Y-y, calculateThe Filling power of adjustment missing attribute:
If 1) e < 0, then readjust the Filling power E of missing attributenew=Ek+ ρ e, i.e., go to increase this with certain probability
Value, FELM study is then carried out again as input, ρ ∈ [0,1] here randomly select according to random function;
If 2)So readjust the Filling power E of missing attributenew=Ek- ρ e, FELM is then carried out again as input
Practise;
If 3)So explanation passes through the value of FELM neural network forecasts, presses close to very much with actual value, is acceptable, therefore will
Filling of the value as Incomplete data set missing attribute.
4. the missing data fuzzy clustering algorithm of feedback of the information extreme learning machine optimization valuation is in strip data clusters statistics
Using, it is characterised in that including following process:
1) experimental data is gathered:The data of a certain period collection of strip are gathered, as data sample;
2) from the gathered data sample extraction with properties:Roller gap size, Rolling roller between the roll-force of rolling machine frame, Rolling roller
Between roll gap is poor, inlet temperature, outlet temperature, mill current size, mill speed, SONY values;
3) property value by step 2) collection is madeFor training dataset;
4) data set is normalized.Because reasons such as the data attribute orders of magnitude, first have to all numbers in data set
Value is transformed into the analog value in [0,1] section, to eliminate the difference between data;
5) to training sample selection and optimize.Calculated using mutual information and select the higher data attribute of the degree of correlation, according to these
Training sample of the complete data as FELM networks in Attributions selection incomplete data.
6) FELM network parameters are determined.Initialization input weights ω and bias b.ω and b initialization value is arranged on area
Between between [- 1,1], any random number for randomly selecting the section initializes to network, determines hiding for extreme learning machine
Node layer number;
7) attribute valuation is lacked.It is pre-filled to missing attribute progress according to Nearest Neighbor Method, and train to obtain according to training sample
Error pre-fill supplemented with money using error descriptor index method be adjusted until finding rational numerical value and filling up;
8) cluster analysis is carried out to recovering complete data set using FCM algorithms.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710992778.9A CN107729943B (en) | 2017-10-23 | 2017-10-23 | Missing data fuzzy clustering algorithm for optimizing estimated value of information feedback extreme learning machine and application thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710992778.9A CN107729943B (en) | 2017-10-23 | 2017-10-23 | Missing data fuzzy clustering algorithm for optimizing estimated value of information feedback extreme learning machine and application thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107729943A true CN107729943A (en) | 2018-02-23 |
CN107729943B CN107729943B (en) | 2021-11-30 |
Family
ID=61212371
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710992778.9A Active CN107729943B (en) | 2017-10-23 | 2017-10-23 | Missing data fuzzy clustering algorithm for optimizing estimated value of information feedback extreme learning machine and application thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107729943B (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109102021A (en) * | 2018-08-10 | 2018-12-28 | 聚时科技(上海)有限公司 | The mutual polishing multicore k- mean cluster machine learning method of core under deletion condition |
CN109195110A (en) * | 2018-08-23 | 2019-01-11 | 南京邮电大学 | Indoor orientation method based on hierarchical clustering technology and online extreme learning machine |
CN109214429A (en) * | 2018-08-14 | 2019-01-15 | 聚时科技(上海)有限公司 | Localized loss multiple view based on matrix guidance regularization clusters machine learning method |
CN109783481A (en) * | 2018-12-19 | 2019-05-21 | 新华三大数据技术有限公司 | Data processing method and device |
CN109948715A (en) * | 2019-03-22 | 2019-06-28 | 杭州电子科技大学 | A kind of water monitoring data missing values complementing method |
CN110110447A (en) * | 2019-05-09 | 2019-08-09 | 辽宁大学 | It is a kind of to mix the feedback limit learning machine steel strip thickness prediction technique that leapfrogs |
CN110378744A (en) * | 2019-07-25 | 2019-10-25 | 中国民航大学 | Civil aviaton's frequent flight passenger value category method and system towards incomplete data system |
WO2019218263A1 (en) * | 2018-05-16 | 2019-11-21 | 深圳大学 | Extreme learning machine-based extreme ts fuzzy inference method and system |
CN112101457A (en) * | 2020-09-15 | 2020-12-18 | 湖南科技大学 | PMSM demagnetization fault diagnosis method based on torque signal fuzzy intelligent learning |
CN112687349A (en) * | 2020-12-25 | 2021-04-20 | 广东海洋大学 | Construction method of model for reducing octane number loss |
WO2022105907A1 (en) * | 2020-11-23 | 2022-05-27 | 维沃移动通信有限公司 | Method for processing partial input missing of ai network, and device |
CN117435870A (en) * | 2023-12-21 | 2024-01-23 | 国网天津市电力公司营销服务中心 | Load data real-time filling method, system, equipment and medium |
CN117557921A (en) * | 2023-09-25 | 2024-02-13 | 中国海洋大学 | Chlorophyll remote sensing data reconstruction method based on numerical simulation and deep learning |
WO2024040845A1 (en) * | 2022-08-22 | 2024-02-29 | 江苏大学 | Combine-harvester big data reconstruction method, and apparatus |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102750286A (en) * | 2011-04-21 | 2012-10-24 | 常州蓝城信息科技有限公司 | Novel decision tree classifier method for processing missing data |
CN103488884A (en) * | 2013-09-12 | 2014-01-01 | 北京航空航天大学 | Wavelet neural network based degradation data missing interpolation method |
CN104751229A (en) * | 2015-04-13 | 2015-07-01 | 辽宁大学 | Bearing fault diagnosis method capable of recovering missing data of back propagation neural network estimation values |
JP2015184853A (en) * | 2014-03-24 | 2015-10-22 | Kddi株式会社 | Missing data complementing device, missing data complementing method, and program |
CN106127262A (en) * | 2016-06-29 | 2016-11-16 | 海南大学 | The clustering method of one attribute missing data collection |
CN106156260A (en) * | 2015-04-28 | 2016-11-23 | 阿里巴巴集团控股有限公司 | The method and apparatus that a kind of shortage of data is repaired |
CN106407464A (en) * | 2016-10-12 | 2017-02-15 | 南京航空航天大学 | KNN-based improved missing data filling algorithm |
CN106971205A (en) * | 2017-04-06 | 2017-07-21 | 哈尔滨理工大学 | A kind of embedded dynamic feature selection method based on k nearest neighbor Mutual Information Estimation |
EP3214874A1 (en) * | 2016-03-01 | 2017-09-06 | Gigaset Communications GmbH | Sensoric system energy limiter with selective attention neural network |
CN107274016A (en) * | 2017-06-13 | 2017-10-20 | 辽宁大学 | The strip exit thickness Forecasting Methodology of the random symmetrical extreme learning machine of algorithm optimization that leapfrogs |
-
2017
- 2017-10-23 CN CN201710992778.9A patent/CN107729943B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102750286A (en) * | 2011-04-21 | 2012-10-24 | 常州蓝城信息科技有限公司 | Novel decision tree classifier method for processing missing data |
CN103488884A (en) * | 2013-09-12 | 2014-01-01 | 北京航空航天大学 | Wavelet neural network based degradation data missing interpolation method |
JP2015184853A (en) * | 2014-03-24 | 2015-10-22 | Kddi株式会社 | Missing data complementing device, missing data complementing method, and program |
CN104751229A (en) * | 2015-04-13 | 2015-07-01 | 辽宁大学 | Bearing fault diagnosis method capable of recovering missing data of back propagation neural network estimation values |
CN106156260A (en) * | 2015-04-28 | 2016-11-23 | 阿里巴巴集团控股有限公司 | The method and apparatus that a kind of shortage of data is repaired |
EP3214874A1 (en) * | 2016-03-01 | 2017-09-06 | Gigaset Communications GmbH | Sensoric system energy limiter with selective attention neural network |
CN106127262A (en) * | 2016-06-29 | 2016-11-16 | 海南大学 | The clustering method of one attribute missing data collection |
CN106407464A (en) * | 2016-10-12 | 2017-02-15 | 南京航空航天大学 | KNN-based improved missing data filling algorithm |
CN106971205A (en) * | 2017-04-06 | 2017-07-21 | 哈尔滨理工大学 | A kind of embedded dynamic feature selection method based on k nearest neighbor Mutual Information Estimation |
CN107274016A (en) * | 2017-06-13 | 2017-10-20 | 辽宁大学 | The strip exit thickness Forecasting Methodology of the random symmetrical extreme learning machine of algorithm optimization that leapfrogs |
Non-Patent Citations (3)
Title |
---|
LI ZHANG 等: "A hybrid clustering algorithm based on missing attribute interval estimation for incomplete data", 《PATTERN ANALYSIS AND APPLICATIONS》 * |
TIANHAO LI 等: "Interval kernel Fuzzy C-Means clustering of incomplete data", 《NEUROCOMPUTING》 * |
杨毅 等: "一种基于极限学习机的缺失数据填充方法", 《计算机应用与软件》 * |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019218263A1 (en) * | 2018-05-16 | 2019-11-21 | 深圳大学 | Extreme learning machine-based extreme ts fuzzy inference method and system |
CN109102021A (en) * | 2018-08-10 | 2018-12-28 | 聚时科技(上海)有限公司 | The mutual polishing multicore k- mean cluster machine learning method of core under deletion condition |
CN109214429B (en) * | 2018-08-14 | 2021-07-27 | 聚时科技(上海)有限公司 | Local deletion multi-view clustering machine learning method based on matrix-guided regularization |
CN109214429A (en) * | 2018-08-14 | 2019-01-15 | 聚时科技(上海)有限公司 | Localized loss multiple view based on matrix guidance regularization clusters machine learning method |
CN109195110A (en) * | 2018-08-23 | 2019-01-11 | 南京邮电大学 | Indoor orientation method based on hierarchical clustering technology and online extreme learning machine |
CN109195110B (en) * | 2018-08-23 | 2020-12-15 | 南京邮电大学 | Indoor positioning method based on hierarchical clustering technology and online extreme learning machine |
CN109783481A (en) * | 2018-12-19 | 2019-05-21 | 新华三大数据技术有限公司 | Data processing method and device |
CN109948715A (en) * | 2019-03-22 | 2019-06-28 | 杭州电子科技大学 | A kind of water monitoring data missing values complementing method |
CN109948715B (en) * | 2019-03-22 | 2021-07-02 | 杭州电子科技大学 | Water quality monitoring data missing value filling method |
CN110110447A (en) * | 2019-05-09 | 2019-08-09 | 辽宁大学 | It is a kind of to mix the feedback limit learning machine steel strip thickness prediction technique that leapfrogs |
CN110110447B (en) * | 2019-05-09 | 2023-04-18 | 辽宁大学 | Method for predicting thickness of strip steel of mixed frog leaping feedback extreme learning machine |
CN110378744A (en) * | 2019-07-25 | 2019-10-25 | 中国民航大学 | Civil aviaton's frequent flight passenger value category method and system towards incomplete data system |
CN112101457B (en) * | 2020-09-15 | 2023-11-17 | 湖南科技大学 | PMSM demagnetizing fault diagnosis method based on torque signal fuzzy intelligent learning |
CN112101457A (en) * | 2020-09-15 | 2020-12-18 | 湖南科技大学 | PMSM demagnetization fault diagnosis method based on torque signal fuzzy intelligent learning |
WO2022105907A1 (en) * | 2020-11-23 | 2022-05-27 | 维沃移动通信有限公司 | Method for processing partial input missing of ai network, and device |
CN112687349A (en) * | 2020-12-25 | 2021-04-20 | 广东海洋大学 | Construction method of model for reducing octane number loss |
WO2024040845A1 (en) * | 2022-08-22 | 2024-02-29 | 江苏大学 | Combine-harvester big data reconstruction method, and apparatus |
CN117557921A (en) * | 2023-09-25 | 2024-02-13 | 中国海洋大学 | Chlorophyll remote sensing data reconstruction method based on numerical simulation and deep learning |
CN117435870A (en) * | 2023-12-21 | 2024-01-23 | 国网天津市电力公司营销服务中心 | Load data real-time filling method, system, equipment and medium |
CN117435870B (en) * | 2023-12-21 | 2024-03-29 | 国网天津市电力公司营销服务中心 | Load data real-time filling method, system, equipment and medium |
Also Published As
Publication number | Publication date |
---|---|
CN107729943B (en) | 2021-11-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107729943A (en) | The missing data fuzzy clustering algorithm of feedback of the information extreme learning machine optimization valuation and its application | |
Luo et al. | Efficient and high-quality recommendations via momentum-incorporated parallel stochastic gradient descent-based learning | |
CN109492822B (en) | Air pollutant concentration time-space domain correlation prediction method | |
Zhang et al. | Forecasting box office revenue of movies with BP neural network | |
CN109063911A (en) | A kind of Load aggregation body regrouping prediction method based on gating cycle unit networks | |
CN107563422A (en) | A kind of polarization SAR sorting technique based on semi-supervised convolutional neural networks | |
Han et al. | Information-utilization-method-assisted multimodal multiobjective optimization and application to credit card fraud detection | |
CN100580698C (en) | Sparseness data process modeling approach | |
CN110378799A (en) | Aluminium oxide comprehensive production index decision-making technique based on multiple dimensioned depth convolutional network | |
CN109002917A (en) | Total output of grain multidimensional time-series prediction technique based on LSTM neural network | |
CN113407864B (en) | Group recommendation method based on mixed attention network | |
Zhang et al. | Self-organizing feature map for cluster analysis in multi-disease diagnosis | |
CN113298191B (en) | User behavior identification method based on personalized semi-supervised online federal learning | |
CN106845012A (en) | A kind of blast furnace gas system model membership function based on multiple target Density Clustering determines method | |
CN107153837A (en) | Depth combination K means and PSO clustering method | |
CN107274016A (en) | The strip exit thickness Forecasting Methodology of the random symmetrical extreme learning machine of algorithm optimization that leapfrogs | |
CN114777192B (en) | Secondary network heat supply autonomous optimization regulation and control method based on data association and deep learning | |
Wang et al. | A new approach of obtaining reservoir operation rules: Artificial immune recognition system | |
CN110110447B (en) | Method for predicting thickness of strip steel of mixed frog leaping feedback extreme learning machine | |
Qi et al. | A modularized case adaptation method of case-based reasoning in parametric machinery design | |
CN108122173A (en) | A kind of conglomerate load forecasting method based on depth belief network | |
CN109408896A (en) | A kind of anerobic sowage processing gas production multi-element intelligent method for real-time monitoring | |
Zhang | Research on precision marketing based on consumer portrait from the perspective of machine learning | |
CN113657678A (en) | Power grid power data prediction method based on information freshness | |
Liu et al. | Wheel hub customization with an interactive artificial immune algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20231226 Address after: 905, Building G, Huangjin Times Square, No. 9999 Jingshi Road, Lixia District, Jinan City, Shandong Province, 250000 Patentee after: Zhongchangxing (Shandong) Information Technology Co.,Ltd. Address before: 110136 58 Shenbei New Area Road South, Shenyang, Liaoning. Patentee before: LIAONING University |
|
TR01 | Transfer of patent right |