CN106055525B - A kind of big data processing method based on stepwise regression analysis - Google Patents

A kind of big data processing method based on stepwise regression analysis Download PDF

Info

Publication number
CN106055525B
CN106055525B CN201610479051.6A CN201610479051A CN106055525B CN 106055525 B CN106055525 B CN 106055525B CN 201610479051 A CN201610479051 A CN 201610479051A CN 106055525 B CN106055525 B CN 106055525B
Authority
CN
China
Prior art keywords
parameter
independent variable
variable
regression analysis
coefficient
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201610479051.6A
Other languages
Chinese (zh)
Other versions
CN106055525A (en
Inventor
魏亚玲
李东
张学梅
苗泽凯
程实
马青华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yinchuan College China of CUMT
Original Assignee
Yinchuan College China of CUMT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yinchuan College China of CUMT filed Critical Yinchuan College China of CUMT
Priority to CN201610479051.6A priority Critical patent/CN106055525B/en
Publication of CN106055525A publication Critical patent/CN106055525A/en
Application granted granted Critical
Publication of CN106055525B publication Critical patent/CN106055525B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Algebra (AREA)
  • Evolutionary Biology (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Complex Calculations (AREA)

Abstract

The big data processing method based on stepwise regression analysis that the present invention relates to a kind of, follows the steps below: first collecting the data of plant operating parameters, and the operating parameter of collection is numbered;Then, using operating parameter a part of above-mentioned collection as dependent variable, other operating parameters are linear between each parameter as independent variable, list equation;Again aforesaid equation and corresponding data are imported into Matlab software one by one, carries out stepwise regression analysis operation, coefficient and intercept before calculating separate equation independent variable;Finally carry out the optimal value that interpretation of result obtains respective operations parameter.Method of the invention utilizes a large amount of data of collection, pass through stepwise regression analysis and processing big data, and R. concomitans Matlab software, the influence between each operating parameter of rational judgment DCS in factory can be passed through, rational judgment changes influence of the size to other parameters of certain parameter, and determines the optimal value of DCS in factory operating parameter.

Description

A kind of big data processing method based on stepwise regression analysis
Technical field:
The present invention relates to the methods that big data is handled in industrial production, and in particular to a kind of based on the big of stepwise regression analysis Data processing method.
Background technique:
Regression analysis is a kind of mathematical method of correlativity between handling multivariable.This correlativity is closed different from function System, the latter reflects the stringent interdependence between variable, and the former then shows a degree of fluctuation or randomness, to independent variable Each value, dependent variable can have multiple numerical value to correspond.When independent variable is nonrandom variable, dependent variable is to become at random When amount, the relationship for analyzing them is known as regression analysis;When both stochastic variables, referred to as correlation analysis.Statistically grind Regression analysis and correlation analysis can be used by studying carefully correlativity.Although uncertain with certain between the variable with correlativity Property, still, the statistical law between them can be explored by the continuous observation to phenomenon, this kind of statistical law is known as returning Relationship.In a multiple linear regression model, not all independent variable all has significant relation with dependent variable, sometimes some The effect of independent variable can be ignored.This generates how from it is a large amount of may pick out to dependent variable in related independents variable have it is aobvious The problem of writing the part independent variable influenced, there are many element in the entire set of possible independent variable, use the calculation of " optimal " subset Method may be unworkable.It but may be effective for so gradually generating the automatic search method of the regression model X variable subset to be contained 's.When here it is seeking the good independent variable subset of appropriateness, comparing with all possible methods returned, being produced to save amount of calculation Raw, here it is successive Regressions.
Modernization industry production link be all it is closely related, organic connections.Its it is most important characterization be automation and Big data.In chemical industry, by modern automation technology, by production technology, equipment, control and management be linked into one it is organic Entirety, the data of magnanimity can be generated simultaneously.The data of these magnanimity can be divided into three classes from the viewpoint of application: adjustable parameter, Parameter and reference parameter must be controlled.Wherein, adjustable parameter is artificial adjustable, either automatically, semi-automatic or Quan Shoudong, including valve Door aperture, voltage, electric current, resistance, frequency etc., sole purpose are to guarantee that production safety, stabilization and product quality are up to standard.It must Controlling parameter is the technological parameter operated under prespecified working condition, i.e., in chemical production process, all kinds of technique ginsengs Number must carry out operation under prespecified working condition just can guarantee production safety, efficiently carries out, such as storage tank and container (including oil tank, water tank, boiler drum etc.) liquid level requires to maintain defined range;The temperature of fermentor, pressure in biochemical process Power, pH etc. will meet technique requirement.Reference parameter is that the parameter other than two kinds of above-mentioned parameters is exactly reference parameter.Because Each equipment, links on production process technology flowline is all closely coupled with the equipment of front and back, link, is related to entirely flowing Journey, numerous controlled variable and manipulating variable.Conscientiously it to consider how to guarantee product quality, improve yield, energy conservation and stablize behaviour Make, consider the connection between comprehensive each process, equipment, link and influence each other, so that each system of reasonable arrangement, is allowed to It works in coordination, is harmonious, it is effective, it is necessary to make good use of these reference parameters.Therefore, it in order to 1. accelerate speed of production, drops Low production cost improves the yield and quality of product;2. reducing labor intensity, improve working conditions;3. can guarantee production peace Entirely, it prevents accident from occurring or expanding, reaches extension service life of equipment, improve the purpose of equipment utilization ability;4. production process The realization of automation, energy radical change labor style, improves worker's culture technical level, to eliminate manual labor and brain step by step Difference between power labour creates conditions, and it is urgent to provide a kind of methods for handling these big datas, by collecting a large amount of data With stepwise regression analysis, to solve the above practical problem.
Summary of the invention:
The present invention provides a kind of big data processing method based on stepwise regression analysis, can effectively carry out to mass data Analysis is handled, and between each operating parameter of rational judgment DCS in factory and judgement changes shadow of the size to other parameters of certain parameter It rings, so that it is determined that the optimal value of DCS in factory operating parameter.
In order to solve the above technical problems, the present invention takes following technical scheme:
A kind of big data processing method based on stepwise regression analysis, follows the steps below:
S1: collecting the data of plant operating parameters, and the operating parameter of collection be numbered, be denoted as 1 respectively, 2, 3,……,n;
S2: data processing: it regard each of operating parameter of collection as dependent variable, other remaining operation ginsengs respectively It counts and is used as independent variable, it is linear between each parameter, it is listed below equation:
Wherein: x is independent variable;Y is dependent variable;A is the coefficient before independent variable;B is intercept;
S3: importing Matlab software for aforesaid equation and corresponding data one by one, carry out stepwise regression analysis operation, Coefficient and intercept before calculating separate equation independent variable;
S4: interpretation of result:
(1) when the coefficient before independent variable is zero, illustrate that the independent variable does not have an impact corresponding dependent variable, coefficient is just The both forward and reverse directions that negative reaction influences, the size that the size reaction of coefficient influences, therefore shadow can be found out by above-mentioned operation result Ring the maximum operating parameter and operating parameter number of dependent variable;
(2) the above-mentioned equation group for acquiring coefficient is changed as shown below:
Equation group is write as to the form of matrix, such as following formula:
With Matlab software solution above formula, need to be added parameter value range in calculating as constraint condition, obtained solution is The optimal value of respective operations parameter.
Method of the invention by stepwise regression analysis and processing big data, and combines fortune using a large amount of data are collected With Matlab software, certain parameter can be changed by the influence between each operating parameter of rational judgment DCS in factory, rational judgment Influence of the size to other parameters, and determine the optimal value of DCS in factory operating parameter.To solve the number occurred in actual production According to the big analysis difficulty of amount, big, labour expends the problems such as big, production efficiency is low, working condition is bad low with utilization rate of equipment and installations, adds Fast speed of production reduces production cost, improves the yield and quality of product;It reduces labor intensity, improves working conditions.Meanwhile Production safety can be guaranteed by promoting and applying this method, prevented accident from occurring or expanding, reached extension service life of equipment, raising is set The purpose of standby Utilization ability, it can be achieved that production process automation, change labor style, improve worker's culture technical level.
Specific embodiment:
Technical solution of the present invention is described in detail below.
Embodiment 1
With the big data processing method based on stepwise regression analysis, certain coal dust factory grinding machine data is first collected, the grinding machine is total There are 19 parameters, to each parameter number such as the following table 1:
Number Parameter Number Parameter Number Parameter
1 Secondary air fan frequency 2 End flue temperature 3 Combustion chamber draft
4 Furnace tail temperature 5 Furnace exit temperature 6 Distribution plenum outlet temperature
7 Feeder frequency 8 Actual flow 9 Mill entrance temperature
10 Mill entrance pressure 11 Mill entrance oxygen amount 12 Mill entrance oxygen amount
13 Grinding machine outlet pressure 14 Whirlwind temperature 15 Cloth bag inlet temperature
16 Cloth bag outlet temperature 17 Fan frequency 18 1# stores up powder tower temperature degree
19 2# stores up powder tower temperature degree
Respectively by 1-19 parameter in table 1 as dependent variable, rest parameter can obtain following equation as independent variable:
Wherein: the number of y and x subscript expression parameter, b are intercept.
Aforesaid equation and corresponding data are imported into Matlab software one by one, stepwise regression analysis operation is carried out, asks Coefficient before independent variable out, the coefficient if certain independent variable does not influence dependent variable before the independent variable is zero, finally it is as follows As a result:
y19=9.01+0.7459x18
y18=-110.699-0.1x2-0.007x4-0.013x6+0.373x8+0.527x15-0.02x16+4.91x17+ 0.25x19
y17=19.427-0.0013x3-0.0007x4+0.1567x7+0.0195x8+0.0021x9+0.0179x14+ 0.0132x15+0.0009x16+0.0203x18
y16=5.77+0.0356x6+0.8161x14
y15=0.481+0.0061x9+0.7972x12+0.0659x14+0.1041x18
y14=29.407-0.3469x7+0.0097x9+0.6419x12+0.1659x15+1.8141x17
y13=-1185.61-34.212x7-25.249x8
y12=13.3586+0.0395x1-0.0047x3+0.0029x4+0.0046x6-0.28031x7+0.0052x9+ 0.2043x14+0.5627x15
y11=1.8995-0.0303x14
y10=-35.0614-0.5407x3-0.0570x6+2.2395x7+0.5132x18
y9=-522.62+0.1531x4+0.0437x6-6.7397x7+2.5759x8+2.0845x16+19.4179x17+ 1.8276x12-0.005x13-0.0655x16
y8=-16.355-0.0063x4-0.0009x5+0.0145x6+0.5751x7+0.0177x9-0.0005x13- 0.0636x15+1.1788x17+0.0819x18
y7=-44.0176+0.0992x1+0.0082x4+0.0003x5+0.0023x6+0.1533x8-0.0133x9- 0.0761x12+0.0526x14+2.4382x17+0.0246x18
y6=-459.437-2.9828x1+0.0072x2+0.2244x3-0.1105x4+0.0278x5+7.155x7+ 13.47x8+0.26x9-0.302x10+6.302x12+2.771x15-3.235x18
y5=-429.54+1.1742x6+41.7394x7-35.5218x8
y4=1244.5-5.695x1+5.5625x3-0.2278x6+46.7195x7-8.4274x8+1.7503x9+ 3.3865x12-56.1883x17-4.4394x18
y3=194.78+2.5697x1+0.1021x4-1.0547x10-2.7895x12+1.6788x15-12.0286x17
y2=-0.596x4+1.0683x6
y1=-1.779+0.0273x3-0.0099x4-0.0101x6+1.0875x7+0.1942x8+0.2186x12- 0.0872x18
The reaction of above-mentioned separate equation formula is influenced as the parameter of dependent variable by independent variable parameter, and according to independent variable before The direction and size that coefficient judgement influences.
Following form is converted by above-mentioned equation group:
Matrix is converted by equation group:
The matrix is solved with Matlab software, needs to be added parameter value range in calculating as constraint condition, obtained solution The as optimal value of respective operations parameter.
Interpretation of result: You Shangbiao 2 is it is found that first collect 19 operating parameters of the grinding machine by this method, by gradually returning Return analytic operation, the coefficient before finding out independent variable obtains the optimal value of respective operations parameter with Matlab software solution matrix. So that the big analysis difficulty of data volume for solving appearance in the actual production process is big, labour's consuming is big, production efficiency is low, life The problems such as production condition is bad low with utilization rate of equipment and installations.
Embodiment 2
With the big data processing method based on stepwise regression analysis, certain heat supply company pulverized-coal fired boiler operation data is collected, The pulverized-coal fired boiler shares 65 parameters, to each parameter number such as the following table 2:
Respectively by above-mentioned 1-65 parameter as dependent variable, rest parameter can obtain following equation as independent variable:
Wherein: the number of y and x subscript expression parameter, b are intercept.
Aforesaid equation and corresponding data are imported into Matlab software one by one, stepwise regression analysis operation is carried out, asks Coefficient before independent variable out, the coefficient if certain independent variable does not influence dependent variable before the independent variable is zero, finally it is as follows As a result:
The reaction of above-mentioned separate equation formula is influenced as the parameter of dependent variable by independent variable parameter, and according to independent variable before The direction and size that coefficient judgement influences.
Following form is converted by above-mentioned equation group:
Matrix is converted by equation group:
The matrix is solved with Matlab software, needs to be added parameter value range in calculating as constraint condition, obtained solution The as optimal value of respective operations parameter.

Claims (1)

1. a kind of big data processing method based on stepwise regression analysis, it is characterised in that: follow the steps below:
S1: collecting the data of plant operating parameters, and the operating parameter of collection be numbered, be denoted as 1 respectively, 2,3 ..., n;
S2: data processing: regarding each of operating parameter of collection as dependent variable respectively, and other remaining operating parameters are made It is linear between each parameter for independent variable, it is listed below equation:
Wherein: x is independent variable;Y is dependent variable;A is the coefficient before independent variable;B is intercept;
S3: aforesaid equation and corresponding data are imported into Matlab software one by one, carry out stepwise regression analysis operation, is calculated Coefficient and intercept before separate equation independent variable out;
S4: interpretation of result:
(1) when the coefficient before independent variable is zero, illustrate that the independent variable does not have an impact corresponding dependent variable, coefficient is positive and negative anti- The both forward and reverse directions that should be influenced, coefficient size reaction influence size, therefore by above-mentioned operation result can find out influence because The maximum operating parameter and operating parameter number of variable;
(2) the above-mentioned equation group for acquiring coefficient is changed as shown below:
Equation group is write as to the form of matrix, such as following formula:
With Matlab software solution above formula, obtained solution is the optimal value of respective operations parameter.
CN201610479051.6A 2016-06-27 2016-06-27 A kind of big data processing method based on stepwise regression analysis Expired - Fee Related CN106055525B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610479051.6A CN106055525B (en) 2016-06-27 2016-06-27 A kind of big data processing method based on stepwise regression analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610479051.6A CN106055525B (en) 2016-06-27 2016-06-27 A kind of big data processing method based on stepwise regression analysis

Publications (2)

Publication Number Publication Date
CN106055525A CN106055525A (en) 2016-10-26
CN106055525B true CN106055525B (en) 2019-06-14

Family

ID=57165763

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610479051.6A Expired - Fee Related CN106055525B (en) 2016-06-27 2016-06-27 A kind of big data processing method based on stepwise regression analysis

Country Status (1)

Country Link
CN (1) CN106055525B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107222551B (en) * 2017-06-23 2020-03-17 东软集团股份有限公司 Data transmission and processing method, equipment and information processing center
CN111144850A (en) * 2019-12-30 2020-05-12 江西服装学院 Intelligent clothing data analysis and management system
CN112132185B (en) * 2020-08-26 2023-07-18 上海大学 Method for rapidly predicting double perovskite oxide band gap based on data mining
CN114398734A (en) * 2022-01-17 2022-04-26 华侨大学 Method, device and equipment for optimizing parameters of micro-feeding and readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719195A (en) * 2009-12-03 2010-06-02 上海大学 Inference method of stepwise regression gene regulatory network
CN103761420A (en) * 2013-12-31 2014-04-30 湖南大唐先一科技有限公司 Evaluation method for stepwise regression of thermal power equipment performances
CN104593540A (en) * 2015-01-30 2015-05-06 冶金自动化研究设计院 Method for evaluating energy efficiency in converter steelmaking process

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719195A (en) * 2009-12-03 2010-06-02 上海大学 Inference method of stepwise regression gene regulatory network
CN103761420A (en) * 2013-12-31 2014-04-30 湖南大唐先一科技有限公司 Evaluation method for stepwise regression of thermal power equipment performances
CN104593540A (en) * 2015-01-30 2015-05-06 冶金自动化研究设计院 Method for evaluating energy efficiency in converter steelmaking process

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Symbolic and numerical regression: experiments and applications;J.W.Davidson等;《Information Sciences》;20030331;第150卷(第1-2期);95-117 *
模糊集合度量配煤方法在数据处理及昆钢煤场管理改进的应用;李东等;《煤质技术》;20150331(第2期);33-38 *
采煤工作面瓦斯涌出量预测逐步回归方法;郭德勇;《北京科技大学学报》;20090930;第31卷(第9期);1095-1099 *

Also Published As

Publication number Publication date
CN106055525A (en) 2016-10-26

Similar Documents

Publication Publication Date Title
CN106055525B (en) A kind of big data processing method based on stepwise regression analysis
CN103544273B (en) Method for assessing integral states of furnace conditions by aid of pattern recognition technology
CN102809928B (en) Control optimizing method for energy consumption of thermal equipment of industrial enterprise
CN111077869B (en) Big data intelligent control bag-type dust collector optimization control method and system
CN105205327B (en) A kind of ethylene production efficiency dynamic assessment method based on operating mode
CN109886471A (en) Fired power generating unit load distribution method based on neural network and intelligent optimization algorithm
CN109185917B (en) Boiler combustion state online diagnosis method and system based on flame intensity signal
CN101893877A (en) Optimization operational method based on energy consumption analysis for power plant and system thereof
CN109507961B (en) Semiconductor production line dynamic load balancing feeding control method
CN103631140B (en) Based on the coke oven heating-combustion process fire path temperature Automatic adjustment method of Performance Evaluation
CN105787271B (en) The adjustable output Interval evaluation method of thermal power plant unit based on big data analytical technology
CN106845012A (en) A kind of blast furnace gas system model membership function based on multiple target Density Clustering determines method
CN113515049A (en) Operation regulation and control system and method for gas-steam combined cycle generator set
CN105868867A (en) Method and system for optimized operation of heating boiler cluster
CN106355272A (en) Sintering intelligent data optimization method
CN105654240A (en) Machine tool manufacturing system energy efficiency analysis method
CN105631545A (en) Photovoltaic power station generation capacity prediction method based on similar day analysis and prediction system thereof
CN111767677A (en) GA algorithm-based cascade pump station group lift optimal distribution method
CN104611000B (en) Dispatch control method is criticized in a kind of production improving Large Scale Ethylene Cracking Furnace operating efficiency
CN114688010A (en) Energy-saving and consumption-reducing control method for water pump
CN104077489A (en) Method and system for analyzing energy efficiency of energy consumption device
CN103455003A (en) Monitoring method and system for production in petrochemical industry
CN112613693A (en) Coal-fired power plant flue gas purification island operation health evaluation system and method
CN103233332B (en) Curve approximation control method for cheese dyeing process
WO2022233101A1 (en) Intelligent optimization control device for low-temperature thermal system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190614

Termination date: 20210627

CF01 Termination of patent right due to non-payment of annual fee