CN111242345A - Nuclear power unit electric power prediction method based on cluster analysis and random forest regression - Google Patents

Nuclear power unit electric power prediction method based on cluster analysis and random forest regression Download PDF

Info

Publication number
CN111242345A
CN111242345A CN201911364412.2A CN201911364412A CN111242345A CN 111242345 A CN111242345 A CN 111242345A CN 201911364412 A CN201911364412 A CN 201911364412A CN 111242345 A CN111242345 A CN 111242345A
Authority
CN
China
Prior art keywords
random forest
forest regression
electric power
regression model
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911364412.2A
Other languages
Chinese (zh)
Inventor
李蔚
吴恺逾
盛德仁
陈坚红
鲍旭东
胡跃华
蔡超
骆雪莲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201911364412.2A priority Critical patent/CN111242345A/en
Publication of CN111242345A publication Critical patent/CN111242345A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Economics (AREA)
  • Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Primary Health Care (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention belongs to the field of power systems, and relates to a nuclear power unit electric power prediction method based on cluster analysis and random forest regression. The method carries out feature extraction on the historical operation data of the unit through cluster analysis to train a random forest regression model, so that the electric power of the unit can be predicted according to the real-time operation data of the unit. Because the data dimensionality reduction is realized through the cluster analysis, and the establishment of a mechanism model is avoided through the establishment of a random forest regression model, the method has the advantages of high prediction precision, high prediction speed and strong generalization capability.

Description

Nuclear power unit electric power prediction method based on cluster analysis and random forest regression
Technical Field
The invention belongs to the field of power systems, and relates to a nuclear power unit electric power prediction method based on cluster analysis and random forest regression.
Background
Nuclear energy plays an important role in the world energy structure due to its characteristics of cleanliness and stable power generation. A number of developed countries, including france, the united states, and germany, have long developed nuclear power technologies. In the face of increasingly severe resource and environmental pressure, the energy situation in China is increasingly severe, and the adjustment of the energy structure is not easy. In 2020, the installed capacity of the nuclear power is expected to reach 5800 ten thousand kilowatts, and the continuously rising installed scale prompts people to pay attention to the problems generated in the operation of the nuclear power unit.
In recent years, due to the arrangement problem of water taking and discharging ports of a certain domestic nuclear power station, the average temperature rise of a whole tide is larger than the designed temperature rise when a unit runs in summer, the change of seawater temperature causes the change of unit backpressure, the phenomenon of lower output electric power of the unit under working conditions in summer occurs, and the phenomenon is particularly represented as the rapid change of the temperature of a seawater inlet of a condenser during the rising tide and falling tide. In order to avoid the thermal power of the reactor of the unit exceeding 100% during the highest rising period of the seawater temperature, the output of the unit is limited during daily operation. Aiming at the problem, the operation data of the unit is monitored, and the influence relation of the operation parameters of the unit on the electric power of the unit is established by using the historical data of the operation of the unit, so that the change of the electric power of the unit is accurately predicted, and the operation guidance is provided for the adjustment of the opening of the high-pressure regulating valve of the steam turbine.
When a nuclear power unit actually operates, parameters influencing the electric power of the unit are various, and the parameters have strong relativity and collinearity, so that the mechanism that each parameter acts on the electric power of the unit independently is difficult to analyze theoretically, and a traditional multiple linear regression model is difficult to achieve high fitting accuracy; compared with the traditional regression model, the random forest regression model has the advantages of strong generalization capability, high training speed, simplicity in implementation, accuracy in prediction and the like, is applied to prediction in the field of wind power at present, but is not applied to nuclear power units, so that the random forest regression model has important significance in application research of the random forest regression model in conventional islands of nuclear power stations.
Disclosure of Invention
The invention adopts a nuclear power unit electric power prediction method based on cluster analysis and random forest regression. The characteristic extraction is carried out on the historical operation data of the unit through cluster analysis, and the characteristic extraction is used for training a random forest regression model, so that the electric power of the unit can be predicted according to the real-time operation data of the unit. Because the data dimensionality reduction is realized through the cluster analysis, and the establishment of a mechanism model is avoided through the establishment of a random forest regression model, the method has the advantages of high prediction precision, high prediction speed and strong generalization capability.
The invention adopts the following technical scheme:
the nuclear power unit electric power prediction method based on cluster analysis and random forest regression comprises the following steps:
(1) and acquiring historical operating data of the unit.
(2) And cleaning historical operating data of the unit and removing abnormal data.
(3) And performing feature extraction on the washed historical operating data by adopting clustering analysis to obtain a feature value.
(4) The characteristic values and the target values (electric power) are subjected to non-dimensionalization processing by adopting a standardized method, so that a data set for random forest regression is obtained.
The normalized formula can be expressed as:
Figure BDA0002338032520000021
in the formula, XFor normalized sample values, X is the original sample value,
Figure BDA0002338032520000022
is the mean value of the original sample, and s is the standard deviation of the original sample.
(5) And (4) splitting the data set obtained in the step (4) to obtain a training set and a test set. The training set is used for training the random forest regression model, and the testing set is used for evaluating the prediction effect of the random forest regression model.
(6) And establishing a random forest regression model, and training the random forest regression model by using a training set.
(7) And testing the random forest regression model by using a test set, and evaluating the prediction effect of the random forest regression model by using three evaluation indexes, namely Mean Absolute Error (MAE), Mean Square Error (MSE) and R square value.
(8) And (3) optimizing a random forest regression model, and adjusting the minimum leaf number and the decision tree number of the random forest (the smaller minimum leaf number enables the model to capture noise in a training set more easily, the larger decision tree number enables the model to have better performance, and meanwhile, the training speed of the model is slowed down), and repeating the steps (6) to (8) until the R square value of the random forest regression model is larger than 0.999.
(9) And predicting the electric power of the unit by using a random forest regression model according to the real-time operation data of the unit.
Further, in the step (1), the historical operation data of the unit is historical operation data of a conventional island of the nuclear power plant, and the historical operation data includes main steam parameters, condenser operation parameters, regenerator reheater operation parameters and the like.
Further, in the step (2), the abnormal data are measured point abnormal data and operation process data, such as data generated when the unit is started and stopped.
Further, in the step (5), before splitting the data set, the order of the data set should be disordered to randomly sort the data set, and then splitting is performed.
Further, in step (7), the R-squared value is also called a decision coefficient, which reflects the proportion of the total variation of the dependent variable that can be explained by the independent variable through the regression relationship.
Due to the adoption of the technical scheme, compared with the prior art, the invention has the following advantages:
in the invention, a statistical learning method is introduced, the characteristic extraction is carried out on the historical operating data of the unit by adopting cluster analysis, the data dimension reduction is realized, and then the characteristic value and the target value are trained by random forest regression, thereby avoiding the establishment of a mechanism model. The unit power prediction is carried out by adopting a random forest regression model, so that the prediction precision is greatly improved, and the prediction time is shortened. The random forest regression model has strong generalization capability and can adapt to different nuclear power generating units.
Drawings
FIG. 1 is a flow chart of a nuclear power generating unit electric power prediction method based on cluster analysis and random forest regression.
FIG. 2 is a local graph of cluster analysis results in an embodiment of the present invention.
FIG. 3 is a local graph comparing the predicted value of the random forest regression model with the actual value of the unit operation in the embodiment of the invention.
Detailed Description
The invention is further explained by an embodiment of predicting the electric power of a 1000MW nuclear power plant unit in combination with the attached drawings.
As shown in FIG. 1, the invention provides a nuclear power unit electric power prediction method based on cluster analysis and random forest regression, which specifically comprises the following steps:
(1) and acquiring historical operating data of the unit. In the embodiment, historical operating data of a conventional island of a 1000MW nuclear power station from 2016 to 8 months in 2019 are obtained, 200 operating data measuring points are obtained, and the sampling interval is 10 min.
(2) And cleaning historical operating data of the unit and removing abnormal data. In the present embodiment, there are 174410 samples in total after the abnormal data is eliminated.
(3) And performing feature extraction on the washed historical operating data by adopting clustering analysis to obtain a feature value. FIG. 2 is a local graph of cluster analysis results in an embodiment of the present invention. According to the graph, the inlet temperature and the outlet temperature of the circulating water of the condenser are similar, and the average values of the measuring points of the inlet temperature and the outlet temperature of the circulating water of the condenser are respectively taken as characteristic values. For the same reason, the average of the measured points of the condenser inlet pressure was taken as the characteristic value.
The characteristic values of the embodiment are finally selected as shown in table 1 by combining the theory and the actual experience of the power plant operation.
TABLE 1 list of feature values (average of multiple points in the same feature)
Figure BDA0002338032520000041
(4) The characteristic values and the target values (electric power) are subjected to non-dimensionalization processing by adopting a standardized method, so that a data set for random forest regression is obtained. The normalized formula can be expressed as:
Figure BDA0002338032520000042
in the formula, XFor normalized sample values, X is the original sample value,
Figure BDA0002338032520000043
is the mean value of the original sample, and s is the standard deviation of the original sample.
(5) And randomly splitting the data set obtained in the last step to obtain a training set and a testing set. The training set is used for training the random forest regression model, and the testing set is used for evaluating the prediction effect of the random forest regression model. In this embodiment, the number of test set samples is 30% of the number of data lump samples.
(6) And establishing a random forest regression model, and training the random forest regression model by using a training set. In this embodiment, the minimum number of leaves is 1, and the number of decision trees is 100.
(7) And testing the random forest regression model by using the test set, and evaluating the prediction effect of the random forest regression model. In this embodiment, the Mean Absolute Error (MAE) of the random forest regression model is 0.01066, the Mean Square Error (MSE) is 0.00043, and the R-square value is 0.99957.
(8) Optimizing a random forest regression model, and adjusting the minimum leaf number and the decision tree number of the random forest. In this embodiment, since the R-square value of the random forest regression model is greater than 0.999, the parameters of the random forest regression model are not adjusted.
(9) And predicting the electric power of the unit by using a random forest regression model according to the real-time operation data of the unit. FIG. 3 is a local graph comparing the predicted value of the random forest regression model with the actual value of the unit operation in the embodiment of the invention. In the graph, the actual electric power value is a scattered point, the predicted electric power value is a solid line, and the predicted electric power value and the actual electric power value are almost overlapped, which shows that the random forest regression model can realize accurate prediction of the electric power of the unit.
In addition to the random forest regression model, other regression models were tried, and the regression models and their predicted effects of this example are listed in table 2.
TABLE 2 regression model and its predicted effect
Figure BDA0002338032520000051
As can be seen from the table, the random forest regression model has the minimum MAE and MSE and the R square value closest to 1, so that the random forest regression model has the highest accuracy on the prediction of the electric power of the unit in the regression model, and can better provide guidance for the operation of the unit.

Claims (7)

1. The nuclear power unit electric power prediction method based on cluster analysis and random forest regression comprises the following steps:
(1) acquiring historical operating data of the unit;
(2) cleaning historical operating data of the unit, and eliminating abnormal data;
(3) performing feature extraction on the washed historical operating data by adopting clustering analysis to obtain a feature value;
(4) carrying out non-dimensionalization processing on the characteristic value and the target value by adopting a standardized method, thereby obtaining a data set for random forest regression;
the normalized formula is expressed as:
Figure FDA0002338032510000011
wherein X' is a normalized sample value, and X is an original sample value,
Figure FDA0002338032510000012
Is the mean value of the original sample, and s is the standard deviation of the original sample;
(5) splitting the data set obtained in the step (4) to obtain a training set and a test set; the training set is used for training a random forest regression model, and the testing set is used for evaluating the prediction effect of the random forest regression model;
(6) establishing a random forest regression model, and training the random forest regression model by using a training set;
(7) testing the random forest regression model by using a test set, and evaluating the prediction effect of the random forest regression model by using three evaluation indexes, namely Mean Absolute Error (MAE), Mean Square Error (MSE) and R square value;
(8) optimizing a random forest regression model, and adjusting the minimum leaf number and the decision tree number of a random forest; repeating the steps (6) to (8) until the R square value of the random forest regression model is more than 0.999;
(9) and predicting the electric power of the unit by using a random forest regression model according to the real-time operation data of the unit.
2. The nuclear power plant electric power prediction method of claim 1, characterized in that: in the step (1), the historical operating data of the unit is historical operating data of a conventional island of the nuclear power station, and the historical operating data comprises main steam parameters, condenser operating parameters and regenerator reheater operating parameters.
3. The nuclear power plant electric power prediction method of claim 1, characterized in that: and (2) the abnormal data in the step (1) are measured point abnormal data and operation process data.
4. The nuclear power plant electric power prediction method of claim 1, characterized in that: in the step (3), the characteristic value comprises one or more of main feed water flow, high-pressure regulating valve opening, main steam main pipe pressure, main steam main pipe temperature, condenser vacuum degree, low-pressure cylinder exhaust steam temperature, circulating water inlet temperature, circulating water outlet temperature, condenser inlet pressure and condenser outlet pressure.
5. The nuclear power plant electric power prediction method of claim 1, characterized in that: and (3) taking the average value of a plurality of measuring points in the same characteristic as a characteristic value.
6. The nuclear power plant electric power prediction method of claim 1, characterized in that: before splitting the data set in the step (5), the order of the data set is disordered to randomly sort the data set, and then splitting is carried out.
7. The nuclear power plant electric power prediction method of claim 1, characterized in that: in step (7), the R-squared value is also referred to as a coefficient of determination, which reflects the proportion of the total variation of the dependent variable that can be explained by the independent variable through a regression relationship.
CN201911364412.2A 2019-12-26 2019-12-26 Nuclear power unit electric power prediction method based on cluster analysis and random forest regression Pending CN111242345A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911364412.2A CN111242345A (en) 2019-12-26 2019-12-26 Nuclear power unit electric power prediction method based on cluster analysis and random forest regression

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911364412.2A CN111242345A (en) 2019-12-26 2019-12-26 Nuclear power unit electric power prediction method based on cluster analysis and random forest regression

Publications (1)

Publication Number Publication Date
CN111242345A true CN111242345A (en) 2020-06-05

Family

ID=70863926

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911364412.2A Pending CN111242345A (en) 2019-12-26 2019-12-26 Nuclear power unit electric power prediction method based on cluster analysis and random forest regression

Country Status (1)

Country Link
CN (1) CN111242345A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113112091A (en) * 2021-05-06 2021-07-13 云南电力技术有限责任公司 Nuclear power unit power prediction method based on PCA and LSTM

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180175790A1 (en) * 2015-06-23 2018-06-21 Qatar Foundation For Education, Science And Community Development Method of forecasting for solar-based power systems
CN108197752A (en) * 2018-01-25 2018-06-22 国网福建省电力有限公司 Wind turbine output power short term prediction method based on random forest
CN108830411A (en) * 2018-06-07 2018-11-16 苏州工业职业技术学院 A kind of wind power forecasting method based on data processing
CN110363354A (en) * 2019-07-16 2019-10-22 上海交通大学 Wind field wind power prediction method, electronic device and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180175790A1 (en) * 2015-06-23 2018-06-21 Qatar Foundation For Education, Science And Community Development Method of forecasting for solar-based power systems
CN108197752A (en) * 2018-01-25 2018-06-22 国网福建省电力有限公司 Wind turbine output power short term prediction method based on random forest
CN108830411A (en) * 2018-06-07 2018-11-16 苏州工业职业技术学院 A kind of wind power forecasting method based on data processing
CN110363354A (en) * 2019-07-16 2019-10-22 上海交通大学 Wind field wind power prediction method, electronic device and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LIU D等: "Random forest solar power forecast based on classification optimization", 《ENERGY》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113112091A (en) * 2021-05-06 2021-07-13 云南电力技术有限责任公司 Nuclear power unit power prediction method based on PCA and LSTM

Similar Documents

Publication Publication Date Title
CN111159844B (en) Abnormity detection method for exhaust temperature of gas turbine of power station
CN108446529B (en) Organic Rankine cycle system fault detection method based on generalized mutual entropy-DPCA algorithm
CN103850726B (en) Method for quickly determining stationary sliding pressing optimization curve of steam turbine
CN111581597A (en) Wind turbine generator gearbox bearing temperature state monitoring method based on self-organizing kernel regression model
CN109538311B (en) Real-time monitoring method for control performance of steam turbine in high-end power generation equipment
CN110987494A (en) Method for monitoring cavitation state of water turbine based on acoustic emission
CN115294671A (en) Air compressor outlet pressure prediction method and prediction system
CN108182553B (en) Coal-fired boiler combustion efficiency online measurement method
CN111242345A (en) Nuclear power unit electric power prediction method based on cluster analysis and random forest regression
CN110646193B (en) Test method for obtaining flow characteristic of high-pressure regulating valve of steam turbine
CN110928248A (en) Method for determining performance degradation degree of gas turbine
CN112348696B (en) BP neural network-based heating unit peak regulation upper limit evaluation method and system
CN115541227A (en) Wind power gear box fault diagnosis method based on time-shifted cosine similar entropy
CN111624979B (en) Industrial closed-loop control loop multi-oscillation detection and tracing method based on slow characteristic analysis
Qiao et al. Research on SCADA data preprocessing method of Wind Turbine
CN111581787B (en) Method and system for screening heat rate analysis data of steam turbine in real time
CN110032791B (en) Turbine low-pressure cylinder efficiency real-time calculation method based on generalized regression neural network
Peng et al. Accuracy research on the modeling methods of the gas turbine components characteristics
Tang et al. Computer Prediction Model of Heat Consumption in Thermal System of Coal-Fired Power Station Based on Big Data Analysis and Information Sorting
CN111307493B (en) Knowledge-based fault diagnosis method for tower type solar molten salt heat storage system
CN109241573B (en) Steam turbine last stage blade model selection method
CN113112091A (en) Nuclear power unit power prediction method based on PCA and LSTM
Chi et al. Turbine blade fault detection based on feature extraction
Yang et al. Running State Assessment for Induced Draft Fans Using Auto Encoder Model Combined With Fuzzy Synthetic
CN116242613A (en) Method for measuring and calculating turbine regulating stage through-flow efficiency characteristics and terminal equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200605

WD01 Invention patent application deemed withdrawn after publication