CN109710601A - A kind of intelligence hydroelectric power plant operation data cleaning method - Google Patents

A kind of intelligence hydroelectric power plant operation data cleaning method Download PDF

Info

Publication number
CN109710601A
CN109710601A CN201811589543.6A CN201811589543A CN109710601A CN 109710601 A CN109710601 A CN 109710601A CN 201811589543 A CN201811589543 A CN 201811589543A CN 109710601 A CN109710601 A CN 109710601A
Authority
CN
China
Prior art keywords
data
equipment
value
power plant
hydroelectric power
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811589543.6A
Other languages
Chinese (zh)
Inventor
杨忠伟
蔡杰
葛嘉
彭放
朱传古
何亚东
曹灿
周洪宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Guodian Dadu River Dagangshan Hydropower Development Co Ltd
Original Assignee
China Guodian Dadu River Dagangshan Hydropower Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Guodian Dadu River Dagangshan Hydropower Development Co Ltd filed Critical China Guodian Dadu River Dagangshan Hydropower Development Co Ltd
Priority to CN201811589543.6A priority Critical patent/CN109710601A/en
Publication of CN109710601A publication Critical patent/CN109710601A/en
Pending legal-status Critical Current

Links

Landscapes

  • Testing And Monitoring For Control Systems (AREA)

Abstract

The invention discloses a kind of intelligent hydroelectric power plant operation data cleaning methods, are related to hydroelectric power plant's operation data processing technology field, it includes: S1. data acquisition;S2. data cleansing;S21. accuracy judges;S22. measuring point screens;S23. the data sampling time;S24. associate device state;S25. characteristic value is screened;S3. data store;Data are repeatedly sorted out and calculate characteristic value and are stored, reduce the retrieval amount of operator at the extraction, so that operator can quickly consult data, the judgement of accuracy and validity eliminates part abnormal data simultaneously, it reduces and is interfered caused by abnormal data, the quality of data is improved, the excavation of data specificity is preferably adapted to.

Description

A kind of intelligence hydroelectric power plant operation data cleaning method
Technical field
The present invention relates to hydroelectric power plant's operation data processing technology field more particularly to a kind of intelligent hydroelectric power plant operation datas Cleaning method.
Background technique
A kind of essential energy, electric energy are mainly conveyed by power station in the dead today's society of electric energy, water initial data It is the significant data for analyzing power station running status, equipment running status analysis, equipment fault diagnosis is crucial, still In the raw operational data of magnanimity there is it is a large amount of it is imperfect, inconsistent, have abnormal data, seriously affect data mining and build The execution efficiency of mould, in some instances it may even be possible to lead to the deviation for analyzing and determining result, so being cleaned just to equipment operation initial data It is particularly important.
Summary of the invention
The object of the invention is that providing a kind of intelligent hydroelectric power plant operation data cleaning method, has and reduce largely not Completely, inconsistent, have the advantages of abnormal data.
To achieve the goals above, the technical solution adopted by the present invention is that: it is a kind of intelligence hydroelectric power plant's operation data cleaning Method, which is characterized in that
S1. initial data obtains, and stores into data center;
S2. data are cleaned;
S21. accuracy judges, is sentenced by the limit value of initial data validity and correctness to the data of acquisition It is disconnected;
S22. measuring point screens, and different types of initial data will be pressed in data center into screening;
S23. data sampling time, the same type of initial data that same equipment is generated temporally are classified;
S24. associate device state classifies the operation data of the equipment by the open/close states of equipment;
S25. characteristic value is screened, and equipment operating data is found out by maximum value and minimum value, and calculate average value, will be maximum Value, minimum value and average value are defined as the characteristic value of equipment operation;
S3. data store, and establish history library, and the characteristic value that equipment is run is stored into history library.
Implement above-mentioned technical proposal, determined by the open/close states to equipment, and is normally started and steady using equipment Fixed postrun data are as effective analysis operation data, thus a large amount of invalid analysis numbers during evading falling equipment downtime According to be mutated larger and little reference value data in equipment start-up course, by the initial data of the equipment run in power plant into Row is collected, and by limit value the data in initial data are carried out with the judgement of validity and correctness, determines the effective of initial data Range, and sort data into as valid data and invalid data, valid data are then pressed into same equipment and same type of number According to being sorted out, the same type of initial data that same equipment generates, numerical value change reaches unanimity substantially, even more can be with It is further polymerized to equipment categorical data, screening mode is mainly divided according to data such as device temperature, flows, according to Timing node progress is regular, by the maximum value in the same type of data of same equipment filtered out in this time and most Small value is taken out, and calculates average value, sets characteristic value for maximum value, minimum value and average value and stores into convenient in history library Operator searches;Data are repeatedly sorted out and calculate characteristic value and are stored, so that operator can be quick Data are consulted, while the judgement of accuracy and validity eliminates part abnormal data, caused by reducing abnormal data Interference, improves the quality of data, preferably adapts to the excavation of data specificity.
Further, raw data packets include monitoring system data, auxiliary control system data, Condition Monitoring Data and auxiliary in S1 Equipment operating data summarizes initial data to data center by using different communications protocol.
Implement above-mentioned technical proposal, by the barrier that different communications protocol is got through between each system data data is obtained Integration, to facilitate later data to carry out homogeneous classification.
Further, judge whether data are online by the certain quality of monitoring system data in S21.
Implement above-mentioned technical proposal, the judgement of data integrity is inclined caused by reducing when carrying out Modeling of Data Mining Difference evades the acquisition to incomplete data in time, and online data is just authentic and valid acquisition data, is such as determined as non-in line number According to, then again from other systems exchange and amended record non-online period required data.
Further, when power station equipment in data center being run in S22 the hop value that generates by " mean filter method " at Reason exceptional value is rejected.
Implement above-mentioned technical proposal, the rejecting of hop value reduces equipment and generates in the process of running because of factors such as vibrations Invalid data, improve the accuracy of overall data, calculate when ask before N-1 data value mean value with the last time acquisition Value compares, and finds out difference, if difference is more than a certain range, abandons, otherwise retains.
Further, initial data is subjected to multi-computer Redundancy backup by power station equipment in S21.
Implement above-mentioned technical proposal, to realize that breakpoint passes continuous function, guaranteeing after communicating interrupt can will for multi-computer Redundancy backup Initial data during interruption obtains, using the data redundancy mechanism of monitoring system and data center, data center's real-time reception Data in monitoring system are simultaneously stored, and data cleansing Shi Xiancong data center determines online data, if not online, offline After the historical data of exchange monitoring system, then carry out the data cleansing of non-online period.
Compared with the prior art, the advantages of the present invention are as follows:
One, the total amount that the setting of characteristic value and history library reduces the data inquired when extracting data is reduced, and is reduced Time needed for extracting data improves the efficiency for extracting data;
Two, judge whether the operation data of equipment is online, evades in time to incomplete data by monitoring system data Be acquired, reduce deviation caused by when carrying out Modeling of Data Mining;
Three, the rejecting of hop value reduces the invalid data that equipment is generated by factors such as vibrations in the process of running, mentions The accuracy of high overall data;
Four, multi-computer Redundancy backup to be to realize that breakpoint passes continuous function, and guaranteeing after communicating interrupt can will be original during interruption Data acquisition.
Detailed description of the invention
Fig. 1 is the flow chart of the data cleaning method in the embodiment of the present invention.
Specific embodiment
The invention will be further described below.
Embodiment:
As shown in Figure 1, a kind of intelligence hydroelectric power plant operation data cleaning method:
S1. initial data obtain, initial data include monitoring system data, auxiliary control system data, Condition Monitoring Data and Initial data is summarized to data center by using different communications protocol and is stored by ancillary equipment operation data, is waited It is cleaned.
S2. data are cleaned.
S21. accuracy judge, by initial data by power station equipment carry out multi-computer Redundancy backup, using monitoring system with The data redundancy mechanism of data center, data in data center's real-time reception monitoring system are simultaneously stored, from data center Whether the certain quality of system data is online to determine data, if not online, multi-computer Redundancy backup can will be during communicating interrupt Initial data is obtained, i.e., the historical data of first offline exchange monitoring system, then carries out the data cleansing of non-online period, is made Being improved for the integrality of the middle data of initial data is obtained, the not online situation of data is reduced and occurs.Online data is just true Real effectively acquisition data, are such as determined as non-online data, then exchange the institute of simultaneously amended record non-online period from other systems again Need data.It is checked after initial data acquisition by validity and correctness of the limit value of initial data to the data of acquisition.
S22. measuring point screens, and different types of initial data will be pressed in data center into screening, by the same of same equipment Categorical data is sorted out, and to carry out the lookup of data, screening mode mainly gives according to data such as device temperature, flows It divides.The hop value generated when then power station equipment in data center being run is picked by " mean filter method " processing exceptional value It removes, that is, before asking after the mean value of N-1 data value, compared with the last collection value, finds out difference, if difference is more than certain Range then abandons, and otherwise retains, and reduces the interference generated when hop value analyzes data.
S23. data sampling time, the same type of initial data that same equipment is generated are classified according to the time period, So that can be searched according to the period in data when being searched, i.e., the data pick-up of same period is carried out by the hour, so as to Analysis follow-up data moves towards trend.
S24. associate device state classifies the operation data of the equipment by the open/close states of equipment, and equipment is being transported Data under row state and shutdown status have biggish difference, such as the temperature and shutdown status of generator under operation Under temperature make a big difference, normally started using equipment and stable operation data after twenty minutes be as effective analysis fortune Row data, to be mutated larger in a large amount of invalid analysis data and equipment start-up course during evading falling equipment downtime and join It examines and is worth little data, the validity and data value of data analysis are improved, to improve the accuracy of data.
S25. characteristic value screen, by equipment at runtime same type data data caused by the same period most Big value and minimum value are found out, and calculate average value, such as the temperature generated when turbine-generator units operation in one hour, by this Maximum value, the minimum value for the temperature that turbine-generator units generate in the section time are found out and calculate turbine-generator units in this time The average value of generated temperature.Maximum value, minimum value and average value are defined as to the characteristic value of equipment operation, pass through feature Value it can be learnt that in this time the equipment substantially operation data, reject huge data make it is quicker when searching. The characteristic value of equipment operation after being screened by data cleansing, has eliminated invalid data, and classification has carried out at statistics and convergence Reason, can most be worth and the statistics source of mean value calculation directly as the equipment moon, year, and be equipment running status and fault diagnosis Determine basis.
S3. data store, and after the completion of eigenvalue, establish history library, and the characteristic value that equipment is run is stored into going through In Shi Ku, the cleaning process of data is completed, the characteristic value of the equipment operation after screening by data cleansing has eliminated invalid number According to, and classification has carried out statistics and convergence processing, can most be worth directly as the equipment moon, year and the statistics source of mean value calculation, and For the judgement basis of equipment running status and fault diagnosis.History library is arranged so that the characteristic value extracted is received again It receives, while only existing characteristic value in history library and making less in history library, can quickly propose data when extracting data, subtract The time required when data is searched less, improves working efficiency.Characteristic value is valid data, improves the execution of Modeling of Data Mining Efficiency.
Used herein a specific example illustrates the principle and implementation of the invention, and above embodiments are said It is bright to be merely used to help understand method and its core concept of the invention;At the same time, for those skilled in the art, foundation Thought of the invention, there will be changes in the specific implementation manner and application range, will to change and improvement of the invention It is possible, the conception and scope without exceeding accessory claim defined, in conclusion the content of the present specification should not manage Solution is limitation of the present invention.

Claims (5)

1. a kind of intelligence hydroelectric power plant operation data cleaning method, which is characterized in that
S1. initial data obtains, and stores into data center;
S2. data are cleaned;
S21. accuracy judges, is judged by the limit value of initial data validity and correctness to the data of acquisition;
S22. measuring point screens, and different types of initial data will be pressed in data center into screening;
S23. data sampling time, the same type of initial data that same equipment is generated temporally are classified;
S24. associate device state classifies the operation data of the equipment by the open/close states of equipment;
S25. characteristic value screen, equipment operating data is found out by maximum value and minimum value, and calculate average value, by maximum value, Minimum value and average value are defined as the characteristic value of equipment operation;
S3. data store, and establish history library, and the characteristic value that equipment is run is stored into history library.
2. a kind of intelligent hydroelectric power plant operation data cleaning method according to claim 1, which is characterized in that in the Central Plains S1 Beginning data include monitoring system data, auxiliary control system data, Condition Monitoring Data and ancillary equipment operation data, by using not Same communications protocol summarizes initial data to data center.
3. a kind of intelligent hydroelectric power plant operation data cleaning method according to claim 2, which is characterized in that lead in S21 The certain quality for crossing monitoring system data judges whether data are online.
4. a kind of intelligent hydroelectric power plant operation data cleaning method according to claim 1, which is characterized in that in S22 The hop value generated when power station equipment in data center is run is rejected by " mean filter method " processing exceptional value.
5. a kind of intelligent hydroelectric power plant operation data cleaning method according to claim 1, which is characterized in that in S21 Initial data is subjected to multi-computer Redundancy backup by power station equipment.
CN201811589543.6A 2018-12-25 2018-12-25 A kind of intelligence hydroelectric power plant operation data cleaning method Pending CN109710601A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811589543.6A CN109710601A (en) 2018-12-25 2018-12-25 A kind of intelligence hydroelectric power plant operation data cleaning method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811589543.6A CN109710601A (en) 2018-12-25 2018-12-25 A kind of intelligence hydroelectric power plant operation data cleaning method

Publications (1)

Publication Number Publication Date
CN109710601A true CN109710601A (en) 2019-05-03

Family

ID=66257396

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811589543.6A Pending CN109710601A (en) 2018-12-25 2018-12-25 A kind of intelligence hydroelectric power plant operation data cleaning method

Country Status (1)

Country Link
CN (1) CN109710601A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111650345A (en) * 2020-07-14 2020-09-11 中科三清科技有限公司 Method, device, equipment and medium for processing atmospheric environmental pollution detection data
CN112069036A (en) * 2020-11-10 2020-12-11 南京信易达计算技术有限公司 Management and monitoring system based on cluster computing
CN112579581A (en) * 2020-11-30 2021-03-30 贵州力创科技发展有限公司 Data access method and system of data analysis engine
CN113012353A (en) * 2021-02-22 2021-06-22 广州好友数码科技有限公司 Water quantity real-time monitoring method and system suitable for Internet of things water meter

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102436249A (en) * 2012-01-19 2012-05-02 四川谊田集群科技有限公司 Intelligent electric quantity management and control system and method
CN104750861A (en) * 2015-04-16 2015-07-01 中国电力科学研究院 Method and system for cleaning mass data of energy storage power station
CN106777150A (en) * 2016-12-19 2017-05-31 国网山东省电力公司电力科学研究院 A kind of cross-system data transfer device for merging operation of power networks environment and facility information
CN108491508A (en) * 2018-03-22 2018-09-04 安徽八六物联科技有限公司 A kind of big data cleaning code system
JP2018180759A (en) * 2017-04-07 2018-11-15 株式会社日立製作所 System analysis system and system analysis method
CN108846511A (en) * 2018-06-04 2018-11-20 国家电网公司 A kind of defect of transformer equipment trend analysis based on regulation big data platform

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102436249A (en) * 2012-01-19 2012-05-02 四川谊田集群科技有限公司 Intelligent electric quantity management and control system and method
CN104750861A (en) * 2015-04-16 2015-07-01 中国电力科学研究院 Method and system for cleaning mass data of energy storage power station
CN106777150A (en) * 2016-12-19 2017-05-31 国网山东省电力公司电力科学研究院 A kind of cross-system data transfer device for merging operation of power networks environment and facility information
JP2018180759A (en) * 2017-04-07 2018-11-15 株式会社日立製作所 System analysis system and system analysis method
CN108491508A (en) * 2018-03-22 2018-09-04 安徽八六物联科技有限公司 A kind of big data cleaning code system
CN108846511A (en) * 2018-06-04 2018-11-20 国家电网公司 A kind of defect of transformer equipment trend analysis based on regulation big data platform

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
邢志刚 等: "风电场与风电机组运行数据的精细化分析", 《风能》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111650345A (en) * 2020-07-14 2020-09-11 中科三清科技有限公司 Method, device, equipment and medium for processing atmospheric environmental pollution detection data
CN112069036A (en) * 2020-11-10 2020-12-11 南京信易达计算技术有限公司 Management and monitoring system based on cluster computing
CN112069036B (en) * 2020-11-10 2021-09-03 南京信易达计算技术有限公司 Management and monitoring system based on cluster computing
CN112579581A (en) * 2020-11-30 2021-03-30 贵州力创科技发展有限公司 Data access method and system of data analysis engine
CN112579581B (en) * 2020-11-30 2023-04-14 贵州力创科技发展有限公司 Data access method and system of data analysis engine
CN113012353A (en) * 2021-02-22 2021-06-22 广州好友数码科技有限公司 Water quantity real-time monitoring method and system suitable for Internet of things water meter

Similar Documents

Publication Publication Date Title
CN109710601A (en) A kind of intelligence hydroelectric power plant operation data cleaning method
CN110263078B (en) A kind of distribution line heavy-overload Intelligent statistical method
CN111738308A (en) Dynamic threshold detection method for monitoring index based on clustering and semi-supervised learning
CN109040320B (en) Multidimensional information acquisition system and method in textile production process
CN116992346A (en) Enterprise production data processing system based on artificial intelligence big data analysis
CN112801313A (en) Fully mechanized mining face fault judgment method based on big data technology
CN111230159B (en) Multi-sensor fusion turning tool state monitoring method and system
CN111443326B (en) Running beat diagnostic system for automatic verification assembly line of electric energy meter and working method thereof
CN116275643B (en) Intelligent recognition method for execution condition of welding process
CN116800586A (en) Method for diagnosing data communication faults of telecommunication network
CN111666978A (en) Intelligent fault early warning system for IT system operation and maintenance big data
CN111404756A (en) Fault diagnosis system for communication equipment
CN109523030B (en) Telemetering parameter abnormity monitoring system based on machine learning
CN114374600A (en) Network operation and maintenance method, device, equipment and product based on big data
CN109382702A (en) A kind of chain digital control gear hobbing machine rolling blade losing efficacy form automatic identifying method
CN117633468A (en) Information analysis-based power system fault judging method and device
CN112803587A (en) Intelligent inspection method for state of automatic equipment based on diagnosis decision library
CN116908137A (en) Abnormality detection and analysis method and device for near infrared data
CN116165939A (en) Remote supervision system and method for environmental protection equipment based on big data
CN116704729A (en) Industrial kiln early warning system and method based on big data analysis
CN113255204A (en) Method for calculating and counting steel-making steel material consumption by utilizing big data
CN115213735B (en) System and method for monitoring cutter state in milling process
CN110502553A (en) A kind of aid decision-making method based on big data
CN118432710B (en) Method for reporting fault of optical fiber communication link
Rao Fault Diagnosis of CNC Machine Tools Based on Neural Network Optimization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190503