CN106709035A - Preprocessing system for electric power multi-dimensional panoramic data - Google Patents

Preprocessing system for electric power multi-dimensional panoramic data Download PDF

Info

Publication number
CN106709035A
CN106709035A CN201611247497.2A CN201611247497A CN106709035A CN 106709035 A CN106709035 A CN 106709035A CN 201611247497 A CN201611247497 A CN 201611247497A CN 106709035 A CN106709035 A CN 106709035A
Authority
CN
China
Prior art keywords
data
electric power
attribute
module
panoramic view
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611247497.2A
Other languages
Chinese (zh)
Other versions
CN106709035B (en
Inventor
黄�良
赵立进
吕黔苏
杨涛
吴建蓉
王波
陈思远
林刚
张亚茹
赵芳菲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electric Power Research Institute of Guizhou Power Grid Co Ltd
Original Assignee
Electric Power Research Institute of Guizhou Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electric Power Research Institute of Guizhou Power Grid Co Ltd filed Critical Electric Power Research Institute of Guizhou Power Grid Co Ltd
Priority to CN201611247497.2A priority Critical patent/CN106709035B/en
Publication of CN106709035A publication Critical patent/CN106709035A/en
Application granted granted Critical
Publication of CN106709035B publication Critical patent/CN106709035B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply

Abstract

The invention relates to a preprocessing system for electric power multi-dimensional panoramic data. The preprocessing system comprises a data cleaning module, a data storage and retrieval module and a data value extraction module which are connected in sequence. By means of the preprocessing method generation system for electric power alignment panoramic data, massive transaction data, massive interaction data and massive processing data from an electric power system can be effectively processed, data types can be rapidly judged, and data value can be rapidly extracted.

Description

A kind of pretreatment system of electric power multidimensional panoramic view data
Technical field
The present invention relates to technical field of data processing, it is related to a kind of preprocess method of electric power multidimensional panoramic view data.
Background technology
In recent years, the development of intelligent grid has turned into a big focus of current era with research, and intelligent electric meter is used as intelligence Can power network important component, acquire a large amount of detailed Multiple Time Scales, the basic input datas of polymorphic type, it is and traditional The basic datas such as flow data are compared, and data volume becomes the overall data of a period of time from a time profile data, or even goes out More unstructured datas are showed, existing structural data cannot meet its actual analysis demand.
With the propulsion that intelligent grid is built, electric power multidimensional panoramic view data is broadly divided into electricity according to its owning user property Net enterprise, power consumer, three aspects of government and the third-party institution distinguish corresponding electric network data, user data and social number According to.These data are typically presented in the way of Information Integration platform;Wherein, power grid enterprises' data mainly include that distribution is automatic Change, GIS, SCADA, power information acquisition system, client's marketing service system, user's use can manage and be;Power consumer data The main user data such as including distributed power source EMS, micro-capacitance sensor MG-EMS, family HEMS, building BEMS, enterprise EMS;Government and Third-party institution's data mainly include weather monitoring system, energy consumption supervisory systems, smart city monitoring system, energy public service The society such as platform data.These data show design style isomery, storage mode isomery and structure in terms of Heterogeneous data Change the features such as being coexisted with destructuring, while it has the characteristics such as magnanimity, data renewal speed are exceedingly fast, distributional region is extensive again.
Because data source is more, amount big, renewal speed is fast and the low feature of value density, data value excavation is increased Difficulty, on the one hand make we be difficult quickly to find the useful value information of data and it is regular the features such as, on the other hand exist Some data redundancies can also influence our judgement.It is therefore proposed that a kind of electric power multidimensional panoramic view data and processing method, logarithm Tool is excavated according to value to be of great significance.
The content of the invention
For problem above, the present invention proposes a kind of preprocess method of electric power multidimensional panoramic view data, to improve data Quality, reduces data value and excavates difficulty.
A kind of pretreatment system of electric power multidimensional panoramic view data, including be sequentially connected:
Data cleansing module:For being modified to electric power multidimensional panoramic view data, noise reduction and missing values are filled up, including attribute Identification, bad data discrimination, data classification, data filling and smoothing processing, this module are used for improving the quality of data, are favorably improved The accuracy rate and efficiency of data mining process;The data cleansing module recognizes the time sequence of each quantity of state using time series models Row, so as to obtain the property value of data, detect the abnormal patterns of data, judge that abnormal data is energy extraction equipment fault message " useful data " " hash " that still can be cleaned, then data are classified, sorted data are used respectively Time series intervention model is fitted to extract effective fault message.In data cleansing, according to the kind of exceptional value in sequence Class selects different correction formulas, so as to reach amendment noise point data and fill up the purpose of missing values
Data memory module:For to carrying out storage and management by the electric power multidimensional panoramic view data after data cleansing, with Better way optimizes memory space, supports the electric power data of magnanimity isomorphism isomery;The data memory module takes into full account data Correlation and time-space attribute, mass data is supported with the non-relational database of relevant database and " key-value " Storage and treatment, data are carried out with storage optimization and the treatment of the parallel parsing based on MapReduce, and use MapReduce Frame Design realizes the parallel parsing algorithm of data;
Data integration module:For after data storage, for substantial amounts of, distributed data source, these data being pressed A kind of unified structure is processed with mode, and scattered data are put together to form unified data set;The data integration Module considers that data source, in multiple databases, data warehouse or generic-document, they is stored in respectively according to data type After structured database and unstructured data storehouse, for the ease of index and the extraction of data, design is based on data correlation square The data method for congregating of battle array, the incidence relation set up between two kinds of databases finally connects two types data base concurrency, adopts A big data platform is built with hierarchy.
In a kind of pretreatment system of above-mentioned electric power multidimensional panoramic view data, the data cleansing module is complete to electric power multidimensional Scape data are modified, noise reduction and fill up the specific method of missing values and include:
Step 1, Attribute Recognition:Input has the n data set sample S of attribute, and wherein property set is X, | X |=n.If I Be J to the evaluation method of data, candidate attribute generation strategy is GS.
It is the starting point of property set X to define L, and Solution is the best attributes in the L drawn according to evaluation method J.To category Property collection be circulated operation, when attribute concentrates X ' to be more than the best attributes that generate before by evaluation of estimate J (X ') of evaluation method J Evaluation of estimate J (Soltion) when, i.e. J (X ') >=J (Soltion), then X ' be best attributes.
Step 2, bad data discrimination:Input contains n attribute data the collection Solution, { x of sample1,x2,…,xn}.It is right Each data x in data setiIf,(σ is acceptable error range), then it is assumed that xiFor bad number According to, and add it to bad data collection BS.
Step 3, data classification:K initial center point is selected first, then each data object is assigned to nearest apart from it Class in, so as to form k cluster, the center of each cluster is finally recalculated again;Repeat said process until each cluster center not Change.
Step 4, data filling:Input is comprising n object and is divided into the k data set D of cluster, and process step includes:
Step 4.1, data set D is divided to for two data subset DsCAnd Di:DCIn record all complete documentations, do not have Any attribute contains missing values;DiIn be recorded as defect record, i.e., the missing values containing and the above in attribute.
Step 4.2, to data subset DCUse k-means algorithms.
Step 4.3, from data subset DiMiddle order removal record, calculates the record and DCK class in any sort it is similar Degree, selects the similarity of maximum, is C the recording marki(i=1,2 ..., k) class;Until data subset is sky.
Step 4.4, according to DiThe allocated class of middle record, the missing values to recording are handled as follows:
Wherein, AiIt is the data in classification.
D after having been processed by step 4.1 to step 4.4iData set as after data filling.
Step 5, smoothing denoising treatment:The data set of missing data will have been filled up carries out wavelet transformation, chooses suitable small Ripple basic function and Decomposition order, burbling noise data and information data, delete noise data therein and carry out signal reconstruction, protect Hold the complete and characteristic of data.
In a kind of pretreatment system of above-mentioned electric power multidimensional panoramic view data, the data memory module is to clear by data Electric power multidimensional panoramic view data after washing carries out comprising the concrete steps that for storage and management:
For structural data:It is that every class data enclose label, being contacted in one-to-many between label and data, profit Existing MySQL database is used, data are entered with label storage.
For unstructured data:Using HDFS systems stored as a file, using the mapping relations (key- between data Value) pattern, sets up data matrix and is stored.The index of data matrix is by line unit (Row Key), row race (Column Family), row key (Column Qualifier) and timestamp (Timestamp) are constituted, can be expressed as (Row, Family:Column, Timestamp) → Value.
In a kind of pretreatment system of above-mentioned electric power multidimensional panoramic view data, the data integration module for it is substantial amounts of, Distributed data source is processed by a kind of unified structure with mode, and scattered data are put together to form unified number It is according to the specific method for collecting:Data are stored (relational data is stored in pass with the described two storage methods of claim 3 kind It is type database, non-relational data are stored in non-relational database), two databases are uploaded to pretreatment system. Using the technology of MapReduce under Hadoop running environment, in mapping (Map) stage, all data are carried out on multiple nodes Packet sequencing, carries out data drawing by about subtracting the TaskTracker nodes in (Reduce) stage by way of remote access afterwards Take.
The present invention can effectively process the magnanimity transaction data from power system, magnanimity interaction data and magnanimity treatment number According to quickly judging data type and extract data value.
Brief description of the drawings
The preprocess method flow chart of the electric power multidimensional panoramic view data of accompanying drawing 1.
The data cleansing module flow chart of accompanying drawing 2.
The data memory module flow chart of accompanying drawing 3.
The data integration block flow diagram of accompanying drawing 4.
Specific embodiment
In order to process of the invention and beneficial effect is expanded on further, it is described with reference to the drawings.
To achieve the above object, technical scheme proposed by the present invention is:Build it is a kind of include data cleansing, data storage and The data pretreatment of the big module of data integration three, is made up of following functions:
(1) data cleansing module is used to be modified electric power multidimensional panoramic view data, noise reduction and fills up missing values, including category Property identification, bad data discrimination and data classification, this module be used for improve the quality of data, be favorably improved the standard of data mining process True rate and efficiency.
(2) data memory module is used for carrying out storage and management by the electric power multidimensional panoramic view data after data cleansing, Optimize memory space in a better way, support the electric power data of magnanimity isomorphism isomery.
(3) data integration module is used for after data storage, for substantial amounts of, distributed data source, by these data Processed with mode by a kind of unified structure, scattered data are put together to form unified data set.
Data cleansing module function is as follows:
The module recognizes the time series of each quantity of state using time series models, detects the abnormal patterns of data, sentences Disconnected abnormal data is " hash " that energy extraction equipment fault message " useful data " still can be cleaned, and uses time series Intervention model is fitted to extract effective fault message.In data cleansing, the species selection according to exceptional value in sequence is not Same correction formula, so as to reach amendment noise point data and fill up the purpose of missing values.
Data memory module function is as follows:
The module takes into full account the correlation and time-space attribute of data, non-with " key-value " with relevant database Relevant database supports the storage of mass data and treatment, data are carried out with storage optimization and based on MapReduce's and Row analyzing and processing, and the parallel parsing algorithm of data is realized using MapReduce Frame Designs.
Data integration functions of modules is as follows:
In view of data source in multiple databases, data warehouse or generic-document etc., by them according to data type It is stored in respectively after structured database and unstructured data storehouse, for the ease of index and the extraction of data, design is based on number According to the data method for congregating of incidence matrix, the incidence relation set up between two kinds of databases, finally database by two types simultaneously Row connection, a big data platform is built using hierarchy.
Such as Fig. 2, the electric power data of magnanimity includes structural data and unstructured data, imports data to data cleansing mould After block, Attribute Recognition is carried out first with data source and data time label, then recognize and remove similar to isolated Point ground bad data.After preliminary treatment data, the destructurings such as conventional structural data and picture, text are splitted data into Data, and different algorithms are utilized respectively, data are carried out to fill up denoising.
Such as Fig. 3, carry out the data after data cleansing and have been divided into structuring and destructuring, they are stored in relation respectively Type and non-relational database, and with MapReduce framework Parallel Processing and Analysis.
Such as Fig. 4, the two kinds of databases that will be had been built up carry out data correlation Matrix Cluster analysis, set up two kinds of databases Incidence relation, finally put it into a data warehouse and build a big data platform.
Specific embodiment described herein is only to the spiritual explanation for example of the present invention.Technology neck belonging to of the invention The technical staff in domain can be made various modifications or supplement to described specific embodiment or be replaced using similar mode Generation, but without departing from spirit of the invention or surmount scope defined in appended claims.

Claims (4)

1. a kind of pretreatment system of electric power multidimensional panoramic view data, it is characterised in that including what is be sequentially connected:
Data cleansing module:For being modified to electric power multidimensional panoramic view data, noise reduction and missing values are filled up, including attribute is known Not, bad data discrimination, data classification, data filling and smoothing processing, this module are used for improving the quality of data, are favorably improved number According to the accuracy rate and efficiency of mining process;The data cleansing module recognizes the time sequence of each quantity of state using time series models Row, so as to obtain the property value of data, detect the abnormal patterns of data, judge that abnormal data is energy extraction equipment fault message The hash that still can be cleaned of useful data, then data are classified, sorted data are used into the time respectively Sequence intervention model is fitted to extract effective fault message;In data cleansing, according to the species choosing of exceptional value in sequence Different correction formulas are selected, so as to reach amendment noise point data and fill up the purpose of missing values;
Data memory module:For to carrying out storage and management by the electric power multidimensional panoramic view data after data cleansing, with more preferable Method optimizing memory space, support magnanimity isomorphism isomery electric power data;The data memory module takes into full account the phase of data Closing property and time-space attribute, the storage of mass data is supported with the non-relational database of relevant database and key-value With treatment, data are carried out with storage optimization and the treatment of the parallel parsing based on MapReduce, and set using MapReduce frameworks Meter realizes the parallel parsing algorithm of data;
Data integration module:For after data storage, for substantial amounts of, distributed data source, by these data by one kind Unified structure is processed with mode, and scattered data are put together to form unified data set;The data integration module In view of data source in multiple databases, data warehouse or generic-document, they are stored in structure respectively according to data type Change after database and unstructured data storehouse, for the ease of index and the extraction of data, design is based on data correlation matrix Data method for congregating, the incidence relation set up between two kinds of databases, finally by two types data base concurrency connect, using point Rotating fields build a big data platform.
2. a kind of pretreatment system of electric power multidimensional panoramic view data according to claim 1, it is characterised in that the data Cleaning module is modified to electric power multidimensional panoramic view data, noise reduction and fill up the specific method of missing values and include:
Step 1, Attribute Recognition:Input has the n data set sample S of attribute, and wherein property set is X, | X |=n;To data Evaluation method is J, and candidate attribute generation strategy is GS;
It is the starting point of property set X to define L, and Solution is the best attributes in the L drawn according to evaluation method J;To property set Operation is circulated, when attribute concentrates X ' to be more than commenting for the best attributes for generating before by evaluation of estimate J (X ') of evaluation method J During value J (Soltion), i.e. J (X ') >=J (Soltion), then X ' is best attributes;
Step 2, bad data discrimination:Input contains n attribute data the collection Solution, { x of sample1,x2,…,xn};For number According to each data x for concentratingiIf,σ is acceptable error range, then it is assumed that xiIt is bad data, and will It is added to bad data collection BS;
Step 3, data classification:K initial center point is selected first, and each data object is then assigned to the class nearest apart from it It is interior, so as to form k cluster, the center of each cluster is finally recalculated again;Said process is repeated until each cluster center does not occur Change;
Step 4, data filling:Input is comprising n object and is divided into the k data set D of cluster, and process step includes:
Step 4.1, data set D is divided to for two data subset DsCAnd Di:DCIn record all complete documentations, it is not any Attribute contains missing values;DiIn be recorded as defect record, i.e., the missing values containing and the above in attribute;
Step 4.2, to data subset DCUse k-means algorithms;
Step 4.3, from data subset DiMiddle order removal record, calculates the record and DCK class in any sort similarity, The similarity of maximum is selected, is C the recording markiClass, i=1,2 ..., k;Until data subset is sky;
Step 4.4, according to DiThe allocated class of middle record, the missing values to recording are handled as follows:
Wherein, AiIt is the data in classification;
D after having been processed by step 4.1 to step 4.4iData set as after data filling;
Step 5, smoothing denoising treatment:The data set of missing data will have been filled up carries out wavelet transformation, chooses suitable wavelet basis Function and Decomposition order, burbling noise data and information data, delete noise data therein and carry out signal reconstruction, keep number According to complete and characteristic.
3. a kind of pretreatment system of electric power multidimensional panoramic view data according to claim 1, it is characterised in that the data Memory module by the electric power multidimensional panoramic view data after data cleansing to carrying out comprising the concrete steps that for storage and management:
For structural data:It is that every class data enclose label, being contacted in one-to-many between label and data, using Some MySQL databases, data are entered with label storage;
For unstructured data:Using HDFS systems stored as a file, using the mapping relations pattern between data, set up Data matrix is stored;The index of data matrix is made up of line unit, row race, row key and timestamp, can be expressed as Row, Family:Column, Timestamp → Value.
4. the according to claim a kind of pretreatment system of electric power multidimensional panoramic view data, it is characterised in that the data set Processed by a kind of unified structure and mode for substantial amounts of, distributed data source into module, by scattered data set In the get up specific method of the data set to form unified be:Data are deposited with claim 3 kind described two storage methods Two databases are uploaded to pretreatment system by storage;Using the technology of MapReduce under Hadoop running environment, in mapping Stage, all data carry out packet sequencing on multiple nodes, afterwards by about derogatory section of TaskTracker nodes by long-range The mode of access carries out data pull.
CN201611247497.2A 2016-12-29 2016-12-29 A kind of pretreatment system of electric power multidimensional panoramic view data Active CN106709035B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611247497.2A CN106709035B (en) 2016-12-29 2016-12-29 A kind of pretreatment system of electric power multidimensional panoramic view data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611247497.2A CN106709035B (en) 2016-12-29 2016-12-29 A kind of pretreatment system of electric power multidimensional panoramic view data

Publications (2)

Publication Number Publication Date
CN106709035A true CN106709035A (en) 2017-05-24
CN106709035B CN106709035B (en) 2019-11-26

Family

ID=58904005

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611247497.2A Active CN106709035B (en) 2016-12-29 2016-12-29 A kind of pretreatment system of electric power multidimensional panoramic view data

Country Status (1)

Country Link
CN (1) CN106709035B (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107133355A (en) * 2017-05-25 2017-09-05 国网天津市电力公司 Route parameter calculation and data management-control method in the range of regional dispatching
CN108021935A (en) * 2017-11-27 2018-05-11 中国电力科学研究院有限公司 A kind of Dimensionality reduction method and device based on big data technology
CN108170825A (en) * 2018-01-05 2018-06-15 上海电气分布式能源科技有限公司 Distributed energy data monitoring cleaning method based on cloud platform
CN108335231A (en) * 2018-01-29 2018-07-27 国网福建省电力有限公司 A kind of power distribution network data diagnosis method of Auto-matching
CN108563770A (en) * 2018-04-20 2018-09-21 南京邮电大学 A kind of KPI and various dimensions network data cleaning method based on scene
CN109492877A (en) * 2018-10-16 2019-03-19 安徽医科大学第附属医院 Hospital Informatization evaluation method
CN109696316A (en) * 2017-10-20 2019-04-30 株洲中车时代电气股份有限公司 A kind of train remote supervision system
CN109801181A (en) * 2017-11-17 2019-05-24 中国电力科学研究院有限公司 A kind of switching data cleaning method for repairing and mending and system
CN109977107A (en) * 2019-04-02 2019-07-05 电子科技大学 A kind of electricity consumption acquisition data cleaning method
CN110287256A (en) * 2019-06-14 2019-09-27 南京邮电大学 A kind of electric network data parallel processing system (PPS) and its processing method based on cloud computing
CN110348716A (en) * 2019-06-28 2019-10-18 国网河北省电力有限公司电力科学研究院 A kind of the backstage control platform and method of controller switching equipment
CN110825798A (en) * 2019-10-29 2020-02-21 深圳供电局有限公司 Electric power application data maintenance method and device
CN110943983A (en) * 2019-11-22 2020-03-31 南京邮电大学 Network security prevention method based on security situation awareness and risk assessment
CN111901158A (en) * 2020-07-14 2020-11-06 广东科徕尼智能科技有限公司 Intelligent home distribution network fault data analysis method, equipment and storage medium
CN112069269A (en) * 2020-08-27 2020-12-11 黄天红 Big data and multidimensional feature-based data tracing method and big data cloud server
CN112104073A (en) * 2020-08-19 2020-12-18 厦门盈盛捷电力科技有限公司 Real-time information calibration method for power system
CN113159326A (en) * 2021-03-03 2021-07-23 国网山西省电力公司信息通信分公司 Intelligent business decision method based on artificial intelligence

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11238006A (en) * 1998-02-19 1999-08-31 Nippon Telegr & Teleph Corp <Ntt> Data cleaning method and device therefor and recording medium recording data cleaning processing program
CN104331435A (en) * 2014-10-22 2015-02-04 国家电网公司 Low-influence high-efficiency mass data extraction method based on Hadoop big data platform
CN104361110A (en) * 2014-12-01 2015-02-18 广东电网有限责任公司清远供电局 Mass electricity consumption data analysis system as well as real-time calculation method and data mining method
CN104820670A (en) * 2015-03-13 2015-08-05 国家电网公司 Method for acquiring and storing big data of power information
CN105069703A (en) * 2015-08-10 2015-11-18 国家电网公司 Mass data management method of power grid
CN105786864A (en) * 2014-12-24 2016-07-20 国家电网公司 Offline analysis method for massive data

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11238006A (en) * 1998-02-19 1999-08-31 Nippon Telegr & Teleph Corp <Ntt> Data cleaning method and device therefor and recording medium recording data cleaning processing program
CN104331435A (en) * 2014-10-22 2015-02-04 国家电网公司 Low-influence high-efficiency mass data extraction method based on Hadoop big data platform
CN104361110A (en) * 2014-12-01 2015-02-18 广东电网有限责任公司清远供电局 Mass electricity consumption data analysis system as well as real-time calculation method and data mining method
CN105786864A (en) * 2014-12-24 2016-07-20 国家电网公司 Offline analysis method for massive data
CN104820670A (en) * 2015-03-13 2015-08-05 国家电网公司 Method for acquiring and storing big data of power information
CN105069703A (en) * 2015-08-10 2015-11-18 国家电网公司 Mass data management method of power grid

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107133355A (en) * 2017-05-25 2017-09-05 国网天津市电力公司 Route parameter calculation and data management-control method in the range of regional dispatching
CN109696316A (en) * 2017-10-20 2019-04-30 株洲中车时代电气股份有限公司 A kind of train remote supervision system
CN109801181A (en) * 2017-11-17 2019-05-24 中国电力科学研究院有限公司 A kind of switching data cleaning method for repairing and mending and system
CN108021935A (en) * 2017-11-27 2018-05-11 中国电力科学研究院有限公司 A kind of Dimensionality reduction method and device based on big data technology
CN108021935B (en) * 2017-11-27 2024-01-23 中国电力科学研究院有限公司 Dimension reduction method and device based on big data technology
CN108170825A (en) * 2018-01-05 2018-06-15 上海电气分布式能源科技有限公司 Distributed energy data monitoring cleaning method based on cloud platform
CN108335231A (en) * 2018-01-29 2018-07-27 国网福建省电力有限公司 A kind of power distribution network data diagnosis method of Auto-matching
CN108563770A (en) * 2018-04-20 2018-09-21 南京邮电大学 A kind of KPI and various dimensions network data cleaning method based on scene
CN108563770B (en) * 2018-04-20 2022-05-17 南京邮电大学 Scene-based KPI and multi-dimensional network data cleaning method
CN109492877A (en) * 2018-10-16 2019-03-19 安徽医科大学第附属医院 Hospital Informatization evaluation method
CN109977107A (en) * 2019-04-02 2019-07-05 电子科技大学 A kind of electricity consumption acquisition data cleaning method
CN109977107B (en) * 2019-04-02 2022-04-05 电子科技大学 Method for cleaning power utilization collected data
CN110287256A (en) * 2019-06-14 2019-09-27 南京邮电大学 A kind of electric network data parallel processing system (PPS) and its processing method based on cloud computing
CN110287256B (en) * 2019-06-14 2022-10-14 南京邮电大学 Cloud computing-based power grid data parallel processing system and processing method thereof
CN110348716A (en) * 2019-06-28 2019-10-18 国网河北省电力有限公司电力科学研究院 A kind of the backstage control platform and method of controller switching equipment
CN110825798A (en) * 2019-10-29 2020-02-21 深圳供电局有限公司 Electric power application data maintenance method and device
CN110943983B (en) * 2019-11-22 2020-10-30 南京邮电大学 Network security prevention method based on security situation awareness and risk assessment
CN110943983A (en) * 2019-11-22 2020-03-31 南京邮电大学 Network security prevention method based on security situation awareness and risk assessment
CN111901158A (en) * 2020-07-14 2020-11-06 广东科徕尼智能科技有限公司 Intelligent home distribution network fault data analysis method, equipment and storage medium
CN112104073A (en) * 2020-08-19 2020-12-18 厦门盈盛捷电力科技有限公司 Real-time information calibration method for power system
CN112069269A (en) * 2020-08-27 2020-12-11 黄天红 Big data and multidimensional feature-based data tracing method and big data cloud server
CN113159326A (en) * 2021-03-03 2021-07-23 国网山西省电力公司信息通信分公司 Intelligent business decision method based on artificial intelligence
CN113159326B (en) * 2021-03-03 2024-02-23 国网山西省电力公司信息通信分公司 Intelligent business decision method based on artificial intelligence

Also Published As

Publication number Publication date
CN106709035B (en) 2019-11-26

Similar Documents

Publication Publication Date Title
CN106709035B (en) A kind of pretreatment system of electric power multidimensional panoramic view data
CN104820670B (en) A kind of acquisition of power information big data and storage method
CN105956015A (en) Service platform integration method based on big data
CN102982097B (en) Domain for Knowledge based engineering data quality solution
CN104881424A (en) Regular expression-based acquisition, storage and analysis method of power big data
CN103577605A (en) Data warehouse based on data fusion and data mining and application method of data warehouse
CN104318481A (en) Power-grid-operation-oriented holographic time scale measurement data extraction conversion method
CN102902752A (en) Method and system for monitoring log
CN111552813A (en) Power knowledge graph construction method based on power grid full-service data
CN106779219A (en) A kind of electricity demand forecasting method and system
CN112395289B (en) Distributed photovoltaic data layered storage method and system
CN113064866A (en) Power business data integration system
CN107944465A (en) A kind of unsupervised Fast Speed Clustering and system suitable for big data
CN106339451A (en) Data mining system based on large data
CN110287237B (en) Social network structure analysis based community data mining method
CN107590225A (en) A kind of Visualized management system based on distributed data digging algorithm
CN112200209A (en) Poor user identification method based on day-to-day power consumption
CN103902582B (en) A kind of method and apparatus for reducing data warehouse data redundancy
CN113094448B (en) Analysis method and analysis device for residence empty state and electronic equipment
CN113254517A (en) Service providing method based on internet big data
CN108986113A (en) A kind of block parallel multi-scale division algorithm based on LLTS frame
CN116662860A (en) User portrait and classification method based on energy big data
CN116011564A (en) Entity relationship completion method, system and application for power equipment
CN109739840A (en) Data processing empty value method, apparatus and terminal device
CN106652032B (en) A kind of parallel contour lines creation method of DEM based on Linux cluster platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant