CN105069703A - Mass data management method of power grid - Google Patents

Mass data management method of power grid Download PDF

Info

Publication number
CN105069703A
CN105069703A CN201510487734.1A CN201510487734A CN105069703A CN 105069703 A CN105069703 A CN 105069703A CN 201510487734 A CN201510487734 A CN 201510487734A CN 105069703 A CN105069703 A CN 105069703A
Authority
CN
China
Prior art keywords
data
power grid
decision tree
attribute
mapreduce
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510487734.1A
Other languages
Chinese (zh)
Other versions
CN105069703B (en
Inventor
刘志刚
魏晓光
陈剑飞
刘小宝
戴昭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
Jinan Power Supply Co of State Grid Shandong Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
Jinan Power Supply Co of State Grid Shandong Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, Jinan Power Supply Co of State Grid Shandong Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN201510487734.1A priority Critical patent/CN105069703B/en
Publication of CN105069703A publication Critical patent/CN105069703A/en
Application granted granted Critical
Publication of CN105069703B publication Critical patent/CN105069703B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a mass data management method of a power grid. The method comprises the following steps: constructing a power grid user data management system, integrating data collected by each power grid subsystem, and utilizing a parallel computing frame to mine and analyze power grid user data; and on the basis of the data management system, utilizing a distributed load prediction algorithm to realize parallel load prediction. The invention provides the mass data management method of the power grid, the data of each system of the power grid user is fused and integrated, and a traditional data computation method is migrated to a distributed platform to meet the operation requirements of the mass data.

Description

A kind of electrical network mass data management method
Technical field
The present invention relates to intelligent grid, particularly a kind of electrical network mass data management method.
Background technology
To the collection of power grid user real time data, transmission and storage, and the magnanimity multi-source historical data of associate cumulation is carried out express-analysis and effectively can be improved demand management, user data to be managed and process supports smart grid security, strong and reliability service.Along with the continuous increase of various kinds of sensors and smart machine quantity, equipment obtains and also exponential growth is occurring with the Various types of data of transmission, these data not only comprise the power consumption that intelligent electric meter is collected, and also comprise temperature, weather, humidity, geography information and wind speed information etc. that various kinds of sensors gathers according to fixed frequency.User data complexity increases.
Technology and the external difference of China's generating and transmitting system are little, but adapted electricity particularly user side there is larger difference, because the market mechanism adapted not yet is formed, the condition of implementation of China intelligent power technology is not mature enough, is difficult to the effective integration supporting intelligent electrical power distribution system and Subscriber Management System.Generally speaking, there is following challenge in the Mass Data Management of power grid user: the fast development of intelligent electric meter and technology of Internet of things, and the mass data mode making it produce varies, and constituent parts data bore differs, and difficulty is integrated in processing.For mass data, how to build a module and carry out regulate expression to it, how to realize Data Integration based on this module is the problem needing solution badly.Because the acquisition mode of data is varied, each communication channel quality differs, and the quality of data not only received is inferior, and also not enough to the management and control ability of data, thus the knowledge causing the data utilizing these inferior to carry out mining analysis discovery is also unscientific, can not make decision-making accurately.This causes ill effect in the world, seriously annoyings information society.Data type is complicated, and traditional relevant database and file memory format can not the demands that increase fast of satisfying magnanimity data.
Summary of the invention
For solving the problem existing for above-mentioned prior art, the present invention proposes a kind of electrical network mass data management method, comprising:
Build power grid user data management system, the data that each electrical network subsystem collects are integrated, and utilizes the data of parallel computation frame to power grid user to excavate and analyze; Based on described data management system, distributed terminator prediction algorithm is utilized to realize parallel load prediction.
Preferably, the framework of described power grid user data management system is divided into application layer, data analysis computation layer, data management layer, Hadoop is utilized to build power grid user data management system, platform adopt HDFS, HBase set up data-storage system, platform builds MapReduce parallel computation frame and Storm memory parallel Computational frame as mass data computational analysis system, the mass data of power grid user is analyzed; Described data management layer carries out collection and integrated to data; Described data acquisition comprises the data gathered from intelligent electric meter, data acquisition monitoring system and various sensor, is managed by Data Migration integrated the comprising of these data to cluster server; In the integrating process of data, data batchmove instrument is adopted to extract and integration work data, by each independently system produce data and historical data utilize data batchmove instrument extracting integral in HBase, and use java persistence instrument to operate column storage database, the online data that the application based on Distributed Calculation produces is written in HBase; Described data analysis computation layer is used for storage and the computational analysis of mass data; Utilize HBase store power load data and related data; Parallel computation module MapReduce is utilized to carry out parallel batch computational analysis to mass data, and the parallel computation module Storm based on internal memory is adopted to data-intensive iterative computation, business desired data is read in internal memory, needs directly to inquire about from internal memory during data.
Preferably, described based on described data management system, utilize distributed terminator prediction algorithm to realize parallel load prediction, comprise further:
Utilize the training process of 3 MapReduce service class execution algorithms, the output of each MapReduce is as the input of thereafter, the decision-making module obtained after training terminates is kept in the distributed type assemblies of Hadoop, and it is divided into three parts: generate data dictionary; Generate decision tree; Form decision tree set;
The sample data that wherein said generation data dictionary comprises training is described, produce a file and describe sample conditional attribute and decision attribute, the type of record condition property value and the position of decision attribute, and the module that will create carries out classifying or regressing calculation, this process is completed by first MapReduce, each Map process reads a part for experimental data, the attribute type of record data and load value or type identification; The description document produced is stored in the file system HDFS of Hadoop with the form of key/value;
Wherein said generation decision tree process comprises following parallel procedure:
1) former data set is had at random to extraction K and the equirotal sample data TS of former state notebook data collection putting back to 1,2 ..., k; The training set of a corresponding decision tree of sample data, each sample data is different, and the same with former data set size;
2) determine the attribute number m of each node Stochastic choice according to the number M of attribute in sample data, wherein m<<M, in sort module, m is the square root of M, and in regression block, m is 1/3 of M; Calculate the quantity of information of each attribute in m attribute, select best attributes to carry out branch;
3) recurrence carries out the foundation of node, generates decision tree; The generation of K decision tree is parallel generation, and a Map generates a decision tree, and this process is completed by second MapReduce process;
The set of described formation decision tree comprises each decision tree set of classifiers altogether, each decision tree produces a result, if it is determined that tree set is used for classifying, its net result is that ballot is chosen, when it is used for regression forecasting, K tree provides K value, end value is the mean value of each tree, and this process is completed by the 3rd MapReduce.
Preferably, in the deployment framework of described HBase system, using the supvr of dispatching center as whole distributing real-time data bank, storing metadata information, comprises the key message of the division of labor of each node, node state, data partition mode, data block location, task scheduling, safety management, described dispatching center keeps the consistance of metadata each other by synchronization mechanism, data analysis computation layer is reciprocity in logic, dispose same process and complete same logical operation, data analysis computation layer adopts the redundancy backup mechanism based on affairs, the distributed file system that power grid user data management system adopts HDFS to store as bottom, build the time series data come towards the sequential control assembly of electrical network mass data in store electricity network service, time series data module is built by sequential control assembly, according to the unified time series data receiving storage of collected of peculiar module, and unified query interface is externally provided,
On storage mode, the form of key-value is adopted to store data, namely store towards row, be basic storage and control of authority unit to arrange race, for being empty row, in actual storage, do not take real space, use the design of sparse table, on data framework is disposed, abandon traditional C/S multi-client, the pattern of Single-Server, adopt the cluster mode of distributed multiserver, all data are stored on the multiple stage computing machine in cluster according to replicator dispersion, sequential control assembly bottom depends on column storage database, abstract when concrete process time series data is reading HBase database, write, increase, delete, the basic operation of amendment, the software the superiors are client and the third-party application client of sequential control assembly, all clients carry out concrete operations by the API of Java, all API are the arrangement set of a database manipulation or multiple database manipulation by type parsing module function decomposition into analytic function, these database manipulation set are called by the RPC of Control Component inside, the HBase that finally unified use is asynchronous operates API and completes data manipulation.
The present invention compared to existing technology, has the following advantages:
The present invention proposes a kind of electrical network mass data management method, the data of each for power grid user system are carried out fusion and integrated, and traditional data computing method are moved in distributed platform, the computing requirement of satisfying magnanimity data.
Embodiment
It is hereafter the detailed description to one or more embodiment of the present invention.Describe the present invention in conjunction with such embodiment, but the invention is not restricted to any embodiment.Scope of the present invention is only defined by the claims, and the present invention contain many substitute, amendment and equivalent.Set forth many details in the following description to provide thorough understanding of the present invention.These details are provided for exemplary purposes, and also can realize the present invention according to claims without some in these details or all details.
An aspect of of the present present invention provides a kind of power grid user mass data processing method.Utilize Hadoop cluster to build the basic management system of mass data, the Data Integration that each electrical network subsystem collects is become mass data storage, and utilizes parallel computation frame to carry out quick mining analysis to the mass data of power grid user.For electrical load predicted application, traditional load estimation is moved to Distributed Computing Platform, utilize the load estimation algorithm realization parallel load based on decision tree to predict.The actual needs that the present invention analyzes in conjunction with power grid user mass data, build with the power grid user data management system analysis calculated as master, its basic framework is divided into application layer, data analysis computation layer, data management layer.
This framework utilizes Hadoop to build power grid user data management system, platform adopt HDFS, HBase set up mass data storage system, platform builds MapReduce parallel computation frame and Storm memory parallel Computational frame as mass data computational analysis system, the mass data of power grid user is analyzed.
Wherein, data management layer carries out collection and integrated to data.Data acquisition comprises the data gathered from intelligent electric meter, data acquisition monitoring system and various sensor, these data not only comprise the data of electrical network inside, also comprise a large amount of relevant data, these data are produced by the equipment of different vendor, mode varies, constituent parts data bore differs, and defines mass data flow, and difficulty is integrated in processing.These data integrated to refer to the Data Migration of the generation of legacy system to cluster server, manages efficiently.
Platform adopts data batchmove instrument to carry out extracting integral work to data for this difficult point of data integration, and by each, independently the data that produce of system and historical data utilize data batchmove instrument extracting integral in HBase.Use java persistence instrument to operate column storage database, the online data that the application based on Distributed Calculation produces is written in HBase.
Data analysis computation layer is used for storage and the computational analysis function of mass data.Distributed Calculation layer utilizes Hadoop to build and forms, and mass data storage, in distributed file system HDFS, utilizes HBase to manage data.
This platform utilizes HBase store power load data and related data, HBase database is classified as storage unit, conveniently permutation data are inquired about, and the prediction algorithm used subsequently needs repeatedly to carry out reading calculating to permutation data in learning process, the operational requirements of data is met to the feature of HBase data storage.
Utilize parallel computation module MapReduce to carry out parallel batch computational analysis to mass data, and the parallel computation module Storm based on internal memory is adopted to data-intensive iterative computation.Storm provides a kind of memory parallel Computational frame, business desired data is read in internal memory by framework, directly inquires about from internal memory during desired data, faster than the speed of the MapReduce visit data based on disk like this, decrease the working time of business, decrease I/O operation.
Load estimation is the key link in Electric Power Network Planning, is transformer station, space truss project important computations foundation, and high-precision switch-time load prediction effectively can reduce cost of electricity-generating, has key effect.The present invention uses a kind of integrated learning approach of improvement, take decision tree as basic studies unit, comprise multiple Stochastic subspace identification method and train the decision tree obtained, input sample to be sorted, produce each classification results by each decision tree, final classification results is chosen in a vote by the result of each decision tree.The some shortcomings of decision tree can be overcome, and be with good expansibility and concurrency, effectively can solve the fast processing problem of mass data, have good application prospect for the electrical load prediction under mass data environment.
Whole load estimation process utilizes the training process of 3 MapReduce service class execution algorithms, and the output of each MapReduce is as the input of thereafter.The decision-making module obtained after training terminates is kept in the distributed type assemblies of Hadoop, and it is divided into three parts: generate data dictionary; Generate decision tree; Form decision tree set.Generating data dictionary is exactly be described the sample data of training, produce a file and describe sample conditional attribute and decision attribute, the type of record condition property value and the position of decision attribute, and the module that will create carries out classifying or regressing calculation.This process is completed by first MapReduce, and each Map process reads a part for experimental data, the attribute type of record data and load value or type identification.The description document produced is stored in the file system HDFS of Hadoop with the form of key/value, uses in order to MapReduce subsequently.
Generate the core that decision tree process is whole parallel algorithm, its parallel procedure is wherein in following several respects: 1) former data set is had at random to extraction K and the equirotal sample data TS of former state notebook data collection putting back to 1,2 ..., k.Because be have the extraction of putting back to, so can walk abreast former data set extracted, and can not have an impact to TS.The training set of a corresponding decision tree of TS, each TS is different, and the same with former data set size, so both ensure that the difference of each decision tree, can not lose again the knowledge scale of former data set.
2) determine the attribute number m (m<<M) of each node Stochastic choice according to the number M of attribute in sample data, in sort module, m is the square root of M, and in regression block, m is 1/3 of M.Calculate the quantity of information of each attribute in m attribute, select best attribute to carry out branch;
3) foundation carrying out node of recurrence, generates decision tree.The generation of K decision tree is parallel generation, and a Map generates a decision tree, achieves the parallel of algorithm.This process is completed by second MapReduce process.This MapReduce only has Map process not have Reduce process.
Form decision tree set namely each decision tree set of classifiers altogether.Each decision tree can produce a result, if it is determined that tree set is used for classifying, its net result is that ballot is chosen, and when it is used for regression forecasting, K tree can provide K value, and end value is the mean value of each tree.This process is completed by the 3rd MapReduce.
Whole module is based upon on the distributed type assemblies of Hadoop, distributed storage is carried out to mass data, MapReduce is utilized to be walked abreast by algorithm, calculation sample general collection S method is enable to rely on the storage capacity of Hadoop cluster and computing power to the excavation of data and computational prediction, whole process is all executed in parallel, effectively can improve the precision of prediction and improve the ability of load estimation system process mass data.
In the deployment framework of above-mentioned HBase system, using the supvr of dispatching center as whole distributing real-time data bank, storing metadata information, comprises the key messages such as the division of labor of each node, node state, data partition mode, data block location, task scheduling, safety management.Dispatching center generally disposes 2 (also can be made up of multiple stage), the consistance of metadata is kept each other by synchronization mechanism, thus eliminate the risk that dispatching center's Single Point of Faliure causes entire system afunction, simultaneously also for the realization of concurrent request load balancing is laid a good foundation.Data analysis computation layer stores for the burst of mass data, completes all kinds of computation process simultaneously, and the quantity of data analysis computation layer is only limited to the rigid condition such as Ethernet bandwidth, machine room physical condition.Each data analysis computation layer is reciprocity in logic, disposes same process and completes same logical operation, according to the area principle of dispatching center to data, only stores the data belonging to respective partition, thus reaches the object of distributed storage.Consider that Distributed architecture lower node lost efficacy and fault can often occur, the redundancy backup mechanism based on affairs is adopted between data analysis computation layer, same transaction operation is synchronized to another or a few number of units (depends on customizable replicator) according on analytical calculation layer, while realizing data high reliability, for the load balancing of data access is laid a good foundation.
Power grid user data management system adopts the distributed file system that stores as bottom of HDFS, and the sequential control assembly built on this basis towards electrical network mass data carrys out the time series data in store electricity network service.Build time series data module by sequential control assembly, according to the unified time series data receiving storage of collected of peculiar module, and externally provide unified query interface.
On concrete storage mode, be different from the list structure of the determinant of traditional relational, adopt the form of key-value to store data, namely storing towards row, is basic storage and control of authority unit to arrange race.For being empty row, in actual storage, not taking real space, using the design of sparse table.In this way, the space waste problem that Different sampling period causes is solved.On data framework is disposed, abandon traditional C/S multi-client, the pattern of Single-Server simultaneously.Adopt the cluster mode of distributed multiserver, all data are stored in the storage security multiple stage computing machine in cluster strengthening data according to replicator dispersion, improve the search efficiency of data.
Sequential control assembly bottom depends on column storage database.When concrete process time series data, it can abstractly be the basic operation such as reading and writing, increase, deletion, amendment to HBase database.The software the superiors are client and the third-party application client of sequential control assembly.All clients carry out concrete operations by the API of Java.All API can function decomposition into analytic function be the arrangement set of a database manipulation or multiple database manipulation by type parsing module.These database manipulation set are called by the RPC of Control Component inside, and the HBase that finally unified use is asynchronous operates API and completes data manipulation.
Time series data record is made up of measuring object, timestamp, measured value, label 4 fields.Wherein, label, is used for further describing measuring object information to forming by one or more key/value, and measuring object and tag combination are for measuring item.The design of label makes user be easy to inquire the value of the measurement item that it is concerned about.Control Component uses accumulation layer to store data, and accumulation layer is the distributed file storage system of a key/value structure.In distributed accumulation layer, store time series data efficiently, and store the data point of over ten billion easily with minimum internal memory/disk space, the key issue that must solve when being outstanding node store structure design.For this reason, the design of the columnar database HBase table that distributing real-time data bank administration and supervision authorities rely on need abide by the principle: for the major key of the sequential control assembly of employing regular length, should comprise retrieving information as much as possible; The data stored generally comprise a large amount of measuring objects and label, and these fields are elongated, therefore, arrange an ID table and store these information, as the numbering that the overall situation is unique, and numbering and timestamp are merged as major key; Often row should store information as much as possible.Such as, the data of certain time period distributed collection are combined, submit data to according to a row.The program can reduce the number of whole table row major key, thus improves the speed of line retrieval.Store data according to the extension of time, adopt stateless storage scheme, thus system survivability is provided.
The method of Hash maps is all adopted to be numbered for the key of each measuring object, label and value, simultaneously in order to improve the efficiency of data query, by above-mentioned map information ID table in stored in 2 parts, portion is the mapping that measuring object, label key and value are numbered to its hash, and another part numbers the mapping of measuring object, label key and value for hash.Above-mentioned hash numbering all adopts the regular length of 3 bytes.The time series data of measuring object is stored in another table, the line unit of this table adopt measuring object ID+ reference time+ID of ID+ label value of label key, the wherein system development of field reference time corresponding to a certain time series data record to be stored and the application integral point time, except being 4 bytes reference time, other fields are 3 bytes.Time series data in 1 hour is stored in a line in table, a certain record be stored in by row and its relative to reference time offset Δ t corresponding to row under, wherein Δ t=record Shi Jian Chuo – reference time.When a certain line item is filled with, opens next line and continue to store.
Obviously, it should be appreciated by those skilled in the art, above-mentioned of the present invention each module or each step can realize with general computing system, they can concentrate on single computing system, or be distributed on network that multiple computing system forms, alternatively, they can realize with the executable program code of computing system, thus, they can be stored and be performed by computing system within the storage system.Like this, the present invention is not restricted to any specific hardware and software combination.
Should be understood that, above-mentioned embodiment of the present invention only for exemplary illustration or explain principle of the present invention, and is not construed as limiting the invention.Therefore, any amendment made when without departing from the spirit and scope of the present invention, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.In addition, claims of the present invention be intended to contain fall into claims scope and border or this scope and border equivalents in whole change and modification.

Claims (4)

1. an electrical network mass data management method, is characterized in that, comprising:
Build power grid user data management system, the data that each electrical network subsystem collects are integrated, and utilizes the data of parallel computation frame to power grid user to excavate and analyze; Based on described data management system, distributed terminator prediction algorithm is utilized to realize parallel load prediction.
2. method according to claim 1, it is characterized in that, the framework of described power grid user data management system is divided into application layer, data analysis computation layer, data management layer, Hadoop is utilized to build power grid user data management system, platform adopt HDFS, HBase set up data-storage system, platform builds MapReduce parallel computation frame and Storm memory parallel Computational frame as mass data computational analysis system, the mass data of power grid user is analyzed; Described data management layer carries out collection and integrated to data; Described data acquisition comprises the data gathered from intelligent electric meter, data acquisition monitoring system and various sensor, is managed by Data Migration integrated the comprising of these data to cluster server; In the integrating process of data, data batchmove instrument is adopted to extract and integration work data, by each independently system produce data and historical data utilize data batchmove instrument extracting integral in HBase, and use java persistence instrument to operate column storage database, the online data that the application based on Distributed Calculation produces is written in HBase; Described data analysis computation layer is used for storage and the computational analysis of mass data; Utilize HBase store power load data and related data; Parallel computation module MapReduce is utilized to carry out parallel batch computational analysis to mass data, and the parallel computation module Storm based on internal memory is adopted to data-intensive iterative computation, business desired data is read in internal memory, needs directly to inquire about from internal memory during data.
3. method according to claim 2, is characterized in that, described based on described data management system, utilizes distributed terminator prediction algorithm to realize parallel load prediction, comprises further:
Utilize the training process of 3 MapReduce service class execution algorithms, the output of each MapReduce is as the input of thereafter, the decision-making module obtained after training terminates is kept in the distributed type assemblies of Hadoop, and it is divided into three parts: generate data dictionary; Generate decision tree; Form decision tree set;
The sample data that wherein said generation data dictionary comprises training is described, produce a file and describe sample conditional attribute and decision attribute, the type of record condition property value and the position of decision attribute, and the module that will create carries out classifying or regressing calculation, this process is completed by first MapReduce, each Map process reads a part for experimental data, the attribute type of record data and load value or type identification; The description document produced is stored in the file system HDFS of Hadoop with the form of key/value;
Wherein said generation decision tree process comprises following parallel procedure:
1) former data set is had at random to extraction K and the equirotal sample data TS of former state notebook data collection putting back to 1,2 ..., k; The training set of a corresponding decision tree of sample data, each sample data is different, and the same with former data set size;
2) determine the attribute number m of each node Stochastic choice according to the number M of attribute in sample data, wherein m<<M, in sort module, m is the square root of M, and in regression block, m is 1/3 of M; Calculate the quantity of information of each attribute in m attribute, select best attributes to carry out branch;
3) recurrence carries out the foundation of node, generates decision tree; The generation of K decision tree is parallel generation, and a Map generates a decision tree, and this process is completed by second MapReduce process;
The set of described formation decision tree comprises each decision tree set of classifiers altogether, each decision tree produces a result, if it is determined that tree set is used for classifying, its net result is that ballot is chosen, when it is used for regression forecasting, K tree provides K value, end value is the mean value of each tree, and this process is completed by the 3rd MapReduce.
4. method according to claim 3, it is characterized in that, in the deployment framework of described HBase system, using the supvr of dispatching center as whole distributing real-time data bank, storing metadata information, comprises the key message of the division of labor of each node, node state, data partition mode, data block location, task scheduling, safety management, described dispatching center keeps the consistance of metadata each other by synchronization mechanism, data analysis computation layer is reciprocity in logic, dispose same process and complete same logical operation, data analysis computation layer adopts the redundancy backup mechanism based on affairs, the distributed file system that power grid user data management system adopts HDFS to store as bottom, build the time series data come towards the sequential control assembly of electrical network mass data in store electricity network service, time series data module is built by sequential control assembly, according to the unified time series data receiving storage of collected of peculiar module, and unified query interface is externally provided,
On storage mode, the form of key-value is adopted to store data, namely store towards row, be basic storage and control of authority unit to arrange race, for being empty row, in actual storage, do not take real space, use the design of sparse table, on data framework is disposed, abandon traditional C/S multi-client, the pattern of Single-Server, adopt the cluster mode of distributed multiserver, all data are stored on the multiple stage computing machine in cluster according to replicator dispersion, sequential control assembly bottom depends on column storage database, abstract when concrete process time series data is reading HBase database, write, increase, delete, the basic operation of amendment, the software the superiors are client and the third-party application client of sequential control assembly, all clients carry out concrete operations by the API of Java, all API are the arrangement set of a database manipulation or multiple database manipulation by type parsing module function decomposition into analytic function, these database manipulation set are called by the RPC of Control Component inside, the HBase that finally unified use is asynchronous operates API and completes data manipulation.
CN201510487734.1A 2015-08-10 2015-08-10 A kind of electrical network mass data management method Expired - Fee Related CN105069703B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510487734.1A CN105069703B (en) 2015-08-10 2015-08-10 A kind of electrical network mass data management method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510487734.1A CN105069703B (en) 2015-08-10 2015-08-10 A kind of electrical network mass data management method

Publications (2)

Publication Number Publication Date
CN105069703A true CN105069703A (en) 2015-11-18
CN105069703B CN105069703B (en) 2018-08-28

Family

ID=54499061

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510487734.1A Expired - Fee Related CN105069703B (en) 2015-08-10 2015-08-10 A kind of electrical network mass data management method

Country Status (1)

Country Link
CN (1) CN105069703B (en)

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105302500A (en) * 2015-11-24 2016-02-03 中国科学技术大学 Distributed type encoding method based on dynamic band configuration
CN105608758A (en) * 2015-12-17 2016-05-25 山东鲁能软件技术有限公司 Big data analysis platform apparatus and method based on algorithm configuration and distributed stream computing
CN105608144A (en) * 2015-12-17 2016-05-25 山东鲁能软件技术有限公司 Big data analysis platform device and method based on multilayer model iteration
CN105678467A (en) * 2016-01-15 2016-06-15 国家电网公司 Regulation and control integrated data analysis and aid decision making system and method under ultrahigh-voltage alternating current and direct current networking
CN106021080A (en) * 2016-05-10 2016-10-12 国家电网公司 Method for intelligently predicting resource consumption trend of application middleware database connection pool
CN106372256A (en) * 2016-09-30 2017-02-01 浙江大学 Distributed storage method for massive Argo data
CN106534251A (en) * 2016-09-23 2017-03-22 郑州云海信息技术有限公司 Task visual uploading and starting method based on Storm
CN106682106A (en) * 2016-12-05 2017-05-17 国网宁夏电力公司信息通信公司 Distributed management system based on massive electric power real-time data
CN106709035A (en) * 2016-12-29 2017-05-24 贵州电网有限责任公司电力科学研究院 Preprocessing system for electric power multi-dimensional panoramic data
CN106897306A (en) * 2015-12-21 2017-06-27 阿里巴巴集团控股有限公司 database operation method and device
CN106934014A (en) * 2017-03-10 2017-07-07 山东省科学院情报研究所 A kind of network data excavation based on Hadoop and analysis platform and its method
CN107330567A (en) * 2017-07-20 2017-11-07 云南电网有限责任公司电力科学研究院 Distribution switch-time load Forecasting Methodology based on big data technology
CN107341241A (en) * 2017-07-05 2017-11-10 深圳市樊溪电子有限公司 A kind of wind-powered electricity generation big data analysis system based on cloud computing
CN107341084A (en) * 2017-05-16 2017-11-10 阿里巴巴集团控股有限公司 A kind of method and device of data processing
CN107391596A (en) * 2017-06-29 2017-11-24 中国电力科学研究院 A kind of power distribution network mass data fusion method and device
CN107483858A (en) * 2017-08-31 2017-12-15 益和电气集团股份有限公司 The distributed memory system and its distributed storage method of electricity consumption enterprise supervision video
CN107679133A (en) * 2017-09-22 2018-02-09 电子科技大学 A kind of method for digging for being practically applicable to the real-time PMU data of magnanimity
CN107943831A (en) * 2017-10-23 2018-04-20 国家电网公司西北分部 HBase-based power grid historical data centralized storage method
WO2018196631A1 (en) * 2017-04-26 2018-11-01 Midea Group Co., Ltd. Training machine learning models on a large-scale distributed system using a job server
CN109800271A (en) * 2019-02-23 2019-05-24 湖北理工学院 A kind of information collecting method based on big data
CN110232007A (en) * 2019-05-21 2019-09-13 昆明能讯科技有限责任公司 A kind of electric power enterprise information service monitoring method based on APM technology
CN110457330A (en) * 2019-08-21 2019-11-15 北京远舢智能科技有限公司 A kind of time series data management platform
CN110502517A (en) * 2019-08-23 2019-11-26 中国南方电网有限责任公司 It is a kind of for storing the distributed memory system of power grid real-time running data
CN110837516A (en) * 2019-11-07 2020-02-25 恩亿科(北京)数据科技有限公司 Data cutting and connecting method and device, computer equipment and readable storage medium
CN111400129A (en) * 2020-03-06 2020-07-10 广东电网有限责任公司 Distributed application performance monitoring and bottleneck positioning system, method and equipment
CN111597415A (en) * 2020-05-13 2020-08-28 云南电网有限责任公司电力科学研究院 Neural network-based power distribution network account data communication method and device
CN112199421A (en) * 2020-12-04 2021-01-08 中国电力科学研究院有限公司 Multi-source heterogeneous data fusion and measurement data multi-source mutual verification method and system
CN112653771A (en) * 2021-03-15 2021-04-13 浙江贵仁信息科技股份有限公司 Water conservancy data fragment storage method, on-demand method and processing system
CN112905573A (en) * 2021-01-29 2021-06-04 杭州市电力设计院有限公司余杭分公司 Mass power grid data management and storage system
CN112970165A (en) * 2018-10-04 2021-06-15 沃尔塔利斯公司 Estimation of physical quantities by means of a distributed measuring system
CN116881200A (en) * 2023-09-07 2023-10-13 四川竺信档案数字科技有限责任公司 Multi-center distributed electronic archive data security management method and system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104820670B (en) * 2015-03-13 2018-11-06 华中电网有限公司 A kind of acquisition of power information big data and storage method

Cited By (47)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105302500A (en) * 2015-11-24 2016-02-03 中国科学技术大学 Distributed type encoding method based on dynamic band configuration
CN105302500B (en) * 2015-11-24 2018-04-10 中国科学技术大学 A kind of distributed coding method based on dynamic banded structure
CN105608758A (en) * 2015-12-17 2016-05-25 山东鲁能软件技术有限公司 Big data analysis platform apparatus and method based on algorithm configuration and distributed stream computing
CN105608144A (en) * 2015-12-17 2016-05-25 山东鲁能软件技术有限公司 Big data analysis platform device and method based on multilayer model iteration
CN105608144B (en) * 2015-12-17 2019-02-26 山东鲁能软件技术有限公司 A kind of big data analysis stage apparatus and method based on multilayered model iteration
CN105608758B (en) * 2015-12-17 2018-03-27 山东鲁能软件技术有限公司 A kind of big data analysis platform device and method calculated based on algorithm configuration and distributed stream
CN106897306B (en) * 2015-12-21 2019-04-30 阿里巴巴集团控股有限公司 Database operation method and device
CN106897306A (en) * 2015-12-21 2017-06-27 阿里巴巴集团控股有限公司 database operation method and device
CN105678467A (en) * 2016-01-15 2016-06-15 国家电网公司 Regulation and control integrated data analysis and aid decision making system and method under ultrahigh-voltage alternating current and direct current networking
CN106021080A (en) * 2016-05-10 2016-10-12 国家电网公司 Method for intelligently predicting resource consumption trend of application middleware database connection pool
CN106021080B (en) * 2016-05-10 2018-10-19 国家电网公司 Using middleware database connection pool resource consumption trend intelligent Forecasting
CN106534251A (en) * 2016-09-23 2017-03-22 郑州云海信息技术有限公司 Task visual uploading and starting method based on Storm
CN106372256A (en) * 2016-09-30 2017-02-01 浙江大学 Distributed storage method for massive Argo data
CN106682106A (en) * 2016-12-05 2017-05-17 国网宁夏电力公司信息通信公司 Distributed management system based on massive electric power real-time data
CN106709035A (en) * 2016-12-29 2017-05-24 贵州电网有限责任公司电力科学研究院 Preprocessing system for electric power multi-dimensional panoramic data
CN106709035B (en) * 2016-12-29 2019-11-26 贵州电网有限责任公司电力科学研究院 A kind of pretreatment system of electric power multidimensional panoramic view data
CN106934014A (en) * 2017-03-10 2017-07-07 山东省科学院情报研究所 A kind of network data excavation based on Hadoop and analysis platform and its method
WO2018196631A1 (en) * 2017-04-26 2018-11-01 Midea Group Co., Ltd. Training machine learning models on a large-scale distributed system using a job server
CN107341084A (en) * 2017-05-16 2017-11-10 阿里巴巴集团控股有限公司 A kind of method and device of data processing
CN107341084B (en) * 2017-05-16 2021-07-06 创新先进技术有限公司 Data processing method and device
CN107391596A (en) * 2017-06-29 2017-11-24 中国电力科学研究院 A kind of power distribution network mass data fusion method and device
CN107391596B (en) * 2017-06-29 2023-09-22 中国电力科学研究院 Power distribution network mass data fusion method and device
CN107341241A (en) * 2017-07-05 2017-11-10 深圳市樊溪电子有限公司 A kind of wind-powered electricity generation big data analysis system based on cloud computing
WO2019006721A1 (en) * 2017-07-05 2019-01-10 深圳市樊溪电子有限公司 Wind power big data analysis system based on cloud computing
CN107330567A (en) * 2017-07-20 2017-11-07 云南电网有限责任公司电力科学研究院 Distribution switch-time load Forecasting Methodology based on big data technology
CN107483858A (en) * 2017-08-31 2017-12-15 益和电气集团股份有限公司 The distributed memory system and its distributed storage method of electricity consumption enterprise supervision video
CN107679133A (en) * 2017-09-22 2018-02-09 电子科技大学 A kind of method for digging for being practically applicable to the real-time PMU data of magnanimity
CN107679133B (en) * 2017-09-22 2020-01-17 电子科技大学 Mining method applicable to massive real-time PMU data
CN107943831B (en) * 2017-10-23 2022-05-13 国家电网公司西北分部 HBase-based power grid historical data centralized storage method
CN107943831A (en) * 2017-10-23 2018-04-20 国家电网公司西北分部 HBase-based power grid historical data centralized storage method
CN112970165A (en) * 2018-10-04 2021-06-15 沃尔塔利斯公司 Estimation of physical quantities by means of a distributed measuring system
CN109800271A (en) * 2019-02-23 2019-05-24 湖北理工学院 A kind of information collecting method based on big data
CN110232007A (en) * 2019-05-21 2019-09-13 昆明能讯科技有限责任公司 A kind of electric power enterprise information service monitoring method based on APM technology
CN110457330A (en) * 2019-08-21 2019-11-15 北京远舢智能科技有限公司 A kind of time series data management platform
CN110502517A (en) * 2019-08-23 2019-11-26 中国南方电网有限责任公司 It is a kind of for storing the distributed memory system of power grid real-time running data
CN110502517B (en) * 2019-08-23 2022-01-28 中国南方电网有限责任公司 Distributed storage system for storing real-time operation data of power grid
CN110837516A (en) * 2019-11-07 2020-02-25 恩亿科(北京)数据科技有限公司 Data cutting and connecting method and device, computer equipment and readable storage medium
CN111400129A (en) * 2020-03-06 2020-07-10 广东电网有限责任公司 Distributed application performance monitoring and bottleneck positioning system, method and equipment
CN111597415A (en) * 2020-05-13 2020-08-28 云南电网有限责任公司电力科学研究院 Neural network-based power distribution network account data communication method and device
CN111597415B (en) * 2020-05-13 2023-05-26 云南电网有限责任公司电力科学研究院 Neural network-based distribution network account data penetration method and device
CN112199421B (en) * 2020-12-04 2021-03-09 中国电力科学研究院有限公司 Multi-source heterogeneous data fusion and measurement data multi-source mutual verification method and system
CN112199421A (en) * 2020-12-04 2021-01-08 中国电力科学研究院有限公司 Multi-source heterogeneous data fusion and measurement data multi-source mutual verification method and system
CN112905573A (en) * 2021-01-29 2021-06-04 杭州市电力设计院有限公司余杭分公司 Mass power grid data management and storage system
CN112653771B (en) * 2021-03-15 2021-06-01 浙江贵仁信息科技股份有限公司 Water conservancy data fragment storage method, on-demand method and processing system
CN112653771A (en) * 2021-03-15 2021-04-13 浙江贵仁信息科技股份有限公司 Water conservancy data fragment storage method, on-demand method and processing system
CN116881200B (en) * 2023-09-07 2024-01-16 四川竺信档案数字科技有限责任公司 Multi-center distributed electronic archive data security management method and system
CN116881200A (en) * 2023-09-07 2023-10-13 四川竺信档案数字科技有限责任公司 Multi-center distributed electronic archive data security management method and system

Also Published As

Publication number Publication date
CN105069703B (en) 2018-08-28

Similar Documents

Publication Publication Date Title
CN105069703A (en) Mass data management method of power grid
CN103023970B (en) Method and system for storing mass data of Internet of Things (IoT)
CN102915347B (en) A kind of distributed traffic clustering method and system
CN109241161A (en) A kind of meteorological data management method
CN106372114A (en) Big data-based online analytical processing system and method
CN107832876B (en) Partition maximum load prediction method based on MapReduce framework
CN104391903A (en) Distributed storage and parallel calculation-based power grid data quality detection method
CN107590749A (en) A kind of processing method and system with electricity consumption data
CN109213752A (en) A kind of data cleansing conversion method based on CIM
CN107247799A (en) Data processing method, system and its modeling method of compatible a variety of big data storages
CN108595664A (en) A kind of agricultural data monitoring method under hadoop environment
CN103207920A (en) Parallel metadata acquisition system
CN115129795A (en) Data space-time storage method based on geospatial grid
CN112148578A (en) IT fault defect prediction method based on machine learning
Ren et al. Long-Term Preservation of Electronic Record Based on Digital Continuity in Smart Cities.
Wang et al. Research on parallelized real-time map matching algorithm for massive GPS data
CN111753034A (en) One-stop type geographical big data platform
CN103995828B (en) A kind of cloud storage daily record data analysis method
CN117372201A (en) Rapid construction method of intelligent water conservancy digital twin model applied to reservoir
CN107818106B (en) Big data offline calculation data quality verification method and device
CN206021244U (en) A kind of data collecting system under distributed computer cluster
CN112540987A (en) Big data management system of distribution and utilization electricity based on data mart
Wang et al. Block storage optimization and parallel data processing and analysis of product big data based on the hadoop platform
CN107908683A (en) Wireless city big data off-line processing system and its big data processed offline method
Menon et al. A spark™ based client for synchrophasor data stream processing

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180828

Termination date: 20190810