CN109471837A - The distributed storage method of power infrastructures data - Google Patents

The distributed storage method of power infrastructures data Download PDF

Info

Publication number
CN109471837A
CN109471837A CN201811167120.5A CN201811167120A CN109471837A CN 109471837 A CN109471837 A CN 109471837A CN 201811167120 A CN201811167120 A CN 201811167120A CN 109471837 A CN109471837 A CN 109471837A
Authority
CN
China
Prior art keywords
data
file
database
distributed storage
power infrastructures
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811167120.5A
Other languages
Chinese (zh)
Inventor
袁兆祥
韩文军
张济勇
刘海波
孙小虎
陈颖
李晓军
张苏
张亚平
于高
蒲洁
赵雨
戴艳
穆伟光
姚春静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
State Grid Hubei Electric Power Co Ltd
State Grid Economic and Technological Research Institute
Original Assignee
Wuhan University WHU
State Grid Hubei Electric Power Co Ltd
State Grid Economic and Technological Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU, State Grid Hubei Electric Power Co Ltd, State Grid Economic and Technological Research Institute filed Critical Wuhan University WHU
Priority to CN201811167120.5A priority Critical patent/CN109471837A/en
Publication of CN109471837A publication Critical patent/CN109471837A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply

Abstract

The present invention provides a kind of distributed storage method of power infrastructures data, and using non-relational database storage file essential information, the distributed storage of power infrastructures data file is realized with distributed file system.The present invention realizes the distributed storage of file on the basis of realizing the distributed storage of database in conjunction with HDFS, file no longer needs to be converted in binary file deposit database, source file is converted by binary file again when taking-up, the handling capacity for improving data can be gone out required data by querying condition quick search while realizing the distributed storage to power infrastructures data.

Description

The distributed storage method of power infrastructures data
Technical field
The present invention relates to electro-technical field more particularly to a kind of distributed storage methods of power infrastructures data.
Background technique
For the needs for meeting national economy and social development, country increases the dynamics of power construction.At the past 10 years In development, China's power construction is fast-developing, has made brilliant achievements, capacity of installed generator rapid growth, and power grid construction speed is prominent to fly suddenly Into.In this course, smart grid becomes the keyword of power network development.
Compared with existing power grid, smart grid embodies the distinguishing feature of flow of power, information flow and the fusion of Business Stream height, Its advantage is mainly reflected in: (1) having strong power grid foundational system and technical support system, can resist all kinds of external disturbances And attack, it can adapt to the access of extensive clean energy resource and renewable energy, the strong property of power grid is consolidated and promoted; (2) information technology, sensor technology, automatic control technology and power grid infrastructure organically blend, and can obtain the panorama letter of power grid Breath, the failure that discovery in time, prediction may occur.When failure occurs, power grid can quickly isolate failure, realize self-recovery, To avoid the generation of large-area power-cuts;(3) flexible AC/DC transmits electricity, nets factory's coordination, intelligent scheduling, power energy storage, distribution certainly The extensive use of the technologies such as dynamicization keeps operation of power networks control more flexible, economical, and adapts to a large amount of distributed generation resources, micro- electricity The access of net and electric car charge-discharge facility;(4) communication, the integrated use of information and modern management technology, will greatly improve Power equipment service efficiency reduces electric energy loss, keeps operation of power networks more economically and efficient;(5) real-time and non real-time letter is realized Highly integrated, the shared and utilization of breath, shows comprehensive, complete and fine power grid operation state diagram, while energy for operational management Corresponding aid decision is enough provided to support, control embodiment and contingency plan;(6) service mode of two-way interaction is established, is used Family can understand power supply capacity, power quality, electricity price situation and outage information in real time, and reasonable arrangement electric appliance uses;Electric power enterprise The detailed power information of available user provides more value-added services for it.Realize smart grid efficient process and analysis Basis be the design of electric power facility database.
MongoDB is a kind of database based on distributed document storage.It is write by C Plus Plus.It is intended that WEB application mentions For expansible high-performance data storage solution.MongoDB is between relational database and non-relational database, right and wrong Function is most abundant in relational database, most like with relational database, it supports query expression abundant, inquiry instruction Using the label of JSON form, the object embedded in document and array can be inquired easily.
Hadoop distributed file system (HDFS) is designed to be suitble to operate in common hardware (commodity Hardware the distributed file system on), in the case of a fault also can reliably storing data depositing.It and existing point Cloth file system has many common ground.But meanwhile the difference of it and other distributed file systems is also apparent. HDFS is the system of an Error Tolerance, is suitble to be deployed on cheap machine.
It is greater than the file of 16M using the GridFS storage in MongoDB build-in function, this method is not that single file is straight It connects and is stored as a document, but file is divided into multiple pieces, then store each piece as an individual document, then These blocks are orderly saved.Under default situations, each block size of GridFS is 256KB.GridFS is deposited using 2 set Store up these files, the block of a set storage file, i.e. actual file data, the metamessage of another then storage file.In block Storage is the binary type data being converted by initial data.In view of power infrastructures data are mainly with remote sensing figure Based on picture, thematic picture, and data volume is big, this mode carries out access can be than relatively time-consuming.HDFS can provide the number of high-throughput According to access, the application being very suitable on large-scale dataset, but data can not be retrieved by attribute information again, therefore The present invention combines MongoDB in HDFS, can be from while solving the distributed storage to power infrastructures data Middle quick search goes out required data.
Summary of the invention
The present invention in view of the deficiencies of the prior art, is based primarily upon power infrastructures data, it is contemplated that the spy substantially of data Property and access requirement, have invented a kind of distributed storage method of power infrastructures data.
A kind of distributed storage method of power infrastructures data provided by the invention, is stored using non-relational database Document base information is realized the distributed storage of power infrastructures data file with distributed file system, is specifically included:
Step 1: establishing Hadoop distributed file system (HDFS) environment, 4 are virtually first dissolved in high-performance server Node includes a master node and three slave nodes, then carries out building building for HDFS environment;
Step 2: the database MongoDB of installation distributed document storage, creates one for storage file essential information Database;
Step 3: electric power basis is set by the database in the essential information deposit step 2 of power infrastructures data Apply data deposit HDFS;
Step 4: power infrastructures data are inquired and are downloaded.
Preferably, above-mentioned steps one specifically include:
The IP address of step 1.1, each dummy node of setting, then configures host file, and host file is mainly used for really The IP address of fixed each node, facilitates subsequent master node that can quickly find and accesses each node, each dummy node is equal Need to configure host file;
Step 1.2 is specially arranged a user group and user for Hadoop cluster and configures SSH and exempt from password and is connected into, and allows Master node can be exempted from password by SSH and smoothly be securely accessed by three slave nodes;
Step 1.3 is downloaded and decompresses Hadoop installation kit and carries out configuration to it until available jps examines each background program The case where successfully starting up, or can viewing cluster by website, which just completes, takes Hadoop distributed file system environment It builds.
Preferably, the essential information of the file in above-mentioned steps two include time, data institute possession, data type etc. all Wish the information for being taken as querying condition.
Preferably, above-mentioned steps three are looked into MongoDB database first with the document base information got It askes, the information of " having existed this record in database " is terminated and prompted if having existed this data;If data The completely the same document in domain is not found in library to be then first stored in this data in HDFS, it then will be where essential information and the data The position of HDFS is as in a document insertion MongoDB database.
Preferably, above-mentioned steps four inquire qualified document according to querying condition in MongoDB database.Such as Selected data need to be downloaded then pass through path of the file recorded in domain in HDFS and data are downloaded to local.
Preferably, above-mentioned Hadoop distributed file system is deployed in four sections that a high-performance server is fictionalized It is Centos6.8 comprising a master node and 3 slave nodes, used operating system on point, network connection Mode is NAT, and wherein the interior of master node saves as 16G, hard-disk capacity 200G, slave1, slave2, slave3 node Memory is 16G, the equal 400G of hard-disk capacity.
The present invention realizes the distributed storage of file, text on the basis of realizing the distributed storage of database in conjunction with HDFS Part no longer needs to be converted in binary file deposit database, and when taking-up is converted into source file by binary file again, improves The handling capacities of data.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, will make below to required in the embodiment of the present invention Attached drawing is briefly described, for those of ordinary skill in the art, without creative efforts, also Other drawings may be obtained according to these drawings without any creative labor.
Fig. 1 is the flow diagram of the distributed storage method of power infrastructures data provided by the invention;
Fig. 2 is the HDFS cluster diagram of the embodiment of the present invention.
Specific embodiment
The feature and exemplary embodiment of various aspects of the invention is described more fully below, in order to make mesh of the invention , technical solution and advantage be more clearly understood, with reference to the accompanying drawings and embodiments, the present invention is further retouched in detail It states.It should be understood that specific embodiment described herein is only configured to explain the present invention, it is not configured as limiting the present invention. To those skilled in the art, the present invention can be real in the case where not needing some details in these details It applies.Below the description of embodiment is used for the purpose of better understanding the present invention to provide by showing example of the invention.
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to Non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment Intrinsic element.In the absence of more restrictions, the element limited by sentence " including ... ", it is not excluded that including There is also other identical elements in the process, method, article or equipment of the element.
As shown in Figure 1, the present embodiment provides a kind of distributed storage method of power infrastructures data, including following step It is rapid:
Step 1: Hadoop distributed file system environment is built, the HDFS cluster in the present embodiment as shown in Figure 1, 4 nodes are virtually first dissolved in high-performance server, a master node and three slave nodes is contained, then carries out Build building for HDFS environment.
Used operating system is Centos6.8, and internetwork connection mode is NAT, and wherein master node is interior 16G is saved as, the memory of hard-disk capacity 200G, slave1, slave2, slave3 node is 16G, the equal 400G of hard-disk capacity.
The specific implementation process of embodiment is described as follows:
The IP address of each dummy node is set first, then configures host file, host file is mainly used for determining every The IP address of a node facilitates subsequent master node that can quickly find and accesses each node, and each dummy node is required to Configure host file.One user group and user are specially set for Hadoop cluster and configures SSH and exempts from password and is connected into, is allowed Master node can be exempted from password by SSH and smoothly be securely accessed by three slave nodes.It downloads and decompresses Hadoop installation kit pair It carries out configuration until available jps examines each background program to successfully start up, or the case where can view cluster by website just It completes and Hadoop distributed file system environment is built.
Step 2: installing MongoDB and creating the MongoDB database for storing power infrastructures data.This hair The database name of bright embodiment essential information for storing data is MultiSourceData, and data essential information all stored In the dataInfo set of MultiSourceData.
Step 3: the essential information of power infrastructures data is stored in MongoDB, the deposit of power infrastructures data HDFS.It is realized in the embodiment of the present invention with python language.The specific implementation process of embodiment is described as follows:
It is inquired in MongoDB database first with the document base information got, if having existed this Data then terminates and prompts the information of " having existed this record in database ";If it is complete not find domain in database This data is then first stored in HDFS by consistent document, then using the position of HDFS where essential information and the data as one Document is inserted into MongoDB database.
Introduce the python packet for needing to use, including pymongo, hdfs, os.Firstly, input electric power infrastructure data Essential information include data filename, data time, data affiliated area, the storage of data type, data in HDFS Path inquires whether have complete in the set dataInfo of database MultiSourceData by these essential informations The document matched shows had in database if there is being returned to message notifying " having existed this record in database " This data does not need to be stored in again, data is stored in HDFS if the document not exactly matched, and obtain file and exist The essential information of path and data is stored in the dataInfo collection of MultiSourceData database by the path in HDFS together In conjunction.
It is carried out Step 4: the inquiry and downloading of power infrastructures data, in the embodiment of the present invention with python language real It is existing.The specific implementation process of embodiment is described as follows:
The essential information for inputting the power infrastructures data needed is searched for eligible in database according to essential information Data record, obtain the path HDFS of data.Data are downloaded from Hadoop distributed file system by the path HDFS To local.
Compared with prior art, the present invention combines HDFS to realize text on the basis of realizing the distributed storage of database The distributed storage of part, file no longer need to be converted in binary file deposit database, again by binary file when taking-up It is converted into source file, improves the handling capacity of data.Power infrastructures data include text data, remotely-sensed data and all kinds of Electric power thematic data.
It should be clear that the invention is not limited to specific configuration described above and shown in figure and processing. For brevity, it is omitted here the detailed description to known method.In the above-described embodiments, several tools have been described and illustrated The step of body, is as example.But method process of the invention is not limited to described and illustrated specific steps, this field Technical staff can be variously modified, modification and addition after understanding spirit of the invention, or suitable between changing the step Sequence.
It should be noted that the exemplary embodiment referred in the present invention, is described based on a series of step or device Certain methods or system.But the present invention is not limited to the sequence of above-mentioned steps, that is to say, that can be according to mentioning in embodiment And sequence execute step, may also be distinct from that the sequence in embodiment or several steps are performed simultaneously.
The above description is merely a specific embodiment, it should be appreciated that protection scope of the present invention is not limited to This, anyone skilled in the art in the technical scope disclosed by the present invention, can readily occur in various equivalent Modifications or substitutions, these modifications or substitutions should be covered by the protection scope of the present invention.

Claims (6)

1. a kind of distributed storage method of power infrastructures data, which is characterized in that the method utilizes non-relation data Library stores document base information, the distributed storage of power infrastructures data file is realized with distributed file system, specifically Include:
Step 1: establishing Hadoop distributed file system (HDFS) environment, 4 sections are virtually first dissolved in high-performance server Point includes a master node and three slave nodes, then carries out building building for HDFS environment;
Step 2: the database MongoDB of installation distributed document storage, creates the data for being used for storage file essential information Library;
Step 3: by the database in the essential information deposit step 2 of power infrastructures data, power infrastructures number According to deposit HDFS;
Step 4: power infrastructures data are inquired and are downloaded.
2. the distributed storage method of power infrastructures data according to claim 1, which is characterized in that the step One specifically includes:
The IP address of step 1.1, each dummy node of setting, then configures host file, and host file is mainly used for determining every The IP address of a node facilitates subsequent master node that can quickly find and accesses each node, and each dummy node is required to Configure host file;
Step 1.2 is specially arranged a user group and user for Hadoop cluster and configures SSH and exempt from password and is connected into, and allows Master node can be exempted from password by SSH and smoothly be securely accessed by three slave nodes;
Step 1.3 is downloaded and decompresses Hadoop installation kit and carries out configuration to it until available jps examines each background program successful The case where starting, or can viewing cluster by website, which just completes, builds Hadoop distributed file system environment.
3. the distributed storage method of power infrastructures data according to claim 1, which is characterized in that the step The essential information of file in two includes all letters for wishing to be taken as querying condition such as time, data institute possession, data type Breath.
4. the distributed storage method of power infrastructures data according to claim 1, which is characterized in that the step Three are inquired in MongoDB database first with the document base information got, if having existed this data Then terminate and prompt the information of " this record is had existed in database ";If it is completely the same not find domain in database This data is then first stored in HDFS by document, then inserts the position of HDFS where essential information and the data as a document Enter in MongoDB database.
5. the distributed storage method of power infrastructures data according to claim 1 or 4, which is characterized in that described Step 4 inquires qualified document according to querying condition in MongoDB database.Then lead to if you need to download selected data It crosses path of the file recorded in domain in HDFS and data is downloaded to local.
6. the distributed storage method of power infrastructures data described in one of -5 according to claim 1, which is characterized in that institute It states Hadoop distributed file system to be deployed on four nodes that a high-performance server is fictionalized, includes one Master node and 3 slave nodes, used operating system are Centos6.8, and internetwork connection mode is NAT, 16G is saved as in middle master node, the memory of hard-disk capacity 200G, slave1, slave2, slave3 node is 16G, firmly The equal 400G of disk capacity.
CN201811167120.5A 2018-10-08 2018-10-08 The distributed storage method of power infrastructures data Pending CN109471837A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811167120.5A CN109471837A (en) 2018-10-08 2018-10-08 The distributed storage method of power infrastructures data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811167120.5A CN109471837A (en) 2018-10-08 2018-10-08 The distributed storage method of power infrastructures data

Publications (1)

Publication Number Publication Date
CN109471837A true CN109471837A (en) 2019-03-15

Family

ID=65664733

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811167120.5A Pending CN109471837A (en) 2018-10-08 2018-10-08 The distributed storage method of power infrastructures data

Country Status (1)

Country Link
CN (1) CN109471837A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110413571A (en) * 2019-07-01 2019-11-05 中国科学院遥感与数字地球研究所 Based on the extensive remote sensing image data distributed storage method of MongoDB
CN111026706A (en) * 2019-10-21 2020-04-17 武汉神库小匠科技有限公司 Method, device, equipment and medium for warehousing power system data

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104317899A (en) * 2014-10-24 2015-01-28 西安未来国际信息股份有限公司 Big-data analyzing and processing system and access method
CN104820670A (en) * 2015-03-13 2015-08-05 国家电网公司 Method for acquiring and storing big data of power information
CN105354250A (en) * 2015-10-16 2016-02-24 浪潮(北京)电子信息产业有限公司 Data storage method and device for cloud storage
CN105763667A (en) * 2016-01-13 2016-07-13 杭州华三通信技术有限公司 Method and device for realizing Hadoop host automatic discovery
KR20180056038A (en) * 2016-11-18 2018-05-28 조선대학교산학협력단 Data distribution storage apparatus and method using relative difference set generated from the group having the two-dimensional element

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104317899A (en) * 2014-10-24 2015-01-28 西安未来国际信息股份有限公司 Big-data analyzing and processing system and access method
CN104820670A (en) * 2015-03-13 2015-08-05 国家电网公司 Method for acquiring and storing big data of power information
CN105354250A (en) * 2015-10-16 2016-02-24 浪潮(北京)电子信息产业有限公司 Data storage method and device for cloud storage
CN105763667A (en) * 2016-01-13 2016-07-13 杭州华三通信技术有限公司 Method and device for realizing Hadoop host automatic discovery
KR20180056038A (en) * 2016-11-18 2018-05-28 조선대학교산학협력단 Data distribution storage apparatus and method using relative difference set generated from the group having the two-dimensional element

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110413571A (en) * 2019-07-01 2019-11-05 中国科学院遥感与数字地球研究所 Based on the extensive remote sensing image data distributed storage method of MongoDB
CN111026706A (en) * 2019-10-21 2020-04-17 武汉神库小匠科技有限公司 Method, device, equipment and medium for warehousing power system data
CN111026706B (en) * 2019-10-21 2023-10-13 武汉神库小匠科技有限公司 Warehouse entry method, device, equipment and medium for power system data

Similar Documents

Publication Publication Date Title
CN101505550B (en) Method, terminal, apparatus and system for device management
CN109379420B (en) Comprehensive energy service platform system based on distributed architecture
CN102202087B (en) Method for identifying storage equipment and system thereof
CN103167041A (en) System and method for supporting cloud environment application cluster automation deployment
CN107888666A (en) A kind of cross-region data-storage system and method for data synchronization and device
CN103546572A (en) Cloud storage device and multi-cloud storage networking system and method
CN108848132A (en) A kind of distribution scheduling station system based on cloud
CN103034541A (en) Distributing type information system and equipment and method thereof
CN109683910A (en) Big data platform dispositions method and device
CN109471837A (en) The distributed storage method of power infrastructures data
CN109215326A (en) A kind of parallel meter register method and device
CN113127526A (en) Distributed data storage and retrieval system based on Kubernetes
CN102624932A (en) Index-based remote cloud data synchronizing method
Malik et al. A common data architecture for energy data analytics
Smidt et al. Smart application development for IoT asset management using graph database modeling and high-availability web services
Lee et al. A big data management system for energy consumption prediction models
Chen et al. An efficient data storage method of NoSQL database for HEM mobile applications in IoT
CN105847364A (en) Public cloud object storage method based on uniform domain name and public cloud object storage system based on uniform domain name
CN102571418A (en) Method, terminal, device and system for equipment management
CN102970375A (en) Cluster configuration method and device
Benhaddou et al. Big data processing for smart grids
CN109710263A (en) Compilation Method, device, storage medium and the electronic equipment of code
CN114866416A (en) Multi-cluster unified management system and deployment method
CN105607594A (en) Method for searching equipment by server memory based on smart home appliance
CN106844058B (en) Management method and device for virtualized resources

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190315

RJ01 Rejection of invention patent application after publication