CN109471837A - The distributed storage method of power infrastructures data - Google Patents
The distributed storage method of power infrastructures data Download PDFInfo
- Publication number
- CN109471837A CN109471837A CN201811167120.5A CN201811167120A CN109471837A CN 109471837 A CN109471837 A CN 109471837A CN 201811167120 A CN201811167120 A CN 201811167120A CN 109471837 A CN109471837 A CN 109471837A
- Authority
- CN
- China
- Prior art keywords
- data
- file
- database
- distributed storage
- power infrastructures
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
- G06Q50/06—Electricity, gas or water supply
Abstract
The present invention provides a kind of distributed storage method of power infrastructures data, and using non-relational database storage file essential information, the distributed storage of power infrastructures data file is realized with distributed file system.The present invention realizes the distributed storage of file on the basis of realizing the distributed storage of database in conjunction with HDFS, file no longer needs to be converted in binary file deposit database, source file is converted by binary file again when taking-up, the handling capacity for improving data can be gone out required data by querying condition quick search while realizing the distributed storage to power infrastructures data.
Description
Technical field
The present invention relates to electro-technical field more particularly to a kind of distributed storage methods of power infrastructures data.
Background technique
For the needs for meeting national economy and social development, country increases the dynamics of power construction.At the past 10 years
In development, China's power construction is fast-developing, has made brilliant achievements, capacity of installed generator rapid growth, and power grid construction speed is prominent to fly suddenly
Into.In this course, smart grid becomes the keyword of power network development.
Compared with existing power grid, smart grid embodies the distinguishing feature of flow of power, information flow and the fusion of Business Stream height,
Its advantage is mainly reflected in: (1) having strong power grid foundational system and technical support system, can resist all kinds of external disturbances
And attack, it can adapt to the access of extensive clean energy resource and renewable energy, the strong property of power grid is consolidated and promoted;
(2) information technology, sensor technology, automatic control technology and power grid infrastructure organically blend, and can obtain the panorama letter of power grid
Breath, the failure that discovery in time, prediction may occur.When failure occurs, power grid can quickly isolate failure, realize self-recovery,
To avoid the generation of large-area power-cuts;(3) flexible AC/DC transmits electricity, nets factory's coordination, intelligent scheduling, power energy storage, distribution certainly
The extensive use of the technologies such as dynamicization keeps operation of power networks control more flexible, economical, and adapts to a large amount of distributed generation resources, micro- electricity
The access of net and electric car charge-discharge facility;(4) communication, the integrated use of information and modern management technology, will greatly improve
Power equipment service efficiency reduces electric energy loss, keeps operation of power networks more economically and efficient;(5) real-time and non real-time letter is realized
Highly integrated, the shared and utilization of breath, shows comprehensive, complete and fine power grid operation state diagram, while energy for operational management
Corresponding aid decision is enough provided to support, control embodiment and contingency plan;(6) service mode of two-way interaction is established, is used
Family can understand power supply capacity, power quality, electricity price situation and outage information in real time, and reasonable arrangement electric appliance uses;Electric power enterprise
The detailed power information of available user provides more value-added services for it.Realize smart grid efficient process and analysis
Basis be the design of electric power facility database.
MongoDB is a kind of database based on distributed document storage.It is write by C Plus Plus.It is intended that WEB application mentions
For expansible high-performance data storage solution.MongoDB is between relational database and non-relational database, right and wrong
Function is most abundant in relational database, most like with relational database, it supports query expression abundant, inquiry instruction
Using the label of JSON form, the object embedded in document and array can be inquired easily.
Hadoop distributed file system (HDFS) is designed to be suitble to operate in common hardware (commodity
Hardware the distributed file system on), in the case of a fault also can reliably storing data depositing.It and existing point
Cloth file system has many common ground.But meanwhile the difference of it and other distributed file systems is also apparent.
HDFS is the system of an Error Tolerance, is suitble to be deployed on cheap machine.
It is greater than the file of 16M using the GridFS storage in MongoDB build-in function, this method is not that single file is straight
It connects and is stored as a document, but file is divided into multiple pieces, then store each piece as an individual document, then
These blocks are orderly saved.Under default situations, each block size of GridFS is 256KB.GridFS is deposited using 2 set
Store up these files, the block of a set storage file, i.e. actual file data, the metamessage of another then storage file.In block
Storage is the binary type data being converted by initial data.In view of power infrastructures data are mainly with remote sensing figure
Based on picture, thematic picture, and data volume is big, this mode carries out access can be than relatively time-consuming.HDFS can provide the number of high-throughput
According to access, the application being very suitable on large-scale dataset, but data can not be retrieved by attribute information again, therefore
The present invention combines MongoDB in HDFS, can be from while solving the distributed storage to power infrastructures data
Middle quick search goes out required data.
Summary of the invention
The present invention in view of the deficiencies of the prior art, is based primarily upon power infrastructures data, it is contemplated that the spy substantially of data
Property and access requirement, have invented a kind of distributed storage method of power infrastructures data.
A kind of distributed storage method of power infrastructures data provided by the invention, is stored using non-relational database
Document base information is realized the distributed storage of power infrastructures data file with distributed file system, is specifically included:
Step 1: establishing Hadoop distributed file system (HDFS) environment, 4 are virtually first dissolved in high-performance server
Node includes a master node and three slave nodes, then carries out building building for HDFS environment;
Step 2: the database MongoDB of installation distributed document storage, creates one for storage file essential information
Database;
Step 3: electric power basis is set by the database in the essential information deposit step 2 of power infrastructures data
Apply data deposit HDFS;
Step 4: power infrastructures data are inquired and are downloaded.
Preferably, above-mentioned steps one specifically include:
The IP address of step 1.1, each dummy node of setting, then configures host file, and host file is mainly used for really
The IP address of fixed each node, facilitates subsequent master node that can quickly find and accesses each node, each dummy node is equal
Need to configure host file;
Step 1.2 is specially arranged a user group and user for Hadoop cluster and configures SSH and exempt from password and is connected into, and allows
Master node can be exempted from password by SSH and smoothly be securely accessed by three slave nodes;
Step 1.3 is downloaded and decompresses Hadoop installation kit and carries out configuration to it until available jps examines each background program
The case where successfully starting up, or can viewing cluster by website, which just completes, takes Hadoop distributed file system environment
It builds.
Preferably, the essential information of the file in above-mentioned steps two include time, data institute possession, data type etc. all
Wish the information for being taken as querying condition.
Preferably, above-mentioned steps three are looked into MongoDB database first with the document base information got
It askes, the information of " having existed this record in database " is terminated and prompted if having existed this data;If data
The completely the same document in domain is not found in library to be then first stored in this data in HDFS, it then will be where essential information and the data
The position of HDFS is as in a document insertion MongoDB database.
Preferably, above-mentioned steps four inquire qualified document according to querying condition in MongoDB database.Such as
Selected data need to be downloaded then pass through path of the file recorded in domain in HDFS and data are downloaded to local.
Preferably, above-mentioned Hadoop distributed file system is deployed in four sections that a high-performance server is fictionalized
It is Centos6.8 comprising a master node and 3 slave nodes, used operating system on point, network connection
Mode is NAT, and wherein the interior of master node saves as 16G, hard-disk capacity 200G, slave1, slave2, slave3 node
Memory is 16G, the equal 400G of hard-disk capacity.
The present invention realizes the distributed storage of file, text on the basis of realizing the distributed storage of database in conjunction with HDFS
Part no longer needs to be converted in binary file deposit database, and when taking-up is converted into source file by binary file again, improves
The handling capacities of data.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, will make below to required in the embodiment of the present invention
Attached drawing is briefly described, for those of ordinary skill in the art, without creative efforts, also
Other drawings may be obtained according to these drawings without any creative labor.
Fig. 1 is the flow diagram of the distributed storage method of power infrastructures data provided by the invention;
Fig. 2 is the HDFS cluster diagram of the embodiment of the present invention.
Specific embodiment
The feature and exemplary embodiment of various aspects of the invention is described more fully below, in order to make mesh of the invention
, technical solution and advantage be more clearly understood, with reference to the accompanying drawings and embodiments, the present invention is further retouched in detail
It states.It should be understood that specific embodiment described herein is only configured to explain the present invention, it is not configured as limiting the present invention.
To those skilled in the art, the present invention can be real in the case where not needing some details in these details
It applies.Below the description of embodiment is used for the purpose of better understanding the present invention to provide by showing example of the invention.
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality
Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation
In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to
Non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those
Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment
Intrinsic element.In the absence of more restrictions, the element limited by sentence " including ... ", it is not excluded that including
There is also other identical elements in the process, method, article or equipment of the element.
As shown in Figure 1, the present embodiment provides a kind of distributed storage method of power infrastructures data, including following step
It is rapid:
Step 1: Hadoop distributed file system environment is built, the HDFS cluster in the present embodiment as shown in Figure 1,
4 nodes are virtually first dissolved in high-performance server, a master node and three slave nodes is contained, then carries out
Build building for HDFS environment.
Used operating system is Centos6.8, and internetwork connection mode is NAT, and wherein master node is interior
16G is saved as, the memory of hard-disk capacity 200G, slave1, slave2, slave3 node is 16G, the equal 400G of hard-disk capacity.
The specific implementation process of embodiment is described as follows:
The IP address of each dummy node is set first, then configures host file, host file is mainly used for determining every
The IP address of a node facilitates subsequent master node that can quickly find and accesses each node, and each dummy node is required to
Configure host file.One user group and user are specially set for Hadoop cluster and configures SSH and exempts from password and is connected into, is allowed
Master node can be exempted from password by SSH and smoothly be securely accessed by three slave nodes.It downloads and decompresses Hadoop installation kit pair
It carries out configuration until available jps examines each background program to successfully start up, or the case where can view cluster by website just
It completes and Hadoop distributed file system environment is built.
Step 2: installing MongoDB and creating the MongoDB database for storing power infrastructures data.This hair
The database name of bright embodiment essential information for storing data is MultiSourceData, and data essential information all stored
In the dataInfo set of MultiSourceData.
Step 3: the essential information of power infrastructures data is stored in MongoDB, the deposit of power infrastructures data
HDFS.It is realized in the embodiment of the present invention with python language.The specific implementation process of embodiment is described as follows:
It is inquired in MongoDB database first with the document base information got, if having existed this
Data then terminates and prompts the information of " having existed this record in database ";If it is complete not find domain in database
This data is then first stored in HDFS by consistent document, then using the position of HDFS where essential information and the data as one
Document is inserted into MongoDB database.
Introduce the python packet for needing to use, including pymongo, hdfs, os.Firstly, input electric power infrastructure data
Essential information include data filename, data time, data affiliated area, the storage of data type, data in HDFS
Path inquires whether have complete in the set dataInfo of database MultiSourceData by these essential informations
The document matched shows had in database if there is being returned to message notifying " having existed this record in database "
This data does not need to be stored in again, data is stored in HDFS if the document not exactly matched, and obtain file and exist
The essential information of path and data is stored in the dataInfo collection of MultiSourceData database by the path in HDFS together
In conjunction.
It is carried out Step 4: the inquiry and downloading of power infrastructures data, in the embodiment of the present invention with python language real
It is existing.The specific implementation process of embodiment is described as follows:
The essential information for inputting the power infrastructures data needed is searched for eligible in database according to essential information
Data record, obtain the path HDFS of data.Data are downloaded from Hadoop distributed file system by the path HDFS
To local.
Compared with prior art, the present invention combines HDFS to realize text on the basis of realizing the distributed storage of database
The distributed storage of part, file no longer need to be converted in binary file deposit database, again by binary file when taking-up
It is converted into source file, improves the handling capacity of data.Power infrastructures data include text data, remotely-sensed data and all kinds of
Electric power thematic data.
It should be clear that the invention is not limited to specific configuration described above and shown in figure and processing.
For brevity, it is omitted here the detailed description to known method.In the above-described embodiments, several tools have been described and illustrated
The step of body, is as example.But method process of the invention is not limited to described and illustrated specific steps, this field
Technical staff can be variously modified, modification and addition after understanding spirit of the invention, or suitable between changing the step
Sequence.
It should be noted that the exemplary embodiment referred in the present invention, is described based on a series of step or device
Certain methods or system.But the present invention is not limited to the sequence of above-mentioned steps, that is to say, that can be according to mentioning in embodiment
And sequence execute step, may also be distinct from that the sequence in embodiment or several steps are performed simultaneously.
The above description is merely a specific embodiment, it should be appreciated that protection scope of the present invention is not limited to
This, anyone skilled in the art in the technical scope disclosed by the present invention, can readily occur in various equivalent
Modifications or substitutions, these modifications or substitutions should be covered by the protection scope of the present invention.
Claims (6)
1. a kind of distributed storage method of power infrastructures data, which is characterized in that the method utilizes non-relation data
Library stores document base information, the distributed storage of power infrastructures data file is realized with distributed file system, specifically
Include:
Step 1: establishing Hadoop distributed file system (HDFS) environment, 4 sections are virtually first dissolved in high-performance server
Point includes a master node and three slave nodes, then carries out building building for HDFS environment;
Step 2: the database MongoDB of installation distributed document storage, creates the data for being used for storage file essential information
Library;
Step 3: by the database in the essential information deposit step 2 of power infrastructures data, power infrastructures number
According to deposit HDFS;
Step 4: power infrastructures data are inquired and are downloaded.
2. the distributed storage method of power infrastructures data according to claim 1, which is characterized in that the step
One specifically includes:
The IP address of step 1.1, each dummy node of setting, then configures host file, and host file is mainly used for determining every
The IP address of a node facilitates subsequent master node that can quickly find and accesses each node, and each dummy node is required to
Configure host file;
Step 1.2 is specially arranged a user group and user for Hadoop cluster and configures SSH and exempt from password and is connected into, and allows
Master node can be exempted from password by SSH and smoothly be securely accessed by three slave nodes;
Step 1.3 is downloaded and decompresses Hadoop installation kit and carries out configuration to it until available jps examines each background program successful
The case where starting, or can viewing cluster by website, which just completes, builds Hadoop distributed file system environment.
3. the distributed storage method of power infrastructures data according to claim 1, which is characterized in that the step
The essential information of file in two includes all letters for wishing to be taken as querying condition such as time, data institute possession, data type
Breath.
4. the distributed storage method of power infrastructures data according to claim 1, which is characterized in that the step
Three are inquired in MongoDB database first with the document base information got, if having existed this data
Then terminate and prompt the information of " this record is had existed in database ";If it is completely the same not find domain in database
This data is then first stored in HDFS by document, then inserts the position of HDFS where essential information and the data as a document
Enter in MongoDB database.
5. the distributed storage method of power infrastructures data according to claim 1 or 4, which is characterized in that described
Step 4 inquires qualified document according to querying condition in MongoDB database.Then lead to if you need to download selected data
It crosses path of the file recorded in domain in HDFS and data is downloaded to local.
6. the distributed storage method of power infrastructures data described in one of -5 according to claim 1, which is characterized in that institute
It states Hadoop distributed file system to be deployed on four nodes that a high-performance server is fictionalized, includes one
Master node and 3 slave nodes, used operating system are Centos6.8, and internetwork connection mode is NAT,
16G is saved as in middle master node, the memory of hard-disk capacity 200G, slave1, slave2, slave3 node is 16G, firmly
The equal 400G of disk capacity.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811167120.5A CN109471837A (en) | 2018-10-08 | 2018-10-08 | The distributed storage method of power infrastructures data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811167120.5A CN109471837A (en) | 2018-10-08 | 2018-10-08 | The distributed storage method of power infrastructures data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109471837A true CN109471837A (en) | 2019-03-15 |
Family
ID=65664733
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811167120.5A Pending CN109471837A (en) | 2018-10-08 | 2018-10-08 | The distributed storage method of power infrastructures data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109471837A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110413571A (en) * | 2019-07-01 | 2019-11-05 | 中国科学院遥感与数字地球研究所 | Based on the extensive remote sensing image data distributed storage method of MongoDB |
CN111026706A (en) * | 2019-10-21 | 2020-04-17 | 武汉神库小匠科技有限公司 | Method, device, equipment and medium for warehousing power system data |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104317899A (en) * | 2014-10-24 | 2015-01-28 | 西安未来国际信息股份有限公司 | Big-data analyzing and processing system and access method |
CN104820670A (en) * | 2015-03-13 | 2015-08-05 | 国家电网公司 | Method for acquiring and storing big data of power information |
CN105354250A (en) * | 2015-10-16 | 2016-02-24 | 浪潮(北京)电子信息产业有限公司 | Data storage method and device for cloud storage |
CN105763667A (en) * | 2016-01-13 | 2016-07-13 | 杭州华三通信技术有限公司 | Method and device for realizing Hadoop host automatic discovery |
KR20180056038A (en) * | 2016-11-18 | 2018-05-28 | 조선대학교산학협력단 | Data distribution storage apparatus and method using relative difference set generated from the group having the two-dimensional element |
-
2018
- 2018-10-08 CN CN201811167120.5A patent/CN109471837A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104317899A (en) * | 2014-10-24 | 2015-01-28 | 西安未来国际信息股份有限公司 | Big-data analyzing and processing system and access method |
CN104820670A (en) * | 2015-03-13 | 2015-08-05 | 国家电网公司 | Method for acquiring and storing big data of power information |
CN105354250A (en) * | 2015-10-16 | 2016-02-24 | 浪潮(北京)电子信息产业有限公司 | Data storage method and device for cloud storage |
CN105763667A (en) * | 2016-01-13 | 2016-07-13 | 杭州华三通信技术有限公司 | Method and device for realizing Hadoop host automatic discovery |
KR20180056038A (en) * | 2016-11-18 | 2018-05-28 | 조선대학교산학협력단 | Data distribution storage apparatus and method using relative difference set generated from the group having the two-dimensional element |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110413571A (en) * | 2019-07-01 | 2019-11-05 | 中国科学院遥感与数字地球研究所 | Based on the extensive remote sensing image data distributed storage method of MongoDB |
CN111026706A (en) * | 2019-10-21 | 2020-04-17 | 武汉神库小匠科技有限公司 | Method, device, equipment and medium for warehousing power system data |
CN111026706B (en) * | 2019-10-21 | 2023-10-13 | 武汉神库小匠科技有限公司 | Warehouse entry method, device, equipment and medium for power system data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101505550B (en) | Method, terminal, apparatus and system for device management | |
CN109379420B (en) | Comprehensive energy service platform system based on distributed architecture | |
CN102202087B (en) | Method for identifying storage equipment and system thereof | |
CN103167041A (en) | System and method for supporting cloud environment application cluster automation deployment | |
CN107888666A (en) | A kind of cross-region data-storage system and method for data synchronization and device | |
CN103546572A (en) | Cloud storage device and multi-cloud storage networking system and method | |
CN108848132A (en) | A kind of distribution scheduling station system based on cloud | |
CN103034541A (en) | Distributing type information system and equipment and method thereof | |
CN109683910A (en) | Big data platform dispositions method and device | |
CN109471837A (en) | The distributed storage method of power infrastructures data | |
CN109215326A (en) | A kind of parallel meter register method and device | |
CN113127526A (en) | Distributed data storage and retrieval system based on Kubernetes | |
CN102624932A (en) | Index-based remote cloud data synchronizing method | |
Malik et al. | A common data architecture for energy data analytics | |
Smidt et al. | Smart application development for IoT asset management using graph database modeling and high-availability web services | |
Lee et al. | A big data management system for energy consumption prediction models | |
Chen et al. | An efficient data storage method of NoSQL database for HEM mobile applications in IoT | |
CN105847364A (en) | Public cloud object storage method based on uniform domain name and public cloud object storage system based on uniform domain name | |
CN102571418A (en) | Method, terminal, device and system for equipment management | |
CN102970375A (en) | Cluster configuration method and device | |
Benhaddou et al. | Big data processing for smart grids | |
CN109710263A (en) | Compilation Method, device, storage medium and the electronic equipment of code | |
CN114866416A (en) | Multi-cluster unified management system and deployment method | |
CN105607594A (en) | Method for searching equipment by server memory based on smart home appliance | |
CN106844058B (en) | Management method and device for virtualized resources |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190315 |
|
RJ01 | Rejection of invention patent application after publication |