Storage system for storing electric power big data
Technical Field
The utility model belongs to the technical field of electric power big data processing and specifically relates to a storage system for saving electric power big data is related to.
Background
Due to the fact that the data quantity of the data source accessed by the large power data platform is huge, the structure is not uniform, the performance of data retrieval and calculation of each service module can be affected, and a unique data storage structure of the large power distribution and utilization data application system is needed.
The utility model discloses a public number is CN 208477511U's utility model discloses a big data mining distributed storage system, including the management net, the management net is connected with cloud management server, customer end and memory equipment respectively, and memory equipment is connected with the cloud host computer, the cloud host computer is connected with storage device, storage device comprises ROM memory, mobile memory, network memory and hard disk, storage device is connected with the storage net, and the storage net is connected with terminal equipment.
The big data mining distributed storage system has the following defects:
1. the storage equipment consists of a ROM (read only memory), a mobile memory, a network memory and a hard disk, the stored energy is limited, and the storage requirement of large electric power data is difficult to meet;
2. the data is controlled by the cloud management server to be stored, the computing performance requirement on the cloud management server is extremely high, and the computing requirement of large-volume and non-uniform-structure electric power big data cannot be met.
SUMMERY OF THE UTILITY MODEL
The utility model aims at providing a storage system for storing big data of electric power that satisfies the huge big data storage service demand of electric power of data volume for overcome the defect that above-mentioned prior art exists.
The purpose of the utility model can be realized through the following technical scheme:
the utility model provides a storage system for storing electric power big data, includes distributed data acquisition cluster, big data management cluster, big data visualization cluster, the mixed parallel computation storage cluster of distribution of power consumption big data, distribution of power consumption big data high performance computation cluster, giga switch and ten thousand million switches, distributed data acquisition cluster with big data visualization cluster all connects the giga switch, distributed data acquisition cluster, big data management cluster, big data visualization cluster, the mixed parallel computation storage cluster of distribution of power consumption big data and distribution of power consumption big data high performance computation cluster all connect the ten thousand million switches.
Further, the distributed data acquisition cluster comprises four two-way servers.
Further, the big data management cluster comprises two-way servers.
Further, the big data visualization cluster comprises two four servers.
Further, the power distribution and power utilization big data hybrid parallel computing and storing cluster comprises twenty-eight two-way servers.
Furthermore, the power distribution and power utilization big data high-performance computing cluster comprises two eight-path servers, three four-path servers and a GPU server.
Furthermore, the number of the gigabit switches is two, and the two gigabit switches are respectively connected with the distributed data acquisition cluster and the big data visualization cluster.
Further, the gigabit switch is a 48-port gigabit switch.
Furthermore, the number of the ten-gigabit switches is two, and the two ten-gigabit switches are respectively connected with the distributed data acquisition cluster, the big data management cluster, the big data visualization cluster, the power distribution and utilization big data hybrid parallel computing and storing cluster and the power distribution and utilization big data high-performance computing cluster.
Further, the terabyte switch is a 48-port terabyte switch.
Compared with the prior art, the utility model has the advantages of it is following:
(1) the utility model provides a storage system for storing electric power big data is equipped with the big data mixing of adapted power and calculates storage cluster and the big data high performance of adapted power and calculates the cluster to get through communication between the two through the ten thousand million switches, guaranteed data exchange speed, improved data processing ability and efficiency, and storage system still is equipped with distributed data acquisition cluster, big data management cluster and the visual cluster of big data, satisfies data acquisition and data service's demand.
(2) The utility model discloses a big data hybrid parallel computation storage cluster for power distribution and utilization, which comprises twenty-eight two-way servers and can meet the storage requirement of big data volume and big electric power data; the power distribution and utilization big data high-performance computing cluster comprises two eight-path servers, three four-path servers and a GPU server, the eight-path servers and the four-path servers meet the processing capacity of the power big data, the GPU server meets the machine learning capacity of storage resource management, and the management efficiency of the storage data is further improved.
(3) The utility model provides a storage system for storing electric power big data is equipped with the visual cluster of big data, and this visual cluster of big data includes two four ways servers, satisfies the visual display demand and the throughput of data.
(4) The utility model connects the distributed data acquisition cluster and the big data visualization cluster through the kilomega switch; the ten-gigabit switch is connected with each cluster in the storage system, so that data exchange and sharing among the clusters are ensured, reasonable resource distribution is embodied, and the advantages of stability, reliability and the like are achieved.
Drawings
Fig. 1 is a schematic structural diagram of a storage system for storing large electric power data according to the present invention;
in the figure, 1, a distributed data acquisition cluster, 2, a big data management cluster, 3, a big data visualization cluster, 4, a power distribution and utilization big data mixed parallel computing and storing cluster, 5, a power distribution and utilization big data high-performance computing cluster, 6, a kilomega switch, 7, a kilomega switch, 8, a two-way server, 9, a four-way server, 10, an eight-way server, 11 and a GPU server.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments. The embodiment is implemented on the premise of the technical solution of the present invention, and a detailed implementation manner and a specific operation process are given, but the scope of the present invention is not limited to the following embodiments.
Example 1
As shown in fig. 1, the present embodiment provides a storage system for storing electric power big data, including a distributed data collection cluster 1, a big data management cluster 2, a big data visualization cluster 3, a power distribution and utilization big data hybrid parallel computation storage cluster 4, a power distribution and utilization big data high performance computation cluster 5, a gigabit switch 6 and a gigabit switch 7, where the distributed data collection cluster 1 and the big data visualization cluster 3 are both connected to the gigabit switch 6, and the distributed data collection cluster 1, the big data management cluster 2, the big data visualization cluster 3, the power distribution and utilization big data hybrid parallel computation storage cluster 4 and the power distribution big data high performance computation cluster 5 are both connected to the gigabit switch 7.
The distributed data acquisition cluster 1 comprises four two-way servers 8. The big data management cluster 2 includes two-way servers 8. The big data visualization cluster 3 comprises two four-way servers 9. The power distribution and utilization big data hybrid parallel computing storage cluster 4 comprises twenty-eight two-way servers 8. The power distribution and utilization big data high-performance computing cluster 5 comprises two eight-path servers 10, three four-path servers 9 and a GPU server 11.
The number of the kilomega switches 6 is two, and the two kilomega switches 6 are respectively connected with the distributed data acquisition cluster 1 and the big data visualization cluster 3. The gigabit switch 6 is a 48-port gigabit switch.
The number of the ten-gigabit switches 7 is two, and the two ten-gigabit switches 7 are respectively connected with the distributed data acquisition cluster 1, the big data management cluster 2, the big data visualization cluster 3, the power distribution and utilization big data hybrid parallel computing and storing cluster 4 and the power distribution and utilization big data high-performance computing cluster 5. The ten million switch 7 is a 48-port ten million switch.
The storage system for storing the large electric power data, provided by the embodiment, is oriented to multi-source heterogeneous data, has hybrid parallel computing and storage capabilities, is highly available and expandable, and meanwhile, communication between a distributed computing and storage cluster and a high-performance cluster is opened on a network layer, so that the exchange speed of the data is ensured.
The storage system for storing the electric power big data provided by the embodiment can be integrated with a distributed memory computing technology, a big data software architecture facing power distribution and utilization is designed, unified distributed cluster resource management can be effectively supported, various algorithms of data mining and machine learning are integrated, good support is provided for upper-layer applications such as power utilization behavior analysis and peak-to-peak scheduling, and execution efficiency of the applications is improved.
A data source is accessed to a big data platform through a distributed data acquisition cluster 1 in four interface modes to form source data, and then the source data is stored in a power distribution and power consumption big data hybrid parallel computing storage cluster 4 in the form of various basic data tables to form a data warehouse. The power distribution and utilization big data hybrid parallel computing storage cluster 4 extracts data from a data warehouse and stores the data in a multidimensional way to form a data mart, and generally, several data marts can be correspondingly established by several service applications. Data in the data warehouse and the data mart can provide data service for each business application, and the embodiment provides two interface modes of jdbc and hbase api to provide data support for each business application: for a data warehouse, a basic data table and a service data table can be associated through a certain field, and data such as power consumption, power saving and the like of a user in a specific direction can be inquired by combining a plurality of tables; for the data marts, each business application will use the corresponding data mart, and only the data in the corresponding mart needs to be directly searched.
The data source formats collected by the distributed data collection cluster 1 comprise dmp, CIM and excel/txt. The power distribution and utilization big data hybrid parallel computing storage cluster 4 processes metadata through web service, span, ftp and java analysis, the power distribution and utilization big data hybrid parallel computing storage cluster 4 is mutually communicated with the power distribution and utilization big data high-performance computing cluster 5, and the big data management cluster 2 and the big data visualization cluster 3 are connected through jdbc (interrupt) and hbase api interface modes to provide data service.
The power distribution and utilization big data hybrid parallel computing storage cluster 4 has an HDFS2.5 optimized storage function and is internally provided with an Inspur Erasure Code; batch framework function, the embodiment adopts Map/Reduce2 framework; the collaboration service function, Zookeeper 3.4.5 is employed in this embodiment.
The power distribution and utilization big data high-performance computing cluster 5 has a resource management function, and is realized by YARN 2.5 in the embodiment, and is internally provided with input Extension; a graph computation function; a machine learning function; the system comprises a workflow processing function, a data integration function, a log collection function and a full text search function.
The big data management cluster 2 is provided with an interactive analysis engine, and the embodiment adopts CC Hive including Apache Spark; NoSQL database, in this example, CC Hbase is used; the data mining function, R length, is used in this embodiment.
The foregoing has described in detail preferred embodiments of the present invention. It should be understood that numerous modifications and variations can be devised by those skilled in the art in light of the present teachings without departing from the inventive concepts. Therefore, the technical solutions that can be obtained by a person skilled in the art through logic analysis, reasoning or limited experiments based on the prior art according to the concepts of the present invention should be within the scope of protection defined by the claims.